File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/98/p98-1018_concl.xml
Size: 1,406 bytes
Last Modified: 2025-10-06 13:58:05
<?xml version="1.0" standalone="yes"?> <Paper uid="P98-1018"> <Title>Consonant Spreading in Arabic Stems</Title> <Section position="7" start_page="121" end_page="122" type="concl"> <SectionTitle> 6 System Status </SectionTitle> <Paragraph position="0"> The current morphological analyzer is based on dictionaries and rules licensed from an earlier project at ALPNET (Beesley, 1990), rebuilt completely using Xerox finite-state technology (Beesley, 1996; Beesley, 1998a). The current dictionaries contain 4930 roots, each one hand-coded to indicate the subset of patterns with which it legally combines (Buckwalter, 1990). Roots and patterns are intersected (Beesley, 1998b) at compile time to yield 90,000 stems. Various combinations of prefixes and suffixes, concatenated to the stems, yield over 72,000,000 abstract words. Sixty-six finite-state variation rules map these abstract strings into fully-voweled orthographical strings, and additional trivial rules are then applied to optionally delete short vowels and other diacritics, allowing the system to analyze unvoweled, partially voweled, and fully-voweled orthographical strings.</Paragraph> <Paragraph position="1"> The full system, including a Java interface that displays both input and output in Arabic script, is available for testing on the Internet at http ://www. xrce. xerox, com/research/ mltt/arabic/.</Paragraph> </Section> class="xml-element"></Paper>