File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/97/p97-1042_intro.xml
Size: 2,390 bytes
Last Modified: 2025-10-06 14:06:17
<?xml version="1.0" standalone="yes"?> <Paper uid="P97-1042"> <Title>Compiling Regular Formalisms with Rule Features into Finite-State Automata</Title> <Section position="2" start_page="0" end_page="0" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> The past few years have witnessed an increased interest in applying finite-state methods to language and speech problems. This in turn generated interest in devising algorithms for compiling rules which describe regular languages/relations into finite-state automata.</Paragraph> <Paragraph position="1"> It has long been proposed that regular formalisms (e.g., rewrite rules, two-level formalisms) accommodate rule features which provide for finer and more elegant descriptions (Bear, 1988). Without such a mechanism, writing complex grammars (say two-level grammars for Syriac or Arabic morphology) would be difficult, if not impossible. Algorithms which compile regular grammars into automata (Kaplan and Kay, 1994; Mohri and Sproat, 1996; Grimley-Evans, Kiraz, and Pulman, 1996) do not make use of this important mechanism. This paper presents a method for incorporating rule features in the resulting automata.</Paragraph> <Paragraph position="2"> The following Syriac example is used here, with the infamous Semitic root {ktb} 'notion of writing'. The verbal pa&quot;el measure 1, /katteb/~ 'wrote Syriac spirantization, see (Kiraz, 1995).</Paragraph> <Paragraph position="3"> morphemes: the pattern {cvcvc} 'verbal pattern', the above mentioned root, and the voealism {ae} 'ACTIVE'. The morphemes produce the following underlying form: 3 a e \[ \[ */kateb/ C V C V C J I I k t b /katteb/is derived then by the gemination, implying CAUSATIVE, of the middle consonant, \[t\].4 The current work assumes knowledge of regular relations (Kaplan and Kay, 1994). The following convention has been adopted. Lexical forms (e.g., morphemes in morphology) appear in braces, { }, phonological segments in square brackets, \[\], and elements of tuples in angle brackets, ().</Paragraph> <Paragraph position="4"> Section 2 describes a regular formalism with rule features. Section 3 introduce a number of mathematical operators used in the compilation process. Sections 4 and 5 present our algorithm. Finally, section 6 provides an evaluation and some concluding remarks.</Paragraph> </Section> class="xml-element"></Paper>