File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/94/c94-1071_intro.xml
Size: 3,158 bytes
Last Modified: 2025-10-06 14:05:36
<?xml version="1.0" standalone="yes"?> <Paper uid="C94-1071"> <Title>Two Parsing Algorithms by Means of Finite State Transducers</Title> <Section position="2" start_page="0" end_page="0" type="intro"> <SectionTitle> INTRODUCTION </SectionTitle> <Paragraph position="0"> Fhlite state devices have recently attracted a hit of interest in computational linguistics. Couiputational ellieioncy has been drastically improved for n)orphological analysis by representing large dictionaries with Finite State Automata (FSA) and by representhig two-level rnles and lexical hlforination with finite-state transducers \[8, 4\] More recently, \[11\] has achieved parsing with low level lexical sensitivity by nleans of linite state automata. Finite state apl)roximation of co~,textfree grammars also proved both useful and efficient for certain application \[9\].</Paragraph> <Paragraph position="1"> One COlYimon rnotiwttion of all this work is to inlprove efficiency dranlatically, hoth hi tel'illS of ti nle and sl)a, ee. These results often provide l)rOgl'anls orders of magnitude faster than more traditional hnplenientalions. Moreover, F~As are a natural way I.o express lexieal sensitivity, which has always lieell a reqlih'enient in lnorphology and which has proved crucial in sylltax. The granllllar we used for French: called Lexh:on-Grammar (set, \[61 \[7\] \[2\] \[3\] \[i01 for insta,,cc), pushes the lexiealization very far and it is our I)elief that this lexicalization trend will alnplify itself and that it will restllt i,l grammars several orders of magnitnde larger than today's representations. This nncovers the need for new methods that will be able to handle such large scale grammars, *Supported by a DRI'\]T-EcoIe l%lytechnique contract, this work w;Ls done at the \]nstitut (~;tSl)~u'd Monge and ~tt the LADL.</Paragraph> <Paragraph position="2"> Ilowever, a tnahl drawback of the lit;ire st,ate approach to syntax is the dilllcnlty of representing hierarchical data; this partly explains why l'~SA-based progralllS ollly do illcnllll)lete parsillg. This I)itl)er l)resents a ilew i)arshig al)proach based on linite-stal.e trallsdlleors, a device that }laS been used ah'eady ill Inorl)liohlgy \[81 btit not yet hi synl.~tx, that provides both hierarchical representations and efllciency hi ;t shnple and natural way. ';'lie represelitatioil is very compact, this allows to hnl)lelllellt large lexical g.ra\[ri\[nars.</Paragraph> <Paragraph position="3"> Two NOW parshlg algorithms ilhistrate the approach llresented hero. The th'st one uses a finite state l;l'ai/sduo;Jr alld conlpul;es a fixed point, llllt finite state Ii'ansducer,% unlike F.<JAs, cannot be niade deteruiiliiStic; however, a hidh'eetional device cidle(I a Iiiinacllhie \[1\] can indirectly nlake tl/eln deterlninistie. This leads to the second algorithni presented here. The very high elliciency of this approach can lie seeil in the experiluenl.s oi1 French. ~elltell('.es ci'tll be I)arsed with a gralrimar col;tabling ;nero than 200>000 lexical rnlosi; this g:r.:tllliliar is> w0 think, the h~rgest ~l'allinlar ever hnplolnented.</Paragraph> </Section> class="xml-element"></Paper>