File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/94/c94-1071_metho.xml
Size: 14,921 bytes
Last Modified: 2025-10-06 14:13:36
<?xml version="1.0" standalone="yes"?> <Paper uid="C94-1071"> <Title>Two Parsing Algorithms by Means of Finite State Transducers</Title> <Section position="3" start_page="0" end_page="431" type="metho"> <SectionTitle> PRINCIPLES </SectionTitle> <Paragraph position="0"> '\]?lie concept of Finlte-State Transducer The basic concept here, since we iiot only niatch but also add lnarkers, is the coilcellt of thlite-state transducer. This device has ah'eady proved very efliclent hi niorl/hohlgical analysis \[8\]. It Call deal with very \]al'~,e alliOllllt o\[' d/tl.a, lutnlely niorl)hological dlctional'ies COlitah/hlg lnore thai; ,r)00,l)(){J elltries.</Paragraph> <Paragraph position="1"> A llnite stal,e trillis(hlcer is shnply ,~tll FSA except that> while \[Tdlowhig a Im.l.h> synlbols are entitled. A finite stal,C/~ tralisdllcer Call also Sill;ply lip seell a.~ it graph where the verth'es, called states, are Ihiked through oriented arrows, called I;l';tilsitions. The trallsitiolls itl'e labeled by pairs (inpuIJabel, outpul_htbcl) ~ By h:xh:a\[ rule we basically me:tn a sentence Structure, its for exatnple Nhm'l~ say lo Nhmn llmt ,~,', where Nlutm iuld S respectively stltlld for human IlOlllillld ;llid sentence. Thus the rules we deal with c:tn roughly he seen as sentpnt e strllcLiires where itt least oi1(! elelllellt is lexical. This will he develope.d hi section .</Paragraph> <Paragraph position="2"> The parser in term of rational transductlon In our parser, the grammar is a rational transdncti0n f, represented by a transducer T. The inl}ut of the parser is the set so containing as only element the input sentence bounded by the phrase marker \[P\], i.e.</Paragraph> <Paragraph position="3"> so = {\[P\] sentence \[P\]}. The analysis consists in computing sl = f(so), s2 = f(st) until a tixed point is reached, i.e. s t , = f(sp). The set s v contains trees represented by bracketed strings, this set is the set of grammatical analysis of the sentence, it contains more than one element in the case of syntactically ambiguous inputs. Each set sl is represented by a Directed Acyclic Graph (DA(1) Ai, thus the computation consists in applying the transducer 7' on the DAGs eli.</Paragraph> <Paragraph position="4"> We shall write it Ai+l = T(Ai).</Paragraph> <Paragraph position="5"> In the next section we give two complete examples of that.</Paragraph> </Section> <Section position="4" start_page="431" end_page="431" type="metho"> <SectionTitle> TWO SIMPLE EXAMPLES </SectionTitle> <Paragraph position="0"> An example of a Top-Down analysis The graph on figure l describes the analysis of the sentence : sl = John said that Mary left The graph on this figure has to be read in the following way: the inpnt sentence is represented by the DAG Aton the upper left corner; the subset of the grammar required for the analysis of this sentence is the transducer fon the right hand side of the figure 1. The analysis is then computed in the following way: we apply the transducer fto Al, that is we compute A2 = f(Al) , this represents one step of a Top-Down analysis of the sentence. The box with a star inside represents this operation, namely applying a transducer to a DA(I. If we then apply fto this result (i.e. A Q, we obtain Aa=f(A2)= f~(Al) represented under A2. If this operation is applied once more, one gets A4=f(Aa)= fa(A1). This last result, A4, is a fixed point of the transducer f, i.e. f(A4)=A4. A4 is a DAG that represents a finite set PS'et(A4) of strings, llere, this set only contains one elmnent, namely PS'et(A4) = { ( J ohn ) N O( said) V O( t hat ( M a,'y) N O( le f t. ) V O )That,~'} . Each element is a bracketed rel)resental.ion of an analysis. I\]ere the analysis is unique.</Paragraph> <Paragraph position="1"> An example of a simultaneous Top-Down Bottom-Up analysis The previous example might give the iml)ression that coml)uting a fixed l)oint of a transducer atttomatically leads to simulating a top-down context free analysis.</Paragraph> <Paragraph position="2"> However, we shall now see that using the tlexibility of manipulating transducers, namely being able to compute the composition and the union of two transducers, allows a context sensitive parsing which is simultaneously Top-Down and Bottom-up with the possibility of choosing which kind of rule should be parsed BottomUp, null SUl}l)ose one wants to analyze the sentence s2 =Max bought a little bit more than five hundred .share certificates. Suppose one has the following small ft, nctions, each one l)eing specialized in the analysis of an atomic fact (i.e. each function is a lexical rule): * fl : w a little lilt more than w' ~ w (pre,! a little bit more than prod) .u/; .w, w ~ ~ A+ * f a : w live hundred 'u/~ w (hUm live hundred IIUlII) W t where w G A* and w ~ ~_ A* - {NUMEI~AL} * fa : w share certificates w / ---+ w (on share certilieates on) w' where iv, w' (~ A* * f4 : \[P\] w bought w'\[P\] --~ \[N w N\] bought \[N w' N\] where w,w' E A+ * Ji~ : w\[NMaxN\] w'--~wMaxw';w,.w'GA* * f,; : wt \[N (pro.d w2 pre+d) (hUm wa mun) (on *&quot;4 on) N\] w5 -----, wl (N wu wa w4 N) w5 where 1131 ~ lV2~ 'U)3, H)4, ~D 5 (~ A* * fr : w ----, w; w C A* -(Dom(fl Uf,,Uf:,Uf4Uf~) 4 If we precomlmte the transducer representing the rational transduction f = (f4 o fa o f2 o fl) tO (f5 o fi;) U fr then the analysis of the sentence is a two-step application of f, namely f( \[P\] Max bought a little bit more than five hundred share certificates \[P\]) = \[N Max N\] bought \[N (pred a little bit more than pred) (hUm live hundred mmQ (on share certificates cm) N\]</Paragraph> <Paragraph position="4"> than llve hundred share certificates N) which is the analysis of the sentence '~.</Paragraph> </Section> <Section position="5" start_page="431" end_page="432" type="metho"> <SectionTitle> FORMAL DESCRIPTION </SectionTitle> <Paragraph position="0"> The algorithm Formally, a transducer T is defined by a 6-uplet (A,Q,i, F,d, 6) where A is a finite all)habet, Q is a finite set of states, i G Q is the initial state, F C Q is the set o\[&quot; t,ermina\[ states, d the transition ftmcl.ion maps (~)x A to the set ofsuhsets of Q and ~5 the etnission function nmps Q x A x Q to A.</Paragraph> <Paragraph position="1"> The core of the procedure consists in apl)lying a transducer to a FSA, the algorithm is well known, we give it here for the sake of readability.</Paragraph> <Paragraph position="3"> a Here f2 simuhges a context sensitive analysis because of 'u/ E A+ - { NI\] M ERAL} 4 Dora(f) stands for the domain of f.</Paragraph> <Paragraph position="4"> SNote that it is Mways possible to keep more information along the anMysis and to kee I ) track, for inst,'tnce, of the position of the determiners.</Paragraph> <Paragraph position="5"> It should be pointed out that, given a (}ontext-Free (;ranlmar, it is alw{tys fmssible to buihl a transducer such that this method applies, h, other words, any c<ml,eXt {'reo il.,~l';I.iil|llD.r C;lll I)(~ (,rltllsl;t(,ed illtO & tl'~tllSdl,cer such thai, the illgorithill pltr;te the \[Illlgllli.g,, de.. scribed by this graimu;tr. Moreover, |.he olmration that ti'ailSI'orIltS ;t (~l&quot;(l into its related t.ransdttcer is itself a v';~,thmal tra.nsdt,ction. Although i.hi:-: ca,tool, I),' d,w,Aopped here dlle I.o the \[~tck of place, this resnlL colnes naturally when looking +~t. the example of section 3.1.</Paragraph> <Paragraph position="6"> Moreover the met, hod has trmch more expressive power t,h;m ('F(;, in fact computing a fixed point of a, r;+t,ionM traxlsdtlc.t;ion has the sarne power as applying ;t 'l'uring Machine (althottghl, (;here might, nol, be. any practical interest for that).</Paragraph> </Section> <Section position="6" start_page="432" end_page="432" type="metho"> <SectionTitle> TItE SECOND ALGORITHM : A DETERMINISTIC DEVICE </SectionTitle> <Paragraph position="0"> (liven ;t transducer representing the ~l'&ll\]|l/\[tr \[.}lore 3A'O tWO dilferenl, ways of ol)t.ahiing new I)m'sing I)rogra.llls.</Paragraph> <Paragraph position="1"> The lil'sl, solution is to buihl a transducer 'F' equivalent to :I' from the view point of their Iixed points, 7' ~Ji=,,d-poi,,t 7&quot;. Namely 7' ~/i.:~a-poi,. 7&quot; ill&quot; for each * e A*, V'(*) = * <* V&quot;(~,) = ,,. l&quot;o,' il,~ta,,ee, if 7' is such that for each x G A*, T n(a:) converges then T 2 ~\]i~ed-point r. The second approach is to try using a different representation of T or to apply it differently. In this section, we shall give an algorithm illustrating this second al~l~roaeh. The basic idea is to transform tile finite-state transducer into a deterministic device called bimaehine \[1\]. We will detail that latter but, basically, a bimaehine stands for a left sequential fimetion (i.e deterministic from left to righQ composed to a right sequential function (i.e. deterministic from right to left). Such a decomposition always exists.</Paragraph> <Paragraph position="2"> The interest of this concept appears when one looks at how tile algorithm ApplyTransdueer performs.</Paragraph> <Paragraph position="3"> In fact the output DAG of this algorithm has a lot of states that lead to nothing, i.e. states that are not eoaceessible, thus tile PR, UNE function (called on live 14 of the ApplyTransducer function) has to remove most of the states (around 90% in our parser of French).</Paragraph> <Paragraph position="4"> Let us for instance consider tile following example: SUl)l)ose the transducer 7; is tile one represented ligure 2 and that we want to compute 7:,(A) where A is Tile PRUNE flmction has then to remove 3 of tile six states to give tile DA(-I A&quot; of figure 3 A way to avoid the overhead of computing unnecessary states is to ilrst ~q)ply a left sequential transducer 71,,, (that is a transducer deterministic in term of its input when read from left to right) given figtire 4 and then apply a right sequential transducer :1',~ (i.e. deterministic in term of its input when read from right to left) given figure 4. We shall call the pair B, = (T,,,, 7'a~) the bimaehine flmctionally equivalent to 7a, i.e. Ba ~function ~/\]~. With the same input A we first obtain Aa = 7~a(A) of figure 5 and then Ab =</Paragraph> <Paragraph position="6"> It should be pointed oul, that both 7'.. and T.b are deterministic in term of their input, i.e.t.heir left, labels, which was not the ease to :l'a, Just like for FSA, the fact that it is deterministic implies that it, can l)e applied faster (and sometime much faster) than nondeternlinistic devices, on the other hand the size of the bimachine might be, in the worst case, exponential ill term of the original tralls(nleer, q'he following algorithm formalizes the analysis by mean of a bimaehine 7.</Paragraph> </Section> <Section position="7" start_page="432" end_page="434" type="metho"> <SectionTitle> IMPLEMENTATION AND RESULTS </SectionTitle> <Paragraph position="0"> The main motivation for this work eo,nes from the linguistic claim that the syntactic rules, roughly the sentence structures, are mostly lexieal. The gralnmar of Freueh we lind at our disposal was so large that noue of the awdlable parsers could handle it.</Paragraph> <Paragraph position="1"> Although the inq)lement.ed l)art of the gramnlar is still inc(mll)\[el.e , it ah'ea(ly describes 2,878 sentential verbs (coming from \[6\]), I.Imt is verl)s tlutt can l.ake a sentence as argument, leading to 2(11,722 lexieal rulesS; 1,359 intransitiw, w.~rbs \[2\] leading to 3,153 lexical rules; 2,109 transitive verbs \[3\] leading to 9,785 lexical rules; 2,920 frozen expression (coming from \[7\]) leading to 9,342 lexieal rules and 1,213 partly frozen adwwbials leading to 5,032 lexieal rules. Thus, t.he grammar describes 10,479 entries aud 229,035 lexieal rnles. This &quot;:'l~he FSA reverse(A) is A where the transitions have been reversed and tile initial and Ihlal st~ttes exclumged.</Paragraph> <Paragraph position="2"> ~For a verb like (former tile set o\[&quot; rules inchlde Nhu'mo :lo,me,&quot; Nhum~ as well as Nhumo avoi; dto,md Nhum~, N humo ~t,'e ~:tonn: pa,&quot; N hum, or N humo s 'dlo,me aupr~s de Nhuml de ee Qut~2 which gives an idea of how these complexe verbs generate ~ttl average of 10O rules, or sentence structures, even if no embbeding is iuvolved art this stage.</Paragraph> <Paragraph position="3"> grammar is reprcsenLed by nne tA'~tilsdtlcer (,~&quot; 13,408 states and d7,119 transitions stored ill {)()<~1(1~, The following illp/ll, ; Jean est; a.gacd l)ar le fail: que son anil , darts la (:rain~(: (t'i&quot;.Lre lmnl l)ar S(}S |)iU'O,1It;S~ ll(~ |OlII&quot; aii; 1)as IiV(llI(! S(~S IIIlIIIVIliS(~S llOt;(}S.</Paragraph> <Paragraph position="4"> is parsed in the fi)llowing way in 0.95s s wiih a l/rogram inqflementing the Mgorithm ANALYSE_I.</Paragraph> <Paragraph position="5"> (N Jean )N esL &VpI)0 aga:'d par hLhdt:_QuP le filit: (QuP qne (N sml II alnl IlIlll )N > (ADV darts llt Cl'I/illt;l~ th! (V0W N0 &|;re ,t~Vpp(i imni par (N ses li parenL par~ml;s )N VOW) AI.)V) , leur 5/~Nlnnn2 avolr all (o l) @he -1)as op) .~VI)I)0 :lv:)ll(~ (N s(~s lilaliVal.qeS II ll<)t>:~ ,l<lLes )N QuP) Typical l, inlc simnding varies froui ().05 secoud f(~r ;t l,eli words Still, elite t(~ ,r/secon(Is for ~t lililidrecl w()i'(Is seill, eiice tliidcr l, he cllrreiil, inll)louienl, al, it)ii, A l(~'y pohlL abouL Lhis lrleLhod is l,hat, the 1,iin~' siren(ling is cluiLe insensitive I,o Lhe size ()f l, lle ~J~l'3,1illliar, tJiis is crtlcM for scaling lip the pl'ogl'alll Lo ll/llch la, rger ~I';LIII -Illa.rs. For insl.ance the proceeding exaniph! is a.llalyzed in 0.93s (inst+ead of 0.{)5s) for a ,gra, llltliar of half its size (aro/lild 100,000 lexical rtliOS).</Paragraph> <Paragraph position="6"> The coverage of t;his gra+lnlrla+r still has I,o he extended, liOfD all data we had aL our disl/(~sal arc yei, encoded in l,}le Lra, ilSdtlcor (ar:)uIld ,50~1 i'(!lll:till). Thus, given ~tll a.r}'liLrlu'y I,eXL, whol'eas lltost. ()t&quot; Iho shiiplo sllort sel/l.ences (tive to lil't;een words) aro allalyzed, t,}ie probal~ilii,y ()f' having all lexical descril)l,i,ms for longer soiil, eilc0s decreases rapidly, llowcw,r, since all the Icyical rulos hay: I/een c}lecleed hy hand OliO hy ore'> l, he aCcllr:-t:'y of the analysis is higher ILh3,1i whaL C~tll he expected with \]ess loxicalized grammars. This means two things: * whenever the anMysis is \[bund and unless l,he Selll, enco is synLaol, ieMly allil)iguous> Lhe analysis is uni(lUe , * inC/orreeL senl;01ices are sysi,eiNal, ically rcjccl, ed. Thu.s Lhe set, o\[&quot; sonLence delhll,(I I)y l.hc pars~q' is ~.t sIII)Sel, of the set :)t&quot; c(irrecL s(uII.l!llt'(!s. 'l'his prol)orl;y is very difficuli, i,o acliiew, l, liroup;h 11(~11 or Icss loxicalized g~l'a, liHli3+rs.</Paragraph> </Section> class="xml-element"></Paper>