File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/99/e99-1020_metho.xml
Size: 10,183 bytes
Last Modified: 2025-10-06 14:15:21
<?xml version="1.0" standalone="yes"?> <Paper uid="E99-1020"> <Title>Tabular Algorithms for TAG Parsing</Title> <Section position="4" start_page="150" end_page="152" type="metho"> <SectionTitle> 2 A CYK-like Algorithm </SectionTitle> <Paragraph position="0"> We have chosen the CYK-like algorithm for TAG described in (Vijay-Shanker and Joshi, 1985) as our starting point. Due to the intrinsic limitations of this pure bottom-up algorithm, the grammars it can deal with are restricted to those with nodes having at most two children.</Paragraph> <Paragraph position="1"> The tabular interpretation of this algorithm works with items of the form \[N &quot;~ , i, j \[ p, q I adj\] such that N ~ ~ ai+l ...ap F ~ aq+l ...aj ai+l ... aj if and only if (p, q) 7~ (-, -) and N ~ ai+l.., aj if and only if (p,q) = (-,-), where N ~ is a node of an elementary tree with a label belonging to VN.</Paragraph> <Paragraph position="2"> The two indices with respect to the input string i and j indicate the portion of the input string that has been derived from N &quot;~. If V E A, p and q are two indices with respect to the input string that indicate that part of the input string recognized by the foot node ofv. In other casep= q =representing they are undefined. The element adj indicates whether adjunction has taken place on node N r.</Paragraph> <Paragraph position="3"> The introduction of the element adj taking its value from the set {true, false} corrects the items previously proposed for this kind of algorithms in (Vijay-Shanker and Joshi, 1985) in order to avoid several adjunctions on a node. A value of true indicates that an adjunction has taken place in the node N r and therefore further adjunctions on the same node are forbidden. A value of false indicates that no adjunction was performed on that node. In this case, during future processing this item can play the role of the item recognizing the excised part of an elemetitary tree to be attached to the foot node of an auxiliary tree. As a consequence, only one adjunction can take place on an elementary node, as is prescribed by the tree adjoining grammar formalism (Schabes and Shieber, 1994). As an additional advantage, the algorithm does not need to require the restriction that every auxiliary tree must have at least one terminal symbol in its frontier (Vijay-Shanker and Joshi, 1985).</Paragraph> <Paragraph position="4"> Schema 1 The parsing systems \]PCYK corresponding to the CYK-line algorithm for a tree adjoining grammar G and an input string al... an is defined as follows: ICYK={ \[N 7,i,jlp,qladj\] } such that N ~ * 79(7), label(Nr) * VN, 7 E I U A, 0 < i < j, (p,q) <_ (i,j), adj e {true, false} 7&quot;~Cy K = { \[a, i -- 1, i\] I a = ai, 1 < i < n } \[a, i - 1, if N r -+ a ~Scan CYK = \[Nr, i - 1, i \[ -,- I false\] 79~'Y=K = \[N% i, i I -,- I false\] N~ -~ e )Foot CYK = \[Fr, i, j I i, j I false\] \[M r, i, k \[ p, q I adj\], q~LeftDo,n \[P~', k, j I -, -- I adj\] '-'CYK = \[NT, i, j I P, q I false\] such that N &quot;r --+ M+rP r E 79(7), M r E spine(v) \[M r, i, k l -,-ladj\], ~R.ightDoln \[p'r k, j I P, q I adj\] ~CYK = \[N r, i, j I P, q false\] such that N &quot;r --+ M'rP ~ * P(7), pr * sp/ne(7) \[M ~, i, k adjJ , P~, k, j --,'-- \[\[ adj\] * pNoDom : CYK \[Nr, i, j I -, - I false\] such that N r ~ MrP r * P(7), M~, P'~ sp/ne(~) C/ )Unary = \[ M~, i, j I P, q I adj\] N~, M. r cY~ \[N% i, j I P, q I false\] -+ * P(~) \[ R~, i', j' i, j I adjl, Nr,i,j \[p,q false\] DAdj C/YK = \[N%i',j' \[p,q \[ true\] such that 3 e A, ~ * adj(N &quot;r) q~Scan I I-DFoot q'~LeftDoml i DCYK ~'CYK \['j ~)~YK I.J : &quot;-' ~'CYK ~'CYK ~RightDom II T~NoDom U TlUnary TIAdj CYK ~ &quot;CYK ~CYK \[J &quot;CYK $'CYK = { \[R ~,0,n \[ -,-\[adj\]la e I } The hypotheses defined for this parsing system are the standard ones and therefore they will be omitted in the next parsing systems described in this paper.</Paragraph> <Paragraph position="5"> The key steps in the parsing system IPCyK are DcFdeg~?t~ and 7?~di K, which are in charge of the recognition of adjunctions. The other steps are in charge of the bottom-up traversal of elementary trees and, in the case of auxiliary trees, the propagation of the information corresponding to the part of the input string recognized by the foot node.</Paragraph> <Paragraph position="6"> The set of deductive steps q-~Foot make it possi- ~'CYK ble to start the bottom-up traversal of each auxiliary tree, as it predict all possible parts of the input string that can be recognized by the foot nodes. Several parses can exist for an auxiliary tree which only differs in the part of the input string which was predicted for the foot node. Not all of them need take part on a derivation, only those with a predicted foot compatible with an adjunction. The compatibility between the adjunction node and the foot node of the adjoined ~Adj . when tree is checked by a derivation step ~'CYK&quot; the root of an auxiliary tree /3 has been reached, it checks for the existence of a subtree of an elementary tree rooted by a node N ~ which satisfies the following conditions: i. /3 can be adjoined on N'L 2. N &quot;r derives the same part of the input string derived from the foot node of/3.</Paragraph> <Paragraph position="7"> If the Conditions are satisfied, further adjunctions on N are forbidden and the parsing process continues a bottom-up traverse of the rest of the elementary tree 3' containing N x.</Paragraph> </Section> <Section position="5" start_page="152" end_page="152" type="metho"> <SectionTitle> 3 A Bottom-up Earley-like Algorithm </SectionTitle> <Paragraph position="0"> To overcome the limitation of binary branching in trees imposed by CYK-like algorithms, we define a bottom-up Earley-like parsing algorithm for TAG.</Paragraph> <Paragraph position="1"> As a first step we need to introduce the dotted rules into items, which are of the form \[N ~ --4 5 * v,i,j I P, q\] such that 6 ~ a~+1...% F &quot;y aq+l...a; :~ ai+l ... a~ if and only if (p, q) # (-,-) and 5 =~ ai+l ... aj if and only if (p, q) = (-, -).</Paragraph> <Paragraph position="2"> The items of the new parsing schema, denoted buEx, are obtained by refining the items of CYK.</Paragraph> <Paragraph position="3"> The dotted rules eliminate the need for the element adj indicating whether the node in the left-hand side of the production has been used as adjunction node.</Paragraph> <Paragraph position="4"> Schema 2 The parsing system \]PbuE corresponding to the bottom-up Earl*y-like parsing algorithm, given a tree adjoining grammar G and a input string al ... a,~ is defined as follows:</Paragraph> <Paragraph position="6"> The deduction steps of \]PbuE are obtained from the steps in IPcyK applying the following refinement: null * LeftDom, RightDom and NoDom deductive steps have been split into steps Init and Comp.</Paragraph> <Paragraph position="7"> * Unary and E steps are no longer necessary, due to the uniform treatment of all productions independently of the length of the production. null The algorithm performs a bottom-up recognition of the auxiliary trees applying the steps ~)Comp During the traversal of auxiliary trees, buE1 &quot; information about the part of the input string recognized by the foot is propagated bottom-up. A set of deductive steps z)Init ~buE are in charge of starting the recognition process, predicting all possible start positions for each rule.</Paragraph> <Paragraph position="8"> A filter has been applied to the parsing system \]PCYK, contracting the deductive steps Adj and Comp in a single AdjComp, as the item generated by a deductive step Adj can only be used to advance the dot in the rule which has been used to predict the left-hand side of its production.</Paragraph> </Section> <Section position="6" start_page="152" end_page="153" type="metho"> <SectionTitle> 4 An Earley-like Algorithm </SectionTitle> <Paragraph position="0"> An Earley-like parsing algorithm for TAG can be obtained by incorporating top-down prediction.</Paragraph> <Paragraph position="1"> To do so, two dynamic filters must be applied to \]PbuE: * The deductive steps in D~ nit will only consider productions having the root of an initial tree as left-hand side.</Paragraph> <Paragraph position="2"> * A new set ~)Pred of predictive steps will be in charge of controlling the generation of new items, considering only those new items which are potentially useful for the parsing process.</Paragraph> <Paragraph position="3"> Schema 3 The parsing system \]PE corresponding to an Earley-like parsing algorithm for TAG without the valid prefix property, given a tree adjoining grammar G and a input string al ... an is defined as follows:</Paragraph> <Paragraph position="5"/> <Paragraph position="7"> Init T)Scan j , ~)Pred U ~r)Comp, , 7) E-- 7:) E U ouE ~ E :.hue w T~ AdjPred i i T~FootPred I I T)VdegdegtCdegmpl I</Paragraph> <Paragraph position="9"> Parsing begins by creating the item corresponding to a production having the root of an initial tree as left-hand side and the dot in the leffmost position of the right-hand side. Then, a set of deductive steps ~E Pred and ~Comp w E traverse each ele-T)AdjPred predicts the ad- mentary tree. A step in w E junction of an auxiliary tree/3 in a node of an elementary tree 3' and starts the traversal of/3. Once the foot of/3 has been reached, the traversal of/3 ~FootPred is momentary suspended by a step in E , which re-takes the subtree of 7 which must be attached to the foot of/3. At this moment, there is no information available about the node in which the adjunction of/3 has been performed, so all possible nodes are predicted. When the traversal of a * .r~FootComp predicted subtree has finished, a step m/Jn re-takes the traversal of/3 continuing at the foot node. When the traversal of/3 is completely fin-T~hdjCdegmp checks if the ished, a deduction step in w E subtree attached to the foot of \[3 corresponds with the adjunction node. With respect to steps in ~)AdjComp E , p and q are instantiated if and only if the adjunction node is in the spine of V-</Paragraph> </Section> class="xml-element"></Paper>