File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/96/c96-2160_metho.xml
Size: 20,534 bytes
Last Modified: 2025-10-06 14:14:19
<?xml version="1.0" standalone="yes"?> <Paper uid="C96-2160"> <Title>Computing Phrasal-signs in HPSG prior to Parsing</Title> <Section position="4" start_page="949" end_page="953" type="metho"> <SectionTitle> 3 Lexical Entry Automata </SectionTitle> <Paragraph position="0"> This section presents a Lexical Entry Automaton (LA). The ineifieiency of parsing in HPSG is due to the fact that what kind of constituents phrasal-signs would become is invisible until the whole sequence of applications of rule schemata is completed. Consider the parse tree in Figure 3. The phrasal-signs $1 and $2 are invisible until a parser creates the feature structures describing them, using expensive unification.</Paragraph> <Paragraph position="1"> Our parsing method avoids this on-line construction of phrasal-signs by computing skeletal part of parse trees prior to parsing. \[n Figure 3, our compiler generates $1 and $2 only from the lexical entry &quot;wrote,&quot; without specifying the non-head daughters indicated by the triangles in Figure 3. Since the non-head daughters are tokenidentical with subcat values of the lexical entry for &quot;wrote&quot;, the obtained skeletal parse tree contains the information that St takes a noun phrase as object and $2 selects another noun-phrase.</Paragraph> <Paragraph position="2"> Then unifying those non-head daughters with actual signs constructed from input, parsing can be done. An LA expresses a set of such skeletal parse trees. A state in an LA corresponds to a phrasal-sign suc h as Sj and $2. They are called corestructures. A transition arc is a domination link between a phrasal-sign and its head daughter, and its condition for transition on input is a non-head daughter, such as signs tagged \[1\] and \[2\] in Figure 3. Kasper c.t al. 1)resented an idea similar to this @line raising in their work on HPSG-TAG compiler(Kasper et al., 1995). The difference, is that our algorithm is based ou substitution, not adjoining, Furthermore, it is not clear in their work how offline raising is used to improve ef\[icicncy of parsing.</Paragraph> <Paragraph position="3"> Before giving the definition of LAs, we detine the notion of a quasi-sign, which is part of a sign and constitutes l~As.</Paragraph> <Paragraph position="4"> Definition 3 (quasi-sign(n)) For a given integer n, a fcatu,e structure S is a q'aasi-sign(n) if it has some of tile following four attributes: syn, sem, head-dtr,non-head-dtr and does not /Lave values for the paths (head-dtr + non-head-dtr)&quot;&quot;.</Paragraph> <Paragraph position="5"> A qua,.si-sign('n) cannot rel)resent a parse tree whose height is inore than n, while a sign can express a parse tree with any height. Tlm)ugh the rest of this 1)aper, we often extract a quasi-sig&quot;n.(n) S from a sign or a quasi-sig',,(,n/) S' where '., < n'. This operation is denote(l by S' = c'x(S',,n).</Paragraph> <Paragraph position="6"> This means that 5' is equivMent to S' except ff)r the attributes head-dtr mM non-head-dtr whose root is the (head-dtr + non-head-dtr) '~ value in S'. Note that S and S' are completely different entities. In other words, S and S' pose different scopes on structure sharing tags, in addition, we also extract a feature structure F reached by a path or an attribute 1) in a feature structure IP'.</Paragraph> <Paragraph position="7"> We denote this by F = val(F',p) and regard F and F' as different entities.</Paragraph> <Paragraph position="8"> Definition 4 (Lexical Entry Auton, aton(LA)) A Lezical Entry Automaton is a tuplc (Q,A,qo} whel'e~ Q: a set of states, where a .state is a quasi-sign(O).</Paragraph> <Paragraph position="9"> A : a ,set of transition arcs between states, where a transition arc is a tuple (qd, q .... N,D,R) where qd, q,. 6 Q, N is a quasi-sign(O), D is a quasi-sign(I) and R is a rule schema.</Paragraph> <Paragraph position="10"> qo : tile initial state, which corresponds to a lezical erLtry.</Paragraph> <Paragraph position="11"> In a transition :-tt'(; < qd, q ..... N, D, 1~} , q,~ denotes the destination of the transition arc, and qd is the root of the arc. The N is a non-head daughter of a l)hrasal-sign, i.e., the destination state of the transition, and expresses the input condition for the transition. The D is used to represe, nt: the dependency 1)etween the nn)ther sign and the daughters through structure sharings. This is called a</Paragraph> <Section position="1" start_page="950" end_page="953" type="sub_section"> <SectionTitle> Dependency Feature Strueture(DFS) of the </SectionTitle> <Paragraph position="0"> transition arc, the role of which will be discussed in Section 4. 1~, is the rule schema used to create this arc.</Paragraph> <Paragraph position="1"> An LA is generated fl'om a lexieal entry l by the following recursive pro(:edure: 1. Let; ,~; 1)e {/}, A be an eml)ty set and sd = / 2. For ea(:h rule, schema 1~, and for each of its ea(:h resolution sequence (rl,...,'r,~} obtain,</Paragraph> <Paragraph position="3"> feature structure, obtain s,, = ex(D,O) and</Paragraph> <Paragraph position="5"> a. If D is a t~ature structure, * If the, re is a state s~,~ 6 S such that s',~ ,s .... 4 let s,~ be s~,~. Otherwise, add s,,~ to 5&quot;. * If there is no T'r = \'/~&quot;d, '~,,~&quot;', N&quot;, D&quot;, 1~) N&quot; and D ~ D&quot;, then, add the tuple (s,t, s,,~, N, D, R) to A.</Paragraph> <Paragraph position="6"> 4. If the new quasi-sign(O) (s,~) was added to S in the previous step, let sd be s,~ and go to Step 2.</Paragraph> <Paragraph position="7"> When this terminates, (S, A, l) is the LA for 1. The major difference of Step 2 and the normal application of a rule schema is that non-head-dtr values are not specified in Step 2. In spite of this underspecification, certain parts of the non-head-dtr are instantiated because they are token-identicM with certain values of the head-d%r domain. By unifying non-head-dtr values with actual signs to be constructed fl'om input sentences, a parser can obtain parsing results. For more intuitive explanation, see (Torisawa and Tsujii, 1996).</Paragraph> <Paragraph position="8"> However, this simple LA generation algorithm has a termination problem. There are two potential causes of non-termination. The first is the generative capacity of a feature structure of a rule schema, i.e., a rule schema can generate infinite variety of signs. The second is non-termination of the execution of DCP in Step 2 because of lack of concrete non-head daughters.</Paragraph> <Paragraph position="9"> For the first case, consider a rule schema with the following feature structure.</Paragraph> <Paragraph position="10"> head-dtr syn \[ counter \[1\] \] \] Then, this can generate an infinite sequence of signs, each of which contains a part, \[ counter <bar, ba, r,...,bar) l and is not equivalent to any previously generated sign. In order to resolve this difficulty, we apply tim restriction (Shieber, 1985) to a rule schemata and a lexical entry, and split the feature structure F = fs(R) of a rule schema R or a lexical entry F = l, into two, namely, core(F) and sub(F) such that F = core(F) U sub(F). The definition of the restriction here is given as follows.</Paragraph> <Paragraph position="11"> Definition 5 (paths) For arty node n in a feature structure F, paths(n,F) is a set of all the paths that reaches n from the root of F.</Paragraph> <Paragraph position="12"> Definition 6 (Restriction Schema) A restriction schema rs is a set of paths.</Paragraph> <Paragraph position="13"> Definition 7 (Res) F' = Res(F, rs) is a ma.~;ireal feature structure such that each node n in F ~ satisfies the following conditions.</Paragraph> <Paragraph position="14"> * The~ is a node no in f: such that paths(no,F) = path.s(n,F') and type('n) = t?tpe(no).</Paragraph> <Paragraph position="15"> * For any p C paths('n,F'), there is no path p,, 6 rs which prefixes p.</Paragraph> <Paragraph position="16"> Res eliminates the feature structure nodes which is specified by a restriction schema. For a certMn given restriction schema rs, eore(fs(l~,)) -= Res(fs(R),rs) and sub(fs(R)) is a minimM feature structure such that core(fs(R))U sub(fs(R)) = fs(R). Tile nodes eliminated by Res must appear in sub(fs(R)). In tile example, if we add (syn, counter} to a restriction schema and replace fs(R) with eorc(fs(.R)) in the Mgorithm for generating LAs, the termination problenl does not occur because LAs can contain a loop and equivMent signs are reduced to one state in LAs. The sub(fs(R)) contains the synlcounter, and the value is treated at Phase 2. The other problem, i.e., termination of DCPs, often occurs because of underspecification of the nork-head-dtr wines. Consider the rule schema in Figure 1. The append does not terminate at Phase 2 because the indices value of non-head (laughters is \[ +- \]. (Consider the case of executing append(X, (b),Y) in Prolog.) We introduce the .freeze Nnctor in Prolog which delays the evaluation of the second argument of the functors if the first arguruent is not instantiated. For instance, freeze (X, append(X, \[b\], Z) ) means to delay the ewfluation of append until X is instantinted. We introduce the functor in the following forln.</Paragraph> <Paragraph position="17"> goals arg2 (fl arg3 \[~ freeze \] This means the resolution of this query is not performed if \[1\] is \[+-\]. The delayed evaluation is considered later when tile non-head-dtr values are instantiated by an actual sign. Note that this change does not affect the discussion on the correctness of our parsing method, because the difference can be seen as only changes of order of unification.</Paragraph> <Paragraph position="18"> Now, tile two phases of our parsing algorithm can be described in more detail.</Paragraph> <Paragraph position="19"> Phase 1 : Enumerate possible parses or edges in a chart only with unifiability checking in a bottom-up chart-parsing like manner.</Paragraph> <Paragraph position="20"> Phase 2 : For comt)leted parse trees, compute sub-structures by DFSs, ,sub(fs(R)) for each schema R and frozen 1)C1 ) programs.</Paragraph> <Paragraph position="21"> Note that, in \['has(; 1, unification is replaced with nnifiability checking, which is more efficient than unification in terlns of space an(l time. The intended side effect by unification, such as building up logical forms in sere values, is COmlntted (D, eh, c,,, ll} wh, e, rc. eh a7%d Cn aTY; (:dges, \]) i.s a quasi-.sign(I) and R is a rule .schema.</Paragraph> <Paragraph position="22"> The intuition behind this definition is, * PS' l)lays the role of a non-/termimd in CFG, though it is actually a quasi-sign(O).</Paragraph> <Paragraph position="23"> * ch and e,~ denote a head daughter edge and a non-head daughter edge, respectively.</Paragraph> <Paragraph position="24"> * Dep represents the dependency of an edge and its daughter edges. Where (D, eh,c,~,l~} E Dcp, D is a DIeS of a transition arc. Basi(:ally, Phase 1 parsing creates these tuples, and \])hase 2 parsing uses them.</Paragraph> <Paragraph position="25"> The Phase 1 parsing (:onsists of the folh)wing steps. Assume that a word in input \]n~s a lexical entry L~ and that an LA (Q,;,A,,q~) generated fi'om Li is attached to the word: 1. Create an edge li -= (j.i,ji + 1,q~,()) in the chart for each Li, for at)propriate .ji.</Paragraph> <Paragraph position="26"> 2. For an edge e. 1 whose state is q~ in the chart, pick u t) an edge e2 which is adjacent to el and whose state is q~.</Paragraph> <Paragraph position="27"> 3. For a transition arc (ql, q, N, D, ll), check if N is unifiable with q2.</Paragraph> <Paragraph position="28"> 4. If the unifiability check is successful, find an edge (l = ('m,d,'n,d,q, Depd) strictly covering el and e2.</Paragraph> <Paragraph position="29"> 5. if there is, replace d with a new edge (m,,,'na,q, Dep,z U {(D,c,,eu,B)}) it) the.</Paragraph> <Paragraph position="30"> chart.</Paragraph> <Paragraph position="31"> 6. Otherwise, create a new edge (Tn, n, q, {(D, el, e2, R)}) strictly covering el and e2.</Paragraph> <Paragraph position="32"> 7. Go to steI) 2.</Paragraph> <Paragraph position="33"> 4 Phase 2 Parsing The algorithnl of Phase 2 parsing is given in cursive 1)rocedure which takes an edge as input and builds Ul) sub-structures, which is differ'ential feature structures representing modifications to core-structures, in a bottoln-U 1) nlanner. The obtained sub-structures are unified with core-structures when 1) the input edge covers a whole input or 2) the edge is a non-head daughter edge of sonm other edge. Note that the .~ub-struet'are treats sub(fs(R)), a feature structure eliminated l)y the restriction in the generation of LAs, (the (A) 1)art in Figure 4) and frozen goals of DCPs, by additional ewduation of DCPs. (the (B) part) Here, we use two techniques: ()tie is dependency analysis which is eml)odied by the function dep in expressed by p_nnify in the figure.</Paragraph> <Paragraph position="34"> The del)endency analysis is represented with the function, dep(F,'rs), where F is a DFS and rs is a restriction schema used in generation of LAs: Definition 9 (dep) For a feature structure \[&quot;' and the. restriction schema r.s, F = dep(l c~,r,s) is a maximal fc.atu're~ structure such O~,at any 'node 'n in F sati,~fies the conjunction of th, e. following two conditions: t. There is a node n' in f i'' ,such, that v(tm.+,., P) - ~),,m.,+,,',F') a,.Z t:,mc.(7,0 := typc(n').</Paragraph> <Paragraph position="35"> 2. Where A) ha. = 'n or B) n,t is a descendan? of n, pa, ths(n,z,F) contains a path. prefixed by one of (head-dtr), (non-head-dtr) and <goa:ts>. 3. The diajunetion of the following three conditions is satisfied where A) n,t = n or B) 'n(t is a descendant of n.</Paragraph> <Paragraph position="36"> * For .some p G pa, th, s(7t~l,F), there i.s a path, p,,. E 'rs wh, ieh prefixes p.</Paragraph> <Paragraph position="37"> * Some p ~ p.,th,@n,t,F) is prefixed by (~m.,ls). * 7'here is no node 'n. in F .~'uch th, at i) there is paths Pi,7)'2 ~ paths('n<,., f;') such that Pi is prefixed by (syn) 07' (sere) aTtd P2 is 'p'r'efi;Le.d by (head-dtr) Or (non-head-dtr>, and i/) for a~ty p G paths(rid, F) there is p,~ E path..s(n,~, F) which prefixes p.</Paragraph> <Paragraph position="38"> Roughly, dep eliminates 1) the descendant nodes of the node which apl)ears both in syn/sem domains and head-dtr/non-head-dtr domains and 2) the nodes at)peering only in syn/sem domains, excet)t for the node which el)pears in s'ab(fs(\]C/)) or goals domains. In other words, it removes the feature structures that have I)een already raised to core-structures or other DFSs, ex(:ept for the structure sharings, and leaves those which will be required by DCPs or xub(fs(R)).</Paragraph> <Paragraph position="39"> p_uni f y( Fl , F.2 , r s ) is a partial unification routine where Fl and F2 are feature structures, and rs is a restriction schema used in generation of LAs. l{oughly, it performs unification of F, and l'12 only for common part of Ft, F.2, and it produces unified results only for the node 'n in Fl if s'nj is ~t descendant of 'n2 in a feaiure structur{~ l,' ill 'nt # n2, and the.r('. ~u:e paths Pl 6 path, s(~,,l, \[&quot;) ~Hld I)'2 E pa, th, s(n.2, l&quot;), nnd p2 l)r('.fixes pl. n has a counter part in F~. More precisely, it produces the unification results for a nod(; n in Fj such that * there is a path p ~ paths(n, I~) such that the node reached by 1) is also defined in F2, or * there is a path p ~ paths(n, F1) prefixed by some p,, C rs or (goals).</Paragraph> <Paragraph position="40"> Note that a node is unified if its structure-shared part has a counter-I)art in F2. Intuitively, the routing produces unified results for the part of Fi instantiated by /7'2. The other part, that is not produced by p_unify, is not required at Phase 2 because it is already computed in a state or DFSs in LAs when the LAs are generated. Then, a sign can be obtained by unifying a sub-structure and the corresponding core-structure.</Paragraph> </Section> </Section> <Section position="5" start_page="953" end_page="954" type="metho"> <SectionTitle> 5 Example </SectionTitle> <Paragraph position="0"> This section describes the parsing process of the sentence &quot;My colleague wrote a good paper.&quot; The LA generated fronl the lexical entry for &quot;wrote&quot; in Figure 5 is given in Figure 6. The transition arc T1 between the states L and S1 is generated by the rule schema in Figure 1. Note thai; the query to DCP, freeze(\[1\], append(Ill, \[2\], \[3\])), is used to obtain union of indices values of daughters and the result is written to the indices values of the mother sign. During the generation of the transition arc, since the first argument of the query is \[ +- \], it is frozen. The core-structures arid the dependencyanalyzed DFSs that augment the LA are shown in Figure 7. We assume that we do not use any restriction, i.e., for any lexical entry l and rule schenaata 2~, s,bb(1) ~-\[+-1 and sub(fs(I{)) = \[+-1.</Paragraph> <Paragraph position="1"> Note that, in the DFSs, the already raised feature structures are eliminated and, that the DFS of the transition arc T contains the frozen query as the goals.</Paragraph> <Paragraph position="2"> Assmne that the noun phrases &quot;My colleague&quot; and '% good paper&quot; are already recognized by a parser. At phase 1, they are checked if they are unifiable to the condition of transition arcs T1 and T2, i.e., the NPs which are non-head daughters head 8\] syn subcat {/10\]+-\[9\]} \] content \[6\] agent sere object indices \[3\] * sign sy 1 ..... I-dtr \[3\] 9\]) ..... \[ ic'::~itceet: t \[2 t (, \] syn \[ t subcat ) non-head-dtr \[5\] sere indices 1 ..... \]> freeze 1 goals argl The sub-structure for $2 content |agen~ 1\]my_colleaque |..... \[ obje,:t \[2\]good_p.vJ4' J indices {\[llmy_collea.o .... \[2\]good_paper) The sub-structure for S1 sere \[ object \[1\]good_paper \] indices (\[1\]good_paper) The goals,head-dtr,non-head- dtr vMues are omitted. ings ,'/.1&quot;o successful, Phase 1 parsing produ(:es the parse tree whose form is presented in Figure 3. The Phase 2 1)arsing produces the sub-structures in Figure 8. Note that the frozen goals are evaluated and the indices wdues have al)prot)riate values. A l)arsing result is obtaine{l by unifying the sub-structure for 5&quot;2 with tim correspon<ling core-structllre.</Paragraph> <Paragraph position="3"> The amount of the feature stru<:ture nodes generate(1 during t)arsing are r(~<lu(:e(1 (:<m~t>are(l to the case of the naive at)l)lication of rule schemata presented in Section 2. The important point is that they contMn only either the part iu the DFSs that was instantiated by head daughters' sub-structures, and non-head daughters' core-structures and sub-structures, or the part that contributes to the DCP's exaluation. The feature structure that does not al)pear i, a sub-structure appears in the corresponding core-structure. Se, e Figure 7. Because of these 1)rot>erties, the correctness of our parsing nmthod is guaranteed. ('lbrisawa and Tsujii, 1996).</Paragraph> </Section> <Section position="6" start_page="954" end_page="954" type="metho"> <SectionTitle> 7 Conclusion </SectionTitle> <Paragraph position="0"> We have lu'esented a two-phased t)arsing nlethod tor HPSG. In the first l)hase,, our 1)arser produces parse trees using Lexical Entry Automnta compilcxl from lexical entries, in the second phase, only the feature structures whi<:h luust \])e (:ompute(\[ dynamically are (:omputed. As a resuit, amount of the fl;ature structures unifie<l at 1)arsing-time is reduce.d. We also showed the el'feet of our optinfization te(:hniques by a series of exl)erinwats <m a real world text.</Paragraph> <Paragraph position="1"> \]t can l)e noticed that ea<:h transition arc of tim cOral)ileal l,As can be seen as a rewriting rule in CFG (or a dott;ed notation in a chart parser.) We belie.ve this can Ol)en the way to integrate severaJ n,et;hods deveh)l>ed for CI,'G, including the inside-outside algorithm tot grmmnar learning or disam biguation, into an HPSC, framework. We also 1)elieve that, by pursuing this direction for optimizing ttl)SG parsers, we can reach the point whe.re grammar learning from corl)ora can be done with concise, and linguistically well-defined (:ore grantItt;tr. null</Paragraph> </Section> class="xml-element"></Paper>