File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/04/w04-0302_metho.xml
Size: 17,774 bytes
Last Modified: 2025-10-06 14:09:05
<?xml version="1.0" standalone="yes"?> <Paper uid="W04-0302"> <Title>Stochastically Evaluating the Validity of Partial Parse Trees in Incremental Parsing</Title> <Section position="3" start_page="0" end_page="0" type="metho"> <SectionTitle> 2 TAG-based Incremental Parsing </SectionTitle> <Paragraph position="0"> Our incremental parsing is based on tree adjoining grammar (TAG) (Joshi, 1985). This section proposes a TAG-based incremental parsing method.</Paragraph> <Section position="1" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 2.1 TAG for Incremental Parsing </SectionTitle> <Paragraph position="0"> Firstly, we propose incremental-parsing-oriented TAG (ITAG). An ITAG comprises two sets of elementary trees just like TAG: initial trees and auxiliary trees. The difference between ITAG and TAG is the form of elementary trees. Every ITAG initial tree is leftmost-expanded. A tree is leftmost-expanded if it is of the following forms: 1. [t]X, where t is a terminal symbol and X is a nonterminal symbol.</Paragraph> <Paragraph position="1"> 2. [ X1C/C/C/Xk]X, where is a leftmost expanded tree, X1;:::;Xk, X are nonterminal symbols.</Paragraph> <Paragraph position="2"> On the other hand, every ITAG auxiliary tree is of the following form: [X/ X1C/C/C/Xk]X where is a leftmost expanded tree and X, X1;:::;Xk are nonterminal symbols. X/ is called a foot node. Figure 1 shows examples of ITAG elementary trees.</Paragraph> <Paragraph position="3"> These elemental trees can be combined by using two operations: substitution and adjunction. substitution The substitution operation replaces a leftmost nonterminal leaf of a partial parse tree with an initial tree fi having the same nonterminal symbol at its root. We write sfi for the operation of substituting fi and sfi( ) for the result of applying sfi to .</Paragraph> <Paragraph position="4"> adjunction The adjunction operation splits a partial parse tree at a nonterminal node having no nonterminal leaf, and inserts an auxiliary tree fl having the same nonterminal symbol at its root. We write afl for the operation of adjoining fl and afl( ) for the result of applying afl to .</Paragraph> <Paragraph position="5"> The substitution operation is similar to rule expansion of top-down incremental parsing such as (Matsubara et al., 1997; Roark, 2001). Furthermore, by introducing the adjunction operation to incremental parsing, we can expect that local ambiguity of left-recursive structures is decreased (Lombardo and Sturt, 1997).</Paragraph> <Paragraph position="6"> Our proposed incremental parsing is based on ITAG. When i-th word wi is scanned, the parser combines elementary trees for wi with partial parse trees for w1C/C/C/wi!1 to construct the partial parse trees for w1C/C/C/wi!1wi.</Paragraph> <Paragraph position="7"> As an example, let us consider incremental parsing of the following sentence by using ITAG shown in Figure 1: I found a dime in the wood. (1) Table 1 shows the process of tree construction for the sentence (1). When the word &quot;found&quot; is scanned, partial parse trees #3, #4 and #5 are constructed by applying substitution operations to partial parse tree #2 for the initial fragment &quot;I&quot;. When the word &quot;in&quot; is scanned, partial parse trees #12 and #13 are constructed by applying adjunction operations to partial parse tree #10 for the initial fragment &quot;I found a dime&quot;. This example shows that the ITAG based incremental parsing is capable of constructing partial parse trees of initial fragments for every word input.</Paragraph> </Section> <Section position="2" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 2.2 ITAG Extraction from Treebank </SectionTitle> <Paragraph position="0"> Here, we propose a method for extracting an ITAG from a treebank to realize broad-coverage incremental parsing. Our method decomposes parse trees in treebank to obtain ITAG elementary trees. The decomposition is as follows: + for each node *1 having no left-sibling, if the parent *p has the same nonterminal symbol as *1, split the parse tree at *1 and *p, and combine the upper tree and the lower tree. *1 of intermediate tree is a foot node.</Paragraph> <Paragraph position="1"> + for each node *2 having only one left-sibling, if the parent *p does not have the same nonterminal symbol as the left-sibling *1 of *2, split the parse tree at *2.</Paragraph> <Paragraph position="2"> + for the other node * in the parse tree, split the parse tree at *.</Paragraph> <Paragraph position="3"> For example, The initial trees fi1, fi2, fi5, fi7 fi8 and fi10 and the auxiliary tree fl2 are extracted from the parse tree #18 in Table 1.</Paragraph> <Paragraph position="4"> Our proposed tree extraction is similar to the TAG extractions proposed in the literatures (Chen and Vijay-Shanker, 2000; Chiang, 2003; Xia, 1999).</Paragraph> <Paragraph position="5"> The main difference between these methods is the position of nodes at which parse trees are split. While the methods in the literatures (Chen and Vijay-Shanker, 2000; Chiang, 2003; Xia, 1999) utilize a head percolation rule to split the parse trees at complement nodes, our method splits the parse trees at left recursive nodes and nodes having left-sibling. The elementary trees extracted by our method are of the forms described in section 2.1, and can be combined from left to right on a word-by-word basis. The property is suitable for incremental parsing. On the other hand, the elementary trees obtained by the method based on head information does not necessarily have this property 1.</Paragraph> </Section> <Section position="3" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 2.3 Probabilistic ITAG </SectionTitle> <Paragraph position="0"> This section describes probabilistic ITAG (PITAG) which is utilized by evaluating partial parse trees in incremental parsing. PITAG assigns a probability to the event that an elementary tree is combined by substitution or adjunction with another tree.</Paragraph> <Paragraph position="1"> We induce the probability by maximum likelihood estimation. Let fi be an initial tree and X be the root symbol of fi. The probability that fi is substituted is calculated as follows:</Paragraph> <Paragraph position="3"> (2) where C(sfi) is the count of the number of times of applying substitution sfi in the treebank, and I(X) is the set of initial trees whose root is labeled with tion splits the parse tree #18 at the node labeled with dt to obtain the elementary tree [a]dt for &quot;a&quot;. However, the tree [a]dt cannot be combined with the partial parse tree for &quot;I found&quot;, since substitution node labeled with dt exists in the initial tree [dt[dime]nn]np for &quot;dime&quot; and not the partial parse trees for &quot;I found&quot;.</Paragraph> <Paragraph position="4"> Let fl be a auxiliary tree and X be the root symbol of fl. The probability that fl is adjoined is calculated as follows:</Paragraph> <Paragraph position="6"> where C(X) is the count of the number of occurrences of symbol X. The probability that adjunction is not applied is calculated as follows:</Paragraph> <Paragraph position="8"> where nilX means that the adjunction is not applied to a node labeled with X, and A(X) is the set of all auxiliary trees whose root is labeled X.</Paragraph> <Paragraph position="9"> In this PITAG formalism, the probability that elementary trees are combined at each node depends only on the nonterminal symbol of that node 2.</Paragraph> <Paragraph position="10"> The probability of a parse tree is calculated by the product of the probability of the operations which are used in construction of the parse tree. For example, the probability of each operation is given as shown in Table 2. The probability of the partial parse tree #12, which is constructed by using sfi1, sfi2, sfi5, sfi7, nilNP and afl2, is 1 PS 0:7 PS 0:3 PS 0:5PS0:7PS0:7 = 0:05145.</Paragraph> <Paragraph position="11"> We write P( ) for the probability of a partial parse tree .</Paragraph> </Section> <Section position="4" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 2.4 Parsing Strategies </SectionTitle> <Paragraph position="0"> In order to improve the efficiency of the parsing, we adapt two parsing strategies as follows: + If two partial parse trees have the same sequence of nodes to which ITAG operations are applicable, then the lower probability tree can be safely discarded.</Paragraph> <Paragraph position="1"> + The parser only keeps n-best partial parse trees.</Paragraph> </Section> </Section> <Section position="4" start_page="0" end_page="0" type="metho"> <SectionTitle> 3 Validity of Partial Parse Trees </SectionTitle> <Paragraph position="0"> This section gives some definitions about the validity of a partial parse tree. Before describing the validity of a partial parse tree, we define the subsumption relation between partial parse trees.</Paragraph> <Paragraph position="1"> Definition 1 (subsumption relation) Let and ? be partial parse trees. Then we write / ?, if sfi( ) = ?, for some initial tree fi or afl( ) = ?, for some auxiliary tree fl. Let // be the reflexive transitive closure of /. We say that subsumes ?, if // ?. 2 That subsumes ? means that ? is the result of applying a substitution or an adjunction to . Figure 2 shows the subsumption relation between the partial parse trees constructed for the sentence (1).</Paragraph> <Paragraph position="2"> If a partial parse tree for an initial fragment represents a syntactic relation correctly, the partial parse tree subsumes the correct parse tree for the input sentence. We say that such a partial parse tree is valid. The validity of a partial parse tree is defined as follows: Definition 2 (valid partial parse tree) Let be a partial parse tree and w1C/C/C/wn be an input sentence. We say that is valid for w1C/C/C/wn if subsumes the correct parse tree for w1C/C/C/wn. 2 For example, assume that the #18 is correct parse tree for the sentence (1). Then partial parse tree #3 is valid for the sentence (1), because #3 // #18. On the other hand, partial parse tree #4 and #5 are not valid for (1). Figure 3 shows the valid partial parse trees for the sentence (1).</Paragraph> </Section> <Section position="5" start_page="0" end_page="0" type="metho"> <SectionTitle> 4 Evaluating the Validity of Partial Parse </SectionTitle> <Paragraph position="0"> Tree The validity of a partial parse tree for an initial fragment depends on the rest of the sentence. For example, the validity of the partial parse trees #3, #4 and #5 depends on the remaining input that follows the word &quot;found.&quot; This means that the validity dynamically varies for every word input. We define a conditional validity of partial parse tree:</Paragraph> <Paragraph position="2"> where is a partial parse tree for an initial fragment w1C/C/C/wi(i * j), T(w1C/C/C/wj) is the set of constructed partial parse trees for the initial fragment w1C/C/C/wj and Sub( ;w1C/C/C/wj) is the subset of T(w1C/C/C/wj) whose elements are subsumed by .</Paragraph> <Paragraph position="3"> The equation (5) represents the validity of on the condition w1C/C/C/wj. is valid for input sentence if and only if some partial parse tree for w1C/C/C/wj subsumed by is valid. The equation 5 is the ratio of such partial parse trees to the constructed partial parse trees.</Paragraph> <Section position="1" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 4.1 Output Partial Parse Trees </SectionTitle> <Paragraph position="0"> Kato et al. proposed a method of delaying the decision of which partial parse trees should be returned as the output, until the validity of partial parse trees are guaranteed (Kato et al., 2000). The idea of delaying the decision of the output is interesting.</Paragraph> <Paragraph position="1"> However, delaying the decision until the validity are guaranteed may cause the loss of incrementality of the parsing.</Paragraph> <Paragraph position="2"> To solve the problem, in our method, the incremental parser returns high validity partial parse trees rather than validity guaranteed partial parse trees.</Paragraph> <Paragraph position="3"> When the j-th word wj is scanned, our incremental parser returns the following partial parse: argmaxf :V( ;w1C/C/C/wj), gl( ) (6) where is a threshold between [0;1] and l( ) is the length of the initial fragment which is yielded by . The output partial parse tree is the one for the longest initial fragment in the partial parse trees whose validity are greater than a threshold .</Paragraph> </Section> <Section position="2" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 4.2 An Example </SectionTitle> <Paragraph position="0"> Let us consider a parsing example for the sentence (1). We assume that the threshold = 0:8.</Paragraph> <Paragraph position="1"> Let us consider when the partial parse tree #3, which is valid for (1), is returned as output. When the word &quot;found&quot; is scanned, partial parse trees #3, #4 and #5 are constructed. That is, T(I found) = f#3;#4;#5g. As shown in Figure</Paragraph> <Paragraph position="3"> #3 is not returned as the output at this point. The parser only keeps #3 as a candidate partial parse tree.</Paragraph> <Paragraph position="4"> When the next word &quot;a&quot; is scanned, partial parse trees #6, #7, #8 and #9 are constructed, where</Paragraph> <Paragraph position="6"> Validity(#3;I found a) , , partial parse tree #3 is returned as the output.</Paragraph> <Paragraph position="7"> Table 3 shows the output partial parse tree for every word input.</Paragraph> <Paragraph position="8"> Our incremental parser delays the decision of the output as shown in this example.</Paragraph> </Section> </Section> <Section position="6" start_page="0" end_page="0" type="metho"> <SectionTitle> 5 Experimental Results </SectionTitle> <Paragraph position="0"> To evaluate the performance of our proposed method, we performed a parsing experiment. The parser was implemented in GNU Common Lisp on a Linux PC. In the experiment, the inputs of the incremental parser are POS sequences rather than word sequences. We used 47247 initial trees and 2931 auxiliary trees for the experiment. The elementary trees were extracted from the parse trees in sections 02-21 of the Wall Street Journal in Penn Tree-bank (Marcus et al., 1993), which is transformed by using parent-child annotation and left factoring (Roark and Johnson, 1999). We set the beam-width at 500.</Paragraph> <Paragraph position="1"> The labeled precision and recall of the parsing are 80.8% and 78.5%, respectively for the section 23 in Penn Treebank. We used the set of sentences for which the outputs of the incremental parser are identical to the correct parse trees in the Penn Treebank. The number of these sentences is 451. The average length of these sentences is 13.5 words.</Paragraph> <Paragraph position="2"> We measured the delays and the precisions for validity thresholds 0.5, 0.6, 0.7, 0.8, 0.9 and 1.0.</Paragraph> <Paragraph position="3"> We define the degree of delay as follows: Let s = w1C/C/C/wn be an input sentence and oj(s) be the partial parse tree that is the output when the j-th word wj is scanned. We define the degree of delay when j-th word is scanned as follows:</Paragraph> <Paragraph position="5"> We define maximum delay Dmax(s) and average delay Dave(s) as follows:</Paragraph> <Paragraph position="7"> The precision is defined as the percentage of valid partial parse trees in the output.</Paragraph> <Paragraph position="8"> Moreover, we measured the precision of the parsing whose delay is always 0 and which returns the partial parse tree having highest probability. We call it the parsing baseline.</Paragraph> <Paragraph position="9"> Table 4 shows the precisions and delays. Figure 4 illustrates the relation between the precisions and delays.</Paragraph> <Paragraph position="10"> The experimental result demonstrates that there is a precision/delay trade-off. Our proposed method increases the precision in comparison with the baseline, while returning the output is delayed. When = 1, it is guaranteed that the output partial parse trees are valid, that is, our method is similar to the method in the literature (Kato et al., 2000). In comparison with this case, our method when < 1 dramatically decreases the delay.</Paragraph> <Paragraph position="11"> Although the result does not necessarily demonstrates that our method is the best one, it achieves both high-accuracy and short-delay to a certain extent. null</Paragraph> </Section> <Section position="7" start_page="0" end_page="0" type="metho"> <SectionTitle> 6 Concluding Remarks </SectionTitle> <Paragraph position="0"> In this paper, we have proposed a method of evaluating the validity that a partial parse tree constructed in incremental parsing becomes valid. The method is based on probabilistic incremental parsing. When a word is scanned, the method incrementally calculates the validity for each partial parse tree and returns the partial parse tree whose validity is greater than a threshold. Our method delays the decision of which partial parse tree should be returned.</Paragraph> <Paragraph position="1"> To evaluate the performance of our method, we conducted a parsing experiment using the Penn Treebank. The experimental result shows that our method improves the accuracy of incremental parsing. null The experiment demonstrated a precision/delay trade-off. To evaluate overall performance of incremental parsing, we would like to investigate a single measure into which delay and precision are combined.</Paragraph> </Section> class="xml-element"></Paper>