File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/82/j82-3004_metho.xml
Size: 24,811 bytes
Last Modified: 2025-10-06 14:11:28
<?xml version="1.0" standalone="yes"?> <Paper uid="J82-3004"> <Title>Coping with Syntactic Ambiguity or How to Put the Block in the Box on the Table 1</Title> <Section position="5" start_page="5" end_page="5" type="metho"> <SectionTitle> 4. Table Lookup </SectionTitle> <Paragraph position="0"> We could improve EQSP's performance on PPs if we could find a more efficient way to compute Catalan numbers than chart parsing, the method currently employed by EQSP. Let us propose two alternatives: table lookup and evaluating expression (10) directly.</Paragraph> <Paragraph position="1"> Both are very efficient over practical ranges of n, say no more than 20 phrases or so. 8 In both cases, the ambiguity of a sentence in grammar (5a) can be determined by counting the number of occurrences of &quot;and John&quot; and then retrieving the Catalan of that number.</Paragraph> <Paragraph position="2"> These approaches both take linear time (over practical ranges of n), 9 whereas chart parsing requires cubic time to parse sentences in these grammars, a significant improvement.</Paragraph> <Paragraph position="3"> So far we have shown how to compute in linear time the number of ambiguous interpretations of a sentence in an &quot;every way ambiguous&quot; grammar.</Paragraph> <Paragraph position="4"> However, we are really interested in finding parse trees, not just the number of ambiguous interpretations. We could extend the table lookup algorithm to find trees rather than ambiguity coefficients, by modifying the table to store trees instead of numbers. For parsing purposes, Cat i can be thought of as a pointer to the i th entry of the table. So, for a sentence in grammar (5a), for example, the machine could count the number of occurrences of &quot;and John&quot; and then retrieve the table entry for that number.</Paragraph> <Paragraph position="5"> index trees</Paragraph> </Section> <Section position="6" start_page="5" end_page="5" type="metho"> <SectionTitle> 0 {\[John\]} 1 {\[John and John\]} 2 {\[\[John and John\] and John\], </SectionTitle> <Paragraph position="0"> \[John and \[John and John\]\]} The table would be more general if it did not specify the lexical items at the leaves. Let us replace the table and assume the machine can bind the x's to the appropriate lexical items.</Paragraph> <Paragraph position="1"> There is a real problem with this table lookup machine. The parse trees may not be exactly correct because the power series computation assumed that multiplication was associative, which is an appropriate assumption for computing ambiguity, but inappropriate for constructing trees. For example, we observed that prepositional phrases and conjunction are both &quot;every way ambiguous&quot; grammars because their ambiguity coefficients are Catalan numbers. However, it is not the case that they generate exactly the same parse trees.</Paragraph> <Paragraph position="2"> Nevertheless we present the table lookup pseudo-parser here because it seems to be a speculative new approach with considerable promise. It is often more efficient than a real parser, and the trees that it finds may be just as useful as the correct one for many practical purposes. For example, many speech recognition projects employ a parser to filter out syntactically inappropriate hypotheses. However, a full parser is not really necessary for this task; a recognizer such as this table lookup pseudo-parser may be perfectly adequate for this task. Furthermore, it is often possible to recover the correct trees from the output of the pseudo-parser. In particular, the difference between prepositional phrases and conjunction could be accounted for by modifying the interpretation of the PP category label, so that the trees would be interpreted correctly even though they are not exactly correct.</Paragraph> <Paragraph position="3"> 8 The table lookup scheme ought to have a way to handle the theoretical possibility that there are an unlimited number of prepositional phrases. The table lookup routine will employ a more traditional parsing algorithm (e.g., Earley's algorithm) when the number of phrases in the input sentence is not stored in the table. 9 The linear time result depends on the assumption that table lookup (or closed form computation) can be performed in constant time. This may be a fair assumption over practical ranges of n, but it is not true in general* 142 American Journal of Computational Linguistics, Volume 8, Number 3-4, July-December 1982 Kenneth Church and Ramesh Patil Coping with Syntactic Ambiguity The table lookup approach works for primitive grammars. The next two sections show how to decompose composite grammars into series and parallel combinations of primitive grammars.</Paragraph> <Paragraph position="5"/> </Section> <Section position="7" start_page="5" end_page="5" type="metho"> <SectionTitle> 5. Parallel Decomposition </SectionTitle> <Paragraph position="0"> Parallel decomposition can be very useful for dealing with lexical ambiguity, as in (13) ...to total with products near profits...</Paragraph> <Paragraph position="1"> where &quot;total&quot; can be taken as a noun or as a verb, as in&quot; (14a) The accountant brought the daily sales to total with products near profits organized according to the new law. noun (14b) The daily sales were ready for the accountant to total with products near profits organized according to the new law. verb The analysis of these sentences makes use of the additivity property of linear systems. That is, each case, (14a) and (14b), is treated separately, and then the results are added together. Assuming &quot;total&quot; is a noun, there are three prepositional phrases contributing Cat 3 bracketings, and assuming it is a verb, there are two prepositional phrases for Cat x ambiguities. Combining the two cases produces Cat 3 + Cat x = 5 + 2 = 7 parses. Adding another prepositional phrase yields Cat 4 + Cat 3 = 14 + 5 = 19 parses. (EQSP behaved as predicted in both cases.) This behavior is generalized by the following power series:</Paragraph> <Paragraph position="3"> This observation can be incorporated into the table lookup pseudo-parser outlined above. Recall that Cat i is interpreted as the i th index in a table containing all binary trees dominating i leaves. Similarly, Cat i + Cati+l will be interpreted as an instruction to &quot;append&quot; the i th entry and i+1 th entry of the table, t0</Paragraph> <Paragraph position="5"> Let us consider a system where syntactic processing strictly precedes semantic and pragmatic processing.</Paragraph> <Paragraph position="6"> In such a system, how could we incorporate semantic 10 This can be implemented efficiently, given an appropriate representation of sets of trees.</Paragraph> <Paragraph position="7"> and pragmatic heuristics once we have already parsed the input sentence and found that it was the sum of two Catalans? The parser can simply subtract the inappropriate interpretations. If the oracle says that &quot;total&quot; is a verb, then (16a) would be subtracted from the combined sum, and if the oracle says that &quot;total&quot; is a noun, then (16b) would be subtracted.</Paragraph> <Paragraph position="8"> On the other hand, our analysis is also useful in a system that interleaves syntactic processing with semantic and pragmatic processing. Suppose that we had a semantic routine that could disambiguate &quot;total,&quot; but only at a very high cost in execution time. We need a way to estimate the usefulness of executing the semantic routine so that we don't spend the time if it is not likely to pay off. The analysis above provides a very simple way to estimate the benefit of disambiguating &quot;total.&quot; If it turns out to be a verb, then (16a) trees have been ruled out, and if it turns out to be a noun, then (16b) trees have been ruled out. We prefer our declarative algebraic approach over procedural heuristic search strategies (e.g., Kaplan 1972) because we do not have to specify the order of evaluation. We can delay the binding of decisions until the most opportune moment.</Paragraph> </Section> <Section position="8" start_page="5" end_page="5" type="metho"> <SectionTitle> 6. Series Decomposition </SectionTitle> <Paragraph position="0"> Suppose we have a non-terminal S that is a series combination of two other non-terminals, NP and VP.</Paragraph> <Paragraph position="1"> By inspection, the power series of S is:</Paragraph> </Section> <Section position="9" start_page="5" end_page="5" type="metho"> <SectionTitle> (18) S-- NP.VP </SectionTitle> <Paragraph position="0"> This result is easily verified when there is an unmistakable dividing point between the subject and the predicate. For example, the verb &quot;is&quot; separates the PPs in the subject from those in the predicate in (19a), but not in (19b).</Paragraph> <Paragraph position="1"> (19a) The number of products over sales of ... is near the number of sales under ... clearly divided (19b) Is the number of products over sales of ... near the number of sales under ...? not clearly divided In (19a), the total number of parse trees is the product of the number of ways of parsing the subject times the number of ways of parsing the predicate. Both the subject and the predicate produce a Catalan number of parses, and hence the result is the product of two Catalan numbers, which was verified by EQSP (Martin, Church, and Patil 1981, p. 53). This result can be formalized in terms of the power series: The power series says that the ambiguity of a particular sentence is the product of Cat i and Catj, where i is the number of PPs before &quot;is&quot; and j is the number after &quot;is.&quot; This could be incorporated in the table lookup parser as an instruction to &quot;multiply&quot; the i th entry in the table by the jth entry. Multiplication is a cross-product operation; L x R generates the set of binary trees whose left sub-tree 1 is from L and whose right sub-tree r is from R,</Paragraph> <Paragraph position="3"> This is a formal definition. For practical purposes, it may be more useful for the parser to output the list in the factored form:</Paragraph> <Paragraph position="5"> which is much more concise than a list of trees. It is possible, for example, that semantic processing can take advantage of factoring, capturing a semantic generalization that holds across all subjects or all predicates. Imagine, for example, that there is a semantic agreement constraint between predicates and arguments. For example, subjects and predicates might have to agree on the feature +human. Suppose that we were given sentences where this constraint was violated by all ambiguous interpretations of the sentence. In this case, it would be more efficient to employ a feature vector scheme (Dostert and Thompson 1971) which propagates the features in factored form.</Paragraph> <Paragraph position="6"> That is, it computes a feature vector for the union of all possible subjects, and a vector for the union of all possible VPs, and then compares (intersects) these vectors to check if there are any interpretations that meet the constraint. A system such as this, which keeps the parses in factored form, is much more efficient than one that multiplies them out. Even if semantics cannot take advantage of the factoring, there is no harm in keeping the representation in factored form, because it is straightforward to expand (23) into a list of trees (though it may be somewhat slow).</Paragraph> <Paragraph position="7"> This example is relatively simple because &quot;is&quot; helps the parser determine the value of i and j. Now let us return to example (19b) where &quot;is&quot; does not separate the two strings of PPs. Again, we determine the power series by multiplying the two subcases:</Paragraph> <Paragraph position="9"> However, this form is not so useful for parsing because the parser cannot easily determine i and j, the number of prepositional phrases in the subject and the number in the predicate. It appears the parser will have to compute the product of two Catalans for each way of picking i and j, which is somewhat expensive, it Fortunately, the Catalan function has some special properties so that it is possible algebraically to remove the references to i and j. In the next section we show how this expression can be reformulated in terms of n, the total number of PPs.</Paragraph> <Section position="1" start_page="5" end_page="5" type="sub_section"> <SectionTitle> 6.1 Auto-Convolution of Catalan Grammars </SectionTitle> <Paragraph position="0"> Some readers may have noticed that expression (24) is in convolution form. We will make use of this in the reformulation. Notice that the Catalan series is a fixed point under auto-convolution (except for a shift); that is, multiplying a Catalan power series (i.e.,</Paragraph> <Paragraph position="2"> produces another polynomial with Catalan coefficients. 12 The multiplication is worked out for the first few terms.</Paragraph> <Paragraph position="4"> This property can be summarized as: (25) ~Cat ix iy.Catj x j = ECatn+ 1 x n i j n where n equals i+j.</Paragraph> <Paragraph position="5"> Intuitively, this equation says that if we have two &quot;every way ambiguous&quot; (Catalan) constructions, and we combine them in every possible way (convolution), the result is an &quot;every way ambiguous&quot; (Catalan) construction. With this observation, equation (24) reduces to: (26) is (N E. Cati(P N)i)(E Catj(P N) j) l j = is N ~ Catn+l(P N) n n Hence the number of parses in the auxiliary-inverted case is the Catalan of one more than in the non-inverted cases. As predicted, EQSP found the following inverted sentences to be more ambiguous than their non-inverted counterparts (previously discussed on page 142) by one Catalan number.</Paragraph> <Paragraph position="6"> of products of products.</Paragraph> <Paragraph position="7"> How could this result be incorporated into the table lookup pseudo-parser? Recall that the pseudo-parser implements Catalan grammars by returning an index into the Catalan table. For example, if there were i PPs, the parser would return: (CAT-TABLE i). We now extend the indexing scheme so that the parser implements a series connection of two Catalan grammars by returning one higher index than it would for a simple Catalan grammar. That is, if there were n PPs, the parser would return (CAT-TABLE (+ n 1)).</Paragraph> <Paragraph position="8"> Series connections of Catalan grammars are very common in every day natural language, as illustrated by the following two sentences, which have received considerable attention in the literature because the parser cannot separate the direct object from the prepositional complement.</Paragraph> <Paragraph position="9"> (27a) I saw the man on the hill with a telescope ... (27b) Put the block in the box on the table in the kitchen ...</Paragraph> <Paragraph position="10"> Both examples have a Catalan number of ambiguities because the auto-convolution of a Catalan series yields another Catalan series. 13 This result can improve parsing performance because it suggests ways to re-organize (compile) the grammar so that there will be fewer references to quantities that are not readily available. This re-organization will reap benefits that chart parsers (e.g., Earley's algorithm) do not currently achieve because the re-organization is taking advantage of a number of combinatoric regularities, especially convolution, that are not easily encoded into a chart. Section 9 presents an example of the reorganization. null 13 There is a difference between these two sentences because &quot;put&quot; subcategorizes for two objects unlike &quot;see.&quot; Suppose we analyze &quot;see&quot; as lexically ambiguous between two senses, one that selects for exactly two objects like &quot;put&quot; and one that selects for exactly one object as in &quot;I saw it.&quot; The first sense contributes the same number of parses as &quot;put&quot; and the second sense contributes an additional Catalan factor.</Paragraph> </Section> <Section position="2" start_page="5" end_page="5" type="sub_section"> <SectionTitle> 6.2 Chart Parsing </SectionTitle> <Paragraph position="0"> Perhaps it is worthwhile to reformulate chart parsing in our terms in order to show which of the above results can be captured by such an approach and which cannot. Traditionally, chart parsers maintain a chart (or matrix) M, whose entries Mij contain the set of category labels that span from position i to position j in the input sentence. This is accomplished by finding a position k between i and j such that there is a phrase from i to k that can combine with another phrase from k to j. An implementation of the inner loop looks something like: (28) Mij := { } loop for k from i to j do Mij := Mij u Mik * Mkj Essentially, then, a chart parser is maintaining tlae invariant (29) Mij = ~k Mik deg Mkj where addition and multiplication of matrix elements is related to parallel and series combination. Thus chart parsers are able to process very ambiguous sentences in polynomial time, as opposed to exponential (or Catalan) time.</Paragraph> <Paragraph position="1"> However, the examples above illustrate cases where chart parsers are not as efficient as they might be. In particular, chart parsers implement convolution the &quot;long way,&quot; by picking each possible dividing point k, and parsing from i to k and from k to j; they do not reduce the convolution of two Catalans as we did above. Similarly, chart parsers do not make use of the &quot;every way ambiguous&quot; generalization; given a Catalan grammar, chart parsers will eventually enumerate all possible values of i, j, and k.</Paragraph> <Paragraph position="2"> 7. Computing the Power Series Directly from the Grammar Thus far, most of our derivations have been justified in terms of successive approximation. It is also possible to derive some interesting (and well-known) results directly from the grammar itself. Suppose, for the sake of discussion, that we choose to analyze adjuncts with a right branching grammar, t4 (By convention, terminal symbols appear in lower case.) (30) ADJS ~ adjADJS I A First we translate the grammar into an equation in the usual way. That is, ADJS is modeled as a parallel combination of two subgrammars, adj ADJS and A.</Paragraph> <Paragraph position="3"> (A, the empty string, is modeled as 1 because it is the 14 A similar analysis of adjuncts is adopted in Kaplan and Bresnan 1981. This analysis can also be defended on performance grounds as an efficiency approximation. (This approximation is in the spirit of pseudo-attachment (Church 1980).) American Journal of Computational Linguistics, Volume 8, Number 3-4, July-December 1982 145 Kenneth Church and Ramesh Patil Coping with Syntactic Ambiguity identity element under series combination, i.e., multiplication.) null (31a) ADJS -~ adj ADJS I A (31b) ADJS = adj .ADJS + 1 We can simplify (31b) so the right hand side is expressed in terminal symbols alone, with no references to non-terminals. This is very useful for processing because it is much easier for the parser to determine the presence or absence of terminals than of nonterminals. That is, it is easier for the parser to determine, for example, whether a word is an adj, than it is to decide whether a substring is an ADJS phrase. The simplification moves all references to ADJS to the left hand side, by subtracting from both sides, (31c) ADJS-adj .ADJS = 1 factoring the left hand side, (31d) (1 - adj)ADJS = 1 and dividing from both sides, (31e) ADJS = (1 -adj) -1 By performing the long division, we observe that (31) has unit coefficients.</Paragraph> <Paragraph position="5"> Grammars like ADJS will sometimes be referred to as a step, by analogy to a unit step function in electrical engineering.</Paragraph> <Paragraph position="6"> 8. Computing the Power Series from the ATN This section will re-derive the power series for the unit step grammar directly from the ATN representation by treating the networks as flow graphs (Oppenheim 1975). The graph transformations presented here are directly analogous to the algebraic simplifications employed in the previous section.</Paragraph> <Paragraph position="7"> First we translate the grammar into an ATN in the usual way (Woods 1970).</Paragraph> <Paragraph position="8"> This graph can be simplified by performing a compiler optimization call tail recursion (Church and Kaplan 1981 and references therein). This transformation replaces the final push arc with a jump: Tail recursion corresponds directly to the algebraic operations of moving the ADJS term to the left hand side, factoring out the ADJS, and dividing from both sides.</Paragraph> <Paragraph position="9"> Then we remove the top jump arc by series reduction. This step corresponds to multiplying by 1 since a jump arc is the ATN representation for the identity where the zero-th term corresponds to zero iterations around the loop, the first term corresponds to a single iteration, the second term to two iterations, and so on. Recall that (36) is equivalent to: (37) 1</Paragraph> </Section> </Section> <Section position="10" start_page="5" end_page="5" type="metho"> <SectionTitle> 1 --adj </SectionTitle> <Paragraph position="0"> With this observation, it is possible to open the loop: (38) ADJS: Q1/(l-adj) ~_~Pop After one final series reduction, the ATN is equivalent to expression (31e) above.</Paragraph> <Paragraph position="1"> (38g) ADJS: Q. 1/(1-adj) e..~Pop Intuitively, an ATN loop (or step grammar) is a division operator. We now have composition operators for parallel composition (addition), series composition (multiplication), and loops (division).</Paragraph> <Paragraph position="2"> An ATN loop can be implemented in terms of the table lookup scheme discussed above. First we reformulate the loop as an infinite sum: 146 American Journal of Computational Linguistics, Volume 8, Number 3-4, July-December 1982 (In this example we will assume no lexical ambiguity among N, V, P, and adj.) By inspection, we notice that NP and PP are Catalan grammars and that ADJS is a Step grammar.</Paragraph> <Paragraph position="4"> With these observations, the parser can process PPs, NPs, and ADJSs by counting the number of occurrences of terminal symbols and looking up those numbers in the appropriate tables. We now substitute (41a-c) into (40c).</Paragraph> <Paragraph position="6"> and simplify the convolution of the two Catalan functions null</Paragraph> <Paragraph position="8"> so that the parser can also find VPs by just counting coccurrences of terminal symbols. Now we simplify (40a-b) so that S phrases can also be parsed by just counting occurrences of terminal symbols. First, translate (40a-b) into the equation: The entire example grammar has now been compiled into a form that is easier for parsing. This formula says that sentences are all of the form: Furthermore, the number of parse trees for a given input sentence can be found by multiplying three numbers: (a) the Catalan of the number of P N's before the verb, (b) the Catalan of one more than the number of P N's after the verb, and (c) the ramp of the number of adj's. For example, the sentence (53) The man on the hill saw the boy with a telescope yesterday in the morning.</Paragraph> <Paragraph position="9"> has Cat 1 * Cat 2 * 3 = 6 parses. That is, there is one way to parse &quot;the man on the hill,&quot; two ways to parse &quot;saw the boy with a telescope&quot; (&quot;telescope&quot; is either a complement of &quot;see&quot; as in (54a-c) or is attached to &quot;boy&quot; as in (54d-f)), and three ways to parse the adjuncts (they could both attach to the S (54a,d), or they could both attach to the VP (54b,e), or they could split (54c,f)).</Paragraph> <Paragraph position="10"> scope\] \[yesterday in the mornmg.\]\] The man on the hill \[\[saw the boy with a telescope\] \[yesterday in the morning.\]\] The man on the hill \[\[saw the boy with a telescope\] yesterday\] in the morning.</Paragraph> <Paragraph position="11"> \[The man on the hill saw \[the boy with a telescope\] \[yesterday in the morning.\]\] The man on the hill \[saw \[the boy with a telescope\] \[yesterday in the morning.\]\] The man on the hill \[saw \[the boy with a telescope\] yesterday\] in the morning.</Paragraph> <Paragraph position="12"> All and only these possibilities are permitted by the grammar.</Paragraph> </Section> class="xml-element"></Paper>