File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/91/e91-1044_metho.xml
Size: 19,019 bytes
Last Modified: 2025-10-06 14:12:37
<?xml version="1.0" standalone="yes"?> <Paper uid="E91-1044"> <Title>AN ALGORITHM FOR GENERATING NON-REDUNDANT QUANTIFIER SCOPINGS</Title> <Section position="4" start_page="0" end_page="0" type="metho"> <SectionTitle> AN OUTSIDE-IN ALGORITHM </SectionTitle> <Paragraph position="0"> The usual way to generate scopings is to do it inside-out: Quantifiers of a subformula are either applied to the subformula or lifted to be applied at a higher level.</Paragraph> <Paragraph position="1"> On the approach presented here, generation is done outside-ini i.e. by first choosing the outermost quantifier of the formula to be generated. The motivation behind this unorthodox move is rather pragmatic: It makes it possible, as we shall see below, to implement nonredundancy and sorting in an easy and understandable way. It is also easy to treat examples like the following, presented by Hobbs & Shieher (87): (4) Every man:i know a child of has arrived where &quot;a child of...&quot; cannot be scoped outside of &quot;Every man&quot;, since it (presumably) contains a variable that &quot;Every man&quot; binds. Building formulas outside-in, it is trivial to check that a formula only contains variables that are already bound.</Paragraph> <Paragraph position="3"> There may be other good reasons for choosing an outside-in approach; e.g. if anaphora resolution is going to be integrated into the algorithm, or if scope generation is to be done incrementally: Usually, the first NP of a sentence contains the quantifier th~tt by default has the widest scope, so an outside-in algorithm is just the right match for an incremental parser.</Paragraph> <Paragraph position="4"> The outside-in generation works in this way: 1. Select one of the quantifiers returned by get-quants.</Paragraph> <Paragraph position="5"> 2. Generate all possible restrictions of this quantifier by recursively scoping the restrictions. null 3. Recursively generate all possible scopes of the quantifier by applying the scoping function to the input structure with the selected quantifier (and thereby the quantifiers in its restriction) removed. Note that get-quants is called anew for each subscoping, but it will only find quantifiers which have not yet been :applied. null 4. Finally, construct a set of formulas: by combining the quantifier with all the possible restrictions and scopes.</Paragraph> </Section> <Section position="5" start_page="0" end_page="0" type="metho"> <SectionTitle> THE BASIC ALGORITHM </SectionTitle> <Paragraph position="0"> I will not formulate a precise definition of the algorithm in some formal programming language, but I will in the following give a half-formal clef'tuition of the main functions of the algorithm as it works in its basic version, i.e. with neither removal of logical re'dundancy nor ordering of scopings integrated into the algorithm: The main function is scopings which takes an in.</Paragraph> <Paragraph position="1"> put form of (almos0 any format and returns a set of scoped formulas:</Paragraph> <Paragraph position="3"> where form(get-var(q)/q) means form with getvat(q) substituted for q. The purpose of this substitution is to mark the quantifier as &quot;already bound&quot; by replacing it with the variable it binds. The variable is then used by build-main in the main formula.</Paragraph> <Paragraph position="4"> The function scope-restrictions is defined by</Paragraph> <Paragraph position="6"> where the role of combine-restrictions is to combine scopings when there are several restrictions to a quantifier, e.g. both a relative clause and a prepositional phrase. Roughly, combine-restrictions works by using the application-deFined function build-conjunction to conjoin one element from each of the sets in its argument set.</Paragraph> <Paragraph position="7"> This is the whole algorithm in its most basic vet, sion 4, provided of course, that the functions build.</Paragraph> <Paragraph position="8"> main, build-quant, build-conjunction, get-quant& get-vat and get-restrictions are defined. These may be defined to fit almost any kind of input and output structure s</Paragraph> </Section> <Section position="6" start_page="0" end_page="0" type="metho"> <SectionTitle> REMOVING LOGICAL REDUNDANCY </SectionTitle> <Paragraph position="0"> We now turn to the enhancements which are the main concern of this paper. We first look at the most important, the removal of logically redundant scopings. To give a precise formulation of the kind of logical redundancy that we want to avoid, we first need some definitions: algorithm of Hobbs & Shieber (87) does when &quot;opaque operatm's&quot; are left out.</Paragraph> <Paragraph position="1"> ~In the actual Common Lisp implementation, substitution of variables for quantifiers is done by destructive list manipulation. This ~s that quanfifiers must be cons. ceils, and that the occurrence of a quantifier in the list returned by get-quants(form) must share with the occurrence of the same quantifier in form.</Paragraph> <Paragraph position="2"> - 253 It is easily seen that both existential and universal determiners are scope-commutative, and that existential, but not universal, determiners are restrictor-commutative. In natural language, this means that e.g. A representative of a company arrived is not ambiguous, in contrast to Every representative of every company arrived. Typical generalized quantifiers like most are neither restrictorcommutative nor scope-commutative~.</Paragraph> <Paragraph position="3"> Since quantifiers are selected outsideAn, it is now easy to equip the algorithm with a mechanism to remove redundant scopings: If the surrounding quantifier had a scope-commutative determiner, quantifiers with the same determiner and which precede the surrounding quantifier in the default ordering are not selected.</Paragraph> <Paragraph position="4"> For example, this means that in Every man loves every woman, &quot;every man&quot; has to be selected before &quot;every woman&quot;. The algorithm will also try &quot;every woman&quot; as the first quantifier, but will then discard that alternative because &quot;every man&quot; cannot be selected in the next step - it precedes &quot;every woman&quot; in the default ordering. For more complex sentences, this discarding may give a significant time saving, which will be discussed below.</Paragraph> <Paragraph position="5"> The algorithm also takes care of the restrictorcommutativity of existential determiners by using the same technique of comparing with the surrounding quantifier when restrictions on quantifiers are recursively scoped.</Paragraph> </Section> <Section position="7" start_page="0" end_page="0" type="metho"> <SectionTitle> PARTIALLY ORDERING THE SCOPINGS </SectionTitle> <Paragraph position="0"> Generating outside-in, one has a &quot;global&quot; view of the generation process, which may be an advantage when trying to integrate ordering of scoping according to preference into the algorithm. As an example, the implemented algorithm provides a very simple kind of preference ordering: A scoping is considered &quot;better&quot; than another scoping ff the number of quantifiers occurring in a non-default position is lower.</Paragraph> <Paragraph position="1"> It is supposed that the input comes with a default ordering, and that the application-specific function get-quants takes care of this. This default order may reflect several heuristics for scope generation; e.g. that the of-complements of NPs usually take scope over the whole NP (and thus should be lifted by default).</Paragraph> <Paragraph position="2"> The trick is now to assign a &quot;penalty&quot; number to every sub-scoping. Every time several quantifiers can be chosen at a given step, the penalty is increased by 1 if aquantifier different from the default one is chosen. And every time a quantifier is cont structed, its penalty is set to the sum of the penalties of the restrictor and scope subformulas. Thus, the penalty counts the number of quantifier displace, ments (compared to the default scoping). The main function of the Common Lisp implementation thus looks like thisT: Here prefer is a function which increases the penalty of each Of the scopings in its second list, and calls merge-scopings on the two lists. Merge-scopings merges thetwo lists with the penalty as ordering criterion. This function is used whenever needed by the algorithm, such that one never needs to re-order the scoping list. From the last function-call above, one can also see how the coding of penalties is done: Atomic formulas are marked with a zero in their car. This number is later removed, the penalty is always stored only in the car of the whole scoped formula.</Paragraph> </Section> <Section position="8" start_page="0" end_page="0" type="metho"> <SectionTitle> SCOPE OF RELATIVE CLAUSE QUANTIFIERS </SectionTitle> <Paragraph position="0"> Whether it ,is a general constraint on English may be questionable, but at least for practical purposes it seems reasonable to assume that no other quantifiers than the existential quantifier may be extracted out of a relative clause.</Paragraph> <Paragraph position="1"> The algorithm makes it easy to implement such a constraint. Since the quantifiers that can be used at a given step are given by the application-defined function get-quants, it is easy for any implementation of get.quants to filter out all non-existential quantifiers when looking for quantifiers inside a relative clause. Here some of the burden is put on the grammar:. The parts of the input structures that correspond to relative clauses must be marked to be distinguishable from e.g. PP complements'.</Paragraph> <Paragraph position="2"> 61&quot;o prove non-scope-commutativity of most, construct an actual example where Most men love most women holds, but Most women are loved by most men does not hold (with the default seopings)I 7For clarity, the mechanism for removing logical redundancy is left out hero.</Paragraph> <Paragraph position="3"> SOne could also put all the burden on the grammar, if one wanted the structures to contain the quantifier list as a - 254 -</Paragraph> </Section> <Section position="9" start_page="0" end_page="0" type="metho"> <SectionTitle> THE NUMBER OF SCOPINGS </SectionTitle> <Paragraph position="0"> Hobbs and Shieber (87) point out that just by avoiding those scopings that are structurally impossible, the number of scopings generated is significantly lower than n!. For the following sentence, the reduction is from 81 = 40320 to &quot;only&quot; 2988: (5) A representative of a department of a company gave a friend of a director of a company a sample of a product.</Paragraph> <Paragraph position="1"> Of course, the sentence has only one &quot;real&quot; scoping! Since the algorithm presented here avoids logical non-redundancy by looking at the default order already when a quantifier is selected for the generation of a subformula, the gain for sentences like (5) is Iremendous 9.</Paragraph> <Paragraph position="2"> The above suggests that complexity for scoping algorithms is a function of both the number of quantifiers in the input, and of the structure of the input. The highest number of scopings is obtained when the input contains n quantifiers, none of which are contained in a restriction to one of the others. An example of this is Most women give most men a flower. In such cases, no quantifier permutations can be sorted out on structural grounds, so the number of scopings is n!.</Paragraph> <Paragraph position="3"> For more complex sentences, the picture is fairly complex. The easiest task is to look at the case where the lowest number of scopings are obtained (disregarding logical redundancy), when all quantifiers are nested inside each other, e.g.</Paragraph> <Paragraph position="4"> (6) Most representatives of most departments of most companies of most cities sighed.</Paragraph> <Paragraph position="5"> It is easy to see that if N is the function that counts the number of scopings in such a sentence, then</Paragraph> <Paragraph position="7"> Here N(n - k)N (k - 1 ) is the number of subscopings generated if quantifier number k is selected as the outermost, the factors are the number of substructure. This seems difficult to do with a pure unification grammar, however.</Paragraph> <Paragraph position="8"> 9Fx)r this particular sentence, the single seeping is generated in less than 1/200 of the time required to generate the 2988 scopings of the same sentence with 'most' substituted for 'a'.</Paragraph> <Paragraph position="9"> scopings of the restriction and scope of that quantifier, respectively. Of course, N(0) = 1.</Paragraph> <Paragraph position="10"> It can be shown that t0 The important observation here, is that that the number of scopings of the completely nested sentences no longer is of faculty order, but of&quot;only&quot; exponential order. This gives us a mathematical conf'm~nation of the suspicion that the number of scop, ings of such sentences is significantly lower than the number of permutations of quantifiers. For sen~ tences which contain two argument NPs and the rest of the quantifiers nested inside each of these, the number of scopings is also N(n). For sentences with three argument NPs, it is somewhat higher, but still of exponential order.</Paragraph> </Section> <Section position="10" start_page="0" end_page="0" type="metho"> <SectionTitle> COMPUTATIONAL COMPLEXITY </SectionTitle> <Paragraph position="0"> What is the optimal way to generate (an explicit representation of) the n! scopings of the worst case? The absolute lower bound of the time complexity: will necessarily be at least as bad as the lower bound on space complexity. And the absolute lower bound on space complexity is given by the size of an optimally structure-sharing direct representation of the n! scopings. Such a representation will only con~ tain one instance of each possible subscoping, but it has to contain all subscopings as substructures. This makes a total of n + n.(n-1)+...+n! subscopings.</Paragraph> <Paragraph position="1"> Factoring out n!, we get n!(1 + 1/1! + 1/2! +...+l/(n-1)!). Readers trained in elementary cab culus, will recognize the latter sum as the Taylor polynomial of degree n-1 around 0 of the exponential function, applied to argument 1, i.e. the sum con.</Paragraph> <Paragraph position="2"> verges to the number e. This means that the total number of subscopings - and hence the lower bound on space complexity - is of order n!.</Paragraph> <Paragraph position="3"> Without any structure-sharing, the number of subscopings generated will of course be n.n!. This is exactly what happens here: The algorithm pre, sented is O(n2.n!) in time and space (provided that no redundancy occurs). This estimate presupposes that get-quants is of order n in both time and space, even when less than n quantifiers are left (presumably this figure will be better for some ira10See e.g. Jacobsen (51), p. 19.</Paragraph> <Paragraph position="4"> - 255 plementations of get-quants). By comparison, the Hobbs & Shieber algorithm is O(n!), by using optimal structure sharing.</Paragraph> <Paragraph position="5"> Does this mean that the outside-in approach should be rejected? Note that we above only considered the non-nested case. In the nested case, the algorithm presented here gains somewhat, while the Hobbs&Shieber algorithm loses somewhat. In both cases, scoping of restrictions has to be redone for every new application of the quantifier they restrict This means that in the general case, the Hobbs & Shieber algorithm no longer provides optimal structure sharing, while the algorithm presented here provides a modest structure sharing. Now, both algorithms can of course be equipped with a hash table (or even a plain array) for storing sets of subscopings (by the qnantifiers left to be bound). This has been successfully tried out with the algorithm presented here. It brings the complexity down to the optimal: O(n!) in the worst :case, and similarly to O(4nn &quot;3/2) in the completely nested ease. So, there is, at least in theory, nothing to be lost in efficiency by using an outside-in algorithm.</Paragraph> </Section> <Section position="11" start_page="0" end_page="0" type="metho"> <SectionTitle> THE SINGLE-SCOPING CASE </SectionTitle> <Paragraph position="0"> What about the promised reduction of complexity due to redundancy checking? We consider the case where a sentence contains n un-nested existential quantifiers. Then the complexity is given by the number of times the algorithm tries to generate a subscoping, multiplied by the complexity of getquants. When quantifier number k is selected as the outermost, n-k quantifiers are left applicable in the resulting recursive call to the algorithm. Let S be the function that counts the number of subscopings considered. We have:</Paragraph> <Paragraph position="2"> Thus, in the single-scoping case the algorithm is O(n-2&quot;) for input with un-nested qnantifiers (and even lower for nested quantifiers).</Paragraph> <Paragraph position="3"> Although the savings will be somewhat less spectacular for sentences wiih more than 1 scoping, this nevertheless shows that removing logical redundancy not only is of its own right, but also gives a significant reduction of the complexity of the algorithm. null</Paragraph> </Section> <Section position="12" start_page="0" end_page="0" type="metho"> <SectionTitle> MODULAR THEORIES OF LINGUISTICS </SectionTitle> <Paragraph position="0"> The algorithm presented here is related to the work of Johnson & Kay (90) by its modular nature.</Paragraph> <Paragraph position="1"> As mentioned, the intcrfacel with the syntax (parse output) is through a small set of access functions (set-quants, get-restrictions, get-var, and quanttype) and the interface with the semantics (the output of the algorithm) is through a small set of con.</Paragraph> <Paragraph position="2"> structor functions (build-conjuction, build-main and build-quant). The implementation thus is a conve, nient &quot;software glue&quot; which allows a high degree of freedom in the choice of both syntactic and semantic framework.</Paragraph> <Paragraph position="3"> This approach is not as &quot;nice&quot; as that of Johnson & Kay (90) or Halvorsen & Kaplan (88), and may on such :grounds be rejected as a theory of the syntactic/semantic interface. But the question is whether it is possible to state any relationship be.</Paragraph> <Paragraph position="4"> tween syntax and semantics which satisfies my four initial requirements (non-redundancy, ordering, special treatment of sub-clauses and modularity), and which still is &quot;beautiful&quot; or &quot;simple&quot; according to some standard:,</Paragraph> </Section> class="xml-element"></Paper>