File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/98/p98-1089_metho.xml
Size: 13,110 bytes
Last Modified: 2025-10-06 14:14:56
<?xml version="1.0" standalone="yes"?> <Paper uid="P98-1089"> <Title>Parsing Parallel Grammatical Representations</Title> <Section position="3" start_page="0" end_page="546" type="metho"> <SectionTitle> 2 A Multidimensional Approach to </SectionTitle> <Paragraph position="0"/> <Section position="1" start_page="0" end_page="545" type="sub_section"> <SectionTitle> Quantifier Scoping 2.1 The Autolexical Model </SectionTitle> <Paragraph position="0"> The framework of Autolexical Grammar treats a language as the intersection of numerous independent CF-PSGs, or hierarchies, each of which corresponds to a specific structural or functional aspect of the language. Semantic, syntactic, morphological, discourse-functional and many other hierarchies have been introduced in the literature, but this project focuses on the interactions among only three major hierarchies: and Operator Scope Structure.</Paragraph> <Paragraph position="1"> The surface syntactic hierarchy is a feature-based grammar expressing those generalizations about a sentence which are most clearly syntactic in nature, such as agreement, case, and syntactic valency. The function-argument hierarchy expresses that (formal) semantic information about a sentence which does not involve scope resolution, e.g., semantic valency and association of referential terms with argument positions, as in Park (1995). The operator scope hierarchy, naturally, imposes a scope ordering on the quantifiers and operators found in the expression. Two other, minor hierarchies are employed in this implementation. The linear ordering of words in the surface string is treated as a hierarchy, and a lexical hierarchy is introduced in order to express the differing lexical &quot;strength&quot; of quantifiers.</Paragraph> <Paragraph position="2"> Each hierarchy can be represented as a tree in which the terminal nodes are not ordered with respect to one another. This implies that, for example, \[John \[saw Mary\]\] and \[Mary \[saw John\]\] will both be acceptable syntactic representations for the surface string Mary saw John. The optimal set of hierarchies for a string consists of the candidate hierarchies for each level of representation which together are most structurally congruous. The structural similarity between hierarchies is determined in Autolexical Grammar by means of an Alignment Constraint, which in the implementation described here counts the number of overlapping constituents in the two trees. Thus, while structures similar to \[Mary \[saw John\]\] and \[John \[saw Mary\]\] will both be acceptable as syntactic and function-argument structure representations, the alignment constraint will strongly favor a pairing in which both hierarchies share the same representation. Structural hierarchies are additionally evaluated by means of a Contiguity Constraint, which requires that the terminal nodes of each constituent of a hierarchy should be together in the surface string, or at least as close together as possible.</Paragraph> </Section> <Section position="2" start_page="545" end_page="546" type="sub_section"> <SectionTitle> 2.2 Quantifier Ordering Heuristics </SectionTitle> <Paragraph position="0"> The main constraints which this model places on the relative scope of quantifiers and operators are the alignment of the operator scope hierarchy with syntax, function-argument structure, and the lexical hierarchy of quantifier strength. The first of these constraints reflects &quot;the principle that left-to-right order at the same syntactic level is preserved in the quantifier order&quot; 1 and accounts for syntactic extraction restrictions. The second will favor operator scope structures in which scope-taking elements are raised as little as possible from their base argument positions. The last takes account of the scope preferences of individual quantifiers, such as the fact that each tends to have wider scope than all other quantifiers (Hobbs and Shieber, 1987; Moran, 1988).</Paragraph> <Paragraph position="1"> As an example of the sort of syntactically-based restrictions on quantifier ordering which this model can implement, consider the generalization listed in Moran (1988), that &quot;a quantifier cannot be raised across more than one major clause boundary.&quot; Because the approach pursued here already has a general constraint which penalizes candidate parses according to the degree of discrepancy between their syntax and scope hierarchies, we do not need to accord a privileged theoretical status to &quot;major clause boundaries.&quot; Figure 1 illustrates the approximate optimal structure accorded to the sentence Some patients believe all doctors are competent on the syntactic and scopal hierarchies, in which an extracted quantifier crosses one major clause boundary. It will be given a misalignment index of 4 (considering for the moment only the interaction of these two levels), because of the four overlapping constituents on the two hierarchies.</Paragraph> <Paragraph position="2"> This example would be misaligned only to degree 2 if the other quantifier order were chosen, and depending on the exact sentence type considered, an example with a scope-taking element crossing two major clause boundaries should be misaligned to about degree 8.</Paragraph> <Paragraph position="3"> The fact that the difference between the primary and secondary scopings of this sentence is 2 degrees of alignment, while the difference between crossing one clause boundary and two clause boundaries is 4 degrees of alignment, corresponds with generally accepted assumptions about the acceptability of this example. While the reading in which the scope of quantifiers mirrors their order in surface structure is certainly preferred, the other ordering is possible as well. If the extraction crosses another clause combination of structures, because they overlap with constituents in the other tree. boundary, however, as in Some patients believe Mary thinks all doctors are competent, the reversed scoping is considerably more unlikely.</Paragraph> </Section> <Section position="3" start_page="546" end_page="546" type="sub_section"> <SectionTitle> 2.3 Lexical Properties of Quantifiers </SectionTitle> <Paragraph position="0"> In addition to ranking the possible scopings of a sentence based on the surface syntactic positions of its quantifiers and operators, the parsing and alignment algorithm employed in this project takes into account the &quot;strength&quot; of different scope-taking elements. By introducing a lexical hierarchy of quantifier strength, in which those elements more likely to take wide scope are found higher in the tree, we are able to use the same mechanism of the alignment constraint to model the facts which other approaches treat with stipulative heuristics.</Paragraph> <Paragraph position="1"> For example, in Some patient paid each doctor, the preferred reading is the one in which each takes wide scope, contrary to our expectations based on the generalization that the primary scoping tends to mirror surface syntactic order. An approach employing some variant of Cooper storage would have to account for this by assigning to each pair of quantifiers a likelihood that one will be raised past the other.</Paragraph> <Paragraph position="2"> In this case, it would be highly likely for each to be raised past some. The autolexical approach, however, allows us to achieve the same effect without introducing an additional device.</Paragraph> <Paragraph position="3"> Given a proper weighting of the result of aligning the scope hierarchy with this lexical hierarchy, it is a simple matter to settle on the correct candidates.</Paragraph> </Section> </Section> <Section position="4" start_page="546" end_page="547" type="metho"> <SectionTitle> 3 The Algorithm </SectionTitle> <Paragraph position="0"/> <Section position="1" start_page="546" end_page="547" type="sub_section"> <SectionTitle> 3.1 Parsing Strategy </SectionTitle> <Paragraph position="0"> This implementation of the Autolexical account of quantifier scoping is written for SWI-Prolog, and inherits much of its feature-based grammatical formalism from the code listings of Gazdar and Mellish (1989), including dagunify.pl, by Bob Carpenter. The general strategy employed by the program is first to find all parses which each hierarchy's grammar permits for the string, and then to pass these lists of structures to functions which implement the alignment and contiguity constraints. These functions perform a pairwise evaluation of the agreement between structures, eventually converging on the optimal set of hierarchies.</Paragraph> <Paragraph position="1"> The same parsing engine is used to generate structures for each of the major hierarchies contributing to the representation of a string. It is based on the left-corner parser of pro_patr.pl in Gazdar and Mellish (1989), attributed originally to Pereira and Shieber (1987). This parser has been extended to store intermediate results for lookup in a hash table.</Paragraph> <Paragraph position="2"> At present, the parsing of each hierarchy is independent of that of the other hierarchies, but ultimately it would be preferable to allow, e.g., edges from the syntactic parse to contribute to the function-argument parsing process. Such a development would allow us to express categorial prototypes in a natural way. For example, the proposition that &quot;syntactic NPs tend to denote semantic arguments&quot; could be modeled as a default rule for incorporating syntactic edges into a function-argument structure parse.</Paragraph> <Paragraph position="3"> The &quot;generate and test&quot; mechanism employed here to maximize the congruity of representations on different levels is certainly somewhat inefficient. Some of the structures which it considers will be bizarre by all accounts. To a certain degree, this profligacy is held in check by heuristic cutoffs which exclude a combination from consideration as soon as it becomes apparent that is misaligned to an unacceptable degree. Ultimately, however, the solution may lie in some sort of parallel approach. A development of this program designed either for parallel Prolog or for a truly parallel architecture could effect a further restriction on the candidate set of representations by implementing constraints on parallel parsing processes, rather than (or in addition to) on the output of such processes.</Paragraph> </Section> <Section position="2" start_page="547" end_page="547" type="sub_section"> <SectionTitle> 3.2 Alignment </SectionTitle> <Paragraph position="0"> The alignment constraint (applied by the align/3 predicate here) compares two trees (Prolog lists), returning the total number of overlapping constituents in both trees as a measure of their misalignent. Constituents are said to overlap if the sets of terminal nodes which they dominate intersect, but neither is a subset of the other.</Paragraph> <Paragraph position="1"> The code fragment below provides a rough outline of the operation of this predicate. First, both trees being compared are &quot;pruned&quot; so that neither contains any terminal nodes not found in the other. The terminal elements of each of the tree's constituents are then recorded in lists. Once those constituents which occur in both trees are removed, the sum of the length of these two lists is the total number of overlapping constituents.</Paragraph> </Section> <Section position="3" start_page="547" end_page="547" type="sub_section"> <SectionTitle> 3.3 Contiguity </SectionTitle> <Paragraph position="0"> While the alignment constraint evaluates the similarity of two trees, the contiguity constraint (contig/3 in this project) calculates the degree of fit between a hierarchy and a string (in this case, the surface string). The relevant measure of &quot;goodness of fit&quot; is taken here to be the minimal number of crossing branches the structure entails. It is true that this approach makes the contiguity constraint dependent on the particular grammatical rules of each representational level. However, since an Autolexical model does not attempt to handle syntax directly in the semantic representation, or morphology in the syntactic representation, there is no real danger of proliferating nonterminal nodes on any particular level.</Paragraph> <Paragraph position="1"> The definition of the contig predicate is somewhat more complex than that for align, because it must find the minimum number of crossing branches in a structure. It works by maintaining a chart (based on the contval predicate) of the number of branches &quot;covering&quot; each constituent, as it works its way up the tree.</Paragraph> <Paragraph position="2"> The contmin predicate keeps track of the current lowest contiguity violation for the structure, so that worse alternatives can be abandoned as soon as they cross this threshold.</Paragraph> <Paragraph position="3"> contig(\[\],_,0).</Paragraph> </Section> </Section> class="xml-element"></Paper>