File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/92/p92-1026_metho.xml

Size: 18,761 bytes

Last Modified: 2025-10-06 14:13:13

<?xml version="1.0" standalone="yes"?>
<Paper uid="P92-1026">
  <Title>HANDLING LINEAR PRECEDENCE CONSTRAINTS BY UNIFICATION</Title>
  <Section position="3" start_page="201" end_page="201" type="metho">
    <SectionTitle>
LINGUISTIC MOTIVATION
</SectionTitle>
    <Paragraph position="0"> This section presents the linguistic motivation for our approach. LP statements in GPSG (Gazdar et al.</Paragraph>
    <Paragraph position="1"> 1985) constrain the possibility of linearizing immediate dominance (ID) rules. By taking the right-hand sides of ID rules as their domain, they allow only the ordering of sibling constituents. Consequently, grammars must be designed in such a way that all constituents which are to be ordered by LP constraints must be dominated by one node in the tree, so that &amp;quot;flat&amp;quot; phrase structures result, as illustrated in figure 1. VmaX sollte V max should NP\[nom\] ADV NP\[dat\] NP\[acc\] V 0 der Kurier nachher einem Spion den Brief zustecken the courier later a spy the letter slip The courier was later supposed to slip a spy the letter. Figure 1 Uszkoreit (1986) argues that such flat structures are not well suited for the description of languages such as German and Dutch. The main reason 1 is so-called complex fronting, i.e., the fronting of a non-finite verb together with some of its complements and adjuncts as it is shown in (1). Since it is a well established fact that only one constituent can be fronted, the flat structure can account for the German examples in (1), but not for the ones in (2),  naehher einem Spion den Brief znsteeken sollte der Kurier In the hierarchical tree structure in figure 2, the boxed constituents can be fronted, accounting for the examples in (1) and (2).</Paragraph>
    <Paragraph position="2">  But with this tree structure, LP constraints can no longer be enforced over siblings. The new domain for linear order is a head domain, defined as follows: A head domain consists of the lexical head of a phrase, and its complements and adjuncts. LP constraints must be respected within a head domain.</Paragraph>
    <Paragraph position="3"> An LP-constraint is an ordered pair &lt;A,B&gt; of category descriptions, such that whenever a node cx subsumed by A and a node 13 subsumed by B occur within the domain of an LP-rule (in the case of GPSG a local tree, in our case a head domain), cz precedes 13.</Paragraph>
    <Paragraph position="4"> An LP constraint &lt;A,B&gt; is conventionally written as A &lt; B. It follows from the definition that B can never precede A in an LP domain. In the next section, we will show how this property is exploited in our encoding of LP constraints.</Paragraph>
  </Section>
  <Section position="4" start_page="201" end_page="202" type="metho">
    <SectionTitle>
ENCODING OF LP CONSTRAINTS
</SectionTitle>
    <Paragraph position="0"> From a formal point of view, we want to encode LP constraints in such a way that  * violation of an LP constraint results in unification failure, and * LP constraints, which operate on head domains, can be enforced in local trees by checking sibling nodes.</Paragraph>
    <Paragraph position="1"> The last condition can be ensured if every node in a projection carries information about which constituents are contained in its head domain. An LP constraint A &lt; B implies that it can never be the case that B precedes A. We make use of this fact by the following additions to the grammar: * Every category A carries the information that B must not occur to its left.</Paragraph>
    <Paragraph position="2"> * Every category B carries the information A must not occur to its right.</Paragraph>
    <Paragraph position="3"> This duplication of encoding is necessary because only the complements/adjuncts check whether the projection with which they are combined contains something that is incompatible with the LP constraints. A projection contains only information about which constituents are contained in its head domain, but no restrictions on its left and right context 2.</Paragraph>
    <Paragraph position="4"> In the following example, we assume the LP-rules A&lt;B and B&lt;C. The lexical head of the tree is X 0, and the projections are X, and X max. The complements are A, B and C. Each projection contains information about the constituents contained in it, and each complement contains information about what must not occur to its left and right. A complement is only combined with a projection if the projection does not contain any category that the complement prohibits on its right or left, depending on which side the projection is added.</Paragraph>
    <Paragraph position="6"> Having now roughly sketched our approach, we will turn to the questions of how a violation of LP constraints results in unification failure, how the 2Alternatively, the projections of the head could as well accumulate the ordering restrictions while the arguments and adjuncts only carry information about their own LP-relevant features. The choice between the alternatives has no linguistic implications since it only affects the grammar compiled for processing and not the one written by the linguist.</Paragraph>
    <Paragraph position="7"> information associated with the projections is built up, and what to do if LP constraints operate on feature structures rather than on atomic categories.</Paragraph>
  </Section>
  <Section position="5" start_page="202" end_page="203" type="metho">
    <SectionTitle>
VIOLATION OF LP-CONSTRAINTS
AS UNIFICATION FAILURE
</SectionTitle>
    <Paragraph position="0"> As a conceptual starting point, we take a number of LP constraints. For the expository purposes of this paper, we oversimplifiy and assume just the following four LP constraints: nora &lt; Oat (nominative case precedes dative case) nora &lt; ace (nominative case precedes accusative case) Oat &lt; ace (dative case precedes accusative  Note that nora, Oat, ace, pro and nonpro are not syntactic categories, but rather values of syntactic features. A constituent, for example the pronoun ihn (him) may be both pronominal and in the accusative case. For each of the above values, we introduce an extra boolean feature, as illustrated in figure 5.  Arguments encode in their feature structures what must not occur to their left and right sides. The dative NP einem Spion (a spy), for example, must not have any accusative constituent to its left, and no nominative or pronominal constituent to its right, as encoded in the following feature structure. The feature structures that constrain the left and right contexts of arguments only use '-' as a value for the LP-relevant features.</Paragraph>
    <Paragraph position="1">  Lexical heads, and projections of the head contain a feature LP-STORE, which carries information about the LP-relevant information occuring within their head domain (figure 7).</Paragraph>
    <Paragraph position="3"> In our example, where the verbal lexical head is not affected by any LP constraints, the LP-STORE contains the information that no LP-relevant features are present.</Paragraph>
    <Paragraph position="4"> For a projection like einen Brief zusteckt (a letter\[acc\] slips), we get the following LP-STORE.  The NP einem Spion (figure 6) can be combined with the projection einen Brief zusteckt (figure 8) to form the projection einem Spion einen Brief zusteckt (a spy\[dat\] a letter\[acc\] slips) because the RIGHT feature of einera Spion and the LP-STORE of einen Brief zusteckt do not contain incompatible information, i.e., they can be unified. This is how violations of LP constraints are checked by unification. The projection einem Spion einen Brief zusteckt has the following LP-STORE.</Paragraph>
    <Paragraph position="5">  The constituent ihn zusteckt (figure 10) could not be combined with the non-pronominal NP einem  In this case, the value of the RIGHT feature of the argument einem Spion is not unifiable with the LP-STORE of the head projection ihn zusteckt because the feature PRO has two different atoms (+ and -) as its value. This is an example of a violation of an LP constraint leading to unification failure.</Paragraph>
    <Paragraph position="6"> In the next section, we show how LP-STOREs are manipulated.</Paragraph>
  </Section>
  <Section position="6" start_page="203" end_page="204" type="metho">
    <SectionTitle>
MANIPULATION OF THE LP-STORE
</SectionTitle>
    <Paragraph position="0"> Since information about constituents is added to the LP-STORE, it would be tempting to add this information by unification, and to leave the initial LP-STORE unspecified for all features. This is not possible because violation of LP constraints is also checked by unification. In the process of this unification, values for features are added that may lead to unwanted unification failure when information about a constituent is added higher up in the tree.</Paragraph>
    <Paragraph position="1"> Instead, the relation between the LP-STORE of a projection and the LP-STORE of its mother node is encoded in the argument that is added to the projection. In this way, the argument &amp;quot;changes&amp;quot; the LP-STORE by &amp;quot;adding information about itselff. Arguments therefore have the additional features LP-IN and LP-OUT.</Paragraph>
    <Paragraph position="2"> When an argument is combined with a projection, the projection's LP-STORE is unified with the argument's LP-IN, and the argument's LP-OUT is the mother node's LP-STORE. The relation between LP-IN and LP-OUT is specified in the feature structure of the argument, as illustrated in figure 11 for the accusative pronoun ihn, which is responsible for changing figure 7 into figure 10. No matter what the value for the features ACC and PRO may be in the projection that the argument combines with, it is '+' for both features in the mother node. All other features are left  Note that only a %' is added as value for LP-relevant features in LP-OUT, never a '-'. In this way, only positive information is accumulated, while negative information is &amp;quot;removed&amp;quot;. Positive information is never &amp;quot;removed&amp;quot;.</Paragraph>
    <Paragraph position="3"> Even though an argument or adjunct constituent may have an LP-STORE, resulting from LP constraints that are relevant within the constituent, it is ignored when the constituent becomes argument or adjunct to some head. Our encoding ensures that LP constraints apply to all head domains in a given sentence, but not across head domains.</Paragraph>
    <Paragraph position="4"> It still remains to be explained how complex phrases that become arguments receive their LP-IN, LP-OUT, RIGHT and LEFT features. These are specified in the lexical entry of the head of the phrase, but they are ignored until the maximal projection of the head becomes argument or adjunct to some other head. They must, however, be passed on unchanged from the lexical head to its maximal projection. When 3Coreference variables are indicated by boxed numbers. \[ \] is the feature structure that contains no information (TOP) and can be unified with any other feature structure.</Paragraph>
    <Paragraph position="5">  the maximal projection becomes an argument/adjunct, they are used to check LP constrains and &amp;quot;change&amp;quot; the LP-STORE of the head's projection.</Paragraph>
    <Paragraph position="6"> Our method also allows for the description of head-initial and head-final constructions. In German, for example, we find prepositions (e.g. far), postpositions (e.g. halber) and some words that can be both pre- and postpostions (e.g. wegen).</Paragraph>
    <Paragraph position="7"> The LP-rules would state that a postposition follows everything else, and that a preposition precedes everything else.</Paragraph>
    <Paragraph position="9"> The information about whether something is a preposition or a postposition is encoded in the lexical entry of the preposition or postposition. In the following figure, the LP-STORE of the lexical head  A word that can be both a preposition and a postposition is given a disjunction of the two lexical  The manipulation of the LP-STORE by the features LP-IN and LP-OUT works as usual.</Paragraph>
    <Paragraph position="10"> The above example illustrates that our method of encoding LP constraints works not only for verbal domains, but for any projection of a lexical head. The order of quantifiers and adjectives in a noun phrase can be described by LP constraints.</Paragraph>
  </Section>
  <Section position="7" start_page="204" end_page="206" type="metho">
    <SectionTitle>
INTEGRATION INTO HPSG
</SectionTitle>
    <Paragraph position="0"> In this section, our encoding of LP constraints is incorporated into HPSG (Pollard &amp; Sag 1987). We deviate from the standard HPSG grammar in the following respects: * The features mentioned above for the encoding of LP-constraints are added.</Paragraph>
    <Paragraph position="1"> * Only binary branching grammar rules are used. * Two new principles for handling LP-constraints are added to the grammar.</Paragraph>
    <Paragraph position="2"> Further we shall assume a set-valued SUBCAT feature as introduced by Pollard (1990) for the description of German. Using sets instead of lists as the values of SUBCAT ensures that the order of the complements is only constrained by LP-statements. In the following figure, the attributes needed for the handling of LP-constraints are assigned their place in the HPSG feature system.</Paragraph>
    <Paragraph position="4"> The paths SYNSEMILOCIHEADI{LP-IN,LP-OUT,RIGHT,LEFT} contain information that is relevant when the constituents becomes an argument/adjunct. They are HEAD features so that they can be specified in the lexical head of the constituent and are percolated via the Head Feature Principle to the maximal projection. The path SYNSEMILOCILP-STORE contains information about LP-relevant features contained in the projection dominated by the node described by the feature structure. LP-STORE can obviously not be a head feature because it is &amp;quot;changed&amp;quot; when an argument or adjunct is added to the projection.</Paragraph>
    <Paragraph position="5"> In figures 18 and 19, the principles that enforce LP-constraints are given 4. Depending on whether the head is to the right or to the left of the complement/adjunct, two versions of the principle are distinguished. This distinction is necessary because linear order is crucial. Note that neither the HEAD features of the head are used in checking LP constraints, nor the LP-STORE of the complement or adjunct.</Paragraph>
    <Paragraph position="6">  In the following examples, we make use of the parametrized type notation used in the grammar formalism STUF (D6rre 1991). A parametrized type has one or more parameters instantiated with feature structures. The name of the type (with its parameters) is given to the left of the := sign, the feature structure to the right.</Paragraph>
    <Paragraph position="7"> In the following we define the parametrized types nom(X,Y), dat(X,Y), pro(X,Y), and non-pro(X,Y), where X is the incoming LP-STORE and Y is the  The above type definitions can be used in the definition of lexical entries. Since the word ihm, whose lexical entry 5 is given in figure 24, is both dative case and pronominal, it must contain both types. While the restrictions on the left and right context invoked by dat/2 and pro/2 can be unified 6, matters are not that simple for the LP-IN and LP-OUT features. Since their purpose is to &amp;quot;change&amp;quot; rather than to &amp;quot;add&amp;quot; information, simple unification is not possible. Instead, LP-IN of ihm becomes the incoming LP-STORE of dat/2, the outgoing LP-STORE of daft2 becomes the incoming LP-STORE of pro/2, and the outgoing LP-STORE of pro/2 becomes LP-OUT of ihm, such that the effect of both changes is accumulated.</Paragraph>
    <Paragraph position="9"> Figure 24: lexical entry for ihm After expansion of the types, the following feature structure results. Exactly the same feature structure had been resulted if dat/2 and pro/2 would have been exchanged in the above lexical entry  The next figure shows the lexical entry for a non-pronominal NP, with a disjunction of three cases.</Paragraph>
  </Section>
  <Section position="8" start_page="206" end_page="206" type="metho">
    <SectionTitle>
COMPILATION OF THE ENCODING
</SectionTitle>
    <Paragraph position="0"> As the encoding of LP constraints presented above is intended for processing rather than grammar writing, a compilation step will initialize the lexical entries automatically according to a given grammar including a separated list of LP-constraints. Consequently the violation of LP-constraints results in unification failure. For reasons of space we only present the basic idea.</Paragraph>
    <Paragraph position="1"> The compilation step is based on the assumption that the features of the LP-constraints are morphologically motivated, i.e. appear in the lexicon. If this is not the case (for example for focus, thematic roles) we introduce the feature with a disjunction of its possible values. This drawback we hope to overcome by employing functional dependencies instead of LP-IN and LP-OUT features.</Paragraph>
    <Paragraph position="2"> For each side of an LP-constraint we introduce boolean features. For example for \[A: v\] &lt; \[B: w\] we introduce the features a_v and b_w. This works also for LP-constraints involving more than one feature such as \[,&gt;.o + 1 r,&gt;.o %3 CASE accJ &lt; LCASE For encoding the possible combinations of values for the participating features, we introduce binary auxiliary features such as pro_plus_case_acc, because we need to encode that there is at least a single constituent which is both pronominal and accusative.</Paragraph>
    <Paragraph position="3"> Each lexical entry has to be modified as follows:  1. A lexical entry that can serve as the head of a phrase receives the additional feature LP-STORE.</Paragraph>
    <Paragraph position="4"> 2. An entry that can serve as the head of a phrase  and bears LP-relevant information, i.e. a projection of it is subsumed by one side of some LP-constraint, has to be extended by the features LP-IN, LP-OUT, LEFT, RIGHT.</Paragraph>
    <Paragraph position="5"> 3. The remaining entries percolate the LP information unchanged by passing through the information via LP-IN and LP-OUT.</Paragraph>
    <Paragraph position="6"> The values of the features LEFT and RIGHT follow from the LP-constraints and the LP-relevant information of the considered lexical entry.</Paragraph>
    <Paragraph position="7"> The values of LP-STORE, LP-IN and LP-OUT depend on whether the considered lexical entry bears the information that is represented by the boolean feature (attribute A with value v for boolean feature a_v). entry bears the entry doesn't bear information the information</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML