File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/88/c88-1017_metho.xml

Size: 23,802 bytes

Last Modified: 2025-10-06 14:12:06

<?xml version="1.0" standalone="yes"?>
<Paper uid="C88-1017">
  <Title>A Constructive View of GPSG or How to Make It Work</Title>
  <Section position="3" start_page="77" end_page="78" type="metho">
    <SectionTitle>
2 Problems With the Implementation of GPSG
</SectionTitle>
    <Paragraph position="0"> In this section we want to justify why we had to develop a constructive version of the GPSG formalism although it might seem that the &amp;quot;classical&amp;quot; version of it (as defined in \[GKPS\]) can be implemented. We want to show that this is only true in theory but not in practice.</Paragraph>
    <Paragraph position="1"> What would it really amount to if we tried to implement the axiomatic version of GPSG in a straightforward way? In order to find all admissible trees corresponding to a given sentence, we would have to do the following things for every local tree (i.e. trees of depth 1): * build every possible extension for every category in an ID rule, which means that every feature that is not specified in the rule may be either absent or specified by any of its values, . filter out the illegal categories with the aid of the FCRs, . build all the possible projections of ID rules with the remaining legal categories, thereby creating every possible order of the daughters, * filter out those combinations of categories that are inadmissible according to the Foot Feature Principle (FFP), Control Agreement Principle (CAP) or Head Feature Convention (HFC), . filter out those projections that are unacceptable because of some category contradicting a Feature Specification Default (FSD), deg filter out all those projections that contradict any LP statement applicable to the daughters.</Paragraph>
    <Paragraph position="2"> After this, the subset of admissible local trees has to be identified which yields the desired complex structures in the following way: two (locally) admissible trees may be combined into a larger tree iff one of the daughters of one of them is identical with the mother of the other one.</Paragraph>
    <Paragraph position="3"> The whole process can be regarded as divided up into three major steps. The first step consists in constructing all the possible projections (possible according to ID rules and FCRs). The second step consists in filtering out local trees that are not admissible according to the restrictions imposed on them by the FIPs, the FSDs and the LP statements. Though these devices are not filters in the Chomskyan sense, 2 they behave in an analogous way by preventing previously generated structures from becoming locally admissible trees. The last step consists in forming complex structures out of locally admissible trees.</Paragraph>
    <Paragraph position="4"> In order to show the complexity of such an approach, it is necessary to give a rough idea of what the first step really mnounts to; it yields a combinatorial explosion of the set of categories. Assuming the 25 atomic and the 4 category-valued 2 This was pointed out to us by John Nerbonne (electronic mail). features defined for file English grammar in \[GKPS\], a lower bound for the number of categories to be checked by the FCRs is 10 774 \[Ristad 1986\].</Paragraph>
    <Paragraph position="5"> '\['he second of the above mentioned steps is riot trivial either, though its problems might be solvable after allo For a purely axiomatic view of the GPSG formalism it may be permissible to neglect the order in which the different filtering components are to be applied, akhough their seem to be some problems with the definitions of the different FIPs with respect to their logical independence of each other. For an effective implementation however, the ordering problem becomes crucial. There are some hints in \[GKPS\] referring to interdependencies between the different filters, but they are not fully specified. The most problematic case is the order in which the HFC and the CAP have to be applied: * the HFC seems to presuppose the effects of the CAP (and of the FFP) because it must not force feature specifications that are excluded by the CAP on categories in local trees; * the CAP presupposes the FlEC in the sense that it is based on semantic types, which are dependent on HEAD features, the distribution of which is in turn governed by the HFC.</Paragraph>
    <Paragraph position="6"> One possible way out of this dilemma is suggested in \[Shieber 1986\], but it is based on the assumption that HEAD features may be split up into two disjoint sets: those HEAD features which are prerequisites for the assignment of semantic types and thus for the applicability of the CAP, and those HEAD features that can safely be applied after the CAP has done its work. However, it is not clear whether such a distribution is possible. Of course, you can always make your ID rules much more informative with respect to feature specifications than is suggested in \[GKPS\] and thereby guarantee a proper functioning of the FIPs; but that would probably not be in the spirit of GPSG, where the main point is to capture the universal as well as the language-specific generalizations.</Paragraph>
    <Paragraph position="7"> There m'e a number of problems with the CAP; we waut to outline just one of them, which has led us to modify this principle. The definition of control in \[GKPS\] implicitly restricts the functioning of CAP to structures where the functor has no more than one argument (with the exception of those very special cases of control mediators). This cannot be seen from the definition of control \[GKPS:88\] alone, but may be derived from the interaction of this definition with the conditions on correct type assignment that are imposed on syntactic structures by the principle of functional realization \[GKPS, chapter 10\]: it follows from beth pm~s of the theoly taken together that a functor can be controlled by its argument only in the case where there is no further argument; otherwise the functor would have to be of a type that differs from what is assumed in the definition of control (intuitively, the type of a functor depends on how many arguments the functor takes).</Paragraph>
    <Paragraph position="8">  This diffictllty seems quite hard to cope with; if we assume rather flat structures (as we do, on independent grounds, in our German syntax \[Preug 1987\], see also \[Uszkoreit 1984\]), then it is not clear which of the different arguments of a functor is to control it; in the case of subject-predicate agreement in German, the subject would have to be marked as the controllc~', which can b~a:dly be done on the basis of the semantic types alone (becaose there seems to be no semantic reason to distinguish ,;ubjects and objects by their semantic type unless we treat subjects as functors operating on VPs as arguments, which would reverse the conlrol relation between them and thus cause all sorts of other problems). The only possibility we can conceive of would be analogous to the concept of argument order as defined in \[GKPS\] :in oi~er to obtain correctly the interpretations of direct attd indirect objects, but this is a language-particular concept (cf. \[GKPS:214\], which would not fit ittto a universal principle.</Paragraph>
  </Section>
  <Section position="4" start_page="78" end_page="78" type="metho">
    <SectionTitle>
3 A Constntctive View of GPS(~,
</SectionTitle>
    <Paragraph position="0"> Aa the previous section shows, the GPSG formalism in its original version is not suitable for computer implementation. From a processing point of view, it is an obvious rcqt&amp;ement that the components of GPSG should only conslru,zt the well-formed categories and trees, i.e. no garbage should be produced. In order to utilize GPSG for parsing artd generation in a computer system, a change in perspective becomes necessary; instead of deciding for all fully specified categories and all local trees whether they are legal or admissible respectively, we start from a highly underspecified local tree that is admitted by an ID rule and gather information by subsequently applying FCRs and FIPs.</Paragraph>
    <Paragraph position="1"> Eventually we sttall have a fully specified local tree that is admissible b7 definition.</Paragraph>
    <Paragraph position="2"> We shall call this view of GPSG constructive since it allows for the construction rather than the selection of a syntactic structure. In a conslructive version of GPSG, FCRs and F1Ps mainly act as principles of feature transport rather than of t'c~atu re distribution.</Paragraph>
    <Paragraph position="3"> One of d~e most important questions for the constructive version is ir~ what order the components of GPSG have to be applied. Since each of them may add further feature specifications to a category in a local tree, the order of application ought to depend on what information must be present for a component to work properly. This can be determined in general by using a monotonic operation such as unification for making categories more and more specific.</Paragraph>
    <Paragraph position="4"> This has led us to dispense with any assertions about categories as they are often used in \[GKPS\]. For instance, the predicate ~ with the meaning that some feature is undefined (i.e. it is nn~: contained in the category) is replaced by a feature value, ~, which is subject to unification. We shall thus say that a featurefi~', undefined if it is specified as &lt;f, ~&gt;.</Paragraph>
    <Paragraph position="5"> Tire predicative character of FCRs is also modified towards a functional one by including the assignment of values to features. Formally, an FCR is written catl ~ cat2, where cat1 and cat2 are categories. An FCR applies to a category C iff C is an extension of cat1. C must unify with cat2, otherwise C is not legal.</Paragraph>
    <Paragraph position="6"> Let us now discuss the role of the FIPs in a constructive version. We shall start with HFC. In \[GKPS\], HFC is based oil the free feature specification sets, which are utilized to prevent HFC from rejecting local trees because of HEAD features specified differently at the mother and tile head daughter(s) by virtue of ID rnles, FCRs, the FFP, or file CAP. To generate these sets would again require all possible projections from an ID rule to be produced. As was shown in the previous section, this must lm avoided if a computer implementation is to be supplied.</Paragraph>
    <Paragraph position="7"> From the constructive point of view we suggest that the effect of using the free feature specification sets can be attained by ensuring that for a local tree, the work of the FCRs, the FFP and the CAP has been completed before HFC comes into play.</Paragraph>
    <Paragraph position="8"> tlFC then assures that the so far unspecified HEAD features at the mother are ktentical with the corresponding HEAD feature specifications at the head daughter(s) and vice versa thereby never rejecting a local tree 3. IqFC proceeds as follows; every head daughter that can unify with its mother with respect to the set of HEAD features will do so. Typically, IJEAD includes features for verb form or clause structure. A constituent is marked as head by a binary feature, head, which is specified in the ID rules, thus replacing the meta-notation H in IGKPSI, the meaning of which is completely dependent on its context.</Paragraph>
    <Paragraph position="9"> This way HFC is supposed to work in an equally general, but much simpler, fashion than it was possible with the definition in \[GKPS\]. Moreover, IIFC is capable of coping with multiple heads used for the treatment of certain coordination phenomena; feature specifications are found in the coordinated head daughters, the HEAD feature in question has to be undefined at the mother. This parallels the way multiple heads are treated in \[GKPS\].</Paragraph>
    <Paragraph position="10"> The requirement that the CAP be prior to HFC raises, however, the problem that the CAP cannot be based on semantic types anymore because it is HFC which might provide the major feature specifications necessary to determine tile type of a constituent. Moreover, to be applicable to local trees with more than one argument (in those cases where no control mediator is present), the CAP had</Paragraph>
  </Section>
  <Section position="5" start_page="78" end_page="80" type="metho">
    <SectionTitle>
3 After HFC has been applied to a local tree, FCRs may become applicable that
</SectionTitle>
    <Paragraph position="0"> were not before, which in turn should cause the HFC to resume its work etc.</Paragraph>
    <Paragraph position="1"> until nothing is specified anymore. Whether this repetition must actually occur, depends on how the grammar is fonnulated.</Paragraph>
    <Paragraph position="2">  to be reformulated, and its place is taken by a purely syntactic mechanism, the Agreement Principle (AP), which is defined ~as follows \[Weisweber 1987\]; every daughter in a local tree that is ~ marked for agreement must unify with its mother with respect to a subset of features, called AGR. If an AGR feature is undefined, it is ignored by the AP. Any local tree violating the AP is rejected. AGR typically contains features for case, gender, person, or number. A constituent is marked for agreemen t by a binary feature, agr, that is specified through FCRs, e.g. {&lt;cas, hem&gt;} ~ {&lt;agr, +&gt;} and {&lt;vform, fin&gt;} D {&lt;agr, +&gt;}. The AP together with HEC provides for subject-verb agreement on the basis of these FCRs. This way of coping with agreement phenomena foregoes with any notion of control. There are no semantic types involved; what agrees with what need not be stated explicitly, it is simply the consequence of the interplay of FCRs, AP, HFC, and the This approach allows a category to contain feature specifications arising from different agreement relations. An important hypothesis underlying the revised AP is that this will only be necessary if that category contains, by virtue of an ID rule, category-valued features, which can by themselves be specified for agr. These features are also inspected by the revised AP in order to find members of some agreement relation in a local tree. Figure 1 contains a local tree, (3), with the feature slash (denoted by 7') specified at the mother as an accusative NP by an ID rule. This expresses the fact that a direct object is missing in local tree (3). The revised AP uses the AGR specifications of the slash value to establish agreement between the direct object and the reflexive pronoun.</Paragraph>
    <Paragraph position="3"> The AGR specifications of the S, on the other hand, are used to ensure subject-verb agreement.</Paragraph>
    <Paragraph position="5"> definition of the feature sets AGR and HEAD. Note that the AP does not presuppose HEAD feature specifications and can thus be prior to HFC.</Paragraph>
    <Paragraph position="6"> However, the AP as defined above cannot account for the fact that a category may participate in some agreement relations, but not in others (in 'raising' constructions a direct object may have to agree with a reflexive pronoun, but not with the finite verb). A more sophisticated version of the AP, which is presently being developed, is based on different kinds of agr values (e.g. agrl and agr2 instead of +). A direct object, as well as the reflexive, is then specified with &lt;agr, agr2&gt; whereas subject and finite verb both have &lt;agr, agrl&gt;. The revised AP requires categories containing the same agr specification to unify with respect to AGR as described above.</Paragraph>
    <Paragraph position="7">  Note that this way of including category-valued features specified in ID rules is independent of which syntactic structures are used to describe a language, rather tile function of category-valued features as indicators of long distance relations is utilized.</Paragraph>
    <Paragraph position="8"> The feature agr can still be specified by virtue of FCRs, though there seem to be some characteristic exceptions where the value is better provided within the ID rules. For instance, a VP should not always contain &lt;agr, agr2&gt;, as in figure 1, because in the case of 'equi' verbs it would have to agree with the subject. 4 4 This relational information cannot be derived from the different subcategorizations of'raising' and 'equi' verbs alone.</Paragraph>
    <Paragraph position="9"> Let us conclude tile discussion of the FIPs with the FFP, tile functionir~g of which has by and large been taken over from \[GKPSI~ A special treatment is necessary for the wflue ,~. All daughters unify with the mother with respect to a set of FOOT feature, s, provided that the values ale not spe, cified in the ID rules. Daughters that are undefined with respect to some FOOT feature are ignored by the Flq ~ unless the FOOT feature is untlefined at every daughter or at the mother; in that case tile FFP requires all constituents to be undefined with respect to that FOOT feature. If a local tree violates the FFP it is rejected.</Paragraph>
    <Paragraph position="10"> The FFP is only dependent on rite ID rules and is thus able to be the first t,'IP to apply. It is in fact prior to the AP since its point, we shall look at two rather obvious strategies, om which is used in the Berlin GPSG system \[Hauensct Busemann 1988\] for parsing and the other for generation.</Paragraph>
    <Paragraph position="11"> The first one constructs the tree in a bottom-up mar, thereby reducing admissible local trees by unifying tl mothers with the daughters of another local tree. The bott( up strategy starts from lexical categories, which are admissi by the lexicon. Each reduction step is followed by application of the FIPs to the newly created local tree. Thus intormation contained in the lexical categories is percolated higher levels of the tree, thereby constraining the set of furtl reduction steps allowed by the grammar. This strategy is us within tile parser in the GPSG system \[Weisweber 1987\].</Paragraph>
    <Paragraph position="12">  results may ~rigger FCRs that specify the agr feature. FCRs have to be applied at each step where a feature might have been specified in a local tree, namely after tile FFP, the AP, and the HFC_ LP statements can only be guaranteed to apply 5 properly on fully specified categories. Thus they operate in the last place (cf. figure 2) 6.</Paragraph>
    <Paragraph position="13"> The next question to be addressed is how complex structures arc built from local trees. Since in the constructional version nothing forces a daughter of one local tree mid tile mother of ~,nother one to have the same set of features specified with the same values, the two categories are not required to be identical, as in \[GKPS\], rather they must unify in order to be combined into a larger tree.</Paragraph>
    <Paragraph position="14"> For each of tile two categories involved in the unification, additional features may be specified. This specification by construction, when combined with the application of FCRs and FIPs, makes the results of transporting feature specifications within local trees immediately available to other locN treks.</Paragraph>
    <Paragraph position="15"> Tile precise way of interaction with FCRs and FIPs depends on the strategy adopted for tree formation. In order to clarify the For parsing, LP statements work as filters whereas for generation, they constructively order the branches in a local tree \[Busemana 1987\]. Note that a similar ordering discovered by Shieber \[Shieber 1986\] results from investigations of underlying assumptions of \[GKPS\].</Paragraph>
    <Paragraph position="16"> The second strategy consists of top-down tree formation With this type of proces s, local trees are expanded by unifyin~ their daughters with mothers of other local trees. The top.</Paragraph>
    <Paragraph position="17"> down strategy starts from a local tree (with mother S, foJ instance), the categories of which have feature specifications by virtue of all ID rule only. FCRs and FlPs cannot be applied during tree expansion because there is too little information available for deciding upon e.g. the value of agr (for the same reason, FCRs attd FlPs are not applied to ID rules directly), rather they apply in a bottom-up manner as with the first strategy after the lexical insertion has been completed.</Paragraph>
    <Paragraph position="18"> The latter strategy is utilized within the generation component \[Busemann 1987\] in the GPSG system, which has to introduce, for instance, number and case information into the structure that it is about to generate. This takes place in the course of tree expansion by adding relevant feature specifications to categories in the tree (to an NP mother, for instance). This information is usually not available in local trees at a deeper level, especially at local trees with lexical categories. Therefore the lexicon should contain word stems (rather than word forms) and, con'espondingly, categories that are unspecified for e.g. number and case. 7 This makes a situation possible that has not been discussed yet; namely, that when FIPs apply to local trees at these deeper levels they may have to cope with unspecified features. There B\] is indeed no requirement that AGR or HEAD features must have a value in order to unify. We should like the FIPs to work properly even if features have not yet received a value. In these cases, the feature values in question are co-specified, i.e. they will have the same value as soon as one of them is specified. In our example, number and case specifications are spread over the sub-structure dominated by the NP as soon as the FIPs apply to the local tree where they have been introduced.</Paragraph>
    <Paragraph position="19"> However, such a delayed specification makes it more difficult to maintain control over whether a category is still legal and whether a local tree still complies with the LP statements. For an elegant solution see \[Weisweber 1988\], in this volume.</Paragraph>
    <Paragraph position="20"> In our present version of GPSG, we use neither metarules nor FSD. However, the linguist ought to still have the possibility of expressing elegantly language-particular generalizations with the aid of metarules. They will be realized in a preprocessing component in order to avoid having to apply them during parsing or during generation.</Paragraph>
    <Paragraph position="21"> As for FSDs, we adopt the working hypothesis that they are superfluous if lexical entries are sufficiently specified and free feature instantiation (in the sense of \[GKPS\]) is not allowed. FSDs are needed in the GPSG version of \[GKPS\] because free feature instantiation may assign nonsensical values to features, which would never occur if the structure had been built orderly on the basis of sufficient lexical information. In the long run it might be desirable to use the device of FSDs in a constructional version of GPSG, too; namely, for those cases wbere features have not been specified, thougll the whole structure has been completed. However, we shall have to avoid the complexity of FSDs as defined in \[GKPS\]; a simplified solution might be analogous to our version of HFC for HFC, too, is a default device in the final account.</Paragraph>
    <Paragraph position="22"> The constructional version of GPSG presented here constitutes the linguistic basis for parsing and generation of English and German sentences within the Berlin GPSG system.</Paragraph>
    <Paragraph position="23"> The system is fully implemented in Waterloo Core Prolog using the set of predicates defined by the KIT-CORE Prolog standard \[Bittkau et al. 1987\], which makes it possible to run it with several other Prolog dialects, too (e.g. Symbolics Prolog). At present, it runs on an IBM 4381 under VM/SP, on a Symbolics 3640 Lisp machine, and on an IBM AT.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML