File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/95/p95-1011_metho.xml
Size: 26,088 bytes
Last Modified: 2025-10-06 14:14:00
<?xml version="1.0" standalone="yes"?> <Paper uid="P95-1011"> <Title>Encoding Lexicalized Tree Adjoining Grammars with a Nonmonotonic Inheritance Hierarchy</Title> <Section position="4" start_page="77" end_page="79" type="metho"> <SectionTitle> 2 Representing LTAG trees </SectionTitle> <Paragraph position="0"/> <Paragraph position="2"> The principal unit of (syntactic) information associated with an LTAG entry is a tree structure in which the tree nodes are labeled with syntactic categories and feature information and there is at least one leaf node labeled with a lexical category (such lexical leaf nodes are known as anchors). For example, the canonical tree for a ditransitive verb such as give is shown in figure 1. Following LTAG conventions (for the time being), the node labels here are gross syntactic category specifications to which additional featural information may be added 5, and are annotated to indicate node type: <> indicates an anchor node, and I indicates a substitution node (where a 3See, for example, Bleiching (1992; 1994), Brown & Hippisley (1994), Corbett & Fraser (1993), Cahill (1990; 1993), Cahill &: Evans (1990), Fraser &= Corbett (in press), Gibbon (1992), Kilgarriff (1993), Kilgarriff & Gazdar (1995), Reinhard & Gibbon (1991).</Paragraph> <Paragraph position="3"> 4See, for example, Andry et al. (1992) on compilation, Kilbury et al. (1991) on coding DAGs, Duda & Gebhardi (1994) on dynamic querying, Langer (1994) on reverse querying, and Barg (1994), Light (1994), Light et al. (1993) and Kilbury et al. (1994) on automatic acquisition. And there are at least a dozen different DATR implementations available, on various platforms and programming languages.</Paragraph> <Paragraph position="4"> Sin fact, \[TAG commonly distinguishes two sets of features at each node (top and bottota), but for simplicity we shall assume just one set in this paper.</Paragraph> <Paragraph position="5"> fully specified tree with a compatible root label may be attached) 6.</Paragraph> <Paragraph position="6"> In representing such a tree in DATR, we do two things. First, in keeping with the radically lexicalist character of LTAG, we describe the tree structure from its (lexical) anchor upwards 7, using a variant of Kilbury's (1990) bottom-up encoding of trees. In this encoding, a tree is described relative to a particular distinguished leaf node (here the anchor node), using binary relations paxent, left and right, relating the node to the subtrees associated with its parent, and immediate-left and -right sisters, encoded in the same way. Second, we embed the resulting tree structure (i.e., the node relations and type information) in the feature structure, so that the tree relations (left, right and parent) become features.</Paragraph> <Paragraph position="7"> The obvious analogy here is the use of first/rest features to encode subcategorisation lists in frameworks like HPSG.</Paragraph> <Paragraph position="8"> Thus the syntactic feature information directly associated with the entry for give relates to the label for the v node (for example, the value of its cat feature is v, the value of type is emchor), while specifications of subfeatures of parent relate to the label of the vP node. A simple bottom-up DATR representation for the whole tree (apart from the node type information) follows: Give: <cat> -- v <parent cat> = vp <parent left cat> =np <parent parent cat> = s <right cat> =np <right right cat> = p <right right parent cat> = pp <right right right cat> =np.</Paragraph> <Paragraph position="9"> This says that Give is a verb, with vp as its parent, an s as its grandparent and an NP to the left of its parent. It also has an NP to its right, and a tree rooted in a P to the right of that, with a PP parent and NP right sister. The implied bottom-up tree structure is shown graphically in figure 2. Here the nodes are laid out just as in figure 1, but related via parent, left and right links, rather than the more usual (implicitly ordered) daughter links.</Paragraph> <Paragraph position="10"> Notice in particular that the right link from the object noun-phrase node points to the preposition node, not its phrasal parent - this whole subtree is itself encoded bottom-up. Nevertheless, the full tree structure is completely and accurately represented by this encoding.</Paragraph> <Paragraph position="11"> s LTAG's other tree-building operation is adjunetion, which allows a tree-fragment to be spliced into the body of a tree. However, we only need to concern ourselves here with the representation of the trees involved, not with the substitution/adjunction distinction.</Paragraph> <Paragraph position="12"> rThe tree in figure 1 has more than one anchor - in such cases it is generally easy to decide which anchor is the most appropriate root for the tree (here, the verb anchor).</Paragraph> <Paragraph position="13"> Once we adopt this representational strategy, writing an LTAG lexicon in DATR becomes similar to writing any other type of lexicalist grammar's lexicon in an inheritance-based LKRL. In HPSG, for example, the subcategorisation frames are coded as lists of categories, whilst in LTAG they are coded as trees. But, in both cases, the problem is one of concisely describing feature structures associated with lexical entries and relationships between lexical entries. The same kinds of generalization arise and the same techniques are applicable. Of course, the presence of complete trees and the fully lexicalized approach provide scope for capturing generalizations lexically that are not available to approaches that only identify parent and sibling nodes, say, in the lexical entries.</Paragraph> <Paragraph position="14"> 3 Encoding lexical entries Following conventional models of lexicon organisation, we would expect Give to have a minimal syntactic specification itself, since syntactically it is a completely regular ditransitive verb. In fact none of the information introduced so far is specific to Give. So rather than providing a completely explicit DATR definition for Give, as we did above, a more plausible account uses an inheritance hierarchy defining abstract intransitive, transitive and ditransitive verbs to support Give (among others), as shown in figure 3.</Paragraph> <Paragraph position="15"> This basic organisational structure can be expressed as the following DATR fragmentS: 8To gain the intuitive sense of this fragment, read a line such as <> --= VERB as &quot;inherit everything from the definition of VERB&quot;, and a line such as <parent> == PPTREE:<> as &quot;inherit the parent subtree from the definition of PPTREE'. Inheritance in DATR is always by default - locally defined feature specifications take priority over inherited ones.</Paragraph> <Paragraph position="16"> Ignoring for the moment the references to TREENODE, VPTREE, NPCOMP and PTREE (which we shall define shortly), we see that VERB defines basic features for all verb entries (and can be used directly for intransitives such as Die), VERB+NP inherits ~om VERB butadds an NP complement to the right of the verb (for transitives), VEKB+NP+PP inherits ~om VERB+NP but adds a further PP complement and so on. Entries for regular verb lexemes are then minimal - syntactically they just inherit everything from the abstract definitions.</Paragraph> <Paragraph position="17"> This DATR fragment is incomplete, because it neglects to define the internal structure of the TREEtlODE and the various subtree nodes in the lexical hierarchy. Each such node is a description of an LTAG tree at some degree of abstraction 9. The following DATR statements complete the fragment, by providing definitions for this internal structure: Here, TREENODE represents an abstract node in an LTAG tree and provides a (default) type of internal. Notice that VERB is itself a TREENODE (but with the nondefault type anchor), and the other definitions here define the remaining tree nodes that arise in our small lexicon: VPTREE is the node for VERB's parent, STREE for VEKB's grandparent, NPCOMP defines the structure needed for NP complement substitution nodes, etc. 1deg Taken together, these definitions provide a specification for Give just as we had it before, but with the addition of type and root features. They also support some other verbs too, and it should be clear that the basic technique extends readily to a wide range of other verbs and other parts of speech. Also, although the trees we have described are all initial (thus, for example, VPTREE inherits from TREENODE, STREE and NPCOMP) but a/l such multiple inheritance is orthogonal in DATR: no path can inherit from more than one node.</Paragraph> <Paragraph position="18"> trees (in LTAG terminology), we can describe auxiliary trees, which include a leaf node of type foot just as easily. A simple example is provided by the following definition for auxiliary verbs:</Paragraph> <Paragraph position="20"> <parent cat> == vp <right cut> == vp <right type> == foot.</Paragraph> </Section> <Section position="5" start_page="79" end_page="81" type="metho"> <SectionTitle> 4 Lexical rules </SectionTitle> <Paragraph position="0"> Having established a basic structure for our LTAG lexicon, we now turn our attention towards capturing other kinds of relationship among trees. We noted above that lexical entries are actually associated with tree families, and that these group together trees that are related to each other. Thus in the same family as a standard ditransitive verb, we might find the full passive, the agentless passive, the dative alternation, the various relative clauses, and so forth. It is clear that these families correspond closely to the outputs of transformations or metarules in other frameworks, but the XTAG system currently has no formal component for describing the relationships among families nor mechanisms for generating them. And so far we have said nothing about them either - we have only characterized single trees.</Paragraph> <Paragraph position="1"> However, LTAG's large domain of locality means that all such relationships can be viewed as directly lexical, and ~hus expressible by lexical rules. In fact we can go further than this: because we have embedded the domain of these lexical rules, namely the LTAG tree structures, within the feature structures, we can view such lexical rules as covariation constraints within feature structures, in much the same way that the covariation of, say, syntactic and morphological form is treated. In particular, we can use the mechanisms that DATR already provides for feature covariation, rather than having to invoke in addition some special purpose lexical rule machinery.</Paragraph> <Paragraph position="2"> We consider six construction types found in the XTAG grammar: passive, dative, subject-auxiliary inversion, wh-questions, relative clauses and topicalisation. Our basic approach to each of these is the same. Lexical rules are specified by defining a derived output tree structure in terms of an input tree structure, where each of these structures is a set of feature specifications of the sort defined above. Each lexical rule has a name, and the input and output tree structures for rule foo are referenced by prefixing feature paths of the sort given above with <input foo . .> or <output foo . .>. So for example, the category of the parent tree node of the output of the passive rule might be referenced as <output passive parent cat>. We define a very general default, stating that the output is the same as the input, so that lexical relationships need only concern themselves with components they modify.</Paragraph> <Paragraph position="3"> This approach to formulating lexical rules in DAIR is quite general and in no way restricted to/TAG: it can be readily adapted for application in the context of any feature-based lexicalist grammar formalism.</Paragraph> <Paragraph position="4"> Using this approach, the dative lexical rule can be given a minimalist implementation by the addition of the following single line to VERB+NP+PP, defined above.</Paragraph> <Paragraph position="6"> <output dative right right> == NPCOMP:<>.</Paragraph> <Paragraph position="7"> This causes the second complement to a ditransitive verb in the dative alternation to be an NP, rather than a PP as in the unmodified case. Subject-auxiliary inversion can be achieved similarly by just specifying the output tree structure without reference to the input structure (note the addition here of a form feature specifying verb form):</Paragraph> <Paragraph position="9"> <output auxinv parent cat> == s <output auxinv right cat> == s.</Paragraph> <Paragraph position="10"> Passive is slightly more complex, in that it has to modify the given input tree structure rather than simply overwriting part of it. The definitions for passive occur at the VERB+NP node, since by default, any transitive or subclass of transitive has a passive form. Individual transitive verbs, or whole subclasses, can override this default, leaving their passive tree structure undefined if required. For agentless passives, the necessary additions to the VERB+NP node are as followsn:</Paragraph> <Paragraph position="12"> &quot;<input passive right right>&quot;.</Paragraph> <Paragraph position="13"> Here, the first line stipulates the form of the verb in the output tree to be passive, while the second line redefines the complement structure: the output of passive has as its first complement the second complement of its input, thereby discarding the first complement of its input. Since complements are daisy-chained, all the others move up too.</Paragraph> <Paragraph position="14"> Wh-questions, relative clauses and topicalisation are slightly different, in that the application of the lexical rule causes structure to be added to the top of the tree (above the s node). Although these constructions involve unbounded dependencies, the unboundedness is taken care of by the \[TAG adjunction mechanism: for lexical purposes the dependency is local. Since the relevant lexical rules can apply to sentences that contain any kind of verb, they need to be stated at the VERB node. Thus, for example, topicalisation and wh-questions can be defined as follows: 11Oversimplifying slightly, the double quotes in &quot;<input passive right right>&quot; mean that that DATR path will not be evaluated locally (i.e., at the VERB+NP node), but rather at the relevant lexeme node (e.g., Eat or Give).</Paragraph> <Paragraph position="16"> <output whq parent parent left form> == vh.</Paragraph> <Paragraph position="17"> Here an additional NP and s are attached above the original s node to create a topicalised structure. The wh-rule inherits from the topicalisation rule, changing just one thing: the form of the new NP is marked as wh, rather than as normal. In the full fragment 12, the NP added by these rules is also syntactically cross-referenced to a specific NP marked as null in the input tree. However, space does not permit presentation or discussion of the DATR code that achieves this here.</Paragraph> <Paragraph position="18"> 5 Applying lexical rules As explained above, each lexical rule is defined to operate on its own notion of an input and produce its own output. In order for the rules to have an effect, the various input and output paths have to be linked together using inheritance, creating a chain of inheritances between the base, that is, the canonical definitions we introduced in section 3, and surface tree structures of the lexical entry. For example, to 'apply' the dative rule to our Give definition, we could construct a definition such as this:</Paragraph> <Paragraph position="20"> Values for paths prefixed with surface inherit from the output of the dative rule. The input of the dative rule inherits from the base (unprefixed) case, which inherits from Give. The dative rule definition (just the oneline introduced above, plus the default that output inherits from input) thus mediates between qive and the surface of Give-dat. This chain can be extended by inserting additional inheritance specifications (such as passive). Note that surface defaults to the base case, so all entries have a surface defined.</Paragraph> <Paragraph position="21"> However, in our full fragment, additional support is provided to achieve and constrain this rule chaining. Word definitions include boolean features indicating which rules to apply, and the presence of these features trigger inheritance between appropriate input and output paths and the base and surface specifications at the ends of the chain. For example, Wordl is an alternative way of specifying the dative alternant of Give, but results in inheritance linking equivalent to that found in Give-dat above: 12The full version of this DAIR fragment includes all the components discussed above in a single coherent, but slightly more complex account. It is available on request from the authors.</Paragraph> <Paragraph position="22"> <parent left form> =- null Notice here the final line of Nord2 which specifies the location of the 'extracted' NP (the subject, in this case), by marking it as null. As noted above, the full version of the whq lexical rule uses this to specify a cross-reference relationship between the wh-NP and the null NP.</Paragraph> <Paragraph position="23"> We can, if we wish, encode constraints on the applicability of rules in the mapping from boolean flags to actual inheritance specifications. Thus, for example, whq, tel, and topic are mutually exclusive. If such constraints are violated, then no value for surface gets defined. Thus Word3 improperly attempts topicalisation in addition to wh-question formation, and, as a result, will fail to define a surface tree structure at all:</Paragraph> <Paragraph position="25"> <alt whq> m= true <alt topic> == true <alt dative> -~, true <alt passive> -= true <parent left form> == null. This approach to lexical rules allows them to be specified at the appropriate point in the lexicM hierarchy, but overridden or modified in subclasses or lexemes as appropriate. It also allows default generalisation over the lexical rules themselves, and control over their application. The last section showed how the whq lexical rule could be built by a single minor addition to that for topicalisation. However, it is worth noting that, in common with other DATR specifications, the lexical rules presented here are rule instances which can only be applied once to any given lexeme - multiple application could be supported, by making multiple instances inherit from some common rule specification, but in our current treatment such instances would require different rule names.</Paragraph> </Section> <Section position="6" start_page="81" end_page="82" type="metho"> <SectionTitle> 6 Comparison with related work </SectionTitle> <Paragraph position="0"> As noted above, Vijay-Shanker & Schabes (1992) have also proposed an inheritance-based approach to this problem. They use monotonic inheritance to build up partial descriptions of trees: each description is a finite set of dominance, immediate dominance and linear precedence statements about tree nodes in a tree description language developed by Rogers & Vijay-Shanker (1992), and category information is located in the node labels.</Paragraph> <Paragraph position="1"> This differs from our approach in a number of ways. First, our use of nonmonotonic inheritance allows us to manipulate total instead of partial descriptions of trees. The abstract verb class in the Vijay-Shanker & Schabes account subsumes both intransitive and transitive verb classes but is not identical to either - a minimal-satisfying-model step is required to map partial tree descriptions into actual trees. In our analysis, VERB is the intransitive verb class, with complements specifically marked as undefined: thus VERB : <right> == under is inherited from TREENODE and VERB+NP just overrides this complement specification to add an NP complement. Second, we describe trees using only local tree relations (between adjacent nodes in the tree), while Vijay-Shanker &5 Schabes also use a nonlocal dominance relation.</Paragraph> <Paragraph position="2"> Both these properties are crucial to our embedding of the tree structure in the feature structure. We want the category information at each tree node to be partial in the conventional sense, so that in actual use such categories can be extended (by unification or whatever). So the feature structures that we associate with lexical entries must be viewed as partial. But we do not want the tree structure to be extendible in the same way: we do not want an intransitive verb to be applicable in a transitive context, by unifying in a complement NP. So the tree structures we define must be total descriptions 13.</Paragraph> <Paragraph position="3"> And of course, our use of only local relations allows a direct mapping from tree structure to feature path, which would not be possible at all if nonlocal relations were present.</Paragraph> <Paragraph position="4"> So while these differences may seem small, they allow us to take this significant representational step significant because it is the tree structure embedding that allows us to view lexical rules as feature covariation constraints. The result is that while Vijay-Shanker & Schabes use a tree description language, a category description language and a further formalism for lexical rules, we can capture everything in one framework all of whose components (nonmonotonicity, covariation constraint handling, etc.) have already been independently motivated for other aspects of lexical description 14.</Paragraph> <Paragraph position="5"> Becket's recent work (1993; 1994) is also directed at exactly the problem we address in the present paper. Like him, we have employed an inheritance hierarchy. And, like him, we have employed a set of lexical rules (corresponding to his metarules). The key differences between our account and his are (i) 13Note that simplified fragment presented here does not get this right. It makes all feature specifications total descriptions. To correct this we would need to change TREENODE so that only the values of <right>, <left> and <parent> default to under.</Paragraph> <Paragraph position="6"> 14As in the work cited in footnote 3, above.</Paragraph> <Paragraph position="7"> that we have been able to use an existing lexical knowledge representation language, rather than designing a formal system that is specific to \[TAG, and (ii) that we have expressed our lexical rules in exactly the same language as that we have used to define the hierarchy, rather than invoking two quite different formal systems.</Paragraph> <Paragraph position="8"> Becket's sharp distinction between his metarules and his hierarchy gives rise to some problems that our approach avoids. Firstly, he notes that his meta-rules are subject to lexical exceptions and proposes to deal with these by stating &quot;for each entry in the (syntactic) lexicon .. which metarules are applicable for this entry&quot; (1993,126). We have no need to carry over this use of (recta)rule features since, in our account, lexical rules are not distinct from any other kind of property in the inheritance hierarchy.</Paragraph> <Paragraph position="9"> They can be stated at the most inclusive relevant node and can then be overridden at the exceptional descendant nodes. Nothing specific needs to be said about the nonexceptional nodes.</Paragraph> <Paragraph position="10"> Secondly, his metarules may themselves be more or less similar to each other and he suggests (1994,11) that these similarities could be captured if the metarules were also to be organized in a hierarchy. However, our approach allows us to deal with any such similarities in the main lexical hierarchy itself 15 rather than by setting up a separate hierarchical component just for metarules (which appears to be what Becket has in mind).</Paragraph> <Paragraph position="11"> Thirdly, as he himself notes (1993,128), because his metarules map from elementary trees that are in the inheritance hierarchy to elementary trees that are outside it, most of the elementary trees actually used are not directly connected to the hierarchy (although their derived status with respect to it can be reconstructed). Our approach keeps all elementary trees, whether or not they have been partly defined by a lexical rule, entirely within the lexical hierarchy. In fact, Becker himself considers the possibility of capturing all the significant generalizations by using just one of the two mechanisms that he proposes: &quot;one might want to reconsider the usage of one mechanism for phenomena in both dimensions&quot; (1993,135). But, as he goes on to point out, his existing type of inheritance network is not up to taking on the task performed by his metarules because the former is monotonic whilst his metarules are not.</Paragraph> <Paragraph position="12"> However, he does suggest a way in which the hierarchy could be completely replaced by metarules but argues against adopting it (1993,136).</Paragraph> <Paragraph position="13"> As will be apparent from the earlier sections of this paper, we believe that Becker's insights about the organization of an \['lAG lexicon can be better expressed if the metarule component is replaced by lSAs illustrated by the way in which the whq lexical rule inherits from that for topicalisation in the example given above.</Paragraph> <Paragraph position="14"> an encoding of (largely equivalent) lexical rules that are an integral part of a nonmonotonic inheritance hierarchy that stands as a description of all the elementary trees.</Paragraph> </Section> class="xml-element"></Paper>