Encoding Lexicalized Tree Adjoining Grammars with a 
Nonmonotonic Inheritance Hierarchy 
Roger Evans 
Information Technology 
Research Institute 
University of Brighton 
rpe©itri, bton. ac. uk 
Gerald Gazdar 
School of Cognitive 
Computing Sciences 
University of Sussex 
geraldg©cogs, susx. ac. uk 
David Weir 
School of Cognitive ~z 
Computing Sciences 
University of Sussex 
dav±dw©cogs, susx. ac. uk 
Abstract 
This paper shows how DATR, a widely used 
formal language for lexical knowledge re- 
presentation, can be used to define an I_TAG 
lexicon as an inheritance hierarchy with in- 
ternal lexical rules. A bottom-up featu- 
ral encoding is used for LTAG trees and 
this allows lexical rules to be implemen- 
ted as covariation constraints within fea- 
ture structures. Such an approach elimina- 
tes the considerable redundancy otherwise 
associated with an LTAG lexicon. 
1 Introduction 
The Tree Adjoining Grammar (lAG) formalism was 
first introduced two decades ago (3oshi et al., 1975), 
and since then there has been a steady stream of 
theoretical work using the formalism. But it is 
only more recently that grammars of non-trivial size 
have been developed: Abeille, Bishop, Cote & Scha- 
bes (1990) describe a feature-based Lexicalized Tree 
Adjoining Grammar (\[_'lAG) for English which sub- 
sequently became the basis for the grammar used in 
the XTAG system, a wide-coverage \[_TAG parser (Do- 
ran et al., 1994b; Doran et al., 1994a; XTAG Rese- 
arch Group, 1995). The advent of such large gram- 
mars gives rise to questions of efficient representa- 
tion, and the fully lexicalized character of the \[TAG 
formalism suggests that recent research into lexical 
representation might be a place to look for answers 
(see for example Briscoe ef a/.(1993); Daelemans & 
Gazdar(1992)). In this paper we explore this sugge- 
stion by showing how the lexical knowledge repre- 
sentation language (LKRL) DA'lR (Evans & Gazdar, 
1989a; Evans & Gazdar, 1989b) can be used to for- 
mulate a compact, hierarchical encoding of an \[-'lAG. 
The issue of efficient representation for I_'rAG 1 is 
discussed by Vijay-Shanker & Schabes (1992), who 
1As with all fully lexicMized grammar formalisms, 
there is really no conceptual distinction to be drawn in 
I_TAG between the lexicon and the grammar: tile gram- 
rnatical rules are just lexical properties. 
draw attention to the considerable redundancy in- 
herent in \[-TAG lexicons that are expressed in a flat 
manner with no sharing of structure or properties 
across the elementary trees. For example, XTAG cur- 
rently includes over 100,000 lexemes, each of which 
is associated with a family of trees (typically around 
20) drawn from a set of over 500 elementary trees. 
Many of these trees have structure in common, many 
of the lexemes have the same tree families, and many 
of the trees within families are systematically rela- 
ted in ways which other formalisms capture using 
transformations or metarules. However, the \[TAG 
formalism itself does not provide any direct support 
for capturing such regularities. 
Vijay-Shanker & Schabes address this problem by 
introducing a hierarchical lexicon structure with mo- 
notonic inheritance and lexical rules, using an ap- 
proach loosely based on that of Flickinger (1987) 
but tailored for \[TAG trees rather than HPSG sub- 
categorization lists. Becker (1993; 1994) proposes a 
slightly different solution, combining an inheritance 
component and a set of metarules 2. We share their 
perception of the problem and agree that adopting 
a hierarchical approach provides the best available 
solution to it. However, rather than creating a hier- 
archical lexical formalism that is specific to the \[_TAG 
problem, we have used DATR, an LKR.L that is al- 
ready quite widely known and used. From an \[TAG 
perspective, it makes sense to use an already availa- 
ble LKRL that was specifically designed to address 
these kinds of representational issues. From a DATR 
perspective, I_TAG presents interesting problems ari- 
sing from its radically lexicalist character: all gram- 
matical relations, including unbounded dependency 
constructions, are represented lexically and are thus 
open to lexical generalization. 
There are also several further benefits to be gai- 
ned from using an established general purpose LKRL 
such as DATR. First, it makes it easier to compare 
the resulting \[TAG lexicon with those associated with 
other types oflexical syntax: there are existing DATR 
2See Section 6 for further discussion of these 
approaches. 
77 
lexicon fragments for HPSG, PATR and Word Gram- 
mar, among others. Second, DATR is not restricted 
to syntactic description, so one can take advantage 
of existing analyses of other levels of lexical descrip- 
tion, such as phonology, prosody, morphology, com- 
positional semantics and lexical semantics 3. Third, 
one can exploit existing formal and implementation 
work on the language 4. 
2 Representing LTAG trees 
S 
NPI VP 
V o NPI PP 
P o NPI 
Figure 1: An example LTAG tree for give 
The principal unit of (syntactic) information asso- 
ciated with an LTAG entry is a tree structure in which 
the tree nodes are labeled with syntactic categories 
and feature information and there is at least one 
leaf node labeled with a lexical category (such lexi- 
cal leaf nodes are known as anchors). For example, 
the canonical tree for a ditransitive verb such as give 
is shown in figure 1. Following LTAG conventions 
(for the time being), the node labels here are gross 
syntactic category specifications to which additional 
featural information may be added 5, and are anno- 
tated to indicate node type: <> indicates an anchor 
node, and I indicates a substitution node (where a 
3See, for example, Bleiching (1992; 1994), Brown & 
Hippisley (1994), Corbett & Fraser (1993), Cahill (1990; 
1993), Cahill &: Evans (1990), Fraser &= Corbett (in 
press), Gibbon (1992), Kilgarriff (1993), Kilgarriff & 
Gazdar (1995), Reinhard & Gibbon (1991). 
4See, for example, Andry et al. (1992) on compila- 
tion, Kilbury et al. (1991) on coding DAGs, Duda & Geb- 
hardi (1994) on dynamic querying, Langer (1994) on re- 
verse querying, and Barg (1994), Light (1994), Light et 
al. (1993) and Kilbury et al. (1994) on automatic ac- 
quisition. And there are at least a dozen different DATR 
implementations available, on various platforms and pro- 
gramming languages. 
Sin fact, \[TAG commonly distinguishes two sets of 
features at each node (top and bottota), but for simpli- 
city we shall assume just one set in this paper. 
fully specified tree with a compatible root label may 
be attached) 6. 
In representing such a tree in DATR, we do two 
things. First, in keeping with the radically lexica- 
list character of LTAG, we describe the tree structure 
from its (lexical) anchor upwards 7, using a variant 
of Kilbury's (1990) bottom-up encoding of trees. In 
this encoding, a tree is described relative to a parti- 
cular distinguished leaf node (here the anchor node), 
using binary relations paxent, left and right, re- 
lating the node to the subtrees associated with its 
parent, and immediate-left and -right sisters, enco- 
ded in the same way. Second, we embed the resulting 
tree structure (i.e., the node relations and type in- 
formation) in the feature structure, so that the tree 
relations (left, right and parent) become features. 
The obvious analogy here is the use of first/rest 
features to encode subcategorisation lists in frame- 
works like HPSG. 
Thus the syntactic feature information directly as- 
sociated with the entry for give relates to the label 
for the v node (for example, the value of its cat fea- 
ture is v, the value of type is emchor), while speci- 
fications of subfeatures of parent relate to the label 
of the vP node. A simple bottom-up DATR represen- 
tation for the whole tree (apart from the node type 
information) follows: 
Give: 
<cat> -- v 
<parent cat> = vp 
<parent left cat> =np 
<parent parent cat> = s 
<right cat> =np 
<right right cat> = p 
<right right parent cat> = pp 
<right right right cat> =np. 
This says that Give is a verb, with vp as its pa- 
rent, an s as its grandparent and an NP to the left 
of its parent. It also has an NP to its right, and a 
tree rooted in a P to the right of that, with a PP 
parent and NP right sister. The implied bottom-up 
tree structure is shown graphically in figure 2. Here 
the nodes are laid out just as in figure 1, but rela- 
ted via parent, left and right links, rather than 
the more usual (implicitly ordered) daughter links. 
Notice in particular that the right link from the 
object noun-phrase node points to the preposition 
node, not its phrasal parent - this whole subtree is 
itself encoded bottom-up. Nevertheless, the full tree 
structure is completely and accurately represented 
by this encoding. 
s LTAG's other tree-building operation is adjunetion, 
which allows a tree-fragment to be spliced into the body 
of a tree. However, we only need to concern ourselves 
here with the representation of the trees involved, not 
with the substitution/adjunction distinction. 
rThe tree in figure 1 has more than one anchor - in 
such cases it is generally easy to decide which anchor is 
the most appropriate root for the tree (here, the verb 
anchor). 
78 
np ° 
s 
arent 
vp 
left/ 
parent 
" np 
right ~ 
right k 
P 
PP 
arent 
np 
right 
Figure 2: Bottom-up encoding for Give 
Once we adopt this representational strategy, wri- 
ting an LTAG lexicon in DATR becomes similar to 
writing any other type of lexicalist grammar's le- 
xicon in an inheritance-based LKRL. In HPSG, for 
example, the subcategorisation frames are coded as 
lists of categories, whilst in LTAG they are coded as 
trees. But, in both cases, the problem is one of con- 
cisely describing feature structures associated with 
lexical entries and relationships between lexical ent- 
ries. The same kinds of generalization arise and the 
same techniques are applicable. Of course, the pre- 
sence of complete trees and the fully lexicalized ap- 
proach provide scope for capturing generalizations 
lexically that are not available to approaches that 
only identify parent and sibling nodes, say, in the 
lexical entries. 
3 Encoding lexical entries 
Following conventional models of lexicon organisa- 
tion, we would expect Give to have a minimal syn- 
tactic specification itself, since syntactically it is a 
completely regular ditransitive verb. In fact none 
of the information introduced so far is specific to 
Give. So rather than providing a completely expli- 
cit DATR definition for Give, as we did above, a more 
plausible account uses an inheritance hierarchy defi- 
ning abstract intransitive, transitive and ditransitive 
verbs to support Give (among others), as shown in 
figure 3. 
This basic organisational structure can be expres- 
sed as the following DATR fragmentS: 
8To gain the intuitive sense of this fragment, read 
a line such as <> --= VERB as "inherit everything from 
the definition of VERB", and a line such as <parent> == 
PPTREE:<> as "inherit the parent subtree from the de- 
finition of PPTREE'. Inheritance in DATR is always by 
default - locally defined feature specifications take prio- 
rity over inherited ones. 
VERB 
Die VERB+NP 
Eat VEKB+NP+PP VERB+NP+NP 
Give Spare 
Figure 3: The principal lexical hierarchy 
VERB: 
<> -- TREENODE 
<cat> == v 
<type> == anchor 
<parent> =s VPTREE:<>. 
VERB+NP: 
<> == VERB 
<right> == NPCOMP:<>. 
VERB+NP+PP: 
<> -= VERB+NP 
<right right> == PTKEE:<> 
<right right root> == to. 
VERB+NP+NP: 
<> == VEBB+NP 
<right right> == NPCOMP:<>. 
Die: 
<> == VERB 
<root> == die. 
Eat: 
<> == VEKB+NP 
<root> == eat. 
Give: 
<> == VERB+NP+PP 
<root> == give. 
Spare: 
<> == VERB+NP+NP 
<root> == spare. 
Ignoring for the moment the references to 
TREENODE, VPTREE, NPCOMP and PTREE (which we 
shall define shortly), we see that VERB defines basic 
features for all verb entries (and can be used directly 
for intransitives such as Die), VERB+NP inherits ~om 
VERB butadds an NP complement to the right of 
the verb (for transitives), VEKB+NP+PP inherits ~om 
VERB+NP but adds a further PP complement and so 
79 
on. Entries for regular verb lexemes are then mi- 
nimal - syntactically they just inherit everything 
from the abstract definitions. 
This DATR fragment is incomplete, because it neg- 
lects to define the internal structure of the TREEtlODE 
and the various subtree nodes in the lexical hierar- 
chy. Each such node is a description of an LTAG tree 
at some degree of abstraction 9. The following DATR 
statements complete the fragment, by providing de- 
finitions for this internal structure: 
TREENODE : 
<> == under 
<type> == internal. 
STREE: 
<> == TREENODE 
<cat> == s. 
VPTREE: 
<> == TREENODE 
<cat> ==vp 
<parent> == STREE:<> 
<left> == NPCOMP:<>. 
NPCOMP: 
<> == TREENODE 
<cat> -- np 
<type> == substitution. 
PPTREE: 
<> == TREENODE 
<cat> == pp. 
PTREE: 
<> == TREENODE 
<cat> I= p 
<type> == anchor 
<parent> == PPTREE:<> 
Here, TREENODE represents an abstract node in an 
LTAG tree and provides a (default) type of internal. 
Notice that VERB is itself a TREENODE (but with the 
nondefault type anchor), and the other definitions 
here define the remaining tree nodes that arise in 
our small lexicon: VPTREE is the node for VERB's pa- 
rent, STREE for VEKB's grandparent, NPCOMP defines 
the structure needed for NP complement substitution 
nodes, etc. 1° 
Taken together, these definitions provide a speci- 
fication for Give just as we had it before, but with 
the addition of type and root features. They also 
support some other verbs too, and it should be clear 
that the basic technique extends readily to a wide 
range of other verbs and other parts of speech. Also, 
although the trees we have described are all initial 
9Even the lexeme nodes are abstract - individual 
word forms might be represented by further more specific 
nodes attached below the lexemes in the hierarchy. 
1°Our example makes much use'of multiple inheritance 
(thus, for example, VPTREE inherits from TREENODE, 
STREE and NPCOMP) but a/l such multiple inheritance is 
orthogonal in DATR: no path can inherit from more than 
one node. 
trees (in LTAG terminology), we can describe auxi- 
liary trees, which include a leaf node of type foot 
just as easily. A simple example is provided by the 
following definition for auxiliary verbs: 
AUXVERB : 
<> == TREENODE 
<cat> --= V 
<type> == anchor 
<parent cat> == vp 
<right cut> == vp 
<right type> == foot. 
4 Lexical rules 
Having established a basic structure for our LTAG 
lexicon, we now turn our attention towards captu- 
ring other kinds of relationship among trees. We 
noted above that lexical entries are actually associa- 
ted with tree families, and that these group to- 
gether trees that are related to each other. Thus in 
the same family as a standard ditransitive verb, we 
might find the full passive, the agentless passive, the 
dative alternation, the various relative clauses, and 
so forth. It is clear that these families correspond 
closely to the outputs of transformations or metaru- 
les in other frameworks, but the XTAG system cur- 
rently has no formal component for describing the 
relationships among families nor mechanisms for ge- 
nerating them. And so far we have said nothing 
about them either - we have only characterized sin- 
gle trees. 
However, LTAG's large domain of locality means 
that all such relationships can be viewed as directly 
lexical, and ~hus expressible by lexical rules. In fact 
we can go further than this: because we have em- 
bedded the domain of these lexical rules, namely the 
LTAG tree structures, within the feature structures, 
we can view such lexical rules as covariation cons- 
traints within feature structures, in much the same 
way that the covariation of, say, syntactic and mor- 
phological form is treated. In particular, we can use 
the mechanisms that DATR already provides for fea- 
ture covariation, rather than having to invoke in ad- 
dition some special purpose lexical rule machinery. 
We consider six construction types found in the 
XTAG grammar: passive, dative, subject-auxiliary 
inversion, wh-questions, relative clauses and topica- 
lisation. Our basic approach to each of these is the 
same. Lexical rules are specified by defining a deri- 
ved output tree structure in terms of an input tree 
structure, where each of these structures is a set of 
feature specifications of the sort defined above. Each 
lexical rule has a name, and the input and output 
tree structures for rule foo are referenced by pre- 
fixing feature paths of the sort given above with 
<input foo . .> or <output foo . .>. So for ex- 
ample, the category of the parent tree node of the 
output of the passive rule might be referenced as 
<output passive parent cat>. We define a very 
general default, stating that the output is the same 
80 
as the input, so that lexical relationships need only 
concern themselves with components they modify. 
This approach to formulating lexical rules in DAIR 
is quite general and in no way restricted to/TAG: it 
can be readily adapted for application in the context 
of any feature-based lexicalist grammar formalism. 
Using this approach, the dative lexical rule can be 
given a minimalist implementation by the addition 
of the following single line to VERB+NP+PP, defined 
above. 
VERB+NP+PP : 
<output dative right right> == NPCOMP:<>. 
This causes the second complement to a ditran- 
sitive verb in the dative alternation to be an NP, 
rather than a PP as in the unmodified case. Subject- 
auxiliary inversion can be achieved similarly by just 
specifying the output tree structure without refe- 
rence to the input structure (note the addition here 
of a form feature specifying verb form): 
AUXVERB : 
<output auxinv form> == finite-inv 
<output auxinv parent cat> == s 
<output auxinv right cat> == s. 
Passive is slightly more complex, in that it has to 
modify the given input tree structure rather than 
simply overwriting part of it. The definitions for pas- 
sive occur at the VERB+NP node, since by default, any 
transitive or subclass of transitive has a passive form. 
Individual transitive verbs, or whole subclasses, can 
override this default, leaving their passive tree struc- 
ture undefined if required. For agentless passives, 
the necessary additions to the VERB+NP node are as 
followsn: 
VERB+NP : 
<output passive form> == passive 
<output passive right> == 
"<input passive right right>". 
Here, the first line stipulates the form of the verb 
in the output tree to be passive, while the second line 
redefines the complement structure: the output of 
passive has as its first complement the second com- 
plement of its input, thereby discarding the first 
complement of its input. Since complements are 
daisy-chained, all the others move up too. 
Wh-questions, relative clauses and topicalisation 
are slightly different, in that the application of the 
lexical rule causes structure to be added to the top 
of the tree (above the s node). Although these con- 
structions involve unbounded dependencies, the un- 
boundedness is taken care of by the \[TAG adjunction 
mechanism: for lexical purposes the dependency is 
local. Since the relevant lexical rules can apply to 
sentences that contain any kind of verb, they need 
to be stated at the VERB node. Thus, for exam- 
ple, topicalisation and wh-questions can be defined 
as follows: 
11Oversimplifying slightly, the double quotes in 
"<input passive right right>" mean that that DATR 
path will not be evaluated locally (i.e., at the VERB+NP 
node), but rather at the relevant lexeme node (e.g., Eat 
or Give). 
VERB : 
<output topic parent parent parent cat> 
<output topic parent "parent left cat> ==np 
<output topic parent parent left form> 
== normal 
<output whq> == "<output topic>" 
<output whq parent parent left form> == vh. 
Here an additional NP and s are attached above 
the original s node to create a topicalised struc- 
ture. The wh-rule inherits from the topicalisation 
rule, changing just one thing: the form of the new 
NP is marked as wh, rather than as normal. In the 
full fragment 12, the NP added by these rules is also 
syntactically cross-referenced to a specific NP mar- 
ked as null in the input tree. However, space does 
not permit presentation or discussion of the DATR 
code that achieves this here. 
5 Applying lexical rules 
As explained above, each lexical rule is defined to 
operate on its own notion of an input and produce 
its own output. In order for the rules to have an ef- 
fect, the various input and output paths have to be 
linked together using inheritance, creating a chain of 
inheritances between the base, that is, the canonical 
definitions we introduced in section 3, and surface 
tree structures of the lexical entry. For example, to 
'apply' the dative rule to our Give definition, we 
could construct a definition such as this: 
Give-dat : 
<> ffi= Give 
<input dative> == <> 
<surface> == <output dative>. 
Values for paths prefixed with surface inherit 
from the output of the dative rule. The input of 
the dative rule inherits from the base (unprefixed) 
case, which inherits from Give. The dative rule de- 
finition (just the oneline introduced above, plus the 
default that output inherits from input) thus media- 
tes between qive and the surface of Give-dat. This 
chain can be extended by inserting additional in- 
heritance specifications (such as passive). Note that 
surface defaults to the base case, so all entries have 
a surface defined. 
However, in our full fragment, additional support 
is provided to achieve and constrain this rule chai- 
ning. Word definitions include boolean features in- 
dicating which rules to apply, and the presence of 
these features trigger inheritance between appro- 
priate input and output paths and the base and 
surface specifications at the ends of the chain. For 
example, Wordl is an alternative way of specifying 
the dative alternant of Give, but results in inhe- 
ritance linking equivalent to that found in Give-dat 
above: 
12The full version of this DAIR fragment includes all 
the components discussed above in a single coherent, but 
slightly more complex account. It is available on request 
from the authors. 
81 
Wordl : 
<> == Give 
<alt dative> == true. 
More interestingly, Nord2 properly describes a wh- 
question based on the agentless passive of the dative 
of Give. 
Word2 : 
<> == Give 
<alt whq> == true 
<alt dative> == true 
<alt passive> == true. 
<parent left form> =- null 
Notice here the final line of Nord2 which specifies 
the location of the 'extracted' NP (the subject, in this 
case), by marking it as null. As noted above, the full 
version of the whq lexical rule uses this to specify a 
cross-reference relationship between the wh-NP and 
the null NP. 
We can, if we wish, encode constraints on the app- 
licability of rules in the mapping from boolean flags 
to actual inheritance specifications. Thus, for exam- 
ple, whq, tel, and topic are mutually exclusive. 
If such constraints are violated, then no value for 
surface gets defined. Thus Word3 improperly att- 
empts topicalisation in addition to wh-question for- 
mation, and, as a result, will fail to define a surface 
tree structure at all: 
Word3 : 
<> == Give 
<alt whq> m= true 
<alt topic> == true 
<alt dative> -~, true 
<alt passive> -= true 
<parent left form> == null. 
This approach to lexical rules allows them to be 
specified at the appropriate point in the lexicM hier- 
archy, but overridden or modified in subclasses or 
lexemes as appropriate. It also allows default gene- 
ralisation over the lexical rules themselves, and con- 
trol over their application. The last section showed 
how the whq lexical rule could be built by a single mi- 
nor addition to that for topicalisation. However, it is 
worth noting that, in common with other DATR spe- 
cifications, the lexical rules presented here are rule 
instances which can only be applied once to any 
given lexeme - multiple application could be sup- 
ported, by making multiple instances inherit from 
some common rule specification, but in our current 
treatment such instances would require different rule 
names. 
6 Comparison with related work 
As noted above, Vijay-Shanker & Schabes (1992) 
have also proposed an inheritance-based approach 
to this problem. They use monotonic inheritance to 
build up partial descriptions of trees: each descrip- 
tion is a finite set of dominance, immediate domi- 
nance and linear precedence statements about tree 
nodes in a tree description language developed by 
Rogers & Vijay-Shanker (1992), and category infor- 
mation is located in the node labels. 
This differs from our approach in a number of 
ways. First, our use of nonmonotonic inheritance 
allows us to manipulate total instead of partial de- 
scriptions of trees. The abstract verb class in the 
Vijay-Shanker & Schabes account subsumes both in- 
transitive and transitive verb classes but is not iden- 
tical to either - a minimal-satisfying-model step is 
required to map partial tree descriptions into actual 
trees. In our analysis, VERB is the intransitive verb 
class, with complements specifically marked as un- 
defined: thus VERB : <right> == under is inherited 
from TREENODE and VERB+NP just overrides this com- 
plement specification to add an NP complement. Se- 
cond, we describe trees using only local tree relations 
(between adjacent nodes in the tree), while Vijay- 
Shanker &5 Schabes also use a nonlocal dominance 
relation. 
Both these properties are crucial to our embed- 
ding of the tree structure in the feature structure. 
We want the category information at each tree node 
to be partial in the conventional sense, so that in 
actual use such categories can be extended (by uni- 
fication or whatever). So the feature structures that 
we associate with lexical entries must be viewed as 
partial. But we do not want the tree structure to 
be extendible in the same way: we do not want an 
intransitive verb to be applicable in a transitive con- 
text, by unifying in a complement NP. So the tree 
structures we define must be total descriptions 13. 
And of course, our use of only local relations al- 
lows a direct mapping from tree structure to feature 
path, which would not be possible at all if nonlocal 
relations were present. 
So while these differences may seem small, they al- 
low us to take this significant representational step - 
significant because it is the tree structure embedding 
that allows us to view lexical rules as feature cova- 
riation constraints. The result is that while Vijay- 
Shanker & Schabes use a tree description language, 
a category description language and a further for- 
malism for lexical rules, we can capture everything 
in one framework all of whose components (non- 
monotonicity, covariation constraint handling, etc.) 
have already been independently motivated for other 
aspects of lexical description 14. 
Becket's recent work (1993; 1994) is also directed 
at exactly the problem we address in the present 
paper. Like him, we have employed an inheritance 
hierarchy. And, like him, we have employed a set of 
lexical rules (corresponding to his metarules). The 
key differences between our account and his are (i) 
13Note that simplified fragment presented here does 
not get this right. It makes all feature specifications total 
descriptions. To correct this we would need to change 
TREENODE so that only the values of <right>, <left> and 
<parent> default to under. 
14As in the work cited in footnote 3, above. 
82 
that we have been able to use an existing lexical 
knowledge representation language, rather than de- 
signing a formal system that is specific to \[TAG, and 
(ii) that we have expressed our lexical rules in ex- 
actly the same language as that we have used to 
define the hierarchy, rather than invoking two quite 
different formal systems. 
Becket's sharp distinction between his metarules 
and his hierarchy gives rise to some problems that 
our approach avoids. Firstly, he notes that his meta- 
rules are subject to lexical exceptions and proposes 
to deal with these by stating "for each entry in the 
(syntactic) lexicon .. which metarules are applica- 
ble for this entry" (1993,126). We have no need to 
carry over this use of (recta)rule features since, in 
our account, lexical rules are not distinct from any 
other kind of property in the inheritance hierarchy. 
They can be stated at the most inclusive relevant 
node and can then be overridden at the exceptional 
descendant nodes. Nothing specific needs to be said 
about the nonexceptional nodes. 
Secondly, his metarules may themselves be more 
or less similar to each other and he suggests 
(1994,11) that these similarities could be captured 
if the metarules were also to be organized in a hier- 
archy. However, our approach allows us to deal with 
any such similarities in the main lexical hierarchy 
itself 15 rather than by setting up a separate hierar- 
chical component just for metarules (which appears 
to be what Becket has in mind). 
Thirdly, as he himself notes (1993,128), because 
his metarules map from elementary trees that are in 
the inheritance hierarchy to elementary trees that 
are outside it, most of the elementary trees actually 
used are not directly connected to the hierarchy (alt- 
hough their derived status with respect to it can be 
reconstructed). Our approach keeps all elementary 
trees, whether or not they have been partly defined 
by a lexical rule, entirely within the lexical hierarchy. 
In fact, Becker himself considers the possibility 
of capturing all the significant generalizations by 
using just one of the two mechanisms that he pro- 
poses: "one might want to reconsider the usage of 
one mechanism for phenomena in both dimensions" 
(1993,135). But, as he goes on to point out, his exi- 
sting type of inheritance network is not up to taking 
on the task performed by his metarules because the 
former is monotonic whilst his metarules are not. 
However, he does suggest a way in which the hierar- 
chy could be completely replaced by metarules but 
argues against adopting it (1993,136). 
As will be apparent from the earlier sections of 
this paper, we believe that Becker's insights about 
the organization of an \['lAG lexicon can be better 
expressed if the metarule component is replaced by 
lSAs illustrated by the way in which the whq lexical 
rule inherits from that for topicalisation in the example 
given above. 
an encoding of (largely equivalent) lexical rules that 
are an integral part of a nonmonotonic inheritance 
hierarchy that stands as a description of all the ele- 
mentary trees. 
Acknowledgements 
A precursor of th'is paper was presented at the Sep- 
tember 1994 TAG+ Workshop in Paris. We thank 
the referees for that event and the ACL-95 referees 
for a number of helpful comments. We are also gra- 
teful to Aravind Joshi, Bill Keller, Owen Rambow 
K. Vijay-Shanker and The XTAG Group. This rese- 
arch was partly supported by grants to Evans from 
SERC/EPSt~C (UK) and to Gazdar from ESRC (UK). 

References 
Anne Abeille, Kathleen Bishop, Sharon Cote, & Yves 
Schabes. 1990. A lexicalized tree adjoining grammar 
for english. Technical Report MS-CIS-90-24, Depart- 
ment of Computer & Information Science, Univ. of 
Pennsylvania. 
Francois Andry, Norman Fraser, Scott McGlashan, Si- 
mon Thornton, & Nick Youd. 1992. Making DATR 
work for speech: lexicon compilation in SUNDIA. 
Comput. Ling., 18(3):245-267. 
Petra Barg. 1994. Automatic acquisition of datr theo- 
ries from observations. Theories des lexicons: Arbei- 
ten des sonderforschungsbereichs 282, Heinrich-Heine 
Univ. of Duesseldorf, Duesseldorf. 
Tilman Becker. 1993. HyTAG: A new type of Tree Ad- 
joining Grammar for hybrid syntactic representation 
of free word order languages. Ph.D. thesis, Univ. des 
Saarlandes. 
Tflman Becker. 1994. Patterns in metarules. In Pro- 
ceedings of the Third International Workshop on Tree 
Adjoining Grammars, 9-11. 
Doris Bleiching. 1992. Prosodisches wissen in lexicon. In 
G. Goerz, ed., KONVENS-92, 59-68. Springer-Verlag. 
Doris Bleiching. 1994. Integration yon morphophono- 
logic und prosodie in ein hierarchisches lexicon. In 
H. Trost, ed., Proceedings of KONVENS-9.t, 32-41. 
Ted Briscoe, Valeria de Paiva, & Ann Copestake. 1993. 
Inheritance, Defaults, ¢J the Lexicon. CUP. 
Dunstan Brown & Andrew Hippisley. 1994. Conflict in 
russian genitive plural assignment: A solution repre- 
sented in DATR. J. of Slavic Linguistics, 2(1):48-76. 
Lynne Cahill & Roger Evans. 1990. An application of 
DATR: the TIC lexicon. In ECAI-90, 120-125. 
Lynne Cahill. 1990. Syllable-based morphology. In 
COLING-90, volume 3, 48-53. 
Lynne Cahill. 1993. Morphonology in the lexicon. In 
EA CL-93, 37-96. 
Greville Corbett & Norman Fraser. 1993. Network mor- 
phology: a DATR account of Russian nominal inflec- 
tion. J. of Linguistics, 29:113-142. 
Walter Daelemans & Gerald Gazdar, eds. 1992. Special 
issues on inheritance. Gomput. Ling., 18(2 & 3). 
Christy Doran, Dania Egedi, Beth Ann Hockey, & B. Sri- 
nivas. 1994a. Status of the XTAG system. In Pro- 
ceedings of the Third International Workshop on Tree 
Adjoining Grammars, 20-23. 
Christy Doran, Dania Egedi, Beth Ann Hockey, B. Sri- 
nivas, & Martin Zaldel. 1994b. XTAG system -- a 
wide coverage grammar for english. In COLING-94, 
922-928. 
Markus Duds & Gunter Gebhardi. 1994. DUTR - a 
DATR-PATR interface formalism. In H. Trost, ed., 
Proceedings o\] KONVENS.9~, 411-414. 
Roger Evans & Gerald Gazdar. 1989a. Inference in 
DATR. In EACL.89, 66-71. 
Roger Evans & Gerald Gazdar. 1989b. The semantics 
of DATR. In AISB-89, 79-87. 
Daniel P. Flickinger. 1987. Le~ical Rules in the Hierar- 
chical Lexicon. Ph.D. thesis, Stanford Univ. 
Norman Fraser & Greville Corbett. in press. Gender, 
animacy, & declensional class assignment: a unified 
account for russian. In Geert Booij & Jaap van Marie, 
ed., Yearbook o\[ Morphology 1994. Kluwer, Dordrecht. 
Dafydd Gibbon. 1992. ILEX: a linguistic approach to 
computational lexica. In Ursula Klenk, ed., Com- 
putatio Linguae: Aulsa("tze zur algorithmischen und 
quantitativen Analyse der Sprache (Zeitsehrilt lu("r 
Dialektologie und Linguistik, Beihe\[t 73), 32-53. Franz 
Steiner Veflag, Stuttgart. 
A. K. Joshi, L. S. Levy, & M. Takahashi. 1975. Tree 
adjunct grarnmaxs. J. Comput. Syst. Sci., 10(1):136- 
163. 
James Kilbury, Petra \[Barg\] Naerger, & Ingrid Renz. 
1991. DATR as a lexical component for PATR. In 
EACL-91, 137-142. 
James Kilbury, Petra Barg, ~: Ingrid Renz. 1994. Simu- 
lation lexiealischen erwerbs. In Christopher Habel 
Gert Rickheit Sascha W. Felix, ed, Kognitive Lingui- 
stik: Repraesentation und Prozesse, 251-271. West- 
deutscher Verlag, Opladen. 
James Kilbury. 1990. Encoding constituent structure in 
feature structures. Unpublished manuscript, Univ. of 
Duesseldorf, Duesseldorf. 
Adam Kilgarriff & Gerald Ga~dar. 1995. Polysemous 
relations. In Frank Palmer, ed., Grammar ~ meaning: 
essays in honour o\] Sir John Lyons, 1-25. CUP. 
Adam Kilgarriff. 1993. Inheriting verb alternations. In 
EACL-93, 213-221. 
Hagen Langer. 1994. Reverse queries in DATR. In 
COLING-94, 1089-1095. 
Marc Light, Sabine Reinhard, & Marie Boyle-Hinrichs. 
1993. INSYST: an automatic inserter system for hier- 
archical lexica. In EACL-93, page 471. 
Marc Light. 1994. Classification in feature-based default 
inheritance hierarchies. In H. Trost, ed., Proceedings 
o\[ KONVENS-94, 220-229. 
Sabine Reinhard & Dafydd Gibbon. 1991. Prosodic in- 
heritance & morphological generalisations. In EACL- 
91, 131-136. 
James Rogers & K. Vijay-Shanker. 1992. Reasoning 
.with descriptions of trees. In ACL-92, 72-80. 
K. Vijay-Shanker & Yves Schabes. 1992. Structure 
sharing in lexicalized tree-adjoining grammar. In 
COLING-92, 205-211. 
The XTAG Research Group. 1995. A lexicalized tree ad- 
joining grammar for English. Technical Report IRCS 
Report 95-03, The Institute for Research in Cognitive 
Science, Univ. of Pennsylvania. 
