Integrating compositional semantics into a verb lexicon 
Hoa Trang Dang, Karin Kipper and Martha Palmer 
Department of Computer and Information Sciences 
University of Pennsylvania 
200 South 33rd Street 
Philadelphia, PA 19104, USA 
{htd,kipper, mpahner} @linc.cis.upenn.edu 
Abstract 
We present a class-based approach to building a 
verb lexicon that makes explicit the close asso- 
ciation between syntax and semantics for Levin 
classes. We have used Lexicalized Tree Adjoin- 
ing Grammars to capture the syntax associated with 
each verb class and have augmented the trees to in- 
clude selectional restrictions. In addition, semantic 
predicates are associated with each tree, which al- 
low for a colnpositional interpretation. 
1 Introduction 
The difficulty o1' achieving adequate hand-crafted 
semantic representations has limited the lield of 
natural  t)rocessing to applications that 
can be contained within well-deIined sub-domains. 
Despite many different lexicon development ap- 
proaches (Mel'cuk, 1988; Copestakc and Sanfil- 
ippo, 1993; Lowe et al., 1997), the field has yet 
to develop a clear conseusus on guidelines for a 
colnputational lexicon. One of the most controver- 
sial areas in building such a lexicon is polyselny: 
how senses can be computationally distinguished 
and characterized. We address this problem hy em- 
ploying compositional semantics and the adjunction 
of syntactic l)hrases to support regular verb sense 
extensions. This differs l'rom the Lexical Concep- 
tual Structure (LCS) approach exemplified by Voss 
(1996), which requires a separate LCS representa- 
tion for each possible sense extension. In this pa- 
per we describe the construction of VerbNet, a verb 
lexicon with explicitly stated syntactic and seman- 
tic information for individual lexical items, using 
Levin verb classes (Levin, 1993) to systematically 
construct lexical entries. We use Lexicalized Tree 
Adjoining Grammar (LTAG) (Joshi, 1987; Schabes, 
1990) to capture the syntax for each verb class, and 
associate semantic predicates with each tree. 
Althougla similar ideas have been explored for 
w:rb sense extension (Pusteiovsky, 1995; Goldberg, 
1995), our approach of applying LTAG to the prob- 
lem of composing and extending verb senses is 
novel. LTAGs have an extended domain of local- 
ity that captures the arguments of a verb in a local 
manner. The association of semantic predicates to a 
tree yields a complete semantics for the verb. More- 
ovel, the operation of adjunction in LTAGs provides 
a mechanism for extending verb senses. 
2 Levin classes 
Levin verb classes are based on the ability of a verb 
to occur in diathesis alternations, which are pairs 
of syntactic frames that are in some sense meaning 
preserving. The fundalnental assulnption is that the 
syntactic frames arc a direct reflection of the under- 
lying semantics, ltowever, Levin classes exhibit in- 
consistencies that have hampered researchers' abil- 
ity to reference them directly in applications. Many 
verbs ale listed in multiple classes, some of which 
have confl icting sets of syntactic frames. Dang et al. 
(1998) showed that multiple listings could in some 
cases be interpreted as regular sense extensions, and 
defined intersective Levin classes, which are a more 
line-grained, syntactically and semantically coher- 
ent refinement of basic Levin classes. We represent 
these verb classes and their regular sense extensions 
in the LTAG forlnalism. 
3 Lexicalized rI?ee Adjoining Grammars 
3.1 Overview of formalism 
Lexicatized Tree Adjoining Granunars consist of a 
finite set of initial and auxiliary elementary hees, 
and two operations to combine them. The min- 
imal, non-recursive linguistic structures of a lan- 
guage, such as a verb and its complements, are cap- 
tured by initial trees. Recursive structures of a lan- 
guage, such as prepositional modifiers which result 
in syntactically embedded VPs, are represented by 
auxiliary trees. 
1011 
Elementary trees are combined by the operations 
of substitution and adjunction. Substitution is a sim- 
ple operation that replaces a leaf of a tree with a new 
tree. Adjunction is a splicing operation that replaces 
an internal node of an elementary tree with an aux- 
iliary tree. Eveu tree is associated with a lexical 
item of the , called the anchor of the tree. 
Tile tree represents the domain over which the lex- 
ical item can directly specify syntactic constraints, 
such as subject-verb number agreement, or seman- 
tic constraints, such as selectional restrictions, all of 
which are implemented as features. 
LTAGs are more powerful than context free gram- 
mars (CFG), allowing localization of so-called un- 
bounded dependencies that cannot be handled by 
CFGs. There are critical benefits to lexical seman- 
tics that are provided by the extended domain of 
locality of the lexicalized trees. Each lexical en- 
try corresponds to a tree. If the lexical item is a 
verb, the conesponding tree is a skeleton for an en- 
tire sentence with the verb already present, anchor- 
ing the tree as a terminal symbol. The other parts 
of the sentence will be substituted or adjoined in at 
appropriate places in the skeleton tree in the course 
of the derivation. The composition of trees during 
parsing is recorded in a derivation tree. The deriva- 
tion tree nodes correspond to lexically anchored el- 
ementary trees, and the arcs are labeled with infor- 
mation about how these trees were combined to pro- 
duce the parse. Since each lexically anchored initial 
tree corresponds to a semantic unit, the derivation 
tree closely resembles a semantic-dependency rep- 
resentation. 
3.2 Semantics for TAGs 
There is a range of previous work in incorporating 
semantics into TAG trees. Stone and Doran (1997) 
describe a system used for generation that simul- 
taneously constructs the semantics and syntax of 
a sentence using LTAGs. Joshi and Vijay-Shanker 
(1999), and Kallmeyer and Joshi (1999), describe 
the semantics of a derivation tree as a set of attach- 
ments of trees. The semantics of these attachments 
is given as a conjunction of formulae in a flat seman- 
tic representation. They provide a specific method- 
ology for composing semantic representations much 
like Candito and Kahane (1998), where the direc- 
tionality of dominance in the derivation tree should 
be interpreted according to the operations used to 
build it. Kallmeyer and Joshi also use a flat semantic 
representation to handle scope phenomena involv- 
ing quantifiers. 
4 Description of the verb lexicon 
VerbNet can be viewed in both a static and a dy- 
namic way. Tile static aspect refers to the verb and 
class entries and how they are organized, providing 
the characteristic descriptions of a verb sense or a 
verb class (Kipper et al., 2000). The dynamic as- 
pect of the lexicon constrains the entries to allow 
a compositional interpretation in LTAG derivation 
trees, representing extended verb meanings by in- 
corporating adjuncts. 
Verb classes allow us to capture generalizations 
about verb behavioL Each verb class lists the tlle- 
mafic roles that the predicate-argument structure of 
its members allows, and provides descriptions of 
the syntactic fi'ames conesponding to licensed con- 
structions, with selectional restrictions defined for 
each argument in each frame, l Each frame also 
includes semantic predicates describing the partic- 
ipants at various stages of the event described by 
the frame. 
Verb classes are hierarchically organized, ensur- 
ing that each class is coherent - that is, all its mem- 
bers have common semantic elements and share a 
common set of thematic roles and basic syntactic 
frames. This requires some manual restructuring of 
the original Levin classes, which is facilitated by us- 
ing intersective Levin classes. 
5 Compositional Semantics 
We use TAG elementary trees for the description 
of allowable frames and associate semantic predi- 
cates with each tree, as was done by Stone and Do- 
ran. The semantic predicates are primitive enough 
so that many may be reused in different trees. By 
using TAGs we get the additional benefit of an ex- 
isting parser that yields derivations and derived trees 
fiom which we can construct the compositional se- 
mantics of a given sentence. 
We decompose each event E into a tripar- 
tite structure in a manner similar to Moens and 
Steedman (1988), introducing a time function for 
each predicate to specify whether the predicate is 
true in the preparatory (d~ring(E)), cuhnination 
(er~d(E)), or consequent (res~ll:(E)) stage of an 
event. 
hfitial trees capture tile semantics of the basic 
senses of verbs in each class. For example, many 
IThese restrictions are more like preferences that generate a 
preferred reading of a sentence. They may be relaxed depend- 
ing on the domain of a particular application. 
1012 
S \[ cvcnt=E \] S \[ event=E2 \] 
NP.,.sH$ VP \[ cvcnt=E \] NParqo$ VP \[ event=E1 \] 
\[ +aninmtc \] \] \[ +animale \] 
V V NI),~,.ql$ 
I \] \[ +animate \] 
1"1.11| rUll 
motion(during(E), Xa,.al ) motion(during(El), Xargl ) 
Figure 1 : Induced action alternation for the Run verbs 
verbs in the Run class can occur in the induced ac- 
tion alternation, in which the subject of the inmmsi- 
tive sentence has the same thematic role as the direct 
object in the transitive sentence. Figure l shows the 
initial trees for the transitive and intransitive vari- 
ants for the Run class, along with their semantic 
predicates. The entity in motion is given by argl, 
associated with the syntactic subject of the intransi- 
tive tree and the direct object of the transitive tree. 
The event denoted by the transitive variant is a com- 
position of two subevents: E1 refers to the event of 
av.ql running, and E2 refers to the event of an entity 
(argO) causing event El. 
Predicates are associated with not only the verb 
trees, but also the auxiliary trees. We use a flat 
semantic rcpmsentatiou like that of Kalhncycr and 
Joshi, and the semantics of a sentence is the con- 
junction of the semantic predicates of the trees used 
to derive the sentence. Figure 2 shows au auxiliary 
tree for a path prepositional pllrase headed by "to", 
along with its associated semantic predicate. 
When the PP tree for "to the park" is adjoiued into 
the intransitive tree for "John ran", the semantic in- 
terpretation is the conjunction of the two predicates 
motion(during(E),john) A goal(end(E),john, park); 
adjunction into the transitive tree for "Bill ran 
the horse" yields cause(during(E2),bilI, El) A mo- 
tion(during(El), horse) A goal(end(El), horse, park). 
In both cases, the argument X,rs¢0.,rgl (john or 
horse) for the anxiliary tree is noulocal and colnes 
from the adjunction site. 2 The arguments are re- 
covered from the derivation tree, following Candito 
and Kahane. When an initial tree is substituted into 
another tree, the dependency mirrors the derivation 
structure, so the variables associated with the sub- 
2X.,..qo,.,.ga is the variable associated with the cntity in mo- 
tion (ar91) in the tree to which tile PP a(Uoins (argO). 
stituting tree can be referenced as arguments in the 
host tree's predicates. When an auxiliary tree is 
adjoined, the dependency for the adjunction is re- 
versed, so that variables associated with the host 
tree can be referenced as arguments in the adjoin- 
ing tree's predicates. 
VP 
VPar:jO* PP 
\[ evc,~t=l~ \] 
I' NP.rql$ 
I lo 
qoal (end(E), X,.,.;jo.,,.r11, Xa,.~j1) 
Figure 2: Auxiliary path PP tree 
The tripartite event structure allows us to express 
the semantics of classes of verbs like change of 
state verbs whose description requires reference to 
a complex event structure. In the case of a verb 
such as "break", it is important to make a distinc- 
tion between the state of the object before the end 
of the action and the new state that results after- 
wards. This event structure also handles the eona- 
tive construction, in which there is an intention of 
a goal during the event, that is not achieved at 
the end of the event. The example of the cona- 
rive construction shown in Figure 3 expresses the 
intention of hitting something. Because the in- 
tention is not satisfied the semantics do not in- 
clude the predicates manner(end(E),fi~rcefuI, X, rgo ) 
A conmct(end(E),X, rgo,Xc~rgO, that express the 
completion of the contact with impact event. 
The ability of verbs to take on extended senses 
in sentences based on their adjuncts is captured in a 
1013 
S \[ event=E \] 
NPa~.qO$ VP \[ evcnt=E \] 
V NPargl$ 
I hit 
manner(during(E), direetedmotion, Xa,..qo )A 
contact(end(E), Xar~O, Xar~l ) A 
7naT,,nel' (end(E), f of'ee f '~l, Xar9O ) 
s \[ cvcnt=r~: \] 
NParq0$ VP 
V VP 
I hit V PP 
I 
I at 
manner (during(E), direct, edmotion, X~r:io ) 
Figure 3: Syntax and semantics of transitive and conative construction for Hit verbs 
natural way by the TAG operation of adjunction and 
our conjunction of semantic predicates. The orig- 
inal Hit verb class does not include movement of 
the direct object as part of the meaning of hit; only 
sudden contact has to be established. By adjoining 
a path PP such as "across NP", we get an extended 
meaning, and a change in Levin class membership 
to the Throw class. Figure 4 shows the class-specific 
auxiliary tree anchored by the preposition "across" 
together with its semantic predicates, introducing a 
motion event that immediately follows (meets) the 
contact event. 
VP \[ evenI:E \] 
VP.rf/o*\[ cvcnt=EargO \] PP 
P NPargl.~ 
I 
aCI'OSS 
meets ( E,,..,jo, E) A 
motion(during(E), X~m~,0.,,.91 )A 
via(during(E), X~r,jo.~r~l , Xa,.,j1) 
Figure 4: Auxiliary tree for "across" 
oll the LTAG formalism, for which we already have 
a large English grammar. Palmer et al. (1998) de- 
fined compositional semantics for classes of verbs 
implemented in LTAG, representing general seman- 
tic components (e.g., motion, manner) as features 
on the nodes of the trees. Ore" use of separate log- 
ical forms gives a more detailed semantics for the 
sentence, so that for an event involving motion, it is 
possible to know not only that the event has a motion 
semantic component, but also which entity is actu- 
ally in motion. This level of detail is necessary for 
applications such as animatiou of natural  
instructions (Bindiganavale et al., 2000). Another 
important contribution of this work is that by divid- 
ing each event into a tripartite structure, we permit a 
more precise definition of the associated semantics. 
Finally, the operation of adjunction in TAGs pro- 
vides a principled approach to representing the type 
of regular polysemy that has been a major obstacle 
in buildiug verb lexicons. 
Researching whether a TAG grammar for Verb- 
Net can be automatically constructed by using de- 
velopment tools such as Xia et al. (1999) or Candito 
(1996) is part of our next step. We also expect to be 
able to factor out some class-specific auxiliary trees 
to be used across several verb classes. 
6 Conclusion 
We have presented a class-based approach to build- 
ing a verb lexicon that makes explicit and imple- 
ments the close association between syntax and se- 
mantics, as postulated by Levin. The power of the 
lexicon comes from its dynamic aspect that is based 
7 Acknowledgments 
The authors would like to thank the anonymous re- 
viewers for their valuable comments. This researeh 
was partially supported by NSF grants IIS-9800658 
and IIS-9900297 and CAPES grant 0914-95. 
1014 

References 

Ralna Bindiganawde, Willianl Schuler, Jan M. All- 
beck, Nornlan I. Badler, Aravind K. Joshi, and 
Martha Pahner. 2000. Dynamically Altering 
Agent Behaviors Using Natural Language In- 
structions. Fourth International Cot!ference on 
Autonomous Agents, June. 

Marie-Hdl~ne Candito and Sylwtin Kahane. 1998. 
Can the TAG derivation tree represent a senlan- 
tic graph? An answer ill the light of Meaning- 
Text Theory. In Piz)ceedhtgs of the Fourth 77~G+ 
Workshop, pages 21-24, Philadelphia, PA, Au- 
gust. 

Marie-Hdl~ne Candito. 1996. A Principle-Based 
Hierarchical Representation of LTAGs. In Pro- 
ceedings of COLING-96, Copenhagen, Denlnark. 

Aim Copestake and Antonio Sanfilippo. 1993. 
Multilingual lexical representation. In Proceed- 
ings ,2.1" the AAAI Spring Symposium: Bttilding 
Lexicons.for Machine Translation, Stanford, Cal- 
ifornia. 

Hoa Trang l)ang, Karin Kipper, Martha Pahner, and 
Joseph Rosenzweig. 1998. hwestigating regu- 
lar sense extensions based on intersective Levin 
classes . In Proceedings of COLING-ACL98, 
Montreal, Canada, August. 

Adele E. Goldberg. 1995. C'onslruclions. A Con- 
struction Grammar Approach 1o k rgument Slrttc- 
lure. University of Chicago Press, Chicago, Ill. 

Aravind K. Joshi and K. Vijay-Shanker. 1999. 
Compositional semantics wilh Lexicalized 
Tlee-ac\[joilfing Granllnar: How nlueh under- 
specification is necessary? In Proceedings of 
the Third International Worksho I) on Conq)tt- 
rational Semantics (IWCS-3), pages 131-145, 
Tilburg, The Netherlands, January. 

Aravind K. Joshi. 1987. An introduction to tree ad- 
joining grannnars. In A. Manaster-Ramer, editor, 
Mathematics of Language. Jolm 13elljamins, Am- 
sterdaln. 

Laura Kalhneyer and Anwind Joshi. 1999. Under- 
specified selnantics with LTAG. In Proceedhlgs 
of Amsterdam Colloquium on Semantics. 

Karin Kippm, Hoa Trang Dang, and Martha Palmer. 
2000. Class-based construction of a verb lexi- 
COll. Ill Pn)ceedings of the Seventh National Con- 
ference on Art!\[icial httelligence (AAAI-2000), 
Austin, TX, July-August. 

Beth Levin. 1993. English Verb Classes and A Iter- 
nation, A Preliminary hwestigation. Tile Univer- 
sity of Chicago Press. 

J.B. Lowe, C.E Baker, and C.J. Filhnore. 1997. A 
fnnne-semantic approach to semantic annotation. 
Ill Proceedings 1997 Siglex WorksholJ/ANLP97, 
Washington, I).C. 

I. A. Mel'cuk. 1988. Semantic description of lex- 
ical units ill an explanatory combinatorial dic- 
tionary: Basic plilmiples and heuristic criteria. 
International ,lournal of Lexicography, I:3:165- 
188. 

M. Moens and M. Steedman. 1988. Telnporal on- 
tology and tenlporal refel'ence. Computational 
Linguistics, 14:15-38. 

Martha l~allnel ", Joseph Rosenzweig, and Willialn 
Schuler. 1998. Capturing Motion Verb General- 
izations in Sylmhronous TAG. Ill Patrick Saint- 
l)izim, editol, Predicative Forms in Natural Lan- 
guage and in Lexical Knowledge Bases. Kluwer 
Press. 

James Pustejovsky. 1995. The Generative Lexicon. 
MIT Press, Cambridge, Massachusetts, USA. 

Yves Sehabes. 1990. Mathematical and Computa- 
tional Aspects of Lexicalized Grammars. Ph.D. 
thesis, Compnter Science Department, University 
of Pennsylvania. 

Matthew Stone and Christine Doran. 1997. Sell- 
tence Plalming as l)escription using Tree Adjoin- 
ing Grammar. Ill Proceedings of ACL-EACL '97, 
Madrid, Spain. 

Clare Voss. 1996. lnlerlinxua-l)ased Machine 
7'ranslation o\[" &~atial Expressions. PI'LD. the- 
sis, University of Maryland, Depamnent of Com- 
puter Science. 

Fei Xia, Martha Pahnei, and K. Vijay-Shanker. 
1999. Toward senli-autolnating grannnar 
development. Ill Proceedings of the 51h 
Natural Language Proces'sing Pacific Rim 
3),mposium(NLPRS-99), Beijing, China. 
