NL Understanding with a Grammar of Constructions 
Wlodek Zadrozny and Martin Szummer and Stanislaw Jarecki 
and David E. Johnson and Leora Morgcnstern * 
IBM Research l)ivision 
T.J.Watson Researc, h Lab 
Yorktown Heights, NY 10598 USA 
Abstract 
We present an approach to natural language under- 
standing based on a computable grammar of con- 
st~ctions. A construetionconsists of a set of features 
of form and a description of meaning in a context. A 
grammar is a set of constructions. This kind of gram- 
mar is the key element of MINCAL, an implemented 
natural language speech-enabled interface to an on- 
line calendar system. Tile architecture has two 
key aspects: (a) the use of constructions, integrating 
descriptions of form, meaning an(t context into one 
whole; and (b) the separation of domain knowledge 
(about calendars) from application kno'wledgt; (about 
the particular on-line calendar). 
1 Introduction: an overview 
of the system 
We present an approach to natural language under- 
standing based on a computablc gTummar of con- 
structions. A construction consists of a set of features 
of form and a description of meaning in a context. 
A grammar is a set of constructions. This kind of 
grammar is the key clement of MINCAI,, an imt)le- 
mented natural language speech-enabled interface t() 
an on-line calendar system. 
The system consists of a NL grammar, a parser, 
an on-line calendar, a domain knowledge base (about 
dates, times and meetings), an application knowl- 
edge base (about the calendar), a speech recognizer, 
a speech generator. 
In this paper we describe two key aspects of the 
system architecture: (a) the use of constructions, 
where instead of separating NL processing into the 
phases of syntax, semantics and pragmatics, we inte- 
grate descriptions of form, meaning and context into 
one whole, and use a parser that take, s into account 
all this information (see \[10\] fin' details); (b) the sep- 
aration of the domain knowledge (about calendars) 
and the application knowledge (about tile particular 
on-line calendar). 
*M. Szummer and S. Jarecki ~re also from MYF. 
The dialogs 
The system allows users to engage in dialogs like: 
Schedule a meeting with Bob/ 
At what time and date? 
On August 30th. 
At what time.? 
At8. 
Mm~ing or afte~won? 
In the evening. 
TILe parser recognizes Schedule a meeting with Bob 
as an instance of sent(imp), the imperative construc- 
tion consisting of a verb and an NP, here up(event). 
TILe context is used to prevent another reading in 
which with Bob modifies schedule, as in l)ance a 
tango with Bob/. That is, a contextual rule is used 
which says that for calen(lar applications, peotlle do 
not modify actions or places. Context also plays an 
important role in understanding answers, e.g. At 8. 
This is understood as a time expression (and not 
t)lace or rate or something else) only because of the 
eontcxt. 
The tlarameters of a meeting can be given in many 
ways, e.g. synonyms or differe, nt constructions can be 
used, users Call in(hide as many parameters in a sen- 
ten(:e as they wish, and the parameters can be given 
in any order. As a result there are about 10,000 ways 
of scheduling meetings (with a given set of parame- 
ters). 
How are the dialogs understood 
With respect to parsing, grammars of constructions 
can be parsed like "standard" grammars, except that 
tile set of features is richer. Civen a string (represent- 
ing a sentence, a fragment of a discourse or a para- 
graph), the parser assigns it a construction. From 
this viewpoint, the situation is similar to "regular" 
parsing, and the t)ossible algorithms arc, similar. We 
have implementect a prototype chart parser for con- 
struction grammars, disensse, d fllrther in Section 3. 
But, c, learly, having understood tile sentence as a 
linguistic entity in isolation is not the ultimate goal. 
Here the message of all utterance mils( be understood 
in tile context of an intended action. This is clone in 
two steps. First, the system determines the intended 
1289 
action and its parameters, using domain knowledge 
(meetings+time+places). Second, once all the pa- 
rameters have been extracted from tile dialog, the 
system executes the action. To do this, the program 
uses application-specific knowledge to translate the 
action and its parameters lute a form that can be 
executed by the application (Xdiary). 
2 Constructions as data struc- 
tures 
A construction is given by the matrix: 
N : name_of_construction 
C : context \] 
V : structure 
M : ntessage 
The vehicle V consists of formulas describing 
presence (or perhaps absence) of certain taxemes, or 
features of form, within the structure of the ('onstruc- 
tion. Such a structure is given by a list of subcon- 
structions and the way they have been put together 
(in all our examples this is concatenation, but there 
are other possibilities, e.g. wrapping). The context, 
C , consists of a set of semantic and pragmatic con- 
straints limiting the application of the construction. 
It can be viewed as a set of preconditions that must 
be satisfied in order for a construction to be used in 
parsing. The message, M , describes the meaning of 
the construction, via a set of syntactic, semantic and 
pragmatic constraints. 
To make this concrete, let us consider a few ex- 
amples. We begin with a simple "command con- 
struction" consisting of an action verb followed by 
its argument. 
N : senti cmnd, v.np) 
C : < hr attends >= sr\] \] 
strue = (V.NP) ! 
< V cons n >= verb V 
< V M v_type >= action_verb 
< NP eons_n >= np 
8ern_eat ~ cogg~gaand 
M a_type-< V M sex_type > 
a_obj =< NP M sex_type > 
agent = hr 
The context of the construction describes all situa- 
tions in which the the hearer hr (hmnan or machine) 
is paying attention to the speaker sr (a "ready" 
state). The feature struc is a list of variables and/or 
words/tokens; it is used to describe the structure of a 
construction, and its role is similar to a rule in a gen- 
erative grammar. (We will write names of variables 
in capital letters, e.g. NP, inside matrices of con- 
structions). The attribute cons_n gives the name of 
a construction that could be assigned to a string. We 
use it here to say that the form of the construction 
can be described as a concatenation of two strings, 
of which one is a verb ((;onstruction) and the other 
an nt) (construction). l,hlrthermore, the verb type 
< V M vAype > is "action_verb". (The expression 
< V M v_type > should be read "tile v_.type of the 
message of V"). 
The message M describes the meaning of tile con- 
struction as that of a command in which tile type 
of action is described by tile nmaning of the verb, 
and the object of the action is given by the mean- 
ing of the noun phrase. The attribute sex_type 
stands for the "semantic type" and we identify it 
currently with tile word sense. Thus "erase the file" 
is understood as a command to delete the file, if 
< erase M sex_type >= delete, but "erase the 
picture" might refer to the type of action associated 
with rub_out. In both cases the hearer hr is supposed 
to be tile agent of the action. 
Constructions: from words to discourse 
Words~ phrases, and fragments of discourse can be 
analyzed as constructions. We view languages as col- 
lections of constructions which range fi'om words to 
discourse. We claim that the same representation 
scheme can be used for all constructions. 
The examples we are going to present; have been 
developed with a specific purpose in mind, namely 
for scheduling calendar events. In other papers (\[10\] 
and \[6\]), we have presented examples showing that 
we (:an give a good descriptions of non-standard con- 
structions. However, in either case descriptions of 
meanings and contexts are general, and hence appli- 
cable to other tasks. 
We uow turn our attention to words. Tile verb 
"cancel" can be represented as follows: 
N :verb(cancel) 
\[lane_code = english\] 
C : lan.q_ehannel = text 
V : struc- (cancel) 
M : 8era_type = delete 
v_type -- action_verb 
Notice that even simple words 
(prop'orly) interpreted. In C 
require context to be 
we say that English 
text is expected (but in other cases it could also be 
l,¥ench text, or 1,¥ench speech, etc.). Some aspects 
of context do not have to be explicitly specified and 
cart be replaced by defaults. 
Although the vehicle and the message are both 
very simple in this example, the simplicity of tile 
message is a result of deliberate simplification. We 
have restricted it to the specification of the scream 
tie type, identified with one sense of tile word, and 
to describing the xmrb type of "cancel" as a verb of 
action. Notice that the other sense of" cancel" -"off- 
set, balance out" wouht appear ill another entry. 
Of course, in reality, tile lexical meaning of any 
word is a much more complicated matter \[1\]. For 
instance, in our lexicon tile messages of words may 
contain many of the attrilmtes that appear in the 
1290 
explanatory combinatorial dictionary of Meleuk \[7\]. 
Discourse constructions: qb illustrate discourse 
constructions, we consider the folh)wing dialog: 
Have you arranged the worn yet? 
No, but I'll do it right away. 
We view the pattern of the answer no.but.,5' as a dis- 
course construction. It can represented by the fol- 
lowing array of features: 
N : sent(assrt, no.bnt.S) 
C \[< p_utter eons.n > scnt(qu('..~,*)\] 
f struc - (no.but.S) V < S cons_n >-- scnt(a,s~rt,*) 
< p_sent truth_'value > 0 \] 
M <SM > 
4 
At we can sue, the construction applies only in the 
context of a previously asked question, and its mes- 
sage says that the answer to the question is negative, 
after which it elaborates tile answer with a sentence 
S. 
3 System Architecture 
The parts 
MINCAL consists of a NI, grarmnal', a t)arser, a tit)- 
main knowledge base (about dates, times and meet- 
ings), an on:line calendar (Xdiary), an application 
knowledge base (about Xdiary), a continuous speech 
recognizer (IBM, ICSS), a speech generator (Sl)eech 
Plus, Text to Speech Converter), and the interfat:es. 
At present, the grammar consists of a few hun- 
(\[red lexi('al constructions, and about 120 "produc- 
tions", i.e. constructions describing combinations of 
ether constructions. ~ it (:overs tim basic fornm 
of assertive sentences, but it emphasizes (}ontntan(ts. 
Thus a comman(t can, for exanq)h:, be given either 
by v.np (also with "please", or "kindly"), or' by art as- 
sertive sentence re('ognized as art indirect si)ee(:h act 
("I'd like to ...", "Leora wants you to ...", etc.). The 
next large group of constructions (:overs PPs, with 
particular emphasis (m time and places. Finally, it 
covers a few discourse construetions, sin('c it is im- 
portant to deal with sentence hagnmnts in dialogs, 
e.g. understanding "evening" as "in the e, vening", 
when it is art answer to the question "when?". 
The interaction of the modules 
"\]Phe calendar and the application knowledge 
base: Xdiary is an on-line calendar for which we 
have not written a complete interface, lint have fo- 
cused on the three most iml)ortant funct;ions: ap- 
pointment, moving, and canceling appointments. 
Other functions, su('h as "to dC' lists, window man- 
a.gement, listing somebody's aptn)intments, etc., (',art 
1These are. constructions we used irt MINCAI.. Ill addition 
in various experiments we }law*. ilsed a few dozen other (2011- 
sLructions, e.g. those covering "open idioms" (see Section 4). 
1re dealt with in a similar fashion, and we i)lan to ex- 
tend the interface to deal with them. At this point 
tile apl)lieation knowledge t)ase in very simple. \]t 
consists of rules that say how to interpret the data 
given by the semantic interpreter, for instance the 
rules for tormatting paranmters and re, naming slots 
(e.g. event_duration - ~ duration). Such rnles are 
ne(:essary, if the distinction between at)plication att(t 
domain knowledge ix to be, maintained. 
The (tomain knowledge base: This has two kinds 
of facts: (I) background ontology, i.e., is, basle facts 
about time and places, and (2) linguistic knowledge 
associated with tim domain. The former includes 
slteh obvious facts as the number of clays ill a inonth, 
which month folh)ws the ottmr, that oitlcos are places 
etc. The latter includes tiu'ts about how tile language 
is use(l, l,br examt)le , the filters saying that places do 
not modify people, so that I want to meet my man- 
ager in th.c cafeteria can be, unambiguously parsed, 
with "cafeteria" 1)eing a meeting place, and not an 
attribute of the nlanager. 
The organization of knowledge: The issue of 
the organization of knowledge has been discussed 
at h:ngth in \[8\] and \[9\] and the formal model tle- 
veh)pe, d tiler(: ix applicabh; in the present context. 
At this t)oint, however, this tbrmal model has only 
been implenmntcd very crudely. Still till; model is 
worth briefly discussing, because the concet)tual (lis.~ 
tinetions made guide our work and have important 
t)rat'tiea\] eonsequene, es. The most important thing 
about it is that we discard the model of t)aekground 
knowledge as a logical theory, and replace it by a 
model consisting of collection of theories and nmeh- 
anisms for putting thenl together depending on eir- 
eantst.:~nces. T\]UlS, the )lSllal, two-1)art h)gical struc- 
tures, consisting of a metalevel and art object level , 
are augmented by a third level a referential level. 
The referential level is a t)artially ordered collection 
of theories; it encodes background knowledge in a 
way resembling a dictionary t)r art encyelopedia. ~ 
Parser, construction grammar and linguistic 
knowledge 
Parser: The parser does not produce (syntactic) 
struetural descriptions of sente, nces. Instead, it com- 
putes meaning representations. For example, it con- 
verts adjuncts directly into attributes of place, time, 
t)articipant etc., once they can be computed, and 
thus tit(; message of the sentonee does not contain 
any infornmtion about how these attributes where 
expressed or about tim attachnmnt of Pl's that ap- 
pear in it. 1,'or example, the sentence I want you to 
arrange a conference in my office at 5 is analyzed as 
.sent(a,ssert, svoe), an assertive sentence consisting 
of a su\[)jeet, a vert), an object and a eomplement. 
:?As llStl~tl, cllrr(!llt situations are described on the object 
level, and the metalevel is a t)lace for rules that can elilniil~ite 
some of t;he models permitted by the (~bjet:t level and the 
retkwential level. 
1291 
The latter and the message of the imperative that 
is passed to sent(assert, svoc) does not contain any 
structural information about the attachment of the 
PPs. This message is combined with the messages of 
the verb and the noun, yielding 
\[ 
\[ 
\[ 
den want (other_agent) \] 
agent hearer\] 
mental_agent 
\[ \[ type person\] 
\[ den speaker\] 
\[ action 
\[ \[ den arrange\] 
\[ action_object 
\[ type event\] 
\[ den conference\] 
\[ number I\] 
\[ mods 
\[ \[ det a\] 
\[ pp_msg 
\[ \[ 
\[ 
\[ 
\[ \[ 
\[ 
\[ 
\[ 
prep at\] 
type time(hour)\] 
den 
\[ \[ hour 
\[ 5 am_or_pm\] 
\[ minute O\] 
prep in\] 
type place \] 
den office\] 
mods 
\[ \[ det my\] 
This result of parsing is then interpreted by the 
domain interpreter to produce: 
***Slots: 
\[ \[ action_name schedule\] 
\[ event name 
\[ a conference\] 
\[ event time 
\[ \[ minute O\] 
\[ hour 
\[ 5 am_or_pm\] 
\[ event_place 
\[ my office\] 
Application-specific defaults then produce yet an- 
other interpretation where, in addition to filling the 
slots of Xdiary, \[ hour \[ 5 am or_pro\] \] is interpreted 
as \[ hour \[ 17l\]. 
The parser is a chart parser, working left to right, 
with no lookahead. The grammar is L-attributed, 
i.e., has has both synthesized and inherited at- 
tributes, but each inherited attribute depends only 
on inherited attributes of the parent or attributes of 
the sisters to the left. Hence, although the parser 
does not have a lookahead step at present,, such a 
step can be added following \[2\]. 
4 Comparisons with related 
work 
Linguistic arguments for constructions-based gram- 
mars has been worked out chiefly by Ch. Fillmore 
and his colleagues (ef. \[31). Their motivation for ad- 
vocating such an approach comes from the fact that 
typical generative theories of grammar cannot deal 
tu:operly with open idioms illustrated by construc- 
tions such as: 
The morc carefully you work, the easier it will 
get. 
Why not fix it yourself? 
Much as I like fgonnie, I don't approve of any- 
thing he does. 
It's time you brushal your teeth. 
Him be. a doctor? 
The same is true about even so-called robust parsers 
of English. The reason for this failure can be at- 
tributed to the fact that expressions like these "ex- 
hibit properties that are not fully predictable from 
independently known properties of its lexical make- 
up and its grammatical structure" \[3\], p.511. How- 
ever we do not need a list; of "strange" construc- 
tions m conclude that ~horoughly integrating syn- 
tax with semantics and pragmatics could provide 
us with a better handle on natnral language under- 
standing. On a closer examination "normal" con- 
struetions exhibit enough complexity to warrant the 
new approach (see \[10\] for details). 
Jurafsky \[4\] has independently come up with a pro- 
posal for a computable grammar of constructions. 
We compare our work with his in \[10\]. Here, we 
limit ourselves to a few remarks. What is common 
in both approaches is the centrality of the concept 
of grammatical construction as a data structure that 
represents lexical, semantic and syntactic knowledge. 
However, there are important differences between the 
two formalisms. First, the actual data structures 
used to represent constructions are different. The 
most important differenee has to do with the pres- 
ence of the context field in our version of the con- 
struction grammar. This allows us to account for 
the importance of pragmatics in representing many 
constructions, and to deal with discourse construe- 
tions. 
Secondly, while Jnrafsky acknowledges the need 
for abstract constructions (pp.43-51), his abstract 
constructions (weak constructions) are not first class 
citizens they are defined only extensionally, by 
specifying the set of constructions they abstract over, 
and their abstract meaning (e.g. entity for NOUN). 
They are used to simplify descriptions of constituents 
of other constructions. However, because they do not 
have a separate vehicle part, they cannot be used to 
assign default meanings. For instance, since verb is 
defined as a collection of all verbs is + read + can- 
cel 4- know + look-up + ..., it cannot be assigned 
a feature action_verb without introducing a contra- 
/292. 
diction its semantics is therefore given as RELA- 
TION/I'I~OCESS. For us the important feature of "ab- 
stract" constructions is not that they simplify (te- 
scriptions of other constructions, trot that they have 
default meanings. (A similar critique of \[5\] can be 
found in \[10\]). 
5 Summary of results 
Our at)preach to NI,U is based both on linguistic 
arguments and on our dissatisfaction with the state 
of the art. State of the art systclns typically are 
too "syntax-driven", failing to take context into ac- 
count in determining the intended meaning of sen- 
tences. A related further weakness is that such sys- 
reins are typically "sentcnce oriented", rather than 
"conversation/diseourse oriented", lit our view, this 
nmkes even the most robust systems "brittle" and 
ultimately impractical. 
To test whether a construction-based approach 
is feasible built a "complete" working system that 
would include a representation for constructions. To 
do this, we focused on the "calendar domain", a do- 
main with limited comi)lexity and simt)le but not nn- 
interesting semantics. We have chosen to deal with 
simple actions, and not e.g. with question answering, 
where dee, per understanding would be necessary. 3 
Our contributions: 
1. We. have t)roposed a new kind of grammar com- 
putable construction grammars, which are neither se- 
mantic, nor syntactic. Instead, their "productions" 
combine lexieal, syntacti(', semantic and pragmatic 
information. 4 
2. We have described data structures for construc- 
tions, and have shown that they can be effectively 
used by the parser. Note that the same data struc- 
ture is used to encode tile lexicon anti the "syntactic" 
forms. 
3. We have shown how to parse with~ constr~lctions. 
We have implemented a simple chart parsing algo- 
rithm, which carl be easily extended to an Eearly- 
like parser, as h)ng as tire construction granmlar re- 
mains L-attributed. We have found that even a sim- 
ple parser of construction ear be quite ef\[icient. This 
is partly due to the fact that it does not require copy- 
ing of all syntactic and semantic information from 
daughters to mothers; the goal of parsing consists 
in producing an interpretation, and strnetural infor- 
mation can be discarded once an intertn.etation of 
a phrase is produced. \]t is also worth e.mt)hasizing 
aWe have also thought about anoLher possibility, that is, 
enhancing an IR system, e..g. with the understanding of date 
expressions. 
4In what sense are they "computable"? Althmtgh this ad- 
jective might sugge.st a formal model with computational com- 
plexity results, etc., what we have in mind ix pretty trivial: 
(1) the systcm actually computes the messages of grammati- 
cal construction; (2) the grammars and constructions are well 
defined data structures, and l)arsing (combining all associated 
constructions in all possible ways) is decidable. 
that invoking domain semantics drastically reduces 
the number of parses constructed. 
4. We have proposed a modular architecture for NL 
interfaces based on the division between linguistic 
knowledge, domain knowledge base, and application 
knowledge base. Based on our experience, we be- 
lieve that this architecture should work in general 
for speech-enabled interfaces for restricted domains. 
References 
\[1\] Computational Linguistics. Special Issue on the 
Lexicon, 1987. 13: 3-4. 
\[2\] Nelson Correa. An extension of earley's algo- 
rithm for s- and l- attributed grammars, in Prec. 
Intl. Conf On Current Issues in Computational 
Linguistics, pages 248--26\]. Penang, Malaysia, 
1991. 
\[3\] Charles J. Fillmore, Paul 
Kay, and Mary Catherine O'Connor. Regularity 
and idiomatieity in grammatical constructions. 
Language, 64(3):501 538, 1988. 
\[4\] It. Jurafsky. An On-line Computational Model 
of Sentence lnte~7~re.tation. Phl) thesis, Univer- 
sity of California, Berkeley, 1992. Report No. 
UCB/CSD 92/676. 
\[5\] Paul Kay. Even. Linguistics and Philosophy, 
13(1):59 111, 1990. 
\[6\] M. MeCord, Arendse Bernth, Shalom l~appin, 
and Wlodek Zadrozny. Natural language pro- 
cessing technology based on slot grammar. In- 
ternational Journal orb Artificial Intelligence 
7bols, 1(2):229 297, 1992. 
\[7\] I. Melcuk. Dependency Syntax: Theory and 
P~uctice. State University of New York Press, 
Albany, NY, 1988. 
\[8\] Wlodek Zadrozny. Reasoning with background 
knowledge a three-level theory. Computational 
Intelligence, 10(2), 1994. To appear. 
\[91 Wlodek Zadrozny and Karen Jensen. Seman-- 
tics of paragraphs. Computational Linguistics, 
17(2):171 210, 1991. 
\[10) Wlodek Zadrozny and Alexis Manaster-Ranler. 
The significance of constructions. Submitted to 
Computational Linguistics, 1994. 
1293 

