An Improved Heuristic for Ellipsis Processing* 
Ralph M. Welschedel 
Department of Computer & Information Sciences 
University of Delaware 
Newark, Delaware 19711 
and Norman K. Sondheimer 
Software Research 
Sperry Univac MS 2G3 
Blue Bell, Pennsylvania 19424 
I. Introduction 
Robust response to ellipsis (fragmen- 
tary sentences) is essential to acceptable 
natural language interfaces. For in- 
stance, an experiment with the REL English 
query system showed 10% elliptical input 
(Thompson, 1980). 
In Quirk, et al. (1972), three types 
of contextual ellipsis have been identi- 
fied: 
I. repetition, if the utterance is a 
fragment of the previous sentence. 
2. replacement, if the input replaces a 
structure in the previous sentence. 
3. expansion, if the input adds a new 
type of structure to those used in the 
previous sentence. 
Instances of the three types appear in 
the following example. 
Were you angry? 
a) I was. 
b) Furious. 
c) Probably. 
d) For a time. 
e) Very. 
f) I did not want to be. 
g) Yesterday I was. 
(repetiion with 
change in person) 
(replacement) 
(expansion) 
(expansion) 
(expansion) 
(expansion) 
(expansion & 
repetition) 
In addition to appearing as answers fol- 
lowing questions, any of the three types 
can appear in questions following state- 
ments, statements following statements, or 
in the utterances of a single speaker. 
This paper presents a method of au- 
tomatically interpreting ellipsis based on 
dialogue context. Our method expands on 
p~evious work by allowing for expansion 
ellipsis and by allowing for all combina- 
tions of statement following question, 
question following statement, question 
following question, etc. 
*This material is based upon work partially sup- 
ported by the National Science Foundation under 
Grant No. IST-8009673. 
2. Related Work 
Several natural language systems 
(e.g., Bobrow et al., 1977; Hendrix et 
al., 1978; Kwasny and Sondheimer, 1979) 
include heuristics for replacement and 
repetition ellipsis, but not expansion 
ellipsis. One general strategy has been 
to substitute fragments into the analysis 
of the previous input, e.g., substituting 
parse trees of the elliptical input into 
the parse trees of the previous input in 
LIFER (Hendrix, et al., 1978). This only 
applies to inputs of the same type, e.g., 
repeated questions. 
Allen (1979) deals with some examples 
of expansion ellipsis, by fitting a parsed 
elliptical input into a model of the 
speaker's plan. This is similar to other 
methods that interpret fragments by plac- 
ing them into prepared fields in frames or 
case slots (Schank et al., 1980; Hayes and 
Mouradian, 1980; Waltz, 1978). This ap- 
proach seems most applicable to limited- 
domain systems. 
3. The Heuristic 
There are three aspects to our solu- 
tien: a mechanism for repetition and 
replacement ellipsis, an extension for 
inputs of different types, such as frag- 
mentary answers to questions, and an ex- 
tension for expansion ellipsis. 
3.1 Repetition and Replacement 
As noted above, repetition and re- 
placement ellipsis can be viewed as sub- 
stitution in the previous form. We have 
implemented this notion in an augmented 
transition network (ATN) grammar inter- 
preter with the assumption that the "pre- 
vious form" is the complete ATN path that 
parsed the previous input and that the 
lexical items consumed along that path are 
associated with the arcs that consumed 
them. In ellipsis mode, the ATN inter- 
preter executes the path using the ellipt- 
ical input in the following way: 
85 
I. Words from the elliptical input, 
i.e., the curren~ input, may be con- 
sumed along the path at any point. 
2. Any arc requiring a word not found 
in the current input may be 
traversed using the lexical item 
associated with the arc from the 
previous input. 
3. However, once the path consumes the 
first word from the elliptical 
input, all words from the elliptical 
input must be consumed before an arc 
can use a word from the previous 
input. 
4. Traversing a PUSH arc may be accom ~ 
plished either by following the sub- 
path of the previous input or by 
finding any constituent ef the re- 
quired type in the current input. 
The entire ATN can be used in these 
cases. 
Suppose that the path for "Were you 
angry?" is given by Table I. Square 
brackets are used to indicate subpaths 
resulting from PUSHes. "..." indicates 
tests and actions which are irrelevant te 
the current discussion. 
01d Lexical 
State Arc Item 
S (CAT COPULA ... (TO Sx)) "w--~'r~e" 
Sx (PUSH NP ... (TO Sy)) 
\[NP (CAT PRO ... (TO NPa)) "you" 
NPa (POP ...) \] 
Sy (CAT ADJ ... (TO Sz)) "angry" 
Sz (POP ...) 
Table I 
An ATN Path for "Were you Angry?" 
An elliptical input of "Was he?" fol- 
lowing "Were you angry?" could be under- 
steed by traversing all of the arcs as in 
Table I. Following point I above, "was" 
and "he" would be substituted for "were" 
and "you". Following point 3, in travers- 
ing the arc (CAT ADJ ... (TO Sz)) the lex- 
ical item "angry" from the previous input 
would be used. Item 4 is illustrated by 
an elliptical input of "Was the old man?"; 
this is understood by traversing the arcs 
at the S level of Table I, but using the 
appropriate path in the NP network to 
parse the old man 
3.2 Transformations of the Previous Form 
While the approach illustrated in 
Section 3.1 is useful in a data base query 
environment where ~\]liptical input typi- 
cally is a modlfication of the previous 
query, it does not account for elliptical 
statements following questions, elliptical 
questions following statements, etc. Our 
approach to the problem is to write a set 
ef transformations which map the parse 
path of a question (e.g., Table I) into an 
expected parse path for a declarative 
response, and the parse ~path for a de- 
clarative into a path for an expected 
question, etc. 
The left-hand side of a transforma- 
tion is a pattern which is matched against 
the ATN path of the previous utterance. 
Pattern elements include literals refer- 
ring te arcs, variables which match a sin- 
gle arc or embedded path, variables which 
match zero or mere arcs, and sets ef al- 
ternatives. It is straightforward to con- 
struct a discrimination net corresponding 
to all left-hand sides for efficiently 
finding what patterns match the ATN path 
of the previous sentence. The right-hand 
side ef a transformation is a pattern 
which constructs an expected path. The 
form of the pattern en the right-hand side 
is a list of references to states, arcs, 
and lexical entries. Such references can 
be made through items matched on the 
left-hand side or by explicit construction 
ef literal path elements. 
Our technique is to restrict the map- 
ping such that any expected parse path is 
generated by applying only one transforma- 
tion and applying it only once. A special 
feature of our transformational system is 
the automatic allowance for dialogue 
diexis. An expected parse path for the 
answer to "Were you angry?" is given in 
Table 2. Note in Table 2, "you" has be- 
come "I" and "were" has become "was" 
Old Lexical 
State Arc Item 
(PUSH NP ... (TO Sa)) 
(CAT PRO ... (TO NPa)) 
(PoP ...) 
(CAT COPULA ... (TO Sy)) 
(CAT ADJ ... (TO Sz)) 
(POP ...) 
S 
\[NP "I" 
NPa \] 
Sa "was " 
Sy "angry" 
Sz 
Table 2 
Declarative for the expected answer 
for "Were you angry?". 
Using this path, the ellipsis interpreter 
de'scribed in Section 3.1 would understand 
the ellipses in "a)" and "b)" below, in 
the same way as "a')" and "b'i" 
a) I was. 
a') I was angry. 
b) ~y spouse was. 
b') My spouse was angry. 
86 
3.3 Expansions 
A large class of expansions are sim- 
ple adjuncts, such as examples c, d, e, 
and g in section I. We have handled this 
by building our ellipsis interpreter to 
allow departing from the base path at 
designated states to consume an adjunct 
from the input string. We mark states in 
the grammar where adjuncts can occur. For 
each such state, we list a set of linear 
(though possibly cyclic) paths, called 
"expansion paths". Our interpreter as 
implemented allows departures from the 
base path at any state so marked in the 
grammar; it follows expansion paths by 
consuming words from the input string, and 
must return to a state on the base form. 
Each of the examples in c, d, e, and g of 
section I can be handled by expansion 
paths only one arc long. They are given 
in Table 3. 
Initial 
State 
Sy 
Expansion Path 
(PUSH ADVERB ... (TO S)) 
Probably (I was angry). 
(PUSH PF ... (To s)) 
For a time (I was angry). 
(PUS~ ~P 
(* this includes a teat 
that the NP is one 
of time or place) 
• .. (TO S)) 
Yesterday (I was angry). 
(PUSH INTENSIFIER-ADVERB 
... (TO Sy)) 
(I was) very (angry). 
Table 3 
Example Expansion Paths 
Since this is an extension to the ellipsis 
interpreter, combinations of repetition, 
replacement, and expansion can all be han- 
dled by the one mechanism. For instance, 
in response to "Were you angry?", "Yester- 
day you were (angry)" would be treated 
using the expansion and replacement 
mechanisms. 
~. Special Cases and Limitations 
The ideal model of contextual el- 
lipsis would correctly predict what are 
appropriate elliptical forms in context, 
what their interpretation is, and what 
forms are not meaningful in context. We 
believe this requires structural restric- 
tions, semantic constraints, and a model 
of the goals of the speaker. Our heuris- 
tic does not meet these criteria in a 
number of cases. 
Only two classes of structural con- 
straints are captured. One relates the 
ellipsis to the previous form as a combi- 
nation of repetition, replacement, and 
expansion. The o~her constraint is that 
the input must be consumed as a contiguous 
string. This constraint is violated, for 
instance, in "I was (angry) yesterday" as 
a response to "Were you angry?" 
Nevertheless, the constraint is computa- 
tionally useful, since allowing arbitrary 
gaps in consuming the elliptical input 
produces a very large space of correct 
interpretations. A ludicrous example is 
the following question and elliptical 
response: 
Has the boss given our mutual friend a 
raise? 
A fat raise. 
Allowing arbitrary gaps between the sub- 
strings of the ellipsis allows an in- 
terpretation such as "A (boss has given 
our) fat (friend a) raise." 
While it may be possible to view all 
contextual ellipsis as combinations of the 
operations repetition, replacement, and 
expansion applied to something, our model 
makes the strong assumption that these 
operations may be viewed as applying to an 
ATN path rather straightforwardly related 
to the previous utterance. Not all expan- 
sions can be viewed that way, as example f 
in Section I illustrates. Also, answers 
of "No" require special processing; that 
response in answer to "Were you angry" 
should not be interpreted as "No, I was 
angry." One should be able to account for 
such examples within the heuristic 
described in this paper, perhaps by allow- 
ing the transformation system described in 
section 3.2 to be completely general rath- 
er than strongly restricted to one and 
only one transformation application. Row- 
ever, we propose handling such cases by 
special purpose rules we are developing. 
These rules for the special cases, plus 
the mechanism described in section 3 to- 
gether will be formally equivalent in 
predictive power to a grammar for ellipti- 
cal forms. 
Though the heuristic is independent 
of the individual grammar, designating 
expansion paths and transformations obvi- 
ously is not. The grammar may make this 
an easy oz" difficult task. For instance 
in the grammar we are using, a subnetwork 
that collects all tense, aspect, and mo- 
dality elements would simplify some of the 
transformations and expansion paths. 
~aturally, semantics must play an 
important part in ellipsis processing. 
Consider the utterance pair below: 
87 
Did the bess have a martini at lunch? 
Some wine. 
Though syntactically this could be inter- 
preted either as "Some wine (did have a 
martini at lunch)", "(The boss did have) 
some wine (at lunch)", or "(The boss did 
have a martini at) some wine". Semantics 
should prefer the second reading. We are 
testing our heuristic using the RUS gram- 
mar (Bebrow, 1978) which has frequent 
calls from the grammar requesting that the 
semantic component decide whether to build 
a semantic interpretation for the partial 
parse found or to veto that partial parse. 
This should aid performance. 
~. Summary and Conclusion 
There are three aspects te our 
solution: a mechanism for repetition and 
replacement ellipsis, an extension for 
inputs of different types, such as frag- 
mentary answers to questions, and an ex- 
tension for expansion ellipsis. 
Our heuristic deals with the three 
types of expansion ellipsis as follows: 
Repetition ellipsis is processed by re- 
peating specific parts of a transformed 
previous path using the same phrases as in 
the transformed form ("I was angry"). 
Replacement ellipsis is processed by sub- 
stituting the elliptical input for contig- 
uous constituents on a transformed previ- 
ous path. Expansion ellipsis may be pro- 
cessed by taking specially marked paths 
that detour from a given state in that 
path. Combinations of the three types of 
ellipsis are represented by combinations 
of the three variations in a transformed 
previous path. 
There are two contributions of the 
work. First, our method allows for expan- 
sion ellipsis. Second, it accounts for 
combinations of previous sentence form and 
ellided form, e.g., statement following 
question, question following statement, 
question following question. Furthermore, 
the method works without any constraints 
on the ATN grammar. The heuristics carry 
over to formalisms similar to the ATN, 
such as context-free grammars and augment- 
ed phrase structure grammars. 
Our study of ellipsis is part of a 
much broader framework we are developing 
for processing syntactically and/or 
semantically ill-formed input; see 
Weischedel and Sondheimer (1981). 
References 
Allen, James F., "A Plan-Based Approach to 
Speech Act Recognition," Ph.D. Thesis, 
Dept. of'Computer Science, University of 
Toronto, Toronto, Canada, 1979. 
Bobrew, D., R. Kaplan, M. Kay, D. Norman, 
H. Thompson and T. Winograd, "GUS, A 
Frame-driven Dialog System", Artificial 
Intelligence, 8, (1977), 155-173. 
Bobrow, R., "The RUS System", in Research 
in Natural Language Understandin$, by B. 
Webber and R. Bobrow, BBN Report No. 3878, 
Belt Beranek and Newman, Inc., Cambridge, 
MA, 1978. 
Hayes, P. and G. Mouradian, "Flexible 
Parsing", in Proc. of the 18th Annual 
Meetin~ of the Assoc. for Cemp. Ling., 
Philadelphia, June, 1980, 97-103. 
Hendrix, G., E. Sacerdoti, D. Sagalowicz 
and J. Slocum, "Developing a Natural 
Language Interface to Complex Data", ACM 
Trans. on Database S~s., 3, 2, (1978--~, 
105-147. 
Kwasny, S. and N. Sondheimer, "Ungrammati- 
cality and Extragrammaticality in Natural 
Language Understanding Systems", in Proc. 
ef the 17th Annual Meeting of the Assoc. 
for Comp. Lin~., San Diego, August, 1979, 
19-23. 
Quirk, R., S. Greenbaum, G. Leech and J. 
Svartvik, A Grammar of Centempory English, 
Seminar Press, New York, 1972. 
Schank, R., M. Lebowitz and L. Birnbaum, 
"An Integrated Understander", American 
Journal of Comp. Ling., 6, I, (1980), 
13-30. 
Thompson, B. H., "Linguistic Analysis of' 
Natural Language Communication with Com- 
puters", p~'oceedings of the Eighth 
International Conference on Computationai 
Linguistics, Tokyo, October, 1980, 
190-201. 
Waltz, D., "An English Language Question 
Answering System for a Large Relational 
Database", Csmm. ACM, 21, 7, (1978), 
526-559. 
Weischedel, Ralph M. and Norman K. Son- 
dheimer, "A Framework for Processing Ill- 
Formed Input", Technical Report, Dept. of 
Computer & Informatiou Sciences, Universi- 
ty of Delaware, Ne~ark, DE, 1981. 
Acknowledgement 
~luch credit is due to Amir Razi for 
his programming assistance. 
88 
