A unification-based approach to multiple VP Ellipsis resolution* 
Claire Gardent 
GRIL, Universitd de Clermont-Ferrand (France) and 
Department of Computational Linguistics, Universiteit van Amsterdam 
Spuistraat 134, 1012 VB Amsterdam (The Netherlands) 
E-marl: claire@mars.let.uva.nl 
Abstract 
An assumption shared by many theories of 
discourse is that discourse structure con- 
strains anaphora resolution (cf. \[Grosz and 
Sidner 1986\] for definite NPs, \[Lascarides 
and Asher 1991\], \[Nakhimovsky 1988\] 
for temporal anaphora, \[Webber 1990\] 
for deictic pronouns and \[Gardent 1991\], 
\[Prfist and Scha 1990\] for VP ellipsis). The 
aim of this paper is (i) to show that this as- 
sumption also applies to multiple VP ellip- 
sis (VPE), (ii) to argue that other levels of 
linguistic information (such as syntax and 
semantics) interact with discourse structure 
in determining multiple VPE acceptability 
and (iii)to make these intuitions precise 
by providing a unification-based account of 
multiple VPE resolution. 
1 Introduction 
\[Klein and Stainton-Ellis 1989\] convincingly argue 
that VPE need not resolve to the nearest possible 
antecedent. The most intricate examples they give 
to support this claim involve what they dubbed mul- 
tiple VPE and can be illustrated by the follow- 
ing discourses (square brackets surround antecedent 
VPs, 01 indicates VP ellipses and indices represent 
anaphoric dependencies) 1 : 
*The work reported here was partially carried out in 
the LRE Project 61-062, Towards a declarative theory of 
discourse. 
1 Although this data often raises suspicion among lin- 
guistic audiences as to its credibility, the facts are that 
(1) it is real life data and (2) it can be understood and it 
is usually understood in an unambiguous fashion. Hence 
(1) I promised myself I \[x wouldn't go to Manch- 
ester\] unless I first \[2 opened a big stack of 
mail\]. I didn't 02, so I didn't 01. (Nesting) 
(2) If you \[1 work hard, make the right choices 
and keep your nose clean\], you \[2 get ahead\]. 
If you don't 01, you don't 09.. (Crossing) 
(3) I was \[1 really thin\] then, and I tried some ski- 
pants that \[2 looked really good on me\], and 
I \[3 should have bought them\]. But I didn't 
03, and now I 'm not 01 and they wouldn't 0~. (Mi~ed) 
As these examples show, there is not one pattern 
relating multiple VPEs to their antecedents, but at 
least three: nesting, crossing and mixed. Nesting 
and crossing can be represented as follows (where 
VPi and 0i represent antecedent and elliptical VPs 
respectively): 
Nesting: VP1 ... VP,~ 0n ... 01 
Crossing: VP1... VPn 01...0n 
while a mixed pattern simply is a configuration in 
which both crossing and nesting occur. According 
to this terminology, (1) illustrates a nesting pattern, 
(2) shows a crossing pattern and (3) a mixed pattern. 
Thus, it is clear that no unique dependency config- 
uration constrains the resolution of multiple VPEs. 
On the contrary, it appears that all patterns are pos- 
sible and thus that any configurational restriction on 
VPE resolution is doomed to failure. Interestingly 
however, despite the multiple ways in which each of 
the VPEs could be resolved, there is in actual fact 
no ambiguity as to how the global discourse should 
be understood. This suggests that some strong con- 
straints come into play to help the hearer resolve 
the question: What is it that constrains multiple VPE 
resolution in such a way that these "exotic" discourses 
are in fact intelligible? 
139 
adequately. In what follows, I argue that discourse 
structure (rather than surface ordering) is one of the 
main constraint regulating multiple VPE resolution. 
2 Discourse grammar and VPE 
resolution 
The discourse grammar used builds 
on \[Polanyi and Scha 1984\]. More specifically, I as- 
sume that discourse is a tree structured entity whose 
well formedness can be described by a unification 
based discourse grammar. Under such a grammar, 
a discourse constituent is either a discourse relation, 
a clause or a discourse relation together with one 
or more discourse constituent(s). The grammar as- 
sociates with each constituent a complex category 
which for the purpose of this paper, I will assume 
to consist of the six main attributes PHO, CAT, SEM, 
IN, OUT and RESTR. PHO, CAT, SEM unsurprisingly 
denote the phonology, the category and the seman- 
tic representation of the constituent described by the 
complex category. IN and OUT are attributes which 
represent the flow of anaphoric information, that is, 
IN represents the in-going context (where a context 
is a sequence of potential antecedents i.e. a sequence 
of VP categories) and OUT, the out-going context. 
Finally RESTR is short for restriction and takes as 
value a constraint which must evaluate to true for 
the category to be well-formed. 
Conventions: In what follows I will omit any in- 
formation that is not relevant to the purpose of the 
discussion. In particular, I shall omit irrelevant at- 
tributes in categories and any anaphoric information 
not pertaining to VPE (i.e. anaphoric pronominal in- 
formation is ignored). Furthermore, the values of IS 
and OUT attributes (which should be VP categories) 
will be abbreviated to the SEM values of these cat- 
egories. Finally, I will use the term a-clause as an 
abbreviation for antecedent clause and e-clause, for 
elliptical clause. 
A simple example will illustrate the workings of 
the discourse grammar with respect to VPE resolu- 
tion. Consider the discourse in (4). 
(4) (a) Jon \[1 likes Mary\] (b) and (c) Peter does 
$1 too. 
As indicated by the bracketed letters, this dis- 
course includes three basic discourse constituents: 
the two clauses (a) and (c) and the discourse con- 
nective and. Consider first the category associated 
with (a). Ignoring irrelevant attributes, this category 
can be represented as follows2: 
~For expository purposes, I assume here a sentenfial 
(rather than a discourse) semantics. In practice, however, 
the analysis is to be based on a discourse semantics and 
most importantly, the definitions of structural identity 
and of equivalence classes over relations (see below) axe 
to apply to discourse semantics representations and to 
discourse relations respectively. 
SEM like:\[j,m\] \] 
\] IN _ 
OUT \[like:\[m\]\] 
With regard to VPE resolution, two points are rel- 
evant. First the IN value is a don't-care-value (sym- 
bolized here by the anonymous variable), thus sig- 
naling the fact that incoming anaphoric information 
is irrelevant in the case of non-elliptical clauses. Sec- 
ond, the OUT value contains the information asso- 
ciated with the sentence main VP thus signalling 
the fact that non elliptical clauses update the cur- 
rent outgoing context with new information. Note 
that anaphoric information concerning VPs is here 
assumed not to be cumulative, that is the OUT value 
of \[\] is not "added" to the IN value - rather it con- 
stitutes the sole output of (a) independent of the 
preceding context. The intuition formalised here is 
that the discourse entity providing the interpreta- 
tion of an elided VP is not as persistent as an indi- 
vidual discourse entity and thus should remain local 
to the discourse constituent that introduced it (al- 
though in some particular cases such as e.g. paral- 
lelism, anaphoric information pertaining to VPs can 
be percolated by the discourse grammar rules). For 
more details on this point, the reader is referred to 
\[Gardent 1991\], pages 141-142. 
Now consider the category assigned by the dis- 
course grammar to the elliptical clause (c). Again 
ignoring irrelevant attributes, this category can be 
represented as: 
SEM R:\[p\[ As\] \] 
\[E IN \[R:AS\] 
OUT \[R:As\] 
where R and As are unification variables over re- 
lations and arguments respectively. The important 
point to note here is that the variables R and As are 
shared by the IN value on the one hand, and by the 
SEM value on the other. This in effect implements 
VPE resolution. To see this, suppose that we have a 
discourse rule of the following form (AND abbreviates 
the category for and): Seml\] \[S M Sem2\] 
IN In , AND , IN Outl 
OUT OUtl OUT Out2 
\[ SEM and:\[Seml,Sem2\] \] 
IN In 
OUT \[Outl, Out2\] 
Application of this rule to the categories of (a), (b) 
and (c) above will trigger the unification of Outl 
with \[like:\[m\]\] on the one hand, and \[R:As\] on the 
other. Thus \[R:As\] is unified with \[like:\[m\]\] and the 
semantic representation R:\[p I As\] of (c) will become 
like:\[p,m\], just as required. 
140 
3 Discourse structure and Multiple 
VPE resolution 
3.1 Some data 
The claim this paper makes about multiple VPE res- 
olution is that the same discourse relation must hold 
between the multiple VP ellipses on the one hand and 
the multiple antecedents on the other. The present 
section has for object to substantiate this claim. As 
a first case in point, consider the following example. 
(5) I never go swimming because I don't look 
good in a swimming suit. (causal) 
a. I might ifI did. (causal) 
b. If I did, I probably would. (causal) 
c. Sarah does and so she does. (causal) 
d. ? I might after I did. (temporal) 
e. ? I might but I did. (contrast) 
Example (5) gives a case of a-clauses which are 
related by a causal relation. Several possible contin- 
uations are then given, some of them are acceptable, 
some of them are not. The relevant observation is 
that in those cases where the relation holding be- 
tween e-clauses also is a causal one, the continuation 
is acceptable; however, in those cases where the rela- 
tion holding between e-clauses is of a different nature, 
the continuation is inacceptable. 
As a second case in point, consider example (6): 
(6) I was thin then and the trousers looked good 
on me and I should have bought them, 
(THIN ~ LG) A ST 
a. but I didn't and now I am not and they 
wouldn't. 
-,BT A ('-,THIN ~ -,LG) 
b. but now I am not and they wouldn't and 
anyway I didn't. 
(-,THIN --* -,LG) A --,BT 
c. ? but now I am not and I didn't and they 
wouldn't. 
-,THIN A -,BT A -~LG 
Here the antecedent discourse unit consists of three 
clauses, the first two can be said to be related by 
a causal relation (because I was thin, the trousers 
looked good on me) whereas the third clause is con- 
joined to the first two. Again, several possible contin- 
uations are given, some of them are acceptable, some 
of them are not. This time, the observation is that 
in the case where no causal relation can be estab- 
lished between the appropriate e-clauses (i.e. when 
those clauses corresponding to the cause and to the 
result of the cause are not adjacent), the continu- 
ation is unacceptable 3. That is, in the case where 
an identical relational pattern cannot be established 
3This observation was originally made in \[Stainton- 
Ellis 1988\], page 75. 
for e- and a-clauses, multiple VPE becomes hard to 
understand, if not unacceptable. 
In what follows, I take these examples to suggest 
that the same discourse relation must hold between 
a- and e-clauses resepctively. I characterise this ob- 
servation in terms of parallelism, make this notion 
precise and show how it interacts with other gram- 
mar components (e.g. syntax and semantic) to deter- 
mine multiple VPE resolution. It should be stressed 
however that the approach can only be as precise 
as the definition of discourse relations and unfortu- 
nately, this notion is notoriously elusive. Nonethe- 
less the hope is that this paper captures an impor- 
tant intuition about multiple VPE resolution namely, 
the intuition that parallelism constitutes one of the 
(many) factors affecting multiple VPE acceptability 
and interpretation. 
3.2 Formal analysis 
Assuming a discourse grammar of the type described 
in section (2), the claims this paper makes about 
multiple VPE resolution are (i) that whenever a dis- 
course contains multiple VPEs, the clauses contain- 
ing the VPEs and those containing the antecedent 
VPs form two complex discourse constituents which 
are related together by the relation of parallelism 
and (ii) that parallelism constrains VPE resolution 
in that each VPE will resolve to the "parallel VP" in 
the complex discourse constituents formed by the a- 
clauses. We now make these claims precise. First, we 
define the semantic representation language £ used 
by the grammar described in section (2). £: consists 
of the wffs described by the following syntax: 
wff ~ { term, formula, 
polarity:rel:\[Wffl ... wffn\] 
term --* { variable, constant } 
formula --* polarity:predicate:\[argl.., argn\] 
arg --~ { term, formula } 
rel ~ constant 
predicate ~ constant 
polarity ~ { 1, 0 } 
The intuition is that £ is a quantifier free lan: 
guage where variables are unification variables and 
polarity (i.e. absence or presence of negation) is 
always explicit (that is, non-negated wffs are de- 
scribed as positive i.e. marked with 1). Thus for 
instance, the expression 0:and:\[0:p, 1:( d is a wff of 
/~, which one can think of as the more traditional 
propositional logic formula-~(-,pAq). We call PROP 
the set of wffs of the form polarity:predicate:\[argl ... 
argn\] 4. Given this language £, the discourse relation 
of parallelism is said to hold between two propo- 
sitions represented by the/: wffs • and • (written, 
parallelism((I), ql)) iff (I) is structurally identical with 
• . Structural identity is defined as follows: 
4Note that contrary to tradition negated propositions 
are assumed to be atomic wffs. 
141 
Definition 1 (Structural identity between L: 
formulae) 
If ¢, ql • £, then ¢ is structurally identical (or s- 
identical) with ql (written ¢ =s el) if: 
(i) ¢, • • PROP 
or (it) ¢ = \[¢1...¢.1, • = \[~1...~.1 
¢1 -~'S itS1 and...¢n =seln 
or(tit) ¢ = pl: ¢1,~ = p2 : @x, 
Pl = P2 and ¢1 "--t 92 
That is, structural identity is identity up to propo- 
sitional level (where negation is taken to be part of 
propositional information). To give two simple ex- 
amples: 
l:p=,0:q 
and 
1: implies\[l: p, 0: q\] =, 1: implies\[l: r, 0: s\] 
To state the constraint regulating multiple VPE res- 
olution, we first define the notion of a yield. 
Definition 2 (Yield) 
If ¢ G £, then the yield of this semantic representa- 
tion ¢, written y(¢), is: 
If ¢ • PROP, Y(¢) = (¢) 
g¢ = \[¢1,...¢.1, Y(¢) = y(¢1)-....y(¢.) where, denotes 
sequence concatenation 
If ¢ = p : tx Y(¢) = Y(¢1) 
Thus the yield of an £ wit ¢ consists of the sequence 
of atomic propositions contained in ¢. Finally, we 
state the constraint as follows: 
Definition 3 (Constraint on multiple VPE 
resolution) 
Let ¢ be the semantic representation associated with 
the discourse segment formed by the a-clauses and el 
be that associated with the e-clauses. Then, if 
Y(¢) = (Poll:Pl:\[sllss1\],... 
, Po12: P.: Is. \[ ss.\]) and 
y(ql) -- (Pol3 : Ol : \[tl I till,... 
, Pol4 : 0.: It. I tt.\]), 
then for 1 < i < n, 0i = 79i and ssi = tti. 
That is, each elided predicate 0i and argument list 
tti 5 in 3;(el) resolves to the parallel predicate :Pi and 
argument list ssi in 31(q~). To see how this constraint 
works, consider example (1). Suppose that the dis- 
course grammar assigns to the a- and the e- part 
of this discourse the following (simplified) semantic 
representations: 
A-clauses: 0:and:\[ 0:OM:\[i\], l:GtM:\[i\]\] 
E-clauses: l:and:\[ 0:rh:\[il, 0:R2:\[i\]\] 
5The first argument in the list corresponds to the sub- 
ject NP and is thus ignored. 
Then definition 3 adequately predicts that R1 
=OM and R2 = GtM. That is, the constraint embod- 
ied in definition 3 implements the fact that multiple 
VPE resolution is sensitive to the semantic- rather 
than to the surface-ordering of the antecedents. 
3.3 Implementation 
The above analysis can be implemented in the dis- 
course grammar described in section (2) as follows. 
The parallelism rule will be: 
IN IN XN OUT1 
OUT OUT1 OUT OUT2 
P,.ESTR _ P,.ESTR _ 
SEM 1 :parallelism:\[ SEM1 ,SEM2\] \] 
IN IN 
OUT \[OUT1, OUT2\] 
aESIR SEM1 =, SEM2 
This rule has two effects. First, it requires that 
the semantic representations of the constituting dis- 
course constituents be s-identical - this implements 
the restriction stated in defining parallelism. Sec- 
ond, it unifies the OUT value of the first discourse 
constituent with the IN value of the second - this en- 
sures that the antecedents provided by the first (pos- 
sibly complex) discourse constituent are accessible to 
any VPEs occuring in the second constituent. Now 
consider the rule for the connective unless (where 
UNLESS abbreviates the category associated with un- 
less): 
IN IN1 ,UNLESS, IN IN2 
OUT OUT1 OUT OUT2 
SEM 0:and:\[0:SEM2,0:SEMt\] \] 
==~ IN \[IN1, IN2\] 
OUT \[OUT2, OUTt\] 
Note that the order of the resulting OUT value is 
\[Out2, Out1\] (and not \[Out1, Out2\] as suggested by 
the surface ordering). This reflects the fact that mul- 
tiple VPE resolution is sensitive to the logical- rather 
than the surface-ordering of its antecedents. Appli- 
cation of the UNLESS rule to the a-clauses 6 (I wouldn't 
go to Manchester unless I open my mat 0 in example 
(1) will yield the category (recall that irrelevant at- 
tributes and attribute values are omitted): 
SEM 
IN 
OUT 
0:and:\[0:open:\[i,mail\], 0:go:\[i,toM\]\] \] 
\[\[open:\[mail\]\], \[go:\[toM\]\]\] 
6Here, I do not consider the problem raised by the 
embedding clause I promised myself that. 
42 
Similarly, the e-clauses (I didn't so I didn't) will be 
assigned the category: 
SEM 
IN 
OUT 
~ l:and:\[0:Px:\[ilAsl\], 0:P2:\[i\[ Asll\] \] 
\[\[PI: As1\], \[P2:As2\]\] 
Finally, application of the parallelism rule to these 
two categories will yield: 
SEM 
IN 
OUT 
~arallelism:\[\[~, \[~\]\] 
,5\]1 \] 
where \[~\] = E\] and thus, 
~\] = l:and:\[O:open:\[i,mail\], 0:go:\[i,toMl\] 
That is, the uninstantiated variables Pi, P2, ASl 
and Ass in \[~\] have been assigned a value by means of 
unification m such a way as to implement the restric- 
tion on multiple VPE resolution stated in definition 
3, and with the result that the semantic representa- 
tion of the overall discourse is the expected one. 
4 Structural identity and semantic 
equivalence 
The approach proposed above relies on the syntactic 
notion of structural identity. However it is a well- 
known fact that syntactically distinct logical formu- 
lae may be semantically equivalent. For instance, 
(7) p ---+ q ~ --,(p A "-,q) _= --,p V q 
Now given these logical equivalences, it is un- 
clear how the semantics of natural language discourse 
should be represented. Suppose for instance, that we 
have a discourse of the form If P, Q. Then there is a 
choice as to how this discourse should be represented, 
namely should it be represented as p --+ q, -~(p ^ -~q) 
or -~p V q (where p and q represent the semantic con- 
tent of the discourses P and Q related by if) ? Tra- 
ditionally, it is assumed that such a discourse will 
translate to what could be called the canonical form 
i.e p ~ q. However, the data on multiple ellipses 
(and the analysis proposed here) suggests that this 
should not always be the only possibility. As a case 
in point, consider example (8). 
(8) If he is \[t lucky\], he has \[2 ordered his software 
from a house that can help\]. If he hasn't 0~, 
he isn't 01 and may the gods be with him 
because he will need it. 
Suppose that both a- and e-clauses translate to the 
canonical form, we then have the following semantic 
representations 7: 
A-clauses: Ax.lucky(x)(i) --+ Ax.buy(x, 
sw, fhtch)(i) 
E-clauses: -~791(i) "-* -'792(i) 
And definition 3 will yield the (wrong) prediction: 
791 = Ax.lucky(x) 
792 = Ax.buy( z, sw , f htch ) 
Now suppose that the semantics of the e-clauses 
(i.e. -,79,(i) --+ -,79~(i)) is replaced by the semanti- 
cally equivalent: 
792(i) ---* 791 (i) 
Definition 3 will then yield the (correct) predic- 
tion: 
792 = Ax.buy(x, sw, fhtch) 
791 = Ax.lucky(x) 
So it seems that a given natural language con- 
nective should be allowed to be ambiguous between 
several semantically equivalent but syntactically dis- 
tinct discourse relations (for instance, if could be as- 
signed all translations given in (7) above). But if this 
is so, the question then arises as to how this ambigu- 
ity can be resolved. The claim I want to make is that 
both the resolution of this ambiguity and the reso- 
lution of multiple VP ellipses result from a complex 
interaction between syntax, semantics and pragmat- 
ics. The following section provides some evidence in 
support of this claim. 
The interaction of parallelism with 
other levels of linguistic 
information 
So far I have argued that multiple VPE resolution is 
subject to the discourse constraint that the proposi- 
tions expressed by e- and a-clauses must be related 
by the discourse relation of parallelism. I have then" 
shown that due to semantic equivalence, there might 
be several parallel configurations potentially holding 
between a- and e-clauses. However the actual data 
shows little ambiguity: in most cases, the hearer can 
single out the (unique) intended reading. In this sec- 
tion, I argue that the discourse constraint of paral- 
lelism interacts with other sources of linguistic infor- 
mation to determine this unique reading. In particu- 
lar, I argue that syntax, semantics and pragmatics all 
contribute to solve the ambiguity raised by semantic 
equivalences between discourse relations. 
rTo improve readibility, I use here (and in the rest 
of this section) an informal notation to describe the se- 
mantics of discourse. ~i represent the semantics of VPEs 
where i indicates surface ordering. 
143 
5.1 Syntax 
Consider again example (8) where the discourse 
formed by the e-clauses is of the form If P, Q and 
the associated semantic representation may be ei- 
ther p --* q or --q --* --p. Now look at the syn- 
tax of antecedent and elliptical VPs. The first el- 
liptical VP is the perfective auxiliary has and thus 
subcategorises for a past participle whereas the sec- 
ond ellipsis consists of copula be and thus selects a 
predicative phrase. Correspondingly, the antecedent 
VPs are (1) a predicative phrase (lucky) and (2) a 
past participle (ordered his software from a house 
that can help). If we assume that VPE acceptability 
is sensitive to the syntactic information associated 
with the antecedent, then the above observations ex- 
plain why the discourse relation holding between a- 
and e-clauses must be --q --~ --p rather than p ---, q. 
For in the first case hasn't indeed resolves to a past 
participle (namely ordered his software from a house 
that can help) and isn't to a predicative phrase (i.e. 
lucky); whereas in the second case, the subcategori- 
sation requirements of the auxiliaries are systemati- 
cally violated. Thus if we assume that the (or at least 
some) syntactic properties of the antecedent VPs are 
relevant in determining VPE acceptability, then we 
can account for the fact that despite of the ambi- 
guity introduced by semantic equivalences between 
discourse relations, there is only one reading for (8) 
i.e. the reading which is compatible both with the 
discourse requirement of parallelism between a- and 
e-clauses and with the syntactic constraints betweeen 
antecedent and elliptical VP. As already mentioned 
(cf. section 2), the present discourse grammar makes 
precisely this assumption since it takes anaphoric in- 
formation to be sequences of VP categories i.e feature 
structures containing inter alia syntactic information 
about admissible antecedent VPs. 
5.2 Semantics 
\[Sag 1980\] argues that VPE is subject to a con- 
straint on semantic representations, which is dubbed 
the alphabetical variant constraint. The analysis is 
convincing in that it accounts for a wide range of 
facts about VPE and its interaction with other lin- 
guistic phenomena such as quantification, extrac- 
tion, pseudo-clefts, ready constructions and equi- 
sentences. For instance, the alphabetic variant con- 
straint will account for the inacceptability of (9)8: 
(9) If every boy thinks that Mary is in love with 
him, the party will be a success. ~ If they 
don't, it won't. 
Note that in this case, discourse parallelism does 
hold between a- and e-clanses. So if discourse paral- 
lelism (as defined in this paper) was taken to be the 
only constraint regulating VPE acceptability, this 
8To be compared with the well formed: If every boy 
brings a bottle, the party will be a success. If they don't, 
it won't. 
(ill formed) discourse could not be rejected by the 
grammar. However, if Sag's constraint is assumed 
then the ill-formedness of (9) can be accounted for 
as follows. Sag's constraint states that VPE is ac- 
ceptable iff the semantic representation of the an- 
tecedent VP (which he assumes to be a lambda ab- 
straction over individuals) is identical tip to renam- 
ing of bound variables with the semantic represen- 
tation of the ellipsis and furthermore, all occurences 
of a free variable occuring both in the representation 
of the antecedent and of the ellipsis are bound by 
the same operator. Given this, the ill-formedness of 
(9) is explained by the fact that the pronoun him 
is represented by a variable (say, y) which is free in 
the semantic representation associated with the an- 
tecedent VP (i.e.)~z.think(z, love(m, y))) and can- 
not be bound by the same operator (i.e. the universal 
quantifier introduced by the subject NP every boy) 
when occuring in the semantic representation of the 
elliptical VP (because it occurs outside the scope of 
every). 
Here again, the assumption that the antecedent 
of a VPE is represented by a monostratal category 
means that Sag's alphabetic variant constraint can 
easily be integrated in the present account. This can 
be done in two ways. The first possibility consists 
in adopting Sag's view and adding a constraint in 
the category associated with VP ellipsis auxiliaries 
to the effect that the semantic representation of the 
antecedent VP and that of the ellided VP must be 
alphabetic variants of each other. This has the incon- 
venience of requiring a global check over the semantic 
representation of the whole discourse segment con- 
taining a- and e-clauses, a check which is essentially 
non compositional in nature 9. A second possibility is 
to adopt a dynamic semantics (i.e. a semantics where 
meaning is taken to be a relation between contexts 
and where a context contains information about pro- 
noun denotations). Under such an assumption, it can 
be shown that the inacceptability of any discourse vi- 
olating the alphabetic variant constraint comes out 
as a failure to interpret this discourse (model theo- 
retic interpretation simply fails) so that the seman- 
tic representation of a- and e-clauses need not be 
checked upon. Such an approach is described in 
\[Gardent 1990\] and could easily be integrated in the 
present framework: it suffices to replace the static 
semantics whose syntax is described in 3, by the dy- 
namic semantics given in \[Gardent 1990\]. 
5.3 Pragmatics 
Just as syntax and semantics, pragmatics can inter- 
act with discourse constraints to determine multiple 
VPE acceptability. A particularly clear illustration 
of this interaction comes from the pragmatics of dis- 
course connectives i.e. words such as but, unless, etc. 
Consider for instance the discourse in (10). 
9For more details concerning this point, see 
\[Gardent 1990\]. 
144 
(10) I gave her some questions to ask you if you 
rang her. 
a. I did but she didn't. 
b. , I did but she did. 
Although both continuations can be viewed as par- 
allel to the a-clauses (cf. section 6), only continua- 
tion (a) is acceptable. Continuation (b) is inaccept- 
able because the pragmatics of but (which requires 
some contrastive relation to hold between the propo- 
sitions it relates) is violated. 
The discourse grammar sketched here does not in- 
tegrate pragmatic information and thus cannot ac- 
count for the difference in acceptability between (a) 
and (b). Whether it can be extended to do so re- 
mains an open question although recent work in 
pragmatics (such as \[Elhadad and McSeown 1990\]) 
suggests that the monostratal, unification based ap- 
proach to discourse grammar is fully compatible with 
a comprehensive treatment of the semantics and 
pragmatics of discourse connectives. 
6 Taking stock 
While section (3) argues that multiple VPE resolu- 
tion is subject to the discourse constraint of paral- 
lelism, section (5) shows that it is also sensitive to 
other linguistic components such as syntax and se- 
mantics. The present section (i) discusses how the 
resulting overall analysis accounts for the examples 
given so far, (ii) introduces some additional data and 
(iii) summarises how the various linguistic modules 
interact in determining VPE acceptability for the set 
of cases presented throughout the paper. 
We start by examining the examples given so far. 
Examples (2) and (ha) are simple cases of discourse 
parallelism where a- and e-clauses translate to the 
same canonical LF and no extraneous factor blocks 
resolution so that each VPE resolves to the paral- 
lel element in the antecedent discourse constituent. 
Example (3) is more intricate and can actually be 
explained in two different ways. A first possibility 
is to assume that I should have bought them and but 
I didn't form a discourse constituent and, I was re- 
ally thin and ... the ski-pants looked really good on 
me and now I'm not and they wouldn't another (the 
intuition here would be that discourse constituents 
reflect the temporal structure of discourse, that is, 
temporally related events must be part of the same 
discourse constituent). Under this first hypothesis, 
we have on the one hand a case of (single) VP ellipsis 
where but I didn't resolves to I didn't buy them and on 
the other hand a simple case of parallelism between 
complex discourse constituents 1°. The second possi- 
bility is to consider that the three a-clauses form a 
discourse constituent which is parallel with the dis- 
course constituent formed by the three e-clauses. In 
1°Thanks to an anonymous referree for pointing out 
this po§sible interpretation. 
this case, the semantic representations of a- and e- 
clauses can be symbolised as: 
A-clauses: (T ---* LG) A BT 
E-clauses: 01 A (02 ---* 03) 
This clearly does not obey parallelism. In this 
case, syntax imposes the choice of an equivalent LF 
(i.e. (02 --~ 03) A 01 ). As in (8), this syntactic con- 
straint stems from the subcategorisation requirement 
of a VPE auxiliary, namely 'm not which requires a 
predicative phrase as antecedent. 
For completeness, consider now the following ad- 
ditional examples. 
(11) I gave her some questions to \[1 ask you\] if you 
\[2 rang her\]. I did 02 but she didn't 01. 
(12) It was preposterous. It \[1 couldn't possibly 
work\]. There \[2 must have been some other 
precautions\]. But there weren't 02 and it did 
01. 
(13) Xenophobia pestis, like the hard native peren- 
nial it is, bourgeons as lordly young Mediter- 
ranean male cyclists sail into oncoming traf- 
fic with such signorial arrogance that even as 
we swear and skid, we look round wildly for 
street signs to see if he \[1 's right\], and we 
\[2 are wrong\] and the one-way system \[3 's 
undergone one of its periodic reversals\]. (He 
isn't 01. We aren't 02. It hasn't 0s.) 
(11) illustrates a case where parallelism constrains 
the choice of an alternative semantic representation 
with the result that the a-clauses semantics is rep- 
resented by a wff of the form --(p A --q) rather than 
the canonical semantic translation for discourses of 
the form If P, Q i.e. p ---* q. Example (12) pro- 
vides one more illustration of the interaction of syn- 
tax with discourse in determining multiple VPE reso- 
lution whereas example (13) illustrates a simple case 
of discourse parallelism. 
The following table summarises these observa- 
tions. The first column (Ex.) indicates the num- 
ber of the example being referred to together with a 
mention of the linguistic module, if any, which forced" 
the choice of an equivalent semantic representation: 
D stands for Discourse and S for syntax. The sec- 
ond column (Canonical LF) indicates the "canoni- 
cal" semantic representations (or Logical Forms) of 
a- and e-clauses: a-clauses are represented by capital 
letter abbreviations which are mnemonic for their 
propositional content, whereas the semantics of el- 
liptical clauses is represented by 0i where i reflects 
surface ordering. Finally, the third column indicates 
an equivalent semantic representation for both e- or 
a-clauses (or none when this is superfluous). The in- 
tuition is that this column also indicates anaphoric 
dependencies whereby it indicates for each ellipsis 
which is the parallel element in the final semantic 
representation of the a-clauses. To take an exam- 
ple, consider the discourse in (1). For this discourse 
145 
the table indicates that discourse forces the choice 
of a non-canonical semantic representation for the 
a-clauses. That is, the choice of the non-canonical 
semantic representation is determined in this case 
by the discourse requirement that a- and e-clauses 
stand in a parallelism relation. As a result, each ellip- 
sis will resolve to its parallel element in the equivalent 
LF (rather than the canonical one) i.e. 01 resolves to 
OM (i.e. open a big stack of mai 0 and ~)2 to GtM 
(i.e. go to Manchester). 
Ex. Canonical LF 
1D -,O M --* "~Gt M 
~1 ^ @2 
2 WH --~ GA 
@a ~ @2 
3s ~T --. LG) ^ BT 
5a LG --* GS 
@2 -" @i 
6s L --* OS 
@1 --* @2 
11l) RH--* AY 
@i ^ "~@2 
12s W ^ P 
@i ^ @2 
13 R A W ^ UPR 
@1 A@2 A@3 
Equivalent LF 
-~(-.OM ^ GtM) 
gl ^ ~2 
(T --* LG) A BT 
(02 -~ ~3) ^ ~1 
L --. OS 
~2 "~ ~1 
~(RH A -~AY) 
$1 A-~02 
WAP 
~2 A ~1 
7 Problems and further research 
A first problem concerns the propagation of 
anaphoric information throughout the discourse tree. 
To see what the problem is, consider the discourse in 
(14). 
(14) Jon won't dance unless Mary does. 
In the absence of any additional context, the an- 
tecedent of the VPE in the second clause is the VP of 
the first clause i.e dance. Now let us examine again 
the discourse rule for unless sketched in section 3. 
For this rule, the distribution of anaphoric informa- 
tion can be pictured as follows: 
\[11, I2~O1, 02\] 
Note that anaphoric information is only shared be- 
tween mother and daughters, not between sisters. 
This means that the rule sketched in section 3.3 will 
fail to resolve the VPE in example (14) because in 
this case, resolution can only obtain if O1 - I2 i.e. if 
anaphoric information is shared between sisters. An 
obvious fix would be to modify the unless rule so that 
Is unifies not only with the IN value of the rightmost 
daughter but also with the OUT value of the leftmost 
daughter. The modified rule would then be: 
\[ SV.M SEM1 \] \[ SEM S~M2 \] 
IN Ii ,UNLESS, IN 
OUT \[\] OUT 
\[ SEM 0:and:\[0:SEM2,0:SEM1\] \] IN \[I1, \[\],2\] OUT \[02, O1\] 
However, although this would solve the problem 
raised by example (14), it would still fail to account 
for cases such as (15). 
(15) (a) Jon won't \[1 dance\] unless (b) Mary does 
01 and (c) Bob won't \[2 come\] unless (d) Sarah 
does 02. 
Here the problem is that the new unless rule re- 
quires the IN value of (d) to unify both with the OUT 
value of (b) i.e. dance and with the OUT value of (c) 
i.e. come. Clearly unification fails and thus example 
(15), although perfectly well-formed, is rejected by 
the grammar. 
In more general terms, the problem is that 
anaphoric information can come to be instantiated 
both in a top-down and in a bottom-up fashion (i.e. 
through sharing of information between mother and 
daughter or through sharing of information between 
sisters) 11. When the two types of information con- 
flict, unification fails and a perfectly well formed dis- 
course may be rejected by the grammar. In other 
words, the grammar will undergenerate. 
There are several possible solutions to this prob- 
lem. A first one would be to privilege one source 
of information over the other, say by means of pri- 
ority union. In this way, one anaphoric flow would 
overwrite the other. But apart form the computa- 
tional problems involved in using such rewrite opera- 
tions at run time, it is also unclear which information 
should be privileged. Thus although in (15), bottom- 
up (or local) information seems to prevail, example 
(16) shows that in some cases, top-down information 
may be strongest: 
(16) (a) Jon won't go to Manchester unless (b) he 
opens his mail and (c) Bob won't go to Paris 
unless (d) he does. 
alThe first type of anaphoric flow is top-down in that 
anaphoric information on the mother may be required to 
unify with the anaphoric information of some other node 
higher up in the discourse tree, whereas the second type 
is bottom-up because the anaphoric information specified 
on the sisters may in turn be required to unify with the 
anaphoric information carried by some other node lower 
down in the discourse tree. 
146 
Here, there is at least one reading where the el- 
lipsis in (d) resolves to the parallel element (b) (i.e. 
opens his mail) rather than to the immediately pre- 
ceding VP (i.e. go to Paris). Furthermore it is easy 
to find cases where the overall discourse is ambigu- 
ous between a "top-down reading" and a "bottom- 
up" one. Thus perhaps a better solution would be 
to always allow both possibilities and to let the vari- 
ous modules of the grammar decide which reading is 
actually available. The details and the adequacy of 
such an approach, I leave here as an open research 
question. 
A second problem concerning the present paper 
concerns the definition of discourse relations and of 
equivalence classes over discourse relations. Here it 
is perhaps worth stressing that although logical con- 
nectives have been used throughout the paper to rep- 
resent discourse relations, these are definitely not a 
sufficient means of characterization. As a simple case 
in point, consider a natural language discourse of the 
form P so Q. In section 3.2, such a discourse is trans- 
lated as p A q (where p and q represent the proposi- 
tional content of the natural language discourses P 
and Q respectively). Clearly this translation does 
not exhaust the meaning of the discourse connective 
so: for instance, the causal link between p and q is 
not accounted for. More generally, it is clear that 
much work remains to be done on the semantics of 
discourse relations before the present analysis of mul- 
tiple VPE resolution can be adequately tested. 
Finally, a third question involves the interaction 
of discourse grammar with anaphora resolution in 
general. As already mentioned, the resolution of 
most types of anaphora can be argued to be influ- 
enced by discourse structure. It would be interest- 
ing to investigate in how far the various mechanisms 
developed to express this constraint are compati- 
ble. More specifically, it would be interesting to see 
whether the discourse grammar sketched in section 
2 could be made to account for the complex interac- 
tion of VPE with other anaphoric phenomena such 
as strict/sloppy identity, pronominal and temporal 
anaphora. 
8 Conclusion 
A model has been proposed of how discourse struc- 
ture influences multiple VPE resolution. However, 
the suggestion is that the analysis generalises to all 
cases of VPE, that is, that discourse structure is one 
of the main factors determining VPE resolution in 
general. In this sense, the analysis proposed here fits 
well with one of the mainstream idea in discourse 
theory, which is that discourse structure constrains 
anaphora resolution. It should also be pointed out 
that this analysis includes a treatment of parallelism 
similar to that developed in \[Asher forthcoming\] and 
is as such likely to be compatible with the treatment 
of sloppy/strict ambiguity proposed there. 
The.model proposed is characterised by two main 
properties: reversibility and monostratality. It is 
reversible because it is characterised in a purely 
declarative manner. Note in particular that the def- 
inition of structural identity is entirely independent 
of any notion of processing and is as such strictly 
declarative. In practical terms, this means that this 
model can be used both for analysis and for gen- 
eration. Monostratality (i.e the fact that different 
levels of linguistic information can be stated within a 
category) is another important aspect of the model 
in that it allows for different knowledge sources to 
interact in determining VPE acceptability and res- 
olution. A typical example of this interaction is in- 
volved in the treatment of cases of multiple VPE 
involving semantically equivalent wffs: in such cases, 
syntax often interacts with discourse information to 
determine the correct resolution. More generally, it 
can be argued that VPE is a phenomenon which 
simultaneously involves phonology, syntax, seman- 
tics and discourse (cf. \[Lappin and McCord 1990\], 
\[Gardent 1991\]). The present model allows for such 
a simultaneous interaction and thus improves on se- 
rial models of VPE resolution (i.e. models where the 
various levels of linguistic information interact in a 
serial rather than a simultaneous fashion) such as 
\[Webber 1978\]. 
The model described in this paper has been imple- 
mented in SICSTUS PROLOG and runs on a SUN 
4 computer, It has been tested in analysis as well as 
in generation mode. 
Acknolwedgements: I would like to thank Mar- 
tin van den Berg, Patrick Blackburn, Remko Scha 
and Henk Zeevat for many helpful comments and 
suggestions. 
References 
\[Asher forthcoming\] Asher, N.: forthcoming, Refer- 
ence to abstract objects in English: a philo- 
sophical semantics for Natural Language meta- 
physics. Book ms. 
\[Elhadad and McKeown 1990\] Elhadad, N. and 
McKeown, K.R.: 1990, Generating connectives. 
Proceedings of COLING-90, Helsinki. 
\[Gardent 1990\] Gardent, C.: 1990, Dynamic Seman- 
tics and VP Ellipsis. In Proceedings of the Eu- 
ropean Workshop on Logics for Artificial Intel- 
ligence, J. van Eijck (ed.), Amsterdam. 
\[Gardent 1991\] Gardent, C.: 1991, Gapping and VP 
Ellipsis in a Unification-Based Grammar. PhD 
thesis, University of Edinburgh. 
\[Grosz and Sidner 1986\] Grosz, B. and Sidner, C.: 
1986, Attention, Intention and the Structure 
of Discourse. Computational Linguistics, 12(3), 
July-September 1986, 175-204. 
\[Klein and Stainton-Ellis 1989\] Klein, E. and 
Stainton-Ellis, K.: 1989, A note on multiple VP 
147 
ellipsis. Centre for Cognitive Science, University 
of Edinburgh, Research Paper EUCCS/RP-30. 
\[Lascarides and Asher 1991\] 
Lascarides, A. and Asher, N.: 1991, Discourse 
relations and defeasible knowledge. Proceedings 
of the 29ih Annual Meeting of the Association 
for Computational Linguistics, 55-63. 
\[Nakhimovsky 1988\] Nakhimovsky, A.: 1988, As- 
pect, aspectual class and the temporal structure 
of narrative. Computational Linguistics, 14(2), 
29-43. 
\[Polanyi and Scha 1984\] Polanyi, L. and Scha, R.: 
1984, A syntactic approach to discourse seman- 
tics. Proceedings of the lOth International Con- 
terence on Computational Linguistics and the 
22nd Annual Meeting of the Association for 
Computational Linguistics, Stanford University, 
413-419. 
\[Priist and Scha 1990\] Priist, H. and Scha, R.: 1990, 
A discourse approach to Verb Phrase Anaphora. 
Proceedings of ECAI. 
\[Sag1980\] Sag, I.A.:1980, Deletion and Logical 
Form. New York and London: Garland Pub- 
lishing. 
\[Lappin and McCord 1990\] Lappin, S. and McCord, 
M.: 1990, Anaphora Resolution in Slot Gram- 
mar, Computational Linguistics, vol. 16, no 4. 
\[Stainton-Ellis 1988\] Stainton-Ellis, C.S.:1988, A 
processing perspective on Verb Phrase Ellipsis, 
MPhil dissertation, University of Edinburgh. 
\[Webber 1978\] Webber, B.: 1978, A formal ap- 
proach to discourse anaphora. PhD Thesis, Har- 
vard University. 
\[Webber 1990\] Webber, B.: 1990, Structure and os- 
tension in the interpretation of discourse deixis. 
To appear in Language and Cognitive Processes, 
1991. Research report MS-CIS-90-58, Univer- 
sity of Pennsylvannia, Philadelphia. 
148 
