Resolving Ellipsis in Clarification
Jonathan Ginzburg
Dept of Computer Science
King’s College, London
The Strand, London WC2R 2LS
UK
ginzburg@dcs.kcl.ac.uk
Robin Cooper
Dept of Linguistics
G¨oteborg University
Box 200, 405 30 G¨oteborg,
Sweden
cooper@ling.gu.se
Abstract
We offer a computational analysis of
the resolution of ellipsis in certain cases
of dialogue clarification. We show that
this goes beyond standard techniques
used in anaphora and ellipsis resolu-
tion and requires operations on highly
structured, linguistically heterogeneous
representations. We characterize these
operations and the representations on
which they operate. We offer an analy-
sis couched in a version of Head-Driven
Phrase Structure Grammar combined
with a theory of information states (IS)
in dialogue. We sketch an algorithm for
the process of utterance integration in
ISs which leads to grounding or clarifi-
cation.
1 Introduction
Clarification ellipsis (CE), nonsentential ellipti-
cal queries such as (1a(i),(ii)) are commonplace
in human conversation. Two common read-
ings/understandings of CE are exemplified in
(1b,c): the clausal reading is commonly used sim-
ply to confirm the content of a particular subutter-
ance. The main function of the constituent read-
ing is to elicit an alternative description or osten-
sion to the content (referent or predicate etc) in-
tended by the original speaker of the reprised sub-
utterance.
(1) a. A: Did Bo finagle a raise?
B: (i) Bo?/ (ii) finagle?
b. Clausal reading: Are you asking if
BO (of all people) finagled a raise/Bo FI-
NAGLED a raise (of all actions)
c. Constituent reading: Who is
Bo?/What does it mean to finagle?
The issue of whether CE involves an ambi-
guity or is simply vague is an important one.1a0 2
Clearly, pragmatic reasoning plays an important
role in understanding CEs. Some considerations
do, nonetheless, favour the existence of an ambi-
guity. First, the BNC provides numerous exam-
ples of misunderstandings concerning CE inter-
pretation,3 where a speaker intends one reading,
is misunderstood, and clarifies his original inter-
pretation:
(2) a. A: ... you always had er er say every foot
he had with a piece of spunyarn in the
wire/B: Spunyarn?/A: Spunyarn, yes/B:
What’s spunyarn?
b. A: Have a laugh and joke with Dick./
B: Dick?/A: Have a laugh and joke with
Dick./B: Who’s Dick?
1An anonymous ACL reviewer proposed to us that all CE
could be analyzed in terms of a single reading along the lines
of “I thought I heard you say Bo, and I don’t know why you
would do so?”.
2Closely related to this issue is the issue of what other
readings/understandings CE exhibits. We defer discussion
of the latter issue to (Purver et al., 2001), which provides a
detailed analysis of the frequency of CEs and their under-
standings among clarification utterances in the British Na-
tional Corpus (BNC).
3This confirms our (non-instrumentally tested) impres-
sion that these understandings are not on the whole disam-
biguated intonationally. All our CE data from the BNC was
found using SCoRE, Matt Purver’s dialogue oriented BNC
search engine (Purver, 2001).
More crucially, the clausal and constituent
readings involve distinct syntactic and phonolog-
ical parallelism conditions. The constituent read-
ing seems to actually require phonological iden-
tity. With the resolution associated with clausal
readings, there is no such requirement. How-
ever, partial syntactic parallelism does obtain: an
XP used to clarify an antecedent sub-utterance a1a3a2
must match a1a3a2 categorially, though there is no re-
quirement of phonological identity:
(3) a. A: I phoned him. B: him? / #he?
b. A: Did he adore the book. B: adore? /
#adored?
c. A: We’re leaving? B: You?
We are used to systems that will confirm the
user’s utterances by repeating part of them. These
presuppose no sophisticated linguistic analysis.
However, it is not usual for a system to be able to
process CEs produced by the user. It would be a
great advantage in negotiative dialogues, where,
for example, the system and the user might be
discussing several options and the system may
make alternative suggestions, for a system to be
able to recognize and interpret a CE. Consider
the following (constructed) dialogue in the route-
planning domain:
(4) Sys: Would you like to make that trip via
Malvern? User: Malvern?
At this point the system has to consider a num-
ber of possible intepretations for the user’s utter-
ance all of which involve recognizing that this is a
clarification request concerning the system’s last
utterance.
Appropriate responses might be (5a-c); the sys-
tem should definitely not say (5d), as it might if it
does not recognize that the user is trying to clarify
its previous utterance.
(5) a. Yes, Malvern
b. Malvern – M-A-L-V-E-R-N
c. Going via Malvern is the quickest
route
d. So, you would like to make that trip
via Malvern instead of Malvern?
In this paper we examine the interpretation
of CEs. CE is a singularly complex ellip-
sis/anaphoric phenomenon which cannot be han-
dled by standard techniques such as first order
unification (as anaphora often is) or by higher or-
der unification (HOU) on logical forms (see e.g.
(Pulman, 1997)). For a start, in order to cap-
ture the syntactic and phonological parallelism
exemplified in (3), logical forms are simply in-
sufficient. Moreover, although an HOU account
could, given a theory of dialogue that structures
context appropriately, generate the clausal read-
ing, the constituent reading cannot be so gener-
ated. Clark (e.g. (Clark, 1996)) initiated work
on the grounding of an utterance (for computa-
tional and formal work see e.g. (Traum, 1994;
Poesio and Traum, 1997)). However, existing
work, while spelling out in great detail what up-
dates arise in an IS as a result of grounding, do not
offer a characterization of the clarification possi-
bilities spawned by a given utterance. A sketch
of such a characterization is provided in this pa-
per. On the basis of this we offer an analysis
of CE, integrated into a large existing grammar
framework, Head-Driven Phrase Structure Gram-
mar (HPSG) (specifically the version developed
in (Ginzburg and Sag, 2000)). We start by infor-
mally describing the grounding/clarification pro-
cesses and the representations on which they op-
erate. We then provide the requisite background
on HPSG and on the KOS framework (Ginzburg,
1996; Bohlin et al., 1999), in which our analy-
sis of ISs is couched. We sketch an algorithm for
the process of utterance integration which leads to
grounding or clarification. Finally, we formalize
the operations which underpin clarification and
sketch a grammatical analysis of CE.
2 Utterance Representation: grounding
and clarification
We start by offering an informal description of
how an utterance a1 such as (6) can get grounded
or spawn a clarification by an addressee B:
(6) A: Did Bo leave?
A is attempting to convey to B her question
whether the property she has referred to with her
utterance of leave holds of the person she has
referred to with the name Bo. B is required to
try and find values for these references. Finding
values is, with an important caveat, a necessary
condition for B to ground A’s utterance, thereby
signalling that its content has been integrated in
B’s IS.4 Modelling this condition for success-
ful grounding provides one obvious constraint on
the representation of utterance types: such a rep-
resentation must involve a function from or a4 -
abstract over a set of certain parameters (the con-
textual parameters) to contents. This much is fa-
miliar already from early work on context depen-
dence by (Montague, 1974) et seq. What hap-
pens when B cannot or is at least uncertain as to
how he should instantiate in his IS a contextual
parameter a5 ? In such a case B needs to do at least
the following: (1) perform a partial update of the
existing context with the successfully processed
components of the utterance (2) pose a clarifica-
tion question that involves reference to the sub-
utterance ua6 from which a5 emanates. Since the
original speaker, A, can coherently integrate a
clarification question once she hears it, it follows
that, for a given utterance, there is a predictable
range of a7 partial updates + consequent clarifica-
tion questionsa8 . These we take to be specified by
a set of coercion operations on utterance repre-
sentations.5 Indeed we assume that a component
of dialogue competence is knowledge of these co-
ercion operations.
CE gives us some indication concerning both
the input and required output of these operations.
One such operation, which we will refer to as
parameter identification, essentially involves as
output a question paraphrasable as what is the in-
tended reference of sub-utterance ua6 ?. The par-
tially updated context in which such a clarifica-
tion takes place is such that simply repeating the
segmental phonology of ua6 using rising intona-
tion enables that question to be expressed. An-
other existent coercion operation is one which we
will refer to as parameter focussing. This in-
volves a (partially updated) context in which the
issue under discussion is a question that arises by
instantiating all contextual parameters except for
a5 and abstracting over a5 . In such a context, one
4The caveat is, of course, that the necessity is goal driven.
Relative to certain goals, one might decide simply to existen-
tially quantify the problematic referent. For this operation on
meanings see (Cooper, 1998). We cannot enter here into a
discussion of how to integrate the view developed here in a
plan based view of understanding, but see (Ginzburg, (forth-
coming)) for this.
5The term coercion operation is inspired by work on ut-
terance representation within a type theoretic framework re-
ported in (Cooper, 1998).
can confirm that a5 gets the value B suspects it has
by uttering with rising intonation any apparently
co-referential phrase whose syntactic category is
identical to a1a3a2 ’s.
From this discussion, it becomes clear that co-
ercion operations (and by extension the ground-
ing process) cannot be defined simply on mean-
ings. Rather, given the syntactic and phonologi-
cal parallelism encoded in clarification contexts,
these operations need to be defined on repre-
sentations that encode in parallel for each sub-
utterance down to the word level phonological,
syntactic, semantic, and contextual information.
With some minor modifications, signs as con-
ceived in HPSG are exactly such a representa-
tional format and, hence, we will use them to de-
fine coercion operations.6 More precisely, given
that an addressee might not be able to come up
with a unique or a complete parse, due to lexi-
cal ignorance or a noisy environment, we need to
utilize some ‘underspecified’ entity (see e.g. (Mil-
ward, 2000)). For simplicity we will use descrip-
tions of signs. An example of the format for signs
we employ is given in (7):7
6We make two minor modifications to the version of
HPSG described in (Ginzburg and Sag, 2000)). First, we re-
vamp the existing treatment of the feature C-INDICES. This
will now encode the entire inventory of contextual parame-
ters of an utterance (proper names, deictic pronouns, indexi-
cals) not merely information about speaker/hearer/utterance-
time, as standardly. Indeed, in principle, relation names
should also be included, since they vary with context and are
subject to clarification as well. Such a step involves a signif-
icant change to how argument roles are handled in existing
HPSG. Hence, we do not make such a move here. This mod-
ification of C-INDICES will allow signs to play a role akin to
the role associated with ‘meanings’, i.e. to function as ab-
stracts with roles that need to be instantiated. The second
modification we make concerns the encoding of phrasal con-
stituency. Standardly, the feature DTRS is used to encode im-
mediate phrasal constituency. To facilitate statement of coer-
cion operations, we need access to all phrasal constituents—
given that a contextual parameter emanating from deeply
embedding constituents are as clarifiable as immediate con-
stituents. We posit a set valued feature CONSTIT(UENT)S
whose value is the set of all constituents immediate or oth-
erwise of a given sign (Cf. the mother-daughter predicates
used in (Gregory and Lappin, 1999).) In fact, having posited
CONSTITS one could eliminate DTRS: this by making the
value of CONSTITS be a set of sets whose first level elements
are the immediate constituents. For current purposes, we
stick with tradition and tolerate the redundancy of both DTRS
and CONSTITS.
7Within the phrasal type system of (Ginzburg and Sag,
2000) root-cl constitutes the ‘start’ symbol of the grammar.
In particular, phrases of this type have as their content an
illocutionary operator embedding the appropriate semantic
(7) a9a10
a10
a10
a10
a10
a10
a10
a10
a10
a10
a10
a10
a10
a10
a10
a10
a10
a10
a10
a10
a10
a10
a10
a10
a10
a10
a10
a10
a10
a10
a10a11
root-cl
PHON did bo leave
CAT V[+fin]
C-INDICES a12 a13 , a14 , a15 , i,ja16
CONT
a9a10
a10
a10
a10
a10
a10
a10
a10
a10
a10
a10a11
ASK-REL
ASKER i
ASKED j
MSG-ARG
a9a10
a10
a10
a10
a10a11
question
PARAMS a12 a16
PROPa17 SOA
a9
a11 leave-rel
AGT a13
TIME a14
a18a19
a18a21a20
a20
a20
a20
a20
a19
a18a21a20
a20
a20
a20
a20
a20
a20
a20
a20
a20
a20
a19
CTXTa17 BCKGRD a22 utt-time( a15 ),precede(
a14 , a15 ), named(bo)( a13 )a23
CONSTITS a24 a25 a26
PHON Dida27 , a28
a26
PHON Boa27 ,
a29
a26
PHON leavea27 , a30
a26
PHON Did Bo leavea27a32a31
a18 a20
a20
a20
a20
a20
a20
a20
a20
a20
a20
a20
a20
a20
a20
a20
a20
a20
a20
a20
a20
a20
a20
a20
a20
a20
a20
a20
a20
a20
a20
a20
a19
Before we can explain how these representa-
tions can feature in dialogue reasoning and the
resolution of CE, we need to sketch briefly the
approach to dialogue ellipsis that we assume.
3 Contextual evolution and ellipsis
We adopt the situation semantics based theory
of dialogue context developed in the KOS frame-
work (Ginzburg, 1996; Ginzburg, (forthcoming);
Bohlin et al., 1999). The common ground com-
ponent of ISs is assumed to be structured as fol-
lows:8
(8)
a9
a11 FACTS set of facts
LATEST-MOVE (illocutionary) fact
QUD p.o. set of questions
a18a19
In (Ginzburg and Sag, 2000) this framework is
integrated into HPSG (Pollard and Sag, 1994);
(Ginzburg and Sag, 2000) define two new at-
tributes within the CONTEXT (CTXT) feature
structure: Maximal Question Under Discussion
(MAX-QUD), whose value is of sort question;9
object (an assertion embedding a proposition, a query em-
bedding a question etc.). Here and throughout we omit vari-
ous features (e.g. STORE, SLASH etc that have no bearing on
current issues wherever possible.
8Here FACTS corresponds to the set of commonly ac-
cepted assumptions; QUD(‘questions under discussion’) is
a set consisting of the currently discussable questions, par-
tially ordered by a33 (‘takes conversational precedence’);
LATEST-MOVE represents information about the content
and structure of the most recent accepted illocutionary move.
9Questions are represented as semantic objects compris-
ing a set of parameters—empty for a polar question—and a
and Salient Utterance (SAL-UTT), whose value
is a set (singleton or empty) of elements of type
sign. In information structure terms, SAL-UTT
can be thought of as a means of underspecifying
the subsequent focal (sub)utterance or as a poten-
tial parallel element. MAX-QUD corresponds to
the ground of the dialogue at a given point. Since
SAL-UTT is a sign, it enables one to encode syn-
tactic categorial parallelism and, as we will see
below, also phonological parallelism. SAL-UTT
is computed as the (sub)utterance associated with
the role bearing widest scope within MAX-QUD.10
Below, we will show how to extend this account
of parallelism to clarification queries.
To account for elliptical constructions such as
short answers and sluicing, Ginzburg and Sag
posit a phrasal type headed-fragment-phrase (hd-
frag-ph)—a subtype of hd-only-ph—governed by
the constraint in (9). The various fragments ana-
lyzed here will be subtypes of hd-frag-ph or else
will contain such a phrase as a head daughter.11
(9)
a9a10
a10
a10
a10
a10
a10a11
HEAD v
CTXTa17 SAL-UTT a34 CAT a13CONT
a17 INDEX a14 a35
HD-DTR a36 CAT a13
a26
HEAD nominala27
CONTa17 INDEX a14 a37
a18a21a20
a20
a20
a20
a20
a20
a19
This constraint coindexes the head daughter
with the SAL-UTT. This will have the effect of
‘unifying in’ the content of the former into a con-
textually provided content. A subtype of hd-frag-
ph relevant to the current paper is (decl-frag-cl)—
also a subtype of decl-cl—used to analyze short
answers:
proposition. This is the feature structure counterpart of the
a38 -abstract a38a40a39a42a41a44a43a45a43a46a43a47a39a48a43a46a43a46a43 a49 .
10For Wh-questions, SAL-UTT is the wh-phrase associated
with the PARAMS set of the question; otherwise, its possible
values are either the empty set or the utterance associated
with the widest scoping quantifier in MAX-QUD.
11In the (Ginzburg and Sag, 2000) version of HPSG infor-
mation about phrases is encoded by cross-classifying them
in a multi-dimensional type hierarchy. Phrases are classi-
fied not only in terms of their phrase structure schema or
X-bar type, but also with respect to a further informational
dimension of CLAUSALITY. Clauses are divided into inter
alia declarative clauses (decl-cl), which denote propositions,
and interrogative clauses (inter-cl) denoting questions. Each
maximal phrasal type inherits from both these dimensions.
This classification allows specification of systematic corre-
lations between clausal construction types and types of se-
mantic content.
(10)
a9a10
a10
a10
a10
a10
a10
a10
a10
a10
a10
a10
a10
a10
a10
a10
a10
a10
a10
a10
a10
a10a11
STORE a50a52a51
CONT
a9a10
a10a11
proposition
SIT a14
SOA a34 QUANTS order( a50a40a53 ) a54 a55NUCL
a28 a35
a18a21a20
a20
a19
MAX-QUD
a9a10
a10
a10
a10
a10
a10
a10a11
question
PARAMS neset
PROP
a9a10
a10a11
proposition
SIT a14
SOA a34
QUANTS a55
NUCL a28 a35
a18a21a20
a20
a19
a18a21a20
a20
a20
a20
a20
a20
a20
a19
HD-DTRa17 STORE a50a40a53 a56 a50a52a51 set(param)
a18 a20
a20
a20
a20
a20
a20
a20
a20
a20
a20
a20
a20
a20
a20
a20
a20
a20
a20
a20
a20
a20
a19
The content of this phrasal type is a proposition:
whereas in most headed clauses the content is en-
tirely (or primarily) derived from the head daugh-
ter, here it is constructed for the most part from
the contextually salient question. This provides
the concerned situation and the nucleus, whereas
if the fragment is (or contains) a quantifier, that
quantifier must outscope any quantifiers already
present in the contextually salient question.
4 Integrating Utterances in Information
States
Before we turn to formalizing the coercion opera-
tions and describing CE, we need to explain how
on our view utterances get integrated in an agent’s
IS. The basic protocol we assume is given in (11)
below.12
(11) Utterance processing protocol
For an agent B with IS a57 : if an utterance a58 is Maximal in
PENDING:
(a) Try to:
(1) find an assignment a59 in a57 for a60 , where a60 is the (maximal
description available for) the sign associated with a58
(2) update LATEST-MOVE with a58 :
1. If LATEST-MOVE is grounded, then FACTS:=
FACTS + LATEST-MOVE;
2. LATEST-MOVE := a61a32a60a63a62a64a59a52a65
(3) React to content(u) according to querying/assertion pro-
tocols.
(4) If successful, a58 is removed from PENDING
(b) Else: Repeat from stage (a) with MAX-QUD
and SAL-UTT obtaining the various values of
coea66 a41a32a67a68a49 a17 a69a71a70a73a72a75a74a77a76a78a58a80a79a68a81a78a82a46a70a73a83a42a74a84a58a40a85a86a85 , where a67 is the sign
associated with LATEST-MOVE and coea66 is one of the
available coercion operations;
12In this protocol, PENDING is a stack whose elements
are (unintegrated) utterances.
(c) Else: make an utterance appropriate for a context such
that MAX-QUD and SAL-UTT get values according to the
specification in coea66 a41 a58a52a62a64a60 a49 , where coea66 is one of the avail-
able coercion operations.
The protocol involves the assumption that an
agent always initially tries to integrate an utter-
ance by assuming it constitutes an adjacency pair
with the existing LATEST-MOVE. If this route
is blocked somehow, because the current utter-
ance cannot be grounded or the putative resolu-
tion leads to incoherence, only then does she try
to repair by assuming the previous utterance is a
clarification generated in accordance with the ex-
isting coercion operations. If that too fails, then,
she herself generates a clarification. Thus, the
prediction made by this protocol is that A will
tend to initially interpret (12(2)) as a response to
her question, not as a clarification:
(12) A(1): Who do you think is the only per-
son that admires Mary? B(2): Mary?
5 Sign Coercion and an Analysis of CE
We now turn to formalizing the coercion op-
erations we specified informally in section 2.
The first operation we define is parameter fo-
cussing:
(13) parameter focussing
a66
:
a9a10
a10
a10
a10a11
root-cl
CTXT-INDICES a13 a12
a43a46a43a46a43
a66
a43a46a43a87a43
a16
CONSTITS a88 a43a45a43a46a43 a14
a26
CONT
a66
a27
a43a87a43a46a43a90a89
CONTENT a15
a18a21a20
a20
a20
a20
a19
a91
a9a10
a10
a10
a10
a10
a10
a10
a10a11
CONTENTa17 MSG-ARG a34 questionPROP
a15 a35
SAL-UTT a14
MAX-QUD
a9a10a11 question
PARAMS a12
a66
a16
PROP a15
a18a21a20
a19
a18a21a20
a20
a20
a20
a20
a20
a20
a20
a19
This is to be understood as follows: given an ut-
terance (whose associated sign is one) which sat-
isfies the specification in the LHS of the rule, a CP
may respond with any utterance which satisfies
the specification in the RHS of the rule.13 More
specifically, the input of the rules singles out a
13The fact that both the LHS and the RHS of the rule are
of type root-cl ensures that the rule applies only to signs as-
sociated with complete utterances.
contextual parameter a5 , which is the content of an
element of the daughter set of the utterance 2 .
Intuitively, a5 is a parameter whose value is prob-
lematic or lacking. The sub-utterance 2 is speci-
fied to constitute the value of the feature SAL-UTT
associated with the context of the clarification ut-
terance a92 a1a94a93 . The descriptive content of a92 a1a94a93 is
a question, any question whose open proposition
3 (given in terms of the feature PROP) is identi-
cal to the (uninstantiated) content of the clarified
utterance. MAX-QUD associated with the clarifi-
cation is fully specified as a question whose open
proposition is 3 and whose PARAMS set consists
of the ‘problematic’ parameter a5 .
We can exemplify the effect of parameter fo-
cussing with respect to clarifying an utterance of
(7). The output this yields, when applied to Bo’s
index 1 , is the partial specification in (14). Such
an utterance will have as its MAX-QUD a ques-
tion cqa93 paraphrasable as whoa95 , named Bo, are
you asking if t left, whereas its SAL-UTT is the
sub-utterance of Bo. The content is underspeci-
fied:
(14)
a9a10
a10
a10
a10
a10
a10
a10
a10
a10
a10
a10
a10
a10
a10
a10
a10
a10a11
CONT a96 MSG-ARG a97
question
PROP a98 a99
SAL-UTT a100
MAX-QUD
a9a10
a10
a10
a10
a10
a10
a10
a10
a10
a10
a10
a10a11
question
PARAMS
a12 a51 a16
PROPa96 SOA a98
a9a10
a10
a10
a10
a10
a10
a10a11
ASK-REL
ASKER i
ASKED j
MSG-ARG
a9a10
a10
a10a11
question
PARAMS a12 a16
PROPa96 SOA
a36
leave-rel
AGT a51
TIME a101 a37
a18 a20
a20
a20
a19
a18a21a20
a20
a20
a20
a20
a20
a20
a19
a18a21a20
a20
a20
a20
a20
a20
a20
a20
a20
a20
a20
a20
a19
a18 a20
a20
a20
a20
a20
a20
a20
a20
a20
a20
a20
a20
a20
a20
a20
a20
a20
a19
This (partial) specification allows for clarifica-
tion questions such as the following:
(15) a. Did WHO leave?
b. WHO?
c. BO? (= Are you asking if BO left?)
Given space constraints, we restrict ourselves
to explaining how the clausal CE, (15c), gets ana-
lyzed. This involves direct application of the type
decl-frag-cl discussed above for short answers.
The QUD-maximality of cqa93 allows us to ana-
lyze the fragment as a ‘short answer’ to cqa93 , using
the type bare-decl-cl. And out of the proposition
which emerges courtesy of bare-decl-cl a (polar)
question is constructed using the type dir-is-int-
cl.14
(16) S
a9a10
a10
a10
a10
a10
a10
a10
a10
a10
a10
a10
a10
a10
a10
a10
a10
a10
a10
a10a11
dir-is-int-cl
CONT
a9a10
a10
a10
a10
a10
a10
a10
a10
a10
a10
a10
a10
a10
a10
a10
a10
a10a11
question
PARAMS a12 a16
PROP a15
a9a10
a10
a10
a10
a10
a10
a10
a10
a10
a10
a10a11
ask-rel
ASKER i
ASKED j
a9a10
a10
a10
a10
a10a11
question
PARAMS a12 a16
PROPa17 SOA
a9
a11 leave-rel
AGT a13
TIME a14
a18a19
a18a21a20
a20
a20
a20
a20
a19
a18a21a20
a20
a20
a20
a20
a20
a20
a20
a20
a20
a20
a19
a18a21a20
a20
a20
a20
a20
a20
a20
a20
a20
a20
a20
a20
a20
a20
a20
a20
a20
a19
a18 a20
a20
a20
a20
a20
a20
a20
a20
a20
a20
a20
a20
a20
a20
a20
a20
a20
a20
a20
a19
S
a9a10
a10
a10
a10
a10
a10
a10
a10
a10
a10
a10a11
decl-frag-cl
CONT a15
CTXT
a9a10
a10
a10
a10
a10
a10a11
MAX-QUD
a9a10a11
question
PARAMS a88
a26
INDEX a14 a27 a89
PROP a15
a18a21a20
a19
SAL-UTT a34
CAT a30
CONTa17 INDEX a14 a35
a18 a20
a20
a20
a20
a20
a20
a19
a18a21a20
a20
a20
a20
a20
a20
a20
a20
a20
a20
a20
a19
a34
CAT a30 NP
CONTa17 INDEX a14 a35
Bo
The second coercion operation we discussed
previously is parameter identification: for a
given problematic contextual parameter its out-
put is a partial specification for a sign whose con-
tent and MAX-QUD involve a question querying
the content of that utterance parameter:
14The phrasal type dir-is-int-cl which constitutes the type
of the mother node in (16) is a type that inter alia enables a
polar question to be built from a head daughter whose con-
tent is propositional. See (Ginzburg and Sag, 2000) for de-
tails.
(17) parameter identification
a66
:
a9a10
a10
a10
a10a11
root-cl
CTXT-INDICES a12 a43a46a43a46a43
a66
a43a46a43a87a43
a16
CONSTITS a88 a43a46a43a87a43 a14
a26
CONT
a66
a27
a43a46a43a46a43a21a89
a43a87a43a46a43
a18a21a20
a20
a20
a20
a19
a91
a9a10
a10
a10
a10
a10
a10
a10
a10
a10
a10
a10
a10
a10a11
CONTENTa17 MSG-ARG a34 questionPROP
a15 a35
SAL-UTT a14
MAX-QUD
a9a10
a10
a10
a10
a10
a10a11
question
PARAMS a88
a26
INDEX
a25
a27
a89
PROP a15
a9
a11
SOA
a9
a11 content-rel
SIGN a14
CONT
a25
a18a19 a18a19
a18a21a20
a20
a20
a20
a20
a20
a19
a18a21a20
a20
a20
a20
a20
a20
a20
a20
a20
a20
a20
a20
a20
a19
To exemplify: when this operation is applied to
(7), it will yield as output the partial specification
in (18):
(18)
a9a10
a10
a10
a10
a10
a10
a10
a10
a10
a10
a10
a10
a10
a10
a10
a10
a10
a10
a10
a10a11
CONTa17 MSG-ARG a34 questionPROP
a29
a35
SAL-UTT a28
a9a10
a10a11
PHON bo
CAT NP
CONTa17 INDEX
a66
CTXTa17 BCKGRD a12 named(Bo)(
a66
)a16
a18a21a20
a20
a19
MAX-QUD
a9a10
a10
a10
a10
a10
a10a11
question
PARAMS a88
a26
INDEX a102 a27 a89
PROP a29
a9
a11
SOA
a9
a11 content-rel
SIGN a28
CONT a102
a18a19a80a18a19
a18a21a20
a20
a20
a20
a20
a20
a19
a18a21a20
a20
a20
a20
a20
a20
a20
a20
a20
a20
a20
a20
a20
a20
a20
a20
a20
a20
a20
a20
a19
This specification will allows for clarification
questions such as the following:
(19) a. Who do you mean BO?
b. WHO? (= who is Bo)
c. Bo? (= who is Bo)
We restrict attention to (19c), which is the most
interesting but also tricky example. The tricky
part arises from the fact that in a case such as this,
in contrast to all previous examples, the fragment
does not contribute its conventional content to the
clausal content. Rather, as we suggested earlier,
the semantic function of the fragment is merely
to serve as an anaphoric element to the phono-
logically identical to–be–clarified sub-utterance.
The content derives entirely from MAX-QUD.
Such utterances can still be analyzed as subtypes
of head-frag-ph, though not as decl-frag-cl, the
short-answer/reprise sluice phrasal type we have
been appealing to extensively. Thus, we posit
constit(uent)-clar(ification)-int-cl, a new phrasal
subtype of head-frag-ph and of inter-cl which en-
capsulates the two idiosyncratic facets of such
utterances, namely the phonological parallelism
and the max-qud/content identity:
(20)
a9a10a11 CONT
a13
CTXT a34 MAX-QUD a13SAL-UTT
a17 PHON a14 a35
a18 a20
a19a104a103
H
a26
PHON a14 a27
Given this, (19c) receives the following analy-
sis:
(21)
a9a10
a10
a10
a10
a10
a10
a10
a10
a10
a10
a10
a10
a10
a10
a10a11
constit-repr-int-cl
CONT a13
a9a10a11 question
PARAMS a12 a14 a16
PROP content( a15 , a14 )
a18a21a20
a19
CTXT
a9a10a11 MAX-QUD
a13
SAL-UTT a15 a34 PHON a28CAT
a25
NPa35
a18a21a20
a19
HD-DTR a34 PHON a28CAT
a25
a35
a18a21a20
a20
a20
a20
a20
a20
a20
a20
a20
a20
a20
a20
a20
a20
a20
a19
6 Summary and Future Work
In this paper we offered an analysis of the types of
representations needed to analyze CE, the requi-
site operations thereon, and how these update ISs
during grounding and clarification.
Systems which respond appropriately to CEs
in general will need a great deal of background
knowledge. Even choosing among the responses
in (5) might be a pretty knowledge intensive busi-
ness. However, there are some clear strategies
that might be pursued. For example, if Malvern
has been discussed previously in the dialogue and
understood then (5a,b) would not be appropriate
responses. In order to be able to build dialogue
systems that can handle even some restricted as-
pects of CEs we need to understand more about
what the possible interpretations are and this is
what we have attempted to do in this paper. We
are currently working on a system which inte-
grates SHARDS (see (Ginzburg et al., 2001), a
system which processes dialogue ellipses) with
GoDiS (see (Bohlin et al., 1999), a dialogue sys-
tem developed using TRINDIKIT, which makes
use of ISs modelled on those suggested in the KOS
framework. Our aim in the near future is to in-
corporate simple aspects of negotiative dialogue
including CEs in a GoDiS-like system employing
SHARDS.
Acknowledgements
For very useful discussion and comments we
would like to thank Pat Healey, Howard Gre-
gory, Shalom Lappin, Dimitra Kolliakou, David
Milward, Matt Purver and three anonymous ACL
reviewers. We would also like to thank Matt
Purver for help in using SCoRE. Earlier versions
of this work were presented at colloquia at ITRI,
Brighton, Queen Mary and Westfield College,
London, and at the Computer Lab, Cambridge.
The research described here is funded by grant
number R00022269 from the Economic and So-
cial Research Council of the United Kingdom, by
INDI (Information Exchange in Dialogue), Riks-
bankens Jubileumsfond 1997-0134, and by grant
number GR/R04942/01 from the Engineering and
Physical Sciences Research Council of the United
Kingdom.
References
Peter Bohlin, Robin Cooper, Elisabet Engdahl, and
Staffan Larsson. 1999. Information states and di-
alogue move engines. Gothenburg Papers in Com-
putational Linguistics.
Herbert Clark. 1996. Using Language. Cambridge
University Press, Cambridge.
Robin Cooper. 1998. Mixing situation theory and
type theory to formalize information states in di-
alogue exchanges. In J. Hulstijn and A. Nijholt,
editors, Proceedings of TwenDial 98, 13th Twente
workshop on Language Technology. Twente Uni-
versity, Twente.
Jonathan Ginzburg and Ivan Sag. 2000. Interrogative
Investigations: the form, meaning and use of En-
glish Interrogatives. Number 123 in CSLI Lecture
Notes. CSLI Publications, Stanford: California.
Jonathan Ginzburg, Howard Gregory, and Shalom
Lappin. 2001. Shards: Fragment resolution in di-
alogue. In H. Bunt, editor, Proceedings of the 1st
International Workshop on Computational Seman-
tics. ITK, Tilburg University, Tilburg.
Jonathan Ginzburg. 1996. Interrogatives: Ques-
tions, facts, and dialogue. In Shalom Lappin, ed-
itor, Handbook of Contemporary Semantic Theory.
Blackwell, Oxford.
Jonathan Ginzburg. forthcoming. Semantics
and Interaction in Dialogue. CSLI Publi-
cations and Cambridge University Press, Stan-
ford: California. Draft chapters available from
http://www.dcs.kcl.ac.uk/staff/ginzburg.
Howard Gregory and Shalom Lappin. 1999. An-
tecedent contained ellipsis in HPSG. In Gert We-
belhuth, Jean Pierre Koenig, and Andreas Kathol,
editors, Lexical and Constructional Aspects of Lin-
guistic Explanation, pages 331–356. CSLI Publica-
tions, Stanford.
David Milward. 2000. Distributing representation for
robust interpretation of dialogue utterances. ACL.
Richard Montague. 1974. Pragmatics. In Rich-
mond Thomason, editor, Formal Philosophy. Yale
UP, New Haven.
Massimo Poesio and David Traum. 1997. Conversa-
tional actions and discourse situations. Computa-
tional Intelligence, 13:309–347.
Carl Pollard and Ivan Sag. 1994. Head Driven Phrase
Structure Grammar. University of Chicago Press
and CSLI, Chicago.
Stephen Pulman. 1997. Focus and higher order unifi-
cation. Linguistics and Philosophy, 20.
Matthew Purver, Jonathan Ginzburg, and Patrick
Healey. 2001. On the means for clarification in di-
alogue. Technical report, King’s College, London.
Matthew Purver. 2001. Score: Searching a corpus
for regular expressions. Technical report, King’s
College, London.
David Traum. 1994. A Computational Theory
of Grounding in Natural Language Conversations.
Ph.D. thesis, University of Rochester.
