File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/01/p01-1031_metho.xml

Size: 17,555 bytes

Last Modified: 2025-10-06 14:07:39

<?xml version="1.0" standalone="yes"?>
<Paper uid="P01-1031">
  <Title>Resolving Ellipsis in Clarification</Title>
  <Section position="3" start_page="0" end_page="0" type="metho">
    <SectionTitle>
2 Utterance Representation: grounding
</SectionTitle>
    <Paragraph position="0"> and clarification We start by offering an informal description of how an utterance a1 such as (6) can get grounded or spawn a clarification by an addressee B:  (6) A: Did Bo leave?  A is attempting to convey to B her question whether the property she has referred to with her utterance of leave holds of the person she has referred to with the name Bo. B is required to try and find values for these references. Finding values is, with an important caveat, a necessary condition for B to ground A's utterance, thereby signalling that its content has been integrated in B's IS.4 Modelling this condition for successful grounding provides one obvious constraint on the representation of utterance types: such a representation must involve a function from or a4 abstract over a set of certain parameters (the contextual parameters) to contents. This much is familiar already from early work on context dependence by (Montague, 1974) et seq. What happens when B cannot or is at least uncertain as to how he should instantiate in his IS a contextual parameter a5 ? In such a case B needs to do at least the following: (1) perform a partial update of the existing context with the successfully processed components of the utterance (2) pose a clarification question that involves reference to the sub-utterance ua6 from which a5 emanates. Since the original speaker, A, can coherently integrate a clarification question once she hears it, it follows that, for a given utterance, there is a predictable range of a7 partial updates + consequent clarification questionsa8 . These we take to be specified by a set of coercion operations on utterance representations.5 Indeed we assume that a component of dialogue competence is knowledge of these coercion operations.</Paragraph>
    <Paragraph position="1"> CE gives us some indication concerning both the input and required output of these operations. One such operation, which we will refer to as parameter identification, essentially involves as output a question paraphrasable as what is the intended reference of sub-utterance ua6 ?. The partially updated context in which such a clarification takes place is such that simply repeating the segmental phonology of ua6 using rising intonation enables that question to be expressed. Another existent coercion operation is one which we will refer to as parameter focussing. This involves a (partially updated) context in which the issue under discussion is a question that arises by instantiating all contextual parameters except for  Relative to certain goals, one might decide simply to existentially quantify the problematic referent. For this operation on meanings see (Cooper, 1998). We cannot enter here into a discussion of how to integrate the view developed here in a plan based view of understanding, but see (Ginzburg, (forthcoming)) for this.</Paragraph>
    <Paragraph position="2"> 5The term coercion operation is inspired by work on utterance representation within a type theoretic framework reported in (Cooper, 1998).</Paragraph>
    <Paragraph position="3"> can confirm that a5 gets the value B suspects it has by uttering with rising intonation any apparently co-referential phrase whose syntactic category is identical to a1a3a2 's.</Paragraph>
    <Paragraph position="4"> From this discussion, it becomes clear that coercion operations (and by extension the grounding process) cannot be defined simply on meanings. Rather, given the syntactic and phonological parallelism encoded in clarification contexts, these operations need to be defined on representations that encode in parallel for each sub-utterance down to the word level phonological, syntactic, semantic, and contextual information.</Paragraph>
    <Paragraph position="5"> With some minor modifications, signs as conceived in HPSG are exactly such a representational format and, hence, we will use them to define coercion operations.6 More precisely, given that an addressee might not be able to come up with a unique or a complete parse, due to lexical ignorance or a noisy environment, we need to utilize some 'underspecified' entity (see e.g. (Milward, 2000)). For simplicity we will use descriptions of signs. An example of the format for signs  HPSG described in (Ginzburg and Sag, 2000)). First, we revamp the existing treatment of the feature C-INDICES. This will now encode the entire inventory of contextual parameters of an utterance (proper names, deictic pronouns, indexicals) not merely information about speaker/hearer/utterancetime, as standardly. Indeed, in principle, relation names should also be included, since they vary with context and are subject to clarification as well. Such a step involves a significant change to how argument roles are handled in existing HPSG. Hence, we do not make such a move here. This modification of C-INDICES will allow signs to play a role akin to the role associated with 'meanings', i.e. to function as abstracts with roles that need to be instantiated. The second modification we make concerns the encoding of phrasal constituency. Standardly, the feature DTRS is used to encode immediate phrasal constituency. To facilitate statement of coercion operations, we need access to all phrasal constituents-given that a contextual parameter emanating from deeply embedding constituents are as clarifiable as immediate constituents. We posit a set valued feature CONSTIT(UENT)S whose value is the set of all constituents immediate or otherwise of a given sign (Cf. the mother-daughter predicates used in (Gregory and Lappin, 1999).) In fact, having posited CONSTITS one could eliminate DTRS: this by making the value of CONSTITS be a set of sets whose first level elements are the immediate constituents. For current purposes, we stick with tradition and tolerate the redundancy of both DTRS and CONSTITS.</Paragraph>
    <Paragraph position="6"> 7Within the phrasal type system of (Ginzburg and Sag, 2000) root-cl constitutes the 'start' symbol of the grammar. In particular, phrases of this type have as their content an illocutionary operator embedding the appropriate semantic  Before we can explain how these representations can feature in dialogue reasoning and the resolution of CE, we need to sketch briefly the approach to dialogue ellipsis that we assume.</Paragraph>
  </Section>
  <Section position="4" start_page="0" end_page="0" type="metho">
    <SectionTitle>
3 Contextual evolution and ellipsis
</SectionTitle>
    <Paragraph position="0"> We adopt the situation semantics based theory of dialogue context developed in the KOS framework (Ginzburg, 1996; Ginzburg, (forthcoming); Bohlin et al., 1999). The common ground component of ISs is assumed to be structured as fol- null In (Ginzburg and Sag, 2000) this framework is integrated into HPSG (Pollard and Sag, 1994); (Ginzburg and Sag, 2000) define two new attributes within the CONTEXT (CTXT) feature structure: Maximal Question Under Discussion (MAX-QUD), whose value is of sort question;9 object (an assertion embedding a proposition, a query embedding a question etc.). Here and throughout we omit various features (e.g. STORE, SLASH etc that have no bearing on current issues wherever possible.</Paragraph>
    <Paragraph position="1"> 8Here FACTS corresponds to the set of commonly accepted assumptions; QUD('questions under discussion') is a set consisting of the currently discussable questions, partially ordered by a33 ('takes conversational precedence'); LATEST-MOVE represents information about the content and structure of the most recent accepted illocutionary move. 9Questions are represented as semantic objects comprising a set of parameters--empty for a polar question--and a and Salient Utterance (SAL-UTT), whose value is a set (singleton or empty) of elements of type sign. In information structure terms, SAL-UTT can be thought of as a means of underspecifying the subsequent focal (sub)utterance or as a potential parallel element. MAX-QUD corresponds to the ground of the dialogue at a given point. Since SAL-UTT is a sign, it enables one to encode syntactic categorial parallelism and, as we will see below, also phonological parallelism. SAL-UTT is computed as the (sub)utterance associated with the role bearing widest scope within MAX-QUD.10 Below, we will show how to extend this account of parallelism to clarification queries.</Paragraph>
    <Paragraph position="2"> To account for elliptical constructions such as short answers and sluicing, Ginzburg and Sag posit a phrasal type headed-fragment-phrase (hdfrag-ph)--a subtype of hd-only-ph--governed by the constraint in (9). The various fragments analyzed here will be subtypes of hd-frag-ph or else will contain such a phrase as a head daughter.11  This constraint coindexes the head daughter with the SAL-UTT. This will have the effect of 'unifying in' the content of the former into a contextually provided content. A subtype of hd-frag-ph relevant to the current paper is (decl-frag-cl)-also a subtype of decl-cl--used to analyze short answers: proposition. This is the feature structure counterpart of the</Paragraph>
    <Paragraph position="4"> with the PARAMS set of the question; otherwise, its possible values are either the empty set or the utterance associated with the widest scoping quantifier in MAX-QUD.</Paragraph>
    <Paragraph position="5"> 11In the (Ginzburg and Sag, 2000) version of HPSG information about phrases is encoded by cross-classifying them in a multi-dimensional type hierarchy. Phrases are classified not only in terms of their phrase structure schema or X-bar type, but also with respect to a further informational dimension of CLAUSALITY. Clauses are divided into inter alia declarative clauses (decl-cl), which denote propositions, and interrogative clauses (inter-cl) denoting questions. Each maximal phrasal type inherits from both these dimensions.</Paragraph>
    <Paragraph position="6"> This classification allows specification of systematic correlations between clausal construction types and types of semantic content.</Paragraph>
    <Paragraph position="7">  The content of this phrasal type is a proposition: whereas in most headed clauses the content is entirely (or primarily) derived from the head daughter, here it is constructed for the most part from the contextually salient question. This provides the concerned situation and the nucleus, whereas if the fragment is (or contains) a quantifier, that quantifier must outscope any quantifiers already present in the contextually salient question.</Paragraph>
  </Section>
  <Section position="5" start_page="0" end_page="0" type="metho">
    <SectionTitle>
4 Integrating Utterances in Information
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
States
</SectionTitle>
      <Paragraph position="0"> Before we turn to formalizing the coercion operations and describing CE, we need to explain how on our view utterances get integrated in an agent's IS. The basic protocol we assume is given in (11)  (11) Utterance processing protocol For an agent B with IS a57 : if an utterance a58 is Maximal in PENDING: (a) Try to: (1) find an assignment a59 in a57 for a60 , where a60 is the (maximal description available for) the sign associated with a58 (2) update LATEST-MOVE with a58 : 1. If LATEST-MOVE is grounded, then FACTS:= FACTS + LATEST-MOVE; 2. LATEST-MOVE := a61a32a60a63a62a64a59a52a65 (3) React to content(u) according to querying/assertion protocols. null (4) If successful, a58 is removed from PENDING (b) Else: Repeat from stage (a) with MAX-QUD and SAL-UTT obtaining the various values of coea66 a41a32a67a68a49 a17 a69a71a70a73a72a75a74a77a76a78a58a80a79a68a81a78a82a46a70a73a83a42a74a84a58a40a85a86a85 , where a67 is the sign associated with LATEST-MOVE and coea66 is one of the available coercion operations; 12In this protocol, PENDING is a stack whose elements are (unintegrated) utterances.</Paragraph>
      <Paragraph position="1"> (c) Else: make an utterance appropriate for a context such that MAX-QUD and SAL-UTT get values according to the  specification in coea66 a41 a58a52a62a64a60 a49 , where coea66 is one of the available coercion operations.</Paragraph>
      <Paragraph position="2"> The protocol involves the assumption that an agent always initially tries to integrate an utterance by assuming it constitutes an adjacency pair with the existing LATEST-MOVE. If this route is blocked somehow, because the current utterance cannot be grounded or the putative resolution leads to incoherence, only then does she try to repair by assuming the previous utterance is a clarification generated in accordance with the existing coercion operations. If that too fails, then, she herself generates a clarification. Thus, the prediction made by this protocol is that A will tend to initially interpret (12(2)) as a response to her question, not as a clarification: (12) A(1): Who do you think is the only per-son that admires Mary? B(2): Mary?</Paragraph>
    </Section>
  </Section>
  <Section position="6" start_page="0" end_page="0" type="metho">
    <SectionTitle>
5 Sign Coercion and an Analysis of CE
</SectionTitle>
    <Paragraph position="0"> We now turn to formalizing the coercion operations we specified informally in section 2.</Paragraph>
    <Paragraph position="1"> The first operation we define is parameter fo- null This is to be understood as follows: given an utterance (whose associated sign is one) which satisfies the specification in the LHS of the rule, a CP may respond with any utterance which satisfies the specification in the RHS of the rule.13 More specifically, the input of the rules singles out a 13The fact that both the LHS and the RHS of the rule are of type root-cl ensures that the rule applies only to signs associated with complete utterances.</Paragraph>
    <Paragraph position="2"> contextual parameter a5 , which is the content of an element of the daughter set of the utterance 2 .</Paragraph>
    <Paragraph position="3"> Intuitively, a5 is a parameter whose value is problematic or lacking. The sub-utterance 2 is specified to constitute the value of the feature SAL-UTT associated with the context of the clarification utterance a92 a1a94a93 . The descriptive content of a92 a1a94a93 is a question, any question whose open proposition 3 (given in terms of the feature PROP) is identical to the (uninstantiated) content of the clarified utterance. MAX-QUD associated with the clarification is fully specified as a question whose open proposition is 3 and whose PARAMS set consists of the 'problematic' parameter a5 .</Paragraph>
    <Paragraph position="4"> We can exemplify the effect of parameter focussing with respect to clarifying an utterance of (7). The output this yields, when applied to Bo's index 1 , is the partial specification in (14). Such an utterance will have as its MAX-QUD a question cqa93 paraphrasable as whoa95 , named Bo, are you asking if t left, whereas its SAL-UTT is the sub-utterance of Bo. The content is underspeci- null This (partial) specification allows for clarification questions such as the following:  (15) a. Did WHO leave? b. WHO? c. BO? (= Are you asking if BO left?)  Given space constraints, we restrict ourselves to explaining how the clausal CE, (15c), gets analyzed. This involves direct application of the type decl-frag-cl discussed above for short answers.</Paragraph>
    <Paragraph position="5"> The QUD-maximality of cqa93 allows us to analyze the fragment as a 'short answer' to cqa93 , using the type bare-decl-cl. And out of the proposition which emerges courtesy of bare-decl-cl a (polar) question is constructed using the type dir-is-int- null The second coercion operation we discussed previously is parameter identification: for a given problematic contextual parameter its output is a partial specification for a sign whose content and MAX-QUD involve a question querying the content of that utterance parameter: 14The phrasal type dir-is-int-cl which constitutes the type of the mother node in (16) is a type that inter alia enables a polar question to be built from a head daughter whose content is propositional. See (Ginzburg and Sag, 2000) for details. null  This specification will allows for clarification questions such as the following:  (19) a. Who do you mean BO? b. WHO? (= who is Bo) c. Bo? (= who is Bo)  We restrict attention to (19c), which is the most interesting but also tricky example. The tricky part arises from the fact that in a case such as this, in contrast to all previous examples, the fragment does not contribute its conventional content to the clausal content. Rather, as we suggested earlier, the semantic function of the fragment is merely to serve as an anaphoric element to the phonologically identical to-be-clarified sub-utterance. The content derives entirely from MAX-QUD.</Paragraph>
    <Paragraph position="6"> Such utterances can still be analyzed as subtypes of head-frag-ph, though not as decl-frag-cl, the short-answer/reprise sluice phrasal type we have been appealing to extensively. Thus, we posit constit(uent)-clar(ification)-int-cl, a new phrasal subtype of head-frag-ph and of inter-cl which encapsulates the two idiosyncratic facets of such utterances, namely the phonological parallelism and the max-qud/content identity:</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML