File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/05/p05-1029_metho.xml

Size: 24,534 bytes

Last Modified: 2025-10-06 14:09:40

<?xml version="1.0" standalone="yes"?>
<Paper uid="P05-1029">
  <Title>Scaling up from Dialogue to Multilogue: some principles and benchmarks</Title>
  <Section position="4" start_page="231" end_page="232" type="metho">
    <SectionTitle>
2 Long Distance Resolution of NSUs in
</SectionTitle>
    <Paragraph position="0"> Dialogue and Multilogue: some benchmarks The work we present in this paper is based on empirical evidence provided by corpus data extracted from the British National Corpus (BNC).</Paragraph>
    <Section position="1" start_page="231" end_page="232" type="sub_section">
      <SectionTitle>
2.1 The Corpus
</SectionTitle>
      <Paragraph position="0"> Our current corpus is a sub-portion of the BNC conversational transcripts consisting of 14,315 sentences. The corpus was created by randomly excerpting a 200-speakerturn section from 54 BNC files. Of these files, 29 are transcripts of conversations between two dialogue participants, and 25 files are multilogue transcripts.</Paragraph>
      <Paragraph position="1"> A total of 1285 NSUs were found in our sub-corpus.</Paragraph>
      <Paragraph position="2"> Table 1 shows the raw counts of NSUs found in the dialogue and multilogue transcripts, respectively.</Paragraph>
      <Paragraph position="3">  All NSUs encountered within the corpus were classified according to the NSU typology presented in (Fern'andez and Ginzburg, 2002). Additionally, the distance from their antecedent was measured.1 Table 2 shows the distribution of NSU categories and their antecedent separation distance. The classes of NSU which feature in our discussion below are boldfaced.</Paragraph>
      <Paragraph position="4"> The BNC annotation includes tagging of units approximating to sentences, as identified by the CLAWS segmentation scheme (Garside, 1987). Each sentence unit is assigned an identifier number. By default it is assumed that sentences are non-overlapping and that their numeration indicates temporal sequence. When this is not the case because speakers overlap, the tagging scheme encodes synchronous speech by means of an alignment map used to synchronize points within the transcription. However, even though information about simultaneous speech is available, overlapping sentences are annotated with different sentence numbers.</Paragraph>
      <Paragraph position="5"> In order to be able to measure the distance between the NSUs encountered and their antecedents, all instances were tagged with the sentence number of their antecedent utterance. The distance we report is therefore measured in terms of sentence numbers. It should however be noted that taking into account synchronous speech would not change the data reported in Table 2 in any significant 1This classification was done by one expert annotator. To assess its reliability a pilot study of the taxonomy was performed using two additional non-expert coders. These annotated 50 randomly selected NSUs (containing a minimum of 2 instances of each NSU class, as labelled by the expert annotator.). The agreement achieved by the three coders is reasonably good, yielding a kappa score k = 0.76. We also assessed the accuracy of the coders' choices in choosing the antecedent utterance using the expert annotator's annotation as a gold standard. Given this, one coder's accuracy was 92%, whereas the other coder's was 96%.</Paragraph>
      <Paragraph position="6">  way, as manual examination of all NSUs at more than distance 3 reveals that the transcription portion between antecedent and NSU does not contain any completely synchronous sentences in such cases.</Paragraph>
      <Paragraph position="7"> In the examples throughout the paper we shall use italics to indicate speech overlap. When italics are not used, utterances take place sequentially.</Paragraph>
    </Section>
    <Section position="2" start_page="232" end_page="232" type="sub_section">
      <SectionTitle>
2.2 NSU-Antecedent Separation Distance
</SectionTitle>
      <Paragraph position="0"> The last row in Table 2 shows the distribution of NSUantecedent separation distances as percentages of the total of NSUs found. This allows us to see that about 87% of NSUs have a distance of 1 sentence (i.e. the antecedent was the immediately preceding sentence), and that the vast majority (about 96%) have a distance of 3 sentences or less.</Paragraph>
      <Paragraph position="1"> Although the proportion of NSUs found in dialogue and multilogue is roughly the same (see Table 1 above), when taking into account the distance of NSUs from their antecedent, the proportion of long distance NSUs in multilogue increases radically: the longer the distance, the higher the proportion of NSUs that were found in multilogue. In fact, as Table 3 shows, NSUs that have a distance of 7 sentences or more appear exclusively in multilogue transcripts. These differences are significant (kh2 = 62.24, p[?]0.001).</Paragraph>
      <Paragraph position="2"> Adjacency of grounding and affirmation utterances The data in table 2 highlights a fundamental characteristic of the remaining majoritarian classes of NSUs,</Paragraph>
    </Section>
  </Section>
  <Section position="5" start_page="232" end_page="234" type="metho">
    <SectionTitle>
Ack(nowledgements), Affirmative Answer, CE (clari-
</SectionTitle>
    <Paragraph position="0"> fication ellipsis), Repeated Ack(nowledgements), and Rejection. These are used either in grounding interaction, or to affirm/reject propositions.2 The overwhelming adjacency to their antecedent underlines the locality of these interactions.</Paragraph>
    <Paragraph position="1"> Long distance potential for short answers One striking result exhibited in Table 2 is the uneven distribution of long distance NSUs across categories. With a few exceptions, NSUs that have a distance of 3 sentences or more are exclusively short answers. Not only is the long distance phenomenon almost exclusively restricted to short answers, but the frequency of long distance short answers stands in strong contrast to the other NSUs classes; indeed, over 44% of short answers have more than distance 1, and over 24% have distance 4 or more, like the last answer in the following example:  (1) Allan: How much do you think? Cynthia: Three hundred pounds.</Paragraph>
    <Paragraph position="2"> Sue: More.</Paragraph>
    <Paragraph position="3"> Cynthia: A thousand pounds.</Paragraph>
    <Paragraph position="4"> Allan: More.</Paragraph>
    <Paragraph position="5"> Unknown: &lt;unclear&gt;  Allan: Eleven hundred quid apparently.</Paragraph>
    <Paragraph position="6"> [BNC, G4X] Long distance short answers primarily a multilogue effect Table 4 shows the total number of short answers found in dialogue and multilogue respectively, and the proportions sorted by distance over those totals: From this it emerges that short answers are more common in multilogue than in dialogue--134(71%) v. 2Acknowledgements and acceptances are, in principle, distinct acts: the former involves indication that an utterance has been understood, whereas the latter that an assertion is accepted. In practice, though, acknowledgements in the form of NSUs commonly simultaneously signal acceptances. Given this, corpus studies of NSUs (e.g. (Fern'andez and Ginzburg, 2002)) often conflate the two.</Paragraph>
    <Paragraph position="7">  54(29%). Also, the distance pattern exhibited by these two groups is strikingly different: Only 18% of short answers found in dialogue have a distance of more than 1 sentence, with all of them having a distance of at most 3, like the short answer in (2).</Paragraph>
    <Paragraph position="8"> (2) Malcolm: [...] cos what's three hundred and sixty divided by seven?  This dialogue/multilogue asymmetry argues against reductive views of multilogue as sequential dialogue. Long Distance short answers and group size As Table 4 shows, all short answers at more than distance 3 appear in multilogues. Following (Fay et al., 2000), we distinguish between small groups (those with 3 to 5 participants) and large groups (those with more than 5 participants). The size of the group is determined by the amount of participants that are active when a particular short answer is uttered. We consider active participants those that have made a contribution within a window of 30 turns back from the turn where the short answer was uttered.</Paragraph>
    <Paragraph position="9"> Table 5 shows the distribution of long distance short answers (distance&gt; 3) in small and large groups respectively. This indicates that long distance short answers are significantly more frequent in large groups (kh2 = 22.17, p [?] 0.001), though still reasonably common in small groups. A pragmatic account correlating group size and frequency of long distance short answers is offered in the  Large group multilogues in the corpus are all transcripts of tutorials, training sessions or seminars, which exhibit a rather particular structure. The general pattern involves a question being asked by the tutor or session leader, the other participants then taking turns to answer that question. The tutor or leader acts as turn manager. She assigns the turn explicitly usually by addressing the participants by their name without need to repeat the question under discussion. An example is shown in (3):  (3) Anon1: How important is those three components and what value would you put on them [...]  Small group multilogues on the other hand have a more unconstrained structure: after a question is asked, the participants tend to answer freely. Answers by different participants can follow one after the other without explicit acknowledgements nor turn management, like in (4):.</Paragraph>
    <Section position="1" start_page="233" end_page="234" type="sub_section">
      <SectionTitle>
2.3 Two Benchmarks of multilogue
</SectionTitle>
      <Paragraph position="0"> The data we have seen above leads in particular to the following two benchmarks protocols for querying, assertion, and grounding interaction in multilogue: (5) a. Multilogue Long Distance short answers (MLDSA): querying protocols for multilogue must license short answers an unbounded number of turns from the original query.</Paragraph>
      <Paragraph position="1"> b. Multilogue adjacency of grounding/acceptance (MAG): assertion and grounding protocols for multilogue should license grounding/clarification/acceptance moves only adjacently to their antecedent utterance.</Paragraph>
      <Paragraph position="2"> MLDSA and MAG have a somewhat different status: whereas MLDSA is a direct generalization from the data, MAG is a negative constraint, posited given the paucity of positive instances. As such MAG is more open to doubt and we shall treat it as such in the sequel.</Paragraph>
    </Section>
  </Section>
  <Section position="6" start_page="234" end_page="235" type="metho">
    <SectionTitle>
3 Issue based Dialogue Management:
</SectionTitle>
    <Paragraph position="0"> basic principles In this section we outline some of the basic principles of Issue-based Dialogue Management, which we use as a basis for our subsequent investigations of multilogue interaction.</Paragraph>
    <Paragraph position="1"> Information States We assume information states of the kind developed in the KoS framework (e.g. (Ginzburg, 1996, forthcoming), (Larsson, 2002)) and implemented in systems such as GODIS, IBIS, and CLARIE (see e.g. (Larsson, 2002; Purver, 2004)). On this view each dialogue participant's view of the common ground, their Dialogue Gameboard (DGB), is structured by a number of attributes including the following three: FACTS: a set of facts representing the shared assumptions of the CPs, LatestMove: the most recent grounded move, and QUD ('questions under discussion'): a partially ordered set--often taken to be structured as a stack--consisting of the currently discussable questions. Querying and Assertion Both querying and assertion involve a question becoming maximal in the querier/asserter's QUD:3 the posed question q for a query where q is posed, the polar question p? for an assertion where p is asserted. Roughly, the responder can subsequently either choose to start a discussion (of q or p?) or, in the case of assertion, to update her FACTS structure with p. A dialogue participant can downdate q/p? from QUD when, as far as her (not necessarily public) goals dictate, sufficient information has been accumulated in FACTS. The querying/assertion protocols (in their most basic form) are summarized as follows:</Paragraph>
    <Paragraph position="3"> A: push q onto QUD; A: push p? onto QUD; release turn; release turn B: push q onto QUD; B: push p? onto QUD;</Paragraph>
    <Paragraph position="5"> B: increment FACTS with p; pop p? from QUD; A: increment FACTS with p; pop p? from QUD; Following (Larsson, 2002; Cooper, 2004), one can  max-qud or a questionq1 on which max-qud Depends. For the latter see footnote 7. If one assumes QUD to be a stack, then 'max-qud-specific' will in this case reduce to 'q-specific'. But the more general formulation will be important below. decompose interaction protocols into conversational update rules--functions from DGBs into DGBs using Type Theory with Records (TTR). This allows simple interfacing with the grammar, a Constraint-based Grammar closely modelled on HPSG but formulated in TTR (see (Ginzburg, forthcoming)).</Paragraph>
    <Paragraph position="6"> Grounding Interaction Grounding an utterance u : T ('the sign associated with u is of type T') is modelled as involving the following interaction. (a) Addressee B tries to anchor the contextual parameters of T. If successful, B acknowledges u (directly, gesturally or implicitly) and responds to the content of u. (b) If unsuccessful, B poses a Clarification Request (CR), that arises via utterance coercion (see (Ginzburg and Cooper, 2001)). For reasons of space we do not formulate an explicit protocol here-the structure of such a protocol resembles the assertion protocol. Our subsequent discussion of assertion can be modified mutatis mutandis to grounding.</Paragraph>
    <Paragraph position="7"> NSU Resolution We assume the account of NSU resolution developed in (Ginzburg and Sag, 2000). The essential idea they develop is that NSUs get their main predicates from context, specifically via unification with the question that is currently under discussion, an entity dubbed the maximal question under discussion (MAX-QUD). NSU resolution is, consequently, tied to conversational topic, viz. the MAX-QUD.5 Distance effects in dialogue short answers If one assumes QUD to be a stack, this affords the potential for non adjacent short answers in dialogue. These, as discussed in section 2, are relatively infrequent. Two commonly observed dialogue conditions will jointly enforce adjacency between short answers and their interrogative  antecedents: (a) Questions have a simple, one phrase answer. (b) Questions can be answered immediately,  without preparatory or subsequent discussion. For multilogue (or at least certain genres thereof), both these conditions are less likely to be maintained: different CPs can supply different answers, even assuming that relative to each CP there is a simple, one phrase answer. The more CPs there are in a conversation, the smaller their common ground and the more likely the need for clarificatory interaction. A pragmatic account of this type of the frequency of adjacency in dialogue short answers seems clearly preferable to any actual mechanism that would rule out long distance short answers. These can be perfectly felicitous--see e.g. example (1) above which 5The resolution of NSUs, on the approach of (Ginzburg and Sag, 2000), involves one other parameter, an antecedent sub-utterance they dub the salient-utterance (SAL-UTT). This plays a role similar to the role played by the parallel element in higher order unification-based approaches to ellipsis resolution (see e.g. (Pulman, 1997). For current purposes, we limit attention to the MAX-QUD as the nucleus of NSU resolution.</Paragraph>
    <Paragraph position="8">  would work fine if the turn uttered by Sue had been uttered by Allan instead. Moreover such a pragmatic account leads to the expectation that the frequency of long distance antecedents is correlated with group size, as indeed indicated by the data in table 5.</Paragraph>
  </Section>
  <Section position="7" start_page="235" end_page="236" type="metho">
    <SectionTitle>
4 Scaling up Protocols
</SectionTitle>
    <Paragraph position="0"> (Goffman, 1981) introduced the distinction between ratified participants and overhearers in a conversation.</Paragraph>
    <Paragraph position="1"> Within the former are located the speaker and participants whom she takes into account in her utterance design-the intended addressee(s) of a given utterance, as well as side participants. In this section we consider three possible principles of protocol extension, each of which can be viewed as adding roles for participants from one of Goffman's categories. We evaluate the protocol that results from the application of each such principle relative to the benchmarks we introduced in section 2.3.</Paragraph>
    <Paragraph position="2"> Seen in this light, the final principle we consider, Add Side Participants (ASP), arguably, yields the best results. Nonetheless, these three principles would appear to be complementary--the most general protocol for multilogue will involve, minimally, application of all three.6 We state the principles informally and framework independently as transformations on operational construals of the protocols. In a more extended presentation we will formulate these as functions on TTR conversational update rules.</Paragraph>
    <Paragraph position="3"> The simplest principle is Add Overhearers (AOV).</Paragraph>
    <Paragraph position="4"> This involves adding participants who merely observe the interaction. They keep track of facts concerning a particular interaction, but their context is not facilitated for them to participate: (7) Given a dialogue protocolpi, add roles C1,. . . ,Cn where each Ci is a silent participant: given an utterance u0 classified as being of type T0, Ci updates Ci.DGB.FACTS with the proposition u0 : T0.</Paragraph>
    <Paragraph position="5"> Applying AOV yields essentially multilogues which are sequences of dialogues. A special case of this are moderated multilogues, where all dialogues involve a designated individual (who is also responsible for turn assignment.). Restricting scaling up to applications of AOV is not sufficient since inter alia this will not fulfill the MLDSA benchmark.</Paragraph>
    <Paragraph position="6"> A far stronger principle is Duplicate Responders  (DR): (8) Given a dialogue protocolpi, add roles C1,. . . ,Cn which duplicate the responder role.</Paragraph>
    <Paragraph position="7"> 6We thank an anonymous reviewer for ACL for convincing us of this point.</Paragraph>
    <Paragraph position="8"> Applying DR to the querying protocol yields the following protocol: (9) Querying with multiple responders 1. LatestMove = Ask(A,q) 2. A: push q onto QUD; release turn 3. Resp1: push q onto QUD; take turn; make max-qudspecific utterance; release turn 4. Resp2: push q onto QUD; take turn; make max-qudspecific utterance; release turn 5. . . .</Paragraph>
    <Paragraph position="9"> 6. Respn: push q onto QUD; take turn; make max-qudspecific utterance; release turn  This yields interactions such as (4) above. The querying protocol in (9) licenses long distance short answers, so satisfies the MLDSA benchmark. On the other hand, the contextual updates it enforces will not enable it to deal with the following (constructed) variant on (4), in other words does not afford responders to comment on previous responders, as opposed to the original querier:  (10) A: Who should we invite for the conference? B: Svetlanov.</Paragraph>
    <Paragraph position="10"> C: No (=Not Svetlanov), Zhdanov D: No (= Not Zhdanov,negationslash= Not Svetlanov), Gergev Applying DR to the assertion protocol will yield the following protocol: (11) Assertion with multiple responders 1. LatestMove = Assert(A,p) 2. A: push p? onto QUD; release turn 3. Resp1: push p? onto QUD; take turn; &lt; Option 1: Discuss p?, Option 2: Accept p&gt; 4. Resp2: push p? onto QUD; take turn; &lt; Option 1: Discuss p?, Option 2: Accept p&gt; 5. . . .</Paragraph>
    <Paragraph position="11"> 6. Respn: push p? onto QUD; take turn; &lt; Option 1:  One arguable problem with this protocol--equally applicable to the corresponding DRed grounding protocol--is that it licences long distance acceptance and is, thus, inconsistent with the MAG benchmark. On the other hand, it is potentially useful for interactions where there is explicitly more than one direct addressee. A principle intermediate between AOV and DR is Add</Paragraph>
    <Section position="1" start_page="235" end_page="236" type="sub_section">
      <SectionTitle>
Side Participants (ASP):
</SectionTitle>
      <Paragraph position="0"> (12) Given a dialogue protocol pi, add roles C1,. . . ,Cn, which effect the same contextual update as the interaction initiator.</Paragraph>
      <Paragraph position="1"> Applying ASP to the dialogue assertion protocol yields the following protocol:  (13) Assertion for a conversation involving {A,B,C1,. . . ,Cn}  1. LatestMove = Assert(A,p) 2. A: push p? onto QUD; release turn 3. Ci: push p? onto QUD; 4. B: push p? onto QUD; take turn;&lt;Option 1: Accept p, Option 2: Discuss p?&gt; (14) 1. LatestMove = Accept(B,p) 2. B: increment FACTS with p; pop p? from QUD; 3. Ci:increment FACTS with p; pop p? from QUD; 4. A: increment FACTS with p; pop p? from QUD;  This protocol satisfies the MAG benchmark in that acceptance is strictly local. This is because it enforces communal acceptance--acceptance by one CP can count as acceptance by all other addressees of an assertion. There is an obvious rational motivation for this, given the difficulty of a CP constantly monitoring an entire audience (when this consists of more than one addressee) for acceptance signals--it is well known that the effect of visual access on turn taking is highly significant (Dabbs and Ruback, 1987). It also enforces quick reaction to an assertion--anyone wishing to dissent from p must get their reaction in early i.e. immediately following the assertion since further discussion of p? is not countenanced if acceptance takes place. The latter can happen of course as a consequence of a dissenter not being quick on their feet; on this protocol to accommodate such cases would require some type of backtracking.</Paragraph>
      <Paragraph position="2"> Applying ASP to the dialogue querying protocol yields the following protocol:  (15) Querying for a conversation involving {A,B,C1,. . . ,Cn} 1. LatestMove = Ask(A,q) 2. A: push q onto QUD; release turn 3. Ci: push q onto QUD; 4. B: push q onto QUD; take turn; make max-qud null specific utterance.</Paragraph>
      <Paragraph position="3"> This improves on the DR generated protocol because it does allow responders to comment on previous responders--the context is modified as in the dialogue protocol. Nonetheless, as it stands, this protocol won't fully deal with examples such as (4)--the issue introduced by each successive participant takes precedence given that QUD is assumed to be a stack. This can be remedied by slightly modifying this latter assumption: we will assume that when a question q is pushed onto QUD it doesn't subsume all existing questions in QUD, but rather only those on which q does not depend:7 (16) q is QUDmod(dependence) maximal iff for any q0 in QUD such that!Depend(q,q1): qfollowsq0.</Paragraph>
    </Section>
  </Section>
  <Section position="8" start_page="236" end_page="236" type="metho">
    <SectionTitle>
7 The notion of dependence we assume here is one common
</SectionTitle>
    <Paragraph position="0"> in work on questions, e.g. (Ginzburg and Sag, 2000), intuitively corresponding to the notion of 'is a subquestion of'. q1 depends on q2 iff any proposition p such that p resolves q2 also satisfies p is about q1.</Paragraph>
    <Paragraph position="1"> This is conceptually attractive because it reinforces that the order in QUD has an intuitive semantic basis.</Paragraph>
    <Paragraph position="2"> One effect this has is to ensure that any polar question p? introduced into QUD, whether by an assertion or by a query, subsequent to a wh-question q on which p? depends does not subsume q. Hence, q will remain accessible as an antecedent for NSUs, as long as no new unrelated topic has been introduced. Assuming this modification to QUD is implemented in the above ASP-generated protocols, both MLDSA and MAG benchmarks are fulfilled. null</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML