File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/99/w99-0314_metho.xml
Size: 13,736 bytes
Last Modified: 2025-10-06 14:15:33
<?xml version="1.0" standalone="yes"?> <Paper uid="W99-0314"> <Title>Automatically Extracting</Title> <Section position="3" start_page="0" end_page="0" type="metho"> <SectionTitle> from BF Tags </SectionTitle> <Paragraph position="0"> contribution as the presentation of a proposition by one dialogue participant, as well as all subsequent related utterances until there is adequate evidence that the initial utterance was understood or abandoned (CS89). Discourse units are the level of granularity at which other dialogue tags, such as the problem-solving acts described in (SA97), are app!ied.</Paragraph> <Paragraph position="1"> Annotating dialogues can be a time-consuming and error-prone undertaking. To make the annotation process easier and more reliable, care should be taken to avoid manually tagging information that can be derived from other tags or that can be automatically extracted. This paper explores how we can automatically annotate dialogues with grounding tags, given a corpus that has been annotated with the BF tags. Once grounding has been marked, we can automatically segment the dialogue into discourse units, using Traum's model. In order to tag with BF tags or grounding tags, a dialog must be segmented into utterances, a problem that is discussed briefly in section 2.</Paragraph> <Paragraph position="2"> Section 3 gives an overview of the BF tags and grounding tags, section 4 discusses the mapping from BF tags to grounding tags, and section 5 presents a comparison of the automatic mapping to a human annotator.</Paragraph> </Section> <Section position="4" start_page="0" end_page="109" type="metho"> <SectionTitle> 2 Segmenting dialogues into </SectionTitle> <Paragraph position="0"> utterances . .</Paragraph> <Paragraph position="1"> Dialogues need to be segmented into utterances before annotation with the BF tags. Unfortunately, there is no widely accepted criteria for identifying utterances. Traum's approach to utterance segmentation is to segment utterances based on the presence of prosodic evidence such as pauses and boundary tones, and on changes of speaker. The benefit of this approach is that it can be done automatically given prosodic annotation. However, we have found this approach to be somewhat problematic since very often the resulting utterance units need to be combined or split when assigning the BF tags. Traum uses a special grounding tag, CONTINUE, when a prosodically-segraented utterance is not an independent grounding act, but rather part of the same grounding act as a previous utterance by the same speaker.</Paragraph> <Paragraph position="2"> Another possible approach to utterance segmentation for BF tagging is to allow the annotator to segment the dialogue and label it for BF tags at the same time. The problem with this approach is that different annotators may segment the same dialogue differently, making it difficult to compare annotations. One way of dealing with this problem is to have subsequent annotators use the first annotator's segmentation. A drawback of this solution is that the first annotator's segmentation may influence subsequent BF labeling. Despite this drawback, we are assuming the second approach in order to avoid the need to split or join utterances, and therefore do not need Traum's CONTINUE tag.</Paragraph> </Section> <Section position="5" start_page="109" end_page="110" type="metho"> <SectionTitle> 3 Overview of Tag Sets </SectionTitle> <Paragraph position="0"> Table 1 shows the illocutionary act features included in the BF tagging scheme, along with the tags for each feature. Actions performed during the grounding process are shown in Table 2.1 In Traum's annotation scheme for grounding, the tags are not mutually exclusive.</Paragraph> <Paragraph position="1"> The BF scheme has four main layers: communicative status, information level, forward communicative function, and backward communicative function. Communicative status is used to label utterances that cannot be understood, are broken off, or are not directed at other conversational participants. Information level is used to differentiate between utterances discussing the topic at hand (TASK and TASK-MGMT) and utterances whose sole purpose is to manage the conversation (COMMUNICATION-MANAGEMENT).</Paragraph> <Paragraph position="2"> COMMUNICATION-MANAGEMENT utterances can be simple acknowledgments (okay) or explicit comments on the communication process (I didn't hear that). Forward communicative functions are aspects of an utterance that directly address future actions. Requests and suggestions axe included in the world are included in STATEMENT.</Paragraph> <Paragraph position="3"> OTHER-FORWARD-FUNCTION identifies utterances that have a turn-taking function but no other forward communicative function. The second utterance below is an example of OTHER-FORWARD-FUNCTION: uttl u: and that would be the fastest utt2 okay okay tun utt3 we're done Backward communicative functions include comments on the content of previous utterances (AGREEMENT) as well as utterances that signal whether previous material was understood or not (UNDERSTANDING). Examples of UNDERSTANDING include SIGNAL-NON-UNDERSTANDING as well as various types of showing understanding: simple ACKNOWLEDGMENTS, acknowledgment through repetition/paraphrase (su- REPEAT- REPHRASE), acknowledgment through correction (CORRECT-MISSPEAKING), and acknowledgment through elaboration/completion (su-cOMPLETION). The grounding acts of Traum are INITIATE, the initial presentation of a proposition a modification to the content or presentation of the current proposition under consideration a request that the other participant perform a REPAIR evidence that a previous utterance has been understood a request that the other participant perform an</Paragraph> </Section> <Section position="6" start_page="110" end_page="110" type="metho"> <SectionTitle> ACKNOWLEDGE </SectionTitle> <Paragraph position="0"> an abandonment of the proposition under consideration</Paragraph> </Section> <Section position="7" start_page="110" end_page="110" type="metho"> <SectionTitle> REPAIR, REQUEST-REPAIR, ACKNOWLEDGE, </SectionTitle> <Paragraph position="0"> REQUEST-ACKNOWLEDGE, and CANCEL. Dialogue participants use these actions to form discourse units as they converse. INITIATES start discourse units. A discourse unit is terminated either through an ACKNOWLEDGE, in which case the discourse unit is considered grounded, or through a CANCEL, in which case the discourse unit is not grounded. Acknowledgments may be either explicit or implicit. Explicit acknowledgments can be requested by performing a REQUEST-ACKNOWLEDGE, such as Did you get that?. Once an initial presentation is made, either participant may make a REPAIR, or enter into a repair subdialogue by performing a REQUEST-REPAIR.</Paragraph> </Section> <Section position="8" start_page="110" end_page="110" type="metho"> <SectionTitle> 4 Mapping from BF tags to </SectionTitle> <Paragraph position="0"> grounding tags In general, any utterance tagged as having a forward communicative function in the BF scheme initiates a new discourse unit and should be given an INITIATE grounding tag. Exceptions are utterances that only perform a turn-taking act. These are tagged as OTHER-FORWARD-FUNCTION in the BF scheme, but have no content that requires acknowledgment. Utterances that have both a turn-taking function and some other forward communicative function, such as Give me a second. (tagged as an ACTION-DIRECTIVE and</Paragraph> </Section> <Section position="9" start_page="110" end_page="111" type="metho"> <SectionTitle> OTHER-FORWARD-FUNCTION at the </SectionTitle> <Paragraph position="0"> COMMUNICATION-MANAGEMENT level) do have content that can be acknowledged and should be tagged as INITIATE. Another exception found frequently in dialogues from collaborative task-oriented domains are utterances that are tagged as COMMIT because they ACCEPT an ACTION-DIRECTIVE. Utterances 2 and 4 in the following dialogue excerpt are examples of COMMITS that are not INITIATES.</Paragraph> <Paragraph position="1"> uttl u: pick up two tankers in Corning</Paragraph> <Paragraph position="3"> The BF tag SU-COMPLETION is interesting since an utterance having this tag should be INITIATE and ACKNOWLEDGE in Tranm's scheme, despite the fact that completions are not labeled with forward communicative functions. The completion has an implicit forward communicative function which is taken as the same as the utterance (by another speaker) that it is completing.</Paragraph> <Paragraph position="4"> Repairs are attempts to fix an utterance through correction or clarification. Corrections reject an utterance and offer a replacement. Clarifications provide additional information about an utterance.</Paragraph> <Paragraph position="5"> Because of the level of granularity at which the BF tags are applied, self-repairs made mid-utterance are not included.</Paragraph> <Paragraph position="6"> An utterance B, should be given a REPAIR grounding tag with respect to utterance A, if B is a response to A and any of the following patterns of BF tags are seen: 1. Utterance B is tagged as SU-CORRECT-MISSPEAKING.</Paragraph> <Paragraph position="7"> 2. Utterance B is tagged with COMMUNICATION-MANAGEMENT and either REJECT or REJECT-PART, and a forward communicative function. In this case, the dialogue participant is making an unsolicited repair of their previous utterances.</Paragraph> <Paragraph position="8"> 3. Utterance A has the tag SIGNAL-NON-UNDERSTANDING and utterance B has a forward communicative function and does not have REJECT or REJECT-PART tags.</Paragraph> <Paragraph position="9"> In this case, the dialogue participant is making an solicited repair.</Paragraph> <Paragraph position="10"> An utterance is given a REQUEST-ACKNOWLEDGE grounding tag when it has either of the following patterns of BF tags: 1. The utterance is tagged as CHECK. These are check-questions, also known as tag-questions, and include examples such as we will take the top route right?.</Paragraph> <Paragraph position="11"> 2. The utterance is tagged as both</Paragraph> </Section> <Section position="10" start_page="111" end_page="111" type="metho"> <SectionTitle> COMMUNICATION-MANAGEMENT and </SectionTitle> <Paragraph position="0"> INFO-REQUEST, and is not tagged as SIGNAL-NON-UNDERSTANDING. Examples of utterances of this type are Did you get that? and Are you listening? Utterances that are tagged as ABANDONED in the BF scheme will be tagged as CANCEL in Tranm's grounding scheme. Sometimes a dialogue participant CANCELS an open discourse unit by saying something like Forget it or Never mind in response to a repair initiationi such as What did you say? In the BF scheme, these CANCELs appear as REJECTs at the COMMUNICATION-MANAGEMENT level, responding to SIGNAL-NON-UNDERSTANDINGs.</Paragraph> <Paragraph position="1"> In the BF scheme, acknowledgments are utterances that explicitly indicate that a previous utterance was understood. In Traum's scheme, acknowledgments can either explicitly or implicitly signal understanding. Explicit acknowledgments occur when a dialogue participant repeats, paraphrases, or completes what was said or when they use an acknowledgment term such as okay.</Paragraph> <Paragraph position="2"> Implicit acknowledgments occur when a dialogue participant continues the dialogue in a way that is consistent with what has been said previously in the dialogue.</Paragraph> <Paragraph position="3"> An utterance B, should be tagged as an ACKNOWLEDGE to utterance A in Traum's scheme under any of the following conditions: 1. Utterance B is tagged as SU-ACKNOWLEDGE in the BF scheme, with the Response-to field set to A. These utterances are examples of acknowledgment terms such as okay.</Paragraph> <Paragraph position="4"> 2. Utterance B is tagged as SU-REPEAT-REPHRASE or SU-COMPLETION in the BF scheme, with the Response-to field set to A. These utterances are examples of explicit acknowledgments by paraphrase, repetition, or completion.</Paragraph> <Paragraph position="5"> 3. Utterance B is tagged with an agreement tag with the Response-to field set to A, and the combination of BF tags has not already been determined to indicate CANCEL or REPAIR.</Paragraph> <Paragraph position="6"> These utterances implicitly acknowledge A by indicating agreement with the propositional content of A.</Paragraph> <Paragraph position="7"> . Utterance B is tagged as either WH-ANS, ASSERT or REASSERT, with the Response-to field set to A, and A was tagged as INFO-REQUEST. Such utterances implicitly show acknowledgment of a previous utterance by answering a question posed in the previous utterance.</Paragraph> <Paragraph position="8"> Problems arise when an interlocutor implicitly acknowledges an initiator's presentation either by continued attention or by initiating a new contribution that is consistent with and relevant to the previous presentation. The following dialogue segment is an example of such an exchange: uttl u: our task is to get two tankers of orange juice to Corning by</Paragraph> <Section position="1" start_page="111" end_page="111" type="sub_section"> <SectionTitle> Coming </SectionTitle> <Paragraph position="0"> The reason that this case is somewhat problematic to our scheme is that it is not clear that utterance 2 should be tagged as an ACCEPT of utterance 1 in the BF scheme, and if the BF annotators fail to tag utterance 2 as an ACCEPT, it will not be identified as an ACKNOWLEDGE. (In the BF scheme, the Understanding feature is only tagged when an explicit acknowledgement or signal of non-understanding is made.)</Paragraph> </Section> </Section> class="xml-element"></Paper>