File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/99/e99-1006_intro.xml
Size: 3,911 bytes
Last Modified: 2025-10-06 14:06:51
<?xml version="1.0" standalone="yes"?> <Paper uid="E99-1006"> <Title>Resolving Discourse Deictic Anaphora in Dialogues</Title> <Section position="2" start_page="0" end_page="37" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> Most anaphora resolution algorithms are designed to deal with the co-indexing relation between anaphors and NP-antecedents. In the spoken language corpus we examined - the Switchboard corpus of telephone conversations (LDC, 1993) - this type of link only accounts for 45.1% of all anaphoric references. Another 22.6% are anaphors whose referents are not individual, concrete entities but events, facts and propositions, e.g., (1) B.7: A.8: \[We never know what they're thinking\]/.</Paragraph> <Paragraph position="1"> Thati's right. \[I don't trust them\]j, maybe I guess itj's because of what happened over there with their own people, how they threw them out of power. (sw3241) Whilst there have been attempts to classify abstract objects and the rules governing anaphoric reference to them (Webber, 1991; Asher, 1993; Dahl and Hellman, 1995), there have been no exhaustive, empirical studies using actual resolution algorithms. These have so far only been applied to written corpora. However, the high frequency of abstract object anaphora in dialogues means that any attempt to resolve anaphors in spoken language cannot succeed without taking this into account.</Paragraph> <Paragraph position="2"> Summarised below are some issues specific to anaphora resolution in spoken dialogues (see also Byron and Stent (1998) who mention some of these problems in their account of the Centering model (Grosz et al., 1995)).</Paragraph> <Paragraph position="3"> Center of attention in multi-party discourse. In spontaneous speech it is possible that the participants of a dialogue may not be focussing on the same entity at a given point in the discourse.</Paragraph> <Paragraph position="4"> Utterances with no discourse entities. E.g., Uhhuh; yeah; right. Byron and Stent (1998) and Walker (1998) assign no importance to such utterances in their models. We assume that these also can be used to acknowledge a preceding utterance.</Paragraph> <Paragraph position="5"> Abandoned or partial utterances. Speakers may interrupt each other or make speech repairs, e.g., (2) Uh, our son/has this kind of, you know, he/'s, well hei started out going Stephen F Austin (sw3117) Self-corrected speech cannot be ignored as can be seen by the fact that the entity referred to by the NP our son is subsequently referred to by a pronoun and must therefore have entered the discourse model. Determination of utterance boundaries. Most anaphor resolution algorithms rely on a syntactic definition of utterance which cannot be provided by spoken dialogue as there is no punctuation to mark complete sentences.</Paragraph> <Paragraph position="6"> These issues are dealt with by our method of segmenting dialogues into dialogue acts with specified discourse functions. In addition, our approach presents a simple classification of individual and abstract object anaphors and uses separate algorithms for each class. We build on the recall rate of state-of-the-art pronoun resolution algorithms but we achieve a far higher precision than would be achieved by applying these to spoken language because the classification of anaphors prevents the algorithm from co-indexing discourse deictic anaphora with individual antecedents. Section 2 gives definitions and frequency of occurrence of the different anaphor types. Section 3 describes the segmentation of the dialogues into dialogue acts and the influence of these on the entities in the discourse model. Section 4 presents the method we use for resolving anaphors and the corresponding algorithm. In Section 5, we report on the corpus annotation and the evaluation of the algorithm.</Paragraph> </Section> class="xml-element"></Paper>