File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/04/w04-0713_intro.xml
Size: 4,080 bytes
Last Modified: 2025-10-06 14:02:32
<?xml version="1.0" standalone="yes"?> <Paper uid="W04-0713"> <Title>An Algorithm for Resolving Individual and Abstract Anaphora in Danish Texts and Dialogues</Title> <Section position="2" start_page="0" end_page="0" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> Most intersentential anaphor resolution algorithms exclusively account for pronominal anaphors with individual nominal antecedents (henceforth IPAs) in texts. Less attention has been given to pronominal anaphors which refer to abstract entities evoked by verbal phrases, clauses or discourse segments (henceforth APAs). However APAs are quite common in English dialogues, see i.a. (Byron and Allen, 1998). Recently two algorithms for resolving APAs and IPAs in specific English dialogues have been proposed: Eckert and Strube's (2000) es00, Byron's (2002) phora. APAs are also frequent in Danish. We found that 15% of all pronominal anaphors in our texts were APAs, while they constituted 48% of the anaphors in the analysed dialogues. Furthermore third-person singular pronouns in neuter gender which can be IPAs or APAs were APAs in two-third of the cases in both texts and dialogues. null In this paper we describe an algorithm, called dar, for resolving intersentential IPAs and APAs in Danish.1 Unlike es00 and phora, dar applies to both texts and dialogues.</Paragraph> <Paragraph position="1"> Differing from most resolution algorithms, dar correctly accounts for the resolution of pronouns referring to newly introduced information, as it is the case in examples (1) and (2). (1) [Chefen]i fik kun [en son]k og [han]k gad i hvert fald ikke viderefore familieforetagendet. [pid] ([The boss]i had only [one son]k and [he]k surely did not want to carry on the family business.) (2) A: hvem...hvem arbejdede [din mor]i med? (with whom... whom did [your mother]i work) B: [Hun]i arbejdede med [vores nabo]k ([She]i worked with [our neighbour]k) [Hun]k var enke ... havde tre sonner [bysoc] ([She]k was a widow... had three sons) In (1) the antecedent of the pronoun han (he) is the indefinite object and not the more &quot;given&quot; definite subject. In (2) the antecedent of the second occurrence of the pronoun hun (she) is the object vores nabo (our neighbour) which provides the information requested in the preceding question. This nominal is assigned lower prominence than the subject pronoun hun (she) in most salience models. To account for this type of data the dar-algorithm proposes a novel strategy combining two apparently contrasting accounts of salience of entities (Navarretta, 2002a). The first account, e.g. (Grosz et al., 1995), assigns the highest degree of salience to the most known (topical) entities in the discourse model, the second assigns the highest degree of salience to entities in the focal part of utterances in Information Structure terms which, often, represent new information (HajiVcov'a et 1dar presupposes that intrasentential anaphors are correctly resolved. At present no resolution algorithm accounts for all uses of Danish intrasentential pronouns. al., 1990).</Paragraph> <Paragraph position="2"> dar was developed on the basis of the uses of pronouns in three text collections and three corpora of naturally-occurring dialogues. The texts comprise computer manuals, henceforth edb, novels and newspaper articles. The dialogue collections are sl (Duncker and Hermann, 1996), consisting of recorded conversations between GPs and their patients, the bysoc corpus (Gregersen and Pedersen, 1991) and the pid corpus (Jensen, 1989) both containing recorded conversations about everyday subjects.</Paragraph> <Paragraph position="3"> In the paper we first present related work (section 2) then we discuss the background for our proposal (section 3). In section 4 the dar-algorithm is described. In section 5 we present some tests of the algorithm, evaluate it and compare its performance with the performance of other known algorithms. Finally, in section 6, we make some concluding remarks.</Paragraph> </Section> class="xml-element"></Paper>