File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/94/j94-4002_abstr.xml
Size: 7,220 bytes
Last Modified: 2025-10-06 13:48:17
<?xml version="1.0" standalone="yes"?> <Paper uid="J94-4002"> <Title>An Algorithm for Pronominal Anaphora Resolution</Title> <Section position="2" start_page="0" end_page="536" type="abstr"> <SectionTitle> 1. Introduction </SectionTitle> <Paragraph position="0"> We present an algorithm for identifying both intrasentential and intersentential antecedents of pronouns in text. We refer to this algorithm as RAP (Resolution of Anaphora Procedure). RAP applies to the syntactic structures of McCord's (1990, 1993, in press) Slot Grammar parser, and like the parser, it is implemented in Prolog. It relies on measures of salience derived from syntactic structure and a simple dynamic model of attentional state to select the antecedent noun phrase (NP) of a pronoun from a list of candidates. It does not employ semantic conditions (beyond those implicit in grammatical number and gender agreement) or real-world knowledge in evaluating The second author's work on this paper was done while he was a visiting scientist at the IBM Germany Scientific Center.</Paragraph> <Paragraph position="1"> @ 1994 Association for Computational Linguistics Computational Linguistics Volume 20, Number 4 In Section 2 we present RAP and discuss its main properties. We provide examples of its output for different sorts of cases in Section 3. Most of these examples are taken from the computer manual texts on which we trained the algorithm. We give the results of a blind test in Section 4, as well as an analysis of the relative contributions of the algorithm's components to the overall success rate. In Section 5 we discuss a procedure developed by Dagan (1992) for using statistically measured lexical preference patterns to reevaluate RAP's salience rankings of antecedent candidates. We present the results of a comparative blind test of RAP and this procedure. Finally, in Section 6 we compare RAP to several other approaches to anaphora resolution that have been proposed in the computational literature.</Paragraph> <Paragraph position="2"> mention, proximity, and sentence recency) for an NP. (Earlier versions of these procedures are presented in Leass and Schwall 1991.) This procedure employs a grammatical role hierarchy according to which the evaluation rules assign higher salience weights to (i) subject over non-subject NPs, (ii) direct objects over other complements, (iii) arguments of a verb over adjuncts and objects of prepositional phrase (PP) adjuncts of the verb, and (iv) head nouns over complements of head nouns. 1 * A procedure for identifying anaphorically linked NPs as an equivalence class for which a global salience value is computed as the sum of the salience values of its elements.</Paragraph> <Paragraph position="3"> * A decision procedure for selecting the preferred element of a list of antecedent candidates for a pronoun.</Paragraph> <Paragraph position="4"> 1 This hierarchy is more or less identical to the NP accessibility hierarchy proposed by Keenan and Comrie (1977). Johnson (1977) uses a similar grammatical role hierarchy to specify a set of constraints on syntactic relations, including reflexive binding. Lappin (1985) employs it as a salience hierarchy to state a non-coreference constraint for pronouns. Guenthner and Lehmann (1983) use a similar salience ranking of grammatical roles to formulate rules of anaphora resolution. Centering approaches to anaphora resolution use similar hierarchies as well (Brennan, Friedman, and Pollard 1987; Walker, Iida, and Cote 1990).</Paragraph> <Section position="1" start_page="536" end_page="536" type="sub_section"> <SectionTitle> 2.1 Some Preliminary Details </SectionTitle> <Paragraph position="0"> RAP has been implemented for both ESG and GSG (English and German Slot Grammars); we will limit ourselves here to a discussion of the English version. The differences between the two versions are at present minimal, primarily owing to the fact that we have devoted most of our attention to analysis of English. As with Slot Grammar systems in general (McCord 1989b, 1993, in press), an architecture was adopted that &quot;factors out&quot; language-specific elements of the algorithm.</Paragraph> <Paragraph position="1"> We have integrated RAP into McCord's (1989a, 1989b) Logic-Based Machine Translation System (LMT). (We are grateful to Michael McCord and Ullrike Schwall for their help in implementing this integration.) When the algorithm identifies the antecedent of a pronoun in the source language, the agreement features of the head of the NP corresponding to the antecedent in the target language are used to generate the pronoun in the target language. Thus, for example, neuter third person pronouns in English are mapped into pronouns with the correct gender feature in German, in which inanimate nouns are marked for gender.</Paragraph> <Paragraph position="2"> RAP operates primarily on a clausal representation of the Slot Grammar analysis of the current sentence in a text (McCord et al. 1992). The clausal representation consists of a set of Prolog unit clauses that provide information on the head-argument and head-adjunct relations of the phrase structure that the Slot Grammar assigns to a sentence (phrase). Clausal representations of the previous four sentences in the text are retained in the Prolog workspace. The discourse representation used by our algorithm consists of these clausal representations, together with additional unit clauses declaring discourse referents evoked by NPs in the text and specifying anaphoric links among discourse referents. 2 All information pertaining to a discourse referent or its evoking NP is accessed via an identifier (ID), a Prolog term containing two integers. The first integer identifies the sentence in which the evoking NP occurs, with the sentences in a text being numbered consecutively. The second integer indicates the position of the NP's head word in the sentence.</Paragraph> <Paragraph position="3"> ditions for NP-pronoun non-coreference within a sentence. To state these conditions, we use the following terminology. The agreement features of an NP are its number, person, and gender features. We will say that a phrase P is in the argument domain of a phrase N iff P and N are both arguments of the same head. We will say that P is in the adjunct domain of N iff N is an argument of a head H, P is the object of a preposition PREP, and PREP is an adjunct of H. P is in the NP domain of N iff N is the determiner of a noun Q and (i) P is an argument of Q, or (ii) P is the object of a preposition PREP and PREP is an adjunct of Q. A phrase P is contained in a phrase Q iff (i) P is either an argument or an adjunct of Q, i.e., P is immediately contained in Q, or (ii) P is immediately contained in some phrase R, and R is contained in Q.</Paragraph> <Paragraph position="4"> A pronoun P is non-coreferential with a (non-reflexive or non-reciprocal) noun phrase N if any of the following conditions hold: 1. P and N have incompatible agreement features.</Paragraph> <Paragraph position="5"> 2. P is in the argument domain of N.</Paragraph> <Paragraph position="6"> 3. P is in the adjunct domain of N.</Paragraph> </Section> </Section> class="xml-element"></Paper>