File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/94/j94-4002_metho.xml

Size: 61,097 bytes

Last Modified: 2025-10-06 14:13:52

<?xml version="1.0" standalone="yes"?>
<Paper uid="J94-4002">
  <Title>An Algorithm for Pronominal Anaphora Resolution</Title>
  <Section position="3" start_page="536" end_page="538" type="metho">
    <SectionTitle>
2 The number of sentences whose syntactic representations are retained is a parametrically specified
</SectionTitle>
    <Paragraph position="0"> value of the algorithm. Our decision to set this value at four is motivated by our experience with the technical texts we have been working with.</Paragraph>
    <Paragraph position="1">  The womani said that he/is funny.</Paragraph>
    <Paragraph position="2"> Condition 2: Shei likes her/.</Paragraph>
    <Paragraph position="3"> John/seems to want to see himi.</Paragraph>
    <Paragraph position="4"> Condition 3: She/sat near her/.</Paragraph>
    <Paragraph position="5"> Condition 4: He/believes that the mani is amusing. This is the man/hei said John/wrote about. Condition 5: John/s portrait of himi is interesting. Condition 6: Hisi portrait of John/is interesting. Hisi description of the portrait by John/is interesting. Figure 1 Conditions on NP-pronoun non-coreference (examples). 2.1.2 Test for Pleonastic Pronouns. The tests are partly syntactic and partly lexical. A class of modal adjectives is specified. It includes the following items (and their corresponding morphological negations, as well as comparative and superlative forms). necessary possible certain likely important good useful advisable convenient sufficient economical easy desirable difficult legal A class of cognitive verbs with the following elements is also specified. recommend think believe know anticipate assume expect It appearing in the constructions of Figure 2 is considered pleonastic (Cogv-ed = passive participle of cognitive verb); syntactic variants of these constructions (It is not~may be Modaladj..., Wouldn't it be Modaladj .... etc.) are recognized as well. To our knowledge, no other computational treatment of pronominal anaphora resolution has addressed the problem of pleonastic pronouns. It could be argued that recognizing pleonastic uses of pronouns is a task for levels of syntactic/semantic analysis that precede anaphora resolution. With the help of semantic classes defined in the lexicon, it should be possible to include exhaustive tests for these constructions in  It is Modaladj that S It is Modaladj (for NP) to VP It is Cogv-ed that S It seems/appears/means/follows (that) S NP makes/finds it Modaladj (for NP) to VP It is time to VP It is thanks to NP that S analysis grammars. 3  following formulation of the binding algorithm is defined by the following hierarchy of argument slots: subj &gt; agent &gt; obj &gt; (iobjlpobj) Here subj is the surface subject slot, agent is the deep subject slot of a verb heading a passive VP, obj is the direct object slot, iobj is the indirect object slot, and pobj is the object of a PP complement of a verb, as in put NP on NP. We assume the definitions of argument domain, adjunct domain, and NP domain given above. A noun phrase N is a possible antecedent binder for a lexical anaphor (i.e., reciprocal or reflexive pronoun) A iff N and A do not have incompatible agreement features, and one of the following five conditions holds.</Paragraph>
    <Paragraph position="7"> A is in the argument domain of N, and N fills a higher argument slot than A.</Paragraph>
    <Paragraph position="8"> A is in the adjunct domain of N.</Paragraph>
    <Paragraph position="9"> A is in the NP domain of N.</Paragraph>
    <Paragraph position="10"> N is an argument of a verb V, there is an NP Q in the argument domain or the adjunct domain of N such that Q has no noun determiner, and (i) A is an argument of Q, or (ii) A is an argument of a preposition PREP and PREP is an adjunct of Q.</Paragraph>
    <Paragraph position="11"> A is a determiner of a noun Q, and (i) Q is in the argument domain of N and N fills a higher argument slot than Q, or (ii) Q is in the adjunct domain of N.</Paragraph>
    <Paragraph position="12"> Examples of bindings licensed by these conditions are given in Figure 3. 2.1.4 Salience Weighting. Salience weighting is accomplished using salience factors. A given salience factor is associated with one or more discourse referents. These discourse referents are said to be in the factor's scope. A weight is associated with each</Paragraph>
  </Section>
  <Section position="4" start_page="538" end_page="540" type="metho">
    <SectionTitle>
3 ESG does, in fact, recognize some pleonastic uses of it, viz. in constructions involving extraposed
</SectionTitle>
    <Paragraph position="0"> sentential subjects, as in It surprised me that he was there. A special slot, subj(it), is used. We expect that enhancements to ESG and to the Slot Grammar English lexicon will ultimately render our tests for pleonastic pronouns redundant.</Paragraph>
    <Paragraph position="1">  Conditions for antecedent NP-lexical anaphor binding.</Paragraph>
    <Paragraph position="2"> factor, reflecting its relative contribution to the total salience of individual discourse referents. Initial weights are degraded in the course of processing.</Paragraph>
    <Paragraph position="3"> The use of salience factors in our algorithm is based on Alshawi's (1987) context mechanism. Other than sentence recency, the factors used in RAP differ from Alshawi's and are more specific to the task of pronominal anaphora resolution. Alshawi's framework is designed to deal with a broad class of language interpretation problems, including reference resolution, word sense disambiguation, and the interpretation of implicit relations. While Alshawi does propose emphasis factors for memory entities that are &amp;quot;referents for noun phrases playing syntactic roles regarded as foregrounding the referent&amp;quot; (Alshawi 1987, p. 17), only topics of sentences in the passive voice and the agents of certain be clauses receive such emphasis in his system. Our emphasis salience factors realize a much more detailed measure of structural salience.</Paragraph>
    <Paragraph position="4"> Degradation of salience factors occurs as the first step in processing a new sentence in the text. All salience factors that have been assigned prior to the appearance of this sentence have their weights degraded by a factor of two. When the weight of a given salience factor reaches zero, the factor is removed.</Paragraph>
    <Paragraph position="5"> A sentence recency salience factor is created for the current sentence. Its scope is all discourse referents introduced by the current sentence.</Paragraph>
    <Paragraph position="6"> The discourse referents evoked by the current sentence are tested to see whether other salience factors should apply. If at least one discourse referent 4 satisfies the conditions for a given factor type, a new salience factor of that type is created, with the appropriate discourse referents in its scope.</Paragraph>
    <Paragraph position="7"> In addition to sentence recency, the algorithm employs the following salience factors: null Subject emphasis Existential emphasis: predicate nominal in an existential construction, as in There are only a few restrictions on LQL query construction for WordSmith.</Paragraph>
    <Paragraph position="8"> 4 In this paper we do not distinguish between properties of a discourse referent and properties of the NP that evokes it.</Paragraph>
    <Paragraph position="9">  Accusative emphasis: direct object (i.e., verbal complement in accusative case) Indirect object and oblique complement emphasis Head noun emphasis: any NP not contained in another NP, using the Slot Grammar notion of &amp;quot;containment within a phrase&amp;quot; (see Section 2.1.1). This factor increases the salience value of an NP that is not embedded within another NP (as its complement or adjunct). Examples of NPs not receiving head noun emphasis are the configuration information copied by Backup configuration the assembly in bay C the connector labeled P3 on theflatcable Non-adverbial emphasis: any NP not contained in an adverbial PP demarcated by a separator. Like head noun emphasis, this factor penalizes NPs in certain embedded constructions. Examples of NPs not receiving non-adverbial emphasis are Throughout thefirstsection of thisguide, used ...</Paragraph>
    <Paragraph position="10"> In the Panel definition panel, action bar.</Paragraph>
    <Paragraph position="11"> these symbols are also select the C~Specify'' option from the The initial weights for each of the above factor types are given in Table 1. Note that the relative weighting of some of these factors realizes a hierarchy of grammatical roles. 5 2.1.5 Equivalence Classes. We treat the antecedent-anaphor relation in much the same way as the &amp;quot;equality&amp;quot; condition of Discourse Representation Theory (DRT) (Kamp 1981), as in u---y.</Paragraph>
    <Paragraph position="12"> This indicates that the discourse referent u, evoked by an anaphoric NP, is anaphorically linked to a previously introduced discourse referent y. To avoid confusion with</Paragraph>
  </Section>
  <Section position="5" start_page="540" end_page="553" type="metho">
    <SectionTitle>
5 The specific values of the weights are arbitrary. The significance of the weighting procedure is in the
</SectionTitle>
    <Paragraph position="0"> comparative relations among the factors as defined by the weights. We have determined the efficacy of this relational structure of salience factors (and refined it) experimentally (see Section 4.2).</Paragraph>
    <Paragraph position="1">  Computational Linguistics Volume 20, Number 4 mathematical equality (which, unlike the relation discussed here, is symmetric), we represent the relation between an anaphor u and its antecedent y by y antecedes u.</Paragraph>
    <Paragraph position="2"> Two discourse referents u and y are said to be co-referential, 6 written as coref(u~y) if any of the following holds:</Paragraph>
    <Paragraph position="4"> Also, coref(u,u) is true for any discourse referent u. The coref relation defines equivalence classes of discourse referents, with all discourse referents in an &amp;quot;anaphoric chain&amp;quot; forming one class:</Paragraph>
    <Paragraph position="6"> Each equivalence class of discourse referents (some of which consist of only one member) has a salience weight associated with it. This weight is the sum of the current weight of all salience factors in whose scope at least one member of the equivalence class lies.</Paragraph>
    <Paragraph position="7"> Equivalence classes, along with the sentence recency factor and the salience degradation mechanism, constitute a dynamic system for computing the relative attentional prominence of denotational NPs in text.</Paragraph>
    <Section position="1" start_page="541" end_page="544" type="sub_section">
      <SectionTitle>
2.2 The Resolution Procedure
</SectionTitle>
      <Paragraph position="0"> RAP's procedure for identifying antecedents of pronouns is as follows.</Paragraph>
      <Paragraph position="2"> Create a list of IDs for all NPs in. the current sentence and classify them as to their type (definite NP, pleonastic pronoun, other pronoun, indefinite NP).</Paragraph>
      <Paragraph position="3"> Examine all NPs occurring in the current sentence.</Paragraph>
      <Paragraph position="4"> Distinguish among NPs that evoke new discourse referents, those that evoke discourse referents which are presumably coreferential with already listed discourse referents, and NPs that are used non-referentially.</Paragraph>
      <Paragraph position="5"> Apply salience factors to the discourse referents evoked in the previous step as appropriate.</Paragraph>
      <Paragraph position="6"> Apply the syntactic filter and reflexive binding algorithm (first phase).</Paragraph>
      <Paragraph position="7">  Shalom Lappin and Herbert J. Leass An Algorithm for Pronominal Anaphora Resolution d.</Paragraph>
      <Paragraph position="8"> (i) (ii)  If the current sentence contains any personal or possessive pronouns, a list of pairs of IDs from the current sentence is generated. This list contains the pronoun-NP pairs in the sentence for which coreference can be ruled out on syntactic grounds (using the conditions stated above).</Paragraph>
      <Paragraph position="9"> If the current sentence contains any lexical anaphors (i.e., reciprocal or reflexive pronouns), a list of ID pairs is generated. Each lexical anaphor is paired with all of its possible antecedent binders.</Paragraph>
      <Paragraph position="10"> If any non-pleonastic pronouns are present in the current sentence, attempt to identify their antecedents. Resolution is attempted in the order of pronoun occurrence in the sentence. In the case of lexical anaphors (reflexive or reciprocal pronouns), the possible antecedent binders were identified by the anaphor binding algorithm. If more than one candidate was found, the one with the highest salience weight was chosen (see second example of Section 3.1).</Paragraph>
      <Paragraph position="11"> In the case of third person pronouns, resolution proceeds as follows:  1. A list of possible antecedent candidates is created. It contains the most recent discourse referent of each equivalence class. The salience weight of each candidate is calculated and included in the list. The salience weight of a candidate can be modified in several ways: a. If a candidate follows the pronoun, its salience weight is reduced substantially (i.e., cataphora is strongly penalized). b. If a candidate fills the same slot as the pronoun, its weight is  increased slightly (i.e., parallelism of grammatical roles is rewarded).</Paragraph>
      <Paragraph position="12"> It is important to note that, unlike the salience factors described in Section 2.1.4, these modifications of the salience weights of candidates  are local to the the resolution of a particular pronoun. 2. A salience threshold is applied; only those candidates whose salience weight is above the threshold are considered further. 3. The possible agreement features (number and gender) for the pronoun  are determined. The possible sg (singular) and pl (plural) genders are determined; either of these can be a disjunction or nil. Pronominal forms in many languages are ambiguous as to number and gender; such ambiguities are taken into account by RAP's morphological filter and by the algorithm as a whole. The search splits to consider singular and plural antecedents separately (steps 4--6) to allow a general treatment of number ambiguity (as in the Spanish possessive pronoun su or the German pronoun sie occurring as an accusative object).</Paragraph>
      <Paragraph position="13">  4. The best sg candidate (if any) is selected: a. If no sg genders were determined for the pronoun, proceed to Step 5.</Paragraph>
      <Paragraph position="14"> b. Otherwise, apply the morphological filter.</Paragraph>
      <Paragraph position="15">  Computational Linguistics Volume 20, Number 4 c. The syntactic filter is applied, using the list of disjoint pronoun-NP pairs generated earlier. The filter excludes any candidate paired in the list with the pronoun being resolved, as well as any candidate that is anaphorically linked to an NP paired with the pronoun.</Paragraph>
      <Paragraph position="16"> d. If more than one candidate remains, choose the candidate with the highest salience weight. If several candidates have (exactly) the highest weight, choose the candidate closest to the anaphor.  Proximity is measured on the surface string and is not directional.</Paragraph>
      <Paragraph position="17"> e. The remaining candidate is considered the best sg candidate. 5. The best pl candidate (if any) is selected. The procedure parallels that outlined above for the best sg candidate: a. If no pl gender is specified for the pronoun, proceed to Step 6. b. Otherwise, apply the morphological filter.</Paragraph>
      <Paragraph position="18"> c. Apply the syntactic filter.</Paragraph>
      <Paragraph position="19"> d. If more than one candidate remains, choose the candidate with the highest salience weight; if several candidates have the highest weight, choose the candidate closest to the anaphor. e. The remaining candidate is considered the best pl candidate. 6. Given the best sg and pl candidates, find the best overall candidate: a. If a sg candidate was found, but no pl candidate, or vice versa, choose that candidate as the antecedent.</Paragraph>
      <Paragraph position="20"> b. If both a sg and a pl candidate were found, choose the candidate  with the greater salience weight (this will never arise in analysis of English text, as all English pronominal forms are unambiguous as to number).</Paragraph>
      <Paragraph position="21"> 7. The selected candidate is declared to be the antecedent of the pronoun. The following properties of RAP are worth noting. First, it applies a powerful syntactic and morphological filter to lists of pronoun-NP pairs to reduce the set of possible NP antecedents for each pronoun. Second, NP salience measures are specified largely in terms of syntactic properties and relations (as well as frequency of occurrence). These include a hierarchy of grammatical roles, level of phrasal embedding, and parallelism of grammatical role. Semantic constraints and real-world knowledge play no role in filtering or salience ranking. Third, proximity of an NP relative to a pronoun is used to select an antecedent in cases in which several candidates have equal salience weighting. Fourth, intrasentential antecedents are preferred to intersentential candidates. This preference is achieved by three mechanisms:  equal salience values.</Paragraph>
      <Paragraph position="22"> The fifth property which we note is that anaphora is strongly preferred to cataphora.  Shalom Lappin and Herbert J. Leass An Algorithm for Pronominal Anaphora Resolution 3. Examples of RAP's Output RAP generates the list of non-coreferential pronoun-NP pairs for the current sentence, the list of pleonastic pronouns, if any, in the current sentence, the list of possible antecedent NP-lexical anaphor pairs, if any, for the current sentence, and the list of pronoun-antecedent NP pairs that it has identified, for which antecedents may appear  in preceding sentences in the text. Each NP appearing in any of the first three lists is represented by its lexical head followed by the integer that corresponds to its position in the sequence of tokens in the input string of the current sentence. The NPs in the pairs of the pronoun-antecedent list are represented by their lexical heads followed by their IDs, displayed as a list of two integers. 7</Paragraph>
    </Section>
    <Section position="2" start_page="544" end_page="544" type="sub_section">
      <SectionTitle>
3.1 Lexical Anaphors
</SectionTitle>
      <Paragraph position="0"> After installation of the option, the backup copy of the Reference Diskette was started for the computer to automatically configure itself.</Paragraph>
      <Paragraph position="1"> Antecedent NP--lexical anaphor pairs.</Paragraph>
    </Section>
    <Section position="3" start_page="544" end_page="545" type="sub_section">
      <SectionTitle>
3.2 Pleonastic and Non-Lexical AnaphoriC/ Pronouns in the Same Sentence
</SectionTitle>
      <Paragraph position="0"> Most of the copyright notices are embedded in the EXEC, but this keyword makes it possible for a user-supplied function to have its own copyright notice.</Paragraph>
      <Paragraph position="1"> Non-coreferential pronoun--NP pairs.</Paragraph>
      <Paragraph position="3"> 7 Recall that the first integer identifies the sentence in which the NP appears, and the second indicates the position of its head word in the sentence.</Paragraph>
      <Paragraph position="4">  Computational Linguistics Volume 20, Number 4 Pleonastic Pronouns.</Paragraph>
      <Paragraph position="5"> it. 16 Anaphor--Antecedent links.</Paragraph>
      <Paragraph position="6"> its.(l.26) to function. (1.23) function.(l.23) and keyword.(l.14) share the highest salience weight of all candidates that pass the morphological and syntactic filters; they are both subjects and therefore higher in salience than the third candidate, PSXEC.(1.10). function.(1.23) is then selected as the antecedent owing to its proximity to the anaphor.</Paragraph>
    </Section>
    <Section position="4" start_page="545" end_page="545" type="sub_section">
      <SectionTitle>
3.3 Multiple Cases of Intrasentential Anaphora
</SectionTitle>
      <Paragraph position="0"> Because of this, MicroEMACS cannot process an incoming ESC until it knows what character follows it.</Paragraph>
      <Paragraph position="1">  ESC.(1.10). In addition, MicroEMACS.(1.4) is rewarded because it fills the same grammatical role as the anaphor being resolved.</Paragraph>
      <Paragraph position="2"> In the case of it.(1.17), the parallelism reward works in favor of ESC.(1.10), causing it to be chosen, despite the general preference for subjects over objects.</Paragraph>
    </Section>
    <Section position="5" start_page="545" end_page="545" type="sub_section">
      <SectionTitle>
3.4 Intersentential and Intrasentential Anaphora in the Same Sentence
</SectionTitle>
      <Paragraph position="0"> At this point, emacs is waiting for a command.</Paragraph>
      <Paragraph position="1"> It is prepared to see if the variable keys are TRUE, and executes some lines if they are.</Paragraph>
    </Section>
    <Section position="6" start_page="545" end_page="549" type="sub_section">
      <SectionTitle>
3.5 Displaying Discourse Referents
</SectionTitle>
      <Paragraph position="0"> The discourse referents currently defined can be displayed with their salience weights.</Paragraph>
      <Paragraph position="1"> The display for the two-sentence text of Section 3.4 is as follows: the members of an equivalence class are displayed on one line. Since salience factors from previous sentences are degraded by a factor of two when each new sentence is processed,  Shalom Lappin and Herbert J. Leass An Algorithm for Pronominal Anaphora Resolution discourse referents from earlier sentences that are not members of anaphoric chains extending into the current sentence rapidly become &amp;quot;uncompetitive.&amp;quot; Salience weight  This example illustrates the strong preference for intrasentential antecedents, printer.(2.10) is selected, despite the fact that it is much lower on the hierarchy of grammatical roles than the other candidate, file.(1.7), which also benefits from the parallelism reward. Degradation of salience weight for the candidate from the previous sentence is substantial enough to offset these factors.</Paragraph>
      <Paragraph position="2"> The PARTNUM tag prints a part number on the document.</Paragraph>
      <Paragraph position="3"> &amp;name.'s initial setting places it on the back cover.</Paragraph>
      <Paragraph position="4">  Four candidates receive a similar salience weighting in this example. Two potential intrasentential candidates that would have received a high salience ranking, setting.(2.4) and cover.(2.10), are ruled out by the syntactic filter. The remaining intrasentential candidate, scsym(name).(2.1) 8 ranks relatively low, as it is a possessive determiner--it scores lower than two candidates from the previous sentence. The parallelism reward causes number.Off ) to be preferred.</Paragraph>
      <Paragraph position="5"> 4. Testing of RAP on Manual Texts We tuned RAP on a corpus of five computer manuals containing a total of approximately 82,000 words. From this corpus we extracted sentences with 560 occurrences 8 &amp;name. is a document formatting symbol: it is replaced by a predefined character string when the text is formatted. ESG treats such symbols as being unspecified for number and gender; number may be assigned during parsing, owing to agreement constraints.</Paragraph>
      <Paragraph position="6">  of third person pronouns (including reflexives and reciprocals) and their antecedents. 9 In the training phase, we refined our tests for pleonastic pronouns and experimented extensively with salience weighting. Our goal was, of course, to optimize RAP's success rate with the training corpus. We proceeded heuristically, analyzing cases of failure and attempting to eliminate them in as general a manner as possible. The parallelism reward was introduced at this time, as it seemed to make a substantial contribution to the overall success rate. A salience factor that was originally present, viz. matrix emphasis, was revised to become the non-adverbial emphasis factor. In its original form, this factor contributed to the salience of any NP not contained in a subordinate clause or in an adverbial PP demarcated by a separator. This was found to be too general, especially since the relative positions of a given pronoun and its antecedent candidates are not taken into account. The revised factor could be thought of as an adverbial penalty factor, since it in effect penalizes NPS occurring in adverbial pps. 1deg We also experimented with the initial weights for the various factors and with the size of the parallelism reward and cataphora penalty, again attempting to optimize RAP's overall success rate. A value of 35 was chosen for the parallelism reward; this is just large enough to offset the preference for subjects over accusative objects. A much larger value (175) was found to be necessary for the cataphora penalty. The final results that we obtained for the training corpus are given in Table 2.</Paragraph>
      <Paragraph position="7"> Interestingly, the syntactic-morphological filter reduces the set of possible antecedents to a single NP, or identifies the pronoun as pleonastic in 163 of the 475 cases (34%) that the algorithm resolves correctly. 11 It significantly restricts the size of the candidate list in most of the other cases, in which the antecedent is selected on the basis of salience ranking and proximity. This indicates the importance of a powerful syntactic-morphological filtering component in an anaphora resolution system.</Paragraph>
      <Paragraph position="8"> We then performed a blind test of RAP on a test set of 345 sentences randomly selected from a corpus of 48 computer manuals containing 1.25 million words. 12 The results which we obtained for the test corpus (without any further modifications of RAP) are given in Table 3.13 This blind test provides the basis for a comparative evaluation of RAP and Dagan's * 9 These sentences and those used in the blind test were edited slightly to overcome parse inaccuracies. Rather than revise the lexicon, we made lexical substitutions to improve parses. In some cases constructions had to be simplified. However, such changes did not alter the syntactic relations among the pronoun and its possible antecedents.</Paragraph>
      <Paragraph position="9"> For a discussion of ESG's parsing accuracy, see McCord (1993).</Paragraph>
      <Paragraph position="10">  10 See comments at the end of Section 4 about refining RAP's measures of structural salience. 11 Forty-three of the pronoun occurrences in the training corpus (~ 8%) were pleonastic; a random sample of 245 pronoun occurrences extracted from our test corpus included 15 pleonastic pronouns (~ 6%). 12 The test set was filtered in order to satisfy the conditions of our experiments on the role of statistically measured lexical preference in enhancing RAP's performance. See Section 5.1 for a discussion of these  (1992) system, RAPSTAT, which employs both RAP's salience weighting mechanism and statistically measured lexical preferences, as well as for a detailed analysis of the relative contributions of the various elements of RAP's salience weighting mechanism to its overall success rate. We will discuss the blind test in greater detail in the following sections.</Paragraph>
    </Section>
    <Section position="7" start_page="549" end_page="550" type="sub_section">
      <SectionTitle>
4.1 Limitations of the Current Algorithm
</SectionTitle>
      <Paragraph position="0"> Several classes of errors that RAP makes are worthy of discussion. The first occurs with many cases of intersentential anaphora, such as the following: This green indicator is lit when the controller is on.</Paragraph>
      <Paragraph position="1"> It shows that the DC power supply voltages are at the correct levels.</Paragraph>
      <Paragraph position="2"> Morphological and syntactic filtering exclude all possible intrasentential candidates. Because the level of sentential embedding does not contribute to RAP's salience weighting mechanism, indicator.(1.3) and controller.(1.8) are ranked equally, since both are subjects. RAP then erroneously chooses controller.(1.8) as the antecedent, since it is closer to the pronoun than the other candidate.</Paragraph>
      <Paragraph position="3"> The next class of errors involves antecedents that receive a low salience weighting owing to the fact that the evoking NP is embedded in a matrix NP or is in another structurally nonprominent position (such as object of an adverbial PP).</Paragraph>
      <Paragraph position="4"> The users you enroll may not necessarily be new to the system and may already have a user profile and a system distribution directory entry.</Paragraph>
      <Paragraph position="5"> &amp;ofc. checks for the existence of these objects and only creates them as necessary.</Paragraph>
      <Paragraph position="6"> Despite the general preference for intrasentential candidates, user.(1.2) is selected as the antecedent, since the only factor contributing to the salience weight of object.(2.8) is sentence recency. Selectional restrictions or statistically measured lexical preferences (see Section 5) could clearly help in at least some of these cases.</Paragraph>
      <Paragraph position="7"> In another class of cases, RAP fails because semantic/pragmatic information is required to identify the correct antecedent.</Paragraph>
      <Paragraph position="8"> conditions. 13 Proper resolution was determined by a consensus of three opinions, including that of the first author.  Again, the Migration Aid produces an exception report automatically at the end of every migration run.</Paragraph>
      <Paragraph position="9"> As you did with the function, use it to verify that the items have been restored to your system successfully.</Paragraph>
      <Paragraph position="10"> function.(2.6) is selected as the antecedent, rather than aid.(1.5).</Paragraph>
    </Section>
    <Section position="8" start_page="550" end_page="552" type="sub_section">
      <SectionTitle>
4.2 The Relative Contributions of the Salience Weighting Mechanisms
</SectionTitle>
      <Paragraph position="0"> Using the test corpus of our blind test, we conducted experiments with modified versions of RAP, in which various elements of the salience weighting mechanism were switched off. We present the results in Table 4 and discuss their significance.</Paragraph>
      <Paragraph position="1"> Ten variants are presented in Table 4; they are as follows: I &amp;quot;standard&amp;quot; RAP (as used in the blind test) II parallelism reward deactivated III non-adverbial and head emphasis deactivated IV matrix emphasis used instead of non-adverbial emphasis V cataphora penalty deactivated VI subject, existential, accusative, and indirect object/oblique complement emphasis (i.e., hierarchy of grammatical roles) deactivated VII equivalence classes deactivated VIII sentence recency and salience degradation deactivated IX all &amp;quot;structural&amp;quot; salience weighting deactivated (II +III + V + VI) X all salience weighting and degradation deactivated The single most important element of the salience weighting mechanism is the recency preference (sentence recency factor and salience degradation; see VIII). This is not surprising, given the relative scarcity of intersentential anaphora in our test corpus (less than 20% of the pronoun occurrences had antecedents in the preceding sentence). Deactivating the equivalence class mechanism also led to a significant deterioration in RAP's performance; in this variant (VII), only the salience factors applying to a  Computational Linguistics Volume 20, Number 4 particular NP contribute to its salience weight, without any contribution from other anaphorically linked NPs. The performance of the syntactic filter is degraded somewhat in this variant as well, since NPs that are anaphorically linked to an NP fulfilling the criteria for disjoint reference will no longer be rejected as antecedent candidates. The results for VII and VIII indicate that attentional state plays a significant role in pronominal anaphora resolution and that even a simple model of attentional state can be quite effective.</Paragraph>
      <Paragraph position="2"> Deactivating the syntax-based elements of the salience weighting mechanism individually led to relatively small deteriorations in the overall success rate (II, III, IV, V, and VI). Eliminating the hierarchy of grammatical roles (VI), for example, led to a deterioration of less than 4%. Despite the comparatively small degradation in performance that resulted from turning off these elements individually, their combined effect is quite significant, as the results of IX show. This suggests that the syntactic salience factors operate in a complex and highly interdependent manner for anaphora resolution.</Paragraph>
      <Paragraph position="3"> X relies solely on syntactic/morphological filtering and proximity to choose an antecedent. Note that the sentence pairs of the blind test set were selected so that, for each pronoun occurrence, at least two antecedent candidates remained after syntactic/morphological filtering (see Section 5.1). In the 17 cases in which X correctly disagreed with RAP, the proper antecedent happened to be the most proximate candidate. null We suspect that RAP's overall success rate can be improved (perhaps by 5% or more) by refining its measures of structural salience. Other measures of embeddedness, or perhaps of &amp;quot;distance&amp;quot; between anaphor and candidate measured in terms of clausal and NP boundaries, may be more effective than the current mechanisms for non-adverbial and head emphasis. 14 Empirical studies of patterns of pronominal anaphora in corpora (ideally in accurately and uniformly parsed corpora) could be helpful in defining the most effective measures of structural salience. One might use such studies to obtain statistical data for determining the reliability of each proposed measure as a predictor of the antecedent-anaphor relation and the orthogonality (independence) of all proposed measures.</Paragraph>
      <Paragraph position="4"> 5. Salience and Statistically Measured Lexical Preference Dagan (1992) constructs a procedure, which he refers to as RAPSTAT, for using statistically measured lexical preference patterns to reevaluate RAP's salience rankings of antecedent candidates. RAPSTAT assigns a statistical score to each element of a candidate list that RAP generates; this score is intended to provide a measure (relative to a corpus) of the preference that lexical semantic/pragmatic factors impose upon the candidate as a possible antecedent for a given pronoun, is 14 Such a distance measure is reminiscent of Hobbs' (1978) tree search procedure. See Section 6.1 for a discussion of Hobbs' algorithm and its limitations.</Paragraph>
      <Paragraph position="5"> The results for IV confirm our suspicions from the training phase that matrix emphasis (rewarding NPs not contained in a subordinate clause) does not contribute significantly to successful resolution. 15 Assume that P is a non-pleonastic and non-reflexive pronoun in a sentence such that RAP generates the non-empty list L of antecedent candidates for P. Let H be the lexical head (generally a verb or a noun) of which P is an argument or an adjunct in the sentence. RAPSTAT computes a statistical score for each element Ci of L, on the basis of the frequency, in a corpus, with which Ci occurs in the same grammatical relation with H as P occurs with H in the sentence. The statistical score that RAPSTAT assigns to Ci is intended to model the probability of the event where Ci stands in the relevant grammatical relation to H, given the occurrence of Ci (but taken independently of the other elements of L).</Paragraph>
      <Paragraph position="6">  Shalom Lappin and Herbert J. Leass An Algorithm for Pronominal Anaphora Resolution RAPSTAT reevaluates RAP's ranking of the elements of the antecedent candidate list L in a way that combines both the statistical scores and the salience values of the candidates. The elements of L appear in descending order of salience value. RAPSTAT processes L as follows. Initially, it considers the first two elements Cl and C2 of L. If (i) the difference in salience scores between C1 and C2 does not exceed a parametrically specified value (the salience difference threshold) and (ii) the statistical score of C2 is significantly greater than that of C1, then RAPSTAT will substitute the former for the latter as the currently preferred candidate. If conditions (i) and (ii) do not hold, RAPSTAT confirms RAP's selection of C1 as the preferred antecedent. If these conditions do hold, then RAPSTAT selects C2 as the currently preferred candidate and proceeds to compare it with the next element of L. It repeats this procedure for each successive pair of candidates in L until either (i) or (ii) fails or the list is completed. In either case, the last currently preferred candidate is selected as the antecedent.</Paragraph>
      <Paragraph position="7"> An example of a case in which RAPSTAT overules RAP is the following.</Paragraph>
      <Paragraph position="8"> The Send Message display is shown, allowing you to enter your message and specify where it will be sent.</Paragraph>
      <Paragraph position="9"> The two top candidates in the list that RAP generates for it.(1.17) are display.(1.4) with a salience value of 345 and message.(1.13), which has a salience value of 315. In the corpus that we used for testing RAPSTAT, the verb-object pair send-display appears only once, whereas send-message occurs 289 times. As a result, message receives a considerably higher statistical score than display. The salience difference threshold that we used for the test is 100, and conditions (i) and (ii) hold for these two candidates. The difference between the salience value of message and the third element of the candidate list is greater than 100. Therefore, RAPSTAT correctly selects message as the antecedent of it.</Paragraph>
    </Section>
    <Section position="9" start_page="552" end_page="553" type="sub_section">
      <SectionTitle>
5.1 A Blind Test of RAP and RAPSTAT
</SectionTitle>
      <Paragraph position="0"> Dagan et al. (in press) report a comparative blind test of RAP and RAPSTAT. To construct a database of grammatical relation counts for RAPSTAT, we applied the Slot Grammar parser to a corpus of 1.25 million words of text from 48 computer manuals. We automatically extracted all lexical tuples and recorded their frequencies in the parsed corpus. We then constructed a test set of pronouns by randomly selecting from the corpus sentences containing at least one non-pleonastic third person pronoun occurrence. For each such sentence in the set, we included the sentence that immediately precedes it in the text (when the preceding sentence does not contain a pronoun). 16 We filtered the test set so that for each pronoun occurrence in the set, (i) RAP generates a candidate list with at least two elements, (ii) the actual antecedent NP appears in the candidate list, and (iii) there is a total tuple frequency greater than 1 for the candidate See Dagan 1992 and Dagan et al. (in press) for a discussion of this lexical statistical approach to ranking antecedent candidates and possible alternatives.</Paragraph>
      <Paragraph position="1"> 16 In the interests of simplicity and uniformity, we discarded sentence pairs in which the first sentence contains a pronoun. We decided to limit the text preceding the sentence containing the pronoun to one sentence because we found that in the manuals which we used to tune the algorithm, almost all cases of intersentential anaphora involved an antecedent in the immediately preceding sentence. Moreover, the progressive decline in the salience values of antecedent candidates in previous sentences ensures that a candidate appearing in a sentence which is more than one sentence prior to the current one will be selected only if no candidates exist in either the current or the preceding sentence. As such cases are relatively rare in the type of text we studied, we limited our test set to textual units containing the current and the preceding sentence.</Paragraph>
      <Paragraph position="2">  Computational Linguistics Volume 20, Number 4 list (in most cases, it was considerably larger). ~7 The test set contains 345 sentence pairs with a total of 360 pronoun occurrences. The results of the blind test for RAP and RAPSTAT are as follows. TM</Paragraph>
    </Section>
  </Section>
  <Section position="6" start_page="553" end_page="558" type="metho">
    <SectionTitle>
RAP RAPSTAT
</SectionTitle>
    <Paragraph position="0"> When we further analyzed the results of the blind test, we found that RAPSTAT's success depends in large part on its use of salience information. If RAPSTAT's statistically based lexical preference scores are used as the only criterion for selecting an antecedent, the statistical selection procedure disagrees with RAP in 151 out of 338 instances. RAP is correct in 120 (79%) of these cases and the statistical decision in 31 (21%) of the cases. When salience is factored into RAPSTAT's decision procedure, the rate of disagreement between RAP and RAPSTAT declines sharply, and RAPSTAT's performance slightly surpasses that of RAP, yielding the results that we obtained in the blind test.</Paragraph>
    <Paragraph position="1"> In general, RAPSTAT is a conservative statistical extension of RAP. It permits statistically measured lexical preference to overturn salience-based decisions only in cases in which the difference between the salience values of two candidates is small and the statistical preference for the less salient candidate is comparatively large. ~9 The comparative blind test indicates that incorporating statistical information on lexical preference patterns into a salience-based anaphora resolution procedure can yield a modest improvement in performance relative to a system that relies only on syntactic salience for antecedent selection. Our analysis of these results also shows that statistically measured lexical preference patterns alone provide a far less efficient basis for anaphora resolution than an algorithm based on syntactic and attentional measures of salience. 2deg 6. Comparison with Other Approaches to Anaphora Resolution We will briefly compare our algorithm with several other approaches to anaphora resolution that have been suggested.</Paragraph>
    <Paragraph position="2"> 17 In previous tests of RAP we found that it generates a candidate list that includes the correct antecedent of the pronoun in approximately 98% of the cases to which it applies.</Paragraph>
    <Paragraph position="3"> 18 We take RAPSTAT as deciding a case when it considers at least two candidates rather than deferring to RAP after the initial candidate because of a large salience difference between this candidate and the next one in the list. In cases in which RAPSTAT does not make an independent decision, it endorses RAP's selection. RAPSTAT's total success rate includes both sorts of cases.</Paragraph>
    <Paragraph position="4"> 19 John Justeson did the statistical analysis of the comparative blind test of RAP and RAPSTAT. These results are described in Dagan et al. (in press).</Paragraph>
    <Paragraph position="5"> 20 Dagan (1992) reaches a similar conclusion on the basis of a much smaller experiment.</Paragraph>
    <Section position="1" start_page="554" end_page="555" type="sub_section">
      <SectionTitle>
Shalom Lappin and Herbert J. Leass An Algorithm for Pronominal Anaphora Resolution
6.1 Hobbs' Algorithm
</SectionTitle>
      <Paragraph position="0"> Hobbs' (1978) algorithm relies on a simple tree search procedure formulated in terms of depth of embedding and left-right order. By contrast, RAP uses a multi-dimensional measure of salience that invokes a variety of syntactic properties specified in terms of the head-argument structures of Slot Grammar, as well as a model of attentional state.</Paragraph>
      <Paragraph position="1"> Hobbs' tree search procedure selects the first candidate encountered by a left-right depth first search of the tree outside of a minimal path to the pronoun that satisfies certain configurational constraints. The algorithm chooses as the antecedent of a pronoun P the first NPi in the tree obtained by left-to-right breadth-first traversal of the branches to the left of the path T such that (i) T is the path from the NP dominating P to the first NP or S dominating this NP, (ii) T contains an NP or S node N that contains the NP dominating P, and (iii) N does not contain NPi. If an antecedent satisfying this condition is not found in the sentence containing P, the algorithm selects the first NP obtained by a left-to-right breadth first search of the surface structures of preceding sentences in the text.</Paragraph>
      <Paragraph position="2"> We have implemented a version of Hobbs' algorithm for Slot Grammar. The original formulation of the algorithm encodes syntactic constraints on pronominal anaphora in the definition of the domain to which the search for an antecedent NP applies. In our implementation of the algorithm, we have factored out the search procedure and substituted RAP's syntactic-morphological filter for Hobbs' procedural filter. Let the Mods (modifiers) of a head H be the sisters of H in the Slot Grammar representation of the phrase that H heads. Our specification of Hobbs' algorithm for Slot Grammar is as follows:  Find a node N1 such that (i) N1 contains the pronoun P; (ii) N1 is an S or NP; and (iii) it is not the case that there is a node N1, such that N1 contains N1, and N1, satisfies (i) and (ii).</Paragraph>
      <Paragraph position="3"> Check the list of Mods of N1 left to right for NPs that are not elements of the list of pairs &lt;P-NP&gt; identified by the syntactic-morphological filter as noncoreferential and that occur to the left of R Select the leftmost NP in the filtered list of NP Mods of N1.</Paragraph>
      <Paragraph position="4"> If this list is nil, then repeat steps 2 and 3 recursively for each Mod in the list of Mods of N1, each Mod in this second list of Mods, etc., until an NP antecedent is found.</Paragraph>
      <Paragraph position="5"> If no NP antecedent is found by applying step 4, then identify a node N2 that is the first NP/S containing N1.</Paragraph>
      <Paragraph position="6"> If N2 is an NP and is not an element of the list of pairs &lt;P-NP&gt; identified by the filter, propose it as the antecedent.</Paragraph>
      <Paragraph position="7"> Otherwise, apply steps 2-4 to N2.</Paragraph>
      <Paragraph position="8"> If no antecedent NP is found, continue to apply steps 5 and 6 and then steps 2-4 to progressively higher NP/S nodes.</Paragraph>
      <Paragraph position="9"> If no antecedent NPs are found at the highest S of the sentence, then take N1 to be the highest S node of the immediately preceding sentence and apply steps 2-4 to N1.</Paragraph>
      <Paragraph position="10">  We ran this version of Hobbs' algorithm on the test set that we used for the blind test of RAP and RAPSTAT; the results appear in Table 5.</Paragraph>
      <Paragraph position="11"> It is important to note that the test set does not include pleonastic pronouns or lexical anaphors (reflexive or reciprocal pronouns), neither of which are dealt with by Hobbs' algorithm. Moreover, our Slot Grammar implementation of the algorithm gives it the full advantage of RAP's syntactic-morphological filter, which is more powerful than the configurational filter built into the original specification of the algorithm. Therefore, the test results provide a direct comparison of RAP's salience metric and Hobbs' search procedure.</Paragraph>
      <Paragraph position="12"> Hobbs' algorithm was more successful than RAP in resolving intersentential anaphora (87% versus 74% correct). 21 Because intersentential anaphora is relatively rare in our corpus of computer manual texts and because RAP's success rate for intrasentential anaphora is higher than Hobbs' (89% versus 81%), RAP's overall success rate on the blind test set is 4% higher than that of our version of Hobbs' algorithm. This indicates that RAP's salience metric provides a more reliable basis for antecedent selection than Hobbs' search procedure for the text domain on which we tested both algorithms. It is clear from the relatively high rate of agreement between RAP and Hobbs' algorithm on the test set (they agree in 83% of the cases) that there is a significant degree of convergence between salience as measured by RAP and the configurational prominence defined by Hobbs' search procedure. This is to be expected in English, in which grammatical roles are identified by means of phrase order. However, in languages in which grammatical roles are case marked and word order is relatively free, we expect that there will be greater divergence in the predictions of the two algorithms. The salience measures used by RAP have application to a wider class of languages than Hobbs' order-based search procedure. This procedure relies on a correspondence of grammatical roles and linear precedence relations that holds for a comparatively small class of languages.</Paragraph>
    </Section>
    <Section position="2" start_page="555" end_page="557" type="sub_section">
      <SectionTitle>
6.2 Discourse Based Methods
</SectionTitle>
      <Paragraph position="0"> Most of the work in this area seeks to formulate general principles of discourse structure and interpretation and to integrate methods of anaphora resolution into a computational model of discourse interpretation (and sometimes of generation as well).</Paragraph>
      <Paragraph position="1"> Sidner (1981, 1983), Grosz, Joshi, and Weinstein (1983, 1986), Grosz and Sidner (1986), 21 The difficulty that RAP encounters with such cases was discussed in Section 4.1. We are experimenting with refinements in RAP's scoring mechanism to improve its performance in these and other cases.  Shalom Lappin and Herbert J. Leass An Algorithm for Pronominal Anaphora Resolution Brennan, Friedman, and Pollard (1987), and Webber (1988) present different versions of this approach. Dynamic properties of discourse, especially coherence and focusing, are invoked as the primary basis for identifying antecedence candidates; selecting a candidate as the antecedent of a pronoun in discourse involves additional constraints of a syntactic, semantic, and pragmatic nature.</Paragraph>
      <Paragraph position="2"> In developing our algorithm, we have not attempted to consider elements of discourse structure beyond the simple model of attentional state realized by equivalence classes of discourse referents, salience degradation, and the sentence recency salience factor. The results of our experiments with computer manual texts (see Section 4.2) indicate that, at least for certain text domains, relatively simple models of discourse structure can be quite useful in pronominal anaphora resolution. We suspect that many aspects of discourse models discussed in the literature will remain computationally intractable for quite some time, at least for broad-coverage systems.</Paragraph>
      <Paragraph position="3"> A more extensive treatment of discourse structure would no doubt improve the performance of a structurally based algorithm such as RAP. At the very least, formatting information concerning paragraph and section boundaries, list elements, etc., should be taken into account. A treatment of definite NP resolution would also presumably lead to more accurate resolution of pronominal anaphora, since it would improve the reliability of the salience weighting mechanism.</Paragraph>
      <Paragraph position="4"> However, some current discourse-based approaches to anaphora resolution assign too dominant a role to coherence and focus in antecedent selection. As a result, they establish a strong preference for intersentential over intrasentential anaphora resolution. This is the case with the anaphora resolution algorithm described by Brennan, Friedman, and Pollard (1987). This algorithm is based on the centering approach to modeling attentional structure in discourse (Grosz, Joshi, and Weinstein 1983, 1986). 22 Constraints and rules for centering are applied by the algorithm as part of the selection procedure for identifying the antecedents of pronouns in a discourse. The algorithm strongly prefers intersentential antecedents that preserve the center or maximize continuity in center change, to intrasentential antecedents that cause radical center shifts. This strong preference for intersentential antecedents is inappropriate for at least some text domains--in our corpus of computer manual texts, for example, we estimate that less than 20% of referentially used third person pronouns have intersentential antecedents. 23 There is a second difficulty with the Brennan et al. centering algorithm. It uses a hierarchy of grammatical roles quite similar to that of RAP, but this role hierarchy does not directly influence antecedent selection. Whereas th e hierarchy in RAP contributes to a multi-dimensional measure of the relative salience of all antecedent candidates, in Brennan et al. 1987, it is used only to constrain the choice of the backward-looking center, Cb, of an utterance. It does not serve as a general preference measure for antecedence. The items in the forward center list, Cf, are ranked according to the hierarchy of grammatical roles. For an utterance U,, Cb(Un) is required to be the highest ranked element of Cf(U~_I) that is realized in U,. If an element E in the list of possible 22 &amp;quot;A discourse segment consists of a sequence of utterances U1,..., Urn. With each utterance, Un is associated with a list of forward-looking centers, Cf(Uti), consisting of those discourse entities that are directly realized or realized by linguistic expressions in that utterance. Ranking of an entity on this list corresponds roughly to the likelihood that it will be the primary focus of subsequent discourse; the first entity on this list is the preferred center, Cp(Un). Un actually centers, or is 'about,' only one entity at a time, the backward-looking center, Cb(U~). The backward center is a confirmation of an entity that has already been introduced into the discourse; more specifically, it must be realized in the immediately preceding utterance, Un--l&amp;quot; (Brennan, Friedman, and Pollard 1987, p. 155).</Paragraph>
      <Paragraph position="5"> 23 This estimate is based on the small random sample used in our blind test (see Section 5.1).</Paragraph>
      <Paragraph position="6">  Computational Linguistics Volume 20, Number 4 forward centers, Cf(Un-1), is identified as the antecedent of a pronoun in Un, then E is realized in Un. The Brennan et al. centering algorithm does not require that the highest ranked element of Cf(Un-1) actually be realized in Un, but only that Cb(Un) be the highest ranked element of Cf(Un-1) which is, in fact, realized in Un. Antecedent selection is constrained by rules that sustain cohesion in the relations between the backward centers of successive utterances in a discourse, but it is not determined directly by the role hierarchy used to rank the forward centers of a previous utterance. Therefore, an NP in Un_~ that is relatively low in the hierarchy of grammatical roles can serve as an antecedent of a pronoun in Un, provided that no higher ranked NP in Un-1 is taken as the antecedent of some other pronoun or definite NP in Un.24 An example will serve to illustrate the problem with this approach.</Paragraph>
      <Paragraph position="7"> The display shows you the status of all the printers.</Paragraph>
      <Paragraph position="8"> It also provides options that control printers.</Paragraph>
      <Paragraph position="9"> The (ranked) forward center list for the first sentence is as follows: (\[DISPLAY\] \[STATUS\] \[YOU\] \[PRINTERS3).</Paragraph>
      <Paragraph position="10"> Applying the filters and ranking mechanism of Brennan, Friedman, and Pollard (1987) yields two possible anchors. 2s Each anchor determines a choice of Cb(Un) and the antecedent of it. One anchor identifies both with display, whereas the second takes both to be status. The hierarchy of grammatical roles is not used to select display over status. Nothing in the algorithm rules out the choice of status as the backward center for the second sentence and as the antecedent of it. If this selection is made, display is not realized in the second sentence, and so Cb(Un) is status, which is then the highest ranked element of Cf(Un-1) that is realized in Un, as required by constraint 3 of the Brennan et al. centering algorithm.</Paragraph>
      <Paragraph position="11"> In general, we agree with Alshawi (1987, p. 62) that an algorithm/model relying on the relative salience of all entities evoked by a text, with a mechanism for removing or filtering entities whose salience falls below a threshold, is preferable to models that &amp;quot;make assumptions about a single (if shifting) focus of attention. &amp;quot;26</Paragraph>
    </Section>
    <Section position="3" start_page="557" end_page="558" type="sub_section">
      <SectionTitle>
6.3 Mixed Models
</SectionTitle>
      <Paragraph position="0"> This approach seeks to combine a variety of syntactic, semantic, and discourse factors into a multi-dimensional metric for ranking antecedent candidates. On this view, the score of a candidate is a composite of several distinct scoring procedures, each of which reflects the prominence of the candidate with respect to a specific type of information or property. The systems described by Asher and Wada (1988), Carbonell and Brown (1988), and Rich and LuperFoy (1988) are examples of this mixed evaluation strategy.</Paragraph>
      <Paragraph position="1"> In general, these systems use composite scoring procedures that assign a global rank to an antecedent candidate on the basis of the scores that it receives from several 24 Other factors, such as level of embedding, may also be considered in generating an ordering for the list of forward-looking centers. Walker, Iida, and Cote (1990) discuss ordering conditions appropriate for Japanese. 25 An anchor is an association between a backward-looking center, Cb, and a list of forward-looking centers, Cf, for an utterance. An anchor establishes a link between a pronoun and its antecedent by associating the reference marker of the antecedent with that of the pronoun in the Cf list of the utterance. 26 See Walker (1989) for a comparison of the algorithm of Brennan, Friedman, and Pollard (1987) with that of Hobbs (1978) based on a hand simulation.</Paragraph>
      <Paragraph position="2">  Shalom Lappin and Herbert J. Leass An Algorithm for Pronominal Anaphora Resolution evaluation metrics. Each such metric scores the likelihood of the candidate relative to a distinct informational factor. Thus, for example, Rich and LuperFoy (1988) propose a system that computes the global preference value of a candidate from the scores provided by a set of constraint source modules, in which each module invokes different sorts of conditions for ranking the antecedent candidate. The set of modules includes (among others) syntactic and morphological filters for checking agreement and syntactic conditions on disjoint reference, a procedure for applying semantic selection restrictions to a verb and its arguments, a component that uses contextual and real-world knowledge, and modules that represent both the local and global focus of discourse. The global ranking of an antecedent candidate is a function of the scores that it receives from each of the constraint source modules.</Paragraph>
      <Paragraph position="3"> Our algorithm also uses a mixed evaluation strategy. We have taken inspiration from the discussions of scoring procedures in the works cited above, but we have avoided constraint sources involving complex inferencing mechanisms and real-world knowledge, typically required for evaluating the semantic/pragmatic suitability of antecedent candidates or for determining details of discourse structure. In general, it seems to us that reliable large scale modelling of real-world and contextual factors is beyond the capabilities of current computational systems. Even constructing a comprehensive, computationally viable system of semantic selection restrictions and an associated type hierarchy for a natural language is an exceedingly difficult problem, which, to our knowledge, has yet to be solved. Moreover, our experiments with statistically based lexical preference information casts doubt on the efficacy of relatively inexpensive (and superficial) methods for capturing semantic and pragmatic factors for purposes of anaphora resolution. Our results suggest that scoring procedures which rely primarily on tractable syntactic and attentional (recency) properties can yield a broad coverage anaphora resolution system that achieves a good level of performance.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML