File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/99/w99-0207_intro.xml
Size: 4,195 bytes
Last Modified: 2025-10-06 14:06:56
<?xml version="1.0" standalone="yes"?> <Paper uid="W99-0207"> <Title>Corpus-Based Anaphora Resolution Towards Antecedent Preference</Title> <Section position="2" start_page="0" end_page="47" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> Coreference information is relevant for numerous NLP systems. Our interest in anaphora resolution is based on the demand for machine translation systems to be able to translate (possibly omitted) anaphoric expressions in agreement with the morphosyntactic characteristics of the referred object in order to prevent contextual misinterpretations.</Paragraph> <Paragraph position="1"> So far various approaches 1 to anaphora resolution have been proposed. In this paper a machine learning approach (decision tree) is combined with a preference selection method based on the frequency information of non-/coreferential pairs tagged in the corpus as well as distance features within the current discourse.</Paragraph> <Paragraph position="2"> The advantage of machine learning approaches is that they result in modular anaphora resolution systems automatically trainable from a corpus with no 1See section 4 for a more detailed comparison with related research.</Paragraph> <Paragraph position="3"> or only a minimal amount of human intervention. In the case of decision trees, we do have to provide information about possible antecedent indicators (syntactic, semantic, and pragmatic features) contained in the corpus, but the relevance of features for the resolution task is extracted automatically from the training data.</Paragraph> <Paragraph position="4"> Machine learning approaches using decision trees proposed so far have focused on preference selection criteria directly derived from the decision tree results. The work described in (Conolly et al., 1994) utilized a decision tree capable of judging which one of two given anaphor-antecedent pairs is &quot;better&quot;. Due to the lack of a strong assumption on &quot;transitivity&quot;, however, this sorting algorithm is more like a greedy heuristic search as it may be unable to find the &quot;best&quot; solution.</Paragraph> <Paragraph position="5"> The preference selection for a single antecedent in (Aone and Bennett, 1995) is based on the maximization of confidence values returned from a pruned decision tree for given anaphor-candidate pairs. However, decision trees are characterized by an independent learning of specific features, i.e., relations between single attributes cannot be obtained automatically. Accordingly, the use of dependency factors for preference selection during decision tree training requires that the artificially created attributes expressing these dependencies be defined. However, this not only extends human intervention into the automatic learning procedure (i.e., which dependencies are important?), but can also result in some drawbacks on the contextual adaptation of preference selection methods.</Paragraph> <Paragraph position="6"> The preference selection in our approach is based on the combination of statistical frequency information and distance features in the discourse. Therefore, our decision tree is not applied directly to the task of preference selection, but aims at the elimination of irrelevant candidates based on the knowledge obtained from the training data.</Paragraph> <Paragraph position="7"> The decision tree is trained on syntactic (lexical word attributes), semantic, and primitive discourse (distance, frequency) information and determines the coreferential relation between an anaphor and antecedent Candidate in the given context. Irrelevant antecedent candidates are filtered out, achieving a noise reduction for the preference selection algorithm. A preference value is assigned to each &quot; potential anaphor-candidate pair depending on the proportion of non-/coreferential occurrences of the pair in the training corpus (frequency ratio) and the relative position of both elements in the discourse (distance). The candidate with the maximal preference value is resolved as the antecedent of the anaphoric expression.</Paragraph> </Section> class="xml-element"></Paper>