File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/04/c04-1033_intro.xml
Size: 3,596 bytes
Last Modified: 2025-10-06 14:02:06
<?xml version="1.0" standalone="yes"?> <Paper uid="C04-1033"> <Title>An NP-Cluster Based Approach to Coreference Resolution</Title> <Section position="2" start_page="0" end_page="0" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> Coreference resolution is the process of linking as a cluster1 multiple expressions which refer to the same entities in a document. In recent years, supervised machine learning approaches have been applied to this problem and achieved considerable success (e.g. Aone and Bennett (1995); McCarthy and Lehnert (1995); Soon et al. (2001); Ng and Cardie (2002b)). The main idea of most supervised learning approaches is to recast this task as a binary classiflcation problem. Speciflcally, a classifler is learned and then used to determine whether or not two NPs in a document are co-referring. Clusters are formed by linking coreferential NP pairs according to a certain selection strategy. In this way, the identiflcation of coreferential clusters in text is reduced to the identiflcation of coreferential NP pairs.</Paragraph> <Paragraph position="1"> One problem of such reduction, however, is that the individual NP usually lacks adequate descriptive information of its referred entity. Consequently, it is often di-cult to judge whether or not two NPs are talking about the the equivalence property of coreference relationship. same entity simply from the properties of the pair alone. As an example, consider the pair of a non-pronoun and its pronominal antecedent candidate. The pronoun itself gives few clues for the reference determination. Using such NP pairs would have a negative in uence for rules learning and subsequent resolution. So far, several efiorts (Harabagiu et al., 2001; Ng and Cardie, 2002a; Ng and Cardie, 2002b) have attempted to address this problem by discarding the \hard&quot; pairs and select only those confldent ones from the NP-pair pool. Nevertheless, this eliminating strategy still can not guarantee that the NPs in \confldent&quot; pairs bear necessary description information of their referents.</Paragraph> <Paragraph position="2"> In this paper, we present a supervised learning-based approach to coreference resolution. Rather than attempting to mine the reference relationships between NP pairs, our approach does resolution by determining the links of NPs to the existing coreferential clusters. In our approach, a classifler is trained on the instances formed by an NP and one of its possible antecedent clusters, and then applied during resolution to select the proper cluster for an encountered NP to be linked. As a coreferential cluster ofiers richer information to describe an entity than a single NP in the cluster, we could expect that such an NP-Cluster framework would enhance the resolution capability of the system. Our experiments were done on the the MEDLINE data set. Compared with the base-line approach based on NP-NP framework, our approach yields a recall improvement by 4.6%, with still a precision gain by 1.3%. These results indicate that the NP-Cluster based approach is efiective for the coreference resolution task.</Paragraph> <Paragraph position="3"> The remainder of this paper is organized as follows. Section 2 introduces as the baseline the NP-NP based approach, while Section 3 presents in details our NP-Cluster based approach. Section 4 reports and discusses the experimental results. Section 5 describes related research work.</Paragraph> <Paragraph position="4"> Finally, conclusion is given in Section 6.</Paragraph> </Section> class="xml-element"></Paper>