File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/05/p05-1020_intro.xml
Size: 5,237 bytes
Last Modified: 2025-10-06 14:03:01
<?xml version="1.0" standalone="yes"?> <Paper uid="P05-1020"> <Title>Machine Learning for Coreference Resolution: From Local Classification to Global Ranking</Title> <Section position="2" start_page="0" end_page="157" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> Recent research in coreference resolution -- the problem of determining which noun phrases (NPs) in a text or dialogue refer to which real-world entity -- has exhibited a shift from knowledge-based approaches to data-driven approaches, yielding learning-based coreference systems that rival their hand-crafted counterparts in performance (e.g., Soon et al. (2001), Ng and Cardie (2002b), Strube et al. (2002), Yang et al. (2003), Luo et al. (2004)). The central idea behind the majority of these learning-based approaches is to recast coreference resolution as a binary classification task. Specifically, a classifier is first trained to determine whether two NPs in a document are co-referring or not. A separate clustering mechanism then coordinates the possibly contradictory pairwise coreference classification decisions and constructs a partition on the given set of NPs, with one cluster for each set of coreferent NPs.</Paragraph> <Paragraph position="1"> Though reasonably successful, this &quot;standard&quot; approach is not as robust as one may think. First, design decisions such as the choice of the learning algorithm and the clustering procedure are apparently critical to system performance, but are often made in an ad-hoc and unprincipled manner that may be suboptimal from an empirical point of view.</Paragraph> <Paragraph position="2"> Second, this approach makes no attempt to search through the space of possible partitions when given a set of NPs to be clustered, employing instead a greedy clustering procedure to construct a partition that may be far from optimal.</Paragraph> <Paragraph position="3"> Another potential weakness of this approach concerns its inability to directly optimize for clustering-level accuracy: the coreference classifier is trained and optimized independently of the clustering procedure to be used, and hence improvements in classification accuracy do not guarantee corresponding improvements in clustering-level accuracy.</Paragraph> <Paragraph position="4"> Our goal in this paper is to improve the robustness of the standard approach by addressing the above weaknesses. Specifically, we propose the following procedure for coreference resolution: given a set of NPs to be clustered, (1) use a0 pre-selected learning-based coreference systems to generate a0 candidate partitions of the NPs, and then (2) apply an automatically acquired ranking model to rank these candidate hypotheses, selecting the best one to be the final partition. The key features of this approach are: Minimal human decision making. In contrast to the standard approach, our method obviates, to a large extent, the need to make tough or potentially suboptimal design decisions.1 For instance, if we 1We still need to determine the a1 coreference systems to be employed in our framework, however. Fortunately, the choice of a1 is flexible, and can be as large as we want subject to the cannot decide whether learner a0 is better to use than learner a1 in a coreference system, we can simply create two copies of the system with one employing a0 and the other a1 , and then add both into our pre-selected set of coreference systems.</Paragraph> <Paragraph position="5"> Generation of multiple candidate partitions. Although an exhaustive search for the best partition is not computationally feasible even for a document with a moderate number of NPs, our approach explores a larger portion of the search space than the standard approach via generating multiple hypotheses, making it possible to find a potentially better partition of the NPs under consideration.</Paragraph> <Paragraph position="6"> Optimization for clustering-level accuracy via ranking. As mentioned above, the standard approach trains and optimizes a coreference classifier without necessarily optimizing for clustering-level accuracy. In contrast, we attempt to optimize our ranking model with respect to the target coreference scoring function, essentially by training it in such a way that a higher scored candidate partition (according to the scoring function) would be assigned a higher rank (see Section 3.2 for details).</Paragraph> <Paragraph position="7"> Perhaps even more importantly, our approach provides a general framework for coreference resolution. Instead of committing ourselves to a particular resolution method as in previous approaches, our framework makes it possible to leverage the strengths of different methods by allowing them to participate in the generation of candidate partitions.</Paragraph> <Paragraph position="8"> We evaluate our approach on three standard coreference data sets using two different scoring metrics. In our experiments, our approach compares favorably to two state-of-the-art coreference systems adopting the standard machine learning approach, outperforming them by as much as 4-7% on the three data sets for one of the performance metrics.</Paragraph> </Section> class="xml-element"></Paper>