File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/96/x96-1034_intro.xml

Size: 4,249 bytes

Last Modified: 2025-10-06 14:06:09

<?xml version="1.0" standalone="yes"?>
<Paper uid="X96-1034">
  <Title>AN EVALUATION OF COREFERENCE RESOLUTION STRATEGIES FOR ACQUIRING ASSOCIATED INFORMATION</Title>
  <Section position="2" start_page="0" end_page="0" type="intro">
    <SectionTitle>
1. INTRODUCTION
</SectionTitle>
    <Paragraph position="0"> As part of our TIPSTER research program \[Contract Number 94-F133200-000\], we have developed a variety of strategies to resolve coreferences within a free text document. Coreference is typically defined to mean the identification of noun phrases that refer to the same object. This paper investigates a more general view of coreference in which our automatic system identifies not only coreferenfial phrases, but also phrases which additionally describe an object. Coreference has been found to be an important component of many applications.</Paragraph>
    <Paragraph position="1"> The following example illustrates a general view of coreference.</Paragraph>
    <Paragraph position="2"> American Express, the large financial institution, also known as Amex, will open an office in Peking.</Paragraph>
    <Paragraph position="3"> In this example, we would like to associate the following information about American Express: its name is American Express; an alias for it is Amex; its location is Peking, China; and it can be described as the large financial institution.</Paragraph>
    <Paragraph position="4"> In the work described in this paper, our goal was to evaluate the contributions of various techniques for associating an entity with three types of information:  1. NameV~atious .</Paragraph>
    <Paragraph position="5"> 3.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
Data Set
Descriptive Phrases
Location Information
</SectionTitle>
      <Paragraph position="0"> The MUC6 Template Element task is typical of what our applications often require; it encapsulates information about one entity within the Template Element. Since we have a way to evaluate our performance on this task via the MUC6 data, we used it to conduct our experiments. The corpus for the MUC6 Template Element task consists of approximately 200 documents for development (pre- and post-dry-run) and 100 documents for scoring. The scoring set had previously been held blind, but it has been released for the purposes of a thorough evaluation of our metheds.</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
Scoring
</SectionTitle>
      <Paragraph position="0"> Scores discussed in this paper measure performance of experimental system reconfigurafions run on the 100 documents used for the final MUC6 evaluation.</Paragraph>
      <Paragraph position="1"> These scores were generated for inter-experiment comparison proposes, using the MUC6 scoring program, v 1.3. Scores reported here are relevant only as relative measures within this paper and are not meant to represent official performance measu~s. Official MUC6 scores were generated using a later version of the scoring program. Furthermore, the scoring program results can vary depending on how the mapping between respouse and answer key is done. For example, if an automarie system has failed to make the link between a descdptor and a name, it may create two objects --- one for each. The scoring system must then decide which object to map to the answer key.</Paragraph>
      <Paragraph position="2">  The scoring program tries to optimize the scores during mapping but, if two objects would score equally, the scoring program chooses arbitrarily, thus, in effect, sacrificing a slot as a penalty for coreference failure. In the following example, the slot can be either NAME or DESCRIPTOR, depending on the mapping.</Paragraph>
      <Paragraph position="3">  Additionally, the answer key contains optional objects which are included in the scoring calculations only ff they have been mapped to a response object. 'ntis sometimes causes a fluctuation in the number of possible correct answers, as reported by the scoring program. The scores, therefore, do not represent an absolute measure of performance.</Paragraph>
      <Paragraph position="4"> Scores reported here use the following abbrevi-</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML