File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/04/p04-3022_metho.xml

Size: 6,575 bytes

Last Modified: 2025-10-06 14:09:07

<?xml version="1.0" standalone="yes"?>
<Paper uid="P04-3022">
  <Title>Combining Lexical, Syntactic, and Semantic Features with Maximum Entropy Models for Extracting Relations</Title>
  <Section position="3" start_page="0" end_page="0" type="metho">
    <SectionTitle>
2 Maximum Entropy models for
</SectionTitle>
    <Paragraph position="0"> extracting relations We built Maximum Entropy models for predicting the type of relation (if any) between every pair of mentions within each sentence. We only model explicit relations, because of poor inter-annotator agreement in the annotation of implicit relations. Table 1 lists the types and subtypes of relations for the ACE RDC task, along with their frequency of occurence in the ACE training data2. Note that only 6 of these 24 relation types are symmetric: 2The reader is referred to (Strassel et al., 2003) or LDC's web site for more details of the data.</Paragraph>
    <Paragraph position="1"> &amp;quot;relative-location&amp;quot;, &amp;quot;associate&amp;quot;, &amp;quot;other-relative&amp;quot;, &amp;quot;other-professional&amp;quot;, &amp;quot;sibling&amp;quot;, and &amp;quot;spouse&amp;quot;. We only model the relation subtypes, after making them unique by concatenating the type where appropriate (e.g. &amp;quot;OTHER&amp;quot; became &amp;quot;OTHER-PART&amp;quot; and &amp;quot;OTHER-ROLE&amp;quot;). We explicitly model the argument order of mentions. Thus, when comparing mentions a0a2a1 anda0a4a3 , we distinguish between the case where a0 a1 -citizen-Of-a0 a3 and a0 a3 -citizen-Of-a0 a1 . We thus model the extraction as a classification problem with 49 classes, two for each relation subtype and a &amp;quot;NONE&amp;quot; class for the case where the two mentions are not related.</Paragraph>
    <Paragraph position="2"> For each pair of mentions, we compute several feature streams shown below. All the syntactic features are derived from the syntactic parse tree and the dependency tree that we compute using a statistical parser trained on the PennTree Bank using the  tions.</Paragraph>
    <Paragraph position="3"> Overlap The number of words (if any) separating the two mentions, the number of other mentions in between, flags indicating whether the two mentions are in the same noun phrase, verb phrase or prepositional phrase.</Paragraph>
    <Paragraph position="4"> Dependency The words and part-of-speech and chunk labels of the words on which the mentions are dependent in the dependency tree derived from the syntactic parse tree.</Paragraph>
    <Paragraph position="5"> Parse Tree The path of non-terminals (removing duplicates) connecting the two mentions in the parse tree, and the path annotated with head words.</Paragraph>
    <Paragraph position="6"> Here is an example. For the sentence fragment, been the chairman of its board ...</Paragraph>
    <Paragraph position="7"> the corresponding syntactic parse tree is shown in Figure 1 and the dependency tree is shown in Figure 2. For the pair of mentions chairman and board, the feature streams are shown below.</Paragraph>
    <Paragraph position="8">  derived from the path shown in bold in Figure 1).</Paragraph>
    <Paragraph position="9"> We trained Maximum Entropy models using features derived from the feature streams described above.</Paragraph>
  </Section>
  <Section position="4" start_page="0" end_page="0" type="metho">
    <SectionTitle>
3 Experimental results
</SectionTitle>
    <Paragraph position="0"> We divided the ACE training data provided by LDC into separate training and development sets. The training set contained around 300K words, and 9752 instances of relations and the development set contained around 46K words, and 1679 instances of relations. null  We report results in two ways. To isolate the perfomance of relation extraction, we measure the performance of relation extraction models on &amp;quot;true&amp;quot; mentions with &amp;quot;true&amp;quot; chaining (i.e. as annotated by LDC annotators). We also measured performance of models run on the deficient output of mention detection and mention chaining modules.</Paragraph>
    <Paragraph position="1"> We report both the F-measure3 and the ACE value of relation extraction. The ACE value is a NIST metric that assigns 0% value for a system which produces no output and 100% value for a system that extracts all the relations and produces no false alarms. We count the misses; the true relations not extracted by the system, and the false alarms; the spurious relations extracted by the system, and obtain the ACE value by subtracting from 1.0, the normalized weighted cost of the misses and false alarms. The ACE value counts each relation only once, even if it was expressed many times in a document in different ways. The reader is referred to the ACE web site (ACE, 2004) for more details.</Paragraph>
    <Paragraph position="2"> We built several models to compare the relative utility of the feature streams described in the previous section. Table 2 shows the results we obtained when running on &amp;quot;truth&amp;quot; for the development set and Table 3 shows the results we obtained when running on the output of mention detection and mention chaining modules. Note that a model trained with only words as features obtains a very high precision and a very low recall. For example, for the mention pair his and wife with no words in between, the lexical features together with the fact that there are no words in between is sufficient (though not necessary) to extract the relationship between the two entities. The addition of entity types, mention levels and especially, the word proximity features (&amp;quot;overlap&amp;quot;) boosts the recall at the expense of the very  sets with true (T) and system output (S) mentions and entities.</Paragraph>
    <Paragraph position="3"> high precision. Adding the parse tree and dependency tree based features gives us our best result by exploiting the consistent syntactic patterns exhibited between mentions for some relations. Note that the trends of contributions from different feature streams is consistent for the &amp;quot;truth&amp;quot; and system output runs. As expected, the numbers are significantly lower for the system output runs due to errors made by the mention detection and mention chaining modules.</Paragraph>
    <Paragraph position="4"> We ran the best model on the official ACE Feb'2002 and ACE Sept'2003 evaluation sets. We obtained competitive results shown in Table 4. The rules of the ACE evaluation prohibit us from disclosing our final ranking and the results of other participants. null</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML