XML Viewer - e06-2015

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/06/e06-2015_evalu.xml
Size: 4,449 bytes
Last Modified: 2025-10-06 13:59:32
<?xml version="1.0" standalone="yes"?>
<Paper uid="E06-2015">
  <Title>Semantic Role Labeling for Coreference Resolution</Title>
  <Section position="5" start_page="144" end_page="145" type="evalu">
    <SectionTitle>
3 Experiments
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="144" end_page="144" type="sub_section">
      <SectionTitle>
3.1 Performance Metrics
</SectionTitle>
      <Paragraph position="0"> We report in the following tables the MUC score (Vilain et al., 1995). Scores in Table 2 are computed for all noun phrases appearing in either the key or the system response, whereas Tables 3  and4refertoscoringonlythosephraseswhichappear in both the key and the response. We discard therefore those responses not present in the key, as we are interested here in establishing the upper limit of the improvements given by SRL.</Paragraph>
      <Paragraph position="1"> We also report the accuracy score for all three types of ACE mentions, namely pronouns, common nouns and proper names. Accuracy is the percentage of REs of a given mention type correctly resolved divided by the total number of REs of the same type given in the key. A RE is said to be correctly resolved when both it and its direct antecedent are in the same key coreference class.</Paragraph>
      <Paragraph position="2"> In all experiments, the REs given to the classifier are noun phrases automatically extracted by a pipeline of pre-processing components (i.e. PoS tagger, NP chunker, Named Entity Recognizer).</Paragraph>
    </Section>
    <Section position="2" start_page="144" end_page="144" type="sub_section">
      <SectionTitle>
3.2 Results
</SectionTitle>
      <Paragraph position="0"> Table 2 compares the results between our duplicated Soon baseline and the original system.</Paragraph>
      <Paragraph position="1"> The systems show a similar performance w.r.t. Fmeasure. We speculate that the result improvements are due to the use of current pre-processing components and another classifier.</Paragraph>
      <Paragraph position="2"> Tables 3 and 4 show a comparison of the performance between our baseline system and the  oneincrementedwithSRL.Performanceimprovements are highlighted in bold. The tables show that SRL tends to improve system recall, rather than acting as a 'semantic filter' improving precision. Semantic roles therefore seem to trigger a  response in cases where more shallow features do not seem to suffice (see example (1)).</Paragraph>
      <Paragraph position="3"> TheREtypeswhicharemostpositivelyaffected by SRL are pronouns and common nouns. On the other hand, SRL information has a limited or even worsening effect on the performance on proper names, where features such as string matching and alias seem to suffice. This suggests that SRL plays a role in pronoun and common noun resolution, wheresurfacefeaturescannotaccountforcomplex preferences and semantic knowledge is required.</Paragraph>
    </Section>
    <Section position="3" start_page="144" end_page="145" type="sub_section">
      <SectionTitle>
3.3 Feature Evaluation
</SectionTitle>
      <Paragraph position="0"> We investigated the contribution of the different features in the learning process. Table 5 shows the chi-square statistic (normalized in the [0,1] interval) for each feature occurring in the training data of the MERGED dataset. SRL features show a high kh2 value, ranking immediately after string matching and alias, which indicates a high correlation of these features to the decision classes.</Paragraph>
      <Paragraph position="1"> The importance of SRL is also indicated by the analysis of the contribution of individual features to the overall performance. Table 6 shows the performance variations obtained by leaving out each feature in turn. Again, it can be seen that removing both I and J SEMROLE induces a relatively high performance degradation when compared to other features. Their removal ranks 5th out of 12, following only essential features such as string matching, alias, pronoun and number. Similarly to Table 5, the semantic role of the anaphor ranks higher than the one of the antecedent. This re- null lates to the improved performance on pronouns, as it indicates that SRL helps for linking anaphoric pronouns to preceding REs. Finally, it should be noted that SRL provides much more solid and noise-freesemanticfeatureswhencomparedtothe WordNet class feature, whose removal induces always a lower performance degradation.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML