File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/06/p06-1079_evalu.xml

Size: 6,658 bytes

Last Modified: 2025-10-06 13:59:38

<?xml version="1.0" standalone="yes"?>
<Paper uid="P06-1079">
  <Title>Exploiting Syntactic Patterns as Clues in Zero-Anaphora Resolution</Title>
  <Section position="7" start_page="628" end_page="631" type="evalu">
    <SectionTitle>
5 Experiments
</SectionTitle>
    <Paragraph position="0"> We conducted an evaluation of our method using Japanese newspaper articles. The following four models were compared:  1. BM: Ng and Cardie (2002a)'s model, which identify antecedents by the candidate-wise classification model, and determine anaphoricity using the one-step model.</Paragraph>
    <Paragraph position="1"> 6http://chasen.org/~taku/software/bact/ 2. BM STR: BM with the syntactic features such as those in Figure 1(c).</Paragraph>
    <Paragraph position="2"> 3. SCM: The selection-then-classification model explained in Section 3.</Paragraph>
    <Paragraph position="3"> 4. SCM STR: SCM with all types of syntactic features shown in Figure 2.</Paragraph>
    <Section position="1" start_page="628" end_page="628" type="sub_section">
      <SectionTitle>
5.1 Setting
</SectionTitle>
      <Paragraph position="0"> We created an anaphoric relation-tagged corpus consisting of 197 newspaper articles (1,803 sentences), 137 articles annotated by two annotators and 60 by one. The agreement ratio between two annotators on the 197 articles was 84.6%, which indicated that the annotation was sufficiently reliable. null In the experiments, we removed from the above data set the zero-pronouns to which the two annotators did not agree. Consequently, the data set contained 995 intra-sentential anaphoric zero-pronouns, 754 inter-sentential anaphoric zero-pronouns, and 603 non-anaphoric zero-pronouns (2,352 zero-pronouns in total), with each anaphoric zero-pronoun annotated to be linked to its antecedent. For each of the following experiments, we conducted five-fold cross-validation over 2,352 zero-pronouns so that the set of the zero-pronouns from a single text was not divided into the training and test sets.</Paragraph>
      <Paragraph position="1"> In the experiments, all the features were automatically acquired with the help of the following NLP tools: the Japanese morphological analyzer ChaSen7 and the Japanese dependency structure analyzer CaboCha8, which also carried out named-entity chunking.</Paragraph>
    </Section>
    <Section position="2" start_page="628" end_page="629" type="sub_section">
      <SectionTitle>
5.2 Results on intra-sentential zero-anaphora
</SectionTitle>
      <Paragraph position="0"> resolution In both intra-anaphoricity determination and antecedent identification, we investigated the effect of introducing the syntactic features for improving the performance. First, the results of antecedent identification are shown in Table 1. The comparison between BM (SCM) with BM STR (SCM STR) indicates that introducing the structural information effectively contributes to this task. In addition, the large improvement from BM STR to SCM STR indicates that the use of the preference-based model has significant impact on intra-sentential antecedent identification. This  zero-pronoun respectively.</Paragraph>
      <Paragraph position="1"> finding may well contribute to semantic role labeling because these two tasks have a large overlap as discussed in Section 1.</Paragraph>
      <Paragraph position="2"> Second, to evaluate the performance of intra-sentential zero-anaphora resolution, we plotted recall-precision curves altering threshold parameter and thinter for intra-anaphoricity determination as shown in Figure 5, where recall R and precision P were calculated by:</Paragraph>
      <Paragraph position="4"> The curves indicate the upperbound of the performance of these models; in practical settings, the parameters have to be trained beforehand.</Paragraph>
      <Paragraph position="5"> Figure 5 shows that BM STR (SCM STR) out-performs BM (SCM), which indicates that incorporating syntactic pattern features works remarkably well for intra-sentential zero-anaphora  resolution. Futhermore, SCM STR is significantly better than BM STR. This result supports that the former has an advantage of learning non-anaphoric zero-pronouns (181 instances) as negative training instances in intra-sentential anaphoricity determination, which enables it to reject non-anaphoric zero-pronouns more accurately than the others.</Paragraph>
    </Section>
    <Section position="3" start_page="629" end_page="629" type="sub_section">
      <SectionTitle>
5.3 Discussion
</SectionTitle>
      <Paragraph position="0"> Our error analysis reveals that a majority of errors can be attributed to the current way of handling quoted phrases and sentences. Figure 6 shows the difference in resolution accuracy between zero-pronouns appearing in a quotation  pronouns), where &amp;quot;IN Q&amp;quot; denotes the former (inquote zero-pronouns) and &amp;quot;OUT Q&amp;quot; the latter. The accuracy on the IN Q problems is considerably lower than that on the OUT Q cases, which indicates that we should deal with in-quote cases with a separate model so that it can take into account the nested structure of discourse segments introduced by quotations.</Paragraph>
    </Section>
    <Section position="4" start_page="629" end_page="631" type="sub_section">
      <SectionTitle>
5.4 Impact on overall zero-anaphora
</SectionTitle>
      <Paragraph position="0"> resolution We next evaluated the effects of introducing the proposed model on overall zero-anaphora resolution including inter-sentential cases. As a baseline model, we implemented the original SCM, designed to resolve intra-sentential zero-anaphora and inter-sentential zero-anaphora simultaneously with no syntactic pattern features. Here, we adopted Support Vector Machines (Vapnik, 1998) to train the classifier on the baseline  model and the inter-sentential zero-anaphora resolution in the SCM using structural information. For the proposed model, we plotted several recall-precision curves by selecting different value for threshold parameters thintra and thinter. The results are shown in Figure 7, which indicates that the proposed model significantly outperforms the original SCM if thintra is appropriately chosen.</Paragraph>
      <Paragraph position="1"> We then investigated the feasibility of parameter selection for thintra by plotting the AUC values for different thintra values. Here, each AUC value is the area under a recall-precision curve. The results are shown in Figure 8. Since the original SCM does not use thintra, the AUC value of it is constant, depicted by the SCM. As shown in the Figure 8, the AUC-value curve of the proposed model is not peaky, which indicates the selection of parameter thintra is not difficult.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML