File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/00/w00-1206_evalu.xml

Size: 3,210 bytes

Last Modified: 2025-10-06 13:58:41

<?xml version="1.0" standalone="yes"?>
<Paper uid="W00-1206">
  <Title>Enhancement of a Chinese Discourse Marker Tagger with C4.5</Title>
  <Section position="6" start_page="42" end_page="43" type="evalu">
    <SectionTitle>
5 Evaluation
</SectionTitle>
    <Paragraph position="0"> In order to evaluate the effectiveness of the tagging system in terms of the percentage of discourse markers that can be tagged correctly, we have chosen 80 tagged editorials from Ming Pao, a Chinese newspaper of Hong Kong, in the duration from December 1995 to January 1996 to form a training data set. Then we randomly selected 20 editorials from Mainland China and Hong Kong newspapers for the system to tag automatically, and then manually checked the results.</Paragraph>
    <Paragraph position="1"> The total CDMs in the training data set is 4764, in which 2116 are RDMs and 2648 are ADMs. The distribution of INTER-sentence relations, INTRA-sentence relations, and NULL marker pairs is shown below.</Paragraph>
    <Paragraph position="2">  sentence relations, and NULL marker pairs Our evaluation is based on counting the number of discourse markers that are correctly and incorrectly tagged.</Paragraph>
    <Paragraph position="3"> The total CDMs in the test data set is 1134, in which 563 are RDMs and 571 are ADMs. The distribution of INTER-sentence relations, INTRA-sentence relations, and NULL marker pairs in the test data set is shown in Table 3.</Paragraph>
    <Paragraph position="4">  From the test results shown in Table 4, we can see that most of the errors are caused by the misclassification of the CDMs. An example of Other errors is shown below. The following sentence is from an editorial  In the above sentence, the first &amp;quot;R&amp;quot; is matched with the NULL marker, but the second &amp;quot;R&amp;quot; is left as an ADM. This causes an &amp;quot;Other error&amp;quot; and an &amp;quot;ADM/RDM classification error&amp;quot;.</Paragraph>
    <Paragraph position="5"> The Gross Accuracy (GA) as defined in T'sou et al. (1999) is: GA = correctly tagged discourse markers / total number of discourse markers = 95.38% This greatly improves the performance compared with the original GA = 68.89%. The overgeneration problem (tagged 415, actual 424) is caused by the mismatch of CDMs as RDM pairs, or by the  misclassification of CDMs as RDMs.</Paragraph>
    <Paragraph position="6"> Following are two examples.</Paragraph>
    <Paragraph position="7">  In this example, &amp;quot;~tl ~&amp;quot; could have matched &lt;:~,*.55&gt;, &lt;,~,*,56&gt;, or &lt;~,*,58&gt;. Only the &lt;:~,*,55&gt; and the &lt;~,*,58&gt; can be eliminated from the candidates according to the &amp;quot;simple rules&amp;quot; mentioned in section 4.1. The system has to choose from &lt;~,*,56&gt; and &lt;}J~,*,57&gt; to match with &amp;quot;~zn~'. Luckily, the system has given a right choice here.</Paragraph>
    <Paragraph position="9"> The two &amp;quot;~&amp;quot; are misclassified as RDMs, and causes a mismatch of RDM pair. Such errors are difficult to avoid for an automatic system. Without further syntactic/semantic analysis, we can only hope for the ML algorithm to give us a solution from more training data.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML