File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/02/w02-1807_evalu.xml

Size: 3,561 bytes

Last Modified: 2025-10-06 13:58:53

<?xml version="1.0" standalone="yes"?>
<Paper uid="W02-1807">
  <Title>A Knowledge Based Approach to Identification of Serial Verb Construction in Chinese-to-Korean Machine Translation System</Title>
  <Section position="6" start_page="6" end_page="6" type="evalu">
    <SectionTitle>
6. Evaluation
</SectionTitle>
    <Paragraph position="0"> We randomly selected 1000 SVC sentences from 1998 people's daily newspapers. The number of verbs in the sentence is two since our dependency parser is still being improved to detect the sentences with multiple embedding clauses. In table 6, the distribution of each type of SVC and the precision are shown.</Paragraph>
    <Paragraph position="1">  The precision is 94.4% and some of the errors occur from the tagger, thus some sentences are not SVCs. The rest of the errors result from missing information in the knowledge bases:  For a sentence Guo Jia Zhu Xi Jiang Ze Min Chu Xi Jiang Hua where the relation (Hj20,Hj12) is not in the low-level adjacent list, but (Hj,Hi) is 1 in the middle-level matrix, it is assigned to the restrictive case, while for the sentence Ta Dai Biao Shan Xi Sheng Chu Xi Liao Zuo Tan Hui where (Hi17,Hj20) is in the low-level adjacent list, thus searching is stopped, it is assigned as a restrictive case. A sententence Ta Zai Fan Dian Chi Fan He Cha do not satisfy all conditions, thus it is detected as Quasi-Coordinate.</Paragraph>
    <Paragraph position="2">  Separate Event SVC.</Paragraph>
    <Paragraph position="3"> GKBCC and VLVI. We need the complete list of verbs, which has a clause as a subject. These verbs in the list will be gradually collected in future works.</Paragraph>
    <Paragraph position="4"> The evaluation table for the separate event SVCs is provided in Table 7.</Paragraph>
    <Paragraph position="5">  The precision of identifying the category of separate event is 95.3%. The errors resulted from a circumstantial case since our heuristics is too restrictive to detect all cases, thus, it might be revised further, and since the low-level concept lists are not completed. The low-level concept lists will be continuously updated for increasing coverage in the tuning stage of the machine translation system.</Paragraph>
    <Paragraph position="6"> Table 8 shows the distribution of the subcategory of restrictive separate events for Korean transfer.</Paragraph>
    <Paragraph position="7">  In table 9, the frequency for each type of accessed resource is listed. Notice that most restrictive separate event SVCs are recognized in the middle-level matrix. The two cases in collocation are all the case of quasi-coordinative, thus, the total number is greater than 153.</Paragraph>
    <Paragraph position="8">  In figure 5, a demo system of TOTAL-CK is illustrated. For a given Chinese SVC sentence displayed in the top position of the right-most window, the corresponding Korean sentence is followed in the next row. The tagged results, the segment of chunking, and the Chinese dependency tree with indentation are shown in each window from left to right.</Paragraph>
    <Paragraph position="9"> Conclusion and Future work In this paper, we formally define serial verb constructions, and classified the SVC into several categories. These categories are related to the analysis stage and the transfer stage of TALK-CK. We provided a resolution algorithm detecting SVCs in each step. Finally, at each stage, a promising experimental result is shown. Further research must help to better resolve the conditional separate event SVC and purposive separate event SVC.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML