File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/05/i05-4002_concl.xml

Size: 2,292 bytes

Last Modified: 2025-10-06 13:54:37

<?xml version="1.0" standalone="yes"?>
<Paper uid="I05-4002">
  <Title>Evaluation of a Japanese CFG Derived from a Syntactically Annotated Corpus with Respect to Dependency Measures</Title>
  <Section position="7" start_page="45" end_page="45" type="concl">
    <SectionTitle>
5 Conclusion
</SectionTitle>
    <Paragraph position="0"> We have been building a large-scale Japanese syntactically annotated corpus. In this paper, we evaluated a CFG derived from the corpus with respect to dependency measure. We assume that parse results created by our CFG is supposed to be re-analyzed in the subsequent processing using semantic information, and the result shows that parsing accuracy will increase when semantic information is incorporated.</Paragraph>
    <Paragraph position="1"> We also compared our result with other dependency analyzers, KNP and CaboCha. Although dependency accuracy of our CFG cannot reach those of KNP and CaboCha if only PGLR model is used for disambiguation, it would exceed if disambiguation in the subsequent processing was done correctly.</Paragraph>
    <Paragraph position="2"> As future work, since we assume that the parse results created by our CFG are re-analyzed in the subsequent processing, we need to integrate the subsequent processing into the current framework. Collins proposed a method for re-ranking the output from an initial statistical parser (Collins, 2000). However, it is not enough for us since we represent some ambiguous cases as the same structure (we need to consider the ambiguity included in each parse result). Our policy has been considered with several types of ambiguity: structure of compound noun, adnominal phrase attachment, adverbial phrase attachment and conjunctive structure. We are planning to provide each method individually and integrate them into a single process.</Paragraph>
    <Paragraph position="3"> Although we attempt to re-analyze after parsing, it seems that some problem should be solved before parsing. For example, ellipsis often occurs in Japanese. It is difficult to deal with ellipsis (especially, postpositions and verbs) in a CFG framework, resulting in higher ambiguity. It would be helpful if the positions where some words are omitted in a sentence were detected and marked in advance.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML