File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/92/c92-1029_intro.xml

Size: 3,539 bytes

Last Modified: 2025-10-06 14:05:12

<?xml version="1.0" standalone="yes"?>
<Paper uid="C92-1029">
  <Title>Dynamic Programming Method for Analyzing Conjunctive Structures in Japanese</Title>
  <Section position="2" start_page="0" end_page="0" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> Analysis of a long Japanese sentence is one of many difficult problems which cannot be solved by the continuing efforts of many researchers and remain abaudoned. It is difficult to get a proper analysis of a sentence whose length is more than fifty Japanese characters, and almost all the analyses fail for sentences composed of more than eighty characters. To clarify why it is is also very difficult because there are varieties of reasons for the failures. People sometimes say that there are so many possibilities of modifier/modifyee relations between phrases in a long sentence. But no deeper consideration has ever been given for the reasons of the analysis failure. Analysis failure here means not only that no correct analysis is included in the multiple analysis results which are caused by the intrinsic ambiguity of a sentence and also by inaccurate grammatical rules, but also that the analysis fails in the middle of the analysis prore88, null We have been claiming that many (more than two) linguistic components are to be seen at the same time in a sentence for proper parsing, and also that tree to tree transformation is necessary for reliable analysis of a sentence. Popular grammar rules which merge two linguistic components into one are quite insufficient to describe the delicate relationships among components ill a long sentence.</Paragraph>
    <Paragraph position="1"> Language is complex. There often happens that components whicb are far apart in a long sentence cooccur, or have certain relationships. Such relations may be sometimes purely semantic, but often they are grammatical or structural, although they are not definite but very subtle.</Paragraph>
    <Paragraph position="2"> A long sentence, particularly of Japanese, contains parallel structures very often. They are either conjunctive noun phrases, or conjunctive predicative clauses. The latter is called &amp;quot;Renyoh chuushiho&amp;quot;. They appear in an embedded sentence to modify nouns, and also are used to connect two or more sentences. This form is very often used in Japanese, and is a main cause for structural ambiguity. Many major sentential components are omitted in the posterior part of Renyoh chuushi expressions and this makes the analysis more difficult.</Paragraph>
    <Paragraph position="3"> For tbc successful analysis of a long Japanese sentence, these parallel phrases and clauses, including Renyoh chuushi-ho, must be recognized correctly.</Paragraph>
    <Paragraph position="4"> This is a key point, and this must be achieved by a completely different method from the ordinary syntactic analysis methods, because they generally fail in the analysis for a long sentence.</Paragraph>
    <Paragraph position="5"> We have introduced au assumption that these parallel phrases/clauses have a certain similarity, and have developed an algorithm which finds out a most plausible two series of words which can be considered parallel by calculating a similarity measure of two arbitrary series of words. This is realized by using the dynamic programming method. The results was exceedingly good. We achieved the score of about 80% in the detection of various types of parallel series of words in long Japanese sentences.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML