XML Viewer - c92-1029

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/92/c92-1029_metho.xml
Size: 25,500 bytes
Last Modified: 2025-10-06 14:12:54
<?xml version="1.0" standalone="yes"?>
<Paper uid="C92-1029">
  <Title>Dynamic Programming Method for Analyzing Conjunctive Structures in Japanese</Title>
  <Section position="3" start_page="0" end_page="0" type="metho">
    <SectionTitle>
2 Types of Conjunctive Struc-
</SectionTitle>
    <Paragraph position="0"> tures and Their Ambiguities First, we will explain what kind of conjunctive structures (hereafter abbreviated as 'CS') appear in Japanese\[l\]N.</Paragraph>
    <Paragraph position="1"> The first type is conjunctive nomi phrases. We  or may not aplmar.</Paragraph>
    <Paragraph position="2"> can find these phrases by tile words for conjunction listed up in Table l(a). Each conjunctive noun sometimes has adjectival modifiers (Table 2(il)) or clause modifiers (Table 2(iii)).</Paragraph>
    <Paragraph position="3"> The second type is conjunctive predicative clauses, ill which two or more itredicates ~ arc in a sentence forming a coordination. We call find these clauses by the ll,enyoh-lbrnts ~ of predicates (Renyoh ehuushi-ho: Table 2(iv)) or by tile predi. cares accompanying one of the words in Table l(b) ('rable 2(v)), '\['he. third t.ype is CSs consisl.ing of parts of conjtmctire predicatiw~ clauses. We call this type eonjunetlve incomplete structures. We can find these structures by the correspondence of p(xstpositional particles (Table 2(vi)) or by the words in Table l(e) which indicate CSs explicitly (Table 2(vii)).</Paragraph>
    <Paragraph position="4"> l,br all of these types, it is relatively easy to tind the existence of a CS by detecting a distinctive key bmlsetsu a (we call this bunsetsu 'KB') which accompanies these words explained above. KB lies last in the prior part of at CS, but it is difficult to deter mine which bunsetsu sequences on both side of tile KB constitute a CS. That is, it is not easy to determine which Imnsetsu to tile hfft of a KII is tile leftmost element of the prior part of a CS, and which bunsetsu to the. right of a Kil is tile rightmost element of the posterior part of a US. The bunsetsus betweeu these two extreme elements constitute the scope of the CS. Particularly in detecting this scope of a CS, it is essential to find out the last Imnsetsn in the posterior part of the CS, which corresponds to the KB. q'here art'. lnany candidates for it ill a seatence; e.g., ill a conjunctive noun i)hras~ all nouns after it KII are the candidates. We call snch it candidate bunsetsu '(211'. It is almost impossible to solve this problem merely by using rules based oil phra.se structure grammar.</Paragraph>
    <Paragraph position="5"> lilt addition to verbs tutti aAjectives~ assertive words (kinds of postpositioxm) &amp;quot; /d&amp;quot;(da), &amp;quot;q2ab5 &amp;quot;(dearu), &amp;quot;e-J&amp;quot;(desu) and so on, which follow directly after nouus, cm~ be predicate it |d*tl&gt;ltllese.</Paragraph>
    <Paragraph position="6"> ~'fhe ending foritls of inflectional words which c;m modify vet|&gt;, ~tdjective, or a~ertivc word au~ c-tiled I~e/lyoh-fornl in  .1 apanese.</Paragraph>
    <Paragraph position="7"> 3 \]~utmetuu is tile Slllgtllet~t ineanhlgful block tx|nsisting of *tit indelxmdcnt word (lW; tmuns, verbs, adjectives, etc.) and aCCOlttpau~yittg word~ (AW; l),xslp~sitio|lal pgu'ticles, &amp;uxiliguy verbs, etc.).</Paragraph>
    <Paragraph position="9"/>
  </Section>
  <Section position="4" start_page="0" end_page="0" type="metho">
    <SectionTitle>
3 Analysis of Conjunctive
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
Structures
</SectionTitle>
      <Paragraph position="0"> We detect the scope of CSs by using wide range of information around it KB. 4 An input sentence is first divided into bunsetsus by tile conventional morphological analysis. Then we calculate similarities of all pairs of ~)unsetsus ill a selltence, and calculate a sum of similarities between a series of bunsetsus on the left of a KII and a series of bunsetsus on the left of a CB. Of all the pairs of the two series of Imnsetsus, the pair which has the greatest sum of similarities is determined as the scope of the CS. We wilt explain tins process in detail in the following.</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.1 Similarities between Bunsetsus
</SectionTitle>
      <Paragraph position="0"> An appropriate similarity value between bunsetsus is given by the following process.</Paragraph>
      <Paragraph position="1"> * If the parts of speech of IWs (independent words) are equal, giw~ 2_j&gt;oints as the similarity values.</Paragraph>
      <Paragraph position="2"> Then go to the next stage and add further the following I)oints.</Paragraph>
      <Paragraph position="3">  1. If IWs match exactly (by character level) each other, add 10 points and skip the next two steps and go to tile step 4. IflWs are inflected, infinitives are compared.</Paragraph>
      <Paragraph position="4"> 2. If both IWs are nouns and they match par tially by character level, ad&lt;l the number of matchin~ characters x 2 \]mints.</Paragraph>
      <Paragraph position="5"> 4 We (Io not halldle Colljullclive predicatiw~ el*tune* cteatexl by the Itcnyoh fc*rtns of predicates (|{enyoh c|nmshi-ho) which do ltOt accompany COllllll*t, })eC/llll~: almost all of these prc,llc,ties iilOdify thc llCXL llt~alC/~st \[)l'edicltte lilld there is 11~) need t,~ chc&lt;:k the possibility of conjunct|oil.</Paragraph>
      <Paragraph position="6"> Acn:.s DE COLING-92, NAutilus, 23-28 Aom' 1992 1 7 1 PROC. O1~ COLING-92, NANTES, AU6.23-28, 1992 ^': ~.pmlal maulr.</Paragraph>
      <Paragraph position="7"> P ~- ..................... r ...................</Paragraph>
      <Paragraph position="8"> . ~ I ~(p n+l) \ .&amp;quot;,.. I &amp;quot;Ne ~'&amp;quot; &amp;quot; similarity value :&amp;quot;... I~----- A = (.(ij)) Figure 1: A path.</Paragraph>
      <Paragraph position="9"> 3. Add points for semantic similarities by using  the thesaurus 'Buurui Goi Ityou' (BGH)\[3\].</Paragraph>
      <Paragraph position="10"> BGH has the six layer abstraction hierarchy and more than 60,000 words are assigned to the leaves of it. If the most specific common layer between two IWs is the k-th layer and if k is greater than 2, add (k - 2) x 2 points. If either or both IWs are not contained in BGH, no addition is made. Matching of the generic two layers are ignored to prevent too vague matching in broader sense.</Paragraph>
      <Paragraph position="11">  4. If some of AWs (accompanying words) matcb, add the number of matchin$ AWs x 3 points.</Paragraph>
      <Paragraph position="12"> Maximum sum of the similarity values which can be added by the steps 2 and 3 above is limited to 10 points.</Paragraph>
      <Paragraph position="13"> * Although the parts of speech oflWs are not equal,  give 2_.points if both bunsetsus can be predicate (see footnote 1).</Paragraph>
      <Paragraph position="14"> For example, the similarity point between &amp;quot;~ ~Pi~ (low level language) +,&amp;quot; and &amp;quot; ~ltC/,~'~ (high level language) + ~ (and)&amp;quot; is calculated as 2(match of parts of speech) + 8(match of four characters: Y~l/t~ ~) = 10 points. The point between &amp;quot; ~\]'aq~ (revision) + L (do) +,&amp;quot; and &amp;quot;l~U3(deteetion) +'J-~ (do)&amp;quot; is 2(match of parts of speech) + 2(match by BGII) + 3(match of one AWs) - 7 points.</Paragraph>
    </Section>
    <Section position="3" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.2 Similarities between Two Series of
Bunsetsus
</SectionTitle>
      <Paragraph position="0"> Our method detects the scope ofa CS by two series of bunaetsus which have the greatest similarity. These two aeries of bunsetsus are searched for on a triangular matrix A = (a(i,j)) (Figure 1), whose diagonal element a(i,i) is the i-th bunsetsu in a sentence and whose element a(i,j) (i &lt; j) is the similarity value between bunsetsu a(i,i) and bunsetsu a(j,j).</Paragraph>
      <Paragraph position="1"> We call the rectangular matrix A' a partial matrix, where  A' =(a(i,j)) (O&lt; i&lt; n; n+ l &lt; j &lt;1) t,(i, ~3~ .......... i'---~i, j-I )-~i, 9 &amp;quot;J ..... I i i\ ,(.. ni i i % Figure 2: An ignored element.</Paragraph>
      <Paragraph position="2"> ..... ~ ......... ~,~: .......... ! ....... i c5~.( &amp;quot;,, i ..... i ........ -2 'v-\ ........ ~ ....... ..... i ........... :;'i.:':~'~:~ .....~ .......  following, 1 indicates the number of bunsetsus and a(n, n) is a KB. We define a path as a series of elements from a non-zero element in the lowest row to an element in the leftmost column of a partial matrix (Figure 1).</Paragraph>
      <Paragraph position="3"> path ::= (a(pl, m), a(p2 .... 1) ..... a(p ..... + 1)), where n + l &lt; m &lt;1, a(pl,m) C/ O, Pi = n, PI&gt;&gt;.PI+I( 1 &lt;i&lt;m-n- 1).</Paragraph>
      <Paragraph position="4"> The starting element of a path shows the correspondence of a KB to a CB. A path has only one element from eacb column and extends towards the upper left. We calculate the similarity between tbe series of bunsetsus on the left side of the path (sbl in Figure 1) and the series under the path (sb2 in Figure 1) as a path score by the following four criteria: 1. Basically the score of a path is tile sum of each element's points on the path. But if a part of the path is horizontal (a(i,j),a(i,j - 1)) as shown in Figure 2, which leads the bunsetsu correspondence of one element a(i, i) to two elements a(j- 1, j- 1) and a(j,j), the element's points a(i,j - 1) is not added to the path score.</Paragraph>
      <Paragraph position="5"> 2. Since a pair of conjunctive phrases/clanses often appear ~s a similar structure, it is likely that both cmdunctive phrases/clauses contain nearly the same numbers of bunsetsus. Therefore, we impose penalty points on the pair of elements in the path which causes the one-to-plural bunsetsu correspondence so as to give a priority to the CS of the same size. Penalty point for  Being the KB of a conjunctive predicative clause, or accompanying a topi(:~ntarking postpositional particle ~' I~ &amp;quot; all(I comma.</Paragraph>
      <Paragraph position="6"> Accompanying a postpositional particle not (:resting a conjunctive nolul phrase and conlllla, or being an adverb aCColnpanyillg conlnla.</Paragraph>
      <Paragraph position="7"> Being the Renyoh-\[orm of a predicate which does 1|o~ ~tccolnp~l/y conllna~ or accolnpanyil|g tt topic-marking postpositionM particle &amp;quot; t.t &amp;quot;, Being the KB of a conjunctive noun phrase accompanyillg COlllnla, Accompanying a comma, or being the KB of a conjullctive IlOllll phrase not aCcolnparlying colnlna.</Paragraph>
      <Paragraph position="8"> (a(pl,j),a(pi+~,j - 1)) is calculated by the for mule (Figure 3), \[p, - pi+x - 11 X 2.</Paragraph>
      <Paragraph position="9"> Tim penalt,y points are subtracted from the path score.</Paragraph>
      <Paragraph position="10"> 3. Since each phrase in the CS has a certain cOherency of meaning, speciM words which separate the meaning in a sentence often limit the scope of a CS. If a path includes such words, we impose penalty points on the path so that the fmssihil ity of including those are reduced. We define five 'separating-levels' (SLs) for hunsetsus, which express the strength of separating a sentence meaning (Table 3, of. Tahle 1). If bunsetsus on the left side of the path ~md under it include a bunsetsu whose SL is equal to KB's SI, or higher than it, we reduce the path score by (SL of the hunsetsu - KB's SL + 1) x 7.</Paragraph>
      <Paragraph position="11"> ltowever, two high SL bunsetsus corresponding to each other ofteu exist in a CS, and those do not limit the scope of the CS. For example, topic-marking postpositional particles correspond each other in the following sentential style,</Paragraph>
      <Paragraph position="13"> Therefore, when two high SL bunsetsus correspond in a CS, that is, the path includes the element which indicates the similarity of them, and those are the 'same-type', the penalty points on them arc not axlded to tile path score. We define thc same-type bunsetsus ~LS two bunsetsus which satisfy the following two conditions.</Paragraph>
      <Paragraph position="14"> * IWs of them are of the same part of speech, and they have the identical inflection whcn they arc inflectional words.</Paragraph>
      <Paragraph position="15">  bunsetsu in a CS or the IW following it. These words thus signal the end of the CS. Such words are shown in Table 4, Bonus points (6 points) are given to the path which indicates the CS ending with one of the words in Table 4, as that path shouhl he preferred.</Paragraph>
    </Section>
    <Section position="4" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.3 Finding the Conjunctive Struc-
ture Scope
</SectionTitle>
      <Paragraph position="0"> As for each non-zero element in the lowest row ill a partial matrix A' in Figure l, we search for tile best path from it which has the greatest path score by a technique of the dynamic programming. Calculation is performed cohuun by columu in the left direction from a non-zero element. For each elenmnt in a col..</Paragraph>
      <Paragraph position="1"> umn, the hast partial path including it in found by extending the partial paths from the previous cohmm and by choosing the path with the greatest score.</Paragraph>
      <Paragraph position="2"> Then among the paths to the leftmost column, the path which ha.s the greatest score becomes the best  Now calculatill 8 :&amp;quot; ill this column. ,--~vl t .... ~--.~---~---, ~---,.---~.--~---,---~--~..-, : :13 : : : . : : t : : : : l ~-+&amp;quot;~,:+-+ i q ~ .... +'- h J~ ~t ~th v... C/.-+-v+---,-.-4 v ...... +--- :+...+..-+..-~ dregte~test+ : i:.-&amp;quot;' '&amp;quot; &amp;quot;~ ; ~.~.; &amp;quot;\[ Score path. ...12:u'~ i~thtg i&amp;quot; i O'j</Paragraph>
      <Paragraph position="4"> ACqES DE COLING-92, NANTES, 23-28 ^o~r 1992 1 7 3 PROC. OV COLING-92, NAWrEs, AIJo. 23-28, 1992 path from the non-zero element (Figure 4).</Paragraph>
      <Paragraph position="5"> Of all the best paths from non-zero elements, the path which have the maximum path score defines the scope of bhe CS; i.e., the series of bunsetsus on tim left side of the maximum path and the series of bunsetsus under it are conjunctive (Figure 5).</Paragraph>
    </Section>
  </Section>
  <Section position="5" start_page="0" end_page="0" type="metho">
    <SectionTitle>
4 Experiments and Discussion
</SectionTitle>
    <Paragraph position="0"> We illustrate the effectiveness of our method by the analysis of 180 Japanese sentences. 60 sentences which are longer aud more complex than the average sentences are collected from each of the following three sources; Encyclopedic Dictionary of Computer Science (EDCS) published by lwanami Publishing Co., Abstracts of papers of Japan Information Center of Science and Technology (JICST), and popular science journal, &amp;quot;Science&amp;quot;, translated into Japanese</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
(Vol.17,No.12 &amp;quot;Advanced Computing for Science&amp;quot;).
</SectionTitle>
      <Paragraph position="0"> Each group of 60 sentences consists of 20 sentences from 30 to 50 characters, 20 sentences from 50 to 80 characters, and 20 sentences over 80 characters.</Paragraph>
      <Paragraph position="1"> As described in the preceding sections, many factors have effects on the analysis of CSs, and it is very important to adjust the weights for each factor. The method of calculating the path score was adjusted during the experiments on 30 sentences out of 60 sentences from EDCS. Then the other 150 sentences are analyzed by these parameters. As the analyses were successful as shown in the following, this method can be regarded as properly representing the balanced weights on each factor.</Paragraph>
      <Paragraph position="2"> This method defines where the CS ends, that is, which bunsetsu corresponds to the KB. However, as for conjunctive noun phrases containing clause modifiers or conjunctive predicative clauses, it is almost impossible to find out exactly where the CS starts, because mm~y bunsetsus which modify right-hand bunsetsus exist in each part of the CSs and usnally they do not correspond exactly. Thus it is necessary to revise the starting position of the CS obtained by this method. We treat the actual prior part of a CS as extending to bunsetsus which modify a bunsetsu in the prior part of it obtained by this method, unless they contain comma or topic-marking postpositional particle &amp;quot; #2 &amp;quot;(ha).</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
4.1 Examples of Correct Analysis
</SectionTitle>
      <Paragraph position="0"> Examples of correct analysis are shown in Figure 68. The revisions of CS scopes are shown in notes of each figure. Chains of alphabet symbols attached to matrix elements show the maximum path concerning the KB marked by the same alphabet and '&gt;'.</Paragraph>
      <Paragraph position="1"> In the case of example(a) in Figure 6, the conjunctive noun phrase, in which eight nouns are conjuncted (chains of %', 'b', ... 'g'), is analyzed rightly thanks to the penalty points by SLs of every comma between nouns. Thus, the CS consisting of more than two  |C/deg2 2 2 2 2 2 t 4 0 2 2 0 0:2 0 J 0 2 0 2 (in~one.,e~,~'l) ~:~. ~ ~ 5 .~ 5 5 2 0 2 ~ 2 0 2 0 2 0 2 2 2 (~clnmr~e) ~b~C/.~. ~b7 ~ 5 5 2 0 2 2 2 0 2 0 :2 0 2 2 2 (collection) .\ ~m4e. s, s ~ ~ 2 o 2 2 2 o ~ o 2 o 2 2 2 (r~.,~c~tlo,~) \ k~tt~l. ~ S S 2 o 2 ~ 2 o a o 2 o 2 2 2 (~mtd~) _... ............... , ....</Paragraph>
      <Paragraph position="2"> ~m. 4&amp; o 2 2 2 o ~ o 2 o 2 2 2 (~urWcmtum)  It k a kind of ~c te~ce which analyz~ the e,seats ~uui tmatre related to info~ttation'$ occmrence, collection, systematiz~o~, ~afi~ t retrieval, uaderstendia,8, c,. commtmicmtlon, and application, tad so on, Lad inv~tigat~ social tdaplability of the  parts is expressed by tile repetition of the combination of CSs consisting of two parts, in this example, also the conjunctive predicative clause is analyzed rightly (chains of 'h').</Paragraph>
      <Paragraph position="3"> In the case of example(b) in Figure 7, the CS which consists of three noun phrases containing modifier clauses is detected as tile combination of the two consecutive CSs like example(a) (chain of 'a' and 'b'). In tile case of example(c) ill Figure 8, the conjunctive noun phrase and the conjunctive predicative clause containing it is analyzed rightly. In this example, the successful analysis is due to the penalty points by SL of the topic-marking postpositional particle &amp;quot; &amp;quot; in &amp;quot;~ff~g~l~rl~t (a computational e~:periment)&amp;quot; and &amp;quot;,~1~ (in that)&amp;quot; which are the outside of tile CS and the bonus points by the AW &amp;quot; ~ v, 5 &amp;quot; in the last bunsetsu of the CS .</Paragraph>
      <Paragraph position="4">  ~'~'/)b. 0 o 0 2 o ~o o 0 2 ~@~o o o o o Ol~O o b&gt;~ I~&amp;quot; Sb 0 2 0 4 o 1~ 0 &gt;~C/~StL~, o 2 o o o 2 incx,)+), o 4 o ~151 ,t '9 ~o o</Paragraph>
    </Section>
    <Section position="3" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
4.2 Experinaental Evaluation
</SectionTitle>
      <Paragraph position="0"> We evaluated tile analysis result of 180 Japanese sentences by hand. The results of cvaluatlug every sentence by each CS type are shown in Table 5. If tile same typc CSs exist two or more ill a sentence, the analysis is regarded as a success only when all of them are analyzed rightly.</Paragraph>
      <Paragraph position="1"> There arc 144 conjunctive noun phrases ill 180 sentences, and ll9 phrases among them are analyzed rightly. Tbe success ratio is 83%. There are 118 conjunctive predicative clauses ill 180 sentences, and 94 clauses among them are analyzed rightly. The success ratio is 80%. There are 3 pairs of the conjunctive incomplete structures, and all of them are analyzed rightly.</Paragraph>
      <Paragraph position="2"> As showu in \]'able 5, the sucecss rate for tile Selltences from J1CST abstracts arc worse than that of the sentences from other sources. The reason for the failures is that tile sentences are often very ambiguous and confusing even for a lluman because they have too many contents in a sentence to satisfy the limitation of tile docnment size.</Paragraph>
    </Section>
    <Section position="4" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
4.3 Examples of Incorrect Analysis
</SectionTitle>
      <Paragraph position="0"> and Solutions for Them Wc give examples of failure of analysis (Table 6, Figurc 9), and indicate st)lutions for them. In Table 6, underlined parts show the KBs, I- ...d shows tile wrongly analyzed scope, and r ... j shows the right scope.</Paragraph>
      <Paragraph position="1"> * It is essential ill this method to define the appropriate similarity between words. Thus changing the sinlilarity points for more detailed groups of parts of speech (e,g. nouns call be divided into ilul~lerals~ proper nonns, conlmon nouns, and action nouns which becomc verbs by the combiuation with &amp;quot; ~-~ (do)&amp;quot;) can improve the accuracy of the anMysis. For example, the example(i) in 'Fable 6 may bc analyzed rightly if the similarity points between action noun &amp;quot;t1~\[~ (extension)&amp;quot; and action noun &amp;quot;t~'f (maintenance)&amp;quot; is greater than that between action noun &amp;quot; t1~ (extension)&amp;quot; and common noun &amp;quot; ~1~ (di~cully)&amp;quot;.</Paragraph>
      <Paragraph position="2"> * Semantic similarities between words are currently calculated only by using BOIl which do not contain technical terms. If tile sinfilarity points between technical terms can be given by thesaurus, tile accuracy of tile analysis will be improved.</Paragraph>
      <Paragraph position="3"> Example(ii) will be analyzed rightly if greater points are given to tile similarity between &amp;quot; T P &amp;quot;T 4 7&amp;quot;. -k 4.-- b ~f~'~ ( Actlve Chart Parsing)&amp;quot; and &amp;quot;llPSG( Head-drtve, Phrase Structure Grami lly the additional usage of relatively simple syntactic conditions, some sentences which are analyzed wrongly by this method will be analyzed rightly. For example, because Japanese modifier/modifyee relations, inchnling the relation between a verb and its case frame elements, do not erc~s each other, the modifier/modifyee relations in nmm phrases and predicative clauses do not spread beyond each phrase or clause, except the relation concerning the last bunsetsu of them.</Paragraph>
      <Paragraph position="4"> This condition is not satisfied by the analyzed CS in the example(ill) whose prior noun phrase contains no verb related with the case frame element &amp;quot; ~,~&amp;quot; (grammar)&amp;quot;. By this condition it can be~-~timated that only &amp;quot; 17~1/~\[0 (natural langlage) MI~ ~ (analysis and)&amp;quot; or &amp;quot;~i:~: (analysis and)&amp;quot; AODdS Dr COLING-92, NANTES, 23-28 hOt~T 1992 17 $ PROC. OF COLING-92, NANTES, AUG. 23-28. 1992  can be the prior part of the CS. We are planning to do such a correction in the next stage of the syntactic analysis, which analyzes all modifier/modifyee relations in a sentence using the CS scopes detected by this method.</Paragraph>
      <Paragraph position="5"> * in example(iv), the KB in the beginning part of a sentence corresponds to the last CB. That is, a short part of a sentence corresponds to the following long part. It is very difficult to analyze such an extremely unbalanced CS because this method gives a priority to similar CSs. In order to analyze example(iv) the causal relationship between &amp;quot;~1~-9&amp;quot;C (usiug)&amp;quot; and &amp;quot;~tr~'J~z~ (create)&amp;quot; will be necessary.</Paragraph>
      <Paragraph position="6"> * Some sentences analyzed incorrectly are too subtle even for a human to find the right CSs. Exampie(v) cannot be analyzed rightly without expert knowledge.</Paragraph>
      <Paragraph position="7"> * This method cannot handle the CSs in which the prior part contains some modifiers and the posterior part contains nothing corresponding to them (example(vi), Figure 9). For these structures we must think the path extending upward in a partial matrix, but it is impossible by the criteria about word similarities alone.</Paragraph>
      <Paragraph position="8"> The CSs such as example(v) and example(vi) cannot be analyzed correctly without semantic informstion. fIowever such expressions are very few in actual text.</Paragraph>
    </Section>
  </Section>
  <Section position="6" start_page="0" end_page="0" type="metho">
    <SectionTitle>
5 Concluding Remarks
</SectionTitle>
    <Paragraph position="0"> We have shown that varieties of parallel structures in Japanese sentences can be detected by the method explained in this paper. As the result, a long sentence can be reduced into a short one, and the success rate of syntactic analysis of these long sentences will bccome very high.</Paragraph>
    <Paragraph position="1"> There are still some conjunctive expressions which cannot be recognized by the proposed method, and we are tempted to rely on semantic information to get proper analyses for these remaining cases. Semantic information, however, is not so reliable as syntactic information, and we have to make further efforts to find out syntactic rather than semantic relations in these difficult cases. We think that it is possible. One thing which is certain is that we have to see many more components simultaneously in a wider range of word strings of a loug sentence.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML