File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/03/p03-2032_metho.xml
Size: 10,660 bytes
Last Modified: 2025-10-06 14:08:21
<?xml version="1.0" standalone="yes"?> <Paper uid="P03-2032"> <Title>Extraction and Verification of KO-OU Expressions from Large Corpora</Title> <Section position="3" start_page="0" end_page="1" type="metho"> <SectionTitle> 2 The Previous Works and How to Posi- </SectionTitle> <Paragraph position="0"> tion this Study (Ohno, 1993) pointed out that there were expressions that try to give advance notice to whether a sentence is affirmative, negative, or interrogative at the early stage of a language expression which continues timewise. It suggested that there were certain adverbs that have replaced KAKARI-JOSI in the archaic Japanese words.</Paragraph> <Paragraph position="1"> (Masuoka, 1991) described the KO-OU relation of sentence elements. According to Masuoka, some sentences have the KO-OU expressions as shown in Table 1.</Paragraph> <Paragraph position="2"> However, this has the following weaknesses.</Paragraph> <Paragraph position="3"> The KO and OU elements in a KO-OU relation are placed together in the same category, and there is A Japanese particle.</Paragraph> <Paragraph position="4"> no description as to the OU element. Furthermore, only a limited number of elements are listed. And the objectivity of the KO and OU elements is not guaranteed.</Paragraph> <Paragraph position="5"> The KO-OU expression data is useful as basic data to dissolve ambiguity in parsing and to decide on the modification relation. However, first of all, it is necessary for the data to have a certain length for being useful basic data. Secondly, it also needs to be objective. Therefore, we have attempted to extract KO-OU relations automatically from large-scale corpus.</Paragraph> <Paragraph position="6"> Table 1 Masuoka's KO-OU expression data</Paragraph> </Section> <Section position="4" start_page="1" end_page="1" type="metho"> <SectionTitle> 3 Assumed Usage of KO-OU Expression </SectionTitle> <Paragraph position="0"/> <Section position="1" start_page="1" end_page="1" type="sub_section"> <SectionTitle> Data 3.1 To Dissolve Ambiguity </SectionTitle> <Paragraph position="0"> The KO-OU expression data is useful for dissolving ambiguity of parsing. Furthermore, it is useful for deciding the modification relation (Figure 1).</Paragraph> </Section> <Section position="2" start_page="1" end_page="1" type="sub_section"> <SectionTitle> 3.2 Gradual Understanding </SectionTitle> <Paragraph position="0"> Using the KO-OU expression data will enable the reader to guess the end expression midway through a sentence. This is because as the KO elements appear it is possible to predict the appearance of the OU elements (Figure 2). It can be used as a basic data for understanding sentences. In addition, this technology can be used to guess the point in the minutes of a meeting at which the speakers KO element OU element Nee, oi te-kudasai, naa tabun, doumo daro-u, rasii, you-da kessite, kanarazu-si-mo nai conviction (If you chew it, you will certainly taste salmon.) ??? T`yzVlq w U b tMsM { kamisimereba kitto sake no aji ga suru ni-chiga-inai</Paragraph> </Section> <Section position="3" start_page="1" end_page="1" type="sub_section"> <SectionTitle> 4.1 Method </SectionTitle> <Paragraph position="0"> (Yamamoto and Umemura, 2002) considered the estimation of the one-to-many relation between entities in corpus. They carried out experiments on extracting one-to-many relation of phenomena from corpus using complementary similarity measure (CSM) which can cope very well with inclusion relation of appearance patterns. The KO-OU relation in this research can be regarded as a type of one-to-many relation.</Paragraph> </Section> <Section position="4" start_page="1" end_page="1" type="sub_section"> <SectionTitle> 4.2 Data Used </SectionTitle> <Paragraph position="0"> In this paper, we dealt with what is called FUKU-</Paragraph> </Section> </Section> <Section position="5" start_page="1" end_page="2" type="metho"> <SectionTitle> JOSI </SectionTitle> <Paragraph position="0"> , KAKARI-JOSI, and some adverbs shown below. We proceeded on the assumption that these are the KO elements in the KO-OU relation. For our research, we used newspaper articles from the Mainichi Shimbun, Nihon Keizai Shimbun, and Yomiuri Shimbun issued between 1991 and 2000.</Paragraph> <Paragraph position="1"> [Target words] koso, sika, sae, ha, mo, bakari, nomi, sura, nara, kurai, dake, nannte, kessite, osoraku, tabun, zehi, marude, mosi, kitto Figure 3 Process flow</Paragraph> <Section position="1" start_page="2" end_page="2" type="sub_section"> <SectionTitle> 4.3 Process Flow </SectionTitle> <Paragraph position="0"> (3) Out of the pairs in (2), we extracted words that appeared in the order of KO and OU elements.</Paragraph> <Paragraph position="1"> (We judge the pairs based on this word order.) (4) We carried out judgment based on reliability.</Paragraph> <Paragraph position="2"> As a result of this process, we obtained 14 pairs of data which had &quot;kesshite&quot; as KO element, 16 which had &quot;sae,&quot; and 23 which had &quot;wa.&quot; Data of approximately 20 pairs was obtained per target word.</Paragraph> </Section> </Section> <Section position="6" start_page="2" end_page="2" type="metho"> <SectionTitle> 5 Verification of KO-OU Expression </SectionTitle> <Paragraph position="0"/> <Section position="1" start_page="2" end_page="2" type="sub_section"> <SectionTitle> Data 5.1 Necessity to Give Meaning/Information </SectionTitle> <Paragraph position="0"> If the KO-OU expression data is used for gradual understanding of sentences, it was necessary for the data to be given meaning/information. When the KO element appears, it will be possible to sufficiently grasp or guess the contents of a sentence by referring the KO-OU expression data (Figure2).</Paragraph> <Paragraph position="1"> However it is difficult to give meaning/information using the data obtained from the process in Chapter 4 because the data is broken down into each morpheme by the morphological analysis, and each element is too short.</Paragraph> <Paragraph position="2"> In Japanese sentences, there are many cases in which continuation of a particle and an auxiliary verb builds a predicate. This continuation plays an important role in determining the event of the sentence. Particles and auxiliary verbs are functional words. Therefore, it is not possible to determine the meaning of some of the particles and auxiliary verbs when they appeared independently. Furthermore, there are some cases in which they change their meaning when paired with another word.</Paragraph> <Paragraph position="3"> Table 2 shows the OU elements obtained pursuant to the procedure in Chapter 4 for KO element &quot;kitto&quot;. &quot;Da&quot; listed in Table 2 has an assertive meaning when used in a sentence like &quot;kyou wa ame da . (It is raining today.)&quot; On the other hand, it has an inferential meaning in the context of &quot;asu wa hareru daro-u . (It should be fine tomorrow.)&quot; In addition, although &quot;nai&quot; is a negative auxiliary verb, when it is paired as in &quot;ka-mo-shire-nai (may be)&quot; and &quot;chigai-nai (must be),&quot; the negative meaning disappears. And the overall pairing stands for guess and conviction.</Paragraph> <Paragraph position="4"> kitto da (auxiliary) kitto chigai (noun) kitto to (particle) kitto ka (particle) kitto omou (verb) kitto Ne (particle) kitto nai (auxiliary) kitto you (noun) kitto hazu (noun) : :</Paragraph> </Section> <Section position="2" start_page="2" end_page="2" type="sub_section"> <SectionTitle> 5.2 Verification of OU Element Using </SectionTitle> <Paragraph position="0"> &quot;Kitto&quot; In this section, we carry out an analytical example using OU element for KO element &quot;kitto (certainly).&quot; We can classify the OU elements obtained from the procedure in Chapter 4, as follows: (a) It can be an OU element by itself, (b) It can become an OU element when paired with others, (c) It does not have the possibility of becoming an OU element.</Paragraph> <Paragraph position="1"> Words of (c) were not found in the OU elements obtained for KO element &quot;kitto.&quot; In the following, we will describe the details on (a) and (b). (a) OU element by itself Out of the OU elements for KO element &quot;kitto&quot; in Table 2, &quot;hazu&quot; can be an OU element by itself. [1] koudaina umibe de miru sakuhin wa kitto miryokuteki ni utsuru hazu da .</Paragraph> <Paragraph position="2"> (Works that you see at the open seaside should look attractive.) This is the only sentence with an independent OU element for &quot;kitto&quot; in the data obtained from the process in Chapter 4. The same can be said of data for KO elements other than &quot;kitto.&quot; Because of morphological analysis, the row of letters has been shortened. As a result, there are few elements that can be regarded as an OU element by itself. And just looking at this element does not determine the meaning.</Paragraph> <Paragraph position="3"> (b) OU element when paired with others When &quot;chigai&quot; is paired with &quot;ni&quot; and &quot;nai&quot; to make &quot;ni-chigai-nai (must be),&quot; it becomes an OU element. Similarly, pairing &quot;da&quot; with &quot;u&quot; results in an OU element &quot;daro-u (perhaps).&quot; &quot;Da&quot; is the original form of &quot;daro&quot; and becomes &quot;daro-u&quot; when paired with &quot;u.&quot; [2] kitto kintyou suru daro-u .</Paragraph> <Paragraph position="4"> (It is certain that one will be nervous.) [3] kamisimereba , kitto sake no aji ga suru ni-chiga-inai . (If you chew it, you will certainly taste salmon.) If we look over the entire pairing shown above, we can give meaning to such guess and conviction.</Paragraph> </Section> </Section> <Section position="7" start_page="2" end_page="2" type="metho"> <SectionTitle> 6 Questions for the Future </SectionTitle> <Paragraph position="0"> As we described in Chapter 5, it is necessary to pair multiple elements before giving meaning/information. We currently persuade the issue of automatic generation of pairing multiple elements. Now, we are carrying out experiments on calculating the similarity measure of pairing of elements. These will give us pairing of automatically generated elements and the similarity measure of the pairings. This should be useful data for resolving ambiguity (Figure 1).</Paragraph> </Section> class="xml-element"></Paper>