File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/80/c80-1019_metho.xml
Size: 21,321 bytes
Last Modified: 2025-10-06 14:11:18
<?xml version="1.0" standalone="yes"?> <Paper uid="C80-1019"> <Title>CONCEPTUAL TAXONOMY OF JAPANESE VERBS FOR UNDERSTANDING NATURAL LANGUAGE AND PICTURE PATTERNS</Title> <Section position="3" start_page="0" end_page="0" type="metho"> <SectionTitle> 3. Criterion for classification </SectionTitle> <Paragraph position="0"> In machine processing it must be shown why a word is classified into such and such term. Ordinary thesauruses do not stress the criteria.</Paragraph> <Paragraph position="1"> Concepts of verbs are the core of the system from the linguistic viewpoint. We classify almost all concepts of verbs in daily Japanese by association of natural language with the real world, answering the above-mentioned problems.</Paragraph> <Paragraph position="2"> As for problem i, a working hierarchy along an abstraction process is constructed in the system As for problem 2, case frames are shown in &quot;simple matter concept,&quot; and connecting relations among elementary matter concepts are shown in &quot; non-simple matter concept.&quot; As for problem 3, an algorithm is introduced into the classification.</Paragraph> </Section> <Section position="4" start_page="0" end_page="0" type="metho"> <SectionTitle> 2 Preliminary Considerations </SectionTitle> <Paragraph position="0"/> <Section position="1" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 2.1 Meaning Common to Natural Language and Pic- ture Patterns </SectionTitle> <Paragraph position="0"> Putting aside what the meaning of a picture pattern is, let's first discuss how it can be understood. When a picture pattern or picture pattern sequence is given, an infinite number of static or dynamic events can generally be observed within. Suppose that the meaning of each event is described in natural language--in fact, one can express almost all events in natural language apart from the question of efficiency__ these descriptive sentences will amount to an infinite number. An ordinary sentence is reduced into simple sentences, each of which is governed syntactically and semantically by a verb. Since there is a finite number of verbs in each language, the meanings of an infinite number of the events involved are roughly divided into the meanings of those verbs and their interrelations.</Paragraph> <Paragraph position="1"> Now, what is the meaning of picture patterns ? In the case of circuit diagrams or chemcal structural formulas, we can think of the se127- null mantics because they have signs and syntactic relations. In the case of real world picture patterns, however, there exists neither signs nor syntactic relations. Here we observe real world objects named by human beings. If we consider them something like signs, we can think of the syntax, and then the semantics, too. The meanings are common to natural language and picture patterns, although their syntactic structures differ largely from each other.</Paragraph> </Section> <Section position="2" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 2.2 Paradigms for Interpretation and Understand- </SectionTitle> <Paragraph position="0"> k~ In order to clarify the notions of interpretation and understanding, first, we propose a working hierarchy of knowledge along the abstraction process, as follows: Level 1 Raw data Data close to copies of Level 2 Level 3 Level 4 Level 5</Paragraph> <Paragraph position="2"> Level 5 things and events in the real world. Image-like data.</Paragraph> <Paragraph position="3"> Data of visual features Features extracted from raw data.</Paragraph> <Paragraph position="4"> Data of conceptual features Symbolic data associated with visual features. Some of them correspond to Chomsky's syntactic features in the lexicon. 6 Concept data Data obtained by organizing conceptual features. Most data have names as words. In case of the verb they roughly correspond to Minsky,s surface semantic frames. 7 Interconnected concept data Networks of concept data. A concept can be interconnected with other concepts from various viewpoints.</Paragraph> <Paragraph position="5"> Schank,s scripts can be regarded as one of this type. 8 Some networks have names as words.</Paragraph> <Paragraph position="6"> Fig. 1 shows the hierarchy. &quot;Interpretation&quot; is considered as an association of the data at one level with another level. (Here input images are considered as level zero data.) Since the knowledge system has several levels and each level has many domains, interpretation is possible in many ways. If an interpretation is performed under a certain control system that specifies which level and which domain the input data should be associated with, it is called &quot; understanding.&quot; As the level number increases, a level becomes higher because abstractions of concepts proceed. But, which is deeper, level 1 or level 5 ? In natural language understanding, input sentences will probably be interpreted initially at level 4 or 5, then the interpretation may descend to level i, where level 1 might be deeper than either level 4 or 5. However, if the interpretation of a picture pattern proceeds from level 1 to 5, we think level 5 as the deeper level.</Paragraph> <Paragraph position="7"> The knowledge system is so massive and complicated that it is necessary to make systematic analyses. Since the number of verbs are finite, concepts of verbs at level 4 provide a clue to systematic and exhaustive analyses of knowledge from the linguistic viewpoint.</Paragraph> <Paragraph position="8"> The concepts of verbs are divided into two large classes:&quot;simple matter concepts&quot; and &quot;non-simple matter concepts.&quot; 2,3</Paragraph> </Section> </Section> <Section position="5" start_page="0" end_page="129" type="metho"> <SectionTitle> 3 Simple Matter Concepts </SectionTitle> <Paragraph position="0"> A ,. with the real world e I : It has a roof. c I : Time lapse</Paragraph> <Paragraph position="2"> Fig. 1 Hierarchy of the knowledge system The simple matter concepts are not reduced into any more elementary matter concepts while the non-simple ones are reduced. Most of them are so concrete that they are well analyzed by direct association identified by a verb is called &quot;matter.&quot; Unlike things matter does not occur alone. It arises accompanied by things, events, and attributes, which are called &quot;constituents,&quot; so this concept can be regarded as the concept of a dynamic or static relation among constituents and be expressed by v(s,o,of,ot,om, Os,Ow,Oc, p,t,r,. .... ) (A) where each symbol in parenthe ~ ses represents a constituent specified below.</Paragraph> <Paragraph position="3"> s : subjective concept o : objective concept of: starting point in ac- null tion, or initial stste of change ot: finishing or target point in action, or final state of change Om: opponent in mutual action Os: standard or reference Ow: way or means(including instrument) Oc: concept which supplements attributive</Paragraph> <Section position="1" start_page="0" end_page="0" type="sub_section"> <SectionTitle> aspects </SectionTitle> <Paragraph position="0"> p,t,r, ..... : place, time, cause(or reason), .....</Paragraph> <Paragraph position="1"> Out of these, eight constituents s through o c are obligatory because they are indispensable for the recognition of matter. In Japanese sentences, the obligatory constituents are often accompanied with such postpositional words as sga, o-o, of-kara, ot-ni, om-to , os-ni, ow-de, and oc-tO. But it is difficult to decide the case of a constituent only by such postpositional words.</Paragraph> <Paragraph position="2"> The combination of obligatory constituents decides the basic frame of matter concepts. Table 1 was obtained after an elaborate investigation of more ~han 1,500 simple matter concepts. Two comments must be added to Table i. First, optional constituents participate fairly freely in matter. Table 1 says nothing about this problem. Next, some obligatory constituents are not obligatory in every case.</Paragraph> <Paragraph position="4"> (A button) comes off (the shirt).</Paragraph> <Paragraph position="5"> In M1 of eda(branch) is optional because ochiru is recognized by observing the vertical movement of a leaf, while in M2 of shatsu(shirt) is obligatory because toreru is not recognized without the existence of a shirt. Constituents of, ot, Ow, and o c belong to such a group.</Paragraph> </Section> <Section position="2" start_page="0" end_page="129" type="sub_section"> <SectionTitle> 3.2 Semantic Contents </SectionTitle> <Paragraph position="0"> In case of semantic contents it is difficult to classify them by examining the combination of constituents, so we adopted a trial-and-error method extracting features for classification from the concepts. Letting a set of simple matter concepts under consideration be C, the feature extraction from PS is performed by the following recursive procedure: Step 1 Select several elements having similar contents from C/ and extract from them a feature (~) which makes them similar.</Paragraph> <Paragraph position="1"> Step n(>2) Let the features extracted up to step (n-l) be Cl, c2,. .... ,Cn_ I. Extract a feature (c n) in the same way as step i. (The element so far selected may be adopted in the extraction.) And compare c n with each ci(l~i_<n-l).</Paragraph> <Paragraph position="2"> i) If c n is independent with each ci, adopt it as a feature and go to step (n+l).</Paragraph> <Paragraph position="3"> 2) Otherwise, 2.1) if the contents of Cn/C i contains that of ci/Cn, adopt c n as an upper/lower-grade feature of c i and go to step (n+l).</Paragraph> <Paragraph position="4"> 2.2) Otherwise, make c n as a special feature and go to step (n+l).</Paragraph> <Paragraph position="6"> ents).</Paragraph> <Paragraph position="7"> (hanako-ga ringo-o) taberu. (Hanako) eats (an apple). (untensyu-ga tsumini-o kuruma- kara) orosu. (A driver) unloads (baggage from the car).</Paragraph> <Paragraph position="8"> (aeito-ga kaban-ni kyb~kasyo-o) treru. (Pupils) put (textbooks into knapsacks).</Paragraph> <Paragraph position="9"> (hikUshi-ga kanseit~-to shing~-o) kawasu. (A pilot) exchanges (information with a control tower).</Paragraph> <Paragraph position="10"> (hito-ga saji-de sate-o) suk~. (One) scoops (sugar with a spoon). (hito-ga soyokaze-o suzushiku) kanjiru. There are 1,209 different concepts in the classified concepts.</Paragraph> <Paragraph position="11"> This method was applied to the set of concepts described in Sect. 3.1 and the result is tabulated in Table 2. Here distribution was obtained by the classification of Chapter 5. In Table 2, the first digit 0, 1 and 2 in the classification numbers roughly represent movement, change, and state, respectively.</Paragraph> </Section> </Section> <Section position="6" start_page="129" end_page="129" type="metho"> <SectionTitle> 4 Non-Sim~le M~tter Concepts </SectionTitle> <Paragraph position="0"> Generally, non-simple matter concepts are so abstract in comparison with simple ones that it is hard to show a clear association of natural language with the real world. We emphasize the analysis of how they are composed of simple ones.</Paragraph> <Section position="1" start_page="129" end_page="129" type="sub_section"> <SectionTitle> 4.1 Complex Concept A </SectionTitle> <Paragraph position="0"> If two elementary matter concepts v i and vj(not necessarily simple ones) are connected according to one of the rules shown in Table 3 and the connected concept is expressed by a Japanese complex word of two verbs for v i and vj, it is called a '~ complex concept of A.&quot; The rules in Table 3 were obtained from the investigation of about 900 matter concepts which consist of two matter concepts and are expressed by a Japanese complex word.</Paragraph> <Paragraph position="1"> In rule XXI.I, vj(deru) is an uppergrade concept of vi(af~reru) and contains the contents of vi. Rule XXI.I is concerned with the whole and a part of the same matter, while rule XXI.IIwith two different matters. The former is considered as a special case of the latter in which two matters coincide with each other.</Paragraph> <Paragraph position="2"> Rule XXI and XXllare logical while rule XXI\[I is linguistic. As &quot;cause&quot; is one of the constituents in (A) in Sect. 3.1, XXI may be considered as a part of XX~II .</Paragraph> <Paragraph position="3"> The semantic contents of complex concept A consists of the v i and vj contents and their connecting relation.</Paragraph> </Section> <Section position="2" start_page="129" end_page="129" type="sub_section"> <SectionTitle> 4.2 Complex Concept B </SectionTitle> <Paragraph position="0"> Complex concept B consists of several elementary matter concepts and is usually expressed by a Japanese simple word. However, no general rule can be found to connect elementary matter concepts, so a hierarchical analysis was made for a small number of complex concepts of B as shown in Fig. 2 and Table 4.</Paragraph> <Paragraph position="2"> From the diachronic point of view, there seems to be a reason why a complex concept of B is expressed by a simple word. The relation among elementary matter concepts can not well be expressed by enumerating each verb as in the case of complex concept A. When one is going to designate matter in the real world without the verb identifying it, one must utter several sen- null Poor practice.fai~ ure Be ill able to do Lose a chance to do Fail to do in part Fail to do tences. If such necessities often arise and the relationship is conceptualized, it will be efficient to give it a name.</Paragraph> <Paragraph position="3"> As for semantic contents, elementary matter concepts and their relationship form a surface contents. Approximately 1,000 complex concepts of B were investigated according to the feature extraction method in Sect. 3.2 and the result is tabulated in Table 5.</Paragraph> </Section> <Section position="3" start_page="129" end_page="129" type="sub_section"> <SectionTitle> 4.3 Derivative Concept </SectionTitle> <Paragraph position="0"> Some concepts possess a function of deriving a new concept by operating others. Matter concepts derived from operative concepts with both morphemic structures and derivative information as shown in Table 6 and 7 respectively are called &quot;derivative concepts.&quot; Table 7 was obtained from the investigation of about 700 matter concepts, most of which are expressed by a complex word and one concept is operative to the other.</Paragraph> <Paragraph position="1"> The derivative information is very similar to the modal information of auxiliary verbs, but it differs in that some matter concepts are operated upon and those operations are fixed.</Paragraph> </Section> </Section> <Section position="7" start_page="129" end_page="129" type="metho"> <SectionTitle> 5 Classification </SectionTitle> <Paragraph position="0"> In order to determine whether analyses in Chapter 3 and 4 are good or not, we classified about 4,700 basic matter concepts in daily Japanese, which are listed in &quot;Word List by Semantic Principles&quot; edited by National Language Research Institue in Japan. 4</Paragraph> <Section position="1" start_page="129" end_page="129" type="sub_section"> <SectionTitle> 5.1 Algorithm of Classificatioh </SectionTitle> <Paragraph position="0"> An algorithm is introduced into the classification, reffering Fig. 3 and 4. The elements or members of Vx(x=T,U,...) are denoted by Vxi(i =1,2,.'.) and the sum and difference in the set theory are denoted by + and -, respectively.</Paragraph> <Paragraph position="1"> i) Preprocessing For each VTi of VT, i.i) examine whether VTi functions with others or by itself. If it functions with others, then it is excluded from V T.</Paragraph> <Paragraph position="2"> Example. -ga~u; 1.2) examine whether there is VTh(h<i ) which has the same contents as VTi and is expressed by the same verb as V~i. If there is such VTh, VTi is excluded from VT.</Paragraph> <Paragraph position="3"> Let,s denote a class of concepts excluded by I.I) and 1.2) by Vp and let VU=VT-Vp.</Paragraph> <Paragraph position="4"> 2) Classification of derivative concepts For each VUi of YU, 2.1) if VUi is expressed by a derivative word, it is classified as a member of term L in Table 6. It is further classified in more detail according to Table 7; 2.2) if VUi is expressed by a complex word of two verbs and one of these verbs is affixal, then it is regarded as a member of term LI in 2.3) if VUi is expressed by neither a derivative word nor a complex word, but it is regarded as a member of one of the terms in Table 7, it is classified into that term. At the same time, it is classified into term L~ in Table 6. Let this class of concepts thus obtained be VD.</Paragraph> <Paragraph position="5"> 3) Classification of complex concepts of A For each VUi(@VDj) of VU, if VUi is expressed by a complex word of two verbs and each concept functions by itself, it is considered as a complex concept of A and classified according to The class thus obtained is denoted by V A. 4) Classification of complex concepts of B For each VUi(~VDj,VAk) of VU, if its contents does not belong to any term in Table 2, it is regarded as a complex concept of B. The class thus obtained is denoted by V B and subject to the following process: For each VBi , 4.1) examine its surface structure and classify it according to Table i; 4.2) examine its surface contents and classify it according to Table 5.</Paragraph> <Paragraph position="6"> Let V~=VD+VA+VB andVs=Vu-V ~.</Paragraph> <Paragraph position="7"> 5) Classification of similar concepts In class V S of simple matter concepts, if there is a group with similar contents, choose a concept as the standard, then classify the re- null mainder as similar concepts.</Paragraph> <Paragraph position="8"> Example. Korogeru(roll)~ korobu(roll), marobu(roll), etc. are similar concepts for standard concept korogaru(roll).</Paragraph> <Paragraph position="9"> Counter-example. Saezuru(chirp), hoeru(bark ), unaru(roar), inanaku(neigh), etc. are not similar concepts for standard concept naku(cry). Here, it is assumed that if a certain concept is a standard concept, it is not a similar concept for another standard one at the same time.</Paragraph> <Paragraph position="10"> The class of similar concepts thus obtained is denoted as V s and let Vb=VS-V s.</Paragraph> </Section> </Section> <Section position="8" start_page="129" end_page="485" type="metho"> <SectionTitle> 6) Classification of standard concepts </SectionTitle> <Paragraph position="0"> For each Vbi of Vb, 6.1) examine its structural pattern and classify it according to Table i; 6.2) examine its semantic contents and classify it according to Table 2.</Paragraph> <Paragraph position="1"> In the above process 2) through 6), one concept can be classified into two or more terms if necessary.</Paragraph> <Section position="1" start_page="129" end_page="485" type="sub_section"> <SectionTitle> 5.2 Results and Discussion </SectionTitle> <Paragraph position="0"> First, let's discuss the relation among the obtained classes along the abstraction process.</Paragraph> <Paragraph position="1"> There are two kinds of abstraction processes:(i) extracting common features from concepts as follows; bulldog--+dog-~animal--~living thing--~thing, (ii) connecting several concepts to form a new concept as shown in complex concept B. From the latter viewpoint, the relation among classes is schematized as indicated in Fig. 5.</Paragraph> <Paragraph position="2"> Fig. 5 Relation among obtained classes Simple matter concepts (V b) are regarded as the base of matter concepts in the sense that V b covers the concepts of real world matter at a minimum and every other matter concept is led from V b by a rule. Two simple matter concepts are connected by a rule and form a little bit abstract concept or complex concept of A. Several matter concepts are organized by a fairly complicated rule into a new abstract concept or complex concept of B. One of the elementary concepts in a complex concept of A changes its meaning diachronically and becomes a derivative operator. So, the system of Japanese verb concepts has its own nature--although it is a fact that a large part of the system is universal-and is not manipulated at one level.</Paragraph> <Paragraph position="3"> Next, Table 8 indicates the distribution of all matter concepts. The minute distribution in V T = V U + Vp 4,740 each class has been shown in Table 2, 5 and 7, respectively. Table 8 is instructive in investigating the human competence in organizing the language system. For example, if class Yb is regarded as &quot;primitive&quot; concepts, number 1,209 of Vb does not side with Schank's classification, 9 but with Minsky's idea. 7 From Table 2, 5 and 7, we can measure the degree of human concern about real world matter. For example, term 0.0 in Table 2 shows human beings are most interested in displacements of objects.</Paragraph> <Paragraph position="4"> Finally, we consider that every matter concept under consideration was classified satisfactorily supporting our analyses.</Paragraph> </Section> </Section> class="xml-element"></Paper>