File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/82/c82-1067_metho.xml

Size: 13,287 bytes

Last Modified: 2025-10-06 14:11:30

<?xml version="1.0" standalone="yes"?>
<Paper uid="C82-1067">
  <Title>MAN-ASSISTED MACHINE CONSTRUCTION OF A SEMANTIC DICTIONARY FOR NATURAL LANGUAGE PROCESSING</Title>
  <Section position="1" start_page="0" end_page="0" type="metho">
    <SectionTitle>
Academia, 1982
MAN-ASSISTED MACHINE CONSTRUCTION OF A SEMANTIC DICTIONARY
FOR NATURAL LANGUAGE PROCESSING
</SectionTitle>
    <Paragraph position="0"> Fukuoka 812, Japan Nagasaki 852, Japan Fukuoka 812, Japan This is a report on the semantic dictionary for natural language processing we areconstructing now. This paper explains how to obtain the semantic information for the dictionary from an ordinary Japanese language dictionary with about 60,000 items (which had already been put into machine readable form) and also explains what should be the frame for the representation of meaning of each item (word). Then a man-assisted machine procedure that embeds the semantic graph with respect to the head word of the ordinary dictionary into the frame of a head word is discussed.</Paragraph>
  </Section>
  <Section position="2" start_page="0" end_page="0" type="metho">
    <SectionTitle>
I. INTRODUCTION
</SectionTitle>
    <Paragraph position="0"> There are following two obstacles to the construction of a practical size semantic dictionary for natural language processing. One is that the number of words are so large that it is unknown from what and how to obtain semantic information necessary for the dictionary (tens of thousands general words should be accomodated). The other is that it is unknown how to seize and represent the meaning of each word.</Paragraph>
    <Paragraph position="1"> We are going to settle the problems as follows. For the first problem, obtain the information necessary for the semantic dictionary from ordinary dictionary of Japanese language by analizing the sentences defining headwords. For the second problem, represent the meaning of a word by using the frame the structure of which has been studied extensively by us for a couple of decade (YOSHIDA (1982)).</Paragraph>
    <Paragraph position="2"> Figure I is the outline of the construction steps of the semantic dictionary.</Paragraph>
    <Paragraph position="3"> In this report we discuss, as preliminary stage of the construction, about the framing of the contents of the sentences defining words in the ordinary dictionary. The discussions are to be held about the followings.</Paragraph>
    <Paragraph position="4">  (1) Features of the structure and the contents of the sentence defining (describing the meaning of) a headword.</Paragraph>
    <Paragraph position="5"> (2) The use of semanticgraph as a scheme for the representation of meaning (structure) given by the sentence that define a head word in the ordinary dictionary. null (3) The frame to represent the meaning of a word.</Paragraph>
    <Paragraph position="6"> (4) The way of embedding the information obtained from the semantic graph in the frame given by (3).</Paragraph>
    <Paragraph position="7"> (5) The way of obtaining the meaning (definition) of a word by unifying partial  definitions&amp;quot; (given by (4)) concerning the word.</Paragraph>
  </Section>
  <Section position="3" start_page="0" end_page="419" type="metho">
    <SectionTitle>
2. INFORMATION OBTAINED FROM AN ORDINARY JAPANESE LANGUAGE DICTIONARY AND
FEATURES OF THE SENTENCES THAT DEFINE HEADWORDS
</SectionTitle>
    <Paragraph position="0"> We have chosen the medium size Japanese language dictionary (Shinmeikai Kokugo  420 S. YOSHIDA, H. TSURUMARU and T, HITAKA jiten, Published by Sanseido) to use for constructing the semantic dictionary. The reasons for the choice of this dictionary are as follows.  (l) The number of headwords are adequate (about 60,000 items). (2) Had already been put into machine readable form. (3) Put emphasis on describing the meaning (signification) of a word by sentences, and avoided mere restatement in other words.</Paragraph>
    <Paragraph position="1"> (4) Tried to classify the meanings of words into several large groups and avoided useless small groups.</Paragraph>
    <Paragraph position="2"> Information that can be obtained from the dictionary are as follows. (1) Syntactic information such as part of speech, inflection and conjugation. (2) The described meaning of a headword.</Paragraph>
    <Paragraph position="3"> (3) Several classified meanings of a multivocal words.</Paragraph>
    <Paragraph position="4"> (4) Detailed meaning in its context and delicate shade of meaning of synonyms. (5) Synonym, antonym and idiom (idiomatic phrase).</Paragraph>
    <Paragraph position="5"> (6) Selection of significant words.</Paragraph>
    <Paragraph position="6"> The features of words definitions in the dictionary are as follows. (1) Symbols and abbreviations are used in place of sentences. e.g. ==~&gt;... : make reference to ... ~ : the five conjugations of verbs (--)...: antonym is ... ~ : adjective * .. : noun form is--. ~deg : German (language) (2) P-N type (Noun modified by Predicate) and N-N type (Noun modified by Noun) phrase are used extensively.</Paragraph>
    <Paragraph position="7"> (3) Words are abridged from the sentence into which the meaning is condensed, because of assuming the human use of the dictionary.</Paragraph>
    <Paragraph position="8"> (4) The meaning of a headword is defined by viewing it from some aspects or from its upper words (or synonyms).</Paragraph>
    <Paragraph position="9"> (5) When we look up in turn the upper words (or synonyms) of a headword, the explanation may be broken off, skipped or get into a cycle.</Paragraph>
    <Paragraph position="10"> 3. SEt, ANTIC GRAPH REPRESENTATION OF A WORD DEFINITION  The definition of a word should be given as accurately as possible in order to extract (meaning) information for the semantic dictionary from it. Thus the semantic graph is introduced for the representation of the word definition. The semantic graph which is a kind of semantic network consists of nodes (expressed by &lt;.-.&gt;) and arcs (or links; expressed by arrows) connecting the nodes. The semantic graph with respect to a word is the conceptual dependency graph of the sentence (called semantic graph of the sentence) defining the word. Nodes express the conceptual words (nouns, verbs, adverbs, adjectives expressed with &lt;...&gt;) and arcs represent the relations between them. These relations are mainly given by the functions (called \]ink information) expressed by the relational words such as particles.</Paragraph>
    <Paragraph position="11"> Examples of relations or link information are : 'a kind of relation', 'case relation', 'cause-effect relation', 'synonym-antonym relation', 'mode', 'attribute', 'state' An exmaple of the semantic graph which is derived from the definition sentence &amp;quot;~(drawing paper) :~&lt;k~(c)~B(c)~(white and slightly thick paper for drawing a picture)&amp;quot; is shown in Fig.2. In Fig.2, relational words are shown explicitly in Japanese which express the link information in the semantic graph. When the more detailed relation between words is needed, the relational information expressed by the relational word is transformed into the compound relation between words such as in Fig.4. In Fig.4, a relational word &amp;quot;+-~(c)&amp;quot;(prep. for: used for the purpose of) is transformed into the relation which contains some words ('~ '(use) in this case) and the relation ('object' and 'for the purpose of'). In the process of obtaining the semantic graph from a sentence, it becomes frequently necessary to supply additional information (expressed by additional words and relations) to the sentence.</Paragraph>
  </Section>
  <Section position="4" start_page="419" end_page="419" type="metho">
    <SectionTitle>
MAN-ASSISTED MACHINE CONSTRUCTION OF A SEMANTIC DICTIONARY 42~
4. THE FP~AME REPRESENTATION OF A WORD DEFINITION
</SectionTitle>
    <Paragraph position="0"> What sorts of information as a meaning of the headword are extracted from the semantic graph derived from the definition sentence in the ordinary Japanese dictionary? In the case of the semantic graph of a word 'i~ (drawing paper)' in Fig.3, following information are extracted, bv means of looking up the upper word '~(paper)' of 'drawing paper' and the linked relations between 'paper' and the other words. @ drawing paper is a kind of paper,@slightly thick, (r)white,@used for drawing a picture. These semantic information can be extracted from the semantic graph of the definitions of a word 'paper' through the views 'a kind of', 'attribute (shape)', 'attribute (color)' and 'purpose of use'. Here, 'a kind of' is a relation between 'drawing paper' and 'paper'. Furthermore, from@, semantic information such that 'drawing paper' is the 'object (place)' for drawing a picture' is obtained. This information denotes case relationship between 'draw' and 'drawing paper'. The above can be understood, from a different point of view, that the meaning of a word 'drawing paper' is defined (described) from the views such as 'a kind of' (relation), 'case'(relation), attribute (e.g. shape, color) and 'purpose of use' (basic view point). We are aiming at constructing the frame of a word definition (called the frame of a word or the frame of definition) by means of these views. What sorts of views should be prepared for the frame of a word? Concerning the frame we cleared that: (i) Words (to be exact, concepts) are classified into four types, that is, concrete object, event, attribute (state) and abstract thing which are roughly correspond to concrete noun, verb, adjective/adverb and abstract word. Each word has the frame corresponding to its type.</Paragraph>
    <Paragraph position="1"> (ii) The frame is made up of which consist of the semantic relations (such as, a kind of relationships, case relationships, cause-effect relationships and wholeparts relationships) and basic views. As basic views for 'product' thing (a kind of concrete object), we prepared ones shown in table 3.</Paragraph>
    <Paragraph position="2"> (iii) The frame for 'event' word is mainly made up of case relationships, cause-effect relationships, a kind of relationships and views of 'mode' (e.g. tense, possibility, in progress). These relationships are shown in tables l and 2.</Paragraph>
  </Section>
  <Section position="5" start_page="419" end_page="419" type="metho">
    <SectionTitle>
5. UNIFICATION PROCEDURE
</SectionTitle>
    <Paragraph position="0"> To give the full definition of a word, the partial definitions given in the form of frames are unified into a hierarchical structure by the unification procedure.</Paragraph>
    <Paragraph position="1"> The definitions of the comparatively upper words in the hierarchical classification of the words should be given fine meaning while lower words are only given of their special meanings, because the definition inherits from the upper words. An example of the definition of a word 'drawing paper' (also, 'paper') is shown in Figs. 5and 6, and also the definition of a word 'fire extinguishing' is shown in Fig.7. In Figs.5, 6 and 7, the semantic information connected by the dotted lines are inherited. In Fig.6, some words (e.g. 'stationary', 'instrument', 'product') are skipped between 'paper' and 'thing'. Therefore, if these words are looked up and related with 'paper', the semantic information extracted from these words should be added to the definition. There may be some shortage which can not be supplied only by these unification procedures, so that it should be given man-assistedly fine meanings to the comparatively upper words. For example, in Fig. ~ the agent of 'Remove' is 'Person' and result of 'remove' is 'vanish', therefore, also the agent of 'extinguish', 'fire extinguishing' and so on can automatically be given if we have given the information in the fine definition.</Paragraph>
  </Section>
  <Section position="6" start_page="419" end_page="419" type="metho">
    <SectionTitle>
6. CONCLUDING REMARKS
</SectionTitle>
    <Paragraph position="0"> Although, in our preliminary investigation, we have prospect oC/ using the ordinary dictionary for the construction of the semantic dictionary there are many problems  422 S. YOSHIDA, H. TSURUMARU and T. HITAKA to be investigated, which contain followings.</Paragraph>
    <Paragraph position="1"> (1) Determination of the frames for abstruct noun and attribute(state). (2) Semantic level of the ~emantic graph. (3) Definitions of individual thing and event.</Paragraph>
    <Paragraph position="2"> (4) Development of the programs and supporting programs.</Paragraph>
    <Paragraph position="3"> (5) Definition of the technical terms.</Paragraph>
    <Paragraph position="4">  It Definition of a headword: he sentence describing the meaning of a headword / agent IStructure and semantic analysis I object dative source Semantic graph with respect to ~ destination the sentence defining the headword! insturument Abstraction of the semantic information from the semantic graph and embedding of the meaning of the headword in the frame (Partial definition of the headword)</Paragraph>
    <Paragraph position="6"> 4&amp;quot;)4 S. YOSHIDA, H. TSURUMARU and T. HITAKA fire extinouish: extinguish: (to extinguish a burning fire) (to remove a useless thing such as fire, sound, poisn) Sentences defining the headword in the ordinary dictionary</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML