File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/92/c92-4202_metho.xml

Size: 23,199 bytes

Last Modified: 2025-10-06 14:13:05

<?xml version="1.0" standalone="yes"?>
<Paper uid="C92-4202">
  <Title>HIERARCHICAL LEXICAL STRUCTURE AND INTERPRETIVE MAPPING IN MACHINE TRANSLATION</Title>
  <Section position="1" start_page="0" end_page="0" type="metho">
    <SectionTitle>
HIERARCHICAL LEXICAL STRUCTURE AND INTERPRETIVE
MAPPING IN MACHINE TRANSLATION
</SectionTitle>
    <Paragraph position="0"/>
  </Section>
  <Section position="2" start_page="0" end_page="0" type="metho">
    <SectionTitle>
Abstract
</SectionTitle>
    <Paragraph position="0"> Large-scale knowledge-based machine translation requires significant amounts of lexical knowledge in order to map syntactic structures to conceptual structures. Tfiis paper presents a framework in which lexical knowledge is separated into different levels of representation, which are arranged in a hierarchical model based on principles of knowledge representation and lexical semantics. The proposed methodology is language-independent, and has been used to organize lexical knowledge for both English and Japanese.</Paragraph>
  </Section>
  <Section position="3" start_page="0" end_page="0" type="metho">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> The basic premise of knowledge-based machine translation is that accurate, high-quality translation requires a complete semantic interpretation of the input text (Carbonell and Tomita, 1987). Therefore, the analysis and generation components of a knowlodge-based MT system must have at least the following functional parts: a grammar for the language, a lexicon for the language, a shared set of domain concepts, and rules that map syntactic structures onto semantic structures (or vice-versa for generation).</Paragraph>
    <Paragraph position="1"> .The goal of our work has been to develop a methodology for the hierarchical organization of lexical knowledge (lexical entries and mapping rules) for knowledge-based MT (Goodman and Nirenburg, 1991; Mitamura, 1989). Interpretive Mapping refers to the relationship between predicate conceptual structures and syntactic structures, and involves two kinds of processes: one is a mapping between grammatical fnnctions (e.g., subject, object) and semantic roles (e.g., agent, theme); the other is a mapping between words (e.g., naguru 'hit') and domain concepts (e.g., *HIT).</Paragraph>
    <Paragraph position="2"> We have developed a shared hierarchical structure for lexical knowledge which can capture significant linguistic generalizations, eliminate rexlundancy, and facilitate both knowledge acquisition and efficient processing. We have implemented our hierarchy using FranaeKit, an AI knowledge representation language that supports frames and multiple inherfiance (Nyberg, 1988).</Paragraph>
    <Paragraph position="3"> Our system demonstrates rite integration of a linguistic forrealism with a frame-based knowledge representation system.</Paragraph>
    <Paragraph position="4"> We have analyzed a large corpus of Japanese verbs and created a set of lexical frames, mapping rules, and an inheritance hierarchy for use in a working translation system.</Paragraph>
  </Section>
  <Section position="4" start_page="0" end_page="0" type="metho">
    <SectionTitle>
2 Linguistic Motivation
</SectionTitle>
    <Paragraph position="0"> Our methodology is based in part on recent work in lexical semantics (Jackendoff 1983, 1987; Levin, B. 1985, 1987, 1989; Hale and Keyser 1986; Fukui, Miyagawa, ,and Tenny 1985; Rappaport and Levin, B. 1986). The field of lexical semantics is concerned with therepresentation of syntactically relevant aspects of word meaning, especially the properties of argument-taking words like verbs.</Paragraph>
    <Paragraph position="1"> Many researchers have noticed that semantically similar predicates tend to be syntactically similar, too. B. Levin (1987, 1989) examines many systematic semantic-syntactic correspondences, including linking regularities and transitivity alternations. Linking refers to associations between semantic arguments and grammatical relations. Common correspondenees between semantic arguments and grammatical relations are called linking regularities.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
2.1 Linking and Alternation
</SectionTitle>
      <Paragraph position="0"> For example, in the causative use of break (e.g., John broke the vase), the subject John is linked to the agent semantic role, and the object vase maps to the theme semantic role. Break can be classified as a change-of-state verb, and the same pattern is observed in the causative use of other change-of-state verbs, such as crack and melt. Moreover, it is important to note that this pattern also holds for other classes of verbs (e.g., change-of-possession verbs like give).</Paragraph>
      <Paragraph position="1"> It is also the case that the same verb can have more than one way of linking syntactic functions with semantic roles.</Paragraph>
      <Paragraph position="2"> These different linkings are called valency alternations, which include both transitivity alternations and alternate linkings of semantic arguments with syntax.</Paragraph>
      <Paragraph position="3"> For example, break can also appear in sentences like The vase broke, where the verb assigns the theme semantic role to the syntactic subject. This is in contrast to the causative use of break, described above, where file verb assigns the agent semantic role to the syntactic subject and a theme se-ACRES OE COLING-92. N^wrEs, 23-28 ho~r 1992 l 2 5 4 Pgoc. oF COLING-92, NANTES, AUG. 23-28, 1992 mantic role to the syntactic object. This alternation, known as the Causative/lnchoative alternation, is also associated with change-of-position verbs like drop (John dropped the ball vs.</Paragraph>
      <Paragraph position="4"> The ball dropped), change-of-psychological-state verbs like worry (John worried vs. Bill worried John), etc. (B. Levin, 1989).</Paragraph>
      <Paragraph position="5"> With some verbs, rite mapping of one syntactic function may remain constant while others alternate. For example, in the sentence John cut the meat, the patient semantic role is assigned to the syntactic object; in John cut at the meat, tile goal semantic role is assigned to the prepositional object.</Paragraph>
      <Paragraph position="6"> In both sentences, the agent semantic role is assigned to the syntactic subject.</Paragraph>
      <Paragraph position="7"> Classes of verbs which undergo the same alternation tend to be semantically similar. Verbs like hack and slash, which belong to the same verb class as cut, undergo tile same alternation mentioned above. However, semantically different verbs like break do not exhibit the stone alternation: (I) a. He broke the cup.</Paragraph>
      <Paragraph position="8"> b. *He broke at the cup.</Paragraph>
      <Paragraph position="9"> Linking regularities and transitivity alternations are used to identify the semantic roles of arguments and the semantic classes of verbs. That is, an argument which displays the same linking regularities as another argument might be assigned the same thematic role, and verbs which have the same transitivity alternations can be placed in the same class.</Paragraph>
      <Paragraph position="10"> Transitivity alternations in English are marked in various ways. Many of them involve the alternation of an argument between object add prepositional phrase. In Japanese, however. valency alternations, (including transitivityalternatious) are usually indicated by different case markers lx3rne by the argmnents of the verb. Every noun phrase in Japanese is marked postpositionally by a particle, such as ga, o, ni, and de. These markers indicate tile case or other grammatical fanction of the nominals they are associated with.</Paragraph>
      <Paragraph position="11"> For example, the o/de alternation appears with verbs like oyogu (swim), sanposuru (rake a walk), and hashiru (run)&lt; (2) a, Taro ga kawa o oyoida ~Taro swam down the river' b. Taro ga kawa de oyoida ~Taro swam in the river'</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
2.2 Lexical Mapping
</SectionTitle>
      <Paragraph position="0"> Another part of building a lexical semantic representation is to foramlate links from lexical items to conceptual meanings; these links are called lcxical mappings. Since the semantic properties of relations and objects (which are crucial in stating subcategorization restrictions) reside most naturally in a semantic domain model, it is necessary for a system to integrate the lexical level and the domain model so that semantic restrictions can be satisfied during parsing and generation.</Paragraph>
      <Paragraph position="1"> In some cases, a lexical item may be linked to more than just a semantic head. For exmnple, in the sentence The pencil rolled off the table, the meaning of roll must be reprei For further detail and examples, see (Mitamura, 1989).</Paragraph>
      <Paragraph position="2"> seated by both a semantic head (e.g., *MOVE) and a semantic modifier indicating the manner of motion (e.g., (manner *ROTATION)) 2. As a result, lexical mapping may also require semantic feature assignment.</Paragraph>
    </Section>
    <Section position="3" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
2.3 Summary
</SectionTitle>
      <Paragraph position="0"> The motivation for our work has been the following set of observations, drawn from the linguistic phenontena mentioned in this section. An appropriate lexical representation must be  able to represent tile following: * The linking of a particuhu&amp;quot; syutactic function with a particular semantic role; * A set of linking rules that indicate a partianlar alternation; * A group of alternations that capture the general behavior of a class of verbs; * An explicit representation of verb classes, to which particular lexical items may be linked; . A set of lexical items, which contain both links to verb classes and links to semantic concepts in the domain conceptual hierarchy.</Paragraph>
      <Paragraph position="1"> 3 The Lexical Hierarchy Our lexical hierarchy has five levels of representation, each corresponding to a linguistically meaningful unit of structure: (1) Mapping Rule Frames, which capture a particular correspondence between a syntactic function and a semantic role; (2) Mapping Pattern Fran|es, which capture a particular set of mapping rules, which correspond to one way of linking the arguments of a particular verb; (3) Mapping &amp;quot;b/pc Frames, which capture tile set of alternations (mapping patterns) alluwed by a particular class of verbs; (4) Verb Class Frames. in which the generalization in verb linking behavior is captured; (5) Lcxical Frames, in which tmrticnlar lexical items (verbs)  are represented as frames which are linked both to appropriate verb class frames and to conceptual frames in the donmin concept hierarchy.</Paragraph>
      <Paragraph position="2"> Figure 1 illustrates the inheritance relations between mapping rules, mapping patterns, mapping types, verb classes, aml lexical frames in English.</Paragraph>
    </Section>
    <Section position="4" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.1 Mapping Rule Frames
</SectionTitle>
      <Paragraph position="0"> The nmpping rule frames each map one grammatical function, such as subject or object, onto a semantic role, such as agent or theme. Each mapping rule is specified in a separate frame, as in the lollowing:</Paragraph>
      <Paragraph position="2"/>
    </Section>
    <Section position="5" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.2 Mapping Pattern Frames
</SectionTitle>
      <Paragraph position="0"> The mapping pattern frames represent particular bundles of mapping rules. For example, a mapping pattern frame which contains the agent-subject mapping and the theme-object mapping represents one mapping pattern, whereas a frame which contains just the theme-subject mapping represents .another mapping pattern 3 (of. Figure 1).</Paragraph>
      <Paragraph position="1"> Syntactic constraint rules can be written in a mapping pattern frame to indicate that the associated mapping rnles can apply only when these constraints are satisfied. Some exampies of mapping pattern frames are shown below:</Paragraph>
      <Paragraph position="3"> The frame *mapping-pattern I captures one way of mapping the syntactic argument of a verb. The subject is mapped to the semantic agent and the object is mapped to the semantic theme. The *1napping-pattern2 frame indicates a mapping where the verb has one argument, the subject, and maps the subject to the semantic agent.</Paragraph>
    </Section>
    <Section position="6" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.3 Mapping Type Frames
</SectionTitle>
      <Paragraph position="0"> Mapping type frames contain sets of mapping rule patterns, and have the ability to capture both transitivity alternations in English and case alternations iu Japanese (Mi~uura, 1989). The two mapping patterns we mentioned earlier, 1) the agent-subject and the theme-object mapping, and 2) the theme-subject mapping, can be generalized as the causative-inchoative verb mapping type. In Figure 1, the causative-inchoative alternation is represented by *causativeinchoative, hi Japanese, the alternation between an oblique argument with particle o and an oblique argumeat with particle de is captured by *obl-o/obl-de.</Paragraph>
      <Paragraph position="1"> An example of a mapping type frame is shown below:</Paragraph>
      <Paragraph position="3"> The *causative-incltoative frame contains two mapping pattern frames, indicated by a contain link that includes *mapping-patterul and *mapping-patteru2.</Paragraph>
      <Paragraph position="4"> 3This is similar to the notkm of lexicalforms in lexical mappi.g theory (nresnan ind Kanerva, 1989), but fire difference is that we incorporate case assignment tule~ into argument mspping nlles to make the mapping a o.e step operatitx~ for use in generating or par~ing sentences. In LFG, casez are tssigned in each lexical entry through grammatical encoding theory, which identifies ~t.d assigns an appropriate ease for t grammatical l'unction ill each lexieal ealtry.</Paragraph>
    </Section>
    <Section position="7" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.4 Verb Class Frames
</SectionTitle>
      <Paragraph position="0"> Verb class frames generalize over verbs with a common core sense and common syntactic behavior, Some example verb class frames (*verbs-of-breaking, *motion-path-verbs) are illustrateA in Figure 1. The *verbs-of-breaking frame has an is-a link to the *causative-inchoative mapping type, indicating that verbs in the *verbs-of-breaking class can undergo the causative-inchoative alternation.</Paragraph>
    </Section>
    <Section position="8" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.5 Lexical Frames
</SectionTitle>
      <Paragraph position="0"> Lexical frames represent the language-dependent lexicon, and include pointers to corresponding conceptual frames. These frames also have is-a relations which link them to verb class frames, which are organized hierarchically according to the particular language, The SEMANTICS slot in the lexical frame contains refereaces to the concepttml frames associated with the lexical item. Particular restrictions on the meaning of the lexical item are captured by semantic role or feature assignment rules that may appear along with each SEMANTICS pointer.</Paragraph>
      <Paragraph position="1"> For example, the SEMANTICS slot shown below for the verb roll points to the conceptual frame *MOVE. Included with the pointer to *MOVE is an assignment rule which indicates that the manner of *MOVE must have the meaning indicated by the conceptual frame *ROTATION. The *roll-1 frame has an is-a relation to the verb class frame, *motionverbs. null</Paragraph>
      <Paragraph position="3"> More examples of lexical frames are shown in Figures 1.</Paragraph>
      <Paragraph position="4"> I11 Figure 1, *break-1 is a lexical frame, corresponding to the semantic notion *BREA'K, which is a member of the *verbs-of-breaking verb class.</Paragraph>
    </Section>
  </Section>
  <Section position="5" start_page="0" end_page="0" type="metho">
    <SectionTitle>
4 The Domain Conceptual Hierarchy
</SectionTitle>
    <Paragraph position="0"> Conceptual frames represent knowledge of the world that is language-independent, lbr example, general concepts such as * EVENT and *PHYS ICAL-OBJECT, as well as more specific concepts, like *BREAK and *SWIM 4. Conceptual frames are organized hierarchically using inheritance relations. Selecdeg tional restrictions can be specified in conceptual frames, and appear as the fillers of semantic role slots.</Paragraph>
  </Section>
  <Section position="6" start_page="0" end_page="0" type="metho">
    <SectionTitle>
5 Multiple Inheritance and Interpretive
</SectionTitle>
    <Paragraph position="0"> Mapping in Machine Translation Oar operational goals in constructing this hierarchy and its inheritance relations include the following: '*An asterisk prefix is used to indicate frame names. Upper case frame names (e,g,, *BREAK) indicate conceptual frames. Lower ease is used for  * Support of rapid, straightforward acquisition of large amounts of texical knowledge in an interactive environ~ meat; * Elimination of unuecessary (and costly) redundancy in the representation of lexical knowledge.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
5.1 Efficient Knowledge Aequisitinn
</SectionTitle>
      <Paragraph position="0"> Productivity in the kuowledge acquisition task is greatly eahancexl by this hierarchical methodology. Rather than exliting an ASCII file containing rednndant mapping rule definitions for each lexical entry, the persou entering new lexical concepts utilizes a 2-dimensional browsing and editing tool to add new knowledge to the system (Kaufinann, 1991).</Paragraph>
      <Paragraph position="1"> Once the initial mapping rules, alternations attd verb classes are specified, the user can easily link new lexical frames to existing verb classes, perhaps refining some of the knowledge in tile upper portions of the hierarchy, but in general taking advaltlage of the compact uaturc of the hierarchy to avoid redund,'mt data entry.</Paragraph>
      <Paragraph position="2"> The fiame representation presented here has a great advtmrage lot the development of large-scale NLP systems, namely, that each mapping rifle need only be defined once, anti is thereafter inherited by all the lexieal frames that require it. By positing intermediate levels of structure (mapping types and mapping patterns), significant generalizations can be captured which farther enhaace the compactness of tile representation and the ease of knowledge acquisition.</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
5.2 Multiple Inheritance
</SectionTitle>
      <Paragraph position="0"> The delinition of containment, however, is not as straightlorwm'd as a simple is-a relation in traditional frame-based knowledge representation. &amp;quot;lhe containment relation that obtains between mappingpattems and mapping rules is the usual conjunctive (nmltiple) type of inheritance, since a mapping pattern contains each and every mapping rule that it is linked to via a contain link. On the other hand, the containment relation that holds between mapping types and mapping patterns in disjunctive, since a nmpping tyve contains different types of alteraations, only one of which can be active at a given time for a particular verb. As a result, inheritance is pert'ormed in a different manner at these two levels in the hierarchy.</Paragraph>
      <Paragraph position="1"> By default, FrameKit supports only conjunctive inherilance, which is most common in system where inherit.mtee hierarchies are built using simple is-a links. We have developed user-delined inheritance methods for FrameKit that perform rite appropriate inheritance operations at each level in the mapping Itierarchy. When all of the possible subcategorization/nmpping pairs must be retrieved for a given lexical frame, these inheritance methods perform the appropriate conjunctive inheritance, bundliug the mapping rules together into mapping patterns, followed by disjunctive inheritance of mapping types to create any &amp;quot;alternative readings of the lexical item. Simply speaking, the inheritance methods must re-create the explicit structure that is implicit in the inheritance hierarchy when it is necessary to represent distinct mappings for verbs at system run-time.</Paragraph>
      <Paragraph position="2"> An extuuple of how iuberitance works at run time is illustrated in Figure 2. The two franles shown in the figure are instantiated by the inheritance methods from the lexical frame *break-l, and represent the two possible alternations of break (the causative reading and file inchoative reading).</Paragraph>
    </Section>
    <Section position="3" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
5.3 Interpretive Mapping
</SectionTitle>
      <Paragraph position="0"> The architecture in Figure 3 illustrates how our lexieal hierarchy fits into the overall machine translation system. During prosing, tile lexieal entries stored iu the source lexical hievaro chy are accessed by the LF(I parser; during the mapping of source f-stractures to interlinguarepresentations, file mapping rules in the lexical hierarchy are accessed by the mapper via instarlliated mapping structures like those shown in Figure 2. During generation, tile target language lexical hierarchy is utilized in a similar fashion. First, instantiated ntapping structures arc used to create target f-structures, a,d then target lexieal entries are utilizexl by the LFG generator to produce target language strings.</Paragraph>
    </Section>
  </Section>
  <Section position="7" start_page="0" end_page="0" type="metho">
    <SectionTitle>
6 Status
</SectionTitle>
    <Paragraph position="0"> we have developed an extensive interpretive mapping hierarchy for Japzmese, which includes 36 mapping rule frames,  45 mapping pattern frames, 37 mapping type frames, 54 verb class frames, and 100 lexical frames. Hundreds of additional lexical frames could be added to the hierarchy without modification of the existing hierarchical structure. We believe that our mapping frame hierarchy accounts for the syntactic behavior of a significant number of Japanese verb classes. The hierarchy is based on data for about 1000 verbs, taken from (Ishiwata and Ogino, 1983) and the IPAL report on basic Japanese verbs OPAL, 1987), We have also developed an initial mapping hierarchy for English verbs. The English and Japanese lexical hierarchies were utilized in the KBMT-89 system for the interpretation of Japanese sentences (Mitamura, et al., 1991). Weare currently integrating our hierarchical structure into a large-scale system for translation of service manuals from English to Japanese. Since the argument mapping knowledge represented in our hierarchy is declarative rather than procedural, it can be used either in analysis or generation (cf. Figure 3).</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML