File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/86/c86-1155_metho.xml

Size: 65,101 bytes

Last Modified: 2025-10-06 14:11:52

<?xml version="1.0" standalone="yes"?>
<Paper uid="C86-1155">
  <Title>Future Di rec t i ons of Mach i n;. ~ Trans l at i on</Title>
  <Section position="3" start_page="656" end_page="658" type="metho">
    <SectionTitle>
3 Basic Approaches
</SectionTitle>
    <Paragraph position="0"> One of the recurring controversies among MT researchers has been between the adoption of the transfer approach and the adoption of the interlingual approach, and this seems extremely relevant to various issues of the possible relation-ships between 'understanding' and 'translation' in future MT systems. The transfer approach, originally proposed by GETA \[Vauquois1979\] and adopted by many research and development groups including the MU project, EUROTRA \[King 1981\] \[Johnson 1985\], TAUM \[Kittredge 1976\] \[Isabelle 1985\], METAL \[S\]ocum 1982\] \[Bennet 1985\] , PAHO-ENGSPA \[Vasconcellos 1985\] , ASCOF\[Biewer 1985\] etc., is an approach in which translation is carried out essentially in three phases: analysis, transfer and generation. The second phase, transfer, is a contrastive phase where lexical items, stereotyped expressions, and the syntactic and semantic structures of two languages are compared so that both lexical items and certain levels of the linguistic structures of the source languages may be transferred to their 'equivalents' in the target languages.</Paragraph>
    <Paragraph position="1"> The interlingual or pivot approach, which has been repeatedly advocated by researchers originally interested in natural language understanding (NLU) who take machine translation as one possible application \[Muraki 1982, 1986\] \[Lytinen 1982\], instead performs translation through two phases, understanding and paraphrasing. The results of the first phase in this approach are supposed to be represented in the form of expressions of interlingua, from which the second phase may generate the target sentences. The expressions of interlingua are language universal in the sense that the second phase can generate target sentences from them without considering what the source language is. It is claimed that this approach is superior to the transfer approach because of the follow ing advantages.</Paragraph>
    <Paragraph position="2"> l.Multi-Lingual Translation: Because this approach does not have any phases dependent on language pairs, only two kinds of modules for transforming sentences of individual languages to expressions of interlingua and vice versa are necessary for multi-lingual translations.</Paragraph>
    <Paragraph position="3"> 2.High quality Translation: Because this approach first understands source sentences and then paraphrases the 'understanding' in the target languages, the translation results are natural and easy to understand.</Paragraph>
    <Paragraph position="4"> Fig.\]. is a schematic figure often used for explaining the relationship between the transfer approach and the interlingual approach \[Vauquois 1979\] \[Tucker \]985\]. This figure shows that there is an abstraction hierarchy of descriptions such as surface word sequences, surface syntactic structures, deep syntactic structures, semantic structures, conceptual structures, etc. where, at the deeper levels, the descriptions of sentences of different individual \]anguages become closer and finally, at the deepest level (the level of understanding), converge.</Paragraph>
    <Paragraph position="5">  Which is often used but quite misleading This figure, however, is often misleading in that it suggests an interpretation where each level of the hierarchy may replace the shallower levels of description. This is to interpret the figure as showing that each \]eve\] Jn the hierarchy can express in its own descriptive framework all aspects of the information conveyed by source sentences: once a description at the deeper \].evel is achieved, it can replace the sha\] lower, more surface-oriented levels of description. This imp\] ies that the transfer approach is more a tentative approach only adopted until we develop technologies for 'understanding' texts and the frameworks for expressing the result of understanding, that is, interlingua.</Paragraph>
    <Paragraph position="6"> The early experiences of CETA, however, show that this naive view does not work well. The surface syntactic structures of sentences, fo~ example, cannot be replaced fully by their deep case structures, because surface strnctures convey extra-information concerned with, for example, the focus of the discourse, the distinction of old/new information in the context, emphasized e\].ements or phrases, etc., and such extra-information is also relevant to the determination of the target sentence structures. Generally speaking, for translation, we have to extract from texts, not only what is described (the extra-linguistic aspects of texts) but also how Jt is described and how the texts are organized(the linguistic aspects of texts) .</Paragraph>
    <Paragraph position="7"> The early, naive interlingual approach tended to put emphasis on just what is described. The same tendency may also be observed in some parts of linguistics and recent knowledge-based approaches to MT. Fillmore's initial notion of cases \[Fillmore 1968\], for example, was proposed for retaining identities of events in the real world which are expressed differently in surface sentences, so that the sentences John opened the door with the key.</Paragraph>
    <Paragraph position="8"> The key opened the door.</Paragraph>
    <Paragraph position="9"> The door opened.</Paragraph>
    <Paragraph position="10"> are all reduced to the same case structures. However, even if they describe the same real world events, they describe those events from different view points. At least, the sentences may play different roles in discourse, and so, when they are put in a certain context, some of them may violate discourse coherency and be less natural than others. One could claim, as researchers of know ledge based approaches often do, that, because discourse roles of sentences should be determined during the generation (paraphrasing) phase by 'inte\].ligent' text generators, the analysis (understanding) phase need not extract factors re\].evant to discourse from source texts. It is probably true that some dJ scourse factors and so some parts of surface linguistic structure should be determined during the generation phase of target texts. EIowever, because the same sequences of events in the real world can usually be described by a number of different texts, each having its own coherent discourse structure, MT systems should be able to select one of them dynamJ eally based on the text structures of the source languages. Certain factors concerned with the text organization of the source languages should be extracted during the analysis phase to facilitate such selection. Otherwise, however intellJ gent the text generalors might be, they may always generate the same texts as translations of different\] y organized source texts whenever 'essentially' the same sequences of events are described, albeit from different view points and attitudes.</Paragraph>
    <Paragraph position="11"> Although there are certain types of texts, such as 'factual' newsreporti.ng articles of newspapers on terrorism \[Ishizaki \[986\] \[Lytinen 11982\] in which only what events occured in the real wor\]d and in what order are J mportaut, there are, of course, far more varied types of texts to be translated. \[Tucker \].984\] also notes this point as follows.</Paragraph>
    <Paragraph position="12"> 'In spite of its initial appeal, the knowldge based approach -- raises some weighty questions, for example, .... To what degree are the scripts of know\]edge based machine translation well suited to 'non-story' texts such as conference proceedings, scientific artic\]es, and budget documents ?' There is, however, another possible interpretation for Fig.l. Here the hierarchy is taken as a hierarchy of the depth of processing during the analysis phase, according to what kinds of information are being explicitly extracted from source sentences at each level \[Boitet 1984\]. In this view, an analysis program which performs processing to a certain level gives as its output certain structural descriptions (or sets of structural descriptions) which contain explicit representations of information up to that level. An analysis program which processes sentences to the level of deep case structures, for example, outputs certain descriptions from which the other program, the transfer program, can retrieve information of, not only deep case relationships, but also surface syntactic structures and surface ordering of the words of input sentences, without any further linguistic processing. The current transfer-based MT systems usually stand on this view, where, based on the deep case structures and surface syntactic structures of source sentences revealed during the analysis phase, the transfer programs compute the most appropriate corresponding descriptions of target sentences. In the cuurent transfer-based systems, however, discourse factors are not usually expressed explicitly in the descriptions but are implicitly preserved in the surface syntactic structures which preserve the surface orderings of phrases. The surface syntactic structures are then preserved during the transfer phase as much as possibJ.e so that discourse ro\].es of elements in the sentences are presumably transferred to the target descriptions. This principle of 'using source sentences as mou\]ds of target sentences' works rather well in translation among languages with many similarities because the syntactic notions of one language such as syntactic subject often play almost the same discourse roles in the other languages.</Paragraph>
    <Paragraph position="13">  However, though the same principle works to a certain extent in the translation between Japanese and Indo-European languages, it does not work so well. In the translation of such a language pair, because surface syntactic structures of source sentences often have to be drastically changed in order to realize the deep case relations in the target, the principle itself becomes hard to follow. In addition, though the principle is based on the assumption that syntactic notions such as syntactic subject etc. play the same roles in the two languages, the assumption is not valid. The principle, therefore, tends to produce either understandable but unnatural translations, or to make the transfer component ad-hoc, complex, and difficult to maintain when we attempt to get natural translations. Furthermore, as can easily be seen, the principle is not even satisfactory for the translation of similar languages when we want to get high quality translations. It is obvious that we have to extract explicitly more kinds of information from source texts than deep case structures and utilize these to compute descriptions of the target sentences. null Note that 'to extract more kinds of information explicitly during the analysis' does not, in fact, necessarily mean 'to express such kinds of information in a language independent framework' nor does it imply that such extracted information can fully replace the shallower levels of description. Indeed, because the linguistic aspects concerned with 'hew things are described', 'how texts are organized' etc.</Paragraph>
    <Paragraph position="14"> are more language-internal aspects than those of 'what are described', it is likely that they are more difficult to express in a language universal framework. null Our tentative view of future MT systems, which is based on the transfer approach and will be zevised in a later section, is shown in Fig.2. In this framework, the analysis phase is expected to extract explicitly many more different kinds of information other than deep case relationships. They are the  factors which collectively determine the surface syntactic structures of source sentences. We neither expect, as described above, that such extracted information should be represented in a language universal manner, nor expect that they uniquely determine surface syntactic structures of source sentences. In this sense, they need not be a complete set of factors determining surface structures of source sentences and so the surface structures cannot be replaced by the set of these factors. They merely give us a framework which facilitates the systematic comparison of the two languages. Based on the set of these factors, the transfer phase computes corresponding factors of target sentences including discourse factors, semantic structures, syntactic structures, etc. from which the generation phase will generate target surface syntactic structures. As the extracted factors give the transfer component a constraint set which is to be satisfied if possible, the factors computed in the transfer give a similar set of conditions to be satisfied in the generation phase.</Paragraph>
    <Paragraph position="15"> Though our current view of future MT systems is based on the transfer approach, our objective in this section is not to claim that this approach is superior to the interlingual approach, but only to claim that the word 'understanding texts' in the context of MT is quite vague and, therefore, that we have to examine and define what is really meant by the mythical word 'understanding' before discussing the advantages and disadvantages of the two approaches. In fact, while several large and practical MT systems, including some commercially available, have been developed in Japan based on different approaches such as the 'Pivot approach' \[Muraki 1985\], 'Conceptual Transfer Approach' \[Uchida 1980, 1985\] \[Amano 1985\], 'Integrated Approach' \[Tanaka 1983\] , each of which puts emphasis on different aspects of translation processes, especially on aspects of 'understanding', when one closely examines the internal translation processes and what kinds of information are utilized in these systems, one in fact finds many similarities and fewer differences than one might have expected.</Paragraph>
    <Paragraph position="16"> Before ending this section, we would like to add some comments: First of all, we neither deny the existence of certain levels of understanding which are language universal nor their importance and relevance to translation. On the contrary, we are willing to accept such claims. Our objective is only to claim that such levels of 'understanding results' should be integrated with other aspects of information conveyed by input texts. Second, though it is implicitly assumed by the researchers of the inter-lingual approaches that the transfer approach is incompatible with 'understanding texts', that assUmption, as Fig.2. shows, is simply wrong.</Paragraph>
    <Paragraph position="17"> Translation and Understandinq In order to discuss the problem on a more concrete basis, we will first see how 'understanding of a sentence' has been understood in conventional NLU frameworks.</Paragraph>
    <Paragraph position="18"> Fig. 3. shows a simplified framework of an NLU system. In this framework, 'understanding of a sentence' is regarded as a process of transformation from an input sentence S, a linear sequence of words, into a meaning representation M(S). The M(S), in turn, is used as an input to a certain scheme of 'internal processing' such as a deductive inference, problem solving program, etc., which Js actually implemented as a computer program to carry out a certain specific task. In this framework, the meanings of input sentences are defined Jn terms of the 'internal processing' specific to individual 'understanding' systems, and so the results of Iunderstanding' are represented by symbolic expression~ which can be interpreted by internal programs for specific tasks.</Paragraph>
    <Paragraph position="19">  An ordiz~ary NL front end for a data base system, for example, transforms sentences into expressions of a certain query Ianguage such as SQI,, an artificial language designed Ifor data base accesses. The internal program in this case is tile SQL interpreter which can execute the expressions to retrieve appropriate data. As all extreme example, the STUDENT system \[D.Bobrow \].968\], which solves exercises of arithmetic expressed in English, transforms texts into a simultaneous equation. In this system, the 'meaning' of an input text is an equation.</Paragraph>
    <Paragraph position="20"> Such transformation from an input to the M(S) \]s essentiaIly an information extraction process where only information relevant to specific tasks is extracted; it is not an information preserving process in tbe sense that exact surface sentences usually cannot be re-generated from information extracted. In other words, M(S) used so far represent the 'meanings' of input sentences only from a certain point of view, that is, from the view point of 'internal processing' for a specific task, and therefore, only preserve information relevant to that task. Though other frameworks which have been adopted by NLU rese;~rchers in certain fields such as 'text understandi.ng' seem to have different flavors, the essential framework is almost the same. In these systems, 'understanding texts' i s taken to be a process of relating texts to internal 'knowledge' called 'scripts', 'frames', 'schemas' etc. prepared in the systems beforehand. Knowledge in these systems is claimed to imitate human conceptual memory formed through experiences in the real world and to be general in the sense that it is independent of specific tasks. Such systems, however, also have their own tasks such as 'paraphrasing', 'summary generation' etc. to show their understanding capabilities by external behaviour; these tasks implicitly define the content and descriptive frameworks of their knowledge so that the information to be extracted from texts is restricted. In addition, because the internal forms of knowledge to which input texts are related usually reflect situations (or sequences of events) in the real world, they have nothing to do directly with linguistic texts. That is, 'understand~ng results' in these systems often miss the linguistic aspects of texts.</Paragraph>
    <Paragraph position="21"> In contrast to a restricted approach to meaning extraction, however, the aim of translation Js Ito re-express by using sentences of target \] anguages the information of all aspects contained in sentences of source languages, with as \]east distortion as possible'.</Paragraph>
    <Paragraph position="22"> It is commonly recognized by l~nguists that a\]\] different surface sentences convey different information. If we share th\]s understanding, the M(S) in MT should v\]rtua\]\]y retain informat ion for re-generating exact source sentences. That is, we do not have any 'internal processings' Jn MT by which we can define certain aspects of information conveyed by texts. The M(.~;) of source sentences in MT should preserve information of a\] \] kinds conveyed by source sentences, not only what Js described by the texts but also how it is described, from what view points and by what attitudes. Such considerations have led us to the framework a\] ready shown as Fig. 2. in this framework, we abandon single layers of descriptions for representing 'understanding results', and instead, have several layers of descriptions which collectively determine the surface syntactic structures of the source sentences and which are a\]\] to be utilized durLng the transfer.</Paragraph>
    <Paragraph position="23"> Based on this assumption of the muiti-\]ayered description o\[ source texts, we can thin\]&lt; of certain \].ayers of description which are language universal.</Paragraph>
    <Paragraph position="24"> and which correspond to 'understanding resu\]ts' in conventional NLU systems. We will discuss in the following sections some of the problems in utilizing these extra-linguistic Layers of 'understanding' in translation processes and what roles these layers shou\].d play in the preeess as a whole.</Paragraph>
  </Section>
  <Section position="4" start_page="658" end_page="660" type="metho">
    <SectionTitle>
5 Words and Concepts
</SectionTitle>
    <Paragraph position="0"> We will first examine the basic units from which complex expressions in these language independent layers might be constructed. The researchers advocating rla~ve interIingua\] approaches have Jn mind snch a view as shown Jn Fig. 4. In this view, each word of individual languages denotes a language independent or extra-linguistic concept, though some words are ambiguous and denote several different (mutually distingui shable) concepts. Such concepts denoted by words in individual languages are the basic units of language universal description. In this view, words of individual languages are related to each other through the concepts, and translation of words from one language to another is to be performed straightforwardly through these concepts.</Paragraph>
    <Paragraph position="1"> This view is we\]\].-fitted for the terminological concepts and words in a scientific field.The word  'mass' in physics, for example, denotes a certain concept called 'mass' in English or 'shitsuryou' in Japanese. The concept has its own definition in the theories of physics, which are, of course, language independent. The relationship between words and concepts here is similar to that found in Fig. 3, where the meanings of linguistic expressions (and so those of individual words) are related to symbolic expressions used in 'internal processing'. Theories of physics are here playing the same role as do 'internal processings' in NLU (Fig. 5).</Paragraph>
    <Paragraph position="2">  In ordinary texts, even in abstracts of scientific and technological papers which our MU systems aim to translate, however, we find a large number of ordinary words which lack such formal definitions and for which the above naive view of lexical translation does not work well. The concepts denoted by ordinary words such as 'to introduce', 'to produce', 'advantages', 'fields' etc. do not have formal explicit definitions, even if we accept the existence of such denoted 'concepts'. Especially, as \[Hobbs 1984\] noted, verbs are usually used to describe quite different situations or events in the real world. He gives the following examples of usages of 'to produce' in medical textbooks on hepatitis as follows.</Paragraph>
    <Paragraph position="3"> A disease can produce a condition A virus can produce a disease Something can produce a virus.</Paragraph>
    <Paragraph position="4"> Intesia flora can produce compounds etc.</Paragraph>
    <Paragraph position="5"> Note that, in Japanese, we have a verb 'tsukuridasu' which roughly corresponds to 'to produce' in English, but some of the above usages of 'to produce' would need to be translated into a different Japanese verb, 'hikiokosu'. In order to retain the simplicity of translation through extra-linguistic concepts, we have to prepare at least two different concepts denoted by 'to produce' which are denoted in Japanese by 'tsukuridasu' and 'hikiokosu', respectively. Moreover, because we can easily recognize the differences among situations described by 'to produce' in the above sentences, it is natural to imagine that there may be other languages which require further division of the concepts. The naive scheme in Fig. 4. may result in a proliferation of concepts and cannot explain the correspondence of words in different languages.</Paragraph>
    <Paragraph position="6"> Hobb's answer (and, of many other researchers both in NLU and linguistics) to this question, which is intuitively reasonable, is: 'to produce' in the above examples is not a polysemy, because all of the  above usages share a certain core meaning in common such as 'x causes y to come into existence'. This kind of approach, the lexical decomposition approach, not only can prevent the proliferation of concepts, but it also has another advantage in that it reduces the diversity of surface expressions by representing sentences with different surface verbs such as 'to produce', 'to create', 'to generate' etc. by the same combinations of primitives. Such a reduction is preferable for 'know\]edge' based processing which utilizes extra-linguistic knowledge, i.e. set of rules intrinsic to external worlds, because the processing is concerned with events or situations described by texts but not directly concerned with texts themselves.</Paragraph>
    <Paragraph position="7"> Though such reduction is inevitable for certain kinds of knowledge based processing, we have to notice that the lexical decomposition approach, by itself, does not explain anything about lexical correspondence among different languages. On the contrary, it may increase the difficulties of lexical choice in translation. In order to discriminate 'to assassinate' from 'to kill', 'to murder' etc., though we have a rather direct correspondence between 'to assassinate' in English and 'annsatsusuru' in Japanese, we have to encode many kinds of information other thal\] 'X cause Y to become not to be alive' such as Y's social status, the reason of 'killing' (political or not) and, in general, the speaker's conception of the 'killing' event in question. In other words, the description cannot replace surface lexical :items unless a complete set of (cognitive or other) i_~etors relevant to surface lexical choices are fully specified. The fact that most decompositionists have been only concerned with verbs shows that to specify such a set of primitives for expressing even only the core meanings of nouns is far more difficult.</Paragraph>
    <Paragraph position="8"> (Note that 'field' should be translated into six or more different nouns in Japanese \[Nagao \].986\]) Furthermore, because the factors to be considered relevant, or the features of situations to be described that are considered to be relevant, to surface lexical choices are highly dependent on each lexical item (and so, of course, dependent on each language), we cannot expect to have a complete set of factors which can be applied to choices of every lexical item of every individual language. Trying to get such a language independent set may result in a proliferation of factors instead of the proliferation of extra-linguistic concepts found in the naive scheme.</Paragraph>
    <Paragraph position="9"> Again, note that we do not claim that the aspect of understanding captured by decomposition is irrelevant to translation. Instead, it constitutes one of several indispensable layers of description which facilitate systematic comparison of the two languages. In order to translate 'to assassinate' correctly into Japanese, we have to discriminate the literal meaning and metaphorical meanings of the word (such as 'to hurt someone's honor by a nasty trick or verbal abuse'), because the Japanese verb 'annsatsusuru' may express the latter, the metaphorical meaning. Such discrimination obviously requires understanding of what really happened in the real world, and the understanding at this level (contextual understanding level) should be expressed by a descriptive framework using a certain set conceptual primitives (because understanding results of this level should be represented independently from surface diversified texts). We only claim that the description only expresses certain aspects of 'meanings' of surface words and it cannot replace them. We also claim that any attempts to get a complete, language universal set of primitives for explaining lexical choices in any language will be in vain, and that what we really need at present is much more comprehensive comparative studies on lexical choices between languages in question in order to clarify what kinds of factors are relevant to the selection of appropriate target equivalents for each individual word of the source language.</Paragraph>
  </Section>
  <Section position="5" start_page="660" end_page="666" type="metho">
    <SectionTitle>
6 Implicit i\[nformation
</SectionTitle>
    <Paragraph position="0"> The discussion in the last section can be summarized thus; Because a continuously infinite physical/mental world is described by a natural language which has only finite words, words in individual languages are used to describe certain ranges o\[ events/objects. That is, 'meanings' of words a~:t quite vague. This vagueness causes difficulties of lexical choice in translation by the fact that certain families of events/objects which can be described by the same words in one language should be described by several different words in other languages (Fig. 6).</Paragraph>
    <Paragraph position="1">  The same line of discussion can be applied to linguistic expressions in general. That is, the set of (cognitive or other) factors which determine surface expressions changes from one language to another. Or, even if similar factors work in the determination of surface expressions, they may be reflected by using quite different syntactic devices. It often happens that to determine target surface expressions requires a set of factors which are not expressed at all in the source language or which are quite implicit, even if they are expressed.</Paragraph>
    <Paragraph position="2"> On the one hand, to translate Japanese to English, for example, we have to have information about plural-singular and definiteness-indefin it en ess d istinctions of noun phrases which are implicit in Japanese.</Paragraph>
    <Paragraph position="3"> The Japanese sentence 'watashi-ha kino kangofu-ni atta.' \[I\] \[yesterday\] \[nurse\] \[to meet\] \[past\] may correspond to the following four sentences in English, depending on the context.</Paragraph>
    <Paragraph position="4">  explicitly the above sentence lacks information, we can claim that the sentence is just vague as 'meanings' of words are. That is, the sentence can describe a set of situations in the real world which share certain properties in common, but in English, the same set of situations should be expressed differently, depending on properties of situations which are not relevant to the selection of Japanese expressions and which therefore remain implicit in Japanese.</Paragraph>
    <Paragraph position="5"> On the other hand, Japanese is rich Jn honorific expressions and highly dependent on speaker-h~arer's social relationships. Therefore, in the translation from English to Japanese, we have to recover such information which is implicit in English. For example, a simple sentence such as 'I'll come tomorrow' may correspond to Japanese sentences such as 'asu oukagai /tashimasu' \[the hearer is blgher in the social position\] 'asu oukaigai shimasu' \[the hearer is higher in the social position\] \[the speaker is intimate with the hearer\] 'asu ikuyo' \[the speaker is intimate with the hearer\] \[the speaker is male\] 'asu ikuwa' \[the speaker is intimate with the hearer\] \[the speaker is female\] 'asu ik imasu' \[neutral\] English native speakers certainly do not think that the sentence is ambiguous in the above sense. In this case, Japanese requires information about social status of speakers and hearers, which is not so relevant to the selection of English expressions. Speaker's intentions, which recent researches of NLU \[Brady 1983\] \[Appelt 1985\] \[Grosz 1986\], especially in dialogue systems, place a strong emphasis upon, are a typical example of implicit information, and we can easily imagine situations where it also plays an important role in translation, especially in translation of dialogues such as the simultaneous translation of telephone communication. It is, however, not desi'rable for translation systems to translate sentences according to speaker's intention alone. Translating 'It's hot in this room' to 'mado-o akete kudasai' (Please open the window) probably commits too much as a translation system. The system should select natural expressions in target languages as long as they do not distort the 'meanings' of source sentences too much. This implies that 'understanding of sentences' and 'the meanings of sentences' should be distinguished. What is meant by 'understanding of sentences' is, as recent researches in NLU typically show, to understand the situations where certain utterances are given or the situations which texts describe, including such factors as speaker's intentions, speaker-heater's social relationships, definiteness/indefiniteness of referenced objects, etc. Though these factors are relevant to the selection of target expressions, it is doubtful that all such derived information is a part of the description of source sentences which expresses various factors determining the surface  expressions in the source language. Researchers in NLU often confuse understanding results with the description of input sentences.</Paragraph>
    <Paragraph position="6"> As noted before, the researches in NLU so far have revealed that 'understanding sentences' cannot be defined, at least computationally, without considering certain specific internal tasks, and the task of MT, 'to re-express in target languages the information conveyed by sentences of source languages with as least distortion as possible', by itself, does not define anything about what kinds of understanding are required in MT. Because the factors relevant to the determination of surface structures are dependent on each language, the exact requirements on what aspects of the situations described by source texts should be 'understood' cannot be fixed unless the language to which the texts are to he translated is specified.</Paragraph>
    <Paragraph position="7"> English native speakers, for example, can 'understand' null 'I'll come tommorrow' without any attention to the social relationships of the speaker and the hearer. Only when they are asked to translate the sentence into Japanese, must they consciously consider such factors to select the most appropriate Japanese expression. The same line of discussion can be applied to the problem of target word selection. We cannnot enumerate, by monolingual thinking, different 'concepts' denoted by the verb 'to produce'. Only when we are asked to translate sentences containing the verb into another language, can we try to find appropriate target words. During this process, 'understanding of the sentences' and so 'understanding of the situations described by the verb' are promoted in such a direction that we can identify the most appropriate target verbs.</Paragraph>
    <Paragraph position="8"> The above discussion implies that certain 'understanding processes' are target language dependent, and cannot be fully specified in a mono-lingual manner. We have to separate, at least conceptually, bi-lingual processings from mono-lingual processings which extract explicitly a set of factors determining the surface structures of source texts. In the tentative framework in Section 2, the role of the transfer phase was restricted to computing factors for determining target structures from factors extracted from source texts including their surface structures. We assumed there that a set of factors for determining target surface structures could be computed from those extracted during the analysis phase, though the computation itself was dependent on language pairs. The discussion in this section shows that this assumption is not true. The transfer phase should do more than that. The revised framework is shown in Fig. 7. Though we adopt here the conventional division of phases in current transfer based systems, we do not claim that the three phase configuration is the best and that these three phases should be executed in order. Instead, we can think of a system in which the 'understanding' phase extracts not only factors determing surface source texts but also factors for determining target structures. But even so, we claim that the understanding results in such a system have to be specific to language pairs and not language universal. Which configuration is superior to the other, the two phase configuration or the three phase configuration, should be disscused  from engineering points of view such as maintainability of grammars and dictionaries, efficiency of processing, etc. but not from the view point of 'understanding texts'.</Paragraph>
    <Paragraph position="9">  The fact that 'understanding texts' has been understood differently by different researchers in NLU implies that the 'knowledge' to which text contents are to be related is different from one system to another. So far, quite different sorts of information prepared beforehand in systems have been called 'knowledge'. In Section 5, we discussed two different approaches to meanings of words which may lead us to quite different views of what 'knowledge' is: One is to relate meanings of words to extra-linguistic, language independent concepts whose semantics are, in turn, given by certain theories (or formal systems), internal processing for specific tasks such as data base accesses, problem solving, etc. The other is to describe core meanings of words by relating the words to a certain set of primitives. The latter may be augmented by adding further description using cognitive, situational or other features (as noted in Section 6, some of these may be language dependent) in order to specify what families of objects/events the words can describe. The knowledge described by this approach is essentially knowledge about possible usages of words and can be utilized to translate words of certain types or to make general inferences on the situations described. On the other hand, 'knowledge' which is often mentioned in fields such as knowledge engineering, expert systems and so forth refers to knowledge of specific fields, and is more easily expressed in the first approach. These two approaches are quite opposite. While the decomposition approach tries to discover a single description which covers possible usages of a word including its metaphorical usages (the decompositionalists may claim all usages are metaphorical), the extra-linguistic concept approach (the concept approach, in short) tries to enumerate a set of concepts denoted by the word. While the decomposition approach attempts to find internal structures of single words, the concept approach tends to identify even complex expressions such as 'diagrams on the plane of the celestial equator' (note that this expression has a simple translation equivalent in Japanese like 'jizuhyou') as single concepts. AS noted in Section 5, the concept approach, which we there called the 'naive approach', cannot be used to express the whole meaning of texts, but this does not imply that know\].edge expressed by this approach is irrelevant in MT. On the contrary, it often happens that we realize 'lack of knowledge' in systems, when we find re\]stranslations of terminological words or when we find misunderstandings of source texts.</Paragraph>
    <Paragraph position="10"> Because the decompos it ion approach essentially captures possible usages of words, it cannot decide appropriate translations of terminological expressions by itself. This is obvious because even human translators who have enough knowledge of language usages often mistranslate terminological words. The systems or human translators should have knowledge about relationships between words and extra-linguistic concepts in the subject fields. Because such relatiorlships are a kind of conventions specific to each subject field, we simply have to know these conventions. Several current MT systems prepare certain frameworks for treating such conventions of term translations specific to individual subject field~: such as the field code in the MU systems \[Sakamoto \]984\], the micro-glossaries in PAHO's systems \[Vasconcellos 1985\], hierarchical organizations of dictionaries in GETA's systems \[Boitet 1982\], etc. However, though relating terminological expressions (or words) in different languages through extralinguistic, language universal concepts has become a standard way of thinking in the field of terminology and already adopted by several multi-lingual terminology data banks (for example, \[Goetschalckx 1974\]), they do not explicitly introduce the extra-linguistic concepts in their frameworks but instead, relate rather directly the terminological words or expressions of the different languages.</Paragraph>
    <Paragraph position="11"> (Uchida 1985\] claims that we have to introduce extra-linguistic concepts even in MT systems, be- null cause ; (1) futurn MT systems should include not: only knowledge of the correspondence of terminological expressions but also factual knowledge and knowledge about inference rules specific to the fields, etc.</Paragraph>
    <Paragraph position="12"> (2) Such extra-linguistic knowledge is language uni null versal, and, therefore, sbou\]d be managed by different frameworks from genera\], linguistic knowledge which is l~mguage dependent.</Paragraph>
    <Paragraph position="13"> \[Boitet 1984\] shows how factual knowledge in a specific subject field can be utilized to resolve certain syntactic ambiguities such as those of the scope of coordinations, determination of antecedents of relatJ w~ clauses and pronouns, etc. For example, he discusses that determining the correct scopes of the coordinations  (i) dangerous \[cyanide and chlorine\] fumes (2) \[carbon and nitrogen tetraoxyde\] requires fatual knowledge of a specific level such as (3) cyanide fumes are dangerous (4) there is no carbon tetraoxyde in normal  chemistry.</Paragraph>
    <Paragraph position="14"> The sequences of 'cyanide and chlorine fumes' and 'carbon an(\[ nitrogen tetraoxyde' could not be differentiated, if we used only a rough semantic classification of nouns such as being the name of a chemical etc. (These examples, as Boitet notes, cannot be correctly interpreted by a simple method of preference semantics.) The necessity of detailed factual knowledge such as (3) and (4) is obvious, and, because such knowledge in chemistry is language independent, it should be represented in a language universal manner. Extra-linguistic concepts should play more important roles than mere links among the terminological terms of individual languages.</Paragraph>
    <Paragraph position="15"> However, although we completely agree that extra-linguistic knowledge should play more important roles in future high quality translation systems, we have to be very careful ill the introduction of such knowledge into MT systems. First of all, as we have repeatedly claimed, the 'meanings' extracted from sentences that can be related to knowledge of this kind does not at all exhaust the information conveyed by sentences that need to be 'transferred' into target sentences. Moreover, because sentences even in specific subject fields consist of both terminological terms and ordinary words, we cannot expect to express a\]I the results of understanding such sentences at the \]eve\]. of description using only the extra-linguistic concepts. We can only expect to express the understanding results of certain parts of sentences at this level and check whether the understanding results of those parts are compatible with common sense knowledge of the specific field. In ozher words, the processing at this level cannot play the main role ~n translation but can only play some roles to prevent certain kinds of 'misunderstanding'. \[Boitet 1984\] notes this point as 'grafting on expert systems ' .</Paragraph>
    <Paragraph position="16"> In addition to this, the boundary between terminological terms and ordinary words is not so clear. When we restrict terminological terms to names of chemical compounds, of mechanical parts, etc., Ld\]e problem of the boundary might not appear so serious :but such restriction &lt;:'an lead to serious limitation on the availability of knowledge of this kind for forming selectina\] restrictions necessary for the disambiguation of source sentences. If we attemp to extend the range of 'terminological terms', the problem of the boundary between terminological terms and ordinary words arises. For example, \[Hobbs 1984\] points out that, in a textbook on hepatitis, ordinary words such as 'human', 'animal', 'water', 'alcohol' etc. have specialized meanings different from those in general fields;the concept denoted by 'human', in this field, is not a lower concept of the concept denoted by 'animal'. We might then claim that these two terms are terminological terms of the field and that the denoted concepts have certain restricted relationships with the other concepts in the fields. A\] though such seleetional restrictions specialized in certain subject fields might be very useful for resolving syntactic ambiguities of sourse sentences, problems here are how to find such restricted usages of ordinary words that are specific to certain fields, how to clarify the possible relationships anlong 'concepts' in those fields ( to create semantic models of the fields), etc. As the above example shows, even clarifying the hierarchy among concepts, which is one of the prevailing techniques for organizing 'knowledge' in ordinary knowledge representation research, is not so easy when we have to deal with reasonably large subject fields. In order to utilize knowledge of this sort in the dlsambiguation pFocess, we have to encode not only such hierarchical relationships among concepts but also many other kinds of factual knowledge about  those concepts. Before claiming 'such-and-such factual knowledge can resolve certain specific ambiguities of given sentences', we have to develop methodologies by which we can systematically clarify a set of concepts in the given fields and the relationships among those concepts, and can gather factual knowledge relevant to those concepts.</Paragraph>
    <Paragraph position="17"> The above discussion shows that there is not a clear boundary between terminological words and ordinary words; but instead, there is a continuous distribution of words from pure terminological words, such as names of chemical compounds, at the one extreme to pure ordinary words at the other. Though the pure terminological words have their own language universal definitions and can be related directly to extra-linguistic concepts, the ordinary words have only their usages in individual languages and we have to infer the denoted 'concepts' from their usages.</Paragraph>
    <Paragraph position="18"> That is, as noted before, the denoted 'concepts' of ordinary words are language internal and cannot be related directly to extra-linguistic concepts. The -selectional restrictions which ordinary words have, therefore, can only be captured by specifying what events/objects can be described by those words, and that specification might be language dependent.</Paragraph>
    <Paragraph position="19"> Some of the difficulties in MT are caused by the fact that most of the words in certain subject fields, even words which are usually taken as part of the terminology of those fields, are in-between the two extremes, and sentences usually contain words at various positions in the distribution. For example, a sentence such as (5) The mixture gives off dangerous cyanide and chlorine fumes contains two pure terminological words (i.e., cyanide, chlorine), two ordinary words (i.e., 'to give', 'to be dangerous') and two intermediate types of words (i.e. 'fume', 'mixture'). This fact requires us to prepare various sorts of description for the selectional restrictions among words (for the analysis phase) and also for the selection of target equivalent words (for the transfer phase). As selectinal restrictions for disambiguation, we have to have factual knowledge of the fields (for restrictions among terminological words), restrictions specified by using cognitive, situational or other features (for restrictions among ordinary words -deep case frames with semantic restrictions on case fillers, which are specified in the verb dictionary, are one of the typical techniques found in current MT systems) and varied sorts of mixtures of these two extremes. On the other hand, for the selection of appropriate target word selection, we have to have several kinds of 'transfer' mechanism using different sorts of information such as extra-linguistic concepts which link the words of individual languages, distinguishing features for described events/objects, and so on.</Paragraph>
    <Paragraph position="20"> The situation becomes even more complicated due to the fact that a single word has often both specialized usages and general usages, even if we restrict our domain of translation to certain limited areas.</Paragraph>
    <Paragraph position="21"> The frameworks which current MT systems provide, such as semantic features, subject fie\]d codes, micro-glossaries specific to the fields, hierarchically organized dictionaries, etc., cannot  -capture the interwined relationships between ordinary words and terminological words, and between usage s specialized in fields and general usages.</Paragraph>
    <Paragraph position="22"> We have to emphasize that there is no single layer of 'understanding' exclusively relevant to translation; only mutually related layers of understanding ranging from detailed understanding (related to factual knowledge in the field) to the vague and general understanding of situations. All these layers will need to contribute to high quality translation in the future.</Paragraph>
    <Paragraph position="23">  researches different from other frameworks in NLU, and we have stressed that one of the peculiarities of MT as an NLP application is that we cannot readily set up a particular task-oriented level of 'understanding' in MT as we can in other applications. This peculiarity causes some difficult problems not encountered elsewhere, and we wi\] 1 list some of them since their resolutions seem particulary important in future, high quality translation systems.</Paragraph>
    <Paragraph position="24"> \[Problem i\] (Multi-Layer Representation) The process of machine translation can be taken as a sequence of processes of the extraction of vario~is factors which collectively determine the surface syntactic structures of source sentences, the computation of factors which are relevant to target sentence structures, and the realization of those factors as surface structures in the target language. Therefore, we need a certain descriptive framework in which we can express these various sorts of factors and from which we can retrieve such factors. Annotated tree structures such as those used in the MU systems, GETA, METAL etc. are one of such currently available frameworks. Annotated trees as they are, however, have only single structures (trees) of nodes with various sorts of information described in the annotation parts. It is obvious that each different sort of information requires different geometorical structures so that the current annotated trees may not be sufficient for sophisticated processing required in the future MT systems. Though Kay's notation in unification grammar \[Kay 1984\] is obviuosly one of the candidate frameworks, it is appropriate only for describing interpretations which have already determined by the analysis phase. Effective computational frameworks shoud be developed for producing such descriptions from source sentences which might be quite ambiguous. Texhniques for sharing a partial description at a certain level by several different descriptions at different levels and for maintaining the consistency of description when some parts of it are changed should be developed.</Paragraph>
    <Paragraph position="25"> \[Problem 2\] (Integration of Understanding Levels) As discussed in Section 7, we should be able to integrate several different levels of 'understanding' with linguistic levels of description. The descriptive frameworks developed so far have confined themselves to either linguistic levels or to one of the specific understanding levels. Kay's unification grammar, LFG, GPSG etc. are all concerned with the description of linguistic levels. All of them, for example, treat surface words as primitive units. On the other hand, most researches in NLU aim to relate texts to certain extra-linguistic knowledge so that the final understanding results are expressed independently from their linguistic source structures. In order to integrate understanding results with the translation proccess, we need further researches to clarify not only what levels of understanding are really re\].evant to translation but also how we are to coordinate such diversified levels of processing computat iona I ly.</Paragraph>
    <Paragraph position="26"> \[Problem 3\] (Incompleteness of Texts and 'Knowledge'-Robustness of Processing) IIuman translators can translate 'I'll come tomorrow' into Japanese without: any knowledge about the social relationships of the speaker and the hearer. They will translate the sentence based on the default assumption that the relationship is neutral. It usually happens that, even for human translators, certain factors relevant to the detc~rmination of target structures cannot be obtained because of the incompleteness of texts and lack of necessary knowledge. The system should be able to determine the most feasible translations based on the incomplete factors extracted from soui-ce texts. Though establishing sets of factors w~ich collectiw~ly determine the surface structures of the source and target languages may facilitate systematic contrastive studies of the two languages and make present ad-hoc transfer phase cleaner, we have to note that actual systems cannot a\].ways extract such factors from the source texts. Even in future systems, we will have to prepare heuristic guided transfer procedures based on lower level factors, such as syntactic structures, alone. That is, the idea of 'safety nets' is indispensab\] e, however intelligent the future MT system might be. \[Nishida \].982\] disscusses, in their MT system from English to Japanese, some techniques for calculating surface syntactic structures of Japanese which can preserve the discourse factors of English texts, without referring to such factors explicitly. These rules are a kind of heuristic but are not linguistically wellfounded. For this kind of processing, we may have to introduce other kinds of knowledge, for example, the expert knowledge of professional translators \[Tucker 1985 \] .</Paragraph>
    <Paragraph position="27"> \[Problem 41 (Easy Accomodation of Future Development of Theories) As noted in Section 3, we cannot expect to have a complete set of factors which carl uniquely determine the surface syntactic structures of a language. Becauese there is always possibility that future linguistic research will reveal factors which have not yet been noticed, the computational framework should be flexible enough for accomodating these factors. In this sense, to commit strongly to one linguistic theory at present seems dangerous for computational frameworks. Furthermore, though most linguistic theories aim to describe linguistic structures from a mono-lingual point of view, the factors to be extracted from source texts depends on the target language. Some of the factors relevant to translation can only be clarified through bi-\].ingual, contrastive studies of the two languages and by referring to the aspects of 'understanding' which are obviously beyond the scope of current linguistic theories. We \]lave to note that the computational frameworks for machine translation should be flexible enough for treating various sorts of phenomena which current linguistic theories do not cover.</Paragraph>
    <Paragraph position="28"> \[Problem 5\] (Other Factors to be Accomodated Discourse Factors, Cognitive Factors) The computational researches in discourse analysis so far have put emphasis on a certain set of topics, such as resolutions of anaphoric expressions, recovering speakers' intention from utterances, etc. Although these are more or less relevant to high quality translation in the future, we have to attack much wider ranges of prb\]ems concerned with discouse phenomena, that is, what kinds of discourse factors are relevant to the determination of surface sentence styles and in what manner. Though relevant topics have been treated in text linguistics and many useful ideas have been proposed already, many of them seem to be too vague to formalJ, ze computationally. It is time to fin(\] computatJorlal formalization for them and to integrate them with translation processes. MT is one of the most promising application fields where the research results in text linguistics could be utilized.</Paragraph>
    <Paragraph position="29"> \[Ishiwata 1985\] discusses how cognitive features are relevant to translation, especially word translation. By taking the French verb 'tomber' and the Japanese translation equivalents 'taoreru' and 'ochJru' as a typical example, he shows that certain movements or objects which carl be expressed by the verb 'tomber' in French should be described differently by using either 'taoreru' or 'ochiru'. His claim \]s that such selection of target word depends on how the speaker recognize the movements of objects, that is, whether the motion J s rather perpendicular (i.e. the stone fa\] is) or not (i.e. the man fell over). That is, the selection of appropriate Japaneses verbs depend on a certain kind of 'image' \]eve\] understanding of the event whJ ch the French verb describes. Whether such levels of understanding carl be represented in a symbol ic manner, and what kinds of such symbolic cognitive features are necessary, whether there is a set of cognitive features which is effective for any language pair, and so on are, of course, research topics in the distant future. However, we }lave to note that such cognitive levels of features are more useful than extra-linguistic know\].edge in specific subject fields, for the choice of appropriate target equivalents for words with wide usages.</Paragraph>
    <Paragraph position="30"> \[Problem 6\] (Setting Layers of 'Understanding') As discussed in Section 6 and 7, we can distinguish at least the two extreme layers of understanding and knowledge relevant to MT. Whether these two kinds of understanding and knowledge can be represented Jn single frameworks, \]low they should be coordinated with linguistic processing (analysis, transfer, generation) computationa\]\].y, to what extent these kinds of knowledge can really be encoded in systems, etc. have to be clarified. If tho two kinds of knowledge should be represented separately, we have to clarify hew many different layers exists and \]low they should be mutually related.</Paragraph>
    <Paragraph position="31"> We have listed above some of the problems caused by the peculiarity of MT that we cannot determine in advance a certain concrete level of 'understanding'.</Paragraph>
    <Paragraph position="32"> The other peculiarities of MT come from the fact that MT systems have to treat documents of much wider subject fields and of much more varied text types than other applications. Our Mu systems, for example, restrict the document type to abstracts of scientific and technological papers but treat scientific fields in genera\].. The PAHO's systems translate documents in more restricted fields but include very wide ranges of document types, including conference reports, budget proposals, letters etc.</Paragraph>
    <Paragraph position="33">  This fact, in combination with the difficulty of setting the understanding level, causes many practical difficulties.</Paragraph>
    <Paragraph position="34"> \[Problem 7\] (Complexities of Semantic Models) Wider subject fields imply more complexities in semantic models. In data base access, one only has to deal with a simple set of semantic classes such as 'name of companies', 'person's name', 'salary', etc. and their possible semantic relationships. However, as \[Bennet 1985\] notes 'the thought of writing complex models of even one complete technical domain is staggering: one set of manuals we have worked with --- is part of a document collection that is expected to comprise some i00,000 pages. A typical NLP research group would not even be able to read that volume of manual, much less write the necessary semantic models, in any reasonable amount of time', we have to treat much more complex semantic fields in MT. We have to develop methodologies to clarify the structures of such complex semantic models systematically for any given subject field.</Paragraph>
    <Paragraph position="35"> \[Problem 8\] (Instability of Lexical Coding) wider subject fields imply a large amount of vocabulary, and high quality translation requires rich information to be coded for each lexical item. This means that we need many lexicographers for lexical coding, and the problem of consistency arises. High semantic complexities imply that criteria for lexical coding are not so evident. In the MU project, we prepared rather detailed manuals for lexical coding but they are still not sufficient for obtaining good quality codings. The semantic codes, for example, are often dependent on individual lexicographers and such inconsistency caused many troubles in grammar development and also depressing translation errors.</Paragraph>
    <Paragraph position="36"> The problem of instability is not found not only in semantic coding but also in every other description items in the dictinary, when codings are perforlaed by many people. We have to develop not only flexible software tools for facilitating lexical coding and cons is tency cheking \[ Kogure 1984\] \[Boitet 1982 \] but also effective linguistic checking procedures.</Paragraph>
    <Paragraph position="37"> \[Problem 9\] (Weak Semantic Constraints) The lack of concrete internal processing for specific tasks implies that the system cannot reject nonsense interpretations of input sentences. In other applications, certain syntactic interpretations are judged as nonsense when the internal processing cannot give any meaningful semantics to them. Furthermore, as Hobbs noted by the examples of 'to produce', wide subject fields imply that various usages of words which share a core meaning in common will appear in texts. That is, many usages which have metaphorical flavors ('The car drinks gas' is a well-known example given by \[Wilks 1972\]) will commonly appear in texts and make the rejection of syntactic interpretations on semantic grounds harder. In the MU systems, we prepared about 50 semantic categories for nouns, but most of them are not as effective as we had expected for preventing 'nonsense' interpretations, though they are effective for certain kinds of semantic interpretation (for example, for deep case inter\]gretations of prepositional phrases which are not ::strictly governed by their predicates) and target word selection to some extent. AS noted in Section 7, though Wilks' idea of 'preferential semantics' is  one of the possible solutions, we have to coordinate this idea with the other kinds of processing and wi%n preferences of other levels.</Paragraph>
    <Paragraph position="38"> \[Problem I0\] (Maintainability of Systems) In the discussion of \[Problem 3\] , we claimed that the transfer component should be robust and be able to compute the most feasible factors relevant to target structure determination, even if necessary factors cannot be given by the analysis phase. The same line of discussion can be applied to the entire process of MT. The analysis phase, for example, cannot expect that a full set of necessary information for interpretation of input sentences will always be accessible. This implies that, at each phase of translation, a certain number of rules, which are a kind of heuristics and not theoretically well-founded, should be prepared. Furthermore, to deal with wide subject fields implies that we have to treat varied types of linguistic phenomena, which again requires a large number of rules in those systems. Wider fields also increase ambiguities at each level of intepretation.</Paragraph>
    <Paragraph position="39"> A single word may have several different part-of-speech interpretations, to each of which several different syntactic features may be assgined (for example, a verb often have several different surface case patterns). This difficulty can be avoided to some extent in other applications because we can fix certain levels of interpretation in advance (for example, 'ship' may only be used as a noun in a certain data base access system, though it has a verb interpretation). In order to prevent the proliferation of possible syntactic interpretation in MT, we need a certain number of disambiguation rules which are also heuristic based \[Tsujii 1984\]. In short, we have to manage a large number of rules whose mutual relationships are tighter than those found in most other rule based expert systems. We have to develop not only flexible software systems for managing such large rule based systems \[Johnson 1984\] \[Nakamura 1986\] but also methodologies by which we can systematically organize and integrate knowledge of quite different sorts.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML