File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/91/p91-1017_intro.xml

Size: 9,163 bytes

Last Modified: 2025-10-06 14:05:05

<?xml version="1.0" standalone="yes"?>
<Paper uid="P91-1017">
  <Title>Two Languages Are More Informative Than One *</Title>
  <Section position="2" start_page="0" end_page="131" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> The resolution of hxical ambiguities in non-restricted text is one of the most difficult tasks of natural language processing. A related task in machine translation is target word selection - the task of deciding which target language word is the most appropriate equivalent of a source language word in context. In addition to the alternatives introduced from the different word senses of the source language word, the target language may specify additional alternatives that differ mainly in their usages.</Paragraph>
    <Paragraph position="1"> Traditionally various linguistic levels were used to deal with this problem: syntactic, semantic and pragmatic. Computationally the syntactic methods are the easiest, but are of no avail in the frequent situation when the different senses of the word show *This research was partially supported by grant number 120-741 of the Iarael Council for Research and Development the same syntactic behavior, having the same part of speech and even the same subcategorization frame.</Paragraph>
    <Paragraph position="2"> Substantial application of semantic or pragmatic knowledge about the word and its context for broad domains requires compiling huge amounts of knowledge, whose usefulness for practical applications has not yet been proven (Lenat et al., 1990; Nirenburg et al., 1988; Chodorow et al., 1985). Moreover, such methods fail to reflect word usages.</Paragraph>
    <Paragraph position="3"> It is known for many years that the use of a word in the language provides information about its meaning (Wittgenstein, 1953). Also, statistical approaches which were popular few decades ago have recently reawakened and were found useful for computational linguistics. Consequently, a possible (though partial) alternative to using manually constructed knowledge can be found in the use of statistical data on the occurrence of lexical relations in large corpora. The use of such relations (mainly relations between verbs or nouns and their arguments and modifiers) for various purposes has received growing attention in recent research (Church and Hanks, 1990; Zernik and Jacobs, 1990; Hindle, 1990). More specifically, two recent works have suggested to use statistical data on lexical relations for resolving ambiguity cases of PP-attachment (Hindle and Rooth, 1990) and pronoun references (Dagan and Itai, 1990a; Dagan and Itai, 1990b).</Paragraph>
    <Paragraph position="4"> Clearly, statistical methods can be useful also for target word selection. Consider, for example, the Hebrew sentence extracted from the foreign news section of the daily Haaretz, September 1990 (transcripted to Latin letters).</Paragraph>
    <Paragraph position="5">  (1) Nose ze maria' mi-shtei ha-mdinot mi-lahtom 'al hoze shalom.</Paragraph>
    <Paragraph position="6">  This sentence would translate into English as: (2) That issue prevented the two countries from signing a peace treaty.</Paragraph>
    <Paragraph position="7"> The verb 'lab_tom' has four word senses: 'sign', 'seal', 'finish' and 'close'. Whereas the noun 'hose' means both 'contract' and 'treaty'. Here the difference is not in the meaning, but in usage.</Paragraph>
    <Paragraph position="8"> One possible solution is to consult a Hebrew corpus tagged with word senses, from which we would probably learn that the sense 'sign' of 'lahtom' appears more frequently with 'hoze' as its object than all the other senses. Thus we should prefer that sense. However, the size of corpora required to identify lexical relations in a broad domain is huge (tens of millions of words) and therefore it is usually not feasible to have such corpora manually tagged with word senses.</Paragraph>
    <Paragraph position="9"> The problem of choosing between 'treaty' and 'contract' cannot be solved using only information on Hebrew, because Hebrew does not distinguish between them.</Paragraph>
    <Paragraph position="10"> The solution suggested in this paper is to identify the lexical relationships in corpora of the target language, instead of the source language. Consulting English corpora of 150 million words, yields the following statistics on single word frequencies: 'sign' appeared 28674 times, 'seal' 2771 times, 'finish' appeared 15595 times, 'close' 38291 times, 'treaty' 7331 times and 'contract' 30757 times. Using a naive approach of choosing the most frequent word yields (3) *That issue prevented the two countries from closing a peace contract.</Paragraph>
    <Paragraph position="11"> This may be improved upon if we use lexical relations. We consider word combinations and count how often they appeared in the same syntactic relation as in the ambiguous sentence. For the above example, among the successfully parsed sentences of the corpus, the noun compound 'peace treaty' appeared 49 times, whereas the compound 'peace contract' did not appear at all; 'to sign a treaty' appeared 79 times while none of the other three alternatives appeared more than twice. Thus we first prefer 'treaty' to 'contract' because of the noun compound 'peace treaty' and then proceed to prefer 'sign' since it appears most frequently having the object 'treaty' (the order of selection is explained in section 3). Thus in this case our method yielded the correct translation.</Paragraph>
    <Paragraph position="12"> Using this method, we take the point of view that some ambiguity problems are easier to solve at the level of the target language instead of the source language. The source language sentences are considered as a noisy source for target language sentences, and our task is to devise a target language model that prefers the most reasonable translation.</Paragraph>
    <Paragraph position="13"> Machine translation (MT) is thus viewed in part as a recognition problem, and the statistical model we use specifically for target word selection may be compared with other language models in recognition tasks (e.g. Katz (1985) for speech recognition).</Paragraph>
    <Paragraph position="14"> In contrast to this view, previous approaches in MT typically resolved examples like (1) by stating various constraints in terms of the source language (Nirenburg, 1987). As explained before, such constraints cannot be acquired automatically and therefore are usually limited in their coverage.</Paragraph>
    <Paragraph position="15"> The experiment conducted to test the statistical model clearly shows that the statistics on lexical relations are very useful for disambiguation. Most notable is the result for the set of examples for Hebrew to English translation, which was picked randomly from foreign news sections in Israeli press. For this set, the statistical model was applicable for 70% of the ambiguous words, and its selection was then correct for 92% of the cases.</Paragraph>
    <Paragraph position="16"> These results for target word selection in machine translation suggest to use a similar mechanism even if we are interested only in word sense disambiguation within a single language! In order to select the right sense of a word, in a broad coverage application, it is useful to identify lexical relations between word senses. However, within corpora of a single language it is possible to identify automatically only relations at the word level, which are of course not useful for selecting word senses in that language. This is where other languages can supply the solution, exploiting the fact that the mapping between words and word senses varies significantly among different languages. For instance, the English words 'sign' and 'seal' correspond to a very large extent to two distinct senses of the Hebrew word 'lab_tom' (from example (1)). These senses should be distinguished by most applications of Hebrew understanding programs. To make this distinction, it is possible to do the same process that is performed for target word selection, by producing all the English alternatives for the lexical relations involving 'lahtom'. Then the Hebrew sense which corresponds to the most plausible English lexical relations is preferred. This process requires a bilingual lexicon which maps each Hebrew sense separately into its possible translations, similar  to a Hebrew-Hebrew-English lexicon (like the Oxford English-English-Hebrew dictionary (Hornby et al., 1980)).</Paragraph>
    <Paragraph position="17"> In some cases, different senses of a Hebrew word map to the same word also in English. In these cases, the lexical relations of each sense cannot be identified in an English corpus, and a third language is required to distinguish among these senses. As a long term vision, one can imagine a multilingual corpora based system, which exploits the differences between languages to automatically acquire knowledge about word senses. As explained above, this knowledge would be crucial for lexical disambiguation, and will also help to refine other types of knowledge acquired from large corpora 1 .</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML