File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/06/p06-2031_metho.xml
Size: 20,956 bytes
Last Modified: 2025-10-06 14:10:23
<?xml version="1.0" standalone="yes"?> <Paper uid="P06-2031"> <Title>Sydney, July 2006. c(c)2006 Association for Computational Linguistics Robust Word Sense Translation by EM Learning of Frame Semantics</Title> <Section position="4" start_page="0" end_page="239" type="metho"> <SectionTitle> 2 One Frame Two Languages </SectionTitle> <Paragraph position="0"> The challenge of translation disambiguation is to select the target word cl* with the correct semantic frame f--(cl,f), among the multitude of translation candidates Pr(cl|el). We suggest that while a source word in the input sentence might have multiple translation candidates, the correct target word must have the same sense, i.e., belong to the same semantic frame, as the source word (i.e.</Paragraph> <Paragraph position="1"> Pr(cl,f|el,f) is high). For example, &quot;burn |Tang (tang)&quot; carries the &quot; cause_harm|damage&quot; sense, whereas &quot;burn |Shao (shao)&quot; carries the &quot;heat|cooking&quot; sense. The source sentence &quot;My hands are burned&quot; has the &quot;cause_harm|damage&quot; sense, therefore the correct translation of &quot;burn&quot; is &quot;Tang (tang)&quot; not &quot;Shao (shao)&quot; . The frame semantics information of the source word can thus lead to the best translation candidate.</Paragraph> <Paragraph position="2"> Whereas some translation ambiguities are preserved over languages, most are not. In particular, for languages as different as English and Chinese, there is little overlap between how lexicon is broken-down (Ploux and Ji 2003). Some cognitive scientists suggest that a bilingual speaker tends to group concepts in a single semantic map and simply attach different words in English and Chinese to the categories in this map.</Paragraph> <Paragraph position="3"> Based on the above, we propose the one-frame-two-languages idea for constructing a bi-lingual word sense dictionary from monolingual ontologies.</Paragraph> <Paragraph position="4"> FrameNet (Baker et al. 1998) is a collection of lexical entries grouped by frame semantics. Each lexical entry represents an individual word sense, and is associated with semantic roles and some annotated sentences. Lexical entries with the same semantic roles are grouped into a &quot;frame&quot; and the semantic roles are called &quot;frame elements&quot;. Each frame in FrameNet is a concept class and a single word sense belongs to only one frame. However, the Chinese HowNet represents a hierarchical view of lexical semantics in Chinese. null HowNet (Dong and Dong 2000) is a Chinese ontology with a graph structure of word senses called &quot;concepts&quot;, and each concept contains 7 fields including lexical entries in Chinese, English gloss, POS tags for the word in Chinese and English, and a definition of the concept including its category and semantic relations (Dong and Dong, 2000). Whereas HowNet concepts correspond roughly to FrameNet lexical entries, its semantic relations do not correspond directly to FrameNet semantic roles.</Paragraph> <Paragraph position="5"> A bilingual frame, as shown in Figure 1, simulates the semantic system of a bilingual speaker by having lexical items in two languages attached to the frame.</Paragraph> </Section> <Section position="5" start_page="239" end_page="243" type="metho"> <SectionTitle> 3 Automatic Generation of Bilingual </SectionTitle> <Paragraph position="0"/> <Section position="1" start_page="239" end_page="243" type="sub_section"> <SectionTitle> Frame Semantics </SectionTitle> <Paragraph position="0"> To choose &quot;burn|Tang (tang)&quot; instead of &quot;burn|Shao (shao)&quot; in the translation of &quot;My hands are burned&quot;, we need to know that &quot;Tang (tang)&quot; belongs to the &quot;cause_harm&quot; frame, but &quot; Shao (shao)&quot; belongs to the &quot;heat&quot; frame. In other words, we need to have a bilingual frame semantics ontology. Much like a dictionary, this bilingual ontology forms part of the translation &quot;lexicon&quot;, and can be used either by human translators or automatic translation systems.</Paragraph> <Paragraph position="1"> Such a bilingual frame semantics ontology also provides a simulation of the &quot;concept lexicon&quot; of a bilingual person, as suggested by cognitive scientists.</Paragraph> <Paragraph position="2"> Figure 1 shows an example of a bilingual frame that possibly corresponds to the semantic structure in a bilingual person.</Paragraph> <Paragraph position="3"> Figure 1. An example bilingual frame We previously proposed using the Chinese HowNet and a bilingual lexicon to map the English FrameNet into a bilingual BiFrameNet (Fung and Chen 2004). We used a combination of frame size thresholding and taxonomy distance to obtain the final alignment between FrameNet frames and HowNet categories, to generate the BiFrameNet.</Paragraph> <Paragraph position="4"> Our previous algorithm had the disadvantage of requiring the ad hoc tuning of thresholds. This results in poor performance on lexical entries from small frames (i.e. frames with very few lexical entries). The tuning process also means that a development set of annotated data is needed. In this paper, we propose a fully automatic estimation-maximization algorithm instead, to generate a similar FrameNet to HowNet bilingual ontology, without requiring any annotated data or manual tuning. As such, our method can be applied to ontologies of any structure, and is not restricted to FrameNet or HowNet.</Paragraph> <Paragraph position="5"> Our approach is based on the following assumptions: null 1. A source semantic frame is mapped to a target semantic frame if many word senses in the two frames translate to each other; 2. A source word sense translates into a target word sense if their parent frames map to each other.</Paragraph> <Paragraph position="6"> The semantic frame in FrameNet is defined as a single frame, whereas in HowNet it is defined as the category. The formulae of our proposed algorithm are listed in Figure 2.</Paragraph> <Paragraph position="7"> Variable definitions:</Paragraph> <Paragraph position="9"> (el, ef) : the word sense entry in ef .</Paragraph> <Paragraph position="10"> (All variables are assumed to be independent of each other.) Model parameters: Pr(cl|el): bilingual word pair probability from dictionary Pr(cf|ef): Chinese to English frame mapping probability.</Paragraph> <Paragraph position="11"> Pr(cl,cf|el,ef): Chinese to English word sense translation probability.</Paragraph> <Paragraph position="12"> (1) Word senses that belong to mapped frames are translated to each other: In the initialization step of our EM algorithm, all English words in FrameNet are glossed into Chinese using a bilingual lexicon with uniform probabilities Pr(cl|el). Next, we apply the EM algorithm to align FrameNet frames and HowNet categories. By using EM, we improve the probabilities of frame mapping in Pr(cf|ef) and word sense translations in Pr(cl,cf|el,ef) iteratively: We estimate sense translations based on uniform bi-lingual dictionary probabilities Pr(cl|el) first. The frame mappings are maximized by using the estimated sense translation. The a priori lexical probability Pr(cl) is assumed to be one for all Chinese words. Underlining the correctness of our algorithm, we note that the overall likelihoods of the model parameters in our algorithm improve until convergence after 11 iterations. We use the alignment output after the convergence step. That is, we obtain all word sense translations and frame mapping from the EM The mapping between FrameNet frames and HowNet categories is obviously not one-to-one since the two languages are different. The initial and final mappings before and after EM iterations are shown in Figures 3a,b and 4a,b. Each point (i,j) in Figures 3a and b represents an alignment between FrameNet frame i to HowNet category j. Before EM iterations, each English lexical item is glossed into its (multiple) Chinese translations by a bilingual dictionary. The parent frame of the English lexical item and those of all its Chinese translations are aligned to form an initial mapping. This initial mapping shows that each English FrameNet frame is aligned to an average of 56 Chinese HowNet categories. This mapping is clearly noisy. After EM iterations, each English frame is aligned to 5 Chinese categories on average, and each Chinese category is aligned to 1.58 English frames on average.</Paragraph> <Paragraph position="13"> We also plot the histograms of one-to-X mapping between FrameNet frames and HowNet categories before and after EM iterations in Figure 4. The horizontal axis is the number X in one-to-X mapping between English and Chinese frames. The vertical axis is the occurrence frequency. For example, point (i,j) represents that there are j frames in English mapping to i categories in Chinese. Figure 4 shows that using lexical glossing only, there are a large number of frames that are aligned to over 150 of Chinese categories, while only a small number of English frames align to relatively few Chinese categories. After EM iterations, the majority of the English frames align to only a few Chinese categories, significantly improving the frame mapping across the two languages.</Paragraph> <Paragraph position="14"> Figure 4a. Histogram of one-to-X mappings between English frames and Chinese categories. Most English frames align to a lot of Chinese categories before EM learning.</Paragraph> <Paragraph position="15"> Figure 4b. Histograms of one-to-X mappings between English frames and Chinese categories. Most English frames only align to a few Chinese categories after EM learning.</Paragraph> <Paragraph position="16"> The above plots demonstrate the difference between FrameNet and HowNet structures. For example, &quot;boy.n&quot; belongs to &quot;attention_getting&quot; and &quot;people&quot; frames in FrameNet.</Paragraph> <Paragraph position="17"> &quot;boy.n|attention_getting&quot; should translate into &quot; Cha Fang /waiter&quot; in Chinese, whereas &quot;boy.n|people&quot; has the sense of &quot;Nan Hai /male child&quot;. However, in HowNet, both &quot;Cha Fang /waiter&quot; and &quot;Nan Hai / male child&quot; belong to the same category, human|Ren . An example of word sense translation from our algorithm output is shown in Figure 5. The word sense translations of the FrameNet lexical entries represent the simulated semantic world of a bi-lingual person who uses the same semantic structure but with lexical access in two languages. For example, the frame &quot;cause_harm&quot; now contains the bilingual word sense pair We evaluate the accuracy of word sense translation in our automatically generated bilingual ontology, by testing on the most ambiguous lexical entries in FrameNet, i.e. words with the highest number of frames. These words and some of their sense translations are shown in Table 1 below. null put The word sense translation accuracies of the above words are shown in Table 2. The results are highly positive given that those from previous work in word translation disambiguation using bootstrapping methods (Li and Li, 2003; Yarowsky 1995) achieved 80-90% accuracy in disambiguating between only two senses per</Paragraph> <Paragraph position="19"> The only susceptibility of our algorithm is in its reliance on bilingual dictionaries. The sense translations of the words &quot;tie&quot;, &quot;roll&quot;, and &quot;look&quot; are relatively less accurate due to the absence of certain translations in the dictionaries we used.</Paragraph> <Paragraph position="20"> For example, the &quot;bread/food&quot; sense of the word &quot;roll&quot; is not found in the bilingual dictionaries at biguous words in FrameNet We compare our results to that of our previous work (Fung and Chen 2004), by using the same bilingual lexicon. Table 3 shows that we have improved the accuracy of word sense translation using the current method.</Paragraph> <Paragraph position="21"> translation precision over Fung and Chen (2004). We note in particular that whereas the previous algorithm in Fung and Chen (2004) does not We are not able to evaluate our algorithm on the same set of words as in (Li & Li 2003; Yarowsky 1995) since these words do not have entries in FrameNet.</Paragraph> <Paragraph position="22"> perform well on lexical entries from small frames (e.g. on &quot;hold.v&quot; and &quot;issue.v&quot;) due to ad hoc manual thresholding, the current method is fully automatic and therefore more robust. In Fung and Chen (2004), semantic frames are mapped to each other if their lexical entries translate to each other above a certain threshold. If the frames are small and therefore do not contain many lexical entries, then these frames might not be correctly mapped. If the parent concept classes are not correctly mapped, then word sense translation accuracy suffers.</Paragraph> <Paragraph position="23"> The main advantage of our algorithm over our work in 2004 lies in the hill-climbing iterations of the EM algorithm. In the proposed algorithm, all concept classes are mapped with a certain probability, so no mapping is filtered out prematurely. As the algorithm iterates, it is more probable for the correct bilingual word sense to be translated to each other, and it is also more probable for the bilingual concept classes to be mapped to each other. After convergence of the algorithm, the output probabilities are optimal and the translation results are more accurate.</Paragraph> </Section> </Section> <Section position="6" start_page="243" end_page="244" type="metho"> <SectionTitle> 5 Towards Translation Disambiguation </SectionTitle> <Paragraph position="0"> using Frame Semantics As translation disambiguation forms the core of various machine translation strategies, we are interested in studying whether the generated bi-lingual frame semantics can complement existing resources, such as bilingual dictionaries, for translation disambiguation.</Paragraph> <Paragraph position="1"> The semantic frame of the predicate verb and the argument structures in a sentence can be identified by the syntactic structure, part-of-speech tags, head word, and other features in the sentence. The predicate verb translation corresponds to the word sense translation we described in the previous sections, Pr(cl,cf |el,ef). We intend to evaluate the effectiveness of bi-lingual frame semantics mapping in disambiguating between translation candidates. For the evaluation set, we use 202 randomly selected example sentences from FrameNet, which have been annotated with predicate-argument structures. null In the first step of the experiment, for each predicate word (el,ef), we find all its translation candidates of the predicate word in each sentence, and annotate them with their HowNet categories to form a translated word sense Pr(cl,cf|el,ef). For the example sentence in Figure 6, there are altogether 147 word sense translations for (hold,detaining).</Paragraph> <Paragraph position="2"> Under South African law police could HOLD the man for questioning for up to 48 hours before seeking the permission of magistrates for an extension We then find the word sense translation with the highest probability among all HowNet and FrameNet class mappings from our EM algo- null An example (el,ef) is (hold, detaining) and the cl*=argmax P(cl,cf|el,ef) found by our program is Kou Liu . (cl,cf)* in this case is (Kou Liu ,detain|Kou Zhu ). Human evaluators then look at the set of {cl*} and mark cl* as either true translations or erroneous. The accuracy of word sense translations on this evaluation set of example sentences is at 74.9%.</Paragraph> <Paragraph position="3"> In comparison, we also look at Pr(cl|el), translation based on bilingual dictionary only, and find The translation accuracy of using bilingual dictionary only, is at a predictable low 15.8%. Our results are the first significant evidence of, in addition to bilingual dictionaries, bilingual frame semantics is a useful resource for the translation disambiguation task.</Paragraph> </Section> <Section position="7" start_page="244" end_page="244" type="metho"> <SectionTitle> 6 Related Work </SectionTitle> <Paragraph position="0"> The most relevant previous works include word sense translation and translation disambiguation (Li & Li 2003; Cao & Li 2002; Koehn and Knight 2000; Kikui 1999; Fung et al., 1999), frame semantic induction (Green et al., 2004; Fung & Chen 2004), and bilingual semantic mapping (Fung & Chen 2004; Huang et al. 2004; Ploux & Ji, 2003, Ngai et al., 2002; Palmer & Wu 1995). Other than the English FrameNet (Baker et al, 1998), we also note the construction of the Spanish FrameNet (Subirats & Petruck, 2003), the Japanese FrameNet (Ikeda 1998), and the German FrameNet (Boas, 2002). In terms of learning method, Chen and Palmer (2004) also used EM learning to cluster Chinese verb senses.</Paragraph> <Section position="1" start_page="244" end_page="244" type="sub_section"> <SectionTitle> Word Sense Translation </SectionTitle> <Paragraph position="0"> Previous word sense translation methods are based on using context information to improve translation. These methods look at the context words and discourse surrounding the source word and use methods ranging from boostrapping (Li & Li 2003), EM iterations (Cao and Li, 2002; Koehn and Knight 2000), and the cohesive relation between the source sentence and translation candidates (Fung et al. 1999; Kikui 1999).</Paragraph> <Paragraph position="1"> Our proposed translation disambiguation method compares favorably to (Li & Li 2003) in that we obtain an average of 82% precision on words with multiple senses, whereas they obtained precisions of 80-90% on words with two senses. Our results also compare favorably to (Fung et al. 1999; Kikui 1999) as the precision of our predicate verb in the input sentence translation disambiguation is about 75% whereas their precisions range from 40% to 80%, albeit on an independent set of words.</Paragraph> <Paragraph position="2"> Automatic Generation of Frame Semantics Green et al. (2004) induced SemFrame automatically and compared it favorably to the hand-constructed FrameNet (83.2% precision in covering the FrameNet frames). They map WordNet and LDOCE, two semantic resources, to obtain SemFrame. Burchardt et al. (2005) used FrameNet in combination with WordNet to extend coverage. null</Paragraph> </Section> <Section position="2" start_page="244" end_page="244" type="sub_section"> <SectionTitle> Bilingual Semantic Mapping </SectionTitle> <Paragraph position="0"> Ploux and Ji, (2003) proposed a spatial model for matching semantic values between French and English. Palmer and Wu (1995) studied the mapping of change-of-state English verbs to Chinese.</Paragraph> <Paragraph position="1"> Dorr et al. (2002) described a technique for the construction of a Chinese-English verb lexicon based on HowNet and the English LCS Verb Database (LVD). They created links between HowNet concepts and LVD verb classes using both statistics and a manually constructed &quot;seed mapping&quot; of thematic classes between HowNet and LVD. Ngai et al. (2002) induced bilingual semantic network from WordNet and HowNet.</Paragraph> <Paragraph position="2"> They used lexical neighborhood information in a word-vector based approach to create the alignment between WordNet and HowNet classes without any manual annotation.</Paragraph> </Section> </Section> <Section position="8" start_page="244" end_page="244" type="metho"> <SectionTitle> 7 Conclusion </SectionTitle> <Paragraph position="0"> Based on the one-frame-two-languages idea, which stems from the hypothesis of the mind of a bilingual speaker, we propose automatically generating a bilingual word sense dictionary or ontology. The bilingual ontology is generated from iteratively estimating and maximizing the probability of a word translation given frame mapping, and that of frame mapping given word translations. We have shown that for the most ambiguous 11 words in the English FrameNet, the average word sense translation accuracy is 82%. Applying the bilingual ontology mapping to translation disambiguation of predicate verbs in another evaluation, the accuracy of our method is at an encouraging 75%, significantly better than the 15% accuracy of using bilingual dictionary only. Most importantly, we have demonstrated that bilingual frame semantics is potentially useful for cross-lingual retrieval, machine-aided and machine translation.</Paragraph> </Section> class="xml-element"></Paper>