File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/92/c92-2107_metho.xml

Size: 17,868 bytes

Last Modified: 2025-10-06 14:13:00

<?xml version="1.0" standalone="yes"?>
<Paper uid="C92-2107">
  <Title>Machine Translation by Case Generalization</Title>
  <Section position="3" start_page="0" end_page="0" type="metho">
    <SectionTitle>
2 Machine Translation by Case
Generalization
</SectionTitle>
    <Paragraph position="0"> A case-base, in contrast to a set of rules, has inherent redundancy, because cases are collected without preselection. In the simplest case, if the sentence &amp;quot;A&amp;quot; has only one translation equivalent &amp;quot;a,&amp;quot; then the single ease &amp;quot;A&amp;quot; ~ &amp;quot;al' is enough to translate &amp;quot;A?' But if we view the case-base as a collection of sentences, the santo sentences rarely seem to occur 1. Sentences can, however, be divided into smaller fragments which are meaningful units for translation according to the some linguistic models, which we call translation patterns.</Paragraph>
    <Paragraph position="1"> These fragments are combined for use in translating sentences. Fragments divided on the basis of translation patterns are obviously more effectlvc than senfences, because smaller fragments are more likely to match than full sentences.</Paragraph>
    <Paragraph position="2"> We generalize such fragments extracted according to each translation pattern, using a thesaurus, by replacing the words that occur in cases by more general concepts in the thesaurus. The words to be replaced are determined by their frequencies in the case-base.</Paragraph>
    <Paragraph position="3"> Frequent occurring fragments should be assigned more weight than less frequent by occurring fragments. The frequencies of fragments axe used to weigh generalized cases in generalization.</Paragraph>
    <Paragraph position="4"> Semantic distances are calculated for each translation pattern as the importances of generalized cases. Only meaningful categories for the translation patteru are stored as generalized cases, except that the most meaningful category is taken as a default. For example, 1The ease-bane should contain natural sentences rather than examplt~ which ~re only the smallest fragments effective for translation. We distinguish CBMT from EBMT in accordance with this viewpaint.</Paragraph>
    <Paragraph position="5"> the word &amp;quot;9~&amp;quot;(dog) may be generalized into the coucept &lt;dog&gt; 2 for translation of ,qrJj &lt; &amp;quot;(&amp;quot;a dog barks&amp;quot;), whereas it may be replaced by tbe more general concept &lt;animal&gt;, for other translation patterns in which the concept &lt;dog&gt; is not ineaningful.</Paragraph>
    <Paragraph position="6"> While generalizing cases, we can identify exceptional cases as those which cannot be generalized. Once we identify exceptions, then we can prevent such exceptions from being interpreted generally.</Paragraph>
    <Paragraph position="7"> In this way, cases are generalized according to tbe translation pattern into generalized cases with concepts as the values of their variables.</Paragraph>
    <Paragraph position="8"> In ddition to generalized cases, rules can be formulatcd according to translation patterns. Generalized cases and manually written rules are assumed to be the same as objects in CBMT. It is valuable to have rules available as well as cases, especially when the case-base contains iusnfficicnt cases. If rules are not available, there must be sufficient cases from the time the system is first used. h~creinental development of any domain is possible only if general rules are available. null In accordance with these basic ideas, we propose a method of machine translation in which cases are generalized. In our approach, we define linguistic patterns in translation. According to these patterns, the cases in the case-base are divided into smaller fragments and are generalized. BotlL rules and generalized cases are used to translate senteuces.</Paragraph>
    <Paragraph position="9"> CBMT is divided into two sub-processes: (1) best matching, to search for the nmst similar cases in the case-base, and (2) application control, to control the combinatim~ of similar cases for translation. Application coutrol is a general problem in machine translation, whereas best matching is a problem unique to CBMT. If the best matching process returns certainty factors, the system is controlled using these factors on the basis of the some other model such as Watanabe's \[5\].</Paragraph>
    <Paragraph position="10"> In tiffs paper, we concentrate on best matching using a thesaurus.</Paragraph>
    <Paragraph position="11"> 2Concepts are enclosed between arrowheads (&lt; and &gt;) in this paper.</Paragraph>
  </Section>
  <Section position="4" start_page="0" end_page="0" type="metho">
    <SectionTitle>
ACTES DE COL1NG-92, NANTES, 23-28 AO(ff 1992 7 I 5 PROC, OF COLING-92, NANTES, AUG, 23-28, 1992
3 Generalizing Cases
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.1 Division and Linearlization of
Cases
</SectionTitle>
      <Paragraph position="0"> At first, we define a translation pattern (TPi) as follows. null</Paragraph>
      <Paragraph position="2"> We call the number of variables in V, the term number (Mi) of TP,.</Paragraph>
      <Paragraph position="3"> Next, we extract translation pattern causes (TPC,) from the case-base by applying the pattern matches described in TPI to all cases in the case-base.</Paragraph>
      <Paragraph position="4"> TPCi = \[L,, C,, L,\] L, : List of Values of Lexieal Variables in SL C, : List of Constraints in SL Lt : List of Values of Variables in TL If some patterns other than those specified in P, are related in translation, those patterns axe described in constraints (C,).</Paragraph>
      <Paragraph position="5"> These TPC, s are finearllzed into linearlized translwtlon pattern cases (LTPCi).</Paragraph>
      <Paragraph position="6"> LTPCi : L. --* (Co, Lt) We call the right-hand part of LTPCi the value (V). The examples in Fig. 1 are extracted LTPC, s in Japanese-to-English translations of &amp;quot;NOUN ni VERB,&amp;quot; where we assume a translation pattern in which an English preposition is determined by a binary relation of a Japanese noun and a Japanese verb. In the following section, we show how to generalize LTPCis into generalized linear translation pattern cases (GLTPCI) by replacing words with more general concepts in the thesaurus, and calculate degrees of importance for them.</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.2 Case Generalization by Means of
a Thesaurus
3.2.1 Creation of N-Term Partial Thesauri
</SectionTitle>
      <Paragraph position="0"> We create working thesauri, PTH~(j) (1 &lt; j &lt; Mi), for each term. They iuclude every word in the j-th term, and set pairs of values and their frequencies in each word node.</Paragraph>
      <Paragraph position="1"> Here we define ttle importances used to weigh generalized cases.</Paragraph>
      <Paragraph position="2"> Importance of a Link (.rL) The importance of a link (IL) is the probability of occurrence of eases that occurred in the subtree of PTH,(j). IL is defined as follows. S IL=-- c, where S is the total number of cases in the subtree connected with the link, and C~ is the total number of LPTCIs extracted from tile case-base according to TP.</Paragraph>
      <Paragraph position="3"> Importance of a Node (IN) The importance of node (IN) shows the degree of variance of values in a subtree. IN is defined as follows.</Paragraph>
      <Paragraph position="4"> where Pk is the probability of each value in the sub-tree 3.</Paragraph>
      <Paragraph position="5"> Importance of a Value (IV) The importance of a value L (IV) in the node k is defined as follows. If node k is a word node, then \[Vkt = frequency of value L in node k aWe adopt the s~me expre~ion as that used by Stanfill \[6\] and Sumita \[4\].</Paragraph>
      <Paragraph position="6">  where m is a node linked to node k, and/14,a is the importaatce of value L in node m.</Paragraph>
      <Paragraph position="7"> Importance of a Generalized Case (IC) The importance of a GLTPCi (IC) is defined a.s follows.</Paragraph>
      <Paragraph position="9"> where IVjt is tile importance of value L, which is the same as the value of the GLTPCi.</Paragraph>
      <Paragraph position="10">  According to the definitions given in the previous section, at first ILs and INs are set in all the links and nodes in PTHI(j), and IVs are calculated m conceptual leaf nodes in PTHi(j).</Paragraph>
      <Paragraph position="11"> If IV is not the maximum value in a conceptual leaf node and is greater than the prc-defined threshold value and its frequency is greater than 2, the node is subdivided into more specific concepts.</Paragraph>
      <Paragraph position="12"> Subdivision occurs because a specific category which doesn't exist ill the thesaurus is effective for a specific translation pattern. Only the difference from tile ttlesaurus is kept a.s the translatlou pattern thcsanrus i (TPTHI).</Paragraph>
    </Section>
    <Section position="3" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.2.3 Propagation of Importance of Values
</SectionTitle>
      <Paragraph position="0"> Next, we calculate IV in all nodes other than COltceptual leaf nodes by propagating IV. The propagation is done by multiplying the importances of values by the importances of links, and the sum of all the propagated values is multiplied by the importance of the node. At first, the propagation is done upward, starting from the conceptual leM nodes. During upward propagation, downward propagation is done if a child node is a conceptual node and a propagated value is greater than the maximum importance of values in the child node. Downward propagation prevents overgeneralization. null We show examples of results of importnnce calculation in Fig. 2 and Fig. 3, for tile first and second terms respectively. In Fig. 2, the subdivision occurred in the node &lt;Time&gt; and the new node &lt;*X*&gt; was created. A downward propagation occurred in the node &lt;Concrete&gt; in Fig. 2. Tile word &amp;quot;in&amp;quot; was made more important than the word &amp;quot;to&amp;quot; in the node  According to importances calculated according to the method described in the previous section, LTPCis are generalized in the jail term. If the value with the highest IV in tile child node is the same as the value with tile highest IV in tin&gt; parent node, then tile word in the term is generalized by the concept in the parent nodE.'. This process of generalization is repeated until no further generalization is possible, and only the most generalized cases are kept. If identical c~es are obtained as a result, only one case is kept.</Paragraph>
      <Paragraph position="1"> We show an ex~tmple of intra-term generalization of \[&amp;quot;Kaymz&amp;quot;(Tnesday),&amp;quot;Ki ....... &amp;quot;(decide)\] ~ (\[1,\[&amp;quot;on&amp;quot;\]). Initially, the firts term &amp;quot;K~vou&amp;quot;(Tuesday) is gener-Mized. T1 .... 1 .... f (hi ..... (\[\],{&amp;quot;on&amp;quot;\]) is th ....... as tile vMue with tile highest IV in the parent node &lt;*X*&gt; (see Fig. 2), so &amp;quot;Kayou&amp;quot;(Tuesday) is replaced by &lt;*X*&gt;. The value (\[\],\[&amp;quot;on&amp;quot;\]) is not tile value with tile highest IV in the parent nede of &lt;*X*&gt;, and therefore generalization stops at the first term. Next, the second term &amp;quot;Kimaru&amp;quot;(decide) is generalized. In tt~e parent node &lt;Decision&gt; of &amp;quot;Kimaru&amp;quot;(decide), tile value that is the same as tile value of the ea.se is one of the values with the higtlest IV. Consequently, parent nodes are checked to determine which value is more important. In tile root node, (\[\],\[&amp;quot;on&amp;quot;\]) is less important tl .... (\[\],\[&amp;quot;in'\]) .... no generalization occurs for the second term. Finally, \[&lt;*X*&gt;,&amp;quot;Ki ...... &amp;quot;(decide)\] ~ (\[\],\[&amp;quot;on&amp;quot;\]) is obtained as tile result of intra-term generalization.</Paragraph>
      <Paragraph position="2"> Tile result of intra-term generallzatiml for all tile LTPC, s in Fig. 1 is slmwn in Fig. 4.</Paragraph>
    </Section>
    <Section position="4" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.2.5 Inter-Term Generalization of LTPCi
</SectionTitle>
      <Paragraph position="0"> Next we generalize cases over terms. Inter-term generalization takes ICs into consideration. If M, = 1, ACRES DECOLING-92, NANTES, 23-28 AOI}T 1992 7 1 7 PROC. OF COLING-92, NANTES, AUG. 23-28, 1992  ACTES DE COLING-92. NANTES, 23-28 ^oI3&amp;quot;r 1992 7 1 8 PROC. OF COLING-92. NANTES, AUG. 23-28. 1992 then the result of intra-term generalization with ICs is the generalized linear translation pattern case i (GLTPCi). If M, &gt; 1, j th term ma~ximum generalization (1 ~ j &lt; M~) is done for e~.ch term. In j-th term maximum generalization, terms other than the j-th term are fixed first and the j-th term is generalized .as much a.s possible. Then, the maxinmni possible generalization is done for remaining of terms in turn.</Paragraph>
      <Paragraph position="1"> If Mi &gt; 1, then M, x (Mi - 1) GLPTC, s axe obtained.</Paragraph>
      <Paragraph position="2"> If identical cases are obtained as a result, only one case is kept.</Paragraph>
      <Paragraph position="3"> We show an exz~mple of inter-term generalizatlnn of \[&lt;Directiou&gt;,&lt;Abstraet&gt;\] -&amp;quot; (\[\],\[&amp;quot;to&amp;quot;\]). Initially, first-term ma.ximum generalization is done. IVs in the node &lt;Abstract&gt; are shown betow (see PTHi(2) in Fig. 3).</Paragraph>
      <Paragraph position="5"> IVs in the node &lt;Abstract&gt;, which is the parent node of &lt;Direction&gt;, are shown below (see PTHi(1) in Fig.</Paragraph>
      <Paragraph position="6">  Since (\[I,\[&amp;quot;to&amp;quot;l) doesn't have the highest importance, the case is not generalized any further in the first term. Next, the second term is generalized. The IVs in the node &lt;Direction&gt; are shown below (see Fig. 2).</Paragraph>
      <Paragraph position="7"> (\[\],\[&amp;quot;to&amp;quot;l) : 0.1 IVs in the node &lt;&gt;, which is the parent node of &lt;Abstract&gt;, axe shown below (see Fig. 3).</Paragraph>
      <Paragraph position="9"> Since (\[\],\[&amp;quot;to'\]) has the highest import ...... th .....</Paragraph>
      <Paragraph position="10"> end term is generMized into the root node &lt;&gt;, and the generalization stops because there are no nlore parent nndes. Therefore \[&lt;Direction&gt;,&lt;&gt;\] ~ ({\],\[&amp;quot;to'\]) 0.108 is the result of first-term nlaxinlutn generalization of \[&lt;Direction&gt;,&lt;Abstract&gt;\] -&amp;quot; (\[\],\[&amp;quot;to&amp;quot;\]). The result of inter-tcrm generalization for all the LTCPis in Fig. 1 is shown in Fig. 5.</Paragraph>
    </Section>
    <Section position="5" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.2,6 Addition of Translation Rules
</SectionTitle>
      <Paragraph position="0"> FinMly, translation rules (TRis) are added to the set of GLTPC, s. TRis are descriptions in which concepts are specified as the values of variables of L. of LTPC, s.</Paragraph>
      <Paragraph position="1"> If the same case Mready exists in the set of GLTPC,, then it is not added. If only the wJue of the ease is different from TRi, then it is replaced by TR,. Ottierwise, TR, is added with its IC. The ICs for TRis are e~dcnleAed in the same way as for GLTPC, s.</Paragraph>
    </Section>
  </Section>
  <Section position="5" start_page="0" end_page="0" type="metho">
    <SectionTitle>
4 Best-Matching Algorithm
</SectionTitle>
    <Paragraph position="0"> The Tl'is, the set of GLTPC, s, the TPHis, aud tile thesaurus are used in hest matching. The values of vaxi~bleu in V. z.re extracted from the input sentence by applying pattern matching according to the description of TPi. The best-matching process retrieves the most similar case frmn the set of GLTPC,.</Paragraph>
    <Paragraph position="1"> If M'~ = 1, words which are equivalent 1o the word that is a value of the variable in l; axe first searched for in the value of the corresponding wriable in L, of GLTPCo. If none are found, upper concepts retrieved in either TPTHI or the thesaurus are searched in turn. The GLTPCI which is found first is the shortest-distance GLTPCi (SDGLTPCI). If C; in GLTPCi is not null, then it is also evaluated, whether it is true or false.</Paragraph>
    <Paragraph position="2"> AcrEs DE COLlNG-92, NANTES, 23-28 ^Ot')q 1992 7 1 9 PROC. OF COLING-92, NANTES, AUG. 23-28, 1992 If Mi &gt; 1, the j-th term shortest-distance GLTPC, s (SDGLTPCj) of each term are searched for. If Mi = 2, SDGLTPC~ holds the shortest-distance word or concept in the first term, and SDGLTPC2 holds the shortest-distance word or concept in the second term.</Paragraph>
    <Paragraph position="3"> If 2!4, &gt; 1, (M~ - 1) SDLTPCjs are obtaiued for each j-th term. A total of Mi x (Mi- 1) SDLTPCjs are obtained. The SDGLTPCj with the highest importance is selected as the SDGLTPC.</Paragraph>
    <Paragraph position="4"> We will show an exmnple in retrieving the most similar example for &amp;quot;Getuyou(Monday)ni-Huru(rain).&amp;quot; Suppose the parent node of &amp;quot;Huru&amp;quot; is &lt;Climate&gt;. At firsL SDGLTPCI will be searched for iu GLTPCis (see Fig. 5). &amp;quot;Getuyou&amp;quot; does not exist in any first terms in the set of GLTPC~s. Therfore &lt;*X*&gt; which is the parent node of &amp;quot;Getuyou&amp;quot; is searched for and \[&lt;*X*&gt;,&lt;&gt;) ~ (\[\],\[&amp;quot;on'\]) 0.306 is found. T1 ...... ond term of this GLTPCi is a upper concept of &amp;quot;Huru,&amp;quot; so this is SDGLTPC1. Next, SDGLTPC2 is searched for and is found to be the same as SDGTPC1. Consequently, the most similar GLTPCi is \[&lt;*X*&gt;,&lt;&gt;\] ~ (\[\],\[&amp;quot;on&amp;quot;\]) 0.306, and the word &amp;quot;on&amp;quot; is set as a preposition.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML