File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/92/c92-2115_intro.xml
Size: 4,708 bytes
Last Modified: 2025-10-06 14:05:11
<?xml version="1.0" standalone="yes"?> <Paper uid="C92-2115"> <Title>A Similarity-Driven Transfer System</Title> <Section position="2" start_page="0" end_page="0" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> The transfer process in macbine translatiou systems is, in general, more complicated than the processes of analysis and generatimt. One reasmt for this is that it relies heavily on human heuristic knowledge or the linguistic intuition of the rule writers. Unfortunately, linguistic intuition tends to be unable to control the process properly for a wide variety of inputs, because of the huge amount of data and the huge number of situations that need to be considered. However, rule writers must rely on their linguistic intuition to some extent, because there is no linguistic theory on lexieal transfer \[7\]. Another reason \[81113 \] is that tile transfer task is inherently a conglomeration of individual lexical rules. Therefore, the transfer process can be said to fall into a class of problem that cannot easily be controlled by the linguistic intuition of rule writers.</Paragraph> <Paragraph position="1"> In accordance with these observations, various attempts have been made to overcome the problems of transfer; they include knowledge-based MT \[12\], bilingual signs \[13\], and Tags for MT\[1\]. One such approacb is case-based or example-based MT \[4\] \[9\] \[10\] \[11\]. The essential idea behind all case-based MT (CBMT) methods is that tile system chooses the case (or example) most similar to tile given input from the case base, and applies the knowledge attached to the chosen case to the input. 1 Supposing that there is a corpus of parsed translation examples in which corresponding parts are linked to each other~ we can regard those parsed transla1 This approach can be regarded as an application of casebaaed tea.sorting \[3\] to ntttural language translation. tion examples as translation rules. A promising ~rpproach is therefore to make a transfi~r process that (1) chooses a set of translation examples, each source part of which is similar to a part of the input~ attd all source parts of which overlap the whole input~ and (2) constructs an output by combining the target parts of those translation examples chosen. However, this does \]tot mean that existing transfer knowledge should be abandoned. Rather, such transfer knowledge should be used ms a fail-safe mechanism if there are no appropriate examples. In the similarity-dr~iven t,unsfer system (Simlmn) we have developed, both translation examples and existing transfer knowledge are treated uniformly as trauslation pattern% and are called translation rules.</Paragraph> <Paragraph position="2"> In Figure 1, for example, (a) is tile parsed dependency structure of an inpnt Japanese sentence, &quot;kare ga kusuri wo numu.&quot; Suppose that (b) is selected as the most similar translation rule for the part &quot;kare ga ... nomu&quot; frmn the translation rule-base, and that (c) is selected as the most similar translation rule for the part &quot;kusuri wo nomu~&quot; even though there are several translation candidates for the Japanese verb &quot;nomu.&quot; This figure illustrates what we would like to do; that is, to construct (d), the translated structure by combining the target structures of the selected translation rules.</Paragraph> <Paragraph position="3"> To develop this kind of system, we must consider the following issues: (a) a metric for similarity, (b) a mecbanism for combining target parts of rules, and (c) correspondence between the source part anti the target part of a rule.</Paragraph> <Paragraph position="4"> To handle the last two issues, I developed a model called Rules Combination Transfer (RUT) \[14\].</Paragraph> <Paragraph position="5"> SimTran is RCT coupled with a similarity calculation method. In tbis paper, I will introduce RCT and the similarity calculation method used in SimTran.</Paragraph> <Paragraph position="6"> The next section defines the data structure for graphs, aud the format of a translation rule. Section 3 presents a method for calculating the similarity between an input and the source part of a translation rule. Section 4 describes the flow of the transfer process in RCT. Section 5 gives examples of translation using SimTran, and Section 6 discusses related work.</Paragraph> <Paragraph position="7"> Some concluding remarks bring the paper to an end.</Paragraph> <Paragraph position="8"> AcrEs DE COLING-92, NAN2T~, 23-28 AOI~T 1992 7 7 0 PROC. OF COLING-92, NANTES, AUG. 23-2fl, 1992 .....-&quot;...... .... ........ ..</Paragraph> </Section> class="xml-element"></Paper>