File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/92/c92-2086_intro.xml
Size: 5,778 bytes
Last Modified: 2025-10-06 14:05:11
<?xml version="1.0" standalone="yes"?> <Paper uid="C92-2086"> <Title>A METHOD OF TRANSLATING ENGLISH DELEXICAL STRUCTURES INTO JAPANESE</Title> <Section position="2" start_page="0" end_page="0" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> Common verbs such as &quot;take,&quot; &quot;have,&quot; &quot;make,&quot; and &quot;give&quot; appear frequently in English. These verbs quite often constitute verb deverbal noun structures such as &quot;make an address,&quot; &quot;give an answer&quot; and &quot;take an approach.&quot; The verbs in the structure arc ahnost devoid of lexical meaning but bear syntactic information such as tense, number and person; the deverbal noun carrying the lexical meaning. The verbs, in this case, are called &quot;dclexical verbs&quot; \[Collins\] or &quot;light verbs,&quot; \[Live\] which refers to their lexical emptiness.</Paragraph> <Paragraph position="1"> In this paper, we call such verbs Delexical Verbs (DV) and a &quot;DV + deverbal noun structure&quot; a Delexical Structure (DS) Iollowing the examples ol' \[Collins\]. The frequency of these verbs in actual text can be seen, for example, in the COLLINS COBUILD ENGLISH DIC-TIONARY, where the pmagraph on the entry &quot;take&quot; states: &quot;The most frequent use of take is in expressions where it does not have a very distinct meaning of its&quot; own, but where most of the meaning is in the noun that follows it...&quot; We have been developing an English to Japanese machine translation system for news broadcasts since 1989 \[Aizawa\] \[Tanaka\]. The precise translation of DS's in news texts is of great importance since they are quite frequent here. We counted the number of &quot;take&quot; + &quot;noun&quot; collocations (as verb + object) in 21 months' worth of AP texts using the parser of the machine translation system. &quot;Take&quot; collocated with 2,188 different nouns a total of 20,271 times. Of the collocating nouns, 87 deverbal nouns were found out of the 119 deverbal nouns listed in \[Live\], comprising about 28% (5,726) of all occurrences. This figure strongly supports the statement in the Collins Dictionary.</Paragraph> <Paragraph position="2"> Failures in DS translation typically result from producing the primary sense instead of the delexical sense of the DV, which greatly deterioratcs the quality of the translation. For example, &quot;make an address&quot; becomes &quot;enzetsu wo tsukuru,&quot; which meaTts &quot;create an address.&quot; There are two possible ways of translating a DS. The first is the idiosyncratic approach, listing all the DS's with their Japanese translations in a lexical system. This approach, however, suffers fi'om several shortcomings: (1) The DS's are numerous and hard to list exhaustively: some DS's allow pttssivization and some deverbal nouns can be modified by quantifiers, adjectives and so on. This doubles and triples the number of possible DS combinations.</Paragraph> <Paragraph position="3"> (2) This direct method is unable to infer the translation of a DS undefined in a lexicon.</Paragraph> <Paragraph position="4"> (3) The use of this approach increases the nulnber of lexical entries, making lexical management difficult.</Paragraph> <Paragraph position="5"> Another approach is to synthesize the translation of a DS using the word sense of each component with syntactic and semantic rules. The attractive part of this &quot;synthetic approach&quot; is that it does not suffer from the problems mentioned above. The &quot;ntonosenty approach&quot; proposed in \[Ruhl\] can be viewed as the extreme manifestation of the synthetic approach. A recent lexical framework \[Bograev\] proposes to generate the word sense instead of listing them exhaustively in a lexicon, which is similar to the synthetic approach.</Paragraph> <Paragraph position="6"> However, fl'om a practical viewpoint, not MI DS's can bc translated by this approach as the necessary rules have not yet been factored out.</Paragraph> <Paragraph position="7"> We propose the use of a DS translation method based mainly on synthesis, and the employment of all idiosyncratic approach where synthesis is difficult. To do this, DS's were categorized into three groups, called type-l, type-2, and idiomatic DS. The first two groups are translated by the synthetic method and the last group is trans-ACRES DE COLING-92, NANTES. 23-28 AOt~r 1992 5 6 7 PROC. OF COLING-92, NA~CrES, AUG. 23-28, 1992 lated by an idiosyncratic approach which can hopefully be integrated into the former part as research reveals the underlying rules. This method should provide clear distinctions between idiomatic and synthesizable DS's through the use of a set of rules, which would facilitate the management of lexical systems.</Paragraph> <Paragraph position="8"> The translation rules are quite simple tbr the following reasons: (1) English DS~s have Japanese equivalents in many cases, and some parallels can be seen between them.</Paragraph> <Paragraph position="9"> (2) Many Japanese &quot;verbal nouns&quot; take the form of &quot;sahen-meishi,&quot; which become verbs by simply adding &quot;suru&quot; to the tail. However, some DS's require translation in a passive sense. The conditions were factored out through semantic consideration, and were integrated into the translation rules.</Paragraph> <Paragraph position="10"> The rules were implemented in the machine translation system and AP news texts were translated appropriately, thus proving the feasibility of this method.</Paragraph> </Section> class="xml-element"></Paper>