File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/91/e91-1054_metho.xml
Size: 6,059 bytes
Last Modified: 2025-10-06 14:12:39
<?xml version="1.0" standalone="yes"?> <Paper uid="E91-1054"> <Title>SUBLANGUAGES IN MACHINE TRANSLATION</Title> <Section position="4" start_page="0" end_page="0" type="metho"> <SectionTitle> A SUBLANGUAGE CONCEPT FOR USE IN MT SYSTEMS </SectionTitle> <Paragraph position="0"> To my knowledge, it was Z. Harris who introduced the term 'sublanguage' (cf. Harris 1968, 152) for a portion of natural language differing from other portions of the same language syntactically and/or lexically. Definitions are gwen by Hirschman/Sager (1982), Quinlan (1989) and Lehrberger (1982).</Paragraph> <Paragraph position="1"> In order to be able to use such characterizations in MT, they have to be formalized in a way adequate to the MT system in question. Such formalizable properties were combined in the definition of Luckhardt (1984) of what sublanguage can mean for MT: Text type represents the syntactic-syntagmatic level of a sublanguage for which only a rather weak differentiation can be proposed (e.g. running text, word list, nominal structures etc.). Subiect field represents the lexical level of a sublanguage, i.e. for every sublanguage a subject field is determined as being characteristic, so that the MT system may choose on the basis of the sublanguage of a text those translation equivalents from the lexicon which carry the same subject field code as the translated text.</Paragraph> <Paragraph position="2"> The lack of a commonly accepted subject field classification for MT Is a serious problem. Such a classification is tentatively proposed in Luckhardt/Zimmermann 1991.</Paragraph> <Paragraph position="3"> T~xt function represents the lexicalpragmatic level. The function of a text (or its target group) may determine the choice of TL equivalents and of syntactic structure or style.</Paragraph> <Paragraph position="4"> The inhouse usage criterion covers a number of aspects determined by special requests of the MT user or the firm ordering the translation. This is first of all a question of inhouse terminology.</Paragraph> </Section> <Section position="5" start_page="0" end_page="0" type="metho"> <SectionTitle> SUBLANGUAGES FOR MT: MAINTENANCE REQUIREMENTS </SectionTitle> <Paragraph position="0"> A typical maintenance requirement card of the Bundessprachenamt (Federal Translations Agency) among others contains the following parts: . 0esignation of eauipment text type 'nominal structure' text function 'title' e.g.: 'Portable gasoline driven pump' . tools, parts, material~ text type 'word list' text function 'accessories'; e.g.: - key set, head screw, L-type hex - wrench, adjustable, open end 6&quot; - solvent, type II - screwdriver, flat tip, medium duty - rags, wiping - 306 3. the basis of word order: orocedure text type 'instructions' (imperative style) text function 'maintenance instructions', e.g.: 'Accomplish annually or when directed as a result of operational test. Clean and inspect fuel filter and float valve; - remove pump housing covers, if applicable - observe no smoking regulation - remove choke knob and fuel connection - remove float chamber and gasket - clean all parts in solvent, allow to air dry - inspect filter for clogging, tears, and deterioration' (cf. Wilms 1983) The example indicates how nicely the different sublanguages of this type of document can be differentiated, and it ought to be possible in all MT systems to capture these differences, especially the typical 'imperative style' of the text type 'instructions'. In order to achieve this it must be possible to weight rules or resulting structures like in the SUSY system (cf. Thiel 1987). This is important, because there is no absolute certainty that all predicate structures appear as imperatives in English or as infinitives in German.</Paragraph> </Section> <Section position="6" start_page="0" end_page="0" type="metho"> <SectionTitle> THE USE OF SUBLANGUAGES IN THE STS PROJECT AND SYSTEM </SectionTitle> <Paragraph position="0"> Since 1985 the SUSY system has been used as the core MT system within the computer-aided Saarbriicken Translation System (STS), i.e. in human-aided MT and in machine-aided human translation.</Paragraph> <Paragraph position="1"> Titles of scientific papers from German databases were machine-translated and postedited by humans, abstracts were translated by translators (in all around 5 million words), with the MT system automatically supplying the correct terminology (from a terminology pool of more than 350.000 German-English entries). In the following a specific aspect of sublanguage-dependent disambiguation is described.</Paragraph> </Section> <Section position="7" start_page="0" end_page="0" type="metho"> <SectionTitle> SEMANTICS OF PREPOSITIONS IN TITLES </SectionTitle> <Paragraph position="0"> A 'zu'-phrase at the beginning of a title (the top node of the nominal structure) always denotes a TOPIC (lst example), otherwise (3rd example) a purpose. 'Uber' at the beginning also denotes a TOPIC. These rules only apply, if the PP is not embedded in a predicate structure like in the 2nd example, where it fills the zu-valency of 'verpflichtet'. So, if the parser produces a structure like the following: SUBJECT: none GOAL:riickgewinnen</Paragraph> <Paragraph position="2"/> </Section> <Section position="8" start_page="0" end_page="0" type="metho"> <SectionTitle> OBJECT: W~-me </SectionTitle> <Paragraph position="0"> there only has to be lexical transfer => oblige SUBJECT: none /~~~'~~ recover ! OBJECT: heat to present a structure to generation that cames enough information to produce the English translation given above ('Obliged to recover heat').</Paragraph> <Paragraph position="1"> Similarly, examples 1. and 3. can be represented by the parser in a way which allows the generation of the correct target language equivalent, e.g.: The surface realization of the semantic roles TOPIC and OBJECT is a task for zenerativ on, i.e. transfer can be completely relieved of rules treating such semantic roles (cf. Luckhardt 1987).</Paragraph> </Section> class="xml-element"></Paper>