File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/80/c80-1065_metho.xml

Size: 22,502 bytes

Last Modified: 2025-10-06 14:11:18

<?xml version="1.0" standalone="yes"?>
<Paper uid="C80-1065">
  <Title>PRESENT AND FUTURE PARADIGMS</Title>
  <Section position="2" start_page="0" end_page="0" type="metho">
    <SectionTitle>
Abstract
</SectionTitle>
    <Paragraph position="0"> Useful automatized translation must be considered in a problem-solving setting, composed of a linguistic environment and a computer environment. We examine the facets of the problem which we believe to be essential, and try to give some paradigms along each of them. Those facets are the linguistic strategy, the programming tools, the treatment of semantics, the computer environment and the types of implementation.</Paragraph>
    <Paragraph position="1"> Introduction Machine Translation has been a recurring theme ~n applied linguistics and computer science since the early fifties. Having not yet attained the enviable status of a science, it is best considered as an art in the same way as Knuth considers computer programming. Failure to recognize that MT must be treated in a problem-solving setting, that is, as a class of problems to be solved in various environments and according to various quality and cost criteria, has led and still leads to impassionate, antiscientific attitudes, ranging polemically between dreamy optimism and somber pessimism. Using the fairly large body of experience gained since the beginning of MT research, we try in this paper to extract the most essential facets of the problem and to propose some paradigms, alongside each of those facets, for usable computer systems which should appear in the near - or middle - term future.</Paragraph>
    <Paragraph position="2"> As a matter of fact, the phrase 'Machine Translation&amp;quot; is nowadays misleading and inadequate. We shall replace it by the more appropriate term &amp;quot;Automatized Translation&amp;quot; (of natural languages) and abbreviate it to AT.</Paragraph>
    <Paragraph position="3"> Part I tries to outline the problem situations in which AT can be considered. The following parts examine the different facets in  turn. Part II is concerned with the linguistic strategy, Part III with the programming tools, Part IV with semantics, Part V with the computer environment and Part VI with possible types of implementation.</Paragraph>
    <Paragraph position="4"> I - Applicability, quality and cost : a problem situation.</Paragraph>
    <Paragraph position="5"> 1. The past  Automatized translation systems were first envisaged and developed for information gathering purposes.</Paragraph>
    <Paragraph position="6"> The output was used by specialists to scan through a mass of documents, and, as RADC user report shows \[49\], the users were quite satisfied. This is no more the case with the growing need for the diffusion of information. Here, the final output must be a good translation. Second generation systems were designed with this goal in mind, and with the assumption that good enough translations cannot nOW be obtained automatically on a large scale, but for very restricted domains (see METEO). Hence, a realistic strategy is to try to automate as much as possible of the translation proc~s. This is the approach taken by GETA, TAUM, LOGOS, PROVOH and many others. Here, the problem is to answer existing needs by letting man and machine work together.</Paragraph>
    <Paragraph position="7"> Another approach comes from AI and is best exemplified in \[9\]. Here, the goal is more theoretical : how to simulate a human producing competent translations ? We will argue that the methods developed in this framework are not yet candidates for immediate applicability.</Paragraph>
    <Paragraph position="8"> PARADIGM 1 : Future MT systems will be AT (automated) systems rather than completeley automatic systems.</Paragraph>
  </Section>
  <Section position="3" start_page="0" end_page="0" type="metho">
    <SectionTitle>
2. Applicability
</SectionTitle>
    <Paragraph position="0"> Automated translation is clearly needed by large organizations as the EC or big industries having to translate tens or hundreds of millions of pages per year. Translations are very often urgent, and there is a human impossibility, as translations needs increase much faster than the number of available translators.</Paragraph>
    <Paragraph position="1"> Translators are specialized to certain kinds of texts. In the same way, AT systems, which are costly to develop and maintain, should be tailored to some kind of texts : AT is applicable when there is a constant flow of very homogeneous and repetitive texts, hopefully already in machine-readable form. AT should allow to integrate automatic rough translation and human on- or off- line revision.</Paragraph>
  </Section>
  <Section position="4" start_page="0" end_page="0" type="metho">
    <SectionTitle>
3. Quality
</SectionTitle>
    <Paragraph position="0"> This is a crucial point, when it comes to acceptability by revisors and/or end-users.</Paragraph>
    <Paragraph position="1"> The quality of translation is a very subjective notion, relative to the need and knowledge of the reader. Traditional counts of grammatical errors, false senses,nansenses give only indications.</Paragraph>
    <Paragraph position="2"> - 430-We believe that quality should be estimated by the amount of revision work needed, to compare it with human (rough) translation, which is also very often of poor quality. As errors of AT systems are certain to be different from tho~ of humans, revisors must have a certain training before such a comparison can be made.</Paragraph>
    <Paragraph position="3"> Another measure could be to compare final translations, with the same amount of revision. We believe this not to be realistic, as cost must also be taken into account : translators will turn to revision, which is much faster than translation, so that they will gain time even if they work for revision.</Paragraph>
  </Section>
  <Section position="5" start_page="0" end_page="0" type="metho">
    <SectionTitle>
4. Cost
</SectionTitle>
    <Paragraph position="0"> The cost of AT should be divided into the costs of development, maintenance and use. It is of course related to the linguistic and computer environments. First, the graph of language-pairs should be considered, as development costs for an analyzer, say, may be charged to different pairs with the same source, of course if a~alys~ and synthesis are strictly monolingual.</Paragraph>
    <Paragraph position="1"> Easy maintenance calls for sophisticated computer systems having an interactive data-b~e aspect and concise metalanguages with good, incremental compilers.</Paragraph>
    <Paragraph position="2"> Machine time tends to be a minor component of the cost of use. Important savings come from the integration of the human revision in the AT system (see TAUM, LOGOS, GETA), as no further typing is required.</Paragraph>
  </Section>
  <Section position="6" start_page="0" end_page="0" type="metho">
    <SectionTitle>
5. Text typology
</SectionTitle>
    <Paragraph position="0"> AT systems developed for simple texts will certainly be less expensive (and probably better) than those for complex texts. Let use give a tentative hierarchy. The easiest texts are perhaps already preedited abstract~ regularly entered into data bases. Then come abs~y~a0~, which may present surprising difficulties mainly due to the tendency to write everything in one long and badly constructed sentence.</Paragraph>
    <Paragraph position="1"> Technical documentation, maintenance manuals, etc. are more and more written in a systematic way, which permits to tailor an AT system to their form and their content. See however TAUM-AVIATION reports for a sobering view on their apparent facility ! Minutes of meetings and working document~ may be still harder.</Paragraph>
    <Paragraph position="2"> Newspaper articles, even on scientific subject matters, tend to accumulate difficult or incorrect constructions, and also to jump far away from the current subject matter.</Paragraph>
    <Paragraph position="3"> Until AI methods (&amp;quot;third&amp;quot; or &amp;quot;fourth&amp;quot;) generation are practicable with really large data, we don't believe AT systems should even try to handle literary, normative or diplomatic texts. Revision would just be a new translation. PARADIGM 2 : AT systems are now applicable only in restricted environments and must be tailored to particular 'kinds of texts.</Paragraph>
    <Paragraph position="4"> II - Linguistic strategy I. Multilingual or pair-oriented systems ? Almost all AT systems divide the process of translation in three main logical steps, analysis, transfer and synthesis. At one extreme, some systems (like METEO) are strongly oriented towards a particular pair of languages. This means that analysis of the source language is performed with the knowledge of the target language. Target lexical units may appear during analysis, and syntactic or semantic ambiguities in the source are handled contrastively.</Paragraph>
    <Paragraph position="5"> The other extreme is the complete independence of analysis and synthesis. This is the approach taken in multilingually oriented systems (like ARIANE-78 \[7, 36, 50\], SUSY \[51\], SALAT \[18, 20\], TAUM/AVIATION \[33\]). This independence enhances modularity and economically justified, as analysis or synthesis are written once for each language. Analysis usually represents at least 2/3 of the programming effort and computing time.</Paragraph>
    <Paragraph position="6"> PARADIGM 3 : We advocate for multilingually oriented systems, where the basic software itself guarantees independence of analysis and synthesis.</Paragraph>
  </Section>
  <Section position="7" start_page="0" end_page="0" type="metho">
    <SectionTitle>
2. What kind of analysis ?
</SectionTitle>
    <Paragraph position="0"> Should the analysis deliver a structural descriptor of the unit of translation, or a representation of its meaning, static or dynamic ? With the first approach, the transfer step includes necessarily a lexical transfer and a structural transfer. With the second one, the result of the analysis is a language-independent representation of the unit of translation (sentence, paragraph(s)). When the lexical units themselves are language-free, as in SAM \[9\], we call it &amp;quot;pure pivot&amp;quot; approach. When only the relations between units are language-free, we call it &amp;quot;hybrid pivot&amp;quot; approach (as in the first CETA \[34, 35\] system). In the first case, there is no transfer, in the second, transfer is purely lexical.</Paragraph>
    <Paragraph position="1"> The pivot approach is theoretically very elegant. However, past experience with it (on a corpus of more than a million words, see Vauquois (1975)) shows that it is quite inadequate in real situations, where, very often, this representation cannot be obtained, or not for all parts of the translation unit. Also, human professional translators seem very often to produce quite acceptable results without actually abstracting the deep meaning and rephrasing it, but rather by using standard syntactic transformations (like active-passive, reordering of nominal groups, passive-impersonal, splitting up sentences, etc.) and ... multiple  choice bilingual dictionaries. If deep comprehension fails, it is hence necessary and possible to fall back on lower levels of information.</Paragraph>
    <Paragraph position="2"> PARADIGM 4 : The result of analysis should be a structural descriptor of the unit of translation, where the lexical units are still source lexical units and where the linguistic information is &amp;quot;multi-level&amp;quot; : logical relations, syntactic functions, ~syntactic classes, semantic features (all universal for large families of languages), and trace information (proper to the source language).</Paragraph>
    <Paragraph position="3"> As we argue in Part IV, we don't think the result of analysis should include a dynamic comprehension of &amp;quot;what is described to happen&amp;quot;, at least in AT systems for the near future. Let us quote Carbonell &amp; al (1978) : &amp;quot;What kind of knowledge is needed for the translation of text? Consider the task of translating the following story about eating in a restaurant...&amp;quot;. Unfortunately, the texts to be translated, as we said in Part I, are not stories, but rather abstracts, manuals, working documents ... of a very different nature.</Paragraph>
  </Section>
  <Section position="8" start_page="0" end_page="0" type="metho">
    <SectionTitle>
3. Strategical aspects
</SectionTitle>
    <Paragraph position="0"> There are some problems the analysis writer can not escape. Should problems such as ambiguities be solved as soon as they appear, or not be solved altogether, or is it better to devise strategies to decide as late as possible, or more complex heuristics ? PARADIGM 5 : AT systems to be developed in the near future should allow complex linguistic heuristics. That is, we feel that preferences computed by the use of weights derived from some frequency counts are not enough, and that linguists should program what they see as being essentially heuristic in the linguistic processes. Hence further requirements on the programming tools, which should at least include such control structures as controlled nondeterminism. null III- Programming tools : algorithmic models and metalanguages I. History The first MT researchers programmed directly in machine language. Until now, SYSTRAN systems are essentially made of programs and tables written in IBM 370 macroassembler. Then came systems based on a simple formal model, like context-free grammars and Q-systems. These systems rely on a general algorithm over which the rules have no control. Systems allowing such controls (PROLOG \[52\], ATEF \[14, 15\], ROBRA \[50\], ATNs \[47\] and derived models like REZO \[32\], PLATO, DEDUKT \[18, 20\]) were created in the seventies.</Paragraph>
    <Paragraph position="1"> Now, the programming languages used to write the linguistic part of AT systems include usual programming languages such as macroassembler, FORTRAN, ALGOL, PL/I, LISP, as well as specialized languages (see above).</Paragraph>
    <Paragraph position="2"> 2. The need for powerful data and control structures In our view, usual programming languages are inadequate as metalanguages to be used for writing the linguistic data and procedures in an</Paragraph>
    <Paragraph position="4"> include built-in complex data-types such as decorated trees and graphs as well as control structures for non-deterministic, parallel and heuristi c programming.</Paragraph>
    <Paragraph position="5"> Note that parallelism may be of two different kinds : processors working independently on independent data structures and processors working on a common data structure (e.g. a normal context-sensitive grammar is not equivalent to the same grammar used in parallel, see S ~ AB, A ~ a/-B, B / b/A-). Many recent specialized programming languages include a form of non-determinism, but very few have parallelism (ROBRA) or control functions for heuristics (PROLOG, ATEF, REZO).</Paragraph>
    <Paragraph position="6"> Of course, these metalanguages should include more classical control structures such as iteration, recursion or selection. Note that dictionaries are simply big &amp;quot;select&amp;quot; constructs,  possibly non-deterministic (one-many).</Paragraph>
    <Paragraph position="7"> 3. Complexity, decidability, adequacy  If one takes all necessary data-types with all possible operators and all control structures, the model obtained is very likely to have the (maximal) computing power of a Turing machine. Hence, no general bound or estimate for the dynamic complexity of programs written in that formalism may be given. On the other hand, as all we want to program in AT systems is certainly sdbrecursive, another approach is to define several subrecursive algorithmic models with associated known (or studyable) complexity classes. This was the original approach at GETA, with the ATEF, ROBRA, TRANSF and SYGMOR algorithmic models, designed to be decidable and of linear complexity.</Paragraph>
    <Paragraph position="8"> As a matter of fact, decidability is a very practical requirement. However, general constraints imposed to guarantee decidability may make certain things unnecessarily clumsy to write. Perhaps a better idea (implemented in ATNs, ATEF and ROBRA) is to build algorithmic models as extensions of decidable models, in such a manner that sources of undecidability are easy to locate, so that particular decidability proofs may be looked for. For example, the fundamental operator of ROBRA is the parallel application of the rules of a transformational grammar to an object tree.</Paragraph>
    <Paragraph position="9">  432-Normal iteration of this operator must terminate, due to some marking mechanism. However, a grammar in &amp;quot;free&amp;quot; iteration mode may never terminate.</Paragraph>
    <Paragraph position="10"> Last, but not least, these algorithmic models must be adequate, in the sense of ease and concision of writing. We sum up with</Paragraph>
    <Paragraph position="12"> ciated with the data types should be adequate, their complexity (time and space) should be reasonably bounded (O(n) to O(n3) ?) and there should be decidable underlying algorithmic models, so that so'urces of undecidability could easily be traced.</Paragraph>
  </Section>
  <Section position="9" start_page="0" end_page="0" type="metho">
    <SectionTitle>
IV - Semantics
</SectionTitle>
    <Paragraph position="0"> i. Two different notions Semantics are understood differently in linguistics, logic and computer science. In the latter, attention is focused on the ways of expressing data and processes. A system is said to be &amp;quot;syntactic&amp;quot; if it operates within the framework of formal language theory, that is by combinatorial processes on classes or &amp;quot;features'~ In a &amp;quot;static&amp;quot; semantic system, there is a fixed model of some universe, possibly represented as a thesaurus, or as a set of formulae in some logic, on which a formal language is interpreted.</Paragraph>
    <Paragraph position="1"> A system incorporates &amp;quot;dynamic semantics&amp;quot;, or &amp;quot;pragmatics&amp;quot;, if the interpretation of the data it processes may alter the model of the universe, or create a parallel model of some  &amp;quot;situation&amp;quot; in this universe.</Paragraph>
    <Paragraph position="2"> 2. Classical approaches Existing AT systems of reasonable size, that is incorporating several thousands of lexical units and quite exhaustive grammars, rely essentially on semantics by features. They may be quite refined and specialized to a domain  (e.g. METEO), and, in that case, this method may give surprisingly good results.</Paragraph>
    <Paragraph position="3"> Although the basic softwares allows to relate lexical units by using (monolingual or bilingual) dictionaries, this possibility is hardly used in the current applications at TAUM see TAUM/AVIATION) or at GETA (see \[50\]). For instance, classical relations such as antonymy, generalization, particularization are not coded in the dictionaries.</Paragraph>
  </Section>
  <Section position="10" start_page="0" end_page="0" type="metho">
    <SectionTitle>
3. AI proposals
</SectionTitle>
    <Paragraph position="0"> AI proposals fall into two classes. The first refers essentially to static semantics, and may be illustrated by Wilks' &amp;quot;preference semantics&amp;quot; \[37-44\] or Simmons &amp;quot;semantic networks&amp;quot; \[30\].</Paragraph>
    <Paragraph position="1"> As applied to AT, these methods have only been incorporated in very small size test programs incorporating at most some hundreds lexical units. However, we feel that their simplicity and relative economy in coding effort make them usable in near-term AT systems, under the essential condition that, as in Wilks' model, it is not necessary to code completely every lexical unit, and that the associated computing effort is controlled by the linguist and undertaken only when necessary, that is when a problem (like ambiguity or anaphoric reference) has not been solved by s~mpler means.</Paragraph>
    <Paragraph position="2"> The second class of AI proposals relates to dynamic semantics and originates in the &amp;quot;frames&amp;quot; proposed by Minsky \[12\], and now proposed by other teams as &amp;quot;scripts&amp;quot;, &amp;quot;plans&amp;quot; or &amp;quot;goals&amp;quot; \[9, 27-29\]. They are certainly very attractive, but have been demonstrated on very particular and very small size situations.</Paragraph>
    <Paragraph position="3"> As we said above, texts to be translated with AT systems are more likely to be technical documents, abstracts, instructions for use, maintenance manuals, etc., than stories about restaurants or earthquakes. Each text doesn't rely on one clear-cut &amp;quot;script&amp;quot;, or &amp;quot;type of situation&amp;quot;, known a priori. Rather, such texts very often don't describe situations (see a computer manual), or, at the other extreme, their content might be understood as ... the description of hundreds of scripts (see aviation maintenance manuals).</Paragraph>
    <Paragraph position="4"> Hence, our objection to the use of such methods is twofold. First, the coding effort, in principle, would be enormous. Charniak's frame for painting \[;2\], although admittedly incomplete, is 19 pages of listing long (in a high-level language !), and we suppose he spent rather a long time on it. Just think of what it would cost to code 5000 basic frames, which we believe would be reasonable for, say, the domain of computers. Second, if the texts describe types of situations, then it is necessary to understand these texts in order to code the necessary scripts, which will be used ... to understand the text again ! This circularity has two consequences.</Paragraph>
    <Paragraph position="5"> First, only very general scripts might be humanly coded by using general previous knowledge about the domain. Second, if we want to use such methods extensively and at levels of detail at which they begin to give better results than simple approaches, then AI researchers should provide methods for the automatic extraction of scripts or frames from large bodies of texts, in an efficient (and perhaps interactive) way. That is, the use of such methods on wide domains and large amounts of texts entails automatic learning. Another problem is to automatically find which script is relevant to the current portion or text.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML