File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/90/c90-2006_concl.xml

Size: 13,983 bytes

Last Modified: 2025-10-06 13:56:26

<?xml version="1.0" standalone="yes"?>
<Paper uid="C90-2006">
  <Title>Towards Personal MT: general design~ dialogue structure, potential role of speech</Title>
  <Section position="6" start_page="0" end_page="0" type="concl">
    <SectionTitle>
3 32
</SectionTitle>
    <Paragraph position="0"> disambiguate for translating correctly (by the corresponding elided forms: &amp;quot;plant&amp;quot;/&amp;quot;system&amp;quot;), no automatic solution is known. A given occurrence may be an elision or not. If yes, it is even more difficult to look for a candidate to the complete form in a hypertext than in a usual text.</Paragraph>
    <Paragraph position="1"> At level seven, file unit of t~anslation (the content of the shadow field) has been submitted to a first step of automatic analysis, which returns a surface structure showing ambiguities of bracketing (PP attachment, scope of coordination...). The questions to the writer should not be asked in linguistic terms. The idea is to rephrase the input text itself, that is, to present tile alternatives in suggestive ways (on screen, or using speech synthesis - see below).</Paragraph>
    <Paragraph position="2"> Some other ambiguities, for instance on reference (unresolved anaphora) or syntactic functions (&amp;quot;Which firm manages this office ?&amp;quot; --where is the subject ?) might be detected at this stage. They may be left tot the next step to solve (actually, this is a general strategy), or solved interactively at that point. In our view, that would best be done by producing paraphrases \[Zajac 1988\], or by &amp;quot;template resolution&amp;quot; \[16\].</Paragraph>
    <Paragraph position="3"> At level eight, the disambiguated surface structure has been submitted to the deep analysis phase, which returns a multilevel structure (decorated tree encoding several levels of linguistic interpretation, universal as well as language specific). Some ambiguities may appear during this phase, ,and be coded in the structure, such as ambiguities on semantic relations (deep cases), deep actualisation (time, aspect...), discourse type (a French infinitive sentence may be an order or not, for example), or theme/rheme distinction. Template or paraphrase resolution will be used to disambiguate, as no rephrasing of the text can often suffice (e.g. : &amp;quot;the conquest of the Barbarians&amp;quot;).</Paragraph>
    <Paragraph position="4"> A suggestion of \[6\] was to delay all interactions until transfer. The view taken here is rather to solve as soon as possible all the ambiguities which can not be solved automatically later, or only with much difficulty.</Paragraph>
    <Paragraph position="5"> For example, word sense disambiguation takes place quite early in the above scheme, and that may give class disambiguation for free.</Paragraph>
    <Paragraph position="6"> A more flexible scheme would be to ask about word senses early only if each lemma of the considered wordform has more than one acception. If not, the system could wait until after surface analysis, which reduces almost all morphosyntactic ambiguities. A v~mation would be to disambiguate word senses only after surface analysis Ires been done. A prototype should allow experimenting with various strategies.</Paragraph>
    <Paragraph position="7"> III. Place and quality of speech synthesis in Personal MT Speech synthesis has a place not only in the translation of spoken dialogues, but also in the translation of written texts. We actually think its introduction in Personal MT could be very helpful in enhancing ergonomy and allowing for more natural disambiguation strategies.</Paragraph>
    <Paragraph position="8"> 1. Speech synthesis and Personal MT Speech synthesis and MT in general Speech synthesis of translations may be useful for all kinds of MAT. In MT for the watcher, people could access Japanese technic,'d and scientific textual databases, for example, through rough English MT not only over computer networks, as is currently done in Sweden \[10\], but also via the telephone. To produce spoken translations could be even more useful in the case of rapidly changing information (political events, weather bulletins, etc. disseminated to a large public through computer or telephone networks).</Paragraph>
    <Paragraph position="9"> In the case of professional translation (MAT for the revisor or 1or the translator), the main area today is the translation of large technical documents. With the advent of widely available hypermedia techniques, these documents are starting to contain not only text and images, but also sound, used for instance to stress some important w,'maing messages.</Paragraph>
    <Paragraph position="10"> Personal MT could be used for translating technical documents as well as all kinds of written material not relying on creative use of language (i.e. poetry). It could also be used for communication within multilingual teams working together and linked by a network, or by phone. Finally, it could be used for the multilingual dissemination of information created on-line by a monolingual operator (sports events, fairs...) and made accessible in written form (electronic boards, miuitcl) as well as in spoken form (loudspeakers, radio, telephone), whence the need for speech synthesis.</Paragraph>
    <Paragraph position="11"> Hence, spoken output does not imply spoken input, and should be considered for all kinds of machine aided translation. As complete linguistic structures of the translations are created during the MT process, speech synthesis should be of better quality than current text-to-speech techniques can provide. This does not apply to MAT for the translator, however (although the translator, being a specialist, could perhaps be asked to insert marks concerning prosody, rhythm and pauses, analogous with formatting markups).</Paragraph>
    <Paragraph position="12"> Speech synthesis of dialogue utterances Dialogue utterances concern the communication between the system and the user, the translation process (reformulation, clarification), and the translation system (e.g. interrogation or modification of its lexical database).</Paragraph>
    <Paragraph position="13"> In Telephone Interpretation of dialogues, all dialogue utterances must obviously be in spoken form, the written form being made available only if the phone is coupled to a screen. In translation of written material, it could be attractive to incorporate speech synthesis in the dialogue itself, as an enhancement to its visual form, for the same ergonomic reasons as above, and because 4, 33 spoken alternatives might be intrinsically more suggestiw~ than written ones in order to resolve ambiguities -- pauses and melody may help to delimit groups and pinpoint their dependencies, while phrasal stress may give useful indications on the theme/rheme division.</Paragraph>
    <Paragraph position="14"> In the case of non-dialogue-based systems, there are only fixed messages, and on-line speech synthesis is not really necessary, because the acoustic codings can be precomputed. In the case of dialogue-based Machine Translation, however, an important part of the dialogue concerns 'variable elements, such as the translated texts or the dictionaries, where definitions or dismnbiguating questions could be inserted.</Paragraph>
    <Paragraph position="15"> Speech in PMT : synthesis of input texts or reverse translations Speech synthesis of input seems to be required when producing a document in several languages, with some spoken parts. It would be strange that the source language documentation not have the spoken parts, or that the author be forced to read them aloud. In the latter case, a space problem would also arise, because speech synthesis can produce an acoustic coding (later fed to a voice synthesis chip) much more compact than any representation of the acoustic signal itself.</Paragraph>
    <Paragraph position="16"> The concept of reverse translation could be very useful in PMT. The idea is to give to the author, who is presumed not to know the target language(s), some control over the translations. In human translation or interpretation, it often happens that the writer or speaker asks &amp;quot;what has been translated&amp;quot;. By analogy, a PMT system should be able to translate in reverse.</Paragraph>
    <Paragraph position="17"> Technically, it would do so by starting from the deep structure of the target text, and not from the target text itself, in order not to introduce spurious ambiguities (although having both possibilities could possibly help in detecting accidental ambiguities created in the target language).</Paragraph>
    <Paragraph position="18"> Note that speech synthesis of reverse translations might be ergonomically at~active, even if no spoken form is required for the final results (translations or input texts), because screens tend to become cluttered with too much information, and because reading the screen in detail quickly becomes tiring.</Paragraph>
    <Paragraph position="19"> 2. The need for very high quality speech synthesis in DBMT It has been surprisingly difficult for researchers in speech synthesis to argue convincingly about the need for very high quality. Current text to speech systems are quite cheap and seem acceptable to laymen. Of course, it is tiring to listen to them for long periods, but in common applications, such as telephone enquiry, interactions are short, or of fixed nature (time-of-day service), in which case synthesis can proceed from prerecorded fragments.</Paragraph>
    <Paragraph position="20"> DBMT, as envisaged above, seems to offer a context in which very high quality could and should be demanded of speech synthesis.</Paragraph>
    <Paragraph position="21"> Ergonomy First, the writer/speaker would be in frequent interaction with the system, even if each interaction is short. The overall quality of speech synthesis depends on three factors : voice synthesis (production of the signal from the acoustic coding) ; linguistic analysis (word class recognition, decomposition into groups), for correct pronunciation of individual words, or contextual treatment (liaisons in French) ; pragmatic analysis (communicative intent : speech act, theme/rheme division...), for pauses, rhythm and prosody.</Paragraph>
    <Paragraph position="22"> We will consider the first factor to be fixed, and work oil the linguistic and pragmatic ~spects.</Paragraph>
    <Paragraph position="23"> Of course, certain parts of the dialogue could be prerecorded, namely the messages concerning the interaction with the system itself. However, users might rather prefer a uniform quality of speech synthesis. In that case, these messages might be stored in the same acoustic coding format as the texts produced under linguistic control.</Paragraph>
    <Paragraph position="24"> Ambiguity resolution by rephrasing We have seen two main ways of disambiguating structural ambiguities in DBMT, namely rephrasing and paraphrasing. Rephrasing means to present the original text in different ways. Suppose we want to disambiguate the famous sentence &amp;quot;He saw a girl in the park with a telescope&amp;quot; by presenting the alternatives on a screen. We  in the park with a telescope the girl in the park with a telescope the girl in the park with a telescope the girl in the park with a telescope 5- He saw the girl in the park ................ with a te!e_.sc~! e~,..,,.,.</Paragraph>
    <Paragraph position="25"> 5 34 If the disambiguation happens orally, the spoken forms should be presented in tile same register as in the original (here, affirmative), but very clearly distinguished, so that a human could reconstruct the forms above. The availability of complete linguistic structures is necessary, but not sufficient, because understandability is not enough : distinguishability is a new requirement for speech synthesis.</Paragraph>
    <Paragraph position="26"> Other types of linguistic interactions In disambiguation by paraphrasing or template generation (generation of abbreviated paraphrases, as it were), questions should be generated, with their focus clearly indicated by stress arid prosody. For instance : Is the girl or the park with a telescope ? In the same manner, speech quality is very important if word sense disambiguation is clone orally. Since some new words or new ~nses of existing words may be added by the user, the disambiguation processes should apply ~.o their definitions in the same way as they do to the ~exts/utterances to be wanslated.</Paragraph>
    <Paragraph position="27"> All preceding remarks are of course even more valid :in the case of oral input, where speech is tile primary means of interaction, and the quality of the signal is reduced by the transmission channel.</Paragraph>
    <Paragraph position="28"> Conclusion The concept of Persona\] MT crystallizes many ideas from previous systems and research (text-critiquing, interactive MT, dialogue-based MT, Machine Interpretation of spoken dialogues, controlled languages...). However, the perspective of interacting with lhe author, not required to have any knowledge of Ihe target language(s), linguistics, or translation, puts Ihings in an original framework.</Paragraph>
    <Paragraph position="29"> While the development of systems of this nature poses old problems in a new way, and offers interesting new possibilities to the developers, their acceptability and usefulness will perhaps result more from their crgonomy than from their intrinsic linguistic quality, how necessary it may be.</Paragraph>
    <Paragraph position="30"> Promotion of the National Languages is becoming quite important nowadays, but, apart of efforts to teach a few foreign languages, no technical solution has yet been proposed to help people write in their own language and communicate with other people in their own l~guages. Personal MT could be such a solution.</Paragraph>
    <Paragraph position="31"> We strongly hope that many researchers will take interest in this new field of MT.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML