File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/67/c67-1011_metho.xml
Size: 20,227 bytes
Last Modified: 2025-10-06 14:11:05
<?xml version="1.0" standalone="yes"?> <Paper uid="C67-1011"> <Title>Production of Text-related Technical Glossaries bY DiK{tal Computer, (mimeo, undated) ; La Terminologie, Problemes de Coop@ration</Title> <Section position="2" start_page="0" end_page="0" type="metho"> <SectionTitle> ASSORTED PHRASING-FRAMES </SectionTitle> <Paragraph position="0"> ~'&quot; I~'~\] ~o~ .......... \[~ou~j T~ is A ..... t~6~\] i~ T~ .......... E~ou~1 HE WENT A TO THE .......... \[Nou~\] ..... ~&~a6~ ~6~ C/) o~ ....................... () \[ABST~O~ ~ou~\] stressed word omitted silent beat A do not translate though stressed. N.B. Other markers e.g. ~he marker J to set in operation a routine to inter-connect syntactiaally connected phrasings will be discussed in a further publication On receiving the phrasing-frame, the machine questions the opea~or in order to make him specify further, from his general knowledge of the text and of its subject, what the cOntext of the particular phraslng-frame is. The example given below, in which is progressively specified the correct French translation of an English ~erb of motion (one of the notoriously difficult ~lish forms to translate into French) shows how complicated this questioning can be. Not more than three rounds of questioning are allowed, and when the operator has produced his specification, the unique correct trans&quot; latio~f the frame is stored in the immediate-access store~.~chine (see Appendix B). In the example set out below, however, the differ~ French translations of all possible answers obtainable under Round II and Round III of the inter-action are set out immediately underneath the English statements which the machine would actually print out on the console, in order to show the underlying reason for the whole enterprise.</Paragraph> <Paragraph position="2"> and immediately, for the text: He flew to the frontier The Machine prints out the translation;</Paragraph> </Section> <Section position="3" start_page="0" end_page="0" type="metho"> <SectionTitle> IL PRIT L'AVION POUR LA FRONTI~RE </SectionTitle> <Paragraph position="0"> Detailed examination of this example shows that ~ hind this particular way of making an on-line system teract with an operator there lies a strategy, a hyDothes~s and a ~ros~ect, V. The strategy is at all costs to avoid post-editing; but to allow maximal pre-processing of the input text by the machine interacting with the operat.or, all the question-and-answer routines being in the operator's native language.</Paragraph> <Paragraph position="1"> Th@ argument against post-editing (as the U.S. Report conclusively shows) is that it is either mechanical e.g. the resolution of French gender-concord - in which case the machine itself can be programmed to do it or it is creative and/or intuitive;in which cgse it cannot be done at all without extensive reference back to the input text~ho could interpret &quot;Shakespeare Overspat&quot;, which was the title of a Russian &quot;Pravda&quot; article as translated by the U.S. Air Force ccmputer~ The real meaning was &quot;Shakespeare is now a back number&quot;), in which case the post-editor might as well have translated the whole text h~self in the first place.</Paragraph> <Paragraph position="2"> To avoid post-editing, however, the output produced by a man-machine reactive M.T. program has either got to be a blamk space (when the program fails), or a unique translation which is known to be correct. Now uniqueness of output can be brutally produced, as everybody knows~programming the machine only to print out one eg any set of alternatives. Correctness, however, can only be achieved by the target-language translation having been approved beforehand by the operator, from ~: cues which the machine gives him, or which he gives the machine - i~ his own language; i.e. in the source language. The real use, therefore, of the three-stage question-and-answer routine exemplified above, is that it enables an Englishman with a console but who does .8.</Paragraph> <Paragraph position="3"> not know any French to produce a unique and correct idiomatic French translation of an English textrprovided that he is prepared to take the trouble to pre-process the English text so that it is finally restated in a Frenchified sort of way. After this the machine can of course transcribe it into French.</Paragraph> <Paragraph position="4"> In other words, a machine-aided translation program basically consists a) of programming the machine to pick up t~e ambiguities in the source language which the target-language will not tolerste (not the other way round) and of making the operator produce the additional information which will resolve them.</Paragraph> <Paragraph position="5"> Take, as example, the phrasing /for a standb2for~.</Paragraph> <Paragraph position="6"> This looks technical and unambiguous in the English, but comparative examination of bi-lingual text showed that it translated into French (and in the same document) as either i)/d'une force d'urgence~ i.e./&quot;of an emer~ency force/ or il) /pour une force de r6serve/ i.e. /&quot;for a reserve force&quot;/, according to sophisticated considerations of context. Therefore, when the operator types the technical term STANDBY FORCE into the machine, in order to fill up the gaps in the phrasing-frame /FOR A .......... \[NS~\] \[AdjJ the machine has got to answer him back:</Paragraph> </Section> <Section position="4" start_page="0" end_page="0" type="metho"> <SectionTitle> DO YOU MEAN A AN EMERGENCY FORCE B A RESERVE FORCE </SectionTitle> <Paragraph position="0"> The operator then has to choose, and type back into the machine the alternative he wants, after which the machine can make the translation.</Paragraph> <Paragraph position="1"> b) 8imil&z~,,.~ way mustmbefound~ef emab~ng the machine to pick up, from cues in the source language, the metaphors and idioms which the target-language will not tolerate/and to assist the operator to rephrase the stretch of text concerne~d~in terms which the target-language will tolerate~he difference between idioms and metaphors is that idiems can be mechanically picked up and matched by an idiom dictionary, whereas metaphors can't.</Paragraph> <Paragraph position="2"> c) Similarly again, the machine must be programmed to pick up, from the source language input, the constructions which the target-language will not tolerate, and assist the operator to transform these into constructions which the target-language will tolerate (e.g. to turm English passives into FreL~ch actives, and the adjectives of English adjective-noun strings into French post-positioned prepositional phrases).</Paragraph> <Paragraph position="3"> Thus the whole translating work, really, is done within the source language. Once you can preprocess your English input into a Frenchified shape in the respects a), b), c), above, the machine can transform this Frenchified English, with no trouble at all, into elegant French.</Paragraph> <Paragraph position="4"> The strategic hope, of course, is that by analysing the printouts produced by a large number of sequences of such machine-man interactions, in translating many types of texts, we shall ultimately learn how to make the machine answer, as well as ask, some of the rounds of questions, (as is already being done in a whole range of machine &quot;edit&quot; programs), so that the machine shall progressively become able to do more of the Frenchification process for itself; thus finally producing, (if the machine ever became able completely to take over) exceedingly slow but reliable machine translation, which could~subsequsntly again)be speeded up.</Paragraph> <Paragraph position="5"> Before further discussion of the extent to which this strategic hope is a real hope and haw much a mere pious aspiration, i.e. the prospect, I will now set out the kvpothesis (as opposed to the strategy) of the experiment. null VI. The hypothesis which the translation-model gives is the following: ATranslation consists of the pairing of a phrasing, P7 ' in Language A, with another ~hrasing, P2 ~ in Language B, in such a way that PI ~ ~1~forms an analogy with PI A, in a sense of &quot;analogy&quot; which cam be ostensively defined intterms of the model.</Paragraph> <Paragraph position="6"> Thus translating a phrasing into another language is no different, (according to this translation-model) from defining it, producing a parallel-phrasing to it, reiterating or otherwise further specifying it, in the same language. ~ The advantage of the model is that unambiguous criteria of the formation of such a pairing can be given. Por any response given by the operator to a machine-ques~ tion will form such a ,pair: the first member of the pair will be the original phrasing, (in English), the second the chosen machine-specification (called by us a template) .10.</Paragraph> <Paragraph position="7"> also in English. Then another pair will be formed whenever the machine translates the operator's final choice of template into French; the first member of the pair in this case, will be the final template chosen, and the seoond member will be the translation into French, with the stressed words translated and inserted into their correct places. Then again, an intermediate pair may be formed of which each member is a template; the first member of such a pair will be a more abstract template chosen atthe first round of man-machine interaction, while the second member of it will be the more concrete template chosen by the operator at the Second round of man-machine interaction; and so on recursively. Any such pairing formed by the translation model, whether between English phrasing and template, or between template and template, or between template and French phrasing, we shall call a semantic square. A philosophic discussion of the notion of semantic square is given in another publication ~.</Paragraph> <Paragraph position="8"> A semantic sauare (in terms of thls model) consists of the pairing of any two linguistic sequences P1 an.d P2, PI and P2 each having the following characteristics.</Paragraph> <Paragraph position="9"> i) each has two stressed segments (which when PI is paired to P2, form points of the square).</Paragraph> <Paragraph position="10"> ii) each has these embedded in some phrasing-frame, (which, when PI is paired to P2 forms the fram._.._! of the square).</Paragraph> <Paragraph position="11"> iii) each has been selected as synonymous @ith the other at least once,either by the operator or by the machine.</Paragraph> <Paragraph position="12"> Thus, according to the model, translation consists of sequential semantic-square forming, the sequence of semantic squares thus formed continuing until it is brought to an end by the machine printim~ out a square which has a target-language phrasing as its second ~amber. To make all this clearer, let us further develop the example of man-machine interaction given above>by assumin~ that the phrasing to be translated is /HE WENTto the ol~q~/, To translate this, the operator types in /HE...E AST~aDVER3~tO the.....8~/~ ~ and chooses, at the first round of questioning, the The operator then types in the stressed word /POLICE/ (to specify the nature of the enemy), and the machine then forms the final semamtic-square: /HE ~VE~mD-ALL TO THE d /IL TOUT RE~ELA AUX FLICS/ &quot;FLICS&quot; having been pro-chosen by the operator's choices of template from a bi-lingual tree-dictionary-entry for the English word &quot;police&quot; with nodes as follows: Ng:Xl lie coa~IAssariat' I Thus the sequence of semantic',~squares formed by this operation of. the model is</Paragraph> </Section> <Section position="5" start_page="0" end_page="0" type="metho"> <SectionTitle> HE WENT TO THE POLICE HE C-~---MMUNICATED WITH SOME ANIMATE-BEING 2 HE COMMUNICATED WITH SOME ANIMATE BEING HE REVEALED-ALL TO THE ENEMY </SectionTitle> <Paragraph position="0"> .12.</Paragraph> </Section> <Section position="6" start_page="0" end_page="0" type="metho"> <SectionTitle> HE REVEAlED-ALL TO THE ENEMY 3 IL TOUT REVELA AUX FLICS----- </SectionTitle> <Paragraph position="0"> This square-sequence, with its AB BC CD overlap of content, I will call the semantic deep-structure of the mode~s translation-operation, and the tree-structure given above I will call the semantic deep-structure of the dictionary-entry.</Paragraph> <Paragraph position="1"> The totality of semantic deep-structures given by the model is the modei ls ~otal semantic-field.</Paragraph> <Paragraph position="2"> V_~ This, stated in the briefest possible terms, is the hypothesis given by the model. Now as to the prospect of developing this line of research.</Paragraph> <Paragraph position="3"> The first thing to say is that the model makes clear the unsuitability of the ordinary digital computer as compared to a human being for performing translation.</Paragraph> <Paragraph position="4"> For in this translation-model the computer handles each phrasing of the input text as a separate unit, and forces the operator, by successive rounds of questioning, so to specify it that it can be translated unambi~aously into French. But the human being, who does not treat each phrasing of a text as a separate unit, but who uses his understanding of the sarlier phrasings of a text to ~aide him in hls understanding of the later ones, does not have to ask himself nearly so many questions. A progressive learning-model of translation, then, is what is really required, rather than the present singlephrasing-matching model. On the other hand, the complezity which has to be introduced into the model to account for all the differing French translations which have to be made of a single piece of English, according to its context, this would have to be introduced into any effective M.T. program: since you cannot retrieve from any computerised data-system any data which you have not first put in. But this second t~pe of complexity can be put into the machine gradually, by feeding in data obtained from examining the inter-lingual correspondenc~in a large corpus of bi-lingual text.</Paragraph> <Paragraph position="5"> There is, however, another, muc~ deeper obstacle to developing this research, and that is that (as M.T.</Paragraph> <Paragraph position="6"> research-workers have for some time past muspected) bi-lingual dictionaries provide almost no clue to semantic deep-structure.</Paragraph> <Paragraph position="7"> Within the context of the present experiment this became apparent in examining the English word &quot;deliberations&quot;. The examination began with the construction of a dictionary-entry-card of the following form: English: DELIBERATIONS French: ~ OELIB~Pd~TIONS This entry being queried (and the maker of it having defended himself by saying that &quot;deliberations&quot; was the only word he knew of in English which could really be translated by the corresponding word in French), it was checked with Vinay's Dictionary~1~which ~ave the entry /d~bats mp1, discussion/. However, w~en an investigation whs made of how it was act~lly ~ranslated in the corpus of text, it only occurred once, where it was translated &quot;membres&quot;, as follows: English The illustrative and comparative materials presented may~helpful to the deliberations of this committee French Les donn~'es explicatives et comuaratives () se r~v~leront, peut-etre tr~s utiles pou--'~ les me-------mbres du comit~ Moreover, the tramslator, in translating it t~us, was quite right; not only because &quot;utiles&quot; in French, likes a concrete complement, but also because this is what the passage means.</Paragraph> <Paragraph position="8"> However, this t~a semantic deep-structure for the hi-lingual dictlonary-entry of ~deliberations&quot; of the following form: .,. /.. \ ~-.. .</Paragraph> <Paragraph position="9"> AGENTS (WHO..~0OS~)I l~m A~T~AL ACT~ARTEFACT (ANIMATE INGS) II(oF 0H00a ) (wHo CHOOSm) I I&quot;les d+-soa'ssions&quot; I AC VI ) &quot;les membres/' \[ ~l &quot;Deliberations&quot; It becomes evident, then, that if we are to make a ~r Chlne account for the translations~ which good human anslators actually produce~using the kind of modern which has been reported o~ this paper, the problem is that of finding the ~ structures of the dlctionary-entries from the data actually given by a bi-lingual corpus; for the construction of the squareforming templates must depend on these- that is if the template-glossary and the bi-llngual dictionary are to interlock.</Paragraph> <Paragraph position="10"> Present resmarch efforts are ~herefore being concentrated on the problem of &quot;f~rming up&quot; the whole notion of semantic dictionary-entry deep-structure.</Paragraph> <Paragraph position="11"> .14.</Paragraph> </Section> <Section position="7" start_page="0" end_page="21214" type="metho"> <SectionTitle> CONCLUSION </SectionTitle> <Paragraph position="0"> In view of the great interest which has already been aroused by this experiment, its small scale and pilot nature must be emphasized. (Actual output from a trial run of the program is given in Appendix ~).</Paragraph> <Paragraph position="1"> It has been implemented only on an I.C.T. 1202 computer, with T.R.A.C. facility, to which a single keyboard has been attlched, just under the print-out, on which the machine's &quot;replies&quot; to the operator, as well as his &quot;questions&quot; appear. This machine has only 4K store with no back-up, and 2K of this is occupied by the T.R.A.C. facility; the rest of the store will therefore only hold enough Thesaurus to process an average of lO &quot;phrasing-frames&quot; at ~ny one time, so the sections of Thesaurus which are needed for any particular test have to be prechosen by hand fromthe larger deck of punched cards of which the Thesaurus, in its machine-readable form, consists. Even these cards, however, are only punched as required; the basic triple dictionary, from which the Thesaurus is being built up, is being stored on ordinary business equipment, (Twinlock Handi~e~inder HRA3 handled with a Shunic Signalling System ~ Paper and a SASCO System so as to ensure maximum flexibility and ease of entry-cham~e)o Mark II of this program is to be implemented on ~n I.CoT. 1903 with disc-file and multiple-access T.R.A.Co facility, but this is not expected to be operational till 1968.</Paragraph> <Paragraph position="2"> .ii.</Paragraph> </Section> <Section position="8" start_page="21214" end_page="21214" type="metho"> <SectionTitle> -LIMITATIONS ON .CANADIAN *COMMITMENTS. *ANY *NATION </SectionTitle> <Paragraph position="0"> . A project supported by the U.S. National Science Foundation at the University of Bloomington, Indiana, has just been started, to make a Thesaurus for Information Retrieval in 50 languages.</Paragraph> <Paragraph position="1"> Also a historical Thesaurus of English is being compiled on a long-term basis by Professor Samuel at the University of Glasgow; and another, compiled by John Bromwich, is being put on magnetic tape at the Linguistics Computation Centre, Cambridge University.</Paragraph> <Paragraph position="2"> The properties and structure of thesauruses and/or conceptual dictionaries have never yet, however, been mechanically examined; partly because, until lately, machines with rapld-access-time to sufficiently large memories were not available, and partly because of the overall cost of such a project. Lance & Machines, p.114. The Report gives this brilliant technical achievement just 3 sentenees on p.114, and ~ppears not to know of the fact that a mechanical justifier using a logic and working up to 95% accuracy is now in use on an actual newspaper (personal communication from Dolby & Resnikoff). Features&quot; in the present Conference.) The phrasing method offers two operational simplifications i) by mapping the distribution of stresses on to a binary frame; il) by applying a phonetlcally-derived feature to Ear, instead of to syllables or phonemes.</Paragraph> </Section> class="xml-element"></Paper>