Lexical gaps and idioms in machine translation 
Diana Santos 
1BM-I N t-SC Scientitic Group 
R,.Alves P, edol, 9, 1000 l,isboa, Portugal 
(internet: dms@inesc.inesc.pt) 
A bstract 
This paper deserihes the treatment of lexical gaps, 
collocation information and idioms in the English to 
Portuguese machine translation system PORTUGA. 
The perspective is strictly bilingual, in the sense that all 
problems referenced above are considered to behmg to the 
tranM'cr phase, and not, as in other systems, to analysis or 
generation. 
The solution presented invokes a parser fi}r the target lan- 
guage (Portuguese) that analyses, producing the corre- 
sponding graph structure, the multiword expression selected 
as the result of lexieal transfer. 
This process seems to bring considerable advantage in what 
readability and ease of bilingual dictionary development is 
concerned, and to fiirnish maximal flexihility together with 
minimal storage requirements. Finally, it also provides 
complete independence between dictionary -rod grammar 
formalisms. 
Organization 
'lhe general architecture of" the MT system is at first de- 
scribed very briefly, emphasizing the features relevant to 
the full understanding of the problem at hand. Then tim 
problem is presented, and a literature survey given. The 
solution put tbrward is then described. Finally, we fiwnish 
a detailed example, together with some evaluation results. 
The general M T system 
The structure of the transfer MT system POR.TUGA is il- 
lustrated in Figure 1. 
Figure I. General structure of PORTUGA: A - 
analysis, G - generation. Transfer: L - 
lexical, S - structural, T - tense, Sty - style. 
'lhe main characteristics of this English to Portuguese 
transhm)r are: 
the separation between possible translation (which 
may be multiple), and best or chosen translation (de- 
cided in the "style Iransfer" module). 
® Complete independence between English and 
Portuguese processing. English analysis is performed 
hy PEG\[8\]. 
• Bilingual dictionary being kept to a minimum, only 
the selection conditions for lexical transfer and 
contrastive knowledge are stored. It should also be 
mentioned that all intbrmation in this dictionary is 
associated with the translations, and not to the English 
index, as is usual for lexical transfer in MT. 
The reader is suggested to consult \[131 or \[14\] \['or more 
details. 
The problem 
Vagueness, together with non overlapping of' semantic 
fields across different languages is widely known to give rise 
to lexical gaps, and lexical ambiguity. 
For this reason, lexical transfer, the process of choosing the 
correct equivalent(s) for one lexical entry in another lan- 
guage, is one of the most ditticult problems that MT has to 
cope with. 
This paper focuses on one aspect of lexical transfer, namely 
the possibility to specify complex translations in the target 
language. Under this broad description, the use of complex 
(henceforth multiword) expressions, change of part-of- 
speech required bY translation, and collocation restrictions, 
are meant.. 
The process of actually choosing which entry (or entries) 
in the bilingual dictionary is more appropriate, and which 
in\[brmation is taken into account for that process has been 
described elsewhere\[ 13 I. 
Some examples of instances of complex translation (in op- 
position to simple translation, in the sense of word to one 
word translation, same part-of-speech, independently of the 
number of possible choices available) will illustrate the 
problem in the context of f'nglish-to-Portuguese trans- 
lation. 
l-to-N words 
miss - sentir a falta 
miss - deixar escapar 
drop - deixar cair 
kick - dar um pontap6 
tonight - hoje 5. node 
graduate - firar o curso 
N-to-I words 
have fun - divertir-se 
get up early - madrugar 
fall in love - apaixonar-se 
take advantage - aproveitar 
television set - televisor 
swimming pool - piscina 
N-to-M words 
kick the bucket - hater as bolas 
lose one's temper - pcrdcr a paci6ncia 
Figure 2. Translath}n gaps: One-to-many, many- 
to-one and many-to-many words trans- 
lation. 
.... z,~ ¸ 
Other approaches 
In this section I mention related work and alternative sol- 
utions that have been proposed and which l lind represen- 
tative of the present day state-of-art. 'l'herelbre, primitive 
approaches such as, for instance, treatment of complex 
expressions as simple strings will not be surveyed. 
Machine translation: 
It is acknowledged by outstanding machine translation re- 
searchers that there are MT problems which are bilingual 
in nature. R.egarding the problem of lexical transfer, 
l'sujii\[17\] states 
"we cannot enumerate, by monolingual thinking, different 
concepts denoted by the verb 'produce'. (...) Only when we 
are asked to translate sentences into another language, can 
we try to find appropriate target language words. (...) The 
above discussion bnplies that certain 'understanding proe- 
e~es" are target language dependant, and cannot be fully 
specified in a monolingual manner." 
Specitically the problem of translating a word for an ex- 
pression, is one of the reasons presented by Schenk\[15\] to 
-se the concept of "Complex Basic Expressiorff in the de- 
~;crlption of one language. (A CBE is a basic expression 
liom a semantic point of view, i.e., it corresponds to a basic 
mearfing, and a complex expression from a syntactical 
point of vie,,,,'.) 
"Expressions that are not idiomatic, but that consist of more 
than one word can be handled by means of a complex basic 
expression in order to retain the isomorphyJ 
This approach is related to the theoretical requirement of 
Ihe Rosetta MT system to btfild isomorphic grammars for 
ihe several l:nlguages dealt with by tim system• 
"Ibis implies that, in this ti'amework, it is the set of all lan- 
guages in presence that delines what is a basic meaning. 
15ke this, it is possible to dispense altogether with structural 
Irarlsfer~ 
"Structural transfer is not neees.rary, since idioms are based 
onto basic meaningsJ 
hx general, however, most M'F systems do not make the 
analysis phase dependant on the target language(s), and 
therefore it is usual to see statements like the following\[4\]: 
"the lexical rule must mark words with the corresponding 
parts of speech,", in the "cases where a source language 
entry must be rephrased in the target language as an ad-hoe 
combination of words which does not form a lexieal 
entity". 
It is unclear, however, how much complexity of the result 
can be handled, or how rnuch syntactical transfo,mation 
the new expression carl suffer. 
Some hybrid approaches cart also be found in \[9\], this time 
apparently putting the burden on the generation phase: 
"The second substep of German morphological generation 
is the application of German word list transformations. (...) 
these rules can also be used to handle non-compositional 
translations. For example, "for example" can translate 
compositionally into far Beispiel, and then a GPltRASE 
rule can convert this to zum Beispiel." 
l lowever, the more comprehensive way to deal with this 
problcnL wilhout changing analysis accordingly, is the one 
cxemplilied by Isabcllc\[7\]. It was developed 
"a special language called LEXTRA, which makes it easier 
to state the type of tree transformations required by lexical 
transfer. (...) LEXTRA takes as data an explicit de- 
scription of the admissible tree structures, and guarantees 
that any tree it receives or creates is bMeed an admi~'ible 
tree". 
Similar solutions can be found in the Japanese-to-I!nglish 
system of\[10\], which handles both lexical gaps and changes 
of part of speech: 
"One can specify not only the Engh'sh main verbs but also 
arbitrmT phrases governed by the verbs as constants", al- 
lowing for variables and complex patterns in tile lexical 
rules. "one can provide lexieal rules directly in GRADE, 
and attach them to specific items. (...) One can specify 
specific tree transformations in GRADE". 
and in the old ITS translation system\[12\]: 
"The contents of these records" (dictionary entries) "#rclude 
transfer language statements which performs the necessary 
transfers as well as other referential information." 
Parsing 
In parsing, idioms have to be considered. A recent paper 
on idiom processing\[1\] lists some of their relevant proper- 
ties (my rephrasing): 
• usual existence of ambiguity between literal and 
Miomatic readings, 
• frequent discontinuity of idioms, 
® applicability of regular application of syntactic rules 
(like adverb(s) or auxiliaJ T verb(s) insertion), 
• applicability of "transformations' to the idioms proper, 
(like passivation or relativization). 
The difference between "non-literal reading" and 
"idiomatic" reading of an expression is also pointed out. 
Metaphoric readings are proposed to be parsed by usu.:d 
rules. The advantages of submitting idioms to "regular" 
syntactic rules and even to 'transformations', whenever 
possible, are emphasized. 
A more extreme view can be found in Gazdar et all3 l, who 
ignore idioms as far as syntax is concerned: 
"no additional devices need be added to the syntax in ac- 
counting for the peculiarities o f fixed expressions". Since 
not only idioms can be assigned internal syntactic structure, 
but an internal semantic structure as well, as "all syntac- 
tically active idiomatic expressions have a metaphorical 
basis". 
However, radically different views can be found Ibr in- 
stance in \[5\], where, in the lexicon-grammar approach, the 
concern with the representation of compound words 
(adverbs, verbs, nouns) makes Gross establish a classilica- 
lion according to their syntactical shape, ranging fiom se- 
veral degrees of variation, li'om completely frozen ("at 
night") to having parts completely free ("organize in one's 
honor"). 
This author suggests that finite automata be attached to a 
given entry in Order to describe the compound wtriation. 
"The variations of form we have enumerated can be partly 
handled by attaching a finite automaton to a given entry, 
and this automaton will describe the main grammatical 
changes allowed." 
In between, the need to store several pieces of infbrmation 
concerning idioms is acknowledged by Stockllf\], such as 
2 331 
undergoing passivization, weight in the whole idiom, re- 
mover of the idiom interpretation, semantic value, etc. 
This system stores idioms as "further information concern- 
ing words", divided in two cases, "canned phrases" and 
"flexible idioms", the latter being stored under the 'thread' 
of tile idiom. 
Based on the claim that "the flexibility of an idiom depends 
on how recognizable its metaphorical origin is", one of the 
goals is to "integrate idioms in our lexical data merely as 
further information concerning words (as in traditional 
dictionaries)". 
Generation 
Finally, research in natural language generation has also 
contributed to clarify and furnish solutions to the problem, 
Clearly, generation is one of the issues in a machine trans- 
lation system. However, work in generation per se usually 
presupposes the existence of an unambiguous 'concept' 
representation, and so the problems begin with the correct 
stating of an idea in one particular language. In this 
framework, it is clear that one key concept is that of 
"collocations", or how lexical items combine in a particular 
language. 
In these systems, it is advocated (see for instance \[6\]) that 
in the specialized 'semantic" dictionary "storing the possible 
lexicalizations of a 'concept" in a given language (...) the 
possibility of combining lexemes in collocations" should 
also be stored, specifically in the entries for the bases 
(which determine the possible collocates: a collocation is a 
pair base-collocate). 
A remark of utmost importance can be found in \[11\], dur- 
ing the description of the DIOGENES generation system: 
"collocalional relations are defined on lexical units, not 
meaning representations". 
Summing up 
The literature survey above supports some of our assump- 
tions, namely that 
• there are problems which are bilingual in nature, and 
cannot therefore be properly dealt with in only one 
language; 
• there is not a clear distinction between what should 
be accounted for as an idiom, a metaphorical use of 
a word or a collocation. The boundaries between 
collocational restrictions, metaphorical readings and 
idioms are blurred and may even not be pertinent to 
the automatic treatment of language. 
to translate correctly, it is often necessary to use ex- 
pressions instead of single words. Those expressions 
can moreover give origin to complex structure 
changes, possibly discontinuous. 
Our approach 
We are interested in solving the problem of translating one 
expression into another expression, no matter whether the 
need arises because of a lexical gap, a collocation differ- 
ence or an idiom not literally translatable. 
Therefore, we treat all these three problems the same way, 
namely, considering then~ as instances of a contrastive lex- 
ical transfer problem in the scope of machine translation. 
We must emphasize that we are only interested in 
expression-to-expression translations when tile literal ones 
are not acceptable. This stems from the lhct that there is a 
considerable number of fixed expressions which do not re- 
quire any special processing, as can be seen in the following 
list, with examples taken from several languages: 
(E) parents and children 
pals e filhos 
(E) ladies and gentlemen 
senhoras e senhores 
(F) monter la moutarde au nez de 
subir a mostarda ao nariz de 
(F) attendre un enfant 
esperar uma crianqa um filho 
(E) take into account 
tomar em conta 
(I) prendere il toro per le coma 
pegar o touro pelos cornos 
(E) in good hands 
em boas mY.os 
Figure 3. Literally translatable idioms: 
F-French, l-Italian 
E-English, 
Our solution 
Given that tile target expressions can be arbitrarily com- 
plex, we impose no restrictions whatsoever on their \[brm 
or structure, andgive unlimited power to the device in- 
tended to cope with them. 
On the other hand, it didn't appeal to us to have to store, 
for each pair source-targe t entries, the lull structural trans- 
formation implied, as in the most powerful approaches 
mentioned above (cf. \[101 and 171). This approach gives 
origin to very heavy dictionaries, with a lot of redundancy, 
moreover, since there may be similar transformations re- 
peated to many entries. On the other hand, not only the 
dictionary becomes very difficult to unde,'stand and modify', 
(requiring someone who knows the "programming ~ lan- 
guage used), but also it makes it tightly coupled to the 
structural representation andor particular linguistic 
formalism and options used in the machine translation 
system, in both analysis and generation. 
We chose thus a different method that 
• allows tbr maximal readability 
is independent of the linguistic (and programming) 
decisions of file whole hit system (being only con- 
cerned with lexical transfer) 
* provides as much power as any unrestricted (tree or 
graph) transformation language 
The method proposed consists then of u~'p~ 
as tile result value in tim bilingual dictionary, there- 
\['ore keeping it independent of whatever structure it shouht 
be assigned, and inv~la~e..d~rser that 
builds the structure required, on the Ily. 
Another advantage of the process above is that the new 
structure is dynamically built onlxwhen it is necessarz (that 
is, when it corresponds to tile chosen translation). 
On the other hand, no separate (and redundant) lexical 
rules need be written in the dictionary, as tile very same 
grammar is used for all multiword target expressions. "lhe 
grammar should be a "twirl" of that used in the analysis 
332 3 
phase, that is, it should obey the same formalism and lin- 
guistic options in order fur them to be compatible. 
A detailed example 
F'or the sake of clarity, a full example will be presented, 
regarding the word miss, in its meaning of to feel sorry or 
unhappy at the absence or loss of (someone or something) 
(l.ongman). The Iqgure 4 shows an abridged fbrm of the 
entry for miss in the bilingual dictionary. The information 
\[br choosing among the several possible translations was 
omitted am1 will not be discussed here. The examples pre- 
sented will be in any case those that correctly trigger the 
translation sentir a falta (literally, "feel tile lack"). 
miss(Vl:.RB CIITPOSS (EVP sentir a falta)) 
miss(VEI,H\] CIITPOSS (EVP ter saudades)) 
perder0/FA~.B) 
faltar0/E RB) 
miss(VERB (EVP deixar escapar)) 
menina(NO UN) 
Figure 4. l)ktionary entry for "miss": EVP stores 
the Portuguese string to be used as trans- 
lation. 
The first thing that should be exemplified, is tbat, after the 
choice of the mulfiword translation, the Portuguese gram- 
mar is invoked, building a equivalent graph fiagment to the 
translation of 'Teel the lack". "|'his graph fragment is then 
conveniently inserted in place of the one fbr miss. 
i miss you. 
I)ECLt NPI PRONI* 'T' 
VERBI * "miss" 
N P2 PI~.ON2* "you" 
PUNCI " " 
arvore portuguesa 
............................................................ 
I)ECI.2 NP3 PR.ON3* "eu" 
VERB2* "sinto" 
NP4 DETI ADJI* "a" 
DET2 ADJ2* "tua" 
NOUN1 * "falta* 
PUNC2 " " 
.............................................................. 
Geraq~.o 
..... > eu sinto a tua falta . 
Figure 5. A simple example. 
With this simple example, it can be seen that some struc- 
tural manipulation took place (converting the English di- 
rect object pronoun into a Portuguese possessive adjective 
- triggered by tim CI I'IPOSS marker in Figure 4), and that 
the words taking part in the multiword expression were 
conveniently intlected (in this case, only the verb). 
More complex processing can clearly take place, as is ex- 
emplified in Figure 6. 
! '11 always miss people i like . 
.... > eu sentirei sempre a falta de pessoas de 
quem eu gosto. 
I miss the man who was here. 
..... > eu sinto a falta do homem que esteve aqui . 
he was missed , but who missed her ? 
.... > foi sentlda a falta dele , mas quem 6 que 
sentiu a falta dela ? 
he was the one who most missed his father . 
.... > ele foi o que sentiu mais a falta do seu 
pai . 
they were the ones who were least missed . 
.... > eles foram os de quem se sentiu menos a 
falta . 
I miss having you in the neighborhood . 
.... > sinto a falta de te ter na vizinbanqa . 
I forgot missing you . 
..... > esqueci-me de sentir a tua falta . 
1 forgot to miss you . 
..... > esqueciqne de sentir a tua falta . 
Figure 6. Several examples of "miss" translated as 
"sentir a falta": Complex ter,ses, passive, 
relative clauses, distinction between third 
person singular and others, adverb posi- 
tion, etc. 
As ter smMades is also a valid translation for miss in the 
same context as senti," a fatta, this choice belongs to style 
transfer. Follows the output of the system in that case: 
i miss that time. 
...... > eu tenho saudades daquele tempo. 
Figure 7. Another style alternative: "miss" trans- 
lated by "ter saudades". 
Other problems 
It remains to be shown how tile other problems rnentioned 
above are solved in this framework. We begin by change 
of part-of-speech, and continue by identifying source lan- 
guage (English) multiword expressions, which then com- 
prehend the remaining cases, namely collocations arid 
equivalence of distinct idioms. 
Change of part-of-speech 
The change of part-of-speech should be transparent as fi~r 
as the dictionary is concerned, being the assignment of the 
correct interpretation performed by the Portuguese parse,'. 
thank(VERB (NPOS AJP) (I)REPO OBJECI' a) 
(EVP obrigada)) 
agradecer(VF, R B (PRE PO tbr por)) 
agradeeimento(NOUN (PI~.I'~PO lbr por)) 
obrigado(NOUN) 
I:igure g. Abbreviated entry for ttm ~ord 
"thank": Since the string "Obrigada" has 
three possible interpretations according to 
the Portuguese parser, NPOS stores the 
phrase type to select. 
Only when there are more than one parse for the target 
expression and the one to choose implies a change of 
part-of-speech needs this to be stored in the bilingual dic- 
tionary, as can be seen in Figure 8 above. 
Note: Mori Rimon pointed out to us that in cases of highly 
ambiguous target languages, as is the case of written 
4 333 
l lebrew, the indication of which syntactical alternative, 
when different from the source one, could be needed very 
frequently, therefore reducing the economy we are assert- 
ing. We can only answer that while for English-to- 
Portuguese translation that structural marking is very 
rarely used, further testing with different language pairs 
must be done in order to assess or deny the universality of 
this method. Namely languages whose translation would 
require an extensive part-of speech change should be 
tested. 
Follows a very simple example of the case discussed above: 
thank you. 
............................................................ 
I)ECLI VERBI* "thank" 
NPI PR.ON1 * "you" 
PUN ,el .... 
............................................................. 
&wore portuguesa 
............................................................. 
I)ECL2 ADJI* "obrigada" 
PPI PREP1 "a" 
PR.ON2* "fi" 
PUNC2 " " 
............................................................. 
Geraq~o 
..... > obrigada a ti . 
Figure 9. Change of part-of-speech. 
This example would be improved if the whole phrase 
"thank you" were translated by "obrigada", but here we 
want to show the simplest case. 
It should also be mentioned in passing that whenever there 
is a generalized part-of-speech change on syntactical 
grounds, that is not done through lexical transfer, but in the 
structural transfer, as is the case of adieetival English 
present participle clauses. 
311VE- to-A1 IVE translation 
Considering the general problem of identifying source lan- 
guage multi word expressions, the philosophy we propose 
is similar. (We are indebted to Stephen Richardson for this 
suggestion.) The implementation is however not yet done, 
so what will be described in the rest of this chapter is only 
a proposal. 
We consider that source expressions should be identified 
as a bilingual requirement too, and therefore this process 
should take place (only if needed) during transfer. If the 
identitication succeeds, the whole phrase would then be 
replaced by the corresponding Portuguese translation, be 
it a word or a complex expression. 
The next examples illustrate how the bilingual dictionary 
would look like: 
thunder(NOUN (MWE thunder and lightning) 
(EVP relfimpagos e trov6es)) 
Figure 10. Collocation differences: The same de- 
vice used tbr many-to-many translation 
can be used when, lbr instance, the order 
must be reversed. 
kick(VERB (MWE kick the bucket) 
(EVP bater as botas)) 
Figure ! I. Example of a many-to-many words 
translation: The MWE feature corre- 
sponds both to the context requited in 
order to choose that particular trans- 
lation, and to the piece of English tc~ re- 
place. 
Sonic nlunbers 
In order to evaluate the interest and need for taking this 
problem into account in machine translation, the following 
measures were performed, regarding an English-to- 
Portuguese NIT dictionary roughly containing 500 English 
entries and 2400 Portuguese translations. Only the case one 
English word to several Portuguese words was taken into 
account. 
No. of English entries with EVPs : 80. 
Total number of EVPs : 152. 
No. of verbs translated by EVPs : 60. 
No. of nouns translated by EVPs : 47. 
No. of adjectives translated by EVPs : 14. 
No. of adverbs translated by EVPs : 19. 
No. of entries whose first translation is an EVP : 27. 
No. of nodes whose correct translation is an EVP : 42. 
Figure 12. Some relevant numbers 
In order to guarantee impartiality of the numbers pre- 
seuted, the criteria for selecting the English entries, and the 
actual translations, bore no relationship whatsoever with 
the problem mentioned in this paper. 
The numbers arrived at, however impressive they may be, 
should nevertheless not be confused with percentages of 
occurrence in actual text. On tile contrary, there is some 
relationship between a rarely used word in one language 
and a set of words to express it in another language. 
I lowever, we still consider that the numbers above 
unequivocally demonstrate that this problem cannot be ig- 
nored in any real machine translation system. 
As for tile actual testing of the proposed method, we ran 
tile system on two test corpora, tile first, regarding tile verb 
"miss", including several different syntactic environments 
(see Figure 6), and the second containing several different 
instances of l-to-N translations: 
3:34 5 
1 stood in the doorway. 
...... > estive de p6 na soleira da porta . 
1 dropped the camera while packing. 
...... > deixei cair a mfiquina fotogrfilica enquanto 
estava a fazer as malas. 
I missed the sunset tor, ight. 
.... = = > senti a \[hlta do p6," do sol hoje fi noite . 
The tihnstar kicked her agent . 
...... > a estrela de cinema deu um pontap6 
ao see agente. 
Vv'atch the dog ! 
..... > toma cuidado corn o cacborro ! 
1 bicycled and did not do my homework 
..... > andei de hicicleta e n-Co fiz o meu 
trabalho de casa . 
A then officer would not borrow a uniform . 
..... >um olicial do tempo nim pediria emprestado 
um tmitbrme . 
Did \[ trouble you when ! yellowed your shirt .9 
= = .... > causei-te transtorno quando tingi de amarelo 
a tea camisa ? 
l,'igt, re 13. Several examples of 1-to-N translation. 
Even though no thorough broad-coverage translation tests 
have been performed, we believe these results can assess 
not only the feasibility but also the flexibility of the method 
proposed. 
Conclusion 
The approach presented in this paper handles in the same 
way the problems of lexical gaps, collocation requirements 
in diil'erent languages, and non-literal translation of idioms. 
Considering them a bilingual problem, the transfer phase 
was assigned as the proper place tbr them to be treated. 
Ihe method presenled has as advantage minimal storage 
icquired aml the least COmlmtation (only on demand) of 
Ihe several strt, ctures involved. Also, it only makes use of 
one single comprehending parser lor the target language, 
instead of developing particular solutions to particular 
problems. 
lhe way the dictionary was conceived brings with it con- 
siderable readability, making it independent of the linguistic 
and programming formalisms used in the other modules 
of the translation system. Its format can, moreover, make 
il very easy to inherit inlbrmation from human-readable 
bilingual dictionaries. I tand-coding by an expert is not re- 
quired. 
Acknowledgements 
This paper greatly benefited from tile comments o1' Jan 
\[-ngh, Stephen Richardson and Mori Rimon, and fi'om 
Paula Newman's critical reading of an earlier version. 
1 am therefore gratefhl to them and to all members of the 
IBM-INI!SC Scientific Group \[br their support and dis- 
cussion, 

References

\[11 Abbeill~, Anne and Yves Schabes. 1989 ~Parsin~, Idi- 
oms in l.exicalized TAGs", Proceedings of the Fourth 
European Conference of the Eurc~pean Chapter of the 
Association for Computational l.inguistics, 10-12 
April 1989, Manchester, UK. 

\[2\] P, eaven, John 1.. and Pete Whitelock. 1988 "Machine 
translation using isomorphic UCGs*', Proceedings of 
the 12th International Conference on Computational 
Linguirtics, Budapest, 22-27 August, 1988. 

\[3\] Gazdar, Gerald, I!v,'an Klein, Geoffrey Pullum and 
Iwm Sag. 1985 Generalized Phrase Structure Gram- 
mar, Basil Blackwell. 

14\] Golan, lgal, Shalon\] l.appin and Mori Rimon. 1988 
"An Active Bilingual l.cxicon Ibr Machine Trans- 
lation ", Proceedings of the 12th International Confer- 
ence on Computational Linguistics, Budapest, 22-27 
August, 1988. 

\[5\] Gross, Maurice. 1986 "Lexicon-Grammar: The Rep- 
resentation of Compound Words °, Proceedings of the 
1 lth International Conference on Computational Lin- 
guistics, Bor, n 1986, pps 1-6. 

\[6\] 11eid, Ulrich and Syhille Raab. 1989 ~Collocafions in 
Multilingual Generation", Proceedings of the Fourth 
European Conference of the European Chapter of the 
Association for Computational Lingui.sties, 10-12 
April 1989, Manchester, UK. 

\[7\] Isabelle, Pierre. 1984 "Machine Translation at the 
'\['AUM Group ", Machine Translation Today: The 
State of the Art, Margaret King, ed., 1987. 

\[8\] Jensen, Karen. 1986 "PEG 1986: A BroiJd-coverage 
Computational Syntax of English", IB.\I Research 
Report RC draft, Feb 1986, T.J.Watson Research 
Center, Yorktown i leights, NY 10598. 

\[9\] McCord, Michael C. 1989 "Design of I.MT: A 
Prolog-Based Machine Translation System", Compu- 
tational Linguistics, Vol. 15, No. 1. 

\[10\] Nagao, Makoto and Jun-ichi "\['sujii. 1986 "The 
transfer phase of" the Mu Machine Translation Sys- 
tem", in Proceedings of COLING'86, ACI., pps 
9%103. 

\[11\] Niremburg, Sergei and hene Niremburg. 1988 "A 
Framework for Lexical Selection in Natural I.an- 
guage Generation", Proceedings- of the 12th Interna- 
tional Conference on Computational l.inguistics, 
Budapest, 22-27 August, 1988. 

\[12\] Pdchardson, Stephen D. 1980 ~A l ligh-Level Trans\[er 
Language for the BYUqSI Interactive Translalion 
System", M.A. \]hesis, Brigham Young University. 

\[13\] Santos, Diana. 1988 "A fase de transfer6ncia de um 
sistema de traduqfio autom,5_tica do ingl6s para o 
portugu6s', Tese de .Mestrado, Instituto Superior 
Tacnico, Universidade Tacnica de Lisboa. 

\[14\] Santos, l)iana. 1988 "An Nll prototypefiom I!nglish 
to Portuguese", Proceedings of the IBM Conference 
on Natural Language Processing, October 24-26, 
1988, Thornwood, pps 122-133. 

\[15\] Schenk, Andr6. 1986 "Idioms in the Rosetta.\lachine 
Translation System", Proceedings of the l lth Inter- 
national ConJerenee on Computational Linguistics, 
Bonn 1986, pps 319-324. 

\[161 Stock, Oliviero. 1989 "Parsing with Flexibility, Dy- 
namic Strategies, and Idioms in Nlind", Computa- 
tional Linguistics, Vol. 15, No. 1. 

\[17\] Tsujii, Jun-lchi. 1986 "l:uture directions of machine 
translation", Proceedings of the llth lnternaticmal 
Conference on Computational Linguistics, Bonn 1986, 
pps 655-668. 
