! 
GENERATION AND TRANSLATION-- 
TOWARDS A FORMALISM-INDEPENDENT CHARACTERISATION 
Henry S. Thompson 
Human Communication Research Centre 
University of Edinburgh 
2 Buccleuch Place 
Edinburgh EH8 9LW 
SCOTLAND 
ABSTRACT 
This paper explores the options 
available in the formal definitions of 
generation and, parasitically, transla- 
tion, with respect to the assumed ne- 
cessity for using a single grammar for 
analysis and synthesis. This leads to 
the consideratiOn of different adequacy 
conditions relating the input to the 
generation process and the products of 
analysis of its Output. 
I. A SCHEMATIC DEFINITION OF 
~GENERATES' 
We start from the assumption of a 
constraint-based theory of linguistic 
description, which supports at least 
the notions of derivation and underly- 
ing form, in that the definition of 
grammaticality appeals to a relation 
between surface strings and some 
formal structure. We will attempt to 
remain agnostic about the shape of 
this formal structure, its precise se- 
mantics and the mechanisms by 
which a grammar and lexicon con- 
strain its natUre in any particular 
case. In particular, we take no stand 
on whether it is uniform and mono- 
lithic, as in the: attribute-value matri- 
ces (hereafter AVMs) of PATR-II or 
HPSG, or varied and partitioned, as in 
the trees, AVMs and logical formulae 
of LFG. We will use the phrase prod- 
ucts of analysis to refer to the set of 
underlying structures associated by a 
grammar and lexicon with a surface 
string, viz 
for a grammar G and sentence s 
e LG, we refer to the set of all 
products of analysis 
where we use A for the 'derives' 
relation. 1 
We will also use ~s to refer to an 
arbitrary member of Xs. 
We will also assume that the for- 
mal structures involved support the 
notions of subsumption and its in- 
verse, extension, as well as unification 
and generalisation. Whether this is 
accomplished via appeal to a lattice, or 
in terms of simlflations, will only be- 
come relevant in section IV. 
We can now provide schematic def- 
initions of generation and, with a few 
further assumptions, translation. We 
say 
Definition 1. 
F~(~,s) (a structure ~ generate~ 
a string s for grammar G) 
iff 3 ~s ~ ~(~s,~) 2 
Most work to date on building gen- 
erators from underlying forms (e.g. 
lIn this we follow Wedekind (1988), where we 
use X/x for an arbitrary underlying form, as 
he uses ¢/~ for f-structure and Z/a for s- 
structure. 
2Again our T is similar to Wedekind (1988)'s 
adequacy condition C. 
53 
Wedekind 1988, Momma and DSrre 
1987, Shieber, van Noord, Pereira and 
Moore 1990, Estival 1990, Gardent & 
Plainfoss~ 1990) have taken the ade- 
quacy condition T to be strict isomor- 
phism, possibly of some formalism- 
specific sub-part of the structures Xs 
and X, e.g. the f-structure part in the 
case of Wedekind (1988) and Momma 
and DSrre (1987). In the balance of 
this paper I want to explore alterna- 
tive adequacy conditions which may 
serve better for certain purposes. 
Although some progress has been 
made towards implementation of gen- 
erators which embody these alterna- 
tives, that is not the focus of this pa- 
per. As far as I know, aside from a 
few parenthetical remarks by various 
authors, only van Noord (1990) ad- 
dresses the issue of alternative ade- 
quacy conditions--I will place his 
suggestion in its relevant context be- 
low. 
II. WEAKER FORMULATIONS 
Work on translation (Sadler and 
Thompson 1991) suggests that a less 
strict definition of T is required. 
Consider the following AVM, from 
which features irrelevant to our con- 
cerns have been eliminated: 
"cat s pred like 
comp ipUbj 
red 
Figure 1. Exemplary 
'underspecified' ~ 
E~at nn~ \] re.d Robi 
swim 
Under the T is identity approach, 
this structure will not generate the 
sentence Robin likes to swim, even 
though one might expect it to. For 
although we suppose that somewhere 
in the grammar and lexicon there will 
be a constraint of identity between the 
subject of like and the subject of swim, 
which should be sufficient to as it were 
'fill in' the missing subject, the strict 
isomorphism definition of T will not al- 
low this. 
II.1 Subsumption and extension 
If T were loosened to extension, the 
inverse of subsumption, this would 
then work 
7(ks,Z) iff ~s 
subsumes 
thing which 
translation, 
straightforwardly (i.e. 
_~ ~, that is, ~s extends ~, 
~s). It is just this sort of 
seems to be required for 
see for example Sadler 
and Thompson (1991) and the discus- 
sion therein of Kaplan et al. (1989), 
where X for the desired target arises 
as a side effect of the analysis of the 
source, and Xs is additionally con- 
strained by the target language 
grammar 3. 
Note that for Wedekind (1988) this 
move amounts to removing the coher- 
ence requirement, which prevents the 
addition of additional information dur- 
ing generation. Not surprisingly, 
therefore, implementation of a genera- 
tor for T as subsumption is in some 
cases straight-forward--for the gen- 
erator of Momma and DSrre, for ex- 
ample, it amounts to removing the 
constraints they call COHA and 
COHB, which are designed to imple- 
ment Wedekind's coherence require- 
ment. 
van Noord (1990) discusses allow- 
ing a limited form of extension, essen- 
tially to fill in atomic-valued features. 
This avoids a problem with the uncon- 
strained approach, namely that it has 
the potential to overgenerate seriously. 
3Note that appealing to subsumption assumes 
that both the inputs to generation (~) and the 
results of analysis (Xs) are fully instantiated. 
54 
For the above example, for instance, 
the sentence Robin likes to swim on 
Saturdays could also be generated, on 
the assumption that temporal phrases 
are not subcategorised for, as Zs in 
this case clearly also extends X. 
Rather than van Noord's approach, 
which is still too strong to handle e.g. 
the example in Figure 1 above, some 
requirement of minimality is perhaps 
a better alternative. 
II.2 Minimal extension 
I What we want is that not only 
should ks extend X, but it should do so 
minimally, that is, there is no other 
string whose analysis extends X and is 
in turn properly extended by Xs. 
Formally, we want T defined as 4 
Definition 2. 
~(Zs,Z) iff 
Zs ~ X and 
2s'~Xs' ~ ZA~S ~ ~s' 
This rules out the over-generation 
of Robin likes to swim on Saturdays 
precisely because Xs for this properly 
extends ~s for the correct answer 
Robin likes to swim, which in turn ex- 
tends the input X, as given above in 
Figure 1. 
4Hereafter I will Use the 'intensional' notation 
for extension, subsumption, unification and 
generalisation, using square-cornered set 
operators, as follows: 
ss E ls Ss subsumes ls; 
ls extends ss 
ss E ls ss properly subsumes ls; 
ls properly extends ss 
SSl U ss2=ls sslandss2unifytols 
lsl N ls2 = ss llsl and ls2 generalise to ss 
The intuition appealed to is that of the set 
operators applying to sets of facts (ssmsmaller 
set; Is--larger set). 
II.3 Maximal Overlap 
Unfortunately, the requirement of 
any kind of extension is arguably too 
strong. We can easily imagine situa- 
tions where the input to the generation 
process is over-specific. This might 
arise in generation from content sys- 
tems, and in any case is sure to arise 
in certain approaches to translation 
(see section III below). By way of a 
trivial example, consider the input 
given below in Figure 2. 
m cat s 
pred swim 
subj rcat nl~ Igender masc 
LPred Robin 
E u 
Figure 2. Exemplary 
'overspecified' X 
In the case where nouns in the lex- 
icon are not marked for gender, as 
they might well not be for English, ac- 
cording to Definition 2 no sentence can 
be generated from this input, as Xs for 
the obvious candidate, namely Robin 
swims, will not extend X as it would 
lack the gender feature. But it seems 
unreasonable to rule this out, and in- 
deed in our approachto translation to 
enforce the extension definition as 
above would be more than an inconve- 
nience, but would rather make trans- 
lation virtually unachievable. What 
seems to be required is a notion of 
maximal overlap, to go along with 
minimal extension, since obviously 
the structures in Figures 1 and 2 could 
be combined. What we want, then, is 
to define y in terms of minimal exten- 
sions to maximal overlaps: 
55 
Definition 3. 
~(Zs,~) iff 
Xs and X are compatible, that is, 
Zs U z~-L and 
they are maximally overlapped, 
that is, ~s'~ Xs D Z E Xs' D X 
and 
Zs minimally extends its over- 
lap with Z, 
that is, 
,~'s" ~ Xs" I-1 Z=Zs FI Z 
A :ZS D %S" 
Roughly speaking, Zs must cover 
as much as possible of Z with as little 
left over as possible. Note that we have 
chosen to give priority to maximal 
overlap at the potential expense of 
minimal extension. For example, 
supposing all proper nouns are 
marked in the lexicon for person and 
number, and further that commitative 
phrases are not sub-categorised for, 
then given the input 
cat s 
pred swim 
subj rcat np~ 
I person 3 I 
\]number sg I 
L#red RobinJ 
comm pat ppq 
I pcase comm\] 
LPred KimJ 
Figure 3. Exemplary Z for over- 
lap/extension conflict 
we will prefer Robin swims with Kim, 
with its extensions for the person and 
number features, as opposed to the 
non-extending Robin swims, because 
the latter overlaps less. Note that in 
the case of two alternatives with non- 
compatible overlaps, two alternative 
results are allowed by the above defini- 
tion. 
56 
Note that this approach is quite 
weak, in that it contains nothing like 
Wedekind's completeness conditionm 
if the grammar allows it, output may 
be produced which does not overlap 
large portions of the input structure, 
regardless of its status. For example 
structures which may be felt to be un- 
grammatical, as in Figure 4 below, 
may successfully generate surface 
strings on this account, i.e. Hours 
elapsed, despite 'leaving out' as 
'important' a part of the underlying 
form as the direct object. 
"~at 
pred 
subj 
obj 
m 
Figure 4. 
S 
elapse I 
at np 
erson 3 
umber pl 
red hou_r I 
at n3~ erson 
umber red KiSmgJ 
m 
Exemplary 
'ungrammatical' X 
If it is felt that generating anything 
at all from such an input is inappro- 
priate, then some sort of complete- 
ness-with-respect-to-subcategorised- 
for-functions condition could be added, 
but my feeling is that although this 
might be wanted for grammar debug- 
ging, in principle it is neither neces- 
sary nor appropriate. 
Alternatively one could attempt to 
constrain not only the relationship be- 
tween Zs and X, but also the nature of 
itself. In the example at hand, this 
would mean for instance requiring 
some form of LFG's coherence restric- 
tion for subcategorisation frames. In 
general I think this approach would 
be overly restrictive (imposing com- 
pleteness in addition would, for exam- 
I 
I 
ple, rule out the Z of Figure 1 above as 
well), and will not pursue it further 
here. 
It is interesting to note the conse- 
quences for generation under this 
defintion of input at the extremes. For 
X = T (or any structure with no 
grammatical subset), the result will be 
the empty string, if the language in- 
cludes that, failing which, interest- 
ingly, it will be the set of minimal sen- 
tences(-types) of the language, e.g. 
probably just intransitive imperative 
and indicative in all tenses for 
English. 
The case of I X = ~ is trickier. If _L 
is defined such that it extends every- 
thing, or alternatively that the gener- 
al±sat±on of anything with ± is the 
thing itself, then 1) .1_ is infinite so 2) 
no finite structure can satisfy the 
maximal overlap requirement; but in 
any case ± fails to satisfy the first 
clause of 3, namely the unification of 
Zs and Z must not be ±, since if Z is ± 
then Xs and Z unify to ± for any Zs. 
Finally note that in cases where 
substantial material has to be sup- 
plied, as it were, by the target gram- 
mar (e.g. if a transitive verb is sup- 
plied but no object), then Definition 3 
would allow arbitrary lexicalisations, 
giving rise to a very large number of 
permissible outputs. If this is felt to be 
problem, then ~estricting (in the sense 
of (Shieber 1985)) the subsumption test 
in the second half of Definition 3 to ig- 
nore the values of certain features, i.e. 
pred, would bepstraight-forward. This 
would have the effect of producing a 
single, exemplary lexicalisation for 
each significantly different (i.e. differ- 
ent ignoring differences under pred) 
structure which satisfies the mini- 
maximal requirements. 
II.4 A Problem with the Mini-maxi- 
mal Approach 
One potential problem clearly 
arises with this approach. It stems 
from its dependence on subsumption 
and its friends. Since subsumption, in 
at least some standard formulations 
(e.g. Definite Clause Grammars) fails 
to distinguish between contingently 
and necessarily equivalent sub-struc- 
tures, we will overgenerate in cases 
where this is the only difference be- 
tween two analyses, e.g. for Kim ex- 
pects to go and Kim expects Kim to go 
on a straight-forward account of Equi. 
One can respond to this either by say- 
ing that this is actually correct, that 
Equi is optional anyway (wishful 
thinking, I guess), or by adding side 
conditions to Definition 3 which 
amount to strengthening subsumption 
etc. to differentiate between e.g. the 
two graphs in Figure 5. As I do not at 
the moment see any way of expressing 
these side conditions formally without 
making more assumptions about the 
nature of underlying forms than I 
have so far had to (c.f. for example 
(Shieber 1986) where subsumption is 
defined in terms of a simulation plus 
an explicit requirement on the preser- 
vation of token identity), I will leave 
this point unresolved. ,)° 
h h h 
a a a 
Figure 5. Two structures not dis- 
tinguished by subsumption 
57 
III. THEORY-BASED TRANSLATION 
As mentioned above, the need to 
consider more carefully the nature of 
the adequacy conditions for the gener- 
ation relation has arisen from devel- 
opments in theory-based translation 
(Kaplan et al. 1989, Sadler and 
Thompson 1991, van Noord 1990). 
Although a range of different ap- 
proaches fall ufider this description, 
they all share some amount of gram- 
maticalisation of translation regulari- 
ties. Furthermore, they all appeal to 
some form of reversibility or bi-direc- 
tionality. Figure 6 below provides a 
schematic characterisation of all these 
approaches, where A and F are as be- 
fore, and T is for an optional transfer 
component. 
Ssource f 
AGsource 
x') (X 
TTsource/target 
Z(') f Starget 
i ¢ Gtarget 
Figure 6. Schematic characteri- 
sation of translation 
The important point about these 
approaches is that the output of the 
analysis process is the input to the 
generation process. This is in con- 
trast to previous transfer approaches, 
in which transfer produces some dis- 
tinct new structure for input to gener- 
ation. If a transfer component is in- 
cluded in the approaches I'm con- 
cerned with, as in van Noord (1990), its 
rules function to elaborate the product 
of analysis, not replace it, and they 
could without loss of generality be in- 
corporated into the source and/or tar- 
get grammars. 
58 
Now we can formalise the picture 
in Figure 6 as follows: 
Definition 4. 
TP~_,c~(s,t) (a string s trans- 
late~o a string t for gram- 
mars Gs,Gt) 
iff ~ Xs ~ Aa(S,Xs) and F~(xs,t) 
The goal of this enterprise has been 
to provide a version of y which makes 
this a practical definition of theory- 
based translation, and it should be 
clear how all the phenomena which 
were used in section II to motivate the 
Definition 3 version of y are likely to 
arise in translation. In particular, 
the necessity for allowing the overlap 
between Xs and Xt to be less than total 
arises from the obvious asymmetry 
which will exist between the syntactic 
contents of the two---in whatever form 
is appropriate to the grammatical the- 
ory involved, Xs will contain a full syn- 
tactic analysis in the source domain, 
and possibly only a root S node for the 
target, while for Zt the situation will be 
reversed. The mini-maximal ap- 
proach given above covers this case 
straight-forwardly. 
IV. BEYOND SUBSUMPTION 
The use of subsumption as the ba- 
sis for my explorations of T has an- 
other problem, in that typically defini- 
tions of subsumption require that the 
structures to be compared share a 
common root. For reasons which 
would take too long to set out, this con- 
straint too may prove over-strong in 
certain translation cases. By way of il- 
lustration, consider translating into a 
language in which overt performa- 
rives are required for all grammatical 
utterances. We would then find that 
the translation into this language of 
e.g. Robin swims would involve a 
higher predicate, so for various parts 
of the product of analysis, the appro- 
priate relationship would hold not be- 
tween root and root, but between root 
and sub-part. This suggests that a 
weaker relationship, perhaps the exis- 
tence of a homomorphism, should re- 
place subsumption in the definition of 
T. 
V. IMPLEMENTATION 
I have made some progress to- 
wards implementing a generator 
based on Definition 2 of section II. I 
believe it will be possible to provide an 
implementationl which is guaranteed 
to provide all and only the correct out- 
puts if any exist, but may fail to termi- 
nate if no output is possible. The basic 
idea is to constrain the generator to 
produce results in node-cardinality 
order, that is, smallest first. In fact, 
there is some slop in the most 
straightforward way of implementing 
this, in that it is fairly simple to limit 
the number of ~ nodes allocated, but 
more difficult to constrain the number 
eventually usedi What is guaranteed, 
however, is that structures are pro- 
duced in an order which respects sub- 
sumption, in th'at if Zs subsumes Zs', 
then it will be generated first. This in 
turn means that one can enforce the 
minimality constraint of Definition 2. 
The problem arises with certain 
classes of recursive definition, both the 
simple left recursion cases of more 
traditional grammars, and the more 
complex ones of categorial-style ones. 
My best guess for these is to anticipate 
that it would be possible to (semi- 
)automatically ~prove that any such 
rule produced Via recursion a struc- 
ture which was 'subsumed' (as per 
section IV above) by one with less re- 
cursion. This in turn would mean 
that provided some result had been 
found, the recursion could be termi- 
nated, since any further downstream 
result would fail the minimality con- 
straint. If however no result could be 
found, there would be no basis for 
stopping the recursion other than a 
very ad-hoc shaper test (Kuno 1965), 
based on some more or less arbitrary 
(depending on the application) limit on 
the size of the expected output. 
At the moment I have no ideas on 
how to implement a generator which 
respects Definition 3. 
ACKNOWLEDGEMENTS 
The work reported here grew out of 
work carried out while the author was 
a visitor to the Embedded Computation 
and Natural Language Theory and 
Technology groups of the Systems 
Science Laboratory at the Xerox Palo 
Alto Research Center. These groups 
provided both the intellectual and ma- 
terial resources required to support 
that earlier work, for which thanks. 
Many of the ideas presented here were 
first articulated in discussions with Jo 
Calder and Mike Reape, and many of 
the HCRC coffee room regulars also 
contributed patience and sugges- 
tions-my thanks to them all. Thanks 
also to Klaus Netter, in particular for 
first calling to my attention the be- 
haviour of Definition 3 with respect to 
the extreme cases. 

REFERENCES 
Estival, D. \[1990\] Generating French 
with a Reversible Unification 
Grammar. In Karlgren, H. (ed.) COLING90, 
1990, pp 106-111. 
Gardent, C. and Plainfossd, A. \[1990\] 
Generating from a Deep Structure. 
In Karlgren, H. (ed.) COLING90, 
1990, pp127-132. 
Kaplan, R., Netter, K., Wedekind, J. 
and Zaenen, A. \[1989\] Translation 
by structural correspondences. In 
Proceedings of the Fourth 
• Conference of the European 
Chapter of the Association for 
Computational Linguistics, 
University of Manchester Institute 
of Science and Technology, 
Manchester, UK, 10-12 April, 1989, 
pp272-281. 
Kuno, S. \[1965\] "The predictive ana- 
lyzer and a path elimination tech- 
nique", Communications of the 
ACM, 8, 687-698. 
Momma, S. and DSrre, J. \[1987\] 
Generation from f- Structures. In 
Klein, E. and van Benthem, J. 
(eds.) Categories, Polymorphism 
and Unification, pp148-167. 
Edinburgh and Amsterdam: 
University of Edinburgh, Centre for 
Cognitive Science and Institute for 
Language, Logic and Information, 
University of Amsterdam. 
Sadler, L. and Thompson, H. S. \[1991\] 
Structural Non- Correspondence in 
Translation. In Kunze, J. and 
Reimann, D. (eds.) Proceedings of 
the Fifth European Association for 
Computational Linguistics, Berlin, 
April, 1991, pp293-298. 
Shieber, S. M. \[1985\] Using restriction 
to extend parsing algorithms for 
complex-feature-based formalisms. 
In Proceedings of the 23rd Annual 
Meeting of the Association for 
Computational Linguistics, pp145- 
152. 
Shieber, S. M. \[1986\] An Introduction 
to Unification- based Approaches to 
Grammar. Chicago, Illinois: The 
University of Chicago Press. 
Shieber, S. M., van Noord, G., Pereira, 
F. C. N. and Moore, R. C. \[1990\] 
Semantic-Head-Driven Generation. 
Computational Linguistics, 16, 30- 
42. 
van Noord, G. \[1990\] Reversible 
Unification Based Machine 
Translation. In Karlgren, H. (ed.) 
COLING90, 1990, pp299- 304. 
Wedekind, J. \[1988\] Generation as 
structure driven derivation. In 
COLING88, Budapest, Hungary, 
22-27 August, 1988, pp732-737. 
