Language-Specific Mappings from Semantics to Syntax 
Judy Delin 
Department of English Studies 
University of Stirling 
Stirling FK9 4LA U.K. 
j Idl~st ir. ac. uk 
Donia R. Scott 
ITRI 
University of Brighton 
Mithras Annexe 
Lewes Road 
Brighton BN2 4AT, U.K. 
drs2@itri, bton. ac. uk 
Anthony Hartley 
ITRI 
University of Brighton 
Mithras Annexe 
Lewes Road 
Brighton BN2 4AT, U.K. 
afh~it ri. bton. ac. uk 
Abstract 
We. present a study of the mappings 
from semantic content to syntactic ex- 
pression with the aim of isolating the 
precise locus and role of pragmatic infor- 
mation in the generation process, l~om 
a corpus of English, French, and Por- 
tuguese instructions for consumer prod- 
ucts, we demonstrate the range of ex- 
pressions of two semantic relations, GEN- 
EI~ATION and ENABLEMENT (Goldman, 
1970) in each language, and show how 
the available choices are constrained syn- 
tactically, semantically, and pragmati- 
cally. The study reveals how multilin- 
gum NLG can be informed by language- 
specific principles for syntactic choice. 
1 Introduction 
We report here on work which addresses the 
message-to-syntax mapping in the context of au- 
tomatic generation of instructional texts the 
kinds of texts found in the procedural parts of 
manuMs or information leaflets, pharmaceuticM 
products. Instructional texts do not simply con- 
sist of lists of imperatives: instructions may also 
describe, eulogise, inform and explain. Generating 
good-quality draft instructions requires a detailed 
specification of how to map from semantic repre- 
sentations of the task actions onto a wide range of 
linguistic expressions. 
Our corpus is composed of naturally-occurring 
instructions in the three languages of study. Our 
overall approach is to obtain different-language 
drafts that are congruenl with the technical con- 
tent embodied in the task to be performed (and 
with other relevant information about the task). 
A satisfactory level of congruence requires the use 
of syntactic and pragmatic rules appropriate to 
each target language, mat)ping fi'om the seman- 
tics to appropriate expression in a way that is frec 
from influence from any source language 1. We 
1See (Hartley and Paris, 1995) for discussion of the 
begin the generation process with a plan-based 
model of the underlying task} 
In our study, we have looked at two specific 
procedurM relations that can hold between pairs 
of actions in a task, identified by the philoso- 
pher Alvin Goldman as the relations of GENEP~- 
ATION and ENABLEMENT (Goldman, 1970) re- 
lations which have the advantage of being tbr- 
mally specified (see e.g.(Pollack, 1986; Balkanski, 
1993)), and need to be expressed regularly within 
instructional texts. In section 2, we give a brief 
definition of generation and enablement, before 
going on in section 3 to describe how the two re- 
lations are realised in the corpus of Portuguese, 
English, and French instructions. 
2 The Semantic Relations 
Generation and enablement are relations that can 
hold between pairs of states, events, processes, or 
actions. A simple test of generation holding be- 
tween action pairs is whether it can be said that 
by performing one of the actions (a) under appro- 
priate conditions, the other (/9) will automaticMly 
occur (Pollack, 1986). If so, it can be said that c, 
generates/). The two actions must be performed, 
or perceived to be performed, by the same human 
agent, and the two actions must be asymmetric 
(i.e. if a generates fl, then fl cannot generate a). 
Simple examples of generation are ~ follows: 3 
(1) Heat gently to soften the coating. 
(2) Dial the numbers of the Mercury authorisa- 
tion code by pressing the appropriate num- 
bers on the keypad. 
In example 1, the action of heating gently ha~s 
the effect of softening the coating. In example 2, 
advantages of this approach over the inherent limi- 
tations of a translation-based approach to producing 
multilinguM instructions. 
2See (Paris et al., 1995) for a discussion of the mod- 
elling of domain knowledge for instructions in the coil- 
text of a support tool for drafting instructional texts. 
3So far, we have concentrated on examples of these 
semantic relations that do not cross sentence bound- 
aries. 
292 
pressing the correct keypad numbers has the auto- 
marie effect of dialing the numbers of the Mercury 
authorisation code. In each case, by performing 
the ¢t action (or set of actions), the user has au- 
tomatically performed the fl action. Note that 
the two actions can bc presented in either order: 
generatiNG first, or generatED first. 
q'he term cnablemeut is commonly used to re- 
fer to the procedural relation between precondi- 
tions and actions. It obtains between two actions 
where the execution of the first brings about a set 
of conditions that are necessary, but not necessar- 
ily suJJicienl for the subsequent performance of 
the second (Pollack, 1986). This is different from 
the generation ('rose, since enablement requires the 
further intervention of an agent - and it need not 
be the same agent - to bring about the fl even- 
tuality. 
(3) (?lose cover and test a.s recornmended in 'Op- 
eration' section. 
(4) l)br prolonged viewing, ttle slide may be 
pushed downwards and then backwards un- 
til it locks under the ledges at each end of 
the slot. 
Example 3, taken from the instructions for a 
household smoke alarm, shows the enabliNG ac- 
tion appearil, g tirst: closing the cover enables test- 
ing to take place, but does not automatically re- 
sult in a test. Example 4, front the instructions 
for a home photographic slide viewer, presents the 
enablED action ~ prolonged viewing - first, and 
describes to tile user what must be done to facili- 
tate it. 
These two relations have been formalised by 
Pollack (1986) and Balkanski (1993) for the pur- 
poses of plan recognition, and can be represented 
in a plan formalism that is a simple extension 
of STRll)S-styled operators developed by Fikes 
(1971) and expanded in the NOAII system (Sacer- 
doti, 1977). \[Iere, we summarise the two relations 
in the form of the following planning statements: 
,, (~ generates fl iff c~ is the body of a plan e 
whose goal is ft. 
,, oe enables fl if ce is a precondition of a plan e 
and/3 is the goal of plan e, or iffl is the body 
of e and t~ is a preconditkm of/3. 
In order to generate instructions clearly, it must 
be obvious which, if either of the two relations is 
intended at any given point: eonflmion of one with 
the other will lead to inadequte, incomplete, or 
even dangerous execution of the t~k described. 
3 From Semantics to Syntax 
Ilow, then, are generation and enablement re- 
alised in the three languages of study? In what 
follows, we look at the syntactic resources that 
are used in each language to convey the two parts 
of the two relations, and look at tile constraints 
on tile ordering of tile two parts; then, at what 
discourse markers play a role in further ensuring 
the clarity of the relation intended, and finally 
show how different rhetorical interpretations re- 
sult from these choices. Together, these factors 
explain a significant amount of the cross-linguistic 
wtriation that occurs within the instructions sub- 
language, in what follows, however, it is not our 
intention to suggest an ordering tbr the set of de~ 
cisions that need to be made for generation: so 
far, our research suggests a complex interaction of 
factors is involved in choice of expression, and tim 
ther research is required to establish their relative 
priorities in the decision-making process. 
Our corpora for the study consisted of 65 exam- 
pies of generation, and 65 examples of enablement 
for each of the three languages of study. 4 
3.1 Syntacti(. Resources 
The distribution of expressions among the two 
components tED and ING) of the generation re- 
lation for Portuguese is shown in figure 1 '~. 
hdinitive ~ ~ I --27 V 501 -3-I--'9~-~ 47.8 
hnperative | 0 I 0 I 0 \[ 26 I 19 I 45 I 34.6 
p,.<v,~ / ~1 °l 21 31 '21 '~/~.4 Subjunctive/ 4 I i I 5 I °l o \[ o/a.8 
Nominal 1 4 5 0 0 0 3 8 K k_4\[_ '51 ~ (~ )\] :.~ 
Figure l: Expressions of(~eneral, ion: Portuguese 
Three strong patterns emerge in the data. First, 
two syntactic forms, infinitives and imperatives, 
dominate; together they account for over 80% of 
the action expressions in tile data set. Second, tile 
overlap in expressions between ED and ING ele- 
ments is relatively small; it is confined to only two 
of the five types of expressions: infinitives and pas- 
sives. Finally, these data suggest that the order 
of occurrence of the ED and ING components in a 
sentence does not interact with decisions of choice 
of expression: in general, once a syntactic form is 
made available lbr expressing El) or ING compo- 
nents, it can be used irrespective of the order of 
occurrence of that component in the sentence. 
French (see figure 2) shows a strong prefer- 
ence for the use of the two forms of imperative 
(imperative- simple and imperative-infinitive), the 
infinitive anti tile gerundive. Overall, however, 
there is a more even spread between choices than 
4 In order to satisfy ourselves that the linguistic ex-- 
amples in the corpus were indeed representative reali- 
sations of the two semantic relations described, we also 
perfornmd an experiment requiring naive informa.ts 
to identify linguistic cxamples as cases of one or other 
relation. There was a high degree of agreement (m 
what constituted all example of each. 
~This includes only those syntactic categories for 
which we found more titan one example in the d~tta 
set; for this reason the percentages do not total to 
100. 
293 
that appearing in Portuguese. 
Syntax \] GeneratED GeneratiNG % L~ 
t 2nd Total 1st 2nd Total 
hnp've-Infin. I 15 0 15 13 5 18 25.4 
,nfinitive \]i 019.2 Gerundive 0 O 2 25 19.2 
Imp've-Simple O 7 10 5 15 16.9 
Finite 6 9 0 1 1 7.7 
Nominal 0 . 2 3 3 6 6.2 
Figure 2: Expressions of Generation: French 
French, unlike Portuguese and English, has two 
forms of imperative. One is identical in form to 
the infinitive of the verb and is usually associated 
with a generic addressee: a 'public' form of ad- 
dress. The imperative-simple, on the other hand, 
is identical in form to the second-person plural 6 
indicative of the verb and its use is associated 
with identifiable addressees. The fact that this 
form accounts for 40% of imperatives in the cor- 
pus may be seen as evidence for the increasingly 
user-oriented style of instructions for household 
appliances. 
Unlike in Portuguese, ordering does play a role 
in French. Both imperative-infinitive and impera- 
tive simple expressing generatED occur first, while 
a gerundive expressing generatiNG occurs second. 
In addition, Portuguese showed a strong differenti- 
ation between ED-specific and ING-specific forms, 
and therefore little overlap, but in French, overlap 
is much greater: only one form, the gerundive, is 
constrained to one part of the semantic relation 
(generatiNG). 
English appears the most permissive in terms 
of both overlap between ED and ING-bearing ex- 
pressions, and lack of influence of ordering - a 
combination of the characteristics of the other two 
languages. While there is a strong preference for 
infinitive and imperative forms, the influence of 
the part of the semantic relation only extends, 
as in French, to a single form: the appearance 
of the infinitive as an expression of generatED 
rather than generatiNG. The influence of order- 
ing appears to be at the level of weak preferences, 
in line with Portuguese, rather than the stronger 
role seen in French generation. The distribution 
of expressions in English generation is shown in 
figure 3. 
1 Ge.0r~G~.eratING \[lst\[2naITotal t 1st 2ha I Tota~ 1 
1251 7 I 32\] 0 O 0124.4 I \[ 8J 1 I 9 I 10 29 I 39 136.71 
I 41 3\] 7 I 1 0 11 6.11 i11 l I 2 I .3 20 23\]19.0 I 
i 81 4 I 12 I 3 o I 3111.5 I 
Syntax 
Infinitive 
imperative 
Passive 
Nominal 
Other Finite 
Other 
Figure 3: Expressions of Generation: English 
Portuguese uses a very small subset of of the 
available syntactic resources of the language to ex- 
press enablement: only infinitives, imperatives 
(together, over 85% of the data set) and nominals 
6We ignore here the singular, familiar imperative. 
express enablement 7. While there was no ordering 
preference in Portuguese generation, there is an 
ordering constraint on enablement: imperatives 
expressing ED do not appear first. The distribu- 
tion of syntactic forms (figure 4) shows that, while 
there is a high degree of overlap in terms of ex- 
pression of 1NG or ED, a system of preferences 
operates: the infinitive is three times as likely to 
be used for the ED than ING component, while 
the imperative is twice as likely to be used for ING 
than for ED. 
Syntax 
Infinitive 
hnperative 
Nomina! 
EnablED ~ EnabliNG 
- 1st 2n~ Tot-al 1st 2nd Total I \] Infinitive 17 ~5 3 8 
0 29 29 3 19 49 
3 2 5 3 2 
Figure 4: Expressions of Enablement: Portuguese 
French has a relatively broad range of expres- 
sions available for enablement (see figure 5)-much 
wider than Portuguese. As was the case for gen- 
eration, French enablement shows a strong order- 
ing preference: when an imperative is used as en- 
ablED, it must be placed second (if expressing 
generatED, it must be placed first). The gerun- 
dive is strongly marked for generation, and in the 
rare cases it is used in enablement is restricted to a 
single semantic role: expressing enabliNG, rather 
than enablED-the only French expression so re- 
stricted. Euablement is most regularly expressed 
by the imperative-infinitive. 
ySy,"a~ I EnablEO ~ E,,ab.NG / 
Dstl 2ha I Total/ 1st 2nd~t~\] _ \[ /6 23 29/~71 61 3314HI 
\] Imp've-Infin.InfinitiVeGerundive | ~l 1~1 l~l ~l ~l : I 1.~:~l 
Imp're-Simple/ O I 7 / 7/ 10 \] 4 I 14 I 16.'_2 \] 
NoInillal ~_~~ 
Figure 5: Expre~ions of Enablement: French 
In English (figure 6), although the imperative is 
the most popular expression of enablement, (over 
60% of tokens), when it expresses the enablED 
part of the relation, it must appear second: to 
place it first would be misleading, as it would 
imply that this action should be performed first. 
There is also a constraint arising from the part of 
the semantic relation being expressed: infinitives 
do not express the enabliNG action. Infinitives 
are only capable of conveying a goal, and the en- 
abliNG element is not the goal. 
Syntax 
Infinitive 
Imperative 
Pa.ssiw~ 
Nominal 
Other Finite CI. 
Other 
4 0 0 6.1 
|01 tl 11 31 31 61 5.41 191 91 lSl 3 I ol 3lt6.2 I 
/31 31 61 41 31 7110.0 I ±_£L° L ~ °_I__°A___ °~L_°'sS 
Figure 6: Expressions of Enablement: English 
294 
potactic conjunctions. 
Discourse 
~n S~ 
(72.3%) 
Purpose 08.5%) 
\[ ____ 
Syntax 
et (ED) nnperative | ,mperatlv( 
puis (ED) imperative | imperative 
avant de (El)) infinitive ~ imperatiw: 
avast tED) nominal \[ imperative 
apr/~s (\[NG) imperative \[ infinitive 
apr6s (ING) imperative | nominal 
/ pour (ED) inner{lye, i~o~\]qn~er-dtlw~ 
afin de tED) infinitive \[iml)erative 
de fa(;oI~l'\]D~ inlinitive Limperative 
Figure 10: French: Enablement 
The relationship with actual temporal ordering 
of events plays no role in determining ordering 
in the case of avant de and apr~s followed by an 
infinitive: the two possible orderings are eqnally 
likely. In the case of avast and apr& followed by 
a nominal, there is a strong preference for placing 
the prepositional phrase containing the nominal 
first. Clearly, this yields an iconic ordering in the 
cause of apt& and a non-iconic ordering in the case 
of avant. 
Apr~s ddpoussidrage, appliquerdeux couches 
de peinture vinylique. 
After dusting, apply two coats of 
vinyl paint. 
(6) Avant l'emploi, faites tremper le boyau dans 
de l'eau ti6de. 
Before use, soak the tube in warm 
water. 
As in Portuguese, though, both rhetorical rela- 
tions are clearly marked, and there is a similar, 
Mthough less marked, tendency to view the se- 
mantic content of the enablement relation as being 
one of temporal sequence. 
3.2.3 English 
English h~s the greatest tolerance of mmmrked 
discourse relations among the languages studied: 
only 37 of the 130 clauses examined appeared with 
a marker of any kind. The majority of markers 
were instances of by appearing with a nominali- 
sation to convey the generatiNG part of the rela- 
tion, showing a preference for communicating this 
semantic content in terms of the rhetorical rela- 
tion of MF, ANS in English 9 18 of these 19 instances 
of by appeared when the generatED element was 
presented first: by is used to signal the MEANS re- 
lation when conflmion might otherwise result from 
a user attempting to perform the generated action, 
9As stated at the outset, however, we cannot yet 
state the ordering of the relevant semantic, syntactic, 
and rhetoricM decisions. 
presented first, rather than the generating action. 
\[\[ Di;~oT,;g& -- 
\[ ~iilati~ cue 
I \[',trpos~ 
j (50.8%) simply (\[NG) 
generatED 
infinitive 
infinitive 
Syntax 
I so that (H)) \[inite clause 
I for tED) nominal i 
I Means by (\]NG) \]m~r~ff\[{e.-~- (29,2)% 
Itesult if/when (ING) ~e (la.S%) by (IN(:) 
and tED) 
O subject complement 
Condition if (EI)) fiit~te.. 
\[ i~ 2%) _ _ 
Figure It: English: Generation 
generatiNG 
imperative 
imperatiw! 
imperative 
-ho~n~I 
nominal 
imperative 
nominal 
~npc~e--- -- 
The markers simply and just appear only with 
generatiNG imperatives. For appears only with 
NP, and marks only generatED elements. So that, 
which appears rarely, marks only generatE1) elc- 
ments. The less common and, if and when couhl 
appear with either ING or 1'3). 
Even though English does not mark the two 
parts of the generation relation explicitly by 
means of discourse markers, the combination of 
ordering, syntax, and rhetorical relation results in 
all but one c~e in an unambiguous interpretation. 
PUIU'OSF,, is the only relation that is expressible 
in both ED-first and ING-first order: in fact, it is 
only infinitives and for with a nominal that can 
appear either before or after their main chmse.t° 
The range of rhetorical relations available for the 
expression of generation is, however, the greatest 
of the three languages, consisting of a superset of 
the relations adopted in French and Portuguese. 
Enablement in English is expressed most fl'e- 
quently by SEQUENCI.;, which, with appropriate 
temporal markers, can appear in both iconic and 
non-iconic order: the few non-iconic cases (5) are 
n,arked with before, follow ling\] by \[ed\], and fol- 
lowed by. PURPOSE is also a frequent interpreta- 
tion. For enablement, some discourse markers are 
exclusive, and some ambiguous. If appears ex- 
clusively with the E1)-first presentation, and and, 
then, followed by, follow X by, and now only ap-- 
pear with the ING-first ordering, ~ do commas. 
7'0, for and before are ambiguous. Finally, just 
and simply are markers that only appear with the 
ING clement, but there is always another marker 
that appears in the ED element in conjunction 
with them. 
l°'l'hese alternations in ordering are discussed in 
(Vander l,inden, 1.993) in terms of the intention t.o 
convey optionality or oblig~ttoriness of tile action in 
t|te matrix clause. 
295 
As was the case with generation in English, 
there is a high degree of overlap between all other 
expressions of the two parts of the relation. 
3.2 Discourse Markers and Rhetorical 
Relations 
Very strong correlations appear between particu- 
lar choices of semantic relation and syntactic form 
on the one hand, and the appearance of discourse 
markers and/or a strong bias towards a particular 
rhetorical interpretation on the other. Our anal- 
ysis shows that selection of syntactic expression 
and local discourse relation strongly interact, and 
provides a rather clearer picture of the influences 
that bear on the mapping front semantics to syn- 
tax. 
A particularly important element to emerge 
is the language-specific nature of the choice of 
rhetorical relation, a notion which we express for 
the moment in terms of l~ST-style rhetorical rela- 
tions, of. (Mann and Thompson, 1988). The anal- 
ysis represents a careful but intuitive interpreta- 
tion of what rhetorical relation would be retrieved, 
by a native speaker of the language, front the par- 
ticular combination of syntax, discourse marker, 
and content. What triggers these interpretations 
is constrained both by semantic content and by 
the conventions and syntactic resources available 
within the languages of study. 
3.2.1 Portuguese 
Portuguese appears to have obligatory signalling 
of discourse relation by a discourse marker, or 
at the very least by punctuation s . Three dis- 
course relations are available for generation (PUR- 
POSE, CONDITION, RESULT) and two for enable- 
ment (PURPOSE, SEQUENCE). For generation, the 
dominant relation is PURPOSE (80%); for enable- 
ment it is SEQUENCE (72%). The overlap in the 
syntax of generation vs enablement sentences is 
confined to expressions of PURPOSE. Figures 7 and 
8 show the relationships between semantic relation 
and syntax with an overlay of discourse: rhetorical 
relations and discourse markers. 
\[ Discourse Syntax 
generatED generatl NG 
(80.1%) 
Condition 
(9.3%) 
Result 
(1..5%) 
cne 
caso ED 
se El) 
e + aulomat- 
icamente ED 
~-nfinitive 
infinitive 
nominal 
infinitive 
p;msive 
f'u\[ure 
(agent = artifact) 
~imperative,passive, 
infinitive} 
basta + infinitive 
{imperative, infinitive} 
imperative} 
infinitive/imperative 
~npera6C, Z 
Figure 7: Portuguese: Generation 
The figures show a strong and unambiguous re- 
SThis can be compared with the finding presented 
by (Moser and Moore, 1995) in their ACL-95 presen-. 
tation, that discourse cues are significantly more likely 
to occur when the 'contributor' component of the re- 
lation PRECEDES the 'core' in English dialogues. 
lationship between rhetorical relation, discourse 
marker, and syntax. There is a strong tendency 
to present generation in terms of the rhetorical 
relation of I'URPOSE, which is marked almost in- 
variably by para or para que (so that), both of 
which can only take a nominal or infinitival ex- 
pression. SEQUENCE, on the other hand, is sig- 
nalled by temporal connectives such as antes de 
(before), depois de (after) or apes (afte,9, by the 
connective e (and) or implicitly by the use of a 
comma between the elements of a string of imper- 
atives. For Portuguese enablement, two relations 
are preferred: PURPOSE and SEQUENCE, with a 
strong preference for the latter. Again, the mark- 
ers of each rhetorical relation are distinct. 
Discourse Syntax 
relation cue 
\['urpose para (ED)-- 
(18.5%) 
Sequence antes de tED) 
(72.3%) depois de (ING) 
ap6s (ING) o tED) 
-~\]mblED enabliNG 
(n~i, infinitive} \]imperative 
infinitive I ba.sta + infinitive 
L _ 
imperative I infinitive 
imperative I ! ....... inal, infinitive} 
imperative I !mperative 
imperative \[imperative 
Portuguese: Enablement 3.2.2 h.encl~ig.re s: 
In French, discourse markers do not accompany 
all expressions of generation. However, where they 
do occur, the markers unambiguously assign the 
expressions to one or other of the plan elements: 
body (si, en and par) or goal (pour, afin de, and 
de fafon it). While Portuguese generation is over- 
whelmingly expressed through the rhetorical re- 
lation of PURPOSE, in French it is more evenly 
distributed between PURPOSE and MEANS, with 
a small showing for CONDITmN. Although not 
shown in the figure, there is only one case in 
French generation where the choice of ordering 
of the elements and the choice of marker-plus- 
expression are not mutually constraining: this is 
when the preposition pour is followed by an infini- 
tive. In this case, the two orderings of the relation 
(ING first or ED first) are more or less equally 
probable. 
\[7)nrp-po.~ -- \] p .... ED 
(41.5%) pour (ED) 
afin de (ED) 
de faqon ~. (ED) 
Condition si (ING) (0.2%) 
\[ Means ien (ING) 
L('13.t%l_Lpa~ .... 
Syntax ge,,e~atED ~nne~ 
infinitive tmperat!ve I 
nominal imperat!vc I mlperat!vc I 
infinitive imperatlve~ infinitive 
imperative, finite nominM 
Figure 9: l~ench: Generation 
The expression of enablement in French instruc- 
tions (figure 10) is limited to two rhetorical rela- 
tions: SEQUENCE and PURPOSE. Situations where 
the choice of ordering of the elements and the 
choice of marker-plus-expression are not mutu- 
ally constraining are limited to the PURPOSE dis- 
course relation marked by pour and the SEQUENCE 
discourse relation marked otherwise than by hy- 
296 
?elation 
Sequence (68%) 
_ Syntax 
cue enablED enabliNG-- {~nd, t-h-~, 
now (ED)} imperative imperative 
finite finite 
{~, ,} imperative imperative 
{when, after (ING)} imperative nominal 
follow (ING) by (ED) nominal nominal 
followed by (ED) nominal imperative 
before (ED) nominal {imperative, finite} 
finite finite 
(21.4%) for (ED) nominal {finite,imperative} 
for (ED) nominal imperative 
in order to (ED) infinitive finite 
Condition when (ING) 
(7.6%) if(ED) imperative \] finite 
Result once (ING) 
(3.0%) which (ED) 
Figure 12: English: Enablement 
4 Conclusions and Implications 
In this paper, we have gone some way towards 
isolating the specific point in the generation pro- 
cedure at which pragmatic information such as 
rhetorical relation must be brought into play. 
Since our notion of semantic content is based 
on a formal model of the task plan to be con- 
veyed to the instruction user, the significance of 
the approach is clear for developing natural lan- 
guage generation applications within this limited 
domain. In particular, for rhetorical planning of 
the communication of particular content, we can 
use the preferences observed for selection of the 
preferred rhetorical relation in the language in 
question. Second, we can use our knowledge of 
how that relation is constituted and expressed in 
terms of syntax for marking the relation appropri- 
ately. 
The approach also reveals some interesting facts 
about the individual languages. In particular, we 
found different levels of tolerance of resid- 
ual ambiguity: Portuguese has little ambigu- 
ity in the mapping from semantic content to syn- 
tactic realisation (the least ambiguous markers of 
rhetorical relation, fewest available syntactic re- 
alisations, least overlap in the roles of these real- 
isations for conveying one or the other semantic 
relation, most restricted set of favoured rhetori- 
cal relations). English, on the other hand, had 
the opposite characteristics. We also found dif- 
fering preferences for rhetorical relations in 
expressing semantic content: for example, while 
Portuguese expresses generation in over 80% of 
cases with the relation of PURPOSE, French gener- 
ation divides this relation almost equally between 
PURPOSE and MEANS. The English corpus, on 
the other hand, while it has a strong showing for 
PURPOSE (around 50%), reveals a relatively strong 
showing (around 14%) for the relation of RESULT, 
a relation found in only 1.5% of the Portuguese 
relations and not at all in French. 
No natural language has an unambiguous map- 
ping from semantics to surface syntax, which 
makes the information encoded by syntax, both 
semantic and pragmatic, very difficult to con- 
sciously 'unpack' from surface form in the per- 
formance of the translation task. We suggest that 
uncovering the decisions necessary for producing 
pragmatically-appropriate sets of parallel instruc- 
tions is a task best performed as an empirical 
study along the lines suggested here. In this way, 
we can encode language-specific pragmatic princi- 
ples into tools that support the process of multi- 
lingual document production. 
References 
Cecile T. Balkanski. 1993. Actions, Beliefs and 
Intentions in Multi-Action Utterances. Ph.D. 
thesis, Harvard University, May. 
R. E. Fikes and Ntis Nilsson. 1971. STRIPS: 
a new approach to the application of theorem 
proving to problem solving. Artificial Intelli- 
gence, 2:189-208. 
Alvin I. Goldman. 1970. A Theory of Human 
Action. Prentice Hall, Englewood Cliffs, NJ. 
Anthony F. Hartley and C6cile L. Paris. 1995. 
Supporting Multilingual Document Production: 
Machine Translation or Multilingual Genera- 
tion? In working notes of the IJCAI-95 Work- 
shop on Multilingual Text Generation, August 
20-21, Montr6al, Canada. Also available as 
ITRI report ITRI-95-13. 
William C. Mann and Sandra A. Thompson. 
1988. Rhetorical structure theory: Toward a 
functional theory of text organization. Text: 
An Interdisciplinary Journal for the Study of 
Text, 8(2):243-281. 
M. Moser and J. Moore. 1995. Investigating Cue 
Selection and Placement in Tutorial Discourse. 
In Proceedings of the 33rd Annual Meeting of 
the ACL, pages 130-150, Boston, Mass. 
C~cile Paris, Keith Vander Linden, Markus 
Fischer, Anthony Hartley, Lyn Pemberton, 
Richard Power, and Donia Scott. 1995. A sup- 
port tool for writing mnltilingual instructions. 
In Proceedings of the Fourteenth International 
Joint Conference on Artificial Intelligence, Au- 
gust 20-25, Montrdal, Canada, pages 1398-1404. 
Also available as ITRI report ITRI-95-11. 
Martha E. Pollack. 1986. Inferring Domain Plans 
in Question-Answering. SRI Technical Report 
SRIN-403. 
Earl D. Sacerdoti. 1977. A Structure for Plans 
and Behavior. Elsevier, New York. 
Keith Vander Linden. 1993. Speaking of Ac- 
tions: Choosing Rhetorical Status and Gram- 
matical Form in Instructional Text Generation. 
Available as Technical l{eport CU-CS-654-93. 
297 
