Planning texts by constraint satisfaction 
Richard Power 
Information Technology Research Institute 
University of Brighton 
Lewes Road 
Brighton BN2 4G J, UI,: 
Rich ard. Power:~_}itri. bton. ac. uk 
Abstract 
A method is de.scribed by which a rhetorical- 
structure tree can be realized by a text structure 
made up of sections, paragraphs, sentences, verti- 
cal lists, mid other textual patterns, with discourse 
connectives added (in the correct positions) to mark 
rhetorical relations. We show that text-structuring 
can be formulated as a Constraint Satisfaction Prob- 
lem, so that all solutions rest)ecting constraints on 
text-structure formation and structu,'al compatibil- 
ity can be efficiently generated. Of the many sohl- 
tions generated by this method, some are stylisti- 
cally preferable to others; we show how further con- 
straints can be applied in order to select the best 
versions. Finally, we discuss some extensions such 
as the generation of indented text structures. 
1 Introduction 
Much recent work on  generation (I-tosner 
and Stede, 1992; Hovy, 1993; Mellish et al., 1998) 
has made use of discourse representations based 
on Rhetorical Structure Theory (RST) (Mann and 
Thompson, 1988). Interest has focussed in particu- 
lar on the problem of buihting a rhetorical str'uct~tre 
(RS) which organizes elementary propositions him'- 
archically by means of RST relations (.NIarcu, 1996). 
There has been less attention to a second problem 
in text plmming, that of realizing the RS by a te:rt 
struct.uT'e (TS), in which the material in the RS is dis- 
tributed among I)aragraphs, sentences, vertical lists, 
etc., perhaps linked up by discourse connectives such 
as 'since' and 'however'. This task, which we will 
call text structuring, is typically addressed through 
a micro-planning phase that determines the content 
of successive sentences. However, docmnents of re- 
alistic complexity require richer TSs including, for 
example, vertical lists, sub-sections, and clauses sep- 
arated by semi-colons. 
We describe in this pat)or a text-structuring sys- 
tem that has been developed within ICONOCLAST 1 , a 
project which investigates applications of constraint- 
based reasoning in Natural Language Generatiou 
I ICONOCI.AST is supt)orted by the UI<. l~ngineering and 
Physical Sciences t/.esearch Council { EPSI{(':) G rant 1,77102. 
concession 
approve(fda, elixir-plus) cause 
banlfda, elixir) contain(elixir, gestodene) 
Fignre 1: Rhetorical structure 
using as subject-nlatter the domain of medical in- 
formation leaflets. Following Scott and de Souza 
(1990), we represent rhetorical structure by graphs 
like figure 1, in which non-ternfinal nodes rel)re- 
sent RST relations, terminal nodes represent propo- 
sitions, and linear order is unspecified (for regularity, 
the nucleus is arbitrarily presented on the left of the 
satellite). One of many possible TSs realizing this 
I{S is shown in figure 2, au ordered tree in which 
nodes are labelled with %ext-categories' (Nunberg, 
1990); the terminal nodes hold either discourse con- 
nectives (which owing to their interaction with text 
structure have already been selected) or l)ropositions 
(to be realized in their turn during tactical genera- 
tion). After passing this TS to the tactical genera- 
tor, we might obtain the. following OUtlmt2: 
The FDA bans Elixir since it contains gesto.- 
dene; however, the FDA approves ElixirPlus. 
Part of the interest of the prol)lem is that RS and 
TS are not always isomorphic; this will be illustrated 
later by an alternative TS realizing figure 1 (figure 
6b). 
Our goal in ICONOCLAST has been to explore the 
huge variety of ways in which an RS can be con- 
veyed, noting stylistic reasons why one version might 
be preferred to another, with the eventual aim of 
providing a system in which the user enjoys fine- 
grained control over style as well as content. These 
requirements ('anllot be met by a text structurer 
~q'he content of the examples is of course fictional. 
642 
senlcnce 
text-clause text-clause 
Icxl-phrase lexl-phrast~ lexl-phl as¢ tcxl -phnlse 
I,aa(i~la. elixir) / ~ "1 ........ '¢1" el,Ill o ve(lila, elixil-llhls) 
Iexl-phrase texl-phlase 
"since" conlain(elixir, geslodene) 
Figure 2: Text structure 
that merely returns one or two satisfactory solu- 
tions, relying I)erhaps on a library of schemas. We 
need a method for enulnerating all the candidate so- 
lutions that (:an be coinl)osed froln a given set of 
text-categories. By a 'candi(late' we mean a solu- 
tion that correctly realizes the RS without violating 
text-structure formation rules; it may nevertheless 
be stylistically inel)t. Having generated a set of can- 
didate text; structures, the ICONO(ILAST system eval- 
uates them through rules that detect stylistic tlaws, 
and on this basis re:ranges them in an order of prefer- 
en(:e. We will discuss stylistic evaluation brMly, but 
the focus of the paper is the 1)roblem of enum(;rating 
solutions. 
2 Formation rules 
A text structur(' is defined in ICONOCI,AST as all or- 
dered tree ill which each node has a text-t'ategory 
('O:ml)rising two t~atur(,s named TI,;X'I'-I,I,;VEI, and IN- 
I)ENTATION. Vahtes of TEXT-I,EVI,;I~ are tel)resented 
by inl;egers in the range 0..LMo:,,; these~ may lie in- 
terl)rete.d in various ways, lint w(! will assttllle here 
that LMa, = 4 aim that integers are t)aired with 
descriptive labels as follows: 
0 t(;xt-i)hrase 
1 text-clause 
2 text-sentence 
3 tlaragraph 
4 section 
The meanings of 'section' and :paragral)h' are the 
usual ones, excellt that section titles are ignored: a 
section is simt)ly a sequence of one or more I)ara- 
graphs. Following Nunberg (1990), :text-sentenee' 
denotes a unit normally Imnetuated with a capital 
letter and a flfll stop; this is distinguished froin the 
syntactic concept of 'senten('e', which depends on 
syntactic formation rules. Thus the following para- 
graph consists of three text-sentences which contain, 
respectively, one, zero, and two syntactic sentences: 
He entered the room. l)isaster. The safe was 
ol)en and the money had gone. 
A (;ext-clause is a unit that would nornmlly be Imne- 
tuated with a semicolon; the text-sentence you are 
now reading contains two text-clauses, but tile see- 
end semicolon does not appear because it has been 
'absorbed' into the flfll-stop that marks the whole 
text-sentence. Wil;hin a text-clause, hierarchy is de- 
termined by syntax rather than text-structm'e, so all 
units within a text-clause are assigned the minimal 
TI~XT-LF, VEI~ of zero. 
The tmrt)ose of INDENTATION is to allow indented 
text structures like lmlteted lists; the feature takes 
values in the range 0, 1, 2 ..., where unindented 
text has INDI,~NTATION ~ ()~ 
• a list item has INI)ENTATION = l 
• a list item within a list item has 1NI)ENTA- 
TION ---~ 2 
and so forth. To siml)lit~y the presentation, we will 
assume for now that all nodes have INI)ENTAT1ON = 
0, so that text-categories are distinguished only by 
TI~XT-I,EVEI,. 
intormally, a text structure is well-formed if it re- 
sl)ects the hierarchy of textual levels, so that sections 
are coml/osed of paragraphs, i)aragraphs of text- 
sentences, atl(l so forth. An examt)le of all ill-formed 
stru(:ture would be one in which a text-sentence, con- 
tained a paragrat)h; such a structure can occur only 
when the. paragrat)h is indented - - a possibility we 
are excluding here. Formally, the text-structure tbr- 
mation rules are as follows: 
1. A text structure is an ordered tree in which 
each node i has a TEX'F-LENq~,I, Li iIl the range 
O..LMox. 
2. If a node p has a (laughter node d, then p must 
have a 'I'EXT-IA,;Xfl.;L Olle rallk higher than d, un- 
less t)oth no(les have the minimal level 0. In 
other words, either 
(a) L v=L,;+l,or 
(b) L v = Ld = 0 
(From this it follows that any nodes that are 
sisters must have the stone level.) 
3. All terminal nodes must have tilt: minimal 
TIqXT-LEVEI, of 0. 
In most al)l)lications it would also inake sense to set 
a lower limit on the root node. For instance, we 
might at)I)ly the constraint L,¢.oot _> 2 to ensure that 
the whole text is at least a text-sentence. 
3 Compatibility 
As well as being a welM'ormed text structure, a can- 
didate solution must realize a rhetorical structure 
'correctly', in a sense that we need to mak(: precise. 
Roughly, a correct solution should satist~y three coil- 
ditions: 
643 
1. The terminal nodes of the TS should express 
all tim elementary propositions in tile RS; they 
may also contain discourse connectives express- 
ing rhetorical relations in tile RS, although for 
some relations discourse commctives are op- 
tional. 
2. The TS must respect rules of syntax when it 
combines propositions and discourse connec- 
tives within a text-clause; tbr instance, a con- 
junction such as 'but' linking two text-phrases 
must be coordinated with tile second one. 
3. Tile TS must be structurally compatible with 
the RS. 
The first two conditions are straightforward, but 
what is meant by 'structural compatibility'? We 
suggest the crucial criterion should be as follows: 
any grouping of the elementary propositions 
in the TS must also occur in the RS. In other 
words, the text-strncturer is allowed to eliminate 
groupings, but not to add any. More formally: 
• If a node in tile TS dominates terminal nodes 
expressing a set of elementary propositions, 
there nmst be a corresponding node in the RS 
dominating the same set of propositions. 
• Tile converse does not hold: for instance, an RS 
of the form R1(R2(pi,p2),p3) can be realized 
by a paragraph of three sentences, one for each 
proposition, even though this TS contains no 
node dominatillg the propositions (Pl and P2) 
that are grouped by R2. However, when this 
happens, the propositions grouped togettmr in 
the I7(S nmst remain consecutive in the TS; so- 
lutions in which Pa comes inbetween Pl and P2 
are protfibited. 
4 Generating solutions 
Our procedure for generating candidate solutions is 
based on a technique for formulating text structuring 
as a constTvdnt satisfaction pTvblem (CSP) (Henten- 
ryck, 1989). In general, a CSP is characterized by 
tim following elements: 
• A set of variables V1..I/'N. 
• For each variable l/i, a finite domain Di of pos- 
sible values. 
• A set of constraints on the wflues of the vari- 
ables. (For integer domains these often use 
'greater than' and 'less than'; other domains 
usually rely on 'equal' or 'unequal'.) 
A solution assigns to each variable 17/ a value fl'om 
its domain Di while respecting all constraints. De- 
pending on tile constraints, there may be multiple 
solutions, or there may be no solution at all. 
The difficulty in formulating a configuration task 
as a CSP is that we usually do not know in ad- 
vance how many variables the solution will contain. 
Problems of this kind are sometimes called dynamic 
(Deehter and Dechter, 1988), because the set of rel- 
evant variables changes as the search for a solution 
progresses. The solution in figure 2, for examl)le, 
has nine TS nodes, each bearing a TEXT-LEVEL vari- 
able; different realizations of the same RS might 
have more nodes, or fewer. However, we have found 
that all candidate solutions can be generated by as- 
signing four variables (TEXT-LEVEL, INI)ENTATION~ 
ORDER and CONNECTIVE) to each node of rhetori- 
cal structure, so obtaining a partial description that 
determines a unique TS. Intuitively, the idea is that 
this description should specify a subset of the nodes 
in the target TS; further nodes are then added, by a 
deterministic procedure, in order to satisfy the for- 
nlation rules and accommodate any discourse con-- 
nectives. 
cause 
ban(fda, elixir) contain(elixir, gestodene) 
(b) 
TEXT-LEVI~L = {0..4} 
ORDER = {t,2} 
cause 
TEXT-LEVEL = 3 
ban(fda, elixir) contain(elixir, gestodene) 
TEXT-LEVEL = {0..4\] 
ORDER = { 1,2 } 
Figure 3: Adding solution variables 
As an introduction to this nmthod, we will begin 
by working through a very simple example. Suppose 
that our aim is to find all TSs that realize the I{S 
in figure 3a in a paragraph, without using discourse 
connectives or indentation. 
Create solution variables 
The first step is to add TEXT-LEVEL and Oa- 
DER variables to each RS node. Since ORDER 
represents tile linear position of a text span in 
relation to its sisters, it can be omitted fi-om the 
root. 
Assign domains 
Each variable is assigned a finite domain of pos- 
sine values (figure 3b). For TEXT-LEVEI. vari- 
ables, tile donlain is O..LMax; for ORI)Ell vari- 
ables it is 1..N, where N is the number of sis- 
ters. Since we have decided that the whole text 
644 
should be a paragrat)h, we can fix the TEXT- 
I,I.~VEL Oll the root directly (assigning it the 
wflue 3). 
Apply constraints 
Constraints over the solution variables are now 
applied. Informally, these are as follows: the 
root node should have a higher TEXT-LF.VEI. 
than its daughters; sister nodes should have the 
same vahms for TEXT-LIgVEL but different val- 
ues {-'or ORDER; and since the 'cause' relation is 
not marked by a discourse connective, its argu- 
ments (the two prot)ositions) cannot be realized 
by text-t)hrases (the result would be syntacti- 
cally ill-formed) --- in otlmr words, they must 
have TI~XT-LEVEL ¢ 0. Collectively, these con- 
straints reduce the TEXT-LEVEL domains for tim 
terminal nodes to {1,2}. 
Enumerate solutions 
The solutions can IIOW be enmnerated by com- 
puting all combinations of values that respect 
tile constraints. One example of a solution is 
shown ill figure 4a. 
Compute eomplete text structures 
For each solution, a complete TS can tie corn- 
tinted by adding any nodes that are required by 
the text-structure formation rules (figure 4b). 
cause 
TEXT-LEVEL = 3 
(a) NUCL "/~S S~'ELLITE 
/ \ 
ban(fda, elixir) contain(elixir, gestodene) 
TEXT-LEVEL = 1 TEXT-LEVEI~ = 1 
ORI)ER = 2 ORDI{R = I 
paragraph (3) 
sentence (2) 
text-clause (1) text-clause (1) 
text-phrase (0) text-phrase (0) 
contain(elixir, gestodene) ban(fda, elixir) 
Figure 4: Completing a solution 
In this simple case there are just four solutions, 
since tile TEXT-LEVI.3I, and ORI)EI/. variables oil the 
nucleus both have the domains {1,2}, and any set- 
ting of these variables fixes the corresponding vari- 
ables on tile satellite. Here are texts that might 
result from the four solutions (L and O represent 
'FEXT-LI,3VEL and ORDER; N and S represent nucleus 
and satellite): 
LN = 1, ON = 1, Ls = 1, Os = 2 
Elixir is banned by the FDA; it contains gesto- 
(lelle. 
LN = 1, ON = 2, Ls = 1, Os = 1 (figure 4) 
Elixir contains gestodene; it is banned by the 
FDA. 
LN = 2, ON = 1, Ls = 2, Os = 2 
Elixir is banned by tile FDA. It contains gesto- 
(telle. 
LN =2, ON = 2, Ls = 2,0s = l 
Elixir contains gestodene. It is banned by the 
FDA. 
The method for including discourse connectives 
has been described elsewhere (Power et al., 1999). 
Briefly, the lexical entry for a discourse connective 
must specify its syntactic category (at present we 
cover subordinating conjunctions, coordinating con- 
junctions and conjuuctivc adverbs) and whether it is 
realized Oil the nucleus or the satellite. For example, 
the relation cause can be marked by the subordinat- 
ing conjunction 'since' (realized on tim satellite) or 
the coi\imlctive adverb 'consequently' (realized on 
the nucleus) -- among others. The choice of dis- 
course connective strongly coustrains tile values of 
q'EXT-I~EVEL gild ORDI,~,R for tile arguments of tile 
relation. If cause is expressed by 'since', the argu- 
merits may occur in any order, but they must be 
text-ptlrases: 
Since Elixir contains gestodene, it is banned 
by the FDA. 
Elixir is 1)armed by the FDA since it contains 
gestodene. 
#Elixir is banne.d by the FDA; since it con- 
tains gestodene. 
#Elixir is banned by the FI)A. Since i~ con- 
tains gestodene. 
If instead cause is expressed by :cousequently', the 
satellite nulst be placed before tile nucleus, and uu- 
less tim style is very informal tile arguments should 
have TEXT-I,I~VEI, values above texl;-t)hrase: 
Elixir contains gestodene; consequently, it is 
brained by the FDA. 
Elixir is banned by the FDA. Consequently, it 
contains gestodene. 
~Elixir is banned by the FDA, consequently 
it contains gestodene. 
5 Constraints 
We now state the text-structuring constraints pre- 
cisely, including the feature CONNECTIVE but still 
onlitting INI)ENTATION. Before applying ttlese con- 
straints, finite domains are assigned to each tlS node 
i: 
TEXT-LEVEl, Li ----- {0..LMa:~} 
OI{1)EI{ 0 i ~- {I...N} (for N sisters) 
645 
COllCeSsion 
TEXT-LEVEI~ = 4 
CONNECTIVE = { 0, allhough, however} 
J \ 
approve(ilia, elixirqflus) cause 
TI~XT-I~EVEL = {0..4} TEXT-LEVEL = {0..4} 
ORDER = {1,2} ORDER = {1,2} 
CONNECTIVE = 0 CONNECTIVE = {0, since, consequently } 
NUCL)~J ~ELLITE 
/ \ 
ban(fda, elixir) contain(elixir, gestodene) 
TEXT-LEVEL = {0..4} TEXT-LEVEL = {O..4} 
ORDER = {1,2} ORDER = {1,2} 
CONNECTIVE = 0 CONNECTIVE = 0 
Figure 5: Domain assigmnents 
CONNECTIVE Oil the node cause (figure 3a) Ci = 
{~, since, consequently}; on a proposition node, 
Ci = 0. The value ~1 represents the option of 
using no discourse connective. 
As an example, possible domain assigmnents for fig- 
ure 1 are shown in figure 5. The constraints are as 
follows: 
Root Domination 
The TEXT-LEVEL of the root node r must exceed 
that of any daughter d. 
L v > Ld 
Parental Domination 
Tile TEXT-LEVEl, of" a parent node p Inust be 
equal to or greater than tile TIgXT-I,EVEL of any 
daughter d. 
Lp >_ Ld 
Sister Equality 
If nodes a and b are descended from the same 
parent, they must have the same TEXT-LEVEL. 
La =- Lb 
Sister Order 
If nodes a and b are descended fi'om the same 
parent, they must have different values of OR- 
DER. 
O~ ¢ O~, 
Argument Order 
If C v is a coordinating conjmlction or conjunc- 
tive adverb, the argument d (nucleus or satel- 
lite) on which the connective will be realised 
(according to its lexical entry) nmst have Od = 
2. 
Subordinating Conjunction Level 
If C v is a subordinating conjunction, any daugh- 
ter node (t (expressing an argument of the rela- 
tion) must have Ld = O. 
Conjunctive Adverb Level 
If Up is a conjunctive adverb, ally daughter node 
d (expressing an argument of the relation) must 
have Ld > 0. 
Umnarked Level 
If a relation is unmarked (Cp = 0) any daughter 
node d (expressing an argument of the relation) 
must have Ld > 0. 
6 Completing the text structure 
conccssioll 
TEXT-I,EVEI, = 4 
CONNI!CTIVli = howovtw 
NUCI, '~S S~A tI~,I,ITE 
J \ ilpprtwc( fda, elixir-plus) cause 
TliXT-I,liVl'l, = 2 'l'l!X'l'-I,liVlil, = 2 
OH)F,R = 2 ORIJER = 1 
CONNE(YI'IVI! = (1 CONNI!CTIVli = consequently 
NUCI,E JSU~ ~ N~qSA (a) / ~1,11 ,I'I'E 
banIfda, elixir) contaill(¢lixil, gcslodClle) 
TI~XT-I,EVI~I, = 2 TliXTd,liVlil, = 2 
ORDER = 2 ORI)ER = 1 
CONNliCTIVE = 0 CONNliCTIVE = 0 
(b) 
Icxt-scntcncc (2) I 
Icxt-chuisc ( 1 ) I 
lexl-phrasc ((1) coiliaill(clixh, gcstodollO) 
section (4) 
I paragraph (3) 
ICxt-sclltell£0 (2) t't.'xl-SCll\[Oilct~ 12) 
I I tcxl-clause (I) text-clause (I) 
text-phrase (0) text-phrase (0) 
"c OI}~k'q tlelll I y .... hOWCVCI" 
text-phrase (0) text-phrase (0) ban(fda, elixir) allprov¢( fda, elixir-phls) 
Figure 6: Conlpleting the TS 
The algorit:hm for completing the TS cannot be de- 
scribed fully here, trot as an exnmple we connnent Oll 
how the solution in figure 6a yields the TS ill figure 
fib. 
• If a parent is more than one level above its 
daughters (Lp - Ld > 1), extra nodes are added 
beneath tim parent to bridge tile gap -- hence 
the paragraph node in figure 6b. 
• If a parent has the same level as its daughters 
(Lp = Ld), the daughters are raised to replace 
tim t)arent. Thus in figure 6b, the paragraph 
has three sentences, and a rhetorical grouping 
has been left unrealized iu the TS. Of course the 
reader might infer the intended RS from other 
evidence (e.g. semantic plausibility). 
• If a terminal node i has a level above text-phrase 
(Li > 0), a chain of nodes is added to bring it 
'down to earth' (e.g. the chain below the first 
text-sentence in figure 6b). 
646 
• Discourse connectives are t)assed down to the 
text>clause ill which they should be realized. 
This is decide(l (i) by l)assing the connective to 
the aI)l)ropriate argument (nucleus or satellite), 
according to its lexical entry, and (it) by there- 
after 1)assing it; down to tile first constituent if 
the argument is complex (Power et al., 1999). 
After tactical generation, we might obtain tile fol- 
lowing (rather poor) result: 
Elixir contains gestodene. Consequently, it; is 
banned by the FDA. Itowever, the FDA ap- 
proves ElixirPhls. 
7 Style 
Having designed a procedure tha.t will generate all 
text structures meeting mininml standards of col 
rectness, we need to at)l)ly fllrther constraints ill or- 
der to eliminate solutions that are stylistically eccen- 
tric or at least ill-suited to the l)Url)ose at hand. 
In ICONOCI,AS'I', this call be done in two ways: 
• If a stylistic (lefect is regarded as fatal, it is 
exchlded 1)y a hard constraint on the sohltion 
variables, so that TSs with tiffs defect are never 
generated. 
• If a stylistic defect is regar(ted as non-fatal (i.e. 
unwelcome trot sometimes necessary), it is 1)e- 
nalize(1, by a sot;(; constraint, during a subse- 
quent evaluation t)hase iil which the enunmrated 
solutions m'e ordered from best to worst. 
The user can iml)ose stylistic 1)retL'r(',n(:es by switch- 
ing hard constraints on/off, and also by weigllting 
soft constraints (i.e. determining the imt)ortanc(~ of 
non-fat;al (lefects). 
We cannot discuss stylistic control in detail here, 
trot we will give onb. or two examples for each type 
of constraint. 
IIARI) CONSTI{AINTS 
Multil)le text-clauses: r\[k) obtain an infornml 
style without semicolons, senten('es c(mtaining 
more than one text-clause (:all l)e avoided by 
ilnt)osing the constraint Li ¢ 1 on all nodes i. 
Nuelens-satellite order: For some rhetorical re- 
lations it in W l)e al)l)ropriate to fix the linear 
order of nucleus and satellite; for instmme, the 
satellite of a background relation shoul(1 pre- 
cede the nucleus. This Call 1)C ensured by a 
constraint Os = 1. on the satellite node S. 
SOFT CONSTRAINTS 
Rhetorieal grouping: Failure to exl)ress a rhetor- 
ical grout)tag can be treated as a defect. (This 
is one reason why the TS ill figure 6b is poor.) 
Oversimple paragraph: A paragraph c(mtaining 
only one text-sentence can lxe treated as a tie- 
feet. 
8 Extensions 
Our method allows an exhaustive emnneration of so- 
lutions, but only within an elenmntary ti'a.nmwork tbr 
representing rhetorical and textual structure. We 
hot)e to gradually extend this frmnework to cover 
many phenonmna that are currently excluded: 
• Since its inlmt takes the form of a rhetorical 
structure tree, the text; strueturer inherits ally 
limitations of RST as a description of rhetorical 
organization. 
• We cover only three types of discourse commc- 
tive (subordillating conjmmtion, coordinating 
conjunction, conjuctive adverb). 
• At present there is no treatment of titles. 
• There is no treatment of relative clauses, which 
(:all be elnt)loyed for exami)le to realize the 
elaboration relation (Scott; and de Souza, 
1990): 
Zovirax, which contains the antiviral 
agent aciclovir, is a smooth white cretan. 
• There is no treatinent of propositions that are 
expressed parenthetically. 
Zovirax, since it; is for you only, should 
never be given to other l)atients. 
Zovirax should never be given to other 
pall(mrs (the medicine is for you only). 
• We have omitted the colou-expansion pat- 
tern (Nunl)erg, 1.990) and some other features 
inthleneing i)unctuation (emi)hasis , quotation 
marks, parentheses). 
• We have not covered the integration of text with 
tloating items like diagrams, tal)les, or boxes. 
Two extensions that have already been iml)lemented 
are indentation and centering. We have exl)lained 
here how indentation is represented in text struc- 
ture; the relevant constraints will be described else- 
where. Centering has been incorporated by assign- 
ing backward and forward centers to all 1)rot)ositions 
ill a comt)leted TS bcforc generating the wording; 
in this way, centering transitions can be evahlated 
before tactical generation begins, and TSs yielding 
good c(mtimlity of reference can be i)referred (Kibble 
aud Power, 1999). 
To use our approach ill 1)ractical applications, one 
must address the 1)roblem that the number of eandi- 
(late solutions increases exi)onentially with the con> 
plexity of the rhetorical structure --- measured, for 
exanll)le , by the mmfl)er of elementary propositions. 
lil informal trials we find that the numl)er of solu- 
tions is roughly 5 g-1 for all input with N proposi- 
tions; this means that even for a short passage con- 
taining n dozen propositions, the text t)lanner would 
lind about 50 million solutions satisfying the hard 
647 
constraints. For texts of non-trivial length, there 
steins no alternative to sacrificing global ot)timality 
in the interests of efficiency. One option is to use 
a statistical optimization method such as a genetic 
algorithm (Mellish et al., 1998). In ICONOCLAST we 
have preferred a method of partial optimization in 
which the the text-structuring problem is split into 
parts, so that at each stage only a manageable part 
of the total solution is constructed. For instance, 
when planning a patient information leaflet, the se- 
mantic material could first be distributed among sec- 
tions, then perhaps among t)aragraphs, thus spawn- 
ing many small-scale text-structuring problems for 
which the search spaces would be measured in hun- 
dreds rather than billions. 

References 

A. Dechter and R. Dechter. 1988. Belief mainte- 
nance in dynamic constraint networks. In Pro- 
ceedings of NCAI-AAAI. American Association 
for Artificial Intelligence. 

P. Van Hentenryck. 1989. Constraint Satisfaction 
in Logic Programming. MIT Press, Cambridge, 
Mass. 

E. Hovy. 1993. Automatic discourse generation us- 
ing discourse structure relations. Artificial Intel- 
ligence, 63:341-386. 

R. Kibble and R. Power. 1999. Using centering the- 
ory to plan coherent texts. In Proceedings of the 
12th Amsterdam Colloquium. Institute for Logic, 
Language and Computation, University of Ams- 
terdam. 

W. Mann and S. Thompson. 1988. Rhetorical struc- 
ture theory: towards a functional theory of text 
organization. Text, 8(3):243-281. 

D. Marcu. 1996. Building up rhetorical structure 
trees. In Proceedings of AAAI-96. American As- 
sociation for Artificial Inmlligence. 

C. Mellish, A. Knott, J. Oberlander, and 
M. O'Donnell. 1998. Experiments using stochas- 
tic search for text planning. In Prvcecdings of 
IWNLG-98, Niagara-on-the-Lake, Canada. Asso- 
ciation for Comtmtational Linguistics. 

G. Nunberg. 1990. The Linguistics of Punctuation. 
CSLI, Stanford, USA. 

R. Power, C. Doran, and D. Scott. 1999. Generat- 
ing embedded discourse markers from rhetorical 
structure. In Proceedings of the European Work- 
shop on Natural Languagc Generation, pages 30- 
38, Toulouse, France. 

D. Rosner and M. Stede. 1992. Customizing RST 
for the automatic production of technical manu- 
als. In R. Dale, C. Mellish, and M. Zock, editors, 
Aspects of Automatic Natural Language Genera- 
tion, Levico, Italy. 

D. Scott and C. de Souza. 1990. Getting the mes- 
sage across in RST-based text generation. In 
R. Dale, C. Mellish, and M. Zock, editors, Current 
Research in Natural Language Generation. Cogni- 
tive Science Series, Academic Press. 
