An integrated framework for text planning and pronominalisation 
R0dger-'Kibble and Richard Power 
ITRI 
University of Brighton 
Brighton BN2 4GJ 
UK. 
Email: {Rodger.Kibble I Richard.Power}@itri.brighton.ac.uk 
Abstract 
This paper describes an implemented system 
which uses centering theory for planning of co- 
herent texts and choice of referring expressions. 
We argue that text and sentence planning need 
to be driven in part by the goal of maintain- 
ing referential continuity and thereby facilitat- 
ing pronoun resolution: obtaining a favourable 
ordering of clauses, and of arguments within 
clauses, is likely to increase opportunities for 
non-ambiguous pronoun use. Centering theory 
provides the basis for such an integrated ap- 
proach. Generating coherent texts according to 
centering theory is treated as a constraint sat- 
isfaction problem. 
1 Introduction 
1.1 Issues in pronoun generation 
The appropriate realisation of anaphoric expres- 
sions is a long-standing problem in NLG re- 
search. However, as McCoy and Strube (1999) 
observe, few researchers have developed sophis- 
ticated algorithms for pronoun generation. A 
typical approach, exemplified by Dale (1993), 
Reiter and Dale (1997) is to pronominalise some 
distinguished referent which was mentioned in 
the previous sentence according to a domain- 
dependent criterion of prominence or salience. 
McCoy and Strube (op cir.) offer a more com- 
plex algorithm based on the notion of "dis- 
course threads", for which they report an ac- 
curacy of 85% when tested against a corpus 
of naturally-occurring texts. Their approach 
makes some fundamental assumptions about 
discourse structure which appear to be beyond 
the capabilities of current text and sentence 
planners and are incompatible with the widely- 
accepted notion of discourse structure as a tree 
with non-crossing branches (e.g., Mann and 
Thompson 1987). 
We argue for an approach which integrates 
the tasks of text planning and choice of referring 
expression on the following grounds: 
o portability: this approach should be com- 
patible with any system that employs hi- 
erarchical text planning and certain basic 
grammatical categories; 
o coherence: we claim that text planning 
needs to be driven in part by the goal of 
maintaining referential continuity: obtain- 
ing a favourable ordering of clauses, and 
of arguments within clauses, is likely to 
increase opportunities for non-ambiguous 
pronoun use. 
The latter claim is not new, but underlies the 
Centering Theory (CT) of Grosz, Joshi and We- 
instein (1995, hereafter "GJW"). 
1.2 Issues in Text Planning 
Text Planning is one of the distinct tasks iden- 
tified in Reiter's "consensus" architecture for 
Natural Language Generation (Reiter 1994, Re- 
iter and Dale 1997): 
Text Planning- deciding the content of a 
message, and organising the component 
propositions into a text tree; 
Sentence Planning - aggregating proposi- 
tions into clausal units and choosing lex- 
ical items corresponding to concepts in the 
knowledge base; 
Linguistic realisation - surface details Such 
as agreement, orthography etc. 
Following Scott and de Souza (1990), we as- 
sume that the component propositions to be re- 
alised in a text are organised in a tree structure 
77 
in which ternfinal nodes are elementary propo- 
sitions and non-terminal nodes represent dis- 
course relations as .detined by e~g:,. Rhetorical 
Structure Theory (RST, Mann and Thompson 
1987). This structure only partially constrains 
the linear order in which the propositions will 
be realised -- in other words, any RST struc- 
ture specifies a range of possible text plans. We 
propose as an additional constraint that the 
generator should seek to maximise continuity 
of reference as determined by centering theory, 
and we argue that- this enables us to select the 
most cohesive variants from a set of text plans. 
The RST tree itself is produced by an interac- 
tive knowledge base editor which allows a user 
to control both semantic content and rhetorical 
structure via a sequence of choices guided by a 
natural  interface. 
2 Reconstructing Centering for NLG 
2.1 Centering in a nutshell 
The main assumptions of Centering theory are: 
1. For each utterance in a discourse there is 
precisely one entity which is the centre of atten- 
tion or center. The center in an utterance Un is 
the most grammatically salient entity realised 
in U~_i which is also realised in Un. This is 
also referred to as the backward-looking center 
or Cb. The notion of "salience" for the purposes 
of centering theory is most commonly defined 
according to a hierarchy of grammatical roles: 
SUBJECT > DIRECT OBJECT > INDIRECT OB- 
JECT ~> OTHERS (see e.g., Brennan et al 1987)• 
For alternative approaches see e.g., (Strube and 
Hahn 1999), (Walker et al 1994). 
2. There is a preference for consecutive ut- 
terances within a discourse segment to keep the 
same entity as the center, and for the center to 
be realised as Subject or preferred center (Cp). 
Kibble (1999) dubbed these principles cohe- 
sion and salience respectively. Pairs of suc- 
cessive utterances (U,~, U~+i} are classified into 
the transition types shown in Fig. 1, in the or- 
.der of preference specified by Grosz et al's "Rule 
,, • 
3. The center is the entity which is most likely 
to be pronominalised: Grosz et al's "Rule 1" 
in its weakest form states that if any entity is 
referred to by a pronoun, the Cb must be. 
CONTINUE." cohesion and salience 
.... both hold; same center (o r 
Cb(Un) undefined), realised as 
Subject in Un+l; 
RETAIN." cohesion only; i.e. center 
remains the same but is not re- 
alised as Subject in Un+l; 
SMOOTH SHIFT." salience only; cen- 
ter of Un+l realised as Subject but 
: , not equal .to,Cb(U~); 
ROUGH SHIFT." neither cohesion nor 
salience holds. 
NO CB: this transition is used by 
some researchers but is not dis- 
cussed by GJW. 
Figure 1: Centering Transitions 
2.2 Pronominalisation 
Text genres are known to vary in the extent 
to which pronouns are used. The CT-based 
framework allows us to experiment with differ- 
ent strategies for choosing when to use a pro- 
noun, including: 
1. Never use anaphoric pronouns -- for in- 
stance, in certain legal documents or tech- 
nical manuals where there must be no pos- 
sibility of ambiguity. 
2. Always pronominalise the Cb. 
3. Use a pronoun for non-Cbs only if the Cb 
is pronominalised. 
4. Pronominalise the Cb only after a Continue 
transition. 
Strategy 3 is favoured by (GJW) who cite psy- 
chological evidence that "the Cb is preferen- 
tially realised by a pronoun in English and by 
equivalent forms (i.e., ,.zero pronouns) in other 
s" (op cit., p. 214). However, in the 
implementation reported in section 3 we opt for 
the more restrictive strategy 4. The generation 
approach outlined below enables us to experi- 
ment with different strategies and compare the 
resulting texts. 
78 
2.3 Centering and discourse structure center after a sequence of CONTINUE. How- 
The canonical formulation of CT is concerned ever, in a sequence CONTINUE-RETAIN-SHIFT 
.' with local• cohesion;--'specifying .aqment-,. t~rarisi~':"::':the:sHng'~:is:p redicted:Am:its:A°cal':c°ntext~ but 
tions between consecutive sentences in a dis- the RETAIN is not; whenever RETAIN is a cheap 
course segment and favouring sequences which 
maintain the same center. Our implementation 
incorporates two extensions which have impli- 
cations for more structured discourse: Strube 
and Hahn's (1999) "cheapness" principle, which 
favours transitions that introduce a new topic 
• in. a salient position, ,and .Cristea's Veins The- 
ory (Cristea et al 1998) which redefines tran- 
sitions in terms of rhetorical hierarchy rather 
than linear sequence of clauses (see section 3.3 
for discussion). 
"Cheapness" is satisfied by a transition pair 
((Un-1, Un), (Un, Un+l)) if the preferred center 
of Un is the Cb of Un+l. For example, this test 
is satisfied by a RETAIN-SHIFT sequence but not 
by CONTINUE-SHIFT, so it is predicted that the 
former pattern will be used to introduce a new 
center. (This claim is consistent with the find- 
ings of Brennan 1998, Brennan et al 1987.) If we 
consider examples la-e below, the sequence c- 
d'-e ~, including a RETAIN-SHIFT sequence, reads 
more fluently than c-d-e even though the latter 
scores better according to the canonical rank- 
ing. 
. a. John has had trouble arranging his va- 
cation. 
b. He cannot find anyone to take over his 
responsibilities. 
c. He called up Mike yesterday to work 
out a plan. CONTINUE 
d. He has been pretty annoyed with Mike 
recently. CONTINUE 
e. Mike cancelled last week's project 
meeting at short notice. 
EXP-SMOOTH SHIFT 
d'. Mike has mmoyed him a lot recently. 
RETAIN 
e I. He cancelled last week's project meet- 
ing at short notice. SMOOTH SHIFT 
The "cheapness" principle illustrates the need 
for global opfimisation. We noted above that 
there is evidence that a RETAIN-SHIFT sequence 
is the preferred way of introducing a new 
transition following CONTINUE, another CON- 
TINUE would be cheap as well. The RETAIN 
is motivated as it enables a "cheap" SMOOTH 
SHIFT, and so we need a way of evaluating the 
whole sequence CONTINUE-RETAIN-SHIFT ver- 
SUS CONTINUE-CONTINUE-SHIFT. 
• : :2~4_~._,:Ceaatering.in :NLG 
CT has developed primarily in the context of 
natural  interpretation, focussing on 
anaphora resolution (see e.g., Brennan et al 
1987). Curiously, NLG researchers have tended 
to overlook GJW's proposal that 
Rule 2 provides a constraint on speak- 
ers, and on natural- gener- 
ation systems ...To empirically test 
the claim made by Rule 2 requires ex- 
amination of differences in inference 
load of alternative multi-utterance se- 
quences that differentially realize the 
same content. 
GJW, p. 215. 
With a few exceptions (e.g., Mittal et al 1998, 
Kibble 1999, Kibble and Power 1999, Cheng 
2000) NLG researchers have interpreted CT as 
a theory of pronominalisation only (e.g., Dale 
1992). In this paper we concentrate on plan- 
ning, aiming to determine whether the prim 
ciples underlying the constraints and rules of 
the theory can be "turned round" and used as 
planning operators for generating coherent text. 
Text planning in conformity with CT will need 
to follow the following set of heuristics: 
1. Plan tile order of clauses so that adjacent 
clauses have at least one referent in corn- 
Ilion. 
2. Cohesion: Prefer orderings which main- 
tain the same Cb in successive clauses. 
,3..- Salience: .Realise as=SubjeCt- of U;~ tile 
most grammatically salient entity in U,~-i 
which is mentioned in Un (the Cb). 
4. Cheapness: Realise as Subject of Un an 
entity which is mentioned in U,~+l (and will 
therefore be Cb of U,,+i). 
79 
Breaking down the problem like this reveals ferent transitions. We assume that certain 
that there are various ways the distinct tasks options for syntactic realisation can be pre- 
...... can. be slotted, into.-an.NLG,system~Cohesion. . ........ ._~dicted.,ma::t~he,~basis~,of:,the~axgu~ment ~ ~str:uc- 
naturally comes under Text Planning: ordering 
a sequence of utterances to maintain the same 
entity as the center, within constraints on order- 
ing determined by discourse relations. However, 
identifying the center depends on grammatical 
salience, which is normally determined by the 
Sentence Planner- for example, choice of active 
or passive voice. Three possibilities are: 
® "Incremental" sentence-by-sentence gener- 
ation, where the syntactic structure of Un 
is determined before the semantic content 
of Un+l is planned. That is, the Text Plan- 
ner would plan the content of Un+l by aim- 
ing to realise a proposition in the knowl- 
edge base which mentions an entity which 
is salient in Un. We axe not aware of any 
system which performs all stages of gener- 
ation in a sentence-by-sentence way, and in 
any case this type of architecture would not 
allow the cheapness principle to be imple- 
mented as it would not support the neces- 
sary forward planning. 
* A pipelined system where the "topic" or 
"theme" of a sentence is designated inde- 
pendently as part of the semantic input, 
and centering rules reflect the information 
structure of a discourse. This approach 
was suggested by Kibble (1999), proposing 
that text and sentence planning should be 
driven by the goal of realising the desig- 
nated topic in positions where it will be 
interpreted as the Cb. However, this is not 
really a solution so much as a refinement of 
the problem, since it simply shifts the prob- 
lem of identifying the topic. Prince (1999) 
notes that definitions of "topic" in the liter- 
ature do not provide objective tests for top- 
ichood and proposes that the topic should 
be identified with the centre of attention 
as defined by CT; however, what would be 
needed here would be a more fimdamental 
definition which would, account for a par- 
ticular entity being chosen to be tile centre 
of attention. 
o The solution we adopt is to treat tile task 
of identifying Cbs as an optimisation prob- 
lem, giving different weightings to t, he dif- 
ture of predicates, which means that cen- 
tering transitions can be calculated as part 
of Text Planning. 
3 Implemented prototype 
concession 
approve(fda, elixir-plus) cause NUCL~ S~LITE 
ban(fda, elixir) contain(elixir, gestodene) 
Figure 2: Rhetorical structure 
The text planner has been developed within 
ICONOCLAST, a project which investigates ap- 
plications of constraint-based reasoning in Nat- 
ural Language Generation using as subject- 
matter the domain of medical information 
leaflets. Following (Scott and de Souza 1990), 
we represent rhetorical structure by graphs like 
figure 2, in which non-terminal nodes represent 
RST relations, terminal nodes represent propo- 
sitions, and linear order is unspecified. The task 
of the text planner is to realize the rhetorical 
structure as a text structure in which propo- 
sitions are ordered, assigned to textual units 
(e.g., sentences, paragraphs, vertical lists), and 
linked where appropriate by discourse connec- 
tives (e.g., 'since', 'however'). 
Even for a simple rhetorical input like figure 2 
many reasonable text structures call be gener- 
ated. Since there are two nucleus-satellite rela- 
tions, tile elementary propositions can be or- 
dered in four ways; several discourse connec- 
tives can be employed to realize each rhetor- 
ical relation (e.g. concession can be realized 
by 'although', 'but' and '.however'); at one ex- 
treme, the text can be spread out over several 
paragraphs, while at the other extreme it can 
be squeezed into a single sentence. With fairly 
restrictive constraint settings, the system gen- 
erates 24 text-structure patterns for figure 2, 
including the following (shown schematically): 
80 
A. Since contain(elixir, gestodene), ban(fda, 3.1 Choosing centers 
elixir). However, approve(fda, elixirplus). Given a text structure, we enumerate all per- 
B. approve(fda, elixirplus), although since 
contain(elixir, gestodene ) , ban ( f da, elixir). 
The final output texts will depend on how the 
propositions are realized syntactically; among 
other things this will depend on centering 
choices within each proposition. 
In outline, the procedure that we propose is 
as follows: ~ . 
. Enumerate all text structures that are ac- 
ceptable realizations of the rhetorical struc- 
ture. 
. 
. 
For each text structure, enumerate all per- 
missible choices for the Cb and Cp of each 
proposition. 
Evaluate the solutions, taking account of 
referential coherence among other consid- 
erations, and choose the best. 
For the example in figure 2, centers can be as- 
signed in four ways for each text-structure pat- 
tern, making a total of 96 solutions. 
As will probably be obvious, such a procedure 
could not be applied for rhetorical structures 
with many propositions. For examples of this 
kind, based on the relations 'cause' and 'con- 
cession' (each of which can be marked by sev- 
eral different connectives), we find that the to- 
tal number of text-structures is approximately 
5 N-~ for N propositions. Hence with N = 4 we 
would expect around 600 text structures; with 
perhaps 5-10 ways of assigning centers to each 
text structure, the total number of solutions 
would approximate to 5000. Global optimiza- 
tion of the solution therefore becomes imprac- 
ticable for texts longer than about five propo- 
sitions; we address this problem by a technique 
of partial optimization in which a macro-planner 
fixes the large-scale structure of the text, thus 
defining a set of micro-planning problems each 
small enough to be tackled by the methods de- 
scribed here. 
Stage 1 of the planning procedure is described 
elsewhere (Power, 2000); we focus here on stages 
2 and 3, in which the text planner enumerates 
the possible assignments of centers and evalu- 
ates which is the best. 
missible centering assignments as foil0ws: " ..... 
1. Determine the predecessor Yn-1 (if any) of 
each proposition Un. 
2. List the potential Cbs and Cps of each 
proposition, henceforth denoted by ECb 
and ECp. 
3. Compute ~li combinations from ECb and 
ECp that respect the fundamental center- 
ing constraint that Cb(Un) should be the 
most salient candidate in Un-1. 
Two criteria for determining the predecessor 
have been implemented; the user can select one 
or other criterion, thus using the NLG system 
to test different approaches. Following a linear 
criterion, the predecessor is simply the propo- 
sition that precedes the current proposition in 
the text, regardless of structural considerations. 
Following a hierarchical criterion, the predeces- 
sor is the most accessible previous proposition, 
in the sense defined by Veins Theory (Cristea et 
al 1998). We will return to this issue later; for 
now we assume the criterion is linear. 
ECb(Un) (potential Cbs of proposition Un) is 
given by the intersection between Cf(U,~) and 
Cf(Un-1) -- i.e., all the referents they have in 
common. The potential Cps are those referents 
in the current proposition that can be realized 
as most salient. Obviously this should depend 
on the linguistic resources available to the gen- 
erator; the system actually uses a simpler rule 
based oil case roles within the proposition. Fig- 
ure 3 shows the potential Cbs and Cps for the 
proposition sequence in solution A. 
Our treatment of salience here simplifies ill 
tWO ways. First, we assume that syntactic real- 
ization serves only to distinguish the Cp from 
all other referents, which are ranked on the 
same level: thus effectively SUBJECT > OTHERS. 
Secondly, we assume that the system already 
.knows, from the event.class,of the proposition, 
which of its case roles can occur in subject po- 
sition: thus for ban, both arguments are poten- 
tim Cps because active and passive realizations 
are both allowed; for contain, only elixir is a 
potential Cp because we disallow 'Gestodene is 
contained by Elixir'. 
81 
U Proposition 
U1 cont ain(elixir, gestodene) 
U2 ban(fda, elixir) 
U3 approve(fda, elixir-plus) 
ECb(U) \[\] 
\[elixir\] 
\[fda\] 
2Cp(U) 
.......... -{elixir\] 
\[fda, elixir\] 
\[fda, elixir-plus\] 
Figure 3: Cbs and Cps for solution A. 
With these simplifications, the enumeration 
of centering assignments is straightforward; in 
the above example, four combinations are pos- 
sible, since there are two choices each for Cp(U2) 
and Cp(U3), none of which leads to any viola- 
tion of the basic centering constraint. This con- 
straint only comes into play if there are several 
choices for Cb(Un), one of which coincides with 
Cp(Un-1). 
3.2 Evaluating solutions 
Various metrics could be used in order to eval- 
uate centering choices. One possibility, for ex- 
ample, would be to associate a cost with each 
transition, so that perhaps 'Continue' (the best 
transition) has zero cost, while 'No Cb' (the 
worst transtion) has the highest cost. However, 
we have preferred the approach mentioned ear- 
lier in which cohesion and salience are evaluated 
separately; this allows us to include the further 
criterion of cheapness. 
Although this paper focusses on centering is- 
sues, it is important to remember that other as- 
pects of text quality are evaluated at the same 
time: the aim is to compute a global measure so 
that disadvantages in one factor can be weighed 
against advantages in another. For instance, 
text pattern B is bound to yield poor continuity 
of reference because it orders the propositions 
so that U1 and U2 have no referents in coin- 
mon. Text pattern A avoids this defect, but 
this does not necessarily mean that A is bet- 
ter than B overall; there may be other reasons, 
unconnected with centering, for preferring B to 
A. 
The system evaluates candidate solutions by 
applying a battery of tests to each.node of the 
text plan. Each test identifies whether the node 
suffers from a particular defect. For instance, 
one stylistic defect (at least for the rhetorical 
relations occurring in figure 2) is that of plac- 
ing nucleus before satellite; in general, the text 
reads better if important material is placed at 
the end. For each type of defect, we specify a 
weight indicating its importance: in evaluating 
continuity of reference, for example, the defect 
'No Cb' might be regarded as more significant 
than other defects. Summing the weighted costs 
for all defects, we obtain a total cost for the so- 
lution; our aim is to find the solution with the 
lowest total cost. 
Regarding centering, the tests currently ap- 
plied are as follows. 
Salience violation 
A proposition Un violates salience if 
Cb(Un) 7 ~ Cp(Un). This defect is assessed 
only on propositions that have a backward- 
looking center. 
Coherence violation 
A proposition Un violates cohesion if 
Cb(Un) 7 ~ Cb(Un-1). Again, this defect is 
not recorded when either Un or Un-1 has 
no Cb. 
Cheapness violation 
A proposition Us violates cheapness if 
Cb(Un) 7 ~ Cp(Un-1). 
No backward-looking center 
This defect is recorded for any proposition 
with no Cb, except the first proposition in 
the sequence (which by definition cannot 
have a Cb). 
Applied to the four solutions to text structure 
A, with all weights equal to 1, these definitions 
yield costs, shown in Figure 4.-According to our 
metric, solutions A1 and A2 should be preferred 
to A3 and A4 because they incur less cost. This 
resutt=cml be assessed, by comparing -the follow- 
ing output texts, in which the generator has fol- 
lowed the policy of pronominalizing the Cb only 
after a 'Continue' transition: 
A1. Since Elixir contains gestodene, the FDA bans 
Elixir. However, it approves ElixirPlus. 
82 
Solution 
A1 
A2 
A3 
A4 
U Cb(U) Cp(U) Defects 
U1 0 .elixir none 
U2 elixir fda salience 
U3 fda fda cohesion 
TOTAL 2 
U1 ~ elixir 
U2 elixir elixir 
Ua fda fda 
TOTAL 
Vl 
U2 
U3 
U1 
U2 
U3 
none 
none 
cohesion, cheapness 
2 
I~ elixir none 
elixir fda salience 
fda elixir-plus salience, cohesion 
TOTAL 3 
elixir none 
elixir elixir none 
fda elixir-plus salience, cohesion, cheapness 
TOTAL 3 
Figure 4: Costs of solutions A1 - A4. 
A2. Since Elixir contains gestodene, it is banned by 
the FDA. However, the FDA approves Elixir- 
Plus. 
A3. Since Elixir contains gestodene, the FDA bans 
Elixir. However, ElixirPlus is approved by the 
FDA. 
A4. Since Elixir contains gestodene, it is banned by 
the FDA. However, ElixirPlus is approved by 
the FDA. 
Of course we are not satisfied that this metric 
is the best; an advantage of the generation ap- 
proach is that different evaluation methods can 
easily be compared. 
3.3 Hierarchical centering 
The linear approach, illustrated above, assigns 
centers on the basis of a proposition sequence, 
flattening the original hierarchy and ignoring 
nucleus-satellite relations. This means, for ex- 
ample, that in a text of two paragraphs, propo- 
sition U2.1 (the first proposition in the second 
paragraph) has to be treated as the successor 
to Ui.N (the final proposition of the first para- 
graph): even if Ui.:\, is relatively insignificant 
(the satellite of a satellite, perhaps). One's in- 
tuition in such cases is that some more signif- 
icant proposition in the first paragraph should 
become the focus against which continuity of 
reference in the second paragraph is judged. 
Veins Theory (Cristea et al 1998) provides a 
possible formalization of this intuition, in which 
some earlier propositions become inaccessible as 
a rhetorical boundary is crossed. The theory 
could be applied to centering in various ways; 
we have implemented perhaps the simplest ap- 
proach, in which centering transitions are as- 
sessed in relation to the nearest accessible prede- 
cessor. In many cases the linear and hierarchical 
definitions give tile same result, but sometimes 
they diverge, as in the following alternative to 
solutions A and B: 
C. ban(fda, elixir) since contain(elixir, 
gestodene). 
However, approve(f tin, elixirplus). 
Following Veins Theory, the predecessor of 
approve(f da, elixirplus) is ban(f da, elixir); its 
linear predecessor contain( elixir, ge.stodene ) 
(an embedded satellite) is inaccessible. This 
makes a considerable difference: under a hier- 
archical approach, fda can be the Cb of the 
83 
final proposition; under a linear approach, this Proceedings of ANLP-NAACL 2000. 
proposition has no Cb. D Cristea, N Ide and L Romary, 1998. Veins 
~ '. Iheory: ~ A :model of,:gtobat: discourse :cohesion 
4 Conclusion 
This paper has highlighted some implications 
of Centering Theory for planning coherent text. 
We show that by making some assumptions 
about which entities are potential Cps, we can 
determine Cbs, Cps, and hence transitions, in 
the text planning stage. This allows the text 
planner to select the proposition sequence that 
yields the best continuity of reference, or to bal- 
ance the goal of referential continuity against 
other factors. For instance, there may be a 
preference for Satellite to follow Nucleus for 
some discourse relations, even if this results in a 
greater number of defects according to centering 
considerations. There are difficulties in evaluat- 
ing algorithms for specific tasks which are em- 
bedded in a generation system, since the quality 
of the output is limited by the functionalities of 
the system as a whole. In particular, the task 
of generating appropriate referring expressions 
cannot be tackled in isolation from other tasks 
which contribute to the coherence of a text. 
The implementation of Centering reported 
here is a special case of text planning by con- 
straint satisfaction, where the user has control 
over the different constraints, and this approach 
means that different strategies for e.g. clause or- 
dering and pronominalisation can easily be com- 
pared by inspecting the resulting texts. The 
evaluation metrics we have presented here are 
provisional and are a matter for further detailed 
research, which our approach to text generation 
will facilitate. 
Acknowledgements 
This work was supported by the UK EPSRC 
under grant references L51126, L77102 (Kibble) 
and M36960 (Power). 

References 
S Brennan 1998. Centering as a Psychological 
Resource for Achieving Joint Reference in Spon- 
taneous Discourse. In Walker, Joshi and Prince 
(eds), Centering Theory in Discourse, Oxford. 
S Brennan. M Walker Friedman and C Pollard 
1987. A Centering Approach to Pronouns. In 
Proe. 25th A CL . 
H Cheng 2000. Experimenting with the Inter- 
action between Aggregation and Text Planning~ 
and coherence. In Proc COLING/ACL'98, pp 
281-285, Montreal. 
R Dale 1992, Generating Referring Expressions, 
MIT Press. 
B Grosz, A Joshi and S Weinstein 1995, Center- 
ing: a framework for modelling the local coher- 
ence of discourse. Computational Linguistics. 
• R Kibble 1999, Cb or not Cb? Centering theory 
applied to NLG. A CL workshop on Discourse 
and Reference Structure. 
R Kibble and R Power,1999. Using centering 
theory to plan coherent texts, Proceedings of 
the 12th Amsterdam Colloquium. 
W Mann and S Thompson 1987, Rhetorical 
Structure Theory: A Theory of Text Organ- 
isation. In L Polanyi (ed.), The Structure of 
Discourse. 
K McCoy and M Strube, 1999. Generating 
Anaphoric Expressions: Pronoun or Definite 
Description? A CL workshop on Discourse and 
Reference Structure. 
V Mittal, J Moore, G Carenini and S Roth 1998, 
Describing Complex Charts in Natural Lan- 
guage: A Caption Generation System. Com- 
putational Linguistics. 
R Power 2000. Planning Texts by Constraint 
Satisfaction, to appear in Proceedings of COL- 
ING 2000. 
E Prince 1999. How not to mark topics: "Top- 
icalization" in English and Yiddish. Ms, Lin- 
guistics Department, University of Pennsylva- 
nia. 
E Reiter 1994. Has a consensus NL generation 
architecture appeared, and is it psycholinguisti- 
cally plausible? In Proc. INLG 7. 
E Reiter and R Dale 1997, Building Applied 
Natural-Language Generation Systems. Jour- 
nal of Natural-Language Engineering 
D Scott and C de Souza, 1990. Getting the 
message across in RST-based text generation. 
In Dale, Mellish and Zock (eds), Current Re- 
search in Natural Language Generation, Aca- 
demic Press. 
M Strube and U Hahn 1999, Functional Center- 
ing - Grounding Referential Coherence in Infor- 
mation Structure. Computational Linguistics. 
