A HYBRID APPROACH TO THE AUTOMATIC PLANNING 
OF TEXTUAL STRUCTURES 
Gang Zhu and Nigel Shadbolt 
A.I. Group, Psychology l)epartment, Nottingham University, Nottingham, U.K. 
1. Introduction 
Over the last decade, the research goals in 
natural language generation have shi fred fi'om the gen- 
eration of isolated sentences to the production of coher- 
ent multi-sentence paragraphs. Two major aspects of 
the generation process have been focused on: deciding 
'what to say' (the strategic level) and deciding 'how to 
say it' (the tactical level). 
In 1985, McKeown designed one of the first 
systems to produce paragraphs using so-called sche- 
mata to describe conventional text structures in terms 
of patterns. Schemata are used to determine the content 
and order of the clauses in paragraphs (McKeown, 
1985). However, these structures have a major limita- 
tion (Moore and Paris, 1988): schemata do not contain 
a description of tim intentional and rhetorical role that 
each part of the paragraph plays with respect to the 
whole paragraph. 
In 1988, How first employed RST (Pdmtorical 
Structure Theory) relations, which state the relation- 
ships between individual elements of a text, to control 
the construction of texts (How, 1988). In developing 
this RST-based method, Ho W has discovered that RST 
relations are a powerful tool for planning pa,'agraphs. 
They support reasoning about the intentions of writers 
and readers in a very natural way. Planning with rhe- 
torical relations affords more flexibility than schcnmta. 
This method of planning paragraphs builds a tree struc- 
ture that represents the internal organisation and rhe- 
torical dependencies between clauses in a text. P, nt 
there is a cost: it is more difficult to assemble an RST 
paragraph tree from a set of independent relations than 
it is to instantiate and traverse a schema (Hovy, 1991 ). 
In 1992, Hovy et. al. described a new text 
planner (Ho Wet. al., 1992) that identifies the distinct 
types of knowledge necessary to generate coherent 
discourse in a text generation system. These knowl- 
edge resources are integrated under a planning process 
that draws from appropriate resources whatever 
knowledge is needed to construct a text. Though Itovy 
et. al. do not claim to have identified all the knowledge 
sources required to produce coherent discourse, their 
planner sets a trend for applying multi-knowledge 
resources for more complete and flexible phmning of 
text. 
So far, planning techniques have developed 
from the direct application of schemata toward the 
wider implementation of multi-knowledge resources 
and diverse l)hmning architectures (McKeown, 1985; 
Paris, 1987; How, 1988; Moore, 1989; McKeown et 
al., 1990; Suthers, 1991; Hovy et. al., 1992). When 
these planning mechanisms are implemented in a 
working system, efficiency is still an important factor 
in developing a workable model. One of the problems 
in generation is that of designing a phmning architec- 
ture that can achieve a good balance between the 
efficiency of the schema-based approach and the lqex- 
ibility o1' the RST-based one. This paper presents such 
a hybrid architecture. 
2. A llybrid Approach 
Both schema-based and RST-based planning 
pa,'adigms have advantages and disadvantages. A 
hybrid of the two approaches that preserves their best 
aspects -- the efficiency of tile schema-based para- 
digm and the flexibility of the RST-based one -- 
would clearly be usefnl. What are the possibilities for 
such a hybrid approach? 
Though the two paradigms seem very differ- 
ent, tile fact is that a close relationship exists between 
them. Schemata are nothing other than stereotypically 
occurring collections of phms, whereas the plans and 
their plan elements are simply the elementary building 
blocks of schemata (Mann, 1987). Schcrnata can be 
viewed as the result of a process where the plans for all 
of the steps in the process have been compiled into a 
single structu re (Moore and Swartout, 1991 ). Schemata 
can be used for plmming relatively invariant aspects of 
text content and structure. RST-based plans can cope 
with less predicable and more volatile. Both planning 
paradigms can be inlplementcd if they are properly 
represenled and manipulated in a hybrid architecture. 
Two features are of importance in this hybrid 
appro'~ch: (t) different planning mechanisms are re- 
quired to deal with different textual phenomena and 
(2) explicit use of multi-knowledge resources indis- 
pensable to these n~echanisms. 
In knowledge resources, there are two types of 
presc,'iptive knowledge: domain-dependent and the 
domain-independent knowledge. Both domain-de- 
334 
20+18+16+13 14-15 17 19 
Nuclear Satellilcs: l{labc.ration 
13. The Five Sttu" stipport package, 
14. costing 35% of Ihe licorice fee 
15. (with a iliiliinltiln of J2500), 
16. also hlcltldeS free, aUtOlllalic tlpgrades 
17. (i.e. Ill;Ijor releases) 
I 8. to the products 
19. thai you have registered, 
20. prel\~rential rales for adding extra copies 
of those or <'lily ol\]ler products and Illel/l- 
Imrships of the new I+l>A User gl'ou I) 
21. I~ olh stt ppott packages include d iscou n ts 
on m~y training or ctmsultancy 
22+ ordered during the support periud. 
Fig. 1: A sample RST "uialysis 
pendent and domain-independent knowledge resources 
will consist of intentional and rhetorical operators 
with associated hierarchical networks. 
In the hybrid planning rnechanisni advocated 
here, top-down hierarchical expansion is used as tile 
basic planning meclmriism. It retrieves as ll/tlch i.t'or- 
mation as possible from relewmt knowledge rcsources. 
In general it does this in a donmin-dcpendent-to- 
domain-independent order. This order rel+lecls the 
idea that an efl'icient planuing mechanism should seek 
to exploit, whenever possible, stereotypical domain- 
dependent knowledge resources. This top-down plan- 
ning mechanism is combined with other heuristic 
meehanistns such as atl atlgmetlted transitiot/network 
traversal, constructive critics and focus modul0s. This 
approach makes use of different knowledge rcsOill+eOs 
and planriing mechanisms and is capable of handling 
a number of different textual pbenomeria. 
3. A Prototype 
A prototype lias been designed to delllOll- 
strate tiffs hybrid approach to die problem of phlnning 
textual structures. We will first describe how the text 
data were collected and analysed. Based on this data, 
we will then discuss the kuowledge resources lhat 
were identified as important as well as how they are 
,'epresented. lqnally we indicnte how the i+lulmitlg 
mechanisnis are iml:,lenu:nted. 
3.1. Test l)ata 
The test data are a selection of l:inglish sah:s 
letters. These letters are rehltively formalised, in that 
some paragral)hs are fixed while others are more 
wtried as to whether they appear mid where they 
appear. The letters were written for a restricted read- 
ership on a specific subject " "i namely, certain comptt+ 
tot" software products. 
The textual analysis has bcen carried out 
according to RST, although several modifications 
have had to be made. An exatnple o1' part of a RST 
analysis is given hi Fig. 1. 
Although RST provides a framework for de- 
scribing rhetorical relations ali\]Ollg parts of a text, it 
lacks ;ill explicit representation of the coummnicative 
intentions undcrlyhlg the generation of cohol+Oilt 
IiltlltisententiaI text. In order to COllstrtlct a hybridised 
processor for various knowledge bases and plannhig 
ilioclialliSlils, we callnot implelnellt RST dh'ectly with 
its rhctoric'al rotations, but llave to develop additional 
hlcntional rol;itions. RST has to be supplenlonted 
with a richer hltoulional context. 
3.2. l(nowledt4e I~,esources 
"Fo plan otlr s;tles letters, we need to develop 
distinctive domain-dcl)endent and domain-independ- 
ent knowledge reseurces and thoh" associated 
proeosshlg nlcchanisms in a plaunhlg system. 
l:{;lcll lesourcc represeuts both domaiu+de- 
pendeltt hlfornr, ttion :.uM donlahl-hldepondout hlfor- 
tllaliOll, tlierarchical l/etworks describe relationsliil+s 
LII/1OII,~ r, the (:OlltOllts of otlr knowledge resources. 
In this section, we present tim main knowl- 
ed~o i'ost)tll'CCS that we have st) far Meniifiod, namely: 
intentional operators, rlietorical operators, and ilet- 
works over theln. 
3.2.1. lntention:d Operators 
h+llClltiOl1~tl ol)(21"~Itt)l+S ~+ll+e Ol'g~Illlscd ;+trOLllld 
tim intentions of the writer, and tlmir decornpositions 
are used to select t'elevant rhetorical operators or 
approl-Jrhite speech acts as defirted by Allen (1987). 
An hlentioruil operator is represented using the for- 
real theory o f t'ational interaction developed by Coben, 
1,cvesque, and Perraut (1985). Each operator has a 
goal, prerequisites, consh'airlts, subgoals, and a type. 
The goal will be brought abotlt by a,i application ot'the 
Ol~JCrtltor. The stlbgo:.llS mtlst be achieved for stibse- 
quent application el'the operator. The prerequisites are 
conditions which must be. salisfied, and coustr;ihlts 
are Colld it ions which Call be ignored if there is rio other 
intention:ll operator which has the desired goal. The 
tylx; in eacli operator is either domain-dependent or 
Z75 
domain-independent. The criteria for the division be- 
tween the domain-dependeut and the domain-inde- 
pendent operators is based on the stereotypic patterns 
of our analysed texts. For example, Fig. 2 represents a 
domain-dependent intentional operator, Pe,'suade. In 
our system, this operator may be instantiated as an 
attempt by an agent X to persuade a client Y to take an 
Action such as buying the agent's Products. This is 
achieved by making the client aware of the products 
and increasing his desire to take the action of buying 
the product. The prerequisites indicate that both the 
agent X and the client Y mutually believe that Informa- 
tion is about Products, the agent believes hfformation, 
and the client does not know it. These prerequisites 
nmst be satisfied within the existing knowledge re- 
sources before the intentional operator can be applied. 
The constraints, in this case that the client Y is not 
competent to fulfil Action, need to be satisfied at this 
stage of processing. When the constraints happened 
not to be satisfied within the existing knowledge 
resources, the constraints are then set as a new subgoal 
for later expansion. 
GOAL: persuade(X, Y, Action) 
PREREQUISITES: 
bmb(X, Y, is(Information, Products)) 
bel(X, hfformation), not(know(Y, hfformation)) 
CONSTRAINTS: not(competent(Y, Action)) 
SUBGOALS: bmb(X, Y, and(bel(Y, l~roducts), 
increase_desire(Y, Action)) 
TYPE: domain-independent 
Fig. 2: An intentional operator: Persuade 
3.2.2. Rhetorical Operators 
Rhetorical operators are associated with in- 
tentional operators. This association reflects the fact 
that there are certain rhetorical means of achieving 
particular intentional goals. P, hetorical operators con- 
sist of seven components: Prerequisites, Constraints, 
Effects, Nuclear, Satellite, Order and Type. As wflh 
our intentional operators the prerequisites must always 
be satisfied. Constraints may be ignored but if they are 
processed they have the same potentkd as constraints 
in intentional operators -- they may become new goals 
for the system. Rhetorical operators as expected to 
have clear effects on intended recipients. Our rhetorical 
operators also possess the important constituents of a 
nuclear and satellite. They concern how the goals 
expressed in the calling intentional operators are to be 
achieved-- the actions to be carried out. There are two 
types of rhetorical operators-- domain-dependent and 
domain-independent: 
Domain-independent rhetorical operators are 
general rhetorical operators applicable across a wide 
range of types of texts. There are about thirty of them 
described to date (Mann and Thompson 1987). Plan- 
ning with these operators affords more flexibility 
than schemata, because individual operators typi- 
cally control less of a paragraph than schemata do; 
Domain-dependent rhetorical operators are 
derived fi'om our RST analysis of our task-oriented 
data. l-laving analysed our sales letters we have 
klentified those rhetorical operators that seem par- 
ticular to such computer product sales texts. Often 
they arc rather schematic in that one can expect 
certain material to be expressed in particnlar ways at 
certain parts in the text. 
3.2.3. Intentional and Rhetorical Networks 
The intentional network is a hierarchical 
structure that embodies a preferred control structure 
for the use o f on r in ten t io nal operators. The intent ion al 
network can be used for giving possible development 
of conmmnicative goal(s) with heuristic ordering for 
an efl'icient schema-based approach. 
The rhetorical network is derived fi'om several 
main sources: the relations defined in RST (Mann and 
Thompson 1989), which were extended in Hovy's 
taxonomization ol' relations (Hovy et. al. 1992), and 
others as determined by our sldes-letter domain. This 
,hetorical network operates together with the other 
knowledge resources, by posting the hierarchical 
patterns of intentional operator(s), selecting relewmt 
speech act(s), or specifying aspects of g,'ammatical 
realisation. 
3.3. l'lanning Mecllanisnls 
A text phnumr, in the form of a heuristic 
planning process adopted from the layered architec- 
ture JAM (Carletta 1992) and a top-down hierarchi- 
cal expansion system based on NOAH (Sacerdoti 
1977), has been ilnl~lcmentcd to phm cohel'ent para- 
graphs which achieve a goal. The goal is configured 
with initial stales designed to affect a reader in a 
specified way. 
During tile main planning process, top-down 
hierarchical phnming takes place. This occurs when 
intentional operators are expanded into a network of 
subgoal(s), or rhetorical operators are expanded into 
a network of aclions. Planning is also involved when 
unsatisfied constraints become new subgoals. There 
may be several alternative expansions to be explored. 
At this point, the organisation of the plan expressed 
by one or more structure trees may have to be criti- 
cised to account for interactions between parts of 
what were previously unanalysed subgoals and ac- 
tions. If there exist a g,'oup of structure trees, these 
trees have to be focused through selective heuristics. 
336 
InlOU{ ~/o.r \[ Intlial.~jc, a,(bmb(agec,t, client, ¢onvincecl(dietd, intonnation(lpa ctelt~ba~)))~ 
\] (k)aI ill all intentional operator 
~ cite r,t, inlllaLgoal(...)) -~ Ti/~nd~tLf~; tmtiue 
lil~n dle e~ LP~\] stla si'le: ~ Constraint a~ a subgc, al y~ Decomposition 
1 F lr l 
-'--"~_~{ cleric opcr to' \[bmb(agent, client, and(bel(Client,information),increase dedre))\] r--~-----7 \[ Preser'tati°nalsequenee I 
- ~j 
bmb(egent client and(exped(a£1ent aelion) e0e,-,t(elie,'lt, achon)))\] ~ ~~._ 
"- .......... ~"--'" ....... ' ...... '-' ~'~ ~/ { 1-21 
,. , ,_ I I / bmb(agent client and(present believe)) .... ~ 3,1-38 Nucleus Nucleus 
1-33 j Nucleus Salellit e 1-25 
Satellite ~~ 
1-33 1-21 
IVD \]\]VA\]\]ON El&43 L Ef...EI..J T FVESEI,KA'IIONf, LSEQUENCE 
Fig. 3: A simplified tol>-Icvcl planning process for |we alternative lextua\[ structures 
These heu,'istics prefer structures with less subgoals 
remaining or lower cost estimates in the knowledge 
hierarchical networks. We call these critic processes 
heuristic ordering mechanisms. 
For example, Fig. 3 shows a simplified top- 
level planning process for two alternative textual struc- 
tures. The initial goal is that the writer or agent wishes 
to convince the client about in formation concerning on 
LPA database products. The two alternative structures 
of Fig. 3 represent two different plans that our system 
can generate so as to achieve the initial goal. The two 
plans vary in terms of whether the text is lengthy and 
persuasive, else short and informative. The persuasive, 
lengthy setting results in an olmrator being selected to 
increase the client's desire to buy tile products. But a 
constraint oftheoriginal persuade operato,'is expanded. 
The operator attempts to increase the client's ability to 
take advantage of his strengthened desire to buy the 
products. This will result in text tlmt attempts to produce 
a means of cnablement to increase the ability to satisfy 
the desire. Motivation and Enablemcnt are used to 
produce a partial textual structure on the left side of 
Fig.3. Otherwise, when an Infornmtive mode and a 
Short time setting are required, the system selects an 
intentional operator with a rhetorical operator to fulfil 
its initial goal as shown on the right side of Fig. 3. This 
is a simplificd presentation of informing material about 
the prodttcts. 
The output for the hybrid phmner is a single 
structure tree, with speech acts associated with each of 
the terminal nodes. The termiual nodes specify propo- 
sitions discharging those speech acts. This inlbr,nation 
is chosen st) that, with minor supplementation, it is 
sufficient to specify sentences to be generated by a 
functional grallllllar (see Fig. 4) 
During the process of developing the hybrid 
phuining protolype, we have found that it possesses the 
following adwlniage. I leuristic strategies can be imple- 
lnlbml(Volitional Result) 
~lntentional 
/Volitional Result operators 
/ 2 Inform(Elaboration) I ! I 
/ / ..--. El Ibo ition 
Rhetorical 3 hfform(l~labc, r;llion) 
opora\[ors "4"" x x \] 
i "~F,\]aboration 
4 5 
:2. \[I IIlI bml)(agcnt,clienl,and(agent is pleased, present(agent,agent is 
plcascd)))l, 
* I,PA is pleased 
3. \[ \[ \[ \[l',In b(agcnl,clicnt,and(anncmlmc(agcrd,wir@.:,w series), 
bcl(clicnt,window_scrics))) I, 
* \[O allllOtlllCe ils lies,,, ~,Villdn,,vs series, 
4. \[l\[\[bmb(agent,client,and(inform ol)jecl(agcnt,lnachin¢ 386 / 
486),hal(client,machine 386 / 486)))1, 
* a rnnge of software tools for 386 :rod 486 machines 
5. 
\[I;inb(agcnl,clierd,ai~d(in/iwrn_attribute(agcnl,windows_3 in cnhancedj'n~le), 
bcl(clicrd,windows 3 in enhanccd_mc, dc)))\]llllllll, 
* running 'Windows 3.0 in Enh'lnced Mode. 
Fig. 4: A smnplc partial OUtlmt tree 
337 
mented within a non-linear hierarchical planning pro- 
cedure and multi-knowledge resources can be em- 
ployed selectively at each level of abstraction. Its top- 
down hierarchical expansion process provides an ef- 
ficient non-linear planning mechanism. Its heuristic 
strategy flexibly chooses not to expend all of the effort 
needed to employ various resources unless it is abso- 
lutely necessary. 
4. Conclusion 
This paper has presented a hybrid approach to 
the planning of textual structm'es. It is based on the 
idea that a variety of explicit knowledge resources and 
planning mechanisms are needed for an efficient but 
flexible text planner. By describing a hybrid planning 
prototype, it identifies various knowledge resources 
required in the domain of business letters. It suggests 
associated planning techniques for manipulating in- 
tentional and rhetorical information. Since tile re- 
search is still in progress, this paper cannot claim to 
have identified all the necessary knowledge resources 
and requisite planning mechanisms. Consequently, 
certain problems, such as how to evaluate wnious 
planning critics in detail, remain unsolved. The next 
stage of the research is to capture richer knowledge in 
the domain and further develop tile critic modules and 
their controlling mechanisms. Nevertheless we feel 
that the system as it stands represents a linguistically 
motivated and coherent computational architecture 
for the generation of text. The generated text is, 
moreover, rhetorically compelling given the intention'fl 
goals of the originator. 
References 
Allen, J. 1987. Natural Language Understanding, The 
Benjamin/Cummings Publishing Company, Inc.. 
Carletta, J. 1992. Risk-taking and Recovery in Task-OH- 
ented Dialogue, PhD thesis, University of Edintmrgh 
Cohen, P. R. and Levesque, H. 1985. Speech "lets and 
nationality, hz Proceedings of tile 23rd Annual Meeting of 
the Association for Computational Linguistics. 
Hovy, E. H. 1988. Planning coherent multisententinl text. 
hi Proceedings of the twenty-sixth Annual Meeting o\['lhe 
Association for Computational I.inguistics, State Univer- 
sity of New York, Buffalo, New York. 
Ho W, E. H. 1990. Unsolved issues in paragraph phuming. 
hi R. Dale, C. Mollish and M. Zock Eds., Current Research 
in Natural Language Generation. Academic Press Lim- 
ited, London. 
Hovy, E.H. 1991. Approach to tile planning of coherent 
text. ht C.L. Paris, W.R. Swartout and W.C. Mann Eds., 
Natural Language Generation in Artificial Intelligence 
and Computational Linguistics. Kluwer Academic Pub- 
lishers, USA. 
Hovy, E., Lavld, J., Maier, E., Mittal, V. and Paris, C. 1992. 
Employing knowledge resources in a new text planner archi- 
tecture. In R. Dale, F,. llovy, D. Rosner and O. Stock Eds., 
Aspects of Autmnated Natural Language Generation. 
Springer-Verlag Berlin Heidelberg. 
Mann, W.C. 1987. Text Generation: the Problem of Text 
Smlcture. Technical Report No. RS-87-181, ugc/in\[brma- 
lion Science Institute, Marina Del Rey, CA. 
Mann, W. C., Matthiessen, C. and Thompson, S. A. 1989. 
P, hetoricaI Structure Theory for text analysis, USC/lnforma- 
lion Sciences Institute, Technical Report ISI/RR-89-242. 
McKeown, K. R. 1985. Text Generation : Using Discourse 
StJwtegies and Focus Constraintx to Generate Natur_al Lzm- 
guage Text. Cambridge University Press, Cambridge, Eng- 
land. 
McKeown, K. R; Elhadad, M.; Fukmnoto, Y.; Lira, J.; 
Immbardi, C.; Robin, J. and Smaadja, F. 1990. Natural 
language generation in COMET. In R. Dale; c. Mellish and 
M. Zock Eds., Current Research in Natural Ixmguage Gen- 
eration. Academic Press Limited, London 
Moore, J. I). 1989. A reactive approach to explanation in 
expert ancl advice giving systems. PhD. dissertation, Univer- 
sity of California, Los Angeles, CA. 
IVloore, J .D. and Paris, C.I_ 1988. Constructing coherent text 
using rhetorical relations. In Proceedings of tile National 
Conference oil Artificial Intelligence, Moston, MA. 
Saeerdoti, E, D. 1977. A Structure Ibr Plans and Behaviours. 
New York: North l lolland, 
Suthers, D. S. 1991. A task-appropriate hybrid architecture 
for explanation. Comput:ttional Intelligence, 7(4). 
338 
