A Computational Theory of Prose Style 
for Natural Language Generation 
David D. McDonald and James D. Pnstejovsky 
Department of Computer and Information Science 
University of M~,=.~chnsetts at Amherst 
1. Abstract 
In this paper we report on initial research we have 
conducted on a computational theory of prose style. Our 
theory speaks to the following major points: 
1. Where in the generation process style is taken into 
account. 
2. How a particular prose style is represented; what 
"stylistic rules" look like; 
3. What modifications to a generation algorithm are 
needed; what the deci~'on is that evaluates stylistic 
alternatives; 
4. What elaborations to the normal description of 
surface structure are necessary to make it usable as 
a plan for the text and a referenco for these 
decicions; 
5. What kinds of information decisions about style 
have access to. 
Our theory emerged out of design experiments we have 
made over the past year with our natural language 
generation system, the Zetalisp program MUMBLE. In the 
process we have extended MUMBLE through the addition 
of an additional process that now mediates between content 
planning and linguistic realization. This new process, which 
we call "attachment", provides the further si~,nificant benefit 
that text structure is no longer dictated by the structure of 
the message: the sequential order and dominance 
relationships of concepts in the memage no longer force one 
form onto the words and ph~ in the text. Instead, 
rhetorical and intentional directives can be interpreted 
flexibly in the context of the ongoing discour~ and stylistic 
preferences. The text is built up through compos/tion under 
the direction of Linguistic organly/nS principles, rather than 
having to follow conceptual principles in Iockstep. 
We will begin by describing what we mean by prose 
style and then introducing the generation task that lead us 
to this theory, the reproduction of short encyclopedia 
articles on African tribes. We will then use that task to 
outline the parts of our theory and the operations of the 
attachment process. Finally we will compare our techniques 
to the related work of Davey, McKeown and Derr, and 
Gabriel, and consider some of the possible psychollnguistic 
hypotheses that it may lead to. 
2. Prose Style 
Style is an intuitive notion involving the manner in 
which something is said. It has been more often the 
professional domain of literary critics and English teachers 
than linguists, which is entirely reasonable given that it 
involves optional, often conscious decb/ons and preferences 
rather than the unconscious, inviolable rules that linguists 
term Universal Grammar. 
To illustrate what we mean by style, cons/der the three 
paragraphs in Figure 1. As we see it, the first two of 
these have the same style, and the third has a different 
one. 
The Ibibio are a group of six related peoples riving 
in southeastern Nigeria. They have a population 
estimated at 1,500,1300, and speak a language in the 
Benue-Niger subfamily of the Niger-Congo 
languages. Most Ibibio are subsistence farmers, but 
two subgroups are fishermen. 
The Ashanti are an AKAN-speaking people of 
central Ghana and neighboring regions of Togo and 
Ivory Coast, numbering more than 900,000. They 
subsist primarily by farming cacao, a major cash crop. 
The Ashanti are an African people. They live in 
central Ghana and neighboring regions of Togo and 
Ivory Coast. Their population is more than 9(}0,000. 
They speak the language Akan. They subsist 
primarily by farming cacao. Thb is a major cash crop. 
1 ~ paragraphs, two styles 
187 
The first two of these paragraphs are extracted from 
the Academic American Encyclopedia; they are the lead 
paragraphs from the two articles on those respective tribes. 
The third paragraph was written by taking the same 
information that we have posited underlies the Ashanti 
paragraph and regenerating from it with an impoverished 
set of stylistic rules. 
We began looking at texts like these during the 
summer of 1983, as part of the work on the "Knoesphere 
Project" at Atari Research (Borning et al \[1983\]). Our goal 
in that project was to develop a representation for the kind 
of information appearing in encyclopedias which would not 
be tied to the way in which it would be presented. The 
same knowledge base objects were to be used whether one 
WaS recreating an article llke the or/giuaJ, or wakin~g a 
simpler version to give to children, or answering isolated 
questions about the material, or giving an interactive 
multi-media presentation coordinated with maps and icons, 
and so on. 
With the demise of Atari Research, this ambitious goal 
has had to be put on the shelf; we have, however, 
continued to work with the articles on our own. Research 
on these articles lead us to begin work on p~o.~ style. This 
remains an interesting domain in which to explore style 
since we are working with a body of texts whose 
organization is not totally dictated by its internal form. 
These paragraphs are representative of all the African 
tribe articles in the Academic American, which is not 
surprising since all of the articles were written by the same 
person and under tight editorial control. What was most 
striking to us when we first looked at these articles was 
their similarity to each other, both in the information they 
contained and the way they were muctured as a text. We 
will assume that for such texts, ~encyclopedia style" involves 
at least the following two generalizations: (1) be consistent 
ia the reformation that you provide about each tribe; and 
(2) adopt a complex, "information loaded" sentence structure 
in your presentation. This sentence t~ructure is typified by 
a rich set of syntactic constructions, including the use of 
conjunction reduction, reduced relative clauses, coordination, 
secondary adjunction, and prenominal modification whenever 
possible. 
A contrasting style might be, for example, one that was 
aimed at children; we have rewritten the information on 
the Ashanti tribe as it might look in such a style. We 
have not yet tried implementing this ~'71e qince it will call 
for doing lexicalization under stylistic control, which we 
have not yet designed. 
"The Ashanti are an African people. They live in 
West Africa in a country called Ghana and in parts 
of Togo and the Ivory Coast. There are about 
900DO0 people in this tribe, and they speak a 
language named AKAN. Most of the Ashanti are 
cacao farmers." 
Figure 2 
The style of the Academic American paragraphs, on the 
other hand, is much tighter, with more compact sentence 
structure, and a more sophisticated choice of phrasing. 
Such differences are the son of thing that rules of prose 
style must capture. 
3. Our Theory of Generation 
Looking at the generation process as a whole, we have 
always presumed that it involved three different stages, with 
our own research concentrating on the last. 
(1) Deter,-,,,i,.e what goals to (attempt to) accomplish with 
the utteraaes. This initiates the other activities and posts a 
set of criteria they are to meet, typically information to be 
conveyed (e.g. pointers to frames in the knowledge base) 
and speech acts to be carried out. 
(2) Deriding which qx.dfle propositions to express and 
which to leave for the audlcnge to Infer on their own. 
This cannot be separated from working out what rhetorical 
constructions to employ in expressing the specified speech 
ace; or from selecting the key lexical items for 
communicating the propositions. The result of this activity 
is a teat plan, which has a principally conceptual vocabulary 
with rhetorical and lexical annotations. The text plan is 
seen by the next stage as an executable %-pecification" that 
is to be incrementally converted into a text. The 
specification is given in layers, Le. not all of the details are 
planned at once. Later, once the linguistic context of the 
uni~ within the s\]~t'ication has been determined, this 
planner will be recunively invoked, unit by unit, until the 
planning has been done in enough detail that only linguistic 
problems remain. 
(3) \]~fnintJ.lnlna_ • rt~u of the ~ ~tl"u~ 
or the uttermuz, traverdng und interpreting thts structure 
to preduce tim words of tim text and constrain further 
dee/stun~ This stage is responsible for the grammaticality of 
the text and its fluency as a discourse (e.g. insuring that 
the correct terms are pronominalized, the conect focus 
maintained, etc.). The central representation is an explicit 
model of the suryace structure of the text being produced, 
which is used both to determine control flow and to 
constrain the activities of the other ~ (see discussion 
in McDonald \[1984\]). The surface structure is defined in 
terms of a stream of phrasal nodes, constituent positions, 
words, and embedded information units (which will 
eventually have to Le sent back to the planner and then 
realized linguistically, extending the surface structure in the 
process). The entities in the stream and their relative order 
is indelible (i.e. once selected it cannot be changed); 
however more material can be spficed into the stream at 
specified points. 
3.1 WHERE IS STYLE CONSIDERED? 
According to our theory, prose style Is a consequence 
of what decisions are made darhllg the U'ans/t/ou from the 
ceueeptmd representationsl level to the linguistic level. The 
conceptual representation of what is to be mid--the text 
188 
plan--is modeled as a stream of information units selected 
by the content planning component. The a:tachmera process 
takes units from this stream and positions them in the 
surface structure somewhere ahead of the point of speech. 
The prose style one adopts dictates what choice the 
attachment process makes when faced with alternatives in 
where to position a unit: should one extend a sentence with 
a nonrestrictive relative clause or start a new one; express 
modification with an prenominal adjective or a postnominal 
prepositional phrase. The collective pattern of such 
decisions is the compotational manifestation of one's style. 
3.2 EXTENSIONS TO THE SURFACE STRUCTURE 
REPRESENTATION 
The information units from the text plan are pos/tioned 
at one or another of the predefmed "attachment points" in 
the surface structure. These points are defined on 
structural grounds by a grammar, and annotated according 
to the rhetorical uses they can be put to (see later example 
in Figure 8). They define the grammatically legitimate 
ways that the surface structure might be extended: another 
adjective added to a certain noun phrase, a temporal 
adjunct added to a clause, another sentence add,~cl tO a 
paragraph, and so on. 
Which attachment points exist at any moment is a 
function of the surface structure's configuration at that 
moment and where the point of speech is. Since the 
configuration changes as units are added to the surface 
structure or already positioned units are realized, the set of 
available attachment points changes as well This is 
accomplished by including the points in the definitions of 
the phrasal elements from which the mrface structure is 
built. We have since argued that this addition of 
attachment point specifications to elementary trees is very 
similar to the grammatical formalism used in Tree 
Adjoining Grammars \[Joshi 1983\] and are actively exploring 
the relationships between the two theories (cf. McDonald & 
Pustejovsky \[1985a\].) 
3.3 A DECISION PROCEDURE 
The job of the attachment process is to decide which 
of the available attachment points it should use in 
positioning a text plan unit in the s~'face structure. This 
decision is a function of three kinds of things: 
1. The different ways that the unit can be realized in 
English, e.g. most adjectives can also be couched as 
relative clauses, not all full clauses can be reduced 
to participial adjectives. 
2. The characteristics of the available attachment 
points, especiafly the grammatical constraints that 
they would impose on the realization of any unit 
using them. The "new sentence" attachment will 
require that the unit be expressible as a clause and 
rule out one that could only be re.afized as a aoua 
phrase; attachment as the head of a noun phrase 
would impose just the opposite constraint. 
3. What stylistic rules have been def'med and the 
predicates they apply to determine their 
applicability. 
The algorithm goes as follows. The units in the stream 
from the text plan are considered one at a time in the 
order that they appear. There is no buffeting of 
unpce/tioned units and no Iookahead down the stream to 
look for patterns among the units; any patterns that might 
be ~gnificant are supposed to already have been seen by 
the text planner and indicated by passing down composite 
units, t Each unit is thus considered on its own, on the 
basis of how it can be realized. 
The total set of alternative phrasings for an information 
unit are prccomputed and stored within the linguistic 
component (i.e the third stage of the process) as a 
"real/z~tion class ~. Different choices of syntactic 
arrangement, optional arguments, idiomatic wordings, etc. 
are anticipated before hand (by the linguist, not the 
program) and grouped together along with characteri~ics 
that describe the uses to which the different choices can be 
put: which choice focuses which argument; which one 
presumes that the audience will already understand a 
certain relationship, which one not. (Realization classes are 
discussed at greater length in McDonald & Pustejovsky 
\[19ssbV. 
The tint step in the attachment algorithm is to 
compote all legitimate pairings of attachment points and 
choices in the unit's realization dam, e,g. a unit might be 
attached at a NP premodifier point using its adjective 
realization; or as postmodifier using its participial 
realization; or as the next sentence in the paragraph using 
any of its several realizations as a root clause. This 
particular case is the one in our example in Section 4. 
The characteristics on each of the active attachment 
points will be compared with the characteristics on each of 
the choices in the unit's realization class. Any choice that 
is compatible with a given attachment point is grouped with 
it in a set; if that attachment point is selected, a later 
decision will be made among the choices in that set. 
Once the attachment point/choice set pairs have been 
computed, the next step is to order them according to 
which is most consistent with the present prose style. This 
is where the stylistic rules are employed. Once the pairs 
are ordered, we select the pair judged to be the best and 
use it. The unit is spliced into the surface structure at the 
selected attachment point, and the choices consistent with 
1 Assumi~ that the critcrial division between conccptuaVrhctorical 
plsaaias sad fin~,~c realization is that only the linguistic ~dc 
t//\]~a '4. gl'smmsg, ¢~. the opporttm~tJcs god COIISU'&IOLq impfic~t 
the surface structurc at • give~ moment (we th~nk that both sides 
should be dcsipcd to appreciate the lexicon), then this restriction 
implim that therc will be no opportunistic reconflg~g of the text 
plan by tl~ lingui~c component, no condensing parallel predicat~ into 
conjunctions or grouping of modifiers etc. unkm there is a specifically 
pbnncd rhetorical motive for doing ~ dictated by the planner. 
189 
that point set up for later seloction (realization of the unit) 
once that point it reached by the linguistic component in 
its traversal. 
3.4 STYLISTIC RULES 
As we have just said, the computational job of a 
stylistic rule it to identify preferences among attachment 
points. 2 This means that the rides themselves can have a 
very simple structure. Each rule has the following three 
parts: 
I. A name. This symbol is for the conven/ence of 
the human designer; it does not take part in the 
computation. 
2. An ordered list of attachment points. 
3. A predicate that can be evaluated in the 
environment accesdble within the attachment 
process. If the predicate it satisfied, the rule it 
applicable. 
Each stylistic rule states a preference between specific 
attachment points, as given by the ordering it defines. To 
perform the sorting then, one performs a fairly simple 
calculation (n.b. it it simple but lengthy; see footnote). 
(1) For each candidate attachment point, collect all of 
the stylistic rules that mention it in their ordered 
lists; discard any rules that do not mention at least 
one of the other candidate points as well. 
(2) Evaluate the applicability predicates of the collected 
rules and discard any that fail. 
(3) Using the rule, that remain, sort the list of 
candidate attachment point, so that its order matches 
the partial orders defined by the individual stylistic 
rule,.. 
'~:. have now looked at our treatment of four of the 
five points which we said at the onset of this paper had to 
b,~ considered by any theory of prose style. The fifth 
point, the kinds of information stylistic rules are allowed to 
have accem to, requires some background illustration before 
it can be addressed; we will take it tip at the end of our 
4. An Example 
4.1 Underlybtg representation 
At the present time we are repr~ndug the 
information about a tribe in a frame language ~,-,owa as 
ARLO \[I-Iaase 1984\], which it a CommonLitp 
implementation of RLL. We have no stock in this 
representation per se, aor, for that matter, in the spec/fic 
detaiLs of the frames we have built (though we are fairly 
pleased with both); our system has worked from other 
representations in the past and we expect to work with still 
others in the future. Rather, this choice provide, us with 
an expeditious, non-linguistic source for the articles, which 
has the characteristic, we expect of modern representations. 
Figure 2 shows the toplevel ARLO frame for the Ashanti 
and one of its subframes. 
(defunlt Ashanti 
(Pmtmy~ #>afdcan-tr~) (encyc*o~Ra-u,'~t? t) 
0oca~ #>Asmntt-~,~on) 
Cooputat~ #>Asttantt-VotmmtJon) (tan0ua~ #>mmn) 
(econorr~bases #>Astmne-economy)) 
(defunlt #>Akan 
prototype #>tan~Ja0e (wcye~;mdta-um? t) 
(st~ak~ #>.~*tam)} 
Figure 3 Ashanti ARLO-uuit 
Given this representation, it is a straightforward matter 
to define a fixed script that can serve as the m_a~__ge-level 
source for the paragraphs. We simply list the slots that 
contain the desired information. 3 
(denne-~ ~am-u~x~o~-~raQn~ ( #~ 
#>alternative-names 
#>tatar Jan 
#>fcptlaeon 
#>e~nom~basts 
(trY) 
Figure 4 T~ &:rtpt Structure 
2 At presem "preference" is dt.fined by sorting candidate 
point-choice pair,, ~r_at~t the rules and selecting the topmost one; it 
i,, easy to se¢ hi. lem ¢omlmtationally intem~ zhemm could be 
worked out. SOI~ ~tylist~ ~ should probably be allowed to "veto" 
whole c!=t,~ of attachment points and others able to declare 
themselves atways the best. Furthermore these ndm naturally fall into 
groups by specialization and features held in common, sugges~ag that 
the "sort" operation co~,.' be sped up by tal~g advantage of that 
m'ucture in the algorithm rather than simply sorting against all of the 
stylistic rules twiformly. We have worked out on papn, ho~, r,w.h 
alternatives would go, and expect to implement them later this ye~'. 
3 In ARLO slot.s are first-.cb,.~ objects with a protot~e hierarchy 
o¢ their own just like the on© for units (frame,). The list of dot,, 
is cffect~ely a list of a~ functions whmc domain is units (the 
re'be being descn~oed) and whose range is also units (the slot values). 
Wh~ this script /s instamiated, the generator will receive a list of 
3-,,.~;c records: slot. unit. and value. 
190 
If any of these slots are empty or "not interesting" for the 
tribe, it is simply left out. The interface between planner 
and realization can be this simple uecause the type of text 
we axe generating is fairly programmatic and predictahle. 
With a more compficated task comes a more mphisticated 
planner. The point here, however, is to examine a simple 
planning domain in order to isolate those decisions that axe 
purely stylistic in nature. 
4.2 Attaehmellg 
TO illustrate what attachment adds, let us tint look 
what the usual alternative procedure, direct trandat/on, 4 
would do with the information plan we use for these 
paragraphs. It would realize the items in the script one by 
one, maintaining the given order, and the resulting text 
would look like this (assuming the system had a reasonable 
command of pronominalization): 
The Ashanti are an African people. They live in central 
Ghana and neighboring regions of Togo and Ivory Coast. This 
is in West Africa. Their population is more than 900~00. 
They ~eak the language Akan. They ~ub~ pr/mar/ly by 
farming cacao. ThL~ is a major cash crop. 
Figure 5 Paragraph II by Direct Replacement 
Although true to th© information in the script, this 
method does not refiet.t the complex stylistic variations and 
enrichments that make up the original paragraph. There 
must be something above the level of a single information 
unit to coordinate the flow of text, while not altering the 
intentions or goals of the planner. With this in mind, we 
have built a stylistic controller which has the following 
properties: 
o It allows information to be "folded in" to already 
planned text. Items in the script do not necessarily 
appear in the same order in the text. 
o The decision about when to fold things in is made 
on the barn of style; i.e. if the style had been 
different, the text would have been different as well. 
o The points where new material may be added to 
planned text are defined on structural grounds. 
For example, notice that in paragraph 1I from Figure I 
the language-field is realized as as a compound adjectival 
phrase, modifying the prototype; viz. "Akan-speaking." For 
the first article, however, the language-field is realized 
differently. The attachment-point that allows this "fold-in" 
(i.e. attach-as-adjective) is introduced by the realization 
class for the prototype field. The decision to select this 
phrase over the sentential form in Figure 5 is made by a 
styllst/e rule. This rule (cf. Figure 6) states that the 
adjectival form is preferred if the language name has its 
own encyclopedia entry. 5 We see that this stylistic rule is 
no* satisfied in Paragraph I, hence another avenue must be 
taken (namely, clausal). The other attachment points used 
by the stylistic rules determine whether to use a reduced 
relative clause, a new sentence, or perhaps an ellipsed 
phrase. The stylistic rule allowing this structure is given 
below in Figure 6. 
(deflne-styCstJc-nJe PRE FER-NO UN-ADJ-COMPOUND-TO- POSTNOM o~n-atlachrnent-polnts 
( attach-as-ad~ctJve attach-as-~prr~se ) a,opllcabUJty-co ndP, Jon 
(encyCopeda-emry't Noun) ) 
(deflne-sty~stlc-n~ PREFER-ADJECTIVES-TO-NEW-SENTENCE o~n-aP.achment-polnts 
(aUa~as-~jectlve attach.as-new-sentence ) appll~lblUly-~n(~Jon 
Of (Ir~____rP~_ at~.,hment.polnt "attach-as-adlec~e 
(not (or (wUl-be.complex-adjec~e..phrase 
(mable-cmJces "aua~as-acr~e)) (too-h~w-wlth-adjectlvus 
(r~-be~-attac,~,~--to "eeam-aa-ezr~-~e)))))) 
Figure 6 StTllst/¢ Rules 
Condder now the derivation of the first sentence of 
Paragraph I, and how the stylistic rules constrain the 
attachment process. The first unit to be planned as surface 
structure is the prototype field--the essential attribute of the 
object. This introduces, as mentioned above, an attachment 
point on the NP aoo:~, allowing additional information to 
be added to me surface structure. The realization class 
as,soctated with the language field for the Ashanti is 
~e-verb, represented in Figure 7 below. 
4 "Dire~ tr•ndation" b i term mined by Mann et ai. \[1981\] to 
describe the teclmiques used by most of the generation systems in use 
to day ~th worlnag ¢:xpe~ systems, it emai/a tak~g • compk~ 
structure from the systea's knowledge ba~ as the text source (in thb 
case our list of sloB) and buiJding from it • ,41~rso that matches 
it eagactly in structure by recursively selecting texta for i~ sourse. 
5Tlds ~ is particular to the encyclopedia domain, of course, 
• rid makes refer~ to information specifically germaine to 
cncyclooodias. The rule, however, b to the point, •rid appears to be 
productive; e.g. "wheat f•rme.~", "town dwellers", etc. 
191 
(~eflne.realtza*Jon-cla~ transt:.'ve.vedo : l~'an'mt~ (agent object verb) 
: choices (( (default-active-form verb agent object} 
clause) 
;A speaka B ( (paas~e-torm vem)a0em object) 
clause In-focm(o~ 
; a is s~ten by A _ ( (genx~e-w,m-sublec~ veto ~ei obj) 
; A speaking B 
( "~e.r~a.wP, h.subject verb sut~ obD ; B being spoken by A 
r~ In-focus(o~\] ( (ae}ecUvaHorm verb object) 
ActjP express~tt~e(B) ) 
: B-speaking ) 
Flgure 7 Realization ~ for Transitive Verb 
Because of the stylistic rules, the compotmd-ad~ctival form 
is preferred. The preconditions are satisfied.namely, Akan is 
itself an entry in the encyclopedia-- and the attachment is 
made. Figure 8 shows the structure at the point of 
attachment. 
/s 
NP--------¢ V P 
1lie Ashanti 
V- ---~NP 
N /  
~N 
Akan-speaking 
Figure 8 Attachment of 0ar~uage #>Akan) 
5, Comparisons with other Research in 
Language Generation 
Two earlier projects are quite close to our own though 
for complementary reasons. Derr and McKeown \[1984\] 
produce paragraph length texts by combining individual 
information units of comparable complexity to our own, 
into a series of compound sentences interspers~ with 
rhetorical connectives. Their system is an improvement over 
that of Davey \[1978\] (which it otherwise closely resembles) 
because of its sensitivity to dLseours~level influences such as 
focus. 
The standard technique for combining a sequence of 
conceptual units into a text has been "direct replacement" 
(see discussion in Mann et al. \[1982\]), in which the 
sequential organization of the ~ex~ is identical to that of 
the message because the mesmge is used directly as a 
template. Our use of attachment dramatically improves on 
this technique by relieving the message planner of any need 
to know how to organize a surface structure, letting it rely 
instead on explicitly stated stylistic criteria operating after 
the planning is completed. 
Derr and McKeown \[1984\] also improve on direct 
replacement's one-proposition-for-one-sentence forced style by 
permitting the combination of individual information units 
(of comparable compiexity to our own) into compound 
sentences interspersed with rhetorical connectives. They 
were, however, limited to extending sentences only at their 
ends, while our attachment procem can add units at any 
grammatically licit position ahead of the po'mt of speech. 
Furthermore they do not yet express combination criteria as 
explicit, separable rules. 
Dick Gabriel's program Yh \[1984\] produced polished 
written texts through the use of critics and repeated editing. 
It maintained a very similar model to our own of how a 
text's structure can be elaborated, and produced texts of 
quite high fluency. We differ from Gabriel in trying to 
achieve fluency in a single online pass in the manner of a 
person talking off the top of his head; this requires us to 
put much more of the responsibility for fluency in the 
we-linguistic text planner, which is undoubtedly subject to 
limitations. 
It is our belief that, for script-like domains, online text 
generation suffices. This method, in fact, provides us with 
an interesting diagnostic to test our theory of style: namely, 
that stylistic rules are meaning-pre~rving, and do not 
change the goals or intentions of the speaker. Stylistic 
rules are to be distinguished from those syntactic rules of 
grammar which affect the semantic interpretation of a 
syntactic expression. A non-restrictive relative, for example, 
is a partictdar stylistic construction that adds no 
meaning-delimiting predication to the denotation of the NP. 
Use of a restrictive relative, on the other hand, is not a 
matter of style, but of interpretation; "the man who owns a 
donkey" is not a stylistic variant of the proposition "The 
man owns a donkey." In other words, the stylLqic 
component has no reference to intentions, goals, focus, etc. 
192 
These are the concerns of the planner, and are expressed in 
its choices of information units and their description (cf. 
Mann and Moore \[1983\] for a discussion of similar 
concerns). 
6. Status and Future Work: Computational 
Models of Text planning 
At the time this is being written, the core data 
structures and interpreters of the program have been 
implemented and debugged, along with the set of 
attachment-points and stylistic rule,, which ate necessary to 
reproduce the paragraphs. The ~ylistic planner is 
completely integrated with the language generation program 
and has produced texts for scene descriptions (McDonald 
and Conklin (forthcoming)), narrative summaries (Cook, 
Lehnert, McDonald, \[1984D, and two of the three 
paragraphs shown in Figure 1. 
Currently we are shifting domains to generate 
newspaper articles, in the style of the New York Tunes. 
We have only a single style worked out in detail, but we 
would like to handle styles involving alternative lexical 
choices, as well. 
Ultimately what is most exciting to us is the 
opportunity that we now have to use this framework to 
develop precise hypotheses about the nature of the 
"planning unit" in human language generatinn. This has 
been an important question in psycholinguistic research as 
well (Garrett \[19S2D. This continum our ongoing line of 
research on the psychological consequences of our 
computational analysis of generation. The following are a 
few of the questions that mutt be addressed in the _r~e__arch 
on planning: 
o What is the size of the planning units at various 
stages; . 
o What is the vocabula.w that the units are stated in, 
e.g. are conceptual and linguistic objects mixed 
together or are there distinct unit-types at different 
levels, with some means of cascading between levels; 
o Should units be modelled as "streams" with 
conceptual components passing in at one end and 
text passing out at the other, or are they "quanta" 
that must be processed in their entirety one after 
the other; and finally 
o Can the comnonents of a planning Unit be revised 
after they are selected, or may they only be 
refined. This appears to relate to similar questions 
in psycholinguistic research (see Oarrett \[1982\] for 
review). 
7. Acknowledgements 
This research has been supterminaled in part by 
contract N0014-85-K-0017 from the Defense Advanced 
Research Projects Agency. We would like to thank Marie 
Vaughan for help in the preparation of this text. 
8. References 
Borning, A., D. Lenat, D. McDonald, C. Taylor, & S. 
Weyer (1983) "Knoesphere: Building Expert Systems 
with Encyclopedic Knowledge" proc. IJCAI-83, 
pp.167-169. 
Cook, M., W. Lehaert, & D. McDonald (1984) "Conveying 
Implicit Context in Narrative Summaries", Proc. of 
COLING-84, Stanford University, pp.5-7. 
Davey (1974) Discourse Production, Ph.D. Dissertation, 
Edinburgh University; published in 1979 by Edinburgh 
University Press. 
Derr,M. & K. McKcown (1984) "Using Focus to Generate 
Complex and Simple Sentences" ~_~ings of 
COLING-84, pp319-326. 
Gabriel R., (184) PhJ3. thesis, Computer Science 
Department, Stanford University. 
Gabriel, R. (to:thcoming) "Deliberate Writing" in Bolc (ed.). 
Garrett, M. (1982) "Production of Speech: Observations from 
Normal and Pathological Language Use", in PatholoSy 
in Cognitive Functions, London, Academic Press. 
Haase, K. (1984) "Another Representation Language Offer", 
PhJ3. Thesis, M1T. 
McDonald,D. (1984) "\[kscription Directed Control: Its 
implications for natural language generation", 
International Journal of Computers and Mathematics, 9(1) 
Spring 1984. 
McDonald,D. & E. I. Conklin (in preparation) "At the 
Interface of Planning and Realization" in Bloc and 
McDonald (eds.) Natw 1 LanfuaSe Generation Systems, 
Springer-Veflag. 
McDonald D., & Pustejovsky J. (1985a) WAGs as a 
Grammatical Formalism for Generation", pr~eedings of 
the 23rd Annual Meeting of the Association for 
Computational Linguistics, University of Chicago. 
McDonald D. & Pustejovsky J. (1985b) "Description-Direeted 
Natural Language Generation', Proceedings of UCAI-85, 
W.Kaufmann Inc., Los Altos CA. 
Mann W., Bates M., Grosz G., McDonald D., McKeown K., 
Swartout W., "Report of the Panel on Text 
Generation" Proceedings of the Workshop on Applied 
Computational Linguistics in Perspective, American 
lournal of Computational Linguistics, 8(2), pgs 62-70. 
193 
