Subsequent Reference: Syntactic and Rhetorical Constraints 
David D. McDonald 
MIT Artificial Intelligence Laboratory 
Cambridge, Massachusetts 0Z139 
Abstract 
Once a_.nn ~ is introduced into a discourse, the form of 
subsequent references to it are strongly governed by 
convention. This paper discusses how those conventions can 
be represented for use by a generation facility. A multistage 
representation is used, allowing decisions to be made when 
and where the information is available. It is suggested that a 
specification of rhetorical structure of the intended message 
should be included with the present syntactic one, and the 
conventions eventually reformulated in terms of it. 
Introduction 
Whenever a speaker wants fo refer in text or speech to 
some object, action, state, etc., she must find phrase which will 
both provide an adequate description and fit the context. 
What governs her choice? One way to find out might be to 
look at the selected phrase afier the fact and try to develop a 
static characterization of the relation between it and its 
context. This is what most non-computational linguisfs do. 
However, relations derived fron~ finished texts are at best 
incomplete. They will not tell us how the choice was made or 
even guarentee tllat the relation(s) was apparent w.hen the 
choice had to be made. 
To get a dear picture of what people know about making 
references, we have to focus our attention of the process that 
they 8_,o through. It must involve making decisions on the basis 
of some contextual evidence. What is the evidence? How and 
when is it computed? How is it described? Is the decision of 
what phrase to use made all at once or as a gradual 
refinement? How is this process interleaved with the larger 
process of constructing the rest of the utterance? 
This report describes research done at the Artificial 
Intelligence Laboralory of the Massachusetts Institute of 
Technology. Support for the laboratory's artificial intellience 
research is provided in part by the Advanced Researc'h 
Projects Agenc:y of the Department of Defence under Office of 
Naval Research contract N00014-75-C-0643. 
We can narrow tile research problem by. distinguishing two 
kinds of references: initial and subsequent. This classification 
divides instances of reference by their position in a discourse. 
"Initial" references introduce new entities into the discourse, 
while "subsequent" references are another mention of one 
already introduced. 
An inilial reference must be an encompassing enough 
description of the new entity that the audence will be able to 
recognize it, This requires matchin 8 goals with evidence from 
a model of what the audience is likely to already know and 
how likely ttley are to understand various choices of wording 
(e.8. which of its properties should be emphasized? - why is it 
being introduced?). This is not easy. People talking or writin 8 
about unfamiliar things or to unfamiliar audiences are not 
particularly good at it. 
Subsequent references are another matter. They are very 
highly grammalisized. Willie an initial reference may take 
almost any form: noun phrases with unrestricted numbers of 
adjectives and qualifying phrases, nominalized clauses, verb 
phrases (for actions), etc., subsequent references must use 
very specialized forms: personal, reflexive, and personal 
pronouns; special determiners like "this" or "my"; class nouns 
like "thing" or "one"; and so on. Here, grammatical convention 
dictates most decisions and leaves only some details to free 
choice. 
¢.= 
My observations in this paper are based on experiences 
with a program for generating English texts from the 
8oal-oriented, internally represented messages of other 
programs. My program, and the state of the art in general, can 
deal much better with the representation of a grammar than 
with then representation of an audience model. Hence the 
focus here on subsequent references. 
The next section looks at the course of the whole 
generation process as my program models it, and fits the 
sub-process of finding phrases fol: references within it. Then 
the process of deciding whether or not to use a pronoun wig 
be examined in some detail. This will lead to the problem of 
64 
accessing audience models and the idea that the relevant 
infor,~nlion should be computed oulside the linguistic 
conslruclion process per 5e. Thal idea is expanded to include 
"rhelorical structures" like the relation "all of a set" that leads 
to a phrases like "...a square .... the other square". Finally, a 
design? for lhis rhelorical slruclure is sketched. 
Internal representation 
Suppose we had a logically minded program that wanted to 
n~ake the statemenl: 
VX man(x) .-', mortal(x) 
People who have worked on language generation have almost 
un\,ersally factored oul all of lhe program's knowledge of 
langua~,e into a temporally and computationally distinct 
component. Once lhe resl of the program has compiled a 
description of what it wanls to say ~ like the formula above - 
it passes it off to its "linguistic generation component" and lets 
it come up with the actual text. 
13ul before moving on to that component, let us look closer 
at this formula. I am presuming lhal the speaker's primary 
(non-linguislic) represenlalion, be il predicate logic, semanlic 
net~, or whatever, uses a lotally unambiguous style of 
represenlalion - son~elhing equivalent Io always referin8 to an 
ol)jecl, elc. by its unique name. For example, the three "x"'s 
in the formula all denote lhe same object (albeit local). The two 
predicates, the quantifier and the irnplication sign all denote 
different ol)jecls. 
We usually think of objects - noun phrases - as being the 
only lhings lhal mighl be refered to more lhan once, but thal 
is nol the case. Consider the formula mortal(Romeo) ^ 
mortal(Juliet). Thal could' be rendered in any of several ways 
including: "Pomeo is mortal and so is Jufief". Here lhe second 
instance of mortal() was realized by a special, highly restricted 
grarm,~atic device - exaclly lhe characteristics of a "subsequent 
reteren¢e". From lhe point of view of the language generation 
componenl, lhe imporlanl lhing will be lhe repetition of some 
nan~e l'rom lhe input formula not, at first glance at leasl, the 
kind of object lhal name denotes. ("The set of descriptive 
formula~ supplied to the linguislics component is called the 
pro\[~ram's "message". Subformulas or terms within a message 
are called "elet,ents" or "msg-elmls".) 
The internal objects lhal appear in a speaker's 
descriplions will have defining and incidenlal properlies 
associaled wilh them which are accessible through their names. 
This will include a property (aclually a packet of properlies 
and procedures) which records what the program knows about 
realizing the object as an English phrase. \] refer to this 
prop~,rty as the object's "entry" - as in an entry in a 
h'anslatin~, diclionary. ,An enlry specifies what are the set of 
possihle English phrases that could be used for the object, and 
includes a set of conlexl sensitive tests that will indicate 
wl~ich phrase to choose. Breaking down the speaker's "how to 
say il" knowledge inlo such small chunks facililates the use of 
a ~eneral recursive process for turning messages into texts by 
following the compositional structure of the formula(s) from 
top to bollom. 
Besides pointing Io permanent properties, a objecl's name 
will also be the reposilory of more or less temporary 
annotalions. In parlicular, when the generation component 
realizes an instance of an object as phrase, it can add an 
annotation to it markin~ what kind of phrase was selected, 
where in lhe text this occured, whal the immediately 
cJominaling clause was at the time, and so on. The next lime 
there is an instance of that same object the annotation can be 
found and used Io help decide whal kind of subsequent 
reference should be made. 
Before the linguistic processing is begun, is it possible to 
e×mnine the input formula and delermine what subsequent 
references it will educe? The bound variable x appears three 
tirnes, once with the quantifier and once with each predicate. 
it would he a. candidate for some subsequent references ifpin 
facl, the formula was rendered into English literally. 
"For any thinE. , if that thing is a man, then it is mortal." 
Rut other, more fluent, renderings of that formula will not give 
the x's a separate status: 
"E~cinF. a man implies beinE, mortal" 
"All nlen are mortal" 
in shorl, it is not possible 1o predict which objects will be 
e~plicilly refered to and which not jusl on the basis of a 
formula in the inlernal representation language. You would 
have to know (1) how lhe terms that dominate the object in 
the formula are going to be renclered; and (2) whether the 
object was rnenlioned earlier in lhe discourse and how it was 
described there. Then you would still have to, in effect, 
duplicate lhe reasoning process that the generation component 
would 8o lhrough ilself. 
A ~, we will see later, lhe generation component will often 
need "advice" as to whether or nol the audience would 
understand certain phrasings. "The audience model which 
makes these decisions will presumably prefer to work from 
pre-calculated observations so as to avoid delay. The 
implication ol the tact that you cannot whelher that there will 
be a subsequent reference to a parlicular object until it 
actually happens is thai you cannot make special preparations 
for it. The audience model, or any other effected part of the 
program, will have to be generally prepared for whatever 
63 
objects might be asked about. 
The possibility of three different renderings for the same 
formula implies that the formula per se does not contain 
enough specification to pick out just one of them. If you 
consider the three sentences for a moment, you will appreciate 
that what distinguishes them are differences in rhetorical 
emphasis and in how to interpret Vx. These are things that 
Frege deliberately omitted from the predicate calculus. To 
direct the generation component so as to arrive at a particular 
one of those sentences, more formulas would have to be added 
to the message or else found in the larger context (e.g. the 
formula mighf be part of a proof), and the entries for 
quanlifiers, implication, etc. would have to. be augmented to 
notice ther,~. 
Upgrading the predicate calculus enough to motivate the 
use of fluent English is a facinating problem, but one which 1 
will ~loss over in this paper. See McDonald \[1978a\] for more 
details. For now, l will assume that the decisions made by the 
various entries come out 'so as to give the literal version of 
the formula with the explicit references just so that we can 
use it for an example. 
Syntactic Context 
Below is my program's representation of the situation just 
as it is about to choose a phrase for the third instance of x in 
the formula. The point of showing this constituent structure is 
to demonstrate that while the program has a great deal of data 
to bring to bear on the choice, it also has a great deal of data 
which is utterly irrelevant to it. The packaging of the data - 
the size of the search space - is at least as important as 
having the data available in the first place. 
clause1 
\[int ro\] \[clause\] 
clause 4 
\[prep\]\[obj\] \[intro\] \[clause\] 
for rip__3 I- coord. "if" I- coord. "then" ...f~ 
--,, clause5 clauseg. 
\[(:let \]\[head\] _~ .... ~ / -~" 
any thin E \[subj\] \[pred\] \[subj\] \[pred\] 
v¢9_7_._, x mortal() 
\[det\]\[head\] \[vg\]\[pred-nom\] 
that timing be n p_8 
\[det \]\[head\] 
a man 
in lime diagram, the names of grammatical categories: 
clausei, pp, etc., denote the syntactic nodes of an annotated 
surface structure. Each node has a set of immediate 
constituents, organized by a list of named constituent slots. A 
slot can be empty, hold another node, hold a word or idiom, or 
hold an element of the input formula which has yet to be 
processe~, e.g. x, or mortal(). The words at the leaves of the 
tree are given in their root form. A morphology subroutine 
specializes them for number, lense, etc. when they are spoken 
(printed on the console). 
The choice of what syntactic categories, descriptive 
features and constituent slots to maintain is tied up with the 
choice of actions associated with them by the linguistics 
component. The \[intro\] constituent, for example, will act to 
insure lhat any introductory clause is realized as a participle. 
There are many trade-offs involved in the design of this 
grammar, and I will again gloss over them for this paper. 
The choice of refering phrase for a subsequent reference 
is determined largely by the syntactic relationship between 
the current instance and the previous instance to the same 
object. |n a static, after the fact analysis, we would determine 
this relationship by examining their positions in a tree like the 
one above. This is a simple enough operation for a person 
using her eyes, but it is an awkward mark and sweep style 
search for a computer program. 
My program uses a much more efficient, and | would say 
more perspicuous approach based on recording potentially 
relevant facts at the time they are first noticed by the 
linguistics component The wording of the heuristics that are 
used for the decisions are similar to the wordings used in 
static analysis. (They almost have to be, given that that is how 
the bulk of linguistic research has been done to date.) But the 
data for the heuristics is acquired in a more natural manner. 
Before discussing lhe program actual pronominalization 
heuristics, I will first digress (o describe the workings of the 
generation process which collects (and creates) the data. 
Tile tree in the previous column was developed 
incremenlally. Clausel is the result of realizing the 
conceptually topmost part of the input formula - the 
quantification, its argument - the implication - was then 
positioned in the new syntactic structure but not yet realized 
itself. This is what the constituent tree looked like at that 
point. 
clause! 
\[intro\] \[clause\] 
1--~PE'~g man(x) ~ mortal(x) 
for x 
All of the generation components actual knowledge is 
spread about many small, local routines: dictionary entries for 
the object that will appear in input formulas; "realization 
strategies" - the construction routines that those entries 
execute to implement their decisions; or "grammar routines" - 
66 
associated with the names of categories or constituents and in 
charge of effecting conventional details not involved in 
conveying rneanin 8. These routines are all activated and 
organized by a simple controller. 
The controller works by walking the constituent tree, top 
down through the syntactic nodes and from left to right at 
each level of constituents, The process begins with the top 
node of the tree just after it is built by the entry for the the 
topmost element of the input formula. 
Outline of the Controller 
Examine-nocJe 
(l) call tlme grammar routine for this category node 
(2) rebind the node recursive state variables 
(3) call Examine-constiluenls 
Examine -constituents 
- For each consliluenl slots of the current node in order do: 
(i) call the grammar routine for thai slot name 
(2) call Exaraine-slol-conlents 
Examine-slot -conlents 
- Cases: 
contents = nil do nothing 
contents = <word > 
call the morphology subroutine with the word 
print lhe result 
conlenls = <node> 
call E~amine-node 
conlenls = <msg-elmt> 
use the dictionary entry for the element to find 
a phrase for the element; replace the element wilh 
that phrase as the contents of the slot; 
loop lhrough the cases again. 
So, having generated clause2, in effect by starting the 
controller on the last case of Examine-slot-contents, the 
conlroller will loop around. Time contents will now be clause2; 
the lhird case will be taken and the clause "entered". Its first 
constituent conlains another node; lhe controller recursively 
re-enters Examine-node and enters time prepositional phrase. 
Ils first constituent contains time word "for", which is 
immediatedty printed out wilh no changes from the morphology 
subroutine; the second contains the first instance of x which is 
processed with the dictionary entry common to "issolated 
variables". The noun phrase it constructs replaces the x in the 
constituent tree; the controller then loops thrQugh the cases 
once more, recursively calling Examine-node on NP3. \]t is now 
three invocations deep. The dolled line shows its path. 
I cladse 1 
\[inl ro\]" \[clause\] 
.~ ~_2 man(x) -~ mortal(x) 
\[prep\]\[pbj\] 
for ~ 
\[deli'\]\[l~ead\] 
any l'hing 
spoken: "For any thing, " 
After processing np3, the controller will leave lhe np and 
thepp, gO to the next constituent of clause\], use the dictionary 
entry for implications, and so on, et cettera. 
The design of this generation component is oriented 
around the decision making process of the dictionary entries 
(see \[McDonald 1978b\] for more discussion). The principle 
reason that the process is deterministic and indelible, for 
example, is to simplify the conditions that the entries will have 
Io lest for. A more relevant example here is the use the 
controller to "pre-calculate" certain relations about the context 
and make lhem available through the values of recursive stale 
variables mainlained by Examine-node. For example, the 
controller keeps pointers to the "current-main-clause", 
"current-verb-phrase", etc.. \]l keeps track of whether it is in 
a subordinate context, of what the last constituent was, last 
sentence, and so on. 
Any of lhese relations could be calculated independantly 
I~y directly exarninin~ the form of the constituent tree and the 
annotalions on its nodes and embedded message elements. But 
the point is more than just efficiency. By maki.ng certain 
relations readily available and not others, one says that just 
those relations are the important ones for making linguistic 
decisions. A one of a kind operation like subject-verb 
agreement will have a special predicate written for it that 
"knows" where to find the relevant subject constituent in the 
constituent tree. But relations that are often Used, particularly 
those needed for evaluating pronominalizalion, are maintained 
by the controller, and, as a corollary, are only available in 
their pre-compuled form when the controller is present at that 
point in the tree. 
The design of the controller guarentees that the 
generation process will have these properties: (1) \]t is done in 
one pass - the controller never backs up. (2) Therefore 
decisions, choices of phrasing, must be made correctly the first 
time. (3) It is incremental. When the first part of the text is 
being printed out, later parts will be in their internal form. (4) 
Therefore very specific facts about the linguistic 
characteristics of earlier parts of the text are available to 
influence the decisions made about the later parts. (.5) \]n 
particular, when the time comes to render any particular 
67 
n~essage element into English, the entire text up to that point 
will have been generated and typed out to the audience. 
Heuristics £or deciding to use a pronoun 
Virtually any element in a .message could be potentially 
realized with a pronoun. Accordingly, the heuristics for 
judginB if a pronoun should be used are abstracted away from 
the elemenls' individual dictionary entries into a common 
subroutine. Call it "pronoun?". Pronoun? operates like a 
predicate. Eitl~er it finds lhat a pronoun can be used and 
returns it, or else it relurns nil and the msg-elmt's entry is 
consulted Io construct a full phrase. 
By the lime the coniroller reaches the third instance of x 
in tl~e example, it will have already passed through and 
processed the earler two instances. Rather than look back 
through the tree to find lhem, pronoun? will consult a stored 
record lhat describes their situation. Below is a blowup of 
part of the controller, showing more of what happens when a 
message element is processed. 
Examine-slot-contents 
,.. earlier cases... 
contents = <msg-elmt> 
either I 
pronoun? (<msg-elmt>) , 
or 
use its dictionary entry i 
add <msoo-elmt> to discourse-list; I 
', lake ts discourse record 
f ................. "I replace <msg-elmt> with phrase; , 
loop through cases again i 
................................ .__i 
The discourse-list contains the names of all of the internal 
elements that have been mentioned so far in the discourse. |f 
this example message had been the start of the discourse, the 
contents of discourse-list would be: 
(man(), .-), X, V) 
The need for a subse~iuent reference is indicated by the name 
of tlle message element already being on lhat list when the 
controller reaches an instance of it in the consitituent tree. 
After the generated phrase is returned by whatever 
source, Ihe context of the original msB-elmt and facts about 
the new phrase are recorded as a special annotation kept with 
the name of the element. This discourse record is a vector of 
just those properties which, from the point of view of later 
routines such as the pronominalization heuristics, are sufficient 
to characterize that instance of the message element in the 
discourse. These are the vectors currently created for the 
first two instances of "x". How the different items are used is 
given later. 
instance3 
msg-index \] 
clause-index c| 
clause-depth J 
slot \[obj\] 
became np 
slrateBies-used ( quantifier->determiner 
det<-any head<-lhing ) 
instance5 
msg-index | 
clause-index c2 
c l ause-dept h 2 
slot \[sub j\] 
bec arne np 
strategies-used ( det<-that head<-lhing ) 
The heuristics governing the use of a pronoun are 
evahJated in staBes according to how much trouble the 
proBram must 8o through inorder to Bet the information it 
needs. 
First come the "quick checks": predicates that can be 
evahJated just on basis on the candidate msg-elmt and the 
immediate, controller defined linguistic context. These include: 
Ca) is the rnsE-elrnt on the discourse-list? (b) is it the token for 
"me" or "you'°? (c) has it been already marked for (or against) 
pronominalization by an earlier grammar routine? (d) is it 
contents of a predicate constituent or a complement 
constih~ent Or any other constituent which is never given by a 
prOnoun? 
If any of these checks decide that a pronoun can be used, 
a common subroutine will make the actual choice. Otherwise, 
the checks either rule out the possibility of a pronoun 
altogether or the pass the msg-elmt lhrough for a more 
extensive deliberation. 
The full-scale deliberation first analyzes the relationship of 
this instance of the msg-elrrd and the lasl instance by 
comparing the current context, as given by the status 
variables in the controller, with the past contexl, as read off 
the msB-elml's entry in the discourse record. This yields a set 
of derived, descriptive features which are the inlgut to the 
actual heuristics. 
The derived features abstract out details which are 
irrelevant to lhe heuristics. For example, the current set of 
heuristics look for last instance having been either a 
proposition, or a "thing" (i.e. by looking at the became item in 
its discourse record). Whether a "thing" was actually a noun 
p ~rase, a nominalized clause, or a trace is all the same to the 
heuristics. The initial analysis into features makes this lest for 
68 
was-a-thing, vs. was-a-proposition once and for all and makes it 
unnecessary for the heuristics tidal refer to this distinction to 
repeatedly include all of the particular cases. For that matter, 
it is also unnecessary to rewrite the code for the heuristics 
every lime there is a new definilion for a feature. 
Other syntactic features currently computed include 
measures of relative position like same-simplex, same-sentence, 
or stale, and proceed-and-command, whihc are computed from 
the several position indexes in the.record. The record of what 
constiluent slot the last instance was in, in conjunction with 
the clause indexes, is used to check for features such • as 
whether the last instance was the previous-subject. Also, 
parallel positions within conjoined phrases are noted. 
Once the list of'features is computed, the heuristics are 
run. At the moment, they are implemented as simple 
condilionals. Here again, there can be an immediate yes or no 
decision, or else a yet more involved process is invoked (see 
below). The grammar forces an immediate decision when 
proceed-and-command applies. Olherwise, a number oi 
heuristics will immediately cause a pronoun to be used if there 
are no "distraclin8" references 1o other object in that vicinity 
of the discourse. For example, if the last instance of the 
object was itself realized as a pronoun, this will cause an 
immediately decision to use one again. 
In the ease'of this example, lhe third instance of "x" will 
be described as: 
same-sentence, last-subject, was-a.~thing 
As there are no other similar references in the vicinity to 
dishact the audience, the heuristics will immediately decide 
that a pronoun should be used. The subr0uline for computing 
the correct print name for pronouns is then consulted, and the 
result, "it" is returned to be inserted in the constituent tree 
and "spoken" on the next loop of l'he controller. 
Reasoning about distracting references 
Except when instance and anaphor are in the same simplex 
clause, syntactic relations alone are never enough to dictate 
whelher or not a message element should be pronorninalized. 
The linguislics component must to be able to tell if there are 
any other elements with which this one might possibly be 
confused. The problem is, of course, that the "confusion" will 
be a semantic or pra~',matic one, i.e. it will be based on 
cognilive facts about the message elements which the 
linguistics component, per se, knows nothing about. 
Given an oracle to tell it which message elements would 
compete wilh current one for the interpretation of.a pronoun 
in that position, the linguistics component, can use a simple 
procedure to decide whether to go ahead with the pronoun, 
namely to run those other elements through the 
pronorninalization heuristics as well and see which accumulates 
tile best reasons for being pronominalized. 
Consider this example sentence. |magine that the 
linguistics component has reached the point in brackets and 
must make the choice whelher to say "her" or "Candy's". 
"Candy asAed Carol to reschedule {her, Candy's} meeting for 
earlier in lhe day" 
Whether or not two objects will be ambiguous depends on 
what the audience knows. In this case, an audience that knows 
who both Candy and Carol are will know that Candy is a 
graduate student who might well organize a meeting and that 
Carol is e group secretary, someone who would probably make 
the arrangements needed for changing a meeting's time. For 
such an audience, it would be not at all confusin 8 to say "her 
meeting". An audience lhat didn't know who they were 
however would at best be confused and would in fact probably 
make the wrong choice. 
This kind of information is much too specific to imagine 
encoding as part of general purpose dictionary entries. But 
because of the general unpredictability al the message level of 
whether an objecl will have subsequent references made to it 
in lhe eventual text, the linguistics component will have to 
make its query to the main program "oracle" at lhe very last 
minute as part of pronominalization heuristics. 
The oracle will presurnably be some kind of audience 
model. But for present purposes, we can think of it as a 
function that takes lhe object we are inlerested in ("Candy") 
as its argument and returns a list of those objects lhat 
appeared in lhis and recent messages which the audience 
might confuse with it. So, in this case, if the audience knew 
Candy and Carol, then the oracle would return a null list, and 
the pronominalization option would go through. If they didn't 
know them, then it would return "( Carol )", and a further 
rouncl of heuristics would be tried. 
To compare the relative "pronominalizability" of several 
messaoe elements, Pronoun? runs them separately through the 
analysis and evaluation procedure. But instead of acting on 
the evaluation direclly, il makes a list of the names of the 
individual heuristics that each passes and then compares the 
two lists. In the current program these would be: 
Cand~ 
sanle-senfence 
proceed-and-command 
Carol 
sarne-sirnplex- ;via a lrace 
proceed-and-command 
upslairs-subject 
no-inter veening-dist raction 
69 
\]n this case, the relative number of heuristics alone would 
indicate lhal Carol would make a "belier" interpretation for a 
pronoun in lhat position, and that, therefore, the possibility of 
a using a pronoun for Candy should be rejected. But actually, 
the different heuristics are given weightings. Same-simplex, 
for exarnpfe, is much better evidence than same-sentence. 
Non-pronominal subsequent references 
Every subsequent reference is first checked for the 
possibility of using a pronoun. If this check fails, a summary 
vector of lhe features analysed and of heuristics passed and 
failed is passed along to the message element's dictionary 
entry. Entries may have their own idiosyncratic procedures 
for dealing with these situations, but they may also make use 
of general procedures packaged by the grammar. 
As explained in \[McDonald 1978b\], the "thinking" part of a 
dictionary entry consists of a set of "filters", which, if their 
condilions are met, will execute one or more "realization 
strategies" which assemble the phrase or modifer that the 
filter set decided upon. Because entries are not evaluated 
directly but instead are interpreted, it is possible for the 
interpreter to dynamically add or subtract filter se~s according 
to the grammatical (or rhetorical - see below) circumstances. 
One of time more common reasOns for rejecting the use of a 
pronoun is that it might be missinlerpreled as refering to some 
other object. The form of subsequent reference eventually 
choosen in these cases must distinguish the object from the 
one it is potenlially ambiguous with, but does not have to 
recapitulate any more delail. 
In parlicular, one frequent pattern for an initial reference 
is a noun phrase with the name of a class of objects as its 
head word, with a series of adjectives, classifiers, or qualifying 
phrases surounding it. There is a simple formula for 
constructing a non-pronominal, subsequent reference to follow 
this kind of NP, namely !o repeat the class name as the head 
word and use either "that" or "the" as a determiner. 
Part of an element's discourse record is a list of the 
realization slrategies that were used in the construction of 
previous phrases. This is a technique for smoothing over the 
irrelevant detail of the actual phrase that what used. As the 
realization strategies are refered to by name, can be 
annotated with properties describing what they do, and 
entered into abstraction hierarchies, Routines that have to 
think about what other routines have done or might do can do 
so at whatever level of generality is appropriate. In 
particular, lhis is a way to describe patterns of noun phrase 
construction so that I~eneral purpose filler sets can recognize 
them. 
The initial references pattern above is recognized by a 
filter set thai the entry interpreter can add. The filter's 
predicate checks for the name of the realization strategy 
head<-classname being included as one of the "strategies-used" 
of the anaphor, if it is found, this filter set will lake 
precedence over any others in the entry. The filter set's 
action wilt assernble a new noun phrase with the same class 
name as used for initial references (it is recorded with the 
entry), and either the or thai as the determiner depending on 
a heuristic rneasure of the distance between this instance and 
the last. This is time process operating in a sentencelike: 
"There is room for a block on a surface iff that surface is a 
table or has a clear top." 
Subsequent references to the same kind of object 
The controller makes only one pass through constituent 
tree, turning internal, messa=oe level structures into linguistic 
.~.tructui'es as it passes. While time amount of information 
available for material behind time controller is limited only by 
how much annotation lhe designer cares to record, material in 
front of the controller is only megerly described. The 
(potential) linguistic properties of an object embedded in the 
constituenl tree in front of lhe controller can be explored to a 
limited extent by "queryinl~" its dictionary entry. However, 
this is limited as a practical mailer because the interveening 
lext has not been finished and any fillers in that entry which 
depended on lime discourse contexl will be undefined. 
This means thai if you want the realization of two 
separated objects 1o be coordinated, the coordination has to 
be planned for well in advance and somehow marked. 
Otherwise the first object will be realized freely, since it 
would not be able to "see" that there is even a second object 
presenl. Time phrases below are examples of where 
coordination is required. (The first two are from the 
tic-tac-toe talking program of \[Davey ,1.974\]. He used special 
purpose routines to handcrafl the pairs.) 
"...my edge and ),'ours..." 
"...a corner ...the opposite one..." 
"...will enclose X's in square brackets and Y's in angle brackets" 
"...a big block and a littleone" 
In each of these cases, the two objects were both of the 
same "sort": edges, corners, brackets, or blocks. By the usual 
criteria, this would mean that they share di'ctionary entries, 
and, indeed, the paired phrases have much in common, and 
could be seen as only differing in the choice of strategy for 
their adjectives and/or determiners. This means that the 
coordinating mark must be something other than the "kind-of" 
70 
poinler thai links objects with their entries. It will also 
prohably have to be a lemporary structure, since "the 
oppo~;i/e corner" is a transient phenomena, defined only at 
particular moments in each came of tic-tac-toe. 
The simplest way 1o mark the pairs is with an additional 
formula in the inpul message, e.g. 
(all-of-a-set cornerl cornerg) 
or (contrast-by-size B6 B3) 
When the message is initially processed, formulas like these 
are indexed by their arguments so lhat, e.B., lhe dictionary 
entry for blocks will be able to notice them and choose its 
strategies accordingly. 
Indicators like all-el-a-set are a part of the common 
grammar, and operate in the same way that the earlier filter 
set for subsequent references by classnames does. The 
dictionary entry inlerpreter keeps track of the arguments to 
the formula and when time last of tt~em is being processed, it 
"inlerupts" and preempts the choice of determiner to insure 
that it is the, indicating lhal the speaker intends for the 
audien¢e to appreciate lhat there is no other corner (or 
whatever) left. (This is a simplification.) 
Rhetorical context 
Rhetoric is the arl of persuasion \[Aristotle\]. Stylistic 
variations in ordering, word choice, use of function words, 
elipsi~, etc. are potenfially rhetorical techniques, if the 
speaker program (or rather its designer) knows when their use 
would have a parlicular desired effect, i.e. when their use 
would make lhe text more persuasive. 
The rhetorical conlexl will typically be just an additional 
pararneler to be noticed by the enlires and ~rammalical 
routines. The dimension that it adds, however, greatly 
increases lhe fluency of lhe linguistic component's output. The 
only problem is that rhetoi'ical phenomena have not been 
studied much at all - they have been sweep under the rug of 
"stylir.tic variations". 
Goals about !low to express lhe message's content can be 
specified in lhe message. They will have their own dictionary 
entries and end up determining part of the rhetorical context 
thal accompanies the syntactic context. (At this wrilini~, the 
details of lime slructure of the rhetorical context are still being 
implemented. What follows is a skelch.)Consider: 
All of the pronorninalizalion heuristics menlioned earlier 
were based on syntactic relations. However, there are other 
relalions governing lhe understanding and generation of texls, 
which have to do with their "rhelorical" or "discourse" 
structure, hl particular, each region of text will have a focus - 
loosely speaking lhe objecl or action lhat lhal text is "about" 
(see \[Sidner 1978\] for an elaboration). 
Pronominalization of subsequent references Io the focused 
object is almost always obligatory. (There can be exceptions if 
time last several references to the object were pronominalized, 
and time intention is to "refresh" the audience's memory.) In the 
example witll "Candy" and "Carol", if the previous part of the 
discourse had been saying thinl~s about Candy, then she would 
have been established as the focus of that sentence. Then the 
presence of a current-focus heuristic in Candy's list of 
sucessful heurislics would have outweighed all of the 
syntactically based heuristics in Caters list and the pronoun 
would have been used. 
The only question is how to mark and monitor focus or any 
other rhetorical indicator. It is not a natural or even 
consistantly definable part of a syntactic constituent structure. 
TI-""afore it will have to be "tacked on" somehow. The 
te,::mique | am experimenting with is to implement a focus 
"register" which is explicitly set and reset by any dictionary 
entries lhat effect focus. A new message could also effect the 
focus register via an explicit directive included with it - say, 
when the topic of conversalion is being changed. An explicitly 
dictated focus would cause the linguistics component to 
"lran.~.form" time realization of the conlent parts of the message 
to insure that time new focus is properly marked as such by 
the syntactic form of the text. 
Time rhetorical conlext could be very domain specific. 
Consider the sentence: 
"The black queen can now take a pawn." 
Notice that it is not necessary to say "a white pawn" because 
of the irnmediale inference that one makes about what pieces 
it is legal for a piece of a g, iven color to "take". 
Since the criteria for conslructing a relating expression 
for any chess piece will overlap, they wilt likely share a 
dictionary entry. Thus we have a sort of subsequent 
reference phenomena. The enlry for chess pieces will be 
Iookin8 for the mention of a piece'S color earlier in lhe text. If 
it finds one, or rather if it finds one of the complementary 
color, and if the situation is right, it can omit any mention of 
color from the phrase it has assembled. 
How to determine that the situation is "right" is a matter 
for the rhetorical conlext to specify. The problem is the color 
of contrasting piece can be omitled only if the choice of verb 
or some other device indicales that, in fact, a constrastin 6 
conlexl is presenl. But there are too many suitable verbs to 
imagine listing them in the entry and explicitly looking for 
lhern. 
7t 
h~stead, lhe rhetorical context will include a list of 
"relations" tha! currently hold. What relations there should be 
is a matter of the rhetorical roles lhat different parts of a 
me.'..s~se mig,ht play and whether the recog,nition of these roles 
by the audience could be facilitated by a choice of wording, 
(i.e. it is a matter of research and experiment). FOr a program 
that talked about chess g,ames, one of these relations would 
be: 
opposing-pieces 
piece1 = xxx 
piece2 = xxx 
relation-name = {attack, defend, pin, ...} 
To decide whether to include the name of a piece's color, the 
entry looks 1o see if there is an opposinl~-pieces relation 
holdin8 at lhe moment. If there is, it looks to see if its piece is 
part of the relation and whether it is the second of the two to 
be mentioned. If so, it omits the color name. 
The power of this representational technique is that it 
compiles its record of the needed facts at the time when they 
easily determined, i.e. as the messag,e is being, compiled, welt 
before the relation name has been rendered into Enslish and 
the simplicity of the relation obscured. 
This technique should be applicable to many more 
phenomena titan simply subsequent reference. Consider 
sentences like these: 
"Briall also wants to come to the meeting." 
"Mitch as a class then and so does Beth." 
"The meetin~ might run overtime, but I don't expect it." 
The underlined words are not a part of the "literal" content of 
those sentences. They represent rhetorical relations between 
parts of the sentence or between the sentence and earlier 
parts of the discourse. 
|f the source messag,es for those sentences described only 
their literal content, it would be impossible to motivate the use 
of also, so, or but in those ways, yet they are what g,ive the 
sentences their naturalness. But if those rhetorical relations 
are inchJded as part of the linguistic context, with their links to 
specific phrases and dictionary entries, including these "little" 
words becomes simple. 
Language Generation" in the proceeding,s of the 2d Annual 
Meeting, of the CSCSI/SCEIO, Toronto, Canada, July 19-21, 
1978. 
Sidner \[1978\] "The Use of Focus as a Tool for Disambiguation 
of Definate Noun Phrases, this volume. 
References 
Aristotle, Rhetoric and Poetic~ translated by Roberts, Modern 
Library edition, Random House, Newy York, 1954. 
Davey \[1974\] The Formalization o Discourse Production, Ph.D. 
thesis, Edinburgh University. 
McDonald \[1978a\] "How MUMBLE translated the barber 
proof" manuscript being readied for publication, MIT A.|. 
Lab. Cambridge, Mass. 
McDonald \[1978b\] "A Simultaniously Procedural and 
Declarative Representation and Its Use in Natural 
72 
