On the Rble of Old Information in Generating Readable Text: 
A Psychological and Computational Definition 
Of "Old" and "New" Information in the NOSVO System 
Mark Vincent LaPolla 
Linguistics Dept, 
University of Texas, at Austin 
Austin, Tx 78712 
CS.LAPOLLA@R20.UTEXAS.EDU 
NOSVO is a Natural Language Generation postproeessor which is sensitive to oldhi~w 
information contrasts. We believe that generating old information first establishes 
cohesion in text promoting readability. This paper describes the NOSVO system in detail 
and the motivations for building it. We also provide a phychological and computational 
definition of "old" and "new" information. 
Tbere.are situations where the speaker is constrained by a grammatical 
rule, and there are situations where he chooses according to Iris meaning ... ; 
but there are fie situations in the system where "it makes no difference" 
which way you go. This is just another way of saying that every contrast a 
language permits to survive is relevant, some time or other. (Bolinger 
1972:71). 
1.0 Introduction 
There are at least two stages of text generation. One is 
generating the content of the text. The other is generating 
the language that represents and communicates the content 
(Thompson 1977). These two stages, though interrelated, 
have their owta sets of interesting problems and principles. 
The first stage, generating the semantic content of the text, 
involves motivating, planning and creating the conceptual 
and semantic content of a piece of text. Once the semantic 
representation for a text has been constructed the language 
of that text can be generated. The second stage, language 
generation, involves communicating the intent and content 
of the text Without confusing or misleading the reader. "~his 
paper will address the second stage only. 
It is not enough to merely generate text. It is also 
necessary to generate cohesive text. However a shopping 
list is cohesive, though not "flowing" text by any means. A 
set of sentences that are prepositionally related are cohesive 
though are not necessarily beautiful prose. 
It is not enough to attend to ellipsis and 
prouominalization to generate readable prose. We believe 
that there are other factors which must be attended to to 
generate prose. The NOSVO system is an attempt to take 
into account old/new information contrasts (Chafe 1974, 
1976) which we believe will help natural language 
generation systems produce more readable text. 
'assumable' as being there" (Prince 19'/8:819)i This is 
quite important and expands upon Ch~'e :.;ir, ce fbr him the 
important thing is that the antecede~t mu~t be in the 
hearer's consciousness, i.e. i-~ the l~earer, s tbcus of 
attention, while for Prince and LaPolla it need only be 
appropriate to the situation or in some other way 
coCperatively assumable, to ho in the fiearer's 
consciousness. 
Hajicov~ and Vbrov~t (19811 also takes exception with 
the terms "given (or old) " or "new" information and 
suggests that "contextuall!¢ bound" and "contextually 
non-bound" lexical item would be more appropriate. 
"contextually bound" and "contextually non-bound" is 
even more appropriate than "already activated" ~md "newly 
activated" because it seems to also convey situational 
appropriateness. However, it seems that Hajicov~t restricts 
her terminology, as well as her theory of discourse (focus) 
strueture~ to linguistic antecedents. That is, her "shared 
stock of knowledge" appears to be closer to, if not 
completely, linguistic in representation. Thet~f0re, neither 
her theory or terminology has the power to deal with mt 
antecedent that is merely inferable or appropriate to a 
situation. We will use the familiar terms "new/old 
infolmation" but will define them a little more precisely 
later in the paper. 
3.0 The System 
2°0 "Old" and "New" 
It is important at this point to clarify the use of the terms 
"old information" and "new information". The term "old 
information" is a misnomer, though it expresges the J 
intuition needed for this paper. The term suggests "it is 
what the listener is expected to know already" (Haviland 
and Clark 1974). The term "new information" is also a 
incorrect. It suggests that what the speaker has uttered is 
completely new to the hearer and is being introduced into 
the speaker's consciousness for the very first time. But as 
Ch~ife (1976) points out a person uttering the sentence "I 
saw you father yesterday" is "unlikely to assume that the 
addressee had no previous knowledge of his father, even 
though~by the usual criteria "your father" would be 
considered new information", (Chafe 1976.- 30). Chafe 
" y" etivated" and "newly suggests that the terms ahead a 
activated" would be more anvronriate. 
Certainly "alreaay actlvatect" and "newly activated" 
(information) are by far better than the terms "~iven" and 
"new". Even so, they are still somewhat tmprecise. 
"already activated" and "newly activated" imply, as does 
Chafe, that the concept that is activated, whether 
extralinguistic or linguistic, is directly activated by a 
(linguistic o1" extralinguistic) referent. As Prince (1978) 
shows for clefts and as LaPolla (19861 shows for 
inversion and prepositional phrase fronting this need not be 
the case. Rather the antecedent "simply has to be 
appropriate to the situation, and hence co6Peratively 
N(ISVO (Not Only Subject Verb Object) is a 
preprocessor that ,aids in the genei:ation of English text. It 
is not in and of itself" a text generator.Rut does contain a 
very simple predicate to EngliSh translator (simihtr in sphit, 
though not in complexity, to Simmons 19841. NOSVO is 
sensitive to the old/flew (i.e. given/new) contrasts in a 
discourse (Chafe 1974, 1976; ttajicov~ and Vbrov~ 19811 
and file syntactic structures that allow a writer, or sI~aker, 
to begin a sentence with old information, NOSVO 
organizes the semantic content of pirxlicates to produce an 
old information first ordering. (Appendix A contains a 
short example of test with and without the application of 
the old information first principle.) 
The rest of this section describes NOSVO and its 
motivation in detail. (See Appendi× B tbr a higll level data 
flow diagram of rite system.) 
~.1/Motivation 
One of the. goals of communication is to modify and 
extend the store of information in memory (Hajicov~i ~md 
vbrowt 19811 and language facilitates eormnunicatlon. If 
the above statement is accepted then one might ehmaeterize 
a discourse as a session where particular slices of memory 
are accessed and ehauged~ Since this actz~nulated store Of 
information in memory is presumably very large and has a 
complicated Structure, it i~ nsefill ff the commmricator can 
first identify tile locations in metuory that are to be 
modified or added to before inu'odncing rite modifications 
372 
or Mdifions (\[lajicov,5 and Vbmv~ 1981; LaPolla 1986). 
Begbming a sentence with old information allows a 
speakeJ to direct the hearcr's focus of attention to exactly 
those elements which he wishes to modify before he utters 
the.rest of the sentence, presamatfly new information 
(Chaff; 1974). If a speaker began a sentence with new 
information, the listener woukl not know where to connect 
it in the discourse. The listener wonkl have to wait uhtil the 
old inlbrmation is uttered to locate where the new 
information isto be used (LaPolla 1986). The former 
process, takes less concentration by the listet,er (Green 
1980), 
If we model human mcm(ny, specifically lexical access, 
as a sp:,:cading-.actiwttion network (Ouillian 1962, 1967; 
Collins and Loftus 1975) we can describe the effects of 
uttering old information before new as p!'imiag, for at least 
the class of linguistic anWcedents. Old inflmnation primes 
an already active node in memory raising its level of 
activation'and making it more accessible than surrounding 
nodes. The priming may then spread to related concepts. 
This causes only the relevant and related portions of 
memory to be available for modification or replacement, 
and restricts the amount of memory brought into tire 
processing of discourse. (Our model can bc extended to 
cover nc,Minguistic and inferable mttecedeuts by using a 
laybrill representation (Vihdn 1985) \[cf. Haviland and 
Clai'k~? 974) or Prince (1978) fbr .a definition of inferable 
anteced~.nts\].) If new infommtkm were nttered first then 
the level of activation achieved would be equal ,.,r close to 
previously primed concepts and the speaker's attention 
would not be properly focused. 
We can now define old (linguistic) information as 
already activated memory. New (linguistic) information, 
then, is either information not in memory or information 
that ha:~ not yet been actNatcd. Using Quilliau's model, all 
concepts that haw; bern, primed by the discourse, either 
directly or by spreading-activation, arc old. Everything else 
is uew. 
Structures like inversion and PIM?-onting, to a name a 
few, extend the syntactic amt logical possibilities of 
inflmnation presentation in a language. They allow 
adjnnct~, e.g. in PP-fronting~ or objects, e.g. in passives, 
or other arguments, e.g. in inversion, to be presented 
before other syntactic constituents. 
To complete the above definitions we will adopt and 
expand Quillian's definidon of "concept". For QuiUian, 
and Collins and Loftus, a concept correspond:; "to 
particulm" senses of words or phra~,',es" (Collins and l~xfftus 
1975: 408). We will expand this to include extralinguistic 
phrases c,r groups of actions and objects. A concept then is 
not onl 7 senses or words or phrases such as NPs, 
"machine", VPs, "to machine", and phrases "the pro-titular 
old car I own" (page 408) but also extralinguistic objects: a 
tficture of a fire engine, and actions, the picture of a fire 
engine racing down the slreet. This also includes situations 
such as eating in a restaurant and paying the check. In 
sum, concepts arc linguistic words and phrases as well as 
more difficult things to pin down like situational scripts. 
We have adopted a well proven model of truman memory 
to describe the elf~-ets of linguistic infbtmation on memory. 
We have used this thex~ry to wodel discourse processing as 
a type o:i! memo~ y l)rocessing. We tuwe also advanced the 
idea that langnage in its role as a communications facilitator 
allows ~a speaker to direct which concepts will be primed 
and therefore what the hearer will loons his attcnlion on. 
We have defined old and new inibmmtion within this 
flamework and have b, icfly described the contribution 
inversion, PP-fronting amt passivization make to the 
preseutafio, of information iu hmguage. In the rest of this 
paper we will show how we have integrated these ideas 
and theories into NOSVO. 
3.2 Syste~l Overview 
NOSVO takes as input a syntacfico-semantic predicate 
rcprescmatRm of each sentence in the text to be produced 
(following Simmons 1984). (Sec Figure 1 lor an example. 
The asterisks indicates a backward pointing arc (Simmons 
1984)0 It determines which of ~ltc predicates constituents 
are "old" inlbnnafion. It thtu~ updates its lexical memory to 
reflect the predicates nttecan,.:e on the listener's own 
memory. The lexicon is a semantic network following 
Quillian. Currently it supports the relations ISA, HAS, 
HAS-PART, LOC, IS-CAI,LED, EXAMPLE, SUPER- 
and SUBCLASS. 
NOSVO assumes that each of the underlying predicates 
maps to a sentence. This assumption has allowed us to 
focus on the oldh~ew information distinction that is the core 
of the NOSVO system withont worrying about the 
mapping from "deep" semantic structure to surface 
structure. In other words, NOSVO is not a robust English 
generator and has all of its intelligence devoted to the 
manipulation of information. We feel that language 
generation is a difficult research issue and is beyond the 
scope of this paper mtd the current system. 
(spend tns past ae ((father number sing) 
agt* (accompany ms pres infl int) 
ae (me number sing)) 
before* ((be ms past nmnber sing) 
atr (tall) atr (enongl0 
agt ((I nmnber sing) 
agt* ((ride ins pres iufl inf) 
on* ((coaster number sing) 
alr (bi~9) 
ae (me nmnber sing)))) 
agt ((hour number pl) atr (pleasant) atr (many))) 
Figure 1 
3o2,1 NOSVO 
First, NOSVO determines which parts of the input 
predicate are "old information". It checks the nodes with 
the tive (5) highest levels of activation in its knowledge 
base (KB), e.g. levels 6-10 where 0 is not activated and 10 
is fully activated. If any of the heads of any of the 
arguments from the input predicate is among the activated 
nodes then that argument is marked as "old". (NB: We do 
not address in this paper intenml ordering of constituents 
other than sentences. We only check the heads of 
consfitnents directly below the sentence level.) 
After NOSVO has marked the old information in the 
input predicate, it updates the stalus of its KB to reflect the 
hypothesized change the generated sentence will have on 
the listener. To do this, NOSVO first parses each input 
predicate into its constituents. For each argument in the 
input predicate NOSVO primes, or reprimes, a 
corresponding (concept) node in its KBs. (Note: Though 
we do not check each part of each constituent when 
determining what is old, every part of every constitueut 
does affect the state of the lexical rnemory of NOSVO.) 
When a node in the knowledge base is primed, it is 
tagged with a level of activation The initially primed 
concept is tagged with the highest level of activation, call it 
level 10. The initially primed concept is also tagged as the 
initially primed node, i.e. the node tMmed by the discourse 
and not by spreading-activation. Activation then spreads 
outward raising the level of activation of sun-ounding 
nodes. As the spreading-activation gets timber away from 
the initially primed node its effect is reduced proportionally 
to the distance mtvclcd (following Collins and Loftus 
1975). For example, at the initially primed node the level 
of activation is 10, at the next node it is 9, at the next 8 and 
so on. NOSVO also tags the surrounding nodes with the 
umnc of the initially primed node. We realize that we do 
not know by what exact proportion the activation effect is 
diminished. Nor do we know how long actiwttion lasts or 
at what rate is deteriorates These are questions for future 
research. 
3.2°2 Generating Sentences 
After NOS¥O has marked the arguments in the 
predicate, the result is passed to a simple English 
generator. The role of the generator is complex but the way 
it executes its role is simple. The generator looks at the 
nuu'ked predicate and chooses the correct English syntax to 
map the predicate to English. We realize that this task is 
very complex and our treatment of it is superficial. We 
realize that entire systems have been created to address the 
problem, e.g. MUMBLE (McDonald 1980; see also 
McDonald, Meteer and Pustejovsky 1987). We "also realize 
that we do not address pragmatic considerations in the 
373 
generation of discourse (tfovy 1987). However, recall that 
our goal was not to create a robust language generator. Our 
• goal was to create a system that could recognize old 
!ll.fo~rnation in a phrase being generated. After this is done 
it is the responsibility of the rest of the system to act on that 
infolmation. 
We assumed fl'om the beginning that the underlying 
semantic ~epresentations NOSVO processes are already 
organized into sentences. This was not to aid NOSVO in 
its task~ though it does by explicitly defining the 
relationship between the verb and its arguments, but rather 
to aid the simple predicate to English generator. 
The English generator coupled with NOSVO takes 
NOSVO's output and analyzes it. If a prepositional phrase 
(PP) adjunct has been marked as "old" then the generator 
fronts it, e.g. "In the park, John kissed Mary". If an object 
has been marked as "old" the generator generates a 
passive, e.g. "The apples were eaten by Vincent". If a pp 
argument or adjunct of an intransitive is marked as "old" 
the generator generates an inverted sentence, e.g. "Around 
tile bend lives John". If no old information is found other 
than in the subject then a simple Subject Object Verb 
sentence is generated, e.g. "Vinnie loves Mark". If a 
predicate contains no old information, either explicitly or 
implicitly, it is a nonsequetur mid should not be geuerated. 
At -this stage in the generators development only simple 
syntaxes are generated. Extraposition out of clauses is not 
addressed. This was not our intent. Also it was not our 
intent to argue here that the presentation of old information 
first is the sole discourse function of structures like 
inversion, pp-.fronting and passives but only one of their 
discourse functions, perhaps the main one. It was our 
intent to build a system that could determine which parts of 
a sentence tinder generation were old information. It was 
also our intent to elm'ify the terms old and new infolxnation 
and to put their definitions in perspective both linguistically 
and psychologically. These issues we have addressed. 
4.0 The System In Detail 
In this section, we will present the NOSVO system in 
detail. 
4ot Detailed Overview 
NOSVO's grammar is segmented into two parts: i) plain 
vanilla SVO rules, e.g. S -> NP verb PP, "A little angel 
stood outside" (Green 1980, page 582), "Uncle Jack lives 
around the bend" (LaPolla forth coming), and it) so called 
old information first syntax, e.g. S -> PP verb NP, 
"Outside stood a little angel" (Green 1980, page 582), 
"Around the bend lives Uncle Jack" (LaPolla 1986 and 
1988). (Cf Green 1980 and LaPolla i986 for details). 
When NOSVO encounters old information in any 
constituent in a predicate, except for the logical subject, it 
uses the old information first syntax grammar to generate 
the seutence. Otherwise it uses the plain vanilla granunar. 
NOSVO does not do any extraposition from within 
embedded clauses nor does it handle the differences 
between internal arguments and adjuncts. It only produces 
the three variations on standard, plain vanilla syntax 
discussed above. These issues will be addressed in future 
versions of NOS VO. 
NOSVO has two Idnds of knowledge bases from which 
to Work: linguistic and conceptual. There are two linguistic 
knowledge bases: the lexicon and the discourse base. The 
lexicon maps into either a domain specific or.a non-domain 
specific KB. Both the domain specific and non-domain 
specific KBs are hierarchical networks which support the 
relations ISA, HAS, HAS-PART, LOC, IS-CALLED, 
EXAMPLE, SUPER- and SUBCLASS. The domain 
specific and non-domain specific knowledge bases are the 
Quillian .style semantic networks. In NOSVO's current 
avatar, there is a one to one mapping of lexical items to 
concepts in .the knowledge bases. That is, there is no 
lexical or conceptual ambiguity. A more robust system 
would allow for multiple mappings in both direction 
because of the power and depth it gives. A more robust 
mapping wonld produce two problems though. The first 
would be the extra computation and heuristics required to 
resolve the ambiguities. The second problem would be 
determining when to prime a node. It might not be correct 
374 
to prime a node just because the lexicon accessed it. Oue 
might have to wait for a completed parse before priming 
the concept bases. 
The discourse base is a tree. As NOSVO generates text, 
it builds the discourse tree connecting old int\~rmation to 
new while retaining the autonamy of each predicate. The 
discourse base contains the structure of the discourse and 
is a way to record prinrinl~o The discourse base maps 
directly into the KBs as well. One could assume that the 
discourse is just a a section of prirnexl memoly, ltowever, 
it was felt that a more linguistic representation would be 
usefld in helping to resolve anaphora. Tile discourse base 
was modeled after the discourse mechanism in LaPolla 
(1986). 
NOSVO's first step in determining whether a predicate 
or an argument is "old information" i~¢ whether or not it has 
been introduced into the discourse, that is whether or not it 
is definite (tleim 1982,1983). If a referent has already 
been introduced into the discourse then it is necessarily oM 
information. However, the converse is not necessarily 
true. That is, just because a referent has not been 
introduced into the discourse does not mean it is 'new 
intbrmation' (Von Stechow 1981). All it means is that the 
referent has not been introduced. The information may 
have been. 
The nonliuguistic KBs contain metNinguistic knowledge 
about articles, stories or other appropriate formats and tile 
expectations speakers have about them, knowledge about 
the topic of the text and specific and generN knowlexlge 
about the lexicon• 
NOSVO tlies to establish a link from the sentence under 
generation to one of the KBs. If a link can be found from 
an argument or an adjunct in the input predicate to the 
knowledge base, the link is recorded and the old 
informatkm first grammar is used to generate the argument 
or adjnnct first, if possible. (This is a oversimplification 
and will be expanded tq~on.) 
If NOSVO can not establish a link to the knowledge 
base, it searches the meta-knowledge base, i.e. the 
knowledge base containing information about author 
motivation for writing an article or story, the techniques 
authors use to write articles or sto~ies, and iufonnation 
about artMes and stories, theh' parts and subcategories. 
The recta--knowledge base is primarily used to establish 
bridges (Clark and Haviland 1974) discourse initially from 
the old information in the predicate to an infeITable 
metalinguistic antecedem. For example, a college professor 
may begin a leetm-e (or course) with the discourse initial 
utterance "What we're going to look at today (this term) 
iso.." but not "*,What one of my colleagues said this 
morning was..." (Prince 1978, page 889) or "*What I told 
my wife this morning was..." (The asterix means 
semantically nnaceeptable) The first sentence is allowable 
because the context, i.e. the class room setting, allows a 
direct inference to studying (for the terns). 
The recta--knowledge base has two pa~s: a taxonomy 
similar to the Domain Specific KB and scripts that have 
knowledge about objects and the actions that they perform, 
e.g. writers write stories, writers set scenes, stories have 
scenes. 
The algorithm which NOSVO uses to detem~ine which 
old information first syntax is appropriate is 
straightforward. The complicated part is how NOSVO 
decides what is old information. Currently NOSVO 
searches throughall comcepts with activation greater than 
5. The value 5 is arbitxary, however, it is still an open 
research question when an antecedent can be considered no 
longer in the speaker's/listener's common ground (Chafe 
1974, 1976), or no longer cooperatively assumable (Prince 
1978). 
4.2 'l'he Components of the System 
This subsection will outline in detail the various 
components of NOSVO and their function. This subsection 
is organized in parallel with the data flow diagram in 
Appendix B, starting with NOSVOs first component 
subsystem. 
4.2.1 The Predicate Parser 
The Predicate Parser identifies and parses the input 
seutential predicate into its component parts. This is the 
first :;tay, c in identifying old information in a predicate. 
4.2°2 e'rcdicate Argn~aent Translator 
The Predicate Argmnent Translator translates the 
lingnisttc representation of the input predicate consituents 
into tokens, from the lexicon, which map into the 
discourse base and the other KBs. Notice that the 
represe~ttation of discourse referents and concepts need not 
be the same, only that each referent or concept be indexed, 
and indexable, by a token. The tokens are only used to 
qnery the knowledge bases. When we speak of finding a 
link bel ween the input predicate and the knowledge base 
that li~}.: is established through the conceptual translation. 
4,2°3 The Discom°se Ba~e Searcher 
The Oiseourse Base Searcher searches the discourse to 
determine whether any of the input predicate arguments in 
the predicate have been previously introduced into the 
discour.;e. If an antecedent(s) is found the link is recorded 
and the whole predicate, with highlighted old infonaaation, 
is sent to the LiHguistic Converter and Category Analyzer (l,C~CA). 
4°2°4 The Domain Specific KB Searcher 
If no antecedents are found in the discourse base the 
Domail, Specific KB Searcher searches the domain specific 
KB for a possible lin k. 
First maximally primed nodes are invest gated, i.e., 
nodes with priming 10. Then other less palmed nodes are 
investigated and so on up to priming level 5. Note: that the 
amount of search necessary increases as the priming 
decreases. If a link is found to a node, that node is primed 
and the input predicate is sent to the Linguistic Converter 
and Category Analyzer with the old information 
highlighted. If not, control and the input predicate is 
passed to the Non-Domain Specific KB Searcher. 
4.2.5 The Non-Donmin Specific KB Searcher 
The Non-.Domain Specific KB Searcher searches for an 
antecedent in the Non-Domain Specific KB. In NOSVOs 
case non-domain specific knowledge is general and 
prototypical knowledge. So, for example, if the domain 
specific KB is Navy ships, then the non-domain specific 
KB might contain information about ships in general, 
water, vehicles, transportation, or guns and fighting in 
general. The exact same mechanism is used to search the 
Non-Domain Specific KB as the Domain Specific KB. If 
no antecedent is found in this knowledge base the predicate 
is passed to the Bridge Building Inference Engine. 
4.206 '~'he Bridge Bttilding Inference Engine 
If all the other processes have failed to find a link from 
predicate to cornmon gromld, i.e. the context, both 
linguistic and nonlinguistic, of a discourse, NOSVO tries 
to build a bridge, an inference, which connects information 
in the predicate to a metalinguistically inferable antecedent. 
Tints component of NOSVO is not very robust. NOSVO 
will eventually be reimplemented using a Valain (1985) 
type hybred approach. Then the Bridge Building Inference 
Engine will be expanded. 
At this point, the careful reader may think that given the 
nature of NOSVO's search mechanisms that it must always 
succeed in establishing a connection from input to 
knowledge base. That is the case. Indeed it must be the 
case. Consider lhat everything that people say to each other 
must in some way link to'.lhe common ground in order to 
be nnderstood. Or else the utterance would be a 
nonsequitur. Even the selfinlroductions perfomaed by two 
people who do not kiroW each other, and have just met, are 
expecled and reasonable° The formula has a metalinguistic 
antecedent in culture. 
The question tot NOSVO is not whether an antecedent 
exists but rather what it is. If NOSVO cannot find an 
antecedent it assumes that one exists and generates a 
sentence with plan vanilla, SVO structure, leaving it to the 
reader to establish the connection. If NOSVO did not 
assume an antecedent it would have to discard the sentence 
as a nonsequitur and potentially confusing and/or 
misleading. This is an important issue and will be dealt 
with in a later and expanded version of NOSVO. 
4.2.7 The LC-CA 
The Linguistic Converter and Category Analyzer 
analyzes the old information to determine its 
syntactico-semantic category. It checks whether it is a 
prepositional phrase, agent, theme or instrument, in that 
order. It then decides if the old infomaation is an internal or 
external argument or a prepositional phrase adjunct. With 
this informatiion it picks the type of grammar that will 
place the particular argument or adjunct first and sends the 
choice along with the predicate to the English generator. 
4.2.8 The English generator 
The English generator is a prolog grammar segmented 
into the various old information first syntaxes, e.g. 
prepositional phrase first, object first rules, and a plain 
vanilla syntax. At this point all, or most, of the intelligent 
work has been done and the generator is nothing more than 
a syntactic manipulator under the direction of the Linguistic 
Co,wetter and Category Analyzer. 
5.0 Future Research and Directions 
Future research and development topics for NOSVO 
include: 
1. determining when information cannot be 
assumed to be in the listener's common 
ground, i.e., at what level of priming is a 
concept not in the listeners common ground?; 
2. expanding NOSVO's capability to handle 
ellipsis, definiteness, and pronominalization 
and investigate how the generation of ellipsis 
an.d definiteness affects the generation of old 
information fin:st; 
3. extending NOSVO to do more of the 
linguistic generation from either a more 
"conceptual" representation or to take as input 
another source langauge such as another 
natural language or a computer program and 
generate English from that underlying 
representation, i.e. expand NOSVO's 
backend; 
4. extending NOSVO's capabilities to handle 
the subtle distinction between arguments and 
adjuncts; 
5. determining how much the nonapplication 
or missapplication of the old infolxnation first 
principle, discussed above, makes a difference 
in reading and understanding text; 
6. finally, investigating other old information 
first syntactic structures and phenomena to 
determine how they affect a discourse and how 
they might be integrated into NOSVO. 
The next generation of NOSVO will be written in CLOS 
and Lisp. The application will be "generating descriptions 
of Lisp programs". CLOS objects will be used to organize 
the knowledge structures and CLOS methods will be used 
to do tile actual parsing. Eventually NOSVO will be 
expanded and refined along the directions stated above. 
Acknowledgements 
I would like to thank Kent Bimson, Mirjam Fried, 
Randy LaPolla, Marie Meteor and Varda Shaked for all 
their help and criticism on this abstract. Any mistakes arc 
my own. 
375 
APPENDIX A 
In this Appendix we have given two examples of text 
that NOSVO can generate. The text was based upon 
naturally occurring text (Lawrence 1985). The old 
information first principle has been applied to the first text. 
It has not been applied to the second text. We believe that 
the second text is stilted, less cohesive and harder to read, though this has yet to be proven experimentally. We also 
believe that the missapplication of the old information first 
principle would be worse than its nonapplication. These 
are topics left for future research. 
TEXTI 
Long before I was tall enough to ride on the big coaster 
myself, I spent many pleasant hours persuading my 
reluctant father to accompany me. 
As an aficionado of amusement parks I was overjoyed 
when our whole family finally flew to California to tackle 
Walt Disney's extravaganza. 
.More than two decades later, I'm still journeying to 
parks. (page 4) 
TEXT II 
I spent many pleasant hours persuading my reluctant 
father to accompany me long before I was tall enough to 
ride on the big coaster myself. I was overjoyed, as an 
aficionado of amusement parks, when our .whole family 
finally flew to California to tackle Walt Disney's 
extravaganza. 
I'm still journeying to parks more than two decades later. 
APPENDIX B 
~J \] ~ ..^\[l:,oraa:tn I ~._ I.on-Domain I .^ Inridg. I 
Dis~ou~so I ~,/.,,..,.,".,."~lspeciz±c I / ~ so spaolz~c ~ .v ~ui:tding A 
Base ~°unQ~Knowledg e ~.Found? ~--~IKnowled~e I ~I ..... ~ -IInferenoe I / ~ Searcher I ~ / IBase Searche.l ~ /I~ _ ~ _ ~-~ounQY~-~_ ~ -~-/Found? @ @ @ 
/ / ~ i- 
\[ Lingulisti$ 
|Converter | 
__,~And CategoSy 
-I Analyzer I 
i 
! English 
Generator 
377 

References 

Bolinger, D. That's that. The Hague: Mouton, 1972. 

Chafe, Wallace. Language and Consciousness. Language, 
1974, 50(1), 111-133. 

Chafe, Wallace. "Givenness, Contrastiveness, 
Definiteness, Subjects, Topics and Point of View". In 
Charles Li (Ed.), Subject and Topic. (ed Li, C. N.), 
New York: Academic Press, 1976. 

Collins, Alan M. and Elizabeth F. Loftus (1975) "A 
Spreading-Activation Theory of Semantic Processing". 
Psychological Review, 87,407-428. 

Davison, Alice. "Peculiar Passives". Language, March 
1980, 56(1), 42-66. 

Green, Georgia M. "Some Wherefores of English 
Inversions". Language, 1980, 56(3), 582-602. 

ttajicov~t, Eva and Jarka VrbovL "On the Salience of the 
Elements of the Stock of Shared Knowledge". Folia Linguistica, 
1981, 15, 291-303. 

Haviland, Susan E, and Herbert Clark. "What's New? 
Acquiring New Information as a Process in 
Comprehension". Journal of Verbal Learning and Verbal 
Behavior, 1974, 13, 512-538. 

Heim, Irene R. The Semantics of Definite and Indefinite Noun Phrases, 
Doctoral dissertation, University of 
Massachusetts at Amherst, September 1982. 

Helm, Irene R. "File Change Semantics and the 
Familiarity Theory of Definiteness". In Rainer Bauerle, 
Christoph Schwarze and Arnim von Stechow (eds.) 
Meaning, Use, and Interpretation of Language. Walter 
de Gmyter, Berlin, 1983, 164-189. 

Hovy, E. H. Generating Natural Language Under 
Pragmatic Constraints. Unpublished Yale Dissertation, 
YALEU/CSD/RR #521, 1987. 

LaPolla, Mark Vincent. "The Role of Inversion, Clefting 
and PP-Fronting in Relating Discourse Elements: Some 
Implications for Cognitive and Computational Models of 
Natural Language Processing". In Proceedings from the 
XI International Conference on Computational 
Linguistics, 1986, 168-173. 

McDonald, David D. Natural Language Production as a 
process of Decision-mdtking under Constraints, 
unpublished Ph.D. Dissertation, MIT, Artificial 
Intelligence Laboratory, 1980. 

McDonald, David D., Made W. Meteer (V, aughan) and 
James D. Pustejovsky. "Factors Contributing to 
Efficiency in Natural Language Generation". In G. 
Kempen (ed.) Natural Language Generation: Recent 
Advqnces in Artificial Intelligence, Psychology and Linguistics, 
Kluwer Academic Publishers, 
Boston/Dordrecht 1987. 

Prince, Ellen F. "A Comparison of Wh-Cl~fts and It-Clefts 
in Discourse". Language, 1978, 54(4), 883-905. 

Quillian, M. R. (1962) "A Revised Design for an 
Understanding Machine", Mechanical Translation, 7, 17-29, 

Quillian, M. R. (1967) "The Teachable Language 
Comprehender: A simulation program and theory of 
language", Commut~ications of the ACM, 12, 459-476. 

Simmons, Robert g. Computations from the English. 
Prentice-Hall, 1984. 

yon Stechow, Arnim, "Topic, Focus and Local 
Relevance". In W. Klein and W. Levelt (eds.), Crossing the Boundaries in Linguistics,'1981, 95-130. 

Thompson, H. "Strategy and tactics: a model for language 
production". In Papers from the 13th Regional Meeting, 
Chicago Linguistics Society, 1977. 

Valain, M. "The Restricted Language Architecture of a 
Hybrid System", In the Proceedings of the Ninth 
International Joint Conferet~ce on Artificial Intelligence, 
L.A,, 1985, 547-55I. 
