American Journal of Computational Linguistics 
Microfiche 82 
A MODEL FOR KNOWLEDGE AND ITS APPLICATION TO 
DISCOURSE ANALYSIS 
BRIAN PHILLIPS 
University of Illinois at Chicago Circle 
60680 
Copyright @ 1979 
Association for Computational Linguistics 
TABLE OF CONTENTS 
Introduction. . . . . . . . . . . . . . . . . . . . . . . . L.. 4 
A model for knowledge . 
Nodes 
Arcs 
Par adigmati,~ relations 
Va r ie ty 
Instance 
Typical 
Manifestation 
Syn tagmatic relations 
Discursive relations 
The metalingual relation 
Status relations 
Negation 
Inhe r i@ance 
Episod ic and systemic menlory 
Quantification 
Processes in the network 
Pa ttl-tracing 
Pattern-matching 
Discourseanalysis. . . . . . . . . . . . . . . . . . . a . . 44 
Thq  st^-IIC ture of c~heren t d ~SCCU~S~ 44 
Anaphora 44 
Spatial, temporal, and causal cohesion 
45 
Thematicity 46 
The role of the encyclopedia 
52 
Anaphora 53 
Spatial, temporal. and causal cohesion 
5.4 
Thematicity 56 
Implementation.. . . . . . . . . . . . . . . . a . . . . . 57 
Processes 58 
Paradigmatic path tracing 58 
Causal connectivity conditfon 58 
Discovering general and apecif ic propositions 
58 
?letslingual decomposi tisn 60 
Iletalinp,ual abstraction 60 
Inference of omitted discursive relations 62 
The system 62 
Analysis of some stories . . . . . . . . . . . . . . . 63 
Common pat terns 65 
A boat capsizing 
67 
A fall into water 
70 
Embedded themes 
72 
INTKODUCTTON 
An important csntribution of natural Language processing has been 
to direct a ttr:n tion to tile structure of 1~~1gi~age at the discau~.se 
I I 
level, wl-rich hna led r6 n greater awareness uE the rote of meaning" in 
Iangtxage For "A text is best: regarded as n SEIAhTXG unit; a unit not a£ 
form but of meaningt' (Ilalliday ct Ilasan, 1976, p. 2). This being so, 
discsursc analysis will deepen ur tlndcrstanding sf meaning and 
vice-versa 
In this paper 1: present a rnodel, of nzcaning stran~ly inf'l~lencccl hy 
Ilays (].969a, 3.969b, 1970, 1973) and stlow hob I t; is able to ccciptwre tf~c 
otganizattnn of L!~HCDU~SC~. In particular T seek to define the or- 
ganization of cahercnt di~courst-. and to slaow llow knowledge is u'sed to 
inhex a coherent structure when, as usually is the case, the surface 
forrr: is ell f~3tic. Tt~e hypathescs arc a~sed to buiLd an automati-c system 
tcn test the coherence of d ibscnnrsc. 
A f hO'fJR1, FOl< IiiJCIE.JX,l? DCJ; 
The pl~ilosopf~ic stnr~cc -fs taken that utrr knowlr.dg.,c of ;I csncep is 
the f~icarling of ct~at concept: : "5arrtcanc wt~o knows wi:at tik-t.r means 
. . Is required to know that w~ercntyp~tal tigers are striped" 
(Futnanr, 1975, p 249) 1 
)!any models of kr,owX.edfie 'have bccn d<welopeb for use In com- 
putatioraal env-dr:,nmcnts .. Samp arc. fc~ nrcstr ic tt.d dl-~mrai.rrz; (Black, 1908: 
Ebbrow, 1968; Colby, 1973; Raphael, 1968; Winograd, 1W1; etc.) The 
present model is raore'in the tradition of Klein, Oakley, Suurballe, and 
Ziesemer (1 972), Quillian (1 969), Rumelhar t , Lindsay, and Normaa 
(1972), Schank (1975ai, Shapiro (1971), Simmons (1970), and Wllks 
(1972), where no particular context is prescrfbed* It will be apparent 
that at many points the present model draws upon these earlier syatems. 
Some of the differences between systems are probably differences in 
notation- However no system Is at a stage of co~staney or completeness 
that makes it worthwhile to devote much effort to establishing the 
equivalences Although it wauld be possib1.e to present only the paws 
that I believe to be novel, giving the whole system in a common nota- 
tion will ease the task of the reader- 
The model, hereafter called the encyclopedia, endeavors to be 
consistent with available psychological and linguistic views of the 
structure of language and t'hought, for any automated language system 
must closely imitate the ttorkings of human cognition to be successful 
(Collins & Quillfan, 1972). 
The encyclopedia encodes common knowledge sf the world which may 
differ from scientifically accurate descriptions. Putnam (1975) calls 
such knowledge "stereotypical": 
The fact that a feature . . . le, included in the stereotype as- 
sociated with a word X does not mean that it is an analytic truth 
that all Xs have that feature, nor that most Xtl have that feature, 
nor that all normal Xs have that feature, nor that some Xs have 
that feature. . . Dtwovering that our stereotype is based on 
nonnormal or unrepresentative members of a natural. kind is not 
discovering a logical contradiction. a . [but) The fact is that 
we could hardly communicate if moat of our stereotypes weren't 
pretty accurate as far as they goe (pp. 250-251) 
The encyclopedia is achematized and ihplemented as a directed 
graph; in current parlance it is a network model. Nodes characterize 
concepts and arcs relations between conceptsv The most general state- 
ment to make about the model is that relations and aoneept types are 
the necessary system primitives; some concepts may be primitive, but 
the model does not depend on the* existence of primitive concepts 
Discussion wi2 1 cover the nodes and relations of the model. 
Attention will also be given to network processes. 
No psychol~gical validity is claimed for the content of any of the 
structures shown; the claim extends only to the relational sttucture 
Questions of content must be answered empirically- 
Nodes 
There are four types o.f nodes: -9 event Yr entit .-- attribute, and 
modality. The first three correspond to simple verb simple noun, and 
simple modifier, respectively. The fourth type of node is novel Its 
rale in *he system will become clear after a description of arcs * For 
the meantime it will, have to suffice to say that it is used in the 
spatio-temporal causal, belief, and hiersrchic organization of know- 
ledge * Its ancestor in linguistic theory is "modal" in the 
model/proposition dichotomy of Fillmore (1969). Sch~bert (1976) has 
predicate nodes that are similar in motivation, but different in use 
from modality nodes. 
-6 - 
Nodes of the encyclopedia are not labeled (Collins & Quillian, 
1972). An arc, termed - name, points from a node into a dictionary of 
print names. For clarity nodes in diagrams will be annotated, but this 
should not be taken as representing the implementation, which is as 
shown in Figure 1. 
rock 
I 
DICTIONARY I ENCYCLOPEDIA 
I 
I 
1 NAME 
person 
Peter 
Adnt 
Sally I 
Figure 1 Labeling nodes 
In all the following figures is an event, entity or attribute 
node- Annotations on these nodes are enclosed in 1, <>, and [I, 
respectively- Modality nodes appear as o and are never annotated. 
Arcs 
Five 
types of arcs are used in the network: paradigmatic arcs ate 
taxonomic, syntagmatic arcs iorm propositions, discursive arcs link 
propositions, the rnetalingual arc is used to associate a concept with a 
story in network form that defines the concept, and status arcs charac- 
terize beliefs and desires + 
Paradigmatic Relations 
Variety. A readily observable aspect of human behavior is the 
existence of folk taxo~~omies. These have been studied in detail by 
ett~nasemantlcist s in order to discover their cogniti~e significance and 
structure: 
Man is by nature a classifying animal. His continued existence 
depends on his ability to recognize similarities and differences 
between objects and events in Iris physical universe and to make 
known these similar ities and differences linguistically . Ifideed , 
the very development of the human mind seems to have been closely 
related to the perception of discontinuities in nature. In view 
of this, the study of folk taxonomic systems, which have received 
a great deal of interest in recent years, has a high significance 
in interpreting the logical processes going on in our minds, as 
well as in understanding the application and utility of the tax- 
oncmfc systcms themselves. (Raven, Eerlin, 6 Breedlove, 1971, 
p 8 1210) 
For example, mammal, bird, and reptile might be classified as kinds of 
vertebrates* Tn the network, the relation is termed variety (ab- 
breviated to VAR in the figures), Figure 2 diagrams this knowledge. 
<pecyn 
(sleepwalk)) 
(talk in sleep)) 2 
4' 
1rt)> 
1 / CAUSE 
/nnrnrr-/*ram\\ 
- 
((hairy (William Proxmire 
William (makes foreign policy 
William Proxmire) Proxm ire) statements)) 
Figure 2 Paradigmatic organization 
Varietal nodes are seen as representing concepts at a categqrical 
level, hence variety is the category-subcategory relation. Berlin, 
Breedlove, and Raven (1968) show the existence of covert categories in 
folk taxonomies, Lee, nodes having scientific, but not folk, names, 
It 
say vertebrate" in Figure 2. These categories are revealed by memory, 
classification, 
and other experiments. ThLs is counter to the view of 
Conklin (1962) for whom concepts must have monolexemic labels. Covert 
categories enable Raven et ale (1971) to show a degree of uniformity in 
taxonomies: about five hierarchic levels with seldom more than five 
hundred items under one node. Berldn et al. (1968) claim that items in 
a folk taxonomy form non-intersecting categories+ i .e., the structure 
is strictly tree-like. This view is not held here, for a typewriter 
can be classified both as a machine and as a writing instrument. 
Cobequently , varietal structures are not restricted to being tree- 
like. Loops, however, do not seem possible. Nor is it necessary that 
A n~de have a name. 
Enstance. kogic since Aristotle has distinguished between cate- 
gory (or type) and a specifk member of a category (token). This 
membership relation is termed instance (IST). For example, "William 
Proxmire" is an instance of "person", Figure 2. Most instances are not 
named, takhg their name from their varietal parent, but a major excep- 
tion is people, ebg., "~eter", "Aunt Sally", Figure 1. 
Any path through the paradigmatic organization of knowledge which 
follows only arcs having the same dixectionality (termed a paradigmatic 
path) contains at most one instance arc. Traversing this arc 
represents a cognitive transition from thinking about categorical 
concepts to thinking about particular concepts, e .g . , from thinking 
about man to thinking about Abraham Lincoln, or from blueness to the 
blue of your car. 
Rumelhart st al. (1972) use an ISA relatidn that cover8 both 
variety and gnstance, e.8~~ ISA(Luigi s ,tavern) and 
ISA(tavern,establishment). The present feeling is that a distinction 
does exist ; hence the two relations of the encyclopedia 
Typical. A third condition of knowledge needs to be represented. 
Concepts have both universal and occasional properties. For example, 
"birds eat wormst' is an occasional, not universal, fact about birds as 
some never do, but bven those that do are sometlmeg Iound eating fruit, 
fish, or even not eating, withbut the proposition being necessarily 
false However, "birds have wings'' is expected to be true at all times 
for all birds; it is a pathological situation if a counter-example is 
f oubd. To represept te arguments of occas2onal predications, the 
Dical (TYP) arc is used. Thus the "blrd" in "bfrds eat worms" is as 
in Figure 2. It is also possible to use the typical relation to attach 
occasional proper ties to members of categories, i .e., to instances. 
In 
Figure 2 is show the represenation of "William Promire makes foreign 
policy statements" where this is a statement of an occasional habit 
rather than a record a specific act. No positlon is taken on how 
noteworthy knowledge ie recognized as such in the development of the 
encyclopedia 
Manifestation. The final paradigmatic relation is manif es-t-ation 
(MAN) This corresponds to- the phenomenon of object constancy: An 
object may undergo change in space and time, but it is still perceived 
as the same objee t. For -ampla, William Proxmire before apd after his 
hair transplant is still William Proxmire. Also an object may par- 
ticipate in many different actions but still preserve its identity, 
e.g., Albert Einstein playing a violin and Albert Einstein writing on a 
blackboard remains Albert Einstefn = TI the system each diffetefii 
situatj.9~ involves a distinct node. To a node defined by an instance 
are linked, by manifestation arcs, nodes that correspond to an object 
in its different guises Manifestations of "William Proxmirett are 
shq~ln in Figure 2. bnifestations do not ustPally have names 
different 
from that of their pareat instance; g rare exception tp this is the 
Evening Star and the Morning Star which are both Venus at different 
times of the day. 
Maalfestatians of varietal and typical concepts are also possible. 
The latter are used for pro~erties that are true of the concept but 
only at spme point or period of time, for example, "vertebrates are 
horn", Figure 2. For typical arcs this notion is redundant as typical 
embodies spatial and temporal inde tgrminancy . However manifestation 
does have a use with the typical arc in representirrg ~oreference. 
Suppose it can happen that a person can trip causing him to be hurt. 
The Uperaon" in the encoded event is a typicalised "person", but it 
must be the same person that trips that is hurt. Figure 2 shows the 
use of manifestation relations to indicate this identity. More will be 
said later about the formal representation of the causal relation 
indicated in Figure 2. If only typical arcs were used, the interpre~a- 
tian would b,e that anyone tripping could cause literally anyone to be 
hurt. Multiple manifestations can also be used with variety if 
coreference needs to be marked. 
[This next paragraph is almost certain not to make sense until the 
I I 
reader has completed as far as, and including, the section Tn- 
heritance", and so he may clloose to leave it and return later- 
Other systems, Quillian (1969), Rumelhart et al. (19721, and 
Schank (1975a), do not use manifestation but capture object constancy 
by having one and the same node for a participant in all of its 
propositions. This is a viable alternative. Nevertheless, information 
on the relative standing of the appearances of the participant has to 
be representable- If a single node were used in the encyclopedia, the 
differentiation could be made on the modality nodes of rile proposi- 
tions. This route was not taken as it is more convenient, for example, 
to let the nature of the inherita'nce be determined completely in 
paradigmatic organization, rather than in a rnix ture of paradigmatic and 
discursive structures. For even without manifestation, the varietal 
structure will require the process of inheritadce. 
Of the four arcs, variety, typical, and manifestation can be 
iterated; instance cannot . Figure 2- and 3 illustrate iterative ar- 
rangements of variety and manifestation- That typical also has this 
property is seen from cofisidering that "While dreaming, some people 
talk or sleep-walk". None ot these propositions are universally true, 
but only of arbitrary people. Figure 2 contains thie situation. The 
above examples present only paradigms of entities. but events and 
attributes also exhibit this kind of organization. 
If paradigmatic structure is a loopless directed graph then there 
will be origin nodes, that is, nodes without entering arcs. Can 
anything be said about the number or kinds of concepts associated with 
origin nodes? It is speculated that entities can be divided into 
doma-ins - of being each of which has its own paradigm. Possible domains 
are thing, soul, role, time, etc. Thus to represent Ford as President 
of the USA the structure in Figure 3 would be used. Figure 3 also 
shows how the totality of John brown (JB) and his fragments, as i~ 
"~ohn Brown's body 1 ies mouLdering in the grave but his soul goes 
'marching on" can be represented . 
Figure 3 Domains of being 
To date scholars have only studied entity paradigms in detail* 
Little investigation of attribute or event paradigms has taken place. 
V 
It is hard to intuitively discern the hierarchical ordering of these 
concepts, i.e ., to know which concepts imply others. Red, yellow, 
etc., are obviously varieties of color, but does having mass imply 
having color?--but many gases and glass have mass but are colorlesso 
Or does having color imply having mass?--but red light, blue jokes, 
etc Or are they quite independent attributes that happen to have a 
large intersection in their domain of applicability? fiese are all 
open question8 in the taxonomy of attributes. The event paradigm is 
also open to much speculation. 
Syntagmatic Relation8 
- 
Syntagmatic relations connect nodes from different paradigms (with 
one exception). Relations of participation, similar td Fillmore s 
(1969) case relations, connect entities and events. A relation of 
mlicatioa (APL) link$ attributes *to events or to entities rn A rela- 
tion of part-whole (P-W) c~nnects a unity to its  component^. A syntag- 
matically related structure is termed a prpposlltion. - 
Four relations of participation are distinguished: agent (AGT), 
instrumental (INS), objective (ORJ), and experiencer (EXP) . The role 
characterized by each is derived from dichotomies animate/inanimate and 
causal/non-carnal, as given in Table 1 (Fillmore, 1969). 
Animate Inan hate 
Causal AGENT INSTKWE~AL 
Non-causal EXPER IENC ER OBJECTIVE 
Table 1 Relations of Participation 
Thus "Angry Bill ferociously hit Fred with the handle of an axe" Is 
diagrammed as in Figure 4. 
' <axe> 
Figure 4 Syn tagmatic organization 
The set of case relatkbns does not include locative and temporal 
relations. Sentence adverb ials (Chobsky, 
1965 ) are not part of SYntag- 
matic strucrare, but of the contextual structure, which fs here 
represented on modality nodes. Bound adverbial8 are part of syntag- 
matic structure, e.g., "ferociously" above, and are related to the 
event node by a relation of application. 
A part-whole relation is used in Figure 4 to show the relation of 
"handle" to "axe". This relation differs from other eyntagmatic rela- 
tions in 
that it connects nodes of the same type, emge, two entities. 
A case can be made for this relation to be considered a paradigmatic 
relation; for the present it has been put in with the syntagmatic 
mainly because it is not used by the process of inheritance, of which 
more later. 
Discursive RelatLons 
Propositions do not occur in isolation. They are tied together in 
cognition in a number of ways. The spatial, temporal, and causal 
connections are characterized by$iclcursive ares. Intuitively th~e 
are relations between whole propositions and it is to fail to capture 
this feeling if, say, a cause relation directly link8 two event nodes. 
Modality nodes are uaed to represent situations in Which the whole 
proposition la involved. Though schematically linked to an event node; 
conceptually the modality belonse to the whole propoeition. 
Discursive 
arca relate the modalities of propositions. Thus "Mary slapped John 
because he chased her" is represented as in Figure 5. 
Figure 5 Discursive organization 
-1 8- 
The one causal relation, cswe, admits of no finer distinction. Others 
(Schank, 1975a; Halllday & Haaan, 1975) distinguish three kinds ~f 
causation: reason, result, and purpose. The single cause relation of 
the encyclopedia models the first tw direcrly. Purpose (or enabling) 
causation is seen as separable into cauae together wf th a desire for 
the consequent. For example, a cup may fell causing it to break. The 
fall could be accidental or it could be deliberate with the purpose of 
breaking the cup . The same causal relation exists between the actions 
in both cases, but the analysis of the putposive situation will in-lve 
"desire". 
Time arcs do permit subdivision. A proposition may be 
simultaneous (SML) with another proposition: "Fred washed the car 
while John chased Mary", Figure 50 A sequenttal rn iering of proposi- 
tions is also found, characterized by a sequence (SEQ) relation. The 
suggestions made here for the organization of space are only a working 
set for which little justification can be offered: location (LOC)--a 
neutral statement of position, contact--in physical contact, and -* near 
far above below left , etc., which are self-explanatoryo 
-# -9 -9 -* 
Figure 5 represents the location of "Fred washed the car" as being 
tt 
garage". Since this work was completed Sondheimer (1977) has propaeed 
an analysis of space and the. 
The Me talingual Relation 
- 
Speech acts do not make use only of forms having physical 
reference,e.g., table, John, blue. A most important aspect of 
language behavior is abstraction- Human social, scientific , and intel- 
lectual development is dependent on the ability to create and control 
abstract concepts- A quick appraisal of thi9 paragraph reveals many 
such concepts: language, behavior, bocial, etc . A system that 
seriously hopes to approach human capabil ities must have a correspond- 
ing ability. 
One part of modeling abstraction 5s representation; but what is to 
be represen tedg 
AbsOrsc tion involves knowing a situation in which the 
nbst rac t term applies and replacing the sltu~tional description by the 
abstract term. An example is "tragedy"- The scene to which it is 
19 
applfcable is, say, Someone does a good act that: results in his 
death". This definiens is encoded in rigure 6. "Tragedy" names a 
single node. 
Figure 6 Metalingual organization 
The general propositions of the definiens are conjoined using a 
modality node linked to the modalYties of the propositions by part- 
whole relations. In general there nay be any number of levels of 
modalities related by part-whole. To complete the association of the 
definiendum with its deftniens, a metallngual (PEL) arc links the 
former to the appropriate modality in the latter, Figure 6. If any 
situation matches the definiens, then the abstract term is appropriate. 
The process of matching will be discussed later. 
The definiendum can also be any concept, the choice is fdiosyn- 
crattc; there is no reason why this device cannot be used with ap- 
parently non-abstract concepts, for example, a dog could be "man's best 
iriend" for sowk, in contrast with a non-abstrabt definition of "canine 
animal" Wn-abstract definitions have the form "genus-specificata". 
In the encyclapedia, the representation is made up from a node related 
by variety to the genus 
(animal), to which are attached the properties 
in the specificata (canine) 
Rumelhart and Ortony (1976) use a relation similar to metalingual, 
ISWHEN, but do not show how participants are equated fn the definiendum 
and definiens, nor the processes that use such definitions. 
The metalingual arc is used in another context* Some propositions 
contain embedded propositions. For uniformity it ie desirable to 
restrict participation in propositions to entity nodes. Thus the 
matrix proposition has an unnamed participant in objective or in- 
strumental role and this nsde is defined by a metalingual arc to the 
modality of the contained propositiono For example 
(1) Peter believes Fred chased the cat* 
is represented as in Figure 7. 
<Fred> 
Figure 7 Embedding propositions 
.Status Relations 
Knowledge in an encyclopedia is a model of the beliefs of one 
person. Nevertheless the knowledge is not all of the same status. In 
addition to containing the person s beliefs, it ~ncludes representcation 
of beliefs about his own desires and of his bcliefs about the beliefs 
and desires of others. His personal beliefs and desires interpret, 
control, and direct his personal activities. The knpwledge about 
others 1s the basis for interacting and communicating with them. 
For 
example, a conversation with a child about the structure of matter is 
quite different from one with a nuclear physiclet because of different 
conceptions about their levels of knowledge and hence what can be taken 
for granted. One has knowledge about individuals, e-g., your brother, 
Nelson Rockefeller, etc ., and about groups, e.g., politicians, sports 
wr lters, Russlans , etc. 
A disti-ncti on can be made between subconscious and conscious 
knowledge. Tbe former is, for example, the knowledge of language 
underlying its use or (2). 
(2) The Sun Circles the Farth. 
Conscious knowledge is learnt or communicated knowledge, e .g -, what one 
has been taught about the solar system.  here is no reason for the two 
krnds of knowledge to be in accord regarding the same enti ties . One 
has learned, for example, that the Farth circles the Sun. 
Subconscious beliefs of self are unmarked in the encyclopedia. 
Subconscious beliefs of another are indicated by a believe arc between 
a node representing the believer and a modality node aovering the 
network representaion of the content of the beliefs. The subconscious 
belief of (2) by "people" is given in Figure 8. 
Figure 8 Knowledge status 
Conscious beliefs are represented as propositions embedded within 
an event "beLieve". An example is given in (1) with its representation 
in Figure 7. 
It is not only propositions that have belief status, but also 
simple concepts, e .g ., ghosts. To accomodate this information, the 
placing of modality nodes is generalized. Previously only proposf tions 
were associated with modalities; now any node can have its own 
modality. On this modality information about a concept's existence and 
belief status can be represented, as in Figure 8 for "Fred Smith". 
It is unlikely that each node or proposition is immediately linked 
to its believer. Using part-whole relations and modality nodes, 
domains ob belief, which may intersect, can be created as in Figure 8 
for "Hugo" . 
Hendrix (1 975) partitions semantic networks to delimit domains of 
belief; here the same effect is gained through the use of modality 
structures. 
The desires of people are situations that they would llke to 
exist. The content of these goals can be represented by a modality 
covering (complexes of) propositions or single concepts, e-g., peace- 
If the goals are subconscious, a desire relation links the desirer to 
the modal ty. For consczous states, the modality is part of a 
metaliagually defined object ve of an event "desire". In modeling 
behavior, these goals provide the situations that other b ehaviotal 
actions are intended to contribute towards achieving 
Negation 
Negation is a property that is marked on 4 modality. The most 
common site for negative marking is a propos:itional modality- Thus 
Fig~e 9 contains the proposition "I do not like tomatoes" 
\like 1 
(tomatoes) 
Figure 9 Negation 
When some other constituent of a sentence is negated, say using strong 
stress, this is marked on a corresponding modality, so "John did not 
hit Mary" is encoded as in Figure 9. 
It is not anticipated, that negation is a common feature in know- 
ledge, for "A person sometimes learns a negative fact when it con- 
t'radicts something that might be inferred by mistake or that is true 
for a similar concept But, most negative facts are never learned'' 
(Collins & Quillian, 1972, p . 319). 
Inheritance 
A node will inherit properties from nodes higher in its paxadig- 
matic path* Quillian (1969) used superset relations for the same 
purpose. In Figure 2, B inherits the properties of A, C those of B, D 
those of E, and E those of D. Inherihance is transitive, thus E in- 
herits properties from A, B, C, and D. This permits parsimonious 
representation of properties: A property need only appear at the 
ancestor of concepts having the property. Inheritance is inhibited 
only if the inheritable property is contradicted on a lower node. For 
example, although the property "fly" may be associated with "bird", it 
is prevented from being inherited by "penguin" by having explicitly 
"penguin not fly". 
The generality of inheritance depends on the form of representa- 
tion of the property at the ancestor node. Properties that are univer- 
sally true at all times, e.g., birds have wings, are attached directly 
to a varietal node and are obligatorily true of all descendents. If at 
any time a bird without wings were reported, it would be cause for 
further explanation. Some other properties are always true but only at 
intermittent times, e.g., people eat, whose representation involves the 
manifestation relation. It is not odd that a person can be seen not 
eating, but if you watched long enough, it would be fully expected to 
obeerve this behavior sometime Finally there are occasional proper- 
ties that make use of the typical arc in their representation. These 
proper ties are not universal, being merely noteworthy recollections 
about a concept, e.g., The French are rude. It would well be possible 
to have a complete history of an example of the concept and not witness 
the property without being disturbed by its absence. 
Episodic andL Sys ternic Memory 
Tulving ( I 972 ) distinguislles episod ic from semantic memory. The 
former "receives and stores information aboht temporally dated episodes 
or events, and temporal-spatial relations between events" (p. 385). 
TH~ 'latter is 
knowledge a person posse sses about words and other verbal symbols, 
tl~ear meaning and referents, about relations among them, and about 
rules, formulas, and algorithms for the manipulation of these 
symbols, concepts , and relations. Sqhan tic memory does not 
register perceptible properties of ~nputs, but rather cogni tlve 
referefits of inpbt signals. (p. 386) 
Abelson (1975 ) distinguishes episodic from propositional memory, and 
Woods (1975) contrasts intensions with extensions along similar lmes. 
The term I prefer, following Hays (1978), is systemic rather than 
semantic, propositional , or in tensional 
The localization in space and time of knowledge is represented .in 
the encyclopedia by spatial and temporal organization of propositions 
using the appropriate d&scursive relations. A proper subpart of 
episodic memory is contained in paradigmatic organization. Nanifesta- 
tions of instances (remember there are also manifestations of varietal 
and typical nodes, so it must be thus stated) represent spatio- 
Qemporally localized inf onnation about members of categories Conse- 
quently' knowledge represented on manifestations of instances, or their 
manifestations, is in episodic memory. This is only part of episodic 
memory as categorical knowledge can also be present. For example, in 
"Jung changed our view of dreams", the reference is to the categorical 
notion of dreams, not to any specific ones. Nor is it sufficient for a 
proposition to have a non-categorical participant to be m episodic 
memory for "Prior to the Revolution. Russf an peasants were feudal 
serfs" cnotdins catpgorical gartici pants, yet is episo~ics The total 
extent of episodic memory is yltimately decided through spatial and 
temporal relatiion of discursive organizati~n, not by paradigmatic 
structure. 
Ouan tif ication 
para dig ma ti.^ arcs have the capability of capturing the essence of 
quantification, .including scope To illustrate the facility, cons:ider 
(3) and (4) whlch are equivalettt to the Iormulae (5) and (6) Figure 
10 encodes (3) and (4). 
(3) 
There is a book that :js read by every scholar, 
(4) 
Every chorlgter knows a song. 
(5) ax Vy [ (book(x) & scholarty)) 3 read (y,x)] 
(6) Vydx [ (song(x) & ohorister(y)) 3 know(y,x) J 
Figure 10 Quantification 
Xf for a given c.orister in (4) it is necessary to determine the song 
he knows, i be., to evaluate the Skolem function, the information is 
present as a predication of that individual and should be retrieved 
using his name, say "George" and  r rink to me only with thine eyes" -in 
Figure 10. 
It is also possible to give distinot representation to unquan- 
tlfied statements, such as (7), as in Figure 10. 
(7) A person likes candy. 
Paradigmatic arcs are here achieving reoresentational power equiv- 
alent to the partitioning of networks by Hendrix (1975)- 
-30- 
The above is a systemic rendition of "all". The quantification 
can also be characterized episodically by every manifestat<on of a 
concept having the property. Interpreting "all" (Woods, 1975), could 
call upon either systemic or episodic facts* A question containing a 
universal quantifier may be answered by either examining a varietal 
node (Are all moil-boxes blue?), or by = examining every mhifestation 
(Do a11 mail-bo~es stand at street corners?). 
It should be noted that "all" requires that the predicacron be 
true only at some time, e.g., All people die; it does not require 
continuity in time, e .g., All birds have wings. Thus untiversal quan- 
tification is also true if the predication is found for a manffestation 
of the varietal node, os is found for every instance of the concept. 
Processes in the Network 
-- 
The model for knowledge described above is only part of a system 
to model cognitive behavior. Thought is simulated by processing know- 
ledge. Different aspects of behavior correspond to different proces- 
ses, but with one and the same encyclopedia common to all. A system 
for discourse analysis requires processes that use the encyclopedia to 
find patterns of organization in a discourse. It would be possible to 
describe solely the requirements of discourse analysis, but greater 
overall insight is garned through a preliminary general examination and 
classification of cognitive processes. Once this is accomplished, 
discourse analy4s 
is seen not to be a unique process but as composed 
of more basic general ones. Simulation of many asp'ec ts of cognitive 
behavior can he porfomed by complexes of these general processes: 
discourse analysis is just an6 such complex 
Processes can be classified in various ways: functionally, by 
complexity, or by the class of relation involved. 
The ftlnction of some processes is external; they deal with input 
and output. Some internal processes find relations between new infor- 
mation and knowledge already in the encycloppdia, others investigate 
the va1idi.t~ of new knowledge, etc. 
Processes are of two type oE complexity, either path-tracini or 
pattern-*. The dichdtomy is justified by showing that there are 
tasks that can only be done by pattern-matching* This topic i.s con- 
sidered in detail later. 
Of the infinite number of possible ordered sets of arcs, only some 
define significant paths in the network. An example of a relevant set 
of arcs is the arcs of a paradigmatic path; this defines possible 
inheritances. Other significant sets are causal chains, whi-ch are 
represented by a string of cause arcs between modalities. Th~s sugg- 
ests that processes that use tile same kind of relati,ons or identical 
relations are significant. 
A functronal classification of processes does nqt give a deeper 
understanding of3 cognitive processes. However, classification by 
complexity and by kind of arc is revealing. Path-tracing and pattern- 
matching differ in power. For the tormer, subpaths can be defined by 
the kind of arc found in the subpath. Henceforth processes in the 
network wilt be described according as they are path-tracers or 
pattern-matchers. 
Path-tracing 
- 
Path-tracing processes try to establish paths between nodes along 
arcs of the network. Quillian (1969) established this methodology for 
semantic nets. A particularly common type of path is the paradigmatic 
path. In rigure 3 there is a paradigmatic path betweqn "Ford (as 
President)" and "thing", but not between "rock" and "soul". The 
definition of a paradigmatic path is valid for entities, events, and 
attributes. 
Any paradigmatic path in the network will conform to the structure 
sl~own in Flgure 11 where * indicates any number of occurrences includ- 
ing zero of the marked relation. 
Figure 11 Paradigmatic paths 
The structure follows directly from the iterativity of variety, 
manifestation, and typical arcs and their posbible relative orien ta~ 
tions. Strings of arc labels representing paths throwh the tree are 
obviously regular expressions, Lee, the strings are sentences of a 
type 3 language* Paradigmatic path-tracing can thus be characterized 
by a finite sVate automaton (Hopcroft & Ullman, 1969). 
Any process that can he characterized by a finite state automaton 
is formally termed a path-tracing prwess in the system* 
One such process is testing the applicability of an attribute to 
an entity, e .go, whether "fresh fish" or "round smoke" is acceptable 
when the relationship is not explicitly in the encyclopedia . Assuming 
the named entry points to the encyclopedia are at vari-ety or instance 
nodes, an entity F1 (e.g., horse) can inherit properties from an entity 
C2 (e.g., animal) if there is a path between F1 and T2 of the form 
(I-) iiihR*, where indicates a relation that is the converse of X and 
) ind~cate an optional arc. Properties may be attached to E2 either 
directly ar with typical and/or manifestation arcs, ime., the path from 
Cp to the node F3 in the representation of the property has the form 
TYP* MAN*. Thus the path from El to Eg has the form (m) WR* TYP* 
MAN*. Analogously, an attribute Al can apply to an entlty if there is 
a similar path to an attribute that :is encoded as applying to the 
entity. Thus if there is a path 
(8) '<El > (ET) KW TYP* MAN* GE IIAN* TYP* VAK* (IST) [Al I 
then A, can feasibly apply to El. That is to say, the explicit encod- 
ing of "emotional animal" would make it reasonable to infer "sad 
horse". The path (8) is composed of paradigmatic patjis linked by a 
single application arc. Each segment is a regular expression. As type 
3 languages are closed under concatenation (Hopcroft & Ullman, 1969, 
theorem 3.8), it EollowS that (8) is also a regular expression and that 
attribute gpplicability testin8 is a path-tracing process. 
* 
Propositions in a discourse should be consistent wit11 encyclopedic 
knowledge. Consistency is established by finding a proposition in the 
encyclopedia that is a generalization of the discourse proposition, 
e.g., given the discourse proposition 
(9) Harv gobbled the caviar. 
and finding the generalization 
(10) People eat food. 
A novel statement, e .g., "fiarry munched the spider", which is not 
consistent with (10) (assuming "spider" is not a variety of "food"), 
would evoke a demand for further explanation, or similar. Consistency 
judgment can be formulated as a complex of path-tracing processes. In 
t 8 
the network form of (9), "Marv" is the agent and caviar" is the objec- 
tive of "gobble1'. Flgute 12 encodes (10). 
Figure 12 Consistency j udgment 
The words in the discourse proposition provide entry points into the 
network of Figure 12 through the dictionary and converse name rela- 
tions* From "gobble", node 1, paths along paradigmatic arcs are 
traversed tio locate nodes from which "gobble" could inherit properties, 
It 
eeg., node 2. Kext from the entries for "tlarv" (A), and caviar", (B), 
analogous paths are followed, reaching C and D, respectively (among 
other nodes). From C and TI arcs corresponding to the participatory 
relations of "Marv" and of "caviar" to "gobble", i.e., agent and objec- 
tive, respectively, are followed. If all paths intersect at a single 
node, e .go, node 2, then the proposition containing the intersection is 
the general proposition sought. Each path from an entry point to an 
intersection can be characterized by a regular expression. There are 
only four case relations, which sets a finite upper bound to the number 
of paths to be followed. Hence this process is also a path-tracine 
proceqs. 
Lbcatlng existing knbwledge, propositions that are already ex- 
plicitly in the encyclopedia, is effectively identical to the consist- 
ency testing process above, but with downward paradigmatic paths being 
followed instead of upward ones. Thus given "Oswald assassinated 
Kennedy" and the network of Figure 13, 
< Oswald > 
'I 
8 
Oswald 
Kennedy 
ate 
Figure 13 Finding known propositions 
paths can be traced from node 1 to node 2, from node A to node B to 
node 2, and from node C to node D to node 2. The common intersection 
IS in the known proposition. 
Pat tern-Ma tchlng 
Pattern-atching is used in processes where two configurations of 
nodes and arcs must match. me such process is with metalingually 
defined terms If a discourse conf lguration matches a metalingual 
definition, then the part of the discourse so matched may be replaced 
by the term. Figure 14 contains representations of (a) "Fred ate some 
11 
cake that made him sick" and (b) the definition of poisonf': 
11 
Someone 
ingests somethin8 that makes him ill". 
(a) (b) 
Figure 14 Pattern-matching 
I I 
If the latter matches the former, then poison" describes the discourse 
situation. Earlies a path-tracing process was used to establish con- 
sistency between a general and a specific proposition. The same proc- 
ess can be used to pair propositions or the discourse and the defini- 
tion. However, there is an aspect of complexes of propositions that 
prevents path-tracing from being a complete solution. If the complex 
contains coreferential items, as "poison" does, this coreferentiality 
must be examined; if it were not errors could result. For example, 
consider a discourse containing "John's eating the worms made Fred 
sick". Each proposition matches phrt of the definition of "poison", 
but it should not be taken as an act of poisoning. The coreferen- 
tiality condition plevents a match. As, in general, there can he any 
number of coreferential participants in a complex of propositions, it 
is not possible to define a regular expression ta characterize the 
coreferentiality test. This can be shown by considering a definition 
of an abstract term that contains n coreferential concepts. There is 
in general no bound on dss the definition can contain any number of 
propositions. If a complex of discourse propositions is to match the 
def initton then there must first be a unique correspond tng proposition 
in the definition for each discourse proposition. This can be done 
using the path-tracing process described above. But over and beyond 
this, the coreference condition must be satistted . For each manifesta- 
tion of the coreferential concept in the definition there must be a 
corresponding manifestatim of one and always the same concept in the 
discourse . Also the syntagmatic role of corresponding manifestations 
in their respective propositions mwt be the same* The acceptance 
condition involves pair-wise counting. This is equivalent to accepting 
n n 
strings of the form a b 
, which are not sentences in a type 3 language 
(Hopcroft & Ullman, 1969). This demonstrates that processes that 
compare complexes of propositions containing coref eren tial items are 
not, in general, path-trac~hg processes. 
-3 9- 
A process charac terired by a device more powerful than a finite 
?utomrrton is formally designhted a pattern-hatching process. 
Paraphrasing discourse usinp metalingunlly def ineri terms is 
another sattern-matching process* lfatalinaual definitions can be 
recursively embedded. For example, "buy1' may be defined in terms of 
"pive", which in turd may be defined in tens of "have". Recursion is 
not a property of regular languages, hence this process is not 3 path- 
tracing plocesse 
Matching discourse con£ igurations against definitions, called 
abstraction, is an extension of the process that substantiates 
discourse propositions by seeking generali zed proposi tions in the 
encyclopedia, discussed earlier* The components of a definition are 
generalized propositions and hence the substantiation process will find 
them if they correspond to part of the discourse. Schematically, two 
discourse propositions DP, and DP1, may match generalized propositions 
rP1 and CP2 an$ GP3, G$, and CP5, respectively, as shown in Figure 15. 
MLD 
DEFINITION 
GENERALIZED 
PROPOSITIONS 
DISCOURSE 
PROPOSITIONS 
DP1 DP2 
Figure 15 Abstracti-on 
This is the normal output when judging consistency. Propositions of a 
definition are under a conjoining modality, to which the metalingual 
arc points. If it is found that Some af the general propositions are 
part of definitions, Lea, CP2 and GP3 in MLD, then these definitions 
are examined to see if all the conditions for their use are satisfied, 
*i.e., coreference and contextual (e.g., cause arcs) conditions. For 
example, 
in llpoison", Figure 14, the coreference of the agent of "eat" 
and the applicand of "ill". If a definition is satisfied, tKen the 
part of the a iscourse matching the definiens can be paraphrased. 
The definitional nets so far presented are not adequate for 
paraphrasing, but must be augmented 40 include the roles of entities of 
the definiens with respect to the definiendum. This is done with 
manifestation arcs. A network definition of "buy" (in "A buys thing 
from B for money1') is given in Figure 16. The verbalization is "A 
gives money to B and B gives thing to A". 
Figure 16 Role correspondence 
The manifestation arcs indicate the role correspondences between "buy" 
and the defining situation as well as coreferentialities within the 
latter. 
The case correspondences are essential information for the prot ess 
of abstraction and for its inverse, decomposition, which produces a 
less abstract description from a network containing a term that has a 
metalingual definition. For example, given the sentence "John bought a 
bicycle from Jane", the definition of Figure 16 enables the paraphrase 
"Jane gave a bicycle to John and John gave money to Jane" to be 
generated. "Money" was unexpressed in the original, but is present in 
the definition, and appears in the paraphrase. The process fills empty 
slots by the appropriate concept from the definition, in this case 
1) 
money". The agent, experiencer, and objective slots are filled in the 
source statement and are transferred to the paraphrase. 
Another abstract term can point to the same definitional network, 
say "sell" in the case of Figure 16. The net then has all the informa- 
<ion for paraphrases between the two abstract terms as well as for 
decomposition and abstraction* 
There is no productive relationship between the roles of the same 
participant at different Levels of abstraction. Case relations repre- 
sent only the causal/animate perception of participation in an event. 
More detailed descriptions of the roles of participants can only be 
If 
given in context. For example, money" is perceived as instrumental in 
"buytt, but at the next level of decomposition, it is in an objectiue 
role in "give". 
The outputs of both abstractton and decomposition are structurally 
indistinguishable from any other proposition in the encyclopedia and 
therefore can again be subject to either of the processes. 
As pattern- 
matching is a recursive process this ability for output of the process 
to be accepted as input is essential. 
The distinction between path-tracing and pattern-matching proces- 
ses may be psychologically significant Inhelder and Piaget (1 964) 
find that prepuberty children cannot use logical equations such as 
-(A AB) 3 -A V -Be The equations involve cgreference and hence their 
application requires a pattern-matching process . It could be 
speculated that this more powerful process only appear& at maturation. 
DISCOURSE ANALYS IS 
The Structure of Coherent Discourse 
- - 
In this section the hypothesrs concerning the kinds of organiza- 
tion present in coherent discourse is outlined. A fuller description 
can be found elsewhere (Phillips, 1975). The role of the encyclopedia 
in discourse is then exemplified. 
A discourse is judged coherent if its constituent propositions are 
connected . Various types of cohesive links are observed in discourse : 
anaphoric , spatial, temporal, causal, and thematic. I will formally 
describe the structure of a well-formed discourse in terms of these 
connect vesL 
Anaphora 
A discourse has reference to objects. Coherence is given by 
repetition of the reference. Two kinds of anaphora can be 
distinguished. The first is marked by the presence of a proform (or by 
repetition of the fonn): [It is usual for coherent d.lscourse to ex- 
hibit several kinds of cohe'sive links . Thus the examples invariably 
contain more than the one specifically being illustrated. ] 
(10) Henry travels too much. He is getting a foreign accent. 
Antecedents may bm nominal, verbal, or clausal . The second kind of 
anaphore has a dependent that is an abstract term for the antecedent. 
for example 
I1 
(11) John put the car into reverse" instead of "drive". 
The mistake cost him $300 to repair. 
"Plistake" in (11) is an abstract characterization of the gear selection 
expressed in the first sentence. Nagao anti Tsujii (1976) address this 
issue. 
A conventi-onal way to label the recurring characters in discourse 
t f 
is as dramatis personaetf. Ifowever, cohesion can result not only from 
multiple appearances of people (lo), but of any concept, as in (11). 
Spatial, Temporal, - and Causal Cohesion 
Space, time, and cause give coherence to a set of clauses or 
sentence- 
(12) The King was in the counting house, counting out his money. 
The Queen was in the parlor, eating bread and honey. 
The actions in (12). are set in different rooms, but of .the same 
11 
palace". 
(13) After Richard talked to the reporter, he went to lunch. 
The temporal sequence of events in (13) is expressed by "after" . 
(14) John eats garlic. Martha avoids him* 
To nun-aficionados garlic is known anly f.or its aroma, detection of 
which causes evasive '~c tion. 
Cat~sc, illustrated in (14), is an important discourse connective 
(Schank, 1975b). Ttle importance is perhaps ethnocentric; in other 
cultures different positions may have to be taken, for mcain~le. n 
teleogical world view (khi te, 1975). 
The causal chaYn of propositions in discourse is termed its plot 
structure. 
Thematicity 
Coherent discourse is expected to have a theme, to have a topic* 
tor example 
(15) DF drowned today in IlB resevoir after restuing his son 
who had fallen into the water while on a fishing trip. 
is a news story 6rom the New York Times with a theme that I will call 
t t 
tragedy". In this section I wish to justify the claim that a thematic 
structure condition is universal by examining different exarnrles and 
analyses of general discourse for evidertce. 
The notion of theme is much used but not often defined with 
clar'ity. It is variously stated to be "The subject of 
discourse . . a topic1' (Oxford English Dictionary); "the playwright's 
point of view towards his material1' (Elabley, 1972, p. 14), etc. In 
Abelson (1973) there is a list of themes (admitted to be neither fixed 
nor exhaustive) : admiration, devotion, appreciation, cooperation, 
love, alienation, betrayal, victory, dominance, rebellion, mutual 
antagonism, opposition, and conflict . Occasionally one finds overt 
comment pn the lack of a thege: "The thing that puzzled me most about 
The Last Remake of Beau Geste was its lack of a point of view" (Barry 
-- -- 
Took, "cinema" p- Punch, December 7, 1977) . 
Equally Infrequently one can 
find a succinct amplification of the structure of a thehe: "On the 
other hand, the suspension of disbelief is what thrillers are about. 
It 
(Sheridan Horlqy,  heatre re" 8- Punch, November 19, 1975). 
A theme may be explicitly stated in discourse. In technical 
writing it is quite usual to express a complete definition, 
def iniendum-def iniens: Kuhn (1 962) defines "paradigm" as an "achieve- 
ment" that is "sufficiently unprecedented ta attract an enduring group 
of adherents away from competing scientific activity [and] suf- 
ficiently open ended to leave all sorts of problems for the redefined 
group of practioners to resolve" (p. 10). Much of the rest of the book 
then discusses paradigms as models for sclen tific revolut&ons. 
If a discourse has an implicit theme, it has to be inferred by the 
reader. An author, therefore, should use themes that are know to the 
reader . One possibilty is that there is only a finite number of 
themes. But lacking evidence for this positfon, I will hypothesise 
that the number of themes may be unlimited in the same way that the 
vocabulary of a language is open. A reader may not know a word 
that is 
used by an author; in a similar fashion he may not recognize a theme. 
There are studies that indicate the existence of abstract themes 
in language. In folk-tales, Propp (1968) analyses a render's expectan- 
cies about the structure of the tale. Propp starts by comparing the 
followin8 events from different tales: 
1. A tsar gives an eagle to a hero. The 
eagle carries the hero away to another kingdom* 
a 
2. A princess gives Ivan a ring Young men appear'tilg 
from otrt of the ring carry Ivan into another kingdom. 
Propp infers that "a tale often attributes identical actions to varhus 
personages. This makes possible the study of the folk tale according 
td the functions of the dramatis personae" (pa 20). Falk tales are 
analysed in terms of Eunct'ions. The above examples are described as 
It 
con'taining two functions: "Aquisi tion of a magical agent" and Trans- 
ference to a designated place". An example of Propp s analysis is 
(22) ACTION 
A tsar, three daughters. 
The daughters go walking, 
overstay in the garden. 
A dragon kidnaps them . 
A call for aid. 
Quest of the three heroes. 
Thxee battles with the 
dragon. 
Rescue of the mazdens. 
Return. 
WedcFing. 
FUNCTION 
INITIAL SITUATION 
AB SENTATION 
VIOLATION 
VILLAINY 
PlEDIATION 
CONSENT' TO COUNTERACTION 
DEPARTURF 
STRUGGLE 
VICTORY 
IN IT IAL MISPORTUNE 
LIQUIDATED 
RETURN 
WEDDING 
(p* 128) 
Functions correspond to metali,ngually defined concepts of the ency- 
clopedia. Propp show that thi.s genre 01 discourse can be analysed as 
an ordered string of abstract concepts. 
-4 8- 
Linde (1974) finds that there is a prescribed pattern in verbal 
descriptions of apartments. Only two discourse strategies are used by 
her subjects to express the spatial structures, and of these, one is 
considerably more frequent than the other: 
There are at least two logical possibilities for . . . [the 
overall description of apartment layouts] . . the speaker may 
describe a map of the apartment. or he may describe a tour of it. 
Fxmples of each are the following: 
I'd say it's laid out in a huge square pattern, broken down 
into four units. If you were looking down at this apartment 
from a height, it would be like . . . like I said before, a 
huge square with two lines drawn through the center to make 
four smaller squares. Now on the ends . . uh . . . in the 
two boxes facing out on the street you have the living room 
and a bedroom. In between thC?se two boxes youhavea 
bathroom. Now between the next two boxes, facing the 
courtyard you have a small foyer and then two boxes, one of 
which is a bedroom and the other of which is a kitchen and a 
small foyer a . . . a little beyond that. 
Well you walk in the door and there's a kitchen and then off 
the kitchen is one bedroom. As you go straieht in from the 
doorway throught the kitchen you go into the living room. 
And then to the left of the living room aye two bedrooms. 
The two bedrooms are on the same side of the building and the 
living room and the kitchen are on the same side of the 
building. 
Both of these descriptions are reasonable agsyers to the question 
"would you describe the lgyout of your apartment?" Our intuition 
certainly informs us that both speakers have fulfilled cne task 
that was proposed them. What our intuitions do not tell us is 
that descriptions like [the first] are extremely rare, while 
descriptions like [the second] are extremely common. Of 72 apart- 
ment descriptions, only 3 are of the form of a map . . * while 69 
are the form of a tour (pp, 8-9) 
The tour may be a composition of separate episodic events of moving 
between rooms of the apartment . The plan is more obviously systemic, 
involving spatial (left , right, e tc *) and comnonential (part-whole) 
organiea tion. 
Longacre (1968) notes that in a given language there is a finite 
number of discourse types which can never be mixed or confused. 
Discourse from various Philippine languages suggest four contrasting 
discourse prose genres: 
Narrative: recounts some sort of stofy 
Procedural: tells how to do something. 
Expository: any sort of explanatory essay. 
Hortatory: attempts to influence or to change conduct* 
Narrative discourse is composed of the following tagmemes : 
APERTURE provides temporal and spatial setting and introduces 
some of the principal dramatis personae. CLOSURE gives final 
commentary on the main participants, "they lived happily ever 
after". Nuclear tagmemes EPISODE, DENOUEMENT, anti ANTI- 
DENOUEEIENT show a great variety of exponence * . typically 
any paragraph type may be an exponent - plus embedded 
drscourse of the PROCEDURAL or EXPOSITORY genre. 
A correspondence can be informally recognized between some of 
It 
Propp's functions and Longacre's tagmemes. For example, between Ini- 
tial Situation1' and "~~er ture", and "Reward" and  losu sure". For Propp 
the peak of the discourse is in the function "Initial PIisfortune Li- 
quidated", and for Longacre it is in the tagmeme "Anti-Denouement" e 
The idea of a hierarkhic orkanization of tagmemes mentioned by 
Longacre, above, is paralleled in Lakof f's (1992) transf:ormational 
generative nodel that uses Propp's set of functions. A phrase struc- 
11 
ture component generates a deep structure". For example, the tale of 
(22) may be represented by the tree stxuc ture of Figure 17. 
-50- 
SITUATION 
REUARD 
S~N 
QUEST RESCUE 
- 
CALL FOR 
AID 
RELEASE 
MEDIATION CW~ER- DEPARTURE STRUGGLE VICTORY MI SFORTUNE RETURN WEDDING 
ACTION ilaimm 
Figure 17 Textual deep structure 
The conclusion is that there are prescibed patterns in all genres 
of discourse; I term these patterns "themes". I do not offer a com- 
plete inventory of themes; their discovery is a matter of empirical 
investigation 
Any extended discourse is unlikely to be organized according to a 
single theme* I hypothesise that a coherent aiscourse is characterized 
by a single rooted tree of themes, as schematized in Figure 17. All 
themes must be proper subthemes of the matrix theme. A text with an 
overlapping thematic structure is incoherent : 
(23) 
Eating fish made John sick. He caught measles last May* 
-51- 
shown schematically in Figure 18. 
John 
eat a 
Jd h n's 
poisoning 
4' 
n John's illnesses 
John alck 
John he8 
meaales 
Figure 18 Incoherent thematic structure 
An important point to conclude this section. The inferred connec- 
tions may not correspond with those intended by the author. Thi~ is 
another problem. Here I only address the analysis of a story by a 
reader. If he connects it in the manner described above, then it is 
coherent for him. 
The Rqle of the Encyclopedia 
---- 
Not all of discourse structure is overtly stated; discourse is 
highly elliptic . In (1 3) tha discourse connective "sf ter" is present 
to mark a temporal sequence, but in (14) there is no realizati~n of the 
causal relation between the two prapositions. Normally one assumes 
that a discourse is coherent; hence (12) is most acceptable if the 
rooms are taken within the same habitation. Evidently a reader must 
infer omitted structure. The inferences are made from his cognitive 
store of world knowledge. 
There is much discussion at present about inference as part of 
understanding. To make inferences is easy; the problem is to make the 
right ones. It helps to have a goal. It is suggested that discourse 
can be said to have been undetstood when it has been judged coherent, 
as defined above. 
In the next sections are presented the role of the encyclopedia in 
determining and representing the dimensions of coherence spelled out 
above. 
Anaphora 
If the dependent is a proform then part of understanding is to 
determine the correct antecedent . There are syntactic constraints 
(Langacker , 1969) which serve to narrow down choices for antecedents 
and to give an order of preference. Winograd (1971) also established 
an ordering for the choice of antecedents. Nash-Webber (1976) used 
lambda abstraction to establish possible antecedents The chosen 
antecedent, when substituted for the proform, must produce a meaningful 
proposition that is coherent in context. A meaningful proposition is 
1 
one that has a counterpart in the encyclopedia. Wilks (1975) discusses 
a method of finding the most semantically acceptable antecedent. In 
encycloped~c terms, the counterpart may be the self-same proposition, 
o?, more likely, a systemic proposition. The process of finding such a 
proposition has been described earlier. If no generalization is found , 
the input proposition is not consistent with encyclopedic knowledge. 
Abstract terms can be defined by complexes of general proposi- 
tions, each having sufficient conceptual content to define situations 
in which they apply. For example, a definition of "mistake" must be 
such thar it applies to part of the first sentence in (11 1. The proc- 
ess of abstraction needed here was presented above. 
Spatial, Temporal, - and Causal Cohesion 
To in£ er omitted spa tio- temporal and causal relations, i .e . , the 
discursive relations of the encyclopedia, it is also necessary to 
locate general propositions. Systemic memory, of course, includes 
these relations. Schematically, Figure 19, from a discourse proposi- 
tion PI we can locate P2, by the means already described. P2 may have 
a discursive relatlon R to another systemic proposition P3. A proposi- 
tion Pq , a particularized version of P3, and the relation R, 'between 
P2 
and P3, can be added to the discourse. Of ten Pq will be a proposition 
already stated in the discourse, then merely the relation need be 
inferred co augment the plot structure. 
R 
P2 * - 
ENCYCLOPEDIA 
R 
P1 - P4 
DISCOURSE 
Figure 19 inference of discursive structure 
It 
may may, however, be necessary to infer a chain of propositionn to 
link the propositions of the original discourse Intuitively there 
must be a limit on the number of propositions that can be inferred 
in a 
sensible path, but at present no %might can be offered. 
To exemplify the process in greater detail, let us consider some 
of the knowledge that is used in the analysis of (15): "In water and 
not ah1 e to act causes drowning". In Figure 20 the network form of 
this knowledge is presented. 
Figure 20 Example of causal inference 
Prom the discourse propositions "DF in watert' and "DF cannot act", 
paradigmatic structure enables the systemic propositions A and B to 
be 
found. mere is e coreferentiality condition that must be tested in 
the manner described earlier . The discourse propositions pass the 
test, so the complex represented by the modality C exists in the 
discourse. Ttle discursive relation cause can be followed from C to D. 
The latter is a pla~isible inference, and in fact, o specific equivolerlt 
of D is one of the original propositions of the discourse, i.em, "DF 
drom". The concepts of the sys.temic propositions .ire linked to thcr 
rest of the encyclopedia by typical orLsm It is so because of the 
nature of the hnowledge. is such that it is only son~ething that could 
happen in the given c ircun~stanc es . 
The indications from the testing of Thorndyke (1976) are 
that. 
~nierences are a psychological reality in understanding natural 
language texts. 
Thematicity 
In, the present system, a thematic concept is defined structurally, 
it is anything having a metalingual definition* 1\ theme is theretore a 
complex of generalized propositions. The process of detecting the 
apgllcability of abstract terms, and hence of finding themes, is ab- 
strac tion, described above. Abstraction is a recursive process and IS 
thus one way to capture the embedding of themes hypothesised to exist 
in discourse. 
The paraphrases ellcited by Pfandler , Johnson, and DeForest (1'976 
and Rumelhart (1976) show that subjects do create descriptions of texts 
that vary in abstractness in accord with the hierarchy of themes 
proposed here. 
IMPLEMENTATION 
In thiq section the imple~nentation of the structures and processes 
presented above is described* The original program was written in 
SNOBOL for a CDC6400. 
In a complete system there should, of course, be a parser. For 
the present this does not exist; the system only embodies the cognitive 
component. This means that the overall organization is not as it would 
be in a complete text analysis system, where interaction between the 
syptac tic and semantic components is essential (Woods, 1971; Schank, 
1975a; Winograd, 1971; Erdman, 1975) The justification for this 
ommission is that for the present I am seeking to establish only the 
nature of the semantic organization of a coherent discourse. Once this 
structure has been identified it will provide the goal for a complete 
systern- 
Input to the spcern is accordingly in a cognipive form that 
retains the logital ellipsis of the surface form. 
Most of the processin8 is performed by "Nonnalizer" which infers 
omitted logical and thema~ic structure. A judgement of coherence is 
then a simple task: if the discourse is not logically connected or 
does not have a single theme, then it is incoherent; otherwise the 
matrix theme indEcates the topic of the discourse. 
Processes 
A component of all processes is a breadth first path-tracing 
routine, called "Ripple". A search path is defined by a sequence of 
arc types A path does not explicitly state whether an arc can be 
repeated. The network is assumed to be syntactically well-formed and 
this controls repetitions. An arc can be marked as obligatory; other- 
wise it is optional. A goal of the search can be defined. This may be 
a particular node, or a node marked with a specified "activation tag", 
i .e., a node reached by another path, when seeking an intersection. 
Paradipiatic path,-tracinq 
Paradigmatic path-tracing is implemented by Ripple with a path 
sequence VAR IST TYP MAN (see Figure 11). A converse paradigmatic path 
---- 
is 'EfAN TYP IST VAR. The properties associated with a varietal concept 
may be found by Ripple with a path TYP MAN starting from the concept. 
Causal connectivity condition 
This process uses Ripple with cause as the path definition* It 
- 
also has to include P-W arid P-W to be able to reach from and to 
conj unc ts. 
Discovery - of gener a1 - and spec if ic propo si ti ons 
All propositions of a discourse must match general propositions in 
the encyclopedia. The procedure is to make cyclic calls of Ripple. 
The first is from the modality node of the discourse proposition. Each 
node reached, other than the modality, initiates another search in the 
encyclopedia. For example, given the discourse and encyclopedia of 
Figure 12, the process is as follows : 
from "gobblef' node 1, a converse 
paradigmatic path plus a typical arc plus manifestation ia followed to, 
for example, node 2 in the encyclopedia. 
Ripplipg from "gobble" in the 
If 
discourse gives nodes "Marv" and caviar". The syntagmatic arcs 
traversed are noted. From "Marv", node A, a converse paradigmatic path 
plus typical plus manifestation plus converse agent is followed, with a 
goal of a node activated from the prior discourse node, i ~e., "gobble". 
[Not all of these arcs have to be present, they are optional except £0, 
tt 
the syntagmatic arc.] Node 2 satisfies this goal. From caviar" node 
B, a converse paradigmatic path plus typical plus manifestation plus 
converse objective is followed with a g~al of a node activated from the 
prior discourse node "gobbleff* Again node 2 satisfies the goal. Thus 
the proposition at 2 is a generalization of tihe discourse proposition. 
The condition on an acceptable generalized match is that it must 
contain all the syntagmatic informirtion of the discourse proposition,; 
the generalizatdon may contain more information but it cannot contain 
less Separate searches are made for syntagmetic structure and for 
spa tio-temporal information on the modality of a proposition. 
It is only necessary to change the path description from that used 
in Figure 
12 from converse paradigmatic path to paradigmatic path for 
the routine to local e more specific propositions. 
kietalingual decomposition 
The search for general propositions also flags nodes that have 
metalingual definitions. New propositions having the structure of the 
definiens are made by copying the definiens but with node names drawn 
from the proposition that is being paraphrased These new propositions 
are then considered as part of the discourse. To make the copy, the 
bresdth first search routine is used to pass through the definiens. 
For each  ode and arc in the definition, an equivalent structure is 
created The end of scanning a proposition of the definiens is marked 
by reaching a typical arc, i.e., at the point at which the definition 
is linked into paradigmatic organization. If a participant in the 
generalized proposition matches a participant in the discourse proposi- 
tion then this participant fits into the corresponding role slot in the 
definiendum, otherwise the concept in the definiendm is used. For 
example, given "Peter buys a bicycle from Jane" and the definition of 
"buy" as in rigure 16. In locating the systemic definiendm, the 
correspondences of "~eter" to "A", etc ., are also found. When copying 
reaches the node that matched "peter", this name is inserted into the 
t 1 
paraphrase. There is no correspondence for the instrument, so money" 
1s the inserted from the definiiendum, 
MetaLin~ual abstraction 
In searching for general propositions, some may be found that are 
components of metalingual definitions. These have modalities that are 
pdinted to by a part-whole arc. 
-60- 
The proqess that tests coreference and contextual requirements 
uses Ripple to traverse in parallel the candidate discourse proposi- 
tions and those of the definiens. Typical arcs in the definiens limit 
the search. Each node of the definiens is compared with equivalent 
nodes in the discourse propositions at each step. A proposition is 
rejected if a node has no equivalent or it does not possess all the 
properties of the nodes of the definiens, including arcs to nodes 
matched in the previous step, See., if nodes x (systemic) and Y 
(discourse) were taken as corresponding nodes at one step, then if on 
the next step a node of the definition has an arc to X then the 
discourse must have the same arc to Ya Only those propositions that 
match the definiens will not have been rejected and can be rewritten 
using the abstrwt term. The process can -be illustrated using the 
definition of "poison" given in Figure 14b, and its application to the 
discourse in Figure 14a. The crux of the test is at that node of the 
definition having the two manifestations arcs emanating from it* If 
the discourse proposition did not have two manifestations it would be 
rejected This is how "John's eating the worms made Fred sick" is 
eliminated* Or if it does have two manifestations, they must point to 
nodes that were satisfied Qn the previous step of the comparison. Thus 
if one of the manifestations pointed into another proposition the test 
would fail. 
Inference I of omitted discursive relations 
A search along discursive arcs in systemic memory from counter- 
parts of discourse propositions may lend to a proposition that is 
flagged as a generaliz~tion of another discourse proposl tion. If this 
is so then the discusive arc may be added between the discourse 
propositions. If the proposition reached is not flagged then it and 
the discursive arc are copied, and added to the discourse. Ttle copying 
routine was given in the discussion of decomposition. 
The System 
- 
The flow chart of the analysts fs shown in Figure 21. The mean- 
ings of the annotations are: 
OLDINFO has a discourse proposition as its argument. It finds 
systemic equivalents. It calls a routine SPACETIME to corn- 
pare spatio-temporal contexts. SPACETLME is also called 
during the search for general propositions when a non-event 
node is found with an attached modality. If OLDINFO is 
presented with a modality that has only part-whole relations 
ta other nodes, it does nothing. 
LOGCON has a systemic proposition as its argumfnt. It succeeds if 
it finds d link to a general proposition corresponding to a 
proposition of the discourse (including propositions added by 
inference). It also generates INTERLIST, a list of causal 
inferences from propositions of the discourse. 
IKST is a list of nodes found to have metalingual definitions. 
CONJLIST is a list of conjoined propositions. When a discourse 
proposition is matched against the encyclopedia, it sees if 
the encycl opedic proposition is a constituent of another 
modality. A CONJUNCTION TEST routine uses CONJLIST to locate 
discourse propositions that can be grouped. 
TRANSFORM has two modes . In one it is used to decompose proposi- 
tions that contain a metalingually defiries concept* A second 
mode c eates causally inferred propositions . 
/ \ 
/ 
I 
\ 
Y 
'C / 
/ 
\ 
/ \ 
CONJUNCTION 
\ 
Figure 21 Flow chart of the system 
nNALYSIS OF SOME STORIES 
I want to show that abstract patterns are quite general, that all 
linguistic behavior is based on such patterns. Obviously such a claim 
must be substantiated by the discovery of such patterns. A number of 
stories of drowning were used to test this hypothesis. The second 
claim of proper embedding,of themes was also tested by a more complex 
drowning story. 
In the examples a refined hypo thesis of discourse connectedness is 
used. One habit in discourse is to set the stage (kropp's "Initial 
Situation", ~on~acre's "Aperture"). In terms of the model thin aspect 
should be recognizable by the occurrence of space and time relations. 
?I 
We find todayf', "in MB resevoir", "On October 11, 1974". "DF of 
Quekns", etc. A greater stru~tural complexity of expression is to be 
expected elsewhere in the stories (see Longacre's comments on nuclear 
tagmemes, above). Longacre (1972) includes in the nmural outline of a 
discourse, recognition of a peak within the discourse. Various surface 
markings for the peak are given: tense change, extra long sentences, 
rhetorical underlinings, etc. Taking an ethnocentric view of the world 
(cf. White's teleogical commune), it is suggested that In the underly- 
ing form, the peak will lie within causally related propositions. It 
is thus expected to find the theme within the causal structure and so I 
focus on this organization. This would be inappropriate if the stories 
were aescriptions of the kind elicited by Linde, above. 
Common patterns 
Shart factual accounts of drowning8 were elicited from freshmen fn 
Linguistics and English. The instructions given sought only to define 
a topic and an approleimate length: "Write a drowning story that, for 
example, you might expect to find as a column filler in the New York 
Times." A sample is 
(Story 1) The body of Horatio Smith was found last night in the 
Niagar a River He was drowned when his boat overturned on the 
river . 
The hypothesis formed is that an acceptable drowning story must 
give the following information: 
(a) Why the victim was in the water. 
(b) 
Why the victim was not able to save himself 
The rationales for these requisites are: 
(c) 
A person is not usually found in water, and therefore 
some explanation of this location is expected. 
(d) 
By an instinct of self-preservation, one would expect the 
victim to try to extricate himself from his predicament. 
The story should say why he couldn't. 
Figure 22 shows the cognitive form of this requirement. 
MTL 
@<drowning 
CAUSE E 
CAUSE 
1 
CAUSE 
Idrow 
act 1 
<person> 
Figure 22 The drowning theme 
The empty modalities indicate that a matching story must have something 
that stand in a causal relationship to the other propositions, Lee, 
explain why they happened (what caused them) If not originally ex- 
plicit, this information must be recoverable through encyclopedic 
knowledge. 
One way in which the content of systemic memory may be substan- 
tiated is by examination of negative propositions in the stories; 
writers presumably only need to negate normal expectancies, and this is 
what generalizations are. Unfortunately there are few negatives in the 
stories. Stories 31 and 5 indicate the assumption of swimming ability. 
Stories 12, 32, and 42 show that the success of rescue attempts is 
anticipated. 
-66- 
The stories elicited fell into several categories. The two 
analysed here are a boat capsizing and 
a fall into water. In spite of 
surface difference it will be shown that: at an abstract level the 
stories conforn to the thematic pattern. The analysis of one story 
from each category is presented 
Stories not analysed include such happenings as a person eating 
too much then going for a swim; Jesus freaks trang to walk on water; 
and water-skiers having accidents while watching bikini-clad 
occupants 
of passing boats 
A boat capsizing 
-- 
(Story 1) The body of Horatio Smith was found last night in the 
Niagara River He was drowned when his boat over turned in the 
river. 
(Story 2) Eggbert Willis, 56, of Bayside, drowned this morning 
after the boat he was rowing overturned near Devil's Cove. 
(Story 3) The body of John Smith, 58, was discovered today at the 
foot of West Ferry Street* He was reported missing four days ago 
by his wife after he failed to return from a boating trip. HIS 
boat had capsized. Death was due to drowning. 
(Story 31) A small sailboat was afloat on a calm peaceful lake 
when suddenly the mast of the vessel struck some cables overhead 
and the boat capsized. The two Inen aboard drowned, ane because he 
was hit by the boat and rendered unconscious, the other didn't 
know how to swim* 
Story 1 is analysed* In it, some of the causal and thematic 
structure is absent and is reconstructed using the following knowledge: 
(i) 
If a person is in a boat and the boat overturns, this may 
cause him to be injured and to be in the water. 
11 
(ii) If injured a person may not be able to act". 
(iii) If a person is in water and cannot "act" then he may drown. 
Figure 18 shows the encyclopedic form of this piece of 
knowledge. 
-6 7- 
The nodes appearing in the encyclopedic entries for all of these Eacts 
exemplified in this aection are linked to varietal nodes by typical 
arcs. This is so because the Eacts qre not obligatory on some concept 
but are something that may happen to some examples of this category 
some of the time. 
In the analysis of the collection of drowning stories, only the 
parts of the story that are relevant to the dtowning are conside'ted. 
For example, in Story 1, only the second sentence is processed; the 
first deals with an event after the death and c~sequently is excluded. 
The result of the analysis is shown in Fip;ure 23. The original 
propositions of the discourse are: 
Boat contains Horatio Smith 
Boat overturns "on Niagara River. 
Horatio Smith drowns. 
The antecedent conditions expressed in a general form in (i) match the 
specific situation in Story 1. Thus it is inferred that Moratio Smith 
is in the water and that he is injured. From (if) it follows that he 
is not: able to act. By being in the water and not able to act, he can 
drown, a fact stated explicitly in the story, as shown in Figure 23- 
(Horatlo Smith) 
Figure 23 A boat capsizing 
hrther we have explanations for Horatio Smith being in the water: the 
boat he was in ovexturned; and for him not being able to act: he was 
in j ur ed . The theme fits. We have a connected discourse with 
(trivially) a single-rooted theme; it is coherent. The possibility 
that a boat over turning only puts a person into the water is added to 
the encyclopedia to account for part of Story 31, where it is not an 
injury but the inability to swim that prevents one victim from saving 
himself. This has consequences for Story 1. These two facts match the 
same propositions but are an exclusive conjunction. When an complex is 
found that has the same constituents, as an conjunction already cow 
structed, the later episode is stacked and used if the current analysis 
path fails* In Story 1 the use of the alternative does not lead to a 
connected structure. The subsequent backup then takes the correct 
A fall into water 
I-- 
The second category of dromina story requires the addition of the 
knowledge that "If a person falls, he may injure himselftta Ten stories 
in this category are listed below. 
(StoPy 5) Early titis morning, James R. Smith, age 7, was for~nd In 
a swimming pool near liis hame. Investigators say the boy stumbled 
into the pool in the darkness esrly this morning whilst looking 
for his pet kitten, Unable to swim, the boy drowned. 
(Story 7) At the home of blrs. ~ohn Smith on Elmwood Avenue, a 
boy,. Mark, age 15, drowned in his pool. The boy was with two 
other friends. They were performing water stunts when Mark fell 
and smashed his head on the bottom of the pool 
(Story 12) Yesterday afternoon, the life of a Buffalo youth was 
taken when he slipped on rocks- at a local quarry. The failure of 
attempted rescues resulted in the drowning of llichael Smith, age 
7, of 29 Oak Street, Buffalo. 
(Story 19) A 12 year-old boy was found drowned in Ellicott Creek. 
Sources say the boy ran away from home and fell accidentally into 
the water. 
(Story 26) -A l0-yea1 dd boy died last night when he fell into a 
smkl pond. His friends say he was chasing his parakeet which had 
escaped from its cage, when the incident occurred. 
(Story 32)- Steve Smith, of Hickstown, drowned today while sailing 
on Glasslyke Eake. EIr . Smith, who was knocked- overboard when 
struck * on the head by 8 seagull, perished ,before help could reach 
him. His son Edgar's attempts to save his life proved futile. 
(Story 37) hn unidentified man was seen by several persons 
fa;lingsinto the Niagara River at the foot oE Ferry Street. He 
was later pulled from the water and pronounced dead at the scene. 
The cause of death was drowning. 
(Story 38) Today, the world's greatest swimmer died. John Whale 
was preparing to take a bath when he tripped and fell into the 
bath 
Cause of death was# drowning. 
-70- 
(Story 40) On October llth, 1974, an unidentified man drowned in 
his bathtub at the Hotel Sheraton* The drowning was due to the 
fact that h6 fell into the tub in trying to make himself sobet. 
(Story 42) An 11 year-old boy drowned today after falling into 
the canal where he and his friends were playing. The two other 
boys, both eleven, tried to save their companion but were unable 
to do so* 
(Story 4) A body was found early yesterday at the foot of the 
Mango River, near Clubsport* The body is believed to be that of 
Jose Gepasto. It seems as if Mr Gepaeto's car made a wrong turn 
on the highway and plunged into the water 
Story 4 is analysed* Note that it does not explicitly mention the 
motion of the person, only that of the car. Understanding the story 
requires that it be known that: 
(h) If a person is "contained" by something that falls, 
then he also falls. 
(b) If a person is "contained" by something that is 
in contact with something (emgm, water), then the 
person is in contact with the something (water) too. 
(But not if the something is a submariner) 
Further it is not given that Jose Gepasto drowns. This can be infer- 
red, but the inference chain is open ended. The analysis continued 
making causal inferefices and conjunctive groupings, some of which led 
to the discovery of the theme* 
Only when the system ran out of logical 
and conjunctive possibilities, did it make the ccannec tedness test. 
Figure 24 shows the network developed in the analysis- 
<car> 
Figure 24 A fall into water 
Embedded themes 
Story 22, in fact taken from the New York Times, looks like a 
drowning story, and as shall be shown, does contain this theme. 
However, it contains more. The claim i"s made that it is a "tragedyt'. 
(Story 22) DF, 43 years old, of Queens, drowned today in MB 
resevoir after rescuing his son D, who had fal,lem into the water 
while on a fishing trip at TF, near here, the police said. 
This theme is defined in Figure 6 as a situation in which "Someane does 
a good act [e.g., rescue) and dies (e.g., drowns)". It will be seen 
that the tragedy is a proper sub theme of the drowning theme. Thus, 
though the story may be edid to have two themes, one is part of the 
other, and by our hypothesis, the discourse is still coherent. 
At each 
step the encyclopedic knowledge used in the inference and an outline of 
the inferred nodes are indicated Figure 25 shows an outline of the 
evolved atruc ture, where the original discourse propositions are shown 
by and inferfed proposi tiona by 0. 
Father not 
able to act 
1 
Sc I fall Son injured Son not able 
to act 
Figure 25 Embedbd themes 
Step 0. Intial state. (8odes 1, 2, 3, 4). 
Step 1. Fall causes injury. (Node 5). 
Step 2. 
Injury causes inability to act . 
(Node 6) . 
Step 3. 
In water and not able to acr: causes fescue. 
(Node 7 and a link to node 3). 
Step 4. 
To rescue someone who is in the water, get into 
the water. (Nodes 8, 9) 
Step 5. Acting causes weariness. (Node 10). 
Step 6. Weariness causes inability to act. (Node 11). 
Step 7. In water and not able to act causes drowning. 
(Node 12 and a link to node 4). 
Note that the antcedent condition in Step 3 is the same as in Step 
7. Both resultant situations are possible and are noted The system 
can select either. However, the wrong choice does not lead to a con- 
nected structure and backup to the alternative has to be made. 
After Step 7 the discourse has an inferred caugal structure con- 
nec ting all the original propositions. 
The theme "tragedy" fits, the rescue is a (partial) cause of the 
demisem Rescue is a variety of act and good can apply to it and Brown 
is a variety of die. The drowning theme is also present. 
Although the drowning theme is not defined in terms of the trag- 
edy, it can be seen that one is properly embedded in the other. The 
Process that performed the analysis is at present incomplete because 
the notion of embedding is nQt well understood for the highly struc- 
tured network. The process used the transitivity of cause and the 
conj oining of propositions . Thus Che tragedy encompasses propositions 
3, 10, 11, and 4 and the drowning 8, 9, 10, 11, and 4. The tran- 
sitivity of cause lets the chain 3, 10, 11 be equivalent to the chain 
10 and 11. 
A postmorrem on this example reveals a serious flaw. As can be 
seen the rescue is the cause of the father being in the water The 
analysis has failed to dietinguish desire for an action, a goal, 
from 
the execution of the action. A more satisfying analysis would include 
some of the mechanisms to be found in the robot planner of Furugori 
C4975). Step 2 should be seen as setting up the conditions for the son 
to drown, which is an event that should be prevented. 
This provides a 
goal for the subsequent activities . One way to prevent someone from 
drowning is to save him, this is a subgoal that would directly achieve 
the goal. If you want to rescue someone who is in the water, then it 
may be necessary to get into the water. With this subgoal included, 
the goal can be achieved, and the analysis resumes at Step 5. Figure 
26 shows this preferred analysis of this fragment of the story. This 
would not c'hange the relative status of the themes. 
Son In 
Son not 
water 
Father res 
drown 
Father save son 
\ +$*# 
b father 
son 
Figure 26 Improved story analysis 
DISCUSSION 
Much of the representative power of the encyclopedia is unused in 
the system for discourse analysis and therefore remains to to be tested 
and evaluated fiere is dlso not at present a parsing system to effect 
transduction from surface to encyclopedic form. The methodology is 
first to try to establish an adequate conceptual representation which 
provides a goal for a complete system. Although one example demon- 
strated the embedding of themes, it did not exhibit recursive abstrac- 
tion. Further examination of themes in discourse should overcome this. 
There are two aspects of the encyclopedia, paradigmatic and 
metalingual organization, that set this model apart from any other 
current system. Discussion will be directed to comparative comments on 
these aspects 
It is evident that the present sytem makes much use of parsdig- 
matic organization. Pet Skhank (1975a) seeks to minimize the need for 
this kind of knowledge. His conlrlusion arises from the observation 
that people do not make responses based on paradigmatic associations, 
but rather on eplsodic associations. This is not telling evldence 
against the existence of such structure, rather it may say something 
about the cognitive process of free association. In Schank's system 
there is no need for paraddgms at the level of conceptual representa- 
tion as words are transformed into conceptual primitives by the concep- 
tual parser* The parser thus contains knowledge that is functionally 
equivalent to paradigmatic structure* 
-75- 
The question then to ask is whether having a single level af 
representation of concepts, the primitives, is the most beneficial for 
conceptual processing. 
I would claim not for two reasons* 
Firstly there is tb presence of thematic structure in discourse 
Metalingual organization enables the content of a text to be 
thematically deacribed at many levels of precisenesse It is possible 
to be quite superficial, or by decomposition to become more and more 
detailed, or vice-versa using abstraction. The depth of analysis can 
be determined by the requirements of understanding a given text: 
essentially abstracting or decomposing until causal links are 
established over the text. It 3s not apparent that definitions of 
themes can be controlled in Conceptual Dependency Theory, in that if 
stated in terms of primitives, each abstract term could become ex- 
tremelv large. In contrast, the encylopedia can define themes in terms 
bf lesser themes. 
Secondly paradigmatic structure enables comparatively small chunks 
of knowledge, 
say involving a single causal relation, to be retrieved 
and pieced together to complete the underlying form of discourse. 
Rather then attempting to patch general knowledge, Schank and Abelson 
(1 975 ) have introduced "Scripts", large preformed knowledge structures 
Wose function is to limit the possible inferences in understanding* 
This has a danger of essentially idiomatizing understanding, with a 
consequent difficulty in handling deviant situations And as Wilks 
(1976) points out, one problem with Scripte is that they are invoked $n 
their entiaety by word association. Thus it is suggested that, for 
example, "I bought some beer from the supermarket, drove home, and 
drank it while watching a football game on television" would evoke 
multitude of Scripts by the presence of such ~rds as "supermarket", 
"drive", "football", etc Heace the desired reduction of possible 
in£ erencea is not achieved. Paradigmatic organization enables recogni- 
tion of higher level structures, including proposittans that are part 
of metalingually defined concepts. Partially recogn~zed abstract terms 
cgn be used t~ predict their completion. The encyclopedia thus has 
general, productive bottom-up and top-down capabilities. 
Even though an abstract definition should be activated by the 
appearance of an appropriate word in the text, the structure will not 
in general be large, and so not produce an overhelming number of 
extraneous active nodes . 
On the other hand it is certainly advantageous to have a multitude 
of overt themes in analysing discourse Searches can be Initiated from 
these terms as well as from the more specific discourse propositions. 
To illustrate this, consider an exhaustive undirected search for a goal 
n states distaht in a space where each state is linked to m other 
states. The number of nodes activated will be of the order m**n. If 
the goal is known then a bidirectional search will reduce the exponent 
to n/2. But a laore significant reduction takes place if there are 
stated intermediate subgoals, i.e., themes in the hierarchy, say g af 
them, with an average separation of n/g, the exponent becomes n/(2g) 
ACKNOWLCDGEMENTS 
I would like to thank David Hays for his continual support and 
guidance throughout this work. Ray Bennett and Randy Walser helped in 
trying to make this discourse coherent, the faults that remain are 
mine. 

REFERENCES 
Abelson, R. P. The structure of belief systems. In R. C. Schank & 
K. M. Colby (Eds-) , Computer Models - of Thought an Language. San 
Francisco: W* H. Freeman, 19/3. 

Abelson, R. Po Concepts for representing mundane reality i.n plans. In 
DO G. Bobrow & A. Collins (Eds.), Representati.on and Understanding. 
Studios - in Cognluve Science. New York: Academic Press, 1975. 

Berlin, Ba, Breedlove, DO Em, & Raven, P. W. Covert categories in folk 
taxonomies. American Anthr~pologist , 1968, - 70, 290-299. 

Black, F1 A deductive question-answering system. In Me Minsky (Ed .) , 
Semantic Information Processinq. Cambridge: MIT Press, 1968. 

Bubrow, D. GO Natural language input for a computer problem-solving 
system. In M. Minsky (Ed .) , Semantic Information Processin&. 
Cambridge: MIT Press, 1968. 

Ch~msky, N. Aspec tg -- of a Theory - of Syntax. Cambridge: MIT Press, 
1965. 

Colby, KO M. Simulation of belief systems. In R. C. Schank & K. M. 
Colby (Eds.), Computer Models - of Thou- - and Language. San Ran- 
cicrcd: We H. Freeman, 1973. 

Collins, A*, & Quillian, MI Re How to make a language user. In E. 
Tulving & W. Donaldson (Eds.), Organization - of Memory. New York: 
Academic Press, 1972. 

Conklin, H. C. Lexigraphical treatment 06 folk taxonomies. In F. W. 
Householder & S. Saporta (Eds 0) , Problems - in Lexicography (Publica- 
tion No. 21) Blooming ton: Indiana Research Center in Anthropology, 
Folklore, and Linguistics, 1962. 

Erdman, L. D. Overview -- of the Hearsay speech understanding research 
(Computer Sciences Review 1974-1 975) . Pittsburgh : Carnegie-Mellon 
University, 1975. 

Fillmore, C. Jo Toward a modern theory of case. In D. A. Reibel & 
S. A. Schane (Eds.), Nodern Studies - in English: Readings - in 
Transf onnational Grammar. Englewood Cliffs , NJ: Prentice-Ha11 , 
1969. 

Furugari, T. - A memory model - and simulation - of memory processes - for 
driving -- a ear (Technical Report No. 77). Buffalo: State University 
of New York, Department of Computer Science, 1975. 

Hallfday, M. A. KO, & Hasan, R. Cohesion - in English. Lopdon: Long- 
man, 1975. 

Hays. Do G. Linguistic foundations for a theory of content analysis. 
In GO Gertner, B. 0. Ro Holsti, K. kippendoff, W. 30 Paiaely, & 
P. Jo Stone (Eds.), - The Analysis - of Communication Content. New 
York: Wiley, 1969. (a) 

Nays, Do Go Applied Computational Linguistics. In GI E. Perren & 
J. L. Trim (Eds .) , Applications - of Linguistics. Cambridge: Cam- 
bridge University Press, 1969. (b) 

Hays, Do GO Linguistic problems of denotation. In M. BForwischC 
K. Ec Heidolph (Eds.), Progreps - in Linguistics. The Hague: Mouton, 
19700 

Hays, D. C. Vpes - of processes - on ~ognitive networks Paper presented 
at the International Conference on Computational Linguistics, Pisa , 
Italy, 1973. 

Hays, D. GO Cognitive Structures. Book in preparation, 1978. 

Hendrix, C. C. Expanding - the utility - of semantic networks throwh , 
partitioning. Menlo Park, CA: Stanford Research Institute, 1975. 

Hopcroft, J. E., & Ullman, J. DO Fofmal Ladguages and their Relation 
to Automata. Reading, MA: Addlson-wesley, 1YbY. 

Inheldet, Be, & Piaget, J. The Earlv Growth of Logic in the child. 
New York: W. We Norton, 1964. 

Klein, So, Oakley, J. Da, Suurballe, Do J., & Ziesemer, Re A. 
A program for generating reports on the status and history of 
stochastically modifiable semantic models of arbitrary universes 
Statistical Methods - in Linguistics, 1972, 8, 64-93. 

Kuhn, T. S. - The Structure - of Scientif id Revolutions. Chicago: Univer- 
sity of Chicago Press, 1962. 

Lakoff , G. Structural complexity in fairy tales. - The Study - of -s Msn 
1972, - 1, 128-150. 

Langacker, R. On pronominalization and the cha'ln of command. In D. A, 
Reibel & S. A. Schane (Eds.), Elolern Studies - in English: Readin= - in 
Transformational Grammar. Englewood Clifts, N3: Prentice-Hall, 
1969. 

Linde, C. 
-. The linguistic encoding - of spatial information. Unpublished 
doctoral bisserta tion, Columbia University, 1974. 

Longacre, R. Discourse, Paragraph, - and Sentence Structure _in Selected 
Philippine Languages San ta Ana, CA: Summer Institute of 
Linguistics, 1968. 

Longacre, Re Hierarchy - and Universality - of Discourse Constituents - in 
New Guinea Languages. Washington, D. C. : Georgetown University 
Press, 1972. 

Mabley, E. Drarilatic Construction. Philadelphia: Chilton, 1972. 

Mandler , Jq M.. Johnson, N. S., & DeForest , M. - A structural analysis 
of stories and their recall: From "Once upon 2 time" to "Happily 
ever after" (Technical Report No. 57). La Jolla: Center for Human 
Information Processing, University of California at San Diego, 1976. 

Nagao, Ma, & Tsujii, J. Analysis of Japanese sentences. American 
Journal - of Computational Linguistics, 1976, Microfiche 41. 

Nash-Webber , IS: L. Semantic interpretation revtsited (Technical Report 
NO. 3335) . Cambridge: Bolt, Beranek, & Neman, 1976. 

Phillips, B. Topic analysis. Unpublished doc tore1 dissertation, State 
Universit,y of New Yqrk at Buffalo, 1975. 

Propp, V, 
Morphology -- of the Folktale. Austin: The University of Texas 
Press, 1968. 

Putnam, No The meaning of "meaningw* Mind Language, - and Reality< 
Philosophica< Papers (Vol . 2). Cambridge: Cambridge University 
Press, 1975. 

Quillian, M. R. The teachable language comprehender. -- Cornme ACM, 1969, 
12, 459-475. 

Raphael, B. SIR: Semantic information retrieval. In Mo Minsky (Ed .) , 
Semantic Information Processinq. Cambridge: MIT Press, 1968. 

Raven, Po He, Berlin, Bo, & Breedlwe, Dm E. The origins of taxonomy. 
Science, 1971, - 174, 1210-1213. 

Rumelhart, D. Em, Lindsay, P. He, & Norman, Do A. A process model for 
long term memory. 
In E. Tulviog 8 Yo Donaldson (Eds.) , Organization 
and Memory. New York: Academic Press, 1972. 

Rumelhart, D. E., 8 Ortony, A. - The representation of knowledge in 
memory (Technical Report No. 55). La 3olla:'Center for Human Infor- 
mation Processing, University of California at San Diego, 1976. 

Rumelhart, Do E. Understandinq - and summarizing brief stories (Tech- 
flical Report No. 58). La Jolla: Center for Human Information 
Processink, University of California at San Diego, 1976. 

Schank, R. C. Conceptual Information Processinq. Amsterdam: North- 
Holland, 1975. (a) 

Schank, R. C. The structure of episodes in memory. In I). G. Bobmw & 
A. Collins (Eds.), Repreeentation - and Understandinq: Studies & 
Cognitive Science. New Y~rk: Academic Press, 1975. (b) 

Schank, Re C., h Abelson, R. P. Scrlpts, plan?, - and kncwled~. - Paper 
presented at the Fourth International. Joint Conference on 4r tif ic ial 
Intelligence, Tbilisi, USSR, 1975. 

Schubert, 1;. K. Extending the expressive power of semantic networks. 
Artificial Intelligence, 1976 - 7, 163-198. 

Shapiro, S. C. The MIND system: A data structure for semantic 
information processing (Report R-83 7-PR ) . San ta ilonica, CA: RAND 
Corporation, 1971. 

Simmons, Re F. Some semantic structures for representing En~lish 
meanings (Technical Report NL-1) . Austin : Computer-Aided Instruc- 
tion Ltrboratory, The University of Texas, 1970. 

Sondheimer , N. K. Spatial reference and semantic nets. Americlan 
~ournai - of Co~nputational Linguistics, 1977, Microfiche 71. 

Thomnclyke, P. W. The role of inferences in discourse comprehension. 
Journal - of Verbal Learninq - and Verbal Behavior, 1976, - 15, 437446. 

Tulving, E. Episodic and semantic memory. In E. Tulving & W. Donald- 
son (Eds .) , Organization - of Memory. New York: Academic Press, 
1972. 

White, M. Abstract definition ,& - the cognitive network: - The 
metaphysical terminology - of ..... a contemporary millenarian community. 
Unpublished doctoral dissertation, State University of New Yark at 
Buffalo, 1975. 

Wilks, Y. Grammar, Meaning 9.- and - the Machine Analysis - of Language 
London: Routledge, 1972. 

Wilks , Ye A prefential, pattern-seeking, semantics for Natural 
Language inference. Artificial Intelligence, 1975, - 6, 53-74. 

Wilks Yb Frames, scripts, stories, - and fantasies. Paper presented at 
the International Conference on Computational Linguistics, Ottawa, 
Canada, 1976 . 

Winogfad, T. Procedures -- as a representation ---- for data in a computer: 
proaram - for under standdnq natural language (Repart MAC-TR-84) . 
Cambridge: Artificial Intelligence Laboratory, MIT, 1971. 

Woods, WY A. What s in a link: Foundations for semantic networks. In 
Dm G. Bobrow 6 A. Collins (Eds.) , Representation and Understanding: 
Studdes - in Cognitive Science. New York: Academic Press, 1975. 
