KIND TYPES IN KNOWLEDGE REPRESENTATION 
K. Dahlgren 
IBH Los Angeles Scientific Center 
11601Wilshiro B1. 
Los Angeles, California 90025 
J. McDowell 
Department of Linguistics 
University of Southern California 
Los Angeles, California 90089 
Abstract Tiffs paper describes Kind Types (KT), a system which uses 
commonsense knowledge to reason about natural language text. KT en- 
codes some of the knowledge underlying natural language understanding, 
including category distinctions and descriptions dlffercntiating real-world 
objects, states and events. It embeds an ontology reflecting the ordinary 
person's top-level cognitive model of real-world distinctions and a data- 
base of prototype descriptions of real-world entities. KT is transportable, 
empirlcally-based and constrained for efficient reasoning in ways similar 
to human reasoning processes. 
I. The problem A model of the semantic knowledge of concepts 
underlying natural language is definitional rather than assertlonal in that 
it contains general descriptions of objects and their relations, as opposed 
to facts about specific objects (Levesque 84)i Part of competence in 
English is the knowledge that an elephant is an animal, and therefore it 
moves on its own. Competence also involves knowing particular things 
about elephants, such as that they have trunks. This general description 
of the elej~hant concept is part of commonsense knowledge and belief. 
We will call this the cognitive model. In order to implement it, a com- 
puter system must represent what speakers of a language believe about 
the world and their named concepts, rather than represent the actual 
world. A complete computer model of the cognitive model would repre- 
sent the commonsense conceptual scheme presupposed by a particular 
culture and language, in tbis case, urban American English, 
Knowledge of a natural language implies knowledge of a kind of theory 
of the environment used by a culture. In learning a language a child 
learns the category cuts recognized in that theory fBerlin 72l IDou£herty 
7g). Assuming that knowledge of word meaning is not differently re- 
presented than other kinds of knowledge (Tarnawsky 82), KT is designed 
to encode the world view embodied in natural language as ordinary 
knowledge, while retaining the autonomy of combinatorial semantics and 
of syntax. KT does not provide all the meaning, but it yields an interest~ 
ing and transportable portion of it. 
The work reported here addresses two major problems. First, the 
knowledge associated with a concept does not always give necessary and 
sufficient conditions for deciding whether an object falls under the con- 
cept name. The problem is to find a systematic way of predicting which 
concepts can be reasoned about using first order logic directly and simply 
(as a conjunction of predicates) and which ones require default logic. 
Second, the cognitive model models the actual world, which is open and 
continuous (Hayes 85). The potential concepts and relations between 
them are infinite. Nevertheless, humans manage to reason without cog- 
nitive overload. Can the computer model be systematically constrained 
using predictions paralleling those employed by humans in reasoning 
about the actual world7 
Consider the following database of facts (assertions) and axioms (defi- 
nitions). The language used is irrelevant; any system isomorphic to first 
order logic will have the same deficiency. 
l) FACTS 
Human(mary). 
Teacher(mary). 
Student of (mary,john). 
Teacherof(john, mary). 
AXIOMS 
V x Teacher(x) ~ Human(x). 
V x ~t y Teacher(x) ~ Teacherof(y,x). 
A question-answering system with the above database of facts and 
216 
axioms can respond easily to questions such as (2) but would be unable 
to answer (3). 
2) Does Mary have a student? 
Who is Mary's student7 
Is Mary human? 
3) Is Mary touchable? 
Can Mary move himseIf about7 
What does Mary do? 
Are John and Mary part of an institution? 
The questions in (3) can be answered by a system which has a taxonomic 
hierarchy with features at the nodes, such as KL-ONE (Brachman and 
Schmolze 1985). If Mary is human, Mary is physical object, which has 
the feature "touchable". Similarly, since Mary is an animal, she can 
move herself about. KT employs such a taxonomy, and it is called an 
ontology to reflect the fact that KT reasons with such information as 
though it were true and complete, in contrast to generic information 
whicl{ is probabilistle. The ontology is is unique to KT and is based upon 
results in cognitive psychology, linguistics and philosophy. 
Another deficiency of tile database in (l) is that it knows nothing about 
John, Mary and their relationship, even though English speakers share 
descriptions of the typical objects in the sets defined by the predicates 
Human, Teacher and Student. For example, it would be desireablo if the 
system could respond as follows: 
4) Is Mary intelligent? --Probably so. 
Is Mary articulate7 --Probably so. 
Does John listen to Mary7 --Probably so. 
ls Mary educated? --Inherently so. 
What does Mary do? --Inherently, teaches. 
The questions in (4) reflect the kind of things that average people think 
of when confronted with the predicates (1) (Dah gren 85). Why not have 
the AI system infer similarly? In order for such information to be useful, 
the system needs to know that "intelligent" is a probabilistic feature as- 
sociated with the predicate Teacher. Therefore, if told 
~lntelligent(mary) it should be able to reason that (5) while reasoning 
that (6) is definitely inconsistent. 
5) ~ lntelllgent(mary) A Teacher(mary) 
6) (Remainder (X/2) = 0) A Oddnumber(X) 
A system needs the capacity to reason with prototype information assn- 
elated with concepts. But the vastness of such information is an obstacle 
to its use in commonsense reasoning systems. The strategy employed in 
the KT system is to take advantage of the high degree of structure in 
prototype information in order to constrain it. Different types of kinds, 
such as artifacts, natural kinds and persons, are associated with predict- 
ably different types of information, and KT exploits these constraints. 
II. Diversity in the Lezicpt/The task of representing meaning for sorts 
(common nouns) attd predicates (verbs and adjectives) themselves, has 
been impeded by several philosophical problems which are yet to be re- 
solved. The traditional approach, decomposition into conjunctions of 
other predicates, is notoriously defective. There is no principled way to 
select or limit the number of other predicates. Suppose the meaning of 
a~n~N_le is represented as (7). 
7) Apple -~. Fruit (P, ed V Green) A l',otmd A Sizel0 
Why not addGrowsontrces? The proposal to justify the addition of fur- 
thor predicates by the contrast with the moaning of other words has been 
rejected on a number of grounds (Dowty 79). 
Predicate meaning representation is difficult because the domain of the 
cognitive model is the actual World, which is both open and unknown to 
a large extent. I-h,mans can never be totally expert about the actual 
world. And, the knowledge o1 predicates used by speakers of a natural 
language varies with expertise, how precise the predicate itself is, and 
context. Some psychologists maintain that the the inherent openness of 
the actual world is dealt with cognitively by making clear (though possi- 
bly inaccurate) category cuts, and then reasoning about categories of ob- 
jects, including the unclear cases, using prototypes. (P, osch, el al 76) 
(Smith and Medin 80). This view implies diversity of representations of 
predicate meanings across the lexicon. Some types of predicates will have 
criterial features, ODD NUMBEI'~, others, such as names and natural 
kinds, LEMON, will not. 
Because it represents sort and predicate meaning with prototypes, and 
because it uses first order logic, KT differs in theory and results from 
systems such as KL-ONE. In KL-ONE, concepts are defined by thelr 
roles (descriptive elements) and their subsuming concepts (those con- 
cepts superordinate to them in the taxonomy). The concept ELEPHANT 
is defined by rolesets describing facts such as "has 4 legs", and by its at- 
tachment to MAMMAL. The claim is that all and any instantiation of 
the ELEPHANT concept has 4 legs. In contrast, descriptions in KT are 
probabilistic. The system accepts elephants with 3 legs, though it knows 
that elephants inherentfy have 4 legs. It accepts eggs which are brown, 
even though it knows that eggs are prototypically white. Further, in 
KL-ONE, since the descriptions arc meant to be defining, non-defining 
associated information is not encoded. By contrast, KT encodes a great 
deal of information usually associated with a concept, without the hn- 
plicit claim that it applies to all instantiations of the concept. ELE- 
PHANT can have features "forgetful", "lmnbering" and so forth, 
without claiming that all elephants have those features. 
Another implication of the prototype model is that the content of fea- 
tures is seen as essentially limitless. In contrast, the semantic net model 
assumes that there is a manageable set of primitive concepts whose size 
is much smallcr that that of the English lexicon, that these are explicitly 
connected. In KT, only ontological relationships are stated as rulcs. The 
relationships between specific descriptions can bc derived through 
problem-solving, but is not encoded. For example, in KL-ONE, the fact 
that both clouds and eggs are white is directly stated by a link from both 
CLOUD and EGG to WHITE. In KT, that both have a color is stated in 
the kind type PHYSICAL OBJECT, but that they both have the same 
color is reasoned at run time. 
The diversity of information KT accepts is constrained by kind types, 
which predict that associated with ELEPHANT are leatures describing 
parts, because ELEPHANT is in the kind type PHYSICAL OBJECT. On 
the other band, ELEPHANT does not have features describing its mode 
of construction because it is not in tile kind type AI~.'I'IFAC'F. Thus, the 
KT system predicts limitless numbers of possible descriptions which are 
constrained by types deriving from correlational constraints of the actual 
world. 
The KT system differs from most other representations of the 
commonsense knowledge underlying natural langtmge in taking the con- 
tent of descriptions from psycholinguistic studies. Because of its empir- 
ical basis, KT responds to queries in a natural and human-like way. 
Though other formalisms could be used to represent empirically-derived 
models of human commonsense knowledge, KT lends itself to represent- 
ins the diversity of information found in the data because it allows a vlr- 
tually unlimited number of features, while organizing them with the kind 
types. 
III. The Kind~~ KT reads geography text, and shows its 
understanding of the textby answering questions. Text understanding 
demonstrates the usefulness of the system, but many interesting problems 
in that area of resemch are not addressed by this work. KT is written in 
VM/PROLOG. It uses a parser, a first-order logic translator and a 
metainterpreter dew:loped by Stabler and Tarnawsky (19851. It employs 
a set of databases which represent the commonsense ontology, tim ge- 
neric features for sorts, type information for the generic features, and 
kind types for the ontology. Below is a sample text representative of the 
English KT understands. 
Sam ling_ Text John is a miner who lives in a mountain town. His wife 
raises a chicken who lays brown eggs. The company-owned clinic is near 
the mine. The nurse monitors the health of the miners. She approves of 
John's diet. 
111.1 Tim Ontological Schema To capture ontological constraints, KT 
employs a top-level conceptual schema, some of which appears in Figure 
1. It is intended to mirror tile average English-speaker's beliefs about 
what tile major category cuts of the environment arc, that is, a 
eomlnooscnge ontology. 
Figure 1 The Ontological Schema 
ENTITY ~ (ABSTRACT v ILEAL) & (INDIVIDUAL v COLLECTIVE) 
ABSTRACT ~ IDEAL V PROPOSITIONAL v qUANTITY v \]RREAL 
REAL ~ (PHYSICAL v TEMPORAL v SENTIENT) 
& (NATURAL v SOCIAL) 
PHYSICAL ~ (STATIONARY v NONSTATIONARY) 
& (ANIMATE v INANIMATE) 
NONSTATIONARY -- SELFMOVING V NONSELFMOVING 
COLLECT\]VE ~ MASS v SET v STRUCTURE 
STATIONARY ~ ~ MOVEABLE 
TEMPORAL ~ STATIVE v NONSTATiVE 
NONSTATIVE ~ (GOAL v NONGOAL) 
& (PROCESS v ACTIVITY v MOTION) 
PROCESS ~ POSITIVE v NEGATIVE 
ACTIVITY ~ OCCUPATIONAL v INTERACTIONAL 
OCCUPATIONAL ~ AGRICULTURAL v MININGMANU 
v IRADE v SERVICE v EDUCATION 
INTERACTIONAL ~ POSSESSIVE v ASSISTIVE v CONIACTt}A4. 
V CONFRONTATIONAL 
MOTION ~ (FAST v SLOW) & (TOWARD v AWAY) 
Tim goal is to encode all ontology which is consistent with an ernpirically 
verifiable cognitive model. As much evklence as possible was derived 
from psychologicalrcscarch. The schema was developed to handle the 
predicates found in 4100 wolds o\[ geography text drawn from textbooks. 
Despite the complexity of constructing a computer model of the 
ontology, two commonly-used sinrplifieations, binary trees and planar 
branching in trees, wcre rejected. First, though binary trees have simpIi- 
lying mathematical properties, they arc not likely to be psychologically 
real. People easily think in terms of more than two branches, such as 
FISH vs BIRD vs MAMMAL, and so on, off of the VEWI'EBI~.A'I'E node. 
Secondly, most representations assume that each node has a unique 
parent. But cross-classification is necdcd shlce colnn\]oosensc rt2}lso?ling 
USeS it. People understand, for example, that entities cress-classify as in- 
dividuals or sets and real or abstract. Tiffs means that at each node, more 
than one plane might be needed for branching. Cross-classiflcation is 
handled as in (McCord 85). A type hierarchy is generated which pernfits 
each node to be cross-classified in !! ways. In the top level rule of Figure 
1 each entity must be classified both ways, as either ABS'I'RACT or 
ILEAL, and as either INDIVIDUAL or COLLECTIVE. 'lhis corresponds 
to the claim that cngnilively there is essentially a palallel ontological 
schema for collectives. For example, people know that herds consist of 
animals, so that herds are real and concrete. Thus we have the parallel 
ontology Iragments in (8). 
(8) 
INDIVIDUAL COLLECTIVE 
ENTITY ENTIIY 
ABSTRACT REAl_ ABSTRACT REAL 
/ J 
CONCRETE CONCRETE / / 
ANIMAL ANIMAL / / 
CO,_~W HERD 
Inheritance of properties works differently for tile collectivcs than it 
does for individuals. Because cow is under ANIMAL, "cow is a kind of 
animal" is true. In contrast, !toLd attaches to ANIMAL, but "a herd is a 
kind of animal" is not true. A herd consists of animals. We have found 
that though there arc gaps among the collectives, a surprising mnnber of 
types of entities lmve collective names in English. For example, plop- 
217 
ositions come in collectives (discourse~th~. Another important 
cross-classification involves SOCIAL vs NATURAL. Entities (or events) 
which come into being (or take place) naturally must be distinguished 
from those which arise through some sort of social intervention. 
ARTIFACT is one of the SOCIAL nodes. The distinction needs to be 
made high up in the ontology because it affects most kind types. For ex- 
ample, events may either be SOCIAL (p_arty_) or NATURAL _(earth- 
guake). (Section IV expands upon the justifications for the ontology). 
The ontology also assumes the possibility of multiple attachments of 
instantiations to nodes. Thus the representation is actually a lattice 
rather than a tree. For example, an entity, John, is both a HUMAN with 
the physical properties of a mammal, and is also a PERSON who thinks. 
The latter makes John very similar to other sentients such at institutions 
and social roles. Instead of loading all of that complexity into a single 
HUMAN node, we make the SENTIENT~NON-SENTIENT distinction 
high tip in the hierarchy. There is ample philosophical (Strawson 53) and 
psychological (Gehnan and Spelke 81) support Ior this decision. Any ac- 
tual person is attached to both the HUMAN and PERSON nodes in the 
ontology. 
1II.2 Generic Information In the generic features database, each sort 
is represented as a predicate with two arguments. The first is a list of 
prototype features and tile second is a list of inherent features. A prote- 
type feature it typically associated with a sort or tuedicate. Most entities' 
have more prototypical features than inherent features. From our sam- 
ple, a miner is typically "male"; a norse is typically "female"; a town 
typically has "houses", Ua square II, Ila fountain", and so on, hlherent 
features are are rationally unrevisable properties of a sort or predicate. 
Thus, a man is inherently "male", a wife is inherently "married", a house 
is inherently "house-slzed". From our sample, a miner inherently "works 
in a mine", a nurse inherently is "educated", a town inherently contains 
"buildings", and so on. 
IlI.3 Feature ~I~y~ The lu'ototype features are represented by the same 
set of predicates osed to represent the inherent 1eatures, thus achieving 
SOOle econollly bl the rules. Nevertheless, the nmnber of predicates 
needed to encode the inherent and prototype features is theoretically 
limitless. Fortunately, a small and manageable set of 33 feature types 
encodes a great deal of inforlnation, although not exhaustively. The fea- 
tures themselves were chosen empirically to correspond with 
psycholingulstic data gathered by l;'.esch et al (1976), Asheraft (1976) 
and Dahigren (1985a) When asked to list prototypieal Ieatures of various 
concrete objects, subjects tend to name features which fall into a small 
nunrber of types such as SIZE, COLOR, SHAPE, and FUNCTION. 
Similarly, a few types of features such as STATUS, SEX, INTERNAL 
TI~.AIT AND I~.ELAT1ON are named for social roles. 
Notice that a feature type such as SIZE or COLOR may be inherent 
for one sort but only prototypleal for another. For instance, while blood 
inherently has COLOP, "red", a brick is only prototypically "red". While 
a brick inherently has SHAPE "rectangular parallehJpitmd" , bread is only 
prototypically "loaf-shaped". In some cases, a sort has a feature type 
both inherently and prototypically. For example, a doctor has the inher- 
ent FUNCTION "treats sick peot~le" and the prototypical FUNCTION 
"consoles sick people". 
I11.4 Kind q'ypcs as Metasorts Most knowledge representation systems 
permit any combination of the features in descriptions. KT limits these 
combinations by taking advantage of several important ontological con- 
straints affecting the possible real-world objects and therefore possible 
combinations of features in commonsense knowledge. Objects 1all into 
kinds. In particular, natural kinds exist because their members share 
some underlying trait, while artifacts and social kinds exist because el 
social convention Schwartz(1979) Dahlgren (1985b). We call classifica- 
tions of kinds KIND TYPES, so that NATUP, AL KIND constitutes one 
kiud type, ARTIFACT another, and so on. Kind types constrain the 
eomnlonsense knowledge base in several ways• First, each kind type is 
understood in terms of certain predictable feature types. NATURAL 
KIND is conceived primarily in terms of perceptual features, while 
ARTIFACT adds functional features. Second, there is a correlational 
structure to the features of reai-world objects. Given that an object is a 
218 
mammal, certain features will be found (eg. "fur") and others will be 
absent (eg. "feathers"). 
Associated with each node in the ontology is kind type information 
encoding feature types entities attached at that node may have. Entities 
may be described by features falling into a some or all of these feature 
types, and no others. Inheritance up the tree ensures that any lower node 
has all the feature types of higher nodes on any path to ENTITY. For 
instance, any node under PItYSICAL may have certain feature types, 
and any node under AP, TIFACT may have those inherited Item PlIYS- 
1CAL, as well as fu,'ther feature types, as below: 
PIIYSICAL - Shape. Size, Color, Material 
rl'exture, Odor, ltasparts, Partof 
AF.TIFACT - {PHYSICAL\], Function, Operation, 
Construction, Owner 
At each node, only certain feature types are applicable. Conversely, each 
feature known to KT is classified by type as a COLOR, SIZE, FUNC- 
TION, INTERNAL TRAIT or other. Cohn (19851 describes the econ- 
onry of the use of sorts in logic programming. In the KT system, sorts and 
predicates appear at the terminal nodes of the ontology. In addition, the 
kind types employed by the system represent metasorts, in that they 
constrain the possible ty_~es of sorts recognized by the system. 
I11.5 Encoding the Common Sense Knowlcdgt~ The representations 
described above will be illustrated with the sort nurse. Nurse is attached 
to the ohtology in axiom 
9) nurse(X) -+ role(X). 
From this axiom nurse inherits SENTIENT, SOCIAL, PHYSICAL, 
REAL, INDIVIDUAL and ENTITY tronl the ontology, In the generic 
• database, the axiom (101 lists the prototype and inherent Ieatures of 
nurse. 
10) marse ({caring,female}, {educated,asslstant, 
help(X,Y) & person(Y) & sick(Y)}). 
Notice that the last inherent feature is in the form of a PP.OLOG clause. 
This makes it possible to use the whole complex feature as input to the 
English grammar in order to \[ormnlate an English response to a question 
such as "What does the nurse do?", or "Does the uursc help peopleT". 
The feature typing database classifies the features as follows: 
relalion(assistant). 
interonltrait (caring). 
internzltrait(educated). 
sex(female). 
function(help(*,*)). 
The kind types predict that as a I~.OLE, nmse will have certain types 
of features. Inherited from the SENTIENT kind type are feature types 
INTERNAL TP, AIT ("caring") and GOAL ("tries to help"). Inherited 
from the SOCIAL kind type are feature types FUNCTION ("takes care 
of patients") and REQUIREMENT ("license"). In addition, RE- 
LATION type features ("assistant") are predicted with a ROLE, 
IV. The Inference Mecharfism Built into the natural language compo- 
nent by Stabler and Tarnawsky is a mctainterpreter which solves queries 
of all axioms active in the system. This permits us to query ontological 
and generic information as welt as textual information. The translation 
of tile first sentence of Sample Text is as in (11). 
11) miner(john} & town(town220) 
The problem solver derives t11o answers to qneries as in (12}. matching 
logic translations of tile queries, which are in ttm form of Prelog goals, to 
the database. 
12) Is John a miner7 -- Yes 
Does John live in a town7 -- Yes 
In addition, KT is able to nlake a number of inferences from the text 
which are not directly stated there. The inferences are drawn from vari- 
ous aspects of the common-sense knowledge built into KT. 
IV.I Inheritance Using the ontological database anti tile same problem 
solver, the KT system deduces taxonomically inherited information 
about the entities mentioned in tile text, as in (13\]-(14). 
13) What is a miner7 
--A miner is a role, sentient, concrete, 
social, individual and an entity. 
14) What does a miner do? 
--A miner digs for minerals. 
What is digging7 
--a goal-oriented, natural, nonmcutal, 
real, tem.tmt'al activity 
If an entity has dual attachment, for example as a human and as a role, 
or as a place anti as an institution, then KT explains inheritance relations 
along both paths of the ontology. A clinic is both a social place and an 
institution, and so when asked (15), 
15) What is the clinic? 
KT replies both that "A clinic is an institution, sentient, physical, real, 
collective, structure." and that "A clinic is a social place, place, inani- 
nmte, physical, stationary, social, real, individual." Direct ontological 
questions such as (16) are also answered: 
16) ls tile clinic a social place? --Yes 
Is the clinic collective? --Yes 
The inheritance path is followed in answering such questions, so that the 
system call answer not only queries of node attachments at to the termi- 
nal nodes of the ontology, but at all higher levels. 
IV.2 Coln1~letq. a.D~l - hmon)plc_'tc.l(nowlcd~gc_" In reasoning with this 
schema, the system knows which valid inferences it can derive 
ontologically, and thcrefcre definitively, and which knowledge is incom- 
plete. For example, KT knows that it knows the following for certain: 
V x Human(x) ~ Thinks(x) 
VxTeacher(x)-- lluman(x) 
It also knows that if something is HUMAN, it is not AIISTRACT. When 
asked "Is the teai:her abstract?" it answers "No". Thus it handles the 
exclusivity of sets called for by Hcndrix (1979) and Teoenbaum(1985). 
On the other hand, it knows which information is incomplete. With ge- 
neric descriptions, KT knows that it only knows at the la'obabilistie level. 
It asked, "Is Mary intelligent?" it rcspouds "Probably so." 'lhis reflects 
the fact that most English speakers share a prototype of teachers as intel- 
ligent. The logic works this way. If a question is ontological, KT gives 
dclinitive (yes/no) answers. If the question is generic, the answer is 
qualified as either prototype or inherent. If no attswcr can be derived to 
a non-ontological question; KT responds "1 don't kttow." Thus KT makes 
the open world assumption except with regard to ontological classifica. 
tions. Tills ability to reason about incomplete definitions is shnilar to 
Levesque's proposal for incomplete databases (Levesque 84). 
IV.3 Prototy_p_c_'_ag~tlnhgj2Efit Fea!31res KT answers queries concerning 
features of the entities in the text, both directly and by types of features. 
Direct feature queries are of the form (17). The form of the answer de- 
pends upon whether the feature is prototypical or inherent. 
17) Is the miner rugged? --Probably so. 
Is the clinic a place? --Inherently so. 
Does the chicken lay eggs7 --Inherently so. 
Are tile eggs white7 --Probably so. 
tIow is digging done7 --Probably with a shovel. 
Where is digging done7 --Probably in the earth. 
IV.4 Overriding Features Genmqe information is handled differently 
from ontological information. First, it is tentatively inferred, and 
checked against tile current knowledge base of information built up from 
the reading tile text. If anything ill the tcxtnal database conflicts with a 
generic inference, the latter is overridden. K'F takes the text as the au- 
thority, and if the text says that an entity has a feature contradicting 
those in its oommonsense knowledge of the entity, the text's claim comes 
first. For example, Salnplc Text says that the eggs are "brown", which 
overrides the prototypical generic lcaturo "white" which is listed for c~g, 
as in (18). 
18) Are the eggs brown? --The text says so. 
The cancellation takes place simply by matching to the textual dalabase 
first. Sbnilarly, if a text said that an elephant had three legs. the KT sys- 
tem would reason that it had three legs, and not the inherent four that 
elephants have. By overriding inherent features, KT gets aronnll the 
cancellation problem which arises when features are viewed as logically 
necessary. If "has four legs" is taken to be a logically necessary feature, 
aug three-legged elephant forces a contradiction, or special processing for 
exceptions (Brachman and Schmolze 1985). The KT system accepts both 
facts as true, with no contradiction. This particular elephant has three 
legs, and elephants inherently have four legs. 
In attempting to match to both the textual and generic databases, tile 
possibility of infinite recursion arises. This is true in principle for the hu- 
man reasoner, as well. KT prevents infinite recursion by limiting infer- 
enccs to a depth of 5. 
Because of the feature typing, KT can answer queries as in (19). 
19) What color are tile eggs? 
What function does tile clinic have? 
Feature typing classifies "brown" as of type COLOR. When KT looks 
first at the translation of the text to see whether it contains an assertion 
which states a color for tile eggs, it must distinguish tile facts in the text 
which arc relevant to the feature type queried. With respect to Sample 
Text, in order for KT to answer "What color are tile eggs?", KT must 
know that "brown" is a COl.OR. Without feature types, KT would not 
contrast "white" with "brown". 
KrI ' deduces sets of lacts, as well as individual facts. When qnmied for a 
type of feature, sucll as FUNCTION, KT responds with all functions 
listed for a sort. For example, clinics protolypically have both outpatient 
and emergency' functions, and Krl ' lists both when queried for funclion. 
For sorts which are structural, that is, concrete objects and institutions, 
KT is able to describe tile structure. If asked "What structure does the 
clinic have?", KT answels that typically it has a hierarchy of head- 
asslstant-clientele and has roles of doctor for head, nurse for assislant nnd 
patient for clientele. Similarly, if asked "What structure does the lish 
have?", KT answers, inherently it has these parts: fins, 1 tail, 1 head, 2 
eyes, scales. When KT lists parts, bare plurals mean an unspecified numo 
ber greater than one. 
IV.5 Kind "l~'t~es The kind types are useful in both parsing and infercnc- 
lug for text understanding. In the parsing phase, kind types can be used 
four ways. First, verb sense ambiguity can be resolve d by' the kind types 
of subject and object head nouns. In a sentence with the verb take, 
knowing that the subject is a vehicle forces the choice of the one sense of 
the verb, and knowing that it is a human forces another. Secoudly, KT 
can reason the other way around, and use selection restrictions on verbs 
to infer the kind types entities referred to in the sentence. Consider sen- 
tence (20). 
20) ABC sued tile man. 
Using kind types for selection restrictions, KT infers that the entity 
named ABC is a SENTIENT. Given the further information in (21), KT 
infers that ABC is an INSTrI'UTION and not a PEP, SON, because the 
verb ~ requires an INSTITUTION as object. 
219 
21) The man had joined ABC illegally. 
Thirdly, certain anaphoric references can be resolved using kind types. 
When verb selection restrictions classify the entity referred to by a pro- 
noun as in a certain kind type, then possible antecedents are correspond- 
ingly constrained. Consider the relationships in (22). ~ corefers with 
milk because, when intransitive, s~ requires a LIQUID as subject. 
22) The cat drank the milk. It spilled. 
Fourthly, attachment ambiguities for prepositional phrases can be re- 
solved using kind types. Consider sentence (23). 
23) John bought the lock in the afternoon. 
It is syntactically possible for the prepositional phrase in the aIternoon to 
modify the lock, the verb phrase or the whole sentence. Since afternoon 
is in the kind type TEMPORAL, KT can resolve this syntactic ambiguity, 
and attach the prepositional phrase so that it modifies the whole sentence 
(Dahlgren and McDowell, 1986). 
IV.6 Summary of Inference Mechanism In summary, predications used 
to derive inferences in the text are found in five aspects of common-sense 
knowledge: 
a) Ontological Schema 
b) Verb and adjective selection restrictions 
c) Generic Information 
d) Typing Information 
e) Kind types 
In using KT, queries drive these inferences. After a text such as the 
Sample Text has been read, KT can respond to queries and seem to un- 
derstand the text in a more human-like way using the various aspects of 
knowledge indicated above. Below are listed some queries and responses. 
Q: Who is John? 
A: The man who lives in the town. 
---Prototypical town has people living in it. 
---Prototypieal male person is a man (not a boy). 
Q: Was the town built? 
A: Yes. 
---By ontology of artifacts 
Q: Who built the town. 
A: People. 
---By ontology of artifacts 
Q: Does John wear pants7 
A: Probably so. 
---By prototype database. 
Q: Does John eat eggs. 
A: Yes. 
---Because eggs are food. 
Q: What does health think7 
A: Health doesn't think. 
---By kind types. 
Q: Does John look like a clinic7 
A: No. 
---By ontology database. 
Q: Does John live in a'tent7 
A: Probably not. 
---By prototype of town 
Q: Does John have a function2 
A: Yes, 
---By kind types 
IV. Basis for the Commonsense Knowledge_ Results in linguistlc re- 
search underline the importance of category distinctions, such as those 
between abstract and concrete objects, and persons as opposed to other 
objects. These actively affect sentence interpretation and generation. 
The sentence "The rock read the book" must either be interpreted as 
anomalous or metaphorical bceause only persons read. These constraints 
provide an empirical basis for the ontology. Cognitive psychological re- 
220 
search provides a further basis for the ontology. Keil's work cn 
ontological categorization in cognitive development was consulted in 
constructing the schema (Kei179). Gehnan and Spelke's results suggested 
placing SENTIENT higher in the schema (Gehnan and Spelke 81). 
Graesser and Clark's studies were the basis of the verb ontology 
(Graesser & Clark, 1985). Psycholingulstie research in the prototype 
theory provided descriptions of the actual prototypes shared by English- 
speakers for a number of these categories (Rosch, ct a176) (Dahlgren 85). 
The ontological schema was developed in two steps. First, the verbs 
from the corpus o\[ geography texts were classified accotdlng to 
selectional restrictions (SRs) oo subjects and objects. Second, the mini- 
real categories needed to aceomodate these Sl~.s were arranged in a hi- 
erarchical schema. Certain SRs, such as HUMAN, ANIMA'Iff, 
CONCRETE, were expected. Others were surprises. Some verbs re- 
quired complements that were marked for PLACE, and others required 
either subjects or objects to have certain mnveability Ieatures. q'hese are 
summarized below. 
STATIONARY: normally immobile, attached to the earth, moved 
only at great effort. 
SELFMOVING: normally in motion or designed for motion, in some 
eases with no apparent initial source. 
NONSELFMOVING: normally immobile but can be moved with 
slight effort. A source for the motion is expected, usually something 
SELFMOV1NG. 
One other interesting result from this stage of tire project is that a nmnber 
of vcrbstakeelthera PROPOSITIONAL or a SENTIENT subject. Both 
a book and a person can say something. 
Once the set of categories had been established, the next stage was \[it- 
ting tllem into a hierarchy from which inheritance of features c(mM be 
computed by KT. There were several constraints guiding this process. 
First, we wanted the ontology to be as compact as possible. Second, we 
wished to minimize nonexistent leaf nodes. Third, we preferred ttmt the 
system infer too little than too much. Daring this process it was also 
necessary to decide which of the Sl~.s represented true category cuts in an 
ontological schema and which were merely features on individual lexieal 
items. The guiding principle here was that if the distinction under exam- 
ination (i.e. ANIMATE/INANIMATE) pervaded some subtrce, then it 
was assigned to a branching point. But if some distinction was needed in 
isolated parts of the tree, then it was represented as a feature. For in- 
stance, we found that the INDIVIDUAL~COLLECTIVE distinction 
pervades the lexicon and must be a primary cut in the ontology. Many 
verbs select only INDIVIDUAL (rnin21~clc) or only COLLEC'IIVt~ 
(stamj~ede) subjects or" objects. Properties which were assigned 1eature 
status were items like EDIBLE and SIZE. 
The NATURAL/SOCIAL distinction was placed high on tile tree be- 
cause human intervention pervades the world. All abstract entities ate 
products oI tile human mind, but every category of real entities, including 
events and states, contains dozens of examples of the products of society. 
We therefore reserved the term ARTIFACT for irranimatc man-made 
objects to distinguish them from natural inanimate objccts. 'Ihe 
SENTIENT~PHYSICAL distinction is also fairly high. SENTIENF is 
often placed as a subordinate nf ANIMATE, but in commonsense rea- 
soning, the properties of people and things are very difIerent. The 
NATURAL/SOCIAL distinction applies to SENTIENT just as it does to 
PHYSICAL. A NATURAL, SENTIENT entity is a PERSON, that is a 
man, woma~ whereas a SOCIAL, SI~NTI\]JNT entity is a 
ROLE, seoretary~ miner, president. A collection of PERSON is a BODY, 
crowd I, Agsnob _. A collection of ROLE is an INSTITUTION, hos imp~l~ 
school. The INDIVIDUAL/COLLECTIVE cut had to be made at the 
level of ENTITY (the highest level) at the same place as the 
AIJSTRACT/REAL cut. This was not the only place where multiple 
distinctions applied (see Figure I). 
Our term COLLECTIVE applies to all collections of entities, classified 
into three subgroups. True collectives are sets in which each member of 
tile set is identical to all tile others (l_lcrd~ t_nob, m~p_w~lL IIgR\[). Masses are 
collections whose members are referred to only in terms of measurable 
units (saskd~__water), Finally, there are structures where the members 
have specified relations, such as in institutions (school, cotnpanz~ 
It was consideration of both the constraints listed above, and the as- 
slgnments of SRs to feature or node status that led us to abandon both 
binary branching and planar trees as useful representational devices. 
While it was possible to model some distinctions as binary, others re- 
quired more than two branches. For example, ABSTRACT entities 
which divided into IDEAL, PP, OPOSH'IONAL, QUANTH'Y, and 
IRREAL, all of which have equivalent status as SRs. 
Figure ill Terminal Nodes in the ~c\]lenla 
Example Rule 
bush PLANT ~ REAL & INI)IVIDUAL & I'IIYSICAL 
& NATURAl.. & ANIMATE & STATIONARY bear ANIMAL * REAL & INDIVIDUAL & PtlYSICAL 
& NATURAL & ANIMAqE & NONSTATIONAR% & SELFMOVING 
mountain PLACE ~- REAL & INDIVIDUAL & PIIYSICAL & STATIONARY & INANIMATE 
mountain NATURAL PLACE ~ NATURAL & PLACE village SOCIAL PI~\[ACE ~- PLACE & SOCIAL 
stone MINERA~ ,- REAL & INDIVIDtJAt & PIIYSICAL & NA'IUI~AI. & INANIMATE 
& NONSTATIONARY & N{}NSEI.FMOVING Santos PERSON ,- REAL & INDIVIDUAL & NAqURAL 
& SENTII'N 1' car VEIIICLI:. -- AIVHFACT & SEI.FMOVING 
& PIIYSICAL & SEt.FMOVING radio AICIIFACT -~ ARTIFACT & NONSELFMOVING 
secretary ROLE * REAl., & INDIVIDUAL & SENTIENT & SOCIAL 
book DISCOURSE * ABSH~.ACT & COLLECTIVE & PILOPOSITIONAL 
We were still faced with the fact that many entities still seemed to 
straddle the hierarchy. Is an individual human a PRIMATE or a PER- 
SON, or both7 Is a hospital an INSTITUTION or a PLACE, or both? If 
we were to establish a hierarchy which would reflect these differences, 
we would end up with a very large and unwieldy schema with huge gaps. 
Therefore, we deekted on multiple attachment for those entities which 
re(mlred it. This decision was justified as well by examioation of the texts 
which revealed that a human teeing was generally dealt within a context 
as either a person or a physiological being, but rarely as both at tim same 
time. Figure IIl giw:s examples of some nouns, their assignment to cate- 
gories and rules by which terminal nodes in the schema are generated 
from higher..level nodes. Figure lII shows only a few examples of termi- 
nal nodes irt the schema, However, every path through the ontology re- 
suits in a terminal node which is named and which represents a unique 
class defined by inheritance of features up the tree. Tetmina\[ node names 
distinguish the individuals from the collectives, For instance, the collec- 
tive node corresponding to I'LANI' is FLOP, A. The individual node cor- 
responding to DISCOURSE is PROPOSH'ION. Similarly, STUFF is the 
collective nf MINERAl_, INSTH'UTION is the collective of ROLl?, and 
BODY is the collective of PERSON, etc. 
The types of features which occurred in the data at each node in the 
ontology were the basis of the kind types. It is an empMeal fact that fea- 
ture types are correlated in relation to ontological classifications. At each 
node in the ontology is a kind type encoding certain sets of properties that 
any entity classified at that node may have. Inheritance up the tree en- 
sures that any lower node has all the properties of higher nodes on a sin- 
gle path to ENTITY. For each property at a node, a set of values applies. 
While the values for items such as COLOR are fairly obvious, we have 
had to construct value ranges elsewhere. For SIZE, we ltave started with 
the set {microscopic, l~iny, small, handleable, medium, large, huge, 
building-sized, skyscraper- sized, mountainous, region-sized!, which is a 
reality-oriented scale to be applied loosely. The kind types were ex- 
tracted empirically from the generic data after all the features were 
typed, by inspection of types of features associa\[ed with sorts and predi- 
cates at each node of the ontology. 
The texts in the corpus describe lifestyle and industry in wlrious coun- 
tries. Generic descriptions of the nouns in the text were drawn frotn the 
psycholinguistic literature, to the extent possible. ((Rosch 76); (Ashcraft 
76); (Dahlgren 85)). For P, OLE, we used generic descriptions of social 
roles collected by Dahlgren and partially published in (Dahlgren 85). For 
PHYSICAL we used generic descriptions from (Ashcraft 76). For those 
nouns where no data existed, generic descriptions were created cen- 
forming to the types of information generated by subjects for similar 
nouns. We do not consider this a defect of our system, since we are not 
trying to argue for the psychological reality of any particular generic de- 
scription, but merely for the efficacy of a reasoning system which uses 
them. The decision to place features in the prototype list or the inherent 
list for a sort or predicate was decided by two judges. It is a research goal 
to verify judgments experimentally. 
Co eclusjot~ hi conclusion, KT encodes an ontology which omdels the 
top level of typical t~'nglish speaker's cognitive model ef the actual wolld. 
It employs several different types of information to reason in human-like 
ways about text that it reads. In addition to the onlolngy, iI uses velb 
selection restrictions and generic in\[ormatlon associated with COIleel)ls. 
By enlploying systematic constraints in the form of kind types assoeialed 
with nodes in the ontology, KT reasons efficiently. All of lhe information 
KT uses is drawn from empirical studies of human cognitive psychology, 
linguistics or the corpus of text which KT reads. Because of this empirical 
basis, and the breadth of the ontology, KT is a transportable syslcm 
which is potentially useful for understanding any text of a general, literal 
nat'tn'e, 
References

Ashcralt, M.II. Property norms for typical and atypical items lrom 17 
categories. Mere9% Z 9qd Cogttition, 6: 227-32, 19"18. 

Brachman, P-_l. and J.G. Schmolze. 1985. An overview of the KL-ONE 
knowledge representation system. C ogt\]iLive Science 9 : 171-210. 

Berlin, B. 1972, Speculations on the growth of ethnobotanical 
nomenclature. \[.5938uagc:3nd So£i~t Z 1:41-86. 

Dahlgren, K. 1985a, The Structnl e of Social Categories, 
C ogniLivq_Science 9:379-398. 

Dahlgren, K. 1985b. Kind types in lexieal representation. To appear. 

Dahlgren, K. and J, McDowell. 1986. Using eotnmonsense knowlcdge 
to disatnbiguale, To appear. 

Dougherty, J.W.D. 1978. Salience and relativity in classification 
Axeer icaR E t\[)~mlo~is b 5:66~80. 

Dowty, David R. 1979. W tA£1 M~!an!#g ~!B! Monta ugpe G\[a!nAw i. 
Durdrecht, llolland: D. Reidel Publishing Company 

Gehnan, R. andE, Spelke. 1981, Thoughts About Animate and 
Inanimate Objects, in S'ecia) CDg!!i).i'ce D~wel~acj ~, 
cds. J.II. Flavell and L. P, oss, p. 43-81. 

Graesser, A. and L. Clark. 1985. Strtt£1urps anc t I'recettures i~\[ 
_In3plieit fZnowle_d~g £. . Norwood,New Jersey: Ablex. 

Hayes, P.J. 1985. The second naive physics manifesto. 
In For_u)al\[l'heur_i£s o/tlaq.Co3nmonsetlye World, 
eds. J,P,. Hobbs and R.C. Moore. Norwood, N.J.; Ablex. 

Hendrix, G.G. 1979, Encoding Knowledge in Partitioned Networks, 
in Asspciative Net~'ork_s, ed. N.J. Findler, p.51-92. 

Keil, F. C. 1979. Se!!mjltic and (.~onceptual Develo3Eneot. 
tlarwu'd U Press. 

Levesque, 11. 1984, The logic of imcotnplete knowledge bases, 
in _QvLCot~cEph,al M£~!eling. eels. M. L. Broclie, 
J. Mylopeulos and J,W. Sehmidt. New York: Sptiogcr-Verlag. 

McCord, M. 1985. The Icxical base for setnantic intcrpretatital in a 
prolog parser. Wet kshop on the Lexicon, Parsing and Seuaantie 
Interpretation. CIJNY Graduate Center. 

Rosch, I3., Mervis, C.B., Gray. W.D., Johnson. D.M. & 
Boyes-Braem, I', 19/6. Basic objects in natural 
categories. _C__ogeilivc_l\[sychqlqgy 8:382-439. 

Smith, Edward E. and Medin, Douglas L. 1985. Catcgories 
and Concepts. tlarvard U. 

Stabler, E.P,, Jr., and G.O. Tarnawsky. 1985, NLProlcg--- 
A prolog-based natural language facility, To appear. 

Strawson, P.C, 1953. Indivkluals. London: Methuen. 

t Tarnawsky, G.O. 1)82. Knowled&e Semantics. Unpublished 
NYU Dissertation. 

Tenenbautn, J.D. 1985. Taxonomic reasoning. Prec, IJCAI. 
