A TAXONOMY FOR ENGLISH NOUNS AND VERBS 
Robert A. Amsler 
Computer Sciences Department 
University of Texas, Austin. TX 78712 
ABSTRACT: The definition texts of a machine-readable 
pocket dictionary were analyzed to determine the 
disambiguated word sense of the kernel terms of each 
word sense being defined. The resultant sets of word 
pairs of defined and defining words were then 
computaCionally connected into t~o taxonomic semi- 
lattices ("tangled hierarchies") representing some 
24,000 noun nodes and 11,000 verb nodes. The study of 
the nature of the "topmost" nodes in these hierarchies. 
and the structure of the trees reveal information about 
the nature of the dictionary's organization of the 
language, the concept of semantic primitives and other 
aspects of lexical semantics. The data proves that the 
dictionary offers a fundamentally consistent description 
of word meaning and may provide the basis for future 
research and applications in computational linguistic 
systems. 
1. INTRODUCTION 
In the late 1960"s, John 01ney et al. at System 
Development Corporation produced machine-readable copies 
of the Merriam-Webster New Pocke~ Dictionary and the 
Sevent~ Collegiate Dictionary. These 
massive data files have been widely distributed within 
the computational linguistic community, yet research 
upon the basic structure of the dictionary has been 
exceedingly slow and difficult due to the Significant 
computer resources required to process tens of thousands 
of definitions. 
The dictionary is a fascinating computational resource. 
It contains spelling, pronunciation, hyphenation, 
capitalization, usage notes for semantic domains, 
geographic regions, and propriety; etymological, 
syntactic and semantic information about the most basic 
units of the language. Accompanying definitions are 
example sentences which often use words in prototypical 
contexts. Thus the dictionary should be able to serve 
as a resource for a variety of computational linguistic 
needs. My primary concern within the dictionary has 
been the development of dictionary data for use in 
understanding systems. Thus I am concerned with what 
dictionary definitions tell us about the semantic and 
pragmatic structure of meaning. The hypothesis I am 
proposing is that definitions in the lexicon can be 
studied in the same manner as other large collections of 
objects such as plants, animals, and minerals are 
studied. Thus I am concerned with enunerating the 
classifications1 organization of the lexicon as it has 
been implicitly used by the dictionary's lexicographers. 
Each textual definition in the dictionary is 
syntactically a noun or verb phrase with one or more 
kernel terms. If one identifies these kernel terms of 
definitions, and then proceeds to disambiguate them 
relative to the senses offered in the same dictionary 
under their respective definitions, then one can arrive 
at a large collection of pairs of disambiguated words 
which can be assembled into a taxonomic semi-lattice. 
This task has been accomplished for all the definition 
texts of nouns and verbs in a comu~n pocket dictionary. 
This paper is an effort to reveal the results of a 
preliminary examination of the structure of these 
databases. 
The applications of this data are still in the future. 
What might these applications be? 
First, the data shoul'd provide information on the 
contents of semantic domains. One should be able to 
determine from a lexical taxonomy what domains one 
might be in given one has encountered the word 
"periscope", or "petiole", or "petroleum". 
Second, dictionary data should be of use in resolving 
semantic ambiguity in text. Words in definitions 
appear in the company of their prototypical 
associates. 
Third, dictionary data can provide the basis for 
creating case gr-,-~-r descriptions of verbs, and noun 
argument descriptions of nouns. Semantic templates of 
meaning are far richer when one considers the 
taxonomic inheritance of elements of the lexicon. 
Fourth. the dictionary should offer a classification 
which anthropological linguists and psycholinguists 
can use as an objective reference in comparison with 
other cultures or human memory observations. This 
isn't to say that the dictionary's classification is 
the same as the culture's or the human mind's, only 
that it is an objective datum from which comparisons 
can be made. 
Fifth. knowledge of how the dictionary is structured 
can be used by lexicographers to build better 
dictionaries. 
And finally, the dictionary if converted into a 
computer tool can become more readily accessible to 
all the disciplines seeking Co use the current 
paper-based versions. Education. historical 
linguistics, sociology. English composition, etc. can 
all make steps foxward given that they can assume 
access to a dictionary is immediately available via 
computer. I do not know what all these applications 
will be and the task at hand is simply an elucidation 
of the dictionary's structure as it currently exists. 
2. "TANGLED" HIERARCHIES OF NOVN S AND VERBS 
The grant. MCS77-01315, '~)evelopment of a Computational 
Methodology for Deriving Natural Language Semantic 
Structures via Analysis of Machine-Readable 
Dictionaries". created a taxonomy for the nouns and 
verbs of the Merriam-Webster Pocket Dictionary (MPD), 
based upon the hand-disambiguated kernel words in their 
definitions. This taxonomy confirmed the anticipated 
structure of the lexicon to be that of a "tangled 
hierarchy" \[8,9\] of unprecedented size (24,000 noun 
senses. 11.000 verb senses). This data base is believed 
to be the first Co be assembled which is representative 
of the structure of the entire English lexicon. (A 
somewhat similar study of the Italian lexicon has been 
done \[2.11\] ). The content categories agree 
substantially with the semantic structure of the lexicon 
proposed by Nida \[I5\], and the verb taxonomy confirms 
the primitives proposed by the San Diego LNR group \[16\]. 
This "tangled hierarchy" may be described as a formal 
data structure whose bottom is a set of terminal 
disambiguated words that are not used as kernel defining 
terms; these are the most specific elexents in the 
structure. The tops of the structure are senses of 
words such as "cause", "thing", '*class", "being", etc. 
These are the most general elements in the tangled 
hierarchy. If all the top terms are considered to be 
133 
members of the metaclass "<word-sense>", the tangled 
forest becomes a tangled tree. 
The terminal nodes of such trees are in general each 
connected to the Cop in a lattice. An individual 
lattice can be resolved into a seC of "traces", each of 
which describes an alternate paCh from terminal to cop. 
In a crate, each element implies the terms above iC, and 
further specifies the sense of the elements below it. 
The collection of lattices forms a transitive acyclic 
digraph (or perhaps more clearly, a "semi-lattice", that 
is, a lattice with a greatest upper bound, <word-sense>, 
but no least lower bound). If we specify all the traces 
composing such a structure, spanning all paths from top 
to bottom, we have topologically specified the 
semi-lattice. Thus the list on the left in Figure I 
topologically specifies the tangled hierarchy on its 
right. 
(a b c e f) a 
(a b c gk) I 
(a b d g k) I 
(a b c g I) b 
(a bd gl) / \ 
(abc gin) / \ 
(a b d g m) c d 
(a b d i) II I \ 
/ J / 
• I / 
\[ f / 
I I/ 
f g 
/I\ 
/I\ 
/ I \ 
k 1 m 
\ 
i 
Figure I. The Trace of a Tangled Hierarchy 
2.1 TOPMOST SEMANTIC NODES OF THE TANGLED HIERARCHIES 
Turning from the abstract description of the forest of 
tangled hierarchies Co the actual data, the first 
question which was answered was, 'What are the largest 
tangled hierarchies in the dictionary?". The size of a 
tangled hierarchy is based upon two numbers, the maximum 
depth below the "root" and the total number of nodes 
transitively reachable from the root. Thus the tangled 
hierarchy of Figure 1 has a depth of 5 and conCains a 
total of 11 nodes (including the "root" node, "a"). 
However, since each non-terminal in Che tangled 
hierarchy was also enumerated, it is also possible Co 
describe the "sizes" of che other nodes reachable from 
"a". Their number of elemenCs and depChs given in Table 
1. 
Table 1. Enumeration of Tree Sizes and Depths of 
Tangled Hierarchy Nodes of Figure 2 
Tree Maximum Rooc 
Size Depth Node 
ii 5 a 
10 4 b 
6 3 c 
6 2 d 
4 l g 
2 I e 
These examples are being given co demonstrate the 
inherenC consequences of dealing wich tree sizes based 
upon these measurements. For example, "g" has the most 
single-level descendants, 3, yet it is neither at the 
Cop of the Cangled hierarchy, nor does iC have the 
highest total number of descendants° The root node "a" 
is at the top of the hierarchy, yet it only has I 
single-level descendant. For nodes ¢o he considered of 
major importance in a tangled hierarchy it is chus 
necessary to consider not only Cheir total number of 
descendants, buc whether Chese descendants are all 
accually immediately under some ocher node Co which this 
higher node is attached. As we shall see, che nodes 
which have the most single-level descendants are 
actually more pivoral concepts in some cases. 
Turning to the actual forest of Cangled hierarchies, 
Table 2 gives the frequencies of the size and depth of 
the largest noun hierarchies and Table 3 gives the sizes 
alone for verb hierarchies (depths were noc oompuced for 
these, unfortunately). 
Table 2. Frequencies and Maxim,-. Depths of 
MPD Tangled Noun Hierarchies 
3379 I00NE-2.1A 1068 13 MEASUREMENT-I.2A 
2121 12 BULK-I.IA 1068 ** DIMENSION-.IA 
1907 10 PARTS-I.1A/! 1061 ** LENGTH-.IB 
1888 10 SECTIONS-.2A/! 1061 ** DISTANCE-I.IA 
1887 9 DIVISION-.2A 1061 14 DIMENSIONS-.IA 
1832 9 PORTION-I.4A 1060 11 SZZE-I.0A 
1832 8 PART-I.IA 1060 13 MEASURE-I.2A 
1486 14 SERIES-.0A 1060 I0 EXTENT-.IA 
1482 18 SUM-I.IA I060 14 CAPACITY-.2A 
1461 ** AMOUNT-2.2A 869 7 HOUSE-I.1A/+ 
1459 8 ACT-I.1B 836 7 SUBSTANCE-.2B 
1414 ** TOTAL-2.0A 836 8 MATTER-I.4A 
1408 15 NUMBER-I.IA 741 8 NENS-.2A/+ 
1379 14 AMOUNT-2.1A 740 6 PIECE-I.2B 
1337 80NE-2.2A 740 7 ITEM-.2A 
1204 5 PERSON-.IA 686 7 ELZMENTS-.IA 
1201 14 OPERATIONS-.IA/÷ 684 6 MATERIAL-2.1A 
1190 ~r* PROCESS-I.4A 647 9 THING-.4A 
1190 14 ACTIONS-.2A/+ 642 8 ACT-I.IA 
1123 6 GROUP-I.OA/! 535 6 THINGS-.SA/! 
ii01 12 FOEM-I.13A 533 6 MEMBER-.2A 
1089 12 VAEIETY-.4A 503 I0 PLANE-4.1A 
1083 Ii MODE-.IA 495 6 STRUCTURE-.2A 
1076 I0 STATE-I.IA 494 I0 RANK-2.4A 
1076 9 CONDITION-I.3A 493 9 STEP-I°3A 
*~ = ouC of range due to dace error 
Table 3. Frequencies of Topmost 
MPD Tangled Verb Hierarchies 
4175 RZMAIN-. 4A 365 GAIN-2.1A 
417 5 CONTINUE-. IA 334 DRIVE- I. IA/+ 
4087 MAINTAIN- .3A 333 PUSH-I .IA 
4072 STAND-1.6A 328 PRESS-2 olB 
4071 HAVE-1.3A 308 CHANGE- I .IA 
4020 BE- .IB 289 MAKE- 1.10A 
3500 EQUAL-3.0A 282 COME- .IA 
3498 BE- .IA 288 CHANGE-I .IA 
3476 CAUSE-2.0A 283 EFFECT- 2 .IA 
1316 APPEAR- .3A/C 282 ATTAIN-. 2B 
1285 EXIST-. IA/C 281 FORCE-2.3A 
1280 OCCUR- .2A/C 273 PUT- .IA 
1279 MAKE-I .IA 246 IMPRESS-3.2A 
567 GO-1 .iB 245 URGE- 1.4A 
439 BRING- .2A 244 DRIVE-I .IA 
401 MOVE- I .IA 244 IMPEL- .0A 
366 GET-I .IA 244 THRUST- I .IA 
While the verb tangled hierarchy appears co have a 
series of nodes above CAUSE-2.0A which have large 
numbers of descendants, the actual structure more 
closely resembles chat of Figure 2. 
134 
remain-.&a <--> continue-.la <-- maintain-.3a 
I 
stand-l.6a 
have-1.3a 
t 
be-.lb 
equal-3.0a 
7 
be-.la 
cause-2.0a 
t ? 
8o-l.la < > make-l.la make-l.la 
Figure 2. Relations between Topmost Tangled 
Verb Hierarchy Nodes 
The list appears in terms of descending frequency. The 
topmost nodes don't have many descendants at one level 
below, but they each have one BIG descendant, the next 
node in the chain. CAUSE-2.0A has approximately 240 
direct descendants, and MAKE-I.IA has 480 direct 
descendants making these t~o the topmost nodes in 
terms of number of direct descendants, though they are 
ranked 9th and 13th in terms of total descendants (under 
words such as EDL%IN-.4A, CONTINUE-.1A, etc.). This 
points out in practice what the abstract tree of Figure 
I showed as possible in theory, and explains the seeming 
contradiction in having a basic verb such as 
"CAUSE-2.0A" defined in terms of a lesser verb such as 
'~EMAIN-.4a". 
The difficulty is explainable given two facts. First. 
the lexicographers HAD to define CAUSE-2.0A using some 
other verb, etc. This is inherent in the lexicon being 
used to define itself. Second, once one reaches the Cop 
of a tengled hierarchy one cannot go any higher -- and 
consequently forcing further definitions for basic verbs 
such as "be" and "cause" invariably leads CO using more 
specific verbs, rather than more general ones. The 
situation is neither erroneous, nor inconsistent in the 
context of a self-defined closed system and will be 
discussed further in the section on noun primitives. 
2.2 NOUN PRIMITIVES 
One phenomenon which was anticipated in computationally 
grown trees was the existence of loops. Loops are 
caused by having sequences of interrelated definitions 
whose kernels form a ring-like array \[5.20\]. However. 
what was not anticipated was how important such clusters 
of nodes would be both co the underlying basis for the 
Caxonomies and as primitives of the language. Such 
circularity is sometimes evidence of a truly primitive 
concept, such as the set containing the words CLASS, 
GROUP, TYPE, KIND, SET. DIVISION, CATEGORY. SPECIES, 
INDIVIDUAL, GROUPING, PART and SECTION. To understand 
this, consider the subset of interrelated senses these 
words share (Figure 3) and then the graphic 
representation of these in Figure 4. 
GROUP 1.0A - a number of individuals related by a 
common factor (as physical association, 
community of interests, or blood) 
CLASS 1,1A - a KrouD of the same general status or 
nature 
TYPE 1.4A - a c~ass, k~nd, or 2rouo set apart by 
com~on characteristics 
KIND Io2A - a 2rouv united by common traits or 
interests 
KIND 1.2B - CATEGORY 
,CATEGORY .0A - a division used in classification ; 
CATEGORY .0B - CLASS, GROUP, KIND 
DIVISION .2A one of the Darts, sections, or 
=rouDinas into which a whole is divided 
*GROUPING <-" W7 - a set of objects combined in a 
group 
SET 3.5A - a zrouv of persons or things of the same 
kind or having a common characteristic usu. 
classed together 
SORT 1.1A - a 2tour of persons or things that have 
similar characteristics 
SORT 1.1B - C~%SS 
SPECIES .IA - ~ORT, KInD 
SPECIES .IB - a taxonemic group comprising closely 
related organisms potentially able co breed with 
one another 
Key: 
* The definition of an MPD run-on, taken from Webster's 
SevenE~ Colle2iate Dictionary to supplement the set. 
Figure 3. Noun Primitive Concept Definitions 
SET 3.5A 
t / 
GROUPINGS* 
one of the PARTS* 
SECTIONS* l 
/ 
/ 
DIVISION . 2A ? 
/ 
/ 
/ 
/ 
CATEGORY .0A % 
\ 
\ 
\ 
KIND 1.2B I 
I\ I 
SPECIES . IA .... 
\ 
\ number of INDIVIDUALS 
\ 7 \ / 
\ / 
¼ / 
CROUP 1.0A < ......... 
7 t t % / / \ \ 
/ I I \ 
/ I I \ 
/ CLASS KIND \ 
/ 1 .IA 1.2A I 
I tt% t I 
CATEGORY .0S I TYPE 1.4A I 
tl I I 
I I I I 
I I I I 
I I SORT I.IB I 
SORT 1.1A 
/ 
/ 
SPECIES .IB 
Figure 4. "GROUP" Concept Primitive from 
Dictionary Definitions 
* Note: SECTIONS, PARTS, and GROUPINGS have additional 
connections not shown which lead to a related 
primitive cluster dealing with the PART/WHOLE concept. 
This complex interrelated set of definitions comprise a 
primitive concept, essentially equivalent to the notion 
of SET in mathematics. The primitiveness of the set is 
evident when one attempts to define any one of the above 
words without using another of them in that definition. 
135 
This essential property, the inability to write a 
definition explaining a word's meaning without using 
another member of some small set of near synonymous 
words, is the basis for describing such a set as a 
PRIMITIVE. It is based upon the notion of definition 
given by Wilder \[21\], which in turn was based upon a 
presentation of the ideas of Padoa, a 
turn-of-the-century logician. 
The definitions are given, the disambiguation of their 
kernel's senses leads to a cyclic structure which cannot 
be resolved by attributing erroneous judgements to 
either the lexicographer or the disambiguator; therefore 
the structure is taken as representative of an 
undefinable pyimitive concept, and the words whose 
definitions participate in this complex structure are 
found Co be undefinable without reference to the other 
members of the set of undefined terms. 
The question of what to do with such primitives is not 
really a problem, as Winograd notes \[22\], once one 
realizes that they must exist at some level, just as 
mathematical primitives must exist. In tree 
construction the solution is to form a single node whose 
English surface representation may be selected from any 
of the words in the primitive set. There probably are 
connotative differences between the members of the set. 
but the ordinary pocket dictionary does not treat these 
in its definitions with any detail. The Merriam-Webster 
CollemfaCe Dictionary does include so-called "synonym 
paragraphs" which seem to discuss the connotative 
differences between words sharing a "ring". 
While numerous studies of lexical domains such as the 
verbs of motion \[1,12,13\] and possession \[10\] have been 
carried out by ocher researchers, it is worth noting 
that recourse to using ordinary dictionary definitions 
as a source of material has received little attention. 
Yet the "primitives" selected by Donald A. Norman, 
David E. Romelhart, and the LNR Research Group for 
knowledge representation in their system bear a 
remarkable similarity to those verbs used must often as 
kernels in The Merriam-Webster Pocket Dictionary and 
Donald Sherman has shown (Table 4) these topmost verbs 
to be among the most common verbs in the Collegiate 
Dictionary as well \[19\]. The most frequent verbs of the 
MPD are, in descending order. MAKE, BE, BECOME, CAUSE, 
GIVE, MOVE, TAKE, PUT, FORM, BEING, HAVE. and GO. The 
similarity of these verbs to those selected by the LNH 
group for their semantic representations, i.e., BECOME, 
CAUSE, CHANGE, DO, MOVE. POSS ("have"), T~SF 
("give","take"), etc., \[10.14.18\] is striking. This 
similarity is indicative of an underlying "rightness" of 
dictionary definitions and supports the proposition that 
the lexical information extractable frca study of the 
dictionary will prove to be the same knowledge needed 
for computational linguistics. 
The enumeration of the primitives for nouns and verbs by 
analysis of the tangled hierarchies of the noun and verb 
forests grown from the MPD definitions is a considerable 
undertaking and one which goes beyond the scope of this 
paper. To see an example of how this technique works in 
practice, consider the discovery of the primitive group 
starting from PLACE-1.3A. 
place-l.3a - a building or locality used for a 
special purpose 
The kernels of this definition are "building" and 
"locality". Lookiog these up in turn we have: 
building-.la a usu. roofed and wailed structure 
(as a house) for permanent use 
locality-.0a a particular ShOt, situation, or 
location 
136 
Table 4. 50 Most Frequent Verb Infinitive Forms of 
W7 Verb Definitions (from \[19\]). 
1878 MAKE 157 FURNISH 
908 CAUSE 154 TURN 
815 BECOME 150 GET 
599 GIVE 150 TREAT 
569 BE 147 SUBJECT 
496 MOVE 141 HOLD 
485 TAKE 137 UNDERGO 
444 PUT 132 CHANGE 
366 BRING 132 USE 
311 HAVE 129 KEEP 
281 FoRM 127 ENGAGE 
259 GO 127 PERFORM 
240 SET 118 BREAK 
224 COME 118 REDUCE 
221 REMOVE 112 EXPRESS 
210 ACT 107 ARRANGE 
204 UTTER 107 MARK 
190 PASS 106 SEFARATE 
188 PLACE 105 DRIVE 
178 COVER 104 CARRY 
173 CUT I01 THR02 
169 PROVIDE 100 SERVE 
166 DRAW 100 SPEAK 
163 STRIKE 100 WORK 
This gives US four OeW terms, "structure", "SpOt", 
"situation", and "location". Looking these up we find 
the circularity forming the primitive group. 
structure-.2a - ~ built (as a house or a dam) 
spot-l.3a - LOCATION, SITE 
location-.2a - SITUATION, PLA~ 
situatiou-.la - location, site 
And finally, the only new term we encounter is "site" 
which yields, 
site-.Oa - location <~ of a building> <battle *> 
The primitive cluster thus appears as in Figure 5. 
something (built) 
, I 
I site-l.3a .. > site-.0a 
J T T I 
I I / I 
\] J situation-.l a J 
structure-.2a I ~ ~\ I 
I l \\ I I I 
building-.la 
T 
locality-.Oa ~> locatio~-.2a 
T I I I 
I I 
place-1.3a <, 
Fisure 5. Diagram of Primitive Bet Containing PLACE. 
LOCALITY, SPOT, SITE, SITUATION, and LOCATION 
2.3 NOUNS TERMINATING IN RELATIONS 
TO oTHER NOUNS OR VERBS 
In addition to terminating in "dictionary circles" or 
"loops", nouns also terminate in definitions which are 
actually text descriptions of case arguments of verbs or 
relationships to other nouns. "Vehicle" is a fine 
example of the former, being as it were the canonical 
instrumental case argument of one sense of the verb 
"carry" or "transport". 
vehicle - a means of carrying or transporting 
something 
'~eaf" is an example of the letter, being defined as a 
part of a plant, 
leaf - a usu. flat and green outgrowth of a plant 
stem that is a unit of foliage and functions 
esp. in photosynthesis. 
interaction of the PART-OF and ISA hierarchies. 
Historically even Raphael \[17\] used a PART-OF 
relationship together with the ISA hierarchy of gig's 
deduction system. What however is new is that I am not 
stating "leaf" is a part of a plant because of some need 
use this fact within a particular system's operation. 
but "discovering" this in a published reference source 
and noting that such information results naturally from 
an effort to assemble the complete lexical structure of 
the dictionary. 
2.4 PARTITIVES AND COLLECTIVES 
Thus "leaf" isn't a type of anything. Even though under 
a strictly genus/differentia interpretation one would 
analyze "leaf" as being in an ISA relationship with 
"outgrowth", "outgrowth" hasn't a suitable homogeneous 
set of members and a better interpretation for modeling 
this definition would be to consider the "outgrowth of" 
phrase to signify a part/whole relationship between 
"leaf" and "plant". 
Hence we may consider the dictionary to have at least 
two taxonomic relationships (i.e. ISA and ISPART) as 
well as additional relations explaining noun terminals 
as verb arguments. One can also readily see that there 
will be taxonomic interactions among nodes connected 
across these relationship "bridges". 
While the parts of a plant will include the "leaves", 
"stem", "roots", etc., the corresponding parts of any 
TYPE of plant may have further specifications added to 
their descriptions. Thus "plant" specifies a functional 
form which can be further elaborated by descent down its 
ISA chain. For example, a "frond" is a type of "leaf", 
frond - a usu. large divided leaf (as of a fern) 
We knew from "leaf" that it was a normal outgrowth of a 
"plant", but now we see that "leaf" can be specialized, 
provided we get confirmation from the dictionary that a 
"fern" is a "plant". (Such confirmation is only needed 
if we grant "leaf" more than one sense meaning, but 
words in the Pocket Dictionary do typically average 2-3 
sense meanings). The definition of "fern" gives us the 
needed linkage, offering, 
fern - any of a group of flowerless seedless vascular 
green plants 
Thus we have a specialized name for the "leaf" appendage 
of a "plant" if that plant is a "fern". This can be 
represented as in Figure 6. 
ISPART 
leaf ------=='''''> plant /\ /\ 
II II 
II II 
II II 
ISA II II ISA 
II il 
II II 
II II 
II ISPART \[\[ 
frond =====~=~==="==''> fern 
Figure 6. LEAF:PLANT::FHOND:FERN 
This conclusion that there are two major transitive 
taxonomies and that they are related is not of course 
new. Evens etal. \[6,7\] have dealt with the PART-OF 
relationship as second only to the ISA relationship in 
importance, and Fahlmen \[8,9\] has also discussed the 
As mentioned in Section 2.3, the use of "outgrowth" in 
the definition of "leaf" causes problems in the taxonomy 
if we treat "outgrowth" as the true genus term of that 
definition. This word is but one ~*-mple of a broad 
range of noun terminals which may be described as 
"partitives". A "partitive" may be defined as a noun 
which serves as a general term for a PART of another 
large and often very non-homogeneous set of concepts. 
Additionally. at the opposite end of the partitive 
scale, there is the class of "collectives". Collectives 
are words which serve as a general term for a COLLECTION 
of other concepts. 
The disambiguators often faced decisions as to whether 
some words were indeed the true semantic kernels of 
definitions, and often found additional words in the 
definitions which were more semantically appropriate to 
serve as the kernel -- albeit they did not appear 
syntactically in the correct position. Many of these 
terms were partitives and collectives. Figure 7 shows a 
set of partitives and collectives which were extracted 
and classified by Gretchen Hazard and John White during 
the dictionary project. The terms under "group names", 
"whole units", and "system units" are collectives. 
Those under "individuators". "piece units". "space 
shapes", "existential units", "locus units", and "event 
units" are partitives. These terms usually appeared in 
the syntactic frame "An of" and this 
additionally served to indicate their functional role. 
I QUANTIFIERS 3 EXISTENTIAL UNITS 
I.i GROUP NAMES 3.1 VARIANT 
pair.collection.group version.form, sense 
cluster,bunch. 
band (of people) 3.2 STATE 
state,condition 
1.2 INDIVIDUATORS 
member.unit,item. 4 REFERENCE UNITS 
article,strand, 
branch 4.1 LOCUS UNITS 
(of science, etc.) place.end,ground, 
point 
2 SHAPE UNITS 
4.2 PROCESS UNITS 
2.1 PIECE UNITS cause,source,means. 
sample,bit,piece, way.manner 
tinge,tint 
5 SYSTEM UNITS 
2.2 WHOLE UNITS system, course,chain. 
mass,stock,body, succession.period 
quantity.wad 
6 EVENT UNITS 
2.3 SPACE SHAPES act,discharge, 
bed,layer.strip,belt, instance 
crest,fringe,knot. 
knob,tuft 7 EXCEPTIONS 
growth.study 
Figure 7. Examples of Partitives and Collectives \[3\] 
137 
ACKNOWLEDGEMENTS 
This research on the machine-readable dictionary could 
not have been accomplished without the permission of the 
G. & C. Merriam Co., the publishers of the Merriam- 
Webster New Pocket Dictiouar7 and the Merriam-Webster 
Seventh C911e~iate Dictionary as well as the funding 
support of the National Science Foundation. Thanks 
should also go to Dr. John S. White. currently of 
Siemens Corp., Boca Eaton, Florida; Gretchen Hazard; and 
Drs. Robert F. Si,--~ns and Winfred P. Lehmann of the 
University of Texas at Austin. 
REFERENCES 
I. Abrahameon, Adele A, "Experimantal Analysis of the 
Semantics of Movement." in Explorations in 
Cognition, Donald A. Norman and David E. Rumelhart. 
ed., W. H. Freeman, San Francisco, 1975, pp. 
248-276. 
2. Alinei, Matin, La struttura del lessico, II Mulino, 
Bologna. 1974. 
3. Amsler. Robert A. and John S. White. "Final Report 
for NSF Project MCS77-01315, Development of a 
Computational Methodology for Deriving Natural 
Language Semantic Structures via Analysis of 
Machine-Readable Dictionaries," Tech. report. 
Linguistics Research Center, University of Texas at 
4. 
Austin, 1979. 
Amsler. Robert A., The Structure of the 
Merriam-Webster Pocket D~ctionarv. PhD 
dissertation, The University of Texas at Austin, 
December 1980. 
5. Calzolari. N., "An Empirical Approach to 
Circularity in Dictionary Definitions," Cahiers de 
Lexicolo~ie, Vol. 31. No. 2, 1977. pp. 118-128. 
6. Evens, Martha and Raoul Smith. "A Lexicon for a 
Computer Quest ion-Answering System," Tech. 
report 77-14, lllinois Inst. of Technology, Dept. 
of Computer Science, 1977. 
7. Evens, Martha. Bonnie Litowitz. Judith Markowitz, 
Raoul Smith and Oswald Werner. L~x~c~l-Semantic 
Relations: A__ Comp§rativ~ Su%-vqy. Linguistic 
Research. Carbondale, 1980. 
8. Fahlman, Scott E., "Thesis progress report: A 
system for representing and using real-world 
knowledge," Al-Memo 331, M.I.T. Artificial 
Intelligence Lab., 1975. 
9. Fahlman, Scott E., _A System for ReDresentin~ and 
Usin~ Rqq~-World Know led2e. PhD dissertation, 
M.I.T., 1977. 
10. Gentner. Dedre, "Evidence for the Psychological 
Reality of Semantic Components: The Verbs of 
Possession," in Explorations in Cognition. Donald 
A. Norman and David E. Rumelhart. ed., W. 
R. Freeman, San Francisco, 1975, pp. 211-246. 
11. Lee, Charmaine, "Review of L__%a struttura del lessico 
by Matin Alinei." Lan2ua~e, Vol. 53, No. 2, 1977, 
pp. 474-477. 
12. Levelt, W. J. M., R. Schreuder. and E. Hoenkamp, 
"Structure and Use of Verbs of Motion." in Recent 
Advances in the Psvcholoev of Laneua~e. Robin 
Campbell and Philip T. Smith. ed., Plenum Press, 
New York, 1976, pp. 137-161. 
13. Miller. G., "English verbs of motion: A case study 
in semantic and lexical memory." in Codine 
Processes in Human Memory. A.W. Melton and 
E. Martins, ed., Winston. Washington. D.C., 1972. 
14. Munro. Allen. '~Linguistic Theory and the LNR 
Structural Representation." in Exml orations in 
Coenition , Donald A. Norman and David E. Runelhart, 
ed., W. H. Freeman. San Francisco. 1975, pp. 
88-113. 
15. Nida. Eugene A., Exnlorin2 S~autic Structures. 
Wilhelm Fink Verlag. Munich. 1975. 
15. Norman, Donald A., and David E. Rumelhart. 
Exnlorations in C~nition. W.H.Freeman. San 
Francisco, 1975. 
17. Raphael. Bertram, ~IR: A Comnuter Pro2raln for 
Semantic Information Retrieval, PhD dissertation. 
M.I.T., i%8. 
18. Runelhart, David E. and James A. Lenin. "A Language 
Comprehension System." in Exolor ations in 
Co2nition, Donald A. Norman and David E. Rumelhart. 
ed., W. H. Freo--n, San Francisco. 1975, pp. 
179-208. 
19. Sherman, Donald, "A Semantic Index to Verb 
Definitions in Webster's Seventh New Colle~iate 
Dictionary." Research Report. Computer Archive of 
Language Materials, Linguistics Dept., Stanford 
University. 1979. 
20. Sparck Jones, Karen. '*Dictionary Circles," SDC 
document TM-3304, System Development Corp., January 
1%7. 
21. Wilder. Raymond L., Introduction to the Foundations 
of ~, John Wiley & Sons, Inc., New York, 
I%5. 
22. Winograd, Terry, "On Primitives, prototypes, and 
other semantic anomalies," Proceedin2s of the 
Workshoo on Theoretical Issues in Natural Laneuaee 
Processin2. June 10-13, 1975~ ~. ~qls., 
Schank, Roger C., and B.L. Nash-Webber. ed., Assoc. 
for Comp. Ling., Arlington, 1978, pp. 25-32. 
138 
