KASSYS: A DEFINITION ACQUISITION SYSTEM IN NATURAL 
LANGUAGE 
Patricc Hernert 
L.I.F.O., R. L. de Vinci, B.P. 6759, 45067 Orldans Ccdex 2, France 
E-mail: hernert @chambord.univ-orleans. fr 
Abstract: This paper is an introdnction to KASSYS, 
a system that has been designed to extract information 
from detining statements in natural language. Only hy- 
peronymous detinitions are dealt with here, for which 
systematic processing has been devised and itnl)lenmnted 
in tlte initial version of the system. The paper describes 
how KASSYS buiMs a taxinomie hierarchy by extracting 
the hyperonyms from these definitions. It also explains 
the way in which the system can answer closed questions 
(yes/no), thus enabling the user to check very quickly that 
a definition has been assimilated correctly. The under- 
lying forrnalism is that of conceptual graphs, with which 
the reader is assumed to be familiar. 
Keywords: Conceptual graphs, hyperm~ymous defini- 
tion, knowledge acquisition, question (yes/no). 
I-INTRODUCTION 
The mm of KASSYS, the system described here, is to 
acquire lexicographical definitions expressed in French, to 
extract from these definitions a carefully chosen concep- 
tual structure, to save this structure in a tile, and to then be 
able to use it, where appropriate, for the semantic analysis 
of a text or during the search for an artswer to a question 
put by the user. + The formalistn which has been adopted 
for the representation of the detinitions is that o1' concep- 
tual graphs, 2 that the reader is assumed to understand. 
All the examples will be given in the feral of statements 
in natural language, and the operations actually pcrfof 
med by the system on the conceptual stnmtut'es extracted 
from these statements will not be described. This paper is 
limited to hyperonymous delinitions. It ;list) shows, very 
brietly, how KASSYS can answer cet+tain types of ques- 
tion. 
2. KNOWLIgl)GE EXTRACTION 
2.1 - Some points concerning hyperonynmus detinl- 
lions 
Ilyperonymy/hyponymy can be delined its follows: 
term A is said to be a hyperonym of term B, or alter- 
natively that term B is a hyponym of term A, if the set of 
instances of term B is included in the set of instances of 
term A. This gives the following corollary: the set of se- 
mantic features that make ttp the elements of A is included 
in the set of sem,'mtic featnres that make up the elements 
of B; the elelnents of B are saM to inherit the sem,'mtic 
features that are common to the elements of A. llere wc 
b For a complete descriplion of the initM version of KASS YS, please 
refer to (llernert 93). 
2Cf. (Sowa 84). 
have tile notion of inheritance of semantic features that is 
fundanmntal to tile theory of semantic netwm'ks. 
Ilyperonymy is thus defined by the gencral relation: 
(1) (A is a hyperonym of B) ~ (V x, B(x) D A(x)) 
The equiwflcnt in natural hmguage of (1) is: 
(2) (A is a hyperonyna of B) ~ (All P, is A) 
For example, lhe definition of the concept bee: 
(3) bee: a social insect which produces wax and 
honey. 
I;rom this delinition it is possible to extr'tct the I'ol- 
lowing statement, in which insect is: the hyperonym of 
bee: 
(4) All bees are insects. 
In formal terms a hyl)eronynu)us definition can be writ- 
ten its in (5), in which the delining statement is split into 
two fundamental components: the hyperonym, followed 
by a conjunction of semantic features which distinguish 
tile defined fl'om this hyperonym; this conjunction of se- 
mantic featnrcs is called spectJic diffi'rence. 
(s) v:,..,R(~,) ~ (A(:,) A PI(:,:) A P2(:,:) A ... A :,,,(:,.)) 
The implication contained in (5) is that KASSYS per- 
ceives definitions its statements of conditions that are ne- 
cessary but not sufticient. It is assnmed that concept F; as 
delined in (5) may possess senmntic features that have not 
been specified but the knowledge of which may turn out 
to be indispensable if it were necessary to differentiate it 
from an individual beh)nging to a very simihn class. 
2.2 - llow to extract the hyperonym 
It is usually fairly easy to extract Ihc hypcronyna fr(ml 
a hypcronymous definition. In nearly all cases the hyper- 
onym of the (letined is the lirst word of the definition, in 
KASSYS the lk)llowing heuristics have been implemen- 
ted: 
For the delinition of a verb the hyperonyn~ is the litst 
word o1' the deliniti,.m; if this word is not a verb the search 
fails, l'or nouns, start by checking whether or not the 
definition begins with a defining prefix, i.e. art expression 
such as action oJ~Jhct oJ~ etc.; in sonte cases the delinition 
may not be hyperonymous; otherwise, if tim first word 
of the deliniti(m is a noun it is the hyperonym; if the 
lit'st N words are adjectives, possibly separated by the 
conjunction and or by a comma, and if the (N+l)th word 
263 
is a noun, then this noun is the hyperonym. 
These two heuristics are commonly used by systems 
to search for hyperonyms in detinitions, sometimes with 
improvements to take into account the special cases tbr 
wbich these heuristics are not suitable. 3 
2.3 - How to build the taxinomie hierarchy 
The hyperonym is obviously a fundamental etmnent of 
a hyperonymous definition. Taken alone, a concept and 
its hyperonym 4 are sufficient to build an elementary se- 
mantic netwo;rk in which all the nodes are connected by 
the same link IS-A. Tbe semantic network is limited to a 
simple taxonomic hierarchy which can be built and nmin- 
rained far more easily than a complete setnantic network. 
KASSYS carries out a certain number of checks on the 
proposed hyperonym. If, for example, it is too general, 
tim user is "tsked to choose another; if it has already been 
used as a hyperonym, the system suggests that maybe one 
of its hyponyms could be a better candidate. Let us now 
look in more detail at what KASSYS does when the user 
defines tile same concept more than once. Let us take the 
following definition patterns: 
(6) V x, A(x) D (B(x) A C(x)) 
(7) V x, A(x) :) 0r(x) ^ C'(x)) 
In (6) and (7), concept A has been deIined by the hypcr- 
onyms B and B' respectively and the specific differet~ces 
C and C'. There are four different cases: 
1. If B=B' and C=C', the definitions are klentieal and 
the second one is therefore redundant. 
2. If B=B' and C<>C', the second definition can be 
considered as additional information which should 
be merged with the definition that has already been 
memorised. 
3. If B<>B' and C=C', the definitions are identical but 
for one hyperonym; the system will therefore ask the 
user to choose between (6) and (7); note that, if B 
has been defined with B' as its hyperonynl (respecti- 
vely B' with B), the system will suggest keeping (6) 
(respectively (7)). 
4. If B<>B' and C<>C', the user will have to choose 
one of these two definitions. 
Note tlmt, if the second definition (7) does not mention 
the hyperonym of A, the system will find this hyperonym 
thanks to the first hyperonymons definition that has been 
entered, which necessarily contains tile structure (6). 
2.4 - Tbe circularity of definitions 
3For example. (Byrd 87) identifies several hypemnym~ in tile ~;alne 
definition, separated by a conjunction; in (V&onis 89) there is a heuristic 
which, in certain cases, allows the hyperonym of a noun delined by the 
pretix action t~'to he extracted. 
4Right from the beginning it has been assumed that no concept 
possesses more than one immediate hyperonyrn; fi'om this point of view 
this immediate hyperonym coincides with the genus in the Arislolelian 
sense of the term. 
264 
Whether definitions come flom a French dictionary or 
have been produced by a user who is not a lexicographer, 
they usually contain characteristics that are considered 
to make them totally useless. Definitions are too often 
found to be repetitive or inconsistent; however, once these 
problems have been klentified they can almost always be 
corrected. But this is not true of circular definitions which, 
today, are accepted as being inevitable, s 
As far as KASSYS is concerned, the presence of cycles 
in definitions would have the nnfortun'lte result of leading 
the program into infinite loops. In order to awfid this, 
an algorithm has been implemented which searches each 
new hyperonymousdefinition for words th:tt will lead to a 
circular definition. Let us exauaine tile folk)wing example: 
(8) swarm: group of bees that leaves an overcrow- 
ded hive to settle elsewhere. 
(9) bee: social insect of tile Hymenoptera group, 
called honeyJty that lives in swarms and pro- 
duces wax and honey. 
(Ill) hive: shelter designed for a swarm of bees. 
If tile definitions arc submitted to tile system in this 
order, tile circularity due to the presence of swarm in the 
definition ot' tim concept bee is detected -ts soon as this 
delinition is entered. The user is therefore asked to modify 
at least one of the two definitions (8) and (9). One possible 
solution would be to replace (9) by (1 I): 
(11) bee: social insect of tile l lymenoptera group, 
called honeyfiy that lives in colonies and pro- 
duces wax anti honey, 
Now (8) and (\[ 1) are accepted without any difficulty. 
But there is still a problem with (10) since the definition of 
hive contains the noun swarm. This circul;trity can be got 
rid of by removing hive from tile definition of swarm or 
swarm from the delinition of hive. The first solution leads 
to seriously truncating the definition of the noun sworm: 
(12) swarm: group of bees. 
The second leads to a somcwlmt unnatural deliuition: 
(13) hive: shelter designed for a group c,f bees. 
This example shows that it is sometimes ahnost inevi- 
table to have reconrse to a circular detinition and it is 
for this reason that KASSYS can be conligured to accept 
such delinitions, llowever, the danger is that, when the 
knowledge base is consulted, certain algorithms which 
are used in this consultation and which, at the present 
time, are tmable to check their own evolution, may lead 
to inIinite loops with a consequent loss o f inlbmmtion that 
has not previously been menmrised. 
3. QUERYING TIlE KNOWlJgDGE BASE 
3.1 - ~qhnple questions 
KASSYS is able to answer yes/no type questions, i.e. 
it can compnte the truth of certain statements. This paper 
does not deal with elementary queries of tile type Is an A 
5Cf, for example, (Weinreich 70). page 81. 
a B?, which tile system b'mdles without any prot~letn. 
Let us suppose that tile following detinitions have been 
st, bmitted to the system, which then analyses and memo- 
rises them: 
(14) revolver: small-arm with a rew/lving cylinder 
that can contain six cartridges. 
(15) pistol: small-arm with a removable cartridge 
clip in which the cartridges are loaded. 
(16) small-arm: short, portable tirearm. 
(17) tirearm: arm that tires shots through tile deto- 
nation of an explosive mixture. 
I,et us begin with the simplest questions, i.e. those it is 
possible to answer by consulting just one delinition. For 
example: 
(18) Does a rewllver have a cylinder? 
Using definition (14), this question can he answered 
in the aflirmative. This is exactly what KASSYS does, 
by simply projecting tile conceptual graph assf~ciated to 
tile question onto tile conceptual graph of the defining 
statement a revolver is a small-arm with etc.. Note that 
KASSYS knows that irA is sltid to be with P,, then A has 
B. 
It may happen that a query is projected onto the body 
of a definition but not onto tile detining statement that 
has been obtained from tile delined and tile definition. For 
example: 
(19) Does a cylinder contain e:trtridges? 
The graph of this query is not projected oiito that of 
a revolver i.v a small-arm with etc. btlt onto that of tile 
detinition properly speaking of tevolver, which contains 
the pattern a cylinder contains cartridges. The system 
cleverly deduces that there exist cylinders which contain 
cartridges and so, in answer to question (19), replies So+ 
metimes. 
3.2 - An algorithm using type expansion 
This section deals with the case of questions that cannot 
be answered by consulting just one hyperonymous deli- 
nition. It is assumed that these questions contain neither 
modal verbs nor negations. 
The following algorithm has been implemented so as 
to be able to answer these questions: a 
Search in the assertion to be veritied for the concepts 
to which a type definition has been associated; for each 
concept C that is found: 
1. Search for tile definition of C. 
2. For this detinition, perform all possible type expan- 
sions (the strategy that has been implemented is a 
breadth lit'st search); for each definition that is ob- 
tained, try to project the graph of the query onto the 
'This algorithm requires an operalion which has not been delinell: 
type #xpansion; this consists ill replacing a give. word in Ihe graph ol n 
statement by its type detinition. 
graph of this delinition; if a projection succeeds, the 
answer is Yes; go to 5; if no projection succeeds, 
Cotltimle. 
g. For each of the hyponyn~s of C, return to \[; i\[' a 
projection succeeds, the answer is Sometitne,v; go to 
5; if no projection succeeds, continue. 
4. No pro.iection has succeeded; the system ix unable to 
answer. 
5. If an answer has been found, display it; otherwise 
display I don't know; stop. 
t+et us take tile query: 
(21}) Does a pistol fire shots? 
Starling froth tire cm/cept pistol, then performhlg type 
expansion on its hyperonym small-arm, followed by a 
second type expansion on tile conceptfirear, t, KASSYS 
buihls the graph of the following detinition: 
(21) A pistol is a short portable arm which Ih'es 
shots, etc. 
We are back to the case of the previous paragraph, 
where just one hypermlymous delhfition is enough to be 
able to answer tile question. It is easy to see that the graph 
of tile query (20) is projected onto that of tile detinition 
(21). This is what KASSYS does, and so it replies in the 
affirmative to question (20). 
It should be noted that this algorithm can be very time 
consuming, it" tile assertion to be verified contains more 
than one concept that has been delined in the knc~wledge 
base. One possible sohttion would be to look lk~r the ans- 
wet starting from a prio,ity concept that we shall call tire 
Joc,s of tile que,y and that is de/ined as being tire concept 
to which tile questioning applies. It is a somewhat wtgue 
notion and is rather difficult to explain clearly. "1~ begin 
with, it was necessary to detine a naive, focus extractim/ 
heuristic. Although far flollr perfect, this heuristic is never 
dangerous since tile previous algorithm guarantees that 
all tile concepts will be tried. 1 \[owever, where tile heuris- 
tic COtllptlteS a focus leading to a successful conclusion, 
tile time saved is inolmrtional to tire ,mmber of concepts 
cmll:lined in tile assertion and on which type expansion 
can be performed. In tire example uf question (2t)), the 
focus determined by the heuristic is pislol, which leads 
to a successful conclusion. The ltnlotlrlt Of time is saved 
here is nil but wouht be considerable if tile definition of 
tile concept shot, for exatnple, were to be inserted in the 
knowledge base. 
3.3 - Queries that contain a negation 
(\]enerally speaking, the handling of negation is a tri- 
cky affair for the essential reason that negation in natural 
language cannot be confused with logical negation, l:{~r 
instance, it is easy to lind a statement with a truth value 
that is identical to that of its negation. 7 llowever, in a 
huge number of elementary cases, especially where the 
:Let us lake the eXalllI)le of Ihe sla|elllenl/~,1y drrl,goll likes bakhtva, 
265 
negation concerns tim main verb of a chmse, it is reaso- 
nable to accept that the trnth value of the chtuse is the 
opposite of that of the assertion which is obtained by re- 
moving the negation from the clause. This is a heuristic 
wllich has proved to be extremely efficient in KASSYS 
but which woukl have to be re-exantined if certain sin> 
plifying hypotheses were to be abandoned. 
Let p be a statement containing just one verb, and leg(p) 
the negation of this statement, obtained by adding a ne- 
gation to the verb contained in p. The answer given for 
leg(p) is a function of that which has been l'ound for p: s 
TRUE FALSE I SOME \] UNI)EF 1 P 
neg(p) FALSE TaUI~ 1~---~-- -----~FALSE \[--O-lqDE\[+ I 
Negation in queries 
This table must be read from top to bottom only. Don't 
forget that it gives the truth value of the statetnent leg(p), 
which contains one and only one negation, in function of 
that of p, which contains no negation. Note that the aim of 
this table is not to define the truth wtlue ofneg(neg(p)). In 
the case where leg(p) isn't valid, ignoring this restriction 
leads to attributing two truth values to leg(leg(p)). 
Let us take as an examl)le tile following queried state- 
ment, which is the negative answer to (19): 
(22) A cylinder does not contain cartrklges. 
KASSYS answers that statement (22) isn't valid since 
cylinders exist which contain cartrklges, as is consistent 
with the hypothesis that queried statements are prefixed by 
a universal quantifier. Statement (22) is interpreted as Not 
all cylinders contain cartridges, and not as There exist 
cylinders which do not contain cartridges, hi tile hitter 
case it is obviously impossible to answer, for the sintple 
reason that nowhere is it said that there exist cylinders 
other than those which are parts of revolvers. However, 
in the hypothesis oft closed world, 9 objects whose exis- 
tence has not been asserted are assumed not to exist in 
this world. Given this one proviso, tile answer to (22) 
is negative, whatever its interpretation. On tile contrary, 
a queried statenrent such as It is not true tlzat a o'lin- 
der does not contain cartridges, which is the negation of 
(22), is rejected by the system for tile simple reason that 
it contains two negations. The previous heuristic is not 
suitable for use in this statement, for which the wdues 
TRUE and SOME are possible. 
4. A SESSION WITH KASSYS 
The following session, given in French, is a faithful 
illustration of the messages given by tile system. 
1. KASSYS: Bonjonr, j'attends lOS instructions... 
2. USER: Un adronef est un appareil d' aviation. 
3. K: OK, j' ai assimil6 le concept <adronef>. 
taken from (1 \[irst 91); this statement is false because we have no drag(m; 
it~ negation is equally false, for the stone reason. 
8UNI)EF for zmdefined; SOME tbr Sometimes: some occurrences 
exist for which the queried statenlenl is true. 
9This hypothesis has not been adopted here. 
4. U: Qn'est-ce qu'un adronef ? 
5. K: Un adroncf est un app.'n'eil d'aviation. 
6. U: Une fusde est un appareil qui sort a envoyer une 
charge dans l'espace. 
7. K: Le concept <appareil> a pour hyponyme <adronef>; 
celui-ci est-il un hyperonyme phls proche de <fusde> ? 
8. U: Oui. 
9. K: Je ddduis done qu'une ftxs6e est un adronef qui sert 
~l envoyer une charge darts l'espace. Etes-vous d'accord 
avec celte ddfinition ? 
10. U: Oui. 
11. K: J'ai mdnlorisd la ddlinition de <fusde>. 
12. U: Une fusde est propulsde par tin ntoteur a rdaction. 
13. K: S'agibil d'une question ou d'une affirmation ? 
14. U: C'est une affirmatitm. 
15. K: Je sais d6j'l qn'une \['LISEe est un adronef qui sert 
h eavuycr une charge dans I'espace. l)ois-je intdgrer la 
nouvelle ¢lonn6e ~l cette ddfinition ? 
16. U: Oui. 
17. K: Bien, j'ai compris votre ddfinition de <rustic>. 
18. U: Qu'est-ce qu'nne fusde ? 
19. I<2: Une I'tlsde est un adronef propulsd par till motenr ,t 
r6action, qui selt ~l envoyer nne charge dans l'espacc. 
20. U: Un adronef est tln apparcil qui petit voler. 
2l. K: Ok, j'ai assimil6 la ddfinition de <a6ronef>. 
22. U: Un phmeur est un appareil semblable a l'avion mais 
ddmuni de motenr. 
23. K: Le concept <appareil> a pour hyponyme <adro- 
nef>; celui-ci est41 un hyperonyme phns proche de <pla- 
neur> ? 
24. U: Oui. 
25. K: Je ddduis donc qu'un phnleur est tilt adronef sem- 
blable h l'avion nlais ddmuni de moteur. Etes-vous d'ac- 
cord avec cette ddfinition ? 
26. U: Oni. 
J'ai ntdnlorisd la ddfinition de <planeur>. 
Un adronef peut cnvoyer une charge dans l'espace 
27. K: 
28. U: 
? 
29. K: Oui. 
3(1. U: l~ourquoi ? 
3 I. K: Parce qu'une fnsde est till adronel' prol~ulsd par 
till nlotcur ~1 rdaction, qui sort ~l envoyer nnc charge dans 
1' espace. 
32. U: Pourquoi ? 
33. K: C'est tons qui me I'avez dit. 
34. U: Est-cc que les appareils volent ? 
35. K: Quelquefois. 
36. U: Un planeur peut voler ? 
37. K: Oui. 
38. U: Est-ce qu'un planenr possb.dc till motenr ? 
39. K: Ncm. 
40. U: On at'r~te lb. 
41. K: Ma mdnloirc n'est pas vide. Dois-je faire nne sau- 
vegarde ? 
42. U: Non. 
7 I. K: Comtne vous voudrez. Merci et :~l bient6t. 
5. CONCLUSION 
266 
This paper is a sketchy panorama of KASSYS. In par- 
ticular, only the handling of byperonylnous definitions 
has been discussed. You have seen how the taxinomic 
hierarchy and the knowledge base are built. You have 
also been told how, broadly speaking, the query/answer 
module currently running in the system works. 
The interest of this work is to show that conceptual 
graph theory offers an elegant framework in which hy- 
peronymous definitions fit naturally. Careful and judicious 
use of this framework and the operations defined within 
it (type expansion, projection) enable information search 
algorithms to be implemented easily. 
6. REFERENCES 
(Byrd 87) 
R.J. Byrd, N. Galzolari, M.S. Chodorow, J.L. Klawms, 
M.S. Neff, O.A. Pdzky, Tools and methods for COmlm- 
tational texicology, Computational Linguistics, Vol. 13, 
Nb. 3-4, 1987, pp. 219-240. 
(IIernert 93) 
P. tlernert, Un systb.me d'acquisition de ddfinitions basd 
sur le module des graphes conceptuels, Thbse de Doctorat, 
Universitd Paris XIII, Villetaneuse, juin 1993. 
(Hirst 91) 
G. Hirst, Existence assumptions in knowledge represen- 
tation, Artificial Intelligence, Vol. 49, No. 1-3, l!lsevier, 
Amsterdam, 1991, pp. 199-242. 
(Sowa 84) 
J. Sowa, Conceptual structures - Information processing 
in mind and machine, Addison Wesley Publishing Com- 
pany, Reading, Mass., 1984. 
(Vfironis 89) 
J. Vdronis, N.M. Ide, N. Wurbel, Extraction d'informa- 
tions sdmantiques darts les dictionnaires courants, Actes 
du 7?zme Congrt~s Reconnaissances des Formes ct Intelli- 
gence Artificielle, A.EC.E.T., Paris, ddcembre 1989, pp. 
1381-1395. 
(Weinreieh 70) 
U. Weinreich, I,a ddiinition lexicographique clans la sd- 
mantique descriptive, Langages, Didier-l.arousse, Paris, 
1970, pp. 69-86. 
267 
