﻿American Journal of Computational Linguistics 
Mi crof i che 2 4 
THE SQAP DATA BASE 
FOR 
NATURAL LANGUAGE INFOR8MATION 
Jacob Palme 
Research Institute of National Defense 
Operatians Research Center 
Stockholm 80, Sweden 
Copytight J975 by the Association fqr Computational Linguistics 
ABSTRACT 
The Swedish Question Answering Pro j ect (SQAP) aims at 
handli~g many digferent kinds of facts, and nat only facts in a 
small dpecial application area. The SQAP data base consists of 
a network of nodes correspdnding to objects, properties, and 
events in the real.world. Deduction can be performed, and deduc- 
tion rules can be input in natural language and stored in the 
data base. 
This report describes the data base, specially focusing on 
problems in its desbn, both problems which have been solved and 
problems which are not yet solved. 
Specially full treatment is given to the data base repre- 
sentation df natural language noun phrases, and to the represen- 
Gation of deduction fules in the data base in the £om D£ data 
base "patterns" 
SWEDISH ABSTRACT 
SQAP-projektet (Swedish Question Answering Project = 
Svenska projektet fur fragebesvarande system) syftar till att 
kunna hantera mllnga olika slags fakta i datorn, inte bara fakta 
inom ett litet speciellt tillhpningsomr8de. Databasen bestar 
av ett dtwrk av noder som svarar mot obj ekt , egenskape~ och 
htlndelser i verkligheten. Slutsatsdragning kan garas, och- 
slutsatsdragningsregler kan ges i naturligt sprdk och lagras 
i databasen. 
Denna rapport beskriver databasen, med specie11 tonvikt pa 
problem vid dess konstruktion, bade sadana problem som vi lest 
och sddana som vi Hnnu inte lust. 
Speciellt utrforligt behandlas representation av substanti'v- 
kon.strukt3oner i databasen, samt hur slutsatsregler kan repre- 
senterqs soh monster i databasen. 
FOAP rapport C 8376-M3CE5), Septemer 1973, revised July 1975. 
Contents : 
Page 
Introduction 
Natural language representation 
Introduction %o our data base 
ObJects, events and predicates 
Quantifiers on the short relation 
Deduction in the data base 
Variables 
Keys 
Dummies 
Questions 
Example of what our system can do 
The EQUAL relatlon 
Natural ldnguage noun phrases 
Attributes on noun phrases 
Composite objects 
Conjunct ions between noun phrases in the 
general s ens e 
Plural nouns 
Fitting composite objects into the sentence 
Noun phrases with just a number and nothing 
more 
18, Some examples of translations of sentences 
with plural nouns 
19, Problems with the dual representation of 
nouns 
20. Equality between composite objects 
21. Relations between predicates 
22. Event nodes 
23. Putting restrictions on equality 
23b. Quantifiers or event nodes 
24. Deduction patterns and natural language 
if-clauses 
4 
25. Ques tlons 
26. DUMMIES = temporary variables for data 
base merging 
27. DU.;MMIES which refer to VABIABLES 
28. The problem qf dual representatlob 
29, What our syatem can do and cannot do 
30. 
A short comparison with other systems 
31 Acknowledgements 
32. Blbll ogra~hy 
33. Index 
5 
0. Introduction 
This paper describes the natural language data base structure 
used in the SQ,AP system    we dish Question Answering system). 
Much of that system is already working, but the paper does 
not only describe the solutions to solved problems. Difficulties 
and unsolved problems are also presented, since I feel this 
is important to further progress. 
One of the goals of the SQAP project was to create a question- 
answering system capable of handllng facts of many different 
kinds. The system should thus not be restricted to a small 
special application area. 
1. . Natural language representation 
There is an obvious need for computers with a capability to 
converse in natural human languages. Natural lmguages are 
more general-purpose than most artificial languages, which 
means that you can talk about a wider subject area if you use 
natural ldnguages. Natural languages can be used by everyone 
without special trafning, so computers talking natural language 
can make more people able to use more different computer 
facilities. Finally, a rizing part of computer usage in the 
future will be unintelligent processing of natural language 
texts, and such systems can be improved if the processing is 
not wholly unintelligent. 
There are also wellknown difficulties with natural languages 
for computers. Natural language is closely connected to human 
knowledge. Therefore, natural laneage sentences can only be 
understood by a mm or a computer with factual howledge about 
the 
subject matter and with the ability to reason w,ith those 
facts. To disambiguate such wellknown examples as 
"The pig was in the pentf (~ar- ille el 1964) or !!He went to the 
park with the girlu (~chank 1969) the computer must have an 
underlying knowledge about various kinds of "pens" , about where 
"the girlw was previously and so On. 
6 
Also, the same thing can be said in many different ways, and 
a computer with natural language capabilities must be able to 
understand this, so that for example it can see the similarity 
between "Find the mean income of unmarried women with at least 
two bhildren.lf and llSearch through -bhe personell file. For 
each individual who is a woman, who is not married, and who 
has a number of children greater -khan two, accumulate income 
to calculate the mean.11 
Theref ore, a computer undexst anding natural language must 
have a data base with basic factual knowledge about the world 
in general or abouf the subject matter which the compvter is 
to be used for. 
This data base is needed to understand ambiguous sentences, 
but also to interpret the sentences into executable data 
processing commands. 
The requirements on such a data base are : 
- You should be able to store a wlde variety of different 
kinds of facts. Natural languages are very general-purpose. 
so the data base should also be general-purpose. 
- You should be able to use this data base to make deductions. 
The capability to do simple and natural deductions fast is 
more important than the capability to make very adv~ced and 
longdrange deductions. Since the data base will be large, an 
important part of deduction will be the selection of the re- 
levant facts a~d rules out of the large mass of facts not 
needed for one special deduction. 
The data base can be .more or less close to natural language. 
A data base close to natural language makes input translation 
easier, and also the loss of nuances during the input translation 
7 
will be smaller. But the data base must on the other hand have 
a logical structure which is suitable for deduction knd fact 
searching. 
One model of natural language knowledge is the following: The 
knowledge consists of "conceptst1 and of rules relating these 
concepts to each other. A typical concept might be " John1' , 
TrAll young menrf, ltThe event when John meets Maxy in the pa.rkn 
or "The month of July, 1973". The concepts are related by 
rules, which can be very simple relations (like the relation 
between "111 young ment1 and the property ffyoungn) or complex 
patterns of concepts (~ike the rule "If Mary is weak and 
tired, and she meets a strong brutal man, then she will be 
frightened.") These rules form a network linking all concepts 
together. 
This model of natural language is close to that often used by 
psychologists in trying to explain the working of the intelligence 
in the human mind. 
fPhe SQAP system uses a data base of that kind. The model may at 
first seem simple and straightfbrwazd. When you try to produce 
a worldng question-a,nsw&ring system, you will however find that 
there are many difficulties and complications with such a data 
base. Thia report presents the raBe,% wortant of the problems 
we have met, and in some cases also our solutions. 1 believe 
that o%her producer8 of natural language system will sooner 
or later encounter the same problem, and they may then benefit 
from our experience as presented in this paper. 
8 
2, Zntroduution to our data base, 
During the 1960:s, several researchers independently and simul- 
taneously came up with -&he same basio (idea of organizir@ such a 
data base - Sandewall 1965, Simmons 1971 , 'Shapiro 1971 . So6e of 
them were influenced by the caae gqanunar of Fillmore 196'8. 
The idea is that the data base is organized into nodes, each node 
representing a concept. In natmal language, the prepositions 
are used to ,represent short simple and direct relations between 
concepts "John is - in the bed", "The fire was lit & Marytt 
In the data base, the idea of prepositions is extended so 
that all aimple ad direct relations between concepts are 
represented by implicit prepositions. (~ust as you could say 
that there is aa implicit preposition Itby" in the phrase "Mary 
lit the fire" .) 
Yore complex rules or relations between concepts are represented 
by extra concepts. Thus there is a concept for the event ItMary 
lit the fire1' and this concept is related to "Maryw, Itthe flreu 
and "act of lightingv in a structure like that in figure 1. 
Acts of lighting, 
T 
CASE 
#Mary lit the fixe, 
)Mary) #The firer, 
Figure 1 
This structure has four concepts linked together by three 
"prepositiona,lIt relations : CASE, BY and OBJ. From now on, I 
will in this paper call such relations ltshort relationsf1. 
The data base is organized so that the deduction rules can 
follow the short relakions in both directions, that is go 
from "M+ryI1 to ItMary lit the firet1 or from "Ma,z?y lit the fireff 
to "Mary". 
3. Objects, events md prdicates 
Noun phrases in natural language usually refer to one or a 
set of objects in the real world, like f~Stockholmll or "hrery 
house lin Swedent1 or "The nice man with a bicycle1v. In our 
system each such concept is represented by a node in the data 
10 
base, which could be called an object node. 
Each object node is associated with one or more predicate 
nodes expressing properties of that object. In our data base, 
we mark predicates with the postfix ct*Pcr. Thus, the phrase 
"An always happy girl" would in our data base be represented 
like in figure 2: 
HAPPY-P GIRL-P 
))A happy girl)) 
A statement like "There is ad always happy girlrf or "One girl 
1s always happy" would be represented in the same way, with an 
object node and two short relation6 on ~t to the two predicates, 
ATTR to the adjectival predicate, PRED to the nominal predicate. 
If we meet the natural language phrase "One girl is nice 
todaytT, then we cannot represent it as simply. We have to 
affix a time to the relation between the girl and HAFPY*Pcc. 
One way to do this would be to always have the ability to add 
ertrs short relations to existing short relations, !Phis would 
require that short relations 'are represented in the data base 
in a form where there is a place ta add a list of extra shost 
relations, and this would triple the size of the data base, 
Tnstead we have an expanded form of those short relations to 
which we want to add other relations, This expanded form xs 
only used when it ia needed, the non-expanded form is used 
when there are no added short relations, The expanded form 
for the PRED short relatlon is a node of the type "eventtt. 
WOne girl is happy todayw will thus be repreaented like in 
figure 3. 
T 
CASE 
Figure 3 
))One girl is happy today)) 
The advantage of having such an extra concept in the data 
base is that we can easily add more short relations to the 
event node llOne girl is happy todq", for example to represent 
"in the school" or "becautse of the weather" or "according to 
what Tom said" . 
Since we want to deal with true statements, hypothetical 
statements and statements belonging to some person's belief 
structure, we always add a relation to an event node indi- 
cating which belief structure it belongs to, true events be- 
long to the set "Tfl#3+Sf1 of all true statements. Since the 
relation VART TRUE*Sn is so common, we represent it in 
pictures with the earth sign of electric charts 1 
A . 
- 
.. 
The statement "John believes that Mary loves him" would thus 
be represented like in figure 4t 
12 
))John believes that Mary loves him)) 
BELIEVE-P LOVE-P 
John believes that)) 
OB 1 
))...Mary loves himn 
Figure 4 
Note that there is an earth sign on the true event, but no 
earth sign on the event belonging to Johnrs belief strucWre. 
Note that predicates and relations in natural language are 
often not represented directly as short relatlong in our 
data base. !'John is the father of Angelica" is thus not rep- 
resented by a short relation "FATHER" from "Johnv to "Angelicav 
but rather with an event node like in figure 54 
FA ER-P 
r'lm 
Figure 5 
%John is the father of Angelican 
4& Quantifiers on the short re1at:ors. 
Every node in our data base can stand for a set of objects 
instead of for just a single objbct. Thus we can represent 
"A11 nice girlsv with a node representing the set of all nice 
girls . 
This means that we need quantifiers on the short relations, 
to be able to express relationships between sets. 
Tf there is a short relation R between two sets A and B, 
then the relamtion R might not be true between any member of 
A and any member of B. We have several cases: 
!Fhese and 0th~ cases axe represented in our data base 
with three qu.an,tifiers ALL, SOME and ITS, The difference 
belmeen SOME and 12% is shown by the difference between Ucle 
second wd the third example. One quantifier is placed on 
each end of the rel&tion, The four examples above will thus 
in our data base look 1;ike this r 
A) (ALL A, R, ALL B) 
iv) (SOME A, R,*ITS B) 
14 
The difference between ITS and SOME can be understood if you look 
at the statement 'Wvery man is in a carv This can mean that 
ffEvery man Is iflside one single car" or it can mean !'For every 
man there is one car in which he is. The first ph~ase might 
in our data base be represented as 
'%very manM ALL IN SOME 
while the second might be represented as 
"Every manu ALL IN IT 
The~e are simple rules to mmipulate. the quantifiers when the 
deduction Pules chain from node to node in the data base. 
This is described in Sandewall 1969. 
In %he following, if no quantifier is marked on a short relation 
in a figure, then ALL is implicit. 
15 
5, Deduction in the data base 
The data base does not contain all true statements explicitly, 
some of %hem have to be deduced when needed, Basically all 
deduction rules can be seen as pattern matching, You haue a 
pattern swing for example that Vf somethipg hot is near 
something inflammable then the irnflammable will catch fire". 
Then we h'ave sbme actual situation, explicit or deduced, e.g. 
"The burnine cigarette is thrown in the petrol tankn. In our 
data base, as in figure 6. 
NEAR-P 
CASE 
BY 
n~f something hot is near something inflammable, 
then the irnflammable will catch fire}) 
,something inflammable, 
THROW-P 
CASE 
INTO > flhe burning cigarette 
rthe petrol tank, 
is thrown into the 
petrol tankn 
OBJ 
 the cigarette)) 
1, 6 
Before using the deduction rule, we must match the pattern to 
the actual situation. The p-attern can contain many inter- 
connected rides, and the reality may not at first resemble 
the pattern directly, deduction may be necessary to see the 
resemblance. 
The simplest deduction rule possible is just a pattern of 
two short relations from ,which a third can be deduced: "If 
A R1 B and B R2 C then A R3 C", a simple example: "If A is 
subset of B, and B is subset of C, then A is subset of Ctf. 
Since such rules link together nodes through a chain of short 
relations, they are called chaining rules. Spme chaining rules 
require side relations on B to the fullfilled, for example 
"A BY B, and A CASE C implies B RED CI1, but only if A is a 
true event. 
This is described in Sandewall 1969 and in Makila 1972. 
6. Variables 
The simplest kind of deduction pattern involves just one node. 
Such a node is calleC a variable, For example, VARIABLES are 
use& in the translation of 'Wery intelligent man is a bad 
soldierv which is represented like in figure 7; 
INTELLIGENT-P MAN-P 
TtE TTED DEF 
BAD-P 
T 
ATTR 
DEF DEF 
,Every intelligent mann < 
ALL EQUAL IT 
#Every bad soldier)) 
sEvety intelligent man is a bad sol diem 
17 
Y.a.riables have a new quantifier on them, DEF. This indicates 
that this short relation is part of the definition of that 
variable. The variable '%very intelligent mantt above corresponds 
to the set of all objects which satisfy the definition. This 
means that ad soon as we find an object in the data base for 
which we know or can deduce that it satisfies the definition, 
then we know that it belongs to the VARIABLE Above, and we 
can thus deduce that it is a bad soldier. 
7. Keys 
Sometimes the deduction requires a pattern of more than one 
node, Such patterns are called keys, The sentence "If a 
motorboat meets a sailingboat, then the motorboat must steer 
away from the sailingboatn will in our data base be represented 
like in figure 8. 
MOTORBOAT-P 
>steer away)) 
na motorboat, 
A THAT 
MgF- - 
-1 DEF 
THAT IFTHEN DEF 
FROM 
/ THAT 
I 
DEF 
PRED 
\L 
S AILINGBOAT-P 
Figure 8 
nIf a motorboat meets a sailingboat, 
then the motorboat mdst steer away 
from tke sailingboatn 
18 
To the left is the pattern of three nodes corinected by sholit 
relations wikh DEF on both ends of the relations. This shows 
that they are part of a key. To the right is the deduced 
statement, connected to the center of the key with an IFTHm 
short relation. Note that there is a new quantifier THAT above. 
This quantifier means that we should single out just that 
actual object which was matched to that part of the key. We 
do not want to say that "Every motorboat meeting a sailingboat 
must steer away from every sailingboat being met by a motor- 
boatv, and therefore we must single out the matched object only. 
Natural language sentences often refer to previously mentioned 
entities with conetructa like "hew or old mann. Our 
system w&ll first translate a sentence into an independent 
data base frwent. An assimilation program will then merge 
this fragment with the data base. Other systems often m&e 
%his assimilation during the input translation. They mould 
then avoid some, but not all, of our problems, but would 
get some other problems instead. 
For this merging we create temporary variables and keys 
during input translation. The sentence "The man always with the 
gun is in the forestw is thus translated into figure 9: 
nThe man with the gun is in the forestr 
GUN-P 
MAN-P 
T 
PRED 
DEF 
FOREST-P 
PRED 
DEF 
DEF WITH DEF 'IN 
>the gunL >>the man )) ' ., )>the forest)) 
19 
The merging program will use special deduction rules for these 
temporary variables, which we call DUMMIES. After merging, the 
D~~IES usually merge with some previous node in the data base, 
and they thus become CONSTANTS or VARIABLES depending on the 
type of that previous node. 
9. Questions 
Questions to a computer can require short or long answers. 
There are for example yes-no questions like rtIs a man with a 
balooh coming?" which in our data base will be represented 
like in figure 10. 
BALO0N.P MAN-P 
PRED 
DEF 
HIS a man wtth a baloon coming? n 
ITS WITH DEF 
na bbaloom < COME-P 
where we put a question-mark on one-relation to show that 
the program shall try to &educe that relation from the prevjious 
howledge. 
Other questions requiw as answer a list of objects, for 
example "Which of you can drive a carn or a description of 
a deduction chain ('%@yt1 
or a description of an 
algorithm (tvhowu questions). We have-not yet tried to represent 
such questions in our data base structure. 
20 
10, Example of what our system can do 
This exmple shows a set of facts and questions, such that 
our system can answer the questions based on the 
facts. 
The input language to our system is not full natural english. 
the language is slightly simplified. me sentences in the 
example are mitten in this simplified englieh, 
A girl is a young woman, A boy is a young man, A woman is a 
human female. A man is a-human male. Every young man is a boy. 
Every sports cax is fast and expensive . Every fast car is 
dangerous. Anybody - with an expensive oar is rich, Every 
rich woman is frigh-bened by every poor nan, 
If a woman is meeting a man and she is fsightened by him but 
she is loved by him, then she will be despising him. If a man 
is loving a woman and she is despising him, then he is depressed. 
If a man is depressed and he is driving a car, then he is 
senseless and irrational. If a senseless man is drkving a 
dangerous car, then he is dangerous, and all traffic is in 
deadly peril. 
Everyone - on a public street is txaffic, 
Mazy is a mature girl, - with a sports tax, Eliza is a pretty 
girl - with long hair. The mature girl is ugly. 
I$ the ugly arl, rich? 
John is a man, He is young and poor. He is loving every fast 
car and every girl - with a fast car. 
1s the pretty girl loved by John? Is the rich and ugly girl 
loved by; John? 
If Eliza had been a girl - nith a fast car, then would she 
be loved by John? 
Mary is meeting John and he is driving her car. Is the poor 
boy, dangerous? 
I I . The EQUAL relations 
The QUAL relation between ho singular elements means tlzt they 
me identical, However, sinoe we can put quantifiers on the 
E&UAL relation, we can alsa. use it for many set relationships, 
Some exmaples: 
ALL A QUAC ALL B means that the sets A and B me equal and 
contain not more than one element each. 
ITS A EQUAL ALL B means that A is a subset of B. 
ITS A EQUAL ITS B means that A and B overlap. 
ALL A NOT EQUAL ALL B means that A and B a.re disjoint. 
StNE A EQUAL ALL A means that A is not empty. 
ALL A NOT EQUAL ALL A means that A is empty. 
SOME A EQUAL ALL A means that A is singular, that is contains 
exactly one member. 
Natural language noun phrases are translated into nodes marked 
as singular in the data base, if: 
a) The noun phrase is not plural. 
b) The noun phrase is not translated into a VARIABLE in the 
data base. 
c) The noun phrase is interprekea in the special sense (like 
"A man is walking on the streetu) and not in the general 
sense (like ~hrery man is a-male humanll). 
Data base nodes are marked as non-empty if they are of the 
type predicates (like "the act of lightingT1 which we call LIGHT*P) . 
22 
12. Na%wal lan~uane noun phrases 
mere are many data base construata which correspond to natural 
language noun phrases. Noun phrases create many problems with 
their attributes, with composite objects which have several 
parts a.8.o. A number of chapters will discuss problem~l with 
representing that kind of facts. The necessary data base 
concepts are discussed, rather than the translation problems. 
Singulm noun phraaes without con Junctions are usually translated 
into one object-type node, An exception to this is some simple 
sentences in which noun-phrases are translated to predicate-type 
nodes, see chapter 21. 
This object-type node can be a CONSTUT, a DUMMY or a VARIABLE. 
CONSTANTS are created for simple positive sentences like I1A 
man is walking on a streetmtf Note however, that in an if-state- 
ment or a, question, the noun-phrases instead must be inter- 
preted as VARIABGEs, e.g. "If a man is walking on a street, 
then,. or IfIs a man wdking on a street?". In these cases, 
"at mault1 does not introduce a new conrjtant , but represents a 
simple search pattern to be used in deduction, and VARIABLES 
are used for such search patterns in our data base system. 
For general-sense statements, the nonians are usually translated 
to VARIABLES. Example: Wvery good girl will kiss every brave 
soldier." These VARIABLES can be used in later deduction, to 
find out what happens if a good girl meets a brave soldier. 
Noun-phrases beginning with "them or "thid" or "thatf1 or some 
similar determiner are usually translated to DUMMIES. Here a 
search must be made in the data base for some previously known 
node to merge the DUMMY with, 
Pronouns like "hen or l1itV or I1hertt are also translated into 
DUMMIES, for the same reason. 
The noun word i-bself indicates a property of that noun (e.g. 
?tmmvf indicates the sex and species of "a manvv). A predicate 
W*F is therefore created, and 'la mann gets a relation ?RED 
.t,o MAN*P 
Adjectives do nbt always indicate properties which are generally 
true for the noun phrase. They can mean many things Examples: 
The good teacher ex he teacher which is good as a teacher), 
The big ant ex he ant which is big for an ant), 
The red house ex he house which is red). 
Therefore, a weaker relation ATTR is used from a noun to its 
adjectives. 
Names are a very special kind of psedicates, ,and therefore 
a special relation NARlE goes from a noun node to its name. 
When a name such as rfJ~hnrl or tfCambridgev is used, we want 
to identify this with some previously know rfJohnv or lrCambridgeff 
Ln the data base, But we cannot give the node itself the name 
"Johnn or trCa;mbridge", since there may be more than one "Johnv 
and lfCambridge" in the data base, Therefore, the last-mentioned 
which fits %he description is found, just as for other DUMMIES, 
"The always Old Johnf1 will the~ef ore for example in our data 
base be translated into the DUMMY in figure 21. 
JOHN-P 
*The always Old John* 
Figure I 1 
24 
We have a special rule for VARIABLES with bnly one PRED as a 
definition, where this PRED goes to a predicate whose name 
comes from the input sentence. This variable gets the same 
name, but with lf*Sw in the end. Thus, "Every manu is t~?anslated 
to W*S h-9 W*P 
This special rule is not really necessary, but has two advantages: 
a) The data base becomes more ~eadable, 
b) The data base routines will immediately see that all W*S 
nodes created by several different sentences can be merged 
into one, without having to do any deduction. 
You could sw that W*P is the property of being a man, 
while M.AN*S is the set of all men. 
12b. Attributes on noun phrases 
This section describes things which are not yet implemented 
in the SQAP program when this is written (May 1974) 
For several reasons, attributes on a noun phrase cannot always 
be represented by a direct relation from the noun phrase to 
the attribute. 
Sometimes two or more attributes on a noun phrase are related. 
If you say £.riend of Nixonn then this person is not always 
a friend (he may not be a friend of ~c~overn) and he is not 
always "of Nixon" (he may not be "a son of N,ixonv although he 
is surely a son). If we represented the two attributes that 
he is a friend and that he is "of Nixon" as two separate 
independent relat.ions on his object node, then the data base 
deduction rules would not properly understand sentences like 
"A friend of Nixon is an enemy of McGovernff. The deduction 
25 
rules would wrongly deduce that since this object independently 
has the attribute of being a friend and the attribute of being 
"of McGovernI1, the object is a friend of McGovern. 
To avoid this erroneous conclusion we must have only one 
single outgoing relation from the object to the composite 
property of being "a friend of Nixon". The statement "A friend 
of Nixon is an enemy of McGovernV might thus be represented 
like in fkgure I1 b 
FRIEND-R ENEMY-P 
HA friend of Nixonu SUBSET ) enemy of McGovernr 
where two new ttevenktt nodes are introduced for being a friend 
of Nixon and being an enemy of McGovern. It seems as if only 
the preposition "off1 and no other preposition in english creates 
this problem. 
The same klnd of representation can be used to correctly 
represent a statement like "A big ant is a small animaltt, 
see figure llc. 
BIG-P ANT-P SMALL-P. ANIWI 
$ DEF $VBSP"T 
nA big antr 
> %A small anirndn 
Figure Ilc 
Another reason why attributes cannot always be represented 
as direct relations on the noun is that the attribute may be 
restricted in time or space or in some other way. If we input 
the noun phrase "A hungm girlu then the mmputer creates an 
object for this girl. But we may thereafter learn that the 
girl eats and is not hungry any more. Thus the same object, 
at a later time, does not any more have the attribute of being 
hungry. Here again, we must in-kroduce an EVENT pode for the 
fact that the girl is hungry as shown in figwre lld. 
HUNGRY-P 
AT-TIME 
..**a GIRL-P 
figure 'lld 
27 
In most. cases, the same time-restriction appliels to attributes 
as to the main verb in the sentence. If we say "A hungq girl 
ate a cold buffet in a sundrenched meadow on a warm summer 
dayw then the time and space restrictians are valid for all 
the attributes on the vqrlous nouns in Che sentence. In such 
mes, the data base cou1.d be simplified if we introduced a 
special "situationv node to represent the time and space, 
and then used a new shmt relation SIT from the various event 
nodes to this situation, This would be even moxe useful if a 
series of sentences all apply to the same situation. 
Prepositional attributes my also be situation restricted as 
for tHe sentence "An angly man with a @;lln is coming at ten 
olclockvT, where the man at another time may not be "with a 
-nu. This might be represented as shown in figure I le. 
ANGRY -P 
MAN-P 
%Being an angry m WITH 
t 
pa gunw 
(The situation-restricted event) 
figure lle 
28 
If the computer is told that "Every english spinster who 
comes into the church is awed1' then the computer can deduce 
that Eliza is awed if it knows that Eliza is english, is a 
spinster, and is cornin$ into the church, all at the same time. 
13ut if Eliza was an english spinster five years ago, and 
comes into the church today, then we cannot make thls deduction. 
This can be solved in two ways. Elther the data base represen- 
tation of "Every english spinster who comes into the church 
is awed" is changedinto "If at a certain time, an english 
spinster comes into the church, then she is awed" or else the 
deduction rules are changed so that the time-limitat:ions are 
implicitly carried along and combined during deduction. 
13. Composite objects 
There is a need to describe the fact that objects can be parts 
of other 0bjec.t~. We ,therefore introduce the node type com- 
posite object. A composite object consists of a known or un- 
hown number of elements, which may or may not be similar. 
If we know which the elements of a composite object are, then 
we use the ELENIENT short relation from the composite to one 
or more of its parts. 
Example: Vohn and Mary are married." would be translated inte 
figure 12. 
JOHN-P MARY -P 
NAME 
DEF 
MARRIED-P 
PRED 
>John and Mary are married, 
One might suggest that conjunctions between noun phr&ses are 
translated as two sepasate indenpendent object nodes, "John 
and Mazy axe married" w6uld thus be $ranslated in the same 
w~ a,s !'Johq is married and Mary is ma,rriedw . However, this 
is &v%ously not the same thing. Sometimes the difference is 
perhaps not there, fox example when we say "John and Nary are 
humanv'. But the aafest way is to create a composite object. 
This of course requires deduction rules to decide when a pro- 
perty on a composite can be transferred to its elementary parts. 
TMs can almost always be done, but not in some oases, for 
example if we say ''John and Mary t-ogether are heavier than 
Peter." But in such cases hhere is usually some indication in 
natural language, like the word "togetherv indicating that 
the property of the composite cannot be transferred to its 
elements, 
Look again at the picture above showing the translation of 
"John and Mazy are married." 
30 
ALL -the fiodes with DEF on %hem above me DUMMIES, This means 
that we first search for a previous-mentioned node with a 
flWAME: JOHN*Pw relation on it. If one is found, lr Johnf1 will 
merge with it, otherwise "JohnI1 will become a new constant 
and the DEF is changed to ALL. The same thing is done for ItMaryl' 
Thereafter, when klJc)hnw and "Mary1' have been found in the data 
base, we te to identify TfJohn and Maryff, that is to find in 
the aata base at node whose two elementary parts are just 
'1 Johnt1 and "Maryw, If such a node is found, Vohn and Maryv 
will merge with it,, otherwise "John and Maryq1 becomes a new 
constant and the DEF quantifiers are ohanged to ALL. 
This process ensures that if we first say IIJohn and Mary are 
rnazriedoFr and then say "John and Mary are going to separate.lf 
then the two statements will refer to the same data base node 
"John and Maryu for both sentences, 
There is a risk, however, if we say "John and Mary- and their 
son me a and then say "John and Mary. are going to 
London." Then the data base might wrongly identify llJohn and 
Maryw in the second sentence with the composite llJohn and 
Mary and *t;heiw sonv1 in the first sentence, and thus wrongly 
conclude that the son is coming along to London. To stop this, 
we might require that if there are ELEMENT relations on a 
node, these must point out all the elements, and not some of 
them. Thus the data base would how that llJohnqf and llMaryll 
axe the only elements of the composite "John and Maryq1, and 
therefore cannot Identify this with l1Sohn and Mary and their 
sont1 . 
The data base also ought to have a deduction rule which auto- 
matically can conc'lude that theye is a PART relation between 
two composites, if all elements of the first composite are 
also elements of the second, 
3 1 
14. 
Conjunctions between noun phrases in the generdl sense 
In the previous chapter I pointed out the ambiguity between 
sentences l'ike "John and Mary are married" and "John and Masy 
are humann where the first sentence says that the composite 
was nsrried, while the second said that the elements indi- 
vidually were humans. I also said that such sentences could 
always be translated to composites, since properties of 
composites can in general be trasferred by deduction to the 
elementaq parts. 
This is not so easy in the general sense, see the following 
examples : 
"A11 men and women are getting married." 
"All men and women, a.re happy. 
"Every man and woman standing together are a maxried couple." 
llAll men and women are young people." 
Noun phrases in the general sense me translated into VARIABLES 
in the data base, and these VARIABIiES are later used during 
deduction. In most of the sentences above, the best translation 
is to create am individual variable for each element, but no 
variable for my composite. If we say "All men and women are 
happyu what we mean is "Create a VARIABLE containing all men, 
and another VARIABLE containing all women, and put the pro- 
perty of being happy on all members of both variables. " We 
do certainly not mean "Create a VARIABLE of man-human couples, 
and put the property of being happy on all such  couple^.^^ 
In the general sense, conjuncted nouns me therefore not 
combined ihto comwsite objects. An exception is when there 
is some special indication that such a combination is wanted, 
like the word lltogetherlv in the third example sentence above. 
32 
15. Plural nouns 
Plural nouns do not simply indicate a set of singular objects, 
There aze also properties which belong to the composite of 
all the elements together. One of these is the property of 
be~ng~lurd, that is of having more than one element. If we 
say "Two horses are running;", then each horse is not plural, 
neither is each horse two, it is the composite which has 
these two properties. 
Plural nouns must theyefore often be translated into composite 
objects in the data base, since relations on sets in our data 
base alway-s refer to the lndivjdual members of the set, not 
to the set as a whole. 
An excep-[;ion from this rule is phrases in the general sense 
'like "A11 men are male humanstt. Rere the plurality is of 
little importance, and no composite object is created at 51- 
put translation. 
Therefore, when a plural refers to all objects with a certain 
property, then a variable is created, but when the plural noun 
phrase refers to some special collection of objects, then a 
composite object is created. 
We introduce the new relation NTlM which goes from a composite 
object to the numeral of it. We also introduce the relation 
COMPLW which goes from a composite object to a predicate 
which applies not necessarily to the composite, but which 
applies to all its elements. Examples in figure 13, 
HORSE-P 
COMPLEX 
I' 
1 HORSE-P 
))Some horses)) 
Figure 13 
This means that the data base must be able to make deductions 
on numbers, e,g, to deduct that if a composite has the relation 
NUM 2, then the relation NOT NUM 1 can be deduced. This is 
necessary e.g. to merge these sentences into the data base 
in a correct way: 
IVTwo horses ae coming. One of the _horses is sick." 
To identify Ifthe horsest1 Ln the second sentence with "two 
horsesv in the first sentence, deduction must infer NOT NUM I 
from NTTM 2. 
There may also Be a need to have in the data base management 
a routine for counting the number of elementary parts of a 
composite, so that the NTTM numeral can be deduced if all parts 
are known. 
16. Fitting composite objects into the sentence 
The gener@ rule is that when two conjuncted nouns have been 
translated into a composite object, then it is this composite 
objects and not its pwts which is fitted into the sentence 
framework. 
34 
This is obviously the correct translation e,g, when you say 
"The road between Stockholm and Gothenburg" where Hbetweenqq 
refers to the composite, but not to the elementary parts 
singularly.  he road between Stockholmn is not right). 
However, for a phrase like ttEvery man and woman in the cityt1 
we do not want to find only couples of men and women, so 
general sense noun phrases are not translated into my com- 
posites at all, The pmts are fitted sepazately into the 
sentence framework instead, 
If we say !?The father and the mother of Mary is comingn, then 
obviously it is the composite which is llcomingtt, but it is 
not so obvious that "offT refers to the composite. Sytppose that 
we previously in the data base have got a father of Mary and 
a mother of Mary, but no com~osite of these two, "The fatherf1 
and !'The mother" are translated into two DUMMIES, and we f.irst 
search to identify these in the data base, before trying to 
identify the composite. But when we try to identify "The father" 
we do not want to find the closest previous-menthioned father, 
we wmt to find the closestprevious-menthioned father of Mary. 
Therefore, the rule for prepositions ia that to the left, they 
refer to the elementary parts but to the right they refer to 
the composite. 
Se example in figure A4, 
Example: "The road and railway between Stockholm and Gothenburg 
is blockedetf 
STOCKHOLM-P 
RAILWAY -P 
T 
PRED 
T 
NAME 
DEF DEF 
))~tockholm)) 
))The railway)) 
TWEEN 
DEF 
DEF 
BLOCKE<PRED )The rdad and railway)) 
a~tockholm and Gothenburgr 
I 
))The road)) 
ELEME 
DEF 
DEF 
PRED 
DEF 
ELEMENT 
ROAD-P 
DEF 
NAME 
GOTHENBURG-P 
Figure 14 
As seen from the picture, between goes from the elementary 
parts Itthe roadtf and "the railway" to the composite object 
IIS t ockholm and Gothenburg" . 
17. Noun ,phYases with just a number and nothing more 
Some natural languages contains constructs where a noun phrase 
consists of only a number, usually followed by a preposition. 
Example "One of the horsesf1 or "Two of the horsesff, 
Rere, as usual "one" creates a singular set, while any number 
except 'bonen creates a composite object, The relation "ofn is 
in this case translated into 
Noun phrase 
before the 
!I Of It 
A composite object 
No coapasi-be object 
36 
Noun phrase after the "ofu 
A composite object No composite object 
PART ELEMENT ITS 
REV ELEMENT' EQUAL ITS 
Examples : 
Two of the horses: PART 
One of the horses: REV ELENIENT 
Two of dl horses: ELmT ITS 
One of all horses: EQUAL ITS 
18, S-ome examples of translations of sentences with plural nouns 
ILL-P 
HOR-P 1 
DEF 
ELEMENT 
))the horses)) 
(DUMMY) 
))One of the horses is ill)) Figure 15 
))two)) 
(DUMMY) 
))Two of the horses are ill)) 
Figure 16 
Figure 17 
))The girls)) > >;iSwdln;) 
DEF INSIDE 
EQUAL 
))The girls - in  wede en are beautiful)) 
1 LINE-P TRUE-S MAK-P 
M~T NUM\ 1 COMPLEX PART\ / CASE 
1 DEF 
))the lines)) 
))are ... p 
(DUMMY) 
BY 
/ 
son 
EQUAL PATTERN-P 
 the lines are making a pattern)) 
HIGH-P HUMIDITY-P RAINY-P DAY-P 
))The humidity)) 
(DUMMY) 
INSIDE 
TROPIC-P <- rthe tropics)) 
PRED DEF 
Figure 19 
))The hurnldit~ on rajny days - in the tropics is high)) 
COME-P PARENT-P 3 STUDENT-P 
PRED 
rparents of)) ))the three students)) 
(VARIABLE) (DUMMY) 
ELEMENT 
DEF DEF 
nall parents)) 
(VARIABLE) (VARIABLE) 
))All parents of each of the three students are coming)) 
Figure 20 
39 
Figure 21 
E AT-P 
))John is eating three eggs, and one of them is rotten)) 
EGG-P 
>John is eating...)) 
- >)three eggs)) 
rr 
C 
(M ergc 
JOHN-P 
) SOME 
))them)) (DUMB 
\ nar 
-. 4 
NOT NUM 
LA 1 
EQUAL I 
ROTTEN-P 
))John is eating three eggs, and two of them are rotten)) 
Figure 22 
40 
19. Problems with the dual representa-tl,ioa of nouns 
As has been explained above, nouns must sometimes be translated 
to singular sets (for special sense singular nouns), to defined 
sets (for general sense nouns) or to composite objects (for 
most plural nouns). This duality 1s necessw, but it also 
will make deduction more difficult, 
since the deduction rules 
must be able to make inferences from: the composite to its 
parts, The deduction rules must also sometimes be able to 
create 
arudlliary composites or audlljlrggr non-composi tes. 
Example I: "One man and three women are coming. How many men 
and how many women are coming?" Here, deductian will probably 
have to create a help-composite for the single man, since ~nly 
composites have numeral on tnem, and the question asks for 
this numeral. 
Example 11: ?!One or &ore men is comingDV The natural transla- 
tion of this is into a composite object with MTM to "One or 
more1!. But if we later learn, or can deduce from the data 
base, that it is only a single man, thep a singular non- 
composite node probably has to be created. 
Example 111: "Soldiers are cmel. This is because they are scared." 
In the first sentence, I1soldiersV1 is used in the general sense 
and thus a defined set is created, But in the second sentence, 
the translation will firgt traaslate "theyll into a DUMMY looking 
for a composite object, When the routine for merging the 
second sentence into the data base finds the defined set for 
t!soldierslv, it must recognize that a DUMMY looking for a com- 
posite object can merge with a non-composite defined set in 
the data base. 
An even more oomplex problem for the deduction routines will 
occur if we say "Two of the horses in the stable me sick. Is 
any horse in the stable sick?!! which will be translated like 
in figure 23. 
S ICK-P 2 
1 HORSE-P 
>>two of...>> 
PART 
v ))the horses)) 
))Two of the horses in the stable are sick> 
SICK-P 
TF ))any horse)) 
SICK-S /EL \j:: 
HORSE-P 
>)the stable)) 
>Is any horse in the stable sick?)) 
Figure 23 
The question-answering routine will be asked to arnswer the 
question "ITS EQUAL" (that is: SUBSET) and to do this it must 
in some way recognize that a member of the defined set SICK*S 
is an element of the composite "two of1?, and thus is sick. 
20. Equality between composite oob jects 
The natural lanmage phrase "The father and the mother is 
John and Mazy" cannot be translated with an EQUAL relation 
between the two composites for "the father and the motherw 
and 'I3oh.n and Mary". EQUAL says that all members of two sets 
42 
are the same so such a translation would say that there is 
a composite object which has four elementary parts : "the 
f atherlt , Vhe motherw, Johnft and tfM~tta 
Therefore a new relation SW is introduced into the data 
base. SM goes between two composite objects, or between a 
composite object and a non-composite object. SAME says that 
both object nodes represent the same reality, but viewed 
from different viewpoints, described by a different set of 
descriptions, 
"John and Mazy are a masricd couplett will also be translated 
with a SAME relation between the two nodes. "John and Mary" 
is a compos5te object, and "a married coupleu is a non-composite, 
and EQUAL would therefore mean that a node can be both com- 
posite and hon-composite at the same time, To avoid this con- 
fusion, the SAME. relation is used. 
SBME is thus used to indicate a relation between two different 
descriptions of the same reality. But SAMl3 cannot be used 
between a composite object and a defined set containing its 
elementary parts. ELEMENT might be used here, but ELEMENT re- 
fers to one of the parts, not to all the parts. Pherefore a 
new relation OBJGOMPLEX is used. OBJCOMPLEX refers from a 
composite object to a set of all its parts. 
&ample: "TWQ girls are citizens of Norway" would be translated 
like in figure 24, 
ALL OBJCOMPLEX ITS 
,The two glrlsn 
)Citizens of Norway, 
(A composite DUMMY) 
' (A defined set VARIABLE) 
))TWO guls are cituens of Norway)) 
43 
Second example: "All people in the roam aro thd two girlstt 
is translated into figure 25; 
OB JCOMPLEX 
~11 people in the 
))The two girls~ 
room)) (A non-composite 
(A composite DUMMY) 
VARIABLE) 
))All people in the room are the two gjrlsr 
Figure 25 
predicate complement noun 
composite 
Subject noun 
non-composite 
composite non-composite 
SAME OB JCOMPLEX 
REV EQUAL 
OB JCOMPLEX 
Relations between predicates 
Predicates form an hierarchical structure, e .g. VERTEBRATE*P 
is a special case of ANIMAL*P, HUMAN*P is a special case of 
VERTEBRATE*P, KMG*P is a special case of HUMAN*P a.s.0. 
To indicate this we use the relation SWPFED, as in figure 26. 
KING-P q HUMAN- -9 VBRTE- 
BRATE-P 
'-3 ANIMAL-I? 
Figure 26 
44 
This means that some simple sentences can be translated as 
relations between predicates, For example, the sentence "Every 
man is a burrianu can be translated like in figure 27. 
MAN-P 
SUBPRED 
) HUMAN-P 
#Every man is humanr 
Figure 27 
which is much simpler than the other translation, in figure 28. 
MAN-P 
PRED 
DEF 
HUMAN-P 
P 
PRED 
BEF 
MAN-S <-) HUMAN-S 
EQUAL ITS 
Figure 28 ))Every man is human)) 
To be able to give this simple translation to adjectival 
predicates, we also have the short relation SUBATTR, so that 
"Every man is a male humanTt can be translated to figure 29. 
!he difference between SUBAT'fR and SUBPRED is "ce same as bebeen 
ATTXi and PRED in figure 11. In the above case, there is no semantic 
22. Event nodes 
difference. 
Many natural language phrases combine several nodes (objects, 
defined sets, into a statement which can have 
limitations in space, in time, in its truth value, and which 
can have a cmse, a result etc. 
45 
The centrd node in the translation of such phrases is the 
event node. Went nodes are used in our system not only 
for typical events like "John went to Dhe cinema with Maryff 
but also for more sustained lfeventstl like tlJohn is the fiancee 
of Maryf1 or even "John is the father of Marytf 
The most important relations on an event node are BY to the 
subject CASE to the predicate, OBJ to the object, and PART 
vaiidi ty 
to the- en~mmt . Example: "John is riding the bikev is 
translated as in figure 30. 
RIDE-P 
JOHN-S 
BIKE-S 
DEF 
bike)) 
TRUE-S 
Figure 30 
,John is riding the bike, 
From a valid &vent node (that is at the time and place of 
the event etc.) the deduction procedures can deduce e.g, a 
PRED relation from "Johnft to 'RDE*~', and these deduced re- 
lations are very useful in later deduction. There is also a 
symmetric relation OBJPRED from the object to the predicate. 
If we can deduce that some object has OBJPRED to a predicate 
like m~*p, then we can deduce that that object is being 
ridden, that is that the predicate RDED+P (the passive of 
~WP) is appliable to the object. 
46 
We can therefore draw the following figure 31 of relations: 
Figure 31 
PASS 
RIDE-P -> RIDED-P 
CAS 
BY OBJ 
fi~ohn)) <-b ))is ridingr k-> ,the biken 
))John is riding the bike)) & pThe bike is ridden by Johnm 
including implicit short relations 
All of these relations do not have to be produced for every 
sentence, since some of them can be deduced from some others 
by chaining rules like : 
X BY Y & Y CASE Z implies X PRED It 
X PASS Y & Z PASSCASE P implies Y CASE X 
Several more triangles in the figure form such chaining rules, 
although not all of them. (~ven if John is riding and the 
bike is ridden, we cannot therefore conclude that John is 
riding just that bike). All the chawing rules involving the 
event node aze true only when that event node is true, or 
valid when the event node is valid, 
If the data base contains a verb both in zctive and passive 
form, then there must be a relation PASS between them to per- 
mit deduction. Since passive forms are less common than active, 
this PASS relatibn is generated whenever a passive verb appears. 
MAN-P DEPRESSED-P PASS 
<- DEPRESS-P 
.The always depressed man3 (DUMMY) 
Figure 32 
The direct relation OBJPRED from "the always depressed mann to 
DEPRESS*P is thus not created by input translation, but it 
can of course easily be deduced. 
In the same way, "The bike is ridden by John" is translated 
like in figure 33, 
[T~F.~-P IC PASS 
\ 
RIDE-P 
I 
PASSCASE 
))The bike is ridden by John)) JOHN-P 
BIKE-P 
DEF 
Figure 33 
48 
The CASE relation from the event to RIDW is not output 
explicitly, but can of course easily be deduced. 
One could argue that we could avoid passive verbs altogether 
in our data base by always using the CASE and OBJPRED rela- 
tions. There are two arguments against thls: 
a) 1.t is valuable always to have the relatlon PRED from a 
noun to all properties on that noun, It is not systematic 
to need the relation OBJPRED to some properties, 
b) Our representation makes it very easy to represent statements 
like "Someone who is killed, is deadu simply by KILLED*P 
SUBATTR DW*P which otherwise would have to be represented 
by a VARIABLE in the way in figure 34. 
))Someone who 1s k~lled, is deadr 
KILL-P DEAD-P 
Y;,gEDl PED 
))Someone who is killed)) 
Figure 34 
23. Putting restrictions on equality 
One can see event nodes as a way of adding restrictions in 
time and space etc on PRED relations. The event nodes are 
necessary because our data base does not permit us to put 
short relations on short relations. 
Sometimes there, is a need to extend the SUBSET relation in 
the same way. PRED and SUBSET are very si..milar relations, 
49 
although PRED goes to a predicate, SUBSET to an object set. 
Since SUBSM! is a special case of the EQUAL relation, it is 
really the EQUAL relation which we want to extend into an 
event. We therefore introduce a new relation OBJCASE so that 
y BY X & Y OBJCASE Z implies X EQUAL Z whenever the event Y 
is true or valid. The relations thus form the triangle in 
figure 35. 
Figure 35 
EQUAL 
The OBJCASE relation is used when natural language equality 
has +o be translated into relations between object nodes. 
AII example is given in figure 36. 
))Every evening, John is a singer in the clubn 
JOHN-P 
NAME 
DEF 
Figure 36 
))Every evening)) 
AT-TIME 
/ SINGER-P 
>>...is...>> 
DEF 
))John)) ))singer in the clubr 
club)) 
50 
From this network, we can deduce that if the event is valid, 
e.g. in the evening, then llJohnlf is a SUBSET of "singer in 
the clubt1. 
Since all relations expand into a chaining rule with EQUAL: 
"X R Y & "Y EQUAL Z implies X R Z" and since EQUAL can be 
expanded into an event node using BY and OBJCASE, this can be 
used to expand any short relation into an event node. For 
example, "After 1972, Britain is a part of EECM requires us 
to expand the PART relation between Britaln and EEC into an 
event, to be able to add a time limitation to that PART 
relation. This can easzly be done by expanding EQUAL into 
BY x OBJCASE in the way in figure 37, 
x~fter 1972, Britain is a part of EECr 
$RT DEF 
- 
- ))the set of 
d 
of EEC)) 
all parts 
Figure 37 
5 1 
23b. Quantifiers on event nodes 
Consider the sentence ItA girl 1s givlng every man a flower". 
This sentence could be interpreted in the following way: 
"There is a set of events, one for each man. One and the 
same girl is giving a different flower in each such event1'. 
In our data base, this is represented like this: 
MAN-8 (Variable) GIVE-P 
f as. 
DEF 
,a girl)) f- pa girl is givmg .... n 
(variable) 
) FLOWER-s 
BY DEF 
k:.:T 
DEF OBJ ITS (Variable) 
TRUE-S 
figure 37a 
In the tgmslation above, the two noun phrases "a girl" and 
"a flower" are interpreted in different ways. "a girl" is 
interpreted as one single girl, while "a flower" is interpreted 
as a set of diffeyenl flowers, one for each man. 
These two interpretations of "au are called respectively the 
singular sense and the distributed sense. Other determiners 
than Itaft have the same ambiguity, for example "someM. 
lfa carrf in the sentence "Every man is in a car" can be 
interpreted in the singular sense (one single car) or in 
the distributed sense (one car for each man). 
In the singuhr sense the interpretation will be: 
figure 37b 
MAN-S CAR-P 
Py 
))Every man is in a car, (Constant) 
TRUE-S ))a car}} (Constant) PD 
And in the distributed sense the interpretation will be: 
MAN-S 
CAR-P 
rEvery man is in a carn (Variable) 
/ 
'$% ITS 
/ 
DEF 
TRUE-S ra carp (Variable) 
figure 37c 
One can note that we can later refer back to the car only 
with the singular sense interpretation. Example: "Every man 
is in a car. The car drove away1'. This also corresponds to 
the .interpretations in the figures above, where only the 
singular sense provides a node to refer back to. 
Such a back-referencing could thus be used to disambiguate 
this kind of ambiguous sentence. 
If you compare the two figures above, an important difference 
is -that -t;he event node is a constant in the singular sense, 
a variable in the distributed sense. 
The translation rule is that if all the noun phrases marked 
with "a" or "some" are to be interpreted in tke singular sense, 
then the event can become a constant. If, however, one of 
the noun phrases marked with Itat' or is to be inter- 
preted in the distributed sense, then we must have one copy 
of the event node for each copy of the distributed noun, so 
the event must become a variable, 
If the event is translated as a variable, then the quantifiers 
on the relations between the event and the noun phrases should be: 
DEF-ALL to singular sense and constant nouns, 
DEF-ITS to distributed sense nouns, 
ITS-ALL to nouns marked with a general quantifier like lteverylf 
or "allt1 or 'leach". 
Examples : 
"A man and a woman are everyone - in the housen 
na man and a woman) < 
SAME ITS 
I \ 
#a man)) ))a woman)) 
))everyone in the house)) (Variable) 
Fm 
rthe house)) (Dummy) 
1 DEF 
MAN-P WOMAN-P 
figure 37d 
JPRED 
HOUSE-P 
54 
"In every city, some woman is in a hospitalu 
CITY-P 
CITY -S (Variable) 
JNSID. 
WOMAN-S (variable) <- ))is....)) 
r::ED 
ITS BY y&gE 
WOMAN-P HOSPITAL-S (Variable) 
HOSPITAL-P 
figure 37e 
24. Deduction patterns and natural language if-clauses 
A natural language statement like "If the weather is rainy 
and a person outdoors and the person is not wearing any 
raincoat, then the persoh will become wet." introduces de- 
duction rules into the data base. These rules are only valid 
if a pattern of facts in the data base can fit into the 
pattern created by the deduction rule. 
Such deduction rule patterns are called keys in our system. 
After merging into the data base, the statement above may 
look like in figure 38. 
WEAR-P 
WEATHER-P 
7P.m DEF 
OU DOOR-P 
T:: 
... . 
/I -\ DEF 
- 
RAINY-P 
\ 
/7XMT coND 
))any raincoat)) 
RAINCOAT-P 
I-> WET-P 
CASE 
))If the weather ts rainy and a person is outdoors and the person is not 
wearing any raincoat, then the person will become wet, 
Figure 38 
A new quantifier "THAT" is introduced above. The reason for 
this is that if there axe two different persons, one who is 
outdoors, and another who is not wearing a raincoat, then we 
do not want to conclude than any of them necessarily will 
become wet. We therefore have to single out in the data base 
one person and two events in which this person is the subject. 
One of the events should say that he is outdoors, the other 
that he is not wearing a raincoat. We therefore have a key 
of one person and two events, which have to be fitted with 
facts in the data base when the deduction rule is used. 
The quantifier HTHATvl refers from a conclusion to the deduction 
pattern key. It means to single out that member of the referred 
set to which the whole pattern has been matched, 
In the figure above, lithe weather is rainyf1 is not part of 
the pattern, But in reality, there is in *he natural lavllguage 
text an implicit time and place indication: "If, at a certain 
place, at a certain time.. .I1 and this place and time will fit 
the weather into the pattern key. 
Our program does not yet handle such implicit time and place 
indications. 
There are two short relations for "imply" in our data base: 
COND and IFTHEN, COND refers to necessazy conditions, IFTREN 
to sufficient conditions. To handle cause and effect patterns, 
and the resulting structure of situations depending on each 
other, probably more such relations are necessary, but we 
have not introduced them yet. 
A somewhat simpler notation is available for some simple cases. 
This is the COP short relation, which refers to a hypothetical 
copy. Thus, the bridge is loat. and weak, then it will breakvv 
can be translated like in figure 39, 
COP DEF 
n t he br idgedf. 
DEF PRED 
>)...lS* *.> 
DEF 
:::? 
LOW-P 
WEAK-P 
))then.,,)) 
Figure 39 
BREAK-P 
rlf the bridge is low and weak, then it will breakx 
Thus, if-statements in natural language introduce a pattern 
key of variables, connected with DEF-DEF relations. And the 
conclusion refers to this pattern with relations with the 
quantifier THAT on the pattern end. 
A nat~rd language if-statement in a question is translated 
in a quite different way. The statement "If the weather is 
rainy and John is outdoors, will he then be wet?" is translated 
like this: "Add the temporary facts that the weather is rainy 
and that John is ouet;do-ors into the data base. Thereafter try 
to .deduce if he will be wet. When the question has been answered, 
then remove the tempwary facts from the data base again. 
CQ-ed to other natural language systems, one characteristic 
of om systiem~is +he representation of deduc%ion rules as vari- 
able patterns in the data base. Other often used representations 
me 
a) Predicate cslculus clauses. 
b) Exeautable programs in some spacial programming language. 
'Phe advantage with our system is that the representation of 
deduction rules is so closely integrated with the represen- 
htfon of facts* The simplest deduction rules, the chafning 
rdle8, sihply are des for traversing the data base graph 
from node .t;o node. The more complex deduction rules are 
patteras very similar to the data base facts which these 
petterns are to match during dedGotion. 
If a predicate cslculus representation is used, then 
efficient deduction requires some algorithm for selecting 
.%hose clauses which might match the clauses in the deduction 
We. mas, tihe pattern matching problem isr not avoided, 
and an ef$f'cienf deduction algorithm probably will hwe to 
have an underlying netwoxk pattern similar to ours, although 
not so visible. 
be advantage with predicate calculus representation is 
however that the theory of deci&%ility is much fuller 
aeveloped fhr that representation than for ours. 
Ekecatable programs in some special programing la,nguage 
is potentidly a more powerful representation than ours. 
Heuristic des guiding the order of the dedkction search 
we easier to include into such a deduction rule, However, 
the power in an achl system is of course limited to the 
set of pro@ams which the input translator can generate. 
&my of the programs will probably in reality not contain 
anything else thsn our chaining rules, variables and patterns, 
and such system will also require some mere or less hidden 
underlying network to select rules and facts of interest 
dusing a certain deduction process. 
On the outemnost surface level, we have until now only iwgle- 
mented yes-no questions in; our system. Other kinds of 
qnes tiona u;m however appear as sub-ques tions during the 
deduction, proceso. A question is f n many ways similar to 
a natural lrtoguaere if-statement, Iri both cases, a pattern 
of variables is created, aad we want to identify this 
patte;m with the data base. 
59 
A typical question like "Is John father of a blond girl1! will 
thus be translated like in figure 40. 
JOHN-P 
NAM 
DEF 
I-= *t 
BY? ITS 
DE DEF 
he father)) I-, aall blond girlsr 
DEF OF ITS 
Figure 40 
nIs john father of a blond girl?)) 
In this simple case, there was no need to introduce pattern 
keys of more than one variable, so the .translation was very 
simple. One central relation, in this case the BY relation, 
is marked with a question-mark, which means that it is this 
relakion which deduction should try ijo; prove. 
The processing of a question therefore usually begins with 
the introduction of temporary data (in this case the VARIABLE 
for rlall blond girls1' and the VARIABLE for "all fathers of 
blond girlsff) and then on a single quostion relation to prove. 
This is what our system is capablle of today. However, some 
complex questions will create patterns where part of the 
paktem refers to other patterns, just as fox if-statements 
in the previous section of this paper. 
Look for example at the question l11s the father of all the 
children of any of Johnt s daughters married to that daughter?f1 
In the translation of this question there will be a VARIABLE 
for !'the fatherw and a VARIABLE for "that daughterv. And these 
two VARIABLES must pairwise match, It is not enough to find 
that the father is married, not even enough to find that he 
is married to one of John's daughters, He must be married 
to just that daughter whose children are all also his children. 
60 
The translation will therefore have to be something like in 
figure 41. 
JOHN-P 
NAME 
7 DEF 
DAUGH*TER-P MARRIED-P FATHER-P 
))John> 
f- ,that daugh- rC__ be .IS..)) 2 rthe fathcrn 
OF DEI! 
ten) DEF TO DEF DEF BY D 
Figure 41 
CHH.,D-P <- pall the childrenr 
PRED DEF 
))Is the father of all the children of any of ~ohn's 
daughters married to that daughter? r 
Look at the OF relation from "the fathern to "all the childrenn. 
This OF relation should single out just the ahildren of his 
wife, not the children of all her sisters. We have not yet 
found out how to do this. We hope that this complex kind of 
questions will not be common. 
26. DUMMIES = temporw variables fBr data base merging 
A shart presentation of the concept of a DU?@fY was made in 
~ection 8. I)lDdIdI:ES and problems with them will be more fully 
treated here. 
When natural language uses constructs like "HeVt or "the manw 
or "this object in the sk;yV thm this usually refers to something 
which the reciever is supposed to know already. Often, the 
thing referred to has been mentioned a short time ago in the 
previous natural 1 anguage input. 
We therefore introduce a special kind of VARIABLE. This is 
call-ed a DUMMY. An ordinary VARIABLE is kept in the data 
base to be used at some later time for deduc.tion. A DUMMY 
causes an immediate search in the data base for a matching 
previously known object. 
6 I 
The order of this search is important. If there are several 
previous objects matching the descriptions, the last-mentioned 
one shall usually be found. However, the subject usually goes 
before other noun phrases. If we say llI£ a card is below 
$1 
mother card, then it cannot be seen." then refers to 
the subject "a cardw, not to the prepositional 
%nother card" 
even though this was mentioned later. 
Our program will therefore make a list, the so-called CURRENT 
list, of previous-mentioned ob jeets. This is searched back- 
wards. 
We have at present two search routines for DUMMY matching, 
the "theu routine and the tlthis" routine. One difference 
between them is that if no matching node is found, the "thist1 
routine will ask the user to rephrase his statement. The "thet1 
routine will in that case accept that this is something which 
the user knows, but not the computer. It will therefore enter 
a new node if no previous-mentioned is found. 
ke pxoroblem whlch we 80 fsr liare not completely solved is how 
to do with patterns of DUMMIES. If we soy "the 
behind Johnw 
then there sre two nstttrd way8 
to translate this into oar 
da$a base t 
a) &o DWOBIE8, one independent (for "~ohn") and another 
bependent (for "=the man") as in figure 42. 
MAN-P 
PRED 
DEF 
JOHN-P 
NAME 
DEF 
,The mann %%> uJohn, 
DEF BEHIND ALL 
Figure 42 
nThe man behind Johnn 
62 
b) A pattern key of tao mnfually depenqent DUBMtES, whexe 
qE3' BEElXD ALLw 5n the figure 42 is uhanged to "DEF BEfflMD DEF", 
'Fhe first translation +a neoe88~ in those oases where only 
one of the DUMKE hes a match in the data baae, e,g. for a 
atatement like "Lf a man is late, then the man behind him is 
even later." Here, there is no previously known man, and the 
seoond translation wikh the pattern key would not match "a manw 
in the if-ata%ement at all. 
Horevez, if solution a) is adopted, this text will not be 
treated correotly *A man dth a dog is ooming. Another do~is 
barking. The man with the dog is frightened." Solution a) 
will first find the other dog, and then create a new man 
who is with that other dog, and let that other man be 
frightened. A more complex algorithm may be necessary to 
solve this problem. 
Another example which will cause difficulty is 
"John and his brother are in the wood, His brother is leaving." 
If no DUMMY pattern key is created, then "his" in the second 
sentence will identify with "brother" in the previous sentence. 
"His brotherH in the second sentence will then identify with 
"His brother's brothern in the first sentence, which is not 
correct. 
27. DUMMIES which refer to VARIABLES 
Look at the natural language sentence "If a lion meets an 
elephant, then the elephant will run to the forest." 
There are two DUMMIES in the main clause, "the elephantv and 
"the f orestft . "The elepbmtv will match the VARIABLE created 
by "an elephm"ct in the if-clause. ?'The forestl1 will match a 
previously known, probably CONSTANT forest. 
63 
In general, only after doing the refer-back search in the 
data base will we know whether a DUMMY will match a VARIABLE 
or a 'CONSTANT. 
If a DUMMX matches a VARIABLE, then that DUMMY may be adding 
definitions to that VARIABLE. Look for example at the sen- 
tence "If a lion meets an elephant, -and if the lion sees the 
elephant, then. . . Here, the DUMMIES in the second phrase 
will add to the pattern key being built up, and thus add to 
the definitions of the VARIABLES Ifa liontr and "an elephanttf, 
This means that there are two kinds of DEF-marked relations 
on DUMMIES, Phe first of them are those which axe to be used 
during the refer-back search. And the second are those which 
are to be added to the VARIABLE, if the DlJMMY matched a 
variable, In our system, we intend to distinguish between 
these by first giving the relations which are to be used in 
the refer-back search. Then the refer-back search is done, 
ad thereafter the relations are given which add DEF-s to 
the definition of the matched VARIABLE. 
Another interesting case is where there are two DUMMIES, one 
dependent on the other, and one of them matches a VARJXBLE. 
Look for example at- the sentence "If a girl is in trouble, 
then her mother will be angryeff Here, "heru becomes a,n in- 
depen-dent DUMMY, while Ither mothern becomes a dependent DUMMY. 
The jtherff DDMEdY will match the VARIABLE "a girlff in the if- 
clause. The DUMMY Ither motheru will not find any match at 
all. And the interesting thing is that because the independent 
Dm matched a VARIABLE, the dependent DUMMY Ifher motherv 
which matches nothing should in this case not crea%e a new 
CONSTANT but a new VARIABLE. For every different girl, there 
is a different mother who will be angry, so a CONS'PBNT will 
not do. 
64 
This means that the "the" qwnmy algorithm must be able to 
decide if a CONSTANT or a VARIABLE is to be created when a 
DUMMY fihds no explicit match, 
28, The problem of dual representation 
We have of course during the writing of the SQAP system 
encountered many problems, For some of them we have found 
solutions, for some not, Many of the problems have &ready 
been presented in this paper, and those problernswhich belong 
more to input "canslation or -t;o deduction than to data base 
structure do not fit into the subject of this paper, 
Looking at the problems we have met, there seems to be one 
problem which recurs several times, Th,is is the fact that 
the same natural language construct can be represented in 
several ways in our data base, We have found that this is 
unavoidable, since one yepresentation is necessary in some 
cases and another in other cases. But on the other hand, this 
difference in representation will make the deduction difficult, 
including the deduction during the merging of new text into 
a prevfous data base, 
One solution to this problem is that when there is two different 
representations, then for sentences giving one of them, both 
of them is created includ'ing the relationship between them. 
This solution is used for the duality of the representation 
of nouns. The noun tfbooku corresponds in our data base both 
to the predioate BOOK*P (=the property of being a book) and 
to the defined set BOOK*S (=the set of all books). But when- 
ever BOOK*S is crea-bed in input traslation, BOOK*P is also 
created and the relation BOOKW DEF PRED BOOK*P is created, 
(1f this already exists in the data base, then of course the 
same thing is not put there twice.) 
65 
This means that whenever both BOOMP and BOOK*S occurs in our 
data base, the relation between them also exists. 
Another example where the same solution is used is active 
and passive verbs. Whenever a passive predicate, e. g. KILLEDXP 
is put into the data base, we also put in the active form 
KILL*P and the relation between them: KILL*P PASS KILLEID*P. 
In this war we ensure that if both KILL%P and KILLED*P are 
in our data base, then the relation PASS between them is 
also there. 
The same solution could be used, but would be cumbers~me and 
memoryconsuming in other cases. For example, a number of 
objects can be regarded both as a composite object and as a 
set, for which we have two different representation. There 
is a short relation, OBJCOMPLEX, in our system, from a composite 
object to a set of all its parts. But this relation cannot 
solve the whole problem, and it would also be very cumbersome 
always to have to put out both representations for certain 
phrases. This is discussed further in section 19 of this 
paper. 
Another problem of this kind is that our system is very much 
based on the idea that simple facts should be stored in a 
simple way ad more complex facts in ,a more complex way. "A 
man is a maJ.en is therefore in our data base stored like in 
figure 43. 
MAN-P ' > MALE-P 
SUBPRED 
BA man is a male)) 
Figure 43 
66 
In this case, a relation between the predicates was enough. 
But for the slightly more complex statement ffEvery human male 
is a manff, a defined set is necessary as in figure 44. 
HUMAN-P MALE-P 
))Every human male)) 
,a> MAN-P 
))Every human male 1s a man)) 
Figure 44 
If there is some limitation in tru-bhfulness or validity, e.g. 
a time-limit, then the PRED must be expandad to REV BY times 
CASE, e.g. for the phrase tfEvery human male was that year a 
soldier", in figure 45. 
HUMAN-P MALE-P 
))..was.,)) 
,Every hqman male)) <-I \/ SOLDIER-S 
BY A- CASE 
- 
r* 
rEvery human male was. that year a soldier)) 
4 
The difficulty with this is that when a new fact is going to 
be added to old facts, then the expanded version may be necessary. 
Also, a question may be asking for the expanded version, and 
the deduction routines may then have to do the expanding 
during deduction, which is surely possible; but difficult to 
manage in an efficient way, 
67 
Example: 
IIEvery male is an animal. If he is humm, then he is 
also a man." 
Here, I1helI in the second sentence creates an object, the data 
base merging routine will find it difficult to understand 
that this refers to the nmalelT in $he previous sentence, since 
this Ivmalett was translated as a predicate, not as an object, 
29. What our system can do and cannot do 
Our system can at least partly manage the following natural 
language constructs: Nouns, articles, quantifiers, adjectives, 
numerals, rnos-l; pronouns, the conjunction rqandI1 , passive and 
active verbs, objects, predicate complements, genitive, pre- 
positional attributes and adverbials, if-clauses, yes-no 
questions. 
Some of the things we are not ready with yet are other con- 
junctions than "andtf, relative pronouns, interrogative 
pronouns, negation, awcilliazy verbs other than llben, com- 
pazative adjectives. 
We do not yet try to resolve ambiguity by reference to the 
data base. 
The kind of facts which our system can handle are basically 
a passive description of a tme set of facts about the vorld. 
We can thus not yet handle properly a description of a sequence 
of events changing the world step by step. Neither can we 
handle properly facts which are part of someone's belief 
structure, Statements about statements cannot be handled (e.g. 
"This is a diffioult problemf1 or "This should not be construed 
to mean that..."). 
68 
30. A short comparison with other systems 
Shapiro '1971, Simmons 1971 and others have presented systems 
very sim5.l~ to our, Most okher systems do not have quantifiers 
on the short relations as we have, and we feel that this 
is an addition which adda to the power of the rep~esentation. 
Special in our system may also be that one short relation 
om be extented when necessary into an event. This saves 
much memory compared to repmsentations where the fullest; 
form is always used, even though it is in most cases not 
needed. It is for example true that fox a statement like 
that in figure 3, there may be doube about only the BY 
relation, or only the AT-TJ3D3 relation, or only the CASE 
relation. (we may be sure that girl is happy", but not 
so sure about the day, or we rnay be sure that there is 
happiness today, but not sure where. ) A full representation 
would therefore require a place to insert doubt on my 
short relation, whether there is doubt or not, and this 
would double the data base size. 
In our system, the deduction rule can for any node in the 
data base find all outgoing and incoming short relations 
directly, and follow them. In-spi.t;e of this, we can store 
a whole short relation in just 64 bits(t-o 24-bit adresses 
plus 16 additional bits). 'Phis compact representation 
increases the efficiency of systems storing the data 
base in virtual memories. 
The basic ideas for our sptem were initially conceived by 
Erik Sandewall and were presented in his papers in the 
bibliography. 
69 
Our system was developed as a team-work between me, Erik 
Sandewall and Kalle Makkil'd. I have been working with input 
translation, Kdle Makila with data base managment and 
deduction, and Erik Sandewall has guided us in our work. 
It is difficult to pinpoint who solved each of our problems, 
shce they were solved through discussions from which a 
solution sooner or later emerged. 
Siv Sjijgren has been working with the problem of adapting 
our system to the swedish and espermto languages, 

Bibliography 

Bar-Hillel , Yehoshua, 1964 : Language and Information, 
Addison-Wisley , Reading, 1964. 

Pillmore, Ch, J, : The Case for Case, In Universals 
in Linguistic Theory, ed Back, E. 
et al , Holt Rinehkd Winston 
Inc., 1968. 

* Makila, Kdle 1972: Deduction procedures in a question 
aswering system, FOA P rapport 
C 8310-~3(~5), January 1972. 

Makila, Kdle 1973: Experience of assimilation and 
deduction in a semantic net (to 
be published). 

* Pdme, Jacob 1970A: Making Computers Underst and Natural 
Language, 108 P rapport C 8257-11 (64), 
July 1970, also in Artificial 
Intelligence and Heuristic Programming 
(ed. Pindler , ~eltzer) Bdinburgh 
University Press 1971. 

* Palme, Jacob 1970B: A simplified English for Question 
Answering FOB P rapport C 8256-11 (64), 
December 1970. 

* Palme, Jacob 1971A: A Natural Language Pazsing Program for 
Question Answering, FOA P rapport 
C 8268-1 1 (64), February 1971. 

Palme, Jacob 1971B: Inet;ernal Structure of the SQAP Natural 
Language Parser, FOA P rapport C 8286- 
11(64), April 1971, 

* Palme, Jacob 1972.A: 
Syntax and dictionary for a computer 
english. FOA P rapport C 831 2-~3(~5), 
February 1972, 

Pdme, Jacob 1972B: From paxsing tree, to predicate calculus 
a prelimin suurvey. FOA P rapport 
C! 8313-M3(3, February 1972. 

* Palme, Jacob 1973: 
The SUP data base for natural language 
information, F0M Rapport, Seyt ember 1 973. 

Sandewall, Erik 1965 : 
Representation of facts in a 
computer question answering 
systems. Uppsala University, 
Computer Science dept . 1965. 

* Sandewall, Erik 1969 : 
A set-oriented property structure repre- 
sentation for binary relations, SPB. In 
Machine Intelligence 5, Edinburgh University 
Press, 1970, also as Uppsala University 
Computer Sciences department Report nr 24. 

* Sandewall, Erik and 
Mails, Kalle 1970: 
A Data Base $"ccture for a 
Question-Answerin S stem. FOA P 
rapport C 8265-11764$, Nove mber 1970. 

* Sandewall, Erik 1971 : 
Formal Methods in the Design of 
Question-Answe~iklg Systems. 
Artificial Intelligence vol. 2 
(1971) pp 129-145. 

Schank, R.C. and 
Testler, L.G. , 1969
A Conceptual Dependency Parser 
for Natural Language, International 
Conference on Conputational 
Linguistics, 1969. 

Shapiro , Stumt Charles 
1971 : 
The MIND system: A Data Structure 
for Semantic Information Processing. 
Rand, Santa Monica, Ca. 90406 USA, 
report R-837-PR. 

Simmons, R.F. , 1971
Natural Language for Instructional 
Communication. In Artificial Intelligence 
and Heuristic Progrming, ed Findler, 
N.V. et al, Edinburgh ~nivegit~ Press 

Sj-en, Siv 1970: 
Eh sy-htax for datamaskinell analys 
av esperanto. FOB P rapport C 8264-1 1(64), 
Oktober 1970% 

Sj'dgren, Siv I971 : 
Utkast till en syntax for data- 
maskinell analys av svenska, FOB P 
rapport C 831 4-~3 (~5), Februari 1971 . 
