I • 
I. INTRODUCTION. 
In this paper I describe ~ system for the on-line 
semantic analysis of texts of up to paragraph length. It 
was programmed o~d applied in Q32 LISP 1.5 to material of 
two sorts: newspaper editorials and passages of classic~ 
philosophical argument. The immediate purpose of the 
analysis was to resolve the word-sense mmbiguity of the 
texts: to tag each word of the texts to one and only one 
of its possible senses or meanings, and to do so in such 
a way that anyone could judge the output's success or 
failure without knowing the coding system. The system 
tackles texts of up to paragraph length because I take it 
as a working hypothesis that many word-sense ambiguities 
cannot be resolved within the bounds of the conventional 
text sentence~ there simply isn't enough context available. 
The system attempts to detect semantic forms (which I 
call templates ) directly in coded text, and not by means 
of a conventional syntax analysis. This restriction 
sets the present approach apart from the better-kno~ ones. 
However, s~ approach like the present one still has to show 
how to obtain the information contained in a conventional 
syntax analysis, and I shall do that below° For each 
paragraph of text examined the systez derives a nested 
structure of the semantic templates, which can be thought 
of as its semantic representation. As I shsX1 show, it may 
be necessary for the system to enlarge its own dictionary in 
an on-line mode in order to obtain such a representation. 
From a representation, a word-sense resolution o~ the text 
is read off and printed out, since the representation contains 
one and only one sense representation for each constituent 
word of the text. 
21 
The basic item, tha template, is intended to express, in 
coded form, the message content of an elementary clause or 
sentence. Thus, if we had to analyse the sentence "The old 
postman is angry", I would expect to match with it a template 
that could be interpreted as "A certain kind of man is in 
a certain state". Similarly, if analysing thQ clause "The 
wicked wizard", I would expect to match with it a tsmplate 
that could be interpreted "a man is of a certain kind". The 
main hypothesis of the system of sense analysis is that one 
can build up a 'proper semantic sequence' of such templates 
as a representation of "semantically compatible" fragments 
of text. At the end of the paper I shall discuss the 
possibility of ex~lig.at~n~ the difficult notion of "meaningful 
lan~age". But at the beginning I am assuming that, if a 
text is meaningful then its parts must cohere together in 
some structured way, and that "semantic compatibility" might 
express that way. This working hypothesis will also mean 
that the word-senses that can participate in such a proper 
sequence will be the appropriate ones. By "appropriate 
senses" I ~ean simply the dictionary word-senses that a 
translator of the text would wish to distinguish from the 
inappropriate ones. 
By way of example, I shall consider the semantic 
compatibilities of the fragments of a p~ro~raph to be 
found in a'Tizes editorial in December 1966. As given 
below it has been frag~_e~e~ by functions whose operations 
I shall describe - ,:, but I shall assume that it is 
comprehensible as a sequence of.twelve items: 
*note 
. 
I 
2 
3 ~4 
t.. 5 
6 
\[V 
-8 
f9 
~-10 
L12 
((BRITAINS TP~'~SPORT SYSTE~4S ARE CI~NGING) 
(~d~D WITH IT TIYE TRAVELLING PUBLICS HJ~BITS) 
(IT IS THE OLD PEP~IANENT WAY) 
(~'/HICH ONCE MORE IS EZ~RGING) 
(AS T~ PACE~a~) 
(~I~LLINES LATELY HAVE BEEN LOSING TP~d~FIC) 
(TO MODERNIZED P~%ILWAYS) 
(RAILWAYS AT LAST ~E BEGINNING) 
(TO TAKE SO~ CI~S 
(OFF THE CONGESTED SYSTE2,iS TO Tj~ THE WEIGHT) 
(IF THE NEW IDE~S ARE FORWI~D PRESSED) 
(COM!~'~. THE OF.- OOF~CUTER MOV~NT ~ND DORMITORY 
.A~3~ CO~:GDSTION FLOW P~.TT~i~ COULD BE CH~GED)) 
Fig.1. A paragraph in fragment form and it's semantic 
compatibilities. 
Let's now look at possible semantic compatibilities 
between fragments of the paragraph (marked with braces in 
the left hand margin of the figure above). 
Fragments I & 2 are semantically compatible (beth 
essentially assert that a structure is of a certain sert: 
(I) that a system is changing, (2) that a structure is the 
public's.) This requires that one takes "to be of a certain 
sort" in its usual wide logical sense to cover such notions 
as change and movement: 
.~ are semantically compatible (both essentially assert 
that something is moving in some way). 
7&8 are semantically compatible (both essentially assert 
that the railways are near to us in time in some way). 
are semantically compatible (both essentially assert 
that something is taking or removing something). 
. 
31 & 12 are semantically compatible (both essentially assert 
that s~me structure is changing or about to change). 
Notice that semantic parallelisms of this sort between 
fragments are sufficient to resolve at least one ambiguity 
in each of the pairs of fragments: for examplethe correct 
sense of "habits" for fragment 2 is "structure of behaviour", 
rather than the less-common "articles of dress". Thu_~s 
pointing out this parallelism is also selecting the appropriate 
sense of "habits". 
2. THE TEXTS AND SEF~NTIC DICTIONARY 
Ten paragraph length texts were chosen for analysis: five 
from randomly chosen Times~'editorials (data texts); and five 
from the works of philosophers, Descartes, Leibniz, Spinoza, 
Hume and Wittgenstein. The reason for the choice of ~his 
type of material will emerge in the discussion. Each paragraph 
was stored as a list of sentences on a LISP file, and an 
alphabetical concordance for the texts was obtained with the 
aid of standard routines. From this the semantic dictionary 
was written. 
The information stored for each dictionary entry word is 
a list of pairs, each member of which consists of a left-hand 
member which is a semantic formula such as (((THIS POINT) TO) 
SIGN) THING), and a right-hand member~which is a sense 
description of the meaning of the corresponding formula, such 
as (COIv~AS8 AS INSTRUMENT POINTING ~O~TH). Each such pair 
(called a sense-pair) corresponds to one sense of the dictionary 
entry word. The sense description (right-hand member of pair) 
serves only to explain to the operator, in ordinary language 
print-out, which particular sense of the word is being 
operated on at any give~ stage of the procedure. The sense 
. 
descriptions are not used as data for computation, except for 
looking at their first item to get the ne~e of the word in 
question. 
The purpose of the formulae is to encod%and so distin~ish 
the different sense~ of natural language words: one would expect 
to assigm a different formula to each major sense of a word 
that a good dictionary distinguishes. Formulaeconsists of 
left and right parentheses and ~!ements, where an element is 
one of the following 53 primitive semantic classifiers or 
markers; 
BE BEAST CAN CAUSE CHANGE COUNT DO D0h-E FEEL FOLK 
FOR FORCE FROM GRAIN HAV~ HOW IN KInD LET LIFE LIkE 
LINE ~AN MAY MORE MUCH MOST ONE PAIR P~RT PLA~T 
PLEASE POINT Sf~E SELF SENSE SIGN SPREf~ STUFF THING 
THINK THIS TO TRUE UP USE WANT ~EN WHERE WHOLE 
WILL WORLD ~RAP. 
These elements constitute the major categories of the 
classification of word-senses. The whole class of elements 
is not chosen at r~ndom; though as with ~o~ny system of 
semantic markers it is difficult to justify its membership 
in detail on theoretical grounds (though see 4). I shall 
assume here only that one has to choose some set of markers 
to work with, and anyone's set of markers is always open to 
detailed objection. The markers are the basic elements in 
terms of which all the others in this system (templates, 
formulae etc.) are defined. So they cannot themselves be 
further defined, except by means of ~ table of 'scope notes: 
which gives the dictionary maker some indication of the 
marker elements. The table contains entries like: 
GRAIN~ (II,IV,~-I) any kind of structure or pattern. 
(III) structural or pattern-like. 
6. 
The Roman numerals refer to the six types of bracket 
groups used by the dictionary maker in constructing formulae. 
They are, in order, Adverbial Group, Adverbial Clause, 
Adjunctive Group, Nominal Group, Operative Group, Operative 
Clause. The first two, for example, can be illustrated as 
follows: 
I. Adverbial Group. 
((TRUE I, gCH) HOW)--equivalent for "enough" used as 
an adverb; same function as "rather nicely" in 
English; can end with element HOW. 
II. Adverbial Clause 
(FLA FROM)--same function as "out of sight" in 
English; cannot end with any of the elements of 
D4 below, and hence a II type cannot be a well- 
formed formula (see below) by itself. 
All these six types of sub-parts of formulae can 
themselves be interpreted (as can the formulae) so that each 
left-part is dependent on the corresponding right-part. This 
is a non-intuitive order in LISP but is an aid to reading the 
formulae for English speakers. This is best explained by means 
of an exe~ple. Thus, to take a sense-pair at r~mdom, say 
KIND)(COLOURLESS AS NOT H~VING THE PROPerTY OF COLOUR)))). 
i~u explanation would be; "Colourless" is a sort; a sort 
indicating that something does Dot possess some property; 
the property is an abstract sensuous property of a certain 
sort; that certain sort has to do with spatial 
ft is not difficult to see that that is what (in right- 
left order) the formula conveys. 
. 
Formulae are defined recursively as follows: 
D.I. A formula is a binarily bracketted string of formulae 
and atoms. 
D.2. An atom is an element, or an element immediately 
preceded by "NOT". 
It follows from this that an element is not a formula. 
Not all formulae can be assignGd to sense-pairs, but only 
well-formed formulae: 
D.3. The head of a formula is its last atom. (and so is the 
opposite of the usual notion of 'head' in LISP 1.5). 
D.4. A well-formed formtul a (wff) is (a) a formula, and (b) 
such that its head is one of the following elements: 
HOW KIND FOLK GAIN N~ Pf~T SIGN STUFF THING WHOLE 
WOP~LD BE Cf~SE CH~IGE DO ~EEL H~VE PLELSE PAIR SENSE 
W~T US~ THIS. 
7. INITI~ ~GM~NTLTION OF THE TEXTS. 
~ initiol set of functions breaks each sentence of a 
paragraph up into strings of words, and, in certain circum- 
stances, reforms discontinuous sub-strings into whole strings. 
The output from this process is a sentence in the form of a 
list of "sentence fragments", each of which (if it is not a 
single word) is either an elementary sentence, a complex 
noun phrase, or a clause introduced by a marker (such as a 
preposition).* So for example, the first paragraph of text 
is returned as on p.2 above by a function which applies the 
se~g~~ to each of the sentences of a paragraph in 
turn, and returns the paragraph as a single list of such sub- 
strings, thus obliterating the original s~ntence boundaries. 
* These markers are largely derived from Earl ° (3) 
8. 
It can be seen from the example paragraph above that the 
functions described do not simply segment sentences in a 
linear manner. They also 'take out' certain kinds of clause 
from within a sentence and append them as separate sub-strings. 
An example of this 'taking out' and reforming can be seen in 
the example paragraph reproduced above. The first two fragments 
read ((BRITAINS TRZnNSP0~T SYSTEm,S ,~E CH/~,fGING)(~D WITH IT THE 
TRLVELLING PUBLICS H~BITS)). 
These are produced Irom a sentence that originally read "Brit~ine 
~ransport system and with it the travelling publics habits are 
cheJ~ging". This sort of break-up leads to ~ apparent grammat- 
ical 'howler', namely a singular subject for a plural verb. 
~ut for the purposes of semantic analysis by the present system 
that is not a disadvantage: it is more than outweighed by 
having the text cut into sezanticelly acceptable units (see 
Halliday(4~ for the attachment of templates to them. 
The fragmonted paragraphs are not passed directly to the 
template-matching procedure, but are first processed by a set 
of re-ordering functions. These inspect the fragmented output 
for a parag, raph and seek for qualifying phrases beginning 
with m~ker words like 'of' and 'for'. These are delimited 
at their other end by the character 'fo', and are placed as . as are 
a whole before the word they qualify/adjectives before the 
preceding noun and so on. 0nly after this rearrs~ugement are 
the fragments passed on to the matching functions. The reason 
for the re-ordering is that when a template has been matched 
with a fragment, the subsequent routines seek for the qualifiers 
of a noun or verb only to the left of it. Thus a phrase "a 
book of rules" goes to the matching routines as "a of rules 
fo book". 
9. 
The purpose of the fragment unit is to define a unit of 
context between the word ~aqd thc sentence, as usually under- 
stood. I shall call "internal" those semantic routines which 
operate wholly within fragments, and "external" those which 
scan text outside\]particular fragment in order to resolve it,~ 
&.. word-senses. 
$. THE SYSTEM OF S~£'~TTIO ~L~LYSIS. 
Production of single bare templates 
The present system replaces each fragment of text by a 
number of strings of formulae (fra~es) constructed from th~ 
formulae for the words of the fragment. It then searches each 
frame and replaces it by a number of matchinj templates, or 
meaning structures. One can display these procedures schemat- 
ically as follows: 
TEXT 
Tenpla_t es 
.... i (structured selections 
Frames :.'..,~_-~-~ from the form-strings) 
Fragments ~__.~-"/ (of formulae~ ~---'' 
/ (of text) ~. 
Fig. 2. Attac.hment of text to templates. 
In the course of these procedures, therefore, each fragment of 
text is tagged to a number of templates, and so each such 
template is tagged to ~ome n~.r~cul,'ir sei~c~ion of the word- 
senses for the words of a fragm,+~:~:~. The purpose of the 
subsequent procedures -is to re4u.: th.'..~ '~fragment ambiguity" 
by specifying a set of str~!.~g~J : ~ .\[- ~ templates, one template 
correspcnd!~ to each tex-~ ~.~ ~ ......... ,~.,,~. ,. ,~" so specifying a p~rt- 
icular s~-~ of worO,sensec .,.r '-h ~. wo~.~ of the whole text. 
10. 
The intuitive goal is that there should be just one string 
of templates in the set, and hence a unique ambiguity res- 
olution of the text. However, the possibility of a number 
of independent resolutions cannot be excluded a priori. 
Thus the outcome of applying these procedures to a 
text is either nothing, or a string of sense-explanations 
for the words of the text. In the case where the outcome 
is nothing, further procedures are defined whereby the system 
returns, as it were, to the beginning, adjusts one or more 
dictionary entries in a determinate way and then tries again 
to resolve the text. Thus the positive outcome described 
may be achieved after any one of a finite number of tries. 
As will be seen, there is a limit to the number of possible 
tries; and after it has been exhausted, the system has to 
conclude that the text cannot be resolved by this particular 
method. 
The procedures of resolution can be put in the form of 
a set of phrase-structure rules which produce a nesting of 
frames of formulae from an initial paragraph symbol P. The 
rules are given in their generative rather than their analytic 
form, but I give the "lowest-level" rules first, because they 
are the ones applied at the first stage 0f an~ysis. The 
presentation will thus end up, rather than start, with highest 
level rules P÷..., where P is a "paragraph symbol" analogous 
to the sentence marker, S, in conventional gram~.ar. 
I 
Following what has been said above: 
D.5. A frame for a fragment is a string of formulae such that 
each word of the fragment that has a (non-null) dictionary 
entry is represented by oue and only one formula, and that 
formula has the same linear order in the as the 
11. 
corresponding word in the fragment. Thus the set of all 
frames consistent with this definition (and with the dictionary 
entries for the words of some fragment) constitutes an initial 
representation of a fragment in the system. 
We can now define the fundamental notion of template. 
D.6. A bare template is any concatenated triple of elements 
that can be produced by Rules I-6 below. (The rules 6. are 
only a sample). 
R_~I. T + NI + V + N2 
R2. V ~ BE 
I{3. N2 ~ KIND, THIS, GRAIN, THING, SIGN. 
R4. NI ~GRAIN, THIS, THING, PART, SIGN, ~N, FOLK, STUFF, 
'WHOLE, WORLD. 
R5i. (NI ->THIS) +...+ N2-~ PI~RT, ~AV, FOLK, STUFf, WHOLE, 
WORLD 
ii. (NI ~, THING) +...~ N2 ~, PART, STUFF, 'WHOLE, WORLD 
iii. (NI ~ P:/{T) +...+ N2 ~, P~LRT, STUFF, WIdOLE, WORLD 
iv. (NI ~ SIGN) +...+ N2-> PI~RT, STUFF 
v. (NI @ N:~) +...+ N2-~P~{T, FOLK, STUFF, ~t~N 
vi. (NI -> FOLK) +...+ N2 -> PART, ~L~N, FOLK, STUFF 
vii. (NI -~ STUFF)+.. .+ N2->P.~IT, STUFF, \]WHOLL, WORLD 
viii. (NI -> ~,WHOLE)+.. .+ N2 9 P~iRT, STUFF, WHOLE,~OP~LD 
ix. (NI -) WORLD)+...+ N2 ~ P:-~T, STUFF, ~K4OLE, WORLD 
x. (NI -> GRI~IN)+...+ N2 -> P.~T 
R6i. (NI -)GRAIN)+...+ V ~ P~IR, DO, C~USE, CH~NGE,Hf.VE 
ii. (NI -) THIS) +...+ V -~. PAIR, DO, CAUSE, OH_~GE,H~VE 
The form of rules 5 and 6 is simpl.~/ ~ "onveni~ut abbreviation 
of a more conventional form. For ex~mp\] ~ 
R5 iv. (NI @ SIGN) ~.. o+ ~2 9 P~'~T STUFF 
12. 
is simply an abbreviated expression of the two context- 
dependent phrase-structure rules: 
SIGN+...+ N2 -~ SIGN+...+ Pi~T, and 
SIGN+...+ N2 -)SIGN+...+ STUFF. 
These rules produce bare templates in the form: 
Substantive (or noun) type element + 
Active (or verb) type element + 
Substantive (or noun) type element. 
Thus ~'~+H~VE+P~aRT can be produced in this way, but 
I~¢BE+WORLD cannot. This order we call the standard order, 
and templates are always considered and compared in this 
order even if located in fragments in other (nonstandard) 
orders, or in "debilitated forms." 
D.7 & 8. 
NI 
If NI+VN2 represents the standard order, then 
V+NI+V2 and NI+N2+V are nonstandard orderS, and 
NI+N2 
NI+V 
NI 
V are debilitated forms. 
D.9o A fragment matches with templates if a frame for it 
contains concatenations of heads (in left-right order) 
corresponding to any template produced by ~lules 1-11. 
Where: (* indicates a blank item). 
R7: THIS ~ * 
R8: B2 ~ * 
R_~.: KIND -> * 
RI_._O: V ~ * 
13. 
RI I .i 
ii. 
iii. 
iv. 
NI +...+ (KIND -)*) -TKLND +...+ NI 
(V ->*) +...+ KIND ~ KIND +...+V 
NI +...+ (V-7*) ->V+...+NI 
(v->*) +...+ N2 ->N2 +...V 
Rules I-6 produce standard forms of bare template, and Rules 
7-11 produce (by means of deletions and reordcgin~)the 
permitted debilitated and nonstandard forms. The latter 
rules produce actual text-items, in the sense of heads (of 
formulae) to bo located in the frames that represent fragments 
of text directly. 
In order to produce templates that can plausibly be 
interpreted as meaning structures for fragments - in that 
they correspond to the heads and fr~nes for the correct word- 
senses of the fragments - it is necessary that classes of 
templates be produced in a given order. There are four such 
ranks of classes, as shown by the following table: 
R~/~E TE~T- ITE~ STI/TD~/d9 I~ORM 
II 
III 
IV 
NI+V NI+V+THIS 
V+NI THIS+V+NI 
N I +V+N2 NI+V~N2 
V+N I +N 2 N I +V+N 2 
NI +N2+V NI +V+N2 
KIND+NI NI +BE+KIND 
NI +V+KIND NI +V+KIND 
N I +KIND+V N I +V+KIhrD 
V+N I+KIND NI +V+KIND 
NI +KIND NI+BE+KIND 
NI +N2 NI +BE+KIND 
V+KIND THI S+V+KIND 
V THI S+V+THIS 
NI THI S+BF,+N I 
KIND T HI S+ V+KIND 
Pig 3- Preference table for bare templates. 
14. 
Since Rules 1-11 are nonrecursive, there is no problem 
about ordering the Iroductions in this way. Apart from t he 
forms given in the table, there are only vacuous cases such 
as @+@+~. 
The above table is intended to make clear the relation 
between the various standard forms (in the rightmost column) 
and the corresponding "items in frames" produced or recognized 
(middle column). Thus in the generative mode, text items are 
produced from the standard forms by transposition and deletion. 
In the analytic mode the text~items are recognized in the rank 
order shown, and then transposed and augmented with dummy BE 
and THIS elements so as to be in standard form fo-further 
computation. 
The actual function of the rank choice is best explained 
by example, particularly as regards the composition of Rank I, 
since the ranks lower than I clearly consist of "debilitated 
forms" and it is intuitively plausible to produce fuller forms 
first. This ordering is one example of the general rule which 
enables template matching to do (at least) the work of a 
conventional grammar; namely, pack the frame as 
tightly as possible, or, in other words, produce the 
fullest possible template. 
The presence in Rank I of the debilitated form KIND+NI 
can be understood by considering, for example, the fragment: 
(THE OLD TR~,TSPORT SYSTEM). 
To simplify matters I shall cohsider only (i) the frame 
consisting of representations of the appropriate senses of the 
words in that fragment, and (ii) the frame identical with the 
first except that it contains representations of OLD as 
substative (noun = "the old people") and the active (verb) 
15. 
form of TRANSPORT. Thus, by the semantic coding system 
described above, those two ~ will contain the 
following heads, and in the order shown: 
i ....... KIND) ...KIND) ...GRAIN) , and 
ii ...... FOLK) ..... DO) ...GPJ~IN) . 
Now the above rules generate both 
(FOLK+DO+GR~N) and (KIND+GRAIN) 
as strings of text-items; the latter by deletion from 
(NI+BE+KIND) and (KIND+N1). It is clear that if the form 
KIND+NI were not in Rank I with forms like (NI+V+N2) which 
yield (FOLI~+DO+GiIAIN), then a substantive phrase like this 
one would never receive a proper interpretation, since Rank I 
(without the form (KIND+NI)) would always look for an active 
(verb) sense for"trans~o~t"and having found one, would be 
satisfied. 
As I have described the process so far both bare template 
forms (FOLK+DO+GRAIN) and (GRAIN+BE+KIND) would be produced. 
I shall show in the next section the additional procedures 
which produce the second of these in preference to the first 
Production of single full templates. 
Further production rules limit the templates actually 
produced, and these require the notion of full template, 
defined as follows: 
D.IO. A full template is two triples of formulae such that 
the heads of the first triple constitute a bare template, snd 
the second triple can be produced from the first by the rules 
12-16. 
D.11. The six formulae constituting a ~ull template are 
called text-values. 
16. 
The six formulae so defined give content to the 
corresponding bare template (expressed by the heads of 
three of the formulae). The rules 12-16 specify the other 
three formulae in such a way that each of them c~m be the 
qualifier of one of the formulae with a head defining part 
of the bare template. The rules 12-16 (not given here for 
reasons of space) are, in effect, rules producin~ an ordered 
pair of formulae such that the first is ~m appropriate 
qualifier for the second. 'Thus rule 13i produces an adjective 
type of formula (one ending in KIND) before a noun-type of 
formula, and so on. 
The full templates are the items with which the system 
really operates. They can be illustrated by contrast with 
bare templates by considering fragment 3 of the paragraph 
examined earlier. That fragment was "It is the old permanent 
way". Among the bare templates produced for it by the 
system are the following two: 
( ( IT IS THE OLD P'Ei~I';~NENT WAY) 
((THn~G BE SIGN) 
(((THIS THING) (IT .,S IN.~TII, X~TE Pi<ONOUN)) 
((BE BE) (IS AS HAS THE P~0P'~RTY)) 
( ( ( (l~d~ FOR) ( (~I~<E POINT) Fi<OM) ) (LINE SIGN) ) 
(w~Y -',s P:.T~ o~ a0UTE)))) 
((THING BE SIGN) 
, (((THIS THING) (IT AS IN/~NI~L.TE PI£0NOUN)) 
((BE BE) (IS ~S H~,S THE PROPFAITY)) 
( (((THIS THING) (TRUE USE) ) SIGN) (',~,~Y AS ME,~NS))) ) 
I 
The fragment here is tied to two items, each of which 
iS a bare template triple followed by the three formulae in 
the sense frame which locate it (their last elements are the 
same as thosaof the t~mplate triple A point of 
17. 
interpretation should be added here for speakers of .nneries~ 
English: all speakers of British English interpret "way" in 
this fragment as having its "path or route" sense in this 
context. 
The two bare templates are now expanded to full templates 
as follows: 
((IT IS THE OLD PER~E~TT NAY) 
((THING BE SIGN) 
(((0NE THING) (I~ ~,S IN~.~I~TE P~0NOUN)) 
((BE BE) (IS LS m~S THE P~OPERTY)) 
((((THIS THING) (TI<UE USE)) SIGN) (WAY AS ~m~ms)) 
NIL NIL ((NOTOH~,NGE KIND) (P~ENT AS UNCHi~GING))) 
((THING BE SIGN) 
(((ONE THING) (IT AS INI,NI~TE PRONOUN)) 
((BE BE) (IS 4S H~S THE mOPm~TY)) 
((((WHERE IN) ((WHERE POINT) N~OI~)) (LINE SIGN)) 
(WAY AS PATH O~ ROUTE)) 
NIL NIL ((NOTCH~GE KIND) P~h~ENT AS UNCHanGING))) 
These two items are the expanslons (in fr~\]es of sense 
pairs) of the two bare templates. They consist of the same 
items as the bare template plus three for~lulae which are the 
qualifiers of the first three, (the fourth of the six is the 
qualifier of the first of the six and so on). In this the 
'it' and 'is' have no qualifiers, hence the LISP 'NIL's in 
those positions. Bare templates other than these two wore 
matched onto the fragment, but only these two could be 
expanded in this way. Hence these two were the 'survivors' 
and the others were rejected from !hrther consideration. 
When expanding in this way to prodrce fuXl templates 
from bare ones the following met~,~rule (i15) is applied 
18. 
"Produce preferentially those full templates in which as 
many elements as possible are developed by the rules R12- 
RI4J' This means producing if possible those full templates 
in which each element of the bare template has a formally 
appropriate predecessor. By means of a further rule (R16) 
an attempt is made to produce not only full templates with 
formally appropriate internal relations, but ~lso ones with 
semantically close internal relations as well. That is to 
say, full templates such that the triple of qualifying 
formulae are semantically close to the formulae they respect- 
ively precede. Where, 
D.12. 
i) 
ii) 
iii) 
Two formulae are said to be semantically close if: 
they share a common pair of elements~ or 
they have one or more of the following elements in 
common: ONE, COUNT, WOI~D, WHOLE, LIFE, LINZ, ~IUST, 
SELF, SPReaD, TRUE, ~R~P, ~EN, WH~-~E, THINK; or 
Their cores are such that they are identical, or 
either is a member of the other in the sense of a 
list-member, or the left or right hand member of 
either core is a member of the other. 
Rules producing more than one template 
I can now consider the production of concatenations of 
the full templates described so far. 
D.13. A paragraph string is any string of templates produced 
I 
by the rules 17 & 18 from the part, graph symbol P. 
R17. P -~ Tr+T s 
if Tris a full template written as a string of 
six formulae thus, 
{ 
where Frl 
Pr2 is a verb type, 
on, then 
19. 
FI I ~I I rl + Frl + Fr2 + Fr2 + r3 + Fr3 J 
is a noun type; F I its qualifier (adjective type); rl 
F I its qualifier (adverb type) and so r2 
I I I (T s ~ Fs I + Fsl + Fs2 + Fs2 + Fs3 + Fs3) T s 
~ (~1 + Ftl + F~2 + Ft2 + F~3 + ?t3 ) +'..+ 
(F I + + Flu2 +&2 + Fd 31 + &3 ), 
where the values cf the two template forms produced are 
semantically close. 
D.14. Two full templates T r T s are semantically close if 
(with the above notation for full templates) at least two of 
the following pairs of formulae are (i) such that the head 
of the second is identical with, or in the negationaclass 
of, the first: 
(Frl Fsl), (Frl ~s3), (Fr2 Fs2 ), (Fr3 Fsl), Fr3 Fs3); and 
(ii) either they, or their qualifier formulae, are semantically 
close. These ten possible directions of connection between 
two full templates can be shown schematically as follows: 
r I + Fr I Fr 2 + Fr 3 + Fr3 
qualifier N TYPE V TYPE qualifier N TYPE 
4 ! 
• FI" " < F 1 I L sl + Fsl s3 + Fs3 ; 
Fig 4. 
+ ~I 
r2 + 
qu~o_li fi er 
+ F I s2 + Fs2 ~ 
Connectin~ pattern betwe~ ~ full t,~mplates. 
See note on page 31. 
20. 
Rule 18 does not, as might appear at first sight, involve 
self-contradiction. The ~La~hand form of rule writing is 
now being extended to mean that when T s has been rewritten 
as Fs1+ ....... +Ps3' then the latter nay be rewritten as the 
right-hand side of the second arrow. 
This "expansion-concatenation" rule can be recursively 
applied to the initial productions ffam F .~us at any stage 
in the process a paragraph string of full templates is 
produced. At any point the string can be considered terminal 
and with the aid of the dictionary of words and sense-pairs, 
the paragraph string of templates can be converted tca string 
of frames and so to a text of words. This is Pmalagous to 
the introduction of the lexicon in any standard phrase- 
structure gr~r. The dictionary entries themselves c~ 
be put in phrase structure for1~l. For example, if a word W n 
has two sense pairs $I and $2 in its dictionary entry, then 
the sense-pairs themselves c~m be put in the form $I ~W n 
and $2 ~W n respectively. This form of the dictionary entries 
is useful in representing the self-~odification of the system 
described below. 
~o APPLICATION OF THE SYSTEM TO TEXTS. 
Matchin~ bare templates onto fragments. 
Rules I-6 above define the matching of bare templates 
onto a fragmented text, one bare template onto each text 
fragment. TEMPO is the main (top level) function that does 
this: it examines in turn all the frames of sense pairs for 
a fragment, and so on for all the fragments of a paragraph. 
21. 
It takes as its argument a frame of sense-pairs, one for 
each word of a given fragment. TEMPO scans each such 
combination in turn, starting with the frame containing ell 
the main senses of the words in the fragment (the first ones 
in the dictionary entry for each word). TEF~O searches for 
triplets of heads in the order of preference given in fig.3, 
above. For example, if it finds type I templates it doesn't 
look for any of types II-IV and so on. Each type of template 
is collected on a list which is the value of a different 
free LISP variable. If TEMPO finds nothing till it reaches 
the debilitated N+N forM, it replaces the N+N by N+BE+N (BE 
being the "dummy verb"). Similarly V+N and N+V are replaced 
by THIS+V+N ~nd N+V+THIS respectively (THIS being the "d~my 
substantive"). The function of thes~ dulmny features is to 
supply a general form of template for subsequent processing, 
even when it is not wholly present in the text. Suppose, for 
example, a fragment consisted not of an assertion form, but 
of a noun phrase like "the black wizard" j where the heads 
of the appropriate codings for "black" and "wizard" would b , 
KIND and ~h~N respectively. As there is no verb, a debilitated 
template of the N+N form would match onto these two heads, 
and that would then be converted into ~'~+BE+~IND. which 
is the intuitively correct interpretation (WIZ2~D is BLACK). 
The dummy verb is added in the way d~scribed; and in cases 
like this, where the first head is the predicate KIND, the 
order of the two heads is reversed, so as to give the I~:~+BE+ 
KIND form. This transposition is defined by R11i. 
The internal rejection functions (matchin~ full temFlates ) 
Earlier I distinguished between internal and external 
procedures. Internal rejections are those procedures which 
cast out matchint~ templates I.~ means of the 
22. 
expansion from bare to full templates. The n~in function 
which does this is PICKUP. It takes a fragment name as 
argument and constructs the TEMPO value for it. PICI<UP 
makes a decision in the case of each template whether or not 
to reject it ~om further consideration. Those that survive 
are then considered further by the external rejection 
procedures. The survivors from PI0i(UP represent a s Bage of 
azbiguity resolution beyond that given by TEMPO° If, for 
ex~mple, PICI(UP examines a template that has been matched 
onto a fragment containing the wordsround box, where ~ 
template head had been attached to a formula for box, then, 
hopefully, PICI~-P keeps at least the template in which round 
is coded by its "spatic~ property sense" and bo~'is coded by 
its "container" sense. 
Inside PICIGIP the function REFINE returns as its value 
a list of five sub-lists of full templates: its first sublist 
contains those form-close internally in four ways (as defined 
by rules 12-15), down to the last sub-list containing tT~ose 
with ~P such closeness. PICKUP takes the first non-empty 
sub-list of REFINEand of that~returns as its value the~full 
templates that are se~uantically close as well (if any). 
The 'semantic p~rser'; resolving a paragraph. 
The top-level function P:~RSP/~P~'~ takes as its argument 
a llst of fragments, pro~uces the PICLzUP value for each (in 
the full template form given on p.14) and then parses these 
full templates using rules 17 &' 18. A nesting of templates 
that satisfies these rules is ~_u interpretation for the 
para~raph, and its word-sense content is read off and printed 
out( since a nesting of full templates is simply a \ selection 
of the possible word-sense assignr.~ents for a text:) Full 
templates which cannot be parsed with those for other 
23. 
fragments are simply rejected .~is is the ~ternal rejection 
procedure referred to earlier. 
Functions called FIT ~nd JAM express rule 18: they 
test for semantic closeness between two full templates and, 
if such closeness is found, the two full templates are 
replaced by a single item with the form of a full template. 
Or - to put it in terms of the two function names - if the 
full templates ~IT, they are then JAMmed. If the three 
main formulae in a full template are related to the three 
main formulae of another template by any three of the 
connectivities expressed in fig.4, above, then the two 
templates FIT (s~e semantically close). The function JAM 
builds up a representation of the two templates based on 
their connectivities. PIT and JAM work with message-pairs, 
which are to a fragment what a sense pair is to a word. 
D.15. A me.ssa~e-pair is a two-item list: one item is a 
list of the first three sense-pairs of some full template, 
the other item is a list containir~g the name of some fragment 
with which the full ~mplate ~e~tches. 
Pi~ISP~a~J~ constructs the PICIqUP value (full templates) 
for its list of fragments, and then builds uD all possible 
frames of message-pairs for the peoragraph. Each frame of 
messageopairs is now a possible meaning representation for 
the whole paragraph, PI~RSPI~R~'~ then scans each frame in 
turn to see if it can find a right-left contiguous pair of 
mess~e-pairs satisfying FIT. If it c?~ it deletes the 
first message-pair and replaces the ,~ ~cond by a message-pair 
consisting of (I) the JAM value of the !~ro 'parsed' full 
templates, and (2) a list of the names of he fitting fragments. 
• . • ., 
,S if we ~ave a paragr:.:'h lzcme 2c::%q~:~Lug the two me~s~g~-p:~irs' 
~nd 
( ( (W~O:~E GiL,:N) (SYS~m.: ..S ~: O::G.m:Z.,~:O~':) ) 
( (CH~NGE KIND) (CPL~{GING L.3 "~LT:{ING) ) 
(((T~:n:o ::'<~::~ <,:~<~::~:E cmmGm ~.::',:~) ) 
(T~:'Id,:f:/"'.',~:T ,~S ;::T~.INI.NG TO MOVIN~ THI~:~: LBOUT)) NIL 
::i: ) ) 
((£2~D ViTH I T}F?~ Ti{.J~.:LLING PUBLICS H..}3iI'S) 
((mmH( (:~::~: iO~) (::~on O0) )C-::,:~,0 
(HABITS AS f:EPEf.TED _.C:!VITI\]~S) 
( (BE BD (DUI"::iF)) 
( ( ('#HO~ FCm:) :rm?) 
(PU,.?,IICZ ..2 CONNECTED WITH THE :THOLE PEOI°LE)) 
NIL ( (( .~i:EkE C\]L'/~SE)LOW) 
(TR;,~LI::: i.~ MOVlf,:a FP,0M PLLC~ TO P.L.~C~))))), 
then the two full templates in those mcss,qg~-pairs -are a 
4~ + .l fitting pair,,,wo shall expect them to be repl3~od in the 
string by th~ form: 
(WITH IT THE Tm'A~LLII~'G PTJBLICS }L'~I3ITS)) 
( ( ( (?;LrOI,E GILPIN) (ZYSTI~Ii .:.S ,~i~ OIIG~h'IZf~TION) ) 
((B:~ BE) (~mE ,".S m~W~ ~m\] P~O~mc~Y)) 
((CH:~NG~ i{IND) (CfLmGING AS .dLT~iCIi:G) ) 
(TR,/{SPOkT AS PII{T~INING TO MOVING THINGS f~BOUT)) 
NIT, ( ( (WHINE Cm~GE) ,HOW) 
T.~..~IN~ AS MOVING F±~0M ~I,ACI~ TO PL..CD) ) ) ) ) 
This fitting together, or pe~sing, of message-pairs 
~xpresses the semantic compatibility between the corresponding 
fragments discussed earlier . PI~S~ rewrites such strings 
25. 
of message-pairs recursively, trying to reach a two item 
list which (by rule 17) is P the paragraph symbol. If this 
point is reached the corresponding sense-resolution is 
read off and printed out for the paragraph in the following 
form: each fragment is given with the list of sense 
expressions for all the words in it which are resolved (or 
which had only a single sense entry initially, and so are 
trivially resolved)~ a list is also given of words not 
resolved (if any). 
(( (BITAINS TR~-SPORT SYSTE~d .~E CH,;~i'GING) 
((WO~-tDS RESOLVED IN FR~IGFfl~TT) 
((TF~.~NSPO~T AS PE~T~INING TO MOVING THINGS ,IBOUT) 
(BRITAI~TS ~S H,IVING THE 0!~'H~LCTEi~ISTIC OF A 
PIi{TICUI,~L PfAXT OF THE WOP~LD) 
(SYSTE~I ,~S ,U~T 0±~G~IZf, TION) 
(~IE AS H~VE THE PR09~T~TY) (CH~.~TGING i~S i~LT~IING))) 
((WOILDS NOT i~ESOLVED IN Fi~.,GM~ENT) NIL)) 
((WITH IT Z'IIT~ T(~VELlING PUBLICS H.LBITS) 
((WORDS ILESOLVED IN FRAG~NT) 
((TP~t~LLING ,~S MOVING FROM PLACE TO PLLCE) 
(IT ~S INI2,TI~h'.TE PRONOUN ) 
(H~.BITS ~S i{EPEi~TED ~CTIVITIES))) 
((WOitDS NOT i~SOLVED IN FRAG}~_~TT) NIL) 
fig, 5. ~.irst two frs~@ments of the resolved output for e. 
text paragraoh. 
The original Englis~ for the first two fragments of that 
paragraph was "Britain's transport system and with tt the 
travelling public's habits are cha~i~Ig". 
The sense constructor" procedure. 
A procedure was built in to the system to deal with the 
26. 
cases where the system returned (NO RESOLUTION ALL PATHS 
BLOC~D) at the teletype. This situation could arise for a 
number of reasons; the text fragments did not cohere together 
sufficiently; a vital word sense had been left out of the 
dictionary; or a word in the text was being used in a new 
and original sense. An obvious suggestion for tackling 
this is to allow the word dictionary to enlarge itself: 
to supply an additional sense entry for the word that is 
holding the procedure up, if it can be found. Such a const- 
ruction could thought of as adding a new rule F - a, where 
P is a formula and a word name, and so expanding to a new 
rule system as the system adjusts to the particular text. 
In practice PARSPARA examined the value of a free 
variable BESTPARS each time it failed to parse a frame 
completely. It stored as th8 value of BESTPARS the parsing 
tree containing the templat~ that had been rewritten least. 
It seemed a good first guess at the recalcitrant word that 
it was in template that 'cohered' least with its neigbours. 
If all the frame blocked PARSPARA would print (CONSTRUCTER MODE) 
and evaluate a function of no variables called C0~TSTRUCTER. 
This function controls all subsequent operations via the 
READ and PRINT functions at the teletype. CONSTRUCTER looks 
a~ the value of the recalcitrant template in BESTPA~S and 
suggested that a word in the corresponding fragment have its 
dictionary of sense pairs enlarged by identifying the recalci- 
trant word with the most 'semantically close' word in the 
paragraph. If the operator accepts the system's suggestion 
at the teletype, the system is rerun with the enlarged 
dictionary to try and get a resolution. In such a case (or 
if none of the system's suggestions are acceptable to the 
operator) the system returns to the normal operating mode. 
This procedure was not called upon for the newspaper 
paragraphs, but it produced some interesting suggestions in 
the case of two of the philosophical paragraphs. 
27. 
In CONSTRUCT:~ MODE dialogues like the following are possible: 
( CONSTRUCT~R MODE) 
((NO RESOLUTION ALL PATHS BLOC~2ED) 
(BEST PARSING CONTAINS) 
((( ((KIND SIGN) (ATTRIBUTE AS A PARTICULAR KIhrD OF 
PROPERTY) ) 
((BE BE) (DUm~)) 
((S.m~ KIND) (S:fl:~ AS IDENTICAL) ) 
((',~IHOLE (MUST (KIND SIGN))) 
(N_~TD-i~E .~S ESSENCE 0It ESSENTI~ P~{0P~RTIES)) 
NIL NIL) 
(I{ECiILCITR~T TEMPLf..TE IS ~0ii) 
(THE SI~,~ f.~TIY~{E OR ATTRIBUTE) 
(CONTINUE YES 0f~ NO) 
YES 
(SUGGEST 4TTiIIBUTE ..S N~TUI~E (SH~LL I TRY IT YES 0A NO)) 
YES 
(((IP THZAE W~E TWO C~{ ~iOf~E DISTINCT SUBST~NCES) 
((W0\]~DS ~{ESOL~@U~D IN ~R.~GMh~T) 
((THERE AS AT i, POINT) 
(OR L8 DISJUNCTION) 
(~lOt~ A5 I~ A~ J IITCRE/.ZED I~k~NNE~R) " 
(DISTINCT i~S DIFFE~L~'~T) (SU~ST~ICES "a SO~TS 0F 
THDTG) ) ) 
((WORDS NOT f~ESOLV-~D IN ~RAGMI~\]NT) 
(T~0 
(((COUNT SIGN) (TWO AS L NUEBER)) 
((COUNT KIND) (TWO ,~S m'XING T~ P~OPmTY 0~ TWOITY))))) 
fig. 6. Dialogue in CONSTRUGT...F~T{ MODE together with first part 
of subsequent resolution. 
28. 
6 DISCUSSION 
One of the main difficulties in coding for, and 
evaluating, a system like this one is the necessary vagueness 
of some of the sense-entries (especially evident in words 
like 'it' and 'is').Noh~th~rcbs'I claim, that the present 
system could constitute a tentative criterion for meanin6- 
fulness: a text is meaningful if and only if a system like ~h~ 
present one can resolve it. It is easy enough to get a 
necessary criterion on the ground that one needs to be able 
to tell in what senses the words of a text are being used in 
order to call it me~uingful. I have ar~ed at length else- 
where that it is possible also to justify the corresponding 
sufficient one (8). The establishment of such a criterion 
would be of some interest in the cases of the five philosophical 
paragraphs, since it was texts like these that Carn~ (2) and 
the 'Logical ~yntax' school generally, said could be shown 
to be meaningless on the basis of a system cf analytic rules, 
though they never in fact constructed such a system. The 
criterion suggested here would only be one of degree (in terms 
of the number of applications of the sense-constructer 
procedure a text required for resolution). That is perhaps 
the only acceptable form that a criterion of meaningfulness 
could take, as there seems something absurd about an attempt 
to set an absolute bound to the meaningful. 
Another speculative interest of the present system might 
be its application to the speec~ patterns of schizophrenics. 
Schizophrenic discourse seems (6) to be meaningft~1 within 
the boundaries of units of the same order of length as the 
clause or phrase. The trouble is that these units don't 
seem to fit together in a coherent ~ay in the schizophrenic's 
29~ 
speech pattern; ~; system of the present sort, which tries 
to make such items cohere, might conceivably provide a 
measure of "sem~tic disorder" in such cases. 
~ number of connexions can be made also between the 
semantic structure assigned to a text by the present system 
and that assigned by formal logic. These connexions have 
been invest~ai:e4 in the cases of the five philosophical 
paragraphs, which have a form sufficiently like the one required 
by formal logic° These connexions are of some interest in 
view of the almost total neglect of the sense-~biguity of 
natural language words by formal logic. 
One can, for example, interpret the present system so 
as to create a notion of "valid ~d useful" argument. It 
has long been recognise~ tha~ an argument can be formally 
valid (and even hmve true premiss~s) ~d yet bc completely 
useless. This is usually due to a genuine ~o~mbiguity in the 
arg~ment~ For exs~ple, the followin~ is perfectly valid: 
"~l kings wear crowns, all crowns are coins, th6refore all 
kings wear coins". ~ndjwithln the context of each premiss, 
each premiss is true. (In the "numismatic world of discourse", 
for examDle, the second is true). 
2Da argument could be deemed "valid and useful" if it is 
formally valid an__~d if the present system assigns to it a 
consistent and comDlete interpretation° I am usi~ the terms 
• 0 'conslstenl" and 'complete' in a way similar to Bobrow's (I) 
use of them: an interpretation is complete if the system 
assigns an interpretation to each key term in the argument, 
and 'consistent-if .,it .a.ssi~ns the same inte.rpretation(wor d- 
sense) to every occp.rre.ncG of a term. Thus the arg~unent above 
would not pass the 'usefulness' criterion, since a proper 
ambiguity-resolver would assign different ~nterpretatign ~ 
to the two occurrences of the key term 'crown'. 
3!. 
te page 19: The negation class of elements for each element is derived induc- 
tively by a separate procedure.The notion onv¢lved is like that 
of logical contrary~an element ~ud any member of its n~gatien 
class are partly s2nonymous and partly exclusive°For example, 
an entity can be basically a ~TU~F or basically a THING;it 
cannot be both so each of these elements is in the negatien 
class of the other° 

References

I. Bobrow, D.G. 
Natural Input for a Computer 
Problem-Solving System. Ph.D. Thesis, 
N.I.T. (1965) 

2. aarnap, R. 
The Logical Syntax of Language, 
Routledge, London (1937) 

3. Earl, L. 
~algorithm for Automatic Olause 
delimitation in English sentences. 
Lockheed Missiles and Space Co., Tech. 
Rept. 5.13.64. 5. (~Mrch, 1964) 

4. Greenberg, J~H, 
(ed) 
Universals of Language. M.I.T. Press, 
Cambridge ~iass. (1963) 

5. Halliday, M. 
Some aspects of the thematic organization 
of the English Clause. R~hND Hemorandum 
5224 (January, 1967). 

6. Laing, R.D. 
The D~vided Self. Tavistock Publications, 
London (1960) 

7. ~atz, J., and 
Postal, P. 
fm integrated theory of Linguistic 
Descriptions. M.I.T. Press, C:~bridge, 
~ass., (1964). 

8. Wilks. Y. ~rgum~nt and Proof in Metaphysics~ from 
an E~pirical Point of View. Ph.D. Thesis, 
Cs~bridge. (1968) 
