i 
I 7 
L, 
m 
1965 International Conference on Computational Linguistics 
A SYSTEM FOR TRANSFOP~ATIONAL ANALYSIS 
Susumu Kuno 
The Computation Laboratory 
Harvard University 
Cambridge,~assachusetts 02138 
' r., r r /,,~~ ~"~, ~.\ 
\ ~,~,. ~;~ ,,~: _~ .I 
Kuno-i 
ABSTRACT 
A system is proposed here for assigning a derived P-marker to a 
given transformed sentence and obtaining the corresponding base P-marker 
at the same time. Rules of analytical phrase-structure grammar for such a 
system have associated with them i~formation pertaining to the transfor- 
mational histories of their own derivation. When a phrase-structure 
analysis of the sentence is obtained, the set of grammar rules used for 
the analysis contains all the information necessary for the direct mapping 
of the derived P-marker into the corresponding P-marker. The system can 
also be used for decomposing a given complex sentence into "kernel" 
sentences for the purpose of structure matching between a query sentence 
and stored document sentences in information retrieval. An experimental 
program for the proposed system has been written and is currently tested 
with a small sample grammar. Study is underway to see if there is any 
mechanical procedure for obtaining an smalytical phrase structure grammar 
of the proposed type for a given transformational grammar. 
Kuno-1 
A SYSTEM FOR TRANSFORMATIONAL ANALYSIS 
Susumu Kuno 
i. Introduction 
Numerous systems for the automatic recognition procedures of 
context-free languages have been proposed: 1 among them, two systems are 
in operation with comparatively large English grammars. One is 
J. Robinson's English parser 2 based on J. Cocke's algorithm, 3 and the 
other is the Kuno-Oettinger predictive analyzer of English. 4'5,6 
The proponents of neither of the two systems have been satisfied 
with simply assigning phrase-structure descriptions to each given sentence. 
A paraphrasing routine has bec~ ~i(~d to Robinson's English parser 7 so that 
a set of kernel sentences can be obtained in addition to the phrase- 
structure description of the sentence. For example, the analysis outputs 
of "X commands the third fleet." "The third fleet is commanded by X." 
and"X is commander of the third fleet." would all contain the information 
that the kernel is "S -- X, V -- cQmmands, 0 -- third fleet". In connection 
with the Kuno-Oettinger predictive analyzer, three kernelizing routines 
have been proposed by J. Olney, 8 B. Carmody and P. Jones, 9 and D. Foster, lO 
which accept as input the output of the predictive analyzer and produce 
either kernel sentences or pairs of words which are in certain defined 
syntactic relationships. The SMART information retrieval system, ll,12,13,14,1~ 
Salton's Magic Automatic Retriever of ~exts, has a routine which compares 
the structure diagram (part of the analysis output of the predictive 
~his ~ork has been supported in ~srt by the N~tional Science Founds tion 
under Gr~nt GN-329. 
Kuno-2 
analyzer) of a request sentence with the structure diagrams of sentences to 
be retrieved, so that paraphrases of the same kernel sentence can be 
identified. 
The aim of the present paper is to investigate the role of the 
predictive analyzer in a transformational grammar recognition system, and 
to propose a system for analysis of a language of a given transformational 
grammar. Before going into details of the proposed system, it is worthwhile 
to discuss briefly two other systems so far proposed as transformational 
grammar recognizers. 
2. General Solution to Recognition Problems of Transformational Languages 
(i) Analysis by Synthesis 
D. E. Walker and J. M. Bartlett 16 have proposed a system which 
parses the language of a given transformational grammar. Their system is 
essentially based on ~atthews' proposal 17 for analysis by synthesis. 
Analysis of a sentence is performed by generation of all possible strings 
from the initial symbol "Sentence" by means of a phrase-structure component, 
a transformational component, and a phonological component. Each of the 
terminal strings thus generated is matched against the input sentence. 
When a match is found, the path which has led to the matched terminal 
string represents an analysis of the input sentence. Certain heuristics 
are used to distinguish transformations which could have been applied to 
generate the sentence under analysis from those which could not have. For 
example, if a sentence ends in a question mark, then it is certain that at 
some point the question transformation was used. 
Kuno-2 
The Walker-Bartlett •system, although drastically improved in 
efficiency compared to the proto-type proposed by Matthews, seems to be 
still far from being practicable because of an astronomical number of 
sentences that will have to be generated before the match is found. 
(ii) From Derived P-markers to Base P-markers 
Two similar parsing methods have been independently proposed by 
S. Petrick 18 and the MITRE Language Processing Techniques Subdepartment 
(Zwick, A. M., Hall, B. C., Fraser, J. B., Geis, M. L., Isard, S., 
Mintz, J., and Peters, P. S.) directed by Walker 19 as a general solution 
for the recognition problem of the language generated by a given transfor- 
mational grammar. A transformational generative grammar G T has three 
components: the phrase-structure component , the transformational component, 
and the phonological component (see Diagram 1). The output of the phrase- 
structure compo~ent are generalized P-markers which have grammatical and 
. 
lexical forms emanating from the lowest nodes in the trees. The function 
of the transformational rules is to map generalized ?~ ~kers into derived 
P-markers. If the transformational rules map the generalized P-marker M G 
into the final derived P-marker ~ of the sentence X, then M G is the deep 
structure (base P-marker) of X and M D is its surface structure. The M D is 
then transferred to the phonological component, whose output is the plain 
terminal string X. 20 
A slightly outdated model of a transformational grammar is presented here 
for the purpose of avoiding delicate arguments not directly connected with 
the aim of the present paper. 
Kuno-3 
Generation Phase of 
Transformational Grammar. G T 
Preparation Phase of 
L(GT) Recognizer 
~p hr ase-structure~ • Component ,~ 
~- .................... ~ ........ ~ ~ /Form a ~ iContext-free ! 
i Generalized 1 ~ ~/ Context-free ~---~# Grammar G~ such i 
~ar_ ~er I 4 ~ Grammar • J that c~L(G )' 
fTransformatio~ ~/Write Invel-se ~ .iInverse Transfor- I 
Component " " "~ Transformations ~--~, mational Component I 
f Derived P-marker Whose ~.'~ ~ 
Terminal String ~ ~ i i 
i 
~honological ~ ..... ' ~/Form a 
~ Component __/ ..... :=:: : ": ..... "~Dictionary 
I i 
Sentence ~ L(G T) i 
C, t 
Dictionary • 
Transformational Language Recognizer (I) 
Diagram I 
Kuno-4 
Consider the (probably infinite) set of derived P-markers obtainable 
from a given transformational grammar GT. Each P-marker has at the bottom 
a string of symbols from which no branch emanates. Regard the set of all 
such strings corresponding to all derived P-markers as constituting language 
L D. It has been shown by Hall that, given the original transformational 
grammar GT, one can automatically construct a context-free grammar G S 
which accepts all the strings in ~ and assigns the corresponding derived 
P-markers to them. It is generally the case, however, that G S accepts 
nonsentences in ~ as well as sentences in ~, and also assigns some 
incorrect P-markers, as well as the correct one(s), to sentences in ~.** 
The analysis procedure works as follows (see Diagram 2).~Given 
a sentence in L(GT) , the dictionary lookup program, whichessentially plays 
the role of the inverse of a phonologicalcomponent, converts the sentence 
into a string in ~. A context-free analyzer with grammar G S assigns one 
(or more if the string is ambiguous in G S) derived P-marker(s) to the 
string. Then, each such P-marker is transferredto the inverse transfor- 
mational component of G T. A test is made to see which of the transformational 
rules could have been applied to map some previous P-marker into the current 
@ 
Private communication. The author is greatly indebted to Barbara C. Hall, 
who read a preliminary draft of this paper and gave him numerous valuable 
suggestions. 
** Actually, the context-free grammars for derived P-markers in both 
Petrick's and the MITRE group's systems have been manually compiled. 
Hall's automatic procedure does not guarantee an optimal context-free 
grammar for derived P-markers of a given transformational grammar. 
***The analysis procedure described here is that of the MITRE group, with 
some simplifications for the sake of clarity of explanation. Petrick's 
procedure is conceptually similar to, but actually deviates significantly 
from, the model described here. 
Kuno-5 
Analysis Phase of L(G T) Recognizer 
I Input Sentencei 
ictionary Lookup ~h 
(Inverse of Phono- 1 
L 
! String in ~ i 
! 
/rcontext:free 
Analysis with GS/ 
t 
Derived 
P-marker '; 
flnverse Transfor- 
~ mational Component / 
i 
~ ...................... T Final 
P-marker i 
i V 
it dertvable .... ", / 
from the Phrase- 
i structure Component/" ~k,~, 
of GT? 
........ .# 
yes 
I 
I Base P-marker i 
no 
Derived P-marker 
produced by G2 
not produced By G T • 
Transformational Language Recognizer (2) 
Diagram 2 
Kuno-6 
P-marker in the course of generation of the given sentence. If a rule is 
. 
found whose derived constituent structure index matches the P-marker, the 
inverse of the structural change specified by the rule is applied to the 
P-marker, and a new P-marker is obtained which matches the original 
structural index* of the rule. If no moretransformaticnal rules can be 
applied inversely to the current P-marker, either the P-marker is a base 
P-marker, or the P-marker assigned by G S was not a final derived P-marker 
assigned to any sentence by G T. The latter case is due to the condition 
that G S accepts nonsentences as well as sentences in ~ and can give 
incorrect P-markers to sentences that are in ~. In order to identify 
whether the P-marker under consideration is a real base P-marker or not, a 
test has to be made to see if the P-marker is obtainable by the phrase- 
structure component of G T. If not, the original derived P-marker, which 
initiated the inverse transformational analysis path, is abandoned. If it 
is obtainable, the forward application of the transformational rules which 
were inversely applied confirms that it is in fact the base P-marker of 
the sentence under analysis. The base P-marker, the set of inversely 
applied transformational rules, and phonological rules contained in the 
dictionary entries constitute the analysis of the input sentence. 
Each transformational rule contains a structural index and a derived 
constituent structure index. The former specifies the condition that a 
P-marker has to fulfill in order for the rule to be applied to it. The 
latter specifies the structure of the P-marker into which the original 
• F-marker is to be mapped by the transformation. 
~no-7 
3. A Predictive Analyzer and Transformational Analysis 
The system of transformational analysis which is proposed below 
aims at obtaining a set of base P-markers almost simultaneously as a set 
of surface P-markers is obtained. Rules of the analytical context-free 
grammar for the system have associated with them information pertaining to 
the transformational histories of their own derivation. For example, 
assume that the base P-marker of "I met a young prince" in a given 
transformational grammar is the one shown in Fig. l, and that the transfor- 
mational component of the grammar maps this base P-marker into the derived 
• P-marker by a sequence of four transformations: 
Base P-marker: '~I met a J~the prince was young#prlnce#- 
Intermediate P-marker: 
Intermediate P-marker: 
Intermediate P-marker: 
Derived P-marker: 
~I met a prince#the'prlnce was young~# 
#°I met a prince who was young # 
~I met a prince young# 
~I met a young prince~ 
Then, the analytical context-free grammar for derived P-markers will have 
a rule which identifies a noun phrase consisting of an article (art), an 
adjective (adJ), and a noun. To this rule, we can assign the information 
that the base P-marker image of this noun phrase is the subtree corresponding 
to "art @ the noun be adj # noun" of Fig. 1. We can say that each such rule 
in the analytical context-free grammar draws a subtree of some base P-marker' 
When a derived P-marker of a sentence is obtained, the set of phrase- 
structure rules used for the analysis draws a set of subtrees which, when 
combined together, constitute the base P-marker corresponding to the derived 
P-marker. 
Kuno-8 
NP 
J 
prn (z) 
(met) 
: k 2 -,... 
VT/" " .... " ..... NP 
i ./ 
DET 
NP VP 
i F / // 
N 
1 
noun 
(p~ince) 
l/ 
DET r 
;1 
the 
cop P~O 
~i A 
i : i 
be Adj 
(young) 
2 
iI 
noun 
(prince) 
Base P-marker for "I met a young prince." 
Figure 1 
K~ no-9 
The system is designed with the predictive analyzer A'5 as its core. 
The predictive analyzer uses a predictive grammar G' whose rules (called 
"predictive rules") are of the following form: 
<Z, c > 
~Z, c ~ 
"'Y Yl" m' m> 1 
.k 
where Z, Yi are intermediate symbols (i.e., syntactic structures, also 
called predictions), c is a terminal symbol (i.e., syntactic word class) 
and ~ denotes the absence of any symbol, d Z, c,~ is called an argument pair. 
riSE, prn> I VP PD, for example, indicates that a sentence (SE) can be 
initiated by a prn (personal p~anoun in the nominative case) if the prn is 
followed by a predicate (VP) and a period (PD). A fragment of our current 
English grammar is shown in Kuno and Oettinger. 4'5 It is proved by Greibac h 
that G' is an exact inverse of a standard-form grammar G whose rules are of 
the form: 
Z-~>cY l...Ym where<Z, c> I YI" "Ym is a rule in G', or 
Z ~c where(Z, c~l ~ is a rule in G'. 
Since Greibach has proved that every context-free language can be generated 
by a standard-form grammar, the predictive analyzer could accept any 
. context.free language given a suitabl e predictive grammar. 
Given a context-free grammar G", we can automatically construct a 
standard-form grammar G which generates the same language as G" does. 
However, it is to be noted that the structural descriptions assigned to a 
given sentence by G are not the same as those assigned to the same 
sentence by G". In such a case, we say that G and G" are weakly 
equivalent with respect to the structural description. 
Kuno-lO 
Consider a predictive grammar which does not contain more than one 
rule with the same argument pair, and an input string of words each of 
which is associated with a unique terminal symbol. The analysis of the 
sequence of terminal symbols Cl.-.c n is initiated with a pushdown store 
(PDS) containing some designated initial symbol ("SE" in the case of a 
natural language. See Fig. 2 for an example). At word k in the course of 
the analysis of the string, an argument pair CZk, Ck> is formed from the 
intermediate symbol Z k topmost in the PDS and the current terminal symbol 
c k. If a rule with this argument pair is not found in the grammar, the 
input string is ill-formed (ungrammatical). If it is found, we say that 
the prediction Z k is fulfilled by the rule <Zk, c~> I Y.'"Y ~ I m 
(or ink, Ck$ I 4), or simply that Z k is fulfilled by c k. A sequence of 
new intermediate symbols Y1 "''Ym (or ~) then replaces the topmost inter- 
mediate symbol Z k of the PDS and the analysis moves to word k+ 1. The 
input string is well-formed if the last terminal symbol c n is processed 
yielding an empty PDS. A set of standard-form rules corresponding to the 
predictive rules used for the analysis of the string gives the derivational 
history of the string in the original standard-form grammar. 
Actually, a grammar may have more than one rule with the same argu- 
ment pair. Also, a word in an input string may be associated with more than 
one terminal symbol. Therefore, a mechanism for cycling through all 
possible combinations of these rules and terminal symbols must be superimposed 
on the simple pushdown store machine described ~n the previous paragraph. 
We are not concerned here, however, about how such a mechanism is designed 
in the current predictive analyzer (see Sec. 1 of Kuno 6 for the analysis 
Kuno-ll 
algorit~hm). In the following discussions, only those analysis paths which 
lead to the end of the sentence are considered, and all abortive paths will 
be ignored in order to avoid unnecessary complications of the important 
question under discussion. 
Assume that the input sentence "A young prince met a beautiful 
girl." is to be analyzed. Also assume that Rules i - 6 (see Fig. 2) have 
been used for the predictive analysis of the sentence. The configuration 
of the PDS prior to and immediately after the application of the rule at a 
given word position is shown in the preceding and succeeding lines of the 
column "PDS Configuration" of Fig. 2. The structural description (P-marker) 
assigned to this sentence by the set of standard-fo~i~l rules corresponding 
to the utilized predictive rules is shown in Fig. 3. 
Let us assume that the base P-marker that we want to have assigned 
to this sentence is not the one shown in Fig. 3, but the one in Fig. 4. 
Since a mapping of one P-marker into another P-marker involves shifting, 
removing, and adding of nodes in P-markers, it is important to have a device 
available to refer to any position in a P-marker. Names of branches in a 
P-marker are defined in the following way. If there are m branches emanating 
from a given node in a P-marker, the leftmost branch is named i, the second 
leftmost branch 2, and so forth. The rightmost branch is named m (see 
Fig. 4). Given a node y in a P-marker, the branch number of y is obtained 
by the concatenation to the right of each successive number assigned to 
each successive branch which leads from the topmost node to node y. For 
example, the branch number of adj for "young" in Fig. 4 is 1211, the branch 
number of noun for "girl" is 22221, and so on. Similarly, if we are given 
Kuno-12 
English Rule : Argument New Predictions PDS Configuration 
Word Used ' Pair top ~2- bottom 
, ! 
; SE 
A Rule i <SE, art> i NP' VP PD .................................................................. 
......................................................................................................... NP' ~ VP ~D 
young Rule 2 
prince Rule 3 < N, noun > 
............................................. : .... t ................................. 
i met Rule Z, z.VP, vtl'~ 
z 
i ~ NP' adj > N 
i 
: , N 'VP 
1 
• NP , ........................................ 
......................................................................................................................... "i NP PD: 
Rule 5 <~. NP, art> NP' a 
................... i NP ' PD 
beautiful Rule 2 I <NP' adj > N i 
! ! ............ • 
girl Rule 3 "N, noun)~ 
Rule 6 -~PD, prd> ,~ 
PD 
Predictive Analysis of a Sample Sentence 
Figure 2 
Kuno-13 
SE 
art NP VP ~... 
adj N vtl NP ......... PD 
• /\ (young) .: I (met) i I 
I i noun ~t NP' prd 
(prince) (a) /k (') 
/\ 
adJ N 
(veautiful) ! 
noun (girZ) 
Structural Description Assigned by 
the Predictive Analyzer 
Figure 3 
Kuno-l~ 
T 
1 
art (a) 
..J 
Np .¸ 
NP' 
/ 1 2 
/ • 
A N 
adj noun 
(young) (prince) 
2~ ~_~ 
// 
/ 
// 1 
VT 
,1 
vtl 
(met) 
~2 ~l 
NP prd 
T NP' 
a~t A N 
(a) 'i 1 1 
adJ noun 
(beautiful) (girl) 
Desired Base P-marker 
Figure 4 
1 
D 
• 
/ 
E 
B C 
, ! 
F G H 
Ordered Pairs and a P-marker 
Figure 5 
Kuno-15 
a set of ordered pairs of (branch number, node) such as (1, A), (2, B), 
(3, C), (ll, D), (12, E), (31, F), (32, G), (33, H), the P-marker shown in 
Fig. 5 can be automatically constructed given the initial symbol S. 
To each prediction in each rule of the predictive grammar is assigned 
a set of ordered pairs (x, y) where y indicates the name of a node and x 
the branch number of y in a P-marker. For example, Rule 1 will have the 
@ following sets of ordered pairs assigned to its predictions: 
Rule l- qSE, artk i N P' VP 
(12, NP,) i (2, vP) 
(ll, T) , 
(lll, art) ' 
PD 
(3, PD) 
The set of ordered pairs assigned to the prediction of the argument pair 
in Rule 1 represents the names of nodes and branch numbers leading from 
the prediction of the argument pair to the final node "art". The set of 
ordered pairs associated with each new prediction shows the relationship 
of new predictions with the word class "art" of the argument pair (see 
Fig. 6). If in an ordered pair (x, y) associated with a prediction in a 
rule, y is not equal to the prediction itself (or to the word class of the 
argument pair in case the prediction is also in the argument pair), then 
the ordered pair plays the role of adding a new node y in a P-marker. 
In the course of predictive analysis of a sentence, the set of 
ordered pairs associated with the argument oair's prediction is stored in 
In Rule l~ each of the new predictions NP', VP, and PD has a one-member 
set of ordered pairs, Examples of sets of more than one ordered pair 
, • % . will follow (e.g., Rule 3a), 
Kuno-16 
/i 
/ 
T,//" / N~' 
1 I i k 
i \\ art 
(a) 
........... vP \ 
\ 
? 
t 
new predictions ./ 
y 
Partial P-marker 
Figure 6 
the output work area. The set of ordered pairs associated with each new 
. 
prediction is stored in the PDS together with the prediction. 
The branch number of an ordered pair in a rule does not have to be 
a constant as is the case with all the ordered pairs of Rule 1. For 
example, see Rule 2. 
The expression "argument pair's prediction" is used as distinct from the 
expression "fulfilled prediction 'r. The former is prediction Z of<Z, ck, 
while the latter refers to the prediction which is topmost in the PDS and 
fulfilled by the rule "Z, cP I YI'"Ym (or J A ). The fulfilled prediction 
was a new prediction of a rule which was used at some preceding word 
position, and has associated with it in the PDS a set of ordered pairs. 
Although the fulfilled prediction itself at a given word position is always 
the same as the argument pair's prediction of the rule used at the same 
word position, it is convenient to distinguish the two for our subsequent 
discussions because the set of ordered pairs associated with the fulfilled 
prediction in the PDS is different from the set of ordered pairs associated 
with the argument pair's prediction in the rule (see explanation of Rule 2). 
Xuno-17 
Rule 2: <NP'~ adj> \[ N 
i 
(xll,  dJ) , 
This rule is used for the processing of "young" and "beautiful" of the 
example "A young prince met a beautiful girl." (see Fig. 2). The branch 
number that the node NP' which dominates "young" is to receive is different 
~T from the branch number that the node ~P' which dominates "beautiful" is to 
receive in the base P-marker. Since NP' can be a recursive symbol, there 
is no way of assigning all the possible branch numbers that NP' can be 
associated with in any finite number of rules. Instead, we use a variable 
x whose value is determined by the branch number of the immediately 
dominating node in a P-marker. The notation {(x, y)~ is used to indicate 
that the prediction appearing above the notation is to Be assigned the same 
set of ordered pairs as the fulfilled prediction used to have in the PDS. 
In our example, the first NP' ("young") has f(12, NP') I due to Rule 1 when 
it becomes topmost in the PDS. In the case of the second NP' ("beautiful"), 
)t- 
o it will be shown later that it has i(222, NP' 
Similarly, the branch number that the node ad~ for "young" is to 
receive in a base P-marker is different from the branch number that the node 
_ad_i for "beautiful" is to receive. In fact, each of the two branch numbers 
depends upon the branch number which its respective immediately dominating 
node NP' is associated with (see Fig. 4). Yet, if NP' is to be regarded as 
the initial node, the branch numbers to be associated with A and adj for 
"young" and N and noun for "prince" are exactly the same as those to be 
K uno- 1;¢; 
associated with A at,d adj For "beautiful" and N and noun for "\[~ir\]", 
respectively. Therefore, in ftule 2, the branch numbers domit~ated by N?' 
are given as constants, and branch numbers emanating from the initial 
symbol and leading to NP' fire ~iven as variables. The notation (xl~ A), 
for example, indicates that whatever the branch number \['rom the initial 
node t,o NP' might be, A is to receive i as the rightmost di~it For f, be 
~ntire branch number from the initLal node to A. It is to ~ noted that 
ordered r~airs with variables a~pear only in rules in the grammar whose 
argument pairs do not contain the initial prediction SE. Once a rule is 
used for the analysis of a sentence, all the variables for branch numbers 
in the set of ordered r~airs associated with this rule will I~ changed into 
some numerical branch numbers. 
In general, (~m, Z) (m >_ i) not in a !~air of braces indicates 
the follo~ing: 
Take the maximum value* of branch numbers (max x) in .!( ~ ~ . X~ jj; 
of the fulfilled prediction. (i{emember tha~ the branch 
numbers of ordered pairs associaL~ed ~ith the fuli'i\].led 
prediction are all numerical, and do not corttain any 
variables. ~egard numeric branch rmmbers as integers to 
obtain the "maximum value".) 
Concatenate m to the ri\[~ht of max x. 
Form an ordered oair ~ith Z. 
The concatenation mark is suppressed where no confusion can result. .',hen 
(C-m, Z)a pe rs p ir   races, the x Y)!i or 
the .\['uifill.ed ored:i_ction, not the maximum w~lue, are used t,o :form a set of 
new ordered pairs with m concatenated to the right oi' each w~luo of x (see 
kule 5a for example) ~ 
Why max x is used among values oF x in )'(×, y~'i will be explained in Sec. 5. 
Ordered pairs with variables (x, y), (x'~m, y) can be regarded as a 
notation for some function whose value depends upon the previously obtained 
value of the same function. It is this recursive nature of ordered pairs 
in the grammar that allows the proposed system to work for an infinite 
number of sentences in the language. 
In the case under discussion, the fulfilled prediction NP' corre- 
O sponding to "young" has ~12, NP')~ associated with it in the PDS. 
Therefore, max x = 12. So, (xl, A) and (xll, adj) are changed into (12~, A) 
and (12 ii, adj), respectively, and the latter two are stored in the output 
work area. As explained in the previous paragraph, ~x, y)~ associated with 
the argument pair's prediction is replaced by (12, NP')~ which also is 
stored in the output work area. The new prediction N of Rule 2 is assigned 
the ordered pair (12~2, N). N, (122, N) replaces the fulfilled prediction 
NP' and its ordered pair (12, NP') in the PDS. Now the output work area 
contains (i, NP), (ii, T), (IIi, art) due to Rule 1 and (12, NP'), (121, A), 
(1211, adj) due to i<ule 2. This set of ordered pairs corresponds to a 
partial P-marker shown in Fig. 7. 
Rule 3 is shown below with ordered pairs: 
Rule 3 : <N~ noun "~. 
(xl, noun) 
When ~{ule 3 is used for the processing of the third word "prince" of the 
example, the fulfilled prediction has associated with it the ordered pair 
(122, N). Therefore, (122, N) and (122~i, noun) are stored in the output 
X ur:o- 23 
S 
t/i" 
J 
~,,w '/ 
NP 
i // "-..2 
NP' 
' 1 
a#t 
1 
aaj 
Partial P-mrwker Constructed 
Figure 7 
work area. Rules 4-6 are shown below in the new form; Fig. 8 shows the 
analysis of the same sentence using the new rules. 
Rule 4: <_V.P, vt!P Hp ........ 
i(x, (x2, (xl, vT) 
/..7 7 ~-'~\ ~ ~t j.~. ~ Vu.L) 
Rule 5: <NP, art) ,,EP' 
(~z, T) (xil, ~'D) 
Rule 6: -"PD~ prd > . k 
< y)} 
(xi, prd) 
(x2, ~P') 
it is to be noted that the set of ordered pairs in the output work area in 
Fig. 8 is isomorphic ~o ~'~ne P-marker shown in ~mg. '~" 4. 
Le~ us go back to the traasformational grammar previously mentioned 
wn~c~, assigns the base P-marker of Fig. ! to "I met a young prince " 
Kuao-21 
A 
English 
Word 
a 
young 
prince 
dSE, 
i NP' , 
<N, 
Argument 
Pair 
Contribution to PDS Configuration 
Output Work Area : top ~- bottom 
~ it. ~ 
,i 
art') (1,NP),(ll,T),(lll,art) i ..................................................................... 
:NP' i VP PD (12,:~P')~ (2,W) (3,PD) 
adj> (12,NP'),(121, A),(!2il,adj) 
N i VP PD 
(122,N) (2,VP) (3,7J) 
i ..i ................. m 
noun> i( 122,N), ( 1221, noun) i. ................................................... 
................. \[VP PD ' 
(g,vP) (3,PD): 
met <VP, vtl> I(2,VP),(21,VT),(211,vtl) 
.............................................................. ~i~ ............................................................................... NP(22,Np) ' PD(3,PD), 
Z. NP, art > {{(22,NP),(221,T),(2211,art) 
: NP' PD : 
! , (222,NP')I (3,PD) ,i 
: -~- i 
beautiful diNP', adj)!(222;NP'),(2221;A),(22211,adj) 
(2222,N),(22221,noun) 
(3,PD), (31,prd) 
girl 
i 
J 
! • 
~\]N, noun 7,~ 
~PD, prd~> 
N ,. PD 
(2222,N1 : (3,PD)i 
J 
.............. J 
PD 
i (3,PD) 
-- ................................................... 
...... J ........... . .................................... 
Analysis of the Sample Sentence 
Figure 8 
"Y~, -- ," - ") 0 
~.~ ZoA*owlng set of rules, in the fraze~ora of the same mechanism 
as ~as introduced above, can give the desired base P-marker. 
Rule " ~ ~ PD ~a: £SE, prn> i ,. 
(ii, pra) ! 
Rule 2a: p 
{(~, y)\] (ya, ~) \[ (~, N?) 
Rule 3~: <~V?, art> A N 
\[(x, y)} 
(x!i, ~t) (~ #) 
(x~3, s) (~-na, #) 
(x!3!ll, the) 
(x!321, COP) 
(x1321i, be) 
(x1322, Pl~) 
(x13221, A) ' (x2, ~) (xl312, ~) 
Rule la: CA, adj> 
\[(~, y)\] 
(xl, adj) 
/< associated with a prediction in a rule performs the function of 
eliminating the node for the prediction from a P-marker. 
i&ule 5a: <:,;~ noun> 
.~x, y)j 
7 A 
/< i 
The argument pair's pred~c~o.. ,', ' ...... "'~'~'~ 
with it a set of ordered pairs xl, noun • ~ ~s to be noted 
that ..... ' - ~ ......... 
"~ "' " " the fulfilled of P.ule z/. ~:'nen 2~ule 5a ~.s used to process '~prince, 
prediction N has associated with it ordered pairs (222, N) and 
(221312, N). Therefore, ~(xl, noun)} is changed into noun) 
and (2213121, noutu). 
In comparing Rule 3a, for example, with Fig. l, one may wonder 
why (x2, N) and (xl312, N) are associated with the new prediction N, 
and not with the argument pair's prediction NP. If the latter 
alternative were chosen, N would have no ordered pairs in Rule 3a. 
Then, when Rule 5a is used for the processing of "prince," there would 
be no way of obtaining desired branch numbers for the noun in \[(xl, noun)~. 
The concatenation operation x~m introduced in the previous 
paragraphs is not enough to deal with coordinate structures. Assume 
that the base P-marker of Fig. 9 is to be assigned to "She is young and 
beautiful .". 
7 Rule $a: ~PD, prd> ', /( 
Kuno- 24 
#s# 
1 
prn (sh,) 
1 
COP 
le 1 
(is) 
,/ 
(yoking) 
2\" ~RED 
(and) (beautiful) 
Base P-marker for "She is young and beautiful." 
Figure 9 
Rule 7: <PRED, adJ> 
(xl, 
(xl19 adJ ) 
AND A 
• • , , , 
(:2, A~) (x3, A) 
Rule 8: <AND, and> 
(x, y)~ 
Rule 9: CA, adJ> 
(xl, adj) 
Rule 7 is capable of assigning numbers I, 2, and 3 %0 the three branches 
emanating from PRED and leading %0 A, AND, and A, respec%ively. 
Xm'xo-25 
However, if the predicate has three adjectivez :*young and beautiful 
and intelligent," the inadequacy of a con~e~#o-free gra~u~-r manifests 
itself. The P-marker that we want to obtain is not that of Fig. 10(a), 
but of Fig. 10(b). Yet, we car~uot include in the predictive grammar a 
rule such as 
~PRED, adj> 
(xll, adj ) 
AND A AND A 
because we will face the same problem for coordinate predicates with 
more than three coordinated members, and because we carmot have an 
infinite n~mber of rules pertair~ing to i-member coordinate structures 
where i = 2,3,...,~. 
In order to obtain P-markers of the type shown in Fig. 10(b) 
with a fir~Ite set of rules, a new operation "÷" is introduced. If a 
prediction in a rule has (x+m, u), max x is chosen among the values of 
x of ~(x, Y)3 associated with the fulfilled prediction: m is numer'ocally 
Actu~lly, the difficulty under discussion is not only of a context-free 
analyzer, but also of the phrase-structure component of a transforma- 
tio~=aL grammar. A base P-marker of the type shown in Fig. 10(b) c~ot 
be obtained by any phrase-structure grammar if an infinite number of 
coordinated members is to be accounted for. One solution for a 
transformational ~enerative grammar is to have in its phrase-structure 
component a re~vriting schema such as PRED-->A (AND A)*, where 
(AND A)* can be repeated any number of times (including zero). T~his 
is done in the :J~ITP~ procedure in both the generative phrase structure 
component of GT and the context-free analysis component G S. In the 
generative component, the only starred rule is S'----~S (AND S)*; in 
the recognition component, all compoundable intermediate symbols have 
rules of this type. 
Ku.uo= 26 
PRED ~ PROD 
1 ,."" 2 .... -.3 
AND A AND A • " AND ~ A 
\, 
3\,, 
(a) (b) 
Base P-markers for Coordinate Structures 
Figure i0 
added to the rightmost position of max x. (If more than nine 
constituents are to be accepted in a construct, it is necessary to 
use more than one digit for the name of each branch, but this does not 
cause any additional complexities.) For example, when the second 
adjective "beautiful" of the example "She is young and beautiful and 
intelligent." fulfills the prediction A, Rule i0 is usedt 
Rule i0- <A~ ad~ I AND A i m , , 
(x÷l, AND) (~2, A)" 
{(x, y)} for the fulfilled prediction is (223, A); therefore, (x+l~ AND) 
and (x+2, A) are changed into (224, AND) and (225, A), respectively 
(224 = 223 + i, 225 = 223 + 2), and are stored in the PDS with the 
corresponding predictions AND and A. If the predicate has four 
adjectives as in "young and beautiful and intelligent and bright," 
Rule i0 will be used again for the processing of "intelligent." This 
Kuno-27 
time, max x = 225. Therefore, new predictions AND and A will be stored 
in the PDS with the new ordered pairs (226, AND) and (227, A), 
respectively. 
It should now be noted that the concatenation operation x~m 
plays the role of generating a subtree whose initial node has the branch 
number max x, while x+ m plays the role of adding a branch to the right 
of a branch whose branch number is x, and whose immediately dominating 
node also dominates the added branch. 
4. Salient Features of the Proposed System for Transformational Analysis 
What are the salient differences between the transformational 
analysis system (see Sec. 2(ii) of this paper) proposed by the MITRE 
group and Petrick (to be referred to as M-P system) and the one proposed 
in the present paper (to be referred to as K-system)? The M-P system 
is based on the condition that a transformational grammar is given. A 
context-free analysis component is automatically constructed on the 
basis of the transformational grammar; the context-free analysis 
component assigns one or more derived P-markers to a sentence to be 
analyzed; transformational rules are applied inversely to each P-marker 
step by step until the base P-markers of the sentence are obtained. 
For example, after a derived P-marker is assigned to "He met a beautiful 
girl.", the M-P system will compare the P-marker with the derived 
See the second footnote on page 4- 
Kuno~-28 
constituent structure indices of transformational rules, and find that 
this derived P-marker is the result of the transformational rule which 
places an adjective in front of a noun. Therefore, by applying this 
rule inversely, an intermediate P-marker corresponding to "#He met a 
girl beautiful#" is obtained. Next, this new P-marker is compared with 
derived constituent structure of transformational rules, and it is 
found that this is the result of the transformational rule which deletes 
a relative pronoun and a copula. Therefore, by applying this rule 
inversely, an intermediate P-marker corresponding to "#He met a girl 
who was beautiful#" is obtained. Next, this intermediate P-marker is 
compared with the derived constituent structure indices of transforma- 
tional rules again and is identified as being the result of a 
relativization rule. Therefore, the rule is applied inversely, and a 
new P-marker corresponding to "#He met a girl # the girl was beautiful#" is obtained, 
which in turn is identified as originating from a rule which places an 
embedded #S# dominated by DET after the noun. A new P-marker corre- 
sponding to "#He met a # the girl was beautiful #girl#" is thus 
obtained. AZ'ter comparing this P-marker again with rules in the 
transformational component, it is found that there is no rule whose 
derived constituent structure index matches the P-marker. It is also 
found that the P-marker is derivable from the phrase-structure 
component of the transformational grammar. Thus, the P-marker is 
identified as being a base P-marker, and forward application of the 
transformations which were inversely applied confirms that it is in 
fact the base P-marker of the sentence under analysis. 
Kuno-29 
With regard to the K system, on the other hand, a predictive 
grammar which accepts all the sentences of a given transformational 
grammar G T (and probably nonsentences in addition) is manually 
compiled. A derived P-marker assigned to a given sentence by the 
predictive grammar is usually not equal to the derived P-marker which 
is assigned to the same sentence by 9" The mapping of such a 
distorted P-marker into the base P-marker is not performed step by 
step through intermediate P-markers as is the case with the M-P 
system. Instead, it is performed in one step by means of ordered 
pairs. For example, the fact that the predictive rule 
<lq?, art~! A N 
has been used for assigning a distorted P-marker to the sentence 
"He met a beautiful girl." indicates immediately that an embedded 
sentence which constitutes a relative clause is involved here, that 
the subject of the embedded sentence is the same as a noun ("girl" 
in our example) which fulfills N of the predictive rule, and that 
the adjective ("beautiful u) which fulfills A is the predicate 
adjective of the embedded sentence. The predictive rule has 
associated with it a set of ordered pairs which draws a subtle 
of the base P-marker image of this NP. The summation of such 
subtrees drawn by all the rules used for obtaining the distorted 
P-maker yields the base P-maker of the sentence. 
The K system does not achieve this one-step mapping without 
cost. The sacrifice is paid in the simplicity of the context-free 
Kuno- 30 
analysis component. For example, in order to obtain desired base 
P-markers for 
(i) 
(ii) 
(iii) 
Look at the girl who is dancing the mazurka. 
This is the girl whom everyone likes. 
This is the glrl by whom he was ruined. 
the predictive grammar must have three different rules pertaining 
to a noun phrase initiated by the definite article "the." Each 
rule specifies a different position, in the embedded sentence, of 
the predicted N (see circled N's in Fig. ll). 
Rule (i): <NP, the> 
(xll, the) (x.U, ~) 
(x~3, s) (x.U~, #) 
(xl31, m~) 
(xl3111, the) 
N 
(x2, N) 
(x1312, N) 
KELsb j 
(x13, R) 
Rule (i-a) 
Rule (ii) 
<RELsb.I , who> 1 
A 
VP 
(x2, v?) 
<NP, the> i N 
(xl, 
(xll, the) (x~, ~) 
(xl3, S) (xl~, ~) 
(x1322, NP) (x13221, mET) 
(x132211, the) 
(x2, N) 
(x13222, N) 
RELob j 
(x13, R) 
Kuno- 31 
NP 
l // / 
// 
DI~T 
z" , \ \ 
the #' S # 
i/" 2 
/ ,/ 
liP gP 
1 ",2 
DET 
for sentence (i) 
2 
N 
NP 
'" ' 2 j ",k 
, \ 
DET N 
the # s # 
NP VP 
,/'~" \ 2 i ,\ 
~,\\ 
VT NP 
1 ///" 2 
/ 
for sentence (ii) 
/"\\ 2 \ 
t/he # S # 
~ /'\. 2. 
'\ 
NP gP 
V aG~r 
BT NP 
by DET 
for sentence (iii) 
Position of Predicted N in Self-embedded Sentence 
Figure ll 
Rule (ii-a) <REL,., whom'> oDJ 
A 
Rule (il-b) ~VP', vtl> 
fix, xl, ~I 
(xll, vtl) 
NP VP' 
(xl, NP) ! 
i\ 
(x2, V?) 
Kuno-32 
Rule (iii) ~NP, the> 
~(x, ~) 
xl, 
(xll~ the) (x~, #) 
(xl3, s) (xl~, #) 
(x2~, ~P) (x2221, 
D~) (x22211, the) 
N KEL 
_ pass 
(x2, N) ! (xl3, R) 
(x2222, N) i 
Rule (lii-a) <~TpasW by> 
(x22, AGNT) 
(x221, BY) 
(x221\], by) 
WHOM NP VP 
i 5 (~l, N~) ! (x2, vP) 
, (~1 v) 
Moreover, in order to deal with sentences such as 
(iv) Look at the ~irl dancing the mazurka. 
(v) Look at the dancinK_g_irl. 
(vi) This is the girl liked by ever ~X_ ~. 
additional rules have to be recognized which have the same argument 
pair <NP, the> but which have different sequences of new predictions 
and the different sets of ordered pairs from those in Rules (i), 
(ii) and (iii). Depending upon the nature of the original trans- 
formational grammar GT, the number of such rules with the same 
argument pair can become very large. However, when a given sentence 
with a noun phrase is analyzed, only one of these rules will lead to 
the end of the sentence (unless the sentence is ambiguous with 
respect to the noun phrase), and all the other rules of <NP, the> 
Kuno-33 
will come to an impasse before the end of the noun phrase is reached. 
Moreover, once an analysis of the sentence is obtained, the derived 
P-marker can be unambiguously mapped into the corresponding base P-marker. 
5- Practical Applications 
The mechanism introduced in Sec. 3 for transformational analysis 
is quite effective for obtaining pairs (or triples, etc.) of words which 
are in certain syntactic relationships in a sentence. Assume that "The 
young prince made the beautiful girl his wife." is to be analyzed and 
that we are interested in obtaining word-triples "prince - made - girl," 
"prince - (be) - young," "girl - (be) - wife," and "girl - (be) - beautiful." 
We can achieve this aim by the following set of rules: 
Rule I': ~SE, the~ 
/4 
NP' V? PD 
(1, z) i (2, z) '. /< 
Rule 2': a__dl > 
(x3, z) 
(x2, be) 
N 
1, 
Rule 3': ~N, noun> I 
Rule 4': <VP, vt3> . ! 
\[(x, 
((x+l)2, be) i 
NP NP 
(x*l, z) ((x÷l)l, z) ((x+l)3, z) 
Kuno-34 
Rule 5': <NP, the> 
A 
NP ' 
{(x, y)\] 
Rule 5'a: <NP A the> 
A 
N 
\[(x, y)} 
Rule 6': <PD, prd> 
"z" as the second coordinate of an ordered pair means that when 
the ordered pair is stored in the work area (not in the PDS), z should 
be changed into whatever word form has fulfilled the prediction. For 
example, the second word "young" of the sentence is processed with 
Rule 2', which has two ordered pairs (x2, be) and (x3, z) associated 
with the argument pair's prediction NP'. NP' in the PDS has (i, z) due 
to Rule i. Therefore, max x = i, and z = young. So, (12, be) and 
(13, young) are stored in the output work area. 
When the fourth word "made" is processed with Rule 4', the 
fulfi\]led prediction VP has (2, z) associated with it in the PDS. 
Therefore, max x = 2. Ordered pair ((x+l)2, be) indicates that i is to 
be numerically added to max x, and 2 is to be concatenated to the right 
of the sum. Therefore, ((x+l)~2, be) = ((2+i)~2, be) = (32, be) is 
obtained, which is stored in the output word area as well as (2, made) 
obtained from (2, z). In the same way, the two sets of ordered pairs 
for the two new predictions of Rule 4' will be changed into: 
NP N 
(3, z) (31, z) (33, z) 
Kuno' 35 
When Rule 5' is used for the processing of the fifth word 
"the," the fulfilled prediction NP has associated with it two ordered 
pairs (3, z) and (31, z). The argument pair's prediction has no 
ordered pairs; the new prediction NP' is assigned the same set of 
ordered pairs as was assigned to the fulfilled prediction NP. There- 
fore, when Rule 2' is used for the processing of the sixth word 
"beautiful, N the fulfilled prediction has ordered pairs (3, z) and 
(31, z). Max x is equal to 31. Therefore, (x2, be) and (x3, z) are 
changed to (312, be) and (313, beautiful), respectively, which are then 
stored in the output work area. The new prediction N is assigned 
(3, z), (31, z), and (311, z) due to the set of ordered pairs ~(x, y)\] 
and (xl, z) of the prediction. The reason that max x is to be used 
among all the values of x in \[(x, y)\] is that, whatever the branch 
number of the noun ("girl") which fulfills N may be, we want to have 
the word triple corresponding to N ("girl") - be - adj ("beautiful") 
emanate as the lowest-order subtree dependent upon the lowest-order 
occurrence of N ("glrl"). Otherwise, the branch numbers of N 
("girl"), be, adj ("beautiful") would be confused with branch numbers 
of N ("girl"), be, N ("wife") (see Fig. 12). 
When the analysis of the sentence is obtained, the ordered 
pairs (with no variable component in the branch number) in the output 
work area are sorted with the right-adjusted branch numbers as the 
sorting key. The result of the sorting is: 
Kuno~6 
( i, prince) 
( 2, made) 
( 3, girl) 
( ii, prince) 
( 12, be) 
( 13, young) 
( 31, girl) 
( 32, be) 
( 33, wife) 
(311, glrl) 
(312, be) 
(313, beautiful) 
Each set of ordered pairs whose branch numbers differ from each other 
only at the rightmost position forms a word pair (or triple, etc.). 
The set of all the ordered pairs can also be regarded as constituting 
a tree of the structured information shown in Fig. 12. 
if 2 /. 
prlnce be 
I'"/I~ \~" ~ • 
i xj" 12 3 
~rince made girl 
yo g glrl be wife 
girl b~ beautiful 
Kernel Sentences for the Sample Sentence 
Figure 12 
Zuno-37 
Observe that the addition operation of "x+m," which was intro- 
duced originally to deal with coordinated structures (see Sec. 3), has 
been used for a different purpose in Rule &'. The first of the two new 
NP predictions in Rule &' has associated with it the ordered pair 
(x+l, z). This places the NP (which is eventually fulfilled by "girl") 
on the same level in a tree as the prediction VP which has been fulfilled 
by "made." 
When P-markers of the type shown in Fig. 12 are desired, neither 
the addition operation nor the concatenation operation is satisfactory 
in dealing with sentences with coordinate structures, for which a new 
device has to be introduced. Assume that the sentence to be analyzed is 
"He met Mary and Jane and Karen.", and that three word-triples 
he - met - 
he - met - 
he - met - 
are to be identified in the sentence. 
Mary 
Jane 
Karen 
In order to accomplish this 
object, the notion of a decimal point is used. The notation x.m in an 
ordered pair indicates that m should be concatenated to the right of x 
as the rightmost fraction digit. For example, if x = 32.3 and m = ip 
x.m = 32.3~i = 32.31. If x = 3, and m = i, x.m = 3.1. The concatenation 
and addition operations described in Sec. 3 are performed on the units 
digit of a given branch number. For example, if x = 32.3 and m = i, 
Kuno-38 
x~'m = (3241).3 = 321.3; and x+m = (32+1).3 = 33.3. As is the case 
with x~m and x+m, x usually indicates the maximum value of x in the 
set of ordered pairs of the fulfilled prediction. However, \[(x.m, y)\] 
indicates that all the ordered pairs associated with the fulfilled 
prediction should be assigned to the corresponding prediction with a 
fraction digit m concatenated to the right of each branch number (see 
Rule 13 for an example). 
Rule ii: VP PD 
Rule 12: 
Rule 13: 
Rule 14: 
<aE, .prn> 
(1, z) 
<VP, vtl> 
<NPj noun> 
~NP, noun> 
f(x, y>\] 
(2, z) 
(x+l, z) 
AND NP 
% /k 
The fraction digit to be concatenated can be a variable itself. 
The variable "k" in (x.k, y) stands for the units digit of max x. For 
ex~ple, 
if x = 13 , then k = 3 and x.k = 13.~3 = 13.3 
if x = 13.21, then k = 3 and x.k = 13.21~3. = 13.213 
Similarly, \[(x.k, y)\] in Rule 13 indicates the same operation should be 
performed for each x of the set of ordered pairs \[(x, y)\] • Whenever the 
fraction variable k appears in a rule utilized at a given word position, 
Kuno- 39 
the following modification of the contents of the output work area and 
the PDS is performed: for each (x.k, y), look for ordered pairs (in 
the output work area or PDS) whose branch number is different from 
(x, y) only with regard to the units digit. For each such pair in the 
output work area or PDS, form a new ordered pair by concatenating the 
value of k as a fraction digit to the right of its branch number. Store 
the new ordered pair in the work area or PDS, respectively. 
For example, when "Mary" of "He met Mary and Jane and Karen." 
is processed with Rule 13, the fulfilled prediction NP has associated 
with it the ordered pair (3, z). Therefore, k is set to 3, and ~(x.k, y)~ 
for the new prediction NP is changed to (3.3, z). At this point, the 
search is made in the output work area and the PDS (see Fig. 13) for 
ordered pairs whose branch number is different from "3" only with regard 
to the units digit. Ordered pairs (i, he) and (2, met) in the output 
work area satisfies the stated condition. Therefore, new ordered pairs 
(l.k, he) = (1.3, he) and (2.k, met) = (2.3, met) are formed, and are 
stored in the output work area. 
When the second noun "Jane" is fulfilled, again with Rule 13p 
the fulfilled prediction NP has associated with it the ordered pair 
(3.3, z). Therefore, k is set to 3, and ~(x.k, y)~ for the new prediction 
NP is changed to (3.3k, Jane) = (3.33, Jane), which is stored in the PDS 
with NP. The search is made for ordered pairs whose branch number is 
different from 3.3 only with regard to the units digit. This time, the 
Kuno-40 
Output Work Area PDS Configuration 
(i, he) PD (2, met) ._~\' 
Contents of Output Work Area and the PDS at "Mary" 
Figure 13 
output work area and the PDS contain the ordered pairs shown in 
Fig. 14. Ordered pairs (1.3, he) and (2.3, met) satisfies the 
stated condition; therefore, new ordered pairs (1.33, he) and 
(2.33, met) are formed and stored in the output work area. The 
third noun "Karen" is processed with Rule 14. Since Rule 14 does 
not contain any ordered pairs whose branch number is of the form x.k, 
C ....... 
Output Work Area 
(i, he) 
(2, met) 
(3, M ry) (1.3, he) 
(2.3, met) 
PDS Configuration 
PD 
A 
.J 
Contents of Output Work Area and the PDS at "Jane" 
Figure 14 
no modification of the contents of the output work area or PDS is 
performed. After the processing of the period, the output work area 
contains the following set of ordered pairs: 
Kuno-41 
(i, he) 
(2, met) (3, 
Mary) (1.3, he) 
(2.3, met) 
(3.3, Jane) (1.33, he) 
(2.33, met) 
(3.33, Karen) 
The ordered pairs are sorted first on left-adjusted decimal part, 
and then on right-adjusted integral part of the branch numbers. A 
set of ordered pairs whose branch numbers are different among them- 
selves only with regard to the units digits forms a word-pair (or 
triple, etc.). Two or more word-pairs (or word-triples, etc.) whose 
branch numbers are different from each other only with regard to 
fraction digits are in the relationship of coordination. In the 
example above, "he - met - Mary," "he - met - Jane," and 
"he - met - Karen" satisfy the latter condition. Therefore, these 
word-triples are in coordination. The set of ordered pairs shown 
above can be represented in a tree diagram of Fig. 15. It should be 
noted that tree diagrams of this form are isomorphic %o sets of 
ordered pairs in the following way. The number for a single-line 
branch should be interpreted in the same way as before (see Fig. 12, 
for example). The number for a double-line branch is a fraction 
digit. In a path leading from the starting point (a circle in Fig. 15) 
to a given node in the tree, the number for a double-line branch is 
concatenated to the right of fraction digits, while the number for a 
Kuno- 
single branch is concatenated to the right of nonfraction digits. 
Therefore, "he" of "he met Jane" in Fig. 15 has the branch number 
1.3, "Jane" 3.3, and "met" of "he met Karen" 2.33, and so on. 
3 3 
i'~ 2 3 "'-,, lj 2 3 \ . 2 
~e / "% met Mary t he met Eaten 
Tree Representation of Coordinated Word Triples 
Figure 15 
Figure 16 shows the word-triples identified in the sentence 
"Tom and Jim and Bill met Mary and Jane and Karen and liked Mary and 
Karen and disliked Jane.". Two new rules are needed for the processing 
of the sentence. 
Rule 15: CSE, noun>... AND NP ...... VP... .... PD 
(i, z) ,,< I (1.1, ~) ' (2 ,) ' ' @ ,~ 
(2.1, Z) i 
Rule 16: ~vP vtl> NP AND VP 
{(x, y)\] A I {Ix.k, y)} 
Figure 17 shows the word-triples identified in the sentence "A 
young and handsome prince met a beautiful and attractive girl and made 
the girl his wife." Three new rules are needed for the processing of 
this sentence • 
Kuno-43 
(i, Tom) (2, met) 
(3, Mary) 
(1.1123, Bill) (2.1123, liked) 
(3.1123, Karen) 
(1.123, Jim) (2.123, liked) 
(3.123, Karen) 
(1.23, Tom) (2.23, liked) 
(3.23, Karen) 
(i.I, Jim) (2.1, met) 
(3.1, gary) 
(1.113, Bill) (2.113, met) 
(3.113, Jane) 
(1.13, Jim) (2.13, met) 
(3.13, Jane) 
(1.3, Tom) (2.3, met) 
(3.3, Jane) 
(i.ii, Bill) (2.11, met) 
(3.11, Mary) 
(1.1133, Bill) (2.1133, met) 
(3.1133, Karen) 
(1.133, Jim) (2.133, met) 
(3.133, Karen) 
(1.33, Tom) (2.33, met) 
(3.33, Karen) 
(1.112, Bill) (2.112, liked) 
(3.112, M~ry) 
(1.12, Jim) (2.12, liked) 
(3.12, Mary) 
(1.2, Tom) (2.2, liked) 
(3.2, Mary) 
Identified Word-triples (i) 
Figure 16 
(i.i122, Bill) (2.1122, disliked) 
(3.1122, Jane) 
(1.122, Jim) (2./22, disliked) 
(3.122, Jane) 
(1.22, Tom) (2.22, disliked) 
(3.22, Jane) 
(i, prince) (2, met) 
(3, girl) 
(ii, prince) (12, be) 
(13, young) 
(31, girl) (32, be) 
(33, beautiful) 
(iii, prince) (i12, be) 
(113, handsome) 
(311, girl) (312, be) 
(313, attractive) 
(1.2, prince) (2.2, made) 
(3.2, girl) 
(31.2, girl) (32.2, be) 
(33.2, wife) 
Identified Word-triples (2) 
Figure 17 
Rulel7: <~I, adJ> l 
(X2, be) I (x3, z) 
AND NP' 
(xl, 
Kuno-44 
Rule 18: ~NP', adj> N 
(x3, z) (xl, z) 
Rule 19: (NP, art) N 
6. Conclusion 
An experimental program has been written in SNCBOL II121 for 
. the system of transformational analysis described above. It is still 
arbitrary 
to be seen whether the proposed system can be used for an atransforma- 
tional grammar. A study is now being made to see if, given a trans- 
formational grammar, there is any mechanical procedure for obtaining a 
predictive ~rammar with associated ordered pairs which will assign the 
same base P-markers to a given sentence as would the original trans- 
formational grammar. 
For the purl~se of structure matching in information retrieval 
systems and of a crude semantic compatibility test between subject and 
complement, subject and verb, etc., the type of output described in 
Sec. 5 seems to be most practically manageable. Applications of the 
proposed system in these two fields are now being studied. 
*The author is greatly indebted to Karen Brassil who has programmed 
for the proposed system and also compiled a small sample grammar 
of English for testing the system. 
Kuno-45 

REFERENCES 

I. Bobrow, D. G., "Syntactic Analysis of English by Computer - A 
Survey," AFIPS Conference Proceedings, Vol. 24, Spartan, 
Baltimore (1963) • 

2. Robinson, J., Preliminary Codes §nd Rules for the Automatic 
Parsing of Eng_lish, Memo RM-3339-PR, The RAND Corporation, Santa 
Monica, California (December 1962). 

3. Described in Hays, D., "Automatic Language-Data Processing," in 
Borko, H. (ed.), Computer Applications in the Behavioral Sciences, 
Prentice-Hall, Englewood Cliffs, N. J. (1962). 

4. Kuno, S. and 0ettinger, A. G., "Multiple-path Syntactic Analyzer," 
Information Processing-62, North-Holland, Amsterdam (1963). 

5. Kuno, S. and 0ettinger, A. G., "Syntactic Structure and Ambiguity 
of English," AFIPS Conference Proceedings, Vol. 24, Spartan, 
Baltimore (1963). 

6. Kuno, S., "The Predictive Analyzer and a Path Elimination Technique," 
to appear in The Communication of the ACM. 

7. Robinson, J., Automatic Parsing and Fact Retrieval: A Comment on 
Grammar, Paraphrase , and Meaning, Memo RM-4005-PR, The RAND 
Corporation, Santa Monica, California (February 1964). 

8. Olney, J., "An Experiment in the Use of Discourse Analysis 
Procedures for R~ducing Syntactic and Semantic Ambiguity," 
reported at the 1964 Annual Meeting of the Association for Machine 
Translation and Computational Linguistics, Indiana University, 
Bloomington (July 29-30, 1964), paper in preparation. 

9. Carmody, B. T. and Jones, P. E., Jr., "Automatic Derivation of 
Constituent Sentences," ibid. 
