METHODS FOR OBTAINING CORRESPONDING PHRASE 
STRUCTURE AND DEPENDENCY GRAMMARS 
Jane J. Robinson 
International Business Machines Corporation 
Thomas J. Watson Research Center 
Yorktown Heights, New York 
ABSTRACT Two methods are given for converting grammars be- 
longing to different systems. One converts a s~mple (context-free) 
phrase structure grammar (SPG) into a corresponding dependency 
grammar (DG); the other converts a DG into a corresponding SPG. 
The structures assigned to a string by a source grammar will cor- 
respond systematically, though asymmetrically, to those assigned 
by the target grammar resulting from its conversion. Since both 
systems are wealdy equivalent, generating exactly the CF lan- 
guages, the methods facilitate experimentation with either notation 
in devising rules for any CF language or any CF set of strings de- 
signed to undergo subsequent transformation. 
A source SPG is assumed to be of'finite degree with or- 
dered rules in which only the initial symbol is recursive. Unless 
the source grammars obey additional constraints, the target gram- 
mars may exhibit a peculiar property, defined as "structure sensi- 
tivity". The linguistic implications of the property are discussed, 
and the linguistic motivation for imposing the constraints necessary 
to avoid its appearance is suggested. 
The author owes an especial debt to Jesse Wright of the Automata 
Theory and Computability Group in the Mathematical Science De- 
partment of IBM Research for many helpful discussior~s of theore- 
tical problems arising in the course of this investigation. 
In an article on dependency theory i appearing in 1964 \[5\], 
Hays remarked, "Casual examination suggests there would be lit- 
tle difference between transformation of dependency trees and 
transformation of IC \[immediate constituent\] structures, but no 
definite investigation has been undertaken." Since then, a depen- 
dency grammar with transformational rules has been designed for 
a subset of English sentences, and preliminary results indicate 
that Hays' observation is correct. \[9\] In either case, transforma- 
tions are specified in terms of labeled trees, and the number of 
branches and the denotations of the labels do not affect the essen- 
tial operation. Since the base grammar is generally limited to the 
generation of context-free pro-terminal strings of "deep struc- 
ture", and since context-free languages are characterizable in 
either dependency or phrase structure (i.e., immediate constitu- 
ent} systems, neither system is clearly preferable as a base. A 
linguist may find that the notation afforded by one or the other is 
simpler for characterizing some language, or for defining struc- 
tures to be transformed, or for adapting a grammar to computer 
applications. A linguist may also wish to experiment with gram- 
mars of both types, or redesign transformations defined on the 
structures of one base grammar in order to incorporate them into 
a transformational grammar using a different base. 
Such considerations as these motivate the present treat- 
ment of the problem of obtaining paired grammars by converting a 
grammar of one kind into a systematically corresponding grammar 
of the other which generates the same sentences and assigns com- 
parable structures. In addition, conversion draws attention to 
some linguistically significant relationships that may exist unno- 
ticed among the categories and rules of the source grammar and 
which may induce in the derived grammar a peculiar property of 
structure sensitivity, roughly analogous to context sensitivity. 
This property will be exhibited and discussed in the course of il- 
lustrating the method. 
We begin concretely by inspecting (Fig. i) a pair of gram- 
mars: SPG1, a simple or context-free phrase structure grammar 
of the kind formalized by Chomsky \[2\], Bar-Hillel \[i\] , and others, 
and DGI, a dependency grammar of the kind formalized by Gaif- 
man{4\] and Hays \[5\] . Two structural diagrams, a P-tree and a 
D-tree drawn beneath the grammars, illustrate the structure each 
assigns. 
The rewriting rules of SPOt are of two kinds, those in 
which only categories appear, and those in which a category is re- 
written as a terminal (lexical formative or word). The latter may 
be separated from the former and made into assignment rules, 
i For additional material on dependency theory, see Ref. 6. 
SPGI DGI 
Axiom: # S # 
Rewriting Rules : 
i. S ~ NP VP 
2. VP --~ V NP 
3. NP~D N 
• ~ the 
5. D ~ some 
6. N ~ boys 
7. N --- girls 
8. V --~ like 
9. V ~ admire 
(ia) 
Axiom: * (V) 
Dependency Rule s : 
i. v (N * N) 
Z. N (D *) 
3. D (,) 
Assignment Rules: 
i. D: the, some 
2. N: boys, girls 
3. V: like, admire 
(Ib) 
NP 
D N I 
the boys 
A V 
NP A 
D N I I 
like the girls 
(Ic) 
V 
N J 
D I I 
I I 
i I 
the boys 
I N 
I 
I D I 
I I I 
I ! i i I 
like the girls 
(Id) 
Figure i 
thereby increasing the resemblance of SPGi to DGi in an obvious 
way. This is possible because SPGt does not contain any mixed 
rewriting rules in which both categories and lexical formatives ap- 
pear on the right. It can be shown that any SPG which has mixed 
rules, in this sense, can be converted into one which does not 
merely by introducing new categories, without affecting generative 
power. Thus there is no reason for not separating the two types of 
rules, and Chomsky~devotes a good part of Aspects of the Theory 
o_ff Syntax \[3\] to giving linguistic reasons for just such a separation. 
Hereafter, "rewriting rules" will refer only to those like 
rules i-3 in SPGi, and those like rules 4-9 Will be called "assign- 
ment rules". Categories which do not appear on the left of any re- 
writing rules are terminal categories. 
With each category of SPGi, we associate a number called 
its degree. {To say that a category is of degree n means that n 
is the fixed upper limit to the number of nodes of the shortest path 
leading from it to a terminal category in any structure derived 
from St by successive rule applications. A category may be of in- 
finite degree. For example, if X ~ XX, then X is of infinite de- 
gree and so is the grammar in which it occurs even though another 
rule rewrites X as a string containing only terminal categories.) i 
In SPGt the terminal categories D, N, V are of zero degree, since 
they are not rewritten; the categories NP and VP are of degree i, 
since at least one category in their replacement is of zero degree; 
and the degree of S is 2, since the least degree assigned to any 
category on the right of the rule for rewriting S is i. The maxi- 
mum number assigned to any category in a grammar is also the 
degree of the grammar; therefore, the degree of SPGi is 2. 
An essential difference between SPGi and DGi now emerges 
more clearly. DGi uses only terminal categories, while SPGi 
uses categories of higher degree. The effects of the differences 
are reflected in the structure of the P-tree and D-tree. The latter 
has fewer branches, and this will always be the case for structures 
assigned to the same string by grammars belonging to the two dif- 
ferent systems. 
Even so, there is a systematic correspondence between the 
two trees and their labels of a kind defined by Hays \[5\] . Every 
complete subtree of the D-tree, that is, every node taken together 
with all of its descendents, covers a substring of the sentence that 
is covered by a complete subtree of the P-tree. The conver, se 
does not hold, but every complete subtree of the P-tree covers a 
substring that is covered by a connected subtree of the D-tree. In 
i For a more precise characterization of "degree", see Gaifman 
\[4\]. 
3 • 
the example, both complete subtrees dominated by N in the D-tree 
correspond to the two complete subt~ees of the P-tree dominated 
by NP, and the complete (sub)tree dominated by V corresponds to 
the complete (sub)tree dominated by S. However, the complete 
subtree dominated by Vlo in the lo-tree corresponds to an incom- 
plete subtree of the D-tree, which is dominated by V but includes 
only V and the branches to its right; so that the relationship of cor- 
respondence is asymmetrical. 
While such systematic correspondences exist between the 
structures assigned to all strings generated by SloGi and DGi, this 
is not the general case. In general. ~ 
i. Any context-free language can be generated by grammars of 
either simple phrase structure or dependency systems. 
2. For any given SPG, there exists one or more DG over the 
same language all of whose structures correspond systemati- 
cally to structures assigned to the same strings by the SPG if 
and only if the SPG is of finite degree. 
3. For any given DG there exists one or more corresponding SloG. 
4. For any given DG there exists a unique SPG of degree i that is 
strongly equivalent to it. 
Gaifman \[4\] gives a very general method for constructing 
corresponding DG from any SloG of finite degree, and also a meth- 
od for constructing a unique SloG of degree t from any DG. Here 
we give two methods for constructing corresponding Slog and DG 
that differ from Gaifman's. The first applies only to a more re- 
stricted set of SI°G and leads to reduced DG, whereas Ga~fman's 
method tends to produce DG with overlapping categories and super- 
fluous rules that may never be used to generate any string. The 
second allows construction of SPG of degree greater than i from 
certain DG. 2 
A simplified sketch of the two methods follows. Each re- 
writing rule is "augmented" by starring a category on the right. 
Each dependency rule is augmented by assigning a numerical coef- 
ficient to each dependent. Figure 2 shows a possible augmented 
lloroofs are given by Gaifman \[4\]. It is assumed that the Slog is 
non-erasing and reduced; that is, no category is rewritten as null, 
and there are no superfluous categories or rules. 
2The second method was suggested by Kay's procedurefor con- 
structing lo-trees from D-trees \[7\]. More precisely, an Slog of 
degree n > i may be derived if, in the DG, some categories 
govern two or more dependents, and the left- or rightmost depen= 
dent itself governs dependents. 
form of SPGI on the left and a possible augmented form of DGI on 
the right. (Detailed consideration of the problem of augmentation 
@ill follow the sketch of the operations.) These augmentations 
were deliberately chosen so that conversion of either grammar 
uniquely produces the other. Different augmentations produce dif- 
ferent results, although structures of the original grammar and of 
the grammar derived from it still correspond. 
SPGI TS IRL DGI 
S --NP Vim* S:V z V 2---N i.IVi* V (2N * iN) 
VP---V* N1 m VP:V i V i -~ V 0 N i N (ID *) 
NP~D N* NP:N i N 1 --- D O N O D (*) 
Figure 2 
The columns TS and IRL in Figure 2 represent a Table of 
Substitutes and an Intermediate Rule List. In the conversion of 
SPGi, the TS is constructed, equating each category on the left of 
a rule with a superscripted terminal category. The numerical 
superscript {hereafter called the exponent) equals the number of 
rules traced through when tracing by starred categories before a 
starred terminal category is reached, and expresses the distance 
between the terminal category and the category for which it is a 
substitute. In the IRL, the categories occurring in each rule of the 
SPG are replaced by their substitutes from TS. Taking the first 
LRL rule, construction of a dependency rule is begun by replacing 
the arrow with parentheses enclosing the substitute categories on 
the right. Thus from the first IRL rule in Figure 2 we obtain 
V 2 (Ni V~). The construction of a dependency rule is continued 
so long as the exponent of the starred category X~ is greater than 
zero. The next step is to search the IRL for a rule with X s on the 
left. The categories on the right of the new rule are inserted in 
the parentheses of the D rule under construction, in the position 
of X~, but no new parentheses are added. When the starred oc- 
currence is X~, it is replaced by *, all exponents are erased, and 
the dependency rule is complete. The process is repeated for new 
dependency rules until all IRL rules are exhausted. 
We must add rules to the constructed DG for any unstarred 
terminal category of the IRL. The added rules assign no depen- 
dents and are of the form D(*). The assignment rules are simply 
transferred, and since V is the substitute for the axiom category 
S, it is taken as the axiom for the DG. 
Going in the other direction from an augmented DGI to 
SI:'Gt, we first assign an exponent s to each category, where s 
equals the largest coefficient of any dependent of that category. If 
a category is not assigned dependents, its exponent is zero. The 
first rule of augmented DGt now appears as V z (ZN 1 * 1Nt). 
From this rule two rules are constructed for the IRL; that is, the 
number of IRL rules constructed from each rule of DGt will be 
equal to the exponent of the governor. The first rule so obtained 
writes the governor with its exponent on the left of the arrow. All 
of its dependents whose coefficients equal that of the exponent are 
written in order on the right, and the governing category with its 
exponent decreased by t is written with the *. Thus we obtain 
V z ~ N 1 Vt.. The second rule is obtained in a similar fashion, 
with the exponent of the governing category diminished by 1 yield- 
ing V t -~ V0 N1. ' The process is repeated until an IRL rule re- 
writing some category with exponent equal to 1 is used, after which 
the next DG rule is processed, and so on until all DG rules are 
exhausted. 
At this point, the *'s may be erased and, except for cate- 
gory labels, the IRL is exactly equivalent to the rewriting rules of 
the original unaugmented SPG. The only function the TS serves is 
to reassign labels. Assignment rules are transferred and the sub- 
stitute for the axiom of DGt is added, t 
SPGI and DG1 are very simple, with no embeddings and no 
optional rules. More complicated grammars give rise to problems 
of augmentation, especially for SPG. Even SPGI poses a problem. 
Assume it had been augmented by starring NP in rule I and in rule 
Z. In that case, the same substitute, N Z, is assigned to both S 
and VP, and the procedure produces a DG that is not even weakly 
equivalent to the original. Clearly some constraints must be im- 
posed on augmentation and provision made for grammars in which 
it may not be possible to avoid starring a category more than once. 
Similarly, assume that some DG has a rule of the form 
C (A B * D), and that this is augmented as C (tA Z\]3 * tD). The 
IRL rules are: 
C z --, B C $ 
C 1 ~ A C O D. 
But now the rules generate the sequence \]3 A C D, which was not 
generated by the original rule, and do not generate A B C D, which 
was. This is remedied by requiring that if a coefficient n has been 
assigned to a dependent, no higher number is assigned to any de- 
pendent which intervenes between it and the *. We will also re- 
quire that at least one dependent be preceded by t, and if any de- 
pendent is preceded by n > t, there must be at least one dependent 
1Although grammars with only one axiom are illustrated, gram- 
mars with more than one axiom can obviously be handled as well. 
preceded by n - i. This is not crucial, but it avoids setting up un- 
necessary, single-branching categories in the derived $PG. Note 
that if all dependents in any rule are preceded by i, which is the 
same as not augmenting the DG at all, the resulting SPG will be of 
degree i; that is, each rule of the grammar will contain at least 
one terminal category on the right. This is equivalent to Gaifman's 
procedure \[4\]. 
Augmentation of SPG is the more difficult case. Primarily 
it is the problem of constructing the TS in a finite number of steps. 
For ex~mple, if for S --- NP VP S the S is starred on the right, 
an infinite loop is created immediately. This is easy to avoid 
when considering a single rule or a small set of rules, but we do 
not know in general whether some series of rule augmentations 
may not lead to the same situation. Gaifman's solution is to re- 
quire that the marked category be of lesser degree than the cate- 
gory on the left, but this not only leads to a proliferation of cate- 
gories in the derived DG whenever the SPG has more than one re- 
writing rule for any of its categories, it prevents us from deriving 
the simplest corresponding DG in some cases. I On the other hand, 
the method employed here will not work unless some restrictions 
are imposed on augmentation which also imply restrictions on the 
form of the SPG in addition to the requirement of finite degree. It 
is not clear what restrictions are minimally necessary, but it is 
sufficient to require, in addition to finiteness of degree, that the 
rules of the SPG be ordered, so that in a developing derivation, if 
rule n has been applied, no rule m, m < n, need be applied there- 
after. This is too restrictive, and does not allow for full recur- 
sion. We may, however, allow a dummy symbol, S', to appear in 
any rule rewriting some X as a string containing some Y, 
S' ~ Y ~ X, where S' is replaced after one application of the set of 
rules by # S #, the axiom, so that the rules then reapply in linear 
order. 
Any $PG of finite degree with ordered rules providing for 
recursive application in the manner specified can be converted to a 
corresponding DG by the method given here. The base component 
proposed by 2Chomsky \[3\] for transformational grammar is an SPG 
of this form. 
l"Simplest" with respect to process, number of categories, and 
freedom from the property of structure sensitivity. See p. 14 ff. 
2Chomsky does not explicitly require that if S' appears in rewriting 
X, some Y (S' ~ Y ~ X) appear in the rule also, but implicitly ob- 
serves the restriction. A new grammar for English by Rosenbaum 
\[10\] contains a rule NP ~ NP S' which does not observe it. This 
is the only case known to me of an actual grammar for some 
!'real" language that violates this restriction, and Rosenbaum does 
not claim any strong theoretical motivation for the rule in-question. 
Cf. also Lees \[8\], and the MITRE grammar \[11\]. 
We turn now to more detailed consideration of augmenting 
and converting SPG subject to the above constraints. While the 
constraints insure that the SPG can be workably converted, some 
linguistically significant problems arise if we consider how to de- 
rive a simple DG, and it will be shown that under some conditions 
the derived DG will exhibit a feature not heretofore considered in 
the literature, a feature of "structure sensitivity". 
Method i. Conversion of SI:'G to DG 
Step i. Augmentation 
All rules of the SPG which rewrite a given category are 
conflated and written as one schematized rule, with square 
brackets enclosing optionally omitted categories and braces 
enclosing lists of categories from which a single choice is 
made. t Thus Z--, \[W\] {X} conflates four rules, andwill, 
in any given application, rewrite Z as either WX or WY or 
Xor Y. 
Beginning with the first rule rewriting some X and pro- 
ceeding in rule order, star occurrences of categories, ex- 
cluding the dummy symbol S I, on the right in such a way that 
one and only one starred category Y% where Y ~ X, will oc- 
cur in any application of the rule. 2 It follows that no bracket- 
ed (omissible) occurrence may be starred. If more than one 
category is starred, the schernatized rule must be separated 
into as many rt, les as there are starred categories, with one 
starred category appearing in each. For example, 
X -~ C must be separated into X -~ A* C 
~D* F J 
X -- D* F. This is a partial undoing of the conflation, but for 
linguistic reasons all ways of rewriting a category will usually 
have some element in common and some c0nflation will usual- 
ly be preserved. 
iSquare brackets are used here rather than the customary paren- 
theses to avoid confusion with the parentheses used in the derived 
DG. More exactly, \[\] and {} enclose lists of strings of catego- 
ries. In the case of \[ \] the list may consist of one member, and 
in both cases the strings may consist of one item. 
ZSince the SPG must be finite, no rule will rewrite X as a string 
orals in any application, and some other category of degree less 
th~n X will be available for starring in all cases. 
If the simplest DG corresponding to the SPG is desired, 
then it is preferable to augment the SPG in such a way that 
a) no category occurs both starred and non-starred on the 
right, and 
b) no category is starred on the right of two or more rules 
which rewrite different categories. 
But it is not always possible to observe these policies, and in 
that case, additional augmentation is necessary in order to 
distinguish among occurrences of X as the marked constituent 
in the rule rewriting Y, X as the marked constituent in the 
rule rewriting Z, Z ~ Y, and X as an unmarked constituent 
of any category. 
Assume some category X is starred in two or more rules 
which rewrite different categories, 1 say Y and Z, and also 
occurs unstarred on the right in some rules. The two X*'s 
are assigned different subscripts to avoid assigning the same 
substitutes for Y and Z with the consequent loss of essential 
information. We now have three varieties of X, namely: X, 
XI, and X2, where X is the unstarred variety, X I is the 
marked constituent of Y, and X 2 of Z. (If X does not occur 
unstarred, only X and X I are needed.) If X is not a termi- 
nal category, there is a rule rewriting X. Then we must add, 
beneath that rule, rules for rewriting X 1 and X 2. If 
X ~ U* W, we add X I ~ UI* W 
and X 2 -~ U2* W. 
Now, if U is not a terminal category, we add rules for re- 
writing U I and U 2 beneath the rule rewriting U. Thus the 
process of adding rules is iterative, but it will eventually end 
when, in some lower rule, a terminal category is starred. 
Note that in some cases sub-subscripts are needed. For 
example : 
I. Z .... PI*" " " 
2. X .... RI*... 
3. Y .... R2*... 
4. R .... P2*" " " 
5. P .... A*... 
I X# may occur 
gory, Y. 
E.g., inY-~ L} 
Y~ X 
~Z* 
more than once in rules rewriting the same cate- 
Y X , the only possible starring is 
Z 
, which produces three rules for Y, in two of 
which X* occurs. 
1"roceeding down the rules, we see first that pt~ and OCCURS, 
scanning down the left,~that P's are not terminal categories 
and that there is no rule rewriting 1"1" Therefore, 
beneath 1" .... A*... 
we add Pt .... At* .... 
In the second rule, RI* occurs, is non-terminal, and requires 
a rule, so 
beneath 
we add 
RZ* occurs 
beneath 
we add 
1"2" occurs 
beneath 
we add 
Our rules are now: 
1. Z .... 1"1"" "" 
2. X .... Rt*... 
3. Y .... RZ*... 
4. R .... P2*" " " 
5. R 1 *... .... 1"21 
6. R 2 #;... .... P22 
7. 1" .... A*... 
8. Pl .... AI*" " " 
9. 1"2 .... A2* .... 
R .... 1'2.... 
R t --- 1"21#;. • . 
in the third rule, under the same conditions, and 
R 1 *... .... '1"21 
R 2 * .... .... 1"22 
in the fourth rule, under the same conditions, and 
1"1 .... At*" " " 
1"2 ~'''A2*''" 
and we are looking at rule 5 where P2i* requires us to add 
t0 ..... * .... P21 A21 
Rule 6 requires us to add 
tt ..... 1=22 A22. .... 
But in rule 7, A#; requires no additions, because A is not sub- 
scripted, and the subscripted A's in rules 8-1t require no ad- 
ditions because A's are terminal categories. 
As a result, no Xi* occurs on the right of two or more 
rules which rewrite different categories. 
This process and the remaining steps will be illustrated 
using SPG2. In order to show the effects of different choices 
t0 
:ii in ~ugmentation, SPG2 will be augmented in a way that delib- 
erately violates the policies advocated for starring (but not 
the restrictions). 
SPGZ 
Axiom: S 
Rewriting Rules : 
I. S ~ NP~ VP 
2. VP-4" V NP* \[NP\] 
3: NP --~ D N~ 
We will assume assignments are the same as for SPGt with 
the additional assignments of "send" and "give" to V, and 
"books" and "flowers" to N. I 
Note that, in this augmentation, NP is starred twice and 
also occurs non-starred on the right. Subscripting and dupli- 
cation are required, resulting in 
i. S --- NP~ VP 
z. vP-v \[NP\] 
3. NP ~ D N~ 
4. NP 1 ~ D N~ 
5. NP 2 ~ D N~ 
Step 2. Establish a table of substitutes (TS) of 2 columns and n 
rows, where n equals the number of rules in the SPG (after 
step i). List categories on the left in column t, and starred 
categories on the right in column 2, in order. Eliminate any 
duplicate rows. (Cf. p. 9, footnote l.) 
At the end of step 2, applied to SPGZ, the result is 
TS 
S NP i 
VP NP z 
NP N 
NP i N I 
NP z N 2 
1In a transformational grammar, lexical assignment rules of more 
complex form than that given here would presumably block the 
generation of un-English sentences. 
li 
Step 3. Starting with k = i, try to match the category in row k, 
column 2 (k, 2) with a category occurring in column i of a 
lower row. If a match is found on row m, check to see if a 
match also occurs on m + i. {This will be the case if the 
SPG contains more than one rewriting rule for the category in' 
(k, 2}.) If it does, mark m as a branching point, insert a 
duplicate of row k beneath row k {the duplicate will be row 
k + i) and follow separate branches for substitutes for (k, i) 
and (k + i, i). Replace the category in (k, 2) with the category 
in (m, 2) and repeat the search on the remaining lower rows. 
When the search is exhausted, assign an exponent s to the 
last category obtained {the final substitute} in column 2, where 
s equals the number of matches plus i. Increment k and re- 
peat until every category in column i has a substitute that 
does not appear in column i. At the end of step 3, we obtain 
a unique substitute for every rewritten category of SPG2. i 
TS 
2 S N i 
2 
VP N 2 
NP N i 
i 
NP i N i 
i 
NP 2 N 2 
Step 4. Construct an Intermediate Rule List (IRL) by replacing 
each category in the augmented SPG rules with its substitutes 
from TS. If a category has no substitute, superscript it with 
a zero. If X i has more than one substitute, include them in 
braces wherever X i appears on the right. If a category X 
occurring on the left has more than one substitute, provide a 
separate rewriting rule for each substitute yS such that 
yn._~..._nS ys-i~... .2 If a non-starred category on the right 
iIn cases, not illustrated here, where several substitutes are found 
because several rules rewrite some Xi, each substitute will be 
uniquely assigned to X i. 
2E.g., assume X .... A~... 
~' X .... Y~... 
Y .... B~... 
Y .... C~... Ai sothat the substitutes for X are , B 2, and C 2, and the substi- 
tutes for Y are B i and C i. Then the IRL is: 
i2 
has more than one substitute, include all substitutes as 
braced options. At the end of step 4, we obtain 
2 N~* 2 1. N 1 --- N 2 
Z V 0 NiZ, \[Ni\] 2. N 2 
3. N 1 -* D O N O 
1 -~ D O N~* 4. N i 
t --~ D O N~* 5. N 2 
Step 5. Take the first unmarked IRL rule, which rewrites some 
X~, set s as a counter, and write X n followed by a pair of 
parentheses enclosing the string of categories on the r~ght of 
the IRL rule. Note that the string will contain an X s- *. n 
Step 6. Mark the previously processed IRL rule, decrease s by 
t, and test for s = 0. If so, go to step 7. Otherwise, find s 
the rule which rewrites the new X n. Replaced the starred 
xS* in the current D rule with the categories on the right of 
the IRL rule. Repeat step 6. 
Step 7. Test to see if any unmarked rules remain in the IRL. If 
they do, return to step 5. If not, 
a. Erase starred categories, leaving only the star. 
b. Add rules of the form X(*) for any non-starred terminal 
, category of the IRL. 
c. Add as axiom(s) of the DG the substitute(s) of the axiom(s) 
of the SPG. 
d. Add the assignment rules of the SPG. If a category is 
subscripted in the DG, the assignments are duplicated for 
each subscripted variant. 
e. Erase exponents. 
At the end of step 7, we obtain the derived DG2. 
Axiom: * (Ni) 
Dependency Rule s : 
1. N I (D * N2) 
2. Then the IRL is: A i .... A 0. . . 
B 2 .... Bi,... 
C 2 .... CI,... 
B I .... B 0. . . 
C I __~ .. C 0. . . 
t3 
Z. N z (V D • iN\]) 
3. N (D *) 
4. V (*) 
5. D (*) 
If we interpret each distinct X n as a separate category, 
then the same list of words is assigned to the three catego- 
ries, N, Nf, and NZ, by the assignment rules. If we inter- 
pret them as the same category, then the subscripts distin- 
guish the different substructures of a sentence in which the 
N's occur. Each N governs a D directly on the left, but if 
it is the axiom it is required to govern another N on the right 
also. If N is not the axiom but is governed directly by the 
axiom, it ~ govern another N on the right, and is required 
to govern a V on the left. If it is neither the axiom nor a di- 
rect dependent of the axiom, it governs nothing but the D. We 
may not erase the subscripts and write a single conflated rule 
N (iv\] D • iN\]), for then strings not generated by SPGZ 
would be generated. (For example, the rule would generate 
an infinite set of strings, (D N)n.) In such circumstances we 
say that the DG is structure sensitive. 
Definition i. A DG is structure sensitive if 
a. the set of terminals assigned to one category is identical 
to the set assigned to any other category, and/or 
b. any rule restricts the choice of dependents a category 
may govern to a subset of the ordered dependents it is 
permitted to govern in some other rule. 
Note that a DG containing the rules A (*) and A (B • C) is 
structure sensitive by this definition. Here, too, conflation is im- 
possible, since A (\[B\] • \[C\]) allows A to govern B without gov- 
erning C. 
This structure-sensitive feature of some DG apparently 
serves functions llke those served by the context-sensitive feature 
of context-sensitive phrase structure grammars in placing restric- 
tions on the string generated, but there seems to be no mention of 
it in the literature. Its character may be masked by the freedom 
to set up categories and assign the same terminals to them. i For 
example, one may substitute simple symbols X and Y for the 
complex symbols N i and N 2, and obtain the rules 
X (D * Y) 
Y (V D ~' iN\]) 
N (D *) 
iGaifrnan's method \[4\] for converting SPG to DG makes great use 
of this freedam. 
i4 
In this case, only the assignment rules, assigning exactly the 
same set of terminals to X, Y, and N, explicitly show the struc- 
ture sensitivity,' although the fact that N governs a subset of the 
dependents of X and Y is significant. 
Such arbitrariness in assigning symbols raises the signifi- 
cant linguistic problem of criteria for establi.shing categories, 
which is too large an issue to be discussed here, The p.roblem is 
no less relevant to SPG, 
Definition Z. An SPG is structure sensitive if 
a. the set of terminals assigned to one terminal category is 
• identical to the set assigned to any other, and/or 
b. any rewriting rule does not contain, on the right, a unique 
category (other than the axiom) which occurs only once in 
that rule and appears on the right in no other rule. 
The linguistic implica.tions of the property may be clarified 
by considering two sets of rules, one for the artificial language 
a n b a n and one for a fragment of English. 
The language a n b a n is generated by the rules S ~-~ A S A, 
S ~ B, A ~ a, and B ~ b. The first rule "does not contain a 
unique category (other than the axiom) which occurs only once", 
since A occurs twice. One of the A's must be starred in con- 
verting to a DG and the DG will distinguish two categories of a, 
a left A and a right A. Note that the same language is generated 
if the first rule is S ~ A S and a transformational rule 
X B => X B X is added. In this case, each rewriting rule con- 
tains a unique category other than the axiom, and a structure-free 
set of dependency rules is obtainable from them. 
A less artificial but still extreme case of structure sensi- 
tivity is that in which two or more rules are rewritten in exactly 
the same way. Assume that an SPG for English has the following 
rewriting rules : 
S~X VP 
VP~V Y 
X--~D N 
Y~D N 
Here the distinction drawn between a sequence D N that is a(n) X 
(i. e., derives from X) and a D N that is a Y, reflects the func- 
tional notions of subject and object, but obscures the categorial no- 
tion is a noun phrase. Chomsky \[3, pp. 68-72\] argues that it is 
confusing and redundant to assign categorial status to both notions 
since the purely relational character of the functional notion is im- 
plicit in the rewriting rules S --~ NP VP and VP ~ V \[NP\]; that is, 
the notion "subject of a sentence" refers to the NP under the im- 
mediate domination of S, and "object of a verb" refers to the NP 
under the immediate domination of VP. Chomsky also shows that 
i5 
for sentences like "John was persuaded by Bill to leave" where 
"John" is simultaneously object-of "persuade" and subject-of a 
transformed embedded sentence "John leave", it is impossible to 
represent such functional notions by categorial assignments, and 
adds that "Examples of this sort, of course, provide the primary 
motivation and empirical justification for the theory of transfor- 
mational grammar." \[p. 70\]. 
Whether it is possible or desirable to require that SPG 
components of transformational grammars for natural languages 
be structure free is an open question, ibresumably, it is desirable 
if it is possible, since the least powerful, most restricted gram- 
mar -- the tightest fit -- is to be preferred. Moreover, inspec- 
tion of proposed grammars, for English at least, indicates that 
most of their rules do contain unique categories on the right. 
Returning now to SPG2, we see that it is structure free, 
since every elementary rewriting rule for every category X i con- 
tains a single occurrence of at least one category Yi on the right 
which does not appear elsewhere on the right, and is not the axiom. 
Under these conditions we will say that Yi is a head for category 
X i and call all such Y's head categories. Examination shows that 
the structure-sensitive property of DG2 arose from the choices 
made in augmenting SPG2. If only head categories are marked, a 
structure-free DG similar to DGi results, in which all signs of 
augmentation can be erased without altering its generative capacity. 
Intuitively, it seems reasonable to regard heads as sources 
of a "governor" in any string derivable from the category in whose 
rewriting rules they appear. This does not mean that the string is 
to be considered endocentric in the strong sense of requiring that 
the governor be substitutable for the entire string without loss of 
grammaticality, and the objection sometimes raised that dependen- 
cy theory forces a purely endocentric analysis of a language is 
based on failure to distinguish between "head o.P' and "substitutable 
for". It appears truer to say that dependency analysis assumes 
that one phrase type is distinguished from another primarily by the 
singular presence of some category in it rather than by co-occur- 
rence and order of categories in it. 
Incidentally, aside from the problem of obtaining a struc- 
ture-free DG from an SPG, avoidance of structure sensitivity may 
be a criterion for assigning government when one is analyzing a 
language in terms of dependency theory. In English, for example, 
the choice has generally wavered between noun and verb as candi- 
dates for sentence government. Since every elementary sentence 
contains one verb but may contain several nouns, choosing the noun 
forces a structure-sensitive DG. 
In converting a DG to an SPG, no requirement of intrinsic 
t6 
ordering needs to be imposed on the dependency rules, as it was 
on the rewriting rules of SPG. Dependency rules may always he 
partially ordered by starting with the rule (or rules) for the axi- 
om(s). Call these "level zero rules". Then level i rules are 
those which assign dependents to the axiom, level Z those which 
assign dependents to categories which make their first appearance 
in level t rules, and so on. To insure eventual termination, how- 
ever, it is required that if X occurs anywhere in a level n rule, 
and is a dependent in a level m rule, m > n, then its choice as 
• dependent in the level m rule is optional or else the governor in 
the level m rule is optionally chosen at some point. Otherwise 
no constraints on recursion are necessary, and any category may 
be i'eintroduced at a lower level. 
By contrast, the problem of conflation arises. It has been 
shown that rules like k (~) and A (B * C), occurring in structure- 
sensitive DG, cannot be conflated. Conflations of A (B $) with 
A (C * D) and of A (~ B) with A (B *) are also impossible, al- 
though structure sensitivity as defined above is not involved. If 
the rules are not conflatable because the DG is structure sensitive, 
then the SPG may also be structure sensitive. If they are not con- 
flatable solely because of disparate number or position Cleft or 
right) with respect to the governor, the SPGwill be structure free. 
In either case, the conversion process becomes somewhat more 
complicated. 
The process will be illustrated first by applying it to a DG 
whose rules cannot be fully conflated both because some are struc- 
ture sensitive and because some assign disparate numbers or po- 
sitions to dependents of a category. 
Method 2. Conversion of DG to SPG 
Step t. Augmentation 
Definition: A dependent element in a dependency rule is any 
braced or bracketed string not included in larger braces or 
brackets, ~nd any single unbraced, unbracketed occurrence 
other than • occurring within parentheses. 
E.g., in A ({BD\[C\]} * E \[F\]) there are three de- 
pendent elements: "{BD\[C\]'~ , E, and \[F\]. 
a. Precede each dependent element with a coefficient n > i, 
so that for any n > i, at least one element is preceded by 
n - I, and for any m > n, m does not intervene between 
n and ~. Thus: 
17 
b. 
C. 
d. 
A (ZB * IC) 
A (2B IC *) 
but not A (2B @ 2C) 
and not A (iB 2C *). 
Assign an exponent s to the governor in each rule, where 
s equals the largest coefficient n of any dependeSt. (For 
rules of the form X (~), s is zero.) 
If a category X is assigned different exponents for differ- 
ent rules in which it occurs as governor, associate a dis- 
tinct subscript with each distinct exponent. 
Replace every occurrence of X as dependent by X s. If 
X has been subscripted, include each variant in braces, 
thus : 
j 
Step 2. Test for unmarked (unprocessed) rules. If none remain, 
go to step 4. Otherwise take the first unmarked rule, where 
some x~i is governor and set a counter S = s.. 1 
Step 3. If S = 0, mark the rule and repeat step 2. If S ~ 0, con- 
struct'a rule of an IntErmediate Rule List of the form 
X~ ..... On the right of the arrow duplicate all dependent 
elements whose coefficients n = S. Include the $, and pre- 
cede it with ~-i. Decrease S by i, and repeat step 3. 
Step 4. Establish a Table of Substitutes, assigning a unique sym- 
bol to every distinct XSn i for which S i ~ 0. 
Step 5. Assign the substitute or substitutes for the axioms of the 
DG as axioms of the SPG. 
Step 6. Rewrite the IRL, using substitutes from the TS, deleting 
exponents and subscripts from any X 0 and omitting ~. n' 
Step 7. Transfer the assignment rules. 
This method will be illustrated using an abstract DG3, with- 
out assignment rules. DG3 is structure sensitive since there are 
rules which restrict the choices of dependents of categories A and 
C to subsets of the ordered categories they are permitted to gov- 
ern in other rules. 
DG3 (augmented) 
Axiom: • (A) 
i8 
The 
Dependency Rules :i 
i. A (IB *) 
Z. A(2 ~cD } IB * 3E) 
3. B (IE *) 
4. B (* iF) 
5. C (IG * iF) 
6. C (*) 
7. D (*) 
8. z (, IG) 
9. F (IA *) 
iO. G (*) 
final augmented form after step i is: 
Axiom: * tA32 j 
Dependency Rules : 
(tB i *) i. A i 
B i D O 
3 i Z. A z (Z C i 
0 C 2 
3. B i (IE i *) 
4. B i (* iF i) 
i (iG 0 , iFi) 5. C i 
0 6. c z (*) 
7. D O (*) 
8. E i (* IG 0) 
iB 1 , 3E 1) 
iNote the impossibility of further conflation, and that every cate- 
gory'occurring as dependent in a rule of level m and also occur- 
ring in some other rule of level n < m is always optionally chosen 
as a dependent. That is, there is t-he possibility of avoiding its 
reintroduction. Thus A, the axiom, is reintroduced in rule 9, 
but its governor, F, is optionally chosen as a dependent. Other- 
wise no derivation could terminate. 
i9 
9. F i (t i 3 *) \[A2 
io. G O (*) 
At the end of iterationm on steps 2 and 3, the IRL is 
i --~ B i A0, i. A i 
2. S A2, E ~ 
A2 ~ \[ B t D O } 
I --~ B I A02 , 4. A 2 
5. B i -~ E i B 0 
6. B i -- B 0 F i 
i_,. G O C0~ F i 7. C i 
8. E i ~ E 0 G O 
A possible TS is: 
i 
S :A i 
3 Z :A z 
2 
Y :A z 
i X :A 2 
W:B i 
i V:C i 
U:E i 
T :F i 
20 
SPG3 is : 
Axiom(s): { S} 
Rewriting Rule s : 
i. S -*W A 
Z. Z-*Y U 
{W D} 
3. Y ~ V 
C 
4. X --~W A 
5. W ---~U B 
6. W~B T 
7. V~G C T 
8. U ---*E G 
tLJ 
X 
SPG3 is structure sensitive, since two categories, S and 
X, are rewritten the same way. This results from the fact that, 
in DG3, A has two sets of dependents and one set is included in the 
other. The structure-sensitive rules for C in DG3 produced no 
additional structure-sensitive rules in SPG3, however, since one 
of them, C ($1, was not processed in step 3 and did not form part 
of the IRL. 
SPG3 may be rewritten as a structure-free grammar in a 
purely ad hoc way by eliminating the rewriting rule for X and shb- 
stituting S for the occurrence of X on the right. More generally, 
it is reasonable to require of the original DG that its rules be de- 
signed so that it is possible to write a single rule (schema} as- 
signing dependents to any given category. This requirement is 
reasonable if the DG is a base component of a transformational 
grammar whose transformations take care of the eventual order of 
elements in a sentence. The primary function of the DO's depen- 
dency rules is, in this case, only that of listing 'co-occurring cate- 
gories in the dependency relations in some canonical order. In 
DO3, A always occurs with B as dependent. Whenever E is a 
dependent of A, then either a second B or a C are also depen- 
dents, and if a second B is a dependent of A, D is also. This set 
of conditions is summed up by 
2i 
Similarly, B always occurs with E or F as dependents, i.e., 
and C occurs with no dependents or else with both G and F, i.e., 
c (* \[O F\]). 
These rules express co-occurrence relationships more directly 
than the original rules do. Let us assume this constraint and re- 
design DG3 as DG4. Let the augmented DG4 be: 
Axiom: • (A) 
Dependency Rules : 
i. A (zB" I \[tC D} El) 
3. c(* JiG F\]) 
4. D (#;) 
5. E (, IG) 
6. F (tA *) 
v. G (*) 
The IRL is: 
A z __. B I Ai~ 
° 
AI--A' L\[ CI } E~\] 
o, 
c i _ c o \[G O F ~1 
E i _. E 0 G O 
F I _ A z F 0 
With appropriate substitutions, this becomes SPG4: 
Axiom: Z 
Rewriting Rules : 
1. Z-~X Y 
ZZ 
4. w - c \[G u\] 
5~ V--E G 
6. U--Z F 
v\] 
SPG4 is structure free. Furthermore, it has only one axiom, 
whereas SPG3 had two even though derived from a DG which had 
only one. 
The additional restrictions proposed for DG and SPG in 
discussions of the results of conversion by these methods are not 
crucial as far as obtaining systematically corresponding gram- 
mars is concerned. Without them, every complete subtree of a 
D-tree will correspond to a complete subtree of a P-tree over the 
same string, and every complete subtree of the P-tree will cor- 
respond to a connected subtree of a D-tree over the same string. 
(No formal proof of this was given, but the methods of constructing 
an IRL make it moderately apparent.) Hays \[5\] suggests the term 
relational correspondence for this state of affairs. Also there is a 
systematic relationship between the categories of the two gram- 
mars~ which the TS makes explicit. 
Sometimes that relationship is simple, as in the case of 
DG4 and SPG4. Given any category of DG4, there is exactly one 
category of SPG4 from which the same set of strings is derivable. 
The relationship also holds between the categories of DGt and 
SPGt, and between those of DG2 and SPG2. Under these condi- 
tions, Hays \[5\] calls the categories "substantively equivalent". 
The relationship between the categories of DG3 and SPG3 is less 
simple. There the set of strings derivable from A of the DG is 
the union of the set of strings derivable from S and Z of the SPG, 
and the set derivable fromB is the union d the set derivable from 
W and V. 
Hays says that a D-tree and a P-tree correspond if they 
correspond relationally and if the category at the origin of every 
complete subtree of the D-tree is substantively equivalent to the 
category labeling the complete subtree of the P-tree related to it. 
If a DG and an SPG "have the same terminal alphabet, and for 
every string over that alphabet, every structure attributed by 
either cot/responds to a structure attributed by the other", he calls 
the two grammars "strongly equivalent". \[5, p. 52t\] We prefer 
to say that they correspond substantively, since relational corre- 
spondence is asymmetric and there are always "left-over" SPG 
23 
categories to which no DG category is substantively equivalent. 
The weaker relationship exhibited by SPG3 and DG3, where some 
DG categories are not substantively equivalent to any single SPG 
category, we have been calling systematic correspondence. 
i 
Conversion by the methods given here results in systema- 
tic' correspondence. If the suggested constraints, which appear to 
be linguistically well-mot~vated for base components of transfor- 
mational grammars, are imposed on the form of the source gram- 
mar, the target grammar corresponds substantively as well as 
systematically to the source grammar and both are structure free. 
°.. 

References

Bar-Hillel, Y., Perles, M., and Shamir, E. "On formal 
properties of phrase structure grammars," in R. D. Luce, 
R. Bush, and E. Galanter (eds.), Readings in Mathematical 
Psychology, Vol. II, pp. 75~i04. New York, Wiley, i965. 

Chomsky, Noam. "On certain formal properties of gram- 
mars," inR. D. Luce, R. Bush, and E. Galanter (eds.), 
Handbook of Mathematical Psychology, Vol. II, pp. 323-4t8. 
New York, Wiley, i963. 

Chomsky, Noam. Aspects of the theory of syntax. The M.I.T. 
Press, Massachusetts Institute of Technology, Cambridge, 
Mass., 1965. 

4. Gaifman, Haim. Dependency systems and phrase structure 
systems. P-23t5, The RAND Corporation, Santa Monica, 
California, May 1961. 

5. Hays, David G. 
observations," 
"Dependency theory: a formalism and some 
Language, Vol. 40, (Oct.-Dec. i964), pp. 

6. Hays, David G. An annotated bibliography of publications on 
depende-tc7 theory. RM-4479-PR, The RAND Corporation, 
Santa Monica, California, Marc1~ 1965. 

Kay, Martin. The tabular parser: A parsing program for 
phrase structure and dependency. RM-4933-PR. The RAND 
Corporation, Santa Monica, California, July 1966. 

Lees, R. B. Th___e_e grammar of English nominalizations, Sup- 
plement to International Journal of American Linguistics, 26, 
No. 3, Part II, 1960. 

Robinson, Jane, "Adependency-based transformational 
grammar." IBM Corporation, Yorktown Heights, New 
York. (Forthcoming, i967) 

Rosenbaum, Peter S. "English Grammar II." IBM Corpo- 
ration, Yorktown Heights, New York. (Forthcoming, 1967) 

Zwicky, A.M., Hall, B.C., Fraser, J.B., Geis, M.L., 
Mintz, J.W., Isar-d, S. ~ and Peters, P.S. "English pre- 
processor manual." Inforrrmtions System Language Studies 
Number Seven, SR-i32, The MITRE Corporation, December 
i964. 
