TREE GRAMMARS ( = 4- GRAMMARS ) 
- GladkyA.V. (Novosibirsk) ~ Melt~uk I.A. (Moscow) - 
i. This paper suggests a new kind of formal grammar (hereafter called 
A- grammarS) which in some respects is closely related to Chomsky's 
grammars but differs from these in that it is meant to process trees (in 
the sense of graph theory) and not to process strings as Chomsky's grammars 
do. More precisely, we aim at a type of grammar with rewriting rules of 
the'X ~Y" where X and Y are trees (N.B. : with no linear order imposed 
on their nodes ~). 
Linguistically, the trees under consideration are dependency (not 
phrase structure) trees representing natural sentences at different levels 
of "depth": roughly speaking, "surface" syntax, ,'deep" syntax, semantics. 
A- grammars are designed to be used not for generating sentences 
but rather for transforming given trees into other trees; this covers transi- 
tions from one abstract representation of a natural sentence to another 
(deeper or more superficial) representation of the same sentence as well as 
transitions from an abstract representation of one sentence to a representa- 
tion on the same level of another sentence, synonymous to the given One. 
The conversion of a ~ready ~ surface tree into an actual sentence - a conver- 
sion consisting of a) inflexion and b) determination of word order - must be 
carried out by some autonomous device not included in the conception of 
A- grammar. 
f~l" _ _ *. ~ from the Greek cJ£v~@poV(tree). 
~'~. The limitatlons of place and time prevent us from comparing tree grammars 
with those of Chomsky as well as from referring to other works dealing with 
more or less analogous matters, such as studies byM. Arapow and V. Bor- 
schtschow; G. Veillon, J. Veyrunes and B. Vauquois; Ch, Hockett; and 
other s. 
The authors are glad to acknowledge here the friendly help and useful sugges- 
tions byO.S. Kulagina and A.Y. Dikovsky. 
All shortcomings in the paper are, of course, ours. 
The A-grammar embodies an attempt to formalize the linguistic "Meaning 
~Text Model" described, e.g., in Ill. In this model, the starting point for 
producing a sentence is a detailed semantic description of its meaning conceived 
as a rather involved graph (not merely a tree) consisting of "semantic atoms" 
and "semantic links" connecting them. The semantic description is generated 
outside of the linguistic model and constitutes the input of that model; it is 
then subsequently "lingualized" (anglicized, russianized etc.) by means of 
formally specified transformations: i) extracting from the given semantic 
description (of a family of synonymous sentences conveying the meaning repret 
sented by that description) the deepest admissible tree-like structures; 
2) proceeding in a multi-step fashion from the deeper trees to the more 
superficial ones; 3) linearizing the most superficial syntactic trees (with simul- 
taneous inflexion where needed) to produce actual sentences. The n-grammars 
deal with the second phase of this . process only. 
Z. We shall consider trees with labelled branches; nodes are not labelled. 
The labels can be interpreted as names of the types of syntactic link at the 
corresponding level. For brevity~s sake such trees will here be referred to 
just as "trees". 
A tree is called minimal if all its nodes, except the root, are terminal 
(i. e., with no branches growing out of them). A tree with but one node is called 
an empty tree and is denoted as ¢. The composition of trees is defined as 
follows: let to, tl, t2, .... tnbe trees, and let in t o some nodesa I, a~, .... a n 
(not necessarily pairwise different) be marked. Then the result of the composi- 
tion of the tree t o with the trees tl, t2, . .., t o will be any tree isomorphic to 
the tree which can be obtained from t o by identifying the roots of the trees tl, t2, 
..., tn with the nodes c~ I, a2, ..., an, respectively in t o . 
The composition of t o in which the nodes al, a 2, .... a n are marked with 
tl, t 2 ..... t n isden°ted 
T = C (to; ¢I, ce2, .... anl t I, t2 ..... in) (i) 
I 
A tree is a subtree of T if T can be represented as: 
T -- C (To; a01 C(t ; a,, ~ ..... a l TI ' T2 .... T.)) (Z) 
where a 0 is a terminal node of TO, and ce~, c~ 2 ..... a n a repetitionless enumera- 
tion of all nodes of t. 
Now, a n elementary transformatign (ET) of trees is an ordered triple 
<tl, ~, f>, where t 1 and t2are trees and fis a mapping of the set of all nodes 
of t 1 into the set of all nodes of t 2 . Instead of <tl, t2, f >, we shall write 
t 1 ~ t21 f. The tree T f is said to be the result of the application of the ET tl=t 2 I f 
to the tree T if T and T t can be represented in the form: 
and 
T =C (T0; o~01 C (-tl; eq, oe 2 ..... ¢Ynl T1, T2 ..... Tn )) (3) 
T'=C (To; C~o/ C (tl; f(c~l), f(o:2) ..... f(c~n) \[~, T 2 ..... Ta) ) (4) 
where cr 0is a terminal node of T o , and c~1, cr 2 ..... ~n a repetionless enumera- 
tion of all nodes of t 1 . Informally, an application of certain ET to a tree T con- 
sists of the substituting of t 2 for an occurence of t 1 in T ; if ~(a node of t I ) 
is mapped on 8( a node of t 2), i.e., B= f(d), then all "untouched" nodes of T 
"pending" from ceare transferred to B with the same labels on corresponding 
branches. 
Example: E 
Let tl = ~//~ , t2= a/~d G and let f be specified 
H I 
as follows: f(A) = E, f(B) = H, f(D) = F. Then, applying the ET t~ ~1 f to the 
tree 
u 
we can obtain the tree 
i 
K 
a d 
N 
W d,~ 
s R 
T contains three occurences of t ; the replaced one is the subtree of T with 
the nodes M, N, O, Q. 
3. A syntactic A-grammar is an ordered pairP =<V,n > where V is a 
finite set of symbols (branch labels) and a finite set of ETIsscalled rules 
of grammarr . A derivation in a syntactic A-grammar is a finite sequence 
of trees where each subsequent tree is obtainable from the preceding one by 
application of an ET ofn. A tree t I is derivable from T inF' if there exists a 
derivation in~ beginning with T and ending with T I 
For linguistic applications, it may prove to be of interest to define 
some specific types of syntactic A-grammars. 
A syntactic A -grammar will be called exp~nding if each rule it contains 
has in its left side no more nodes than in its right. 
An expanding syntactic h -grammar will be called minimal if in each 
of its rules of the form"t 1 ~ t 2 \[ f" the trees t I and t 2 can be represented 
in the form 
t, = c (% ; %1 c (~; ~, ~ ..... ~ \[ ~, .~ ..... ¢.)) (5) 
and 
t2=c(r-o;~olC(v;f(~,), f(~) ..... f(~.)l~,~ .... ~o)) (6) 
where 1) ~ is a minimal tree, Z) ~1, ~2 ..... ~n is a repetitionless enume- 
ration of all nodes of ~, 3) a~ is the root of ~ , 4) f(et I ), f(~2) ..... f(Ctn) 
are pairwise different, 5) f(c¢2), f(a~3) ..... f(O/n) are terminal nodes of 
, 6) for every i = Z, 3 ..... n the label on the branch of ~ending in 
coincides with the label on the branch of V ending in f(c¢ i ) , 7) for all nodes 
of t differing from ~I , c~2 ..... cl n, the mapping is identical. 
A minimal expanding syntactic A -grammar will be called context-free 
if in the expressions (5) and (6) the trees ~r0, ~ , ,%r 2 ..... ~n are unity trees. 
4. Linguistic considerations dealt with in "Meaning~-~Text Model" (see, 
e.g., \[ 1 \]) imply the introduction of a subset of ETIs, ~2ecial elementary 
transformations (SETfs). A SET is an ET of one of the following three types: 
i 1) Splitting of one node - a transformation of the form A.=>B a *" C 
where either f(A) = B or f(A) =C. 
Notation:A= a(B, C) \[ f(A) =B andA~ a(B, C~i a(A) = C . 
Z) Transfer of one node - a transformation of the form 
p A 
E F in both cases f(A) =D, I(B) =E, f(C) = F 
(Notation : a(A, B). b(B, C) ~ a(D, E). b(D, F) and a(A, B). b(A, C) 
a(D, E). b(E, F)). 
3) Lumping two nodes into one - a transformation of the form 
B ~ C~ A , where f(B) = f(C) =A 
Notation : a(B, C) = A. 
Let t~ ~ t 2 \[ f be an ET and let M be a set of ETIs. Then the statement 
"The ET t 1 = t 2 I f can be simulated byETIs of M" means that there 
exists some finite sequence m 1 , m 2 , .... m n of ET's in M such that for 
any trees T and T I where T I can be obtained from T by apptlcation of the 
ET tl ~t 2 \[ f the tree TI can be obtained from T byapplying 
m 1 , m 2 ..... mn in tandem. 
Theorem t. Any elementary transformation can be simulated by 
special elementary transformations. 
5. For the representation of natural sequences it is reasonable to assume 
not arbitrary syntactic trees but rather a subset of those - namely, those with 
limited branching. The precise meaning of limited branching is as follows: 
for each branch label a~ there is fixed an integer n i such that each node 
can be a starting ppint at moat for ni branches labelled a i . The trees 
meeting this restriction are called (n 1 , n 2 ..... n k ).-regular (k being 
the number of different branch labels); for brevity we shall call these trees 
simply regular trees. 
Now, a slight modification of the notion of the application of an ET 
suggests itself: if we suppose that the trees T and T I in (3) - (4) are regular, 
we need consider onlyET~s with regular left and right sides; such ETIs will 
also be called regular. 
A regular syntactic h-grammar is an ordered triple < V, ~,\[1 >, where 
V =~ a 1 , a 2 ..... a k \] is a finite set of symbols (branch labels), ~ is a 
mapping of V into the set of positive integers (for every a EV the integer 
C (a) being the maximum number of branches labelled a which can grow out 
of any single node) ands is a finite set of (~(a 1 ),~ (a 2 ) ..... ~ (a k ))-regular 
ETWs. 
The set of ali regular syntactic A-grammars may be divided into 
hierarchical subsets which are fully analogous to the corresponding subsets 
of the syntactic A-grammars as defined above. Special elementary trans. 
formations (SET r s) can be defined here too. 
Theorem I to Any (r~ , n 2 .... , n k)-regular elementary transfor- 
mation can be simulated by (n 1 , n 2 ..... n k , 1)-regular SETIs '. 
..Theorem 2. a) Ifnl +...+nk~3 orif n 1 +...+nk =I, thenany 
(nl, n2, ..., nk)-regular ET can be simulated by (nt, n 2 ..... nk )_ regular 
SET i s. 
b) There exists (1, l)-regular and (2)-regular ETIs which cannot 
be simulated by (i, l)-regular and (Z)-regular SET~s respectively. 
6. If a regularity characteristics (n 1 , n 2 ..... n k ) is fixed on the basis 
of Some empirical (linguistic) evidence, then a "universal syntax" can be 
constructed as an abstract calculus of all possible syntactic structures and 
all possible transformations of these. Choosing (1, 1, l, 1, 10, 1)- regula- 
rity* as a first approximation to the deep syntactic description of natural 
languages, we obtain auniversal (t, 1, 1, 1, 10, 1)-regular A-grammar, 
<%, ~u, il~>, where V~= In1, a2 ..... a~ is the set of types of deep syntac- 
tic connections and where 
~o(a~) = ~(a2) = ~0(a3) = ~.(a4) = g~(a6) = 1 ; ~Jas) = 10 . 
consists of the following 80 rules: 
l) 12 "splitting" rules of the formAT ai(B ' C) \[ f(A) =B and 
A~ ai(B, C) \[ f(A) =C (i = i ..... 6) 
2) 62 "transfer" rules of the form ai(A ' B). aj(B, C)~ai(D , E). ai(D , F) 
and a i(A, B) . a i(A, C) = a i(D, E) . a i(E, Iv) ; 
*)The description of deep syntax suggested in \[ i\] is meant here. 6 types of syn- 
tactic connections are differentiated and interpreted as follows: connections 1 
through 4 link a predicate with its arguments (only predicates with no more than 
4 places are considered), connection 5 formalizes the general attributive rela- 
tion, and connection 6 expresses coordination; a node can be a starting point 
for only one branch of each of types t, 2, 3, 4, 6 and for several branches of 
type 5 (we have set the number of the latter at 10 as a sufficient upper limit). 
The set of all regular syntactic A-grammars may be divided into 
hierarchical subsets which are fully analogous to the corresponding subsets 
of the syntactic A -grammars as defined above. Special elementary trans- 
formations (SET Is) can be defined here too. 
Theorem 1', Any (r~ , n 2 , ..., n k)-regular elementary transfor- 
mation can be simulated by (n 1 , n 2 ..... n k, l)-regular SET's'. 
Theorem Z. a) Ifn 1 +...+nk~3 orif n 1 +...+n k =1, thenany 
(nl, n2, ..., nk)-regular ET can be simulated by (n 1, n 2 ..... nk)-regular 
SET ' s. 
b) There exists (I, l)-regular and (2)-regular E'T's which cannot 
be simulated by (i, i)-regular and (2). regular SETIs respectively. 
6. If a regularity characteristics (n! , n 2 .... , n k ) is fixed on the basis 
of some empirical (linguistic) evidence, then a "universal syntax" can be 
constructed as an abstract calculus of all possible syntactic structures and 
all possible transformations of these. Choosing (1, 1, 1, l, i0, 1)- regula- 
rity* as a first approximation to the deep syntactic description of natural 
languages, we obtain a universal (1, l, 1, 1, 10, 1)-regular A-grammar, 
<V~, ~u, il~>, where Vu= \[a 1, a 2 ..... a~ is the set of types of deep syntac- 
tic connections and where 
consists of the following 80 rules: 
I) IZ "splitting" rules of the form A ~ ai(B, C) \[ f(A) = B an__~d 
A~ a~(B, C) l f(A):C (i=1 ..... 6) 
2) 62 "transfer" rules of the form ai(A , B) . aj(B, C) ~ai(D, E) . ai(D, F) 
an__! a~ (A, B) . ai (A, C) ~ a, (D, E) . aj (E, ~) ; 
*)The description of deep syntax suggested in \[ i\] is meant here. 6 types of syn- 
tactic connections are differentiated and interpreted as follows: connections 1 
through 4 link a predicate with its arguments (only predicates with no more than 
4 places are considered), connection 5 formalizes the general attributive rela- 
tion, and connection 6 expresses coordination; a node can be a starting point 
for only one branch of each of types 1, 2, 3, 4, 6 and for several branches of 
type 5 (we have set the number of the latter at tO as a sufficient upper limit). 
here i, j = I ..... 6 and either i ~j or i =j = 5. 
3) 6 "lumping" rules of the form ai(A.B) ~ C (i = I ..... 6). 
7. It may be useful, in view of possible linguistic applications, to consider 
also such regular trees where the branches as well as the nodes are labelled 
filled regular trees. The node labels may be interpreted as characterized lexeme~ 
i.e., symbols denoting words, idioms and so-called lexical functions with 
morphological subscripts attached to them (~ i\], p. 186) The notion of regu- 
lar ET and that of regular syntactic A-grammar can in an obvious manner 
be modified accordingly. As a result, we obtain regular lexico-syntactic 
~grammars. For these grammars (see p. 6-7) we can define SETts of the 
types '~splitting','transfer" and "lumping H in a manner analogous to the one 
above; in addition another type of SET must be introduced: 
4) Hrenarning" of a node - a transformation of the form • ~ • , 
where w i and v~ are node labels. 
If SETIs are understood as transformations of the types I-4, the theorems i I 
and Z will hold also for this case. 
Z~o~oBc~ A K, Me.,rz~yK H A 0 ceMeHT~uecKo~ c~nTese.- 
Upo6~eKu R~6epHeT~ Bmn. 18 1967, 177 ~ ~38. 
