COL1NG 82, Jr. Horeckp (ed.) 
North-Holland Publishing Company 
© Accdemia, 1982 
TREE DIRECTED GRAMMARS 
Werner Dilger 
Universit~t Kaiserslautern 
Fachbereich Informatik 
D-6750 Kaiserslautern 
FR Germany 
Tree directed grammars as a special kind of translation 
grammars are defined. It is shown that a loop-free tree 
directed grammar can be transformed into an equivalent 
top-down tree transducer, and from this fact it follows 
that given an arbitrary context-free language as input, 
a tree directed grammar produces an output language 
which is at most context-sensitive. 
INTRODUCTION 
Within the natural language information system PLIDIS \[6\] a seman- 
tic processor was implemented for the translation of syntactically 
analyzed sentences into expressions of a predicate calculus-orlented 
internal representation language. This semantic processor was de- 
signed according to a translation grammar defined by Wulz \[8\], which 
is similar to the transformation grammar introduced by Chomsky \[3\]. 
The operations on trees which are defined in thetransformation 
grammar, i.e. deletion, insertion, and transposition of subtrees, 
are also available in the Wulz grammar. Therefore it can be assumed 
that it is equivalent to the transformation grammar with regard to 
the input/output-relation. 
But when the Wulz grammar was realized within PLiDIS for a section 
of German, only of a few of its possibilities was made use. No real 
transformation was prescribed by the PLIDIS translation rules, they 
only checked the parse tree and produced an output separated from 
this tree. Thus, what was realized in the PLIDIS translation rules 
can be better described by another kind of translation grammar, 
namely the tree directed grammar (TDG). When we investigate the TDGs 
and their relation to tree transducers it turns out that they are 
less powerful than transformation grammars. 
TREE DIRECTED GRA~tMARS 
We define trees in the manner of \[2\] and \[7\] as mappings from tree 
domains (special subsets of N*, where N is the set of natural num- 
bers) into an alphabet Z and call them therefore trees "over" Z. We 
assume for the rest of the paper that Z is ranked. Because trees are 
flnlte mappings it is convenient to identify a tree with its graph. So e.g. the set 
{<(),a>,<(O),b>,<(1),d>,<(2),a>,<(O,O),e>,<(O,1),c>, 
<(2,O),d>,<(2,1),b>,<(2,2),e>,<(O,1,0),e>,<(2,1,O),c>, 
<(2,1,1),d>,<(2,1,0,O),d>} 
represents the tree of fig. I. 
77 
78 W. D1LGER 
a a e 
figure 1 
If u is an element of a tree domain, a 6 ~, and t(u) = a, then the 
pair <u,a> is called a n0dz of t. 
Let T be any set of trees over E. A TDG G T for T is a quadruple 
G T = (~,A,n,~) 
where ~ is the alphabet of terminals of G m, ~ is the set of produc- 
tions of G T, and u E E U 4. It follows fr6m this definition that the 
elements of E play the role of nonterminals in G T. When they are 
used for this purpose in the productions, they are enclosed in 
brackets, so we get from E the set 
\[~\] = {\[~\]la ~ ~} 
The elements of ~ are further used in the structural condition parts 
of the productions. There we should be able to distinguish between 
different occurrences of ~he same symbol in a tree. In order to re- 
present such distinctions, the symbols are provided with indices, so 
we get from E the set 
EIN D = U {aila E E} iEIND 
for some index set IND (in general a subset of N). 
Now a production p E ~ is a triple 
(\[al\],SO,~) 
with a q ~, e e (4 U \[EIND\])*, and sc is a structural condition 
which contains the symbol a I . 
In order to explain the application of a production we have to de- 
fine the structural conditions. Assume, x E E and X = {Xl,X2,...}. 
Then the set of structural indiuid~als is 
SI = EIN D U X 
TREE DIRECTED GRAMMARS 79 
There are four two-place predicates defined on SI, namely DOM 
("dominates immediately"), DOM* ("dominates"), LFT ("is immediately 
left from"), and LFT * ("is left from"). Atomic structural conditions 
are TRUE, FALSE, P(~,~) 
where P is one of the four predicates above and ~,~ E SI. 
A ~t~uct~ral ~ondltZon is then an atomic structural condition or a 
Boolean expression over the set of atomic structural conditions. 
For example, if ~ = {a,b,c,d,e}, IND = {1,2}, then the following 
expressions are structural conditions: 
I. DOM(al,b I) 
2. DOM(bl,X I) ^ LFT(el,x I) 
3. DOM(Xl,C I) ^ LFT*(Xl,X 2) ^ DOM*(x2,e I) 
4. DOM(al,b I) ^ (NDOM*(bl,e I) v LFT(bl,dl)) 
The semantics of a structural condition is defined in the usual way 
by an interpreting function from the condition into a semantic domain. 
Here, the trees of T are semantic domains. The four predicates DOM, 
...,LFT • are always interpreted in the same way, and this interpre- 
tation should be obvious. The main part of the interpretation is the 
assignment of the structural individuals to the nodes of a tree, 
which is called the ,od~ ~Zgnment. A mapping of the individuals of 
a structural condition into the set of nodes of a tree is a node 
assignment, if it obeys the following restrictions: If a 6 E, then 
an individual ~ (i 6 IND) should be assigned to a node with label a, 
whereas the individuals e~ and e. (i ~ j) should be assigned to different nodes with the ~ same 3 label u. An individual x~ 6 X can 
be assigned to an arbitrary node. A tree t ~atZsfig~ a structural 
condition sc if there exists a node assignment such that sc holds 
for the assigned nodes of t under the assumed interpretation of the 
four predicates and the usual interpretation of the Boolean operators. 
The reader is invited to check, how the tree of the example above 
satisfies the structural conditions I. - 4. 
The structural conditions are similar to the local constraints of 
Joshi and Levy \[5\], and it can be shown that both are equivalent with 
regard to their ability to describe relations on the set of nodes of 
a tree. 
Assume, p = (\[ul\],sc,~) is a production of G m. Then the structural 
individual u I m~st occur in sc. Assume further that 
y = y1\[ai\]Y2 
where yl,y 2 E (~ u \[ZTw_\]) ~, i E IND, and there is a node assignment 
which m~ps e~ on a nod~U<u,a> in tree t and t satisfies sc in such a 
way that ~I ~s mapped on <u,a> as well, then p can be applied to y: 
YI\[ui\]Y2 ~ y1~Y2 GT,t 
Some of the individuals of X occurring in e may be replaced by the 
node assignment for sc by individuals of \[~T,,~\]. In this way deriva- 
tions in G T with regard %o a tree taze def!~d. If a derivation 
stops with a word y e ~, y can be regarded as a translation of t. 
80 " W. DILGER 
Assume e.g. we are given the following four productions: 
(\[al\],DOM(a1~bl),\[bl\]\[bl \]) 
(\[bl\],DOM(bl,X I) ^ LFT(el,xl),H\[Xl\]) 
(\[Cl\],DOM(Xl,Cl) ^ LFT(Xl,X 2) ^ DOM*(x2,el),\[el\]E) 
(\[el\],TRUE,AR) 
By means of these productions we can perform the derivation 
\[al\] P--- \[bl\]\[b 1\] v-- H\[Cl\]\[bl\] P-- H\[el\]E\[bl \] P---HARE\[b 1\] 
~L- HAREHARE 
with regard to the tree of the example above. 
TOP-DOWN TREE TRANSDUCERS i 
A top-dow~ IAZ¢ t~n~da~zr (TDTT) (cf. \[4\]) is a transducing auto- 
maton which proceeds top-down from the root to the leaves in a tree 
and in each step yields an output. It is defined as a quintuple 
M = (Q,?-,A,qo,R) 
where ~ and ~ are defined as before, Q is a finite set of states, 
q E Q is the initial state and R is a finite set of rules of the fSrm 
q(u(~1...Tk)) ---> ylq1(Til)Y2q2(~i2 ) -.. Ynqn(Tin)Yn+ I 
with n,k z O; I ~ i. < k for I < j ~ n; q'q1'''''qn E Q, ~ E T, 
Yl'" "-'Yn+1 E &*. 3 k is the rank of ~ and the T 4 are variables 
over T. when SUCh a rule is applied to a tree ~ at a node with 
label u, the variables ~ are replaced by those subtrees of t whose 
roots are immediately dof~inated by the node with label a. 
Assume e.g. we are given the TDTT 
M = ({qo,ql }, {a,b,c,d,e}, {A,E,H,R},qo,R) 
with R = { qo(a(¢lT2T3)) --~ Hq1(T3)qo(T1)Hq I(T3)qO(T 1), 
qo(b(TiT2 )) --> qo(¢2), 
qo(C(~1)) --~ E, 
ql (a(¢1¢2T3)) ---> qI(¢3 )' 
q1(e) --~ AR } 
M performs on the tree cf the example above the derivation 
qc (a (b (ec (e)) da (db (c (d) d) e) ) ) 
k-- Hql (a(db(c(d)d)e))qc(b(ec(e)))Hq1(a(db(c(d)d)e))qo(biec(e))) 
~--- Hql (e) qo (b (ec (e)) ) Hql (a (db (c (d) d) e) ) qc (b (ec (e)) ) 
~-- HARqo (b (ec (e)) ) Hql (a (db (c (d) d) e) ) qo (b (ec (e)) ) 
P--- HARqo (c (e)) Hql (a \[db (c (d) d) e) ) qo (b (ec (e)) ) 
P-- HAREHql(a(db(c(d)d)e))qo(b(ec(e))) ~--- HAREHARE 
TREE Dn~ECTED GRAMMARS 81 
TDGs AND TDTTs 
There are some obvious similarities between TDGs and TDTTs. It is 
easy to see that not every TDTT can be transformed into an equi- 
valent TDG, because the TDTTs have the states as an additional means 
to direct derivations. In some cases the derivation can be directed 
by appropriate structural conditions in the same way as it is done 
by states, but i~ is easy to construct examples where this is impos- 
sible. On the other hand, each TDG can be transformed into an equi- 
valent TDTT. The main step of this transformation is to put to- 
gether some of the productions so that the resulting productions 
satisfy the condition that all symbols of the structural condition 
part except a I are situated below the symbolu I in each tree, where 
a I correspond~ to the first component of the @roduction. 
Take e.g. the productions 
(\[al\],DOM(al,bl),\[bl\]\[bl\]) 
(\[bl\],DOM(bl,X I) ^ LFT(el,xl),H\[Xl\]) 
(\[Cl\],DOM(Xl,C I) a LFT(xl,x 2) ^ DOM*(x2,el),\[el\]E ) 
The first and the second production satisfy the condition, the third 
one does not, because the nodes assigned to x I and x~ are above that 
one assigned to c I in each tree which satisfies the §tructural con- 
dition. But we ca~ put together the second and the third production and get a new one: 
(\[bl\],DOM(bl,C 1) ^ LFT(el,c I) ^ LFT(bl,X 2) ^ DOM*(x2,el), 
H\[el\]E) 
NOW this production is "better" than the third above, but it does 
not yet satisfy our condition. Therefore we put it together with the first one and get 
(\[al\],DOM(al,b I) ^ DOM(bl,c I) ^ LFT(el,c I) ^ LFT(bl,X 2) 
^ DOM*(x2,el),H\[el\]EH\[el\]E) 
This production is acceptable and together with the production 
(\[el\],TRUE,AR) 
it performs the same derivation as the four productions above. The 
productions resulting from this transformation process are all pro- 
ceeding downward in a tree. Each of them can be transformed into a 
TDTT of its own and finally these single TDTTs are composed to one 
TDTT which is equivalent to the TDG. 
The transformation process sketched above can be made only if the 
TDG is loop-free. That means that each node of a tree is passed 
during a derivation in TDG at most once. 
Now we can adopt the result of Baker \[I\] about top-down tree trans- 
ductions. It states that the family of the images of recognizable 
sets of trees (e.g. the set of derivation trees of a context-free 
grammar) under a top-down transduction is properly contained in the 
family of deterministic context-sensltive languages. In other words, 
the result of t~e translation of the set of derivation trees of a 
context-free grammar by a TDG is at most a deterministic context- sensitive language. 
/ 
82 W. DILGER 

REFERENCES 

\[I\] Baker, B.S., Generalized Syntax Directed Translation, Tree 
Transducers, and Linear Space, SIAM J. Comput. 7 (1978) 
376 - 391 

\[2\] Bralnerd, W.S., Tree Generating Regular Systems, Inf. and 
Control 14 (1969) 217 - 231 

\[3 \] Chomsky, N., Aspects of the Theory of Syntax, MIT Cambridge, 
Mass., 1965 

\[4\] Engelfrlet, J., Rozenberg, G., and Slutzkl, G., Tree Transducers, 
L Systems, and Two-Way-Machines, JCSS 20 (1980) 150 - 202 

\[5\] Joshi, A.K., and Levy, L.S., Constraints on structural 
descriptions: local transformations, SIAM J. Comput. 6 (1977) 
272 - 284 

\[6\] Kolvenbach, M., LStscher, A., and Lutz, H.-D., (eds.), KUnst- 
lithe Intelligenz und natUrllche Sprache, Forschungsberichte 
des Instituts f~r deutsche Sprache 42, Narr-Verlag, T~bingen, 
1979 

\[7\] Rosen, B.K., Tree-Manipulating Systems and Church-Rosser Theo- 
rems, JACM 20 (1973) 160 - 187 

\[8\] Wulz, H., Formalismen elner Ubersetzungsgranunatlk, Forschungs- 
berichte des Instituts fur deutsche Sprache 46, Narr-Verlag, T~tbingen, 1979 
