USES OF C-GP.APHS lil A PROTOTYPE FOR ALrFC~ATIC TRNLSLATION, 
Marco A. CLEMENTE-SALAZAR 
Centro de Graduados e Investigaci6n, 
Instltuto Tecnol6gico de Chihuahua, 
Av. Tecnol6gico No. 2909, 
31310 Chihuahua, Chih., MEXICO. 
ABSTRACT 
This paper presents a prototype, not com- 
pletely operational, that is intended to use 
c-graphs in the translation of assemblers. Firstly, 
the formalization of the structure and its princi- 
pal notions (substructures, classes of substruc- 
tures, order, etc.) are presented. Next section de- 
scribes the prototype which is based on a Transfor- 
mational System as well as on a rewriting system of 
c-graphs which constitutes the nodes of the Trans- 
formational System. The following part discusses a 
set of operations on the structure. Finally, the 
implementation in its present state is shown. 
1. INTRODUCTION. 
In the past \[10,11\], several kinds of repre- 
sentation have been used (strings, labelled trees, 
trees with "decorations", graphs of strings and 
(semantic) networks). C-graphs had its origin as 
an alternative in the representation and in the 
treatment of ambiguities in Automatic Translation. 
In earlier papers \[4,5\] this structure is named 
E-graph but c-graph is better suited since it is a 
generalized "grafo de cadenas" (graph of strings). 
This structure combines some advantages of 
the Q-systems \[7\] and of the trees of ARIANE-78 
\[1,2,11\], in particular, the use of only one struc- 
ture for all the translation process (asln the 
former) and foreseeable decidability and parallel- 
ism (as in the latter). This paper presents a pro- 
totype, not completely operational, that uses 
c-graphs and is intended to translate assemblers 
to refine the adequacy of this kind of structure 
in the translation of natural languages. 
2. DEFINITIONS 
C-graph. A c-graph G is a cycle free,labelled 
graph \[1,9\] without isolated nodes and with exactly 
one entry node and one exit node. It is completely 
determined by a 7-tupie: G=(A,S,p,I,O,E,¢), where 
A is a set of arcs, S a set of nodes, p a mapping 
of A into SxS, I the input node, 0 the output node, 
E a set of labels (c-trees, c-graphs) and E a map- 
ping of A into E. For the sake of simplicity, arcs 
and labels will be merged in the representation of 
G (cf. Fig.1 . Interesting c-graphs are sequential 
c-graphs (cf. Fig.2a) and bundles (cf. Fig.2b). 
G= 1~7 
h~...~ e -- v k 
A={1 ..... 12} ; S={1 ..... 7} ; I={1} ; 0={7} 
p={ (1,1,2), (2,2,4), (3,4,5), (4,5,7), (5,5,6), 
(6,6,7), (7,6,7), (8,2,3), (9,3,4), (10,3,5), 
(11,1,2), (12,1,2)} 
E={a,b,c,d,e,f,g,h,i ,j,k} E={ (I ,a), (2,b), (3,f), (4,g), (5, i), (6,j), 
(7,k), (8,c), (9,d), (lO,e), (11,b), (12,h) } 
Fig.1. A c-graph. 
GI= ~ i :c J ~o 
(a) (b) 
Fig.2. A seq. c-graph (a) and a bundle (b). 
C-trees. A c-tree or a tree with decorations 
is an ordered tree, with nodes labelled by a label 
and a decoration that is itself a decorated tree, 
possibly empty. 
Classes of c-graphs. There are three major 
classes: (1) recursive c-graphs (cf. Fig.3a) where 
each arc is labelled by a c-graph; (2) simple 
c-graphs (cf. Fig.l) where each arc is labelled by 
a c-tree and (3) regular c-graphs, a proper sub- 
class of the second that is obtained by concatena- 
tion and alternation of simple arcs (cf. Fig.3b). 
By denoting concatenation by "." and alternation 
by "+", we have an evident linear representation. 
For example, G4=g+i.(j+k). Note that not every 
c-graph may be obtained by these operations, e.g.G. 
Substructures. For the sake of homogeneity, 
the only substructures allowed are those that are 
themselves c-graphs. They will be called sub- 
61 
-c-graphs or seg's. For example, G1 and G2 are 
seg's of G. 
G2 
a) A recursive c-graph. 
b) A regular c-graph. G4= 
Fig.3. Two classes of c-graphs. 
Isolatability. It is a feature that deter- 
mines, for each c-graph G, several classes of seg's 
An isolated seg G' is intuitively a seg that has no 
arcs that "enter" or that "leave" G'. Depending on 
the relation that each isolated seg keeps with the 
rest of the c-graph, several classes of isolatabil- 
ity can be defined. 
a) Weak isolatability. A seg G' of G is weakly 
isolatable (segif) if and only if for every 
node x of G' (except I' and 0'), all of the 
arcs that leave or enter x are in G ~. E.g.: 
G5=i is a segif of G. 
b) Normal isolatability. A seg G' of G is normaly 
isolatable (segmi) if and only if it is a 
segif and there is a path, not in G', such 
that it leaves I' and enters 0'. Example: G6=k 
is a segmi of G. 
c) Strong isolatability. A seg G' of G is 
strongly isolatable (segfi) if and only if the 
only node that has entering arcs not in G' is 
I' and the only node that has leaving arcs not 
in G' is 0'. When G' is not an arc and there 
is no segfi contained strictly in G', then G' 
is an "elementary segfi"; if G contains no 
segfi, then G. is elementary. E.g. G4 is a 
segfi of G. 
Order and roads. Two order relations are con- 
sidered: (l) a "vertical" order or linear order of 
the arcs having the same initial node and (2) a 
"horizontal" order or partial order between two 
arcs on the same path. A road is a path from I to 0 
Vertical order induces a linear order on roads. 
3. DEFINITION OF THE PROTOTYPE. 
The prototype consists of a model and a data 
structure. The model is essentially a generaliza- 
tion of a Transformational System (TS) analogous 
to ROBRA \[2\] and whose grammars are rewriting sys- 
tems of c-graphs (RSC) \[4,5,6\]. Regarding data 
structure, we use c-graphs, 
3.1A Transformational ~stem. 
This TS is a c-graph-~c-graph transducer. It 
is a "control" graph whose nodes are RSC and the 
arcs are labelled by conditions. 
A TS is a cycle free oriented graph, with 
only one input and such that, 
CI) Each node is labelled with a RSC or &nul. 
(2) &nul has no successor. 
(3) Each grammar of the RSC has a transition 
scheme S or c (empty scheme). 
~4) Arcs of the same initial node are ordered. 
TS works heuristically. G~ven a c-graph gn as 
an input, it searches for the first path endin~ in 
&nul. This fact implies that all of the transition 
schemes on the path were satisfied. Any scheme not 
satisfied provokes a search of a new path. For 
example, if $1 is satisfied, TS produces Gl(gn)=g 1 
and it proceeds to calculate G2(G1(go))=g ~. IY S 4' 
is satisfied the system stops and produce~ g~. 
Otherwise, it backtracks to GI and tests S2.-If it 
is satisfied g\] is produced. Otherwise, it tests 
S3, etc. 
• Snul S 4 
~- &nul 
Fig.4. A Transformational System. 
3.2 A REWRITING SYSTEM. 
Let us consider a simple example: let GR be 
the following grar~mar for syntactic analysis (with- 
out intending an example of linguistic value). 
R1:(g1+e1+g2)(g3+~2+g4)* I 
(g1+gZ)(g3+~2+g4)÷61 I 
R2:(g1+~1+gZ)(g3+eZ+g4) 
(gl+g2)(g3+~2+g4)+81 
R3:~I(gl+~Z+g2) 
~1(g1+g2)+B1 
R4:~l(g1+~2+g2) 
g1+g2+81 
R5:(g1+~1+g2)(g3+~2+g4) 
(g1+g2)(g3+~2+g4)+B1 
R6:(g1+~1+g2)(g3+~2+g4) 
(g1+g2)(g3+~2+g4)+61 
~I=GN, ~2=GV / == 
81:=PHRA(~I,~2) /. 
/ ~I=VB, ~2=GN / == 
/ BI:=PRED(~I,~2) /. 
/ ~I=NP, ~2=AD / == 
/ BI:=GN(~I,~2) /. 
/ ~I=NP, ~2=PRED / == 
/ 61:=PHRA(~I,~2) /. 
/ ~I=PRON, ~2=VB / == 
/ 61:=GV(~I,~2) /. 
/ ~I=ART, ~2=NM / == 
/ BI:=GN(~I,~2) /. 
As we can see, each rule has: a name (RI,R2, 
...), a left side and a right side. 
The left side defines the geometricaI Form 
62 
and the condition that an actual seg must meet in 
order to be transformed. It is a c-graph scheme 
composed of two parts: the structural descriptor 
that defines the geometrical form and the condition 
(between slashes) that tests label information. The 
first part use "*" as an "element of structural de- 
scription" in the first rule. It denotes the fact 
that no seg must be right-concatenated to g3+~2+g4. 
The right side defines the transformation to 
be done. It consists of a structural descriptor, 
similar to the one on the left side and a llst of 
label assignments (also between slashes) where for 
each new iabe\] we precise the values it takes; and 
for each old one, its possible modifications. A 
point ends the rule. Note the properties of an 
empty g: if g' is any c-graph, then g.g'=g and 
g+g'=g'. 
Let us analyze the phrase: "Ana lista la ti- 
ra". The representation in our formalism is G7. 
Morphological analysis produces G8. Note that a11 
ambiguities are kept in the same structure in the 
form of para\]\]e\] arcs. The application of GR to G8 
results in Gg, where each arc will be labelled with 
a c-tree with a possib\]e interpretation of G8 in 
grammar GR. The sequence of applications is R3, R6, 
RS, RI, R2, R4. The system stops when. no more rules 
are applicab\]e. 
G7= e Ana ^ ..... lista _ la _^ tira :o 
GS= Ana C 
np 
el 
1 isto \ 
ad 
t i tar 
lo 
pron 
, where 
AI=PHRA(GN(NP(Ana), AD(listo)), GV(PRON(Io), 
VB(tirar))) 
A2=PHRA(NP(Ana), PRED(VB(IIstar, GN(ART(eI), 
NM(tira)))) 
Operations are divided in two classes: (1) 
those where the structure is taken as a whole (glo~ 
a\]) and (2) those that transform substructures 
(local), 
I. Global Operations. 
Concatenation and alternation have been de- 
fined above. These operations produce sequentlaI 
c-graphs and bundles respectively, as well as the 
polynomia\] writing of regular c-graphs. 
Expansion. This operation produces a bundle 
exp(G) from all the roads of a c-graph G. For exam- 
ple, expansion of GIO produces exp(G10)=(b.f)+ 
(c.d.f)+(c.e). 
GIO= ~ f 
exp(G10)= 
f 
Fig.6. Expansion of a c-graph. 
Factorization. There are two kinds and their 
results may differ. Consider G11=a.b+a.c+d.e+d.f+ 
g.f+h.e. Left factorlzation produces G12=a.(b+c)+ 
d.(e+f)+g.f+h.e, and right factorization G13=a.b+ 
a. c+ (d+h). e+ (d+g). f. 
Arborization. This operation constructs a 
c-tree from a c-graph. There may be several kinds 
of c-trees that can be constructed but we search 
for a tree that keeps vertical and horizontal or- 
ders, i.e. one that codes the structure of the 
c-graph. An "and-or" (y-o) tree is well suited for 
this purpose. The result of the operation will be 
a c-graph with one and only one arc labelled by 
the and-or tree. For example, arb(G)=G14 (cf. Fig. 
7). Note that the non-regular seg has ~ as a root. 
Regular seg's have o. 
G14= C ~ :O , where 
A= y (o (y (a) ,y (b) ,y (h)) ,a (y (b,f) ,y (c,d, f), 
y (c,e)),o(g,y (i ,o(j ,k))) 
Fig.7. Arborization of G. 
Fig.5. Example of sentence analysis. 
3.3 Operations. 
2. Local Operations. 
Replacement. Given two c-graphs G and G",this 
operation substitutes a seg G' in G for G", e.g. 
if G=G4, G"=m+n and G'=i, then the result will be 
63 
G 15=g+ (re+n) : (j+k). 
Addition. This operation inserts a c-graph G' 
into another, G, by merging two distinct nodes (x, 
y) of G with the input and output of G'. Addition 
requires only that insertion does not produce cy- 
cles. Note that if (I,0) are taken as a couple of 
nodes, we have alternation. Example, let (2,3) be 
a couple of nodes of G16 and take G'=G17=s+u. The 
resulting c-graph is G18. 
c 
G16=c ---c i 2 3 5 
c 
GI8= c i 2 
Fig.8. Addition of a c-graph. 
Erasing. This eliminates a substructure G' 
of a c-graph G. Erasing may destroy the structure 
even if we work with isolated seg's. Consequently, 
it is only defined on particular classes of seg's, 
namely segfi's and segmi's. For any other substruc- 
ture, we eliminate the smaller segmi that contains 
it. A special case constitutes a segfi G' such 
that I and 0 do not belong to G'. Eliminating G' in 
such a case produces two non-connecting nodes in 
the c-graph that we have chosen to merge to pre- 
serve homogeneity. Example: let us take G and G'= 
GIO, then the result of erasing GIO from G is G19= 
G2.G4. 
4. IMPLEMENTATION. 
A small system has been programmed in PROLOG 
\[4\] (mainly operations) and in PASCAL (TS and RSC). 
For the first approach, we chose regular c-graphs 
to work with, since there is always a string to 
represent a c-graph of this class. 
In its present state, the system has two 
parts: (1) the Transformational System including 
the rewriting system and (2) the set of local and 
global operations. 
The TS is interactive. It consists of an ana- 
lyzer that verifies the structure of the TS given 
as a console input and of the TS proper. As data 
we have the console input and a segment composed of 
transition schemes. There are no finer controls for 
different modes of grammar execution. 
Regarding operations and from a methodological 
point of vlew, algorithms for c-graph treatment can 
be divided in two classes: (I) the one where we 
search for substructures and (2) the one where this 
search is not needed. Obviously, local operations 
belong to the first class, but among global opera- 
tions, only concatenation, alternation and expan- 
sion belong to the second one. Detailed description 
of algorithms of this part Of ~he system can be 
found in \[4\]. 
5. CONCLUSION. 
Once we have an operational version of the 
prototype, it is intended as a first approach to 
proceed to the translation of assemblers of the 
microprocessors available in our laboratory such 
as INTEL's 8085 or 8080 and MOTOROLA's 6800. 
6. REFERENCES. 
I.\[I\] Boitet, Ch. UN ESSAI DE REPONSE A QUELQUES 
QUESTIONS THEORIQUES ET PRATIQUES LIEES A LA TRA- 
DUCTION AUTOMATIQUE. DEFINITION D'UN SYSTEME PROTO- 
TYPE. Th~se d'Etat. Grenoble. Avril. 1976. 
2.\[2\] Boitet, Ch. AUTOMATIC PRODUCTION OF CF AND CS 
ANALYSERS USING A GENERAL TREE TRANSDUCER. Rapport 
de recherche de l'Institut de Math~matiques Appli- 
qu~es N°218. Grenoble. Novembre. 1979. 
3.\[4\] Clemente-Salazar, M. ETUDES ET ALGORITHMES 
LIES A UNE NOUVELLE STRUCTURE DE DONNEES EN T.A.: 
LES E-GRAPHES. Th~se Dr-lng. Grenoble. Mai. 1982. 
4.\[5\] Clemente-Salazar, M. E-GRAPHS: AN INTERESTING 
DATA STRUCTURE FOR M.T. Paper presented in COLING- 
82. Prague. July. 1982. 
5.\[6\] Clemente-Salazar, M. C-GRAPHS: A DATA STRUC- 
TURE FOR AUTOMATED TRANSLATION. Paper presented in 
the 26th International Midwest Symposium on Clr- 
cuits and Systems. Puebla. Mexico. August. 1983. 
6.\[7\] Colmerauer, A. LES SYSTEMES-Q. Universit~ de 
Montreal.Publication Interne N°43. Septembre. 1970. 
7.\[9\] Kuntzmann, J. THEORIE DES RESEAUX (GRAPHES). 
Dunod. Paris. 1972. 
8.\[10\] Vauquois, B. LA TRADUCTION AUTOMATIQUE A 
GRENOBLE. Document de Linguistique Quantitative 
N°24. Dunod. Paris. 1975. 
9.\[11\] Vauquois, B. ASPECTS OF MECHANICAL TRANSLA- 
TION IN 1979. Conference for Japan IBM Scientific 
Program. Document du Groupe d'Etudes pour la Tra- 
duction Automatique. Grenoble. July. 1979. 
64 
