Saussurian analogy: a theoretical account and its application 
Yves Lepage & Ando Shin-ichi 
ATt{ Interpreting T('.lecommu~tications l{esearch Labs, 
llikaridai 2-2, Seika.-(:ho, Soraku-gull, Kyoto 619-02, .\]apan 
{ lepage, ando }~±tl. atr. co. jp 
Abstract 
in tim (7ours de linguislique g&~,&'alc, 
S;d, llSSllr() l-fl(;Ilti()llS kl, p\[tellO\]ll(:\[IOll OF 
tremendous itnl)ortance in l+mguage, 
analogy. For example, giwm the series 
walk, walked and look, how can we (:oil, 
t,tl( + fourth term, looked? We give ;~ t)ossi -- 
I)le a,(:count of this phenomenotl in t, et'tns 
of edition distances, thus paving the way 
to comptlt+d;ionM applic++tk)ns. This ex- 
l)lanal;ion a ccoutJts for prefixing, suffix.+ 
ing and infixing. We show how it is 
possible to perform the a.nMogical anal 
ysis and generation of sentences, using 
tt tree-bank ;m(l ~q)l)roxitnaLl;e l);tt, tertF 
ln&t(;hitlg. As a, (;OllSe(ltlellce ~ our \[)ro- 
posal linds its t)lnee in I;he (~xa.tul)le-l)as(.'d 
approach to naturM language processing. 
1 Introduction 
In the Cours de linguistique gdn&ul, which dates 
back t;o 1916, Saussm:e lnetltions a phenoruenon 
of tremendous iml)ortance in language, analogy: 
given some series of three words, human beings 
arc able to (;oii| a fourth one. One can see a 
reaetualisation of this principle in the ex~unl)te- 
based nppro;+ch to machine l,r~mslation. AnMogy 
sec~\[l(|s to h+Lve tiever b('+en Lheorise(\] ill a, ltiono- 
lingual frmnework, maldng its hilinguM al>l)li('~v- 
tion questionable. 'l'he purpose of this article is 
to propose a possible, mathemati('Mly sound cx- 
I)l~mat\[on, and to show the p;~t,h to comtmtational 
applications. 
2 Saussurian analogy 
In Chapter \[our, t)~rt 1\[1 o\[" the Cou'rs de linguis- 
lique .qdn&'alc t, Saussur(; points out wha.t he calls 
analogy: given two forms of ~ given word, and 
ouly one form of a second word, it is possil>h'. I.o 
tall uxa.ml>les in this six:tlott +u'e front the (\]ours. 
coin the missing form ~. 
Latin: oratorem : oralor= honorem : x 
X ~: h, ollor 
In this particular case, Saussure was interested in 
explaining the competition of honor with the older 
fornl honos, honor is not a phonetic transforma- 
tion of hon.os by rhotacism, but simply the result 
of a.n alogy. 
Analogy is very general, and its ('.\[\[~('ts are 
seen in a number of other places. Ill may ex 
plain M1 flexional paradigms, from conjugation to 
declension :~. 
(~ern\]an: screen : .sclzle := lachen : x 
x : lachte 
Analogy a.lso explains what is called the produc 
l;ivity of bmguage, i.e., the fact that underst~md 
a.ble words cml be coined, which are not regis- 
tered in dictiona.ries, nnd may have never been 
uttered before by the speaker nor heard before by 
the list;crier 4. 
I,'rench: rdaction : r'6actionnaUv = r@rcssion : x 
x = r@rcssionnairc 
Fina.lly, analogy Mso cxphdns incorrect \[brms or 
barbarisms, ex;mq)le~s of which ,~re flmquent in 
child langua.g(; ~. 
French: dleindrai : /leindre = viendrai : x 
x = viendre 
Our goal is to give one possible account of this 
phenomenon in compul,ationM t, erms, ;rod to show 
that, given n tree brink, ~ possible ;~pl)lica.t;ion may 
t)e the mmtysis or genera.lion of sentences. 
o,..to,, (o,.~to,:, ~pe~,,kor) +,,d ho.o; (t,oHo,.') ~.,mi- 
\[,;~tive singular, oralorem and honorcm a.ccu\]sMivc sin- 
gula.r. 
:~taHu:r+ (to la,,gh) ~u,d .~,'~tz,~r, (to put), mcht< +~:t- 
ztc past \[OrltlS. 
4rdaction (reaction) ~uM ,'@ression (repression) 
nouns, rdactionnaire (reactionary) adje(:tivc; rdprcs° 
sionnaire souuds perh~(:tly I!'rench, but will not be 
tound in +~ diction~u:y. 
~'dtei,,hv (to extinguish; to turn off) infinitive, 
('teindrai ",rod vicndrai future tense; viendre is +L b~u'- 
ba,rism in pbtce of venir (to come) (compare, in l';u- 
gllsh, qocd for wc'nl). 
717 
3 A possible account 
3.1 Notations 
Let 12 be a non-empty finite set, called the vocab- 
ulary. (12", .) is the monoid over 12 where . denotes 
concatenation. 1)* is also the infinite union of all 
12n tbr n 6 IN. By convention, 120 = {el with g 
being the empty string. 
Using these notations, analogies are equations 
with one unknown on 12': u:v = w:x. Tobe 
able to solve analogies, it is necessary to give a 
meaning to such a notation. 
3.2 A geometrical view 
In our attempt to discover a mathematical expla- 
nation of analogy, we were long hindered by the 
notation itself. Of course, the idea behind it is 
that analogy could be considered a similar psy- 
chological process as the one intervening in pro- 
portions: 
mathematics:mathematical = physics : x 
2:4 = 3:x 
But ~, the set of rationals, is mathematically 
well equipped. Addition defines a commutative 
group, and multiplication makes it a field. Pro- 
portions in iI~ are thus well understood, and safely 
solved. What is true for II~ is not for l;*. The basic 
operation, concatenation, is not comnmtative and 
does not define a group, but a relaxed structure, 
that of a monoid. And no one knows what the 
meaning of u : v could possibly be. 
In fact, looking at analogy fi'om the previous 
point of view is misleading because, intentionally 
or not, we think of nmnbers, which enforces too 
many constraints. A better, more relaxed view 
of the problem is that era rectangle. In a rect- 
angle, opposite sides and diagonals are equal (see 
Figure 1). 
rnathemO, tics physics 
mathematical ~x 
Figure 1: Analogy seen as a rectangle 
3.3 Formal|sat|on 
'Fhis view makes explicit that analogy sets a rela- 
tion between an unknown on on(" hand, and three 
terms on the oth& hand. Now, carrying on with 
the geometrical paralM, analogy may be inter- 
preted in terms of distances as follows : the dis 
tanee of any term to the unknown is the same 
as the distance between the two remaining terms. 
We thus posit the following equivalence. 
Definition 1 (Analogy) 
A f dist(u,v) = dist(w,x) 
u: v -- w: x ~=> dist(u,w) = dist(v,x) 
dist(v,w) = dist(u,x) 
The rectangle view does not forbid commutativ- 
ity for dist, a notable difference with division on 
numbers, where 2/4 is not the same as 4/2. 
3.3.1 Linguistic interpretation 
Let us linguistically interpret the previous sys- 
tem of equations. Supl)ose we get the following 
analogy to solve: mathematics: mathematical = 
physics : x. Of course, x = physical. 
The first two equations show that the terms 
on the diagonal may be exchanged. A linguis- 
tic interpretation is thai; analogy involves two 
orthogonal dimensions reflecting the duality of 
the lexeme/morpheme (or root/affix, or mean- 
ing/limction, etc.) separation. 
(list(mathematics, mathematical) = 
dist(physics, physical) 
dist( mathematics, physics) = 
dist(mathematical, physical) 
On each side of the equal sign something is con- 
served (one dimension), and something changes 
(second dimension). 
* In the example, the first equation stands for 
a conservation in meaning ("mathematics" 
as opposed to "physics") and a change in 
categories, 
. whereas the second equation stands for a con- 
servation of grammatical categories (N as op- 
posed to A), but a change in meanings. 
The third equation means that, somehow, anal- 
ogy neutralises changes performed at the same 
time along the two previous dimensions. 
dist( mathematics, physical) = 
dis.t(physics, mathematical) 
On each side of the equality sign, both changes in 
meanings and categories, performed at the same 
time, leave the proportion unchanged. 
3.3.2 Conlplete tbrmalisation 
In order to complete the \['ormMisation, dist re-- 
mains to be defined, l,;dition distances which have 
been proposed in many works (Levenshtein 65), 
(Wagner & Fischer 74), (Selkow 77), etc., are a 
good (;atl(lidate. They are mathematically sound 
as well as fat,it|rely relevant: they re\[tect a sensi- 
ble notion, that of keystrokes, an(t turn out to be 
metrics under some hypotheses. They answer the 
correction problem: what is the minimal number 
of cdit operations needed to lransJbrm one word 
into anolhcr one?. In our example, how mmty 
characters need to be (:hanged to transR)rm malh- 
emalical into physics'? Edit operations are inser- 
tion (for instance, e -~ p), (leletion (like l-+ ¢) 
718 
and replacement (like a ---+ s). A <listance can be 
defined by assigning weights to these three ()per: 
ations, 1 for each of them, for simplification. The 
edit <tistance is then a simple extension fi'om edit 
operations to strings. 
Definition 2 (Edition distanee) Let V be a 
vocab'ulary, dist is defined on V* as a commutative 
operation, in the following way: 
v(a, c v v(,,,,O (v*) 
dist(e,e) = dist(a,a)=0 
dist(e,a) = dist(a,b) = 1 ff a :/b 
dist(a.u,c) = dist(a,e) + dist(u,e) 
dist(a, c) + dist(u, b.v), 
dist(a.u,b.'v) = mi~4 dist(a,b) + dist(u,v), ) 
dist(< b) + dist(a.'u, v) 
With this delinition and a weight of 1 for each 
of the three edit operations, tile distance between 
mathematical and physics becomes 9. 
m a t h c m a I i c a / 
yszcs 
= 9 
As a mathematical result, with more general 
weighl;s, it can be proved that, if the edit Ol)er- 
ations define a metric on P U {c}, then the ('.(lit 
distance on V* is also a metric. We recall tile 
tbrmal definition of a metric. 
Definition 3 (Metric) Let S bc a set, dist a 
function from ,_q x S to IR + , the ,so/of non-negative 
real n'umbcrs, dist is a metric on S if and only if 
® (cqualily) 
V(a,b)<S 2 , dist(a,b)=0C>a=b 
• (commulalivity) 
v((,, s (list(< b) = di t(< a) 
• (h'iangle inequality) V(a, b, c) C S a, 
dist(a, c) _< dist(a, b) + dist(b, c) 
3.4 Coverage 
tlaving defined what we un<lerstand by analogy 
in a formal way, we inspect, some o\[' its proper- 
ties. We first; make a very strong but necessary 
assumption about the nature of the solution of an 
analogy. Following the linguistic feeling, we im- 
pose that tile solution of an analogy be built only 
with the elements of the vocabulary present in the 
three given terms. In other words, no material 
from outside should be used. 
This constraint does not prevent analogies from 
having multiple solutions. It suffices that the dis- 
tances become too large relative to the lengths of 
the words, a:thc = of:x is such a case. The 
constraint eliminates, for instance, all words of 
the form txy, with x and y two letters outside of 
the set {a, e, f, h, o,t}, but does not bar Ill, hhh, 
eee, which are solutions of this analogy. But, as a 
matter of fact, this kind of example does not make 
much linguistic sense. 
3.4.1 Equality 
A degenerated c~se of analogy is when two of 
tile three terms are equal. The only possihle solu- 
tion is then the third term. IlL other words, noth- 
ing new <:an really be said. This meets common 
sellse. 
v) c (v'f, = = 
This property is always true. It is proved thanks 
to the equality property of a metric: u : "u = v : x 
dist(u,,4=O=dist(v,x) ~ x=,:. 
Some. imt>ortmtt linguistic phenomena are cov- 
ered hy onr proposal for linguistic examples. But 
the corresponding mathematical properties ap- 
pear not to hold ill the general case. In fact, study- 
ing the necessary and suificient conditions under 
which they are true remains an open problem. It 
seems that, in all cases, it; has to do with some 
"weakest links" along the pair of strings consid- 
ered (minimisation of a sum of distances). 
3.4.2 Transitivity 
An important property which works in many 
cases, and at least on linguistic examples, but may 
not; be true in the general case 6, is transitivity: 
?t l V -~ lff : V I A ~ttl l ?)l ~- lJO : X zT~ I/\[V :'W :X 
This accounts for the fact; that any representative 
ill a group of conjugation/declension/die, may he 
chosen as the model. In Ancient Greek, AoTo< is 
always taken as a model for the declension of tile 
1st group of masculine nouns, although any other 
word from the same group would have been as 
good. 
3.4.3 Prefixes and suffixes 
Our definition of analogy fortunately captures 
linguistic cases where prefixes (or suffixes) are in- 
volved. 
tt.t :'~t.~ ~--- w.t :X =2? X ~ W.~ 
This is not true in tile general case. At least;, 
x = w.v ahvays verilles tile first two distance equa- 
tions: 
{ dist('u.t,u.v) = dist(l,v) 
= dist(w.t, w.v) 
= dist(l,v) 
dist(u.t,w.t) = (list(u, w) 
= dist(u.v, w.v) 
= (list(u, w) 
thanks to a property of edit distances, which we 
give here without a proof: V (u,v,w) E (V*) a, 
6Counter-example: the:ttt = a:of A a :of = 
th.c : hhh. ~ the : tit = the : hhh because 
dist(the, thc) = 0 ¢ dist(ttt, hhh) = 3 
719 
dist(u.v, u.w) = dist(v, w). But the third equation 
may not always be verified. A suIficient condition 
for it to hold is that the joints between prefixes 
and suffixes minimise some sums of distances: 
dist(u.v,w.t) = dist(u,w) + dist(t,v) 
= dist(u.t, w.v) 
This is the case when prefixes and sulfixes are dis- 
similar enough, as in our example with mathema# 
i-cs and phys-i-cal, but in the general case, only 
dist(u.v, w.t) _< dist(u, w) + dist(v, t) holds. 
3.4.4 Infixes and umlauts 
Similarly to prefixing and suffixing, our for- 
malisation accounts for linguistic examples of in- 
fixing, a phenomenon well illustrated by semitic 
languages 7 (here, the replacement of an a by an i). 
Arabic: arsala : mursilun = aslama : x 
x = muslimun 
It also accounts for some (not all) examples of 
sound changes, like umlaut in German s . 
German: Balg : B~lge = Ilals : x 
x = Hiilse 
These linguistic cases work partly thanks to the 
previous property of distances with prefixes. 
3.4.5 R.eduplieatlon 
Unfortunately, our proposal does not render an 
account of reduplication. This would be necessary 
if we wanted to describe, for example, the for- 
mation of plurals in Malay/Indonesian: orang ---+ 
orang-orang 9. Here, a speculative remark would 
link the power of analogy with some class of lan- 
guages; our proposal seems not to go beyond reg- 
ular languages. 
4 Application 
In the sequel, we will apply the principle of anal- 
ogy not on words anymore, but on sentences. In 
the same way ~ words are strings of characters, 
sentences are strings of words. So, the shift from 
words to sentences is just of matter of reformula- 
tion. 
Wc also recall that edit distances and edit op- 
erations are not contined to strings; they extend 
in a natural way to forests, and hence to trees. 
In fact, it is possible to give a definition of an 
edit distance on forests which generalises the def- 
initions on strings (Wagner & Fischer 74) and on 
7arsala (he sent) and aslama (he became con- 
verted) verbs 3rd person singular past; mursilun (a 
sender) and muslirnun (a convert) agent nouns. 
8Balg (pelt, skin) and Iials (neck) singular; Biilge 
and Hdlse plural. 
9 orang (human being) singular, orang-orang plural. 
trees (Selkow 77). Hence the possibility of apply- 
ing analogy to trees. 
The example-based approach in machine trans- 
lation, inaugurated by (Nagao 84) and illustrated 
by (Sadler and Vendelmans 90) or (Sato 90), for 
instance, relies on the assumption that, if two sen- 
tences are "ek)se", then, their analyses should be 
"close" too. By consequence, if the analysis of a 
first sentence is known, the analysis of the second 
one could be obtained by performing slight "mod- 
ifications" (>it it. A problem arises: where are the 
slight "modifications" to be performed, and what; 
are they? In that matter, edit distances could help 
a lot: "close" means at a distance not too large, 
and ':modifications" are edit operations. 
4.1 Analysis by analogy 
4.1.1 Principle 
Suppose we have a collection of sentences (a 
data-base) already analysed (in fact, a tree-bank). 
For a new sentence, called the prototype, our goal 
is to build its analysis, i.e., a corresponding tree. 
Of course, the ideal case is when the prototype 
is already present in the tree-bank, which means 
that the analysis is tbnnd there too. 
In general, the prototype will not be found in 
the tree-bank. The search may thus be relaxed to 
similar sentences. Now, if at least two different 
sentences are retrieved by approximate matching, 
a fourth one can be built by analogy. Figure 2 
illustrates this: the prototype is in the upper left 
corner; the two sentences on its right and under 
it have been obtained by approximate matching. 
Knowing the respective distances between these 
three sentences (on the arrows), sentence x can be 
computed by analogy. 
it' by chance sentence x belongs to the tree-bank, 
its analysis is also in the tree-bank. Now, a reverse 
process on trees delivers an analysis for the proto- 
type, as illustrated in Figure 3. The three trees in 
the right and bottom corners are the correspond- 
ing analyses of the sentences of Figure 2. They 
were taken from the tree-bank. The distances are 
given on the arrows. Tree y is the solution of the 
analogy, and we claim that it is the analysis of the 
prototype sentence. 
4.1.2 Implementation 
Approximate matching is retrieval of all sen- 
tences at a distance less than a threshold Kom a 
given prototype. Efficient algorithms, using dy- 
namic programming, have been proposed to per- 
tbrm approximate matching (Ukkonen 83) and 
(Landau & Vishkin 88). Our method is some- 
what different. We do as if we wanted to gener- 
ate the entire set of sentences at a distance less 
than or equal to the threshold. In doing that, 
we introduce a don't care symbol representing any 
possible word. Pattern-matching with don't care 
symbols has already been studied (Pinter 85). Of 
720 
the green lamp turns oil ~-- 3 - * 
1 / 
2 3 
the lamp turns on 
the green signal is on 
x = the signal is off 
Figure. 2: l)rototype (upper left corner), sentences obtained by approximate matching and x, sentence 
oi)tained by analogy, and retrieved from the tree-bank. I)istanees are given in words. 
S 
NP VP 
I I 
adj adv 
/ 
S 
NP VP 
be hP 
I I 
adj adv 
! 
2 
,1 
3 / 
S S 
NP VP NP VP 
erb~.. A +-- 1 ----~ ~ ~ det N v P det N be AP 
l l 
adv adv 
l,'igure 3: Analyses from the tree-bank and y, analysis of the prototyt)e sentence obtained by analogy. 
Distau('es are given in nodes. 
course, this naive solution implies an exponential 
explosion, but, fortunately, it is not ne(:essary to 
consider the entire set of sentences, neith('.r to gen- 
erate them. Only sentences which a.re substrings 
of other strings may be coded. This allows us to 
use a siml)le non-determinist|(- version of the Aho 
& (;orasick algorithm (Aho &. (k)rasick 75), which 
only cheeks the possible presenc.e of patterns on an 
array of integer triples. This algorithm C()tl.tl)CteS 
well with one of the n,ost elIichmt algorithm agrcp 
(Wu ~ Manl)er 92), as it is faster in average. 
4.1.3 Us(; anti us(;fulness 
In a first implementation, rather than really 
computing solutions of analogies on trees, we re- 
trieve them from the tree-bank using approximate 
matching. I,;xeeution dines are l)eh)w one second 
for the analysis of short chunks of text (about 5 
words). This technique helps a lot in the (-onstruc- 
lion of tree-banks. Firstly, building new linguis- 
tic structures for new sentences is delinitely made 
faster. Secondly, this technique enforces consis-- 
tency, a sensible issue in tree-bank construction, 
especially if tree-banks are to be used to train 
probabilistic models. 
4.1.4 Exi)erhnents and measures 
'lb have a more precise idea about the power 
of the method, we carried out some experiments 
on an excerpt of the tree.-bank of the University 
of Pennsylwmin (787 sentences with their corre 
sponding analyse's). For all possible zl-tuples of 
Sellt,(\]iilees which verify the analogy definition, wc 
eomlmted the analysis of the first sentence by 
analogy. We. recall that there may be no solution, 
one solution, or several solutions..As a restrict, ion 
in this experiment, we (lid not consider distances 
between objects over half of the lengths of the ob- 
jects. 
Ret:all In document retrieval, recall is delined 
as the ratio of the number of relevant documents 
retrieved over the total mmlber of relewmt docu-- 
inents ill the data base. Ih're, we (lefine the recall 
as the number of times the exact structure was 
computed by analogy, divided by the number of 
sentence t)airs having the same structure in the 
tree-bank. In one experiment, the recall is 0.69, a 
quite good figure, which shows thai; the technique 
is promising. 
Precision Again, in document retriewfl, preci- 
sion is defined as the ratio of the nmnt)er of rele- 
721 
rant documents retrieved over the total number of 
documents retrieved. Here, we define the precisio~ 
as the number of times when the exact structure 
was computed by analogy divided by the number 
of solutions delivered. 
In the experiment, the precision is 0.43, which 
means thai in almost half of the cases, one of the 
structures delivered is the right one. Now, in aver- 
age, the structures delivered are far from the exact 
structure by 1.61 node, with a standard deviation 
of 1.86. This means that in average less that two 
nodes have to be edited in order to get the exact 
structm:e, the size of a structure in the tree-bank 
being 9.8 :t: 5.4 nodes. 
4.2 Generation by analogy 
Generation may be performed in the same way as 
analysis, the difference being that the prototype 
is a tree and pattern-matching is performed on 
trees. The overall process is similar to the one for 
analysis, but in the opposite direction. The tool 
we have built for the edition of text with trees, 
allows approximate matching on trees, and gener- 
ation is performed using the same functions as for 
analysis. 
5 Conclusion 
We have proposed a possible theoretical explana- 
tion of analogy in terms of edit distances. As 
expected, this proposal renders an account of 
some important linguistic phenomena, in partic- 
ular, prefixing, suffixing and infixing. Also, tran- 
sitivity is verified by linguistic examples. Never- 
theless, the exact mathematical properties, and 
especially, the necessary and sufficient conditions 
on strings under which the above mentioned prop- 
erties hold remain for the large part to be studied. 
A possible application is anMysis and genera- 
tion by analogy. The proposed technique falls un- 
der the example-based approaches to natural lan- 
guage processing, hut we think it may be safer 
than previous methods, because it relies on more 
information, and linguistically founded informa- 
tion. We have built a first implementation, which 
shows to be of great utility in accelerating the con- 
struction of tree-banks and improving their con- 
sistency. 
References 
Alfred V. Aho and Margaret J. Corasick 
Efficient String Matching: An Aid to Biblio- 
graphic Search 
Communications of the ACM, Vol. 18, No. 6, 
June 1975, pp. 333-340. 
(gad M. Landau and Uzi Vishkin 
Fast String Matching with k Differences 
Journal of Computer and System Sciences, 
Vol. 37, 1988, pp. 63-78. 
V.\[. l,evenshtein 
Binary codes capable of correcting deletions, in- 
sertions and reversals 
Dokl. Akad. Nauk SSSR, vol. 163, No. 4, Au- 
gust 1965, pp. 845-848. 
English translation in Soviet Physics-doklady 
vol. 10, No. 8, February 1966, pp. 707-710. 
Nagao Makoto 
A l"ramework of a Mechanical Translation be- 
tween Japanese and English by Analogy Prin- 
ciple 
in Artificial g4 Human Intelligence, 
Alick Elithorn and Ranan Banerji eds., Elsevier 
Science Publishers, NATO 1984. 
Ron Y. Pinter 
Etficient string matching with don't care pat- 
terns 
in A. Apostolico and Z. Galil (eds) NATO Se- 
ries, vol. F I~2, Combinatorial Algorithms on 
Words, Springer Verlag, Berlin Heidelberg, 
1985, pp. 11-29. 
Victor Sadler and Ronald Vendelmans 
Pilot implementation of a bilingual knowledge 
bank 
Proceedings of COLING-90, Helsinki, 1990, vol 
3, pp. 449-451. 
Sato Satoshi and Nagao Makoto 
Example-Based Translation of Technical Terms 
Proceedings of the Fifth International Confer- 
ence on Theoretical and Methodological Issues 
in Machine Translation TMI-93, pp 58-68, Ky- 
oto, 1993. 
Ferdinand de Saussure 
Gouts de lin.quislique gdndrale 
publi5 par Charles Bally et Albert Sechehaye, 
Payot, Lausanne et Paris, 1916. 
Stanley M. Selkow 
The Tree-to-Tree Editing Problem 
Informalion Processing Letters, Vol. 6, No. 6, 
December 1977, pp. 184-186. 
Esko Ukkonen 
On approximate string matching 
in Proc. lnt. Conf. Found. Comp. Theor., Lec- 
ture Notes in Computer Science 158, Springer 
Verlag, Berlin/New York, 1983, pp 487-495. 
Robert A. Wagner and Michael J. Fischer 
The String-to-String Correction Problem 
Journal for the Association of Computing Ma- 
chinery, Vol. 21, No. l, January 1974, pp. 168- 
173. 
Sun Wu & Udi Manber 
Fast Text Searching Allowing Errors 
Communications of the ACM, Vol. ;35, No. 10, 
October 1992, pp. 83-91. 
722 
