Unifioatlon and Tram;~ducti~)n ia Cc~mputati\[~na\] Ph~mology 
Julie CARSO~ 
Un iver~:d ~,h t Bielefeld 
Faku.l~t \['u~' Ll~sui~{ti~ und Litera~ur'wi,~enschaft 
4~{00 BJelefeld 
In this paper miif hzation and transductien 
mocha n i~m.% are applied in a new approach to 
ohono\].ogical parching. It is shown that unification in 
the sense of Kay as used in unification grammars, and 
tl m~.sda,:t\[oa, a p~o~;ess deriving from automata theory, 
{~ ~-~ both valuable tools for use in computational 
pb,.m,.,k~gy. By way of illustration, a brief outline of 
the allophsni.c parser described by Church is given. 
Then a linea~ unification partier for English syllables 
\[s intro./urea. This parser takes phonetic input in 
the ~orm of feature bundles and use~ phcnologlcal 
ules rep~ ~ !:~;en t~l b~ networks of transduction 
relathm:5 together with unification, and an iterat:\[ve 
\[in i te-~:d;ate process to produce phonemic output with 
marked sy\].l~ble boundar ies. A fundamental 
d b%inct ion is made between two domains: the . 
representations at the phonetic and phonological 
levels, and the proc~.ssing of these representations. 
On this basis, a d\[~itinctisn is made between networks 
of tran~.di~ction relations <e.g. between allophones and 
phonemes), and a .%et of possible processors (i.e. 
parsers and transducers) for the interpretation of 
such networks. 
1. 'F~e~ne~du~tion and Unification in Phonology 
The proposal to use finlte-state transducers in 
morphology and phonology has been advocated in recent 
years by Kaplan and Kay /1981/, Koskenniemi /1983/ and 
others, It has been suggested /Gibbon 1987! that 
finlte-state transducers are the most appropriate 
devices for use in other areas of computational 
phonology. In Koskennlemi's system, single finite-- 
state transducers act as parallel fllters in the 
analysis of Finnish morphology. However, in his 
morphophsnological analysis Koskennleml has been 
critlcised for uslng monadlc segments rather than the 
feature bundles which play such an important role in 
phonology /Gazdar 1985:601/. In the proposal 
presentcxl below, segments regarded as feature bundle~ 
are essential components in the model. The quesLion 
as to whether it is better to represent the 
phonological rules as a cascade of transducers or to 
incorporate them into a single transducer will not be 
considered here. Kaplan and Kay /1981/ have already 
put fowar'd a method of compiling the series uf 
transducers into a sing\].(-: transducer (described by Kay 
/1982/)), Below, for discussion purposes, a single 
transducer is assumed. 
Furthermore 1 would llke to stress that on the 
phonological level I will discuss network 
\['epresen tahions of phonotact ic and allophonic 
cons'traln ts. The transitions in these networks 
consist of transduction relations. In the proce.~s 
domain a finite-state transducer will be used to 
interpret the networks. This is a distinction which 
is not always made but is beneficial for abstracting 
the attributes of the model from the processing of the 
model. Below more emphasis will be placed on the 
representation domain as it is this whlch is most 
interesting for" discussion purposes. The actual 
implementation of the processing domain as a program 
is regarded, theoretically, as a secondary but by no 
m~ns a minor issue. 
Unification is a concept which has become common 
in linguistics in recent years due to the important 
role it plays in current syntactic theories such as 
FUG, LFG and GPSG. However, it has not as yet played 
an explicit part in phonological analysis. Below I 
propose that, by employing elementary unification 
mechanisms, assimilation and dis~imllation can be 
dea\].h wlth in a most satisfactory way. The 
unification used in this connection is based on the 
functional description unification described by Kay 
I1984/. 
Here I will give an informal definition of 
unification based on contradiction and set union and 
in terms of feature bundles, since this is the 
representation which will be used below. Two feature. 
bundles composed of attribute-value pairs may be said 
to unify if for each attribute in their union there 
does not exist an attribute of the same name with a 
106 
contradicting vahJe. Where a variable, .,lay X, is found 
in place of a value in ~I~e featnre bundle~ this 
variable wilt be. assigned permanently the va\].ue from 
the corresponding attribute--value pair in the nther 
bundle if this exists. This definition of unificati.en, 
and its implements lion, differs from Prolog term 
unification. 
2. All i)p\]lorie~-PhslIi~-;\[m: ~. Tran'.-iducti(in 
\[n the proposal presented here, segments regarded 
a.~i feature bundles are esseutial ci~mponen Ls. The 
feature bundles used ill fih\[s model are sets Fir 
attribute-valtie pairs ill \]ine witll tradlt/snal 
distinctive fealure terminology, The feat.urea are not 
complex and are generally based un those of Ghlmlsky 
alld llalle /196{~/. A fully :.;pec:\] tiled feature bUlidJe 
contains at\] the features, t~igel;hm with l.heil value.'G 
needed tcl describe cme pSI'lieu\[FaY .~Klliltd. WheFe a 
phou£.q;ic s,ymbol occurs :i.n the text below tills i~; 
met-ely an abhrev iati ca convention far' a fully 
specified feature bundle. \[Tether" tiian being" fully 
specified, a feature buad\]e may be. underspec\[fied. 
That \[s h~ say, only those feahures appear in the 
feature bundle wi~ich are necessary I:o describe a class 
a:\[ ssund.s which participate in a pat'l;ienlar phonetic 
pr'oce.,~s, For example, the unde.rspec ifi~t feature 
bundle {\[4 volt, \[-- cons\]} descv\[be.~ at1 vowels, The 
feature bundles are generaliGatl.on'.~ for sets of" i.npllt 
symbols, and resemble the classification in t(~vm.q of ~\[ 
aad V features found ill ,qyn Lax whicll a\] lows 
generalJ.sai:ion over cat;egllliet~, "I'hcy are thus termed 
C- \[eatur'e.~ (.for Catt~gory- featur-eCD, 
In (;hu~cch /I.983/ tile clail~l h-; made that alloph~m~c 
cues can be extremely usefu\] in plumolegica\], parsing. 
Selkirk /19',12/ al.'~l maintain',i that investigatirm of 
allophonlc variation may be advantageeus tot sy}.iab\].e 
analysis .,~ince the realisatil~n el particular aliophones 
of a languaSe is strsngly dependent £m their" pusitlon 
wiLhln the ~yllable. Thus in order te take advantage 
of allophonic cues a distinction must be made between 
variant and invaFianL features. Variant features, .'such 
a,<:i it aspb:'atilnl\], fxx.'ur whell di,.icu.,<ising a\]lopiu:uie,.i o\[ 
/pl for clamp\]e. Thus underspeclfi(.~i f£~ture bundles 
also contg, in variant featul'e5 iu order for u.<; ttl 
incorporate allephonic \] nformation into t~uF 
classification. 
U,,~in 5 variant and iavarlant features~ fo\] iswlni{ 
Church /1988/, the arm i.s, g;\[ven phonetic input in the 
form of fully specified feature bundlc~, di~;card 
al.lophonlc information (varlaat feai:ures) and produce 
phonemic ,mtput else in f~I;ure bundle form ;.~Ith 
syllable tlaundarie~ marked. Ghureh's /I.Q83/ sy*.;henl 
has a number of stages from phone~ic ~aput te the 
point where phonemic output is matched wlLil a syllable 
dictionary. A phonetic, feature lattice incorporating 
generalisatlons about allophones i,'; input to a bottom- 
tip chart parser. This chart parser, whicii works (In a 
similar basis to the GYK al~{orlthm, provides the 
phonetic .\[npll L with a syllable structure. A 
canonicalisec then dlscar'ds the allophonlc infsrmation 
and outputs a phgmem.ie feature lattice preeierving the 
syllable structure. It is this structure which then 
complises the input l;e tile lexicai matcher. 
Tokin~ a ohm;el look aC the canonical.D~er tile \[ii:7;L 
thing which springs to mind is a .';simp\]e transduction 
places<3, that is to .~iay, a translation from pilunes t~J 
phonem~:;. The chart; parser has the ta,':ik of ptevidi.ng" 
syllable structur'e using phlmotactlc and allophonic 
constraints. Iluwever, the question here is, are twe 
separate procedure~5, namely parst nit and 
canonicalisation, realty necessary ,i~ (:an they be 
incorporated i nto a G/ll~Le \[)rocf~ei.~'/ Below \[ w\] \] J 
sketch ,~ plepr)sal which, wJl;h Lhe hi!Jp f)\[ a ~iilil;~ 
tltate trans(lucez: doe.~i just thh;. 
{~. 1Jlt(llilli. 7~(;{ i() Net;,'-i 
Let tl.% \[ \[l',% I;, (;on~g:l der tile ri;p; ~!:ieui;a t ion levi!l, 
Following tile on lille t~d~;in-f. ~ ,%(.,ecif/c,¥~#.don x'~:~col{ni,,ser 
/&Jr ~'n~JLqh f;),llabie,'; plesctn{L-~i in G\[10ben 119851 ~ 
.%yllable tt~fllplato, was cflnlrltruGteli Lt.<:-J a (li.<i(:{Jhl\]P.iil, hlii 
ne tlqork (HI the basis of phonotac tic ru le.~;, Lhu!; 
Wfilkin\[~ (in tilt! principle tif "allowabkg' Glrnllbili:~i:ion..; 
ef phonenle{i rather than limiting acceptab\]e F>I;~ i.ugs i;~.J 
tht~e c\]llGters whicti actual\[y occur. <dyHab\]et~ ~re iirll; 
di.<-icu~.<~ed explicitly in refills; o£ rm.,.;ei;, \[)e*tk and (xlda 
in th\].q rood\[el, Rath(n- theist: fgub- StfHCttlres and \[;lie 
phonotactlc and all.ophfmic r-uies which depeild t~n i;h(:lli 
ilre implicit in the net;wolk. The s;trucLuYe.<-,, \]lewev(w, 
can be derived immediately from the t.optlkogy ot the 
llet, wtl\[\]i a~, reple:senl.ed ill o \[;rtsfl.%iLiun ilia#tiara, Ti~h~ 
IK~I;WII\[ k \[L; l C fel-f l~.(1 tCI as a phi }llQl,ac {;i G ueL. 
All.rlphontc CfUlStlsint,~il were tiletl introduced a~;; piarl; 
of the :input; ~:qr)ecifications, 
Each tiansitkm \[n I;he phunl!tacCtc fleL moiK!h:i ,a 
phonemic ::gegmenf Tin! ativclnta{~'~# ef~ th(3 \['(~atuve \[)llll(i\]f! 
reprl~ae.ntatlon is tilat t~el{me.nt~; can be viewc~d in t(.'r/tl:; 
(if natural classes, which *;i mpli fie.<; the netw(ir-k 
con,~dderably, The tlYtll.~xit\[on \]abel.if; tilt' the afar;work 
consJ..';t of a pail" ef feature bundles each containing 
O-l'eatl/l-es, One of the,~;e blnldle5 repFesellts Jilpllt 
.'xpecifications and the ether output specifications; 
both are in geueral undevspecified, Fer example, the 
bundle i:lf G features which de~icribeL; lille veicelesu 
plosive con.'~onants is {\[ cent\], \[-- voice\], \[ seal, 
\[ stYtdi), }lowever, where we need I;o dealt ilia the 
aspiral;ed allophones ef the v\[lice\]ess plosives the 
w~viant feaLur'e \[t asp\] must be addc~l: {\[ cent}, 
107 
\[- voice\], \[- son\], \[- strid\], \[+ asp\]). Therefore when 
a particular transition in the network is responsible 
for remevin 5 this allophonic information the input 
transition specification is {\[- cent\], \[- voice\], 
\[- s~bn\], \[- strid\], \[+ asp\]), and the output transition 
specification is {\[- cent\], \[~- voice\], \[- son\], 
\[- strid\]} (see Fig.l). When this phonotactic net is 
interpreted by a particular parser the phonetic input 
is generally a string ef fu\]ly specifl~xl feature 
bundles *~nd in order to u'~e the output for recognition 
purposes the phonemic output will also be fully 
specified. It i.% here that unification plays an 
important role. 
indeed the features themselves may not be 
recosnisable. This facility is advantageous for 
workin~ with feature detectors at the front end as it 
is still possible to analyse what is known. Thls, of 
course, leads to underspecifk~l output which may be 
used in connection with a lexicon for recol{nltlon 
hypothesisin 5 . In such cases the underspecified 
output, althoush representin~ classes of phonemes in 
the various positions, will only allow those 
combinations of such classes which actually exist, 
thus llmltJng' possibilities available for hypothesis. 
Thus it is not necessary to check the lexicon for 
forms which accordln 5 to the rules of the language 
cannot exist. 
l TS OTS 
Voice voic~ 
son son 
strld stri 
asp 
f~ 
% t %j 
Fig. 1, 
Tran~it\[un acceptln~ voiceless ospirated ploslvos 
~;hen attemptin 5 to traverse the network the fully 
specified input feature bundle must unify with the 
input transition speelflcatlon (in terms of C-features) 
of the current transition. If unification succeeds, the 
fully specific4 output bundle must contain the output 
transition specifications together with all those 
features from the fully specified input bundle not 
contained in the input transition specification. In 
set the\[~retic terms, let us call the fully specified 
input feature bundle lnFB, the input and output 
transition specifications ITS and OTS respectively; if 
unification of InFB with ITS succeeds, the fully 
specified output bundle OutFB is OTS ~ (InFB / ITS). 
The phonetic input feature bundles may be also 
underspecified however. This allows for circumstances 
where the values of some features may not be known or 
4. Gonstr'ainin~ Principles 
Church discusses a number of factors, most of 
which date back to work by Morris Halle and are 
discussed by Chomsky and Halle /1960/, which must be 
taken into consideration when desl6ning the model 
I1983:1281 length, idiosyncratic systematic gaps, 
voicing assimilation, place assimi\].atien and 
dissimilation, sonority. These can all be incorporated 
very easily into the network. The fact that languages 
restrict sound combinations (Jdiosyncratlc gaps) and 
the length of initlal/flnal consonant clusters is in 
any case the basis on which this network is 
constructed. Decreasing sonority from the nucleus of 
the syllable towards the margins would seem to be a 
matter of having \[son\] as a C-feature and adjusting 
the value at the appropriate transition. 
With re~ard to phonotactic constraints, the C-. 
features on the transition labels may have variable 
values. In other words we may cater for the fact that 
all initial /s/ in Bn~lish may not be followed by 
voiced plosives by havin 5 as input specifications for 
one of its followln~ transitions the C-features 
{\[- voc\], \[~ cent\], \[~ voice\], \[o son\], \[- strid\]) (see 
Fig.2). ~ here must have the same value in the three 
cases, this value bein~ assigned durin}~ unification. 
Unification would fall in this case for voiced plosives 
as they would be specified for the feature~ {\[- reel, 
\[-cont\], \[+ voice\], \[- son\], \[- strld\]}. A further 
convention is Introduced, *tamely that once a feature 
has been specified on a particular transition it 
remains until it is eKplicitly altered, ell a subsequent 
transition. In this way vowel harmony may be 
incorporated into such a network whereby the vowel 
sDeeificatlons would remain for subsequent transitions 
since they would not be relevant for intervenin~ 
consonants. 
108 
1 T~; ( I'l'*~; 
I "l'~q OT~I i, I 
voc wlc 
atrid strld 
r~ c~mt cx conL 
viii (;(} ~cll c;~ 
5\[)II \[;uii 
%~-, %; %j 
Fig. 2 
' hdtial /t~/ ,l~ly not Im f(}llclwed by voiced plo~;lve~3 in }{llgll~ih. 
(1{;I and /~/ ~tv(~ abbrevl~ltlrm~; ftlr fully specified fetltuie 
bund I ~t;) 
It shoal(l be clear also that feature bundle 
representation together with unification is an elegant 
way of dealing with assimilation, dissimilation and 
neutrallsation. Assimilation and dissimilation are 
dealt with by Chomsky and l{aile /1968/ in terms of 
variables a. ~\] feature cfK~.fficients and it is this , 
method which has been incorporated into the network 
here. So for example, in eases of voice assimilation, 
the fe/Iture \[voice\] may be checked using a variable, 
say \[a volc~\]. Therefore, where the particular input 
segment ha~ the feature \[+ voice\], unification assigns 
the value + to the undefin£Rl variable ¢~ permanently, 
and slmilaFly in the case of a negative value. This 
newly found value together with the attribute will 
then be a (k-feature in the input specification for the 
following I ransltion unless exp\]icltly changed on that 
transition. This is a type of feature-passlng 
technique :~imilar to that employed in unification- 
based syn'i;actlc theories, but essentially simpler, 
slnce it is nsn-recurslve. 
Transltion weighting is also very important in this 
model, St\]kirk /1980/ emphasil~es that it is all very 
well to cater for collocational restrictions but other 
constraining principles such as maxlmising snoots 
should also) be incorporated into a syllable parser. 
Thus ironed\]ions are weighted in such a way that the 
most preferred path out of the network is sought. 
'Early closure' /Kimball 1973/ :for example, which seeks 
the shorte~;t path out of the network, is equivalent to 
the maxima\[ onset principle. Str~s re~yllablflcatlon 
is simllar\]y dealt with using weighting. Thus, such 
constraints are incorporated into the network in a 
simple and principled fashitm. 
5. Gyllable Parsing 
Up to now we have \]men discussing the 
representation level, namely the phons\]attic net 
envisaged as a syllable template. The phonotacU.c net 
in hhls case was for English but it should be. clear 
that this representation may he used for other 
languages, dialects sr codes. Since the phenol~ctic 
net is a network of transduction relations between 
allophone and phoneme it should be 
both apeech analysia and synthesis. 
to note at this stage however, that 
level we are not re'~;tricte~ to what 
we employ. The phonotactic net may 
a usefu\[ tool for 
It is important 
on the processing 
parsing algorithm 
be interpretu~:l by 
any one of a number ~ff par,'~ing procedure.%. The 
.~;trategy emphJyed (i.e. depth-first, breadth first, 
hast fir.~t, ioskahead etc.) is~ a\].so totally independent 
of the repve~.~entation. 
In the mode\], deacrib(M here the aim wa~ to use the 
simplest formalism pcJssible. Thus the parsing and 
translation processes are undertaken by a deptl>first 
nondeterministi.c finite state transducer. That is to 
say, the phonotacUc nets of ti-ansducti~n re\[atitms are 
interpreted by a fin~te,-state machine. Giwm the 
phonetic input in the form of feature bundles, the 
tram~ducer msve~; from ,,~tate to state in line ~ith the 
unification procedure de~;cr ilxM in s6x:tion '3 above. 
l':very time the tran,%duce.v reaches its fins} ,';tote gi 
"p¢~sible" sytlable ha,% been found. Therefore,in order 
to find more than one syllable the transducer iterates 
so that phonological units and syllable boundaries are 
output until the input ,qtring is empty. Thu~; we have 
a single iterative finite .%Late precooks. The parsing 
and canonicalisation pt-t~:e.~ses referred to in f;ection 
2 abtwe are in(:orpora ted into a single proceduJ e. 
%that is interesting to note in this esnnection isl that 
since the l~Irsing proc+.~ture i.~; uondetermini~4tic in 
fact all "p(easible" syllables from the beginning of the 
input are checked internally (i.e. in the intermediate 
stag~ before producing ,mtput). Thu.~s the notion ell a 
"pov~slble ~ syllable of English is catered fol. 
From a psychological viewpoint it is an intere~ting 
fact that only the "possible" syl\]abh.,s aye considered. 
This would also be tile case in human protest, tog Iff 
neologisms whereby no attempt would be made to form a 
syllable with an \[mpc~sib\].e in ltJal/flnal consonant 
cluster comb\]no\]los: humans can accept w{wds which 
conform to the rules of their language even if the 
words do not actually exist. Thus, with thi:3 meg/el we 
can distlngnish Imtween "possible" and "acttlal" words. 
If we tested Currol\]'s Jabberwocky using this model 
we would get a correct syllable structure. As already 
noted, the lexictm filters out actual words. 
\]09 
6. Conclusion 
The implementation of this model doe~s not claim to 
be a speech recognition system as it stands but :\[,s 
fat.her an attempt to deal with a small component of 
such in a new, elegant and theorectically satisfytn S 
way. Hnlfication and transduction can be seen to be 
useful m£~hanisms in syllable parsing. Unification 
prey \[des uederspeclficahion-manlpulation and feature- 
pan, sing facilities and transduction provides a 
translation facility between allophsnes and phonemes. 
Tvansduction relations interpreted by a flnlte-state 
transduce/" have the further advantage c}f 
bidir'ectlonalty. That is to say, one can translate 
\[\[om allophones to phonemes OY vice versa (perhaps 
with some ambiguity in the phoneme-allophone 
dtre~tion). This system, however, should be a useful 
tool \[n both speech synthesis and speech analysis. 
An exten&;~ou of this notion of a syllable parser is 
t(:) tale in terats of phonological words, whereby at the 
(epre_.sentation level the network would ¢:onsist of two. 
sub-nets catering for redu¢:ed and unreduced syllables 
I'espectively. A furthei extension is to use a tree- 
structured lexicon could be employed in a similar way 
to that propor~e(l by Kay /\]9821 to distin~ulsh actual 
words from pc~.Jslble words. Representing the lexicon 
as a diserlmination net and in terms of distinctive 
feature bundles makes it possible to deal with variol\]s 
})arts of a recognition system in a uniform way. The 
movement of the transducei' may then be directed by 
usin~ 'the tree-lexicon in paraliel (see Flg.3>. In. 
ca se:_~ where the input segment is underspecified 
hypotheses could be made immediately as to the values 
tlf particular feaflurgs thus excludi~ 5 paths which will 
eventually le~d to impossible sequences hence 
Inct~t~.~ing the efficiency of the parser. 
\].i0 
LEXICON 
PHONOTACTI(" 
Hit T 
Ft~,3 
The model has been implemented in C-Prolog on a 
Hewlett Packard 9000. 

References 

Chomsky, ~. , M. Halle 1966. The ~und Pattern of 
En811sh. Harper and Row, New York 

Church, K. 1983. Phrase Structure ParoleE. A method 
for tak'la~ advantage of allophonlc 
coastralnts. Ph.D. Thesis, MIT. Published 
Indiana University Linguistics Club 

Gazdar, G, 1.985. "Review article: Finite state 
morphology'. In: Lia~uls%Ic8 23:597-607 

Gibbon, D. 1985. "Prosodic Parsing in English and 
Akan". Paper held at 21st International 
Conference on Contrastive Linguistics, 
B~azejewko 

Gibbon, D. 1987. "Finite State Processing of Tone 
Systems". Paper held at ACL European 
Chapter Xeetln~, Copenhagen 

Kaplan, R. M. , M. Kay 1981. "Phonolosical rules 
and flnite-state transducers". Paper held 
at the Annual Meetlng of L.S.A. in N.Y.C. 

Kay, M. I!)82. "When meta-rules are not met:a- 
rutes". In: Spark-Jones & Wilks: (eds) 
Automatlc Natural Lan~ua~j~ Parsln 8 
C}~:M-IO, University of Essex 

Kay , M. 1984. "Functional Unification Grammar: A 
fo:cmalism for machine translation". 
Proceedln~s of lOth International Conference' 
on Computational Lin~uistlcs:75-78 

Kimball, S. 1978. "Seven Principles of Surface 
Structure Parsing in Natl,ral Language" 
In: Cs~nltlon 2/1:15~-47 

KoskennlemI, K. 19S3. Two-level MorpholoKy: A 
~eneralcomputational model for word-~Cor~ 
reco~nltlon and pr~ductlen. University of 
Heislnkl, Department of General Linsuistlcs. 
Publications Io.\].l 

Selki*'k, E. O. 1980. On Pro~odlc Structure and its 
Relation to ~yntactlc ~ructure. Indiana 
University Linguistics Club 

Selklrk, E. O. 1982 "The Syllable" 
In: Van der Hulst & Smith (eds) The 
structure olphonslo~Ical representatlons 
(Part II) Forls Publications, Dsrdrecht 
