Phonological Derivation in Optimality Theory* 
T. Mark Ellison 
Centre for Cognitive Science, University of Edinburgh 
2 Buccleuch Place, Edinburgh EH8 9LW, Scotland 
Sulnmary: Optimality Theory is a constraint-based 
Iheoly of phonology which allows constraints to conllict 
aad to be violated. Consequently, implementing the theory 
presents problems for declarative constraint-based proces- 
sing frameworks. On the hasis of two regularity assumpti- 
ons, that sets are regular and that constraints can bc model-. 
led by transducers, this paper presenls and proves correct 
algorithms for computing the action of constrainls, and 
hence deriving surface forms. 
INTRODUCTION 
Recent years have sect) two major trends in phonology: 
Iheories \[lave begOlllC \[11o1{3 oriented arOl.llld constlai\[lts 
Ihan transformations, while itnplenmntations have come to 
rely increasingly on finite state attlomata and transducers. 
This paper seeks to build a bridge between these trends, 
showing how one constraint-based theory of phonology, 
munoly Optimality Theory, might be implemented using 
\[inite-statc tnethods. 
The paper falls into three main sections. The lirst de- 
scrihcs Optimality Theory alld its restriction to constraints 
which can only make binary distinclions in harmony. The 
second part covers the fornmlisation of the evaluation of 
harmony, inchtding the silnl)lifying assumptions that the 
set of candidate forms nntst initially be regular, and that 
the action of each constraint in assigning harmony also be 
regular. The third section l)resents algorithms for (i) de- 
lining the product of automata modelling constraints, (ii) 
finding the optimal level o f harlnony o f a set o f candidates 
and (iii) culling suhoptimal candidates. The last two algo- 
rithms are proved COITeCt, alld St)ale worst-case complexity 
results are given. Tim paper concludes with a discussion 
o1' the work. 
OPTIMALITY THEORY 
Optimality Theory tOT) is a constraint-bascd theory of 
phonoh)gy, developed by Prince and Smolensky (1993) 
(hereafter, this work will be referred to as P&S) and is now 
being used by a growing numher of i~honologists (ito and 
Mester 1993, McCarthy and Prince 1993, McCarthy 1993). 
It difli~rs from declarative phonology (Bird 1994, Scobbie 
199 I, Bird and l'31ison 1994) in that its constraints arc vio-- 
lable and can conllict, with the contlicts resolved by an 
* This research was funded Ily the U.K. Science and F.ngineeling Re 
search Council, i.llldcl- grallt GP./G-22084 (.'t.'mputalional Pho.ology: A 
Co~lsH'ainl-BasedAl.,prrmch. I am grateful to Stcven Bird, 1",wml Klein 
alld Jitll Scobbic for their COIIIIIlelI\[S Oll fin etlrliel- Version of this \[)aper. 
ordered system of defaults I . l)eclarative phonology eva- 
luates candidate l'ornls ~ on a binary scale: whether they 
are accepted by a constraint system or not. In contrast, OT 
assigns a ranking to all of the candidale realisations of a 
word, calling the scale a tlleasta'e o1' harmony. ALl of the 
candidates which show the maximal amount of harmony 
are accepted by the constraint system, and others are rejec- 
ted. A derivation in OT consists of an original candidate 
set produced by a \[unction called GI,;N, and the subsequent 
application of constraints to reduce tile candidate set, elimi- 
nating non-optimal candidates and I)reserving those with 
Ihe greatest harmony. At no stage can a constraint elimi- 
llale all candidates. 
Each constraint assigns to each candidate a list of iJlarks. 
These lllilrks illay, \['el instance, tag segntents as reguhtr or 
exceptional. The lllarks are wdues on the harnlony scale, 
and are totally ordered: for any two marks a and b, either 
a is more harmonic than b (symbolically, a ~- b) or tile 
reverse. In the list assigned to a candidate, however +, the 
Sallle lllal'k Inay occttl' inaay times, To COllll)arc the hal'- 
illOIly Of twt) candidates with regard to a given constraint, 
their respective lists of marks are sorted into increasing 
order of harmony 3. The lists are then compared lirstqo- 
last componentwise. The more harmonic candidate has 
Ihe more hammnie vahte at tim tirst point where the lists 
differ. The empty list always has the same harmony its lhe 
most hamlonic mark on the harmony scale, common to all 
consmtints, which we will call the zero mark, and write 
as (34 (2restraints which only use two different marks arc 
called binary constraints. For binary constrainls, the ewl- 
luation of harmony is a sitnple affair. The candidate with 
the fewest non-zero marks is preferred. 
Consider, lbr exatnple, tile hinary constraint 
ONS(P&S:25). This constraint discourages nuclei without 
onsets when selecting hetween diffc, rent syllabifications. 
Two syllabilications of the Arabic segntental sequence al- 
qahmm are shown in (I), with syllables demarcated by 
parentheses. The nuclei are always the w)wels, The dishar- 
t l'31ison (1994) offers at I~armal amdysis of Ihe use of defaults in 
Optimalily Theory. 
2 hi COllSll'aint-llased theories, consll'ailllS ilIipose \[illlits Oli possible 
rcalisations of objects, such as words or sentences. A caadidate is a 
tellialive realisalioll which is yet to be Icslcd against the eonstraillts. 
aEarly in their techaical report, P&S illlroduce OllC COllStrgtiat, \[\[NUC, 
which requires stlrtiag into Ihe reverse order, l .alel" in the SalllO work Ihey 
replace this eollSll'aillt with a lltllnber o1' billary COliStl'aillls wilh I\[le tlSlllll 
ordering. 
4The zero mark is aot ilarl of P&,q's aec(}tlal. They are not explicit 
about conlparisc.n of lists of marks of tlncqual lenglh except m the binary 
casK, hi that case, lheir delinitiolls \]lave the Sllllle consequences ;is those 
described here. 
1007 
monic mark L indicates on onsetless nucleus, the harmonic 
(zero) mark 9) is used tor other segments. 
( 1 ) syllabification marks sorted 
(al)(qa)(la)(mu) L 9) 9) 9) 9) 0 9) 9) L 0 9) 9) 9) 9) 9) 0 
(alq)(al)(am)(u) Lg)0LOLOL LLLL0009) 
In this example, the sorted lists of marks differ in the 
second position with the first candidate, (al)(qa)(ia)(mu), 
being the more harmonic of the two. 
When there is more than one constraint, we must con- 
sider not only the orderiug of marks assigned by one con- 
straint, but the ordering of marks from diffcrent constraints. 
In OT, constraints are placed in a total order (6'1 :,-->- C2), 
and all non-zero marks of higher-ranked constraints (C1) 
are less harmonic than all non-zero marks of lower-ranked 
constraints (C2). in effect, this means that higher-ranked 
constraints have priority in eliminating candidates. For all 
constraints, however, the zero mark has the same, maxi- 
mally harmonic, value. 
binarity 
So far we have considered a general class of constraints in- 
cluding non-binary constraints. As it happens, non-binary 
constraints can often be replaced by binary constraints. 
Binary constraints are those which only assign two marks: 
the zero nmrk, and one other. 
In the simplest case, restating a constraint in a logically 
equivalent form can transform a non-binary constraint into 
a binary constraint. The constraint fanfily EDGEMOST is 
delined by P&S(p35) as (2). 
(2) EDGEMOST((-'; E; D). 
The item 4) is situated at the edge E of domain D. 
This delinition covers a family of constraints depending on 
the instantiations of the arguments: E is either left (L) or 
right (R), domain \]nay be syllable, foot or word, and ~b can 
be any phonological object, such as stress or an affix. 
According to P&S, constraints of this form are non- 
binary, returning as their marks the distance of their ob- 
jects from the designated edge of domain. The greater 
the distance, the less harmonic the mark. Constraints of 
this kind can, however, be replaced by logically equivalent 
binary constraints (3). 
(3) NOINTERVENING(O; E; D). 
There is no material intervening between 4) and 
edge E of domain D. 
This form of constraint assigns a disharmony mark to each 
item intervening between 4) and edge E. The more material 
lying between 4) and E, the greater the nmnber of marks 
and so the lower the harmony value. 
Other types of non-binary constraints can be conver- 
ted into hierarchies (ordered sequences) of binary con- 
straints. Suppose a constraint C produces N different kinds 
of marks. Applied to a candidate form c, this constraint 
produces a list C(e) of anal'ks. Now detine a function f 
which takes a list of marks, 1, and a mark type m, and 
replaces all marks in 1 which are different from m by thc 
zero mark 0, and then re-sorts the list. So with the marks 
2 --< 1 -< 0, then f(2219), 2) is 229)0 and f(2219), 1) is 10(/)0. 
If the marks generated by C are 0= ,q >- r~2 >-- .. >- rt*N, 
then C can be replaced by consmfints Ci,i=l. N_ 1 such 
that Ci (c) = f(C(c), i) sut!ject to the ordering Ci>->-Ci if 
i>j. 
To see the equivalence of the single nm>binary con- 
straint with the family of binary constraints, let us look at 
the comparison of some candidate forms. Using the three- 
valued constraint of the earlier example, st, ppose candi- 
dates M, N and P are assigned mark lists 102, 219)12 and 
9)122 respectively. Sorted, these lists become 2J0, 22119) 
and 2210. Comparing these lists, we arrive at the harmony 
ordering M >- P >- N. 
Now, let us apply the corresponding binary constraints. 
The first and dominant constraint preserves only 2s in the 
mark list, the second preserves only the mark 1. The two 
lists of marks lor M, N and P are 209) and 19)0, 22000 and 
119)9)0, and 2209) and 19)9)9), respectively. By lhe ordcringof 
the coustraints, we know that 2 --< I still, and so merging 
the two lists of marks for each candidate gives 219)9)9)0, 
221.100009)9) and 22100000. Apart from the trailing 0s, 
these arc identical to the marks assigned by the single 
constraint, and so lead to the same ordering: M > P >- N. 
So all constraints which use a finite alphabet of marks, 
and some which do not, such as EDGEMOST constraints, can 
be t,'anslated into binary constraints or a linite sequence 
of binary constraints. Consequently, formalising binary 
constraints and their interaction wilt be enough to capture 
the bulk of constraints in OT. 
FORMALISATION 
The formalisation of OT developed here makes uses three 
idcalising assmnptions (4). 
(4) 1. All constraints are binary. 
2. The output of GEN is a regular set. 
3. All constraints are regular. 
We have aheady seen that most non-binmy constraints 
can be recast as binary constraints or families of binary 
constraints. Unlortnnately, P&S are not explicit about 
whether there are other unbot, nded non-binary constraints 
(like EDGEMOST) -- there may he some which cannot be 
recast as binary constraints. Assumption I is, therefore, an 
idealisation imposing a slight linfitation on the theory. 
regular gen 
The second assumption requires that tim ontpnt of GEN be 
regnlar. Recall that GEN is the function which produces 
the initial set of candidate forms which is reduced by the 
constraints. In other words, the set of candidates must 
be i,fitialised to a set which can be dclincd by a regular 
expression, or, equivalently, by a \[inite-state autonmton 
(FSA). 
As an example, (5) shows a regular expression giving a 
subset of the candidate syllabifications of alqalamu accor- 
cling to the syllabification rules of P&S(p25). The set does 
1008 
not include all candidates; for clarity I have omitted partial 
syllabilications in which segments have not been assigned 
a syllabic role, and completely empty syllahles. The set 
does include syllabic slots which do not correspond to seg- 
ments. In snch slot-segmen! pairs, tile empty segment is 
written as 0. 
L01J..1 ,i. \[L0lJl I/rl. oJJJ :{(,:: },o ,;, (,;}} ,:{;} 
The hrackets cover disjunctions of lerms separaled by 
vertical bar I, while concatenation is expressed by juxtapo- 
sition. The vertical pairs of symbols are tile complex labels 
used (Ill alcs ill tile corresponding atttolnaton. The three 
syllabic slot types arc onset (O), nucleus (N) and coda (C). 
As a reguhn" expression, (5) captures 64 different possible 
syllabiIications of tile sequence ahlalanm, l:or example, 
Ihe syllahilicntion (al)(qal)(am)(u)is accepted by the (5), 
while (aiq)(al)(am)(u) is not. 
regular constraints 
The third assumption imposes regularity on conslraints. A 
constraint is regular if there is iF linite-slale tlansducer 5 
(I"ST) which assigns tim same list of lllarks to a candidate 
form that the constraint does. Since we are only dealing 
with hinary constraints, the transducer will associate with 
each component of tile caudidale oue ,51" the two harmonic 
vahles ( ~ (/). Such transdtlcers Call be expressed its regular 
expressions over pairs of phonological material and nll.nks. 
P&S (1/25) ase two constraints, lVIl,l~ (6) and ONS (7), to 
account for the limits tm cpenlhesis in Arabic. Epenlhetic 
material arises when syllabic slots which are not occultied 
by segments are realised, llerc the nlarks are given on the 
right hand side of the cohm in each pair. llere ~ is the 
disharmonic nlark, alld 0 the more harmonic zero illat'k. 
(6) I;II.L. Sylhthte positions arc lilled with segnlenlal 
materM. 
(7) ONS. Every syllable has an onscL 
These two constraints can be readily translated into re- 
gular expressions, using the ahbrcviatory notations: N for 
onset or coda, 0 for segmental nmterial and *, for anything. 
The transducers for 171I.I. alld ONS are defined hy tile regtdar 
expressions in (8) and (9) respectively. 
This trausduccr marks with c every syllabic slot associated 
with an empty (0) segment. 
i:,,) ' joo ), 
\[00 • 
SA lilfitc slate II'illlSdUct3r is all FSA which is labelled with pairs of 
values. In this case, the pairs will combilm phonological in forlnatioll with 
COIIStl'tthl\[ lllalks. 
This transducer is non-deterministic, producing more than 
one sequence of nlarks Ior a given input. All nttclei prece- 
ded by an onset are marked with 0 and wilh ~. All other 
segnmnls segments are marked as 0. The multiple evahmti- 
ons el:candidates is not a problem: candklates will survive 
so long its their best evaluation is its good as the hest of any 
other candidate. 
linearity 
The reader may be concerned that tim regularity consmtint 
in\]poses tmdue restrictions (51' linearity (m the candidate 
forms, and, in doing so, vitiates the phonological advan- 
tages of non-linear representations. This is not tile case. 
Bird and Eltison ( 1992,1994) have shown that it is possible 
to capture the semantics of autosegmental rules and repre- 
sentations using FSAs. The oulpttt of GEN, therefore, may 
correspond to a set of partially specilied autosegmental 
representations, attd still be i lUerprelcd as a regular set. 
candidate comparison 
For single binary constraints, tile harnlony of candidates 
is compared as sorted lists over the alphabet containing c 
and 0, whe,'e 0 has the higher harmony, the same, in fact, 
its tile empty list. Consequently, the results o1' comparing 
lists of these marks is identical with comparing #<h(<) 
where #e is the ntunber o1' times c occurs il5 tile list, and 
h(e) is the constant quantity of harnlony assigned to f. 
As ~ has Ihe same harmony as the empty list, h((/)) musl 
be zero. As e -X O, comparison is preserved il' h(e) < 
0, so we set h(~) =: -1. II' the arcs in file h'ansducer 
arc labellcd with -1 and 0 instead o1' e and (/), then the 
harmony of a candidate can be evaluated by just adding 
the numbers along the corresponding path in the constraint 
transducer. 'File greater tile (always non-positive) result, 
Ihe more hartnonic the candidate. 
Just its we Catl tneasure hartilony relative to a single COlt- 
straint with a single integer, we can ineasure tile Imnnony 
relative to an ordered hierarchy of constraints wilh an or- 
dered list of integers. The list of integers corresponds one- 
to-one to tile constraints in decreasing order (51: dominance. 
\[!ach integer maintains information about the number of 
e values o1: the corresponding constraint in lhc evaluation 
of the candidate. A candklate wilh tile list (--2, -I ) vio- 
lates the first conshaint twice and tile second once: the 
corresponding sorted list of harnlony marks is 22 I. 
Lists of this t'ornt citn be compared just like lists of 
lmrmony marks. The first integer is tile most signilicant 
and tile last tile least. The greater of two lists is tile ode with 
the higher value at the most signiiicant point of difference. 
l;oi' example (- 10, --3 \[, -50) is nlOl'e harmonic than (>) 
(-10,-34,-t2). I,ists of integers can be accumulated 
like single integers using componentwise addition. 
We can generalise transducers l)'om denoting single con- 
straints to detloting hierarchies of constraints: frotIt trans- 
lating candidates into sequences o1: {0, e} or { 0, - 1 } lllarks 
to transducers from candidates to sequences of lists of inte- 
gers, each integer drawn fronl {0, - 1 }. Sunnning the lists 
7009 
along a path gives a harmonic evaluation of the correspon- 
ding candidate. 
Let us call the length of the integer list the degree of tile 
transdt, cer. The output of GEN is an automaton -- a trails- 
ducer without marks -- and so corresponds to a transducer 
of degree 0. The transducer for a single constraint needs 
only a single binary distinction for its marks, so a degree 
1 transducer suflices. In general, the number o1' binary 
conslraints that a transducer encodes will equal its degree. 
The next section looks at how transducers of single con- 
straints or small hierarchies can be combined into single 
transducers for larger hierarchies. 
ALGORITHMS 
product 
We have seen how a single constraint can tie regarded as 
a transducer fl'om candidate segments into a singleton list 
of integers, and further that multiple constraints can be 
cvaluated using longer lists of integers. Combining these 
two notions into tin extended version of the automaton pro- 
duct operation allows us to build tip transducers capturing a 
hierarchy of constraints fiom single constraint transducers. 
The product operation is easier to describe when trail- 
sducers are thought (if in terms o1' automata rather than 
reguhlr expressions. For brevity, then, the algorithms will 
be phrased in terms of the states and arcs of an automa- ton, 
while, for clarity, regular expressions will be used to 
present the inputs and outputs of examples. 
Tile pseudocode for the standard automaton product ope- 
ration appears in (10). As the initial states of any automaton 
can be identified with each other withont affecting the lan- 
guage recognised, and similarly the final states, we will 
assume that there is only a single initial state (I) and linal 
state (F) in each automaton. In this pseudocodc, semico- 
lons are followed by colnments. 
( t 0) Product(,A,B): 
1 make (IA,Ie) initial in ¢4×B 
2 make (F.a,Fls) final in ,Ax/3 
3 for each arc li'om x to y in ,,4 labelled 
4 tbr each arc fi'om z to t in/3 labelled .IV" 
5 if Ad f3A/'¢ q) 
6 then add arc fiom (x,z) to (y,t) to .A x 8 
7 labelled .M aN" 
The pseudocode in (10) applies to two automata A and/3, 
over the same alphabet, and constructs their product el x 17, 
an automaton which accepts only those strings accepted 
hy both ,,4 and/3. Each combination of arcs, one from ,,4 
and one from/3, which could be traversed while reading 
tile same input, that is, an input in the intersection .Ad fI./V" 
of the labels of the two arcs, defines an arc in the product 
autonlaton. 
"Ib make the product mimic tile combination of con- 
straints in OT, we need to introduce an asymmetric ope- 
ration on the lists o1' marks: concatenation. Each arc in 
each automata passed to this prodnct operation is labelled 
not only with a set of possible phonological segments, but 
also a list of harmony marks. When two arcs are combi- 
ned, these lists tire concatenated. The pseudocode for this 
augmented product operation appears in (I 1). 
( I 1) AugmentedProduct(,4,13): 
1 make (IA ,Is) initial in A x B 2 nlake OvA ,Fe) 
filial in AxU 
3 lbr each arc froin x to y ia ¢ A, labelled Ad:tt 
4 tbr each arc from z to t in B labelled N':v, 
5 if AriaN'# (~ 
6 then add arc from (x,z) to (y,t) to .4 x 8 
7 tabellcd .A.4 nN':l~.v 
It.v is the concatetmtion of lt and v. 
Because concatenation is not a synmletric operation, the 
augmented product does not commute: A x/3 does not as- 
sign the same marks to candidate forms as /3xel. The 
difference in interpretation is that A x/3 regards all con- 
straints in el as higher priority than all constraints in/3, 
whereas/3 x el i nstantiates the reverse ordering. 
Tile augmented product operation provides a way of 
combining two constraints into a single transducer. As 
an example, (12) is the product of the transducers corre- 
sponding to the constraints ONS (9) and FmL (8) in that 
order. 
I o N I 
to IO o 
The prodnct is the crucial operation for implementing 
OT. The product of the regular expression or automaton 
produced by GEN with all of the constraints in order pro- 
ducts a transducer encoding the harmony evaluations of 
all candidates. Let us call it the su#;i?we transducer. To 
evaluate the harmony of any fully specified candidate, we 
need only follow the corresponding paths in the sur\[ace 
transducer accumuhiting the integer lists associated with 
each arc. Tile path with the greatest total harmony is the 
crucial one lbr deciding whether the candidate is optimal 
or not, 
The surface transduccr which is the product of the candi- 
date syllabilications of alqalanm with the constraints ONS 
and FILL, in that order, is shown in (13). tr ) }oooo.o 
(13) ~ON CON -- 
Oct ct l q a 
{ fO~'\]O0"O Ol) fO\]-'OiTO/ } 
1 ,, \[ to. \[. J f/::) oo.oo. ,o o < )} 
t t 0 i'll It Ill t 0 It lit 
harmony of suhstrings 
In OT, only tile candidates with maxinml harmony survive 
to the surface; non-optimal candidates are eliminated. To 
1010 
implement this part of the derivation, we need to remove all 
paths fi'om the sttrface transducer which do not acctnntllate 
optinml values of harmony. The algorithms in this section 
and the next are designed to achieve this task, and will be 
proven to do so. 
The lirst algorithm (14) assigns Io every slate N the 
harmony value of the optimal path to it from the initial 
state, storing this wflue in the liekl harmony(N). Since 
there is only a sin gle linal state, F, harmony(F) will c{}ntain 
the harmony evalttation of all {}ptinml candidates. 
(14) LabelNodes(transducer): 
I tbr each state II ilI transducer 
2 harnlony(n) undelined 
3 Imrmony(l) ,'- 00...0 1 is the inilial state 
4 list<-. {I } 
5 while List is not empty 
6 expand m begins 
7 nl <- most harlnonie slate in list 
8 delete in from list 
9 lbr each arc a:m-->n from m 
1{} if lmnnony(n) < harmony(m) + harlllOt}y(;.t) 
1 I dclete n from List 
12 harmony(n) ~- harmony(m) + harmony(a) 
13 insert n in list 
14 else ir harn,oily(n) undelined 
15 harmony(n) +- harmony(tn) + harmony(a) 
16 insert n in list 
The algorithm sets the harmony of the inital slate to zero, 
and places Ihe inilial state itt an otherwise empty list. Tile 
most optinml member ill the list is expanded (lines 6-15) 
and relnoved from the list. When a state is expanded, 
all of the arcs from it are examined in turn. If ally of 
them point to states with undelined Imrtnony values, tile 
harlnony of the state being expanded, and o1' tile arc, are 
used to calculate the lmtumny value of tile {}tiler state and 
it is added to the list. If the arc points to a state with a 
delined harmony value, the hamlony value of the better 
palh is retained by that state, and its position ill tile sorted 
list adjusted appropriately. 
If the list is kept sorted, inserting each new state ill oMer 
of the value o1' its harmo,ly lield then, in the worst case, 
o(Iog \[sZatc,s D comparis(ms of harm{my wdues will need 
to he done fol+ an insertion into the list where ,+tcttc.s is 
the set of states in the transducer and arcs the set o\[ arcs. 
As each stale is expanded only once, each arc is examined 
only once. So lare.s I forms an upper bound on lhe number 
{)f insertions that need to he done. The single comparis{m 
on line 9 is insigniIicant in relation to tim coml}arisons 
used in insertion. So an upper hound on order of the worst 
case execution of tiffs algorithm is o(I,vc.q lo~ I.,zazc.H) 
COlltpari8011S. 
It is not obvious that this algorithm will, in fact, label 
each state with the lmnnony of the optimal 1}ath to it, so a 
l}roof follows. 
(15) I+emma. When state M is being expanded (lines 6- 
15), tile tree harlnony value of Ihe optimal path to M, 
namely h(M), and the computed value, harll}ony(M), 
arc equal, il'the same is true lbr all previously eXl}anded 
stales. 
Proof. Case .<. Suppesc the lemma is false, and that 
h(M)>harnlony(M). Then there is an el}tined path p.a.q 
where p is a (possibly null) path, (t is an arc from an al- 
ready expanded state P, to an unexpanded state S and q is 
another (possihly null) path. There will always be such a 
path as M is reachable Dom the initial state, and the initial 
state is tile litst one expanded. This path is ot}timal, so 
h(M) = h.(R) I- h(a) -I-h(q) which in turn is less than or 
equal to h(P,) q. h(a) as h. is always non-positive. F'utting 
this inequality together with the SUl}position of the lemma 
that lmnnony and h. match for all expanded nodes, gives 
tile following inequality: 
/+.av,,+o',y(S) >_ /,,,+,,,,,,..~\](R)-I \],+,,+.,,,o,,.,j(,) 
:-- /,(i~) +/,(.) 
> hm+m, on:q(M) 
A lower I}ound for harmony(S) was set when P, was expan- 
ded. As R is ah'eady expanded l~(P,)=harnaony(l~,), and con- 
sequently harn/ony(S)>harmony(M) which contradicts the 
ntinimality o1' the choice of M (line 6 of algorithm (14.)). 
Thus h(M)<harnaony(M), 
Case >A. lf M is in list, then harmony(M) must he defined 
and set at a wdue _< h(M). 
Tiros the equality o f harmony(M) and h.(M), and the lemma. 
When M is the initial state I, the result follows immediately 
from line 3 which sets hartnony(I) to zero. u 
(16) Theorem. After the al}plication o\[LAImI,NOI)ES , 
for all states N on which harmony(N) is deiined, 
harmony(N) is the harmony of the optimal path to 
N. 
Proof. By the lemma and induction on the sequence of 
expansion of stales, uI 
We call mimic the hthelling of nodes ill the transducer 
with harmonic evaluations by labelling disjuncts in reguhu 
expressions with harmonic values. The value of a whole 
disjunction is most Ilannonic value amongst the disjuncts. 
As before, the harmonic cvahlations arc added during con- 
catenation. The evahtations lor the surlhce transducer (13) 
of alqalamu are shown in (17). 
I'°s'° } +00.0 
(iv) ~ON (;,' o o -- 
t,O ++ <~ l q +t 
oT 
c){) I 0T ) / f / 00.1110{/f0Wo0 IT{, / / 
i(:, '  o,v +: iO VlN  -+ 
t. 0 I J e. 1 + u) a I+, J / 
I "-~/ {}(} I {IT ./ 
1011 
oo 0¥ 
/ f I°°'f°T'° \[i:° \] / ?ON C' n ? ? 
lie IJ m. m to ,, I" J l / ~ "----"---~ / 
t oo oT J 
oo 
<File evaluation of the optimal path in the transducer is 
(O,-l). 
pruning 
Having determined the harnmny value of the optimal path 
to the final state, and others, it only remains to remove 
suboptimal paths. As it happens, this can be easily done 
by removing all arcs which cannot occur in an optimal path 
(\]8). 
( 18) Prune(trausducer): 
1 for each arc a:m4.m of transducer 
2 if harmony(a) + harrnony(n) < harmony0n) 
3 then delete a 
If the sum of the harmony ot' an arc, and the harmony of 
the optinml path to the state it comes from, is less than the 
harmony of the state the arc goes to, then that means that 
there is a lnore optimal path to the second state which will 
always be preferred. Consequently this arc can never be 
part of an optimal path. It is, therefore, saliz and appropriate 
to delete it. 
The complexity, in the number of comparisons perfor- 
med, of this algorithm is identical to the number of arcs 
in the transducer. This is of lower order than the wurst- 
case complexity for LABEl.NODES, so the complexity of 
the combined algorithm is still o(\[ares\[ log \[stctte.sl) con,- 
parisons. 
It is not immediately obvious that the only paths which 
can be formed by the remaining arcs are optimal. This is, 
howevel; the case. 
(t 9) Theorem. After tile application of LAIIELNODES 
and PRUNE there arc no non-oi}timal paths from 
the start state to any state. 
Proof. By induction on the length of the path. 
P(n) = Alier the application of LABt~LN{/DES and PP, UNE 
there is no non-minimal path of length n from tile 
initial state to any other state. 
Base case. P(0) is trivially true, as there is only a unique 
path of length zero. 
Step. Assume P(k) is true. Suppose we have a non- 
optimal path of length k + 1. By the assumption, this 
must consist of an optimal path of length k followed by 
a non-optinml arc a fi'om M to N. c~ would have been de- 
leted unless harmony(M)+harmony(a)>hannony(N). But, 
by theorem (16), harnlony(N) is the harmony of the optimal 
path to N. So harlnony(M)+hannony(a)<harmony(N), and 
the path must be optimal. This contradicts our supposition, 
and so P(k + 1) is true. 
The theorem follows by induction. \[\] 
Consequently, the only paths fiom the initial state to tim 
linal state will he optimal and deline optimal candidates. 
The regnlar expression corresponding to the culled auto- 
maton describing the syllabifications of alqalamu appears 
in (20). It includes only a single candklate sylhtbilication 
of the sequence. 
(20) 0-l-0000000000000000 
ONUONONON 
0 ~t l q a 1 a m', 
Discussion 
The work described in this paper was based oil the Optima- 
l ity Theory of Prince and Smolensky (I 993), making three 
additional assumptions: 
1. All constl'aints are binary, or can be recast as binary 
constraints. This seems to be true of all constraints 
used by P&S. 
2. That the initial set of candidates, the OUtlmt of GEN, 
is a reguhtr set which can be specified by a finite-state 
automaton. 
3. Each constraint cml be implemented as a reguhu'tran- 
sdncer which determines the list of marks lor each 
candidate. 
On the basis of those assumptions, the following deve- 
lopments were made: 
• Transducers were defined wlfich computed not just 
a single constraint, but an ordered hierarchy of coll- 
straints. 
• An algorithm for a product operation on these trun- 
sducers was given. With this operation transducers 
representing constraints coukl be applied to sets of 
candidates, and also be combined into transducers re- 
presenting collections of constraints. 
• Algorithms were presented for 
- tinding the harmony of the optimal candidate in 
a transducer, and 
- culling all non-optimal paths li'om a transducer. 
• These algorithms were proved to fullill their goals. 
• The worst-case complexity o f the combined algorithna 
in terms of harmony comparisons was found to be less 
than o(\[arcsl log \[.~t.zr;sl), for a given transducer. 
Using the assumptions and algorithms given here, there 
are three stages to computing a derivation in OT: 
1. Specify the regular class of candidates as an automa- 
toil. 
2. Build tip the product of this automaton with the tran- 
sducers of each constraint in decreasing order of pri- 
ority. 
1012 
3. Cull subol)timal paths. 
There are at three lllOl'C points worth noting, lqrstly, the 
constraints in a hierarchy can be precompiled into a single 
transducer, l';ach application to a set of candidates then 
only requires a single product operation tollowed by a cull. 
Secondly, casting tim output of GI,:N and all constraints 
as reguhu' means that, al all stages in a derivation, the set 
of candidates is regular. This is because the output of the 
product and ctllling operations are reguhlr-- - Ix)lh return 
autonlata. 
Finally, this speeilication of OT in te,ms of reguhu' sets 
Zllld \[inite-state iltltonlata opens the way for more rigorous 
exploration of the differences between OT and declarative 
phonological theories, such ;.is One-Level Phonology (Bird 
and l illison 1994), which is a constraint-based phonology 
that defines inviolable constraints with automata. 
References 
P, ird, N. (1994). Uomputational Pho,ology: A Co,straim- 
Based Aplm)ach. Studies in Natural l.anguagc Pro- 
cessing. ('.ambridge University Press. 
Bird, S. & Ellison, T. M. (1994). One level phonology: 
autosegmental representations and rules as iinitc au- 
Iomata. Computational Linguistics, 20, 55-90. 
Ellis(m, T. M. (1994). Constrainls, Exceptions and Re- 
presentations. In Proceedings o\[ the t"irst Meeti,g 
elthe ACL Special Interest Grotq~ itl Com/mtatio,al 
Phonology, (pp. 25-32). ACI ,. 
\[t6, J. & Mester, R. A. (1993). I,icensed segments and salc 
paths. Canadian Journal of Littg uistics, 38, 197- -213. 
McCarthy, J. (1993). A case olsurlitce constraint violation. 
(danadia, ,lournal of l,ittguistics, 38, 16% 195. 
McCarlhy, J. & Prince, A. (1993). Prosodic Morphology 1 
Constraint Interaction and Satisfaction. Unpublis- 
hed Report. 
Prince, A. S. & Smolcnsky, P. (1993), Optimality Theory: 
Constraint Interaction in Generative Grammar. Tech- 
nical Rcl)ort 2, Center \['or Cognitive Science, Rutgcrs 
University. 
Scobbie, J. M. ( 1991). Attribute- ~due Phom)logy. Phi) 
thesis, University of I{dinburgh. 
Scobbie, J. M. (1993). (_',onslraint violation and con\[licl 
from the perspective of declarative phonology. Ca,a- 
dia, .lournal of Ling,istics, 38, 155-167. 
701,3 
