On Reversing the Generation Process in Optimality Theory 
J. Eric Fosler 
U.C. Berkeley and International Computer Science Institute 
1947 Center Street, Suite 600, Berkeley, CA 94704 
fosler~icsi.berkeley.edu 
Abstract 
Optimality Theory, a constraint-based phonol- 
ogy and morphology paradigm, has allowed 
linguists to make elegant analyses of many 
phenomena, including infixation and redupli- 
cation. In this work-in-progress, we build on 
the work of Ellison (1994) to investigate the 
possibility of using OT as a parsing tool that 
derives underlying forms from surface forms. 
a. Derivational Phonology b. Optimality Theory 
Figure I: Search Spaces Within Different Paradigms 
1 Introduction 
Optimality Theory (Prince and Smolensky, 1993) is a 
constraint-based phonological and morphological system 
that allows violable constraints in deriving output sur- 
face forms from underlying forms. In OT a system of 
constraints selects an "optimal" surface output from a 
set of candidates. The methodology allows succinct anal- 
yses of phenomena such as infixation and reduplication 
that were difficult to describe under sets of transforma- 
tional rules. 
Several computational methods for OT have been pro- 
duced within the short amount of time since Prince and 
Smolensky's paper (Ellison, 1994; Tesar, 1995; Ham- 
mond, 1995). These systems were designed as genera- 
tion systems, deriving surface forms from an underlying 
lexicon. There have, however, been no computational 
models of OT parsers that derive underlying forms from 
the surface form. z In this work, we lay the theoretical 
groundwork for using OT as a parsing tool. 
2 Comparing Derivational Methods 
to Optimality Theory 
In traditional computational phonology/morphology 
systems such as two-level phonology (Koskenniemi, 
1983), grammars that generate surface forms axe invert- 
ible, allowing parsing back into underlying forms. In a 
derivational framework, the grammar converts underly- 
ing forms to surface outputs via transformations; the in- 
put and output share the same space (Figure la). In the 
one-level version of OT that most computational meth- 
ods use, the space is populated with candidate outputs 
I Some of the computational work in OT confusingly 
uses the term "parsing" to refer to generation. 
created by a generator function GZN operating on in- 
put strings. The search narrows in on an optimal out- 
put (Figure lb) using evaluation constraints in a process 
called EVAL; successively smaller boundaries are cut out 
by the constraints until only one candidate remains. It 
is easy to see why the derivational method can be run 
backward: it just retraces derivational links in the graph. 
It is not obvious, though, how the input can be found 
from the search space in OT. 
3 Tagalog Infixation 
Infixation has traditionally been a difficult problem 
for computational models that use two-level phonology 
(Sproat, 1992). Infixation in Tagalog, however, has been 
modeled using OT (McCarthy and Prince, 1995). In 
Tagalog, the urn affix can appear as a prefix, or "move" 
slightly into the word to which it is attaching (French, 
1988). 
Root with um Gloss 
alis um-alis "leave" 
sulat s-um-ulat "write" 
gradwet gr-um-adwet "graduate" 
McCarthy and Prince analyze um as a prefix, which 
moves into a word to reduce the number of coda con- 
sonants. They postulate two competing constraints, 
ALIGN-PREFIX and the higher-ranked NOCODA. ALIGN- 
PREFIX states that the prefix should remain as close to 
the front of the word as possible. NOCODA penalizes 
syllables with coda consonants. 
In the OT derivation of grumsdwet from um+gradwet 
(Figure 2), the winning candidate violates NOCODA 
twice, while the first two candidates violate it three 
times. The final candidate is pruned since it violates 
the ALIGN constraint more times than the winner. 
354 
Candidates NoCoda 
um.grad.wet • * .! 
gum.rad.wet • • .! 
V / gru.mad.wet ** 
gra.dum.wet ** 
Align 
,,III~ 
,,,,,L;;;;;;; 
i,\[:ll 
* * *!* 
Figure 2: OT Evaluation for Tagalog Infixation 
(Morphenm Structure) ( PPWWWWWWW~ I / WPPWWWW~ { WWPP~ 
(Syllable Structure) { NC00NCONC/\[ | 0NCONCONC~ ~ 00NONCONC} 
(Phoneme Strtlcturc) x umgradwet I \[ ~ gumradwetl • grumadwet / 
Candl Cand2 Cand3 
Figure 3: Candidate outputs for um+gradwet in an 
FST 
4 Ellison's Conversion Method 
Ellison (1994) provides a paradigm for converting Opti- 
mality Theory constraints into Finite State Transducers. 
He requires that EVAL constraints output binary marks 
when ranking candidates and be describable as a regular 
language; the output of GEN must also be describable 
by a regular language. As Ellison points out, most con- 
straints can be reformulated to be binary. He is able 
to build FST representations for the constraints that he 
considers, showing them to be regular. 
For the Tagalog example, GEN will output the regu- 
lar language shown in Figure 3 for the first three candi- 
dates (umgradwet, gumradwet, and grumadwet). 2 Each 
candidate consists of segments associated with a syllable 
structure position and a morpheme structure marker. 3 
We now consider the ALIGN-PREFIX constraint, re- 
stricting the prefix to occur as early in the word as pos- 
sible. This is encoded as an FST that writes marks on 
an output "Harmony Marks" tape. A 'T' is written 
for any word (W) morphological material that precedes 
prefix (P) material, and a "0" is written for any other 
segment. 
(Molpheme St~ctm'e) 0 (Syllable 
Structure) • 
(Phoneme Structure) • 
Figure 4: ALIGN-PREFIX FST Regular Language 
The regular language generated by this FST (Figure 4) 
has a very simple structure. Any Ws before Ps on the 
Morpheme Structure tape get a harmony violation mark. 
Taking the product of this language with the optimal 
candidate scores the candidate (Figure 5). The harmony 
marks include two non-harmonicmarks (i.e. "l"s); inthe 
OT tableau in Figure 2, we see that ALIGN also gives two 
marks to the optimal candidate. 
We can encode a similar FST for NOCODA. This FST 
examines the syllable structure tape to give harmony 
marks (Figure 6)-- codas (Cs) get a harmony violation 
mark, onsets (O) and nuclei (N) are unmarked. As in the 
OT tableau, the winning candidate (Figure 7) violates 
NOCODA twice. 
2For brevity, we are not considering other candidates. 
aWe have extended Ellison's work by adding a third 
tape that marks segments as belonging to the prefix or 
to the word. 
(Harm°nyMarks) (1 1 0 0 0 0 0 0i)w (Morpheme Structure) W P P W W W W 
(Syllable S~cture) 0 0 N 0 N C 0 N 
(Phoneme Structure) g r u m a d w e 
Figure 5: Scoring of gruma~lwet by ALIGN-PREFIX -ony-,(ll °1o ) 
(Morpheme Structure) • • 
(Syllable S~ucmm) o N C 
(Phoneme Structure) • • • 
Figure 6: Regular Language generated by NOCODA 
Once the OT constraints are represented as FSTs, 
combining all of the EVAL constraints into one trans- 
ducer is a straightforward product. Ellison augments 
the product procedure so that harmony marks are con- 
catenated by the resulting transducer. 
We have used two different types of harmony marks 
in the ALIGN-PREFIX and NOCODA FSTs, representing 
the ranking of the two rules as suggested by McCarthy 
and Prince. The higher-ranked NOCODA constraint out- 
puts "2" marks while ALIGN-PREFIX outputs 'T' marks. 4 
Harmonic comparisons between the candidates will con- 
sider the candidates with the smallest number of "2" 
marks first, followed by the smallest number of "1" 
marks. Marks are not added together, rather, the count 
of each type of mark is the deciding factor in evaluation, s 
The output of GEN and the constraints of EVAL are 
combined into a single transducer by taking the product 
of all of the FSTs. For the Tagalog example, the output 
rankings for the candidates are shown in Figure 8. Using 
the harmonic marks to prune the resulting transducer 
reveals the optimal candidate (Figure 9). 
5 Extensions to Parsing 
Ellison's approach gives us an elegant method of per- 
forming OT generation using finite state automata. Nev- 
ertheless, the system cannot parse the output string 
back into underlying surface forms. In a derivational 
paradigm (Figure la) , the input and output forms are 
enclosed in the same space. The derivational grammar 
is a transform that one can invert using FSTs, searching 
for the input using the output. 
Ellison's FSTs transform output candidates to har- 
mony marks; even so, the inversion of these FSTs are 
useless. The crucial point is that GEN hides the surface- 
form-to-candidate mapping; in Ellison's system the EVAL 
portion of the system only combines with the output of 
GEN, so the mapping is lost. For invertability it is crit- 
ical that the FST have access to both input and output 
forms. 
In the version of 0T (one-level OT) Ellison incorpo- 
rated into his system, outputs of GEN are constrained 
4Ellison uses only one type of mark and determines 
rank ordering from the relative positions of marks for 
each output segment. These two methods are equivalent. 
S0ne "2" is worse than two 'q"s. 
- .-iooooo oo ) (Morpheme Sa'uc~r¢) W W P P W W W W 
(Syllable StmcnLm) 0 0 N 0 N C 0 N 
(Phoneme Slructul~) g r u m a d w • t 
Figure 7: Scoring of grumadwet by NOCODA 
355 
~s~) \[ P pwwwwwww III wPewwwwww |j wwPPwwwww (Syllab~ Stnmt~e) INCOONCONC II ~ONCONCONC ~l OONONCONC 
~S~gctw~)~umgradwetlJ%gumradwetll grumadwet 
Figure 8: Output of OT-FST System 
a~ooy~k,) \[ zozoooooooo2ooooo2\ (Moqmen~S~ucut~) I ww P Pwwwww l 
(Syllable Stmctu~) %00NONCONC/ (Phonen= Struclm~) \ g r u m a d w e c / 
Figure 9: Pruned Output of OT-FST System 
to be similar to the input. McCarthy & Prince (1994) 
abandon this constraint principle, and use faithfulness 
constraints in EVAL to achieve the same effect within 
"modern" two-level OT. This will be a critical move for 
the OT-FST paradigm. 
In two-level OT, G~.N generates all strings; faith- 
fulness constraints in EVAL minimize the inserted and 
deleted material between underlying and candidate sur- 
face forms. By specifically modeling the faithfulness con- 
straints, we now allow the FST to have access to the 
input-output correspondences crucial for searching for 
underlying forms. The remaining question, however, is 
whether faithfulness constraints can be modeled by reg- 
ular grammars. Several formulations of two-level OT 
faithfulness constraints are discussed by McCarthy and 
Prince (1994) and Orgun (1994). To illustrate the fla- 
vor of these constraints and how they might be regular- 
izable, we consider two constraints, CORR and MATCH 
(named for their similarity to Orgun's constraints). For 
our Tagalog example, we add two tapes for the under- 
lying word and prefix forms (Figure 10). The CORR 
constraint requires that for every element in the surface 
phoneme string there is a segment in the underlying word 
or prefix, and vice versa. MATCH constrains the surface 
string phoneme to match 6 those in the word and prefix, 
and vice versa (Figure 11). Using these constraints, the 
OT-FSTs should be able to generate and parse in the 
Tagalog example. 
(Morpheme Structure)(WWP I, WWWWtc ~ (Syllable S~rucRn~) 0 0 N 0 N C 0 N 
(Surface Phoneme SU'nctu~) g r u m a d w e 
(Word Phoneme@ (gradwe t) (Pzef=Phoneme,) (u m) 
Figure 10: Adding Word and Prefix Tapes 
The additional computational complexity for imple- 
menting this type of system may be quite large; the 
search space for determining unknown strings at parse 
time will make for a slow implementation unless suit- 
able heuristics are found for searching over each type of 
string. Systems of this type are likely to become even 
more complex as more information such as moraic struc- 
ture is added. We envision that these heuristics will be 
based on the harmony mark scoring of the FST, but the 
exact nature of this is left to future work. 
6 Conclusions & Future Work 
Current Computational Optimality Theory systems pro- 
vide solutions for OT generation, but deriving underly- 
ing forms from surface forms is not possible within these 
6Here we mean be identical to; this definition can be 
extended with features and underspecified elements. 
( 1o111111 ) Smear) • • (1%¢~ Pmmm~) • • 
Co~ Co-~a~iat '°l ll!b!l I:ll) 
(Moq~.~ Structure) \[ w (Fno.~.z Sm,cah-~) \[ a 
(W=dmmmm) ~ a ...... • 
Mazh Constmim 
Figure lh Faithfulness Constraints 
systems. In order to extend any generation system to an 
OT parsing system, two-level Optimality Theory should 
be a critical component, since it moves the hidden rela- 
tionship between input and output out of GEN and into 
EVAL. With two-level OT, the mapping from input to 
output can be directly operated upon by computational 
theories. 
We have proposed using two-level OT to extend E1- 
lison's technique for representing constraints as finite 
state transducers. By explicitly representing the input- 
to-output mapping using two-level OT, we have laid the 
theoretical groundwork for recovering underlying forms 
from surface forms. 
In future work, we will implement the extensions to 
Ellison's algorithm allowing us to morphologically ana- 
lyze cases like the Tagalog example. Search complexity 
will, however, be an issue in the implementation of the 
system; after an initial brute-force implementation, work 
must be focused on determining how the harmony marks 
can be used to heuristically guide the parser search. 
Acknowledgments 
We would like to thank Dan Jurafsky, Orhan Orgun, 
Sharon Inkelas, Nelson Morgan, Su-Lin Wu, and three 
anonymous ACL reviewers for comments, suggestions, 
and support. 

References 
T.M. Ellison. Phonological derivation in optimality theory. In 
COLING-9$, 1994. 
K. French. Insights into Tagalog: Reduplication, lnftxation, and 
Stress from Nonlinear Phonology. M.A. Thesis, Summer In- 
stitute of Linguistics and University of Texas, Arlington, 1988. 
M. Hammond. Syllable parsing in English and French. Rutgers 
Optimality Archive, 1995. 
L. Karttunen. Kimmo: A general morphological processor. In 
Texas Linguistics Forum P$, 1983. 
K. Koskenniemi. Two-Level Morphology: A General Compu- 
tational Model \]or Word.Forra Recognition and Production. 
Ph.D. thesis, University of Helsinki, 1983. 
J. McCarthy and A. Prince. Prosodic morphology, parts 1 and 2. 
Prosodic Morphology Workshop, OTS, Utrecht, 1994. 
J. McCarthy and A. Prince. Prosodic morphology. In J. Gold- 
smith, editor, Handbook of Phonological Theory, pages 318- 
366. Basil Blackwell Ltd., 1995. 
O. Orgun, Containment: Why and why not. Unpublished ms., U. 
of California-Berkeley, Department of Linguistics, July 1994. 
A. Prince and P. Smolensky. Optimality theory. Unpublished 
ms., Rutgers University, 1993. 
R. Sproat. Morphology and Computation. MIT Press, Cam- 
bridge, MA, 1992. 
B. Tesar. Computational Optimality Theory. Ph.D. Thesis, U. 
of Colorado-Boulder, Department of Computer Science, 1995. 
