Optimality Theory: Universal Grammar, Learning and 
Parsing Algorithms, and Connectionist Foundations (Abstract) 
Paul Smolensky and Bruce Tesar 
Department of Computer Science and Institute of Cognitive Science 
University of Colorado, Boulder USA 
We present a recently proposed theory of grammar, 
Optimality Theory (OT; Prince & Smolensky 1991, 
1993). The principles of OT derive in large part from 
the high-level principles governing computation in con- 
nectionist networks. The talk proceeds as follows: (1) 
we summarize OT and its applications to UG. The we 
present (2) learning and (3) parsing algorithms for OT. 
Finally, (4) we show how crucial elements of OT emerge 
from connectionism, and discuss the one central feature 
of OT which so far eludes connectionist explanation. 
(1) In OT, UG provides a set of highly general univer- 
sal constraints which apply in parallel to assess the well- 
formedness of possible structural descriptions of linguis- 
tic inputs. The constraints may conflict, and for most 
inputs no structural description meets them all. The 
grammatical structure is the one that optimally meets 
the conflicting constraint sets. Optimality is defined on 
a language-particular basis: each language's grammar 
ranks the universal constraints in a dominance hierar- 
chy such that each constraint has absolute priority over 
all lower-ranked constraints. Given knowledge of UG, 
the job of the learner is to determine the constraint 
ranking which is particular to his or her language. \[The 
explanatory power of OT as a theory of UG has now 
been attested for phonology in over two dozen papers 
and books (e.g., McCarthy ~: Prince 1993; Rutgers Op- 
timality Workshop, 1993); applications of OT to syntax 
are now being explored (e.g. Legendre, Raymond, 
$molensky 1993; Grimshaw 1993).\] 
(2) Learnability ofOT (Tesar ~ Smolensky, 1993). The- 
ories of UG can be used to address questions of learn- 
ability via the formal universal principles they provide, 
or via their substantive universals. We will show that 
OT endows UG with sufficiently tight formal struc- 
ture to yield a number of strong learnability results at 
the formal level. We will present a family of closely 
related algorithms for learning, from positive exam- 
ples only, language-particular grammars on the basis 
of prior knowledge of the universal principles. We will 
sketch our proof of the correctness of these algorithms 
and demonstrate their low computational complexity. 
(More precisely, the learning time in the worst case, 
measured in terms of 'informative examples', grows only 
as n 2, where n is the number of constraints in UG, even 
though the number of possible grammars grows as n!, 
i.e., faster than exponentially.) Because these results 
depend only on the formal universals of OT, and not on 
the content of the universal constraints which provide 
the substantive universals of the theory, the conclusion 
that OT grammars are highly learnable applies equally 
to OT grammars in phonology, syntax, or any other 
grammar component. 
(3) Parsing in OT is assumed by many to be problem- 
atic. For OT is often described as follows: take an 
input form, generate all possible parses of it (generally, 
infinite in number), evaluate all the constraints against 
all the parses, filter the parses by descending the con- 
straints in the dominance hierarchy. While this cor- 
rectly characterizes the input/output function which is 
an OT grammar, it hardly provides an efficient pars- 
ing procedure. We will show, however, that efficient, 
provably correct parsing by dynamic programming is 
possible, at least when the set of candidate parses is 
sufficiently simple (Tesar, 1994). 
(4) OT is built from a set of principles, most of which 
derive from high-level principles of connectionist com- 
putation. The most central of these assert that, given 
an input representation, connectionist networks tend to 
compute an output representation which best satisfies 
a set of conflicting soft constraints, with constraint con- 
flicts handled via a notion of differential strength. For- 
malized through Harmony Theory (Smolensky, 1986) 
and Harmonic Grammar (Legendre, Miyata, & Smolen- 
sky 1990), this conception of computation yields a the- 
ory of grammar based on optimization. Optimality 
Theory introduces to a non-numerical form of optimiza- 
tion, made possible by a property as yet unexplained 
from the connectionist perspective: in grammars, con- 
straints fall into strict domination hierarchies. 
271 
