OT Syntax: Decidability of Generation-based Optimizationa0
Jonas Kuhn
Department of Linguistics
Stanford University
jonask@stanford.edu
Abstract
In Optimality-Theoretic Syntax, optimiza-
tion with unrestricted expressive power on
the side of the OT constraints is unde-
cidable. This paper provides a proof for
the decidability of optimization based on
constraints expressed with reference to lo-
cal subtrees (which is in the spirit of OT
theory). The proof builds on Kaplan and
Wedekind’s (2000) construction showing
that LFG generation produces context-
free languages.
1 Introduction
Optimality-Theoretic (OT) grammar systems are an
interesting alternative to classical formal grammars,
as they construe the task of learning from data in
a meaning-based way: a form is defined as gram-
matical if it is optimal (most harmonic) within a set
of generation alternatives for an underlying logical
form. The harmony of a candidate analysis depends
on a language-specific ranking (a1 ) of violable con-
straints, thus the learning task amounts to adjusting
the ranking over a given set of constraints.
(1) Candidatea2a4a3 is more harmonic thana2a6a5 iff it incurs fewer
violations of the highest-ranking constraint a7a9a8a11a10a13a12 in
which a2a4a3 and a2a6a5 differ.
The comparison-based setup of OT learning is
closely related to discriminative learning approaches
in probabilistic parsing (Johnson et al., 1999; Rie-
zler et al., 2000; Riezler et al., 2002),1 however the
comparison of generation alternatives - rather than
parsing alternatives - adds the possibility of system-
atically learning the basic language-specific gram-
matical principles (which in probabilistic parsing
are typically fixed a priori, using either a treebank-
derived or a manually written grammar for the given
a14 This work was supported by a postdoctoral fellowship of
the German Academic Exchange Service (DAAD).
1This is for instance pointed out by (Johnson, 1998).
language). The “base grammar” assumed as given
can be highly unrestricted in the OT setup. Using a
linguistically motivated set of constraints, learning
proceeds with a bias for unmarked linguistic struc-
tures (cf. e.g., (Bresnan et al., 2001)).
For computational OT syntax, an interleaving of
candidate generation and constraint checking has
been proposed (Kuhn, 2000). But the decidability
of the optimization task in OT syntax, i.e., the iden-
tification of the optimal candidate(s) in a potentially
infinite candidate set, has not been proven yet.2
2 Undecidability for unrestricted OT
Assume that the candidate set is characterized by
a context-free grammar (cfg) a15a17a16 , plus one addi-
tional candidate ‘yes’. There are two constraints
(a18 a16 a1 a18a11a19 ): a18 a16 is violated if the candidate is neither
‘yes’ nor a structure generated by a cfg a15
a19
; a18 a19 is vi-
olated only by ‘yes’. Now, ‘yes’ is in the language
defined by this system iff there are no structures in
a15a20a16 that are also in a15
a19
. But the emptiness problem
for the intersection of two context-free languages is
known to be undecidable, so the optimization task
for unrestricted OT is undecidable too.3
However, it is not in the spirit of OT to have
extremely powerful individual constraints; the ex-
planatory power should rather arise from interaction
of simple constraints.
3 OT-LFG
Following (Bresnan, 2000; Kuhn, 2000; Kuhn,
2001), we define a restricted OT system based
on Lexical-Functional Grammar (LFG) represen-
tations: c(ategory) structure/f(unctional) structure
2Most computational OT work so far focuses on candidates
and constraints expressible as regular languages/rational rela-
tions, based on (Frank and Satta, 1998) (e.g., (Eisner, 1997;
Karttunen, 1998; Gerdemann and van Noord, 2000)).
3Cf. also (Johnson, 1998) for the sketch of an undecidability
argument and (Kuhn, 2001, 4.2, 6.3) for further constructions.
                  Computational Linguistics (ACL), Philadelphia, July 2002, pp. 48-55.
                         Proceedings of the 40th Annual Meeting of the Association for
pairs a0a2a1a4a3a6a5a8a7 like a0 (4),(5)a7 . Each c-structure tree
node is mapped to a node in the f-structure graph
by the function a9 . The mapping is specified by f-
annotations in the grammar rules (below category
symbols, cf. (2)) and lexicon entries (3).4
(2) ROOT a10
a11
FP
a12a14a13a16a15a18a17
a17
a17
a17
VP
a12a19a13a20a15a22a21
FP a10 a23
NP FP
a24 a12
TOPICa25 a13a20a15 a12a14a13a16a15
a24 a12
COMP* OBJa25 a13a16a15
a17
a17
a17
a17
(NP) Fa26
a24 a12
SUBJa25 a13a16a15a27a12a14a13a16a15 a21
Fa26a28a10 Fa12a14a13a16a15
a11
FP
a12a14a13a16a15 a17
a17
a17
a17
VP
a12a14a13a16a15 a21
VP a10 (NP) Va26
(a12 SUBJ)=a15 a12 =a15
Va26a29a10
V
a12a14a13a16a15
a11
NP
a24 a12
OBJa25 a13a16a15a30a17a17
a17
a17
FP
a24 a12
COMPa25 a13a16a15a22a21
(3) Mary NP (a12 PRED)=‘Mary’
(a12 NUM)=SG
that F
had F (a12 TNS)=PAST
seen V (a12 PRED)=‘seea31 (a12 SUBJ) (a12 OBJ) a32 ’
(a12 ASP)=PERF
thought V (a12 PRED)=‘thinka31 (a12 SUBJ) (a12 COMP) a32 ’
(a12 TNS)=PAST
laughed V (a12 PRED)=‘laugha31 (a12 SUBJ) a32 ’
(a12 TNS)=PAST
(4) c-structure
ROOT
VP
NP Va26
John V FP
thought Fa26
F FP
that NP Fa26
Mary F VP
had Va26
V NP
seen Titanic
(5) f-structurea33a34
a34
a34
a34
a34
a34
a34
a34
a34a35
PRED ‘thinka36 (a37 SUBJ) (a37 COMP)a38 ’
TNS PAST
SUBJ a39 PRED ‘John’NUM SG
a40
COMP
a33a34
a34
a34
a34a35
PRED ‘seea36 (a37 SUBJ) (a37 OBJ)a38 ’
TNS PAST
ASP PERF
SUBJ a39 PRED ‘Mary’NUM SG
a40
OBJ a39 PRED ‘Titanic’NUM SG
a40
a41a43a42
a42
a42
a42
a44
a41a43a42
a42
a42
a42
a42
a42
a42
a42
a42
a44
4a15 abbreviates
a45
a24a2a46
a25 , i.e., the present category’s a45 image;
a12
abbreviates a45 a24a48a47a49a46 a25 , i.e., the f-structure corresponding to the
present node’s mother category.
The correct f-structure for a sentence is the min-
imal model satisfying all properly instantiated f-
annotations.
In OT-LFG, the universe of possible candidates
is defined by an LFG a15a51a50a53a52a55a54a56a50a58a57a60a59 (encoding inviolable
principles, like an X-bar scheme). A particular can-
didate set is the set Gena61a63a62a64a56a65a2a62a66a68a67a70a69 a5 a50a53a52a72a71 - i.e., the c-/f-
structure pairs in a15a73a50a43a52a55a54a70a50a53a57a60a59 , which have the input a5 a50a43a52
as their f-structure. Constraints are expressed as lo-
cal configurations in the c-/f-structure pairs. They
have one of the following implicational forms:5
(6) a74a75a77a76 a74 a26a75
a26
where a74a79a78a80a74a79a26 are descriptions of nonterminals of a81 a62a64a70a65a82a62a83a66a68a67 ;a75
a78
a75
a26 are standard LFG f-annotations of constraining
equations with a12 as the only f-structure metavariable.
(7) a74
a84a86a85
a75 a87
a76 a74a79a26
a84
a26
a85
a26a75
a26
a87a72a26
where a74a79a78a88a74a79a26a82a78
a85
a78
a85
a26 are descriptions of nonterminals
of a81 a62a64a56a65a2a62a66a68a67 ; a74a79a78a80a74 a26 refer to the mother in a local subtree
configuration, a85 a78 a85 a26 refer to the same daughter cate-
gory; a84 a78 a84 a26 a78 a87 a78 a87a72a26 are regular expressions over nontermi-
nals;
a75
a78
a75
a26 are standard f-annotations as in (6).
Any of the descriptions can be maximally unspe-
cific; (6) can for example be instantiated by the
OPSPEC constraint (a89 OP)=+ a90 (DF a89 ) (an operator
must be the value of a discourse function, (Bresnan,
2000)) with the category information unspecified.
An OT-LFG system a91 is thus characterized by
a base grammar and a set of constraints, with a
language-specific ranking relation a1a93a92 :
a94 a13
a31a82a81
a62a64a56a65a2a62a66a68a67
a78a60a31a12a95a78a88a96a98a97a99a32a80a32 .
The evaluation function Evala100a43a101a103a102a104
a97a72a105
picks the most
harmonic from a set of candidates, based on the con-
straints and ranking. The language (set of analyses)6
generated by an OT system is defined as
a106 a24 a94
a25
a13a108a107
a31a110a109a112a111a55a78a88a113a95a111a114a32 a10a27a81
a62a64a70a65a82a62a83a66a68a67
a115a80a116
a113
a62a64a118a117
a31a110a109 a111 a78a119a113 a111 a32 a10 Evala120a122a121a55a123a124a19a125a127a126
a24 Gen
a128a72a129a130a2a131a58a129a132a58a133
a24
a113
a62a64
a25a80a25a88a134
4 LFG generation
Our decidability proof for generation-based op-
timization builds on the result of (Kaplan and
Wedekind, 2000) (K&W00) that LFG generation
produces context-free languages.
5Note that with GPSG-style category-level feature percola-
tion it is possible to refer to (finitely many) nonlocal configura-
tions at the local tree level.
6The string language is obtained by taking the terminal
string of the c-structure part of the analyses.
(8) Given an arbitrary LFG grammar a81 and a cycle-free f-
structure a113 , a cfg a81 a26 can be constructed that generates
exactly the strings to which a81 assigns the f-structure a113 .
I will refer to the resulting cfg a15 a0 as a1a3a2 a69 a15 a3a6a5 a71 .
K&W00 present a constructive proof, folding all f-
structural contributions of lexical entries and LFG
rules into the c-structural rewrite rules (which is
possible since we know in advance the range of f-
structural objects that can instantiate the f-structure
meta-variables in the rules). I illustrate the special-
ization steps with grammar (2) and lexicon (3) and
for generation from f-structure (5).
Initially, the generalized format of right-hand
sides in LFG rules is converted to the standard
context-free notation (resolving regular expressions
by explicit disjunction or recursive rules). F-
structure (5) contains five substructures: the root f-
structure, plus the embedded f-structures under the
paths SUBJ, COMP, COMP SUBJ, and COMP OBJ.
Any relevant metavariable (a89 , a4 ) in the grammar
must end up instantiated to one of these. So for each
path from the root f-structure, a distinct variable is
introduced: a5 , subscripted with the (abbreviated and
possibly empty) feature path: a5 a3 a5a7a6 a3 a5a9a8 a3 a5a9a8a10a6 a3 a5a9a8a12a11 .
Rule augmentation step 1 adds to each category
name a concrete f-structure to which the category
corresponds. So for FP, we get FP:a5 , FP:a5a13a6 , FP:a5a14a8 ,
FP:a5 a8a10a6 , and FP:a5 a8a12a11 . The rules are multiplied out
to cover all combinations of augmented categories
obeying the original f-annotations.7 Step 2 adds a
set of instantiated f-annotation schemes to each sym-
bol, based on the instantiation of metavariables from
step 1. One instance of the lexicon entry Mary look
as follows:
(9) NP:a15a17a16a19a18 :
a11
a24
a15 a16a20a18 PRED)=‘Mary’
a24
a15a17a16a19a18 NUM)=SG
a21
a10 Mary
The rules are again multiplied out to cover all
combinations for which the set of f-constraints
on the mother is the union of all daughters’ f-
constraints, plus the appropriately instantiated rule-
specific annotations. So, for the VP rule based
on the categories NP:a15 a16a19a18 :
a11
a24
a15a21a16a20a18 PRED)=‘Mary’
a24
a15 a16a19a18 NUM)=SG
a21 and
Va26 :a15a21a16 :a23
a24
a15a21a16 PRED)=‘laugh’
a24
a15a17a16 TNS)=PAST
a15 a16
a13
a15 a16
a22
, we get the rule
7VP:
a15 a16 a10 NP:a15 a16a20a18 Va26 :a15 a16 is allowed, while
VP:a15a21a16 a10 NP:a15a21a16a20a18 Va26 :a15a17a16a20a23 is excluded, since the a12 =a15 annotation
of Va26 in the VP rule (2) enforces that a45 a24 VPa25 a13 a45 a24 Va26a58a25 .
VP:a15 a16 : a24a25
a25
a25
a25
a26
a25
a25
a25
a25a27
a24
a15 a16 SUBJa25
a13
a15 a16a19a18
a24
a15a17a16a19a18 PRED)=‘Mary’
a24
a15 a16a19a18 NUM)=SG
a24
a15a17a16 PRED)=‘laugh’
a24
a15a21a16 TNS)=PAST
a15 a16
a13
a15 a16
a28
a25
a25
a25
a25a29
a25
a25
a25
a25
a30
a10
NP:a15a21a16a20a18 :
a11
a24
a15 a16a19a18 PRED)=‘Mary’
a24
a15a17a16a19a18 NUM)=SG
a21 V
a26 :a15a17a16 :a23
a24
a15a21a16 PRED)=‘laugh’
a24
a15 a16 TNS)=PAST
a15a21a16
a13
a15a21a16
a22
With this bottom-up construction it is ensured that
each new category ROOT:a5 :a31 . . . a32 (corresponding to
the original root symbol) contains a complete pos-
sible collection of instantiated f-constraints. To ex-
clude analyses whose f-structure is not a5 (for which
we are generating strings) a new start symbol is in-
troduced “above” the original root symbol. Only for
the sets of f-constraints that have a5 as their minimal
model, rules of the form ROOTa0a7a33 ROOT:a5 :a31 . . . a32
are introduced (this also excludes inconsistent f-
constraint sets).
With the cfg a1a3a2 a69 a15 a3a6a5 a71 , standard techniques for
cfg’s can be applied, e.g., if there are infinitely many
possible analyses for a given f-structure, the small-
est one(s) can be produced, based on the pumping
lemma for context-free languages. Grammar (2)
does indeed produce infinitely many analyses for the
input f-structure (5). It overgenerates in several re-
spects: The functional projection FP can be stacked
due to recursions like the following (with the aug-
mented FP reoccuring in the Fa0 rules):
FP:a34a36a35 :
a24a25
a25
a25
a26
a25
a25
a25a27
a37
a34a38a35 PRED)=‘seea36 . . .a38 ’
a37
a34a36a35 TNS)=PAST
a37
a34 a35 SUBJa39a41a40a42a34 a35a44a43
a37
a34 a35a45a43 PRED)=‘Mary’
a37
a34a36a35 OBJa39a41a40a42a34a36a35a44a46
a37
a34 a35a45a46 PRED)=‘Titanic’
a34 a35 a40a42a34 a35
a28
a25
a25
a25a29
a25
a25
a25
a30a48a47
Fa49 :a34a36a35 :
a24a25
a25
a25
a26
a25
a25
a25a27
a37
a34a36a35 PRED)=‘seea36 . . .a38 ’
a37
a34a38a35 TNS)=PAST
a37
a34 a35 SUBJa39a13a40a42a34 a35a45a43
a37
a34 a35a44a43 PRED)=‘Mary’
a37
a34a38a35 OBJa39a7a40a42a34a38a35a45a46
a37
a34 a35a44a46 PRED)=‘Titanic’
a34 a35 a40a42a34 a35
a28
a25
a25
a25a29
a25
a25
a25
a30
Fa49 :a34a38a35 :
a24a25
a25
a25
a26
a25
a25
a25a27
a37
a34a38a35 PRED)=‘seea36 . . .a38 ’
a37
a34a36a35 TNS)=PAST
a37
a34 a35 SUBJa39a13a40a42a34 a35a44a43
a37
a34 a35a45a43 PRED)=‘Mary’
a37
a34a36a35 OBJa39a41a40a42a34a36a35a44a46
a37
a34 a35a45a46 PRED)=‘Titanic’
a34 a35 a40a42a34 a35
a28
a25
a25
a25a29
a25
a25
a25
a30 a47
F:a34a38a35 :a50 FP:a34a36a35 :
a24a25
a25
a25
a26
a25
a25
a25a27
a37
a34a38a35 PRED)=‘seea36 . . .a38 ’
a37
a34a36a35 TNS)=PAST
a37
a34 a35 SUBJa39a7a40a42a34 a35a44a43
a37
a34 a35a45a43 PRED)=‘Mary’
a37
a34a36a35 OBJa39a41a40a42a34a36a35a44a46
a37
a34 a35a45a46 PRED)=‘Titanic’
a34 a35 a40a42a34 a35
a28
a25
a25
a25a29
a25
a25
a25
a30
F:a5a9a8 :a51 is one of the augmented categories we get
for that in (3), so a1a3a2 ((2),(5)) generates an arbitrary
number of thats on top of any FP. A similar repeti-
tion effect will arise for the auxiliary had.8 Other
choices in generation arise from the freedom of gen-
erating the subject in the specifier of VP or FP and
from the possibility of (unbounded) topicalization of
the object (the first disjunction of the FP rule in (2)
8The F
a52 entries do not contribute any PRED value, which
would exclude doubling due to the instantiated symbol charac-
ter of PRED values (cf. K&W00, fn. 2).
contains a functional-uncertainty equation):
(10) a. John thought that Titanic, Mary had seen.
b. Titanic, John thought that Mary had seen.
5 LFG generation in OT-LFG
While grammar (2) would be considered defective
as a classical LFG grammar, it constitutes a rea-
sonable example of a candidate generation grammar
(a15 a50a53a52a55a54a56a50a58a57a60a59 ) in OT. Here, it is the OT constraints that
enforce language-specific restrictions, so a15a98a50a53a52a55a54a56a50a58a57a60a59 has
to ensure that all candidates are generated in the
first place. For instance, expletive elements as do in
Who do you know will arise by passing a recursion
in the cfg constructed during generation. A candi-
date containing such a vacuous cycle can still be-
come the winner of the OT competition if the Faith-
fulness constraint punishing expletives is outranked
by some constraint favoring an aspect of the recur-
sive structure. So the harmony is increased by going
through the recursion a certain number of times. It
is for this very reason, that Who do you know is pre-
dicted to be grammatical in English.
So, in OT-LFG it is not sufficient to apply just
the a1a3a2 construction; I use an additional step: prior
to application of a1a3a2 , the LFG grammar a15 a50a53a52a55a54a56a50a58a57a60a59 is
converted to a different form a0 a101 a69 a15 a50a43a52a55a54a70a50a53a57a60a59 a71 (depend-
ing on the constraint set a1 ), which is still an LFG
grammar but has category symbols which reflect lo-
cal constraint violations. When the a1 a2 construc-
tion is applied to a0 a101 a69 a15 a50a53a52a55a54a56a50a58a57a60a59 a71 , all “pumping” struc-
tures generated by the cfg a1a3a2 a69
a0
a101 a69 a15 a50a43a52a55a54a70a50a53a57a60a59 a71
a3a6a5
a50a43a52 a71
can indeed be ignored since all OT-relevant candi-
dates are already contained in the finite set of non-
recursive structures. So, finally the ranking of the
constraints is taken into consideration in order to de-
termine the harmony of the candidates in this finite
subset.
6 The conversion a2 a101a4a3a6a5 a50a43a52a55a54a70a50a53a57a60a59a8a7
Preprocessing Like K&W00, I assume an initial
conversion of the c-structure part of rules into stan-
dard context-free form, i.e., the right-hand side is a
category string rather than a regular expression. This
ensures that for a given local subtree, each constraint
(of form (6) or (7)) can be applied only a finite num-
ber of times: if a9 is the arity of the longest right-hand
side of a rule, the maximal number of local viola-
tions is a9 (since some constraints of type (7) can be
instantiated to all daughters).
Grammar conversion With the number of local vi-
olations bounded, we can encode all candidate dis-
tinctions with respect to constraint violations at the
local-subtree level with finite means: The set of
categories in the newly constructed LFG grammar
a0
a101 a69 a15 a50a43a52a55a54a70a50a53a57 a59 a71 is the finite set
(11) a10a12a11a14a13a16a15 a128 a129a130a68a131a53a129a132a58a133a18a17 : the set of categories in a19 a121 a24 a81 a62a64a56a65a2a62a66a68a67 a25
a107
a74 :a31a21a20
a3
a78a22a20
a5
a78a22a20a24a23a26a25a27a25a28a25a29a20a31a30a6a32
a115
a74 a nonterminal symbol of a81
a62a64a56a65a2a62a66a68a67 ,
a32 the size of the constraint set
a12 ,
a33a35a34
a20 a8
a34a37a36 ,
a36 the arity of the longest rhs in rules of
a81
a62a64a56a65a2a62a66a68a67
a134
The rules in a0 a101 a69 a15 a50a43a52a55a54a70a50a53a57 a59 a71 are constructed in such a
way that for each rule
X
a52
a10 Xa3 . . . Xa38
a109 a3 a109 a38
in a15a51a50a43a52a55a54a70a50a53a57a60a59 and each sequence a0a40a39 a16a41 a3a42a39 a19a41a44a43a45a43a45a43 a39a47a46a41 a7 ,
a48a50a49 a39a52a51
a41
a49
a9 , all rules of the form
X
a52
:a31a21a20
a3
a52
a78a22a20
a5
a52
a25a28a25a27a25a22a20 a30
a52
a32a80a10 Xa3 :a31a21a20
a3
a3 a25a27a25a42a25a53a20 a30a3 a32 . . . Xa38 :a31a21a20
a3
a38 a25a27a25a27a25a22a20 a30a38 a32 ,
a109 a26a3 a109 a26a38
a33a35a34
a20 a8a111
a34a37a36
are included such that a39 a51a41 (the number of violations
of constraint a18 a51 incurred local to the rule) and the
f-annotations a1 a0
a16
. . .a1 a0a54 are specified as follows:
(12) for a7a9a8 of form (6) a55 a74a75 a76 a74 a26a75
a26a29a56
:
a. a20 a8
a52
a13 a33 ;
a109 a26a111
a13
a109a112a111 (a57
a34a59a58a60a34a62a61 )
if X
a52
does not match the condition a74 ;
b. a20 a8
a52
a13 a33 ;
a109 a26a3
a13
a109 a3a64a63a66a65
a75
; a109 a26a111 a13 a109 a111 (a67 a34a59a58a68a34a37a61 )
if X
a52
matches a74 ;
c. a20 a8
a52
a13 a33 ;
a109 a26a3
a13
a109 a3 a63
a75
a63
a75
a26 ; a109 a26a111
a13
a109a112a111 (a67
a34a59a58a68a34a62a61 )
if X
a52
matches both a74 and a74 a26 ;
d. a20 a8
a52
a13
a57 ; a109 a26a3
a13
a109 a3a64a63
a75
; a109 a26a111 a13 a109 a111 (a67 a34a59a58a68a34a62a61 )
if X
a52
matches a74 but not a74 a26 ;
e. a20 a8
a52
a13
a57 ; a109 a26a3
a13
a109 a3 a63
a75
a63a69a65
a75
a26 ; a109 a26a111
a13
a109a112a111 (a67
a34a70a58a26a34a60a61 )
if X
a52
matches both a74 and a74a79a26 ;
(13) for a7a9a8 of form (7)
a33
a35
a74
a84a16a85
a75 a87
a76 a74 a26
a84
a26
a85
a26a75
a26
a87 a26
a41
a44 :
a. a20 a8
a52
a13 a33 ;
a109 a26a111
a13
a109a112a111 (a57
a34a59a58a60a34a62a61 )
if X
a52
does not match the condition a74 ;
b. a20 a8
a52
a13
a38
a71
a111a73a72 a3a28a74
a111 ; a109 a26a111
a13a76a75 a24
a109a112a111 a78
a75
a78
a75
a26 a25 (a57
a34a77a58a68a34a62a61 ),
where
i.
a74
a111
a13 a33 ; a75 a24
a109a112a111 a78
a75
a78
a75
a26 a25
a13
a109a112a111
if Xa111 does not match a85 , or Xa3 . . . Xa111 a0 a3 do not match a84 ,
or Xa111 a1 a3 . . . Xa38 do not match a87 ;
ii.
a74
a111
a13 a33 ; a75 a24
a109a112a111 a78
a75
a78
a75
a26 a25
a13
a109a112a111 a63
a75
a63
a75
a26
if X
a52
matches both a74 and a74 a26 ; Xa111 matches both a85 and
a85
a26 ; Xa3 . . . Xa111 a0 a3 match
a84 and a84
a26 ; Xa111
a1
a3 . . . Xa38 match a87
and a87 a26 ;
iii.
a74
a111
a13 a33 ; a75 a24
a109 a111 a78
a75
a78
a75
a26 a25
a13
a109 a111 a63a12a65
a75
if X
a52
matches both a74 and a74 a26 ; Xa111 matches both a85 and
a85
a26 ; Xa3 . . . Xa111 a0 a3 match
a84 and a84
a26 ; Xa111
a1
a3 . . . Xa38 match a87
and a87 a26 ;
iv.
a74
a111
a13
a57 ;
a75 a24
a109 a111 a78
a75
a78
a75
a26 a25
a13
a109 a111 a63
a75
if X
a52
matches a74 , Xa111 matches a85 , Xa3 . . . Xa111 a0 a3 match a84 ,
Xa111 a1 a3 . . . Xa38 match a87 , but (at least) one of them does
not match the respective description in the consequent
(a74a79a26a2a78 a85 a26 a78 a84 a26 a78 a87 a26 );
v.
a74
a111
a13
a57 ;
a75 a24
a109 a111 a78
a75
a78
a75
a26 a25
a13
a109 a111 a63
a75
a63a66a65
a75
a26
if X
a52
matches both a74 and a74a79a26 ; Xa111 matches both a85 and
a85
a26 ; Xa3 . . . Xa111 a0 a3 match a84 and a84 a26 ; Xa111
a1
a3 . . . Xa38 match a87
and a87 a26 .
Note that the constraint profile of the daughter
categories does not play any role in the determi-
nation of constraint violations local to the subtree
under consideration (only the sequences a39 a51a41 are re-
stricted by the conditions (12) and (13)). So for each
new rule type, all combinations of constraint profiles
on the daughters are constructed (creating a large but
finite number of rules).9 This ensures that no sen-
tence that can be parsed (or generated) by a15a98a50a43a52a55a54a70a50a53a57a60a59 is
excluded from a0 a101 a69 a15 a50a43a52a55a54a70a50a53a57 a59 a71 (as stated by fact (14)):10
(14) Coverage preseveration
All strings generated by an LFG grammar a81 are also gen-
erated by a19 a121
a24
a81a19a25 .
The original a15 analysis can be recovered from an
a0
a101 a69 a15 a71 analysis by applying a projection function
Cat to all c-structure categories:
Cata24 a74 :a31a21a20
a3
a78a22a20
a5
a25a27a25a27a25a22a20 a30 a32a80a25
a13
a74
for every category in a10 a11a24a13a16a15 a128 a129a130a2a131a58a129a132a58a133a18a17 (11)
9For one rule/constraint combination several new rules can
result; e.g., if the right-hand side of a rule (X
a52
) matches both the
antecedent (a74 ) and the consequent (a74 a26 ) category description
of a constraint of form (6), three clauses apply: (12b), (12c),
and (12d). So, we get two new rules with the count of 0 local
violations of the constraint and two rules with count 1, with a
difference in the f-annotations.
10Providing all possible combinations of augmented category
symbols on the right-hand rule sides in a19 a121 a24 a81a19a25 ensures that the
newly constructed rules can be reached from the root symbol in
a derivation. It is also guaranteed that whenever a rule a2 in a81
contributes to an analysis, at least one of the rules constructed
from a2 will contribute to the corresponding analysis in a19 a121 a24 a81a19a25 .
This is ensured since the subclauses in (12) and (13) cover the
full space of logical possibilities.
We can overload the function name Cat with a func-
tion applying to the set of analyses produced by an
LFG grammar a15 by defining
Cata24 a81a19a25 a13 a107 a31a110a109a63a78a56a113 a32 a115 a31a110a109 a26 a78a119a113 a32a11a10 a81 , a109 is derived from a109 a26 by
applying Cat to all category symbols a134 .
Coverage preservation of the a0 a101 construction holds
also for the projected c-category skeleton (cf. the ar-
gumentation in fn. 10):
(15) C-structure level coverage preservation
For an LFG grammar a81 : Cata24 a19 a121 a24 a81a19a25a80a25 a13 a81
Each category in a0 a101 a69 a15 a71 encodes the number of
local violations for all constraints. Since all con-
straints are locally evaluable by assumption, all con-
straints violated by a candidate analysis have to be
incurred local to some subtree. Hence the total
number of constraint violations incurred by a can-
didate can be computed by simply summing over all
category-encoded local violation profiles:
(16) Total number of constraint violations
Let Nodesa24 a109 a25 be the multiset of categories occurring in
the c-structure tree a109 , then the total number of viola-
tions of constraint a7 a8 incurred by an analysis a31a110a109 a78a88a113 a32 a10
a19 a121
a24
a81
a62a64a56a65a2a62a66a68a67
a25 is
a3a5a4a7a6 a24
a109 a25
a13
a71
a8a10a9
a120a12a11 a11 a11a13
a6
a11 a11 a11a126a15a14a17a16
a66a15a18a20a19a22a21
a15a24a23 a17
a20 a8
Define
Totala121 a24 a109 a25 a13 a31 a3 a4a26a25 a24 a109 a25a56a78 a3 a4a7a27 a24 a109 a25a56a78a28a25a27a25a27a25 a3 a4a26a28 a24 a109 a25a80a32
7 Applying a29a31a30 on a2 a101a4a3a73a5 a50a43a52a55a54a70a50a53a57 a59 a7
Since a0 a101 a69 a15a51a50a43a52a55a54a70a50a53a57a60a59 a71 is a standard LFG grammar, we
can apply the a1a3a2 construction to it to get a cfg
for a given f-structure a5 a50a43a52 . The category symbols
then have the form X:a0a40a39 a16 a3 a43a45a43a45a43 a3a42a39 a46 a7 :a5 :a32 , with a5 and
a32 arising from the a1a3a2 construction. We can over-
load the projection function Cat again such that
Cata69a34a33 :a5 :a35 :a36 a71a38a37 a33 for all augmented category sym-
bol of the new format; likewise Cata69 a15 a71 for a15 a cfg.
Since the a0 a101 construction (strongly) preserves
the language generated, coverage preservation holds
also after the application of a1a3a2 to a0 a101 a69 a15 a50a43a52a55a54a70a50a53a57 a59 a71 and
a15 a50a53a52a55a54a56a50a58a57a60a59 , respectively:
(17) Cata24a40a39a42a41a18a24 a19 a121 a24 a81 a62a64a70a65a82a62a83a66a68a67 a25a56a78a88a113 a62a64 a25a80a25 a13
Cata24a40a39a42a41a18a24 a81 a62a64a70a65a82a62a83a66a68a67 a78a119a113 a62a64 a25a80a25
But since the symbols in a0 a101 a69 a15 a50a43a52a55a54a70a50a53a57a60a59 a71 reflect local
constraint violations, Cata69 a1a3a2 a69 a0 a101 a69 a15 a50a43a52a55a54a70a50a53a57a60a59 a71 a3a6a5 a50a53a52 a71a60a71
has the property that all instances of recursion in the
resulting cfg create candidates that are at most as
harmonic as their non-recursive counterparts. As-
suming a projection function CatCounta69a34a33 :a5 :a35 :a36 a71a5a37
a33 :a5 , we can state more formally:
(18) If a109 a3 and a109 a5 are CatCount projections of trees produced
by the cfg a39a42a41a18a24 a19 a121 a24 a81 a62a64a70a65a82a62a83a66a68a67 a25a56a78a88a113 a62a64 a25 , using exactly the
same rules, and a109 a5 contains a superset of the nodes that
a109 a3 contains, then
a20 a8a3
a34
a20 a8a5 , for all a20 a8a3 a78a22a20 a8a5
a24 a0 a13
a57 a25 a25
a32
a25
from a31a21a20 a3a3 a25a27a25a42a25a53a20 a8a3 a25a42a25a27a25a22a20 a30a3 a32 a13 Totala121 a24 a109 a3 a25 ,
and a31a21a20 a3a5 a25a27a25a27a25a22a20 a8a5 a25a28a25a27a25a22a20 a30a5 a32 a13 Totala121 a24 a109 a5a60a25 .
This fact follows from definition of Total (16): the
violation counts in the additional nodes in a1
a19
will
add to the total of constraint violations (and if none
of the additional nodes contains any local constraint
violation at all, the total will be the same as in a1 a16 ).
Intuitively, the effect of the augmentation of the cat-
egory format is that certain recursions in the pure
a1a3a2 construction (which one may think of as a
loop) are unfolded, leading to a longer loop. The
new loop is sufficiently large to make all relevant
distinctions.
This result can be directly exploited in processing:
if all non-recursive analyses are generated (of which
there are only finitely many) it is guaranteed that a
subset of the optimal candidates is among them. If
the grammar does not contain any violation-free re-
cursion, we even know that we have generated all
optimal candidates.
(19) A recursion with the derivation path a2 a76 a25a27a25a27a25 a76 a2 is
called violation-free iff all categories dominated by the
upper occurrence of a2 , but not dominated by the lower
occurrence of a2 have the form a74 a117 a31a21a20 a3 a78a22a20 a5 a25a27a25a28a25a22a20 a30 a32 with
a20 a8
a13 a33
a78
a0 a13
a57 a25 a25
a32
Note that if there is an applicable violation-free re-
cursion, the set of optimal candidates is infinite; so
if the constraint set is set up properly in a linguis-
tic analysis, one would assume that violation-free
recursion should not arise. (Kuhn, 2000) excludes
the application of such recursions by a similar con-
dition as offline parsability (which excludes vacu-
ous recursions over a string in parsing), but with the
a1a3a2 construction, this condition is not necessary
for decidability of the generation-based optimization
task. The cfg produced by a1 a2 can be transformed
further to only generate the optimal candidates ac-
cording to the constraint ranking a1 a92 of the OT sys-
tem a91 a37 a0 a15 a50a53a52a55a54a56a50a58a57a60a59 a3 a0 a1 a3 a1 a92 a7a60a7 , eliminating all but the
violation-free recursions in the grammar:
(20) Creating a cfg that produces all optimal candidates
a. Define
a1
a16a3a2
a4
a129a130
a13a93a107
a109 a10
a39a42a41a18a24
a19 a121
a24
a81
a62a64a56a65a2a62a66a68a67
a25a56a78a119a113
a62a64
a25
a115
a109 contains no
recursion a134 .
a1
a16a3a2
a4
a129a130 is finite and can be easily computed, by keeping
track of the rules already used in an analysis.
b. Redefine Evala120a122a121a55a123a124a19a125a103a126 to apply on a set of context-free
analyses with augmented category symbols with counts
of local constraint violations:
Evala120a122a121a55a123a124a19a125a127a126 a24 a1 a25 a13a93a107 a109 a10 a1 a115 a109 is maximally harmonic
in a1 , under ranking a96 a97a99a134
Using the function Total defined in (16), this function is
straightforward to compute for finite sets, i.e., in particu-
lar Evala120a122a121a55a123a124 a125 a126 a24 a1 a16a5a2a4 a129a130 a25 .
c. Augment the category format further by one index
component.11 Introduce index a6
a13 a33 for all categories in
a39a42a41a18a24
a19 a121
a24
a81
a62a64a70a65a82a62a83a66a68a67
a25a56a78a119a113
a62a64
a25 of the form X:a31a21a20
a3
a78a27a25a28a25a27a25a22a20 a30 a32 :a15 :a7 ,
where a20 a8
a13 a33 for a0 a13
a57 a25 a25
a32 . Introduce a new unique in-
dex a6a9a8 a57 for each node of the form X:a31a21a20 a3 a78a27a25a28a25a27a25a22a20 a30a114a32 :a15 :a7 ,
where a20 a8a11a10
a13 a33 for some
a20 a8
a24
a57
a34a12a0 a34 a32
a25 occurring in the
analyses Evala120a122a121a55a123a124a19a125a103a126 a24 a1 a16a3a2a4 a129a130 a25 (i.e., different occurrences of
the same category are distinguished).
d. Construct the cfg
a81a14a13a16a15
a4
a129a130
a13
a31 a74a17a13a16a15
a4
a129a130 a78 a109a18a13a16a15
a4
a129a130 a78Sa13a16a15
a4
a129a130 a78
a2
a13a16a15
a4
a129a130 a32 ,
where a74a17a13a16a15
a4
a129a130 a78 a109a18a13a16a15
a4
a129a130 are the indexed symbols of step c.;
Sa13a16a15a4 a129a130 is a new start symbol; the rules a2 a13a16a15a4 a129a130 are (i) those
rules from a39a42a41a18a24 a19 a121 a24 a81 a62a64a70a65a82a62a83a66a68a67 a25a56a78a119a113 a62a64 a25 which were used in
the analyses in Evala120a122a121a55a123a124 a125 a126
a24 a1
a16a3a2
a4
a129a130 a25 - with the original
symbols replaced by the indexed symbols -, (ii) the
rules in a39a42a41a30a24 a19 a121 a24 a81 a62a64a56a65a2a62a66a68a67 a25a56a78a119a113 a62a64 a25 , in which the mother
category and all daughter categories are of the form
X:a31a21a20 a3 a78a27a25a42a25a27a25a22a20 a30 a32 :a15 :a7 , a20 a8 a13 a33 for a0 a13 a57 a25 a25a32 (with the new
index a33 added), and (iii) one rule Sa13a16a15a4 a129a130 a10 Sa111 :a6 for each
of the indexed versions Sa111 :a6 of the start symbols of
a39a42a41a18a24
a19 a121
a24
a81
a62a64a70a65a82a62a83a66a68a67
a25a56a78a119a113
a62a64
a25 .
With the index introduced in step (20c), the origi-
nal recursion in the cfg is eliminated in all but the
violation-free cases. The grammar Cata69 a15a20a19a22a21a23
a62a64
a71 pro-
duces (the c-structure of) the set of optimal candi-
dates for the input a5 a50a43a52 :12
(21) Cata24 a81a14a13a16a15a4 a129a130 a25 a13
a107
a109
a115
a31a110a109a63a78a56a113
a62a64
a32 a10 Evala120a122a121a55a123a124a19a125a103a126
a24 Gen
a128 a129a130a68a131a58a129a132a58a133
a24
a113
a62a64
a25a80a25a88a134 ,
i.e., the set of c-structures for the optimal candidates for
input f-structure a113
a62a64 according to the OT system
a94 a13
a31a82a81
a62a64a70a65a82a62a83a66a68a67
a78 a31a12a95a78a119a96 a97 a32a80a32 .
11The projection function Cat is again overloaded to also re-
move the index on the categories.
12Like K&W00, I make the assumption that the input f-
structure in generation is fully specified (i.e., all the candidates
have the form a31a110a109a63a78a119a113 a62a64 a32 ), but the result can be extended to allow
for the addition of a finite amount of f-structure information in
generation. Then, the specified routine is computed separately
for each possible f-structural extension and the results are com-
pared in the end.
8 Proof
To prove fact (21) we will show that the c-structure
of an arbitrary candidate analysis generated from
a5
a50a53a52 with a15 a50a53a52a55a54a56a50a58a57a60a59 is contained in Cata69 a15 a19a22a21
a23
a62a64
a71 iff all
other candidates are equally or less harmonic.
Take an arbitrary candidate c-structure a1 gen-
erated from a5 a50a43a52 with a15 a50a53a52a55a54a56a50a58a57a60a59 such that a1 a0
Cata69 a15 a19a22a21a23
a62a64
a71 . We have to show that all other candi-
dates a1 a0 generated from a5 a50a53a52 are equally or less har-
monic than a1 . Assume there were a a1 a0 that is more
harmonic than a1 . Then there must be some con-
straint a18 a51 a0 a1 , such that a1 a0 violates a18 a51 fewer times
than a1 does, and a18 a51 is ranked higher than any other
constraint in which a1 and a1 a0 differ. Constraints have
to be incurred within some local subtree; so a1 must
contain a local violation configuration that a1 a0 does
not contain, and by the construction (12)/(13) the
a0
a101 -augmented analysis of
a1 - call it a0
a101 a69
a1
a71 - must
make use of some violation-marked rule not used in
a0
a101 a69
a1 a0
a71 . Now there are three possibilities:
(i) Both a0 a101 a69 a1 a71 and a0 a101 a69 a1 a0 a71 are free of recursion.
Then the fact that a0 a101 a69 a1 a0 a71 avoids the highest-ranking
constraint violation excludes a1 from Cata69 a15a20a19a22a21
a23
a62a64
a71 (by
construction step (20b)). This gives us a contradic-
tion with our assumption.
(ii) a0 a101 a69 a1 a71 contains a recursion and a0 a101 a69 a1 a0 a71 is free
of recursion. If the recursion in a0 a101 a69 a1 a71 is violation-
free, then there is an equally harmonic recursion-
free candidate a1 a0 a0 . But this a1 a0 a0 is also less har-
monic than a0 a101 a69 a1 a0 a71 , such that it would have been ex-
cluded from Cata69 a15 a19a22a21a23
a62a64
a71 too. This again means that
a0
a101 a69
a1
a71 would also be excluded (for lack of the rel-
evant rules in the non-recursive part). On the other
hand, if it were the recursion in a0 a101 a69 a1 a71 that incurred
the additional violation (as compared to a0 a101 a69
a1 a0
a71 ),
then there would be a more harmonic recursion-free
candidate a1 a0 a0 a0 . However, this a1 a0 a0 a0 would exclude
the presence of a0 a101 a69 a1 a71 in a15 a19a22a21a23
a62a64
by construction step
(20c,d) (only violation-free recursion is possible).
So we get another contradiction to the assumption
that a1 a0 Cata69 a15 a19a22a21a23
a62a64
a71 .
(iii) a0 a101 a69
a1 a0
a71 contains a recursion. If this recursion
is violation-free, we can pick the equally harmonic
candidate avoiding the recursion to be our a0 a101 a69 a1 a0 a71 ,
and we are back to case (i) and (ii). Likewise, if the
recursion in a0 a101 a69 a1 a0 a71 does incur some violation, not
using the recursion leads to an even more harmonic
candidate, for which again cases (i) and (ii) will ap-
ply.
All possible cases lead to a contradiction with the
assumptions, so no candidate is more harmonic than
our a1 a0 Cata69 a15 a19a22a21a23
a62a64
a71 .
We still have to prove that if the c-structure a1 of a
candidate analysis generated from a5 a50a43a52 with a15a51a50a43a52a55a54a70a50a53a57a60a59
is equally or more harmonic than all other candi-
dates, then it is contained in Cata69 a15 a19a22a21a23
a62a64
a71 . We can
construct an augmented version a1 a0 of a1 , such that
Cata69
a1 a0
a71 a37
a1 and then show that there is a homo-
morphism mapping a1 a0 to some analysis a1 a0 a0 a0 a15 a19a22a21a23
a62a64
with Cata69 a1 a0 a0 a71 a37 a1 .
We can use the constraint marking construction a0 a101
and the a1 a2 construction to construct the tree a1 a0
with augmented category symbols of the analysis
a1 . The result of K&W00 plus (17) guarantee that
Cata69 a1 a0 a71 a37 a1 . Now, there has to be a homo-
morphism from the categories in a1 a0 to the cate-
gories of some analysis in a15 a19a22a21a23
a62a64
. a15 a19a22a21a23
a62a64
is also based
on a1a3a2 a69 a15 a50a53a52a55a54a56a50a58a57a60a59 a3a6a5 a50a43a52 a71 (with an additional index a1
on each category and some categories and rules of
a1a3a2
a69 a15 a50a43a52a55a54a70a50a53a57a60a59
a3a6a5
a50a53a52 a71 having no counterpart in a15 a19a22a21
a23
a62a64
).
Since we know that a1 is equally or more harmonic
than any other candidate generated from a5 a50a43a52 , we
know that the augmented tree a1 a0 either contains no
recursion or only violation-free recursion. If it does
contain such violation-free recursions we map all
categories a2 on the recursion paths to the indexed
form a2 :a48 , and furthermore consider the variant of
a1 a0 avoiding the recursion(s). For our (non-recursive)
tree, there is guaranteed to be a counterpart in the
finite set of non-recursive trees in a15a20a19a22a21a23
a62a64
with all cat-
egories pairwise identical apart from the index a1 in
a15 a19a22a21
a23
a62a64
. We pick this tree and map each of the cate-
gories in a1 a0 to the a1 -indexed counterpart. The exis-
tence of this homomorphism guarantees that an anal-
ysis a1 a0 a0 a0 a15 a19a22a21a23
a62a64
exists with Cata69 a1 a0 a0 a71 a37 Cata69 a1 a0 a71 a37
a1 . QED
9 Conclusion
We showed that for OT-LFG systems in which all
constraints can be expressed relative to a local sub-
tree in c-structure, the generation task from (non-
cyclic13) f-structures is solvable. The infinity of
13The non-cyclicity condition is inherited from K&W00; in
linguistically motivated applications of the LFG formalism, cru-
the conceptually underlying candidate set does not
preclude a computational approach. It is obvious
that the construction proposed here has the purpose
of bringing out the principled computability, rather
than suggesting a particular algorithm for imple-
mentation. However on this basis, an implementa-
tion can be easily devised.
The locality condition on constraint-checking
seems unproblematic for linguistically relevant con-
straints, since a GPSG-style slash mechanism per-
mits reference to (finitely many) nonlocal configu-
rations from any given category (cf. fn. 5).14
Decidability of generation-based optimization
(from a given input f-structure) alone does not im-
ply that the recognition and parsing tasks for an OT
grammar system defined as in sec. 3 are decidable:
for these tasks, a string is given and it has to be
shown that the string is optimal for some underlying
input f-structure (cf. (Johnson, 1998)). However, a
similar construction as the one presented here can
be devised for parsing-based optimization (even for
an LFG-style grammar that does not obey the offline
parsability condition). So, if the language generated
by an OT system is defined based on (strong) bidi-
rectional optimality (Kuhn, 2001, ch. 5), decidabil-
ity of both the general parsing and generation prob-
lem follows.15 For the unidirectionally defined OT
language (as in sec. 3), decidability of parsing can
be guaranteed under the assumption of a contextual
recoverability condition in parsing (Kuhn, in prepa-
ration).
cial use of cyclicity in underlying semantic feature graphs has
never been made.
14A hypothetical constraint that is excluded would be a paral-
lelism constraint comparing two subtree structures of arbitrary
depth. Such a constraint seems unnatural in a model of gram-
maticality. Parallelism of conjuncts does play a role in models
of human parsing preferences; however, here it seems reason-
able to assume an upper bound on the depth of parallel struc-
tures to be compared (due to memory restrictions).
15Parsing: for a given string, parsing-based optimization
is used to determine the optimal underlying f-structure; then
generation-based optimization is used to check whether the
original string comes out optimal in this direction too. Gen-
eration is symmetrical, starting with an f-structure.

References

Joan Bresnan, Shipra Dingare, and Christopher Manning.
2001. Soft constraints mirror hard constraints: Voice
and person in English and Lummi. In Proceedings of
the LFG 2001 Conference. CSLI Publications.

Joan Bresnan. 2000. Optimal syntax. In Joost Dekkers,
Frank van der Leeuw, and Jeroen van de Weijer, edi-
tors, Optimality Theory: Phonology, Syntax, and Ac-
quisition. Oxford University Press.

Jason Eisner. 1997. Efficient generation in primitive
optimality theory. In Proceedings of the ACL 1997,
Madrid.

Robert Frank and Giorgio Satta. 1998. Optimality theory
and the generative complexity of constraint violation.
Computational Linguistics, 24(2):307-316.

Dale Gerdemann and Gertjan van Noord. 2000. Approximation and exactness in finite state Optimality Theory. In SIGPHON 2000, Finite State Phonology. 5th
Workshop of the ACL Special Interest Group in Comp.
Phonology, Luxembourg.

Mark Johnson, Stuart Geman, Stephen Canon, Zhiyi Chi,
and Stefan Riezler. 1999. Estimators for stochastic
unification-based grammars. In Proceedings of the
37th Annual Meeting of the Association for Computational Linguistics (ACL '99), College Park, MD, pages 535-541.

Mark Johnson. 1998. Optimality-theoretic Lexical Functional Grammar. In Proceedings of the 11th Annual
CUNY Conference on Human Sentence Processing,
Rutgers University.

Ronald M. Kaplan and Jurgen Wedekind. 2000.
LFG generation produces context-free languages.
In Proceedings of COLING-2000, pages 297-302,
Saarbrucken.

Lauri Karttunen. 1998. The proper treatment of optimality in computational phonology. In Proceedings of the
Internat. Workshop on Finite-State Methods in Natural
Language Processing, FSMNLP '98, pages 1-12.

Jonas Kuhn. 2000. Processing Optimality-theoretic syntax by interleaved chart parsing and generation. In
Proceedings of ACL 2000, pages 360-367, Hongkong.

Jonas Kuhn. 2001. Formal and Computational As-
pects of Optimality-theoretic Syntax. Ph.D. thesis, In-
stitut f¨ur maschinelle Sprachverarbeitung, Universit¨at
Stuttgart.

Jonas Kuhn. in preparation. Decidability of generation
and parsing for OT syntax. Ms., Stanford University.

Stefan Riezler, Detlef Prescher, Jonas Kuhn, and Mark
Johnson. 2000. Lexicalized stochastic modeling of
constraint-based grammars using log-linear measures
and EM training. In Proceedings of the 38th Annual
Meeting of the Association for Computational Linguistics (ACL '00), Hong Kong, pages 480-487.

Stefan Riezler, Dick Crouch, Ron Kaplan, Tracy King,
John Maxwell, and Mark Johnson. 2002. Parsing the
Wall Street Journal using a Lexical-Functional Grammar and discriminative estimation techniques. This
conference.
