File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/00/c00-1038_metho.xml
Size: 13,926 bytes
Last Modified: 2025-10-06 14:07:10
<?xml version="1.0" standalone="yes"?> <Paper uid="C00-1038"> <Title>Directional Constraint Evaluation in Optimality Theory*</Title> <Section position="5" start_page="259" end_page="262" type="metho"> <SectionTitle> 3 Formal Results </SectionTitle> <Paragraph position="0"/> <Section position="1" start_page="259" end_page="259" type="sub_section"> <SectionTitle> 3.1 Definition of OT </SectionTitle> <Paragraph position="0"> An OT grammar is a pair (Gen, C) where * the candidate generator Gen is a relation that maps eaeh input to a nonempty set of candidate outputs; * the hierarchy C = (C1, C2,...) is a finite tnple of constraint functions that evaluate outputs.</Paragraph> <Paragraph position="1"> We write d(5) for the tuple (C~(5), C2(5),...). Given a UR, or, as input, the grammar adnfits as its SRs all the outtmts 5 such that C(5) is lexieographicalty minimal in {C(5) : 5 ~ Gen(~)}. The values taken by 6'/ are called its violation levels. Conventionally these are natural mnnbers, trot any ordered set will do.</Paragraph> <Paragraph position="2"> Our directional constraints require the following immvations. Each input a is a string as usual, but the outputs are not strings. Rather, each candidate 5 C Gen(cr) is a tuple of I~l + 1 strings. We write 5 for the concatenation of these strings (the &quot;real&quot; SR). So 5 specifies an aliflnmcnt of 5 with a. The directional constraint Ci maps the tuple 5 to a tuple of natural numbers (&quot;offense levels&quot;) also of length I~1 + \]. Its violation levels {6~(5) : 5 < Gen(~)} are compared lexicographically.</Paragraph> </Section> <Section position="2" start_page="259" end_page="260" type="sub_section"> <SectionTitle> 3.2 Finite-state assumptions </SectionTitle> <Paragraph position="0"> We now confine our attention to tinite-state OT grammars, following (Ellison, 1994; Tesar, 1995; Eisner, 1997a; Frank and Satta, 1998; Karttunen, 1998). Gen C_ E* x A* is a regular relation, ~ and may be implemented as an uuweighted FST. Each constraint is implemented 7 as a possibly nondeterministic, weighted finite-state automaton (WFSA) that accepts A* and whose ares are weighted with natural nulnbers.</Paragraph> <Paragraph position="1"> An FST, T, is a tinite-state automaton in which each arc is labeled with a string pair (t : 3'-Without loss of generality, we require \[eel < 1.</Paragraph> <Paragraph position="2"> This lets us define an aligned transduction that maps strings to tuples: If ~r = al...a~, we define T(a) as the set of (n + 1)-tui)les 5 = (50, 31,... 5n) such that T has a path trailsducing a : g along which 50&quot;&quot; 5i-1 is the con> plete output before ai is read fronl the input.</Paragraph> <Paragraph position="3"> We now describe how to evaluate C(d) where C is a WFSA. Consider the path in C that accepts a. 8 In (nn)bounded evaluation, C(5) is the total weight of this path. In left>to-right evaluation, C(5) is the n + 1 tuI)le giving the respective total weights of the subpaths that consume a0,.., at. In right-to-left evaluation, C(5) is the reverse of the previous tuple. &quot;~ GEllison required only that Gen(c,) be regular (Vcr). rSpace prevents giving the equivalent characterization as a locally weighted language (Walther, 1999). Slf there art multiple accepting paths (nondeterminism), take the one that gives the least vahm of C(5).</Paragraph> </Section> <Section position="3" start_page="260" end_page="260" type="sub_section"> <SectionTitle> 3.3 Expressive power </SectionTitle> <Paragraph position="0"> Thanks to Gen, finite-state ()T can l;rivially iln1)lement any regular inl)ut-outtmt relation with no coustrmnts at all! And {i3.4 below shows that whether we allow directional or houri(led constraints does not affect this generative power.</Paragraph> <Paragraph position="1"> But in another sense, directional constraints are strictly more expressive than bounded ones.</Paragraph> <Paragraph position="2"> If Gen is fixed, then any hierarchy of hemmed constrMnts can be simulated by some hierarchy of directionM constraints 1deg -but not vice-versa.</Paragraph> <Paragraph position="3"> Indeed, we show even more strongly that directional constraild;S cannot always be simulated even by mflmmMed constraints. 11 \])trine * b as in SS2.5. This ranks the set (alb) '~ in lexicographic order, so it; makes 2u distinctions. Let Gen be the regular relation (a :(,,Ib:b)*(c:,((,,:-lb: b)* I (,:t,(-:(,F,: hi,,: t, lb:-)*) We claim that the grammar (Gen,*b) is not equivalent to (Gen, C1,..., C~) tTor any bounded or mfl)ounded constraints C~,... C.~. There is some k; such l,hat tbr all d 5 A&quot;, each Ci(5) < t~:'n. 12 So candidates 5 of length n have at most (h,n,) 's ditt'erenl; violation profiles (~(5). Choos, n such that 2 '~ > (k'n) ~. Then the set of 2 ~ strings (alb) n must contain two distinct strings, 6 = :,.,...:,:,, a' = > .-..v,,,, will, = Let i be nfinima,1 such that xi ~ ?,Ji, all(l withoul; loss of generality ~ssume xi = o,, yi = b. Put O- = J;1 &quot;''Zi--lC:Ci+l &quot;'':g~t&quot; Now 5, 51 C Gen(o) and 5 is lexicographicMly minimM in Gen(c,).</Paragraph> <Paragraph position="4"> So the granunar (Gen,*b) 1naps cr to 5 ouly, whe, reas (Gen, C) emmet distinguish between 5 and (\[', st) it; maps cr to neither or both.</Paragraph> </Section> <Section position="4" start_page="260" end_page="262" type="sub_section"> <SectionTitle> 3.4 Grammar compilation: OT ---- FST </SectionTitle> <Paragraph position="0"> It is triviM to translate an arbitrary FST grammar into ()T: let Gen be the FST, aim C = ().</Paragraph> <Paragraph position="1"> The rest of this section shows, conversely~ how to compile a tinite-state OT grammar (Gen, C) into an FST, provided that the grammar uses only bounded and/or directional constraints.</Paragraph> <Paragraph position="2"> 1degHow? By using states to count, a bounded COilstraint's WIPSA can bc transtbrmed so that all the weight of each path falls on its final arc. This defines the same Let 5/~ = Gen. For i > 0, we will construct an FST 77,/ that iml)lements the i)artial grmnmar (Gen, C1, C2,... Ci). We construct Ti from Ti_ 1 alld C i only: Ti('/;) colttaills the forlllS y E Ti_l(X) tot&quot; whieh G(Y) is minimal.</Paragraph> <Paragraph position="3"> If C i is L;-txmnded, we use the construction of (Frank and Satta, 1998; Karttulmn, 1998).</Paragraph> <Paragraph position="4"> If Ci is a left-to-right constraint, we compose Ti-1 with the WFSA that l'epresents Ci, obtaining a weigh, ted finite-state transducer (WFST), Ti * This transducer may be regarded as assigning a Ci-violation level (an (1~1 + 1)-tuple) to each cr : (~ it accepts. We must now 1)rulm away the subol)timM candidates: using the DBP algorithm below, we construct a new unweighted FST 7) that transduces a : ~ ill&quot; the weighted 9~ can transduce (r : 5 as cheaply as any a : 5 ~.</Paragraph> <Paragraph position="5"> If Ci is right-to-left;, we do just the same, except DBP is used to construct; T/t ti'om 7)\]&quot;. All that remains is to give the construction of Ti from 7~i, which we call Directional Best Paths (DBP). Recall standard bestq)aths or shortest-t)aths algorithms that pare a WFSA d(}wn to its 1)aths of minimmn total weight (Dijkstra, 1959; Ellison, 1994). Our greedier version (toes llot SllUl along Imths trot always imme(liately takes the lightest &quot;availal)le&quot; at('. Cru{:iMly, available ar{:s are define{t r(;lativc to the int)ut string, l)ecause we must retain one or more ot)timal output candidates for each inlmt.</Paragraph> <Paragraph position="6"> So availal}ility requires &quot;lookahead&quot;: we must take a heavier are (b:z beh)w) just when the rest; of the intmt (e.g., abd) emmet otherwise be ac{:et}ted on any t)ath. ~ c:c _ ,,:,, 2 2</Paragraph> <Paragraph position="8"> (a,c ,fl)bIev,att,s (e, a,, c)) su~mptimal ~) On this example, DBP would simply make state 6 non-tinal (tbrcing abe to take the light are unavailal)le to abd), but often it; must add states! This relativization is what lets us compile a hierarchy of directionM constraints, once and tbr all, into an single Fsq_' that can find the optimal OUtl)ut for aTzy of the infinitely many t)ossible inputs. We saw in SS2.4 why this is so desirable. By contrast, Ellison's (1994) best-paths construction tbr unbounded constraints, and previously proposed constructions tbr directional-style constraints (set SS2.5) only find the optimal outt)ut for a single input, or at best a finite lexicon.</Paragraph> <Paragraph position="9"> a.4.a Dir. Best Paths: A special case SS3.2 restricted our FSTs such that for every arc label o~ : 7, I ~t\] -< 1. In this section we construct ^ ~) ti'om Ti under the stronger assumption that % Ioe\[= 1, i.e., ~i is e-flee on the intmt side.</Paragraph> <Paragraph position="10"> If Q is the stateset of Ti, then let; the stateset of be S\]: c_ S c_ 0, q c- S-This has size IQ\[&quot; 31QI-*. However, most of these states are typicMly unreachable from the start state. Lazy &quot;on-the-fly&quot; construction techniques (Mohri, 1997) can be used to avoid allocating states or arcs until they arc discovered during exploration from the start State.</Paragraph> <Paragraph position="11"> For a E E*,q G Q, define V(G,q) as the minimmn cost (~ \[al-tut)le of weights) of' any ^ or-reading 1)ath from Ti's start state q0 to q.</Paragraph> <Paragraph position="12"> The start state ot'fl) is \[q0; 0; {q0}\]. The intent is that Ti have a path from its start state to \[q; R.; S\] that transduces cr:5 \]a itf * Ti has a q0 to q, o':a path of cost V(er, q);</Paragraph> <Paragraph position="14"> So as Ti reads c,, it &quot;Ibllows&quot; Ti. cheapc.st cyreading paths to q, while calculating R, to which yet cheaper (but l)erhaI)S dead-end) paths exist.</Paragraph> <Paragraph position="15"> Let \[q; R; S\] be a final state (in Ti) itf q is final and no q' E R is final (in 5~?i). So an accepting path in ~) survives into Ti ifl' there is no lowercost accepting path in Ti for the same int)ut.</Paragraph> <Paragraph position="16"> The arcs fl'om \[q;R;S\] correspond to arcs from q. For each arc fl'om q to q' labeled a : -y and with weight W, add an unweighted a : 7 arc from \[q;R; S\] to \[q'; R'; S'\], provided that the latter state exists (i.e., unless q' E R', indicating that there is a cheaper path to q'). Here R' is the set of states that art either reachable from R by a (single) a-reading arc, or reachable from S by an a-reading arc of weight < W. S t is the union of R' and all states reachat)le from S by an a-reading arc of weight W.</Paragraph> <Paragraph position="17"> 3.4.4 Dir. Best Paths: The general case To apply the above construction, we nnlsl; firsl; transtbrm Ti so it is e-flee on the int)ut side. Of laa is a tuple of \]~r\]+l strings, but 50 = e by e-fl'eeuess. course int)ut c's are cruciM if Gen is to be allowed to insert unbounded alilOllllt8 of surface materim (to be pruned back by the constraints). 14 To eliminate e's while preserving these semantics, we are tbrced to introduce FST arc labels of the tbrm a : F where F is actually a regular set of strings, represented as an FSA or regular expression. Following e-elimination, we can apply the construction of SS3.4.3 to get Ti, and finMly convert Ti back to a normal transducer by expanding each a:F into a subgraph.</Paragraph> <Paragraph position="18"> When we elilninate an arc labeled c. : 7, we must Imsh 7 and the arcs weight back onto a previous non< arc (but no further; contrast (Mohri, 1997)). The resulting machine will iraplement the same Migncd transduction as ~ 1)ut more transparently: in the notation of .~3.2, the arc reading ai will transduce it directly to 5i.1.5 Concretely, suppose G~ can gel; from state q to q&quot; via a t)ath of total weight W that 1)egins with a : 7~ on its first arc followed 1)y e : &quot;T2~ e : 7a, ... on its remaining arcs. \Y=e would like to substitute an arc from q to q&quot; with label a : 7172Ta-.. and weight I/V. But there may be infinitely many such q q&quot; t)~ths, of varying weight, so we actually write a : F, where \]? describes .just those q-q&quot; paths with minimmn W.</Paragraph> <Paragraph position="19"> The exact procedure is as follows. Let G be the possibly discommcted subgraph of 5hi \]brined by e-reading arcs. Run ml nil-pairs shortestpaths algorithm Is on G. This finds, for each state pair (qt, q,) connected by an c-readiug path, the subgral)h Gq,,q,, of G formed by the minimmn-weight e-reading t)aths froln q' to q&quot;, as well as the common weight Wq,,q,, of these paths. So tbr each arc in 2Pi from q to q', with weig~ht W and label a : 7, we now add an arc to Ti from q to q&quot; with weight W + l/Vq, q,, and label a : 7Gq,,q,,(e). (G(e) denotes the regular language to which G transduces e.) Having done this, we can delete all e-reading arcs.</Paragraph> <Paragraph position="20"> The modified e-free ~) is equivalent to 14As is conventional. Besides epenthetic material, Gen often introduces COlfiOUS prosodic structure.</Paragraph> <Paragraph position="21"> lSThat arc is labeled ai : P where & E F. But what is ao? A special symbol E E E that we introduce so that 5o can be pushed back onto it: Before e-dimination, we modify Ti by giving it a new start state, commcted to the old start state with an arc E : e. After e-elimination, we apply DBP and replace E with e in the result Ti.</Paragraph> <Paragraph position="22"> lS(Cormen et al., 1990) cites several, including fast algorithms for when edge weights are small integers.</Paragraph> <Paragraph position="23"> tlm origim~l (;xcet)l; for ('Jilnim~ting some of tim sul)ol)timal sul)l);tths. H('a'c is a gri~l)h ti'agment heft)r(; and after c-climim~tion:</Paragraph> </Section> </Section> class="xml-element"></Paper>