File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/02/c02-1001_metho.xml

Size: 8,695 bytes

Last Modified: 2025-10-06 14:07:46

<?xml version="1.0" standalone="yes"?>
<Paper uid="C02-1001">
  <Title>Disambiguation of Finite-State Transducers</Title>
  <Section position="5" start_page="0" end_page="8" type="metho">
    <SectionTitle>
2 Q K.
</SectionTitle>
    <Paragraph position="0"> Transitions are of the form</Paragraph>
    <Paragraph position="2"> where p(t) denotes the transition's origin state, i(t) its input label, o(t) its output label, n(t) the transition's destination state and w(t) 2K is the weight of t. The tropical semiring de ned as (R+ [1;min;+;1;0) is commonly used in speech recognition, but our results are applicable to the case of general semirings as well. A path = t1 tn of T is an ele-</Paragraph>
    <Paragraph position="4"> We can easily extend the functions p and n to those paths:</Paragraph>
    <Paragraph position="6"> We denote by P(r;s) the set of paths whose origin is state r and whose destination is state s. We can also extend the function P to the sets R Q and</Paragraph>
    <Paragraph position="8"> We can extend the functions i and o to the paths by taking the concatenations of the input and output symbols:</Paragraph>
    <Paragraph position="10"> ducer, (Berstel, 1979)) A transducer T is said to be unambiguous if for each w 2 1, there exists at most one path in T such</Paragraph>
    <Paragraph position="12"> Remark 1 : To remove the ambiguity between two paths and , it su ces to modify i( ) by changing the rst input label of the path . This is done by introducing an auxiliary symbol such that: i( )6= i( ).</Paragraph>
    <Paragraph position="13"> Figure 1a shows an ambiguous transducer. It is ambiguous since for the input string \s e [z]&amp;quot;, there are two paths representing the output strings fces;sesg. In this gure, \eps&amp;quot; stands for epsilon or null symbol. To disambiguate a transducer, we rst group the ambiguous paths; we then remove the ambiguity in each group by adding auxiliary labels as shown in Figure 1b. Unfortunately, it is infeasible to enumerate all the paths in a cyclic transducer. However, in (Smaili, 2001) it is shown that cyclic transducers of the type studied in this work can be disambiguated by transforming to a corresponding acyclic subtransducer such that T0 T. This  fundamental property is described in detail in section 2.1. Accordingly, we apply the appropriate transformation to the input transducer.</Paragraph>
    <Section position="1" start_page="0" end_page="8" type="sub_section">
      <SectionTitle>
2.1 Fundamental Property
</SectionTitle>
      <Paragraph position="0"> We are interested in the transducer</Paragraph>
      <Paragraph position="2"> transition t such that i(t)2 1: We denote by E0 and E1 the following sets: E0 = ft 2 E : i(t) 2 0g and E1 =ft2E : i(t) 2 1g: Notice that E = E0]E1: We can give a characterization of the ambiguous paths verifying the fundamental property. Before, let's make the following remark:</Paragraph>
      <Paragraph position="4"> fi and gi are ambiguous (0 i n): We will assume that the rst transition's path belongs to E0, i.e. f0 = : Recall that if we want to avoid cycles, we just have to remove from T all transitions t 2 E1. According to Proposition 1, ambiguity needs to be removed only in paths that use transitions t 2 E0, namely the path i that performs the decomposition given in Remark 2. Disambiguation consists only of introducing auxiliary labels in the ambiguous paths. We denote by Asrc the set of origin states of transitions belonging to E1 and by Adst the set of destination states of transitions belonging to E2.</Paragraph>
      <Paragraph position="6"> According to Proposition 1 and what precedes, it would be equivalent and simpler to disambiguate an acyclic transducer obtained from T in which we have removed all E1 transitions.</Paragraph>
      <Paragraph position="7"> Therefore, we introduce the operator : fTing ! fToutg which accomplishes this construction.</Paragraph>
      <Paragraph position="9"> The third condition insures the connectivity of (T) if T is itself connected.</Paragraph>
      <Paragraph position="10"> It su ces to disambiguate the acyclic transducer (T), then reinsert the transitions of E1 in (T). The set of paths in (T) is then P(I1, F1).</Paragraph>
    </Section>
    <Section position="2" start_page="8" end_page="8" type="sub_section">
      <SectionTitle>
2.2 Algorithm
</SectionTitle>
      <Paragraph position="0"> Input:</Paragraph>
      <Paragraph position="2"> ambiguous transducer verifying the fundamental property.</Paragraph>
      <Paragraph position="3"> Output:</Paragraph>
      <Paragraph position="5"> unambiguous transducer, X1 is the set of auxiliary symbols.</Paragraph>
      <Paragraph position="6">  1. Tacyclic (T).</Paragraph>
      <Paragraph position="7"> 2. Path set of paths of Tacyclic. 3. Disambiguate the set Path (creating the set X1).</Paragraph>
      <Paragraph position="8"> 4. T0 build the unambiguous transducer which has unambiguous paths.</Paragraph>
      <Paragraph position="9"> 5. T1 1(T0) (consists of reinserting in T0 the transitions of T which where removed).</Paragraph>
      <Paragraph position="10"> 6. return T1  Now, we will study an important class of transducers verifying the fundamental property. This class is obtained by doing the composition of a transducer D verifying the fundamental property with a transducer R. The composition of two transducers is an e cient algebraic operation for building more complex transducers. We give a brief de nition of composition and the fundamental theorem that insures the invariance of the fundamental property by composition.</Paragraph>
    </Section>
  </Section>
  <Section position="6" start_page="8" end_page="8" type="metho">
    <SectionTitle>
3 Composition
</SectionTitle>
    <Paragraph position="0"> The transducer T created by the composition of two transducers R and D, denoted T = R D, performs the mapping of word x to word z if and only if R maps x to y and D maps y to z.</Paragraph>
    <Paragraph position="1"> The weight of the resulting word is the -product of the weights of y and z (Pereira and Riley, 1997).</Paragraph>
    <Paragraph position="2"> De nition 3 (Transitions) Let t = (q, a, b, q1, w1) and e = (r, b, c, r1, w2) be two transitions. We de ne the composition t with e by: t e = ((q, r), a, c, (q1, r1), w1 w2). Note that, in order to make the composition possible, we must have o(t) = i(e).</Paragraph>
    <Paragraph position="4"> two transducers. The composition of R with S is a transducer R S = (Q;Q;X;Z;E;F) de ned by: 1. i = (iR;iS), 2. Q = QR QS, 3. F = FR FS, 4. E =feR eS : eR 2ER; eS 2ESg.  Let D = (QD;ID;Y;Z;ED;FD) be a transducer verifying the fundamental property. We can write Y = Y0 ]Y1</Paragraph>
    <Paragraph position="6"> the following condition: (C) 8t2ER; o(t)2Y1 )i(t)2Y1: Then the transducer T = R D veri es the fundamental property.</Paragraph>
    <Paragraph position="7"> Proof: Let X1 = fi(t) : t 2 ER and o(t) 2 Y1g Y1 and X0 = XnX1. We will prove that any path in T contains at least a transition t such that i(t)2X1. Let be a cycle in T. Then, there exists two cycles R and D in R and in D respectively such that = R D. The paths R and D have the following form:</Paragraph>
    <Paragraph position="9"> There is an index k such that i(gk) 2 Y1 since D veri es the fundamental property. We also necessarily have</Paragraph>
    <Paragraph position="11"> tion (C) of Theorem 1, we deduce that i(fk)2Y1. Knowing that fk 2ER, we deduce that i(fk)2X1, which implies</Paragraph>
    <Paragraph position="13"/>
    <Section position="1" start_page="8" end_page="8" type="sub_section">
      <SectionTitle>
3.1 Consequence
</SectionTitle>
      <Paragraph position="0"> The restriction to the case X = Y allows us to build a large class of transducers verifying the fundamental property. In fact, if two transducers R = (QR;IR;Y;Y;ER;FR) and S = (QS;IS;Y;Y;ES;FS) verify the condition (C) of Theorem 1, then S R veri es the condition (C), associativity of implies: S (R D) = (S R) D.</Paragraph>
      <Paragraph position="1"> Suppose that we have m transducers Ri ( 1 i m ) verifying the condition (C) of Theorem 1 and that we want to reduce the size of the transducer: null</Paragraph>
      <Paragraph position="3"> To this end, we proceed as follows: we add the auxiliary symbols to disambiguate the transducer; then we apply determinization and nally we remove the auxiliary labels. These three operations are denoted by .</Paragraph>
      <Paragraph position="5"> The size of transducer Tm can also be reduced by computing:</Paragraph>
      <Paragraph position="7"> has several disadvantages. The size of R0i for 1 i m increases considerably since the auxiliary labels introduced in each transducer have to be taken into account in all others. This fact limits the number of transducers that can be composed with D.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML