File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/04/c04-1011_intro.xml
Size: 4,433 bytes
Last Modified: 2025-10-06 14:02:05
<?xml version="1.0" standalone="yes"?> <Paper uid="C04-1011"> <Title>Kullback-Leibler Distance between Probabilistic Context-Free Grammars and Probabilistic Finite Automata</Title> <Section position="3" start_page="0" end_page="0" type="intro"> <SectionTitle> 2 Preliminaries </SectionTitle> <Paragraph position="0"> Throughout the paper we use mostly standard formal language notation, as for instance in (Hopcroft and Ullman, 1979; Booth and Thompson, 1973), which we summarize below.</Paragraph> <Paragraph position="1"> A context-free grammar (CFG) is a 4-tuple</Paragraph> <Paragraph position="3"> joint sets of terminals and nonterminals, respectively, S2N is the start symbol and R is a nite set of rules. Each rule has the form A! , where A2N and 2( [N) .</Paragraph> <Paragraph position="4"> The 'derives' relation ) associated with G is de ned on triples consisting of two strings ; 2 ( [N) and a rule 2R. We write ) if and only if is of the form uA and is of the form u , for some u 2 , 2 ( [N) , and = (A! ). A left-most derivation (for G) is a string d = 1 m, m 0, such that 0 1) 1 2) m) m, for some 0;:::; m 2 ( [N) ; d = (where denotes the empty string) is also a left-most derivation. In the remainder of this paper, we will let the term 'derivation' refer to 'leftmost derivation', unless speci ed otherwise. If</Paragraph> <Paragraph position="6"> then we say that d = 1 m derives m from 0 and we write 0 d) m; d = derives any 0 2( [N) from itself.</Paragraph> <Paragraph position="7"> A (left-most) derivation d such that S d)w, w2 , is called a complete derivation. If d is a complete derivation, we write y(d) to denote the (unique) string w2 such that S d)w.</Paragraph> <Paragraph position="8"> The language generated by G is the set of all strings y(d) derived by complete derivations, i.e., L(G) = fwjS d) w; d 2 R ; w 2 g.</Paragraph> <Paragraph position="9"> It is well-known that there is a one-to-one correspondence between complete derivations and parse trees for strings in L(G).</Paragraph> <Paragraph position="10"> A probabilistic CFG (PCFG) is a pair Gp = (G;pG), where G is a CFG and pG is a function from R to real numbers in the interval [0;1].</Paragraph> <Paragraph position="11"> A PCFG is proper if P =(A! )pG( ) = 1 for all A 2 N. Function pG can be used to associate probabilities to derivations of the underlying CFG G, in the following way. For</Paragraph> <Paragraph position="13"> d)w for some w 2 , and pG(d) = 0 otherwise. The probability of a string w2 is de ned as pG(w) = Pd:y(d)=wpG(d).</Paragraph> <Paragraph position="14"> A PCFG is consistent if PwpG(w) = 1. Consistency implies that the PCFG de nes a probability distribution on the set of terminal strings as well as on the set of grammar derivations. If a PCFG is proper, then consistency means that no probability mass is lost in 'in nite' derivations. null A nite automaton (FA) is a 5-tuple M = ( ; Q;q0;Qf;T), where and Q are two nite sets of terminals and states, respectively, q0 is the initial state, Qf Q is the set of nal states, and T is a nite set of transitions, each of the form s a7! t, where s;t 2 Q and a 2 . A probabilistic nite automaton (PFA) is a pair Mp = (M;pM), where M is an FA and pM is a function from T to real numbers in the interval [0;1].1 For a xed (P)FA M, we de ne a con guration to be an element of Q , and we de ne the relation ' on triples consisting of two con gurations and a transition 2 T by: (s;w) '(t;w0) if and only if w is of the form aw0, for some a2 , and = (s a7!t). A complete computation is a string c = 1 m, m 0, such that (s0;w0) 1' (s1;w1) 2' m' (sm;wm),</Paragraph> <Paragraph position="16"> c is a complete computation, and pM(c) = 0 otherwise. A PFA is consistent ifPcpM(c) = 1.</Paragraph> <Paragraph position="17"> We say M is unambiguous if for each w2 , 9s2Qf[(q0;w) c' (s; )] for at most one c 2 T .</Paragraph> <Paragraph position="18"> We say M is deterministic if for each s and a, there is at most one transition s a7! t. Determinism implies unambiguity. It can be more readily checked whether an FA is deterministic than whether it is unambiguous. Furthermore, any FA can be e ectively turned into a deterministic FA accepting the same language.</Paragraph> <Paragraph position="19"> Therefore, this paper will assume that FAs are deterministic, although technically, unambiguity is su cient for our constructions to apply.</Paragraph> </Section> class="xml-element"></Paper>