File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/04/n04-1014_intro.xml
Size: 2,432 bytes
Last Modified: 2025-10-06 14:02:18
<?xml version="1.0" standalone="yes"?> <Paper uid="N04-1014"> <Title>Training Tree Transducers</Title> <Section position="3" start_page="0" end_page="0" type="intro"> <SectionTitle> 2 Trees </SectionTitle> <Paragraph position="0"> TS is the set of (rooted, ordered, labeled, finite) trees over alphabet S. An alphabet is just a finite set.</Paragraph> <Paragraph position="1"> TS(X) are the trees over alphabet S, indexed by X-the subset of TS[?]X where only leaves may be labeled by X. (TS([?]) = TS.) Leaves are nodes with no children.</Paragraph> <Paragraph position="2"> The nodes of a tree t are identified one-to-one with its paths: pathst [?]paths[?]N[?] [?]uniontext[?]i=0Ni (A0 [?]{()}). The path to the root is the empty sequence (), and p1 extended by p2 is p1*p2, where*is concatenation.</Paragraph> <Paragraph position="3"> For p [?] pathst, rankt(p) is the number of children, or rank, of the node at p in t, and labelt(p) [?] S[?]X is its label. The ranked label of a node is the pair labelandrankt(p) [?] (labelt(p),rankt(p)). For 1 [?] i [?] rankt(p), the ith child of the node at p is located at path p* (i). The subtree at path p of t is t |p, defined by pathst|p [?]{q |p*q [?] pathst}and labelandrankt|p(q)[?]labelandrankt(p*q).</Paragraph> <Paragraph position="4"> The paths to X in t are pathst(X) [?] {p [?] pathst |labelt(p) [?] X}. A frontier is a set of paths f that are pairwise prefix-independent: [?]p1,p2[?]f,p[?]paths : p1 = p2*p == p1 = p2 A frontier of t is a frontier f [?]pathst.</Paragraph> <Paragraph position="5"> For t,s[?]TS(X),p[?]pathst, t[p-s] is the substitution of s for p in t, where the subtree at path p is replaced by s. For a frontier f of t, the mass substitution of X for the frontier f in t is written t[p - X,[?]p [?] f] and is equivalent to substituting the X(p) for the p serially in any order.</Paragraph> <Paragraph position="6"> Trees may be written as strings over S [?]{(,)} in the usual way. For example, the tree t = S(NP,VP(V,NP)) has labelandrankt((2)) = (VP,2) and labelandrankt((2,1)) = (V,0). For t[?]TS,s[?]S, s(t) is the tree whose root has label s and whose single child is t.</Paragraph> <Paragraph position="7"> The yield of X in t is yieldt(X), the string formed by reading out the leaves labeled with X in left-to-right order. The usual case (the yield of t) is yieldt[?]yieldt(S).</Paragraph> <Paragraph position="9"/> </Section> class="xml-element"></Paper>