File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/00/c00-2134_metho.xml

Size: 22,843 bytes

Last Modified: 2025-10-06 14:07:14

<?xml version="1.0" standalone="yes"?>
<Paper uid="C00-2134">
  <Title>Lexicalized Tree Automata-based Grammars for Translating Conversational Texts</Title>
  <Section position="3" start_page="926" end_page="927" type="metho">
    <SectionTitle>
2 Lexicalized Tree Automata-based
Grammars
</SectionTitle>
    <Paragraph position="0"> In this section we introduce Lexicalized Tree Automata-based Grammar (LTA-based Grammar) and present its parsing algorithm.</Paragraph>
    <Paragraph position="1"> First, we define some basic terminologies. A grammar is strongly lexicalized if it consists of 1) a finite set of structures each associated with a lexical item; each lexical item will be called the anchor of the corresponding structure, and 2) an operation or operations for composing the structures (Sehabes, Abeilld and Joshi, 1988).</Paragraph>
    <Paragraph position="2"> In the following, the word &amp;quot;tree automaton&amp;quot; (TA) will be used as a generic term for an automaton that accepts trees as input. It can be a finite tree automaton, a pushdown tree automaton, or any tree-accepting automaton having a state setdeg state transitions, initial and final states, and optional memories associated with states. Although our argument below does not necessarily require understanding of these general TAs, definitions and properties of finite and pushdown TAs can be found in G6eseg and Steinby (1997) for example~</Paragraph>
    <Section position="1" start_page="926" end_page="927" type="sub_section">
      <SectionTitle>
2.1 Definition of LTA-based Grammars
</SectionTitle>
      <Paragraph position="0"> The basic idea of an LTA-based grammar is to associate a tree automaton to each word that defines the set of local trees anchored to the word, instead of associating the trees themselves. The lexicalized tree automaton (LTA) provides a finite representation of a possibly non-finite set of local trees. This differs from other lexicalized grammars as LTAG, where non-finiteness of local trees is introduced through a global tree operation such as adjunction of auxiliary trees.</Paragraph>
      <Paragraph position="1"> We define a lexicalized tree automata-based grammar as follows. LEt X be a set of terminal symbols (words), and NT be the set of nonterminal symbols disjoint fi'om 27. Let Tw be a set of trees (elementary trees) associated with a word w in 2:.. A tree in Tw has nodes either from 27 or from N1, and its root )and one of its leaves are marked by a distinguished symbol self in NT. Let A,v be the tree automaton lcxicalized to the word w, which accepts a subset of trees obtained by repeatedly joining two trees in Tw at the special nodes labelled selfi one at the root of a tree and the other at a foot of another tree. From this definition~ A1, can be identified with a string automaton; its alphabets are the trees in Tw, and a string of the elementary trees are identified with the tree obtained by joining the elementary trees in a bottom-up manner. Sw is a set of nonterminal symbols associated with the word Wo They are assigned to the root of a tree when the tree is accepted by A,~ f For each word w, the set A,, l T,,, A,v, Sw} is the set of local trees associated with w. The structure is described by Aw and 1'w~ the symbol at the root node  is fl-om Sw, and se./fin the loot is identified with w. We denote the family of Aw as A = {A,,,} for w in Z.</Paragraph>
      <Paragraph position="2"> A lexicalized tree automata-based grammar G is defined to be the tree algebra whose trees are the set union of Aw for all w in Z', and the basic tree operation being the substitution, that is, joining two trees at the root node and a foot node when they have the same nonterminal in NT other than self.</Paragraph>
    </Section>
    <Section position="2" start_page="927" end_page="927" type="sub_section">
      <SectionTitle>
2,2 Some Remarks
</SectionTitle>
      <Paragraph position="0"> 1 Strictly speaking, the definition above does not satisfy the definition of strongly lexicalized grammars that the structures associated to a word must form a finite set, since the tree set accepted by the automaton may be an infinite set. However, since a finite device, namely an automaton, describes this possibly infinite set, we will classify the proposed formalism as a strongly lexicalized grallllTlar.</Paragraph>
      <Paragraph position="1"> 2 We defined the lexicalized tree automata using string automata where the alphabets are trees. The latter is obtained by linearizing the constituent trees along the spine of the tree. Because the I/I'A can be any tree automaton as long as it accepts all and only the (possibly infinite) tree set beaded by a word, LTA are essentially tree automata. These equivalent two pictures (the tree automata picture of a tree grammar and the string automata picture employed in the definition) will be used interchangeably in this paper.</Paragraph>
    </Section>
  </Section>
  <Section position="4" start_page="927" end_page="928" type="metho">
    <SectionTitle>
3 The grarnmar G can also be defined by a tree
</SectionTitle>
    <Paragraph position="0"> automaton T that accepts all and only the trees of the grammar as follows: First we regard NT as the set of states of T. Its initial states are .Z', and the final states are also NT. Sw is regarded as the set of final states of A .... The set of initial states ofAw are the set of nonterminal symbols that appear in T,,.</Paragraph>
    <Paragraph position="1"> and w. The LTAs are combined into T through the common state set NT. The recognition era tree t proceeds in a bottom-up manner, beginning at the leaf nodes that are initial states for G and for some Aw. When a subtree of T has been recognized by an LTA Aw, its root node is in a state s fl'om S,,. Ifs is</Paragraph>
    <Paragraph position="3"> an initial state of another LTA A,., the recognition can proceed. The tree t has been successfully recognized if the recognition step arrives at the root node.</Paragraph>
    <Section position="1" start_page="927" end_page="927" type="sub_section">
      <SectionTitle>
2.3 Examples
</SectionTitle>
      <Paragraph position="0"> Adjunetion in the X-bar Theory We demonstrate how the proposed formalism handles the simplest case of an infinite set of the local trees. The example is repeated adjunction at the bar level 1 of the X-bar theory. Figure I shows a general scheme of the X-bar theory. X' at the bar level 1 can be paired with some adjunct arbitrary times betbre it grows to the phrase level, XP.</Paragraph>
      <Paragraph position="1"> Figure 2 shows how this scheme is realized in the LTA-based grammar tbrmalism. Figure 2 (a) shows the tree set associated with the word. It consists of three trees, corresponding to the bar levels. r3 is for the complement, &amp;quot;r2 for adjunction, and T3 for the specifier. (b) shows the tree automaton associated with this word in the (tree-alphabet) string automaton representation. It first accepts &amp;quot;I'~, then &amp;quot;I'2 arbitrary times, finally T3 to arrive at the final state. This sequence is identified with the trees in (d), obtained by concatenating &amp;quot;Fi through T3 in a bottom-ut) manner. When the ETA arrives at the final state, the root node is given a nonternlhml symbol lYom the set in (c), which is XP.</Paragraph>
    </Section>
    <Section position="2" start_page="927" end_page="928" type="sub_section">
      <SectionTitle>
Tree Adjoining Language
</SectionTitle>
      <Paragraph position="0"> Figure 3 shows a I~I'AG that generates a strictly context sensitive language anb&amp;quot;ec'~d n. The unique initial tree 7&amp;quot; in (a) is adjoined repeatedly by the unique auxiliary tree A in (b) at the root node labeled S. The root and foot of A is labeled S, but adjunction to them is inhibited by the index NA. (c) shows a tree obtained by adjoinhlg A once to T.</Paragraph>
      <Paragraph position="1"> Generally a string a&amp;quot;b&amp;quot;ec&amp;quot;d&amp;quot; is obtained as the yield  era tree produced by adjoining :I n-times to T.</Paragraph>
      <Paragraph position="2"> The same language can be expressed by an IM'A-based grammar shown in l:igure 4. &amp;quot;lhe word is e Ca). The tree set associated \rich e consists of two trees TI and 'I', as shown in (b). The local automaton is a pushdown automaton that accepts tree sequence (Tx)n(Tl)n, and accepted trees are given the nonterminal symbol S as in (d). (e) shows a Wee with n = 2. From this setting, it is apparent that ibis l,l'A-based grallllnar generates the same language as the TAI, in the figure 3.</Paragraph>
      <Paragraph position="3"> By extending this construction, it will be obvious that for any I/FAG, all equivalent 171'A-based grammar can be constructed within the class of pushdown LTA.</Paragraph>
      <Paragraph position="4"> 2.,4 Parsing, LTA-based Gramnlars Parsing algorithm lbr the I/l'A-based granlnlar is a &lt;;traiglltforward extension of the CFG case. In the CFG case, an active edge is represented by a rewriting rule with a dot in the right hand side. The dot shows that the terminals and ilOll-termillals up to that location have been ah'eady t'(')tllld, and the rule application becomes a success when the (lot reaches the end of the rule. If we regafd the right-haild side of a rule as an automaton accepting a sequence of terminals and non-terminals, with the clot representing the current state, this picture can be easily generalized to the LTA-based grammar  &amp;quot;'He eats dinner&amp;quot;. Figure 5 shows the dictionary content of the verb &amp;quot;'eats&amp;quot;, which is basically the same as llgure 2. Figure 6 shows the dictionary of &amp;quot;he&amp;quot; and &amp;quot;dinner&amp;quot;. We suppose here that these words have no associated trees for simplicity. The basic strategy is left to right bottom-up chart pall sing.</Paragraph>
      <Paragraph position="5"> First, edges el, e, and e3 are loaded into the chart and set to the initial state. They correspond to &amp;quot;'he&amp;quot;, &amp;quot;'eats&amp;quot; and &amp;quot;'dinner&amp;quot; respectively. The parsing proceeds from left to right, and the parser triggers the I,TA of el first. Since its only possible transition is a null transition, it arrives at the llnal state immediately and creates an edge e4 labelled ,~,'111)/.</Paragraph>
      <Paragraph position="6"> Then the focus moves one step to the right on the chart and the I,TA of e, is activated. It tries to find Hie tree T,I, and finds that an edge labelled d(;D is necessary to its right. Since there is no such edge, the I/I'A creates an active edge t with a hole Uoh, as in the case of'CFG, and the I/I'A goes into a pause waiting tBr the hole to be filled.</Paragraph>
      <Paragraph position="7"> Creation of e5 from e3 is simihu&amp;quot; to the creation of % t'rorn el. Then e5 starts the completion step as in the CI:(I case. At this step, the active edge created above is fouild, and es is found to match the hole.</Paragraph>
      <Paragraph position="8"> Then the I/FA of e, is reactivated, arrives at the stale Sl, then creates an edge ca.</Paragraph>
      <Paragraph position="9"> Next the I/I'A of e(, is actiwited. It tries to find &amp;quot;t'a, or Ta3 In searching for Ta;, an active edge with a hole t)o.vlmcJd is created. While searching for Ta,, the I.TA finds that an edge labelled mutT/to the left</Paragraph>
      <Paragraph position="11"> is what is necessary, and finds that e4 satisfies this condition. By accepting Ta2, tile LTA creates the edge ev, label it as senlence and advances to tile final state. There is no more possible action on the Chart, and the parsing is completed successlhlly.</Paragraph>
      <Paragraph position="12"> Please note that the algorithm exemplified above does not depend on the concrete form of LTA. The satne algorithm can be applied to pushdown atttomata and other class of automata having internal memories.</Paragraph>
    </Section>
  </Section>
  <Section position="5" start_page="928" end_page="930" type="metho">
    <SectionTitle>
3 Translation Module
</SectionTitle>
    <Paragraph position="0"> We buih a bi-directional translation system between the ,lapanese and English languages usitlg the proposed method. It translates conversational texts as will appear in a dialogue between two people, to help them communicate in a foreign travel situation.</Paragraph>
    <Paragraph position="1"> Figttre 8 shows an overview of the system.</Paragraph>
    <Section position="1" start_page="928" end_page="928" type="sub_section">
      <SectionTitle>
3.1 Translation Engine
</SectionTitle>
      <Paragraph position="0"> Since each word in tile dictionary has its own tree set and tree automaton, a simple implementation will lead to inefficiency. To cope with this problem, we provided two mechanisms to share tile UI'A. A &amp;quot;rule template&amp;quot; mechanism is provided to share the triplet, tmrnely Aw it1 tile definition of I+TA, while a &amp;quot;'shared tree&amp;quot; mechanism is provided to sllare the elementary trees among different A+.</Paragraph>
      <Paragraph position="1"> The rule template is applied just after dictionary loading, and assigns an LTA to a word that matches the condition in tile template. It is mainly used for words such as cotnmon nouns. A shared tree is represented by a pointer to an elementary tree in the pool, and is loaded into the systetn when it is tised lbr tile first time.</Paragraph>
      <Paragraph position="2"> The language conversion method is based on synchronous deriwttion of analysis and generation trees, basically the same as the syntax directed translation by Aho and Ulhnan (1969) and the synchronous LTAG by Shieber and Schabes (1990).</Paragraph>
      <Paragraph position="3"> In this method, elementary analysis tree of each word is paired with another tree (elementary generation tree). Starting from the root, at each  node of the analysis tree+ the direct descendant nodes are reordered in the generation tree, according to tile correspondence of elementary analysis and generation trees. Tiffs translation mechanisna is basically independent of how the analysis tree is constructed, hence the grammar l'ornmlistri. In our implerrientation, the gerleration tree is a call graph of target language generating ftirmtioils, which enables detailed procedural control over the syntactic generation process.</Paragraph>
    </Section>
    <Section position="2" start_page="928" end_page="930" type="sub_section">
      <SectionTitle>
3.2 Grammars and Dictionaries for
</SectionTitle>
      <Paragraph position="0"> English to Japanese Transhltion The English to Japanese translation grammar and dictionary has been developed. In order to achieve wide coverage for general input and high quality for tile target dolnain, we developed general gramrrlar l't.iles and donlain-specific rules sinltlltaneot, isly. (\]eneral rules are based on a standard X-bar theory. Nodes of a tree are associated with attribute-value structure in a standaM way. As nonterminals, we employed a grammatical function-centered approach as lank  level node is assigned attribute-values that express their syntactic ftlnction Stlch as subject, direct object, etc, instead of a single part-of-speech symbol such as NP. This approach is suitable to capture idiosyncratic behaviour of words.</Paragraph>
      <Paragraph position="1"> Domain-specific rules are mostly pattern-like rules with special attention to aspects that are important for carrying conversations, such as modality and the degree of politeness. The English to Japanese translation dictionary contains about seventy thousand words. The number of words that required individual I,TA was a few thousand at tile time of this report.</Paragraph>
    </Section>
    <Section position="3" start_page="930" end_page="930" type="sub_section">
      <SectionTitle>
3deg3 Current Status of Implementation
</SectionTitle>
      <Paragraph position="0"> The system has been iml~lemented using C++, and runs on Windows 98 and NT. The requirelnent is Pentium I\] 400MItz or above for tile CPU~ about 61) MB of memory, and 200 MB of disk space.</Paragraph>
      <Paragraph position="1"> Most of the disk space is used for statistical data for disambiguation.</Paragraph>
      <Paragraph position="2"> We performed a preliminary evaluation of the translation quality of l';nglish to Japanese translation. A widely used COlnmercial systeln was chosen as a reference system, of which the dictionaries were expanded for the target domain, t:ive hundred sentellces were randomly chosen from a large (about 40K) pool of conversatiolml texts of the target domain. '\['hen the output of our system and the reference system were mixed, and then presented to a single evahmtor at a random order.</Paragraph>
      <Paragraph position="3"> The evaluator classified them into (Bur levels (natural, good, understandable and bad). The result showed that tile number of sentences classified to &amp;quot;'natural&amp;quot; increased about 45% compared to that of tlle reference system, i.e. tile ratio of the ntlmber of sentences was arotllld 1.45. The ntllllber o|&amp;quot; sentences classified as &amp;quot;bad&amp;quot; decreased about 40% in the same measure.</Paragraph>
      <Paragraph position="4"> We applied this module to an experimental speech translation system (Watanabe et al., 2000).</Paragraph>
    </Section>
  </Section>
  <Section position="6" start_page="930" end_page="931" type="metho">
    <SectionTitle>
4 l)iscussions
</SectionTitle>
    <Paragraph position="0"> The proposed granllnar fornlalism is a kind of lexicalized granll'nar fcnTnalisnl and shares its advantages. The largest difference frolll other strongly lexicalized granunar tbrnlalisms is that it employs lexicalized tree automata (I,TA) to describe the tree set associated with a word, which allows a finite description of a non-finite set of local trees. These automata's role is equivalent to additional tree operations in other formalisms. In addition, an LTA provide an extended domain of locality (EDOL) of the word.</Paragraph>
    <Paragraph position="1"> If all the LTAs are finite automata in the string automaton representation, then the tree language recognized by this grammar is regular and its yield is a context-free language. The grammar can accept general Tree Adjoining Language (TAL) if&amp;quot; the LTAs belong to the class of pushdown atttomata in the string autonlaton representation. This is a reflection of tile thct that pushdown tree automata can accept the indexed languages (Gdcseg and Steinby, 1997), of which the TAL is a subclass.</Paragraph>
    <Paragraph position="2"> As shown in the section 2.4, the control strategy of bottom-up chart parsing does not rely ell the concrete content of the I,TA, which is an adwmtage of the proposed formalism. This implies that we can alter even tile grammar class without affecting the parsing. Suppose the current L'I'As are finite automata, hence the yield language is context-free.</Paragraph>
    <Paragraph position="3"> If we want to introduce a word e that induces a non-context-fi'eeness, such as e in a&amp;quot;b&amp;quot;ec&amp;quot;d&amp;quot;, then what we have to do is to write a pushdown automaton in tile figure 4 li)r the word e. We change neither tile grammar formalism nor the parsing algorithm, and the change is localized to the LTA of e.</Paragraph>
    <Paragraph position="4"> Writing automata by hand may seem much more complex than writing trees, but our experience shows that it is not nlucll different fronl convelltional granHllar development. As long as appropriate notations are used, writing automata for a word anlounts to detornlining possible t'olnl of trees headed by that word, a task ahvays required in gramil-iar development. In fact, thei'e is tess alllOtllli of work since tile gralllnlar writer does not need to pay attention to assigning proper nontcrminals and/or proper attributes to internal nodes of trees in order to control their growth.</Paragraph>
    <Paragraph position="5"> It is another advantage of the proposed formalisnl that it can utilize various autorriata operations, such as conlposition and intersection.</Paragraph>
    <Paragraph position="6"> For exanlple, a word can append an atltOlllatoll to thai of the headword when it becomes a child, which enables to specify a constraint fi'onl a h)wer-positioned word to a higherq)ositioned v~oi'd in tile tree. Another example is cootdination. Two edges are conjoined when tile unapplied parts of I,TAs have nonempty intersection as automata, and tile conjoined edge is given with this intersection as the lTfA. Verb phrase conjunction such as &amp;quot;,lohn eats cookies and drinks beer&amp;quot; is handled in this manner, by conjoining &amp;quot;'eats cookies&amp;quot; and &amp;quot;'drinks beer. The intersected automaton will accept the subject tree and other sentence-level trees.</Paragraph>
    <Paragraph position="7">  In the proposed method, elementary trees are always anchored by the syntactic headword. For example, a verb iit a relative clause is in the EDOL of the antecedent. Then, if the embedded verb puts a constraint on the antecedent, that constraint is not expressed in a straightforward manner, which may seem a weakness of the method. We just poiltt out that this type of problem occurs when the syntactic head and the semantic head are different, and is common to lexicalized grammars as long as a tree is anchored to one word, because constraints are often reciprocal. In our current implementation, the constraint written in the verb's dictionary is found and checked by the relative-clause-tree accepting automaton of the antecedent noun.</Paragraph>
    <Paragraph position="8"> There have been many work on syntactic analysis based on automata attached to the headword. Evans and Weir (1998) used finite state automata as representation of trees that can be merged and minimized to improve parsing efficiency, lit their method, the granlnlar is fixed to be I,TAG or seine lexicalized grammar and the automata are obtained by automatic conversion from the trees. Our nlethod differs frol11 theirs ilt the poiltt that ours employs trees as the basic object of automata, which enables to handle general recursive adjunction in LTAG, while their automata work Oll tile nonterillinal and ternlinal synlbols, lit the center O\[&amp;quot; Ot.lr method is the notion of&amp;quot; the local orallllllal of a word. &amp;quot;\['he whole grammar is divided into the global part and the set of h)cal gralltntars specific to the words, which is represented by tile LTAs.</Paragraph>
    <Paragraph position="9"> Alshawi (1996) introduced ttead Atitornata, a weighted finite machine that accepts a pair o\[&amp;quot; sequences of relation symbols. The difference is similar as above. Since the tree automata in our method are used to define the set of the local trees, their role will be equivalent to building the head automata themselves, but not to combining the trees that are already built, like the I lead Automata.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML