XML Viewer - w01-1403

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/01/w01-1403_metho.xml
Size: 21,393 bytes
Last Modified: 2025-10-06 14:07:43
<?xml version="1.0" standalone="yes"?>
<Paper uid="W01-1403">
  <Title>Inducing Lexico-Structural Transfer Rules from Parsed Bi-texts</Title>
  <Section position="3" start_page="0" end_page="0" type="metho">
    <SectionTitle>
2 Overall Approach
</SectionTitle>
    <Paragraph position="0"> In its most general form, our approach to transfer rules induction includes three different processes, data preparation, transfer rule induction and evaluation. An overview of each process is provided below; further details are provided in subsequent sections.</Paragraph>
    <Paragraph position="1"> The data preparation process creates the following resources from the bi-texts: * A training set and a test set of source and target parses for the bi-texts, post-processed into a syntactic dependency representation.</Paragraph>
    <Paragraph position="2"> * A baseline transfer dictionary, which may include (depending upon availability) lexical transfer rules extracted from the bi-texts using statistical methods, lexical transfer rules from existing bilingual dictionaries, and/or handcrafted lexico-structural transfer rules.</Paragraph>
    <Paragraph position="3"> The transfer induction process induces lexico-structural transfer rules from the training set of corresponding source and target parses that, when added to the baseline transfer dictionary, produce transferred parses that are closer to the corresponding target parses. The transfer induction process has the following steps: * Nodes of the corresponding source and target parses are aligned using the baseline transfer dictionary and some heuristics based on the similarity of part-of-speech and syntactic context.</Paragraph>
    <Paragraph position="4"> * Transfer rule candidates are generated based on the sub-patterns that contain the corresponding aligned nodes in the source and target parses.</Paragraph>
    <Paragraph position="5"> * The transfer rule candidates are ordered based on their likelihood ratios.</Paragraph>
    <Paragraph position="6"> * The transfer rule candidates are filtered, one at a time, in the order of the likelihood ratios, by removing those rule candidates that do not produce an overall improvement in the accuracy of the transferred parses.</Paragraph>
    <Paragraph position="7"> The evaluation process has the following steps: * Both the baseline transfer dictionary and the induced transfer dictionary (i.e., the baseline transfer dictionary augmented with the induced transfer rules) are applied to the test set in order to produce two sets of transferred parses, the baseline set and the (hopefully) improved induced set. For each set, the differences between the transferred parses and target parses are measured, and the improvement in tree accuracy is calculated.</Paragraph>
    <Paragraph position="8"> * After performing syntactic realization on the baseline set and the induced set of transferred parses, the differences between the resulting translated strings and the target strings are measured, and the improvement in string accuracy is calculated.</Paragraph>
    <Paragraph position="9"> * For a subset of the translated strings, human judgments of accuracy and grammaticality are gathered, and the correlations between the manual and automatic scores are calculated, in order to assess the meaningfulness of the automatic measures.</Paragraph>
  </Section>
  <Section position="4" start_page="0" end_page="0" type="metho">
    <SectionTitle>
3 Data Preparation
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.1 Parsing the Bi-texts
</SectionTitle>
      <Paragraph position="0"> In our experiments to date, we have used a corpus consisting of a Korean dialog of 4183 sentences and their English human translations. We ran off-the-shelf parsers on each half of the corpus, namely the Korean parser developed by Yoon et al. (1997) and the English parser developed by Collins (1997). Neither parser was trained on our corpus.</Paragraph>
      <Paragraph position="1"> We automatically converted the phrase structure output of the Collins parser into the syntactic dependency representation used by our syntactic realizer, RealPro (Lavoie and Rambow, 1997). This representation is based on the deep-syntactic structures (DSyntS) of Meaning-Text Theory (Mel'Vcuk, 1988). The important features of a DSyntS are as follows: * a DSyntS is an unordered tree with labeled nodes and labeled arcs; * a DSyntS is lexicalized, meaning that the nodes are labeled with lexemes (uninflected words) from the target language; * a DSyntS is a dependency structure and not a phrase- structure structure: there are no non-terminal nodes, and all nodes are labeled with lexemes; * a DSyntS is a syntactic representation, meaning that the arcs of the tree are labeled with syntactic relations such as SUB-JECT (represented in DSyntSs as I), rather than conceptual or semantic relations such as AGENT; * a DSyntS is a deep syntactic representation, meaning that only meaning-bearing lexemes are represented, and not function words.</Paragraph>
      <Paragraph position="2"> Since the output of the Yoon parser is quite similar, with the exception of its treatment of syntactic relations, we have used its output as is. The DSyntS representations for two corresponding Korean1 and English sentences are illustrated in Figure 1.</Paragraph>
      <Paragraph position="3"> In examining the outputs of the two parsers on our corpus, we found that about half of the parse pairs contained incorrect dependency assignments, incomplete lemmatization or incomplete parses. To reduce the impact of such parsing errors in our initial experiments, we have primarily focused on a higher quality subset of 1763 sentence pairs that were selected according to the following criteria: * Parse pairs where the source or target parse contained more than 10 nodes were rejected,</Paragraph>
      <Paragraph position="5"> for corresponding Korean and English sentences since these usually contained more parse errors than smaller parses.</Paragraph>
      <Paragraph position="6"> * Parse pairs where the source or target parse contained non-final punctuation were rejected; this criterion was based on our observation that in most such cases, the source or target parses contained only a fragment of the original sentence content (i.e., one or both parsers only parsed what was on one side of an intra-sentential punctuation mark).</Paragraph>
      <Paragraph position="7"> We divided this higher quality subset into training and test sets by randomly choosing 50% of the 1763 higher quality parse pairs (described in Section 3.1) for inclusion in the training set, reserving the remaining 50% for the test set. The average numbers of parse nodes in the training set and test set were respectively 6.91 and 6.11 nodes.</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.2 Creating the Baseline Transfer
Dictionary
</SectionTitle>
      <Paragraph position="0"> In the general case, any available bilingual dictionaries can be combined to create the base-line transfer dictionary. These dictionaries may include lexical transfer dictionaries extracted from the bi-texts using statistical methods, existing bilingual dictionaries, or handcrafted lexico-structural transfer dictionaries. If probabilistic information is not already associated with the lexical entries, log likelihood ratios can be computed and added to these entries based on the occurrences of these lexical items in the parse pairs.</Paragraph>
      <Paragraph position="1"> In our initial experiments, we decided to focus on the scenario where the baseline transfer dic- null tracted from the bi-texts using statistical methods.</Paragraph>
      <Paragraph position="2"> To simulate this scenario, we created our baseline transfer dictionary by taking the lexico-syntactic transfer dictionary developed by Han et al. (2000) for this corpus and removing the (more general) rules that were not fully lexicalized. Starting with this purely lexical baseline transfer dictionary enabled us to examine whether these more general rules could be discovered through induction.</Paragraph>
    </Section>
  </Section>
  <Section position="5" start_page="0" end_page="0" type="metho">
    <SectionTitle>
4 Transfer Rule Induction
</SectionTitle>
    <Paragraph position="0"> The induced lexico-structural transfer rules are represented in a formalism similar to the one described in Nasr et al. (1997), and extended to also include log likelihood ratios. Figures 2 and 3 illustrate two entry samples that can be used to transfer a Korean syntactic representation for cito-reul po-ra to an English syntactic representation for look at the map. The first rule lexicalizes the English predicate and inserts the corresponding preposition while the second rule inserts the English imperative attribute. This formalism uses notation similar to the syntactic dependency notation shown in Figure 1, augmented with variable arguments prefixed with $ characters.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
4.1 Aligning the Parse Nodes
</SectionTitle>
      <Paragraph position="0"> To align the nodes in the source and target parse trees, we devised a new dynamic programming alignment algorithm that performs a top-down, bidirectional beam search for the least cost mapping between these nodes. The algorithm is parameterized by the costs of (1) aligning two nodes whose lexemes are not found in the baseline transfer dictionary; (2) aligning two nodes with differing parts of speech; (3) deleting or inserting a node in the source or target tree; and (4) aligning two nodes whose relative locations differ.</Paragraph>
      <Paragraph position="1"> To determine an appropriate part of speech cost measure, we first extracted a small set of parse pairs that could be reliably aligned using lexical matching alone, and then based the cost measure on the co-occurrence counts of the observed parts of speech pairings. The remaining costs were set by hand.</Paragraph>
      <Paragraph position="2"> As a result of the alignment process, alignment id attributes (aid) are added to the nodes of the parse pairs. Some nodes may be in alignment with no other node, such as English prepositions not found in the Korean DSyntS.</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
4.2 Generating Rule Candidates
</SectionTitle>
      <Paragraph position="0"> Candidate transfer rules are generated using three data sources: * the training set of aligned source and target parses resulting from the alignment process; * a set of alignment constraints which identify the subtrees of interest in the aligned source and target parses (Section 4.2.1); * a set of attribute constraints which determine what parts of the aligned subtrees to include in the transfer rule candidates' source and target patterns (Section 4.2.2).</Paragraph>
      <Paragraph position="1"> The alignment and attribute constraints are necessary to keep the set of candidate transfer rules manageable in size.</Paragraph>
      <Paragraph position="2">  Figure 4 shows an example alignment constraint. This constraint, which matches the structural patterns of the transfer rule illustrated in Figure 2, uses the aid alignment attribute to indicate that  in a Korean and English parse pair, any source and target sub-trees matching this alignment constraint (where $X1 and $Y1 are aligned or have the same attribute aid values and where $X2 and $Y3 are aligned) can be used as a point of departure for generating transfer rule candidates. We suggest that alignment constraints such as this one can be used to define most of the possible syntactic divergences between languages (Dorr, 1994), and that only a handful of them are necessary for two given languages (we have identified 11 general alignment constraints necessary for Korean to English transfer so far).</Paragraph>
      <Paragraph position="3">  Attribute constraints are used to limit the space of possible transfer rule candidates that can be generated from the sub-trees satisfying the alignment constraints. Candidate transfer rules must satisfy all of the attribute constraints. Attribute constraints can be divided into two types: * independent attribute constraints, whose scope covers only one part of a candidate transfer rule and which are the same for the source and target parts; * concurrent attribute constraints, whose scope extends to both the source and target parts of a candidate transfer rule.</Paragraph>
      <Paragraph position="4"> The examples of an independent attribute constraint and of a concurrent attribute constraint are given in Figure 5 and Figure 6 respectively. As with the alignment constraints, we suggest that a relatively small number of attribute constraints is necessary to generate most of the desired rules for a given language pair.</Paragraph>
      <Paragraph position="5"> Each node of a candidate transfer rule must have its relation attribute (relationship with its governor) specified if it is an internal node, otherwise this relation must not be specified:  In a candidate transfer rule, inclusion of the lexemes of two aligned nodes must be done concurrently: e.g.</Paragraph>
      <Paragraph position="7"/>
    </Section>
    <Section position="3" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
4.3 Ordering Rule Candidates
</SectionTitle>
      <Paragraph position="0"> In the next step, transfer rule candidates are ordered as follows: first, by their log likelihood ratios (Manning and Schutze, 1999: 172-175); second, any transfer rule candidates with the same log likelihood ratio are ordered by their specificity. null 4.3.1 Rule ordering by log likelihood ratio We calculate the log likelihood ratio, log l, applied to a transfer rule candidate as indicated in  that in the definitions of C1, C2, and C12 we are currently only considering one occurrence or co-occurrence of the source and/or target patterns per parse pair, while in general there could be more than one; in our initial experiments these definitions have sufficed.</Paragraph>
      <Paragraph position="1">  If two or more candidate transfer rules have the same log likelihood ratio, ties are broken by a specificity heuristic, with the result that more general rules are ordered ahead of more specific ones. The specificity of a rule is defined to be the following sum: the number of attributes found in the source and target patterns, plus 1 for each for</Paragraph>
      <Paragraph position="3"> where, not counting attributes aid,  each lexeme attribute and for each dependency relationship. In our initial experiments, this simple heuristic has been satisfactory.</Paragraph>
    </Section>
    <Section position="4" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
4.4 Filtering Rule Candidates
</SectionTitle>
      <Paragraph position="0"> Once the candidate transfer rules have been ordered, error-driven filtering is used to select those that yield improvements over the baseline transfer dictionary. The algorithm works as follows.</Paragraph>
      <Paragraph position="1"> First, in the initialization step, the set of accepted transfer rules is set to just those appearing in the baseline transfer dictionary, and the current error rate is established by applying these transfer rules to all the source structures and calculating the overall difference between the resulting transferred structures and the target parses. Then, in a single pass through the ordered list of candidates, each transfer rule candidate is tested to see if it reduces the error rate. During each iteration, the candidate transfer rule is provisionally added to the current set of accepted rules and the updated set is applied to all the source structures. If the overall difference between the transferred structures and the target parses is lower than the current error rate, then the candidate is accepted and</Paragraph>
    </Section>
    <Section position="5" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
4.5 Discussion of Induced Rules
</SectionTitle>
      <Paragraph position="0"> Experimentation with the training set of 882 parse pairs described in Section 3.1 produced 12467 source and target sub-tree pairs using the alignment constraints, from which 20569 transfer rules candidate were generated and 7565 were accepted after filtering. We expect that the number of accepted rules per parse pair will decrease with larger training sets, though this remains to be verified. null The rule illustrated in Figure 3 was accepted as the 65th best transfer rule with a log likelihood ratio of 33.37, and the rule illustrated in Figure 2 was accepted as the 189th best transfer rule candidate with a log likelihood ratio of 12.77. An example of a candidate transfer rule that was not accepted is the one that combines the features of the two rules mentioned above, illustrated in Figure 8.</Paragraph>
      <Paragraph position="1"> This transfer rule candidate had a lower log likelihood ratio of 11.40; consequently, it is only considered after the two rules mentioned above, and since it provides no further improvement upon these two rules, it is filtered out.</Paragraph>
      <Paragraph position="2"> In an informal inspection of the top 100 accepted transfer rules, we found that most of them appear to be fairly general rules that would normally be found in a general syntactic-based transfer dictionary. In looking at the remaining rules, we found that the rules tended to become increasingly corpus-specific.</Paragraph>
    </Section>
  </Section>
  <Section position="6" start_page="0" end_page="0" type="metho">
    <SectionTitle>
5 Initial Evaluation
5.1 Results
</SectionTitle>
    <Paragraph position="0"> In an initial evaluation of our approach, we applied both the baseline transfer dictionary and the induced transfer dictionary (i.e., the baseline transfer dictionary augmented with the transfer rules induced from the training set) to the test half of the 1763 higher quality parse pairs described in Section 3.1, in order to produce two sets of transferred parses, the baseline set and the induced set. For each set, we then calculated tree accuracy recall and precision measures as follows: Tree accuracy recall The tree accuracy recall for a transferred parse and a corresponding target parse is determined the by C/Rq, where C is the total number of features (attributes, lexemes and dependency relationships) that are found in both the nodes of the transferred parse and in the corresponding nodes in the target parse, and Rq is the total number of features found in the nodes of the target parse. The correspondence between the nodes of the transferred parse and the nodes of the target parse is determined with alignment information obtained using the technique described in Section 4.1.</Paragraph>
    <Paragraph position="1"> Tree accuracy precision The tree accuracy precision for a transferred parse and a corresponding target parse is determined the by C/Rt, where C is the total number of features (attributes, lexemes and dependency relationships) that are found in both the nodes of the transferred parse and in the corresponding nodes in the target parse, and Rt is the total number of features found in the nodes of the transferred parse.</Paragraph>
    <Paragraph position="2"> Table 1 shows the tree accuracy results, where the f-score is equally weighted between recall and precision. The results illustrated in Table 1 indicate that the transferred parses obtained using induction were moderately more similar to the target parses than the transferred parses obtained using the baseline transfer, with about 15 percent improvement in the f-score.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
5.2 Discussion
</SectionTitle>
      <Paragraph position="0"> At the time of writing, the improvements in tree accuracy do not yet appear to yield appreciable improvements in realization results. While our syntactic realizer, RealPro, does produce reasonable surface strings from the target dependency trees, despite occasional errors in parsing the target strings and converting the phrase structure trees to dependency trees, it appears that the tree accuracy levels for the transferred parses will need to be higher on average before the improvements in tree accuracy become consistently visible in the realization results. At present, the following three problems represent the most important obstacles we have identified to achieving better end-to-end results: * Since many of the test sentences require transfer rules for which there are no similar cases in the set of training sentences, it appears that the relatively small size of our corpus is a significant barrier to better results.</Paragraph>
      <Paragraph position="1"> * Some performance problems with the current implementation have forced us to make use of a perhaps overly strict set of alignment and attribute constraints. With an improved implementation, it may be possible to find more valuable rules from the same training data.</Paragraph>
      <Paragraph position="2"> * A more refined treatment of rule conflicts is needed in order to allow multiple rules to access overlapping contexts, while avoiding the introduction of multiple translations of the same content in certain cases.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML