File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/01/w01-1404_evalu.xml

Size: 4,095 bytes

Last Modified: 2025-10-06 13:58:46

<?xml version="1.0" standalone="yes"?>
<Paper uid="W01-1404">
  <Title>Approximating Context-Free by Rational Transduction for Example-Based MT</Title>
  <Section position="7" start_page="0" end_page="0" type="evalu">
    <SectionTitle>
6 Experiments
</SectionTitle>
    <Paragraph position="0"> We have investigated a corpus of English/Japanese sentence pairs, related by hierarchical alignment (see also (Bangalore and Riccardi, 2001)). We have taken the first 500, 1000, 1500, . . . aligned sentence pairs from this corpus to act as training corpora of varying sizes; we have taken 300 other sentence pairs to act as test corpus.</Paragraph>
    <Paragraph position="1"> We have constructed a bilexical transduction grammar from each training corpus, in the form of a context-free grammar, and this grammar was approximated by a finite automaton. The input sentences from the test corpus were then processed by context-free and finite-state machinery (in the sequel referred to by cfg and fa, respectively). We have also carried out experiments with robust finite-state processing, as discussed in Section 5, which is referred to by robust fa. If we append 2 after a tag, this mean that a133a68a134a14a136a6a137 a2 a42 a7 a42a98a97 a20 a23 a106 a48a3a9a5a8a74 a42 a7a19a3a11a10a14a74 a42a98a97a144a49 , otherwise  The reorder operators from the resulting output strings were applied in a robust way as explained in Section 5. The output strings were then compared to the reference output from the corpus, resulting in Figure 1. Our metric is word accuracy, which is based on edit distance. For a pair of strings, the edit distance is defined as the minimum number of substitutions, insertions and deletions needed to turn one string into the other.</Paragraph>
    <Paragraph position="2"> The word accuracy of a string a3 with regard to a string a173 is defined to be a30a13a83a175a174 a72 , where a125 is the edit distance between a3 and a173 and a22 is the length of a173 .</Paragraph>
    <Paragraph position="3"> To allow a comparison with more established techniques (see e.g. (Bangalore and Riccardi, 2001)), we also take into consideration a simple bigram model, trained on the strings comprising both source and target sentences and reorder operators, as explained in Section 4. For the purposes of predicting output symbols, a series of consecutive target symbols and reorder operators following a source symbol in the training sentences are treated as a single symbol by the bigram model, and only those may be output after that source symbol. Since our construction is such that target symbols always follow source symbols they are a translation of (according to the automatically obtained hierarchical alignment), this modification to the bigram model prevents output of totally unrelated target symbols that could otherwise result from a standard bigram model. It also ensures that a bounded number of output symbols per input symbol are produced.</Paragraph>
    <Paragraph position="4"> The fraction of sentences that were transduced (i.e. that were accepted by the grammar or the automaton), is indicated in Figure 2. Since robust fa(2) and bigram are able to transduce all input, they are not represented here. Note that the average word accuracy is computed only with respect to the sentences that could be transduced, which explains the high accuracy for small training corpora in the cases of cfg(2) and fa(2), where the few sentences that can be transduced are mostly short and simple.</Paragraph>
    <Paragraph position="5"> Figure 3 presents the time consumption of transduction for the entire test corpus. These data support our concerns about the high costs of context-free processing, even though our parser relies heavily on lexicalization.4 Figure 4 shows the sizes of the automata after determinization and minimization. Determinization for the largest automata indicated in the Figure took more than 24 hours for both fa(2) and robust fa(2) , which suggests these methods become unrealistic for training corpus sizes considerably larger than 10,000 bitexts.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML