XML Viewer - w01-1408

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/01/w01-1408_evalu.xml
Size: 4,951 bytes
Last Modified: 2025-10-06 13:58:46
<?xml version="1.0" standalone="yes"?>
<Paper uid="W01-1408">
  <Title>An Efficient A* Search Algorithm for Statistical Machine Translation</Title>
  <Section position="8" start_page="0" end_page="0" type="evalu">
    <SectionTitle>
6 Results
</SectionTitle>
    <Paragraph position="0"> We present results on the HANSARDS task which consists of proceedings of the Canadian parliament that are kept both in French and in English. Table 3 shows the details of our training corpus.</Paragraph>
    <Paragraph position="1"> We used different the test corpora with sentences of length 6-14 words (Table 4).</Paragraph>
    <Paragraph position="2"> In all experiments, we use the following two error criteria: WER (word error rate): The WER is computed as the minimum number of substitution, insertion and deletion operations that have to be performed to convert the generated string into the target string.</Paragraph>
    <Paragraph position="3"> PER (position independent word error rate): The word order of a French/English sentence pair can be quite different. As a result, the word order of the automatically generated target sentence can be different from that of the given target sentence, but nevertheless acceptable so that the WER measure alone could be misleading. In order to overcome this problem, we introduce the position independent word error rate (PER) as additional measure. This measure compares the words in the two sentences without taking the word order into account.</Paragraph>
    <Paragraph position="4"> In the following experiments we restricted the maximum number of active search hypotheses in A* search to 1 million. Every hypothesis has an effective memory requirement of about 100 Byte. Therefore, we obtain a dynamic memory requirement of about 100 MByte.</Paragraph>
    <Paragraph position="5"> In order to speed up the search, we restricted the reordering of words in IBM-style (Berger et al., 1996; Tillmann, 2001). According to this restriction, up to 3 source sentence positions may be skipped and translated later, i. e. during the search process there may be up to 3 uncovered positions left of the rightmost covered position in the source sentence. The word error rate does not increase compared to a non-restricted reordering, but the search becomes much more efficient.</Paragraph>
    <Paragraph position="6"> Table 5 shows how many sentences with different sentence lengths can be translated using beam search and A* with various heuristic functions. Obviously, the BS approach is able to translate any sentence length, therefore the search success rate is 100%. Without any heuristic function A* is only able to translate all 8-word sentences (with the restriction of a maximum number of 1 million hypotheses). Using more sophisticated heuristic functions we are also able to translate all 10-word sentences with A*.</Paragraph>
    <Paragraph position="7"> Table 6 compares the search errors of A* and BS. During the BS search, translation pruning is carried out. The different hypotheses are distinguished according to the set of covered positions of the source sentence. For every set, the best score of all hypotheses is computed. Only those hypotheses are kept whose score is greater than this best score multiplied with a threshold. We chose the threshold to be 2.5, 5.0, 7.5 and 10.0 (see Table 6).</Paragraph>
    <Paragraph position="8">  For A* we never observe any search errors. In the case of the admissible heuristic functions, this is guaranteed by the approach. As can be seen from Table 6, the BS algorithm with a large beam rarely produces search errors.</Paragraph>
    <Paragraph position="9"> Table 7 compares the translation efficiency of the various search algorithms. We see that beam search even with a very large beam producing only very few search errors is much more efficient than the used A* search algorithm.</Paragraph>
    <Paragraph position="10"> Table 8 contains an assessment of translation quality comparison of A* and BS using the T6, T8, T10, T12-test corpus. For A*, we use the E+ rest cost estimation as this gives optimal results. From the 200 sentences of these test corpora we can translate 192 sentences using the 1 million hypotheses constraint. For the remaining sentences we performed a search with 4 million hypotheses  (cf. below) which lead to a success for all the 12word sentences.</Paragraph>
    <Paragraph position="11"> The number of hypotheses in A* search We restricted the maximal number of hypotheses to 1 million. This was sufficient for translating 10-word sentences, as the search algorithm success rate in Table 5 shows. For longer sentences it is necessary to allow for a larger number of hypotheses. For the sentences of lengths 12 and 14, we performed an A* search (E+) with 2, 4 and 8 million possible hypotheses. The search algorithm success rate for those searches is contained in Table 9. We see a significant effect on the number of successful searches.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML