File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/06/w06-3119_evalu.xml

Size: 2,712 bytes

Last Modified: 2025-10-06 13:59:57

<?xml version="1.0" standalone="yes"?>
<Paper uid="W06-3119">
  <Title>Syntax Augmented Machine Translation via Chart Parsing</Title>
  <Section position="6" start_page="139" end_page="140" type="evalu">
    <SectionTitle>
5 Results
</SectionTitle>
    <Paragraph position="0"> We present results that compare our system against the baseline Pharaoh implementation (Koehn et al., 2003a) and MER training scripts provided for this workshop. Our results represent work done before the submission due date as well as after with the fol- null bank parse categories as nonterminals; rules containing up to 4 nonterminal abstraction sites.</Paragraph>
    <Paragraph position="1"> * SynExt - Syntactic extraction using the extended-category scheme, but with rules only containing up to 2 nonterminal abstraction sites.</Paragraph>
    <Paragraph position="2"> We also explored the impact of longer initial phrases by training another phrase table with phrases up to length 12. Our results are presented in Table 1. While our submission time system (Syn using LM for rescoring only) shows no improvement over the baseline, we clearly see the impact of integrating the language model into the K-Best list extraction process. Our final system shows at statistically significant improvement over the baseline (0.78 BLEU points is the 95 confidence level). We also see a trend towards improving translation quality as we  System Dev: w/o LM Dev: LM-rescoring Test: LM-r. Dev: integrated LM Test: int. LM Baseline - max. phr. length 7 - - - 31.11 30.61 Lex - max. phrase length 7 27.94 29.39 29.95 28.96 29.12 XCat - max. phrase length 7 27.56 30.27 29.81 30.89 31.01 Syn - max. phrase length 7 29.20 30.95 30.58 31.52 31.31 SynExt - max. phrase length 7 - - - 31.73 31.41 Baseline - max. phr. length 12 - - - 31.16 30.90 Lex - max. phr. length 12 - - - 29.30 29.51 XCat - max. phr. length 12 - - - 30.79 30.59 SynExt - max. phr. length 12 - - - 31.07 31.76  parameter tuning) and '06 'Development Test Set' (identical to last year's Shared Task's test set). The system submitted for evaluation is highlighted in bold.</Paragraph>
    <Paragraph position="3"> employ richer extraction techniques. The relatively poor performance of Lex with LM in K-Best compared to the baseline shows that we are still making search errors during parsing despite tighter integration of the language model.</Paragraph>
    <Paragraph position="4"> We also ran an experiment with CMU's phrase-based decoder (Vogel et al., 2003) using the length7 phrase table. While its development-set score was only 31.01, the decoder achieved 31.42 on the test set, placing it at the same level as our extended-category system for that phrase table.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML