File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/95/w95-0115_concl.xml
Size: 3,175 bytes
Last Modified: 2025-10-06 13:57:27
<?xml version="1.0" standalone="yes"?> <Paper uid="W95-0115"> <Title>Automatic Evaluation and Uniform Filter Cascades for Inducing N-Best Translation Lexicons</Title> <Section position="10" start_page="196" end_page="196" type="concl"> <SectionTitle> 6 CONCLUSIONS </SectionTitle> <Paragraph position="0"> The research presented here makes several contributions to research in machine translation and related fields: * a uniform framework for combining various data filters with statistical methods tbr illducing N-best translation lexicons, * an automatic evaluation method for translation lexicons which obviates the need for labor-intensive subjective evaluation by human judges, * four different ways to improve statisticM translation models, * a demonstration of how tiny training corpora can be enhanced with non-statistical knowledge sources to induce better lexicons than unenhanced training corpora many times the size. The effectiveness of different data filters for inducing translation lexicons crucially depends on the particular pair of languages under consideration. Cognates are more common, and therefore more useful, in languages which are more closely related. For example, one would expect to find more cognates between Russian and Ukrainian than between French and English. The implementation of a part of speech filter for a given pair of languages depends on the availability of part of speech taggers for both languages, where the two taggers have a small common tag set. The effectiveness of oracle filters based on MRBDs will depend o11 the extent to which the vocabulary of the MRBD intersects with the vocabulary of the training text. This, in turn, depends partly on the size of the MRBD. Filters based on word alignment patterns will only be as good as the model of typical word alignments between the pair of languages in question. For languages with very similar syntax, a linear model will suffice. Higher order models will be required for a pair of languages like English and Japanese.</Paragraph> <Paragraph position="1"> For the case of French and English, each of the presented filters makes a significant improvement over the baseline model. Taken together, the filters produce models which approach human performance. These conclusions could not have been drawn without a uniform framework for filter comparison or without a technique for automatic evaluation. An automatic ewluation technique such as BIBLE should be used to gauge the effectiveness of any MT system which has a lexical transfer component. BiBLE's objective criterion is quite simple, with the drawback that it gives no indication of what kinds of errors exist in the lexicon being evaluated. Even so, given a test corpus of a reasonable size, it can detect very small differences in quality between two N-best translation lexicons. For example, BIBLE evaluations were used to find the precise optimum value for the LCSR cut-off in the Cognate Filter. BIBLE also helped to select the optimum tag set for the POS Filter. This kind of automatic quality control is indispensable for an engineering approach to better machine translation.</Paragraph> </Section> class="xml-element"></Paper>