File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/01/p01-1027_intro.xml
Size: 2,000 bytes
Last Modified: 2025-10-06 14:01:14
<?xml version="1.0" standalone="yes"?> <Paper uid="P01-1027"> <Title>Refined Lexicon Models for Statistical Machine Translation using a Maximum Entropy Approach</Title> <Section position="3" start_page="0" end_page="0" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> Typically, the lexicon models used in statistical machine translation systems are only single-word based, that is one word in the source language corresponds to only one word in the target language.</Paragraph> <Paragraph position="1"> Those lexicon models lack from context information that can be extracted from the same parallel corpus. This additional information could be: a3 Simple context information: information of the words surrounding the word pair; a3 Syntactic information: part-of-speech information, syntactic constituent, sentence mood; a3 Semantic information: disambiguation information (e.g. from WordNet), current/previous speech or dialog act.</Paragraph> <Paragraph position="2"> To include this additional information within the statistical framework we use the maximum entropy approach. This approach has been applied in natural language processing to a variety of tasks. (Berger et al., 1996) applies this approach to the so-called IBM Candide system to build context dependent models, compute automatic sentence splitting and to improve word reordering in translation. Similar techniques are used in (Papineni et al., 1996; Papineni et al., 1998) for so-called direct translation models instead of those proposed in (Brown et al., 1993). (Foster, 2000) describes two methods for incorporating information about the relative position of bilingual word pairs into a maximum entropy translation model.</Paragraph> <Paragraph position="3"> Other authors have applied this approach to language modeling (Rosenfeld, 1996; Martin et al., 1999; Peters and Klakow, 1999). A short review of the maximum entropy approach is outlined in Section 3.</Paragraph> </Section> class="xml-element"></Paper>