File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/relat/02/p02-1038_relat.xml
Size: 1,519 bytes
Last Modified: 2025-10-06 14:15:40
<?xml version="1.0" standalone="yes"?> <Paper uid="P02-1038"> <Title>Discriminative Training and Maximum Entropy Models for Statistical Machine Translation</Title> <Section position="7" start_page="0" end_page="0" type="relat"> <SectionTitle> 6 Related Work </SectionTitle> <Paragraph position="0"> The use of direct maximum entropy translation models for statistical machine translation has been sug- null mum entropy training for alignment templates; ,1: trigram language model; ,2: alignment template model, ,3: lexicon model, ,4: alignment model (normalized such that P4m=1 ,m = 4).</Paragraph> <Paragraph position="1"> gested by (Papineni et al., 1997; Papineni et al., 1998). They train models for natural language understanding rather than natural language translation. In contrast to their approach, we include a dependence on the hidden variable of the translation model in the direct translation model. Therefore, we are able to use statistical alignment models, which have been shown to be a very powerful component for statistical machine translation systems.</Paragraph> <Paragraph position="2"> In speech recognition, training the parameters of the acoustic model by optimizing the (average) mutual information and conditional entropy as they are defined in information theory is a standard approach (Bahl et al., 1986; Ney, 1995). Combining various probabilistic models for speech and language modeling has been suggested in (Beyerlein, 1997; Peters and Klakow, 1999).</Paragraph> </Section> class="xml-element"></Paper>