File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/04/n04-1023_intro.xml

Size: 3,042 bytes

Last Modified: 2025-10-06 14:02:17

<?xml version="1.0" standalone="yes"?>
<Paper uid="N04-1023">
  <Title>Discriminative Reranking for Machine Translation</Title>
  <Section position="3" start_page="0" end_page="0" type="intro">
    <SectionTitle>
2 Ranking and Reranking
2.1 Reranking for NLP tasks
</SectionTitle>
    <Paragraph position="0"> Like machine translation, parsing is another field of natural language processing in which generative models have been widely used. In recent years, reranking techniques, especially discriminative reranking, have resulted in significant improvements in parsing. Various machine learning algorithms have been employed in parse reranking, such as Boosting (Collins, 2000), Perceptron (Collins and Duffy, 2002) and Support Vector Machines (Shen and Joshi, 2003). The reranking techniques have resulted in a 13.5% error reduction in labeled recall/precision over the previous best generative parsing models. Discriminative reranking methods for parsing typically use the notion of a margin as the distance between the best candidate parse and the rest of the parses. The reranking problem is reduced to a classification problem by using pairwise samples. null In (Shen and Joshi, 2004), we have introduced a new perceptron-like ordinal regression algorithm for parse reranking. In that algorithm, pairwise samples are used for training and margins are defined as the distance between parses of different ranks. In addition, the uneven margin technique has been used for the purpose of adapting ordinal regression to reranking tasks. In this paper, we apply this algorithm to MT reranking, and we also introduce a new perceptron-like reranking algorithm for MT.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
2.2 Ranking and Ordinal Regression
</SectionTitle>
      <Paragraph position="0"> In the field of machine learning, a class of tasks (called ranking or ordinal regression) are similar to the reranking tasks in NLP. One of the motivations of this paper is to apply ranking or ordinal regression algorithms to MT reranking. In the previous works on ranking or ordinal regression, the margin is defined as the distance between two consecutive ranks. Two large margin approaches have been used. One is the PRank algorithm, a variant of the perceptron algorithm, that uses multiple biases to represent the boundaries between every two consecutive ranks (Crammer and Singer, 2001; Harrington, 2003). However, as we will show in section 3.7, the PRank algorithm does not work on the reranking tasks due to the introduction of global ranks. The other approach is to reduce the ranking problem to a classification problem by using the method of pairwise samples (Herbrich et al., 2000). The underlying assumption is that the samples of consecutive ranks are separable. This may become a problem in the case that ranks are unreliable when ranking does not strongly distinguish between candidates. This is just what happens in reranking for machine translation.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML