File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/06/w06-0107_intro.xml

Size: 3,665 bytes

Last Modified: 2025-10-06 14:03:48

<?xml version="1.0" standalone="yes"?>
<Paper uid="W06-0107">
  <Title>Latent Features in Automatic Tense Translation between Chinese and English Yang Ye + , Victoria Li Fossum SS</Title>
  <Section position="3" start_page="0" end_page="48" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> Language speakers make two types of distinctions about temporal relations: the first type of relation is based on precedence between events and can be expanded into a finer grained taxonomy as proposed by (Allen, 1981). The second type of relation is based on the relative positioning between the following three time parameters proposed by (Reichenbach, 1947): speech time (S), event time (E) and reference time (R). In the past couple of decades, the NLP community has seen an emergent interest in the first type of temporal relation. In the cross-lingual context, while the first type of relationship can be easily projected across a language pair, the second type of relationship is often hard to be projected across a language pair. In contrast to this challenge, cross-lingual temporal reference distinction has been poorly explored.</Paragraph>
    <Paragraph position="1"> Languages vary in the granularity of their tense and aspect representations; some have finer-grained tenses or aspects than others. Tense generation and tense understanding in natural language texts are highly dynamic and context-dependent processes, since any previously established time point or interval, whether explicitly mentioned in the context or not, could potentially serve as the reference time for the event in question. (Bruce, 1972) captures this nature of temporal reference organization in discourse through a multiple temporal reference model. He defines a set (S</Paragraph>
    <Paragraph position="3"> ..., n-1) stand for a sequence of time references from which the reference time of a particular event could come. Given the elusive nature of reference time shift, it is extremely hard to model the reference time point directly in temporal information processing. The above reasons motivate classifying temporal reference distinction automatically, using machine learning algorithms such as Conditional Random Fields (CRFs).</Paragraph>
    <Paragraph position="4"> Many researchers in Natural Language Processing seem to believe that an automatic system does not have to follow the mechanism of human brain in order to optimize its performance, for example, the feature space for an automatic classification system does not have to replicate the knowledge sources that human beings utilize. There has been very little research that pursues to testify this faith. The current work attempts to identify which features are most important for tense generation in Chinese to English translation scenario, which can point to direction of future research effort for automatic tense translation between Chinese and English.</Paragraph>
    <Paragraph position="5">  The remaining part of the paper is organized as follows: Section 2 summarizes the significant related works in temporal information annotation and points out how this study relates to yet differs from them. Section 3 formally defines the problem, tense taxonomy and introduces the data.</Paragraph>
    <Paragraph position="6"> Section 4 discusses the feature space and proposes the latent features for the tense classification task. Section 5 presents the classification experiments in Conditional Random Fields as well as Classification Tree and reports the evaluation results. Section 6 concludes the paper and section 7 points out directions for future research.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML