File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/03/n03-1024_concl.xml

Size: 1,740 bytes

Last Modified: 2025-10-06 13:53:30

<?xml version="1.0" standalone="yes"?>
<Paper uid="N03-1024">
  <Title>Syntax-based Alignment of Multiple Translations: Extracting Paraphrases and Generating New Sentences</Title>
  <Section position="6" start_page="12" end_page="12" type="concl">
    <SectionTitle>
5 Conclusion &amp; Future Work
</SectionTitle>
    <Paragraph position="0"> In this paper, we presented a new syntax-based algorithm that learns paraphrases from a newly available dataset.</Paragraph>
    <Paragraph position="1"> The multiple translation corpus that we use in this paper is the first instance in a series of similar corpora that are built and made publicly available by LDC in the context of a series of DARPA-sponsored MT evaluations. The algorithm we proposed constructs finite state representations of paraphrases that are useful in many contexts: to induce large lists of lexical and structural paraphrases; to generate semantically equivalent renderings of a given meaning; and to estimate the quality of machine translation systems. More experiments need to be carried out in order to assess extrinsically whether the FSAs we produce can be used to yield higher agreement scores between human and automatic assessments of translation quality.</Paragraph>
    <Paragraph position="2"> In our future work, we wish to experiment with more flexible merging algorithms and to integrate better the top-down and bottom-up processes that are used to induce FSAs. We also wish to extract more abstract paraphrase patterns from the current representation. Such patterns are more likely to get reused - which would help us get reliable statistics for them in the extraction phase, and also have a better chance of being applicable to unseen data.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML