File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/98/p98-1042_concl.xml

Size: 1,550 bytes

Last Modified: 2025-10-06 13:58:04

<?xml version="1.0" standalone="yes"?>
<Paper uid="P98-1042">
  <Title>An Experiment in Hybrid Dictionary and Statistical Sentence Alignment</Title>
  <Section position="8" start_page="272" end_page="272" type="concl">
    <SectionTitle>
6 Conclusion
</SectionTitle>
    <Paragraph position="0"> The assumption that a partial alignment at the word level from lexical correspondences can clearly indicate full sentence alignment is flawed when the texts contain many sentences with similar vocabulary. This is the case with the news stories used in our experiments and even technical vocabulary and proper nouns are not adequate to clearly discriminate between alternative alignment choices because the vocabulary range inside the news article is not large. Moreover, the basic assumption of the lexical approach, that the coverage of the bilingual dictionary is adequate, cannot be relied on if we require robustness. This has shown the need for some hybrid model.</Paragraph>
    <Paragraph position="1"> For our corpus of newspaper articles, the hybrid model has been shown to clearly improve sentence alignment results compared with the pure models used separately. In the future we would like to make extensions to the lexical model by incorporating term weighting methods from information retrieval such as inverse document frequency which may help to identify more important terms for matching. In order to test the generalisability of our method we also want to extend our investigation to parallel corpora in other domains.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML