File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/05/w05-0808_concl.xml

Size: 1,428 bytes

Last Modified: 2025-10-06 13:54:56

<?xml version="1.0" standalone="yes"?>
<Paper uid="W05-0808">
  <Title>A hybrid approach to align sentences and words in English-Hindi parallel corpora</Title>
  <Section position="7" start_page="62" end_page="62" type="concl">
    <SectionTitle>
5 Future works
</SectionTitle>
    <Paragraph position="0"> It would be useful to evaluate separate stages (i.e.</Paragraph>
    <Paragraph position="1"> DL, TS, EEW and Nearest Aligned Neighbours approach) in the word alignment algorithm separately. We aim to do this as part of a failure analysis of the algorithm in future. We also aim to improve our alignment results by using Part-of-Speech information for the English texts. We aim to implement or use local word grouping rules for the English text and improve our existing word grouping rules for the Hindi texts. The Nearest Aligned Neighbours approach suggests possible alignments, but we are trying to integrate some statistical ranking algorithms in order to suggest more reliable pairs of alignment. Yarowsky et al.</Paragraph>
    <Paragraph position="2"> (2001) introduced a new method for developing a Part-of-Speech tagger by projecting tags across aligned corpora. They used this technique to supply data for a supervised learning technique to acquire a French part-of-speech tagger. We aim to use our English-Hindi word alignment results to bootstrap a Part-of-Speech tagger for the Hindi language.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML