File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/06/w06-1204_abstr.xml
Size: 1,103 bytes
Last Modified: 2025-10-06 13:45:22
<?xml version="1.0" standalone="yes"?> <Paper uid="W06-1204"> <Title>Using Information about Multi-word Expressions for the Word-Alignment Task</Title> <Section position="2" start_page="0" end_page="0" type="abstr"> <SectionTitle> Abstract </SectionTitle> <Paragraph position="0"> It is well known that multi-word expressions are problematic in natural language processing. In previous literature, it has been suggested that information about their degree of compositionality can be helpful in various applications but it has not been proven empirically. In this paper, we propose a framework in which information about the multi-word expressions can be used in the word-alignment task. We have shown that even simple features like point-wise mutual information are useful for word-alignment task in English-Hindi parallel corpora. The alignment error rate which we achieve (AER = 0.5040) is significantly better (about 10% decrease in AER) than the alignment error rates of the state-of-art models (Och and Ney, 2003) (Best AER = 0.5518) on the English-Hindi dataset.</Paragraph> </Section> class="xml-element"></Paper>