File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/04/c04-1156_concl.xml
Size: 1,735 bytes
Last Modified: 2025-10-06 13:53:58
<?xml version="1.0" standalone="yes"?> <Paper uid="C04-1156"> <Title>Knowledge Intensive Word Alignment with KNOWA</Title> <Section position="7" start_page="4" end_page="4" type="concl"> <SectionTitle> 6 Conclusion </SectionTitle> <Paragraph position="0"> In this paper we compared the performances of two word aligners, one exclusively based on statistical principles, and the other intensively based on linguistic resources. Although statistics-based algorithms are very appealing, because they are language independent, and only need a parallel corpus of reasonable size to be trained, we have shown that, from a practical point of view, the lack of parallel corpora with the necessary characteristics can hamper the performances of the statistical algorithms. In these cases, an algorithm based on linguistic resources, if available, can outperform a statistics-based algorithm.</Paragraph> <Paragraph position="1"> Also, knowledge-intensive word aligners may be more effective when word alignment is needed for special purposes such as annotation transfer from one language to another. This is the case for instance of the MultiSemCor project, in which, apart from a better performance in terms of precision and recall, a word aligner based on dictionaries, such as KNOWA, has the advantage that it will fail to align words that are not synonyms. The alignment of non-synonymous translation equivalents, which are hardly found in bi-lingual dictionaries, is usually a strength of corpus-based word aligners, but turns out to be a disadvantage in the MultiSemCor case, where the alignment of non synonyoums words causes the transfer of wrong word sense annotations from one language to the other.</Paragraph> </Section> class="xml-element"></Paper>