File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/02/p02-1052_concl.xml
Size: 1,881 bytes
Last Modified: 2025-10-06 13:53:19
<?xml version="1.0" standalone="yes"?> <Paper uid="P02-1052"> <Title>Using Similarity Scoring To Improve the Bilingual Dictionary for Word Alignment</Title> <Section position="7" start_page="0" end_page="0" type="concl"> <SectionTitle> 7 Conclusions and Future Work </SectionTitle> <Paragraph position="0"> We have demonstrated how rebuilding a dictionary can improve the performance (both precision and recall) of a word alignment algorithm. The algorithm proved robust across baseline dictionaries and various different parameter settings. Although a small test set was used, the improvements are statistically significant for various parameter settings. We have shown that computing similarity scores of pairs of words can be used to cluster morphological variants of words in an inflected language such as French.</Paragraph> <Paragraph position="1"> It will be interesting to see how the similarity and clustering method will work in conjunction with other word alignment algorithms, as the dictionary up to one link per word rebuilding algorithm is independent of the actual word alignment method used.</Paragraph> <Paragraph position="2"> Furthermore, we plan to explore ways to improve the similarity scoring algorithm. For instance, we can assign lower match scores when the characters are not identical, but members of the same equivalence class. The equivalence classes will depend on the target language at hand. For instance, in German, a and &quot;a will be assigned to the same equivalence class, because some inflections cause a to become &quot;a. An improved similarity scoring algorithm may in turn result in improved word alignments.</Paragraph> <Paragraph position="3"> In general, we hope to move automated dictionary extraction away from pure surface form statistics and toward dictionaries that are more linguisti-</Paragraph> </Section> class="xml-element"></Paper>