File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/04/p04-3007_concl.xml

Size: 3,037 bytes

Last Modified: 2025-10-06 13:54:08

<?xml version="1.0" standalone="yes"?>
<Paper uid="P04-3007">
  <Title>Exploiting Aggregate Properties of Bilingual Dictionaries For Distinguishing Senses of English Words and Inducing English Sense Clusters</Title>
  <Section position="8" start_page="0" end_page="0" type="concl">
    <SectionTitle>
7 Analysis and Conclusions
</SectionTitle>
    <Paragraph position="0"> This is the first presentation of a novel method for the induction of word sense inventories, which makes use of aggregate information from a large collection of bilingual dictionaries.</Paragraph>
    <Paragraph position="1"> One possible application of the induced sense inventories presented here is as an aid to manual construction of mono-lingual dictionaries or thesauri, motivated by translation distinctions across numerous world languages. While the desired granularity of sense distinction will vary according to the requirements of taste and differing applications, treating our output as a proposal to be assessed and manually modified would be a valuable labor-saving tool for lexicographers.</Paragraph>
    <Paragraph position="2"> Another application of this work is a supplemental resource for statistical machine translation (SMT). It is possible, as shown graphically in Figure 4, to recover the foreign words associated with a cluster (not just a single word). Given that the clusters provide a more complete coverage of English word types for a given sense than the English side of a particular bilingual dictionary, clusters could be used to unify bitext co-occurrence counts of foreign words with English senses in a way that typical bilingual dictionaries cannot. Unifying counts in this way would be a useful way of reducing data sparsity in SMT training.</Paragraph>
    <Paragraph position="3"> Finally, evaluation of induced sense taxonomies is always problematic. First of all, there is no agreed &amp;quot;correct&amp;quot; way to classify the possible senses of a particular word. To some degree this is because human experts disagree on particular judgments of classification, though a larger issue, as pointed out in Resnik and Yarowsky 1997, is that what constitutes an appropriate set of sense distinctions for a word is, emphatically, a function of the task at hand. The sense-distinction requirements of English-to-French machine translation differ from those of English-to-Arabic machine translation (due to differing degrees of parallel polysemy across the language pairs), and both differ from those of English dictionary construction.</Paragraph>
    <Paragraph position="4"> We believe that the translingually-motivated word-sense taxonomies developed here will prove useful for the a variety of tasks including those mentioned above. The fact that they are derived from a novel resource, not constructed explicitly by humans or derived in fully unsupervised fashion from text corpora, makes them worthy of study and incorporation in future lexicographic, machine translation, and word sense disambiguation efforts.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML