File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/03/w03-1610_intro.xml

Size: 3,963 bytes

Last Modified: 2025-10-06 14:02:00

<?xml version="1.0" standalone="yes"?>
<Paper uid="W03-1610">
  <Title>Optimizing Synonym Extraction Using Monolingual and Bilingual Resources</Title>
  <Section position="2" start_page="0" end_page="0" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> This paper addresses the problem of extracting synonymous English words (synonyms) from multiple resources: a monolingual dictionary, a parallel bilingual corpus, and a monolingual corpus. The extracted synonyms can be used in a number of NLP applications. In information retrieval and question answering, the synonymous words are employed to bridge the expressions gaps between the query space and the document space (Mandala et al., 1999; Radev et al., 2001; Kiyota et al., 2002). In automatic text summarization, synonymous words are employed to identify repetitive information in order to avoid redundant contents in a summary (Barzilay and Elhadad, 1997). In language generation, synonyms are employed to create more varied texts (Langkilde and Knight, 1998).</Paragraph>
    <Paragraph position="1"> Up to our knowledge, there are few studies investigating the combination of different resources for synonym extraction. However, many studies investigate synonym extraction from only one resource. The most frequently used resource for synonym extraction is large monolingual corpora (Hindle, 1990; Crouch and Yang, 1992; Grefenstatte, 1994; Park and Choi, 1997; Gasperin et al., 2001 and Lin, 1998). The methods used the contexts around the investigated words to discover synonyms. The problem of the methods is that the precision of the extracted synonymous words is low because it extracts many word pairs such as &amp;quot;cat&amp;quot; and &amp;quot;dog&amp;quot;, which are similar but not synonymous. null Other resources are also used for synonym extraction. Barzilay and Mckeown (2001), and Shimohata and Sumita (2002) used bilingual corpora to extract synonyms. However, these methods can only extract synonyms which occur in the bilingual corpus. Thus, the extracted synonyms are limited.</Paragraph>
    <Paragraph position="2"> Besides, Blondel and Sennelart (2002) used mono-lingual dictionaries to extract synonyms. Although the precision of this method is high, the coverage is low because the result of this method heavily depends on the definitions of words.</Paragraph>
    <Paragraph position="3"> In order to improve the performance of synonym extraction, Curran (2002) used an ensemble method to combine the results of different methods using a monolingual corpus. Although Curran (2002) showed that the ensemble extractors out-performed the individual extractors, it still cannot overcome the deficiency of the methods using the monolingual corpus.</Paragraph>
    <Paragraph position="4"> To overcome the deficiencies of the methods using only one resource, our approach combines both monolingual and bilingual resources to automatically extract synonymous words. By combining the synonyms extracted by the individual extractors using the three resources, our approach can combine the merits of the individual extractors to improve the performance of synonym extraction.</Paragraph>
    <Paragraph position="5"> In fact, our approach can be considered as an ensemble of different resources for synonym extraction. Experimental results prove that the three resources are complementary to each other on synonym extraction, and that the ensemble method we used is very effective to improve both precisions and recalls of extracted synonyms.</Paragraph>
    <Paragraph position="6"> The remainder of this paper is organized as follows. The next section presents our approach for synonym extraction. Section 3 describes an implementation of the three individual extractors.</Paragraph>
    <Paragraph position="7"> Section 4 presents the evaluation results. Section 5 discusses our method. In the last section, we draw the conclusions of this work.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML