File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/04/w04-1113_abstr.xml

Size: 1,477 bytes

Last Modified: 2025-10-06 13:43:48

<?xml version="1.0" standalone="yes"?>
<Paper uid="W04-1113">
  <Title>Using Synonym Relations In Chinese Collocation Extraction</Title>
  <Section position="1" start_page="0" end_page="0" type="abstr">
    <SectionTitle>
Abstract
</SectionTitle>
    <Paragraph position="0"> A challenging task in Chinese collocation extraction is to improve both the precision and recall rate. Most lexical statistical methods including Xtract face the problem of unable to extract collocations with lower frequencies than a given threshold. This paper presents a method where HowNet is used to find synonyms using a similarity function. Based on such synonym information, we have successfully extracted synonymous collocations which normally cannot be extracted using the lexical statistical approach. We applied synonyms mapping to each headword to extract more synonymous word bi-grams. Our evaluation over 60MB tagged corpus shows that we can extract synonymous collocations that occur with very low frequency, sometimes even for collocations that occur only once in the training set.</Paragraph>
    <Paragraph position="1"> Comparing to a collocation extraction system based on Xtract, we have reached the precision rate of 43% on word bi-grams for a set of 9 headwords, almost 50% improvement from precision rate of 30% in the Xtract system.</Paragraph>
    <Paragraph position="2"> Furthermore, it improves the recall rate of word bi-gram collocation extraction by 30%.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML