File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/04/c04-1111_intro.xml

Size: 3,174 bytes

Last Modified: 2025-10-06 14:02:12

<?xml version="1.0" standalone="yes"?>
<Paper uid="C04-1111">
  <Title>Towards Terascale Knowledge Acquisition</Title>
  <Section position="3" start_page="0" end_page="771" type="intro">
    <SectionTitle>
2 Relevant Work
</SectionTitle>
    <Paragraph position="0"> Previous approaches to extracting is-a relations fall under two categories: pattern-based and co-occurrence-based approaches.</Paragraph>
    <Section position="1" start_page="0" end_page="771" type="sub_section">
      <SectionTitle>
2.1 Pattern-based approaches
</SectionTitle>
      <Paragraph position="0"> Marti Hearst (1992) was the first to use a pattern-based approach to extract hyponym relations from a raw corpus. She used an iterative process to semi-automatically learn patterns. However, a corpus of 20MB words yielded only 400 examples.</Paragraph>
      <Paragraph position="1"> Our pattern-based algorithm is very similar to the one used by Hearst. She uses seed examples to manually discover her patterns whearas we use a minimal edit distance algorithm to automatically discover the patterns.</Paragraph>
      <Paragraph position="2">  Riloff and Shepherd (1997) used a semi-automatic method for discovering similar words using a few seed examples by using pattern-based techniques and human supervision. Berland and Charniak (1999) used similar pattern-based techniques and other heuristics to extract meronymy (part-whole) relations. They reported an accuracy of about 55% precision on a corpus of 100,000 words. Girju et al. (2003) improved upon Berland and Charniak's work using a machine learning filter. Mann (2002) and Fleischman et al. (2003) used part of speech patterns to extract a subset of hyponym relations involving proper nouns.</Paragraph>
      <Paragraph position="3"> Our pattern-based algorithm differs from these approaches in two ways. We learn lexico-POS patterns in an automatic way. Also, the patterns are learned with the specific goal of scaling to the terascale (see Table 2).</Paragraph>
    </Section>
    <Section position="2" start_page="771" end_page="771" type="sub_section">
      <SectionTitle>
2.2 Co-occurrence-based approaches
</SectionTitle>
      <Paragraph position="0"> The second class of algorithms uses co-occurrence statistics (Hindle 1990, Lin 1998). These systems mostly employ clustering algorithms to group words according to their meanings in text. Assuming the distributional hypothesis (Harris 1985), words that occur in similar grammatical contexts are similar in meaning. Curran and Moens (2002) experimented with corpus size and complexity of proximity features in building automatic thesauri. CBC (Clustering by Committee) proposed by Pantel and Lin (2002) achieves high recall and precision in generating similarity lists of words discriminated by their meaning and senses. However, such clustering algorithms fail to name their classes.</Paragraph>
      <Paragraph position="1"> Caraballo (1999) was the first to use clustering for labeling is-a relations using conjunction and apposition features to build noun clusters. Recently, Pantel and Ravichandran (2004) extended this approach by making use of all syntactic dependency features for each noun.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML