File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/03/n03-1036_abstr.xml

Size: 1,164 bytes

Last Modified: 2025-10-06 13:42:48

<?xml version="1.0" standalone="yes"?>
<Paper uid="N03-1036">
  <Title>Unsupervised methods for developing taxonomies by combining syntactic and statistical information</Title>
  <Section position="2" start_page="0" end_page="0" type="abstr">
    <SectionTitle>
Abstract
</SectionTitle>
    <Paragraph position="0"> This paper describes an unsupervised algorithm for placing unknown words into a taxonomy and evaluates its accuracy on a large and varied sample of words. The algorithm works by first using a large corpus to find semantic neighbors of the unknown word, which we accomplish by combining latent semantic analysis with part-of-speech information. We then place the unknown word in the part of the taxonomy where these neighbors are most concentrated, using a class-labelling algorithm developed especially for this task. This method is used to reconstruct parts of the existing Word-Net database, obtaining results for common nouns, proper nouns and verbs. We evaluate the contribution made by part-of-speech tagging and show that automatic filtering using the class-labelling algorithm gives a fourfold improvement in accuracy.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML