File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/02/c02-1144_concl.xml

Size: 1,220 bytes

Last Modified: 2025-10-06 13:53:14

<?xml version="1.0" standalone="yes"?>
<Paper uid="C02-1144">
  <Title>Concept Discovery from Text</Title>
  <Section position="9" start_page="13403" end_page="13403" type="concl">
    <SectionTitle>
7 Conclusion
</SectionTitle>
    <Paragraph position="0"> We presented a clustering algorithm, CBC, for automatically discovering concepts from text. It can handle a large number of elements, a large number of output clusters, and a large sparse feature space. It discovers clusters using wellscattered tight clusters called committees. In our experiments, we showed that CBC outperforms several well known hierarchical, partitional, and hybrid clustering algorithms in cluster quality.</Paragraph>
    <Paragraph position="1"> For example, in one experiment, CBC outperforms K-means by 4.25%.</Paragraph>
    <Paragraph position="2"> By comparing the CBC clusters with WordNet classes, we not only find errors in CBC, but also oversights in WordNet.</Paragraph>
    <Paragraph position="3"> Evaluating cluster quality has always been a difficult task. We presented a new evaluation methodology that is based on the editing distance between output clusters and classes extracted from WordNet (the answer key).</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML