File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/06/w06-3803_abstr.xml

Size: 1,068 bytes

Last Modified: 2025-10-06 13:45:47

<?xml version="1.0" standalone="yes"?>
<Paper uid="W06-3803">
  <Title>Graph-Based Text Representation for Novelty Detection</Title>
  <Section position="1" start_page="0" end_page="0" type="abstr">
    <SectionTitle>
Abstract
</SectionTitle>
    <Paragraph position="0"> We discuss several feature sets for novelty detection at the sentence level, using the data and procedure established in task 2 of the TREC 2004 novelty track.</Paragraph>
    <Paragraph position="1"> In particular, we investigate feature sets derived from graph representations of sentences and sets of sentences. We show that a highly connected graph produced by using sentence-level term distances and pointwise mutual information can serve as a source to extract features for novelty detection. We compare several feature sets based on such a graph representation. These feature sets allow us to increase the accuracy of an initial novelty classifier which is based on a bag-of-word representation and KL divergence. The final result ties with the best system at TREC 2004.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML