XML Viewer - p06-1076

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/06/p06-1076_abstr.xml

Size: 1,401 bytes

Last Modified: 2025-10-06 13:44:59

<?xml version="1.0" standalone="yes"?>
<Paper uid="P06-1076">
  <Title>Sydney, July 2006. c(c)2006 Association for Computational Linguistics A Comparison of Document, Sentence, and Term Event Spaces</Title>
  <Section position="2" start_page="0" end_page="0" type="abstr">
    <SectionTitle>
Abstract
</SectionTitle>
    <Paragraph position="0"> The trend in information retrieval systems is from document to sub-document retrieval, such as sentences in a summarization system and words or phrases in question-answering system. Despite this trend, systems continue to model language at a document level using the inverse document frequency (IDF). In this paper, we compare and contrast IDF with inverse sentence frequency (ISF) and inverse term frequency (ITF). A direct comparison reveals that all language models are highly correlated; however, the average ISF and ITF values are 5.5 and 10.4 higher than IDF. All language models appeared to follow a power law distribution with a slope coefficient of</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
1.6 for documents and 1.7 for sentences
</SectionTitle>
      <Paragraph position="0"> and terms. We conclude with an analysis of IDF stability with respect to random, journal, and section partitions of the 100,830 full-text scientific articles in our experimental corpus.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>

Download Original XML