File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/06/p06-1127_intro.xml

Size: 1,396 bytes

Last Modified: 2025-10-06 14:03:38

<?xml version="1.0" standalone="yes"?>
<Paper uid="P06-1127">
  <Title>Novel Association Measures Using Web Search with Double Checking</Title>
  <Section position="4" start_page="1009" end_page="1009" type="intro">
    <SectionTitle>
2 A Web Search with Double Checking
Model
</SectionTitle>
    <Paragraph position="0"> Instead of simple web page counts and complex web page collection, we propose a novel model, a Web Search with Double Checking (WSDC), to analyze snippets. In WSDC model, two objects X and Y are postulated to have an association if we can find Y from X (a forward process) and find X from Y (a backward process) by web search. The forward process counts the total occurrences of Y in the top N snippets of query X, denoted as f(Y@X). Similarly, the backward process counts the total occurrences of X in the top N snippets of query Y, denoted as f(X@Y). The forward and the backward processes form a double check operation. null Under WSDC model, the association scores between X and Y are defined by various formulas as follows.</Paragraph>
    <Paragraph position="1">  Where f(X) is the total occurrences of X in the top N snippets of query X, and, similarly, f(Y) is the total occurrences of Y in the top N snippets of query Y. Formulas (1)-(4) are variants of the Dice, Cosine, Jaccard, and Overlap Ratio association measure. Formula (5) is a function</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML