File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/06/p06-1127_intro.xml
Size: 1,396 bytes
Last Modified: 2025-10-06 14:03:38
<?xml version="1.0" standalone="yes"?> <Paper uid="P06-1127"> <Title>Novel Association Measures Using Web Search with Double Checking</Title> <Section position="4" start_page="1009" end_page="1009" type="intro"> <SectionTitle> 2 A Web Search with Double Checking Model </SectionTitle> <Paragraph position="0"> Instead of simple web page counts and complex web page collection, we propose a novel model, a Web Search with Double Checking (WSDC), to analyze snippets. In WSDC model, two objects X and Y are postulated to have an association if we can find Y from X (a forward process) and find X from Y (a backward process) by web search. The forward process counts the total occurrences of Y in the top N snippets of query X, denoted as f(Y@X). Similarly, the backward process counts the total occurrences of X in the top N snippets of query Y, denoted as f(X@Y). The forward and the backward processes form a double check operation. null Under WSDC model, the association scores between X and Y are defined by various formulas as follows.</Paragraph> <Paragraph position="1"> Where f(X) is the total occurrences of X in the top N snippets of query X, and, similarly, f(Y) is the total occurrences of Y in the top N snippets of query Y. Formulas (1)-(4) are variants of the Dice, Cosine, Jaccard, and Overlap Ratio association measure. Formula (5) is a function</Paragraph> </Section> class="xml-element"></Paper>