File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/06/w06-0803_evalu.xml

Size: 8,123 bytes

Last Modified: 2025-10-06 13:59:50

<?xml version="1.0" standalone="yes"?>
<Paper uid="W06-0803">
  <Title>Extracting Key Phrases to Disambiguate Personal Name Queries in Web Search</Title>
  <Section position="8" start_page="20" end_page="23" type="evalu">
    <SectionTitle>
6 Experiments and Results
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="20" end_page="20" type="sub_section">
      <SectionTitle>
6.1 Evaluating Contextual Similarity
</SectionTitle>
      <Paragraph position="0"> In section 5.4, we defined the similarity between documents (i.e., term-entity models created from the documents) using a web snippets based contextual similarity (Formula 1). However, how well such a metric represents the similarity between documents, remains unknown. Therefore, to evaluate the contextual similarity among documents, we group the documents in &amp;quot;person-X&amp;quot; dataset into four classes (each class representing a different person) and use Formula 1 to compute within-class and cross-class similarity histograms, as illustrated in Figure 3.</Paragraph>
      <Paragraph position="1"> Ideally, within-class similarity distribution should have a peak around 1 and cross-class similarity distribution around 0, whereas both histograms in Figure 3(a) and 3(b) have their peaks around 0.2. However, within-class similarity distribution is heavily biased toward to the right of this peak and cross-class similarity distribution to the left. Moreover, there are no document pairs with more than 0.5 cross-class similarity. The experimental results guarantees the validity of the contextual similarity metric.</Paragraph>
    </Section>
    <Section position="2" start_page="20" end_page="20" type="sub_section">
      <SectionTitle>
6.2 Evaluation Metric
</SectionTitle>
      <Paragraph position="0"> We evaluate experimental results based on the confusion matrix, where A[i.j] represents the number of documents of &amp;quot;person i&amp;quot; predicted as &amp;quot;person j&amp;quot; in matrix A. A[i,i] represents the number of correctly predicted documents for &amp;quot;person i&amp;quot;. We define the disambiguation accuracy as the sum of diagonal elements divided by the sum of all elements in the matrix.</Paragraph>
    </Section>
    <Section position="3" start_page="20" end_page="22" type="sub_section">
      <SectionTitle>
6.3 Cluster Quality
</SectionTitle>
      <Paragraph position="0"> Each cluster formed by the GAAC process is supposed to be representing a different namesake.</Paragraph>
      <Paragraph position="1"> Ideally, the number of clusters formed should be equal to the number of different namesakes for  axis represents the similarity value. Y axis represents the number of document pairs from the same class (within-class) or from different classes (cross-class) that have the corresponding similarity value. the ambiguous name. However, in reality it is impossible to exactly know the number of namesakes that appear on the Web for a particular name.</Paragraph>
      <Paragraph position="2"> Moreover, the distribution of pages among namesakes is not even. For example, in the &amp;quot;Jim Clark&amp;quot; dataset 78% of documents belong to the two famous namesakes (CEO Nestscape and Formula one world champion). The rest of the documents are distributed among the other six namesakes. If these outliers get attached to the otherwise pure clusters, both disambiguation accuracy and key phrase selection deteriorate. Therefore, we monitor the quality of clustering and terminate further agglomeration when the cluster quality drops below a pre-set threshold. Numerous metrics have been proposed for evaluating quality of clustering (Kannan et al., 2000). We use normalized cuts (Shi and Malik, 2000) as a measure of clusterquality. null Let, V denote the set of documents for a name.</Paragraph>
      <Paragraph position="3"> Consider, A [?] V to be a cluster of documents taken from V . For two documents x,y in V , sim(x,y) represents the contextual similarity between the documents (Formula 1). Then, the normalized cut Ncut(A) of cluster A is defined as,</Paragraph>
      <Paragraph position="5"> For a set, {A1,...,An} of non-overlapping n clusters Ai, we define the quality of clustering,</Paragraph>
      <Paragraph position="7"> To explore the faithfulness of cluster quality in approximating accuracy, we compare accuracy (calculated using human-annotated data) and cluster quality (automatically calculated using Formula 4) for person-X data set. Figure 4 shows cluster quality in x-axis and accuracy in y-axis.</Paragraph>
      <Paragraph position="8"> We observe a high correlation (Pearson coefficient of 0.865) between these two measures, which enables us to guide the clustering process through cluster quality.</Paragraph>
      <Paragraph position="9"> When cluster quality drops below a pre-defined  person-X data set.</Paragraph>
      <Paragraph position="10"> threshold, we terminate further clustering. We assign the remaining documents to the already formed clusters based on the correlation (Formula 2) between the document and the cluster. To determine the threshold of cluster quality, we use person-X collection as training data. Figure 5 illustrates the variation of accuracy with threshold. We select threshold at 0.935 where accuracy maximizes in Figure 5. Threshold was fixed at 0.935 for the rest of the experiments.</Paragraph>
    </Section>
    <Section position="4" start_page="22" end_page="22" type="sub_section">
      <SectionTitle>
6.4 Disambiguation Accuracy
</SectionTitle>
      <Paragraph position="0"> Table 2 summarizes the experimental results. The baseline, majority sense , assigns all the documents in a collection to the person that have most documents in the collection. Proposed method outperforms the baseline in all data sets.</Paragraph>
      <Paragraph position="1"> Moreover, the accuracy values for the proposed method in Table 2 are statistically significant (ttest: P(T[?]t)=0.0087, a = 0.05) compared to the baseline. To identify each cluster with a namesake, we chose the person that has most number of documents in the cluster. &amp;quot;Found&amp;quot; column shows the number of correctly identified namesakes as a fraction of total namesakes. Although the proposed method correctly identifies the popular namesakes, it fails to identify the namesakes who have just one or two documents in the collection. null</Paragraph>
    </Section>
    <Section position="5" start_page="22" end_page="23" type="sub_section">
      <SectionTitle>
6.5 Web Search Task
</SectionTitle>
      <Paragraph position="0"> Key phrases extracted by the proposed method are listed in Figure 6 (Due to space limitations, we show only the top ranking key phrases for two collections). To evaluate key phrases in disambiguat- null Michael Jackson and Jim Clark datasets.</Paragraph>
      <Paragraph position="1"> ing namesakes, we set up a web search experiment as follows. We search for the ambiguous name and the key phrase (for example, &amp;quot;Jim Clark&amp;quot; AND &amp;quot;racing driver&amp;quot;) and classify the top 100 results according to their relevance to each namesake. Results of our experiment on Jim Clark dataset for the top ranking key phrases are shown in Table 3.</Paragraph>
      <Paragraph position="2"> In Table 3 we classified Google search results into three categories. &amp;quot;person-1&amp;quot; is the formula one racing world champion, &amp;quot;person -2&amp;quot; is the founder of Netscape and &amp;quot;other&amp;quot; category contains rest of the pages that we could not classify to previous two groups 5. We first searched Google without adding any key phrases to the name. Including terms racing diver, rally and scotsman,  which were the top ranking terms for Jim Clark the formula one champion, yields no results for the other popular namesake. Likewise, the key words entrepreneur and silicon valley yield results fort he founder of Netscape. However, the key word story appears for both namesakes. A close investigation revealed that, the keyword story is extracted from the title of the book &amp;quot;The New New Thing: A Silicon Valley Story&amp;quot;, a book on the founder of Netscape.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML