File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/98/p98-1012_evalu.xml
Size: 2,039 bytes
Last Modified: 2025-10-06 14:00:30
<?xml version="1.0" standalone="yes"?> <Paper uid="P98-1012"> <Title>Entity-Based Cross-Document Coreferencing Using the Vector</Title> <Section position="9" start_page="83" end_page="84" type="evalu"> <SectionTitle> 8 Results </SectionTitle> <Paragraph position="0"> Figure 11 shows the precision, recall, and F-Measure (with equal weights for both precision and recall) using the B-CUBED scoring algorithm. The Vector Space Model in this case constructed the space of terms only from the summaries extracted by SentenceExtractor. In comparison, Figure 12 shows the results (using the B-CUBED scoring algorithm) when the vector space model constructed the space of terms from the articles input to the system (it still used the summaries when computing the similaxity). The importance of using CAMP to extract summaries is verified by comparing the highest F-Measures achieved by the system for the two cases.</Paragraph> <Paragraph position="1"> The highest F-Measure for the former case is 84.6% while the highest F-Measure for the latter case is 78.0%. In comparison, for this task, named-entity tools like NetOwl and Textract would mark all the John Smiths the same. Their performance using our</Paragraph> <Section position="1" start_page="83" end_page="84" type="sub_section"> <SectionTitle> Summaries </SectionTitle> <Paragraph position="0"> scoring algorithm is 23% precision, and 100% recall.</Paragraph> <Paragraph position="1"> Figures 13 and 14 show the precision, recall, and F-Measure calculated using the MUC scoring algorithm. Also, the baseline case when all the John Smiths are considered to be the same person achieves 83% precision and 100% recall. The high initial precision is mainly due to the fact that the MUC algorithm assumes that all errors are equal.</Paragraph> <Paragraph position="2"> We have also tested our system on other classes of cross-document coreference like names of companies, and events. Details about these experiments can be found in (Bagga 98b).</Paragraph> </Section> </Section> class="xml-element"></Paper>