File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/04/w04-1202_evalu.xml
Size: 5,531 bytes
Last Modified: 2025-10-06 13:59:14
<?xml version="1.0" standalone="yes"?> <Paper uid="W04-1202"> <Title>Using Argumentation to Retrieve Articles with Similar Citations from MEDLINE</Title> <Section position="7" start_page="10" end_page="12" type="evalu"> <SectionTitle> 4 Results </SectionTitle> <Paragraph position="0"> In this section, we described the generation of the baseline measure and the effects of different conditions on this baseline.</Paragraph> <Section position="1" start_page="11" end_page="11" type="sub_section"> <SectionTitle> 4.1 Comparison of text index parameters </SectionTitle> <Paragraph position="0"> The use of a domain specific thesaurus tends to improve the MAP when compared to the citation benchmark, 0.1528 vs. 0.1517 for ltc.atn and 0.1452 vs. 0.1433 for atc.atn (Table 3). The ltc.atn weighting schema in combination with the thesaurus produced the best results, therefore these parameters were more likely to retrieve abstracts found in the citation index and thus were used for all subsequent experiments.</Paragraph> <Paragraph position="1"> Table 3. Mean average precision (MAP) for each query set (1,2,3, and 4) with different term weighting schemas. The last column gives the average MAP. T represents the thesaurus</Paragraph> </Section> <Section position="2" start_page="11" end_page="11" type="sub_section"> <SectionTitle> 4.2 Argumentation-based retrieval </SectionTitle> <Paragraph position="0"> For demonstrating that argumentative features can improve document retrieval, we first determined which argumentative class was the most content bearing. Subsequently, we combined the four argumentative classes to again improve document retrieval.</Paragraph> <Paragraph position="1"> Table 4. MAP results from querying the collection using only the argumentative move.</Paragraph> <Paragraph position="2"> To determine the value of each argumentative move in the retrieval, the argumentative categorizer first parses each query abstract, generating four groups each representing a unique argumentative class. The document collection was separately queried with each group. Table 4 gives the MAP measures for each type of argumentation. Table 4 shows the sentences classified as PURPOSE provide the most useful content to retrieve similar documents. Baseline precision of 62.5% is achieved when using only this section of the abstract. The CONCLUSION move is the second most valuable at 56% of the baseline. The METHODS and RESULTS sections appear less content bearing for retrieving similar documents, 16.4% and 17.6%, respectively, of the baseline.</Paragraph> <Paragraph position="3"> Each argumentative set represents roughly a quarter of the textual content of the original abstract. Querying with the PURPOSE section, (25% of the available textual material) realizes almost 2/3 of the average precision and for the CONCLUSION section, it is more than 50% of the baseline precision. In information retrieval queries and documents are often seen as symmetrical elements. This fact may imply the possible use of the argumentative moves as a technique to reduce the size of the indexed document collection or to help indexing pruning in large repositories (Carmel and al. 2001).</Paragraph> </Section> <Section position="3" start_page="11" end_page="12" type="sub_section"> <SectionTitle> 4.3 Argumentative overweighting </SectionTitle> <Paragraph position="0"> As implied in Table 4, Table 5 confirms that overweighting the features of PURPOSE and CONCLUSION sentences results in a gain in average precision (respectively +3.39% and +3.98 for CONCLUSION and PURPOSE) as measured by citation similarity. More specifically, Table 5 demonstrates the use of PURPOSE and CONCLUSION as follows: Set 1 Set 2 Set 3 Set 4 Average features classified as PURPOSE by the argumentative classifier; * CONCLUSION applies a boosting coefficient to features classified as CONCLUSION by the argumentative classifier; * COMBINATION applies two different boosting coefficients to features classified as CONCLUSION and PURPOSE by the argumentative classifier.</Paragraph> <Paragraph position="1"> The results, in Table 5, from boosting PURPOSE and CONCLUSION features are given alongside the MAP and show an improvement of precision at the 5 and 10 document level. At the 5-document level the advantage is with the PURPOSE features, but at the 10-document level boosting the CONCLUSION features is more effective. While the improvement brought by boosting PURPOSE and CONCLUSION features, when measured by MAP is modest (3-4%), the improvement observed by their optimal combination reached a significant improvement: + 5.48%. The various combinations of RESULTS and METHODS sections did not lead to any improvement.</Paragraph> <Paragraph position="2"> Argumentation has typically been studied in relation to summarization (Teufel and Moens, 2002). Its impact on information retrieval is more difficult to establish although recent experiments (Ruch et al., 2003) tend to confirm that argumentation is useful for information extraction, as demonstrated by the extraction of gene functions for LocusLink curation. Similarly, using the argumentative structure of scientific articles has been proposed to reduce noise (Camon et al., 2004) in the assignment of Gene Ontology codes as investigated in the BioCreative challenge. In particular, it was seen that the use of 'Material and Methods' sentences should be avoided. A fact which is confirmed by our results with the METHOD argumentative move.</Paragraph> </Section> </Section> class="xml-element"></Paper>