File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/04/w04-1202_metho.xml
Size: 8,797 bytes
Last Modified: 2025-10-06 14:09:12
<?xml version="1.0" standalone="yes"?> <Paper uid="W04-1202"> <Title>Using Argumentation to Retrieve Articles with Similar Citations from MEDLINE</Title> <Section position="4" start_page="8" end_page="10" type="metho"> <SectionTitle> INTRODUCTION marker. 3 Methods </SectionTitle> <Paragraph position="0"> We established a benchmark based on citation analysis to evaluate the impact of using argumentation to find related articles. In information retrieval, benchmarks are developed from three resources: a document collection, a query collection and a set of relevance rankings that relates each query to the set of documents.</Paragraph> <Paragraph position="1"> Existing information retrieval collections normally contain user queries composed of only a few words. These short queries are not suitable for evaluating a system tailored to retrieve articles with similar citations. Therefore, we have created the collection and tuned the system to accept long queries such as abstracts (Figure 2).</Paragraph> <Paragraph position="2"> Figure 2:Flowchart for the chain of experimental procedures. The benchmark was assembled from citations shared between documents and compared to the document similarity ranking of EasyIR.</Paragraph> <Section position="1" start_page="9" end_page="9" type="sub_section"> <SectionTitle> 3.1 Data acquisition and citation indexing </SectionTitle> <Paragraph position="0"> All the data used in these experiments were acquired from MEDLINE using the PubMed interface.</Paragraph> <Paragraph position="1"> Document Collection.. The document set was obtained from PubMed by executing a set of Boolean queries to recover articles related to small active peptides from many animal species excluding humans. These peptides hold the promise of becoming novel therapeutics. The set consisted of 12500 documents, which were comprised of abstract, title and MeSH terms. For 3200 of these documents we were able to recover the full text including the references for citation extraction and analysis.</Paragraph> <Paragraph position="2"> Queries.. Following statistical analysis confirmed by Buckley and Voorhees (2000), four sets of 25 articles were selected from the 3200 full text articles. The title, abstract and MeSH terms fields were used to construct the queries. For testing the influence the argumentative move, the specific sentences were extracted and tested either alone or in combination with the queries that contained the title, abstract and MeSH terms.</Paragraph> <Paragraph position="3"> Citation analysis.. Citation lists were automatically extracted from 3200 full-text articles that were correspondingly represented within the document set. This automatic parsing of citations was manually validated. Each citation was represented as a unique ID for comparison purposes. Citation analysis of the entire collection demonstrated that the full-text articles possessed a mean citation count of 28.30 + 24.15 (mean + S.D.) with a 95% CI = 27.47 - 29.13. Within these records the mean co-citation count was 7.79 + 6.99 (mean + S.D.) with a 95% CI = 7.55 - 8.03. As would be expected in a document set which contains a variety of document types (reviews, journal articles, editorials), the standard deviations of these values are quite large.</Paragraph> <Paragraph position="4"> Citation benchmark.. For each set of queries, a benchmark was generated from the 10 cited articles that contained the greatest number of co-citations in common with the query. For the benchmark, the average number of cited articles that have more than 9 co-citations was 15.70 + 6.58 (mean + S.D.). Query sets were checked to confirm that at least one sentence in each abstract was classified per argumentative class.</Paragraph> </Section> <Section position="2" start_page="9" end_page="9" type="sub_section"> <SectionTitle> 3.2 Metrics </SectionTitle> <Paragraph position="0"> The main measure for assessing information retrieval engines is mean average precision (MAP).</Paragraph> <Paragraph position="1"> MAP is the standard metric although it may tend to hide minor differences in ranking (Mittendorf and Schauble, 1996).</Paragraph> </Section> <Section position="3" start_page="9" end_page="10" type="sub_section"> <SectionTitle> 3.3 Text indexing </SectionTitle> <Paragraph position="0"> composed of combinations of term frequency, inverse document frequency and length normalization was varied to determine the most relevant output ranking. Table 1 gives the most common term weighting factors (atc.atn, ltc.atn); the first letter triplet applies to the document, the second letter triplet applies to the query (Ruch, 2002).</Paragraph> </Section> <Section position="4" start_page="10" end_page="10" type="sub_section"> <SectionTitle> 3.4 Argumentative classification </SectionTitle> <Paragraph position="0"> The classifier segmented the abstracts into 4 argumentative moves: PURPOSE, METHODS, RESULTS, and CONCLUSION.</Paragraph> <Paragraph position="1"> in Figure 1. In each box, the attributed class is first, followed by the score for the class, followed by the extracted text segment. In this example, one of RESULTS sentences is misclassified as METHODS The classification unit is the sentence which means that abstracts are preprocessed using an ad hoc sentence splitter. The confusion matrix for the four argumentative moves generated by the classifier is given in Table 2. This evaluation used explicitly structured abstracts; therefore, the argumentative markers were removed prior to the evaluation.</Paragraph> <Paragraph position="2"> Figure 3 shows the output from the classifier, when applied to the abstract shown in Figure 1. After extraction, each of the four types of argumentative moves was then used for indexing, retrieval and comparison tasks.</Paragraph> </Section> </Section> <Section position="5" start_page="10" end_page="10" type="metho"> <SectionTitle> 3.5 Argumentative combination </SectionTitle> <Paragraph position="0"> We adjusted the weight of the four argumentative moves, based on their location and then combined them to improve retrieval effectiveness. The query weights were recomputed as indicated in equation</Paragraph> <Paragraph position="2"> : the feature weight as given by the query weighting (ltc) S: the normalized score attributed by the argumentative classifier to each sentence in the abstract. This score is attributed to each feature appearing in the considered segment CONCLUSION |00160116 |The highly favorable pathologic stage (RI-RII, 58%) and the fact that the majority of patients were alive and disease-free suggested a more favorable prognosis for this type of renal cell carcinoma. METHODS |00160119 |Tumors were classified according to well-established histologic criteria to determine stage of disease; the system proposed by Robson was used.</Paragraph> <Paragraph position="3"> METHODS |00162303 |Of 250 renal cell carcinomas analyzed, 36 were classified as chromophobe renal cell carcinoma, representing 14% of the group studied.</Paragraph> <Paragraph position="4"> PURPOSE |00156456 |In this study, we analyzed 250 renal cell carcinomas to a) determine frequency of CCRC at our Hospital and b) analyze clinical and pathologic features of CCRCs.</Paragraph> <Paragraph position="5"> PURPOSE |00167817 |Chromophobe renal cell carcinoma (CCRC) comprises 5% of neoplasms of renal tubular epithelium. CCRC may have a slightly better prognosis than clear cell carcinoma, but outcome data are limited. RESULTS |00155338 |Robson staging was possible in all cases, and 10 patients were stage 1) 11 stage II; 10 stage III, and five stage IV.</Paragraph> <Paragraph position="6"> k: a constant for each value of c. The value is set empirically using the tuning set (TS). The initial value of k for each category is given by the distribution observed in Table 4 (i.e., 0.625, 0.164, 0.176, 0.560 for the classes, PURPOSE,</Paragraph> </Section> <Section position="6" start_page="10" end_page="10" type="metho"> <SectionTitle> METHODS, RESULTS and CONCLUSION </SectionTitle> <Paragraph position="0"> respectively), and then an increment step (positive and negative) is varied to get the most optimal combination.</Paragraph> <Paragraph position="1"> This equation combines the score (S c ) attributed by the original weighting (ltc) for each feature (W old ) found in the query with a boosting factor (k c ). The boosting factor was derived from the score provided by the argumentative classifier for each classified sentence. For these experiments, the parameters were determined with a tuning set (TS), one of the four query sets, and the final evaluation was done using the remaining three sets, the validation sets (VS). The document feature factor (atn) remained unchanged.</Paragraph> </Section> class="xml-element"></Paper>