File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/06/w06-2204_evalu.xml
Size: 3,655 bytes
Last Modified: 2025-10-06 13:59:51
<?xml version="1.0" standalone="yes"?> <Paper uid="W06-2204"> <Title>Transductive Pattern Learning for Information Extraction</Title> <Section position="5" start_page="28" end_page="28" type="evalu"> <SectionTitle> 4 Experiments </SectionTitle> <Paragraph position="0"> To evaluate the performance of our algorithm we conducted experiments with four widely used corpora: the CMU seminar set, the Austin jobs set, the Reuters acquisition set [www.isi.edu/info-agents/RISE], and the MUC-7 named entity corpus [www.ldc.upenn.edu].</Paragraph> <Paragraph position="1"> We randomly partitioned each of the datasets into two evenly sized subsets. We then used these as labeled and unlabeled sets. In each experiment the algorithm was presented with a document set comprised of the test set and a randomly selected percentage of the documents from the training set. For example, if an experiment involved providing the algorithm with a 5% seed set, then 5% of the documents in the training set (2.5% of the documents in the entire dataset) would be selected at random and used in conjunction with the documents in the test set.</Paragraph> <Paragraph position="2"> For each training set size, we ran five iterations with a randomly selected subset of the documents used for training. Since we are mainly motivated by scenarios with very little training data, we varied the size of the training set from 1-10% (1-16% for MUC-7NE) of the available documents. Precision, recall and F1 were calculated using the BWI (Freitag and Kushmerick, 2000) scorer. We used all occurrences mode, which records a match in the case where we extract all of the valid fragments in a given document, but we get no credit for partially correct extractions.</Paragraph> <Paragraph position="3"> We compared TPLEX to BWI (Freitag and Kushmerick, 2000), LP2 (Ciravegna, 2001), ELIE (Finn and Kushmerick, 2004), and an approach based on conditional random fields (Lafferty et al., 2001). The data for BWI was obtained using the TIES implementation [tcc.itc.it/research/textec/toolsresources/ties.html]. The data for the LP2 learning curve was obtained from (Ciravegna, 2003). The results for ELIE were generated by the current implementation [http://smi.ucd.ie/aidan/Software.html]. For the CRF results, we used MALLET's SimpleTagger (McCallum, 2002), with each token encoded with a set of binary features (one for each observed literal, as well as the eight token generalizations).</Paragraph> <Paragraph position="4"> Our results in Fig. 2 indicate that in Acquisitions dataset, our algorithm substantially outperforms the competitors at all points on the learning curve. For the other datasets, the results are mixed. For SA and Jobs, TPLEX is the second best algorithm at the low end of the learning curve, and steadily loses ground as more labelled data is available. TPLEX is the least accurate algorithm for the MUC data. In Sec. 5, we discuss a variety of modifications to the TPLEX algorithm that we anticipate may improve its performance.</Paragraph> <Paragraph position="5"> Finally, the graph in Fig. 3 compares TPLEX for the SA dataset, in two configurations: with a combination oflabelledandunlabelleddocumentsasusual,andwith only labelled documents. In both instances the algorithm was given the same seed and testing documents.</Paragraph> <Paragraph position="6"> In the first case the algorithm learned patterns using both the labeled and unlabeled documents. However, in the second case, only the labeled documents were used to generate the patterns. These data confirm that TPLEX isindeedabletoimproveperformancefromunlabelled data.</Paragraph> </Section> class="xml-element"></Paper>