File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/06/n06-2007_evalu.xml
Size: 5,533 bytes
Last Modified: 2025-10-06 13:59:39
<?xml version="1.0" standalone="yes"?> <Paper uid="N06-2007"> <Title>Semi-supervised Relation Extraction with Label Propagation</Title> <Section position="4" start_page="25" end_page="27" type="evalu"> <SectionTitle> 3 Experiments and Results </SectionTitle> <Paragraph position="0"/> <Section position="1" start_page="25" end_page="25" type="sub_section"> <SectionTitle> 3.1 Data </SectionTitle> <Paragraph position="0"> Our proposed graph-based method is evaluated on the ACE corpus 1, which contains 519 files from sources including broadcast, newswire, and newspaper. A break-down of the tagged data by different relation subtypes is given in Table 1.</Paragraph> </Section> <Section position="2" start_page="25" end_page="26" type="sub_section"> <SectionTitle> 3.2 Features </SectionTitle> <Paragraph position="0"> We extract the following lexical and syntactic features from two entity mentions, and the contexts before, between and after the entity pairs. Especially, we set the mid-context window as everything between the two entities and the pre- and post- context as up to two words before and after the corresponding entity. Most of these features are computed from the parse trees derived from Charniak Parser (Charniak, 1999) and the Chunklink script 2 written by Sabine Buchholz from Tilburg University.</Paragraph> <Paragraph position="1"> Words: Surface tokens of the two entities and three context windows.</Paragraph> <Paragraph position="2"> Entity Type: the entity type of both entity mentions, which can be PERSON, ORGANIZATION, FACILITY, LOCATION and GPE.</Paragraph> <Paragraph position="3"> POS: Part-Of-Speech tags corresponding to all tokens in the two entities and three context windows. Chunking features: Chunk tag information and Grammatical function of the two entities and three context windows. IOB-chains of the heads of the two entities are also considered. IOB-chain notes the syntactic categories of all the constituents on the path from the root node to this leaf node of tree. We combine the above features with their position information in the context to form the context vector. Before that, we filter out low frequency features which appeared only once in the entire set.</Paragraph> </Section> <Section position="3" start_page="26" end_page="27" type="sub_section"> <SectionTitle> 3.3 Experimental Evaluation </SectionTitle> <Paragraph position="0"> We collect all entity mention pairs which co-occur in the same sentence from the training and devtest corpus into two set C1 and C2 respectively. The set C1 includes annotated training data AC1 and unrelated data UC1. We randomly sample l examples from AC1 as labeled data and add a &quot;NONE&quot; class into labeled data for the case where the two entity mentions are not related. The data of the &quot;NONE&quot; class is resulted by sampling l examples from UC1.</Paragraph> <Paragraph position="1"> Moreover, we combine the rest examples of C1 and the whole set C2 as unlabeled data.</Paragraph> <Paragraph position="2"> Given labeled and unlabeled data,we can perform LP algorithm to detect possible relations, which are those entity pairs that are not classified to the &quot;NONE&quot; class but to the other 24 subtype classes. In addition,we conduct experiments with different sampling set size l, including 1% x Ntrain,10% x Ntrain,25%xNtrain,50%xNtrain,75%xNtrain, 100% x Ntrain (Ntrain = |AC1|). If any major subtype was absent from the sampled labeled set,we redo the sampling. For each size,we perform 20 trials and calculate an average of 20 random trials.</Paragraph> <Paragraph position="3"> Table 2 reports the performance of relation detection by using SVM and LP with different sizes of labled data. For SVM, we use LIBSVM tool with linear kernel function 3. And the same sampled labeled data used in LP is used to train SVM models. From Table 2, we see that both LPCosine and LPJS achieve higher Recall than SVM. Especially, with small labeled dataset (percentage of labeled data [?] 25%), this merit is more distinct. When the percentage of labeled data increases from 50% to 100%, LPCosine is still comparable to SVM in F-measure while LPJS achieves better F-measure than SVM. On the other hand, LPJS consistently outperforms LPCosine.</Paragraph> <Paragraph position="4"> Table 3 reports the performance of relation classification, where the performance describes the average values over major relation subtypes. From Table 3, we see that LPCosine and LPJS outperform SVM by F-measure in almost all settings of labeled data, which is due to the increase of Recall. With smaller labeled dataset, the gap between LP and SVM is larger. On the other hand, LPJS divergence consistently outperforms LPCosine.</Paragraph> <Paragraph position="5"> 3.3.3 LP vs. Bootstrapping In (Zhang, 2004), they perform relation classification on ACE corpus with bootstrapping on top of SVM. To compare with their proposed Bootstrapped SVM algorithm, we use the same feature stream setting and randomly selected 100 instances from the training data as the size of initial labeled data. Table 4 lists the performance on individual relation type. We can find that LP algorithm achieves 6.8% performance improvement compared with the (Zhang, 2004)'s bootstrapped SVM algorithm average on all five relation types. Notice that performance reported on relation type &quot;NEAR&quot; is low, because it occurs rarely in both training and test data.</Paragraph> </Section> </Section> class="xml-element"></Paper>