File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/06/w06-1634_evalu.xml

Size: 10,364 bytes

Last Modified: 2025-10-06 13:59:49

<?xml version="1.0" standalone="yes"?>
<Paper uid="W06-1634">
  <Title>Automatic Construction of Predicate-argument Structure Patterns for Biomedical Information Extraction</Title>
  <Section position="8" start_page="288" end_page="291" type="evalu">
    <SectionTitle>
5 Results and Discussion
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="288" end_page="289" type="sub_section">
      <SectionTitle>
5.1 Experimental Results on the AImed
Corpus
</SectionTitle>
      <Paragraph position="0"> To evaluate extraction patterns automatically constructed with our method, we used the AImed corpus, which consists of 225 MEDLINE (U.S. National Library of Medicine, 2006) abstracts (1969 sentences) annotated with protein names and protein-protein interactions, for the training/test corpora. Weusedtagsfortheproteinnamesgiven.</Paragraph>
      <Paragraph position="1"> We measured the accuracy of the IE task using thesamecriterionasBunescuandMooney(2006), who used an SVM to construct extraction patterns on word/POS/type sequences from the AImed corpus. That is, an extracted interaction from an abstract is correct if the proteins are tagged as interacting with each other somewhere in that abstract (document-level measure).</Paragraph>
      <Paragraph position="2"> Figure 4 plots our 10-fold cross validation and the results of Bunescu and Mooney (2006). The line ALL represents results when we used all features for SVM learning. The line SCORE represents results when we extracted pairs with higher combination matching scores than various threshold values. And the line ERK represents results by Bunescu and Mooney (2006).</Paragraph>
      <Paragraph position="3"> The line ALL obtained our best overall F-measure 57.3%, with 71.8% precision and 48.4% recall. Our method was significantly better than Bunescu and Mooney (2006) for precision be- null tween 50% and 80%. It also needs to be noted that SCORE, which did not use SVM learning and only used the combination patterns, achieved performance comparable to that by Bunescu and Mooney (2006) for the precision range from 50% to 80%. And for this range, introducing the fragmental patterns with SVM learning raised the recall. This range of precision is practical for the IE task, because precision is more important than recall for significant interactions that tend to be described in many abstracts (as shown by the next experiment), and too-low recall accompanying too-high precision requires an excessively large source text.</Paragraph>
      <Paragraph position="4"> Figure 5 shows the advantage of introducing full parsing. &amp;quot;FGF-2&amp;quot; and &amp;quot;KGFR&amp;quot; is an interacting protein pair. The pattern &amp;quot;ENTITY1 interact with ENTITY2&amp;quot; based on PASs successfully extracts this pair. However, it is difficult to extract this pair with patterns based on surface words, because there are 5 words between &amp;quot;FGF-2&amp;quot; and &amp;quot;interact&amp;quot;. null</Paragraph>
    </Section>
    <Section position="2" start_page="289" end_page="289" type="sub_section">
      <SectionTitle>
5.2 Experimental Results on Abstracts of
MEDLINE
</SectionTitle>
      <Paragraph position="0"> We also conducted an experiment to extract interacting protein pairs from a large amount of biomedical text, i.e. about 14 million titles and 8 million abstracts in MEDLINE. We constructed combination patterns from all 225 abstracts of the AImed corpus, and calculated a threshold value of combination scores that produced about 70% precision and 30% recall on the training corpus.</Paragraph>
      <Paragraph position="1"> We extracted protein pairs with higher combination scores than the threshold value. We excluded single-protein interactions to reduce time consumption and we used a protein name recognizer in this experiment2.</Paragraph>
      <Paragraph position="2"> We compared the extracted pairs with a manually curated database, Reactome (Joshi-Tope et al., 2005), which published 16,564 human protein interaction pairs as pairs of Entrez Gene IDs (U.S. National Library of Medicine, 2006).</Paragraph>
      <Paragraph position="3"> We converted our extracted protein pairs into pairs of Entrez Gene IDs by the protein name recognizer.3 Because there may be pairs missed by Re- null and other species, these are considered to be human proteins without checking the context. This is a fair assumption because Reactome itself infers human interaction events from experiments on model organisms such as mice.</Paragraph>
      <Paragraph position="4">  actome or pairs that our processed text did not include, we excluded extracted pairs of IDs that are not included in Reactome and excluded Reactome pairs of IDs that do not co-occur in the sentences of our processed text.</Paragraph>
      <Paragraph position="5"> After this postprocessing, we found that we had extracted 7775 human protein pairs. Of them, 155 pairs were also included in Reactome ([a] pseudo TPs) and 7620 pairs were not included in Reactome ([b] pseudo FPs). 947 pairs of Reactome werenotextractedbyoursystem([c]pseudoFalse Negatives (FNs)). However, these results included pairs that Reactome missed or those that only co-occurred and were not interacting pairs in the text. There may also have been errors with ID assignment. null To determine such cases, a biologist investigated 100 pairs randomly selected from pairs of pseudo TPs, FPs and FNs retaining their ratio of numbers. She also checked correctness of the assigned IDs. 2 pairs were selected from pseudo TPs, 88 pairs were from pseudo FPs and 10 pairs were from pseudo FNs. The biologist found that 57 pairs were actual TPs (2 pairs of pseudo TPs and 55 pairs of pseudo FPs) and 32 pairs were actual FPs of the pseudo FPs. Thus, the precision was 64.0% in this sample set. Furthermore, even if we assume that all pseudo FNs are actual FNs, the recall can be estimated by actual TPs / (actual TPs + pseudo FNs) x 100 = 83.8%.</Paragraph>
      <Paragraph position="6"> These results mean that the recall of an IE system for interacting proteins is improved for a large amount of text even if it is low for a small corpus. Thus, this justifies our assertion that a high degree of precision in the low-recall range is important.</Paragraph>
    </Section>
    <Section position="3" start_page="289" end_page="290" type="sub_section">
      <SectionTitle>
5.3 Error Analysis
</SectionTitle>
      <Paragraph position="0"> Tables 3 and 4 list causes of error for FNs/FPs on a test set of the AImed corpus using the prediction model with the best F-measure with all the  features. Different to Subsection 5.1, we individually checked each occurring pair of interacting proteins. The biggest problems were parsing error/failure, lack of necessary patterns and learning of inappropriate patterns.</Paragraph>
      <Paragraph position="1">  As listed in Table 3, 14 (40%) of the 35 parsing errors/failures were related to coordinations. Many of these were caused by differences in the characteristics of the PTB/GTB, the training corpora for Enju, and the AImed Corpus. For example, Enju failed to obtain the correct structure for &amp;quot;the ENTITY1 / ENTITY1 complex&amp;quot; because words in the PTB/GTB are not segmented with &amp;quot;/&amp;quot; and Enju could not be trained on such a case. One method to solve this problem is to avoid segmenting words with &amp;quot;/&amp;quot; and introducing extraction patterns based on surface characters, such as &amp;quot;EN-TITY1/ENTITY2 complex&amp;quot;.</Paragraph>
      <Paragraph position="2">  ParsingerrorsareintrinsicproblemstoIEmethods using parsing. However, from Table 3, we can conclude that the key to gaining better accuracy is refining of the method with which the PAS patterns are constructed (there were 46 related FNs) ratherthanimprovingparsing(therewere35FNs).</Paragraph>
      <Paragraph position="3"> 5.3.2 Lack of Necessary Patterns and Learning of Inappropriate Patterns There are two different reasons causing the problems with the lack of necessary patterns and the learning of inappropriate patterns: (1) the training corpus was not sufficiently large to saturate IE accuracy and (2) our method of pattern construction was too limited.</Paragraph>
      <Paragraph position="4"> Effect of Training Corpus Size To investigate whether the training corpus was large enough to maximize IE accuracy, we conducted experiments on training corpora of various sizes. Figure 6 plots graphs of F-measures by SCORE and Figure 7 plots the number of combination patterns on trainingcorporaofvarioussizes. FromFigures6and7, the training corpus (207 abstracts at a maximum)  is not large enough. Thus increasing corpus size will further improve IE accuracy.</Paragraph>
      <Paragraph position="5"> Limitation of the Present Pattern Construction The limitations with our pattern construction method are revealed by the fact that we could not achieve a high precision like Bunescu and Mooney (2006) within the high-recall range.</Paragraph>
      <Paragraph position="6"> Comparedtotheirs,oneofourproblemsisthatour method could not handle attributives. One exampleis&amp;quot;bindingpropertyofENTITY1toENTITY2&amp;quot;. null We could not obtain &amp;quot;binding&amp;quot; because the smallest set of PASs connecting &amp;quot;ENTITY1&amp;quot; and &amp;quot;ENTITY2&amp;quot; includes only the PASs of &amp;quot;property&amp;quot;, &amp;quot;of&amp;quot; and&amp;quot;to&amp;quot;. Tohandletheseattributives,weneeddistinguish necessary attributives from those that are general4 by semantic analysis or bootstrapping.</Paragraph>
      <Paragraph position="7"> Another approach to improve our method is to include local information in sentences, such as surface words between protein names. Zhao and Grishman (2005) reported that adding local information to deep syntactic information improved IE results. This approach is also applicable to IE in other domains, where related entities are in a short</Paragraph>
    </Section>
    <Section position="4" start_page="290" end_page="291" type="sub_section">
      <SectionTitle>
4Considerthecasewhereasourcesentenceforapatternis
</SectionTitle>
      <Paragraph position="0"> &amp;quot;ENTITY1 is an important homodimeric protein.&amp;quot; (&amp;quot;homodimeric&amp;quot; represents that two molecules of &amp;quot;ENTITY1&amp;quot; interact with each other.)  distance like the work of Zhou et al. (2005).</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML