File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/00/a00-1026_evalu.xml

Size: 3,311 bytes

Last Modified: 2025-10-06 13:58:33

<?xml version="1.0" standalone="yes"?>
<Paper uid="A00-1026">
  <Title>Extracting Molecular Binding Relationships from Biomedical Text</Title>
  <Section position="4" start_page="192" end_page="193" type="evalu">
    <SectionTitle>
2 Evaluation
</SectionTitle>
    <Paragraph position="0"> In order to determine ARBITER's effectiveness, the program was formally evaluated against a gold standard of MEDLINE citations in which the binding predications asserted were marked by hand. A search of MEDLINE limited to one month (June 1997) and based on the text words ((bind, binds, binding, or bound) and (protein or proteins)) retrieved 116 citations with 1,141 sentences; of these, 346 contained some form of the verb bind. 260 binding predications were identified in the binding sentences. (The binding sentences also contained 2,025 simple noun phrases, 1,179 of which were marked as being binding terms.) In processing this test collection, ARBITER extracted 181 binding predications, 132 of which were correct. Since ARBITER missed 128 marked binding predications (false negatives) and incorrectly identified 49 such relationships, recall and precision as measures of effectiveness are 51% and 73%, respectively.</Paragraph>
    <Paragraph position="1"> In comparing ARBITER's output against that marked in the gold standard, fairly stringent matching criteria were used. A binding predication extracted from a particular sentence by ARBITER had to appear in that same sentence in the gold standard (not just the same citation) in order to be counted as correct. Further, in the gold standard, only the most specific component of a macro-noun phrase was marked as being the correct argument for a particular binding predication. If ARBITER retrieved any other part of a macro-noun phrase in identifying the arguments of that predication, it was assessed as an error.</Paragraph>
    <Paragraph position="2"> A large number of ARBITER errors are due to two phenomena: difficulties in correctly identifying binding terms during the first phase of processing and syntactic complexity confounding argument identification during the second phase. An example of the first error type is seen in (12), where the failure to identify ran as a binding term caused ARBITER to miss the correct binding predication asserted in this sentence  This error then led to the false positive error (&amp;quot;-FP-&gt;&amp;quot;) when the program wrongly interpreted the next noun phrase in the sentence (signalmediated nuclear protein export) as the second argument in this predication.</Paragraph>
    <Paragraph position="3"> The interaction of coordination and negation in (13) caused ARBITER to partially misinterpret the binding predications in this sentence. (13) The nonvisual arrestins, beta-arrestin and arrestin3, but not visual arrestin, bind specifically to a glutathione S-transferase- clathrin terminal domain fusion protein.</Paragraph>
    <Paragraph position="4">  terminal domain fusion protein&gt; Although some of the coordination in (13) was processed properly, resulting in the relationships listed, the negated coordination associated with the noun phrase visual arrestin was not interpreted correctly, and thus ARBITER failed to 1Q&amp;quot;2  identify the predication marked as a false negative. null</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML