File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/95/m95-1020_concl.xml

Size: 3,216 bytes

Last Modified: 2025-10-06 13:57:26

<?xml version="1.0" standalone="yes"?>
<Paper uid="M95-1020">
  <Title>STERLING SOFTWARE : AN NLTOOLSET-BASED SYSTEM FOR MUC-6</Title>
  <Section position="5" start_page="259" end_page="260" type="concl">
    <SectionTitle>
RESULTS AND CONCLUSION S
</SectionTitle>
    <Paragraph position="0"> The overall results (see Table 2) were obtained in 4 person-weeks of effort, lifting some pattern and code ideas from the ATS, which worked on a very different set of message types, and wasting a few day s on the ST task and on filling in date templates . These results show that our semantic-pattern-based approach to entity detection and templating is a very good one, and one which can be brought to bear o n a new application quickly .</Paragraph>
    <Paragraph position="1"> As we have noted, dramatic improvements in the worst numbers (timex in NE, org locale and country in TE) would have been obtained with very minor changes in the patterns -- literally, a couple hour s worth of work . The org locale fix would actually have given us the highest f-measure on that category : 61.3. Despite that &amp;quot;couple hours&amp;quot; estimate, we would have to say that our greatest limiting factor wa s  time -- time to test more thoroughly and isolate the causes of the biggest problems . Slowness of the system was a problem but not a major one, as it took only a minute or two per article .</Paragraph>
    <Paragraph position="2"> After those two improvements, we turn to the problem of org descriptors -- although we had th e highest f-measure, it was only 43.6, which shows that there is still room for improvement . Here, the solutions are less obvious . One step to take is to add to the patterns to allow modifier phrases after the head noun in a descriptor noun phrase, such as &amp;quot;the agency with billings of $400 million&amp;quot; . More exploration is needed on this, especially in light of the fact that both the recall and precision rates were low.</Paragraph>
    <Paragraph position="3"> Another area where we would like to make changes is in the order of reduction stages . For example, the system currently does all person reductions after organization reductions . This meant we had to prevent the secondary organization reduction from matching what are clearly person names (eg: primary &amp;quot;Schecter Group&amp;quot; -/-&gt; secondary &amp;quot;Mr. Schecter&amp;quot;). The solution, clearly, is to apply some of the perso n patterns before the organization patterns .</Paragraph>
    <Paragraph position="4"> Since all the processing occurs without any regard to the types of events discussed in the articles, the system we have developed here is easily portable across domains. If a domain required a different set o f template slots than used for MUC-6, the patterns would be unchanged but the reduction code that fill s the slots, and the postprocessing code that reports them, would have to be modified slightly .</Paragraph>
    <Paragraph position="5"> We have demonstrated, on MUC-6 and on CDIS, that we have an excellent approach to both entity an d event extraction on a range of document types . We hope to have the opportunity to continue this work , as funding permits.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML