File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/91/m91-1033_concl.xml

Size: 5,241 bytes

Last Modified: 2025-10-06 13:56:38

<?xml version="1.0" standalone="yes"?>
<Paper uid="M91-1033">
  <Title>UNIVERSITY OF MASSACHUSETTS: DESCRIPTION OF THE CIRCUS SYSTEM AS USED FOR MUC-3</Title>
  <Section position="10" start_page="231" end_page="232" type="concl">
    <SectionTitle>
CONCLUSIONS
</SectionTitle>
    <Paragraph position="0"> As we explained at the beginning of this paper, CIRCUS was originally designed to investigate th e integration of connectionist and symbolic techniques for natural language processing . The original connectionist mechanisms in CIRCUS operated to manage bottom-up slot insertion for information foun d in unexpected (i .e. unpredicted) prepositional phrases. Yet when our task orientation is selectiv e concept extraction, the information we are trying to isolate is strongly predicted, and therefore unlikel y to surface in a bottom-up fashion. For MUC-3, we discovered that bottom-up slot insertion was needed primarily to handle only dates and locations : virtually all other relevant information was managed i n a predictive fashion . Because dates and locations are relatively easy to recognize, any number o f techniques could be successfully employed to handle bottom-up slot insertion for MUC-3 . Although we used the numeric relaxation technique described in [1] to handle dates and locations, we consider thi s mechanism to be excessively powerful for the task at hand, and it could readily be eliminated for efficiency reasons in a practical implementation .</Paragraph>
    <Paragraph position="1"> Although our score reports for TST2 indicate that our system is operating at the leading edge of overall performance for all MUC-3 systems, we nevertheless acknowledge that there are difficultie s with our approach in terms of system development . It would take us a lot of hard work (again) to scale up to this same level of performance in a completely new domain . New and inexperienced technica l personnel would probably require about 6 months of training before they would be prepared to attempt a technology transfer to a new domain . At that point we would estimate that another 1 .5 person/years of effort would be needed to duplicate our current levels of performance in a new domain . Although these investments are not prohibitive, we believe there is room for improvement in the ways that we ar e engineering our dictionary entries and rule-based consolidation components . We need to investigat e strategies for deducing linguistic regularities from texts and explore available resources that might leverage our syntactic analysis . Similar steps should be taken with respect to semantic analysis although we are much more skeptical about the prospects for sharable resources in this problem area .</Paragraph>
    <Paragraph position="2"> Although we have had very little time to experiment with the CBR consolidation component, th e CBR approach is very exciting in terms of system development possibilities . While the rule-based consolidation component had to be crafted and adjusted by hand, the case base for the CBR componen t was generated automatically and required virtually no knowledge of the domain or CIRCUS per se . In fact, our CBR module can be transported with minor modification to any other MUC-3 system tha t generates case frame meaning representations for sentences . As a discourse analysis component, thi s module is truly generic and could be moved into a new domain with simple adjustments . The labor needed to make the CBR component operational is the labor needed to create a development corpus o f texts with associated target template encodings (assuming a working sentence analyzer is already i n place). It is much easier to train people to generate target templates for texts than it is to trai n computer programmers in the foundations of artificial intelligence and the design of large rule bases .</Paragraph>
    <Paragraph position="3"> And the amount of time needed to generate a corpus from scratch is only a fraction of the time needed t o scale up a complicated rule base. So the advantages of CBR components for discourse analysis ar e enticing to say the least. But much work needs to be done before we can determine the functionality o f this technology as a strategy for natural language processing .</Paragraph>
    <Paragraph position="4"> Having survived the MUC-3 experience, we can say that we have learned a lot about CIRCUS, th e complexity of discourse analysis, and the viability of selective concept extraction as a technique fo r sophisticated text analysis . We are encouraged by our success, and we are now optimally positioned t o explore exciting new research areas . Although our particpation in MUC-3 has been a thoroughl y positive experience, we recognize the need to balance intensive development efforts of this type against the somewhat riskier explorations of basic research . We would not expect to benefit so dramatically from another intensive performance evaluation if we couldn't take some time to first digest the lesson s  we have learned from MUC-3 . Performance evaluations can operate as an effective stimulus fo r research, but only if they are allowed to accompany rather than dominate our principal researc h activities .</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML