File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/93/m93-1021_concl.xml

Size: 3,502 bytes

Last Modified: 2025-10-06 13:57:01

<?xml version="1.0" standalone="yes"?>
<Paper uid="M93-1021">
  <Title>UNISYS : Description of the CBAS System Used for MUC- 5</Title>
  <Section position="8" start_page="260" end_page="260" type="concl">
    <SectionTitle>
CONCLUSION S
</SectionTitle>
    <Paragraph position="0"> A motivating factor in the design of CBAS has been a desire to exploit simple data extraction miethocls t o the fullest extent possible . We do not, believe that we have fully exploited the capabilities of non-linguist i s data extraction methods and intend to continue exploring such techniques, especially special-purpos e parsers . However, at the same time, we do believe that linguistic analysis techniques will ultimately h e essential in data extraction applications, and our research group is actively engaged in the development of new linguistically-based methodologies which meet the portability, reliability, accuracy, and spee d requirements of large-scale systems .</Paragraph>
    <Paragraph position="1"> Another motivating factor has been the desire to build a relatively inexpensive system which individual s with no training whatsoever in linguistics could develop and maintain. The current implementation o f CBAS certainly demonstrates that we have been successful in meeting this goal : the primary implementation media, Perl and CLIPS, are available at little or no cost ; and we have made successful use of rul e developers with little or no experience in linguistic analysis .</Paragraph>
    <Paragraph position="2"> Finally, the most significant factor in the design of CBAS has been a desire to exploit. multiple preprocessors in the same way that multiple sensors are exploited in multisensor data fusion engines . The basi c idea behind this design concept is simple: by having many different processors contributing information , the failure of any one processor will not result in a lot of information being lost . Thus, instead of having a single NLP parser from which all information regarding constituent structure is derived, multipl e specialized parsers are implemented, parsers for recognizing company names, dates, names of individuals , place names, and so forth . In this type of situation, different parsers may contribute &amp;quot;competing information&amp;quot; . For example, a company name parser may determine that a given substring denotes the name of a company whereas a place name parser may determine that the sane substring denotes the name of a city .</Paragraph>
    <Paragraph position="3"> We have not yet actually proven the merit of the &amp;quot;multisensor &amp;quot; approach : there is no &amp;quot;sensor management&amp;quot; capability in existing CBAS implementations to compensate for preprocessor failure, nor is there any methodology in place for managing competing processor output . We of course intend to pursue the goal o f proving the utility of this approach in future evaluation efforts with more sophisticated implementations o f the CBAS architecture. In future implementations we are particularly interested in the possibility that a multisensor approach will provide a natural framework for the development of interactive data extractio n systems in which the multiple preprocessors extract &amp;quot;basic&amp;quot; objects and relations (ie, an ontology) from which composite structures are derived in response to user extraction queries (which are constrained b y the ontology and a set of composition rules defined over it) .</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML