File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/97/w97-0809_intro.xml

Size: 2,494 bytes

Last Modified: 2025-10-06 14:06:29

<?xml version="1.0" standalone="yes"?>
<Paper uid="W97-0809">
  <Title>The Use of Lexical Semantics in Information Extraction *</Title>
  <Section position="2" start_page="0" end_page="0" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> Customizing information extraction systems across different domains has become an important issue in Natural Language Processing. Many research groups are making progress toward efficient customization, such as BBN (Weischedel, 1995), NYU (Grishman, 1995), SRI (Appelt et al., 1995), SRA (Krupka, 1995), MITRE (Aberdeen et al., 1995), UMass (Fisher et al., 1995)...etc. SRI developed a specification language called FASTSPEC that automatically translates regular productions written by the developer into finite state machines (Appelt et al., 1995). FASTSPEC makes the customization easier by avoiding the effort in enumerating all the possible ways of expressing the target information. The HASTEN system developed at ~This work has been supported by a Fellowship from IBM Corporation.</Paragraph>
    <Paragraph position="1"> SRA (Krupka, 1995) employs a graphical user interface that allows the user to create patterns by identifying the important concepts in the text, as well as the relationships between the concepts. Then the concepts are manually generalized to word classes before the patterns are applied to other texts from the domain.</Paragraph>
    <Paragraph position="2"> We have built a trainable information extraction system that enables any user to adapt the system to different applications. The trainability of the system provides users the ability to identify the patterns for the information of interest. The training process is similar to the HASTEN system. However, instead of manual generalization as in HASTEN, our system automatically generalizes patterns by use of Word-Net hierarchies. Automatic generalization of rules makes the customization process an easier one.</Paragraph>
    <Paragraph position="3"> This paper describes the automated rule generalization method and the usage of WordNet (Miller, 1990) in our system. First, it introduces the idea of generalization; then it describes our Generalization Tree (GT) model based on the WordNet and illustrates how GT controls the degree of generalization according to the user's needs. Finally it demonstrates some preliminary results from the experiment of applying GT in our trainable information extraction system.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML