File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/94/h94-1031_intro.xml

Size: 1,751 bytes

Last Modified: 2025-10-06 14:05:42

<?xml version="1.0" standalone="yes"?>
<Paper uid="H94-1031">
  <Title>ISSUES AND METHODOLOGY FOR TEMPLATE DESIGN FOR INFORMATION EXTRACTION</Title>
  <Section position="2" start_page="0" end_page="0" type="intro">
    <SectionTitle>
ABSTRACT
</SectionTitle>
    <Paragraph position="0"> The goal of Information Extraction tasks is to identify, categorize, classify, relate, and normalize specific information of interest found in free text, and to make that information available to a back-end data base, data fusion, or other application. A data structure referred to as a template is typically used for capturing such information, particularly in cases where the amount and complexity of information is substantial. The design of the template for such/m application (or exercise) thus defines the task itself and therefore crucially affects the success of the Information Extraction attempt.</Paragraph>
    <Paragraph position="1"> This paper discusses template structure and methodological issues which arise in the template design process, within the context of a discussion of the design process itself; this paper is based on the template design process for TIPSTER/MUC5 and certain subsequent Information Extraction exercises. The first section of this paper addresses the issue of selection of the appropriate data representation (text annotation vs. flat template representation vs.</Paragraph>
    <Paragraph position="2"> object-oriented template). The second section outlines a set of high-level design considerations (desiderata) that have emerged; these desiderata feed into the discussion of design elements and a procedural review of the design process (design iterations, use of those linguistic analysis tools, etc.)</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML