File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/00/a00-2030_intro.xml
Size: 1,861 bytes
Last Modified: 2025-10-06 14:00:43
<?xml version="1.0" standalone="yes"?> <Paper uid="A00-2030"> <Title>A Novel Use of Statistical Parsing to Extract Information from Text</Title> <Section position="3" start_page="0" end_page="226" type="intro"> <SectionTitle> 2 Information Extraction Tasks </SectionTitle> <Paragraph position="0"> We evaluated the new approach to information extraction on two of the tasks of the Seventh Message Understanding Conference (MUC-7) and reported in (Marsh, 1998). The Template Element (TE) task identifies organizations, persons, locations, and some artifacts (rocket and airplane-related artifacts). For each organization in an article, one must identify all of its names as used in the article, its type (corporation, government, or other), and any significant description of it. For each person, one must find all of the person's names within the document, his/her type (civilian or military), and any significant descriptions (e.g., titles). For each location, one must also give its type (city, province, county, body of water, etc.). For the following example, the template element in Figure I was to be generated: &quot;...according to the report by Edwin Dorn, under secretary of defense for personnel and readiness .... Dorn's conclusion extracted for TE.</Paragraph> <Paragraph position="1"> The Template Relations (TR) task involves identifying instances of three relations in the text: * the products made by each company * the employees of each organization, * the (headquarters) location of each organization.</Paragraph> <Paragraph position="2"> TR builds on TE in that TR reports binary relations between elements of TE. For the following example, the template relation in Figure 2 was to be generated: &quot;Donald M. Goldstein, a historian at the University of Pittsburgh who helped write...&quot;</Paragraph> </Section> class="xml-element"></Paper>