File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/91/h91-1059_metho.xml
Size: 5,979 bytes
Last Modified: 2025-10-06 14:12:44
<?xml version="1.0" standalone="yes"?> <Paper uid="H91-1059"> <Title>THIRD MESSAGE UNDERSTANDING EVALUATION AND CONFERENCE (MUC-3): PHASE 1 STATUS REPORT</Title> <Section position="4" start_page="0" end_page="301" type="metho"> <SectionTitle> SCOPE </SectionTitle> <Paragraph position="0"> The third evaluation began in October, 1990. A dry-run phase was completed in February, 1991, and the official testing will be carried out in May, 1991, concluding with the Third Message Understanding Conference (MUC3). This evaluation is significantly broader in scope than previous ones in most respects, including text characteristics, task specifications, performance measures, and range of text understanding and information extraction techniques. The corpus and task are sufficiently challenging that they are likely to be used again (with a new test set) in a future evaluation of the same and/or similar systems.</Paragraph> <Paragraph position="1"> The corpus was formed via a keyword query to an electronic database containing articles in message format from open sources worldwide, compiled, translated (if necessary), edited, and disseminated by the Foreign Broadcast Information Service of the U.S. Government. A training set of 1300 texts was identified, and additional texts were set aside for use as test data. The corpus presents realistic challenges in terms of its overall size (over 2.5 mb), the length of the individual articles (approximately half-page each on average), the variety of text types (newspaper articles, summary reports, speech and interview transcripts, rebel communiques, etc.), the range of linguistic phenomena represented (both well-formed and ill-formed), and the open-ended nature of the vocabulary (especially with respect to proper nouns).</Paragraph> </Section> <Section position="5" start_page="301" end_page="302" type="metho"> <SectionTitle> METROPOLITAN POLICE CAI \[IMMEDIATE ATTENTION CENTER\]. THE ANTIOQUIA DEPARTMENT LIBERAL PARTY LEADER HAD LEFT HIS HOUSE WITHOUT ANY BODYGUARDS ONLY MINUTES EARLIER. AS HE WAITED FOR THE TRAFFIC LIGHT TO CHANGE, THREE HEAVILY ARMED MEN FORCED HIM TO GET OUT OF HIS CAR AND GET INTO A BLUE RENAULT. HOURS LATER, THROUGH ANONYMOUS TELEPHONE CALLS TO THE METROPOLITAN POLICE AND TO THE MEDIA, THE EXTRADITABLES CLAIMED RESPONSIBILITY FOR THE KIDNAPPING. IN THE CALLS, THEY ANNOUNCED THAT THEY WILL RELEASE THE SENATOR WITH A NEW MESSAGE FOR THE NATIONAL GOVERNMENT. LAST WEEK, FEDERICO ESTRADA VELEZ HAD REJECTED TALKS BETWEEN THE GOVERNMENT AND THE DRUG TRAFFICKERS. </SectionTitle> <Paragraph position="0"> The task is to extract information on terrorist incidents (incident type, date, location, perpetrator, target, instrument, outcome, etc.) from the relevant messages in a blind test on 100 previously unseen texts in the test set. Approximately half of the messages will be irrelevant to the task as it has been defined. The extracted information is to be represented in the template in one of several ways, according to the information requirements of each slot. Some fills are required to be categories from a predefined set of possibilities (e.g., for the various types of terrorist incidents such as BOMBING, ATTEMPTED BOMBING, BOMB THREAT); others are required to be canonicalized forms (e.g., for dates) or numbers; still others are to be in the form of strings (e.g., for person names). The participants collectively created a set of training templates, each site manually filling in templates for 100 messages. A simple text and corresponding answer-key template are shown in Figures 1 and 2. Note that the text in Figure 1 is all upper case, that the dateline includes the source of the article Clnravision Television Cadena 1&quot;) and that the article is a news report by Jorge Alonso Sierra Valencia.</Paragraph> <Paragraph position="1"> 0. MSG ID TST1-MUC3-0080 1. TEMPLATE ID 1 2. INCIDENT DATE 03 APR 90 3. INCIDENT TYPE KIDNAPPING 4. INCIDENT CATEGORY TERRORIST ACT 5. INDIV PERPETRATORS &quot;THREE HEAVILY ARMED MEN&quot; 6. ORG PERPETRATORS &quot;THE EXTRADITABLES&quot; / &quot;EXTRADITABLES&quot; 7. PERP CONFIDENCE REPORTED AS FACT: In Figure 2, the slot labels have been abbreviated to save space. The right-hand column contains the &quot;correct answers&quot; as defined by NOSC. Slashes mark alternative correct responses (systems are to generate just one of the possibilities), an asterisk marks slots that are inapplicable to the incident type being reported, and a hyphen marks a slot for which the text provides no fill.</Paragraph> <Paragraph position="2"> A call for participation was sent to organizations in the U.S. that were known to be engaged in system design or development in the area of text analysis or information retrieval. Twelve of the sites that responded participated in the dry run and reported results at a meeting held in February, 1991. These sites are association with the University of Southwest Louisiana (Lafayette, LA). The meeting also served as a forum for resolving issues that affect the test design, scoring, etc. for the official test in May.</Paragraph> <Paragraph position="3"> A wide range of text interpretation techniques (e.g., statistical, key-word, template-driven, pattern-matching, and natural language processing) were represented in this phase of the evaluation. One of the participating sites, TRW, offered a preliminary baseline performance measure for a pattern-matching approach to information extraction that they have already successfully put into operational use as an interactive system applied to texts of a somewhat more homogeneous and straightforward nature than those found in the MUC-3 corpus. All sites reporting in February are likely to continue development in phase 2 and undergo official testing in May. In addition, three sites that did not report results for the dry run are expecting to report results on the official run.</Paragraph> </Section> class="xml-element"></Paper>