File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/96/x96-1057_metho.xml
Size: 2,254 bytes
Last Modified: 2025-10-06 14:14:28
<?xml version="1.0" standalone="yes"?> <Paper uid="X96-1057"> <Title>NTT DATA: DESCRIPTION OF THE ERIE SYSTEM USED FOR MUC-6</Title> <Section position="4" start_page="469" end_page="469" type="metho"> <SectionTitle> 4 PATTERNS DEFINED IN ERIE </SectionTitle> <Paragraph position="0"> There are 54 dictionary patterns, 86 segmentation patterns, and 162 name recognition patterns defined in Erie. The pattern set was developed by using a hundred newspaper articles annotated and provided to the MET participants by DARPA.</Paragraph> <Paragraph position="1"> During its official run on a Sun SparcStation 10, Erie processed each article in an average of 1.5 seconds. This is several times faster than Textract.</Paragraph> <Paragraph position="2"> But, entity names, especially person names, were not identified well, although time and numeric expressions were identified with a high level of recall and precision. This was probably because the patterns for entity names were not well enough defined. Since names can be expressed in many ways, a hundred newspaper articles used for the pattern development were insufficient.</Paragraph> </Section> <Section position="5" start_page="469" end_page="469" type="metho"> <SectionTitle> 5 OBSERVATIONS </SectionTitle> <Paragraph position="0"> Erie achieved a high processing accuracy in the Japanese MET task. In the course of this project, most of our time was spent on the development of the engine generator. Considering that the pattern development was done in only two weeks, our scores are quite satisfactory. This was achieved by separating the patterns and pattern matching engine, which has made the pattern development faster and easier. The pattern definition in Erie was powerful enough to identify the names and expressions required in the MET task.</Paragraph> <Paragraph position="1"> The pattern development was mainly done by hand, which is very time-consuming. To develop systems more rapidly, tools are needed that will help pattern developers find and define patterns, then check the results. We will continue to work towards this goal and plan to improve our pattern matching engine to deal with more complicated patterns that Erie cannot currently handle.</Paragraph> </Section> class="xml-element"></Paper>