File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/91/m91-1035_concl.xml
Size: 1,851 bytes
Last Modified: 2025-10-06 13:56:40
<?xml version="1.0" standalone="yes"?> <Paper uid="M91-1035"> <Title>PERQUIN THREATING TROOP VELIZ</Title> <Section position="8" start_page="252" end_page="252" type="concl"> <SectionTitle> CONCLUSIONS </SectionTitle> <Paragraph position="0"> The above results provide a measure of the extent to which the MUC-3 task can be viewed as one o f sorting documents into categories based on statistical word associations . Only a subset of slots, the set fill slots, can easily be treated in this way. Of those slots, high performance was reached only o n the two most associated with overall document content . However, a nontrivial level of performance was achieved on all the set fill slots, and provides an interesting point of comparison with knowledge based techniques . An information theoretic method of choosing words to predict slot/filler pairs wa s shown to achieve reasonable results, and studying the output of this method raised some interestin g questions about the composition of the MUC-3 corpus.</Paragraph> <Paragraph position="1"> The Maxcat categorization software used was the result of about 6 months of programmin g effort. However, applying Maxcat to the MUC-3 task required only about a week's effort on th e author's part, plus an additional two weeks' effort by a staff programmer to write the software fo r mapping MUC-3 texts and templates into and out of the format used by Maxcat . This suggest s that categorization techniques may be of practical use in situations where an extraction system fo r set fill slots must be brought into operation in a short period of time . In addition, the relatively high performance on those slots related to overall document content suggests that categorizatio n may have an important role to play with respect to similar slots in any extraction system .</Paragraph> </Section> class="xml-element"></Paper>