File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/06/w06-2207_concl.xml
Size: 1,632 bytes
Last Modified: 2025-10-06 13:55:42
<?xml version="1.0" standalone="yes"?> <Paper uid="W06-2207"> <Title>A Hybrid Approach for the Acquisition of Information Extraction Patterns</Title> <Section position="7" start_page="54" end_page="54" type="concl"> <SectionTitle> 6 Conclusions </SectionTitle> <Paragraph position="0"> This paper introduces a hybrid, lightly-supervised method for the acquisition of syntactico-semantic patterns for Information Extraction. Our approach co-trains a decision list learner whose feature space covers the set of all syntactico-semantic patterns with an Expectation Maximization clustering algorithm that uses the text words as attributes. Furthermore, we customize the decision list learner with up to four criteria for pattern selection, which is the most important component of the acquisition algorithm.</Paragraph> <Paragraph position="1"> For the evaluation of the proposed approach we have used both an indirect evaluation based on Text Categorization and a direct evaluation where human experts evaluated the quality of the generated patterns. Our results indicate that co-training the Expectation Maximization algorithm with the decision list learner tailored to acquire only high precision patterns is by far the best solution. For the same recall point, the proposed method increases the precision of the generated models up to 35% from the previous state of the art. Furthermore, the combination of the two feature spaces (words and patterns) also increases the coverage of the acquired patterns. The direct evaluation of the acquired patterns by the human experts validates these results.</Paragraph> </Section> class="xml-element"></Paper>