XML Viewer - w05-0637

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/05/w05-0637_metho.xml
Size: 3,600 bytes
Last Modified: 2025-10-06 14:09:56
<?xml version="1.0" standalone="yes"?>
<Paper uid="W05-0637">
  <Title>Applying spelling error correction techniques for improving semantic role labelling</Title>
  <Section position="5" start_page="230" end_page="231" type="metho">
    <SectionTitle>
4 Results
</SectionTitle>
    <Paragraph position="0"> In order to perform the optimisation of the semantic role labelling process in a reasonable amount of time, we have divided it in four separate tasks: pruning the data for individual words and the data for phrases, and labelling of these two data sets. Pruning amounts to deciding which instances correspond with verb-argument pairs and which do not. This resulted in a considerable reduction of the two data sets: 47% for the phrase data and 80% for the word data. The remaining instances are assumed to define verb-argument pairs and the labelling tasks assign labels to them. We have performed a separate feature selection process in combination with the memory-based learner for each of the four tasks.</Paragraph>
    <Paragraph position="1"> First we selected the best feature set based on task accuracy. As soon as a working module for each of the tasks was available, we performed an extra feature selection process for each of the modules, optimising overall system Fb=1 while keeping the other three modules fixed.</Paragraph>
    <Paragraph position="2"> The effect of the features on the overall perfor- null sets when memory-based learning is applied to the development set (overall Fb=1). The process consisted of four tasks: pruning data sets for individual words and phrases, and labelling these two data sets.</Paragraph>
    <Paragraph position="3"> Selected features are shown in bold. Unfortunately, we have not been able to use all promising features.</Paragraph>
    <Paragraph position="4"> mance can be found in Table 1. One feature (syntactic path) was selected in all four tasks but in general different features were required for optimal performance in the four tasks. Changing the feature set had the largest effect when labelling the phrase data.</Paragraph>
    <Paragraph position="5"> We have applied the two other learners, Maximum Entropy Models and Support Vector Machines to the two labelling tasks, while using the same features as the memory-based learner. The performance of the three systems on the development data can be found in Table 3. Since the systems performed differently we have also evaluated the performance of a combined system which always chose the majority class assigned to an instance and the class of the strongest system (SVM) in case of a three-way tie. The combined system performed slightly better than the best</Paragraph>
  </Section>
  <Section position="6" start_page="231" end_page="231" type="metho">
    <SectionTitle>
5 Conclusion
</SectionTitle>
    <Paragraph position="0"> We have presented a machine learning approach to semantic role labelling based on full parses. We have split the process in four separate tasks: pruning the data bases of word-based and phrase-based examples down to only the positive verb-argument cases, and labelling the two positively classified data sets. A novel automatic post-processing procedure based on spelling correction, comparing to a trusted lexicon of verb-argument patterns from the training material, was able to achieve a performance increase by correcting unlikely role assignments.</Paragraph>
    <Paragraph position="1">  algorithm, the application of Levenshtein-distance-based post-processing and the use of system combination on the performance obtained for the development data set.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML