XML Viewer - w04-0819

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/04/w04-0819_evalu.xml
Size: 3,916 bytes
Last Modified: 2025-10-06 13:59:15
<?xml version="1.0" standalone="yes"?>
<Paper uid="W04-0819">
  <Title>Semantic Parsing Based on FrameNet</Title>
  <Section position="10" start_page="0" end_page="0" type="evalu">
    <SectionTitle>
5 Experimental Results
</SectionTitle>
    <Paragraph position="0"> In the Senseval-3 task for Automatic Labeling of Semantic Roles 24,558 sentences from FrameNet were assigned for training while 8,002 for testing.</Paragraph>
    <Paragraph position="1"> We used 30% of the training set (7367 sentences) as a validation-set for selecting SVM parameters that optimize accuracy. The number of FEs for which labels had to be assigned were: 51,010 for the training set; 15,924 for the validation set and 16,279 for the test set. We used an additional set of 66,687 sentences (hereafter extended data) as extended data produced when using the examples associated with any other frame from FrameNet that had at least one FE shared with any of the 40 frames evaluated in Senseval-3. These sentences were parsed with the Collins' parser (Collins, 1997).</Paragraph>
    <Paragraph position="2"> The classifier experiments were carried out using the SVM-light software (Joachims, 1999) available at http://svmlight.joachims.org/with a polynomial kernel2 (degree=3).</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
5.1 Unrestricted Task Experiments
</SectionTitle>
      <Paragraph position="0"> For this task we devised four different experiments that used four different combination of features: (1) FS1 indicates using only Feature Set 1; (2) +H indicates that we added the heuristics; (3) +FS2+FS3 indicates that we add the feature Set 2 and 3; and (4) +E indicates that the extended data has also been used. For each of the four experiments we trained 40 multi-class classifiers, (one for each frame) for a total of 385 binary role classifiers. The following Table illustrates the overall performance over the validationset. To evaluate the results we measure the F1-score by combining the precision P with the recall R in the</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
5.2 Restricted Task Experiments
</SectionTitle>
      <Paragraph position="0"> In order to find the best feature combination for this task we carried out some preliminary experiments over five frames. In Table 1, the row labeled B lists the F1-score of boundary detection over 4 different feature sets: FS1, +H, +FS4 and +E, the extended data. The row labeled R lists the same results for the  Table 1 illustrates the overall performance (boundary detection and role classification) of automatic semantic role labeling. The results listed in Tables 1 and 2 were obtained by comparing the FE boundaries identified by our parser with those annotated in FrameNet. We believe that these results are more 2In all experiments and for any classifier, we used the default SVM-light regularization parameter (e.g., C = 1 for normalized kernels) and a cost-factor j = 100 to adjust the rate between Precision and Recall.</Paragraph>
      <Paragraph position="1"> indicative of the performance of our systems than those obtained when using the scorer provided by Senseval-3. When using this scorer, our results have a precision of 89.9%, recall of 77.2% and an F1-score of 83.07% for the Restricted Case.</Paragraph>
      <Paragraph position="2">  To generate the final Senseval-3 submissions we selected the most accurate models (for unrestricted and restricted tasks) of the validation experiments.</Paragraph>
      <Paragraph position="3"> Then we re-trained such models with all training data (i.e. our training plus validation data) and the setting (parameters, heuristics and extended data) derived over the validation-set. Finally, we run all classifiers on the test-set of the task. Table 2 illustrates the final results for both sub-tasks.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML