File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/05/w05-0630_metho.xml

Size: 10,573 bytes

Last Modified: 2025-10-06 14:09:54

<?xml version="1.0" standalone="yes"?>
<Paper uid="W05-0630">
  <Title>Hierarchical Semantic Role Labeling</Title>
  <Section position="4" start_page="201" end_page="201" type="metho">
    <SectionTitle>
2 The Basic Semantic Role Labeler
</SectionTitle>
    <Paragraph position="0"> In the last years, several machine learning approaches have been developed for automatic role labeling, e.g. (Gildea and Jurasfky, 2002; Pradhan et al., 2005). Their common characteristic is the adoption of flat feature representations for predicate-argument structures. Our basic system is similar to the one proposed in (Pradhan et al., 2005) and it is described hereafter.</Paragraph>
    <Paragraph position="1"> We divided the predicate argument labeling in two subtasks: (a) the detection of the arguments related to a target, i.e. all the compounding words of such argument, and (b) the classification of the argument type, e.g. A0 or AM. To learn both tasks we used the following algorithm:  1. Given a sentence from the training-set, generate a full syntactic parse-tree; 2. Let P and A be respectively the set of predicates and the set of parse-tree nodes (i.e. the potential arguments); null 3. For each pair &lt;p,a&gt;[?]P xA: - extract the feature representation set, Fp,a; - if the subtree rooted in a covers exactly the  words of one argument of p, put Fp,a in T+ (positive examples), otherwise put it in T[?] (negative examples).</Paragraph>
    <Paragraph position="2"> We trained the SVM boundary classifier on T+ and T[?] sets and the role labeler i on the T+i , i.e. its positive examples and T[?]i , i.e. its negative examples, where T+ = T+i [?]T[?]i , according to the ONE-vs.-ALL scheme. To implement the multi-class classifiers we select the argument associated with the maximum among the SVM scores.</Paragraph>
    <Paragraph position="3"> To represent the Fp,a pairs we used the following features:  - the Phrase Type, Predicate Word, Head Word, Governing Category, Position and Voice defined in (Gildea and Jurasfky, 2002); - the Partial Path, Compressed Path, No Direction Path, Constituent Tree Distance, Head Word POS, First and Last Word/POS in Constituent, SubCategorization and Head Word of Prepositional Phrases proposed in (Pradhan et al., 2005); and - the Syntactic Frame designed in (Xue and Palmer,</Paragraph>
  </Section>
  <Section position="5" start_page="201" end_page="202" type="metho">
    <SectionTitle>
3 Hierarchical Semantic Role Labeler
</SectionTitle>
    <Paragraph position="0"> Having two phases for argument labeling provides two main advantages: (1) the efficiency is increased as the negative boundary examples, which are almost all parse-tree nodes, are used with one classifier only (i.e. the boundary classifier), and (2) as arguments share common features that do not occur in the non-arguments, a preliminary classification between arguments and non-arguments advantages the boundary detection of roles with fewer training examples (e.g. A4). Moreover, it may be simpler to classify the type of roles when the not-argument nodes are absent.</Paragraph>
    <Paragraph position="1"> Following this idea, we generalized the above two level strategy to a four-step role labeling by grouping together the arguments sharing similar properties. Figure 1, shows the hierarchy employed for argument classification: During the first phase, we select the parse tree nodes which are likely predicate arguments. An SVM with moderately high recall is applied for such purpose.</Paragraph>
    <Paragraph position="2"> In the second phase, a simple heuristic which selects non-overlapping nodes from those derived in the previous step is applied. Two nodes n1 and n2 do not overlap if n1 is not ancestor of n2 and viceversa. Our heuristic simply eliminates the nodes that cause the highest number of overlaps. We have also studied how to train an overlap resolver by means of tree kernels; the promising approach and results can be found in (Moschitti et al., 2005).</Paragraph>
    <Paragraph position="3"> In the third phase, we classify the detected arguments in the following four classes: AX, i.e. Core  Arguments, AM, i.e. Adjuncts, CX, i.e. Continuation Arguments and RX, i.e. the Co-referring Arguments. The above classification relies on linguistic reasons. For example Core arguments class contains the arguments specific to the verb frames while Adjunct Arguments class contains arguments that are shared across all verb frames.</Paragraph>
    <Paragraph position="4"> In the fourth phase, we classify the members within the classes of the previous level, e.g. A0 vs. A1, ..., A5.</Paragraph>
  </Section>
  <Section position="6" start_page="202" end_page="203" type="metho">
    <SectionTitle>
4 The Experiments
</SectionTitle>
    <Paragraph position="0"> We experimented our approach with the CoNLL 2005 Shared Task standard dataset, i.e. the PennTree Bank, where sections from 02 to 21 are used as training set, Section 24 as development set (Dev) and Section 23 as the test set (WSJ). Additionally, the Brown corpus' sentences were also used as the test set (Brown). As input for our feature extractor we used only the Charniak's parses with their POSs.</Paragraph>
    <Paragraph position="1"> The evaluations were carried out with the SVM-</Paragraph>
    <Paragraph position="3"> which encodes the tree kernels in the SVM-light software (Joachims, 1999). We used the default polynomial kernel (degree=3) for the linear feature representations and the tree kernels for the structural feature processing.</Paragraph>
    <Paragraph position="4"> As our feature extraction module was designed to work on the PropBank project annotation format (i.e. the prop.txt index file), we needed to generate it from the CoNLL data. Each PropBank annotation refers to a parse tree node which exactly covers the target argument but when using automatic parses such node may not exist. For example, on the CoNLL Charniak's parses, (sections 02-21 and 24), we discovered that this problem affects 10,293 out of the 241,121 arguments (4.3%) and 9,741 sentences out of 87,257 (11.5%). We have found out that most of the errors are due to wrong parsing attachments. This observation suggests that the capability of discriminating between correct and incorrect parse trees is a key issue in the boundary detection phase and it must be properly taken into account. null</Paragraph>
    <Section position="1" start_page="202" end_page="202" type="sub_section">
      <SectionTitle>
4.1 Basic System Evaluation
</SectionTitle>
      <Paragraph position="0"> For the boundary classifier we used a SVM with the polynomial kernel of degree 3. We set the regularization parameter, c, to 1 and the cost factor, j to 7 (to have a slightly higher recall). To reduce the learning time, we applied a simple heuristic which removes the nodes covering the target predicate node. From the initial 4,683,777 nodes (of sections 02-21), the heuristic removed 1,503,100 nodes with a loss of 2.6% of the total arguments. However, as we started the experiments in late, we used only the 992,819 nodes from the sections 02-08. The classifier took about two days and half to converge on a 64 bits machine (2.4 GHz and 4Gb Ram).</Paragraph>
      <Paragraph position="1"> The multiclassifier was built with 52 binary argument classifiers. Their training on all arguments from sec 02-21, (i.e. 242,957), required about a half day on a machine with 8 processors (32 bits, 1.7 GHz and overll 4Gb Ram).</Paragraph>
      <Paragraph position="2"> We run the role multiclassifier on the output of the boundary classifier. The results on the Dev, WSJ and Brown test data are shown in Table 1. Note that, the overlapping nodes cause the generation of overlapping constituents in the sentence annotation. This prevents us to use the CoNLL evaluator. Thus, we used the overlap resolution algorithm also for the basic system.</Paragraph>
    </Section>
    <Section position="2" start_page="202" end_page="203" type="sub_section">
      <SectionTitle>
4.2 Hierarchical Role Labeling Evaluation
</SectionTitle>
      <Paragraph position="0"> As the first two phases of the hierarchical labeler are identical to the basic system, we focused on the last two phases. We carried out our studies over the Gold Standard boundaries in the presence of arguments that do not have a perfect-covering node in the Charniak trees.</Paragraph>
      <Paragraph position="1"> To accomplish the third phase, we re-organized the flat arguments into the AX, AM, CX and RX classes and we built a single multi-classifier. For the fourth phase, we built a multi-classifier for each of the above classes: only the examples related to the target class were used, e.g. the AX mutliclassifier was designed with the A0,..,A5 ONE-vs-ALL binary classifiers.</Paragraph>
      <Paragraph position="2"> In rows 2 and 3, Table 2 shows the numbers of training and development set instances. Row 4 contains the F1 of the binary classifiers of the third phase whereas Row 5 reports the F1 of the resulting multi-classifier. Row 6 presents the F1s of the multi-classifiers of the fourth phase.</Paragraph>
      <Paragraph position="3"> Row 7 illustrates the F1 measure of the fourth phase classifier applied to the third phase output. Fi- null the WSJ test (bottom).</Paragraph>
      <Paragraph position="4"> nally, in Row 8, we report the F1 of the basic system on the gold boundary nodes. We note that the basic system shows a slightly higher F1 but is less computational efficient than the hierarchical approach.</Paragraph>
    </Section>
  </Section>
  <Section position="7" start_page="203" end_page="203" type="metho">
    <SectionTitle>
5 Final Remarks
</SectionTitle>
    <Paragraph position="0"> In this paper we analyzed the impact of a hierarchical categorization on the semantic role labeling task.</Paragraph>
    <Paragraph position="1"> The results show that such approach produces an accuracy similar to the flat systems with a higher efficiency. Moreover, some preliminary experiments show that each node of the hierarchy requires different features to optimize the associated multiclassifier. For example, we found that the SCF tree kernel (Moschitti, 2004) improves the AX multiclassifier</Paragraph>
  </Section>
  <Section position="8" start_page="203" end_page="203" type="metho">
    <SectionTitle>
AX AM CX RX
</SectionTitle>
    <Paragraph position="0"> whereas the PAF tree kernel seems more suited for the classification within the other classes, e.g. AM.</Paragraph>
    <Paragraph position="1"> Future work on the optimization of each phase is needed to study the potential accuracy limits of the proposed hierarchical approach.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML