File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/05/w05-0632_metho.xml

Size: 9,543 bytes

Last Modified: 2025-10-06 14:09:55

<?xml version="1.0" standalone="yes"?>
<Paper uid="W05-0632">
  <Title>Maximum Entropy based Semantic Role Labeling</Title>
  <Section position="3" start_page="0" end_page="210" type="metho">
    <SectionTitle>
2 System Description
</SectionTitle>
    <Paragraph position="0"> In this section, we describe our system that identifies and classifies semantic arguments. First, we explain pre-processing of the identification phase.</Paragraph>
    <Paragraph position="1"> Next, we describe features employed. Finally, we explain classifiers used in each phase.</Paragraph>
    <Section position="1" start_page="0" end_page="209" type="sub_section">
      <SectionTitle>
2.1 Pre-processing
</SectionTitle>
      <Paragraph position="0"> We thought that the occurrence of most semantic arguments are sensitive to the boundary of the immediate clause or the upper clauses of a predicate.</Paragraph>
      <Paragraph position="1"> Also, we assumed that they exist in the uniform distance on the parse tree from the predicate's parent node (called Pp) to the parse constituent's parent node (called Pc). Therefore, for identifying semantic arguments, we do not need to examine all parse constituents in a parse tree. In this study, we use the clause boundary restriction and the tree distance restriction, and they can provide useful information for spotting the probable search space which include semantic arguments.</Paragraph>
      <Paragraph position="2"> In Figure 1 and Table 1, we show an example of applying the tree distance restriction. We show the distance between Pp=VP and the nonterminals of a parse tree in Figure 1. For example, NP2:d=3 means 3 times downward movement through the parse tree from Pp=VP to Pc=NP2. NP4 does not have the distance from Pp because we allow to move only upward or only downward through the tree from Pp to Pc. In Table 1, we indicate all 14 argument candidates that correspond to tree distance restriction (d 3). Only 2 of the 14 argument candidates are actually served to semantic arguments (NP4, PP).</Paragraph>
    </Section>
    <Section position="2" start_page="209" end_page="209" type="sub_section">
      <SectionTitle>
2.2 Features
</SectionTitle>
      <Paragraph position="0"> The following features describe properties of the verb predicate. These featues are shared by all the parse constituents in the tree.</Paragraph>
      <Paragraph position="1"> pred lex: this is the predicate itself.</Paragraph>
      <Paragraph position="2"> pred POS: this is POS of the predicate.</Paragraph>
      <Paragraph position="3"> pred phr: this is the syntactic category of Pp. pred type: this represents the predicate usage such as to-infinitive form, the verb predicate of a main clause, and otherwise.</Paragraph>
      <Paragraph position="4"> voice: this is a binary feature identifying whether the predicate is active or passive.</Paragraph>
      <Paragraph position="5"> sub cat: this is the phrase structure rule expanding the predicate's parent node in the tree. pt+pl: this is a conjoined feature of pred type and pred lex. Because the maximum entropy model assumes the independence of features, we need to conjoin the coherent features.</Paragraph>
      <Paragraph position="6"> The following features characterize the internal structure of a argument candidate. These features change with the constituent under consideration. head lex: this is the headword of the argument candidate. We extracts the headword by using the Collins's headword rules.</Paragraph>
      <Paragraph position="7"> head POS: this is POS of the headword.</Paragraph>
      <Paragraph position="8"> head phr: this is the syntactic category of Pc.</Paragraph>
      <Paragraph position="9"> cont lex: this is the content word of the argument candidate. We extracts the content word by using the head table of the chunklink.pl 1.</Paragraph>
      <Paragraph position="10"> cont POS: this is POS of the content word.</Paragraph>
      <Paragraph position="11"> gov: this is the governing category introduced by Gildea and Jurafsky (2002).</Paragraph>
      <Paragraph position="12"> The following features capture the relations between the verb predicate and the constituent.</Paragraph>
      <Paragraph position="13"> path: this is the syntactic path through the parse tree from the parse constituent to the predicate. pos: this is a binary feature identifying whether the constituent is before or after the predicate. pos+clau: this, conjoined with pos, indicates whether the constituent is located in the immediate clause, in the first upper clause, in the second upper clause, or in the third upper clause. pos+VP, pos+NP, pos+SBAR: these are numeric features representing the number of the specific chunk types between the constituent and the predicate.</Paragraph>
      <Paragraph position="14"> pos+CC, pos+comma, pos+colon, pos+quote: these are numeric features representing the number of the specific POS types between the constituent and the predicate .</Paragraph>
      <Paragraph position="15"> pl+hl (pred lex + head lex), pl+cl (pred lex + cont lex), v+gov (voice + gov).</Paragraph>
    </Section>
    <Section position="3" start_page="209" end_page="210" type="sub_section">
      <SectionTitle>
2.3 Classifier
</SectionTitle>
      <Paragraph position="0"> The ME classifier for the identification phase classifies each parse constituent into one of the following classes: ARG class or NON-ARG class. The ME classifier for the classification phase classifies the identified argument into one of the pre-defined semantic roles (e.g. A0, A1, AM-ADV, AM-CAU, etc.).</Paragraph>
      <Paragraph position="1">  #exa. %can. #can. %arg. Fb=1</Paragraph>
    </Section>
  </Section>
  <Section position="4" start_page="210" end_page="211" type="metho">
    <SectionTitle>
3 Experiments
</SectionTitle>
    <Paragraph position="0"> To test the proposed method, we have experimented with CoNLL-2005 datasets (Wall Street sections 0221 as training set, Charniak' trees). The results have been evaluated by using the srl-eval.pl script provided by the shared task organizers. For building classifiers, we utilized the Zhang le's MaxEnt toolkit 2, and the L-BFGS parameter estimation algorithm with Gaussian Prior smoothing.</Paragraph>
    <Paragraph position="1"> Table 2 shows the different ways of reducing the number of argument candidates. The 2nd and 3rd columns (#can., %can.) indicate the number of argument candidates and the percentage of argument candidates that satisfy each restriction on the training set. The 4th and 5th columns (#arg., %arg.) indicate the number of correct arguments and the percentage of correct arguments that satisfy each restriction on the training set. The last column (Fb=1) indicates the performance of the identification task on the development set by applying each restriction.</Paragraph>
    <Paragraph position="2"> In no restriction, All1 extracts candidates from all the nonterminals's child nodes of a tree. All2 filter the nonterminals which include at least one non- null terminal child 3. All3 filter the nonterminals which include at least one nonterminal child and have distance from Pp. We use All3 as a baseline.</Paragraph>
    <Paragraph position="3"> In restriction on clause boundary, for example, 2/0 means that the left search boundary for identifying the argument is set to the left boundary of the second upper clause, and the right search boundary is set to the right boundary of the immediate clause.</Paragraph>
    <Paragraph position="4"> In restriction on tree distance, for example, 7/1 means that it is possible to move up to 7 times upward (d 7) through the parse tree from Pp to Pc, and it is possible to move up to once downward (d 1) through the parse tree from Pp to Pc.</Paragraph>
    <Paragraph position="5"> In clause boundary &amp; tree distance, for example, 3/1,7/1 means the case when we use both the clause boundary (3/1) and the tree distance (7/1).</Paragraph>
    <Paragraph position="6">  According to the experimental results, we use 7/1 tree distance restriction for all following experiments. By applying the restriction, we can remove about 47.3% (%can.=52.70%) of total argument candidates as compared with All3. 93.90% (%arg.) corresponds to the upper bound on recall.</Paragraph>
    <Paragraph position="7"> In order to estimate the relative contribution of each feature, we measure performance of each phase on the development set by leaving out one feature at a time, as shown in the top of Table 3. Precision, Recall, and Fb=1 represent the performance of the identification task, and Accuracy represent the performance of the classification task only with 100% correct argument identification respectively. All represents the performance of the experiment when all 26 features introduced by section 2.2 are considered.</Paragraph>
    <Paragraph position="8"> Finally, for identification, we use 24 features except gov and pos+clau, and obtain an Fb=1 of 80.59%, as shown in the bottom of Table 3. Also, for classification, we use 23 features except pred type, cont POS, and pos+clau, and obtain an Accuracy of 87.16%.</Paragraph>
    <Paragraph position="9"> Table 4 presents our best system performance on the development set, and the performance of the same system on the test set. Table 5 shows the performance on the development set using the one-phase method and the two-phase method respectively. The one-phase method is implemented by incorporating the identification into the classification. one-phase shows the performance of the experiment when 25 features except pos+clau are used. Experimental results show that the two-phase method is better than the one-phase method in our evaluation.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML