XML Viewer - w04-2419

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/04/w04-2419_metho.xml
Size: 6,763 bytes
Last Modified: 2025-10-06 14:09:26
<?xml version="1.0" standalone="yes"?>
<Paper uid="W04-2419">
  <Title>Semantic Role Labeling using Maximum Entropy Model</Title>
  <Section position="3" start_page="0" end_page="2" type="metho">
    <SectionTitle>
2 Semantic Role Labeling using ME
</SectionTitle>
    <Paragraph position="0"> In the maximum entropy framework (Berger, 1996), the conditional probability of predicting an outcome y given  a history x is defined as follows :</Paragraph>
    <Paragraph position="2"> (x,u), k is the number of features, and Z(x) is the normalization factor for summationtext y p(y|x)=1.</Paragraph>
    <Paragraph position="3"> Given a predicate and its partial parse tree represented by constituents such as chunks and clauses, the probabilistic model for semantic role labeling assigns the semantic role labels to the constituents as described in the equation (1).</Paragraph>
    <Paragraph position="5"> the i-th semantic role, n is the number of constituents.</Paragraph>
    <Paragraph position="6"> In order to apply the equation (1) to an incremental approach, we classify clauses into the immediate clause and the upper clause. The immediate clause is the clause which contains the target predicate, and the upper clause is the clause which includes the immediate clause. Generally, most of the arguments of the predicate are located in the immediate clause while some of them are located in the upper clauses, especially the first or second upper clauses. Since it is much easier and more reliable to identify the arguments in the immediate clause, the proposed method first assigns the semantic role labels to the constituents  in the immediate clause. Then, it assigns the semantic role labels to the constituents in the upper clauses by using previously assigned labels. This incremental approach is described in the equation (2) derived from the equation (1).</Paragraph>
    <Paragraph position="8"> Here, we regard a chunk or a clause as a constituent.</Paragraph>
    <Paragraph position="9"> where m is the number of constituents covered by the im- null is a feature set for upper clauses.</Paragraph>
    <Paragraph position="10"> A semantic role label(r</Paragraph>
    <Paragraph position="12"> ) is represented by using a BIO notation such as B-A*, I-A*, etc. However, O is too frequently occurred than other semantic role labels, it can have a somewhat high probability than others. Therefore, to degrade its probability, we divide the single O into O-, O+, O0 with respect to the position of a constituent which is relative to the predicate. Therefore, B-A*, I-A*, O-, O+, and O0 are used as semantic roles as shown in Figure 1.</Paragraph>
    <Paragraph position="13"> After processing the equation (2), we use some heuristic to attach the some semantic roles and to adjust the boundary of semantic arguments in the post-processing step. More specifically, we use some rules to attach the V, AM-MOD, and AM-NEG, and extend the boundary of core roles to include to infinitive of the VP chunk like &amp;quot;expect/B-VP (A1 to/I-VP take/I-VP dive/B-NP)&amp;quot;.</Paragraph>
  </Section>
  <Section position="4" start_page="2" end_page="2" type="metho">
    <SectionTitle>
3 Feature Sets for Semantic Role Labeling
</SectionTitle>
    <Paragraph position="0"> For accurate semantic role labeling, we regard that the following information is important: the contextual information of the constituent, the syntactic information of the predicate, and the relation between the constituent and the predicate. Therefore, we use the features presented in Table 1 for semantic role labeling. For example, Figure</Paragraph>
    <Paragraph position="2"> for immediate clause feature set Ph  for upper clause pl, ctag, ctag+v+p, ctag+v+p+pl pl, ctag, ctag+v+p ptag+ctag, ctag+ntag ptag+ctag, ctag+ntag hp+p, hp+p+ntag hp+p, hl+ctag, predtype+ctag predtype+ctag predlex+hl, predlex+ctag+v+p, predlex+ctag+pl predlex+hl, predlex+ctag+v+p, predlex+ctag+pl predpos+p, predpos+hp+pl, predpos+ctag path, path+hp+v, path+nhl, path+predlex path-im-cl, path-im-cl+ctag+v, path-beg-end hl+p, hl+ctag, hl+ctag+predlex ctag+l-cl, ptag+ctag+l-cl, ctag+ntag+l-cl chl+pl, chl+pl+predlex ctag+cl-bn, ptag+ctag+cl-bn, ctag+ntag+cl-bn chl+phl, chl+phl+predlex im-cl-roles  2 shows how the features in Table 1 are used for labeling semantic roles to the proposition in Figure 1.</Paragraph>
    <Paragraph position="3"> Because the maximum entropy model assumes the independence of features, we should conjoin the coherent features. As presented in Table 2, we use the conjoined feature sets to assign semantic roles to the constituents of the immediate clause and the upper clauses.</Paragraph>
    <Paragraph position="4"> The predicate-type feature represents the predicate usage such as to-infinitive form (TO), the beginning of the immediate clause (BEG), and otherwise (SEN). The tag feature represents the tag of the current constituent. If it is a clause, it is subdivided into a relative pronoun, a infinitival relative clause, etc according to its represented form.</Paragraph>
    <Paragraph position="5"> The path feature indicates the sequence of constituent tags between the current constituent and the predicate. The voice feature is determined to be an active or passive voice of the predicate, and the position feature is assigned by the constituent position with respect to predicate. These features implicitly represent the predicate-argument relation such as predicate-subject or predicateobject. null For the headword feature, we use the Collins' head-word rules, and as a complementary feature to the head word feature, a content word feature  is used to represent the content of the PP, VP, or CONJP chunk.</Paragraph>
    <Paragraph position="6"> The path-immediate-clause feature is the sequence of constituent tags between the current constituent and the immediate clause, and the path-begin-end feature is the sequence between current constituent and beginning/end of clause. The level-of-clause feature indicates whether the current constituent is located in the first upper clause or in the second upper clause, and the is-clause-boundary feature is the binary value which indicates the existence of the starting clause. The immediate-clause-roles features are the binary indicators to represent whether the core arguments exist in the immediate clause or not.</Paragraph>
    <Paragraph position="7"> The path-immediate-clause, path-begin-end, level-ofclause, is-clause-boundary, and immediate-clause-roles features are used only in the second phase, and the others except the path feature and the content-head feature are used in common.</Paragraph>
    <Paragraph position="8">  For example, if the PP-chunk is because of, the headword feature is of, and the content word feature is because.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML