File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/06/p06-1079_metho.xml

Size: 11,911 bytes

Last Modified: 2025-10-06 14:10:19

<?xml version="1.0" standalone="yes"?>
<Paper uid="P06-1079">
  <Title>Exploiting Syntactic Patterns as Clues in Zero-Anaphora Resolution</Title>
  <Section position="4" start_page="625" end_page="626" type="metho">
    <SectionTitle>
2 Zero-anaphora resolution
</SectionTitle>
    <Paragraph position="0"> In this paper, we consider only zero-pronouns that function as an obligatory argument of a predicate for two reasons: * Providing a clear definition of zero-pronouns appearing in adjunctive argument positions involves awkward problems, which we believe should be postponed until obligatory zero-anaphora is well studied.</Paragraph>
    <Paragraph position="1"> * Resolving obligatory zero-anaphora tends to be more important than adjunctive zero-pronouns in actual applications.</Paragraph>
    <Paragraph position="2"> A zero-pronoun may have its antecedent in the discourse; in this case, we say the zero-pronoun is anaphoric. On the other hand, a zero-pronoun whose referent does not explicitly appear in the discourse is called a non-anaphoric zero-pronoun. A zero-pronoun may be non-anaphoric typically when it refers to an extralinguistic entity (e.g. the first or second person) or its referent is unspecified in the context.</Paragraph>
    <Paragraph position="3"> The following are Japanese examples. In sentence (1), zero-pronoun phi is anaphoric as its antecedent, 'shusho (prime minister)', appears in the same sentence. In sentence (2), on the other hand, phj is considered non-anaphoric if its referent (i.e. the first person) does not appear in the discourse. (1) shushoi-wa houbeisi-te , prime ministeri-TOP visit-U.S.-CONJ PUNC</Paragraph>
    <Paragraph position="5"> The prime minister visited the united states and unveiled the plan to push diplomacy between the two countries.</Paragraph>
    <Paragraph position="6"> (2) (phj-ga) ie-ni kaeri-tai .</Paragraph>
    <Paragraph position="7"> (phj-NOM) home-DAT want to go back PUNC (I) want to go home.</Paragraph>
    <Paragraph position="8"> Given this distinction, we consider the task of zero-anaphora resolution as the combination of two sub-problems, antecedent identification and anaphoricity determination, which is analogous to NP-anaphora resolution: For each zero-pronoun in a given discourse, find its antecedent if it is anaphoric; otherwise, conclude it to be non-anaphoric.</Paragraph>
  </Section>
  <Section position="5" start_page="626" end_page="626" type="metho">
    <SectionTitle>
3 Previous work
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="626" end_page="626" type="sub_section">
      <SectionTitle>
3.1 Antecedent identification
</SectionTitle>
      <Paragraph position="0"> Previous machine learning-based approaches to antecedent identification can be classified as either the candidate-wise classification approach or the preference-based approach. In the former approach (Soon et al., 2001; Ng and Cardie, 2002a, etc.), given a target anaphor, TA, the model estimates the absolute likelihood of each of the candidate antecedents (i.e. the NPs preceding TA), and selects the best-scored candidate. If all the candidates are classified negative, TA is judged nonanaphoric. null In contrast, the preference-based approach (Yang et al., 2003; Iida et al., 2003) decomposes the task into comparisons of the preference between candidates and selects the most preferred one as the antecedent. For example, Iida et al. (2003) proposes a method called the tournament model. This model conducts a tournament consisting of a series of matches in which candidate antecedents compete with each other for a given anaphor.</Paragraph>
      <Paragraph position="1"> While the candidate-wise classification model computes the score of each single candidate independently of others, the tournament model learns the relative preference between candidates, which is empirically proved to be a significant advantage over candidate-wise classification (Iida et al., 2003).</Paragraph>
    </Section>
    <Section position="2" start_page="626" end_page="626" type="sub_section">
      <SectionTitle>
3.2 Anaphoricity determination
</SectionTitle>
      <Paragraph position="0"> There are two alternative ways for anaphoricity determination: the single-step model and the two-step model. The single-step model (Soon et al., 2001; Ng and Cardie, 2002a) determines the anaphoricity of a given anaphor indirectly as a by-product of the search for its antecedent. If an appropriate candidate antecedent is found, the anaphor is classified as anaphoric; otherwise, it is classified as non-anaphoric. One disadvantage of this model is that it cannot employ the preference-based model because the preference-based model is not capable of identifying non-anaphoric cases.</Paragraph>
      <Paragraph position="1"> The two-step model (Ng, 2004; Poesio et al., 2004; Iida et al., 2005), on the other hand, carries out anaphoricity determination in a separate step from antecedent identification. Poesio et al. (2004) and Iida et al. (2005) claim that the latter subtask should be done before the former. For example, given a target anaphor (TA), Iida et al.'s  selection-then-classification model: 1. selects the most likely candidate antecedent (CA) of TA using the tournament model, 2. classifies TA paired with CA as either anaphoric or non-anaphoric using an anaphoricity determination model. If the  CA-TA pair is classified as anaphoric, CA is identified as the antecedent of TA; otherwise, TA is conclude to be non-anaphoric.</Paragraph>
      <Paragraph position="2"> The anaphoricity determination model learns the non-anaphoric class directly from non-anaphoric training instances whereas the single-step model cannot not use non-anaphoric cases in training.</Paragraph>
    </Section>
  </Section>
  <Section position="6" start_page="626" end_page="628" type="metho">
    <SectionTitle>
4 Proposal
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="626" end_page="627" type="sub_section">
      <SectionTitle>
4.1 Task decomposition
</SectionTitle>
      <Paragraph position="0"> We approach the zero-anaphora resolution problem by decomposing it into two subtasks: intra-sentential and inter-sentential zero-anaphora resolution. For the former problem, syntactic patterns in which zero-pronouns and their antecedents appear may well be useful clues, which, however, does not apply to the latter problem. We therefore build a separate component for each subtask, adopting Iida et al. (2005)'s selection-thenclassification model for each component: 1. Intra-sentential antecedent identification: For a given zero-pronoun ZP in a given sentence S, select the most-likely candidate antecedent C[?]1 from the candidates appearing in S by the intra-sentential tournament model  2. Intra-sentential anaphoricity determination: Estimate plausibility p1 that C[?]1 is the true antecedent, and return C[?]1 if p1 [?] thintra (thintra is a preselected threshold) or go to 3 otherwise null 3. Inter-sentential antecedent identification: Select the most-likely candidate antecedent C[?]2 from the candidates appearing outside of S by the inter-sentential tournament model.</Paragraph>
      <Paragraph position="1"> 4. Inter-sentential anaphoricity determination: Estimate plausibility p2 that C[?]2 is the true antecedent, and return C[?]2 if p2 [?] thinter (thinter is a preselected threshold) or return non-anaphoric otherwise.</Paragraph>
    </Section>
    <Section position="2" start_page="627" end_page="627" type="sub_section">
      <SectionTitle>
4.2 Representation of syntactic patterns
</SectionTitle>
      <Paragraph position="0"> In the first two of the above four steps, we use syntactic pattern features. Analogously to SRL, we extract the parse path between a zero-pronoun to its antecedent to capture the syntactic pattern of their occurrence. Among many alternative ways of representing a path, in the experiments reported in the next section, we adopted a method as we describe below, leaving the exploration of other alternatives as future work.</Paragraph>
      <Paragraph position="1"> Given a sentence, we first use a standard dependency parser to obtain the dependency parse tree, in which words are structured according to the dependency relation between them. Figure 1(a), for example, shows the dependency tree of sentence (1) given in Section 2. We then extract the path between a zero-pronoun and its antecedent as in Figure 1(b). Finally, to encode the order of siblings and reduce data sparseness, we further transform the extracted path as in Figure 1(c): * A path is represented by a subtree consisting of backbone nodes: ph (zero-pronoun), Ant (antecedent), Node (the lowest common ancestor), LeftNode (left-branch node) and RightNode.</Paragraph>
      <Paragraph position="2"> * Each backbone node has daughter nodes, each corresponding to a function word associated with it.</Paragraph>
      <Paragraph position="3"> * Content words are deleted.</Paragraph>
      <Paragraph position="4"> This way of encoding syntactic patterns is used in intra-sentential anaphoricity determination. In antecedent identification, on the other hand, the tournament model allows us to incorporate three paths, a path for each pair of a zero-pronoun and left and right candidate antecedents, as shown in</Paragraph>
      <Paragraph position="6"/>
      <Paragraph position="8"/>
    </Section>
    <Section position="3" start_page="627" end_page="628" type="sub_section">
      <SectionTitle>
4.3 Learning algorithm
</SectionTitle>
      <Paragraph position="0"> As noted in Section 1, the use of zero-pronouns in Japanese is relatively less constrained by syntax compared, for example, with English. This forces the above way of encoding path information to produce an explosive number of different paths, which inevitably leads to serious data sparseness.</Paragraph>
      <Paragraph position="1"> This issue can be addressed in several ways.</Paragraph>
      <Paragraph position="2"> The SRL community has devised a range of variants of the standard path representation to reduce the complexity (Carreras and Marquez, 2005). Applying Kernel methods such as Tree  kernels (Collins and Duffy, 2001) and Hierarchical DAG kernels (Suzuki et al., 2003) is another strong option. The Boosting-based algorithm pro5To indicate which node belongs to which subtree, the label of each node is prefixed either with L, R or I.</Paragraph>
      <Paragraph position="4"> posed by Kudo and Matsumoto (2004) is designed to learn subtrees useful for classification.</Paragraph>
      <Paragraph position="5"> Leaving the question of selecting learning algorithms open, in our experiments, we have so far examined Kudo and Matsumoto (2004)'s algorithm, which is implemented as the BACT system6. Given a set of training instances, each of which is represented as a tree labeled either positive or negative, the BACT system learns a list of weighted decision stumps with a Boosting algorithm. Each decision stump is associated with tuple &lt;t,l,w&gt; , where t is a subtree appearing in the training set, l a label, and w a weight, indicating that if a given input includes t, it gives w votes to l. The strength of this algorithm is that it deals with structured feature and allows us to analyze the utility of features.</Paragraph>
      <Paragraph position="6"> In antecedent identification, we train the tournament model by providing a set of labeled trees as a training set, where a label is either left or right. Each labeled tree has (i) path trees TL, TR and TI (as given in Figure 2) and (ii) a set nodes corresponding to the binary features summarized in Table 3, each of which is linked to the root node as illustrated in Figure 4. This way of organizing a labeled tree allows the model to learn, for example, the combination of a subtree of TL and some of the binary features. Analogously, for anaphoricity determination, we use trees (TC,f1,...,fn), where TC denotes a path subtree as in Figure 1(c).</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML