File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/06/p06-1079_intro.xml

Size: 6,548 bytes

Last Modified: 2025-10-06 14:03:34

<?xml version="1.0" standalone="yes"?>
<Paper uid="P06-1079">
  <Title>Exploiting Syntactic Patterns as Clues in Zero-Anaphora Resolution</Title>
  <Section position="3" start_page="0" end_page="625" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> Zero-anaphora is a gap in a sentence that has an anaphoric function similar to a pro-form (e.g. pronoun) and is often described as &amp;quot;referring back&amp;quot; to an expression that supplies the information necessary for interpreting the sentence. For example, in the sentence &amp;quot;There are two roads to eternity, a straight and narrow, and a broad and crooked,&amp;quot; the gaps in &amp;quot;a straight and narrow (gap)&amp;quot; and &amp;quot;a broad and crooked (gap)&amp;quot; have a zero-anaphoric relationship to &amp;quot;two roads to eternity.&amp;quot; The task of identifying zero-anaphoric relations in a given discourse, zero-anaphora resolution, is essential in a wide range of NLP applications.</Paragraph>
    <Paragraph position="1"> This is the case particularly in such a language as Japanese, where even obligatory arguments of a predicate are often omitted when they are inferable from the context. In fact, in our Japanese newspaper corpus, for example, 45.5% of the nominative arguments of verbs are omitted. Since such gaps can not be interpreted only by shallow syntactic parsing, a model specialized for zero-anaphora resolution needs to be devised on the top of shallow syntactic and semantic processing.</Paragraph>
    <Paragraph position="2"> Recent work on zero-anaphora resolution can be located in two different research contexts. First, zero-anaphora resolution is studied in the context of anaphora resolution (AR), in which zero-anaphora is regarded as a subclass of anaphora. In AR, the research trend has been shifting from rule-based approaches (Baldwin, 1995; Lappin and Leass, 1994; Mitkov, 1997, etc.) to empirical, or corpus-based, approaches (McCarthy and Lehnert, 1995; Ng and Cardie, 2002a; Soon et al., 2001; Strube and M&amp;quot;uller, 2003; Yang et al., 2003) because the latter are shown to be a cost-efficient solution achieving a performance that is comparable to best performing rule-based systems (see the Coreference task in MUC1 and the Entity Detection and Tracking task in the ACE program2).</Paragraph>
    <Paragraph position="3"> The same trend is observed also in Japanese zero-anaphora resolution, where the findings made in rule-based or theory-oriented work (Kameyama, 1986; Nakaiwa and Shirai, 1996; Okumura and Tamura, 1996, etc.) have been successfully incorporated in machine learning-based frameworks (Seki et al., 2002; Iida et al., 2003).</Paragraph>
    <Paragraph position="4"> Second, the task of zero-anaphora resolution has some overlap with Propbank3-style semantic role labeling (SRL), which has been intensively studied, for example, in the context of the CoNLL SRL task4. In this task, given a sentence &amp;quot;To attract younger listeners, Radio Free Europe intersperses the latest in Western rock groups&amp;quot;, an SRL  model is asked to identify the NP Radio Free Europe as the A0 (Agent) argument of the verb attract. This can be seen as the task of finding the zero-anaphoric relationship between a nominal gap (the A0 argument of attract) and its antecedent (Radio Free Europe) under the condition that the gap and its antecedent appear in the same sentence.</Paragraph>
    <Paragraph position="5"> In spite of this overlap between AR and SRL, there are some important findings that are yet to be exchanged between them, partly because the two fields have been evolving somewhat independently. The AR community has recently made two important findings: * A model that identifies the antecedent of an anaphor by a series of comparisons between candidate antecedents has a remarkable advantage over a model that estimates the absolute likelihood of each candidate independently of other candidates (Iida et al., 2003; Yang et al., 2003).</Paragraph>
    <Paragraph position="6"> * An AR model that carries out antecedent identification before anaphoricity determination, the decision whether a given NP is anaphoric or not (i.e. discourse-new), significantly outperforms a model that executes those subtasks in the reverse order or simultaneously (Poesio et al., 2004; Iida et al., 2005). To our best knowledge, however, existing SRL models do not exploit these advantages. In SRL, on the other hand, it is common to use syntactic features derived from the parse tree of a given input sentence for argument identification. A typical syntactic feature is the path on a parse tree from a target predicate to a noun phrase in question (Gildea and Jurafsky, 2002; Carreras and Marquez, 2005). However, existing AR models deal with intra- and inter-sentential anaphoric relations in a uniform manner; that is, they do not use as rich syntactic features as state-of-the-art SRL models do, even in finding intra-sentential anaphoric relations. We believe that the AR and SRL communities can learn more from each other.</Paragraph>
    <Paragraph position="7"> Given this background, in this paper, we show that combining the aforementioned techniques derived from each research trend makes significant impact on zero-anaphora resolution, taking Japanese as a target language. More specifically, we demonstrate the following: * Incorporating rich syntactic features in a state-of-the-art AR model dramatically improves the accuracy of intra-sentential zero-anaphora resolution, which consequently improves the overall performance of zero-anaphora resolution. This is to be considered as a contribution to AR research.</Paragraph>
    <Paragraph position="8"> * Analogously to inter-sentential anaphora, decomposing the antecedent identification task into a series of comparisons between candidate antecedents works remarkably well also in intra-sentential zero-anaphora resolution.</Paragraph>
    <Paragraph position="9"> We hope this finding to be adopted in SRL.</Paragraph>
    <Paragraph position="10"> The rest of the paper is organized as follows.</Paragraph>
    <Paragraph position="11"> Section 2 describes the task definition of zero-anaphora resolution in Japanese. In Section 3, we review previous approaches to AR. Section 4 described how the proposed model incorporates effectively syntactic features into the machine learning-based approach. We then report the results of our experiments on Japanese zero-anaphora resolution in Section 5 and conclude in Section 6.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML