File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/02/c02-1078_intro.xml
Size: 3,319 bytes
Last Modified: 2025-10-06 14:01:24
<?xml version="1.0" standalone="yes"?> <Paper uid="C02-1078"> <Title>A Probabilistic Method for Analyzing Japanese Anaphora Integrating Zero Pronoun Detection and Resolution</Title> <Section position="2" start_page="0" end_page="0" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> Anaphora resolution is crucial in natural language processing (NLP), specifically, discourse analysis. In the case of English, partially motivated by Message Understanding Conferences (MUCs) (Grishman and Sundheim, 1996), a number of coreference resolution methods have been proposed.</Paragraph> <Paragraph position="1"> In other languages such as Japanese and Spanish, anaphoric expressions are often omitted. Ellipses related to obligatory cases are usually termed zero pronouns. Since zero pronouns are not expressed in discourse, they have to be detected prior to identifying their antecedents. Thus, although in English pleonastic pronouns have to be determined whether or not they are anaphoric expressions prior to resolution, the process of analyzing Japanese zero pronouns is different from general coreference resolution in English.</Paragraph> <Paragraph position="2"> For identifying anaphoric relations, existing methods are classified into two fundamental approaches: rule-based and statistical approaches. In rule-based approaches (Grosz et al., 1995; Hobbs, 1978; Mitkov et al., 1998; Nakaiwa and Shirai, 1996; Okumura and Tamura, 1996; Palomar et al., 2001; Walker et al., 1994), anaphoric relations between anaphors and their antecedents are identified by way of hand-crafted rules, which typically rely on syntactic structures, gender/number agreement, and selectional restrictions. However, it is difficult to produce rules exhaustively, and rules that are developed for a specific language are not necessarily effective for other languages. For example, gender/number agreement in English cannot be applied to Japanese.</Paragraph> <Paragraph position="3"> Statistical approaches (Aone and Bennett, 1995; Ge et al., 1998; Kim and Ehara, 1995; Soon et al., 2001) use statistical models produced based on corpora annotated with anaphoric relations. However, only a few attempts have been made in corpus-based anaphora resolution for Japanese zero pronouns. One of the reasons is that it is costly to produce a sufficient volume of training corpora annotated with anaphoric relations.</Paragraph> <Paragraph position="4"> In addition, those above methods focused mainly on identifying antecedents, and few attempts have been made to detect zero pronouns. Motivated by the above background, we propose a probabilistic model for analyzing Japanese zero pronouns combined with a detection method. In brief, our model consists of two parameters associated with zero pronoun detection and antecedent identification. We focus on zero pronouns whose antecedents exist in preceding sentences to zero pronouns because they are major referential expressions in Japanese.</Paragraph> <Paragraph position="5"> Section 2 explains our proposed method (system) for analyzing Japanese zero pronouns.</Paragraph> <Paragraph position="6"> Section 3 evaluates our method by way of experiments using newspaper articles. Section 4 discusses related research literature.</Paragraph> </Section> class="xml-element"></Paper>