XML Viewer - j94-2006

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/94/j94-2006_abstr.xml
Size: 14,070 bytes
Last Modified: 2025-10-06 13:48:16
<?xml version="1.0" standalone="yes"?>
<Paper uid="J94-2006">
  <Title>Educational Testing Service</Title>
  <Section position="1" start_page="0" end_page="303" type="abstr">
    <SectionTitle>
1. Introduction
</SectionTitle>
    <Paragraph position="0"> Several researchers have noted the local coherence exhibited by discourse (Sidner 1979; Grosz, Joshi, and Weinstein 1983; Carter 1987; etc.). A primary component of this local coherence is the way the local focus of the discourse shifts from one sentence to the next and the way this shifting is marked by linguistic choices made by the writer/speaker.</Paragraph>
    <Paragraph position="1"> By local focus, we refer to that concept a sentence is most centrally about within the discourse context in which it occurs. This is sometimes called the topic or center.</Paragraph>
    <Paragraph position="2"> A local focusing framework typically consists of focus-tracking algorithms and algorithms for suggesting referents for pronouns. Such a framework can be used in conjunction with an inferencing mechanism to resolve pronouns (and other anaphora) in a Natural Language Understanding system. The focusing framework suggests a referent for a pronoun, and the inferencing mechanism then confirms or rejects the suggested referent on the basis of semantic factors, i.e., semantics, world knowledge, etc.</Paragraph>
    <Paragraph position="3"> The focusing framework is useful because it only requires an inferencing mechanism to confirm a co-specification rather than requiring an inferencing to find the referent independently.</Paragraph>
    <Paragraph position="4"> To date, there have been two major frameworks for tracking the local focus from one sentence to the next and for using focus during pronoun resolution. The first framework, Focusing, was introduced by Sidner (1979). In this squib, we will describe our framework, Revised Algorithms for Focus Tracking and Revised Algorithms for Pronoun Resolution (RAFT/RAPR), which is based on Sidner's work. RAFT/RAPR can be characterized as maintaining two foci for a sentence: the subject focus and the current focus, which very often have distinct contents. RAFT/RAPR maintains a set of data structures, and uses rules (which take grammatical roles into account) for pronoun resolution and computing the foci. Taken together, these rules describe how focus can (and is most likely to) shift from one sentence to the next. Note that focus tracking and pronoun resolution are mutually dependent processes: focus tracking is necessary for pronoun resolution, and pronoun resolution, in turn, affects focus tracking.</Paragraph>
    <Paragraph position="5">  * Educational Testing Service, Mail Stop 10-R, Rosedale Road, Princeton, NJ 08541. E-mail: lsuri@rosedale.org. t Dept. of Computer and Information Sciences, 103 Smith Hall, University of Delaware, Newark, DE 19716. E-mail: mccoy@udel.edu. (~) 1994 Association for Computational Linguistics  Computational Linguistics Volume 20, Number 2 Subsequent to Sidner's work, Grosz, Joshi, and Weinstein (1983) introduced centering to account for the same phenomena addressed by Sidner's algorithm. 1 Centering attempted to simplify processing by keeping fewer data structures than Sidner's framework did. In particular, the centering literature claims that, rather than two foci, only one focus is needed, termed the backward-looking center (Cb). Pronoun resolution within the centering framework is largely based on an ordering of preferred focus (centering) moves.</Paragraph>
    <Paragraph position="6"> Other research on discourse (e.g., Grosz 1981; Grosz and Sidner 1986; Reichman 1978) has studied another phenomenon, the global focus of discourse. The term global focus generally refers to the entity or set of entities that are relevant to or salient in the overall discourse; the identification of global focus typically interacts with the identification of discourse segments. Global focus and discourse segmentation are distinct from the phenomenon of local focusing that is addressed in this paper. However, we should point out that the centering literature has noted that centering &amp;quot;... is intended to operate within a \[discourse\[ segment&amp;quot; (Walker 1989, p. 253). In our work on RAFT/RAPR we do not restrict the domain of the algorithms to within a discourse segment.</Paragraph>
    <Paragraph position="7"> Given that multiple frameworks for focus tracking and pronoun resolution have emerged, we would like to do a comparison to see how the frameworks are the same and how they differ. Previous assessments and comparisons of local focusing frameworks have relied on comparing how frameworks process a small number of constructed discourses, but this kind of comparison is inadequate. Instead the question that must be answered is which framework performs best on naturally occurring text.</Paragraph>
    <Paragraph position="8"> However, such a comparison is not possible at this point because no framework has fully specified how to handle complex sentences (see Suri \[1993\] for the details of this argument).</Paragraph>
    <Paragraph position="9"> In light of this, we propose a comparison of RAFT/RAPR and centering along two lines. First, it is instructive to take a careful look at how the frameworks handle certain kinds of constructed discourses involving simple sentences. This comparison proves useful for understanding why the frameworks suggest the referents that they do. It is interesting to note that, while the methodologies used in RAFT/RAPR and centering are quite different from one another, the frameworks very often have the same preferences for pronoun resolution for text that is not discourse-initial (nor discoursesegment-initial) and that involves only simple sentences. Despite this similarity, we point out places where the two frameworks differ. A major difference between centering and RAFT/RAPR is that while RAFT/RAPR stacks old focus information, centering keeps information about the previous sentence only. We show why this is problematic for centering. We point out other differences that arise because centering keeps one focus and does not take the grammatical roles of pronouns and potential antecedents into account during pronoun resolution. This difference is evident in the examples discussed in this paper involving discourse-initial text, and even in an example discussed in this paper that (we believe) does not involve a discourse segment boundary. Note that because the centering literature claims that centering should operate only within a discourse segment, and because this claim is used to explain some otherwise problematic cases of pronoun use, not being able to adequately handle discourse seg1 Notice that we use the term focusing to cover all local focusing frameworks, Sidner's focusing framework (Sidner 1979), Carter's extensions to Sidner's framework (Carter 1987), the centering framework (Grosz, Joshi, and Weinstein 1983 and others), our framework (RAFT/RAPR), PUNDIT (Dahl \[1986\] and others), etc. We use uppercase (&amp;quot;Focusing,&amp;quot; or &amp;quot;Local Focusing&amp;quot;), or &amp;quot;Sidner's Focusing Algorithm/Framework&amp;quot; to refer to Sidner's work. We use RAFT/RAPR to refer to our work.  Linda Z. Suri and Kathleen F. McCoy RAFT/RAPR and Centering ment initial text is much more of a problem for the centering frameworks than may at first be apparent.</Paragraph>
    <Paragraph position="10"> While the observations we make in this first comparison are intriguing, it would be inappropriate to assess and compare the frameworks only on the basis of a handful of constructed texts. However, in order to do a corpus analysis to compare focusing frameworks one must be able to handle many kinds of complex sentences. Thus, we developed a methodology for determining how people process a particular kind of complex sentence. 2 Our second line of comparison of frameworks involves studying how well each framework can be extended to account for such findings. Suri (1993) presented preliminary results for processing sentences of the form &amp;quot;SX because SY,&amp;quot; where SX and SY each consist of a single clause. In this squib, we report some of those findings, and discuss extending RAFT/RAPR and centering in light of these findings.</Paragraph>
    <Paragraph position="11"> In closing, we summarize the abstract similarities and differences between centering  and RAFT/RAPR.</Paragraph>
    <Paragraph position="12"> 2. Our Focusing and Pronoun Resolution Algorithms  Below, we discuss the behavior of our algorithms for simple (i.e., single-clause) sentences. (See Suri \[1993\] for a fuller discussion of our focusing and pronoun resolution algorithms, and a discussion of how our algorithms differ from Sidner's.)</Paragraph>
    <Section position="1" start_page="302" end_page="303" type="sub_section">
      <SectionTitle>
2.1 Data Structures
</SectionTitle>
      <Paragraph position="0"> Our algorithms maintain more focusing data structures than centering does, but each data structure is motivated by discourse processing needs. Below are the data structures that RAFT/RAPR uses:  * Current Focus (CF): the item computed to be the local focus of the sentence.</Paragraph>
      <Paragraph position="1"> * Potential Focus List (PFL): all NPs other than the CF and SF, ordered according to the following: direct object, indirect object, all other NPs in surface order within the clause.</Paragraph>
      <Paragraph position="2"> * Subject Focus (SF): the surface subject of the clause, except in certain cases as mentioned later. (The need for an SF as well as a local focus is discussed in Section 4.3.) * The Potential Subject Focus List (PSFL): all NPs other than the SF and CF, ordered as follows: direct object, indirect object, all other NPs in surface order within the clause. 3 * CF stack, SF stack, PFL stack, PSFL stack. We stack the foci and foci lists after each sentence. (See Section 4.1.) 2.2 Resolving Pronouns (in Simple Sentences) RAFT/RAPR resolves pronouns based on the grammatical role of the pronoun and the focusing data structures from the previous sentence. For a nonsubject third person 2 Suri (1993) discusses tile issues that one faces in trying to determine how to process complex sentences, and argues why this methodology is better than alternative methodologies. 3 Determining whether we truly need both a PFL and PSFL would require a corpus analysis, which is beyond the capabilities of the current technology, as discussed in Suri (1993). In fact, the motivation for maintaining both a PFL and a PSFL is based on processing complex sentences (see Suri 1993).  Computational Linguistics Volume 20, Number 2 singular pronoun, our algorithm first proposes the CF (of the last sentence) as the cospecifier, then the SF, then members of the PFL, and then, under preferences yet to be determined, the members of the CF stack, SF stack, PFL stack, and PSFL stack. We thus prefer the pronoun to co-refer with the last focus, then the subject focus of the previous sentence, then some other NP introduced in the previous sentence, and then elements that have been in focus or mentioned in previous sentences. Each attempted co-specification may be rejected by a separate inferencing component on the basis of semantic factors (semantics, world knowledge, etc.) or based on syntactic constraints.</Paragraph>
      <Paragraph position="3"> For a subject third person singular pronoun, we first try the SF, then the CF, then members of the PSFL, then the stacked elements.</Paragraph>
    </Section>
    <Section position="2" start_page="303" end_page="303" type="sub_section">
      <SectionTitle>
2.3 SF and CF Computation (in Simple Sentences)
</SectionTitle>
      <Paragraph position="0"> For a there-insertion sentence, the SF is the deep subject of the sentence, but for most simple sentence types the SF is the surface subject of the simple sentence. 4 Our algorithms compute the CF of the current sentence based on the following interacting criteria:</Paragraph>
      <Paragraph position="2"> Co-specification: Prefer elements that co-specify an element in a focusing data structure over elements just introduced. If an element being talked about now has been talked about before, it is more likely to be the topic (and thus to continue to be talked about) than something that has just been introduced.</Paragraph>
      <Paragraph position="3"> The type of realization of each element: Prefer NPs realized as pronouns over those realized with full NPs. A pronoun is more likely to be talked about in subsequent text than a full NP (Brown 1983).</Paragraph>
      <Paragraph position="4"> To appreciate this preference, consider the following. A pronoun carries less semantic information than a full NP. If a writer chose to communicate an element using a pronoun, he or she must have believed that the element was highly focused enough that the reader would not have difficulty interpreting the pronoun without the extra semantic information that would be communicated with a full NP.</Paragraph>
      <Paragraph position="5"> Which focusing data structure, if any, is co-specified by each NP.</Paragraph>
      <Paragraph position="6"> In general, we believe a writer/speaker is more likely to keep discussing the focus than to move the local focus to some other discourse entity.</Paragraph>
      <Paragraph position="7"> Thus, we prefer the CF to remain constant from one sentence to the next.</Paragraph>
      <Paragraph position="8"> We refer to this preference as the focus retention preference. We also believe a writer is more likely to switch focus to an element that was just mentioned in the previous sentence than to one that was discussed earlier. Thus, in sum, we prefer for the CF to co-specify the last CF rather than something on the last PFL or the last SF, and we prefer for the CF to co-specify something on the last PFL or the last SF rather than a stacked element.</Paragraph>
      <Paragraph position="9"> Syntax: We prefer for the CF to be a nonsubject rather than a subject, although the CF can be the subject.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML