XML Viewer - j81-2001

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/81/j81-2001_metho.xml
Size: 62,181 bytes
Last Modified: 2025-10-06 14:11:22
<?xml version="1.0" standalone="yes"?>
<Paper uid="J81-2001">
  <Title>Discourse-Oriented Anaphora Resolution in Natural Language Understanding: A Review</Title>
  <Section position="3" start_page="0" end_page="0" type="metho">
    <SectionTitle>
2. Concept activatedness
</SectionTitle>
    <Paragraph position="0"> Robert Kantor (1977) has investigated the problem of why some pronouns in discourse are more comprehensible than others, even when there is no ambiguity or anomaly. In Kantor's terms, a hard-to-understand pronoun is an example of inconsiderate discourse, and speakers (or, more usually, writers) who produce such pronouns lack secondary llinguistic\] competence. In our terms, an inconsiderate pronoun is one that is not properly in focus.</Paragraph>
    <Paragraph position="1"> I will first summarize Kantor's work, and then discuss what we can learn about focus from it.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
2.1 Kantor's thesis
</SectionTitle>
      <Paragraph position="0"> Kantor's main exhibit is the following text: (2-1) A good share of the amazing revival of commerce must be credited to the ease and security of communications within the empire. 'The Imperial fleet kept the Mediterranean Sea cleared of pirates. In each province, the Roman emperor repaired or constructed a number of skillfully designed roads. They were built for the army but served the merchant class as well. Over them, messengers of the Imperial service, equipped with relays of horses, could average fifty miles a day.</Paragraph>
      <Paragraph position="1"> He claims that the they in the penultimate sentence is hard to comprehend, and that most informants need to reread the previous text to find its referent. Yet the sentence is neither semantically anomalous nor ambiguous -- the roads is the only plural NP available as a referent, and it occurs immediately before the pronoun with only a full-stop intervening. To explain this paradox is the task Kantor set himself.</Paragraph>
      <Paragraph position="2"> Kantor's explanation is based on discourse topic and the listener's expectations. In (2-1), the discourse topic of the first three sentences is ease and security of communication in the Roman empire. In the fourth sentence, there is an improper shift to the roads as the topic: improper, because it is unexpected, and there is no discourse cue to signal it. Had the demonstrative these roads been used, the shift would have been okay. 3 Underlining is used in this and subsequent examples to indicate the anaphor(s) of interest. It does not indicate stress. (Note that a definite NP such as the roads is not enough.) Alternatively, the writer could have clarified the text by combining the last three sentences with semicolons, indicating that the last two main clauses were to be construed as relating only to the preceding one rather than to the discourse as a whole.</Paragraph>
      <Paragraph position="3"> Kantor identifies a continuum of factors affecting the comprehension of pronouns. At one end is unrestricted expectation and at the other negative expectation. What this says in effect is that a pronoun is easy to understand if its referent is expected, and difficult if it is unexpected. This is not as vacuous as it at first sounds; Kantor provides an analysis of some subtle factors which affect expectation.</Paragraph>
      <Paragraph position="4"> The most expected pronominalizations are those whose referent is the discourse topic, or something associated with it (though note the qualifications to this below). Consider: (2-2) The final years of Henry's reign, as recorded by the admiring Hall, were given over to sport and gaiety, though there was little of the licentiousness that characterized the French court. The athletic contests were serious but very popular.</Paragraph>
      <Paragraph position="5"> Masques, jousts and spectacles followed one another in endless pageantry. He brought to Greenwich a tremendously vital court life, a central importance in the country's affairs, and above all, a great naval connection. 4 In the last sentence, he is quite comprehensible, despite the distance back to its referent, because the discourse topic in all the sentences is Henry's reign. An example of the converse -- an unexpected pronoun which is difficult despite recency -- can be seen in (2-1) above. Between these two extremes are other cases involving references to aspects of the local topic, changes in topic, syntactic parallelism, and, in topicless instances, recency (though the effect of recency decays very fast). I will not describe these here; the interested reader is referred to Section 2.6.5 of Kantor's dissertation (1977).</Paragraph>
      <Paragraph position="6"> Kantor then defines the notion of the activatedness of a concept. This provides a continuum of Concept givenness, which contrasts with the simple binary given-new distinction usually accepted in linguistics (for example, Chafe 1970). Kantor also distinguishes activatedness from the similar &amp;quot;communicative dynamism&amp;quot; of the Prague school (Firbas 1964). Activated-</Paragraph>
    </Section>
  </Section>
  <Section position="4" start_page="0" end_page="0" type="metho">
    <SectionTitle>
4 From: Hamilton, Olive and Hamilton, Nigel. Royal Greenwich.
</SectionTitle>
    <Paragraph position="0"> Greenwich: The Greenwich Bookshop, 1969. Quoted by Halliday and Hasan (1976:14), quoted by Kantor (1977).</Paragraph>
    <Paragraph position="1"> 86 American Journal of Computational Linguistics, Volume 7, Number 2, April-June 1981 Graeme Hirst Discourse-Oriented Anaphora Resolution ness is defined in terms of the comprehensibility phenomena described above: the more activated a concept is, the easier it is to understand an anaphoric reference to it. Thus activatedness depends upon discourse topic, context, and so forth.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
2.2 The implications of Kantor's work
</SectionTitle>
      <Paragraph position="0"> What are the ramifications of Kantor's thesis for focus? Clearly, the notions of activatedness and focus are very similar, though the latter has not generally been thought of as a continuum. It follows that the factors Kantor finds relevant for activatedness and comprehensibility of pronouns are also important for those of us who would maintain focus in computer-based natural language understanding (NLU) systems; we will have to discover discourse topic and topic shifts, generate pronominalization expectations, and so forth.</Paragraph>
      <Paragraph position="1"> In other words, if we could dynamically compute (and maintain) the activatedness of each concept floating around, we would have a measure for the ordering of the focus set by preferability as referent; the referent for any given anaphor would be the most highly activated element which passes basic tests for number, gender and semantic reasonableness. And to find the activatedness of the concepts, we follow Kantor's pointers (which he himself concedes are very tenuous and difficult) to extract and identify the relevant factors from the text.</Paragraph>
      <Paragraph position="2"> It may be objected that by applying Kantor's insights all we have done is produce a mere notational variant of our original problem. This is partly true.</Paragraph>
      <Paragraph position="3"> One should not gainsay the power of a good notation, however, and what we can buy here even with mere notational variance is the power of Kantor's investigations. And there is more. Previously, it has been suggested that items either are in focus or they aren't, and that at each separate anaphor we need to compute a preference ranking of the focus elements for that anaphor. What Kantor tells us is that such a ranking exists independently of the actual use of anaphors in the text, and that we can find the ranking by looking at things like discourse topic.</Paragraph>
      <Paragraph position="4"> Some miscellaneous comments on Kantor's work: 1. It can be seen as a generalization albeit a weakening of Grosz's (1977a, 1977b, 1978) findings on focus in task-oriented dialogues (where each sub-task becomes the new discourse topic, opening up a new set of possible referents), which are discussed below in Section 3. (Kantor and Grosz were apparently unaware of each other's work; neither cites the other.)  2. It provides an explanation for focus problems that have previously baffled us. For example, in Hirst (1977a) I contemplated the problem of the ill-formedness of this text: (2-3) *John left the window and drank the wine on the table. It was brown and round.</Paragraph>
      <Paragraph position="5">  I had previously thought this to be due to a syntactic factor -- that cross-sentence pronominal reference to an NP in a relative clause or adjectival phrase qualifying an NP was not possible. However, it can also be explained as a grossly inconsiderate pronoun which does not refer to the topic properly -- the table occurs only as a descriptor for the wine, and not as a concept &amp;quot;in its own right&amp;quot;. This would be a major restriction on possible reference to sub-aspects of topics.</Paragraph>
      <Paragraph position="6"> 3. Like too many other researchers, Kantor makes many claims about comprehensibility and the degree of well-formedness of sentences which others (as he concedes) may not agree with. He uses only himself (and his friends, sometimes) as an informant, and then only at an intuitive level. 5 Claims as strong and subtle as Kantor's cry out for empirical testing.6 3. Focus of attention in task-oriented dialogues Barbara Grosz (1977a, 1977b, 1978) studied the maintenance of the focus of attention in task-oriented dialogues and its effect on the resolution of definite reference, as part of SRI's speech understanding system project (Walker 1978). By a task-oriented dialogue is meant one which has some single major well-defined task as its goal. For example, Grosz collected and studied dialogues in which an expert guides an apprentice in the assembly of an air compressor. She found that the structure of such dialogues parallels the structure of the task. That is, just as the major task is divided into several well-defined sub-tasks, and these perhaps into sub-sub-tasks and so on, the dialogue is likewise divided into sub-dialogues, sub-sub-dialogues, etc, 7 each corresponding to a task component, much as a well-structured Algol program is composed of blocks within blocks within blocks. As the dialogue progresses, each sub-dialogue in turn is performed in a strict depth-first order corresponding to the order of sub-task performance in the task goal (though note that some sub-tasks may not be ordered with respect to 5 For a discussion of the problem of idiosyncratic well-formedness judgments, and a suggested solution, see Sections 4.2 and 7.3 of Hirst (1981).</Paragraph>
    </Section>
  </Section>
  <Section position="5" start_page="0" end_page="0" type="metho">
    <SectionTitle>
6 Kantor tells me that he hopes to test some of his assertions
</SectionTitle>
    <Paragraph position="0"> by observing the eye movements of readers of considerate and inconsiderate texts, to find out if inconsiderate texts actually make readers physically search back for a referent.</Paragraph>
    <Paragraph position="1">  others). As we will see, this dialogue structure can be exploited in reference resolution.</Paragraph>
    <Paragraph position="2"> Grosz's aim was to find ways of determining and representing the focus of attention of a discourse -that is, roughly speaking, its global theme and the things associated therewith -- as a means for constraining the knowledge an NLU system needs to bring to bear in understanding discourse. In other words, the focus of attention is that knowledge which is relevant at a given point in a text for comprehension of the text. 8 Grosz claims that antecedents for definite reference can be found in the focus of attention. That is, the focus of attention is a superset of focus in our sense, the set of referable concepts (in this case definite reference, not just anaphoric reference). Moreover, no element in the focus of attention is excluded from being a candidate antecedent for a definite NP.</Paragraph>
    <Paragraph position="3"> Grosz thereby implies that all items in the focus of attention can be referred to, and that hence the two senses of the word focus are actually identical.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.1 Representing and searching focus
</SectionTitle>
      <Paragraph position="0"> In Grosz's representation, which uses a partitioned semantic net formalism (Hendrix 1975, 1978), an explicit focus corresponds to a sub-dialogue, and includes, for each concept in it, type information about that concept and any situation in which that concept participates. For each item in the explicit focus, there is an associated implicit focus, which includes subparts of objects in explicit focus, subevents of events in explicit focus, and participants in those subevents.</Paragraph>
      <Paragraph position="1"> The implicit focus attempts to account for reference to items that have a close semantic distance to items in focus, or which have a close enough relationship to items in focus to be able to be referred to. The implicit focus is also used in detecting focus shifts (discussed below).</Paragraph>
      <Paragraph position="2"> Then, at any given point in a text, antecedents of definite non-pronominal NPs can be found by searching through the explicit and implicit focus for a match for the reference. After checking the other non-pronominal NPs in the same sentence to see if the reference is intrasentential, the currently active explicit focus (the focus corresponding to the present subdialogue) is searched, and then if that search is not successful, the other currently open focus spaces (that is, those corresponding to sub-dialogues that the present sub-dialogue is contained in) are searched in order, back up to the top of the tree. As part of the search the implicit focus associated with each explicit focus is checked, as are subset relations, so that if a novel, say, 8 In her later work (Grosz 1978), Grosz emphasizes focusing as an active process carried out by dialogue participants.</Paragraph>
      <Paragraph position="3"> is in focus, it could be referred to as the book. If there is still no success after this, one then checks whether the NP refers to a single unique concept (such as the sun), contains new information (such as the red coat, when a coat is in focus, but not yet known to be red), or refers to an item in implicit focus.</Paragraph>
      <Paragraph position="4"> A similar search method could be used for pronouns. However, since pronouns carry much less information than other definite NPs, more inference is required by the reference matching process to disambiguate many syntactically ambiguous pronouns, and it would be necessary to search focus exhaustively, comparing the reasonableness of candidate referents, rather than stopping at the first plausible one. In addition, other constraints on pronoun reference, such as local (rather than global) theme, and default referent, would also need to be taken into account; Grosz's mechanisms do not do this. However, Grosz does show how a partitioned network structure can be used to resolve certain types of ellipsis by means of syntactic and semantic pattern matching against the immediately preceding utterance, which may itself have been expanded from an elliptical expression. She leaves open for future research most of the problems in relating pronouns to focus.</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.2 Maintaining focus
</SectionTitle>
      <Paragraph position="0"> Given this approach, one is then faced with the problem of deciding what the focus is at a given point in the discourse. For highly constrained task-oriented dialogues such as those Grosz considered, the question of an initial focus does not arise; it is, by definition, the overall task in question. The other component of the problem, handling changes and shifts in the focus, is attacked by Grosz in a top-down manner using the task structure as a guide.</Paragraph>
      <Paragraph position="1"> A shift in focus can be indicated explicitly by an utterance, such as: (3-1) Well, the reciprocating afterburner nozzle speed control is assembled. Next, it must be fitted above the preburner swivel hose cover guard cooling fin mounting rack.</Paragraph>
      <Paragraph position="2"> In this case, the reciprocating afterburner nozzle speed control assembly sub-task and its corresponding sub-dialogue and focus are closed, and new ones are opened for the reciprocating afterburner nozzle speed control fitting, dominated by the same open subtasks/sub-dialogues/focuses in their respective trees that dominated the old ones. If however the new sub-task were a sub-task of the old one, then the old one would not be closed, but the new one added to the  hierarchy below it as the new active focus space. The newly created focus space initially contains only those 88 American Journal of Computational Linguistics, Volume 7, Number 2, April-June 1981 Graeme Hirst Discourse-Oriented Anaphora Resolution  items referred to in the utterance, and those objects associated with the current sub-task. (Being able to bring in the associated objects at this time is, of course, the crucial point on which the whole system relies.) As subsequent non-shift-causing utterances come in, their new information is added to the active focus space.</Paragraph>
      <Paragraph position="3"> Usually, of course, speakers are not as helpful as in (3-1), and it is necessary to look for various clues to shifts in focus. For Grosz, the clues are definite NPs. If a definite NP from an utterance cannot be matched in focus, then this is a clue that the focus has shifted, and it is necessary to search for the new focus. If the antecedent of a definite NP is in the current implicit focus, this is a clue that a sub-task associated with this item is being opened. If the task structure is being followed, then the new focus will reflect the opening or closing of a sub-task.</Paragraph>
      <Paragraph position="4"> Shifting cannot be done until a whole utterance is considered, because clues may conflict, or the meaning of the utterance may contraindicate the posited shift. In p~rticular, recall that the task structure is only a guide, and does not define the dialogue structure absolutely. For example, the focus may shift to a problem associated with the current sub-task with a question like this: (3-2) Should I use the box-end ratchet wrench to do that? This does not imply a shift to the next sub-task requiring a box-end ratchet wrench (assuming that the current task doesn't require one) (cf Grosz 1977b:105). We can see here that the problem of the circularity of language comprehension looms dangerously: to determine the focus one must resolve the references, and to resolve the references, one must know the focus. In Grosz's work, the strong constraints of the structure of task-oriented dialogues provide a toehold. Whether generalization to the case of discourse with other structures, or with no particular structure, is possible is unclear, as it may not be possible to determine so nicely what the knowledge associated with any new focus is. (See however my remarks in Section 2.2 above on the relationship between Grosz's work and that of Kantor, and Section 6 on approaches which attempt to exploit local discourse structure.) In addition, Grosz's mechanisms are limited in their ability to resolve anaphora that require inference or are intersentential (or both). The assumption that global focus of attention equals all and only possible referents (except where the focus shifts), while perhaps not unreasonable in task-oriented domains, is probably untrue in general. For example, it is unclear that such mechanisms could handle the effects of local as opposed to global theme that exclude the table from the focus for almost all speakers in (2-3). Similarly, could the level of world knowledge and inference required to resolve the different referents of she in (3-3) and (3-4) be integrated into the partitioned semantic net formalism? (3-3) When Nadia visited Sue for dinner, she ate sukiyaki au gratin.</Paragraph>
      <Paragraph position="5"> (3-4) When Nadia visited Sue for dinner, she served sukiyaki au gratin.</Paragraph>
      <Paragraph position="6"> Could entities evoked by, but not explicit in, a text of only moderate structure be identified and instantiated in focus? Grosz did not address these issues (nor did she need to for her immediate goals), but they would need to be resolved in any attempt to generalize her approach. (Some other related problems, including those of focus shifting, are discussed in Grosz 1978.) Grosz's contribution was to demonstrate the role of discourse structure in the identification of theme, relevant world knowledge and the resolution of reference. We now turn to another system which aspires to similar goals, but in a more general context.</Paragraph>
      <Paragraph position="7"> 4. Focus in the PAL system The PAL personal assistant program (Bullwinkle 1977a) is a system designed to accept natural language requests for scheduling activities. A typical request (from Bullwinkle 1977b:44) is: (4-1) I want to schedule a meeting with Ira. It should be at 3 pm tomorrow. We can meet in Bruce's office.</Paragraph>
      <Paragraph position="8"> The section of PAL that deals with discourse pragmatics and reference was developed by Candace Sidner \[Bullwinkle\] (Bullwinkle 1977b; Sidner 1978a). Like Grosz's system, PAL attempts to find a focus of attention in its knowledge structures to use as a focus for reference resolution. Sidner sees the focus as equivalent to the discourse topic; in fact in Bullwinkle (1977b) the word topic is used instead of focus.</Paragraph>
      <Paragraph position="9"> There are three major differences from Grosz's system:  1. PAL does not rely heavily on discourse structures.</Paragraph>
      <Paragraph position="10"> 2. Knowledge is represented in frames.</Paragraph>
      <Paragraph position="11"> 3. Focus selection and shifting are handled at  a more superficial level.</Paragraph>
      <Paragraph position="12"> I will discuss each difference in turn.</Paragraph>
    </Section>
    <Section position="3" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
4.1 PAL's approach to discourse
</SectionTitle>
      <Paragraph position="0"> Because a request to PAL need not have the rigid structure of one of Grosz's task-oriented dialogues, PAL does not use discourse structure to the same extent, instead relying on more general local cues. However, as we shall see below, in focus selection and</Paragraph>
    </Section>
    <Section position="4" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
4.2 The frame as focus
</SectionTitle>
      <Paragraph position="0"> The representation of knowledge in PAL is based on frames, and its implementation uses the FRL frame representation language (actually a dialect of LISP) developed by Roberts and Goldstein (1977a, 1977b).</Paragraph>
      <Paragraph position="1"> In PAL, the frame corresponds to Grosz's focus space. Following Rosenberg's (1976, 1977) work on discourse structure and frames, the antecedent for a definite NP is first assumed to be either the frame itself, or one of its slots. So, for example, in (4-2): (4-2) I want to have a meeting with Ross (1). It should be at three pm. The location will be the department lounge. Please tell Ross(2).</Paragraph>
      <Paragraph position="2"> it refers to the MEETING frame (not to the text a meeting) which provides the context for the whole discourse; the location refers to the LOCATION slot that the MEETING frame presumably has (thus the CLOSELY ASSOCIATED WITH relation (Hirst 1981) is handled), and Ross (e) to the contents 9 of the CO-MEETER slot, previously given as Ross.</Paragraph>
      <Paragraph position="3"> If the antecedent cannot be found in the frame, it is assumed to be either outside the discourse or inferred. In (4-2), PAL would search its database to find referents for Ross (1) and the department lounge. Personal names are resolved with a special module that knows about the semantics of names (Bullwinkle 1977b:48). PAL carries out database searches for references like the department lounge apparently by searching a hierarchy of frames, looking at the frames in the slots of the current focus, and then in the slots of these frames, and so on (Sidner 1978a:211), though it is not apparent why this should usefully constrain the search in the above example. 10  9 Sidner only speaks of reference to slots (1978a:211), without saying whether she means the slot itself or its contents; it seems reasonable to assume, as I have done here, that she actually means both.</Paragraph>
      <Paragraph position="4"> 10 In fact there is no need in this particular example for a referent at all. The personal assistant need only treat the department lounge as a piece of text, presumably meaningful to both the speaker and Ross, denoting the meeting location. A human might do this when passing on a message he or she didn't understand: (i) Ross asked me to tell you to meet him in  the arboretum, whatever the beck that is.</Paragraph>
      <Paragraph position="5"> On the other hand, an explicit antecedent would be needed if PAL had been asked, say, to deliver coffee to the meeting in the department lounge. Knowing when to be satisfied with ignorance is a difficult problem which Sidner does not consider, preferring the safe course of always requiring an antecedent.</Paragraph>
    </Section>
    <Section position="5" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
4.3 Focus selection
</SectionTitle>
      <Paragraph position="0"> In PAL, the initial focus is the first NP following the main verb of the first sentence of the discourse -usually, the object of the sentence -- or, if there is no such NP, then the subject of that sentence. This is a short-cut method, which seems to be sufficient for requests to PAL, but which Sidner readily admits is inadequate for the general case (Sidner 1978a:209). I will briefly review some of the problems.</Paragraph>
      <Paragraph position="1"> Charniak (1978) has shown that the frameselection problem (which is here identical to the initial focus selection problem, since the focus is just the frame representing the theme of the discourse) is in fact extremely difficult, and is not in the most general case amenable to solution by either strictly top-down or bottom-up methods. Sidner's assumption that the relevant frame is given by an explicitly mentioned NP is also a source of trouble, even in the examples she quotes, such as these two (Sidner 1978b:92): (4-3) I was driving along th__ S freeway the other day. Suddenly the engine began to make a funny noise.</Paragraph>
      <Paragraph position="2"> (4-4) I went to a new restaurant with Sam. The waitress was nasty. The food was great.</Paragraph>
      <Paragraph position="3"> (Underlining indicates what Sidner claims is the focus.) In (4-3), Sidner posits a chain of inferences to get from the engine to the focus, the FREEWAY frame.</Paragraph>
      <Paragraph position="4"> This is more complex than is necessary; if the frame/focus were DRIVING (with its LOCATION slot containing the FREEWAY frame), then the path from the frame to the engine is shorter and the whole arrangement seems more natural. Thus we see that focus need not be based on an NP at all.</Paragraph>
      <Paragraph position="5"> In (4-4), our problem is what to do with Sam, who could be referenced in a subsequent sentence. It is necessary to integrate Sam into the RESTAURANT frame/focus, since clearly he should not be considered external to the discourse and sought in the database.</Paragraph>
      <Paragraph position="6"> While the RESTAURANT frame may indeed contain a COMPANION slot for Sam to sit in, it is clear that the first sentence could have been I went &lt;anywhere at all&gt; with Sam, requiring that any frame referring to something occupying a location must have a COMPANION slot. This is clearly undesirable. But the RESTAURANT frame is involved in (4-4); otherwise the waitress and the food would be external to the discourse. A natural solution is that the frame/focus of (4-4) is actually the GOING-SOMEWHERE frame  (with Sam in its COMPANION slot), containing the RESTAURANT frame in its PLACE slot, with both frames together taken as the focus. Sidner does not consider mechanisms for a multi-frame focus.</Paragraph>
      <Paragraph position="7"> 90 American Journal of Computational Linguistics, Volume 7, Number 2, April-June 1981 Graeme Hirst Discourse-Oriented Anaphora Resolution It is, of course, not always true that the frame/focus is explicit. Charniak (1978) points out that (4-5) is somehow sufficient to invoke the MAGICIAN frame: (4-5) The woman waved as the man on stage  sawed her in half.</Paragraph>
      <Paragraph position="8"> (See also Charniak (1981) for more on frame invocation problems.) Focus shifting in PAL is restricted: the only shifts permitted are to and from sub-aspects of the present focus (Sidner 1978a:209). Old topics are stacked for possible later return. This is very similar to Grosz's open-focus hierarchy. It is unclear whether there is a predictive aspect to PAL's focus-shift mechanism, 11 but the basic idea seems to be that any new phrase in a sentence is picked as a potential new focus. If in a subsequent sentence an anaphoric reference is a semantically acceptable coreferent for that potential focus, then a shift to that focus is ipso facto indicated (Sidner 1978a:209). Presumably this check is done after a check of focus has failed, but before any data-base search. A potential focus has a limited life span, and is dropped if not shifted to by the end of the second sentence following the one in which it occurred. An example (Sidner 1978a:209): (4-6) I want to schedule a meeting with George, Jim, Steve and Mike. We can meet in my office. It's kind of small, but the meeting won't last long anyway.</Paragraph>
      <Paragraph position="9"> (4-7) I want to schedule a meeting with George, Jim, Steve and Mike. We can meet in my office. It won't take more than 20 minutes. null In the second sentence my office is identified as a potential focus, and it, in the first reading of the third sentence, as an acceptable coreferent to my office confirms the shift. In the second reading, it couldn't be my office, so no shift occurs. The acceptability decision is based on selectional and case-like restrictions. null While perhaps adequate for PAL, this mechanism is, of course, not sufficient for the general case, where a true shift, as opposed to an expansion upon a previll On page 209 of Sidner (1978a) we are told: &amp;quot;Focus shifts cannot be predicted; they are detectable only after they occur&amp;quot;. Yet on the following page, Sidner says: &amp;quot;Sentences appearing in mid-discourse are assumed to be about the focus until the coreference module predicts a focus shift .... Once an implicit focus relation is established, the module can go onto \[sic\] predictions of focus shift&amp;quot;. My interpretation of these remarks is that one cannot be certain that the next sentence will shift focus, but one can note when a shift might happen, requiring later checking to confirm or disconfirm the shift.</Paragraph>
      <Paragraph position="10"> ously mentioned point, may occur. This is exemplified by many of the shifts in Grosz's task-oriented dialogues. null Another problem arising from this shift mechanism is that two different focus shifts may be indicated at the same time, but the mechanism has no way to choose between them. For example: (4-8) Schedule a meeting of t..h_e Experimental Theology Research Group, and tell Ross Andrews about it too. I'd like him to hear about the deocommunication work that they're doing.</Paragraph>
      <Paragraph position="11"> Each of the two underlined NPs in the first sentence would be picked as a potential focus. Since each is pronominally referenced in the second sentence, the mechanism would be confused as to where to shift the focus. (Presumably Ross Andrews would be the correct choice here.)</Paragraph>
    </Section>
    <Section position="6" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
4.4 Conclusions
</SectionTitle>
      <Paragraph position="0"> The shortcomings of Sidner's work are mainly attributable to two causes: her avoidance of relying on the highly constrained discourse structures that Grosz used, and the limited connectivity of frame systems, compared to Grosz's semantic nets. tz With respect to the former point, perhaps Sidner's main contribution has been to show the difficulties and pitfalls that lie in wait for anyone attempting to generalize Grosz's work, even to the extent that PAL does.</Paragraph>
    </Section>
  </Section>
  <Section position="6" start_page="0" end_page="0" type="metho">
    <SectionTitle>
5. Webber's formalism
</SectionTitle>
    <Paragraph position="0"> In the preceding sections of this paper, we saw approaches to anaphor resolution that were mainly top-down in that they relied on a notion of theme and/or focus of attention to guide the selection of focus (although theme determination may have been bottom-up). An alternative approach has been suggested by Bonnie \[Nash-\]Webber (Nash-Webber and Reiter 1977; Webber 1978a, 1978b), wherein a set of rules is applied to a logical-form representation of the text to derive the set of entities that that text makes available for subsequent reference. Webber's formalism attacks some problems caused by quantification that have not otherwise been considered by workers in NLU, 12 In her thesis (1979) \[which was not available to me when this paper was first written\], Sidner subsequently proposed the use of an association network instead of frames, and presented more sophisticated focus selection and shifting algorithms. I have emphasized her earlier work here, as it has received much wider circulation. null  here, and I shall have to assume some familiarity with logical forms. Readers who want more details should see her thesis (1978a); readers who find my exposition mystifying should not worry unduly -- the fault is probably mine -- but should turn to the thesis for illumination.</Paragraph>
    <Paragraph position="1"> In Webber's formalism, it is assumed that an input sentence is first converted to a parse tree, and then, by some semantic interpretation process, to an extended restricted-quantification predicate calculus representation. It is during this second conversion that anaphor resolution takes place. When the final representation, which we shall simply call a logical form, is complete, certain rules are applied to it to generate the set of referable entities and descriptions that the sentence evokes. Webber considers three types of antecedents those for definite pronouns, those for one-anaphora, 13 and those for verb phrase ellipsis.</Paragraph>
    <Paragraph position="2"> Each type has its own set of rules; we will briefly look at the first. (The others are discussed in Sections</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
5.4.2 and 5.4.3 of Hirst 1981.)
5.1 Definite pronouns'
</SectionTitle>
      <Paragraph position="0"> The antecedents for definitepronouns are invoking descriptions (IDs); these are in effect focus elements that are explicit in the text. IDs are derived from the logical form representation of a sentence by a set of rules that attempt to take into account factors, such as NP definiteness or references to sets, that affect what antecedents are evoked by a text. There are six of these ID-rules; 14 which one applies depends on the structural description of the logical form.</Paragraph>
      <Paragraph position="1"> Here is one of Webber's examples (1978a:64): (5-1) Wendy bought a crayon.</Paragraph>
      <Paragraph position="2"> This has this representation:  (5-2) Ox:Crayon) . Bought Wendy,x Now, one of the ID-rules says that any sentence S whose representation is of this form: (5-3) (ax:C). Fx  where C is an arbitrary predicate on individuals and Fx an arbitrary open sentence in which x is free, evokes an entity whose representation is of this form: 13 One-anaphors are those such as those, one, and some uses of it that refer to a description rather than a specific entity. An example: null (i) Wendy didn't give either boy a green tie-dyed T-shirt, but she gave Sue a red one.</Paragraph>
      <Paragraph position="3"> 14 Webber regards her rules only as a preliminary step towards a complete set that considers all relevant factors. She discusses some of the remaining problems, such as negation, in Webber (1978a:81-88).</Paragraph>
      <Paragraph position="4"> (5-4) ej ix: Cx &amp; Fx &amp; evoke S,x where ej is an arbitrary label assigned to the entity and is the definite operator. Hence, starting at the left of (5-2), we obtain this representation for the crayon of (5-1): (5-5) e 1 ,x: Crayon x &amp; Bought Wendy,x &amp; evoke (5-1),x which may be interpreted as e I is the crayon mentioned in sentence (5-1) that Wendy bought. Similarly we will obtain a representation of e 2, Wendy, which is then substituted for Wendy in (5-5) after some matching process has determined the identity of the two. In this next, more complex example (Webber 1978a:73), we see how quantification is handled: (5-6) Each boy gave each girl a peach.</Paragraph>
      <Paragraph position="6"> This matches the following structural description (where Oj stands for the quantifier (Vxj e ej), where ej is an earlier evoked discourse entity, and ! is the left boundary of a clause):  (5-7) lO 1 ... Qn (3y:C) . Fy and hence evokes an ID of this form: (5-8) e i ty: maxset(X(u:C)\[(3x 1 * el) (~ix n * en) . Fu &amp; evoke S,u\]) y (For any one-place predicate P, maxset(P)y is true if and only if y is the set of all items u such that Pu holds.) Another rule has already given us: (5-9) e 1 tx: maxset(Boy) x &amp;quot;the set of all boys&amp;quot; e 2 tx: maxset(Girl) x &amp;quot;the set of all girls&amp;quot; and so (5-8) is instantiated as: (5-10) e 3 ~z: maxset(A(u:Peach) \[(ax * el) (3y * e2) . Gave x,y,u &amp; evoke (5-6),y\]) z  &amp;quot;the set of peaches, each one of which is linked to (5-6) by virtue of some member of e 1 giving it to some member of e2&amp;quot; Although such rules could (in principle) be used to generate all IDs (explicit focus elements) that a sentence evokes, Webber does not commit herself to such an approach, instead allowing for the possibility of generating IDs only when they are needed, depending on subsequent information such as speaker's perspective. She also suggests the possibility of &amp;quot;vague, temporary&amp;quot; IDs for interim use (1978a:67). There is a problem here with intrasentential anaphora, since it is assumed that a sentence's anaphors are resolved before ID rules are applied to find what may be the antecedents necessary for that resolution. Webber proposes that known syntactic and selectional</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
92 American Journal of Computational Linguistics, Volume 7, Number 2, April-June 1981
Graeme Hirst Discourse-Oriented Anaphora Resolution
</SectionTitle>
      <Paragraph position="0"> constraints may help in this conflict, but this is not always sufficient. For example: (5-11) Marybought each girl a cotton T-shirt, but none of them were the style de rigeur in high schools.</Paragraph>
      <Paragraph position="1"> The IDs for both the set of girls and the set of T-shirts are needed to resolve them, but them needs to be resolved before the IDs are generated. In this particular example, the clear solution is to work a clause at a time rather than at a sentence level. However, this is not always an adequate solution, as (5-12) shows: (5-12) The rebel students annoyed the teachers greatly, and by the end of the week none of the faculty were willing to go to their classes.</Paragraph>
      <Paragraph position="2"> In this ambiguous sentence, one possible antecedent for their, the faculty, occurs in the same clause as the anaphor. Thus neither strictly intraclausal nor strictly interclausal methods are appropriate. Webber is aware of this problem (1978a:48), and believes that it suffices that such information as is available be used to rule out impossible choices; the use of vague temporary IDs then allows the anaphor to be resolved.</Paragraph>
    </Section>
    <Section position="3" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
5.2 Conclusions
</SectionTitle>
      <Paragraph position="0"> It remains to discuss the strengths and weaknesses of Webber's approach, and she herself (in contradistinction to some other workers) is as quick to point out the latter as the former. The reader is therefore referred to her thesis (1978a) for this. However, I will make some global comments on the important aspects relevant here.</Paragraph>
      <Paragraph position="1"> Webber's main contributions, as I see them, are as follows: 1. The anaphor resolution problem is approached from the point of view of determining what an adequate representation would be, rather than trying to fit (to straitjacket?) a resolution mechanism into  some pre-existing and perhaps arbitrarily chosen representation; and the criteria of adequacy for the representation are rigorously enumerated.</Paragraph>
      <Paragraph position="2"> 2. A formalism in which it is possible to  compute focus elements as they are needed, rather than having them sitting round in advance (as in Grosz's system), perhaps never to be used, is provided (but compare my further remarks below).</Paragraph>
      <Paragraph position="3">  3. Webber brings to NLU anaphora research the formality and rigor of logic, something that has been previously almost unseen.</Paragraph>
      <Paragraph position="4"> 4. Previously ignored problems of quantification are dealt with.</Paragraph>
      <Paragraph position="5"> 5. The formalism itself is an important con- null tribution.</Paragraph>
      <Paragraph position="6"> The shortcomings, as I see them, are as follows: 1. The formalism relies very much on antecedents being in the text. Entities evoked by, but not explicit in, the text cannot in general be adequately handled (in contrast to Grosz's system).</Paragraph>
      <Paragraph position="7"> 2. The formalism is not related to discourse structure. So, for example, it contains nothing to discourage the use of the table as the antecedent in (2-3). It remains to be seen if discourse pragmatics can be adequately integrated with the formalism or otherwise accounted for in a system using the formalism.</Paragraph>
      <Paragraph position="8">  3. Intrasentential and intraclausal anaphora are not adequately dealt with.</Paragraph>
      <Paragraph position="9"> 4. Webber does not relate her discussions of representational adequacy to currently popular knowledge representations. If  frames, for example, are truly inadequate we would like to have some watertight proof of this before abandoning current NLU projects attempting to use frames.</Paragraph>
      <Paragraph position="10"> It will be noticed that contribution 2 and shortcoming 1 are actually two sides of the same coin m it is static pre-available knowledge that allows non-textual entities to be easily found -- and clearly a synthesis will  be necessary here.</Paragraph>
      <Paragraph position="11"> 6. Discourse-cohesion approaches to anaphora  resolution Another approach to coreference resolution attempts to exploit local discourse cohesion, building a representation of the discourse with which references can be resolved. This approach has been taken by (inter alia) Klappholz and Lockman (1977; Lockman 1978). By using only cues to the discourse structure at the sentence level or lower, one avoids the need to search for referents in pre-determined dialogue models such as those of Grosz's task-oriented dialogues, or rigidly predefined knowledge structures such as scripts (Schank and Abelson 1977) and frames (Minsky 1975), which Klappholz and Lockman, for example, call overweight structures that inflexibly dominate processing of text. Klappholz and Lockman emphasize that the structure through which reference is resolved must be dynamically built up as the text is processed; frames or scripts could assist in this building, but cannot, however, be reliably used for refer-American Journal of Computational Linguistics, Volume 7, Number 2, April-June 1981 93 Graeme Hirst Discourse-Oriented Anaphora Resolution ence resolution, because deviations by the text from the pre-defined structure will cause errors. The basis of this approach is that there is a strong interrelationship between coreference and the cohesive ties in a discourse that make it coherent. By determining what the cohesive ties in a discourse are, one can put each new sentence or clause, as it comes in, into the appropriate place in a growing structure that represents the discourse. This structure can then be used as a focus to search for coreference antecedents, since not only do coherently connected sentences tend to refer to the same things, but knowledge of the cohesion relation can provide additional reference resolution restraints. Hobbs (1979) in particular sees the problem of coreference resolution as being automatically solved in the process of discovering the coherence relations in a text. (An example of this will be given in Section 6.2.) Conversely, it is frequently helpful or necessary to resolve coreference relations in order to discover the coherence relations. This is not a vicious circle, claims Hobbs, but a spiral staircase. In our discussion below, we will cover four issues:  1. deciding on a set of possible coherence relations; 2. detecting them when they occur in a text; 3. using the coherence relations to build a focus structure; and 4. searching for referents in the structure.</Paragraph>
    </Section>
    <Section position="4" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
6.1 Coherence relations
</SectionTitle>
      <Paragraph position="0"> The first thing required by this approach is a complete and computable set of the coherence relations that may obtain between sentences and/or clauses.</Paragraph>
      <Paragraph position="1"> Various sets have been suggested by many people, including Eisenstadt (1976), Phillips (1977), Pitkin (1977a, 1977b), Hirst (1977b, 1978), Lockman (1978), Hobbs (1978, 1979) and Reichman (1978). 15 None of these sets fulfill all desiderata; and while Halliday and Hasan (1976) provide an extensive analysis of cohesion, it does not fit within our computational framework of coherence relations, and those, such as Hobbs, Lockman, Eisenstadt and Hirst, who emphasize computability, provide sets insufficient, I believe, to capture all the semantic subtleties of discourse cohesion. Nevertheless, the works cited above undoubtedly serve as a useful starting point for development of this area.</Paragraph>
      <Paragraph position="2"> To illustrate what a very preliminary set of cohesion relations could look like, I will briefly present a set abstracted from the various sets of Eisenstadt, Hirst, Hobbs, Lockman and Phillips (but not faithful to any one of these).</Paragraph>
      <Paragraph position="3"> The set contains two basic classes of coherence relations: expansion or elaboration on an entity, concept or event in the discourse, and temporal continuation or time flow. Expansion includes relations like EFFECT, CAUSE, SYLLOGISM, ELABORATION, CONTRAST, PARALLEL and EXEMPLIFICATION. In the following examples, &amp;quot;u&amp;quot; is used to indicate the point where the cohesive tie illustrated is acting:  (6-1) \[ELABORATION\] To gain access to the latch-housing, remove the control panel cover. * Undo both screws and rock it gently until it snaps out from the mounting bracket.</Paragraph>
      <Paragraph position="4"> (6-2) \[CONTRAST\] The hoary marmot likes to be scratched behind the ears by its mate, * while in the lesser dormouse, nuzzling is the primary behavior promoting pairbonding. null (6-3) \[EFFECT\] Ross pulled out the bottom module. * The entire structure collapsed.</Paragraph>
      <Paragraph position="5"> (6-4) \[CAUSE\] Ross scratched his head furi null ously. * The new Hoary Marmot TM shampoo that he used had made it itch unbearably.</Paragraph>
      <Paragraph position="6"> (6-5) \[SYLLOGISM\] Nadia goes to the movies with Ross on Fridays. Today's Friday, * so I guess she'll be going to the movies.</Paragraph>
      <Paragraph position="7">  (6-6) \[PARALLEL\] Nearly all our best men are dead! Carlyle, Tennyson, Browning, George Eliot? -- * I'm not feeling very well myself!16 (6-7) \[EXEMPLIFICATION\] Many of our staff  are keen amateur ornithologists. * Nadia has written a book on the Canadian triller, and Daryel once missed a board meeting because he was high up a tree near Gundaroo, watching the hatching of some rare red-crested snipes.</Paragraph>
      <Paragraph position="8"> (One may disagree with my classification of some of the relations above; the boundaries between categories are yet ill-defined, and it is to be expected that some people's intuitions will differ from mine.) 15 Reichman's coherence relations operate at paragraph level rather than sentence or clause level.</Paragraph>
      <Paragraph position="9">  Temporal flow relations involve some continuation forwards or backwards over time: (6-8) VICTORIA -- A suntanned Prince Charles arrived here Sunday afternoon, * and was greeted with a big kiss by a pretty English au pair girl. 17 (6-9) SAN JUAN, Puerto Rico -- Travel officials tackled a major job here Sunday to find new accommodations for 650 passengers from the burned Italian cruise liner Angelina Lauro.</Paragraph>
      <Paragraph position="10"> * The vessel caught fire Friday while docked at Charlotte Amalie in the Virgin Islands, but most passengers were ashore at the time. 18 Temporal flow may be treated as a single relation, as Phillips, for example, does, or it may be subdivided, as by Eisenstadt and Hirst, into categories like TIME STEP, FLASHBACK, FLASHFORWARD, TIME EDIT, and so on. Certainly, time flow in a text may be quite contorted, as in (6-10) (from Hirst 1978); &amp;quot;m&amp;quot; indicates a point where the direction of the time flow changes: (6-10) Slowly, hesitantly, Ross approached Nadia. * He had waited for this moment for many days. * Now he was going to say the words * which he had agonized over * and in the very room * he had often dreamed about. * He gazed lovingly at her soft green eyes.</Paragraph>
      <Paragraph position="11"> It is not clear, however, to what extent an analysis of time flow is necessary for anaphor resolution. I suspect that relatively little is necessary -- less than is required for other aspects of discourse understanding. I see relations like those exemplified above as primitives from which more complex relations could be built. For example, the relation between the two sentences of (6-3) above clearly involves FORWARD TIME STEP as well as EFFECT. I have hypothesized elsewhere (Hirst 1978) the possibility of constructing a small set of discourse relations (with cardinality about twenty or less) from which more complex relations may be built up by simple combination, and, one hopes, in such a way that the effects of relation Ri+R 2 would be the sum of the individual effects of relations R 1 and R 2. Rules for permitted combinations would be needed; for example, FORWARD TIME STEP could combine with EFFECT, but not with</Paragraph>
    </Section>
  </Section>
  <Section position="7" start_page="0" end_page="0" type="metho">
    <SectionTitle>
BACKWARD TIME STEP.
</SectionTitle>
    <Paragraph position="0"> 17 From: The Vancouver express, 2 April 1979, page A1.</Paragraph>
    <Paragraph position="1"> 18 From: The Vancouver express, 2 April 1979, page A5.</Paragraph>
    <Paragraph position="2">  What would the formal definition of a coherence relation be like? Here is Hobbs's (1979:73) definition of ELABORATION: Sentence S 1 is an ELABORATION of sentence S O if some proposition P follows from the assertions of both S O and $1, but S 1 contains a prop-erty of one of the elements of P that is not in S 0. The example in the next section will clarify this.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
6.2 An example of anaphor resolution using a
</SectionTitle>
      <Paragraph position="0"> * coherence relation It is appropriate at this stage to give an example of the use of coherence relations in the resolution of anaphors. I will present an outline of one of Hobbs's; for the fine details I have omitted, see Hobbs (1979:78-80). The text is this: (6-11) John can open Bill's safe. He knows the combination.</Paragraph>
      <Paragraph position="1"> We want an NLU system to recognize the cohesion relation operating here, namely ELABORATION, and identify he as John and the combination as that of Bill's safe. We assume that in the world knowledge that the system has are various axioms and rules of inference dealing with such matters as what combinations of safes are and knowledge about doing things. Then, from the first sentence of (6-11), which we represent as (6-12): (6-12) can (John, open (Bill's-safe)) (we omit the details of the representation of Bill's safe), we can infer: (6-13) know (John, cause (do (John, a), open (Bill's-safe))) &amp;quot;John knows that he can perform an action a that will cause Bill's-safe to be open&amp;quot; From the second sentence of (6-11), namely: (6-14) know (he, combination (comb, y)) &amp;quot;someone, he, knows the combination comb to something, y&amp;quot; we can infer, using knowledge about combinations: (6-15) know (he, cause (dial (comb,y), open (y))) &amp;quot;he knows that by causing the dialing of comb on y, the state in which y is open will be brought about&amp;quot; Recognizing that (6-13) and (6-15) are nearly identical, and assuming that some coherence relation does hold, we can identify he with John, y with Bill's-safe, and the definition of the ELABORATION relation is satisfied. In the process, the required referents were found.</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
6.3 Lockman's contextual reference resolution
</SectionTitle>
      <Paragraph position="0"> algorithm Given a set of discourse cohesion relations, how may their use in a text be computationally recognized and employed to build a structure that represents the discourse and can be used as a focus for reference resolution? Only Hobbs (1978, 1979) and Lockman (1978; Klappholz and Lockman 1977) seem to have considered these aspects of the problem, though Eisenstadt (1976) discusses some of the requirements in world knowledge and inference that would be required. In this section we look at Lockman's work. Lockman does not separate the three processes of recognizing cohesion, resolving references and building the representation of the discourse. Rather, as befits such interrelated processes, all three are carried out at the same time. His contextual reference resolution algorithm (CRRA) works as follows: The structure to be built is a tree, initially null, of which each node is a sentence and each edge a coherence relation. As each new sentence comes in, the CRRA tries to find the right node of the tree to attach it to, starting at the leaf that is the previous sentence and working back up the tree in a specified search order (discussed below) until a connection is indicated. Lockman assumes the existence of a judgment mechanism that generates and tests hypotheses as to how the new sentence may be feasibly connected to the node being tested. The first hypothesis whose likelihood exceeds a certain threshold is chosen.</Paragraph>
      <Paragraph position="1"> The hypotheses consider both the coherence and the coreference relations that may obtain. Each member of the set of coherence relations is hypothesized, and for each one, all possible coreference relations between the conceptual tokens of the new sentence and tokens in the node under consideration (or nearby it in the tree) are posited. (The search for tokens goes back as far as necessary in the tree until suitable tokens are found for all unfulfilled definite noun phrases.) The hypotheses are considered in parallel; if none are judged sufficiently likely, the next node or set of nodes will be considered for feasible connection to the current sentence.</Paragraph>
      <Paragraph position="2"> The search order is as follows: First the immediate context, the previous sentence, is tried. If no feasible connection is found, then the immediate ancestor of this node, and all its other descendants, are tried in parallel. If the algorithm is still unsuccessful, the immediate ancestor of the immediate ancestor, and the descendants thereof, are tried, and so on up the tree.</Paragraph>
      <Paragraph position="3"> If a test of several nodes in parallel yields more than one acceptable node, the one nearest the immediate context is chosen.</Paragraph>
      <Paragraph position="4"> If the current sentence is not a simple sentence, it is not broken into clauses dealt with individually, but rather converted to a small sub-tree, reflecting the semantic relationship between the clauses. The conversion is based simply upon a table look-up indexed on the structure of the parse tree of the sentence.</Paragraph>
      <Paragraph position="5"> One of the nodes is designated by the table look-up as the head node, and the sub-tree is attached to the pre-existing context tree, using the procedure described above, with the connection occurring at this node. Similarly one (or more) of the nodes is designated as the immediate context, the starting point for the next search. (The search will be conducted in parallel if there is more than one immediate context node.) There are some possible problems with Lockman's approach. The first lies in the fact that the structure built grows without limit, and therefore a search in it could, in theory, run right through an enormous tree.</Paragraph>
      <Paragraph position="6"> Normally, of course, a feasible connection or desired referent will be found fairly quickly, close to the immediate context. However, should the judgment mechanism fail to spot the correct one, the algorithm may run a little wild, searching large areas of the structure needlessly and expensively, possibly lighting on a wrong referent or wrong node for attachment, with no indication that an error has occurred. In other words, Lockman's CRRA places much greater trust in the judgment mechanism than a system like Grosz's that constrains the referent search area -- more trust than perhaps should be put in what will necessarily be the most tentative and unreliable part of the system.</Paragraph>
      <Paragraph position="7"> Secondly, I am worried about the syntax-based table look-up for sub-trees for complex sentences. On the one hand, it would be nice if it were correct, simplifying processing. On the other hand, I cannot but feel that it is an over-simplification, and that effects of discourse theme cannot reliably be handled in this way. However, I have no counterexamples to give, and suggest that this question needs more investigation. null The third possible problem, and perhaps the most serious, concerns the order in which the search for a feasible connection takes place. Because the first hypothesis whose likelihood exceeds the threshold is selected, it is possible to miss an even better hypothesis further up the tree. In theory, this could be avoided by doing all tests in parallel, the winning hypothesis being judged on both likelihood and closeness to the immediate context. In practice, given the evergrowing context tree as discussed above, this would not be feasible, and some way to limit the search area would be needed.</Paragraph>
      <Paragraph position="8"> 96 American Journal of Computational Linguistics, Volume 7, Number 2, April-June 1981 Graeme Hirst Discourse-Oriented Anaphora Resolution The fourth problem lies in the judgment mechanism itself. Lockman frankly admits that the mechanism, incorporated as a black box in his algorithm, must have abilities far beyond those of present state-of-the-art inference and judgment systems. The problem is that it is unwise to predicate too much on the nature of this unbuilt black box, as we do not know yet if its input-output behavior could be as Lockman posits. It may well be that to perform as required, the mechanism will need access to information such as the sentence following the current one (in effect, the ability to delay a decision), or more information about the previous context than the CRRA retains or ever determines; in fact, it may need an entirely different discourse structure representation from the tree being built. In other words, while it is fine in theory to design a reference resolver around a black box, in practice it may be computationally more economical to design the reference resolver around a knowledge of how the black box actually works, exploiting that mechanism, rather than straitjacketing the judgment module into its pre-defined cabinet; thus Lockman's work may be premature.</Paragraph>
      <Paragraph position="9"> None of these problems are insurmountable. However it is perhaps a little unfortunate that Lockman's work offers little of immediate use for NLU systems of the present day.</Paragraph>
    </Section>
    <Section position="3" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
6.4 Conclusions
</SectionTitle>
      <Paragraph position="0"> Clearly, much work remains to be done if the coherence/cohesion paradigm of NLU is to be viable.</Paragraph>
      <Paragraph position="1"> Almost all aspects need refinement. However, it is an intuitively appealing paradigm, and it will be interesting to see if it can be developed into functioning NLU systems.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML