File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/99/e99-1006_metho.xml

Size: 25,984 bytes

Last Modified: 2025-10-06 14:15:19

<?xml version="1.0" standalone="yes"?>
<Paper uid="E99-1006">
  <Title>Resolving Discourse Deictic Anaphora in Dialogues</Title>
  <Section position="3" start_page="37" end_page="38" type="metho">
    <SectionTitle>
2 Anaphor Types in Dialogues
</SectionTitle>
    <Paragraph position="0"> In the dialogues examined, only 45.1% of the anaphors are individual anaphors, i.e., anaphors with NP-antecedents (IPro, IDem), e.g., (3) Boeing ought to hire himi and give him/ a junkyardj .... and see if hei could build a Seven Forty-Seven out of itj. (sw2102) 22.6% of the anaphors are discourse deictic, i.e. co-specify with non-NP constituents such as VPs, sentences, strings of sentences (DDPro, DDDem; cf. Webber (1991)). The phenomenon of discourse deictic anaphora in written texts has been shown to be strongly dependent on discourse structure. As can also be seen in the examples below, anaphoric reference is restricted to elements adjacent to the utterance containing the anaphor, i.e., those on the right frontier of the discourse structure tree (Webber, 1991; Asher,  The existence of abstract object anaphora shows that aside from individual entities, the discourse model may also contain complex, higher-order entities. One of the differences between individual and discourse deictic anaphora is that whereas a concrete NP antecedent usually only refers to the individual it describes, a sentence may simultaneously denote an eventuality, a concept, a proposition and a fact.</Paragraph>
    <Paragraph position="1"> Instead of assuming that all levels of abstract objects are introduced to the discourse model by the sentence that makes them available, it has been suggested that anaphoric discourse deictic reference involves referent coercion (Webber, 1991; Asher, 1993; Dahl and Hellman, 1995). This assumption is further justified by the fact that discourse deictic reference, as opposed to individual anaphoric reference, is often established  by demonstratives rather than pronouns. In theories relating cognitive status and choice of NP-form (cf.</Paragraph>
    <Paragraph position="2"> Gundel et al. (1993)), pronouns are only available for the most salient entities, whereas demonstratives can be used to shift the focus of attention to a different entity. null A further 19.1% of anaphors are Inferrable-Evoked Pronouns (IEPro) and constitute a particular type of plural pronoun which indirectly co-specifies with a singular antecedent. This group includes existential, generic and corporate 3rd person plural pronouns (Jaeggli, 1986; Belletti and Rizzi, 1988).</Paragraph>
    <Paragraph position="3"> (6) I think the Soviet Union knows what we have and knows that we're pretty serious and if they ever tried to do anything, we would, we would be on the offensive. (sw3241) In (6), the NP Soviet Union can be associated with inferrables such as the population or the government.</Paragraph>
    <Paragraph position="4"> These can subsequently be referred to by pronouns without having been explicitly mentioned themselves.</Paragraph>
    <Paragraph position="5"> In some cases of IEPro's there is no associated NP, as in the following example, where the speaker is referring to the organisers of the Switchboard calls: (7) this is the first call I've done \[...\] and, I didn't realize that they ha-, were going to reach out to people from \[...\] all over the country.</Paragraph>
    <Paragraph position="6"> (sw2041) 13.2% of the anaphors are vague (VagPro, Vag-Dem), in the sense that they refer to the general topic of conversation and, as opposed to discourse deictic anaphors, do not have a specific clause as an antecedent, e.g., (8) B.29: I mean, the baby is like seventeen months and she just screams.</Paragraph>
    <Paragraph position="7"> A.30: Uh-huh.</Paragraph>
    <Paragraph position="8"> B.31 : Well even if she knows that they're fixing to get ready to go over there.</Paragraph>
    <Paragraph position="9"> They're not even there yet -A.32: Uh-huh.</Paragraph>
    <Paragraph position="10"> B.33: -you know.</Paragraph>
    <Paragraph position="11"> A.34: Yeah. It's hard.</Paragraph>
    <Paragraph position="12"> Non-referring pronouns, or expletives, were not marked. These include subjects of weather verbs, those in raising verb constructions or those occurring in sentences with extraposed sentential subjects or ob- null jects, e.g., (9) It's hard to realize, that there are places that are just so, uh, bare on the shelves as there.</Paragraph>
    <Paragraph position="13"> (sw2403) This group also contains the various subcategorised expletives (Postal and Pullum, 1988), defined as being non-referring pronouns in argument positions, e.g., Proceedings of EACL '99 (10) Uh, they don't need somebody else coming in and saying, you know, okay we're going to be with them and we're going to zap it to you.</Paragraph>
    <Paragraph position="14"> (sw2403) (11) When it comes to trucks, though, I would probably think to go American. (sw2326)  They differ from referring anaphors in that they cannot be questioned (e.g., *When what comes to trucks ?).</Paragraph>
  </Section>
  <Section position="4" start_page="38" end_page="41" type="metho">
    <SectionTitle>
3 Synchronising Units
</SectionTitle>
    <Paragraph position="0"> The domain which contains potential antecedents is not given in syntactic terms in spoken dialogue. Hence we define this domain in pragmatic terms. We assume that discourse entities enter the joint discourse model and are available for subsequent reference when common ground between the discourse participants is established. Our model builds on the observation that certain dialogue acts - in particular acknowledgments - signal that common ground is achieved. Our assumptions are based on Clark's (1989) theory of contributions (cf. also Traum (1994)).</Paragraph>
    <Paragraph position="1"> Each dialogue is divided into short, clearly defined dialogue acts - Initiations I and Acknowledgments A - based on the top of the hierarchy given in Carletta et al. (1997). Each sentence and each conjoined clause counts as a separate I, even if they are part of the same turn. A's do not convey semantic content but have a pragmatic function (e.g., backchannel). In addition there are utterances which function as an A but also have semantic content - these are labelled as A/I.</Paragraph>
    <Paragraph position="2"> A single I is paired with an A and they jointly form a Synchronising Unit (SU). In longer turns, each main clause functions as a separate unit along with its subordinate clauses. Single I's constitute SU's by themselves and do not require explicit acknowledgment.</Paragraph>
    <Paragraph position="3"> The assumption is that by letting the speaker continue, the hearer implicitly acknowledges the utterance. It is only in the context of turn-taking that I's and A's are paired up.</Paragraph>
    <Paragraph position="4"> Our model is based on the observation that common ground has an influence on attentional state. We assume that only entities in a complete SU are entered into the common ground and remain in the S-list for the duration of a further SU. If one speaker's I is not acknowledged by the other participant it cannot be included in an SU. In this case the discourse entities mentioned in the unacknowledged I are added to the S-List but are immediately deleted again when the subsequent I clearly shows that they are not part of the common ground.</Paragraph>
    <Paragraph position="5"> Figure 1 below, taken from the Trains-corpus (speakers s and u) illustrates that a missing acknowl- null edgment prevents the discourse model from containing discourse entities from the unacknowledged turn. SUi I s: so there- the five boxcars of oranges &lt;sil&gt; + that are at- + S-List: \[5 boxcars of oranges\] SUj A/I u: +at &lt;sil&gt; +atComing S-List: \[5 boxcars of oranges, Coming\] A s: urn - I u: okay the orange warehouse &lt;sil&gt; urn I + have to + S-List: \[Coming, orange warehouse\] SUk I S: yOU need + you need to get five &lt;sil&gt;  Speaker u's second turn is an I which is not followed by an A. This means that the entity referred to in that utterance (orange warehouse) is immediately removed from the joint discourse model. Thus there in the final two turns co-specifies with Coming and not the most recent orange warehouse.</Paragraph>
    <Paragraph position="6"> 4 How to Resolve Discourse Deictic Anaphora We now turn to our method of anaphora resolution, which extends the algorithm presented in Strube (1998), in order to be able to account for discourse deictic anaphora as well as individual anaphora.</Paragraph>
    <Section position="1" start_page="39" end_page="39" type="sub_section">
      <SectionTitle>
4.1 Anaphor-anteeedent Compatibility
</SectionTitle>
      <Paragraph position="0"> As indicated in Section 2, information provided by the subcategorisation frame of the anaphor's predicate can be used to determine the type of the referent. In the algorithm, we make use of the notion of anaphor-antecedent Compatibility to distinguish between discourse deictic and individual reference. Certain predicates (notably verbs of propositional attitude) require one of their arguments to have a referent whose meaning is correlated with sentences, e.g., is true, assume (referred to as SC-bias verbs in Garnsey et al. (1997) and elsewhere). Pronouns in these positions rarely have concrete individual NP-antecedents and are generally only compatible with discourse deictic referents. Other argument positions are preferentially associated with concrete individuals (e.g., objects of eat, smell) (DO-bias verbs). A summary of these predicate types is provided in Figure 2, where l-incompatible</Paragraph>
      <Paragraph position="2"> Equating constructions where a pronominal referent is equated with an abstract object, e.g., x is making it easy, x is a suggestion.</Paragraph>
      <Paragraph position="3"> Copula constructions whose adjectives can only be applied to abstract entities, e.g., x is true, x is false, x is correct, x is right, x isn't right.</Paragraph>
      <Paragraph position="4">  * Arguments of verbs describing propositional attitude which only take S'-complements, e.g., assume. * Object of do.</Paragraph>
      <Paragraph position="5"> * Predicate or anaphoric referent is a &amp;quot;reason&amp;quot;, e.g., x is because I like her, x is why he ' s late.</Paragraph>
      <Paragraph position="6"> A-Incompatible (*A)  Equating constructions where a pronominal referent is equated with a concrete individual referent, e.g., x is a car.</Paragraph>
      <Paragraph position="7"> Copula constructions whose adjectives can only be applied to concrete entities, e.g., x is expensive, x is tasty, x is loud.</Paragraph>
      <Paragraph position="8"> Arguments of verbs describing physical contact/stimulation, which cannot be used metaphorically, e.g., break x, smash x, eat x, drink x, smell x but NOT *see x  means preferentially associated with abstract objects and A-incompatible means preferentially associated with individual objects 1. Anaphors which are argument positions of the first type are classified as discourse deictic (DDPro; DDDem), those in argument positions of the second type are classified as individual anaphora (IPro; IDem).</Paragraph>
      <Paragraph position="9"> It is clear that predicate information alone is not sufficient for this purpose as there is a large group of verbs which allow both individual and discourse deictic referents (e.g., objects of see, know) (EQ-bias verbs). In these cases the preference is determined by NP-form of the anaphor (pronoun vs. demonstrative).</Paragraph>
    </Section>
    <Section position="2" start_page="39" end_page="40" type="sub_section">
      <SectionTitle>
4.2 Types of Abstract Antecedents
</SectionTitle>
      <Paragraph position="0"> We follow Asher (1993) in assuming that the predicate of a discourse deictic anaphor determines the type of abstract object. An anaphor in the object position of the verb do, for example, can only have a VP (eventconcept) antecedent (eg John \[sang\]. Bill did that too.), whereas an anaphor in the subject position of the predicate is true requires a full S (proposition) (eg \[John sang\]. That's true.). This verbal subcategorisation information is used to determine which part of the preceding I is required to form the correct referent.</Paragraph>
      <Paragraph position="1"> Following Webber and others, we assume that an abstract object is only introduced to the discourse model by the anaphor itself. In addition to the S-List (Strube, 1998), which contains the referents of NPs available for anaphoric reference, our model includes ~These are preferences and not strict rules because some l-Incompatible contexts are compatible with NPs denoting abstract objects, e.g., The story/It is true. and NPs which are used to stand elliptically for an event or state, e.g., His car/It is the reason why he's late. This shows that predicate compatibility must ultimately be defined in semantic terms and not just rely on syntactic strings (NP vs. S).</Paragraph>
      <Paragraph position="2">  an A-List for abstract objects. This is only filled if discourse deictic pronouns or demonstratives occur and its contents remain only for one I, which is necessary for multiple discourse deictic reference to the same entity. null The following context ranking describes the order in which the parts of the linguistic context are accessed:  1. A-List (containing abstract objects previously referred to anaphorically).</Paragraph>
      <Paragraph position="3"> 2. Within same I: Clause to the left of the clause containing the anaphor.</Paragraph>
      <Paragraph position="4"> 3. Within previous I: Rightmost main clause (and subordinated clauses to its right).</Paragraph>
      <Paragraph position="5"> 4. Within previous rs: Rightmost complete sentence (if previous I is incomplete sentence).</Paragraph>
    </Section>
    <Section position="3" start_page="40" end_page="41" type="sub_section">
      <SectionTitle>
4.3 The Algorithm
</SectionTitle>
      <Paragraph position="0"> The algorithm consists of two branches, one for the resolution of pronouns, the other for the resolution of demonstratives. Both of them call the functions resolveDD and resolvelnd, which resolve discourse deictic anaphora and individual anaphora, respectively.</Paragraph>
      <Paragraph position="1"> If a pronoun is encountered (Figure 4, below), the functions resolveDD or resolvelnd (described below) are evaluated, depending on whether the pronoun is Iincompatible (1) or A-incompatible (2). In the case of success the pronouns are classified as DDPro or lPro, respectively. In the case of failure, the pronouns are classified as VagPro. If the pronoun is neither I- nor A-incompatible (i.e., the pronoun is ambiguous in this respect), the classification is only dependent on the  success of the resolution. The function resolvelnd is evaluated first (3) because of the observed preference for individual antecedents for pronouns,. If successful, the pronoun is classified as IPro, if unsuccessful, the function resolveDD attempts to resolve the pronoun (4). If this, in turn, is successful, the pronoun is classified as DDPro, if it is unsuccessful it is classified as VagPro, indicating that the pronoun cannot be resolved using the linguistic context.</Paragraph>
      <Paragraph position="2"> The procedure is similar in the case of demonstrafives (Figure 5, below). The only difference being that the antecedent of a demonstrative is preferentially an abstract object. The order of (3) and (4) is therefore reversed.</Paragraph>
      <Paragraph position="3"> We now turn to the function resolveDD (Figure 6, below) (assuming that resolvelnd resolves individual anaphora and returns true or false depending on its success). In step (1) the function resolveDD examines all elements of the context ranking (Figure 3) until the function co-index succeeds, which evaluates whether the element is of the right type. Then the function resolveDD returns true. If the pronoun is an argument of &amp;quot;do&amp;quot;, the function co-index is tried on the VP of the current element of the context ranking (2). If successful, the VP-referent is added to the A-List and the function returns true. In (3), co-index evaluates whether the pronoun and the current element of the context ranking are compatible. In the case of a positive result, the element is added to the A-List and true is returned. If all elements of the context ranking are</Paragraph>
      <Paragraph position="5"> 1. foreach element of context ranking do 2. if (PRO is argument of do) then if (co-index PRO with VP of element) then add VP to A-List; return true 3. else if (co-index PRO with element) then add element to A-List; return true 4. return false.</Paragraph>
      <Paragraph position="6">  checked without success, resolveDD returns false (4). Example 12 illustrates the algorithm:  (12) B.8: I mean, if went and policed, just like you say, every country when they had squabbles, A.9: Well, but we've done it before, B.10: Oh, I know we have.</Paragraph>
      <Paragraph position="7"> A. 11 : and it has not been successful. (sw2403)  When the pronoun &amp;quot;it&amp;quot; in A.9 is encountered, the algorithm determines the pronoun to be Iincompatible (Step 1 in Figure 4), as it is the object argument of the verb do. The function resolveDD is evaluated. The A-List is empty, so the highest ranked element in the context ranking is the last complete sentence in B.8. The pronoun is an argument of &amp;quot;do&amp;quot;, therefore gets co-indexed with the VP-referent of the sentence in B.8. The VP is added to the A-List, the function returns true and the pronoun is classified as DDPro by the algorithm.</Paragraph>
      <Paragraph position="8"> When the next pronoun is encountered, the A-List is empty again because of the intervening sentence (I) in B.10. The pronoun is neither I- nor A-incompatible, therefore the algorithm evaluates resolvelnd (step 3). This fails, since there are no individual antecedents available in B. 10 and the algorithm evaluates resolveDD in the step (4). The first element in the context ranking is the main clause in A. 11 which is co-indexed with the pronoun. The clause-referent is added to the A-List, the function returns true and the algorithm classifies the pronoun as DDPro. In this case, the classification is correct but not the resolution, since the pronoun should co-specify with the pronoun in A.9.</Paragraph>
    </Section>
  </Section>
  <Section position="5" start_page="41" end_page="4784" type="metho">
    <SectionTitle>
5 Empirical Evaluation
</SectionTitle>
    <Paragraph position="0"> In order to test the hypotheses made in the previous sections we performed an empirical evaluation on nat-Proceedings of EACL '99 urally occurring dialogues. First, the corpus was annotated for all relevant features, i.e., division of turns into dialogue act units, classification of dialogue acts (I, A), marking of noun phrases, classification of the various types of anaphors introduced in Section 2, and annotating coreference between anaphors and individual/abstract discourse entities. The last step provided the key for the test of the algorithm described in Section 4.3.</Paragraph>
    <Section position="1" start_page="41" end_page="4784" type="sub_section">
      <SectionTitle>
5.1 Annotation
</SectionTitle>
      <Paragraph position="0"> Our data consisted of five randomly selected dialogues from the Switchboard corpus of spoken telephone conversations (LDC, 1993). Two dialogues (SW2041, SW4877) were used to train the two annotators (the authors), and three further dialogues for testing (SW2403, SW3117, SW3241). The training dialogues were used for improving the annotation manual and for clarifying the annotation in borderline cases.</Paragraph>
      <Paragraph position="1"> After each step the annotations were compared using the ~ statistic as reliability measure for all classification tasks (Carletta, 1996). A t~ of 0.68 &lt; ~ &lt; 0.80 allows tentative conclusions while ~ &gt; 0.80 indicates reliability between the annotators. In the following tables, the rows on above the horizontal line show how often a particular class was actually marked as such by both annotators. In the rows below the line, N shows the total number of markables, while Z gives the number of agreements between the annotations. PA is percent agreement between the annotators, PE expected agreement by chance. Finally, ~ is computed by the formula PA - PE/1 - PE.</Paragraph>
      <Paragraph position="2"> Dialogue Acts. First, turns were segmented into dialogue act units. We turned the segmentation task into a classification task by using boundaries between dialogue acts as one class and non-boundaries as the other (see Passonneau and Litman (1997) for a similar practice). In Table l, Non-Bound. and Bound. give the number of non-boundaries and boundaries actually marked by the annotators, N is the total number of possible boundary sites, while Z gives the number of agreements between the annotations.</Paragraph>
      <Paragraph position="3">  Table 2 shows the results of the comparison between the annotations with respect to the classification  of the dialogue act units into Initiations (I), Acknowledgements (A), Acknowledgement/Initiations (A/I), and no dialogue act (No). For this test we used only these dialogue act units which the annotators agreed about. PA was 92.6%, ~ = 0.87 again indicating that it is possible to annotate these classes reliably.</Paragraph>
    </Section>
    <Section position="2" start_page="4784" end_page="4784" type="sub_section">
      <SectionTitle>
Individual and Abstract Object Anaphora. Table
</SectionTitle>
      <Paragraph position="0"> 32 shows the reliability scores for the classification of pronouns in the classes IPro, DDPro, VagPro, and IEProclassification of demonstratives in the classes IDem, DDDem, ~ and VagDem. The e-values are around .8, indicating that annotators were able to classify the pronouns reliably.</Paragraph>
      <Paragraph position="1">  Co-Indexation of Abstract Object Anaphora. The abstract object anaphora were manually co-indexed 2No. for each class is the actual no. marked by both annotators. N is the total number of markables, Z is total number of agreements between annotators, PE is the expected agreement by chance.</Paragraph>
      <Paragraph position="2"> Proceedings of EACL '99 with their antecedents. For this task we cannot provide reliability scores using n because it is not a classification task. It is much more difficult than the previous ones, as the problem consists of identifying the correct beginning and end of the string which co-specifies with the anaphor. We used only the abstract anaphors whose classification both annotators agreed upon. The annotators then marked the antecedents and co-indexed them with the anaphors. The results were compared and the annotators agreed upon a reconciled version of the data. Annotator accuracy was then measured against the reconciled version. Accuracy ranged from 85,7% (Annotator A) to 94,3% (Annotator B).</Paragraph>
    </Section>
    <Section position="3" start_page="4784" end_page="4784" type="sub_section">
      <SectionTitle>
Deictic Anaphora against Key
5.2 Evaluation of the Algorithm
</SectionTitle>
      <Paragraph position="0"> We used the reconciled version of the annotation as key for the abstract anaphora resolution algorithm. Table 6 shows the results of the evaluation. Precision is 63.6% and Recall 70%.</Paragraph>
      <Paragraph position="1"> Res. Corr.</Paragraph>
    </Section>
    <Section position="4" start_page="4784" end_page="4784" type="sub_section">
      <SectionTitle>
Algorithm
</SectionTitle>
      <Paragraph position="0"> The low value for precision indicates that the classification did not perform very well. Of the 28 anaphors resolved incorrectly, only 11 were classified correctly.</Paragraph>
      <Paragraph position="1"> One of the most common errors in classification was, that an anaphor annotated as vague (VagPro, VagDem) was classified by the algorithm as discourse deictic (DDPro, DDDem). Classification is dependent on resolution, so since the context almost always provides an antecedent for a discourse deictic anaphor, it is possible to classify and resolve a vague anaphor incorrectly, as in Example 13: (13) A: \[I don't know\]/ , I think it/ really depends a lot on the child.</Paragraph>
      <Paragraph position="2"> (sw3117)</Paragraph>
    </Section>
  </Section>
  <Section position="6" start_page="4784" end_page="4784" type="metho">
    <SectionTitle>
6 Comparison to Related Work
</SectionTitle>
    <Paragraph position="0"> Both Webber(1991) and Asher (1993) describe the phenomenon of abstract object anaphora and present restrictions on the set of potential antecedents. They do not, however, concern themselves with the problem of how to classify a certain pronoun or demonstrative as individual or abstract. Also, as they do not give preferences on the set of potential candidates, their approaches are not intended as attempts to resolve abstract object anaphora.</Paragraph>
    <Paragraph position="1"> Concerning anaphora resolution in dialogues, only little research has been carried out in this area to our knowledge. LuperFoy (1992) does not present a corpus study, meaning that statistics about the distribution of individual and abstract object anaphora or about the success rate of her approach are not available.</Paragraph>
    <Paragraph position="2"> Byron and Stent (1998) present extensions of the centering model (Grosz et al., 1995) for spoken dialogue and identify several problems with the model. We have chosen Strube's (1998) model for the resolution of individual anaphora as basis because it avoids the problems encountered by Byron &amp; Stent, who also do not present data on the resolution of pronouns in dialogues and do not mention abstract object anaphora.</Paragraph>
    <Paragraph position="3"> Dagan and Itai (1991) describe a corpus-based approach to the resolution of pronouns, which is evaluated for the neuter pronoun &amp;quot;it&amp;quot;. Again, abstract object anaphora are not mentioned.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML