File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/99/w99-0208_metho.xml
Size: 23,549 bytes
Last Modified: 2025-10-06 14:15:25
<?xml version="1.0" standalone="yes"?> <Paper uid="W99-0208"> <Title>Coreference resolution in dialogues in English and Portuguese</Title> <Section position="2" start_page="53" end_page="55" type="metho"> <SectionTitle> 2 The annotation of coreference cases </SectionTitle> <Paragraph position="0"> The operational routine of data collection was simply to search manually for tokens which coreferred in dialogue samples. Samples consisted of full dialogues for Portuguese but not for English, as a result of the sampling technique used in the English corpus (the London-Lund). Whenever a case of coreference was found, it was classified, according to four attributes, namely: type of anaphor; type of antecedent; topical role of the antecedent; and processing strategy.</Paragraph> <Paragraph position="1"> The first a~ibute refers to the word or phrase which triggers the anaphoric link, that is, the visible item that requires the retrieval of another element in the text for semantic interpretation. Concepts such as zero pronouns or empty categories are not used in the classification.</Paragraph> <Paragraph position="2"> Thus, the anaphor is invariably a phonetically realised item, and a verb without a phonetically realised subject is classified as an anaphoric verb. Although such verbal forms are rare in English, they are fairly common in spoken Portuguese. The same approach is used for transitive verbs without a phonetically realised object, which are also frequent.</Paragraph> <Paragraph position="3"> The type of antecedent concerns primarily the implicit/explicit dichotomy. Typically anaphoric words, such as it and that, may occur in nonreferential uses - for instance, the 'prop' it (Quirk et al. 1985). Thus, a third category, nonreferentiai, was used to classify these cases. Although these are not cases of coreference strictu sensu, it was thought important to include them, so that they could be identified when it came to implementation. Some tokens of pronouns with a vague antecedent identifiable by means of inference based on discourse information were classified as discourse implicit antecedents.</Paragraph> <Paragraph position="4"> The attribute named as the topical role of the antecedent classifies the antecedent of a given coreference case according to categories which assign a saliency status to discourse entities (typically noun phrases) in a dialogue. These categories include a discourse topic for the dialogue; a segment topic for every stretch of dialogue in which the topic is considered to he the same, according to specific procedures; a subsegment topic, if further division within a segment is needed for the appropriate modelling of topicality; and both global and local thematic elements, which are salient discourse entities related to the topics above mentioned. As antecedents may also be discourse chunks of varying length, these same categories were used to classify such antecedents as predicates of a given topical role thought to be the dominant entity within the discourse chunk.</Paragraph> <Paragraph position="5"> The aim of this attribute is to use the often mentioned relationship between topicality and coreference (see Grosz and Sidner 1986) for operational purposes. This classification does not claim to be the actual key for the modelling of topicality in dialogues from a psycholinguistic point of view. It does claim, however, to be a useful tool for the resolution of particularly hard cases of coreference, in which the antecedent is not the nearest syntactically appropriate candidate, as will be shown in section 3. The topical roles are assigned on the basis of frequency, distribution and order of appearance. This information is used in conjunction with an adaptation for dialogues of Hoey's method (Hoey 1991) to establish patterns of lexis. Procedures were thus defined for the assignment of the topical roles above mentioned to the various discourse entities in a dialogue.</Paragraph> <Paragraph position="6"> The fourth attribute is the processing strategy, which is an attempt to classify the resolution path according to informational demands seen as the most essential for the processing at hand.</Paragraph> <Paragraph position="7"> The processing strategy was included in the annotation scheme as a way of enriching the classification model, uncovering distinctions which, might remain unnoticed if only the type of anaphor were to be specified. The plain assignment of a type of anaphor based on word classes would ignore distinctions in the processing required for the resolution of anaphors of the same type. On the other hand, subsuming processing information in the classification used for the type of anaphor would disrupt the intended link of the latter to phonetically realised forms in a strict way.</Paragraph> <Paragraph position="8"> The annotation is entered between brackets in the order previously presented, beginning with the type of anaphor and ending with the processing strategy. The code for each one &the properties is delimited by semicolons. An example of annotated text is shown below.</Paragraph> <Paragraph position="9"> (i) B: well I think probably what Captain Kay (ENP; ex 222; dthel; LR;) must have said was a will is legal if it's ~P; ex 224; dthel; FtC;) witnessed on the back of an envelope The first token of coreference is the anaphoric nonpronominal noun phrase Captain Kay, which has been previously introduced in the dialogue. The type of anaphor is classified as FNP, for full noun phrase; the next slot defines the type of antecedent as explicit (ex__) and assigns a number for the referent according to order of appearance in the dialogue (222). The topical role of the antecedent is considered to be of a discourse thematic element. This means, thus, that Captain Kay is a fairly frequent discourse entity not only in a specific stretch of discourse, but throughout the dialogue, being, therefore, closely associated to the discourse topic. As the reference to Captain Kay is identified by means of verbatim repetition of the noun form under which it appeared for the first time in the dialogue, the processing strategy is defined as lexical repetition (LR).</Paragraph> <Paragraph position="10"> The subsequent anaphoric it refers to the first syntactically appropriate candidate looking backwards. Having Hobbs' (1986) naive algorithm as a reference, a primary first-candidate processing strategy was established under the code FtC. An extension of this primary strategy is the first-candidate chain (FtCCh), for cases in which Hohbs' naive algorithm finds another anaphor for antecedent. This sort of chain is crucially important in dialogues, as demonstrated by Biber (1992). An example is given below 1.</Paragraph> <Paragraph position="11"> (2) B: and I went down this morning to talk to the American Embassy on the off chance that the State Department might be you know able to finance a bit of travelling in the States and they can't they've (SP; ex_13; st; FtCCh;) got priority on vice-chancellors and uh English schoolteachers The second token of they refers to the first one, which, eventually, links both anaphors to the referent State Department. The two first-candidate processing strategies, together with resolutions relying on syntactic parallelism, were grouped under the umbrella category named syntactic processes.</Paragraph> <Paragraph position="12"> As the analysis of anaphora cases found in the corpus proceeded, a number of other categories for the classification of the processing strategy came up. These included, for instance, coilocational knowledge (CK), for cases in which the basic information required for processing was thought to derive from the use of anaphors within crystallised phrases, such as that is to say. Example (3) is one of those cases.</Paragraph> <Paragraph position="13"> (3) B:the bibliography has gone about as far as I can take it on my own that (De; ex_lO; p_st; CK;) is to say er in order to complete it I will have to visit the major resources in the United States and uh several in Europe t Annotation for other cases of coreferencc is omitted.</Paragraph> <Paragraph position="14"> By collecting these phrases in association with each type of anaphor, a collocation list of anaphoric terms was built for each one of the types, with a resolution procedure attached, which was designed on the basis of corpus data observation. This list was subsequently used as an ancillary routine in the AL theory, as will be shown later.</Paragraph> <Paragraph position="15"> Several forms of lexicai knowledge, assigned to cases in which the antecedents were identified chiefly by means of semantic information contained in the anaphor, were also identified, such as part-whole relationshps. In example (4), monies refers to finances by means of information conveyed by the lexieal semantics contained in the lexical item itself, but not by means of plain repetition. Thus, the classification used is lexical signalling (LS), one of the categories within the umbrella category iexical knowledge, along with lexical repetition.</Paragraph> <Paragraph position="16"> and uh - you know my own personal finances are well sure it's just out but you have applied er for monies (FNP; im_12; st; LS;) I keep hearing wherever I go Finally, a category named as discourse knowledge was used to classify cases in which the resolution required full processing of combined bits of discourse information. These four broader categories, including the essentially syntactic information required for the first-candidate strategies, grouped more fine-grained subclassifications in all cases, except for collocational knowledge. Thus, the umbrella categories were used to perform a statistical analysis using the data collected by means of manual annotation.</Paragraph> <Paragraph position="17"> However, the more detailed classification was retained in the actual annotation of the sample. The same approach was used in the other attributes.</Paragraph> <Paragraph position="18"> Frequencies for each category were then used in three different statistical procedures: a chi-square test; a measure of association; and the model-building variety of loglinear analysis. Chi-square tests with the attributes considered two by two showed statistical significance in all measurements (p < 0.00005) in both languages. The Goodman and Kruskal tau was used to measure association between attributes two by two. Association was shown to be high (over 0.30) between the processing strategy and the other three attributes, but relatively low (under 0.30) between these three attributes measured two by two. The loglinear analysis revealed that interactions considering three of the attributes were significant whenever the processing strategy was one of the three. The opposite was true when it was not.</Paragraph> <Paragraph position="19"> These results were true for both languages with minor variations.</Paragraph> <Paragraph position="20"> The statistical analysis showed thus that the classification model was adequate to represent the anaphora world. Moreover, it became clear that the attribute named as processing strategy yielded the highest information gain, acting as a link between the type of anaphor and the other two attributes which classify the antecedent. Therefore, the type of anaphor in itself, which could be mapped from POS tags or, in some cases, skeleton parsing (see Mitkov 1997), only became truly useful information for the resolution of the anaphoric reference when associated to the definition of a processing strategy. This made of course psycholinguistic sense, as it is not difficult to infer from corpus data that the same anaphor (such as it or that) may appear in contexts that lead to distinct processing demands for their resolution.</Paragraph> </Section> <Section position="3" start_page="55" end_page="58" type="metho"> <SectionTitle> 3 The antecedent-likelihood theory </SectionTitle> <Paragraph position="0"> The AL theory is made up of a series of entries for each type of anaphor. Entries contain instructions organised in an algorithm-like form to check the applicability of all possible processing strategies, relying on information taken from the training set. The initial information considered is the probability of occurrence for each processing strategy and the two other attributes. As a result, some categories included in the general classification model are never checked because there are no tokens in the training set associating them to the type of anaphor in question. The subject pronoun entry is shown below.</Paragraph> <Paragraph position="1"> Subject pronoun global probability = 0.247 Category probabilities process, strat, type antec, topical role</Paragraph> <Paragraph position="3"> The table with the category probabilities 2 defines the likelihood of categories in the three other variables being assigned to tokens of the anaphor type described in the entry, having the total number of tokens for the type of anaphor not the full sample - as a reference. The first column specifies the probabilities for the categories which define the processing strategy, while the second column shows the figures for the type of antecedent, and the third column lists the topical roles of antecedents with the respective numbers. In order to make the table visually compact, most of the categories are listed using the code specified for the annotation of the sample.</Paragraph> <Paragraph position="4"> 2 Categories cannot be fully described in this paper for reasons of space. The essential features have been presented though.</Paragraph> <Paragraph position="5"> Some processing information can be directly derived from the table of category probabilities. Categories which are not listed in the columns of the variables they belong to were not used to classify any tokens of the anaphor type, and thus can be left out of the processing. This may mean, for instance, that the processing need not be concerned with implicit antecedents for a given type of anaphor, because there are no tokens classified as such. Another possibility is that no tokens have been classified as being processed on the basis of collocational knowledge, and thus there is no point in checking the collocation list in search of matches.</Paragraph> <Paragraph position="6"> The header in AL theory entries is followed by a set of instructions organised in algorithm-like form. These instructions rely on the taxonomy employed to analyse processing strategies. The choice is based both on the results of the loglinear analysis and on the nature of the variable, which is in fact a description of the way a given anaphor token is resolved. The typical instruction appears as check ps, ps being any category included in the list of possible classifications of processing strategy for the type of anaphor. This means that the processing towards resolution of an anaphor of the type described in the entry should check, at this point, whether the processing strategy specified is a possible way to identify the correct antecedent. The typical check ps instruction is usually followed by a set of attached probabilities specific to the processing strategy being checked. These probabilities concern categories in the remaining two variables. Other information, such as the probability of predicate topical roles, may be added whenever this is felt to be useful. The subsequent items in a typical check ps instruction are recognition and resolution path. The first item contains information about features of the token itself and the immediate context in which it occurs, based on the observation of corpus data. The purpose is to guide the processing in the attempt to recognise the need for a certain type of processing strategy in order to resolve the anaphoric reference. The second item contains information related to the actual identification of the correct antecedent.</Paragraph> <Paragraph position="7"> The amount and complexity of information included in each one of the items varies with the type of anaphor and the processing strategy. In some cases, the recognition requires careful analysis, involving a number of details and check-ups. In other cases, recognising that a certain processing strategy is the adequate one is not as difficult as identifying the antecedent, as in some cases of discourse-implicit antecedents. The AL theory is built so as to permit the expansion or reduction of guidelines included as instructions or items within instructions.</Paragraph> <Paragraph position="8"> In case a given processing strategy presents sufficient diversity of recognition and/or resolution patterns, the instructions may be divided into subtypes of recognition and resolution. This approach to the form of entries applies generally but not always, that is, there may be check ps instructions which do not include one or more of the items described above. There may also be instructions which specify actions of an unique nature for the type of anaphor or processing strategy under scrutiny. The extract of the subject pronoun entry shown below illustrates this flexibility. The header shown above is followed by two instructions which break with the general check ps norm, only to return to it in the third instruction, as shown below.</Paragraph> <Paragraph position="9"> check ifPOS tag is Q-tag item )~ if not, go to instruction 2; if yes go to tag-question entry in collocation list )~ follow resolution path in entry identify pronoun pronoun is he, she or they )&quot; go to instruction 5 pronoun is it )&quot; go to instruction 4 pronoun is first or second person go to instruction 3 The AL theory was manually tested on a previously analysed dialogue used as a test bed. There were 804 cases of anaphora in the testing set for English. The AL theory predicted the correct antecedent in 98.4% of the cases, which is evidently a satisfactory result. Results were also satisfactory, although not quite as good (93.5%), for Portuguese. However, the score was only obtained on the assumption that the dialogue had been POS-tagged, parsed and segmented according to topicality, using the procedures defined for each category in the attribute named as topical role of the antecedent. These are not minor assumptions, particularly if it is taken into account that, in real-life processing situations, these tasks would have to be carried out during an ongoing conversation.</Paragraph> <Paragraph position="10"> Nevertheless, the approach seems worth pursuing as a promising way to solve a difficult problem in the actual implementation of dialogue interfaces and in NLP in general. Thus, the attempt to transform the AL theory into an automatic procedure may be a useful way forward.</Paragraph> </Section> <Section position="4" start_page="58" end_page="59" type="metho"> <SectionTitle> 4 The decision trees for coreference </SectionTitle> <Paragraph position="0"> resolution The general procedure for the resolution of any anaphora case is then to cheek the processing strategy with the highest probability first. If anaphors classified as determinative possessives in the English sample are taken as an example, this strategy would be the one named as first-candidate chain, in which the first appropriate candidate - in syntactic terms searching backwards is selected, although it is also an anaphor. It may be safely assumed that this anaphor has already been dealt with, as it precedes the one being resolved.</Paragraph> <Paragraph position="1"> Checking a processing strategy for adequacy involves a recognition procedure specified in the entry, which, in the example considered above, would be to check the appropriateness of the first candidate. However, the probabilities indicate that there were cases in the training set in which this type of anaphor was resolved by means of discourse knowledge. This means that there were tokens in which the use of syntactic information only - as in Hobbs' &quot;naive&quot; algorithm - would lead to the identification of an incorrect antecedent.</Paragraph> <Paragraph position="2"> Therefore, ways of checking whether the first appropriate candidate is actually the correct antecedent had to be devised. Two basic routines were used: selectional restrictions and association history. As formalised in Katz and Fodor (1963), selectional restrictions are semantic constraints which the sense of a given word imposes on those syntactically related to it. Thus, whenever an anaphor is linked to a verb as a complement, it is useful to check if a candidate antecedent is a good fit by using selectional restrictions.</Paragraph> <Paragraph position="3"> There were cases in the training set, however, in which selectional restrictions would not detect the incorrectness of a syntactically appropriate candidate. A second kind of lexical clue was then included as a checking routine: the association history. It is unusual - although not impossible of course - that pronoun reference is used in the first instance of an association between a verb and a referent. This is even less likely in situations in which there is an established competitor with a record of tokens repeatedly associated to the verb in question.</Paragraph> <Paragraph position="4"> These checking routines may signal that it is advisable to consider bypassing the first candidate on the basis of discourse information. Checking the possibility of a resolution by means of discourse knowledge usually involves a recognition procedure, which relies on topica!ity information. If the alternative candidate selected is one of the highly salient discourse entities, the chances that the speaker felt the listener would successfully process the reference are much higher, making the bypass of the first candidate far more likely.</Paragraph> <Paragraph position="5"> The entry for determinative possessives is a relatively simple one, however, if compared to those for subject pronouns or anaphorie demonstratives in English or anaphoric verbs in Portuguese. Moreover, entries for other types of anaphor may require various forms of checking routines, which are specific to the type of anaphor in question. In spite of this highly complex and broad set of required information, it seems possible to organise it into decision trees for operational use. The notion of decision tree (as in Quinlan 1993) may have to be somewhat expanded in order to accommodate the various bits of specific information related to each type of anaphor.</Paragraph> <Paragraph position="6"> At present, several different algorithms and adaptations of these algorithms are being tested in order to establish their adequacy to the task, including the well-known C4.5. A hybrid approach, in which an example-based altemative process would choose the most closely related case in the training set and use it to resolve a new case of anaphora, is also being considered, having the TiMBL package (Tilbury 1999) as a primary reference. It is expected that initial tests will be run soon, yielding results which will be then used to gradually improve the approach and its implementation. The GATE structure (Cunnigham et al. 1995) is likely to be used as a way to organise the various required elements of linguistic information as an integrated system.</Paragraph> <Paragraph position="7"> At the present stage, however, the software mentioned are quoted as reference rather than firm choice.</Paragraph> </Section> class="xml-element"></Paper>