XML Viewer - p98-2172

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/98/p98-2172_metho.xml
Size: 20,800 bytes
Last Modified: 2025-10-06 14:15:00
<?xml version="1.0" standalone="yes"?>
<Paper uid="P98-2172">
  <Title>Reference Resolution beyond Coreference: a Conceptual Frame and its Application Andrei POPESCU-BELIS, Isabelle ROBBA and G6rard SABAH Language and Cognition Group, LIMSI-CNRS</Title>
  <Section position="2" start_page="0" end_page="1048" type="metho">
    <SectionTitle>
1 A general framework
</SectionTitle>
    <Paragraph position="0"> reference use and resolution for</Paragraph>
    <Section position="1" start_page="0" end_page="1046" type="sub_section">
      <SectionTitle>
1.1 Overview of the model
</SectionTitle>
      <Paragraph position="0"> The communication situation is deliberately conceived here from a representationist point of view: the speaker (s) and the hearer (h) share the same world (W) considered as a set of objects with various characteristics or properties (Figure 1). Objects can be material or conceptual, or even belong to fictitious constructions. Each individual's perception of the world is different: ph(W) ~ ps(W). Perception (p) as well as inferences (i) on perceptions using previous knowledge and beliefs provide each individual with a representation of the world, that is, RWs and RWh, where RWx = ix(px(W)) -- ipx(W). For computational reasons, it is useful to consider that only part of the world W plays a role in the communication act; this is called the topic T, and its representations are RTh and RTs.</Paragraph>
      <Paragraph position="1"> The speaker produces a discourse message (DM) and a gesture message (GM). Both DM and GM contain referring expressions (RE), that is, chunks of discourse or gestures which are mapped to particular objects of RW. RWh and RWs each include a list of represented objects with their properties, called mental representations (MR).</Paragraph>
      <Paragraph position="3"> Understanding a message cannot be defined solely with respect to W, as there is no direct access to it. Instead, each individual builds a representation of the others' RW, using its own perceptions and inferences (ip). The speaker has his own RWs and also</Paragraph>
      <Paragraph position="5"> specularity, is potentially infinite, as one may conceive RWh(s(h)), RWh(s(h(s))), etc. (it could be tentatively asserted that when all the RW of all individuals become identical for a given assertion, the assertion becomes &amp;quot;common knowledge&amp;quot;).</Paragraph>
      <Paragraph position="6"> A message has been understood if, for the current topic, RTh(s)- RTs, i.e., if the hearer's representation of the speaker's view of the world is accurate. This definition simplifies of course reality to make it fit into a computational model. For instance, from a rhetorical point of view, a communication succeeds if RTh changes according to the sender's will.</Paragraph>
      <Paragraph position="7"> Evolution in time isn't represented yet, so we do not index the various representations along the time axis.</Paragraph>
      <Paragraph position="8"> In order to understand a message, the hearer has to find out which objects the referring expressions refer to - REs from the discourse, as well as deictic (pointing) ones. The hearer is able to use his own perception of W, namely RWh, and his knowledge, to build mental representations of objects from the referring expressions.</Paragraph>
    </Section>
    <Section position="2" start_page="1046" end_page="1047" type="sub_section">
      <SectionTitle>
1.2 Human-computer dialog vs. story
</SectionTitle>
      <Paragraph position="0"> understanding by a computer We focus here on the problem of reference understanding by a computer program (c). Such a program has to build and manage, in theory, a RWc and a RWc(s), using information about the world, the message itself, and possibly a deictic set.</Paragraph>
      <Paragraph position="1"> For a window manager application accepting natural language commands, the displayed graphic objects constitute the topic (T), i.e., the part of the world more specifically dealt with. The program's perception of T is totally accurate (pc(T)= T); pc(T) is the most important and reliable source of information.</Paragraph>
      <Paragraph position="2"> Mouse pointing provides also direct deictic information. The difference between RWc and RWc(s) may account for the difference between the complete description of the displayed objects and their visible features.</Paragraph>
      <Paragraph position="3"> For a story understanding program, the direct perception of the shared world W is strongly reduced, especially for fiction stories. Human readers in this case derive their knowledge only from the processed text. But knowledge about basic properties of W and about language conventions has still to be shared, otherwise no communication would be possible. For story processing, both pc(W) and the gesture message are extremely limited, so the program has to rely only on discourse information, thus building fh'st RWc(s) and only afterwards RWc, using supplementary knowledge about W. The gap between RWc(s) and RWc is  due to the speaker's misuse of referring expressions, or to internal contradictions of the story. The system described below follows this second approach.</Paragraph>
      <Paragraph position="4"> 1,3 Data structures and operations For minimal reference resolution, a program has to select the referring expressions (RE) of the received message and use them in order to build a list of mental representations of objects (MR). Each MR is a data structure having several attributes, depending on the program's capacities. Here is a basic set:  * MR.identificator -- a number; * MR.list-of-REs- the REs referring to the object; * MR.semantic-information.text --a con- null ceptual structure gathering the properties of the object, from the REs and from the sentences in which they appear; * MR.semantic-information.dictionary -- a conceptual structure gathering the properties of the object from the conceptual dictionary (concept lattice) of the system.</Paragraph>
      <Paragraph position="5"> These properties reflect a priori knowledge about the conceptual categories the MR belongs to; * MR.relations -- the relationship of the MR to other MRs, for instance: part-of or composed-of (these allow processing of plural MRs); * MR.computer-object- a pointer on the object in case it belongs to a computer application (e.g., a window in a command dialog); * MR.perceptual-information ~ an equivalent of the previous attribute, in case the program handles perceptual representations of objects.</Paragraph>
      <Paragraph position="6"> In turn, the computational representation of a referring expression (RE) should have at least the following attributes: * RE.identificator m a number; * RE.position- uniquely identifies the RE's position in the text: number, paragraph, sentence, beginning and ending words; * RE.syntactic-information -- a parse tree of the RE, the RE's function, or, if available, a parse tree of the whole sentence where the</Paragraph>
      <Paragraph position="8"> structure for the RE, or, if available, for the whole sentence.</Paragraph>
      <Paragraph position="9"> Finally, there are operations on the MR set: * creation: REi ---&gt; MRnew -- a new MR is created when an object is fh'st referred to; * attachment: REi + MRa ----&gt; MRa ~ when a RE refers to an already represented object, the RE is attached to the MR and the MR's structure is updated; * fusion: MRa + MRb ~ MRnew -- at a given point, it may appear that two MRs were built for the same object, so they have to be merged. The symmetrical operation, i.e., splitting an MR which confusingly represents two objects, is far more difficult to do, as it has to reverse a lot of decisions;</Paragraph>
      <Paragraph position="11"> The last two operations (partition/grouping) are symmetrical, and prove necessary in order to deal with collections of objects (plurals). For instance, from a collective RE as &amp;quot;the team&amp;quot; (and its MR) the program has to use built-in knowledge to create several MRs corresponding to the players, and correctly solve the new RE &amp;quot;the first player&amp;quot;. Conversely, after construction of two MRs for &amp;quot;Miss X&amp;quot; and &amp;quot;Mrs. Y&amp;quot;, an RE as &amp;quot;the two women&amp;quot; has to be attached to the MR which was built by grouping the previous MRs. In both cases, the MR.relation attribute has to be correctly filled-in with the type of relation between MRs.</Paragraph>
      <Paragraph position="12"> If enough data is available, the system should build a conceptual structure for the MR (e.g., conceptual graphs), which should incrementally gather information from all referring expressions attached to the same MR. A lowerknowledge technique is to record for each MR a list of &amp;quot;characteristic REs&amp;quot; without any conceptual structures, and apply selectional constraints on it.</Paragraph>
    </Section>
    <Section position="3" start_page="1047" end_page="1048" type="sub_section">
      <SectionTitle>
1.4 Selection heuristics
</SectionTitle>
      <Paragraph position="0"> During the resolution process, each RE either triggers the creation of a new MR or is attached to an existing MR. The purpose of the selection heuristics is to answer whether the RE may be associated to a given MR, after examining compatibility between the RE and the other REs in the MR.list-of-REs. One of the simplest heuristics is: * (HI) \[MRa can be the referent of REi\] iff \[RE1 being the first element of MRa.list-of-REs, REi and RE1 can be coreferent\] This presupposes that the first RE referring to an object is typical, which isn't always true.</Paragraph>
      <Paragraph position="1"> To take advantage of the MR paradigm, it may seem wiser to compare the current RE to all the REs in the MR.list-of-REs. This list includes also pronominal REs, which are actually meaningless for the compatibility test. Despite Ariel's (1990) claim that there is no clear-cut referential difference between pronouns and  nominals, we will exclude pronouns in the implementation of our model. So, a second heuristic is: * (H2) \[MRa can be the referent of REi\] iff \[for all (non-pronominal) REj in MRa.listof-REs, REi and REj can be coreferent\] This heuristic is in fact quite inefficient: first, it allows for little variation in the naming of a referent. Second, it neglects an important distinction in RE use, between identification and information (as described, for instance, by Appelt and Kronfeld (1987)). The sender may use a particular RE not only to identify the MR, but also to bring supplementary knowledge about it; thus, two REs conveying different pieces of knowledge may well be incompatible in the system's view. A more tolerant heuristic is thus: * (H3) \[MRa can be the referent of REi\] iff \[there exists a (non-pronominal) REj in MRa.list-of-REs so that REi and REj can be coreferent\] A more general heuristic subsumes both H2 ('all') and H3 ('one'): * (H4) \[MRa can be the referent of REi\] iff \[REi and REj can be coreferent for more than X% of the REj in MRa.list-of-REs\] When X varies from 0 to 100, this selection heuristic varies from H3 to H2 providing intermediate heuristics that can be tested (SS4). H3 seems in fact close to the co-reference paradigm, as it privileges links between individual REs, from which the MRs could even be built a posteriori, using the coreference chains. But here MRs are also characterized by an intrinsic activation factor, evolving along the text, which cannot be managed in the coreference paradigm.</Paragraph>
    </Section>
    <Section position="4" start_page="1048" end_page="1048" type="sub_section">
      <SectionTitle>
1.5 Activation
</SectionTitle>
      <Paragraph position="0"> The activation of an MR is computed according to salience factors (this technique is described for instance by Lappin and Leass (1994)). Our salience factors are: de-activation in time, re-activation by various types of RE, re-activation according to the function of the RE. Among the MRs which pass the selection, activation is used to decide whether the current RE is added to an MR (the most active) or if a new MR is created. Activation is thus a dynamic factor, which changes for each MR according to the position in the text and the previous reference resolution decisions.</Paragraph>
    </Section>
  </Section>
  <Section position="3" start_page="1048" end_page="1050" type="metho">
    <SectionTitle>
2 Comparison with other works
</SectionTitle>
    <Paragraph position="0"> Theoretical studies of discourse processing have long been advocating use of various representations for discourse referents. However, implementations of running systems have rather focused on anaphora or coreference.</Paragraph>
    <Paragraph position="1"> Our purpose here is to show how a simplified computational model of discourse reference can be implemented and give significant results for reference resolution; we showed previously (Popescu-Belis and Robba 1997) that it was also relevant for pronoun resolution.</Paragraph>
    <Section position="1" start_page="1048" end_page="1048" type="sub_section">
      <SectionTitle>
2.1 High-level knowledge models
</SectionTitle>
      <Paragraph position="0"> The idea of tracking discourse referents using &amp;quot;files&amp;quot; for each of them has already been proposed by Kartunnen (1976). Evans (1985) and Recanati (1993) are both close to our proposals, however they neither give a computational implementation nor an evaluation on real texts. Sidner's work (1979) on focus led to salience factors and activations, but proved too demanding for an unrestricted use.</Paragraph>
      <Paragraph position="1"> A more operational system using semantic representation of referents is for instance LaSIE (Gaizauskas et al. 1995), presented at MUC-6, which relies however a lot on task-dependent knowledge. The system doesn't seem to use activation cues. Another system (Luperfoy 1992) uses &amp;quot;discourse pegs&amp;quot; to model referents and was applied successfully to a man-machine dialogue task.</Paragraph>
      <Paragraph position="2"> From a theoretical point of view, the model presented by Appelt and Kronfeld (1987) is in its background close to ours. Being further developed according to the speech acts theory, it relies however on models of intentions and beliefs of communicating agents which seem uneasy to implement for discourse understanding.</Paragraph>
    </Section>
    <Section position="2" start_page="1048" end_page="1048" type="sub_section">
      <SectionTitle>
2.2 Robust, lower-level systems
</SectionTitle>
      <Paragraph position="0"> Some of the robust approaches derive from anaphora resolution (e.g., Boguraev and Kennedy (1996)) because the antecedent / anaphoric links are a particular sort of coreference links, which disambiguate pronouns. Most of these systems however remain within the co-reference paradigm, as defined by the MUC-6 coreference task. Numerous low-level techniques have been developed, using generally pattern-matching between potentially coreferent strings (e.g., McCarthy and Lehnert 1995). An interesting solution has been proposed by Lin (1995) using constraint solving to group REs into MRs. While this idea fits the MR paradigm, it doesn't work well incrementally, which makes use of activation impossible.</Paragraph>
    </Section>
    <Section position="3" start_page="1048" end_page="1049" type="sub_section">
      <SectionTitle>
2.3 Advantages of the MR paradigm
</SectionTitle>
      <Paragraph position="0"> Grouping REs into MRs brings decisive ad- null vantage even without conceptual knowledge. First, it suppresses an artificial ambiguity of coreference resolution: if RE1 and RE2 are already known as coreferent, coref(RE1, RE2), there is no conceptual difference between coref(RE3, RE1) and coref(RE3, RE2), so these two possibilities shouldn't be examined separately. Moreover, the system of coreference links makes it very time-consuming to find out whether REi and REj are coreferent, whereas MRs provide reusable storing of all the already acquired information.</Paragraph>
      <Paragraph position="1"> Second, coreference links cannot represent multiple dependencies as needed by some objects which are collections of other objects. Coreference links simply mark identity of the referent for two REs: collections require typed links (part-of /composed-of) between several objects, as shown previously.</Paragraph>
    </Section>
    <Section position="4" start_page="1049" end_page="1049" type="sub_section">
      <SectionTitle>
3 Application of the model
3.1 Reference resolution mechanism
</SectionTitle>
      <Paragraph position="0"> We have particularized and implemented the theoretical model using algorithms in the style of Lappin and Leass (1994). We don't wish to overload this paper with technical details. The REs are solved one by one, either by attachment to an existent MR, or by creation of a new MR.</Paragraph>
      <Paragraph position="1"> Selection rules are applied to the existing MRs to find out whether the current RE may or may not refer to the object represented by the MR. As our implementation deals with unrestricted texts, only very basic selection rules are used; there are two agreement rules (for gender and number) and a semantic rule (synonyms and hyperonyms are compatible).</Paragraph>
      <Paragraph position="2"> As no semantic network is available for French (e.g., WordNet), only very few synonyms are taken into account. Conceptual graphs are neither used, as our conceptual analyzer isn't robust enough for unrestricted noun phrases.</Paragraph>
      <Paragraph position="3"> The working memory stores a fixed quota of the most active MRs, the others being archived and inaccessible for further resolution. From a cognitive point of view, this memory mimics the human incapacity to track too many story characters. Computationally, it reduces ambiguity for the attachment of REs, and increases the system's speed.</Paragraph>
    </Section>
    <Section position="5" start_page="1049" end_page="1049" type="sub_section">
      <SectionTitle>
3.2 The texts
</SectionTitle>
      <Paragraph position="0"> Two narrative texts have been chosen to test our system: a short story by Stendhal, Vittoria Accoramboni (VA) and the first chapter of a novel by Balzac, Le P~re Goriot (LPG) (Table 1). VA, available as plain text, underwent manual tagging of paragraphs, sentences and boundaries of all REs, then conversion to 'objects' of our programming environment (Smalltalk). Using Vapillon's and al. (1997) LFG parser, an f-structure (parse tree) was added to each RE. Then the correct MRs were created using our user-friendly interface.</Paragraph>
      <Paragraph position="1">  LPG was already SGML-encoded with the REs and MRs, using Bruneseaux and Romary (1997) mark-up conventions. Only REs referring to the main characters of the first chapter were encoded: humans, places and objects. As a result, the ratio RE / MR is much greater than for VA. The text was converted to Smalltalk objects, f-structures were added to the REs, and MRs were automatically generated from the SGML tags. To make comparison with VA easier, a fragment of the LPG text was isolated (LPG.eq); it contains the same amount of REs as VA.</Paragraph>
      <Paragraph position="2"> It should be noted that in both cases the LFG parser isn't robust enough to deliver proper f-structures for all noun phrases. The parser's total silence is ca. 4% and its ambiguity ca. 2.7 FS per RE. Despite such drawbacks (unreliable parser, lack of semantics), we kept working on complex narrative texts in order to study in depth the effects of elementary rules and parameters in situations where the coreference rate is high. Reference resolution is probably easier on technical documentation or articles, as referents receive more constant names.</Paragraph>
    </Section>
    <Section position="6" start_page="1049" end_page="1050" type="sub_section">
      <SectionTitle>
3.3 Evaluation methods
</SectionTitle>
      <Paragraph position="0"> The MRs produced by the reference resolution module (response) are compared to the correct solution (key) using an implementation of the algorithm described by Vilain and al. (1995), used also in the MUC evaluations. Although this algorithm was designed for coreference evaluation, it builds in fact each coreference chain, and compares the key and the response  partition of the RE set in MR subsets -- it follows thus the MR paradigm. The algorithm computes a recall error (number of coreference links missing in the response vs. the key) and a precision error (number of wrong coreference links, i.e. present in the response but absent from the key).</Paragraph>
      <Paragraph position="1"> The MUC scoring method isn't always meaningful. We have shown elsewhere (Popescu-Belis and Robba 1998) that it is too indulgent, and have proposed new algorithms which seem to us more relevant, named here 'core-MR' and 'exclusive-core-MR'.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML