File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/05/w05-1606_intro.xml
Size: 9,522 bytes
Last Modified: 2025-10-06 14:03:17
<?xml version="1.0" standalone="yes"?> <Paper uid="W05-1606"> <Title>Generating Referential Descriptions Under Conditions of Uncertainty</Title> <Section position="3" start_page="0" end_page="0" type="intro"> <SectionTitle> 2 Motivation </SectionTitle> <Paragraph position="0"> In the scope of this paper, we adopt the terminology originally formulated in [Dale 1988] and later used by several others. A referential description [Donellan 1966] serves the purpose of letting the hearer or reader identify a particular object or set of objects in a given situation. The referring expression to be generated is required to be a distinguishing description, that is a description of the enitties being referred to, but not to any other object in the context set. A context set is defined as the set of the entities the addressee is currently assumed to be attending to - this is similar to the set of entities in the focus spaces of the discourse focus stack in Grosz' and Sidner's [1986] theory of discourse structure.</Paragraph> <Paragraph position="1"> Moreover, the contrast set (or the set of potential distractors [McDonald 1981]) is defined to entail all elements of the context set except the intended referents.</Paragraph> <Paragraph position="2"> Generating referring expressions is pursued since the eighties [Appelt 1985, Kronfeld 1986, Appelt and Kronfeld 1987]. Subsequent years were characterized by a debate about computational efficiency versus minimality of the elements appearing in the resulting referring expression [Dale 1988, Reiter 1990, Reiter and Dale 1992]. In the mid-nineties, this debate seemed to be settled in favor of the incremental approach [Dale and Reiter 1995] - motivated by results of psychological experiments [Levelt 1989, Pechmann 1989], certain non-minimal expressions are tolerated in favor of adopting the fast strategy of incrementally selecting ambiguity-reducing attributes from a domain-dependent preference list. Recently, algorithms have been applied to the identification of sets of objects rather than individuals [Bateman 1999, Stone 2000, Krahmer, v. Erk, and Verweg 2001], and the repertoire of descriptions has been extended to boolean combinations of attributes, including negations [van Deemter 2002]. To avoid the generation of redundant descriptions that is typical for incremental approaches, Gardent [2002] and Horacek [2003] proposed exhaustive resp. best-first searches.</Paragraph> <Paragraph position="3"> All these procedures more or less share the design of the underlying knowledge base. Objects are conceived in terms of sets of attributes, each with an atomic value as its filler.</Paragraph> <Paragraph position="4"> Some models distinguish specializations of these values according to a taxonomic hierarchy, so that the most accurate value can be replaced by one of its generalizations if there are reasons to assume this alternative is preferable due to insufficient knowledge attributed to the audience, or to prevent unintended implications. A few approaches also deal with relations to other objects, whose representation differs from that of attributes only by the reference to the related object. Typically, a user model is assumed to guide the choice among available descriptors; the user model expresses taxonomic knowledge attributed to the user, indicating for a descriptor whether it is known to the user or not. While a knowledge base developed and interpreted in this manner is adequate for generating referring expressions in most application-relevant settings, there may be circumstances in which uncertainties are prominent, so that the simple boolean attribution of properties to objects becomes problematic and may prove insufficient. Uncertainties may manifest themselves in at least the following three factors: * Uncertainty about knowledge There may not be sufficient evidence to assume that the user is or is not acquainted with a specific term. In fact, most of today's user model components assign some probability to statements about a user's knowledge or capabilities, for example on the basis of inferences obtained through a belief network [Pearl 1988].</Paragraph> <Paragraph position="5"> * Uncertainty about perception capabilities There is an increasing number of applications with natural language interaction where the objects of the discourse do not appear on the computer screen (e.g., ubiquitous tools guiding a user in environments such as airports and tourist attraction areas, e.g., [Wahlster 2004]). In such situations, perception and recognition of object properties is much harder to assess; for example, the visibility of some object or of one of its parts may not be derivable with complete certainty.</Paragraph> <Paragraph position="6"> * Uncertainty about conceptual agreement While ascribing a value to an attribute is straightforward for certain categories of attributes, problems may occur, e.g., in connection with vagueness. This concept may be relevant for a number of commonly used properties such as size and shape, and even with colors, transitions between adjacent color tones may not be firmly categorized as one of the two candidates.</Paragraph> <Paragraph position="7"> To illustrate these manifestations of uncertainty, let us consider a scenario with three similar dogs, one of which is a bassett, which is also the intended referent. In addition, the bassett is brownish and has a long tail. The other two dogs have shorter tails and their skin is also brown, but with some white resp. black portions. Furthermore, we assume that the audience has little knowledge about dog specifics, that is, it is not very likely that they may recognize the intended referent as a bassett. We also assume that the tails of the dogs cannot be observed easily by the audience under the given local circumstances.</Paragraph> <Paragraph position="8"> Hence, the three attributes &quot;category&quot;, &quot;color&quot;, and &quot;tail length&quot; each fall into one of the categories of uncertainty introduced above: the categorization of the intended referent as a bassett is associated with uncertainty about knowledge, the limited visibility which may not enable the spectators to see the tails of the dogs in each moment constitutes an uncertainty about perception capabilities, and the similarity of the dogs' colors may yield uncertainty about conceptual agreement, that is, it is doubtfull whether the descriptor &quot;brownish&quot; is attributed only to the intended referent or also to some of the other dogs in the given situation.</Paragraph> <Paragraph position="9"> Apparently, these uncertainties have consequences on building human-adequate referring expressions, especially in contexts where most of the descriptors available are associated with some kind of uncertainty. Intuitively, we would expect people to produce referring expressions with several of these descriptors, being redundant in case they are all recognized, but also hoping that the identification will succeed if the audience can identify only some part of the overall description in the given situation. Moreover, we would expect people only to use descriptors that have some reasonable chance of being understood.</Paragraph> <Paragraph position="10"> Unfortunately, traditional generation algorithms do not enable us to model such a behavior, since none of the options available does justice to the uncertainty involved. If a descriptor is modeled as applying to all entities (e.g., for &quot;brownish&quot;), it will never be chosen since it yields no discrimination. A similar consequence is obtained when the capabilities of the audience are interpreted pessimistically.</Paragraph> <Paragraph position="11"> Finally, if a descriptor is assumed to be understood, it might be chosen without considering any of the other candidates associated with uncertainty. Thus, modeling in the existing algorithms forces us to make crisp decisions, with strong impacts on the result of the algorithm. Redundant expressions motivated by uncertainties about recognition cannot be generated under any modeling alternative.</Paragraph> <Paragraph position="12"> There are only a few computational approaches which address the problem of uncertainty about the recognition of referring expressions. For example, [Edmonds 1994] and [Heeman and Hirst 1995] describe both plan-based methods, where a vague and partial description is produced initially, which is narrowed and ultimately confirmed in the subsequent discourse. However, the documented examples do only emphasize incomplete, but never incorrect interpretations.</Paragraph> <Paragraph position="13"> An approach that fits better to our intentions is the work by Goodman [1987], which emphasizes reference identification and associated failures in task-oriented dialogs [Goodman 1986]. This case study demonstrates various impacts of limitations and discrepancies of expertise on referential identification: subjects exhibit uncertainty in identification, which manifests itself in tentative actions and changes of mind, they misinterpret descriptions (e.g., 'outlet' interpreted as 'hole'), and they may find no appropriate referent at all. In the latter case, subject even undertake attempts to repair an otherwise uninterpretable description by relaxing descriptors. In the following, we interpret some of these findings for our model of uncertainty, including a model of a repair mechanism.</Paragraph> </Section> class="xml-element"></Paper>