File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/97/w97-1411_concl.xml

Size: 6,266 bytes

Last Modified: 2025-10-06 13:57:56

<?xml version="1.0" standalone="yes"?>
<Paper uid="W97-1411">
  <Title>Referring to Displays in Multimodal Interfaces</Title>
  <Section position="9" start_page="0" end_page="0" type="concl">
    <SectionTitle>
6 Preliminary Proposals
</SectionTitle>
    <Paragraph position="0"> In order to facilitate inter-module communication, and to allow for possible symbolic reasoning, a high-level symbolic representation is needed (see figure 2). Pointing facilities are included in the system, combining with the graphic display and NL interpreter to form a multimodal system. Pointing should assist in resolving ambiguity (of described referent, not intended referent), but the main component for dealing with these ambiguities is the reference model.</Paragraph>
    <Paragraph position="1"> It is clear that the resolving of references in such a system could, in general, depend upon a wide variety of sources of knowledge, including the following: Local Semantic Properties The noun phrase itself will supply the most immediate constraints on choice of (described) referent, in terms of the head noun and its modifiers such as adjectives and prepositional phrases.</Paragraph>
    <Paragraph position="2"> Semantic Relations Processing of referents must also take account of relations which are not shown in the noun phrase but which involve the referent(s) and other display/world objects.</Paragraph>
    <Paragraph position="3"> Mutual Beliefs The user and the system should know the referent object and its described features, and at the same time both should acknowledge that the other knows the object and its features as well ((Clark and Marshall, 1981), p57). In a multimodal environment, there are various ways for an object to be acknowledged by both dialogue participants: either it is displayed on the screen, or it is mentioned in previous dialogue, or it is part of common sense knowledge for both speaker and listener. A variety of pragmatic inferences might be possible. For example, in the query 2 What colour is this/~ ?, it may be possible that either model is in question, but it is unlikely that the display property is intended because it is already clearly visible. Moreover, as we suggested, the display property may represent some other property in the world model, so that if the user says What is the price band of this/2 ?, it may be inferrable both that the user must mean the car (rather than the icon), and also that it would be appropriate to give a reminder (e.g. Colour represents 2The/~ means a pointing act happens here.</Paragraph>
    <Paragraph position="4"> 82 D. He, G. Ritchie and J. Lee price band) about the depictive mapping, since the user is querying a directly depicted world property.</Paragraph>
    <Paragraph position="5"> Coherence The coherence of the proceeding dialogue should not be damaged by an object becoming the referent of the expression (Grosz and Sidner, 1986).</Paragraph>
    <Paragraph position="6"> It follows that the disambiguation process should be based on the following information sources: the world model and the display model for the sources of candidates and the examination of various restrictions, the dialogue model for providing coherence information about the dialogue and the user model for the modelling of mutual beliefs. In practice, our project is too limited to explore all of these issues, and we intend to leave aside issues of mutual belief (that is, our &amp;quot;user model&amp;quot; will be degenerately simple). It seems plausible that the consideration of described referents could be restricted, in this more limited project, to the use of &amp;quot;Local Semantic Properties&amp;quot; (in the above list). As argued in another context (Ritchie, 1976; Ritchie, 1983), broader semantic constraints (such as relations to other objects or even existence in the current situation) are largely concerned with the eventual referent, rather than superficial aspects of how it happens to be described. Even the question of whether a phrase is a semantically compatible subject or object of a particular verb is a constraint on the referent, not the symbolic expression describing it. In the revised three-level arrangement suggested earlier, such constraints would be on the intended referent rather than the described referent. That is, in a sentence like &amp;quot;What kind of fuel-injection system does the blue one have?&amp;quot; the constraint that the referent must be a type of object which can have a fuel-injection system is to be imposed upon the intended referent.</Paragraph>
    <Paragraph position="7"> This suggests, at least superficially, that the described referent might be calculated relatively simply using just the properties of the noun phrase, without much inference. The more difficult question of determining the intended referent would then involve potentially complicated inference about domain objects, etc. This would allow a two stage referent determination approach: find the described referent, then compute possible intended referents.</Paragraph>
    <Paragraph position="8"> As a benefit of this approach, a pointing action, which can be seen as a short way to indicate a described referent, could be included in a modular fashion.</Paragraph>
    <Paragraph position="9"> During these inferences (particularly the search for intended referents), a variety of sources of information may affect the result. It is therefore necessary to have some mechanisms which allow the interaction of these disparate sources. It is possible that some of the constraint-satisfaction suggestions of (Mellish, 1985) might be useful.</Paragraph>
    <Paragraph position="10"> If none of the available sources of information resolve the ambiguities, then the query as a whole is ambiguous, but it seems unlikely that this would happen in practice. The challenge is that the processing method should be equally effective at making use of these sources of disambiguation. Our objective is to allow for as much flexibility as we can in the referential phenomena, but we acknowledge inevitable limitations.</Paragraph>
    <Paragraph position="11"> Acknowledgements: The first author (Daqing He) is supported by a Colin &amp; Ethel Gordon Scholarship from the University of Edinburgh.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML