File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/06/w06-1322_intro.xml

Size: 5,313 bytes

Last Modified: 2025-10-06 14:03:53

<?xml version="1.0" standalone="yes"?>
<Paper uid="W06-1322">
  <Title>Sydney, July 2006. c(c)2006 Association for Computational Linguistics A computational model of multi-modal grounding for human robot interaction</Title>
  <Section position="3" start_page="0" end_page="153" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> Natural language is the most intuitive way to communicate for human beings (Allen et al., 2001). It is, therefore, very important to enable dialog capability for personal service robots that should help people in their everyday life. However, the interaction with a robot as a mobile, autonomous device is different than with many other computer controlled devices which affects the dialog modeling. Here we want to first clarify the most essential requirements for dialog management systems for human-robot interaction (HRI) and then outline state-of-the-art dialog modeling approaches to position ourselves.</Paragraph>
    <Paragraph position="1"> The first requirement results from the situatedness (Brooks, 1986) of HRI. A mobile robot is situated &amp;quot;here and now&amp;quot; and cohabits the same physical world as the user. Environmental changes can have massive influence on the task execution.</Paragraph>
    <Paragraph position="2"> For example, a robot should fetch a cup from the kitchen but the door is locked. Under this circumstance the dialog system must support mixed-initiative dialog style to receive user commands on the one side and to report on the perceived environmental changes on the other side. Otherwise the robot had to break up the task execution and there is no way for the user to find out the reason.</Paragraph>
    <Paragraph position="3"> The second challenge for HRI dialog management is the embodiment of a robot which changes the way of interaction. Empirical studies show that the visual access to the interlocutor's body affects the conversation in the way that non-verbal behaviors are used as communicative signals (Nakano et al., 2003). For example, to refer to a cup that is visible to both dialog partners, the speaker tends to say &amp;quot;this cup&amp;quot; while pointing to it. The same strategy is considerably ineffective during a phone call. This example shows, an HRI dialog system must account for multi-modal communication.</Paragraph>
    <Paragraph position="4"> The third, probably the unique challenge for HRI dialog management is the implication of the learning ability of such a robot. Since a personal service robot is intended to help human in their individual household it is impossible to hard-code all the knowledge it will need into the system, e.g., where the cup is and what should be served for lunch. Thus, it is essential for such a robot to be able to learn new knowledge and tasks. This ability, however, has the implication for the dialog system that it can not rely on comprehensive, hard-coded knowledge to do dialog planning. Instead, it must be designed in a way that it has a loose relationship with the domain knowledge.</Paragraph>
    <Paragraph position="5"> Many dialog modeling approaches already exist. McTear (2002) classified them into three main types: finite state-based, frame-based, and agentbased. In the first two approaches the dialog structure is closely coupled with pre-defined task steps and can therefore only handle well-structured tasks for which one-side led dialog styles are sufficient. In the agent-based approach, the com- null munication is viewed as a collaboration between two intelligent agents. Different approaches inspired by psychology and linguistics are in use within this category. For example, within the TRAINS/TRIPS project several complex dialog systems for collaborative problem solving have been developed (Allen et al., 2001). Here the dialog system is viewed as a conversational agent that performs communicative acts. During a conversation, the dialog system selects the communicative goal based on its current belief about the domain and general conversational obligations. Such systems make use of communication and domain model to enable mixed-initiative dialog style and to handle more complex tasks. In the HRI field, due to the complexity of the overall systems, usually the finite-state-based strategy is employed (Matsui et al., 1999; Bischoff and Graefe, 2002; Aoyama and Shimomura, 2005). As to the issue of multi-modality, one strand of the research concerns the fusion and representation of multi-modal information such as (Pfleger et al., 2003) and the other strand focuses on the generalisation of human-like conversational behaviors for virtual agents. In this strand, Cassell (2000) proposes a general architecture for multi-modal conversation and Traum (2002) extends his information-state based dialog model by adding more conversational layers to account for multi-modality.</Paragraph>
    <Paragraph position="6"> In this paper we present an agent-based dialog model for HRI. As described in section 2, the two main contributions of this model are the new modeling approach of Clark's grounding mechanism and the extension of this model to handle multi-modal grounding. In section 3 we outline the capabilities of the implemented system and present some quantitative evaluation results.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML