File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/06/p06-2085_intro.xml
Size: 3,024 bytes
Last Modified: 2025-10-06 14:03:43
<?xml version="1.0" standalone="yes"?> <Paper uid="P06-2085"> <Title>Using Machine Learning to Explore Human Multimodal Clarification Strategies</Title> <Section position="3" start_page="0" end_page="659" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> Good clarification strategies in dialogue systems help to ensure and maintain mutual understanding and thus play a crucial role in robust conversational interaction. In dialogue application domains with high interpretation uncertainty, for example caused by acoustic uncertainties from a speech recogniser, multimodal generation and input leads to more robust interaction (Oviatt, 2002) and reduced cognitive load (Oviatt et al., 2004). In this paper we investigate the use of machine learning (ML) to explore human multimodal clarification strategies and the use of those strategies to decide, based on the current dialogue context, when a dialogue system's clarification request (CR) should be generated in a multimodal manner.</Paragraph> <Paragraph position="1"> In previous work (Rieser and Moore, 2005) we showed that for spoken CRs in human-human communication people follow a context-dependent clarification strategy which systematically varies across domains (and even across Germanic languages). In this paper we investigate whether there exists a context-dependent &quot;intuitive&quot; human strategy for multimodal CRs as well. To test this hypothesis we gathered data in a Wizard-of-Oz (WOZ) study, where different wizards could decide when to show a screen output.</Paragraph> <Paragraph position="2"> From this data we build prediction models, using supervised learning techniques together with feature engineering methods, that may explain the underlying process which generated the data. If we can build a model which predicts the data quite reliably, we can show that there is a uniform strategy that the majority of our wizards followed in certain The overall method and corresponding structure of the paper is as shown in figure 1. We proceed as follows. In section 2 we present the WOZ corpus from which we extract a potential context using &quot;Information State Update&quot; (ISU)-based features (Lemon et al., 2005), listed in section 3. We also address the question how to define a suitable &quot;local&quot; context definition for the wizard actions. We apply the feature engineering methods described in section 4 to address the questions of unique thresholds and feature subsets across wizards. These techniques also help to reduce the context representation and thus the feature space used for learning. In section 5 we test different classifiers upon this reduced context and separate out the independent contribution of learning algorithms and feature engineering techniques. In section 6 we discuss and interpret the learnt strategy. Finally we argue for the use of reinforcement learning to optimise the multimodal clarification strategy.</Paragraph> </Section> class="xml-element"></Paper>