File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/06/n06-2044_intro.xml

Size: 2,781 bytes

Last Modified: 2025-10-06 14:03:30

<?xml version="1.0" standalone="yes"?>
<Paper uid="N06-2044">
  <Title>Evolving optimal inspectable strategies for spoken dialogue systems</Title>
  <Section position="2" start_page="0" end_page="173" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> Developing a dialogue management strategy for a spoken dialogue system is often a complex and time-consuming task. This is because the number of unique conversations that can occur between a user and the system is almost unlimited. Consequently, a system developer may spend a lot of time anticipating how potential users might interact with the system before deciding on the appropriate system response. null Recent research has focused on generating dialogue strategies automatically. This work is based on modelling dialogue as a markov decision process, formalised by a finite state space S, a finite action set A, a set of transition probabilities T and a reward function R. Using this model an optimal dialogue strategy pi[?] is represented by a mapping between the state space and the action set. That is, for each state s [?] S this mapping defines its optimal action a[?]s. How is this mapping constructed? Previous approaches have employed reinforcement learning (RL) algorithms to estimate an optimal value function Q[?] (Levin et al., 2000; Frampton and Lemon, 2005). For each state this function predicts the future reward associated with each action available in that state. This function makes it easy to extract the optimal strategy (policy in the RL literature).</Paragraph>
    <Paragraph position="1"> Progress has been made with this approach but some important challenges remain. For instance, very little success has been achieved with the large state spaces that are typical of real-life systems.</Paragraph>
    <Paragraph position="2"> Similarly, work on summarising learned strategies for interpretation by human developers has so far only been applied to tasks where each state-action pair is explicitly represented (Lecoeuche, 2001).</Paragraph>
    <Paragraph position="3"> This tabular representation severely limits the size of the state space.</Paragraph>
    <Paragraph position="4"> We propose an alternative approach to finding optimal dialogue policies. We make use of XCS, an evolutionary reinforcement learning algorithm that seeks to represent a policy as a compact set of state-action rules (Wilson, 1995). We suggest that this algorithm could overcome both the challenge of large state spaces and the desire for strategy inspectability. In this paper, we focus on the issue of inspectability. We present a series of experiments that illustrate how XCS can be used to evolve dialogue strategies that are both optimal and easily inspectable.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML