File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/00/p00-1013_concl.xml
Size: 3,195 bytes
Last Modified: 2025-10-06 13:52:49
<?xml version="1.0" standalone="yes"?> <Paper uid="P00-1013"> <Title>Spoken Dialogue Management Using Probabilistic Reasoning</Title> <Section position="6" start_page="0" end_page="0" type="concl"> <SectionTitle> 5 Conclusion </SectionTitle> <Paragraph position="0"> This paper discusses a novel way to view the dialogue management problem. The domain is represented as the partially observable state of the user, where the observations are speech utterances from the user. The POMDP representation inverts the traditional notion of state in dialogue management, treating the state as unknown, but inferrable from the sequences of observations from the user. Our approach allows us to model observations from the user probabilistically, and in particular we can compensate appropriately for more or less reliable observations from the speech recognition system. In the limit of perfect recognition, we achieve the same performance as a conventional MDP dialogue policy. However, as recognition degrades, we can model the effects of actively gathering information from the user to offset the loss of information in the utterance stream.</Paragraph> <Paragraph position="1"> In the past, POMDPs have not been used for dialogue management because of the computational complexity involved in solving anything but trivial problems. We avoid this problem by using an given as mean +/- std. dev.</Paragraph> <Paragraph position="2"> augmented MDP state representation for approximating the optimal policy, which allows us to find a solution that quantitatively outperforms the conventional MDP, while dramatically reducing the time to solution compared to an exact POMDP algorithm (linear vs. exponential in the number of states).</Paragraph> <Paragraph position="3"> We have shown experimentally both in simulation and in preliminary user testing that the POMDP solution consistently outperforms the conventional MDP dialogue manager, as a function of erroneous actions during the dialogue. We are able to show with actual users that as the speech recognition performance varies, the dialogue manager is able to compensate appropriately. null While the results of the POMDP approach to the dialogue system are promising, a number of improvements are needed. The POMDP is overly cautious, refusing to commit to a particular course of action until it is completely certain that it is appropriate. This is reflected in its liberal use of verification questions. This could be avoided by having some non-static reward structure, where information gathering becomes increasingly costly as it progresses.</Paragraph> <Paragraph position="4"> The policy is extremely sensitive to the parameters of the model, which are currently set by hand. While learning the parameters from scratch for a full POMDP is probably unnecessary, automatic tuning of the model parameters would definitely add to the utility of the model. For example, the optimality of a policy is strongly dependent on the design of the reward structure. It follows that incorporating a learning component that adapts the reward structure to reflect actual user satisfaction would likely improve performance.</Paragraph> </Section> class="xml-element"></Paper>