File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/91/e91-1015_metho.xml
Size: 23,237 bytes
Last Modified: 2025-10-06 14:12:36
<?xml version="1.0" standalone="yes"?> <Paper uid="E91-1015"> <Title>A TASK INDEPENDENT ORAL DIALOGUE MODEL</Title> <Section position="1" start_page="0" end_page="0" type="metho"> <SectionTitle> A TASK INDEPENDENT ORAL DIALOGUE MODEL Erie Bil~nge CAP GEMINI ~NNOVATION </SectionTitle> <Paragraph position="0"> 118, rue de Tocque~|!!~ 75017 Paris. France and IRISA Lannion e-mail: bilanp~C/rp.capsogeti.fr</Paragraph> </Section> <Section position="2" start_page="0" end_page="0" type="metho"> <SectionTitle> ABSTRACT </SectionTitle> <Paragraph position="0"> This paper presents a human-machine dialogue model in the field of task-oriented dialogues. The originality of this model resides in the clear separation of dialogue knowledge from task knowledge in order to facilitate for the modeling of dialogue strategies and the maintenance of dialogue coherence. These two aspects are crucial in the field of oral dialogues with a machine considering the current state of the art in speech recognition and understanding techniques. One important theoretical innovation is that our dialogue model is based on a recent linguistic theory of dialogue modeling. The dialogue model considers real-life situations, as our work was based on a real man-machine corpus of dialogues.</Paragraph> <Paragraph position="1"> In this paper we describe the model and the designed formalisms used in the implementation of a dialogue manager module inside an oral dialogue system. An important outcome and proof of our model is that it is able to dialogue on three differ- * ent applications.</Paragraph> </Section> <Section position="3" start_page="0" end_page="0" type="metho"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> The work presented here is a dialogue model for oral task oriented dialogues. This model is used and under development in the SUNDIAL ESPRIT project I whose aim is to develop an oral cooperative dialogue system.</Paragraph> <Paragraph position="1"> Many researchers have observed that oral dialogue is not merely organized as a cascade of adjacency pairs as Schlegoff and Sacks {1973} sug. gested. Task oriented dialogues have been analyzed from different point of view: discourse seg-</Paragraph> </Section> <Section position="4" start_page="0" end_page="0" type="metho"> <SectionTitle> REY University </SectionTitle> <Paragraph position="0"> ented dialogues planning techniques have received a fair amount of attention {Allen et al, 1982; Litman & Allen, 1984).</Paragraph> <Paragraph position="1"> In the latter approach there is no means to describe and deal with pure discursive phenomena {meta-communication) such as oral misunderstanding, initiative keeping, initiative giving etc, Whilst in the first approaches there is no attempt to develop a full dialogue system, except in Grosz's and Sidner's {1986) model that unfortunately does not cover all oral dialogue phenomena (Bilange et al, 1990b).</Paragraph> <Paragraph position="2"> In oral conversation, meta-communication represents a large proportion of all possible phenomena and is not simple to deal with, especially if we strive to obtain natural dialogues. Therefore, we developed a computational model able to have clear views on happenings at the task level and at the level of the communication itself. This model is not based on pure intuition but has been validated in a semi-automatic human-machine dialogue simulation {Ponamal~ et al, 1990). The aim is to obtain a dialogue manager capable of natural behaviour during a conversation allowing the user to express himself and without being forced to respect the system behaviour. Thus we endow the system with the capabilities of a fully interactive dialogue.</Paragraph> <Paragraph position="3"> Moreover, as a strategic choice, we decided to have a predictive system, as it has been shown crucial for oral dialogue system {Guyomard et al, 1990; Young, 1989}, to guide the speech understanding mechanisms whenever possible. These predictions result from an analysis of our corpus and generalized by endowing the system with the capacity to judge the degree of dialogue openness. As a results predicting the user's possible interventions doesn't mean that the system will predict all possibilities - only relevant ones. This presupposes cooperative users.</Paragraph> </Section> <Section position="5" start_page="0" end_page="0" type="metho"> <SectionTitle> 2 Overview of the Dialogue </SectionTitle> <Paragraph position="0"> manager The architecture of the SUNDIAL Dialogue Manager is presented in Fig. 1. It is a kind of dis_ S*3.</Paragraph> <Paragraph position="1"> tributed architecture where sub-modules are independent agents.</Paragraph> <Paragraph position="3"> Let us briefly present how the dialogue manager works as a whole. At each turn in the dialogue, the dialogue module constructs dialogue allotvance8 on the basis of the current dialogue structure. Depending on whose turn it is to speak, these dialogue allowances provide either: dialogic descriptions of the next possible system utterance or dialogic predictions about the next possible user utterance(s). When it is the system's turn, messages from the task module, such as requests for missing task parameters, message8 from the linguistic interface module such as requests for the repetition of missing words, and messages from the belief module arising, for example, from referential failure, are ordered and merged with the dialogue allowances by the dialogue module to produce the next relevant system dialogue act(8) 2. The result-Lug acts are then sent to message generator.</Paragraph> <Paragraph position="4"> When it is the user's turn to talk, task and belief goals are ordered and merged with the dialogue allowances to form predictions. They are sent, via the linguistic interface module, to the linguistic processor. When the user speaks, a representation of the user's utterance is passed from the linguistic processor to the linguistic interface module and then on to the belief module. The belief module assigns it a context-dependent referential interpretation suitable for the task module to make a task interpretation and for the dialogue module to make a dialogic interpretation (e.g. assign the correct dialogue act(s) and propagate the effects on the dialogue history). This results in the construction of new dialogue allowances. The cycle is then repeated, to generate the next system turn.</Paragraph> <Paragraph position="5"> This is necessarily a simplified overview of the processing which takes place inside the Dialogue Manager. A detailed description of the dialogue manager can be found in (Bilange et al, 1990a).</Paragraph> <Paragraph position="6"> The purpose of this paper is to describe some funaThis terminology is defined later.</Paragraph> <Paragraph position="7"> damental aspects of the dialogue module. It is however important to state that the task module should use planning techniques similar to Litman's (1984)) 3 Basis of the dialogue model Task oriented dialogues mainly consist of negotiations. These negotiations are organized in two possible patterns: 1. Negotiation opening + Reaction 2. Negotiation opening + Reaction + Evaluation Moreover negotiations may be detailed which causes sub-negotiations. Also, in a full dialogue, conversational exchanges occur for clarifying communication problems, and for opening and closing the dialogue. This description is then recursive with different possible dialogic functions.</Paragraph> <Paragraph position="8"> A dialogue model should take into account these phenomena keeping in mind the task that must be achieved. An oral dialogue system should also take into consideration acoustic problems due to the limitation of the speech understanding techniques (soft-as well as hardware) e.g. repairing techniques to avoid misleading situations due to misunderstandings should be provided. Finally, as a cooperative principle, the model must be habitable and thus not rigid so that the two locutors can take initiative whenever they want or need.</Paragraph> <Paragraph position="9"> These bases lead us to define a model which consists of four decision layers:</Paragraph> <Section position="1" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 3.1 Rules of conversation </SectionTitle> <Paragraph position="0"> The structural description of a dialogue consists of four levels similar to the linguistic model of Roulet and Moeschler (1989). In each level specific functional aspects are assigned: s ~ransaction level : informative dialogues are a collection of transactions. In the domain of travel planning, transactions could be : book a one-way, a return, etc. The transaction level is then tied to the plan/sub-plan paradigm. A transaction can be viewed as a discourse segment (Grosz & Sidner~ 1986).</Paragraph> <Paragraph position="1"> * Ezchange level: transactions are achieved through exchanges which may be considered - 84-Dialogue excerpt of example in section 4 $2 when would you like to leave 7 U2 next thursday Sa next tuesday the 30th of November ; and at what time 7 Us no, thursday december the 2nd towards the end of the afternoon St ok december the 2nd around six ... initiative(system, \[open_request, get_paranteter( dep.date)\]) reaction(user, \[answer, \[dep_date : #1\]\]) El \[ initiative( s#stem, \[echo, #1\]) evaluation : E2 \] reaction(user, \[correct, \[#I, #2\]\]) Tl L evaluation(system, \[echo, #2\]) initiative(system, \[open_request, get_parameter(dep_time)\]) Ea reaction(user, \[answer, \[dep_time : #3\]\]) e~aluation(s~ste,,~, \[echo, #$\]) El : exchange(Owner: system, Intention: get(dep.date), Attention: {departure, date)) E2 exchange(Owner: system, Intention: clarify(value(dep.date)), Attention: {departure, date)) Ea exchange(Owner: system, Intention: get(dep_time), Attention: {departure, time)) Tl = transaction(Intention:problem.description, Attention:(departure, arrival, city, date, time, flight)) as negotiations. Exchanges may be embedded (sub-exchanges). During an exchange, negotiations occur concerning task objects or the dialogue itself (meta-communication).</Paragraph> <Paragraph position="2"> Intervention level : An exchange is made up of interventions. Three possible illocutionaxy functions axe attached to interventions: initiative, reaction, and evaluation.</Paragraph> <Paragraph position="3"> Dialogue acts : A dialogue act could be defined as a speech act (Senile, 1975) augmented with structural effects on the dialogue (thus on the dialogue history) (Bunt, 1989). There axe one or more main dialogue acts in an intervention. Possible secondary dialogue acts denote the argumentation attached to the main ones.</Paragraph> <Paragraph position="4"> Dialogue acts represent the minimal entities of the conversation.</Paragraph> <Paragraph position="5"> The rules of conversation use this dialogue decomposition and axe organised as a dialogue grammax. Dialogue is then represented in a tree structure to reflect the hieraxchica\] dialogue aspect augmented with dialogic functions. An example is given in Fig. 2. Now let us describe conversational rules through a detailed description of the functional aspects of the intervention level.</Paragraph> <Paragraph position="6"> * Initiatives axe often tied to task information requests, in task-oriented dialogues. Initiatives axe the first intervention of an exchange but may be used to reintroduce a topic during an exchange. Intentional and attentional information is attached to initiatives and exchanges as in (Gross & Sidner, 1986). When a locutor perforn'ts an initiative the exchange is attributed to him, and he retains the initiative, since there is no need for discourse clarification, for the duration of the exchange. This is important as according to the analysis of our corpus the owner of an exchange is responsible for properly closing it and he has many possibilities to either keep the initiative or give it back.</Paragraph> <Paragraph position="7"> The simplest initiative allowance rule initiative_taking, presented in Fig 3, means that the speaker X who has just evaluated the exchange Sub-ezchange is allowed to open a new exchange such as it is a new sub-exchange of the exchange Ezchange ({_} means any well-formed sequence according to the dialogue grammar). Moreover, the new exchange can be used to enter a new transaction. In this case the newly created exchange will not be linked as a sub-exchange (see section 3.2 below).</Paragraph> <Paragraph position="8"> . Reactions obey the adjacency pair theory.</Paragraph> <Paragraph position="9"> Reactions always give relevant information to the initiative answered.</Paragraph> <Paragraph position="10"> (r) Evaluations, both by the machine and the human, axe crucial. To evaluate an exchange means evaluating whether or not the underlying intention is reached. In task-oriented dialogues evalu- ,~5 ations may serve task evaluations or comprehension evaluations in cases of speech degradations. An example of an evaluation dialogue rule is given in Fig 3. The rule evaluation permits when X has initiated an exchange and Y reacted that X evaluates this exchange. The evaluation cannot be made whilst there is no reaction taking place.</Paragraph> <Paragraph position="11"> This rule (as any other) is bidirectional : if X is instantiated by &quot;user&quot; then the generated dialogue 'allowance' is a prediction of what the user can utter. On the other hand, if X is instantiated by &quot;system&quot; then the rule is one of a &quot;strategic generation&quot;. Evaluations are very important in oral conversation and coupled with the principle of bidirectional rules, this allows to foresee possible user contentions and to handle them directly as clarifying subexchanges. The dialogue flavour is that the system implicitly offers initiative to the user if necessary, keeping a cooperative attitude, and thus avoids systematic confirmations which can be annoying (see example in section 4).</Paragraph> <Paragraph position="12"> The structural effects of evaluations are not necessarily evident. When an evaluation is acknowledged (with cue expressions like &quot;yes&quot;, &quot;ok ~ or echoing what has been said) the exchange can be closed in which case the exchange is explicitly closed. The acknowledgement may not have a concrete realization in which case the exchange is implicitly closed. In the latter case, closings axe effective when the next initiative is accepted by the addressee. It is unlikely, according to our corpus of dialogues, that one speaker will contest an evaluation later in the dialogue. In the example in section 4, Sa initiative is accepted because U2 answers the question - the effect is then: U's reaction implicitly accepts the initiative which implicitly accepts the S's evaluation. Therefore, the exchange, concerning the destination and arrival cities, can be closed. We will describe later how such effects are modelled.</Paragraph> <Paragraph position="13"> During one cycle, every possible dialogue allowance is generated even if some are conflicting.</Paragraph> <Paragraph position="14"> Conflicts are solved in the next two layers of the model.</Paragraph> </Section> <Section position="2" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 3.2 Dialogue acts computation </SectionTitle> <Paragraph position="0"> Once the general perspective of the dialogue continuation has been hypothesised, dialogue acts axe instantiated according to task and communication management needs. A dialogue act definition is described in Fig 4.</Paragraph> <Paragraph position="1"> The premises state the list of messages the dialogue act copes with s. The conclusions axe twofold: there is a description of the dialogic effect of the act and of its mental effect on the two aWe recall that these messages are received by the dialogue module internally (see section 3) or externally (see section 2) Dialogue act label ==> message_l, ..., msssagsn =:=> Description of the dialogue act Effects of the dialogue act <- preconditions and/or actions</Paragraph> <Paragraph position="3"> our model does no more than what exists in Allen etal's work (82 I. Lastly, the preconditions are a list of tests concerning the current intentional and attentional states in order to respect the dialogue coherence and/or actions used for example to signal explicit topic shifts. Signaling this means introducing features in order that once the act is to be generated some rhetorical cues are included: &quot;Now let's talk about the return when do you want to come back?&quot;, or simply: aand at what time?&quot; when the discursive context states that the system has the initiative.</Paragraph> <Paragraph position="4"> At this level all possible dialogue acts according to the dialogue allowances issued by the previous level axe hypothesised. Discursive and metadiscursive acts are planned and the next layer will select the relevant acts according to the dialogue strategy.</Paragraph> <Paragraph position="5"> In the next paragraphs, we describe the most important dialogue acts the system knows and classify them according to the function they achieve. Combining task messages and dialogue allowances : The dialogue model considers the task as an independent agent in a system. The task module sends relevant requests whenever it needs information, or information whenever asked by the dialogue module.</Paragraph> <Paragraph position="6"> * Initiatives and Parameter requests : an initiative can be used to ask for one task parameter. The intention of the new created exchange is then tagged as &quot;get_parameter&quot; whereas the attention is the requested object 4. The act is presented in Fig. 5.</Paragraph> <Paragraph position="7"> . The other identified possibilities are initiative tThis is a very simplified description. One can refer to (Sadek, 1990) to have a more precise view of what could be done.</Paragraph> <Paragraph position="8"> - 86 and non topical information; initiative and task solution(n); trannaction opening, initiative, and task plan opening; reaction and parameter value; transaction closing, evaluation and task plan closing in which case the act may not have a surface realization since exchanges in the transaction may have been evaluated which implicitly allows the transaction closing.</Paragraph> <Paragraph position="9"> Dialogue progression control : s Confirmation handling: Representations coming from the speech understanding module contain recognition scores s. According to the score rate, confirmations are generated with different intensity. The rules are : s Low score : realize only the evaluation goal entering a clarifying exchange.</Paragraph> <Paragraph position="10"> * Average score : a combination of evaluation and initiative is allowed, splitting them into two sentences as in &quot;Paris Brest ; when would like to leave ?&quot; * High score : in that case, the evaluation can be merged with the next initiative as in &quot;when would you like to leave for Bonn?&quot;.</Paragraph> <Paragraph position="11"> * Contradiction handling. When the addressee utters a contradiction to an evaluation if any initiative has been uttered by the system, it is marked as &quot;postponed&quot;. The exchange in which the contest occurs is then reentered and the evaluation part becomes a sub-exchange.</Paragraph> <Paragraph position="12"> * Communication management. Requests for pauses or for repetition postpone every kind of dialogue goal. The adopted strategy is to achieve the phatic management and then reintroduce the goals in the next system utterance.</Paragraph> <Paragraph position="13"> * Reintroducing old goals. As long ~ the current transaction is not closed the system tries to realize postponed goals if a dialogue opportunity (e.g. a dialogue allowance} arrives. When realizing the opportunity a marker is used to reintroduce the communicative goal if it has been postponed for a long time (&quot;long time&quot; refers to the length in the discourse structure from the postponement and the point where it is reintroduced). This involves the tactical generation of using a special case of rhetoric formulation.</Paragraph> <Paragraph position="14"> * Abandoning previous goals. The concrete realization of dropping an exchange occurs when goals have been postponed and the transaction to which they belong is closed. The justification is simple : a transaction close is submitted to the addressee for evaluation. If he does not contest this closing then this implicitly allows the drop.</Paragraph> <Paragraph position="15"> Only non crucial exchanges are dropped. If they SScores may be fuzzy. They only represent the confusion rate which occurs during the lexicalization of the acoustic signal.</Paragraph> <Paragraph position="16"> were crucial to the transaction then they wouldn't have been dropped.</Paragraph> <Paragraph position="17"> These communication management acts illustrate the interest of our dialogue model and offer new means to cope with dialogue failure comparing with recent techniques (Jullien & Marty, 1989).</Paragraph> </Section> <Section position="3" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 3.3 Dialogue strategy modeling </SectionTitle> <Paragraph position="0"> In one running cycle, more than one dialogue act can be a candidate, this is due to the nondeterministic nature of the dialogue which is preserved until this step. For example, it is possible that the dialogue rules allow the system to take an initiative, evaluate an exchange, or react. Consequently a third layer of rules has been designed, in order to select the best candidate according to a general dialogue strategy. As our system is dedicated to oral dialogues the strategy is firstly oriented toward a systematic confirmation of system's understandings and secondly, as a general strategy, we decided to avoid too many embedded subexchanges.</Paragraph> <Paragraph position="1"> This avoids numerous topic shifts, especially implicit ones. The concrete realization of the latter is done by forcing the user to give explicit answers to problematic goals with utterances like &quot;please answer yes or no ~.</Paragraph> </Section> </Section> <Section position="6" start_page="0" end_page="0" type="metho"> <SectionTitle> 4 A dialogue example </SectionTitle> <Paragraph position="0"> We present here a dialogue example obtained with our system in the field of flight reservations 6. At present, there is no oral recognition system; user's utterances are entered manually in predefined format, including hypothesised acoustic scores and voluntary misrecognition.</Paragraph> <Paragraph position="1"> $I flight booking service, how can I help you? as we are not able, at present, to deal with them. Ques- tion marks mean that intonation rises and commas denote pauses.</Paragraph> <Paragraph position="2"> Normally, the dialogue continues with the acquisition of the passenger name and address but now this is not included in the task management.</Paragraph> </Section> class="xml-element"></Paper>