File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/94/w94-0324_intro.xml
Size: 28,915 bytes
Last Modified: 2025-10-06 14:05:45
<?xml version="1.0" standalone="yes"?> <Paper uid="W94-0324"> <Title>Generating Cooperative System Responses in Information Retrieval Dialogues</Title> <Section position="3" start_page="0" end_page="211" type="intro"> <SectionTitle> 2 Theoretical Framework 2.1 Related Work </SectionTitle> <Paragraph position="0"> Research activities in several areas, such as NLG, dialogue modeling, information retrieval and multimedia interfaces played an important role in motivating our work. Two streams of research were particularly interesting in our context: on one hand the incorporation of dialogue models in natural language systems, on the other hand the extension of RST and its application for (not exclusively natural-language-based) dialogue systems. The following gives some examples of work done in these two areas.</Paragraph> <Paragraph position="1"> As part of the Communal project (cf. Fawcett e~ al., 1988), which includes generation as well as understanding of natural language, a dialogue model called SFM (Systemic Flowchart Model) was developed. It uses a discrimination network to describe situations and actions that can occur in a dialogue. Due to the fact that many different speech acts (based on Searle, 1969) and speech act sequences were to be considered, the network is quite complex. Attempts were also made to integrate this model with RST(cf. Fawcett and Davies (1992) and section 2.1.2).</Paragraph> <Paragraph position="2"> A system which is capable of performing dialogues with a user on the basis of speech, was proposed by Smith, Hipp and Biermann (1992). Its domain is the maintenance of electrical appliances, and the emphasis in this approach lies on (nested) communicative goals, and concepts such as intentional, attentional and linguistic structures (Grosz and Sidner, 1986).</Paragraph> <Paragraph position="3"> Another system for the treatment of spoken dialogues is reported in Bilange (1991). The approach, which has been developed in the framework of the SUNDIAL project,</Paragraph> <Section position="1" start_page="207" end_page="209" type="sub_section"> <SectionTitle> 7th International Generation Workshop * Kennebunkport, Maine * June 21-24, 1994 </SectionTitle> <Paragraph position="0"> is based on the assumption that dialogues can best be described by means of a multi-level approach. The author distinguishes four levels: a domain-specific transaction level, an exchange level, an intervention level modeling initiative, reaction and evaluation, and finally a level consisting of dialogue acts. The system has been developed for the domain of flight reservations.</Paragraph> <Paragraph position="1"> Plan-oriented approaches for dialogue modeling are described in Litman and Allen (1987) and Lambert and Carberry (1992). Both approaches distinguish domain or task, problem-solving and discourse levels. Knowledge from the various levels is employed to solve the task of plan recognition in dialogues. Connections do not only exist between the various levels but also between elements within one level. These links are modeled either as discourse plans which follow the course of the interaction (e.g. CONTINUATION, CLARIFICATION and TOPIC-SHIFT, see Litman and Allen(1987)) or as discourse actions that link an utterance with&quot; the context. A similar distinction of various levels of representation is made in O'Donnell h1990), except that the links between various elements of is discourse or exchange model are not made explicit.</Paragraph> <Paragraph position="2"> The EES project (Explainable Expert Systems) is the basis for the theoretical and practical work of Moore and Paris (1989), Moore and Swartout (1990), and Carenini and Moore (1993). A central goal in EES was the creation of a fiexibl~ explanation module for expert systems communicating in natural language, allowing the user to ask questions about explanations given by the system and generating appropriate system responses. Strategies incorporating parameters such as context and focus were used to disambiguate the user's utterances. To address the intentions behind utterances and communicative goals, concepts of RST (cf. Mann and Thompson, 1987) and Speech Act Theory were exploited. The focus, however, is on flexible explanation dialogues and, unlike COR-RST, not on modeling information-seeking dialogues as complex &quot;negotiations&quot; with flexible ways to withdraw and reject dialogue contributions.</Paragraph> <Paragraph position="3"> The Intelligent Documentation Advisory System (IDAS), developed by Reiter, Mellish and Levine (1992) represents an attempt to use dynamically generated natural language in the framework of an information retrieval system using hypertext techniques. To obtain information, the user clicks on the object under consideration and then chooses one out of a list of request options displayed by the system. However, there is no dialogue model at all, the system only allows simple query - answer cycles.</Paragraph> <Paragraph position="4"> The concepts of speech acts in combination with RST are used in several multimedia presentation systems. Among the first systems following this approach were the WIP system developed at DFKI (see Andrd and Rist (1993)) and the system developed by Maybury (1991). However, neither the WIP project nor Maybury's system use high-level dialogue structures. As pointed out in Arens et al. (1993) global structures are necessary for establishing overall coherence in the context of multimedia interfaces. COR-RST takes this into account. It focuses on Natural Language Generation, yet allows the extension to multimodal dialogue acts.</Paragraph> <Paragraph position="5"> In the following two sections the two theoretical approaches (COR and RST) that were most influential for our work will be presented. Especially the COR model will be described in detail, because it is essential for understanding our approach.</Paragraph> <Paragraph position="6"> In the field of information retrieval (IR) the interactive and communicative aspects of IR have only recently been emphasized (cf. for example, Belkin and Vickery, 1985; Belkin et al., 1993). There exist approaches to distinguish various types of information retrieval strategies and tactics (cf. Bates, 1979), task hierarchies and global phases of the interaction. Itowever, no elaborate interaction models are provided in this field (except simplistic iterative question-answer models). In the area of conversational analysis and discourse theory, on the other hand, we find various discourse and dialogue models which address local dialogue structures (e.g., Fawcett et al., 1988; Grosz and Sidner, 1986, 1990; Reichman, 1985).</Paragraph> <Paragraph position="7"> To be able to design a flexible dialogue system which can engage in cooperative information-seeking dialogues we &quot; (deg _ 4 use the Conversational P~oles&quot; model COR) developed by Sitter and Stein (1992). It has been used to design the interface of a multimedia information system, called MERIT (cf. Stein et al., 1992). The COR model was originally influenced by the &quot;Conversation for Action&quot; (CfA) model (Winograd and Flores, 1986) which was applied to design computer-aided human-human interactions.</Paragraph> <Paragraph position="8"> By adopting basic concepts of speech act theory and existing discourse models, and extending the CfA model for the situation of information-seeking human-computer interactions, the COR model shows the following features: * it depicts the interaction as a cooperative two-party &quot;negotiation&quot; where commitments (to supply information or meta-information) can be made, retracted or rejected; * it permits mixed-initiative dialogues and is flexible enough to describe all possible - even extremely complex - interaction patterns (this includes the temporary role changes of information seeker/information provider, which frequently occur in highly vague task settings such as information-seeking); * it provides the means for an explicit representation of the dialogue history in an abstract form, i.e., disregarding the interaction mode (graphical, linguistic, mixed).</Paragraph> <Paragraph position="9"> According to COR the two participants (A and B) have dialogue goals and pursue specific conversational tactics to achieve these goals. The speaker's and addressee's mutual expectations about possible responses and about the subsequent course of the dialogue are essential. The model can be represented as a recursive state-transition network (see figure 1). The network defines the full potential of all possible interactions where successfully completed dialogue contributions/acts end in specialized states (circles). Transitions (arcs) represent the various types of dialogue acts: e.g., REQUEST, OFFER, and INFORM can exactly be mapped onto Searle's basic &quot;illocutionary types&quot;: directives, commissives, and assertives (cf. Searle, 1979); the other generic acts belong to the same categories, but they are less significant bec.au.se they are merely responsive, e.g., PROMISE is a commlss~ve act, but it adopts the conditions of action expressed by the preceding REQUEST (cf. Sitter and Stein, 1992). The traversal of the graph stops in states which are marked by squares. State < 5 >, for instance, is reached when the information seeker (A) h'as expressed contentment with the given information and quits the dialogue. States < 6 > to < 11 > are also terminal states, but here the information need could not be satisfied. Note that a dialogue which ends in one of these states can be well-formed, cooperative and complete (e.g,, B rejects a request of A, because the requested information could not be retrieved).</Paragraph> <Paragraph position="10"> The bold arcs leading from state < 1 > to state < 5 > denote two &quot;idealized&quot; courses of the interaction which follow the basic role-expectations or role assignments. The two initiative acts (REQUEST, OFFER) typically establish new.conditions of action, whereas the subsequent acts are reactive and do not introduce new conditions (PROMISE, ACCEPT, INFORM, and BE-CONTENTED are all &quot;expected&quot; in that they are positive responses to the preceding acts).</Paragraph> <Paragraph position="11"> Really encountered information-seeking dialogues, however, often do not follow such a simple, linear conversational development. Directive acts (REQUEST, ACCEPT) can be rejected by the addressee, commissives (OFFER, PROMISE) are often withdrawn. Both rejections and withdrawals (called &quot;alternative&quot; acts or responses) can either lead back to state < 1 >, where the dialogue is entered again (begin of a new: dialogue cycle), or to a terminal state (definite REJECT, WITHDRAW).</Paragraph> <Paragraph position="12"> Another way of departing from the linear course of interaction is even mor e important. So far we have only considered &quot;atomic&quot; dialogue acts which are not further decomposed. But consider the following situations which often occur in information-seeking interactions: if the meaning of an utterance (atomic act) has not been understood by the addressee, or, if she needs additional information to be able to proceed, an embedded clarification dialogue might be necessary. In order to resolve this problem, the transitions in. figure 1 must not be interpreted as atomic acts but as Structured dialogue contributions.</Paragraph> <Paragraph position="13"> The extended COR model, therefore, defines basically two types of subnetworks: figure 2 displays the net of an INFORM contribution; figure 3 shows a representative net for all other types of contributions (here: REQUEST as an example). Thus, recursion is taken into account.</Paragraph> <Paragraph position="14"> In the figures 2 and 3 &quot;A: request&quot; and &quot;A: inform&quot; denote atomic acts, whereas a suffix notation indicates structured contributions, for example: ASSERT(A,B) or DIALOGUE(B,A, solicit context information). The traversal of the INFORM net is quite simple: A's INFORM act can be followed by a subdialogue initiated by B (e.g., by a REQUEST such as a clarifying question), or, if B does not need additional information, state < c > is reached immediately (jump). Thus, A's (&quot;nuclear&quot;) INFORM act might be sufficient, whereas the (&quot;satellitic&quot;) subdialogue is optional and depends on B's decision.</Paragraph> <Paragraph position="15"> The transition net given in figure 3 is more complex but 7th International Generation Workshop * Kennebunkport, Maine * June 21-24, 1994 follows the same principles. The subdialogues are also optional; A has two possibilities to start in state < a >: * A may start with a nuclear act (here, a REQUEST for information, such as a query to the database)and has then the opportunity to supply additional context information (e.g., an assertion to explain, specify or illustrate the request). If A does not give this context information voluntarily, B may enter a subdialogue to solicit the required context information. Thus, the ASSERT transition and the DIALOGUE leading from state < b > to < c > have a similar function, both being optional, i.e., satellitic.</Paragraph> <Paragraph position="16"> * A may start with an ASSERT (entering the subnet type displayed in figure 2) to give context information concerning her request; she may add the explicit REQUEST immediately afterwards, or may skip the explicit utterance (jump), in case she believes that B is able to infer the intended request from the context. If B is not able to identify the request, B has the option to initiate a clarification DIALOGUE.</Paragraph> <Paragraph position="17"> The COR model focuses on the illocutionary aspects of the conversation and abstracts away from the specific propositional content of dialogue contributions. However, it has been recognized that COR in its first version was only a partial model which had to be further enhanced by addressing rhetorical and semantic aspects (cf. Maier and Sitter, 1992; Stein and Maier, 1993). This was recently verified by Fischer (1993) who used the COR model to analyze a corpus of real dialogues between humans (information seekers communicating with information brokers to prepare a database search).</Paragraph> <Paragraph position="18"> Among the theories for modeling discourse, the RST Rhetorical Structure Theory (cf. Mann and Thompson, 1987) is the theory most exploited for natural language processing, particularly for natural language generation.</Paragraph> <Paragraph position="19"> RST is a theory which describes the structure of written monologues. One of the most basic assumptions of RST is that coherence can be modeled by means of named relations which hold between adjacent text units. Such relations can be used to structure texts by iteratively applying relations thereby composing complex text units out of smaller ones.</Paragraph> <Paragraph position="20"> Another assumption of RST is that, in general, a relation imposes an asymmetric structure on the connected text units. For a given pair of related text units, the so-called nucleus corresponds to the unit which contains highly relevant information, while the satellile carries less significant information; the satellite can be either substituted or left out without significantly changing the overall meaning of the discourse.</Paragraph> <Paragraph position="21"> Since RST relations have been specified in a semi-formal way, this theory was a good candidate for a computational specification of coherence and later for an implementation of text planning and text generation systems.</Paragraph> <Paragraph position="22"> Recent attempts have been made to use this theory also for modeling dialogues, in particular for modeling both the connections within and between various dialogue contributions which is in contrast to approaches which only use RST to model links within a dialogue contribution like, e.g., Moore and Paris (1993). Nearly all approaches (see, for example, Fawcett and Davies, 1992, and Daradoumis, 1993) are in the area of human-computer interaction, where a generation component is responsible for the automatic production of system utterances. Another approach, which was developed for the domain of information-seeking dialogues, is reported in Maier and Sitter (1992). The authors showed that in such dialogues a specific subset of relations, the so-called interpersonal relations (see Maier and Hovy (1991)), are used. This classification of relations is based on three types of meaning as distinguished in Halliday (1985): ideational, interpersonal and textual meaning. Ideational meaning is the representation of experience of the world. Interpersonal meaning refers to what the speaker or writer does in order to address the goals of the recipient. Textual meaning, finally, relates pieces of discourse to the context and indicates how the discourse structure has to be interpreted. Interpersonal relations, therefore, share the behavior that they mainly address features of the discourse participants. Among these relations we find, for instance, JUSTIFICATION, where the satellite provides reasons why the speaker or the listener should carry out actions specified in the nucleus, or EVALUATION, where the satellite presents a subjective account of the information given in the nucleus.</Paragraph> </Section> <Section position="2" start_page="209" end_page="210" type="sub_section"> <SectionTitle> 2.2 Integration of COR and RST </SectionTitle> <Paragraph position="0"> To find out how COR and RST can be integrated for modeling information-seeking dialogues, a corpus of dialogue transcripts - obtained from Prof. Saracevic, Rutgets University, New Jersey - was carefully analyzed. The transcribed dialogues were conversations between a per-son seeking information and an information broker specialized in database search.</Paragraph> <Paragraph position="1"> The transcripts contained oral communication with frequent syntactical mistakes, incomplete and halfway reformulated sentences. Our approach was not to try to model these attributes of dialogues. Instead, the utterances were adapted to match written, error-free text.</Paragraph> <Paragraph position="2"> First, a COR analysis was carried out, resulting in a segmentation of the transcribed dialogues into acts, contributions and whole dialogue cycles. Then these dialogue elements were assigned an illocutionary point and a nucleus or satellite status.</Paragraph> <Paragraph position="3"> Taking these results, an RST analysis was performed.</Paragraph> <Paragraph position="4"> The existing segmentation into acts, contributions and dialogue cycles was used to create the text spans that make up the constituents - nuclei and satellites - of the RST analysis. In our analyses no major problems were encountered by applying RST to dialogues, even though RST was developed for monologues only.</Paragraph> <Paragraph position="5"> There is a relatively small number of typical lIST relations that connect pairs of dialogue acts of the basic COR model which does not incorporate the recursive structure for subdialogues. The most important ones are SOLUTIONHOOD, EVALUATION, EVALUATION* and BACK-GROUND. Note that EVALUATION* is a newly defined relation that inherits aspects of EVALUATION; it will be described later in this section.</Paragraph> </Section> <Section position="3" start_page="210" end_page="211" type="sub_section"> <SectionTitle> 7th International Generation Workshop * Kennebunkport, Maine * June 21-24, 1994 </SectionTitle> <Paragraph position="0"> How these relations apply to illocutionary acts as defined in the COlt model is shown in a sample dialogue and its RST analysis (figures 4 and 5). As can be seen there, in the first phase of the dialogue an act of type REQUEST (for information) gets acknowledged positively by a PROMISE (to search for information and to present what has been found). The nucleus of the EVALUATION relation holding between the two acts is the REQUEST.</Paragraph> <Paragraph position="1"> In the second stage, the requested information is given. REQUEST and PROMISE become satellite of a new relation which has an INFORM act as its nucleus. The suitable relation here is SOLUTIONHOOD, since the INFORM carries the answer to the REQUEST. Finally, the appropriateness of the provided data is confirmed by an act of type BE-CONTENTED. The recursively determined nucleus of the whole dialogue turn is the INFORM act. This is in line with what can be expected for information retrieval dialogues: the presentation of information to be looked for is the most central part of the whole dialogue. Initiative acts (REQUEST, OFFER) are also important, but they merely open the Structure span which is completed when the INFORM act i S given, or when it is REJECTed or WITHDRAWn.</Paragraph> <Paragraph position="2"> Several dialogue cycles which appear within one level of dialogue are usually connected by the BACKGROUND relation (not shown here, see Fischer (1993) for examples). In the COR model, roles - or rather expectations of specific role behavior - are essential. Some acts are expected, while others are not; the latter ones are called alternative. The following describes \]how the expectation of certain dialogue acts influence their status as a constituent within an RST relation, i.e., whether they are considered the nucleus or the satellite of the relation.</Paragraph> <Paragraph position="3"> On the dialogue level (see figure 1), the acts REQUEST, OFFER, and INFORM are most important. They actually contribute to the progression of the dialogue insofar as they actively model the negotiation of information. Other acts are merely evaluations of these three acts, they can either be positive or negative. The positive ones include ACCEPT, PROMISE, an d BE-CONTENTED. The negative ones are WITHDRAW, REJECT ' and BE-DISCONTENTED.</Paragraph> <Paragraph position="4"> The acts ACCEPT, PROMISE, and BE-CONTENTED are all expected ones. In our corpus, between any of these acts and their respective preceding acts the (positive) EVALUATION relation holds. Since the EVALUATION relation defines the constituent that contains the evaluating expression as the satellite of the relation and the evaluated expression as the nucleus, this behavior matches features of the COR model: The evaluated dialogue acts are RE-QUEST, OFFER, Or INFORM which make significant propositional contributions to the dialogue. For example, in the REQUEST-PROMISE pair of dialogue acts, REQUEST is the nucleus and PROMISE the satellite - the latter one being the expected positive acknowledgement.</Paragraph> <Paragraph position="5"> Evaluations that are not expected are WITHDRAW, RE-JECT, and BE-DISCONTENTED. They give the dialogue an alternative turn. This means that the evaluation is of high relevance and overrides the importance of previous acts. In order to model this fact by means of RST, an alternative evaluation relation has to be introduced, which defines the evaluating expression to be the nucleus of the relation, in contrast to its definition as given above. We called this relation EVALUATION* because it resembles EVALUATION, except that it swaps the roles of the involved constituents.</Paragraph> <Paragraph position="6"> model The extended COR model contains complex dialogue contributions with a recursive structure (see figures 2 and 3). These complex acts have two constituents. One of these constituents expresses the illocutionary point of the whole contribution. Therefore, it takes on the role of the nucleus. Examples are: A: INFORM, A: REQUEST, A: OF-FER. The other constituent is either an ASSERT contribution (atomic or complex) or a DIALOGUE to negotiate the contextual information (see also section 2.1.1).</Paragraph> <Paragraph position="7"> The fact that ASSERT and DIALOGUE serve the same purpose in COlt - namely to provide context information - also had to be modeled adequately in terms of ltST.</Paragraph> <Paragraph position="8"> This was achieved by simply assigning the INFORM act to be the nucleus of the whole DIALOGUE. Both ASSERT and INFORM may contain the same proposition (contextual information). The only difference is the way to get to this state: either A gives the information voluntarily ASSERT) or B initiates a sub-dialogue to ask for the inrmation (DIALOGUE).</Paragraph> <Paragraph position="9"> Concerning the question about the types of relations typically holding between the nuclear act - carrying the illocutionary point of the compound contribution (REQUEST, for example) - and the accompanying satellitic (assertive) act, our analyses resulted in finding two distinct types: (1) The first type is for additional information that is needed in order to answer a question or to understand a certain statement. In our genre of dialogues, this information is obtained by applying Information Retrieval Tactics. (2) The second type is for supplementary information that explains the underlying reasoning behind an act made by a dialogue participant. We call this information type meta-information, The two relation types outlined above are now described in more detail.</Paragraph> <Paragraph position="10"> Context relation type 1: Information Retrieval Tactics In information-seeking dialogues, it is very uncommon that questions can be answered immediately. In most cases additional information is necessary. One way</Paragraph> </Section> <Section position="4" start_page="211" end_page="211" type="sub_section"> <SectionTitle> 7th International Generation Workshop * Kennebunkport, Maine * June 21-24, 1994 </SectionTitle> <Paragraph position="0"> to do this is to use Information Retrieval Tactics. Their purpose is the procurement of data that is needed to successfully answer a request. In the transcribed dialogues, the tactics were employed by the information broker. An intelligent information retrieval system should also be able to handle at least some of these tactics. Tactics inelude replacement or addition of search terms, pursuing or neglecting search paths; for a detailed collection see Bates (1979).</Paragraph> <Paragraph position="1"> The short dialogue shown in figure 6 exemplifies the use of the tactic SUPERTERM, which is used to replace a specific term by a more general one. The system cannot give an appropriate answer to the original request for all known EC-funded projects involving &quot;NL Generation&quot; because that search item is not contained in the database. Inorder to get some relevant entities, it tries to extend the search, by replacing the original search term by a superterm. Therefore, it initiates a subdialogue asking the user to provide another term, and the user chooses &quot;Natural Information retrieval tactics can be integrated into RST without significant problems. They are subsumed by existing RST relations, such as GENERAL-SPECIFIC and ABSTRACT-INSTANCE. We have observed that most common tactics used in Information Retrieval are in fact forms of (the very broad) ELABORATION relation. A more exhaustive analysis regarding the mapping of tactics to RST relations is required for a complete picture.</Paragraph> <Paragraph position="2"> Context relation type 2: Meta-Information Apart from the contextual information concerned with the contents of the database, there is a second type of context which deals with underlying reasoning behind utterances. Making these explicit means to make this underlying reasoning transparent to the dialogue partner.</Paragraph> <Paragraph position="3"> An information retrieval system can enhance confidence of the information seeker by giving this meta-information. There are several RST relations available for connecting assertive acts with meta-information to the nuclear act.</Paragraph> <Paragraph position="4"> Examples are CAUSE, PURPOSE and INSTRUMENT.</Paragraph> <Paragraph position="5"> The RST analysis of the sample dialogue given above is shown in figure 7. It shows examples for tactics (SU-PERTERM) and meta-information (CAUSE). Note that the context relations are connecting dialogue acts (RE-QUESTs in this example) with complete (sub-) dialogues, to be precise: the respective INFORM act within these subdialogues. Indeed, this consequently means that at some stage of the dialogue, no real relation holds between two adjacent dialogue acts, e.g., two consecutive REQUESTs.</Paragraph> <Paragraph position="6"> For the text generation processes, however, this does not involve a drawback. The COR model itself transposes the RST structure given for the whole sub-dialogues to single dialogue acts.</Paragraph> </Section> </Section> class="xml-element"></Paper>