File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/00/w00-1013_metho.xml

Size: 19,656 bytes

Last Modified: 2025-10-06 14:07:26

<?xml version="1.0" standalone="yes"?>
<Paper uid="W00-1013">
  <Title>Document Transformations and Information States</Title>
  <Section position="4" start_page="0" end_page="112" type="metho">
    <SectionTitle>
2 Degrees of Interactivity and the
</SectionTitle>
    <Paragraph position="0"> difference between monologue and dialogue We take here the position that the main difference between dialogue and monologue is that the former implies interactivity. With interactivity we mean here that the participants can influence each other's moves. With respect to the area that interests us here, giving instructions to repair devices, a traditional written manual influences the user but not vice versa (except through notes to the author).</Paragraph>
    <Paragraph position="1"> The user can, however, influence the order in which she accesses the material: it is easy to stop, to go back or to consult an other section (traditional printed material might be argued to be better in that respect than presentation on a screen, we ignore that difference here).</Paragraph>
    <Paragraph position="2"> We can consider this as a limit case of interactivity. null  Note that interactivity does not necessarily imply shared initiative. The literature makes a distinction between task and dialogue initiative (e.g. (Chu-Carroll and Brown, 1998)) but one can have dialogue with both types of initiative staying with one side. In the cases we discuss below the task initiative stays completely with the manual and the dialogue initiative only switches to the instructee in the case where she can indicate that information about some subprocedures can be skipped.</Paragraph>
    <Paragraph position="3"> There is another dimension that often intervenes in discussions about the difference between dialogue and written discourse: the former is spoken, the latter is written. Given the way things are in a natural setting, the written medium tends not to allow interactivity, whereas the spoken medium is used mainly in interactive settings. Technical changes, however, allow us to separate the written/spoken opposition from that between interactive and non, or minimally, interactive discourse. Instructional material can be presented in the aural mode without becoming more interactive e.g. when a recording is played. This can be considered as a plus for instructional material because it allows the instructee to use her hands and eyes for the task itself but it is not an unqualified advantage given that reading gives much more flexibility than listening to a tape. To cash in on the advantages of the aural presentation, we need to recapture the flexibility of access that the written medium allows.</Paragraph>
  </Section>
  <Section position="5" start_page="112" end_page="113" type="metho">
    <SectionTitle>
3 Instructions and Interactivity
</SectionTitle>
    <Paragraph position="0"> It is obvious that instructional situations profit from an interactive setting. Instructional situations are typically situations in which some participants (the instructors) know a lot that the other participants (the instructees) need to know to achieve the common goals. In these kinds of situations it is important that all the required and, preferably only the required, knowledge gets transferred at the moment the instructees need it.</Paragraph>
    <Paragraph position="1"> To achieve this, it is not enough that the instructor have all the necessary knowledge, she needs also to know which state the instructee is in and how that state changes to adapt the transfer of knowledge, hence the instructee needs to be able to inform the instructor about his state and influence in this way the course of the interaction.</Paragraph>
    <Paragraph position="2"> Currently we have manuals, whose content can be presented aurally or in a written form but where both the content and the presentation are uniquely determined a priori (modulo, the speed and order of reading mentioned above). Or we have interactions that can be at a distance but where a human instructor needs to be available at the time of the action. Making humans with the required competence available is expensive and one would want to achieve some interactivity without this. But computers tend to be frustrating participants in interactive settings when one compares them to human beings and the study of dialogue concentrates mainly on making them as human as possible.</Paragraph>
    <Paragraph position="3"> When one considers the possibility of transferring the interactivity from humans to machines, there are, however, many intermediate possibilities between no interactivity and full blown interactivity in free-wheeling dialogue where the participants can ask each other questions about anything and nothing (for a more thorough discussion about dialogues between humans and computers see (Clark, 1999)). In this paper we consider how minimal interactions can be modeled on the basis of information which is available in traditional instructional manuals.</Paragraph>
    <Paragraph position="4"> In looking at the problem this way one has to keep in mind that instructional manuals, although not interactive, are cooperative constructs: they assume that they participate with the user in a rational cooperative task and they are built on an implicit reader model, specifically they make assumptions about what the user knows and what she doesn't know and the granularity of the task descriptions that they have to provide.</Paragraph>
    <Paragraph position="5"> They obey in their own way Grice's Maxim of Quantity but they need to leave open a range of possibilities so they need to provide more detail than is necessary in all circumstances. In what follows we can only consider  cases of over-informedness as the information needed to remedy under-informedness is not available.</Paragraph>
  </Section>
  <Section position="6" start_page="113" end_page="118" type="metho">
    <SectionTitle>
4 The TRINDI model
</SectionTitle>
    <Paragraph position="0"> The TRINDI project has developed both a framework and a toolkit to model various types of interactions in terms of information state updates. The framework, whose main ingredients are information states, dialogue moves and updates, is described in (Traum et al., 1999). We use the term information state to mean, roughly, the information stored internally by an agent, in this case a dialogue system. A dialogue move engine updates the information state on the basis of observed dialogue moves and selects appropriate moves to be performed. In:formation state updates are formalised as in~brmation state update rules. The importance of the framework is that new interactive :hypotheses can be modeled with minor extensions. The information state approach is implemented in the TRINDIKIT (Larsson et al., 2000); (Larsson and Traum, To appear), a toolkit for experimenting with the implementation of information states and dialogue move engines and for building dialogue systems. It is used in the experimental implementation described here.</Paragraph>
    <Paragraph position="1"> Various instantiations of the framework articulate further what information states, moves, and update rules contain. In this paper we use one formal representation of information states that has been developed in the TRINDI, SDS 2 and INDI 3 projects, and implemented in the GoDiS dialogue system (Bohlin et al., 1999). The central parts of the information state in GoDiS are dialogue plans and Questions Under Discussion (QUD), a notion borrowed from Ginzburg (Ginzburg,  - yes/no - done/don't understand - how? * 3. User can indicate whether she already knows certain (sub)procedures</Paragraph>
    <Section position="1" start_page="113" end_page="114" type="sub_section">
      <SectionTitle>
5.1 GoDiS/IMDiS information states
</SectionTitle>
      <Paragraph position="0"> To model the types of interactions above, we started from the GoDiS system which is designed to deal with information-seeking dialogue. The IMDiS information state type is shown in Figure 1.</Paragraph>
      <Paragraph position="2"> The main division in the information state is between information which is private to the agent and that which is shared between the dialogue participants. The private part of the information state contains a PLAN field holding a dialogue plan, i.e. is a list of dialogue actions that the agent wishes to carry out.</Paragraph>
      <Paragraph position="3"> The plan can be changed during the course of the conversation. The AGENDA field, on the other hand, contains the short term goals or obligations that the agent has, i.e. what the agent is going to do next. We have included a field TMP that mirrors the shared fields. This field keeps track of shared information that has not yet been grounded, i.e.</Paragraph>
      <Paragraph position="4"> confirmed as having been understood by the</Paragraph>
      <Paragraph position="6"> other dialogue participant. The SHARED field is divided into four subfields. One subfield is a set of propositions which the agent assumes for the sake of the conversation. The second subfield is for a stack of questions under discussion (QUD). These are questions that have been raised and are currently under discussion in the dialogue. The ACTIONS field is a stack of (domain) actions which the user has been instructed to perform but has not yet performed.The LU field contains information about the latest utterance.</Paragraph>
      <Paragraph position="7"> To adapt GoDiS to instructional dialogue, we added a subfield of SHARED.ACTIONS to (the shared part of) the information state.</Paragraph>
      <Paragraph position="8"> The value of this field is a stack of actions which the system has instructed the user to perform, but whose performance has not yet been confirmed by the user.</Paragraph>
      <Paragraph position="9"> In building the experimental IMDiS, we have made several simplifications. We have ignored all the natural language generation problems and all the problems related to making text or dialogue natural, e.g. problems related to the use of pronouns and other referential expressions. To handle these we would not only have to discuss basic interactivity but also the medium in which the interaction takes place: speech or written text.</Paragraph>
      <Paragraph position="10"> The monologue mode (case 1) uses only 2 moves (Instruct, and Inform). Since there is no user to confirm that actions have been performed, all actions are automatically confirmed using the update rule autoConfirm.</Paragraph>
      <Paragraph position="11">  (Confirm). Confirmations are integrated by assuming that the current topmost action in SHARED.ACTIONS has been performed, as seen in the update rule below.</Paragraph>
      <Paragraph position="12">  This rule says that if the user performed a Confirm move, which has not yet been integrated, and A is the &amp;quot;most salient&amp;quot; action, then integrate the move by putting the proposition done (A) in the shared beliefs, and taking A off the action stack.</Paragraph>
      <Paragraph position="13"> Elliptical &amp;quot;how&amp;quot;-questions from the user are interpreted as applying to the currently topmost action in the SHARED.ACTIONS stack.</Paragraph>
    </Section>
    <Section position="2" start_page="114" end_page="116" type="sub_section">
      <SectionTitle>
5.2 Domain task, manuals and
dialogues
</SectionTitle>
      <Paragraph position="0"> Let's now see how a monologue and a dialogue version of the same task are related. Below we have an example from the user manual for the HomeCentre, a Xerox MFD.</Paragraph>
      <Paragraph position="1">  * Reinstalling the print head * Caution: Make sure that the green carriage lock lever is STILL moved all the way forward before you reinstall the print head.</Paragraph>
      <Paragraph position="2"> * 1. Line up the hole in the print head with the green post on the printer carriage.</Paragraph>
      <Paragraph position="3"> * Lower the print head down gently into position. * 2. Gently push the green cartridge lock lever up until it snaps into place.</Paragraph>
      <Paragraph position="4">  ter position after you press the cartridge change button, remove and reinstall the print head. From this text, one can (re)construct a task plan for reinstalling the print head. Such a plan may be represented as in figure 2. Note  that this is a conditional plan, i.e. it contains branching conditions.</Paragraph>
      <Paragraph position="5"> From this task plan, IMDiS generates two plans: a monologue plan and a dialogue plan. This is done using the &amp;quot;translation schema&amp;quot; in Figure 3.</Paragraph>
      <Paragraph position="6"> The difference between the text plan and the dialogue plan is in the way that conditionals in the task plan are interpreted. In the monologue plan, they correspond to simply informing the user of the conditional. In dialogue mode, however, the system raises the question whether the condition holds. When the system finds out if the condition holds, it will instruct the user to execute the appropriate guarded action.</Paragraph>
      <Paragraph position="7"> Here we can clearly see how dialogue differs from monologue as viewed by Carlson or Van Kuppevelt ((Carlson, 1983), (~an Kuppevelt, 1995)). Under these views the writer anticipates the questions the user might have asked but given the user is not present the writer has to make up for the lack of interactivity. The questions that can be reconstructed (or accommodated) are different in that case. For instance in the example given here, the question could something like &amp;quot;What should the user/I make sure of?&amp;quot;. These questions are valuable to help figure out the discourse structure of a monologue. They can also be valuable tools to illustrate the differences between dialogue and monologue but they do not give much insight in the effects of various degrees of interactivity.</Paragraph>
      <Paragraph position="8"> Conditionals are treated as follows by the system in dialogue mode: When the system has found out what the user's task is, it will load the appropriate dialogue plan into the PRIVATE.PLAN field of the information state.</Paragraph>
      <Paragraph position="9"> It will then execute the actions in the appropriate order by moving them to the agenda and generating appropriate utterances. When a conditional statement is topmost on the plan, IMDiS will check whether it has been established that the condition holds (by checking the SHARED.BEL field). Since the system has previously asked the user and the user has answered, either the condition or its negation will be in the set of established propositions. If the condition or its negation holds, the conditional will be popped off the plan and replaced by the first or second guarded action (respectively).</Paragraph>
    </Section>
    <Section position="3" start_page="116" end_page="116" type="sub_section">
      <SectionTitle>
5.3 Monologue and Dialogue
Behaviour
</SectionTitle>
      <Paragraph position="0"> In the monologue mode in IMDiS, the control module does not call the input and interpretation modules. The text is output &amp;quot;move by move&amp;quot; as a sequence of utterances from the system.</Paragraph>
      <Paragraph position="1"> S: Reinstalling the print head.</Paragraph>
      <Paragraph position="2"> S: Make sure that the green carriage lock lever is STILL moved all the way forward before you install the print head.</Paragraph>
      <Paragraph position="3"> S: Line up the hole in the print head with the green post on the printer carriage Compared to the monologue mode, even a very restricted dialogue mode offers several advantages: User attention and control The user can direct her attention to the machine and does not have to look at the manual. As we noted in when one goes from written to aural presentation, one gains the advantage that the user has free hands and eyes but if nothing more is done this advantage has to be weighted against the disadvantage that the user looses all control over the order and the speed with which the information is presented. We can avoid these drawbacks by allowing some limited grounding behaviour. Very simple interactions like 'done' (Confirm) or 'don't understand' (RequestRepeat) give back to the user a limited control over the speed and the order of the presentation (at least up to allowing repetition): the user decides when to move on to the next action, by confirming that the previous action is done, and by 'don't understand' she can indicate that she would want a repetition of what was said immediately before. Here we see how to take advantage of the advantages of a different mode of presentation (written versus aural) we also have to change the type of interactivity.</Paragraph>
      <Paragraph position="4"> S: Has the carriage moved from the center position? U: I didn't understand S: Has the carriage moved from the center position? Avoid irrelevant information When the action to be taken depends on a condition, the system does not give irrelevant information. null S: Has the carriage moved from the center position? U: yes S: The print head is now installed Because there is no feedback from the user, a manual has always to give all the possibilities regardless of which one actually pertains. The possibility to ask yes/no questions allows us to do away with this redundancy.</Paragraph>
    </Section>
    <Section position="4" start_page="116" end_page="118" type="sub_section">
      <SectionTitle>
5.4 More complex task plans
</SectionTitle>
      <Paragraph position="0"> In the example above we illustrated how a simple task plan can give rise to a dialogue and a monologue rendering. We can get some added flexibility by giving more structure to the task plan. For instance in the example above, one can argue that the reiustallation proper of the print head is described in point 1 to 2 and that 3 and 4 describe termination  conditions. To reflect this we can revise the task plan as follows: With this structure the user can control the level of detail of the instructions given. If the user does not know how to perform a substep, she can ask the system for more detailed instructions.</Paragraph>
      <Paragraph position="1"> U: done, aud now? S: Close the top cover If the user manages to complete the whole action sequence without instructions, she can tell the system this and the system proceeds to the next relevant action.</Paragraph>
      <Paragraph position="2"> S: put the print head in place U: how? S: Line up the hole in the print head with the green post on the printer carriage U: right, ok S: Lower the print head down gently into position U: ok S: Gently push the green cartridge lock lever up until it snaps into place U: ok S: The print head is now securely in place On the other hand, if the user already knows how to perform a substep, the system moves on to the next step.</Paragraph>
      <Paragraph position="3"> S: put the print head in place S: put the print head in place U: how? S: Line up the hole in the print head with the green post on the printer carriage U: right, done S: Lower the print head down gently into position U: done, I remember now (pause) the print head is put in place S: Ok. Close the top cover Here, however, we see the importance of the task structure. It is only if we have information that gives the structure of the task with subtasks that we can model this. Very often instructional manuals will give this substructure, e.g. in the form of subdivisions of instructions, but they tend not to be corn- null pletely consistent in this. It is only when this information is given in a consistent way that we can exploit it in a transformation from a written manual presentation to a more interactive presentation.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML