XML Viewer - h01-1004

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/01/h01-1004_metho.xml
Size: 10,400 bytes
Last Modified: 2025-10-06 14:07:33
<?xml version="1.0" standalone="yes"?>
<Paper uid="H01-1004">
  <Title>Amount of Information Presented in a Complex List: Effects on User Performance</Title>
  <Section position="4" start_page="0" end_page="0" type="metho">
    <SectionTitle>
2. METHODS AND PROCEDURES
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
2.1 Subjects
</SectionTitle>
      <Paragraph position="0"> Sixty-four subjects were run at a local shopping mall over a five day period. Subjects were recruited from the shoppers frequenting the mall.</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
2.2 Wizard of Oz
</SectionTitle>
      <Paragraph position="0"> A Wizard of Oz (WOZ) experiment was run to determine the optimal way for the end-user to select a desired itinerary in the Communicator project.</Paragraph>
      <Paragraph position="1"> A Wizard of Oz experiment is one in which no real automatic speech recognition (ASR) or natural language understanding (NLU) is used. Instead, the user interface is prototyped and a 'wizard,' or experimenter, acts in place of the ASR and NLU. Consequently, subjects believe that ASR/NLU is being used. The WOZ methodology allows competing user interface strategies to be prototyped and tested with end users in a shorter period of time than would be required to implement multiple fully-functioning systems with competing user interfaces.</Paragraph>
    </Section>
    <Section position="3" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
2.3 Apparatus &amp; Materials
</SectionTitle>
      <Paragraph position="0"> Relevant aspects of the AT&amp;T Communicator user interface were prototyped using the Unisys Natural Language Speech Assistant (NLSA) software. NLSA runs on a PC using the Windows NT operating system. Subjects called into the Communicator prototype using an analog telephone and interacted with the system by voice. The wizard categorized the subject's speech using the NLSA Wizard graphical user interface (GUI). Each subject completed 5 surveys in pen and paper format. During the course of the experiment, subjects also had access to a pad of paper.</Paragraph>
    </Section>
    <Section position="4" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
2.4 Experimental Design
</SectionTitle>
      <Paragraph position="0"> All itineraries presented to the subjects were round-trip.</Paragraph>
      <Paragraph position="1">  This was a factorial experiment with two factors, one factor between subjects and the other within subject (see Table 1). Selection Itinerary Content. There were two levels of this between subjects factor: --Terse. The presented itineraries included: airline, number of stops, and departure time1. In order to get additional information, the user could ask the system questions (e.g. &amp;quot;When does that flight arrive?&amp;quot;).</Paragraph>
      <Paragraph position="2"> --Verbose. The presented itineraries included: airline, flight number, number of stops, departure time, and arrival time. All the information relevant to the tasks specified in the experiment are presented about each flight; the user did not need to ask questions to get additional information.</Paragraph>
      <Paragraph position="3"> Number of Flights Before Question. Each level is actually a combination of two separate, but related, factors.</Paragraph>
      <Paragraph position="4"> --Combined vs. Separate. Whether outbound and return flights are presented separately or in combination.</Paragraph>
      <Paragraph position="5"> --Number of flights. The number of flights that are presented before asking the subject to make a decision.</Paragraph>
      <Paragraph position="6"> Four levels of this factor were chosen. In all cases (1) the total number of flights 'found' was 5, and, (2) the question was, &amp;quot;Would you like to hold [that flight/any of those flights]?&amp;quot;. --Separate 1. The outbound and return flights of the trip are presented separately and after each flight the subject is asked the question.</Paragraph>
      <Paragraph position="7"> --Separate 3. The outbound and return flights of the trip are presented separately and after the third flight the subject is asked the question.</Paragraph>
      <Paragraph position="8"> --Separate 5. The outbound and return flights of the trip are presented separately and after the last flight the subject is asked the question.</Paragraph>
      <Paragraph position="9"> --Combined. The outbound and return flights of the trip are presented at the same time and after each set of two flights the subject is asked the question.</Paragraph>
      <Paragraph position="10">  Example. The following example could have been used in the Separate 3 condition. Text that is unformatted is common to both the terse and verbose conditions. Text in italics is found only in the verbose condition.</Paragraph>
      <Paragraph position="11"> &amp;quot;I found 5 outbound Delta flights. Option 1 is flight number 323. It's a non-stop leaving at 9:10 and arriving at 2:01. Option 2 is flight number 798. It has one stop; it departs at 11:13 and arrives at 5:07. Option 3 is flight number 295. It  Separate 1 Separate 1 Separate 3 Separate 3 Separate 5 Separate 5 Combined 2 Combined 2 has two stops; it departs at 1:52 and arrives at 6:57. Would you like to hold any of those flights?&amp;quot;  The dialog strategy was mixed initiative. The first prompt was open-ended, e.g. &amp;quot;How may I help you with your travel plans?&amp;quot; All subsequent prompts requested specific information from the user (e.g. &amp;quot;What date did you want to depart?&amp;quot;) The prototypes were built to allow the user to provide multiple informational elements (e.g. departure city and departure date) to either open-ended or specific requests. Subsequent steps in the flow of control could be skipped if multiple pieces of information were presented at a single dialog point.</Paragraph>
      <Paragraph position="12">  Each subject was asked to complete four tasks in the course of this experiment. In each task the subject was given a set of criteria that the subject had to meet in selecting both an outbound and a return flight. The tasks used in this experiment exercise selection criteria that are representative of selection criteria typically used by individuals actually purchasing airline tickets. The four tasks given to subjects follow: Departure Only. The task criteria for both the outbound and return flights require the subject to choose flights based on departure time only.</Paragraph>
      <Paragraph position="13"> Arrival Only. The task criteria for both the outbound and return flights require the subject to choose flights based on arrival time only.</Paragraph>
      <Paragraph position="14"> Departure &amp; Arrival. The task criteria require the subject to choose the outbound flight based on departure time and the return flight based on arrival time.</Paragraph>
      <Paragraph position="15"> Specific Flight. The task requires the subject to book a particular flight for both the outbound and return flights.</Paragraph>
      <Paragraph position="16"> Example. The following example was used for the Departure &amp; Arrival task (it has been edited for presentation here). You want a round trip ticket from Boston to Charleston. You want to leave Boston about 5 in the evening of Friday November 10th. You want to arrive in Boston no later than 8 PM on Tuesday November 14th.</Paragraph>
      <Paragraph position="17"> An important selection criterion for many purchasers of airline tickets is price. The price of the ticket was not a selection criterion used in this experiment because it would introduce possible confounds. Many users are willing to trade-off other important selection criteria, e.g. arrival time and departure time, in order to minimize price. Therefore, it was decided, a priori, to postpone the use of price as a selection criterion to a later experiment.</Paragraph>
      <Paragraph position="18">  A Balanced Greco-Latin Square was used to counterbalance the orders of the conditions and tasks.</Paragraph>
      <Paragraph position="19">  A rich set of dependent measures were gathered in this  experiment: -- After each system prompt was played, NLSA recorded what subjects said.</Paragraph>
      <Paragraph position="20"> -- At the end of each task, the wizard determined whether that task was successfully completed.</Paragraph>
      <Paragraph position="21"> -- At the end of each task, subjects completed paper and pen surveys rating the overall dialog for that task.</Paragraph>
      <Paragraph position="22"> -- After experiencing all four tasks, subjects told the  experimenter which of the flight selection criteria were important to them.</Paragraph>
      <Paragraph position="23"> Objective measure. Successful task completion was the one objective measure used in determining the optimal method for presenting complex lists in an audio-only domain. For each task the subject was given a set of required criteria for selecting both the outbound and a return flight. Task completion was binary, successful or unsuccessful, and was determined by the experimenter (wizard) at the time the subject completed each task. In order for a subject to successfully complete a task, the subject had to select both the outbound and return flight that best fit the clear criteria given to subjects in the task description. Subjective measures. Other data gathered in this experiment included a number of subjective measures. After each task, subjects were asked: Overall, how satisfied were you with AT&amp;T Communicator while booking this flight?  [1] Much Too Fast [2] A Little Too Fast [3] Just the Right Speed [4] A Little Too Slow [5] Much Too Slow After you told Communicator the date and time to book your flight, Communicator responded with possible flights to choose from. For EACH of the possible flights, did Communicator present the right amount of information? [1] Too Much Information about Each Flight [2] Just the Right Amount of Information About Each Flight [3] Too Little Information about Each Flight  After completing all four tasks, subjects were asked to (1) rank order the criteria they personally use when selecting between multiple itineraries, and (2) specify the information that Communicator should present about every flight for selection purposes in the future.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML