XML Viewer - h01-1004

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/01/h01-1004_evalu.xml
Size: 8,215 bytes
Last Modified: 2025-10-06 13:58:38
<?xml version="1.0" standalone="yes"?>
<Paper uid="H01-1004">
  <Title>Amount of Information Presented in a Complex List: Effects on User Performance</Title>
  <Section position="5" start_page="0" end_page="0" type="evalu">
    <SectionTitle>
3. RESULTS AND CONCLUSIONS
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.1 Terse or Verbose?
</SectionTitle>
      <Paragraph position="0"> A two-way, 2x4, Analysis of Variance (ANOVA) was run for each of 5 dependent measures: successful task completion, amount of information presented about each flight, satisfaction, ease of use, and speed of interaction. For each dependent measure, no significant interactions were found2. A significant main effect for Terse/Verbose was found for the subjective measure of the amount of information presented about each flight (p=.001), see Fig. 1.</Paragraph>
      <Paragraph position="1">  Information question (2=Just the Right Amount of Information about each flight).</Paragraph>
      <Paragraph position="2"> No other significant main effects were found for any of the dependent measures. The optimum value for the dependent measure amount of information is '2' (Just the right amount of information about each flight). The average value for the Verbose condition (across the 4 levels of # of Flights) was 2.06, while the equivalent average for the Terse condition was 2.24.</Paragraph>
      <Paragraph position="3">  ordering of their personal selection criteria.</Paragraph>
      <Paragraph position="4"> Related to these results is a question that was asked of all subjects at the end of the experiment. Figure 2 shows the weighted scores based on the rank ordering of the selection 2 Throughout the experiment, the alpha level used to determine significance of an effect was p&lt;.05.</Paragraph>
      <Paragraph position="5"> criteria subjects personally use when selecting among multiple flights. A rank order of 1 was given a score 7 points, a rank order of 7 was given a score of 1 point, etc. The Weighted Score for each selection criteria shown in Figure 2 is the sum of the Weighted Scores for all subjects.</Paragraph>
      <Paragraph position="6"> Similarly, a second question was asked of all subjects at the end of the experiment: &amp;quot;In the future, what information should AT&amp;T Communicator present about each flight when you are choosing between multiple flights?&amp;quot; Figure 3 shows the compiled responses to this question.</Paragraph>
      <Paragraph position="7">  criterion should, by default, be presented by AT&amp;T Communicator.</Paragraph>
      <Paragraph position="8"> Information that should definitely be presented to subjects when selecting between multiple flights includes: price, arrival time, departure time, number of stops and airline. The value to users of the length of stops is ambiguous. It probably should not be presented by default, although it might be useful to present the length of stops if they will be inordinately long, e.g. greater than 2 hours, or inordinately short, e.g. less than 45 minutes. Flight number was judged to be least valuable and should not be presented.</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.2 Number of Flights?
</SectionTitle>
      <Paragraph position="0"> The above analyses indicate that the amount of information presented in the Verbose condition better met the expectations of subjects. The next question then was, within the verbose condition, which level of the number of flights before the question factor showed the best performance. A one-way, 1x4, ANOVA was run for the verbose condition for each of five dependent measures: successful task completion, amount of information about each flight, satisfaction, ease of use, and speed of interaction3. A significant main effect was found for successful 3 As noted at the beginning of the Results section, subject responses to the satisfaction, ease of use, and speed of the interaction questions may be attributable to the subject's reactions to the novel user-system task completion (p=.005). Figure 4 shows the percentage of successful task completions in the Verbose condition only. No significant effects were found for the other four dependent measures.</Paragraph>
      <Paragraph position="1">  four levels of the # of Flights Before Question condition (Verbose only).</Paragraph>
      <Paragraph position="2"> The significant main effect was probed using the Tukey test4. Separate 5 was the condition with the highest successful task completion rate. Only one pairwise comparison was significant (p&lt;.05). Tasks attempted in the Separate 5 condition were significantly more likely to be completed successfully than tasks attempted in the Separate 3 condition.</Paragraph>
      <Paragraph position="3">  across the four levels of # of Flights Before Question condition (Verbose only).</Paragraph>
      <Paragraph position="4"> Sep. 3  Sep. 1  Among the three Separate conditions (Separate 1, Separate 3, and Separate 5), subjects were much more likely to successfully complete a task in Separate 5. That is, when all the flights for a given flight (outbound or return) were presented at once, without any intervening questions. Also, based on subject comments, it appeared that at least some subjects in the Separate 3 condition were confused about the number of flights they had available to select between. These subjects didn't realize that there were more flights available after the system presented them with the first interaction style, rather than to the experimentally varied presentation of the flight selection criteria.</Paragraph>
      <Paragraph position="5"> 4 The Tukey is a test of significance for pairwise comparisons of treatment means that controls for familywise error.</Paragraph>
      <Paragraph position="6"> three in a total set of five flights. This is in spite of the fact that in all tasks, including the Separate 3 condition, the subjects heard a sentence like &amp;quot;I found five outbound Northwest Airlines flights,&amp;quot; before the options were presented for selection.</Paragraph>
      <Paragraph position="7"> It not possible, on the basis of the experimental data gathered in this study, to unambiguously choose one of the # of flights before question conditions over the others. It may be that a more difficult set of tasks would elicit stronger differences in both the objective and subjective measures for the levels of this factor. However, in absolute terms, the task completion rates with Separate 5 and Combined were both high (90% and 83%, respectively), relative to the Separate 1 and Separate 3 conditions (60% and 57%, respectively).</Paragraph>
      <Paragraph position="8"> Anecdotal evidence sheds some additional light on the issue of which condition (Separate 5 or Combined) is preferred by subjects. In the Verbose condition, the last 17 subjects run in the experiment were asked a few questions that provide evidence concerning their subjective impressions of the four levels of the number of flights before question factor. The first question was &amp;quot;Did you notice any difference between the different versions of the system?&amp;quot; Twelve of seventeen subjects stated that they had noticed a difference between the four versions. Those 12 subjects were then asked to choose the version they liked the best, and then the version they considered to be the worst.</Paragraph>
      <Paragraph position="9">  In response to the question of which version of the system was best, the subjects stated no consistent preference for any of the versions of the system. On the other hand, the responses to the question concerning which version of the system was 'worst' resulted in a more consistent set of responses; the Combined version was selected by 5 of 12 of the subjects as the version they considered to be the 'worst.' From subject comments, it appeared that subjects didn't like it when they heard one flight that matched their constraints (e.g. outbound), while the other flight did not match their constraints (e.g. return). Some subjects found this to be frustrating, confusing, and/or tedious.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML