File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/94/h94-1041_metho.xml

Size: 6,251 bytes

Last Modified: 2025-10-06 14:13:50

<?xml version="1.0" standalone="yes"?>
<Paper uid="H94-1041">
  <Title>PREDICTING AND MANAGING SPOKEN DISFLUENCIES DURING HUMAN-COMPUTER INTERACTION*</Title>
  <Section position="4" start_page="0" end_page="222" type="metho">
    <SectionTitle>
2. SIMULATION EXPERIMENTS ON
HUMAN-COMPUTER
INTERACTION
</SectionTitle>
    <Paragraph position="0"> This section outlines three experiments on human spoken and handwritten input to a simulated system, with spoken dlsflu- null encies constituting the primary analytical focus.</Paragraph>
    <Section position="1" start_page="222" end_page="222" type="sub_section">
      <SectionTitle>
2.1. Method
</SectionTitle>
      <Paragraph position="0"> Subjects, Tasks, and Procedure- Forty-four subjects participated in this research as paid volunteers. A UService 'l~ansaction System&amp;quot; was simulated that could assist users with tasks that were either (1) verbal-temporal (e.g., conference registration or cax rental exchanges, in which proper names and scheduling information predominated), or (2) computational-numeric (e.g., personal banking or scientific calculations, in which digits and symbol/sign information predominated). During the study, subjects first received a general orientation to the Service Transaction System, and then were given practice using it to complete tasks. They received instructions on how to enter information on the LCD tablet when writing, speaking, and free to use both moralities. When speaking, subjects held a stylus on the tablet as they spoke.</Paragraph>
      <Paragraph position="1"> People also were instructed on completing tasks in two different presentation formats. In an unconstrained format, they expressed information in an open workspace, with no specific system prompts used to direct their speech or writing. People simply continued providing information while the system responded interactively with confirmations. For example, in this format they spoke digits, computational signs, and requested totals while holding their stylus on an open %cratch pad&amp;quot; area of their LCD screen. During other interactions, the presentation format was explicitly structured, with linguistic and graphical cues used to structure the content and order of people's input as they worked.</Paragraph>
      <Paragraph position="2"> For example, in the verbal-temporal simulations, form-based prompts were used to elicit input (e.g., Car pickup location I 0, and in the computational-numeric simulation, patterned graphical layouts were used to elicit specific digits and symbols/signs.</Paragraph>
      <Paragraph position="3"> Other than specifying the input modality and format, an effort was made not to influence the manner in which people expressed themselves. People's input was received by an informed assistant, who performed the role of interpreting ~nd responding as a fully functional system would. Essentially, the assistant tracked the subject's written or spoken input, and clicked on predefined fields at a Sun SPARCstation to send confirmations back to the subject.</Paragraph>
      <Paragraph position="4"> Semi-Automatic Simulation Technique- In developing this simulation, an emphasis was placed on providing automated support for streamlining the simul~ttion to the extent needed to create facile, subject-paced interactions with deax feedback, and to have compaxable specifications for the different input modalities. In the present simulation environment, response delays averaged 0.4 second, with less than a 1-second delay in all conditions. In addition, the simulation was organized to transmit analogues of human backchannel and propositional confirmations, with propositional-level combinations embedded in a compact transaction receipt. The simulation also was designed to be sufficiently automated so that the assistant could concentrate attention on monitoring the accuracy of incoming information, and on maintaining sufficient vigilance to ensure prompt responding. This semi-automation contributed to the fast pace of the simulation, and to a low rate of technical errors. Details of the simulation technique and its capabilities have been detailed elsewhere \[8\].</Paragraph>
      <Paragraph position="5"> Research Design and Data Capture- Three studies were completed in which the research design was a completely crossed factorial with repeated measures. In all studies, the main factors of interest included: (1) communication modality - speech-only, pen-only, combined pen/voice, and (2) presentation format - form-based, unconstrained. The first two studies exmmined disfluencies during communication of verbal-temporal content. To test the generality of certain findings, a third study was conducted that compared disfluencies in computational-numeric content.</Paragraph>
      <Paragraph position="6"> In total, data were available from 528 tasks for analysis of spoken and written disfluencies. All human-computer interactions were videotaped. Hardcopy transcripts also were created, with the subject's handwritten input captured automatically, and spoken input transcribed onto the printouts.</Paragraph>
      <Paragraph position="7"> Transcript Coding- To summarize briefly, spontaneously occurring disfluencies and self-corrections were totaled for each subject and condition. The total number of disfluencies per condition then was converted to a rate per 100 words, and average disfluency rates were summaxized as a function of condition and utterance length. Disfluencies were classified into the following types: (1) content selfcorrections- task-content errors that were spontaneously corrected as the subject spoke or wrote, (2) false starts-alt~ations to the grammatical structure of an utterance that occurred spontaneously as the subject spoke or wrote, (3) verbatim repetitions-- retracings or repetitions of a letter, phoneme, syllable, word, or phrase that occurred spontaneously as the subject spoke or wrote, (4) frilled pauses-spontaneous nonlexical sounds that frill pauses in running speech, which have no analogue in writing, (5) self-corrected sp~lllngs and abbreviations-- spontaneously corrected misspelled words or further specification of abbreviations, which occur in writing but have no analogue in speech.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML