File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/94/h94-1041_intro.xml
Size: 3,997 bytes
Last Modified: 2025-10-06 14:05:46
<?xml version="1.0" standalone="yes"?> <Paper uid="H94-1041"> <Title>PREDICTING AND MANAGING SPOKEN DISFLUENCIES DURING HUMAN-COMPUTER INTERACTION*</Title> <Section position="3" start_page="0" end_page="0" type="intro"> <SectionTitle> 1. INTRODUCTION </SectionTitle> <Paragraph position="0"> Recently, researchers interested in spoken language processing have begun searching for reliable methods to detect and correct disfluent input automatically during interactions with spoken language systems \[2, 4, 9\]. In general, this research has focused on identifying acoustic-prosodic cues for detecting self-repairs, either alone or in combination with syntactic, semantic, and pattern matching information. To date, however, possible avenues for simply reducing or eliminating disfluencies through manipulation of basic interface features have not been explored.</Paragraph> <Paragraph position="1"> Another underdeveloped but central theme in disfluency research is the relation between spoken disfluencies and planning demands. Although it is frequently claimed that disfluencies rise with increased planning demands of different kinds \[3\], the nature of this relation remains poorly understood. The major factors contributing to planning have yet *This research was supported by Grant No. IRI-9213472 from the National Science Foundation, contracts from USWest, AT&T/NCR, and ATR International to SRI International, and equipment donations from Apple Computer, Sun Microsystems, and Wacom Inc.</Paragraph> <Paragraph position="2"> to be identified and defined in any comprehensive manner, or linked to disfluencies and self-repairs. From the viewpoint of designing systems, information on the dynarnics of what produces disfluencies, and how to structure interfaces to minimize them, could improve the robust performance of spoken language systems.</Paragraph> <Paragraph position="3"> A related research issue is the extent to which qualitatively different types of speech may differ in their disfluency rates. That is, does the rate of spoken disfluencies tend to be stable, or variable? If variable, do disfluency rates differ systematically between human-human and human-computer speech? And are disfluency rates sufficiently variable that techniques for designing spoken language interfaces might exert much leverage in reducing them? To compare disfluency rates directly across different types of human-human and human-computer interactions, research needs to be based on comparable rate-per-word measures, the same definition of disfluencies and self-repairs, and so forth, in order to obtain meaningful comparisons.</Paragraph> <Paragraph position="4"> For the purpose of the present research, past studies by the author and colleagues \[1, 6, 7\] were reanalyzed: (1) to yield data on the rate of disfluencies for four different types of human-human speech, and (2) to conduct comparative analyses of whether human-human disfluencies .differ from human-computer ones. In addition, three simulation studies of human-computer interaction were conducted, which generated data on spoken and handwritten disfluencies. Apart from comparing disfluencies in different communication modalities, two separate factors associated with planning demands were examined. First, presentation format was manipulated to investigate whether degree of structure might be associated with disfluencies. It was predicted that a relatively unconstrained format, which requires the speaker to self-structure and plan to a greater degree, would lead to a higher rate of speech disfluencies. Second, the rate of disfluencies was examined in sentences of varying length. Spoken utterances graduated in length were compared to determine whether longer sentences have an elevated rate of disfluencies per word, since they theoretically require more planning. Finally, implications are outlined for designing future interfaces capable of substantially reducing disfluent input.</Paragraph> </Section> class="xml-element"></Paper>