File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/04/n04-4019_metho.xml
Size: 5,230 bytes
Last Modified: 2025-10-06 14:08:55
<?xml version="1.0" standalone="yes"?> <Paper uid="N04-4019"> <Title>Speech Graffiti vs. Natural Language: Assessing the User Experience</Title> <Section position="3" start_page="0" end_page="0" type="metho"> <SectionTitle> 2 MATCHES: 1:25, 5:30. </SectionTitle> <Paragraph position="0"/> </Section> <Section position="4" start_page="0" end_page="1" type="metho"> <SectionTitle> 2 Method </SectionTitle> <Paragraph position="0"> We conducted a within-subjects user study in which participants attempted a series of queries to a movie information database with either a Speech Graffiti interface (SG-ML) or a natural language interface (NL-ML).</Paragraph> <Paragraph position="1"> Participants repeated the process with the other system after completing their initial tasks and an evaluation questionnaire. System presentation order was balanced.</Paragraph> <Section position="1" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 2.1 Participants </SectionTitle> <Paragraph position="0"> Twenty-three users (12 female, 11 male) accessed the systems via telephone in our lab. Most were undergraduate students from Carnegie Mellon University, resulting in a limited range of ages represented. None had any prior experience with either of the two movie systems or interfaces, and all users were native speakers of American English. About half the users had computer science and/or engineering (CSE) backgrounds, and similarly about half reported that they did computer programming &quot;fairly often&quot; or &quot;very frequently.&quot;</Paragraph> </Section> <Section position="2" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 2.2 Training </SectionTitle> <Paragraph position="0"> Users learned Speech Graffiti concepts prior to use during a brief, self-paced, web-based tutorial session.</Paragraph> <Paragraph position="1"> Speech Graffiti training sessions were balanced between tutorials using examples from the MovieLine and tutorials using examples from a database that provided simulated flight arrival, departure, and gate information.</Paragraph> <Paragraph position="2"> Regardless of training domain, most users spent ten to fifteen minutes on the Speech Graffiti tutorial.</Paragraph> <Paragraph position="3"> A side effect of the Speech Graffiti-specific training is that in addition to teaching users the concepts of the language, it also familiarizes users with the more general task of speaking to a computer over the phone. To balance this effect for users of the natural language system, which is otherwise intended to be a walk-up-and-use interface, participants engaged in a brief NL &quot;familiarization session&quot; in which they were simply instructed to call the system and try it out. To match the in-domain/out-of-domain variable used in the SG tutorials, half of the NL familiarization sessions used the NL MovieLine and half used MIT's Jupiter natural language weather information system (Zue et al., 2000).</Paragraph> <Paragraph position="4"> Users typically spent about five minutes exploring the NL systems during the familiarization session.</Paragraph> </Section> <Section position="3" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 2.3 Tasks </SectionTitle> <Paragraph position="0"> After having completed the training session for a specific system, each user was asked to call that system and attempt a set of tasks (e.g. &quot;list what's playing at the Squirrel Hill Theater,&quot; &quot;find out & write down what the ratings are for the movies showing at the Oaks Theater&quot;). Participant compensation included task completion bonuses to encourage users to attempt each task in earnest. Regardless of which system they were working with, all users were given the same eight tasks for their first interactions and a different set of eight tasks for their second system interactions.</Paragraph> </Section> <Section position="4" start_page="0" end_page="1" type="sub_section"> <SectionTitle> 2.4 Evaluation </SectionTitle> <Paragraph position="0"> After interacting with a system, each participant was asked to complete a user satisfaction questionnaire scoring 34 subjective-response items on a 7-point Likert scale. This questionnaire was based on the Subjective Assessment of Speech System Interfaces (SASSI) project (Hone & Graham, 2001), which sorts a number of subjective user satisfaction statements (such as &quot;I always knew what to say to the system&quot; and &quot;the system makes few errors&quot;) into six relevant factors: system response accuracy, habitability, cognitive demand, annoyance, likeability and speed. User satisfaction scores were calculated for each factor and overall by averaging the responses to the appropriate component statements.</Paragraph> <Paragraph position="1"> In addition to the Likert scale items, users were also asked a few comparison questions, such as &quot;which of the two systems did you prefer?&quot; For objective comparison of the two interfaces, we measured overall task completion, time- and turns-tocompletion, and word- and understanding-error rates.</Paragraph> </Section> </Section> class="xml-element"></Paper>