File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/97/w97-0612_evalu.xml

Size: 4,445 bytes

Last Modified: 2025-10-06 14:00:28

<?xml version="1.0" standalone="yes"?>
<Paper uid="W97-0612">
  <Title>A Robust Dialogue System with Spontaneous Speech Understanding and Cooperative Response</Title>
  <Section position="4" start_page="57" end_page="59" type="evalu">
    <SectionTitle>
5 Evaluation Experiment
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="57" end_page="59" type="sub_section">
      <SectionTitle>
5.1 Overview
</SectionTitle>
      <Paragraph position="0"> In order to evaluate our dialogue system with the multi-modal interfaces, we investigated its performance through the evaluation experiments, paying attention to &amp;quot;usefulness of our system&amp;quot;.</Paragraph>
      <Paragraph position="1"> We gave a task of making some plans of Mt.Fuji sightseeing to 10 users\[A ... J\] ( 6 users where evaluation of language processing part ) who did not know about this system\[novises\] in advance. The number of items that user should fill in using our system in this experiment is eight: &amp;quot;Where to go&amp;quot; and &amp;quot;What to do&amp;quot; in first day and second day, and &amp;quot;Where to stay&amp;quot;, &amp;quot;Kind of accommodation&amp;quot;, &amp;quot;Accommodation name&amp;quot;, and &amp;quot;Accommodation fee&amp;quot; in first night. We explained this dialogue system to them and asked them to speak to the system freely and spontaneously.</Paragraph>
      <Paragraph position="2"> And we gave three dialogue modes to every subjects, as shown in below : mode-A Using only speech input and output (our conventional system)  mode-B Using speech input and multi-modal output (graphical output on display and speech output) mode-C Using multi-modal input and output (input : speech and using touch screen, output : speech and graphic on display) Users used three systems on-line mode at the computer room.</Paragraph>
      <Paragraph position="3"> In this experiment, the performances (recognition / comprehension rate, dialogue time, number of utterances) of three systems were not seen explicit differences, because the system is imperfect.</Paragraph>
    </Section>
    <Section position="2" start_page="59" end_page="59" type="sub_section">
      <SectionTitle>
5.2 Evaluation of the language processing
</SectionTitle>
      <Paragraph position="0"> part through the experimental result Table 1 shows the performance of our system through experiments using mode-A system, which investigated the performance of the language processing parts.</Paragraph>
      <Paragraph position="1"> The column of &amp;quot;Speech input&amp;quot; is the result that experiments was done in practice. And the column of &amp;quot;Text input&amp;quot; is the perforamnce of our system, when system inputted a transcription of user's utterances that the recognition rate of the speech recognizer was assumed as 100%. &amp;quot;Semicorrect Recog&amp;quot; means the recognition rate that permitted some recognition errors of particles. &amp;quot;Data presentation&amp;quot; is the rate that the system offered the valuable information to user. &amp;quot;System query&amp;quot; is the rate that the system queried the user to get necessary conditions and to select the information. &amp;quot;Alternative plan&amp;quot; is the rate that the system proposed the alternative plan. &amp;quot;Correct response&amp;quot; is the sum of &amp;quot;Data presentation&amp;quot;, &amp;quot;System query&amp;quot;, &amp;quot;Alternative plan&amp;quot; and rate that the intepreter was unsuccessful in generating a semantic network. &amp;quot;Retrieval failure&amp;quot; is the rate that the system could not offer the valuable information to user although the interpreter has been successful in generating a semantic network.</Paragraph>
      <Paragraph position="2"> The number of total utterances was 101. 81 out of 101 were acceptable by the grammar of the recognizer. 12 unacceptable out of 20 utterances were caused by unknown words, so we considered that it was very important to solve the unknown word problem. And, 8 out of 20 were not acceptable by the grammar. The recognition rate of the speech recognizer on the spontaneous speech was 20.8%. In the speech input, the system unterstood about 55% of the all utterances and offered the available information to user about 55% (42.6%+9.0%+3.0%). And in the text input, these rates were 90% and 80%, respectively. These rates show that the language processing part worked well.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML