File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/04/c04-1158_intro.xml

Size: 4,223 bytes

Last Modified: 2025-10-06 14:02:11

<?xml version="1.0" standalone="yes"?>
<Paper uid="C04-1158">
  <Title>Efficient Confirmation Strategy for Large-scale Text Retrieval Systems with Spoken Dialogue Interface</Title>
  <Section position="2" start_page="0" end_page="0" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> Information retrieval systems with spoken language have been studied (Harabagiu et al., 2002; Hori et al., 2003). They require both automatic speech recognition (ASR) and information retrieval (IR) technologies. As a straight manifestation to create these systems, we can think of using ASR results as an input for IR systems that retrieve a text knowledge base (KB). However, two problems occur in the characteristics  of speech.</Paragraph>
    <Paragraph position="1"> 1. Speech recognition errors 2. Redundancy included in spoken language expressions  One is an ASR error, which is basically inevitable in speech communications. Therefore, an adequate confirmation is indispensable in spoken dialogue systems to eliminate the misunderstandings caused by ASR errors.</Paragraph>
    <Paragraph position="2"> If keywords to be confirmed are defined, the system can confirm them using confidence measures (Komatani and Kawahara, 2000; Hazen et al., 2000) to manage the errors. In conventional tasks for spoken dialogue systems in which their target of retrieval was well-defined, such as the relational database, keywords that are important to achieve the tasks correspond to items in the relational database. Most spoken dialogue systems that have been developed, such as airline information systems (Levin et al., 2000; Potamianos et al., 2000; San-Segundo et al., 2000) and train information systems (Allen et al., 1996; Sturm et al., 1999; Lamel et al., 1999), are categorized here. However, it is not feasible to define such keywords in retrieval for operation manuals (Komatani et al., 2002) or WWW pages, where the target of retrieval is not organized and is written as natural language text.</Paragraph>
    <Paragraph position="3"> Another problem is that a user's utterances may include redundant expressions or out-of-domain phrases. A speech interface has been said to have the advantage of ease of input. This means that redundant expressions, such as disfluency and irrelevant phrases, are easily input. These do not directly contribute to task achievement and might even be harmful. ASR results that may include such redundant portions are not adequate for an input of IR systems.</Paragraph>
    <Paragraph position="4"> A novel method is described in this paper that automatically detects necessary portions for task achievement from the ASR results of a user's utterances; that is, the system determines if each part of the ASR results is necessary for the retrieval. We introduce two measures for each portion of the results. One is a relevance score (RS) with the target document  Summary: This article describes how to use speech recognition in Windows XP. If you installed speech recognition with Microsoft Office XP, or if you purchased a new computer that has Office XP installed, you can use speech recognition in all Office programs as well as other programs for which it is enabled.</Paragraph>
    <Paragraph position="5"> Detail information: Speech recognition enables the operating system to convert spoken words to written text. An internal driver, called a speech recognition engine, recognizes words and converts them to text. The speech recognition engine ...</Paragraph>
    <Paragraph position="6">  set. The score is computed with a document language model and is used for making confirmation prior to the retrieval. The other is a significance score (SS) in the document matching. It is computed after the retrieval using N-best results and is used for prompting the user for post-selection if necessary. Information necessary to define these two measures, such as a document language model and retrieval results for N-best candidates of the ASR, can be automatically derived from the target knowledge base. Therefore, the system can detect the portions necessary for the retrieval and make the confirmation efficiently without defining the keywords manually.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML