File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/04/w04-2314_intro.xml
Size: 3,348 bytes
Last Modified: 2025-10-06 14:02:46
<?xml version="1.0" standalone="yes"?> <Paper uid="W04-2314"> <Title>Bootstrapping Spoken Dialog Systems with Data Reuse</Title> <Section position="2" start_page="0" end_page="0" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> Spoken dialog systems aim to identify intents of humans, expressed in natural language, and take actions accordingly, to satisfy their requests (Gorin et al., 2002). In a natural spoken dialog system, typically, rst the speaker's utterance is recognized using an automatic speech recognizer (ASR). Then, the intent of the speaker is identied from the recognized sequence, using a spoken language understanding (SLU) component. This step can be framed as a classi cation problem for goal-oriented call routing systems (Gorin et al., 2002; Natarajan et al., 2002, among others). Then, the user would be engaged in a dialog via clari cation or con rmation prompts if necessary. The role of the dialog manager (DM) is to interact in a natural way and help the user to achieve the task that the system is designed to support.</Paragraph> <Paragraph position="1"> In our case we only consider automated call routing systems where the task is to reach the right route in a large call center, which could be either a live operator or an automated system. An example dialog from a telephone-based customer care application is given in ings. Based on these heterogeneous requirements, the User Experience (UE) expert analyzes and de nes by hand the system core functionalities: the system semantic scope (call-types) and the dialog manager strategy which will drive the human-machine interaction. Once the UE expert designs the system, large amounts of transcribed and labeled speech utterances are needed for building the ASR and SLU models.</Paragraph> <Paragraph position="2"> In our previous work, we have presented active and unsupervised (or semi-supervised) learning algorithms in order to reduce the amount of labeling effort needed while building ASR and SLU systems (Tur et al., 2003; Tur and Hakkani-Tcurrency1ur, 2003; Riccardi and Hakkani-Tcurrency1ur, 2003). There we focus on a single application, and only the ASR and SLU components. In this study, we aim to exploit the labeled and transcribed data and common reusable dialog templates and patterns obtained from similar previous applications to bootstrap the whole spoken dialog system with ASR, SLU, and DM components.</Paragraph> <Paragraph position="3"> The organization of this paper is as follows. Sec- null a2 System: How may I help you? a2 User: Hello? a2 Call-type: Hello a2 System: Hello, how may I help you? a2 User: I have a question.</Paragraph> <Paragraph position="4"> a2 Call-type: Ask(Info) a2 System: OK, What is your question? a2 User: I would like to know my account balance.</Paragraph> <Paragraph position="5"> a2 Call-type: Request(Account Balance) a2 System: I can help you with that. What is your account number? a2 User: ...</Paragraph> <Paragraph position="6"> tion 2 describes brie y the AT&T Spoken Dialog System, which we use in this study, and its main components, ASR, SLU, and DM. In Section 3 we present our method to bootstrap ASR, SLU, and DM for a new application. Section 4 presents our experiments using real data from a customer care application.</Paragraph> </Section> class="xml-element"></Paper>