File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/06/w06-1624_intro.xml
Size: 2,618 bytes
Last Modified: 2025-10-06 14:04:01
<?xml version="1.0" standalone="yes"?> <Paper uid="W06-1624"> <Title>Sydney, July 2006. c(c)2006 Association for Computational Linguistics A Weakly Supervised Learning Approach for Spoken Language Understanding</Title> <Section position="3" start_page="0" end_page="199" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> Spoken Language Understanding (SLU) is one of the key components in spoken dialogue systems.</Paragraph> <Paragraph position="1"> Its task is to identify the user's goal and extract from the input utterance the information needed to complete the query. Traditionally, there are mainly two mainstreams in the SLU researches: knowledge-based approaches, which are based on robust parsing or template matching techniques (Sneff, 1992; Dowding et al., 1993; Ward and Issar, 1994); and data-driven approaches, which are generally based on stochastic models (Pieraccini and Levin, 1993; Miller et al., 1995).</Paragraph> <Paragraph position="2"> Both approaches have their drawbacks, however.</Paragraph> <Paragraph position="3"> The former approach is cost-expensive to develop since its grammar development is timeconsuming, laboursome and requires linguistic skills. It is also strictly domain-dependent and hence difficult to be adapted to new domains. On the other hand, although addressing such drawbacks associated with knowledge-based approaches, the latter approach often suffers the data sparseness problem and hence needs a fully annotated corpus in order to reliably estimate an accurate model. More recently, some new variation methods are proposed through certain tradeoffs, such as the semi-automatically grammar learning approach (Wang and Acero, 2001) and Hidden Vector State (HVS) model (He and Young, 2005). The two methods require only minimally annotated data (only the semantic frames are annotated).</Paragraph> <Paragraph position="4"> This paper proposes a novel weakly supervised spoken language understanding approach. Our SLU framework mainly includes two successive classifiers: topic classifier and semantic classifier. The main advantage of the proposed approach is that it is mainly data-driven and requires only minimally annotated corpus for training whilst retaining the understanding robustness and deepness for spoken language. In particular, the two classifiers are trained using weakly supervised strategies: the former one is trained through the combination of active learning and self-training (Tur et al., 2005), and the latter one is trained using a practical bootstrapping technique. null</Paragraph> </Section> class="xml-element"></Paper>