File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/97/w97-0303_intro.xml
Size: 3,852 bytes
Last Modified: 2025-10-06 14:06:21
<?xml version="1.0" standalone="yes"?> <Paper uid="W97-0303"> <Title>An Efficient Distribution of Labor in a Two Stage Robust Interpretation Process</Title> <Section position="2" start_page="0" end_page="26" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> The correct interpretation of spontaneous spoken language poses challenges that continue to fall outside of the reach of state-of-the-art technology. The first essential task of a natural language interface is to map the user's utterance onto some meaning representation which can then be used for further processing. The three biggest challenges that continue to stand in the way of accomplishing even this most basic task are extragrammaticality, ambiguity, and speech recognition errors. In this paper we address the issue of how to handle the problem of extra-grammaticality efficiently, where extragrammaticality is defined as any deviation of an input string from the coverage of a given system's parsing grammar.</Paragraph> <Paragraph position="1"> We demonstrate the superiority of our approach by comparing performance between it and a set of alternative approaches in terms of parse time and parse quality over the same previously unseen test corpus.</Paragraph> <Paragraph position="2"> The approach presented in this paper is the com- null pletely automatic portion of the ROSE 1 approach.</Paragraph> <Paragraph position="3"> ROSE, RObustness with Structural Evolution, repairs extragrammatical input in two phases. The first phase, Repair Hypothesis Formation, is responsible for assembling a set of hypotheses about the meaning of the ungrammatical utterance. This phase is itself divided into two stages, Partial Parsing and Combination. A restricted version of Lavie's GLR* parser (Lavie, 1995; Lavie and Tomita, 1993) is used to obtain an analysis of islands of the speaker's sentence in cases where it is not possible to obtain an analysis for the entire sentence. In the Combination stage, the fragments from the partial parse are assembled into a set of alternative meaning representation hypotheses. A genetic programming approach is used to search for different ways to combine the fragments in order to avoid requiring any hand-crafted repair rules. In ROSE's second phase, Interaction with the User, the system generates a set of queries, negotiating with the speaker in order to narrow down to a single best meaning representation hypothesis. In this paper, only the Hypothesis Formation phase is described and evaluated. Since repairs beyond those made possible by the partial parser are performed during the Combination stage, we refer to the implementation of the Combination stage as the repair module. Though a set of hypotheses are produced by during the Combination stage, in the evaluation presented in this paper, only the repair hypothesis scored by the repair module as best is returned.</Paragraph> <Paragraph position="4"> The ROSE approach was developed in the context of the JANUS large-scale multi-lingual machine translation system (Lavie et al., 1996; Woszcyna et al., 1993; Woszcyna et al., 1994). Currently, the JANUS system deals with the scheduling domain where two speakers attempt to schedule a meeting together over the phone. The system is composed of four language independent and domain independent modules including speech-recognition, parsing, discourse processing, and generation. The repair module described in this paper is similarly language ROSE is pronounced ros~, like the wine.</Paragraph> <Paragraph position="5"> independent and domain independent, requiring no hand-coded knowledge dedicated to repair. The evaluations described in this paper were conducted using a grammar with approximately 1000 rules and a lexicon with approximately 1000 lexical items.</Paragraph> </Section> class="xml-element"></Paper>