File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/relat/04/c04-1021_relat.xml
Size: 3,679 bytes
Last Modified: 2025-10-06 14:15:38
<?xml version="1.0" standalone="yes"?> <Paper uid="C04-1021"> <Title>Modern Natural Language Interfaces to Databases: Composing Statistical Parsing with Semantic Tractability</Title> <Section position="6" start_page="0" end_page="0" type="relat"> <SectionTitle> 5 Related Work </SectionTitle> <Paragraph position="0"> We discuss related work in three categories: Database-independent NLIs, ATIS-specific NLIs, and sublanguages.</Paragraph> <Paragraph position="1"> Database-independent NLIs There has been extensive previous work on NLIs (Androutsopoulos et al., 1995), but three key elements distinguish PRE-CISE. First, we introduce a model of ST questions and show that it produces provably correct interpretations of questions (subject to the assumptions of the model). We measure the prevalence of ST questions to demonstrate the practical import of our model. Second, we are the first to use a statistical parser as a &quot;plug in&quot;, experimentally measure its efficacy, and analyze the attendant challenges. Finally, we show how to leverage our semantic model to correct parser errors in difficult syntactic cases (e.g., prepositional attachment). A more detailed comparison of PRECISE with a wide range of NLI systems appears in (Popescu et al., 2003). The advances in this paper over our previous one include: reformulation of ST THEORY, the parser retraining, semantic over-rides, and the experiments testing PRECISE on the ATIS data.</Paragraph> <Paragraph position="2"> ATIS NLIs The typical ATIS NLIs used either domain-specific semantic grammars (Seneff, 1992; Ward and Issar, 1996) or stochastic models that required fully annotated domain-specific corpora for reliable parameter estimation (Levin and Pieraccini, 1995). In contrast, since it uses its model of semantically tractable questions, PRECISE does not require heavy manual processing and only a small number of annotated questions. In addition, PRECISE leverages existing domain-independent parsing technology and offers theoretical guarantees absent from other work. Improved versions of ATIS systems such as Gemini (Moore et al., 1995) increased their coverage by allowing an approximate question interpretation to be computed from the meanings of some question fragments. Since PRECISE focuses on high precision rather than recall, we analyze every word in the question and interpret the question as a whole. Most recently, (He and Young, 2003) introduced the HEY system, which learns a semantic parser without requiring fully-annotated corpora. HEY uses a hierarchical semantic parser that is trained on a set of questions together with their corresponding SQL queries. HEY is similar to (Tang and Mooney, 2001). Both learning systems require a large set of questions labeled by their SQL queries--an expensive input that PRECISE does not require--and, unlike PRECISE, both systems cannot leverage continuing improvements to statistical parsers.</Paragraph> <Paragraph position="3"> Sublanguages The early work with the most similarities to PRECISE was done in the field of sublanguages. Traditional sublanguage work (Kittredge, 1982) has looked at defining sublanguages for various domains, while more recent work (Grishman, 2001; Sekine, 1994) suggests using AI techniques to learn aspects of sublanguages automatically. Our work can be viewed as a generalization of traditional sublanguage research. We restrict ourselves to the semantically tractable subset of English rather than to a particular knowledge domain. Finally, in addition to offering formal guarantees, we assess the prevalence of our &quot;sublanguage&quot; in the ATIS data.</Paragraph> </Section> class="xml-element"></Paper>