File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/04/c04-1021_metho.xml

Size: 26,095 bytes

Last Modified: 2025-10-06 14:08:39

<?xml version="1.0" standalone="yes"?>
<Paper uid="C04-1021">
  <Title>Modern Natural Language Interfaces to Databases: Composing Statistical Parsing with Semantic Tractability</Title>
  <Section position="3" start_page="0" end_page="0" type="metho">
    <SectionTitle>
2 The PRECISE System Overview
</SectionTitle>
    <Paragraph position="0"> Our recent paper (Popescu et al., 2003) introduced the PRECISE architecture and its core algorithm for  ciple -- typically, a learning algorithm is effective when its test examples are drawn from roughly the same distribution as its training examples.</Paragraph>
    <Paragraph position="1"> reducing semantic interpretation to a graph matching problem that is solved by MaxFlow. In this section we provide a brief overview of PRECISE, focusing on the components necessary to understanding its performance on the ATIS data set in Section 4.</Paragraph>
    <Paragraph position="2"> To discuss PRECISE further, we must first introduce some terminology. We say that a database is made up of three types of elements: relations, attributes and values. Each element is unique: an attribute element is a particular column in a particular relation and each value element is the value of a particular attribute. A value is compatible with its attribute and also with the relation containing this attribute. An attribute is compatible with its relation. Each attribute in the database has associated with it a special value, which we call a wh-value, that corresponds to a wh-word (what, where, etc.).</Paragraph>
    <Paragraph position="3"> We define a lexicon as a tuple (T; E; M), where T is a set of strings, called tokens (intuitively, tokens are strings of one or more words, like &amp;quot;New York&amp;quot;); E is a set of database elements, wh-values, and join paths; 2 and M is a subset of T E -- a binary relation between tokens and database elements.</Paragraph>
    <Paragraph position="4"> PRECISE takes as input a lexicon and a parser.</Paragraph>
    <Paragraph position="5"> Then, given an English question, PRECISE maps it to one (or more) corresponding SQL queries. We concisely review how PRECISE works through a simple example. Consider the following question q: &amp;quot;What are the flights from Boston to Chicago?&amp;quot; First, the parser plug-in automatically derives a dependency analysis for q from q's parse tree, represented by the following compact syntactic logical form: LF(q) = what(0);is(0;1);flight(1), from(1;2);boston(2);to(1;3);chicago(3).</Paragraph>
    <Paragraph position="6"> LF(q) contains a predicate for each question word.</Paragraph>
    <Paragraph position="7"> Head nouns correspond to unary predicates whose arguments are constant identifiers.</Paragraph>
    <Paragraph position="8"> Dependencies are encoded by equality constraints between arguments to different predicates. The first type of dependency is represented by noun and adjective pre-modifiers corresponding to unary predicates whose arguments are the identifiers for the respective modified head nouns. A second type of dependency is represented by noun postmodifiers and mediated by prepositions (in the above example, &amp;quot;from&amp;quot; and &amp;quot;to&amp;quot;). The prepositions correspond to binary predicates whose arguments specify the attached noun phrases. For instance, &amp;quot;from&amp;quot; attaches &amp;quot;flight&amp;quot; to &amp;quot;boston&amp;quot;. Finally, subject/predicate, predicate/direct object and predicate/indirect object dependency information is computed for the various 2A join path is a set of equality constraints between the attributes of two or more tables. See Section 3 for more details and a formal definition.</Paragraph>
    <Paragraph position="9"> verbs present in the question. Verbs correspond to binary or tertiary predicates whose arguments indicate what noun phrases play the subject and object roles. In our example, the verb &amp;quot;is&amp;quot; mediates the dependency between &amp;quot;what&amp;quot; and &amp;quot;flight&amp;quot;. 3 PRECISE's lexicon is generated by automatically extracting value, attribute, and relation names from the database. We manually augmented the lexicon with relevant synonyms, prepositions, etc..</Paragraph>
    <Paragraph position="10"> The tokenizer produces a single complete tokenization of this question and lemmatizes the tokens: (what, is, flight, from, boston, to, chicago). By looking up the tokens in the lexicon, PRECISE efficiently retrieves the set of potentially matching database elements for every token. In this case, what, boston and chicago are value tokens, to and from are attribute tokens and flight is a relation token.</Paragraph>
    <Paragraph position="11"> In addition to this information, the lexicon also contains a set of restrictions for tokens that are prepositions or verbs. The restrictions specify the database elements that are allowed to match to the arguments of the respective preposition or verb. For example, from can take as arguments a flight and a city. The restrictions also specify the join paths connecting these relations/attributes. The syntactic logical form is used to retrieve the relevant set of restrictions for a given question.</Paragraph>
    <Paragraph position="12"> The matcher takes as input the information described above and reduces the problem of satisfying the semantic constraints imposed by the definition of a valid interpretation to a graph matching problem (Popescu et al., 2003). In order for each attribute token to match a value token, Boston and Chicago map to the respective values of the database attribute city.cityName, from maps to flight.fromAirport or fare.fromAirport and to maps to flight.toAirport or fare.toAirport. The restrictions validate the output of the matcher and are then used in combination with the syntactic information to narrow down even further the possible interpretations for each token by enforcing local dependencies. For example, the syntactic information tells us that &amp;quot;from&amp;quot; refers to &amp;quot;flight&amp;quot; and since &amp;quot;flight&amp;quot; uniquely maps to flight, this means that from will map to flight.fromAirport rather than fare.fromAirport (similarly, to maps to flight.toAirport and whatmaps to flight.flightId).</Paragraph>
    <Paragraph position="13"> Finally, the matcher compiles a list of all relations satisfying all the clauses in the syntactic logical form using each constant and narrows down the set 3PRECISE uses a larger set of constraints on dependency relations, but for brevity, we focus on those relevant to our examples. null of possible interpretations for each token accordingly. Each set of (constant, corresponding database element) pairs represents a semantic logical form.</Paragraph>
    <Paragraph position="14"> The query generator takes each semantic logical form and uses the join path information available in the restrictions to form the final SQL queries corresponding to each semantic interpretation.</Paragraph>
    <Paragraph position="15"> pronoun verb noun prep noun prep noun prep noun  by PRECISE's semantic over-rides. PRECISE detects that the parser attached the PP &amp;quot;on Monday&amp;quot; to &amp;quot;Chicago&amp;quot; in error. PRECISE attempts to re-attach &amp;quot;on Monday&amp;quot; first to the PP &amp;quot;to Chicago&amp;quot;, and then to the NP &amp;quot;flights from Boston to Chicago&amp;quot;, where it belongs.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
2.1 Parser Enhancements
</SectionTitle>
      <Paragraph position="0"> We used the Charniak parser (Charniak, 2000) for the experiments reported in this paper. We found that the Charniak parser, which was trained on the WSJ corpus, yielded numerous syntactic errors.</Paragraph>
      <Paragraph position="1"> Our first step was to hand tag a set of 150 questions with Part Of Speech (POS) tags, and re-train the parser's POS tagger. As a result, the probabilities associated with certain tags changed dramatically.</Paragraph>
      <Paragraph position="2"> For example, initially, 'list' was consistently tagged as a noun, but after re-training it was consistently labeled as a verb. This change occurs because, in the ATIS domain, 'list' typically occurs in imperative sentences, such as &amp;quot;List all flights.&amp;quot; Focusing exclusively on the tagger drastically reduced the amount of data necessary for re-training.</Paragraph>
      <Paragraph position="3"> Whereas the Charniak parser was originally trained on close to 40,000 sentences, we only required 150 sentences for re-training. Unfortunately, the re-trained parser still made errors when solving difficult syntactic problems, most notably preposition attachment and preposition ellipsis. PRECISE corrects both types of errors using semantic information. null We refer to PRECISE's use of semantic information to correct parser errors as semantic over-rides. Specifically, PRECISE detects that an attachment decision made by the parser is inconsistent with the semantic information in its lexicon.4 When this occurs, PRECISE attempts to repair the parse tree as follows. Given a noun phrase or a prepositional phrase whose corresponding node n in the parse tree has the wrong parent p, PRECISE traverses the path in the parse tree from p to the root node, searching for a suitable node to attach n to. PRECISE chooses the first ancestor of p such that when n is attached to the new node, the modified parse tree agrees with PRECISE's semantic model. Thus, the semantic over-ride procedure is a generate-and-test search where potential solutions are generated in the order of ancestors of node n in the parse tree. The procedure's running time is linear in the depth of the parse tree.</Paragraph>
      <Paragraph position="4"> Consider, for example, the question &amp;quot;What are flights from Boston to Chicago on Monday?&amp;quot; The parser attaches the prepositional phrase &amp;quot;on Monday&amp;quot; to 'Chicago' whereas it should be attached to 'flights' (see Figure 1). The parser merely knows that 'flights', 'Boston', and 'Chicago' are nouns. It then uses statistics to decide that &amp;quot;on Monday&amp;quot; is most likely to attach to 'Chicago'. However, this syntactic decision is inconsistent with the semantic information in PRECISE's lexicon -- the preposition 'on' does not take a city and a day as arguments, rather it takes a flight and a day.</Paragraph>
      <Paragraph position="5"> Thus, PRECISE decides to over-ride the parser and attach 'on' elsewhere. As shown in Figure 1, PRECISE detects that the parser attached the PP &amp;quot;on Monday&amp;quot; to &amp;quot;Chicago&amp;quot; in error. PRECISE attempts to re-attach &amp;quot;on Monday&amp;quot; first to the PP &amp;quot;to Chicago&amp;quot;, and then to the NP &amp;quot;flights from Boston to Chicago&amp;quot;, where it belongs. While in our example the parser violated a constraint in PRECISE's lexicon, the violation of any semantic constraint will trigger the over-ride procedure.</Paragraph>
      <Paragraph position="6"> In the above example, we saw how semantic over-rides help PRECISE fix prepositional attachment errors; they also enable it to correct parser errors in topicalized questions (e.g., &amp;quot;What are Boston to Chicago flights?&amp;quot;) and in preposition ellipsis (e.g., when 'on' is omitted in the question &amp;quot;What are flights from Boston to Chicago Monday?&amp;quot;).</Paragraph>
      <Paragraph position="7"> Unfortunately, semantic over-rides do not correct all of the parser's errors. Most of the remaining parser errors fall into the following categories: relative clause attachment, verb attachment, numeric 4We say that node n is attached to node p if p is the parent of n in the parse tree.</Paragraph>
      <Paragraph position="8"> noun phrases, and topicalized prepositional phrases.</Paragraph>
      <Paragraph position="9"> In general, semantic over-rides can correct local attachment errors, but cannot over-come more global problems in the parse tree. Thus, PRECISE can be forced to give up and ask the user to paraphrase her question.</Paragraph>
    </Section>
  </Section>
  <Section position="4" start_page="0" end_page="0" type="metho">
    <SectionTitle>
3 PRECISE Theory
</SectionTitle>
    <Paragraph position="0"> The aim of this section is to explain the theoretical under-pinnings of PRECISE's semantic model. We show that PRECISE always answers questions from the class of Semantically Tractable (ST) questions correctly, given correct lexical and syntactic information.5 null We begin by introducing some terminology that builds on the definitions given Section 2.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.1 Definitions
</SectionTitle>
      <Paragraph position="0"> A join path is a set of equality constraints between a sequence of database relations. More formally, a join path for relations R1;::: ;Rn is a set of constraints C fRi:a = Ri+1:bj1 i n 1g. Here the notation Ri:a refers to the value of attribute a in relation Ri.</Paragraph>
      <Paragraph position="1"> We say a relation between token set T and a set of database elements and join paths E respects a lexicon L if it is a subset of M.</Paragraph>
      <Paragraph position="2"> A question is simply a string of characters. A tokenization of a question (with respect to a lexicon) is an ordered set of strings such that each element of the tokenization is an element of the lexicon's token set, and the concatenation of the elements of the tokenization, in order, is equal to the original question. For a given lexicon and question, there may be zero, one, or several tokenizations. Any question that has at least one tokenization is tokenizable.</Paragraph>
      <Paragraph position="3"> An attachment function is a function FL;q : T ! T, where L is the lexicon, q is a question, and T is the set of tokens in the lexicon. The attachment function is meant to represent dependency information available to PRECISE through a parser. For example, if a question includes the phrase &amp;quot;restaurants in Seattle&amp;quot;, the attachment function would attach &amp;quot;Seattle&amp;quot; to &amp;quot;restaurants&amp;quot; for this question. Not all tokens are attached to something in every question, so the attachment function is not a total function. We say that a relation R between tokens in a question q respects the attachment function if</Paragraph>
      <Paragraph position="5"> not take on a value for t1).</Paragraph>
      <Paragraph position="6"> 5We do not claim that NLI users will restrict their questions to the ST subset of English in practice, but rather that identifying classes of questions as semantically tractable (or not), and experimentally measuring the prevalence of such questions, is a worthwhile avenue for NLI research.</Paragraph>
      <Paragraph position="7"> In an NLI, interpretations of a question are SQL statements. We define a valid interpretation of a question as being an SQL statement that satisfies a number of conditions connecting it to the tokens in the question. Because of space constraints, we provide only one such constraint as an example: There exists a tokenization t of the question and a set of database elements E such that there is a one-to-one map from t to E respecting the lexicon, and for each value element v 2 E, there is exactly one equality constraint in the SQL clause that uses v.</Paragraph>
      <Paragraph position="8"> For a complete definition of a valid interpretation, see (Popescu et al., 2003).</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.2 Semantic Tractability Model
</SectionTitle>
      <Paragraph position="0"> In this section we formally define the class of ST questions, and show that PRECISE can provably map such questions to the corresponding SQL queries. Intuitively, ST questions are &amp;quot;easy to understand&amp;quot; questions where the words or phrases correspond to database elements or constraints on join paths. Examining multiple questions sets and databases, we have found that nouns, adjectives, and adverbs in &amp;quot;easy&amp;quot; questions refer to database relations, attributes, or values.</Paragraph>
      <Paragraph position="1"> Moreover, the attributes and values in a question &amp;quot;pair up&amp;quot; naturally to indicate equality constraints in SQL. However, values may be paired with implicit attributes that do not appear in the question (e.g., the attribute 'cuisine' in &amp;quot;What are the Chinese restaurants in Seattle?&amp;quot; is implicit). Interestingly, there is no notion of &amp;quot;implicit value&amp;quot; -- the question &amp;quot;What are restaurants with cuisine in Seattle?&amp;quot; does not make sense.</Paragraph>
      <Paragraph position="2"> A preposition indicates a join between the relations corresponding to the arguments of the preposition. For example, consider the preposition 'from' in the question &amp;quot;what airlines fly from Boston to Chicago?&amp;quot; 'from' connects the value 'Boston' (in the relation 'cities') to the relation 'airlines'. Thus, we know that the corresponding SQL query will join 'airlines' and 'cities'.</Paragraph>
      <Paragraph position="3"> We formalize these observations about questions below. We say that a question q is semantically tractable using lexicon L and attachment function  FL;q if: 1. It is possible to split q up into words and phrases found in L. (More formally, q is tokenizable according to L.) 2. While words may have multiple meanings in the lexicon, it must be possible to find a one null to-one correspondence between tokens in the question and some set of database elements.</Paragraph>
      <Paragraph position="4"> (More formally, there exists a tokenization t and a set of database elements and join paths Et such that there is a bijective function f from  t to Et that respects L.) 3. There is at least one such set Et that has exactly one wh-value.</Paragraph>
      <Paragraph position="5"> 4. It is possible to add 'implicit' attributes to Et to get a set E0t with exactly one compatible attribute for every value. (More formally, for some Et with a wh-value there exist attributes a1;::: ;an such that E0t = Et [fa1;:::;ang and there is a bijective function g from the set of value elements (including wh-values) V to the set of attribute elements A in E0t.) 5. At least one such E0t obeys the syntactic restrictions of FL;q. (More formally, let</Paragraph>
      <Paragraph position="7"/>
    </Section>
    <Section position="3" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.3 Results and Discussion
</SectionTitle>
      <Paragraph position="0"> We say that an NLI is sound for a class of questions Q using lexicon L and attachment function FL if for every input q 2 Q, every output of the NLI is a valid interpretation. We say the NLI is complete if it returns all valid interpretations. Our main result is the following: Theorem 1 Given a lexicon L and attachment function FL, PRECISE is sound and complete for the class of semantically tractable questions.</Paragraph>
      <Paragraph position="1"> In practical terms, the theorem states that given correct and complete syntactic and lexical information, PRECISE will return exactly the set of valid interpretations of a question. If PRECISE is missing syntactic or semantic constraints, it can generate extraneous interpretations that it &amp;quot;believes&amp;quot; are valid. Also, if a person uses a term in a manner inconsistent with PRECISE's lexicon, then PRECISE will interpret her question incorrectly. Finally, PRECISE will not answer a question that contains words absent from its lexicon.</Paragraph>
      <Paragraph position="2"> The theorem is clearly an idealization, but the experiments reported in Section 4 provide evidence that it is a useful idealization. PRECISE, which embodies the model of semantic tractability, achieves very high accuracy because in practice it either has correct and complete lexical and syntactic information or it has enough semantic information to compensate for its imperfect inputs. In fact, as we explained in Section 2.1, PRECISE's semantic model enables it to correct parser errors in some cases.</Paragraph>
      <Paragraph position="3"> Finding all the valid interpretations for a question is computationally expensive in the worst case (even just tokenizing a question is NP-complete (Popescu et al., 2003)). Moreover, if the various syntactic and semantic constraints are fed to a standard constraint solver, then the problem of finding even a single valid interpretation is exponential in the worst case. However, we have been able to formulate PRECISE's constraint satisfaction problem as a graph matching problem that is solved in polynomial time by the MaxFlow algorithm: Theorem 2 For lexicon L, PRECISE finds one valid interpretation for a tokenization T of a semantically tractable question in time O(Mn2), where n is the number of tokens in T and M is the maximum number of interpretations that a token can have in L.</Paragraph>
    </Section>
  </Section>
  <Section position="5" start_page="0" end_page="0" type="metho">
    <SectionTitle>
4 Experimental Evaluation
</SectionTitle>
    <Paragraph position="0"> Semantic Tractability (ST) theory and PRECISE's architecture raise a four empirical questions that we now address via experiments on the ATIS data set (Price, 1990): how prevalent are ST questions? How effective is PRECISE in mapping ATIS questions to SQL queries? What is the impact of semantic over-rides? What is the impact of parser retraining? Our experiments utilized the 448 context-independent questions in the ATIS &amp;quot;Scoring Set A&amp;quot;. We chose the ATIS data set because it is a standard benchmark (see Table 2) where independently generated questions are available to test the efficacy of an NLI.</Paragraph>
    <Paragraph position="1"> We found that 95.8% of the ATIS questions were ST questions. We classified each question as ST (or not) by running PRECISE on the question and  column records the percentage of questions where the small set of SQL queries returned by PRECISE contains the correct query; PRECISE-1 refers to the questions correctly interpreted if PRECISE is forced to return exactly one SQL query. ParserORIG is the original version of the parser, ParserTRAINED is the version re-trained for the ATIS domain, and ParserCORRECT is the version whose output is corrected manually. System configurations marked by + indicate the automatic use of semantic over-rides to correct parser errors.</Paragraph>
    <Paragraph position="2">  are database independent. All results are for performance on the context-independent questions in ATIS. recording its response. Intractable questions were due to PRECISE's incomplete semantic information. Consider, for example, the ATIS request &amp;quot;List flights from Oakland to Salt Lake City leaving after midnight Thursday.&amp;quot; PRECISE fails to answer this question because it lacks a model of time, and so cannot infer that &amp;quot;after midnight Thursday&amp;quot; means &amp;quot;early Friday morning.&amp;quot; In addition, we found that the prevalence of ST questions in the ATIS data is consistent with our earlier results on the set of 1,800 natural language questions compiled by Ray Mooney in his experiments in three domains (Tang and Mooney, 2001). As reported in (Popescu et al., 2003), we found that approximately 80% of Mooney's questions were ST.</Paragraph>
    <Paragraph position="3"> PRECISE performance on the ATIS data was also comparable to its performance on the Mooney data sets.</Paragraph>
    <Paragraph position="4"> Table 1 quantifies the impact of the parser enhancements discussed in Section 2.1. Since PRECISE can return multiple distinct SQL queries when it judges a question to be ambiguous, we report its results in two columns. The left column (PRECISE) records the percentage of questions where the set of returned SQL queries contains the correct query.</Paragraph>
    <Paragraph position="5"> The right column (PRECISE-1) records the percentage of questions where PRECISE is correct if it is forced to return exactly one query per question. In our experiments, PRECISE returned a single query 92.4% of the time, and returned two queries the rest of the time. Thus, the difference between the two columns is not great.</Paragraph>
    <Paragraph position="6"> Initially, plugging the Charniak parser into PRECISE yielded only 61.9% accuracy. Introducing semantic over-rides to correct prepositional attachment and preposition ellipsis errors increased PRECISE's accuracy to 89.7% -- the parser's erroneous POS tags still led PRECISE astray in some cases.</Paragraph>
    <Paragraph position="7"> After re-training the parser on 150 POS-tagged ATIS questions, but without utilizing semantic overrides, PRECISE achieved 92.4% accuracy. Combining both re-training and semantic over-rides, PRECISE achieved 94.0% accuracy. This accuracy is close to the maximum that PRECISE can achieve, given its incomplete semantic information-- we found that, when all parsing errors are corrected by hand, PRECISE's accuracy is 95.8%.</Paragraph>
    <Paragraph position="8"> To assess PRECISE's performance, we compared it with previous work. Table 2 shows PRECISE's accuracy compared with the most successful ATIS NLIs (Minker, 1998). We also include, for comparison, the more recent database-independent HEY system (He and Young, 2003). All systems were compared on the ATIS scoring set 'A', but we did &amp;quot;clean&amp;quot; the questions by introducing sentence breaks, removing verbal errors, etc.. Since we could add modules to PRECISE to automatically handle these various cases, we don't view this as significant. null Given the database-specific nature of most previous ATIS systems, it is remarkable that PRECISE is able to achieve comparable accuracy. PRECISE does return two interpretations a small percentage of the time. However, even when restricted to returning a single interpretation, PRECISE-1 still achieved an impressive 89.1% accuracy (Table 1).</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML