File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/91/h91-1022_metho.xml

Size: 14,099 bytes

Last Modified: 2025-10-06 14:12:43

<?xml version="1.0" standalone="yes"?>
<Paper uid="H91-1022">
  <Title>THE USE OF A COMMERCIAL NATURAL LANGUAGE INTERFACE IN THE ATIS TASK</Title>
  <Section position="1" start_page="0" end_page="0" type="metho">
    <SectionTitle>
THE USE OF A COMMERCIAL NATURAL LANGUAGE
INTERFACE IN THE ATIS TASK
Evelyne Tzoukermann
AT&amp;T Bell Laboratories
</SectionTitle>
    <Paragraph position="0"/>
  </Section>
  <Section position="2" start_page="0" end_page="0" type="metho">
    <SectionTitle>
Abstract
</SectionTitle>
    <Paragraph position="0"> A natural language interface for relational databases has been utilized at AT&amp;T Bell Laboratories as the natural language component of the DARPA ATIS common task. This is a part of a larger project that consists of incorporating a natural language component into the Bell Laboratories Speech Recognizer.</Paragraph>
    <Paragraph position="1"> The commercially available system used in this project was developed by Natural Language Incorporation (NLI), in particular by J. Ginsparg \[Ginsparg, 1976\]. We relate our experience in adapting the NLI interface to handle domain dependent ATIS queries. The results of this allowed the exploration of several important issues in speech and natural language: 1. the feasabilitiy of using an off-the-shelf commercial product for a language understanding front end to a speech recognizer, 2. the constraints of using a general-purpose product for a specific task.</Paragraph>
  </Section>
  <Section position="3" start_page="0" end_page="0" type="metho">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> The ATIS common task was designed by DARPA and the members of the DARPA community to build and evaluate a system capable of handling continuous and spontaneous speech recognition as well as natural language understanding. Although the evaluation task is still not fully defined, the ATIS common task presents the opportunity to develop reliable and measurable criteria. The present paper focuses on the natural language component only, the integration with speech being reported in other papers \[Pieraccini and Levin, 1991\]. The domain of the task is on the Air Travel Information Service (ATIS). The project touches on a wide range of issues both in natural language and speech recognition, including incorporation of an NL interface in speech understanding, flexibility in the type of input language (i.e. spoken or written), relational databases, evaluation of system performance, possible limitations, and others.</Paragraph>
    <Paragraph position="1"> The NLI system 1 is driven by a syntactic parser designed to handle English queries that are characteristic of the written language. In contrast, ATIS syntax is characteristic of spoken and spontaneous language. Therefore, one of the primary questions in using the NLI system has been how to overcome problems related to the discrepancy between written and spoken language input. Issues related to the ATIS domain and queries on the one hand, and to the construction of the NLI interface on the other hand are addressed. The task of the experiment is then described along with the results.</Paragraph>
    <Paragraph position="2"> 2 Why use a commercial product? null Using a commercial product is attractive for a number of reasons: within Bell Laboratories, there has been no effort so far to develop a natural language interface (although this may change). Therefore, it is a significant savings of time and effort to use a publicly available system in order to achieve the larger task, that is the integration of speech and natural language.</Paragraph>
    <Paragraph position="3"> within the task of language understanding, the use of a natural language interface meant to understand written language input, exposes issues specific to speech incorporation.</Paragraph>
  </Section>
  <Section position="4" start_page="0" end_page="134" type="metho">
    <SectionTitle>
3 NLI system description
</SectionTitle>
    <Paragraph position="0"> The NLI system is composed of a series of modules, including a spelling corrector, a parser, a semantic interface consulting a knowledge representation base, a con- null versation monitor, a deductive system, a database inference system as well as a database manager, and an English language generator \[NLI Development Manual, 1990\]. The components are: * a spelling corrector which analyses morphological forms of the words and creates a word lattice.</Paragraph>
    <Paragraph position="1"> * a parser which converts the query represented by a word lattice into a grammatical structure resembling a sentence diagram.</Paragraph>
    <Paragraph position="2"> * a semantic interface which translates the parser ouput into the representation language, a hybrid of a semantic network and first-order predicate logic.</Paragraph>
    <Paragraph position="3"> This language permits representation of time dependencies, quantified statements, tense information, and general sets. The representation produced by this system component is concept-based rather than word-based.</Paragraph>
    <Paragraph position="4"> * a language generator which transforms the representation language statements into an English sentence. null * an interpreter which reasons about the representation language statements and makes decisions. It is also called by the semantic interface to help resolve reference, score ambiguous utterances, and perform certain noun and verb transformations.</Paragraph>
    <Paragraph position="5"> * a database interface which translates the representation language statements into SQL database statements.</Paragraph>
    <Paragraph position="6"> * a dictionary which contains over 9,000 English words and their parts of speech.</Paragraph>
    <Paragraph position="7"> * a set of concepts which consists of internal notions of predicates, named objects, and statements.</Paragraph>
    <Paragraph position="8"> * a set of rules which consists of &amp;quot;facts&amp;quot; that make up the rule base of the system. Rules are statements which are believed to be true. The rule interface can handle quantification, sets, and general logic constructs.</Paragraph>
  </Section>
  <Section position="5" start_page="134" end_page="135" type="metho">
    <SectionTitle>
4 Making the connections
</SectionTitle>
    <Paragraph position="0"> The first steps in using NLI consisted of creating connections within the concept tables of the database and in reformatting the ATIS database into the NLI formalism. This has required different types of operations; one operation consisted of taking a relation, naming what it represents and connecting it with its properties. For example, the relation &amp;quot;aircraft&amp;quot; represents plane, and attribute &amp;quot;weight&amp;quot; in aircraft represents the weight of the plane or how heavy or light it is with units of pounds.</Paragraph>
    <Paragraph position="1"> Another was to instantiate verb templates. For example, verbs such as &amp;quot;travel&amp;quot;, &amp;quot;fly&amp;quot;, &amp;quot;go&amp;quot;, etc. must be linked to a filler such as &amp;quot;from_airport&amp;quot; through the preposition &amp;quot;from&amp;quot;. The relation &amp;quot;flight&amp;quot; contains information about which airlines (relation &amp;quot;airline&amp;quot; via &amp;quot;airline_code&amp;quot;) fly flights (relation &amp;quot;flight&amp;quot;) from airports (&amp;quot;from_airport&amp;quot;) to airports (&amp;quot;to_airport&amp;quot;) on airplanes (relation &amp;quot;aircraft&amp;quot; via &amp;quot;aircraft_code&amp;quot;) at times (&amp;quot;departure_time&amp;quot;, &amp;quot;arrival_time&amp;quot;, &amp;quot;flight_day&amp;quot;). A third type of expansion involves the synonymy between two items; for example, the system must be informed that the string &amp;quot;transportation&amp;quot; should be understood as &amp;quot;ground transportation&amp;quot;. Connections were added incrementally in order to expand the coverage.</Paragraph>
    <Paragraph position="2"> 5 System performance and analysis of results The system has been trained on the set of training sentences (only the non-ambiguous sentences called &amp;quot;class A&amp;quot;, i. e. about 550 sentences) recorded at Texas Instruments and the set of test sentences (i. e. about 100 sentences distributed by NIST) that were used for the June 1990 DARPA task. The last test made on the training sentences gave over 61% successfully answered queries which conformed to the Common Answer Specification (CAS) required by the NIST comparator program. It must be pointed out that the translation of NLI output answers into the CAS format was not a straightforward process. For example, when the system could not &amp;quot;answer a query successfully, it output various expressions such as: Sorry, I didn't understand that. Please check your spelling or phrasing, or The database contains no information about how expensive airports are, or I could not find a meaning for the noun &amp;quot;five&amp;quot;, etc., so finding the correct CAS format became guess work. For this purpose, a program was written by Mark Beutnagel ~ at Bell Laboratories to handle the generM cases (transformation of the output tables into the CAS tables) but also a number of idiosyncratic ones.</Paragraph>
    <Paragraph position="3"> The February '91 test was designed to handle different types of queries, unlike the June '90 test that had  only class A sentences. The queries were divided in four categories: class A or non-ambiguous, class AO non-ambiguous but containing some so-called verbal deletions (they have in fact all sorts of spoken-specific language peculiarities, such as What flights list the flights 2I want to thank Mark Beutnagel for his masay hours of useful help.</Paragraph>
    <Paragraph position="4">  from Pittsburgh to San Francisco?), class D for dialogsentences where queries are presented by pairs (one member of the pair indicating the context of the sentence, the other member being the query itself), and class DO for dialog sentences with verbal deletions. At the time of the experiment, although NLI could handle queries with anaphoric pronouns across sentences such as in the pair Show the flights from Atlanta to Baltimore. When do they leave?, the connection file had not been shaped in that direction. The system was trained only to handle the class A queries. Answers to the four categories were run and sent, but only the class A results are of real interest and relevant. The following table shows the results of the four sets of sentences. The queries were evaluated in three different categories, &amp;quot;T&amp;quot; for &amp;quot;True&amp;quot;, &amp;quot;F&amp;quot; for &amp;quot;False&amp;quot; and &amp;quot;NA&amp;quot; for &amp;quot;No_Answer&amp;quot;:</Paragraph>
  </Section>
  <Section position="6" start_page="135" end_page="135" type="metho">
    <SectionTitle>
CLASS CLASS CLASS CLASS
</SectionTitle>
    <Paragraph position="0"/>
  </Section>
  <Section position="7" start_page="135" end_page="135" type="metho">
    <SectionTitle>
6 Error analysis
</SectionTitle>
    <Paragraph position="0"> The first obstacle encountered in utilizing NLI was the nature of the input queries. The ATIS task is meant to understand spoken and spontaneous language whereas NLI is built to understand written type of language.</Paragraph>
    <Paragraph position="1"> There are a number of discrepancies between spoken and written language that involve a different parsing strategy; spontaneous speech contains various kinds of: * repetitions such as through through in the sentence Please show me all the flights from DFW to Baltimore that go through through Atlanta; * restarts as shown in the query What flights list the flights from Pittsburgh to San Francisco; * deletions such as the missing word Francisco in Display ground transportation options from Oakland to San downtown San Francisco; * interjections with the use of the word Okay in Okay I'd like to see a listing of all flights available...; * ellipsis such as in the third phrase I'm sorry cancel that flight. The passenger wants to fly on Delta.</Paragraph>
    <Paragraph position="2"> How about Delta 870 on the 12th?; Note that in this format, the punctuation marks which might give the system information do not occur.</Paragraph>
    <Paragraph position="3"> There are a number of explanations for the unanswered sentences: Lexical gaps: if a lexical item is not in the lexicon, no analysis is given. The problem in lexical gaps is partly due to the domain specific vocabulary of the ATIS task. In the following example I need flight schedule information from Denver to Philadelphia, the system does not have the word schedule in the lexicon; therefore the sentence is rejected.</Paragraph>
    <Paragraph position="4"> The informal addition of information is common to spoken language, more than written language. For example, in the following sentence, the speaker adds information in what is almost telegraphic speech: Cost of a first class ticket Dallas to San Francisco departing August the 6th.</Paragraph>
    <Paragraph position="5"> Absence of additional connections: in sentences like the following, the system cannot answer because arrival times are related to the flight relation and not to the fare ones in the relational database: On fare code 7100325 list your arrival times from Dallas to San Francisco on August l~th.</Paragraph>
    <Paragraph position="6"> * System incompletness at the time of the test: in the sentence Is there a flight from Denver through Dallas Fort Worth to Philadelphia? the connection was established to handle a from-to relation, but not a through relation.</Paragraph>
    <Paragraph position="7"> length of the sentences such as Display lowest price fare available from Dallas to Oakland or Dallas to San Francisco and include the flight numbers on which these options are available.</Paragraph>
    <Paragraph position="8"> This is a common problem in many NL systems. null Other sentences remain unanswered due either to some contradictory meanings in the lexical items of the queries or to the design of the database.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML