XML Viewer - h91-1021

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/91/h91-1021_metho.xml
Size: 18,915 bytes
Last Modified: 2025-10-06 14:12:42
<?xml version="1.0" standalone="yes"?>
<Paper uid="H91-1021">
  <Title>Augmented Role Filling Capabilities for Semantic Interpretation of Spoken Language</Title>
  <Section position="3" start_page="0" end_page="126" type="metho">
    <SectionTitle>
SEMANTICS
</SectionTitle>
    <Paragraph position="0"> When evaluating our system after the Hidden Valley workshop, we observed two phenomena about PUNDIT (the Unlsys natttral language understanding system \[3\]) that warranted improvement. The first was that PUNDIT's semantic interpreter was sometimes fA;llng to correctly recognize predicate  argument relationships for syntactic constituents that were not immediately associated with their intended head. The second was that PUNDIT was producing different representations for queries with different syntactic/lexical content but identical (or nearly identical) semantic content. We see both of these shortcomings as due to what we will term &amp;quot;non-transparent argument structure&amp;quot;: syntactic representations in which syntactic constituents are not associated with their intended head, or semantic representations in which predicate-argument relationships are underspecified. Our approach to dealing with these shortcomings has been to maintain a rule-governed approach to role-filling despite non-transparent syntactic and semantic structures. We believe that the extensions we are about to describe are especially relevant to Spoken Language Understanding, because non-transparent argument structure appears to be particularly characteristic of spontaneous spoken utterances, for reasons we will sketch below.</Paragraph>
    <Paragraph position="1"> The semantic interpreter and non-transparent parses Semantic interpretation in PUNDIT is the process of instantiating the arguments of case frame structures called decompositions, which are associated with verbs and selected nouns and adjectives (\[7\]). The arguments of decompositions are assigned thematic role labels such as agent, patient, source, and so forth. Semantic interpretation is a search for suitable grammatical role/thematic role correspondences, using syntax-semantics mapping rules, which specify what syntactic constituents may fill a particular role; and semantic class constraints, which specify the semantic properties required of potential fillers. The syntactic constraints on potential role fillers are of two types: CATEGORIAL constraints, which require that the potential filler be of a certain grammatical type such as subject, object, or prepositional phrase; and ATTACHMENT constraints, which require that the potential filler occur within the phrase headed by the predicate of which it is to be an argument. The categorial constraints are stated explicitly in the syntax-semantics mapping rules; the latter are implicit in the functioning of the semantic interpreter. For example, the source role of flight_C, the domain model predicate associated with the noun &amp;quot;flight&amp;quot;, can, in accordance with the syntax-semantics mapping rules, be filled by the entity associated with the object of a &amp;quot;from'-pp occurring within the same noun phrase as &amp;quot;flight&amp;quot; (The flight from Boston takes three hours). Unfortunately, the parse does not always express the argument structure of the sentence as transparently as it does in this example; constituents that should provide role fillers for a predicate are not always syntactically associated with the predicate. There are several causes for such a mismatch between the parse and the intended interpretation. They include (l) a variety of syntactic deformations which we will refer to as extraposition ( What flights do you have to Boston, where the &amp;quot;to'-pp belongs in the subject np; I need ticket information from Boston to Dallas, where the pp's modify the prenominal noun &amp;quot;ticket&amp;quot;, not the head noun &amp;quot;information&amp;quot;; or I toant a cheaper flight than Delta 66, where the &amp;quot;than'-pp modifies &amp;quot;cheaper&amp;quot;, not &amp;quot;flight&amp;quot;), (2) metonymy (I toant the $50.00 flight, where the speaker means that s/he wants the flight whose FARE is $50.00), and (3) suboptimal parses (e.g., parses with incorrect pp-attachment).</Paragraph>
    <Paragraph position="2"> Our changes to the semantic interpreter allow it to fill roles correctly in cases such as the above, utilising its existing knowledge of syntax-semantics correspondences, but relaxing certain expectations about the syntactic attachment of role-filling constituents. Thus the CATEGORIAL constraints remain in force, but the ATTACHMENT constraints have been loosened somewhat. The system now identifies prepositional phrases and adverbs which have not frilled a role in the predicate with which they are syntactically associated, and offers them as role fillers to fillers of this predicate. This strategy applies recursively to fillers of fillers of roles; for example, in What types of ground transportation services are available from the airport in Atlanta to downtown A tlanta f , the two final pp's ultimately fill roles in the decomposition associated with &amp;quot;ground transportation&amp;quot; since neither &amp;quot;types&amp;quot; nor &amp;quot;services&amp;quot; has mapping rules to consume them. The same mechanism already in place for role-filling is employed in these cases, the only difference being that unused syntactic constituents are propagated downward. Note that we continue to take syntax into account; we do not wish to ignore the syntax of leftover constituents and fill roles indiscriminately on the basis of semantic properties alone.</Paragraph>
    <Paragraph position="3"> We conducted an experiment to assess the effects of these changes upon the system's performance, using a set of 138 queries (both class A and non-class A) on which the system was previously trained. The measure of performance used was the standard ATIS metric of the number of correct answers minus the number incorrect. Disabling the semantic changes described above lowered the system's score from 82 to 63, a decrease of 23~.</Paragraph>
    <Paragraph position="4"> The application module and non-transparent semantic representations Our second improvement was directed at cases where PUNDIT's semantic interpreter may have correctly represented the meaning of a sentence but in an irregular way. For exampie, the instantiated decomposition produced for &amp;quot;flights from Boston&amp;quot; is: fiight_C/ (~iightl, source (boston) .... ) while &amp;quot;flights leaving Boston&amp;quot; resulted in: ~light C (flight 1, source (_), ...) loavoP (lsavel, flight (flight 1), source (boston), ...) Clearly it would be preferable for the flight_C decomposition to be the same in both cases, but in the second case the source role of the decomposition associated with flightl was unfilled, although it could be inferred from the leaveP decomposition that the flight's source was Boston. In other words, PUNDIT had not captured a basic synonymy relation between these np's.</Paragraph>
    <Paragraph position="5"> Our response to this was to augment the semantic interpreter with a routine which can perform inferences involving more than one decomposition. The actual inferences are expressed in the form of rules which are domain-dependent; the inference-performlng mechanism is domain-independent. For the above example, we have written a rule which, paraphrased  in English, says that if a verb is one of a class of motion verbs used to express flying (e.g., &amp;quot;leave&amp;quot;), and if the source role of this verb is filled, propagate that filler to the source role of the flight involved. Thus the flight_C decomposition becomes the same for both inputs. Thirty-four such rules have been written for the ATIS domain, and we estimate that they are applicable to 10% to 15% of the training data.</Paragraph>
    <Paragraph position="6"> The payoff from this extension comes in the use of PUNDIT's output by application modules. For the ATIS domain, the application module is the program that takes PUNDIT's output and uses it to formulate a query to the ATIS DB. It is obviously advantageous for the creation and maintenance of an application module that its input be regularized to the greatest extent possible, thus making such a module simpler, and avoiding duplication of code to compensate for non-regularized input in different application modules.</Paragraph>
    <Paragraph position="7"> When we ran the same set of 138 queries used in the experiment described in the previous subsection without the rules just discussed (but with the semantics improvements of the previous subsection), the system's score dropped from 82 to 62, or 24%. There appears to be little interaction between the semantics improvements and the rules of this subsection-they apply to different phenomena in input data.</Paragraph>
  </Section>
  <Section position="4" start_page="126" end_page="127" type="metho">
    <SectionTitle>
PRAGMATICS
</SectionTitle>
    <Paragraph position="0"> In our June 1990 workshop paper (\[6\]), we described a feature of our system which we included to handle correctly a particular kind of discourse phenomenon. In particular, in the ATIS domain there are frequent references to flights by flight nnmher (e.g., &amp;quot;Delta flight 123&amp;quot;) which the user means to be unambiguous, but which in general have to be disambiguated in context. The reason is that the user learned about &amp;quot;Delta 123&amp;quot; from some previous answer, where it was returned as one of the flights between two cities City1 and City2. The problem is that &amp;quot;Delta 123&amp;quot; may have additional legs; for instance it may go on from City2 to City3. The user, when asking for the fares for &amp;quot;Delta 123&amp;quot;, is presumably interested only in the City1 to City2 fare, not the City2 to City3 one and not the City1 to City3 one. So our system looked back at previous answers to find a mention of &amp;quot;Delta 123&amp;quot;, thereby determining the flight leg of interest.</Paragraph>
    <Paragraph position="1"> This kind of disamhiguation can take other forms, and we have added some of them to our system since June. One of these capabilities is illustrated by the two queries What does LH meanf and What does EQP meanf Without context, the first of these cannot be correctly answered, because &amp;quot;LH&amp;quot; is a code for both an airline and a fare class. The second of these queries would yield a table with two rows, one row for each table for which &amp;quot;EQP&amp;quot; is one of the table's column headings. In both of these queries, however, the user is asking for clarification of something which has been presented as part of a previous answer display. So what our system needs to do, and does do, is refer back to previous answers much in the spirit of the &amp;quot;Delta 123&amp;quot; example above. For the first query, we will find the most recent answer which has &amp;quot;LH&amp;quot; as a column entry in some row: for the second we will find the most recent answer which has &amp;quot;EQP&amp;quot; as a column heading.</Paragraph>
    <Paragraph position="2"> Our system can then make the proper disambiguation and present the user with an appropriate cooperative response to the follow-up query. There were only a handful of follow-up queries of this form in the training data, hut the extension to handle them was easy to add given the code in place to handle the &amp;quot;Delta 123&amp;quot; example.</Paragraph>
    <Paragraph position="3"> Similarly, the training data contained numerous instances of queries such as What are the claues. ~ In the absence of context, the best answer to this seems to be a llst of the more than 100 different fare classes. However, queries such as these invariably follow the display of some fare classes in either flight tables or fare tables. The cooperative response, then, is to display a table of fare classes whose rows have been limited to those classes previously mentioned in the most recent flight or fare table. Our system also uses a generalization of this algorithm to filter requests for other kinds of codes, such as restrictions, ground transportation codes, aircraft codes, and meal codes. In all, from the TI training data (\[2\]) we have noticed 19 follow-up queries (out of 912) which now get the correct answer in context because of this extension to our system; there may be more queries which requite this extension that we have not yet processed correctly for other reasons.</Paragraph>
    <Paragraph position="4"> We make it possible to refer to previous answer tables in our system by means of the following mechanism. Whenever an answer table is returned, a discourse entity representing it is added to the discourse context, and a semantics for this entity is provided. Roughly speaking, if the query leading to the answer table is a request for X, the semantics can be thought of as being &amp;quot;the table of X&amp;quot; (\[6\]). For example, if the query was a request for flights from City1 to City2, the semantics assigned to the discourse entity representing the answer is &amp;quot;the table of flights from City1 to City2&amp;quot;. Note that we do NOT create discourse entities for each row (particular flights from City1 to City2 in the example) or for each column entry in a row (e.g., the departure time of a particular flight from Cityl to City2). Doing so would make the discourse context munanageably large. But the table (complete with column headings) is available and accessible to our system, and can be searched for particular values when it is desirable to do so, as in the capabilities being described in this section.</Paragraph>
    <Paragraph position="5"> The techniques just described depend on the availability of previous ANSWERS. Some of the follow-up queries which they enable to be answered correctly could perhaps be handled by reference to previous QUERIES only, particularly in the special case where there is known to be only one previous query. We believe that our techniques are superior for at least two reasons. First, in the presence of more than one previous query, the answers to those previous queries are for our system a more compact and modular representation of the content of those queries than the discourse entities created while analysing the queries themselves; in short, it is simply easier to find what we want in the answers rather than in our representations of the queries. Second, there are follow-up queries which cannot he answered unless reference is made to previous answers, so such techniques are necessary in a complete system.</Paragraph>
    <Paragraph position="6"> Therefore, why not use them whenever they can be used, even when alternative techniques might be available? The February 1991 D1 pairs test, which limited context dependency to dependency which could be resolved by examination of a single previous query (and not its answer), provides  additional data on the applicability of these methods. In particular, 27 of the 38 pairs involved the disambiguation of a flight number to the flight leg of interest. It appears that four additional queries can be successfully answered by the technique we discussed above for handling the query What are the classes? The remaining 7 queries appear to be such that reference to previous answers is not helpful.</Paragraph>
  </Section>
  <Section position="5" start_page="127" end_page="127" type="metho">
    <SectionTitle>
SPOKEN LANGUAGE SYSTEMS
</SectionTitle>
    <Paragraph position="0"> We describe here the five spoken language tests in which we participated. Our methodology in these tests has been to couple the speech recognition output from different recognizers to the same natural language processing system. Because the natural language component and the application module are held constant in these systems, this methodology provides us with a means of comparing the performance of speech recognlzers in a spoken language context.</Paragraph>
    <Paragraph position="1"> Class .A: UnSays PUNDIT system coupled with MIT Summit The spoken language system used in this test consists of the Unisys PUNDIT natural language system coupled via an N-best interface to the MIT SUMMIT speech recognition system. We will refer to this system as Unisys-MIT. These results were run with N=16, except for 4 utterances which could not be run at N=16 because of insufficient memory in the speech recognition system. N=I was used for these utterances. SUMMIT produced the N-best and PUNDIT selected from the N-best the most acoustically probable utterance which also passed PUNDIT's syntactic, semantic, and pragmatic constraints. PUNDIT then processed the selected candidate to produce the spoken language system output. The value of N of 16 was selected on the basis of experiments reported in \[1\], which demonstrated that using larger N's than 10-15 leads to a situation where the chance of getting an F begins to outweigh the possible benefit of additional T's.</Paragraph>
    <Paragraph position="2"> The SUMMIT system is a speaker-independent continuous speech recognition system developed at the MIT Laboratory of Computer Science. It is described in \[10\].</Paragraph>
    <Paragraph position="3"> Unlsys PUNDIT coupled with Lincoln Labs Speech Recognizer null The spoken language system used in this test consists of the Unisys PUNDIT natural language system loosely coupled to the MIT Lincoln Labs speech recognition system. The Lincoln Labs system selected the top-1 output, which PUNDIT then processed to produce the spoken language system output.</Paragraph>
    <Paragraph position="4"> The LincoLn Labs system is a speaker independent continuous speech recognition system which was trained on a corpus of 5020 training sentences from 37 speakers. It used a bigram backoff language model of perplexity 17.8. The system is described in more detail in \[8\].</Paragraph>
    <Paragraph position="5">  BBN was the output from BYBLOS rescored using cross-word models and a 4-gram model and then reordered before input to the natural language system.</Paragraph>
  </Section>
  <Section position="6" start_page="127" end_page="127" type="metho">
    <SectionTitle>
SPEECH RECOGNITION TESTS
</SectionTitle>
    <Paragraph position="0"> The speecli recognition tests were done using the natural language constraints provided by the Unisys PUNDIT natural language system to select one candidate from the N-best output of the MIT Laboratory of Computer Science SUMMIT speech recognition system. Using an N of 16, PUNDIT selected the first candidate of the N-best which passed its natural language constraints based on syntactic, semantic and pragmatic knowledge. If all candidates were rejected by the natural language system, the first candidate in the N-best was considered to be the recognized string.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML