File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/86/j86-1002_metho.xml

Size: 42,718 bytes

Last Modified: 2025-10-06 14:11:53

<?xml version="1.0" standalone="yes"?>
<Paper uid="J86-1002">
  <Title>THE CORRECTION OF ILL-FORMED INPUT USING HISTORY-BASED EXPECTATION WITH APPLICATIONS TO SPEECH UNDERSTANDING</Title>
  <Section position="3" start_page="0" end_page="0" type="metho">
    <SectionTitle>
2 AN OVERVIEW OF THE HISTORY-BASED
EXPECTATION SYSTEM
</SectionTitle>
    <Paragraph position="0"> The general goal of the history-based expectation system is to merge a series of dialogues, each of which consists of a sequence of sentences, into a more general dialogue that reflects the patterns that exist between and within the separate dialogues. Thus, the expectation system must: - save incoming dialogues, -find patterns between and within these dialogues so that they can be merged into a more general dialogue which becomes a formula for a more general situation, and - use this information to help predict what will be said by a user in a given situation.</Paragraph>
    <Paragraph position="1"> This ability to predict what might be said by a user can help error correct what is input to the natural language system through errorful means, such as a voice recognizer. We will call this ability expectation. Figure 1 shows an overview of the structure of the history-based expectation system. Expectation is acquired at two levels, the sentence level and the dialogue level. A special parser, called the expectation parser, is used to analyze at the sentence level. The expected dialogue is a data structure used to store the history-based expectation that is acquired using an expectation acquisition algorithm. This constitutes the dialogue level.</Paragraph>
    <Paragraph position="2"> As each sentence is entered into the system, such as through a speech recognition device, it is parsed and a meaning representation is produced and saved by an expectation acquisition algorithm in the expectation module (see 1 in Figure 1). The parse is also output for use in the next step in the system's processing of the sentence. This process builds a sequence of sentence meanings, which are then incorporated into an expected dialogue (see 2 in Figure 1). After an expected dialogue is partially or completely built, the expectation module attempts to determine where the user is in a given dialogue using information from the expected dialogue and the current parsed sentence (see 1 and 3 in Figure 1). If it succeeds, it creates and transmits (see 4 in Figure 1) an expected sentence set to the expectation parser. The expectation parser will then use this information to improve its ability to recognize the next incoming sentence.</Paragraph>
    <Paragraph position="4"/>
  </Section>
  <Section position="4" start_page="0" end_page="0" type="metho">
    <SectionTitle>
3 A REPRESENTATION FOR USER BEHAVIORS
</SectionTitle>
    <Paragraph position="0"> Suppose a user inputs the following sequence:</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
Sentence Label
</SectionTitle>
      <Paragraph position="0"> Display my mail summary for today. S 1 Show me this letter. (with touch input) $2 (the letter appears on the screen) Remove this letter. $3 Display the letter from JA. $4 (letter appears on the screen) Delete it. $5 Log off. $6 We denote the meaning of each sentence Si with the notation M(Si). The exact form of M(Si) need not be discussed at this point; it could be a conceptual dependence graph (Schank and Abelson 1977), a deep parse of Si, or some other representation. A user behavior is represented by a network, or directed graph, of such meanings. At the beginning of a task, the state of the interaction is represented by the start state of the graph. The immediate successors of this state are the typical opening meaning structures for this user, and succeeding states represent, historically, paths that have been followed by this user.</Paragraph>
      <Paragraph position="1"> It is important that if two sentences, Si and Sj, have approximately the same meaning this should be clear in the representations M(Si) and M(Sj). Our algorithm, described below, merges two meanings M(Si) and M(Sj) into a single node in the behavior representation if they - are sufficiently similar, and - appear in similar contexts.</Paragraph>
      <Paragraph position="2"> Thus, in the above example it would appear that M(S3) and M(S5) play similar roles and could be represented by one structure: after a letter is read, one might expect to see it deleted.</Paragraph>
      <Paragraph position="3"> Often, two commands will be similar except for the instantiation of certain constituents. This is the case in sentences $2 and $4, which request the display of, respectively, the message indicated by a touch and the letter from JA. Again, it is desired to represent such similar meanings in a behavior graph with a single node if they appear in similar environments. Thus, a routine will be needed to find a generalization of two such sentences that can represent their common meaning. In the example, the generalization of $2 and $4 might be &amp;quot;display (LETTER)&amp;quot; where &amp;quot;(LETTER)&amp;quot; is a noun group referring to a letter.</Paragraph>
      <Paragraph position="4"> In tracking a dialogue, we may arrive at a node in the behavior graph with meaning M1. This means a command is expected with meaning M2 that is either identical to, or a special case of, M1. If such an M2 is input at this time, we will say that M1 predicts M2 and define the predicate: Predicts(M1, M2) = true if and only if meaning M1 is identical or similar to M2.</Paragraph>
      <Paragraph position="5"> It is quite possible, as with M(S2) and M(S4) above, that a common generalization can be found for two sentences that appear in similar contexts. Then one will be able to merge them into a single node in the behavior graph. Thus, it is necessary to have a predicate to check whether these conditions hold and a function to find the desired generalization. The following two routines do this: Mergeable(M1, M2) = true if and only if an M can be found such that Predicts(M, M1) and Predicts(M, M2).</Paragraph>
      <Paragraph position="6"> Merge(M1, M2) yields a meaning M that is identical to, or a generalization of, M1 and M2.</Paragraph>
      <Paragraph position="7"> A user behavior is represented as a network of sentence meanings with transitions from one meaning to another that indicate traversals observed in actual dialogues and their frequencies. For example, the above six-sentence sequence could be represented as shown in  which gives the number of times in observed dialogues this node has been visited. The integer on each transition gives the number of times it has been traversed in observed dialogues.</Paragraph>
      <Paragraph position="8">  More formally, a behavior graph B will consist of a set of nodes named 0, 1, 2, 3 ..... bsize-1. Each node i will have its associated Mi and Ci and the first node will have a special meaning M0 = 'start'. The transitions will be represented as triples (i, j, k) where the traversal is from node i to node k and has been observed j times. The example six-command sequence would be represented by Computational Linguistics, Volume 12, Number 1, January-March 1986 15 Pamela K. Fink and Alan W. Biermann The Correction of Ill-Formed Input the nodes 0 through 4 with Mi's and Ci's as shown and with the triples {(0,1,1) (1,1,2) (2,2,3) (3,1,2) (3,1,4)}.</Paragraph>
      <Paragraph position="9"> Notice that the observed probability of crossing transition (i, j, k) is j/Ci, a fact that is used by the expectation parser.</Paragraph>
    </Section>
  </Section>
  <Section position="5" start_page="0" end_page="0" type="metho">
    <SectionTitle>
4 THE EXPECTATION MODEL BUILDING AND
TRACKING ALGORITHM
</SectionTitle>
    <Paragraph position="0"> It is desired to have an algorithm to monitor the discourse, collect the history of inputs, and invoke expectation when any kind of repetition occurs. Such an algorithm is described below. To do so, however, some additional notation is needed: current = an integer giving the state number in B corresponding to the most recently recognized sentence.</Paragraph>
    <Paragraph position="1"> bsize = the total number of states in B.</Paragraph>
    <Paragraph position="3"> states to state i, also called the expected sentence set of i.</Paragraph>
    <Paragraph position="4"> P(S, E(current)) = The result of the expectation parser with input S and E(current), where S is the current input sentence which may have errors, and E(current) is a set of expected meanings in B, the successors of node current. The result or output of the parse of sentence S is its meaning M(S).</Paragraph>
    <Paragraph position="5"> The behavior graph B begins with one state numbered &amp;quot;0&amp;quot; and with M0 = start, C(0) = 0. Thus, the size of the graph is bsize = 1 and the most recently recognized sentence is assumed to be this start state, current = 0. Suppose that the first sentence in the above sample dialogue is read: S1 = &amp;quot;Display my mail summary for today.&amp;quot; Then the processor will begin with no expectation since E(0) is currently the empty set, and find</Paragraph>
    <Paragraph position="7"> This will result in the creation of a second state in B with the following statements: Create a NEW NODE: Put(current, 1, bsize) into B; (a transition to the new state is created)</Paragraph>
    <Paragraph position="9"> Thus, the first two states shown in Figure 2 will exist with the single transition (0, 1, 1). Sentence $2 and $3 result in similar processing, the addition of states 2 and 3, and the creation of transitions (1, 1, 2) and (2, 1, 3) as shown in Figure 3.</Paragraph>
    <Paragraph position="10">  The input sentence will yield a different action, however, if its meaning M(S) is determined to be mergeable with the meaning of an existing node Mk on the graph. While the details of mergeability have not yet been discussed, let us assume for the current example that M(S4) is mergeable with M(S2). Then a new meaning will appear in the graph that is a generalization of these two, Merge(M(S2), M(S4)), and a graph transition will be built to this new meaning. Transfer to the existing meaning Mk would proceed as follows:</Paragraph>
    <Paragraph position="12"> Figure 4 shows the updated graph. At this point, current = 2, and the expectation set, E(2), is non-empty for the first time. So, now we compute P(S5, {M3}), meaning that $5 is read with the expectation that its meaning will be &amp;quot;remove this one&amp;quot;. Given this expectation, the parser will prefer any transitions down paths that lead to some paraphrase of this sentence and, unless the system clearly recognizes that something else has been said, a sentence meaning &amp;quot;remove this one&amp;quot; should be recognized. If it is, then current will be advanced to this expected node. In general, there may be several expected sentence meanings, and the processor will select the one most similar to the incoming utterance unless that sentence is clearly not any member of the expected set.</Paragraph>
    <Paragraph position="13">  Thus, if a successor k to the current state predicts the incoming sentence, we track that successor. Tracking the expected meaning Mk would proceed as follows:</Paragraph>
    <Paragraph position="15"> The final sentence $6 in the dialogue will cause the creation of a termination state and complete the graph of  rithm is thus the collection of the above code segments: if no behavior graph B exists then</Paragraph>
    <Paragraph position="17"> until M(S) is a dialogue termination.</Paragraph>
    <Paragraph position="18"> This code creates a finite state model of the dialogue based on equivalence or similarity classes defined by the functions Predicts, Mergeable, and Merge. As will be discussed in the next section, similarity classes are based not only on the similarity of the sentences themselves, but also on the environment in which they occur. Thus, there is only one state for each such similarity class in the finite state model created.</Paragraph>
    <Paragraph position="19"> When the user enters the system again, this algorithm can be reinvoked using the existing B graph. If the next dialogue is very similar to a previous one, then the expectation dialogue will powerfully support error correction. If the next dialogue has little resemblance to previous ones, then no expectation will be available, and the user will be dependent on basic processor recognition capabilities. null This section has given an overview of the approach to history-based expectation processing. The details of the method are dependent on how the functions P, Predicts, Mergeable, and Merge are implemented. The following sections describe our implementation, which was used to investigate the viability of this approach and the performance it can achieve.</Paragraph>
    <Paragraph position="20"> Computational Linguistics, Volume 12, Number 1, January-March 1986 17 Pamela K. Fink and Alan W. Biermann The Correction of Ill-Formed Input</Paragraph>
  </Section>
  <Section position="6" start_page="0" end_page="4257" type="metho">
    <SectionTitle>
5 AN IMPLEMENTATION 5.1 THE EXPECTATION PARSER
</SectionTitle>
    <Paragraph position="0"> The usefulness of the methodology described above was tested in the implementation of a connected speech understanding system. An off-the-shelf speech recognition device, a Nippon Electric Corporation DP-200, was added to an existing natural language processing system, the Natural Language Computer (NLC) (Ballard 1979, Biermann and Ballard 1980). The expectation system provided the intermediate processing between the errorful output of the speech recognizer and the deep semantics of NLC. The resulting speech understanding system is called the Voice Natural Language Computer with Expectation (VNLCE, Fink 1983). \[The current system should be distinguished from an earlier voice system (VNLC, Biermann et al. 1985), which had no expectation and which handled discrete speech where a 300 millisecond pause must follow each word.\] It should be emphasized, of course, that the central issue here is the study of expectation mechanisms and the details of the design decisions could have been made in rather different ways. Thus one could have implemented expectation error correction with a typed input system or with a speech input system that integrates voice signal processing with higher level functions in a way not possible with a commercial recognizer. This implementation shows only one way in which the functions P, Predicts, Mergeable, and Merge can be constructed to achieve expectation capabilities. The conclusion of this paper is that in this particular situation substantial error correction is achieved, and thus one may suspect that similar results can be achieved in other applications.</Paragraph>
    <Paragraph position="1"> The implementation, as in the overview of the general system presented in section 2, consists of two major parts, an expectation parser and an expectation module, and their respective data structures. The expectation parser embodies the function P, while the major functions of the expectation module are Predicts, Mergeable, and Merge. An expected sentence set, E(current), along with the most recent input sentence S, are inputs to the expectation parser P. The expectation parser P uses these two inputs to determine the meaning M(S) of the input sentence S. Thus, M(S) is a deep parse of S. The function Predicts determines if one of the sentences in E(current) predicts M(S). If so, then 1~(S) is merged with this sentence meaning and dialogue tracking is begun from that point. Otherwise the function Mergeable determines how &amp;quot;similar&amp;quot; M(S) is to any other sentences in the expected dialogue. In this implementation, the function Mergeable is actually much more cautious about determining whether or not a set of sentences should be merged. For the implementation, if Mergeable determines that certain nodes in the expected dialogue are mergeable with M(S), then it adds the successors of these nodes to E, creating an expanded expected sentence set. Then, if the next sentence input is predicted by one or more of these sentences, they are merged through the action of Predicts and Merge.</Paragraph>
    <Paragraph position="2"> The purpose of the expectation parser in this implementation of a speech understanding system is to take input from the scanner and the expectation module, and use this information to determine what was said by the user.</Paragraph>
    <Paragraph position="3"> Thus, during the parsing process, the expectation parser must reconcile the sequence of words input from the scanner with the expected sentence set from the expectation module, or determine that the scanner input is not like anything that was expected and, thus, ignore expectation. In this way, the expectation parser parses from two inputs. It is constantly trying to maintain an equilibrium between the input from the scanner and the input from the expectation module. This balancing is kept in line by a set of rating factors that are used during the parsing procedure to help guide the search for a reasonable sentence structure. These rating factors, at times, will be referred to as probabilities in the following discussion. However, in reality, the ratings are one thousand times the values of the logarithms of numbers between 0 and 1. Thus, the ratings span the values -999 to 0, where 0 is equivalent to a probability of one. These ratings are computed this way because they remain integral and still fairly accurately represent the correct values. Also, they can simply be added and subtracted rather than multiplied and divided in the hundreds of calculations required for a single sentence parse.</Paragraph>
    <Paragraph position="4"> The expectation parser uses an ATN-like representation for its grammar (Woods 1970). Its strategy is top-down. The types of sentences accepted are essentially those accepted by the original NLC grammar, imperative sentences with nested noun groups and conjunctions (Ballard 1979). An attempt has been made to build as deep a parse as possible so that sentences with the same meaning result in identical parses. Sentences have the same &amp;quot;meaning&amp;quot; if they &amp;quot;result in identical tasks being performed. The various sentence structures that We have have the same meaning we call paraphrases.</Paragraph>
    <Paragraph position="5"> studied the following types of paraphrasing:  'double row two and zero matrix one.' &lt;= &gt; 'double row two. zero matrix one.' It is obvious from this list that there are varying levels of paraphrasing. Some arise at the vocabulary level (number 1), some at the syntactic level (numbers 2, 3, 4, 5, 6, and 7), some at the semantic level (numbers 8, 9, and 10), some at the current world level (numbers 11 and 12), and some at a combination of levels (numbers 13 and 14). Some are domain dependent, especially at the vocabulary level such as entry &lt; = &gt; number. Others are not, such as ADJ NOUN &lt;=&gt; NOUN QUALIFIER.</Paragraph>
    <Paragraph position="6"> Those that only require knowledge of the vocabulary or of the grammar are implemented in the current history-based expectation system. This means that paraphrases one through seven are handled currently as part of the parsing process itself. The last seven may be dealt with at some future date. However, they are somewhat more complicated because they require temporal-type knowledge such as the current referent of a pronoun or the current size of a matrix. The lexical and grammatical paraphrases, on the other hand, will always have the same meaning, regardless of the current state of the world. By handling the seven lexical and syntactic paraphrases, a stored parse can aid in recognizing many sentences with the same &amp;quot;meaning&amp;quot; but different surface structures.</Paragraph>
    <Paragraph position="7"> To simplify representation of the parser output we have developed a special notation to indicate the deep parse of a sentence. For example, the parse of the sentences: Double the positive row 1 entries.</Paragraph>
    <Paragraph position="8"> Double the positive entries in row 1.</Paragraph>
    <Paragraph position="9"> Double the row 1 entries which are positive.</Paragraph>
    <Paragraph position="10"> Double the entries in row 1 which are positive.</Paragraph>
    <Paragraph position="11"> is notated as: Double (entries (positive) (rl)) The mechanism for using the expectation information during parsing is built into the ATN-like network. The parser receives from the scanner a sequence of word slots. These word slots are defined by the speech recognition system based on the sequence of words it recognized.</Paragraph>
    <Paragraph position="12"> Thus, there could be missing or extra word slots due to errors made during speech recognition. To each word slot the scanner adds other possible words based on what words the system tends to confuse. The scanner also rates the possibilities for each word slot by the same scale discussed previously. During parsing, the parser creates a template that represents the parse of the sentence input. This template contains slots that represent the parts of a sentence such as verb, adjective, and headnoun. At each point in the parse of a sentence, when the expectation parser is trying to determine what the role of the current word slot is in the sentence, five different attempts are made to use the current word slot as needed to fill the template slot at the current point in the grammar network. These are: * ADV (advance): Find a word in the current word slot from the scanner output that will fit the needs at this node in the grammar. If such_ a word cannot be found, try choice 2.</Paragraph>
    <Paragraph position="13"> * EXPADV (expectation advance): Look at the parse of the current expected sentence to see if the template slot that the parser is currently trying to fill is filled in the expected sentence. If so, copy the value in the template slot from the expected sentence to the current parse, ignoring the word slot from the scanner. Otherwise, try choice 3.</Paragraph>
    <Paragraph position="14"> * SKIPWORD: Skip the current word slot from the scanner output, filling the corresponding parser template slot, when appropriate, with a NIL value to indicate that a word has been skipped and that it was assumed to have the function associated with the template slot.</Paragraph>
    <Paragraph position="15"> If the parse fails later on, and the parser backs up to this point, try choice 4.</Paragraph>
    <Paragraph position="16"> * EXTRAWS (extra word slot): Assume that the word slot from the scanner is an extra one due to an error in recognition. Skip this word slot and again try choice 1.</Paragraph>
    <Paragraph position="17"> If failure occurs, try choice 2. Finally, if failure again occurs, try choice 5.</Paragraph>
    <Paragraph position="18"> * LOSTWS (lost word slot): Assume that the needed word slot from the scanner is lost due to an error in recognition. Without advancing to the next scanner word slot, try step 2 again. If this fails, then fill the parser template slot, when appropriate, with a NIL value to indicate that a word has been lost and that it was assumed to have the function associated with that template slot. Remain at the current scanner word slot so that it can again be evaluated for a different function. null An example piece of the parser network is shown in Figure 6. The five kinds of error correction were hand coded into each network so that the special character-Computational Linguistics, Volume 12, Number 1, January-March 1986 19 Pamela K. Fink and Alan W. Biermann The Correction of HI-Formed Input formatted routine FILLADJ:  istics of each grammatical structure could be accounted for individually. Thus in some cases, certain error correction alternatives were checked immediately while in others it was wiser to determine whether normal processing would fail at deeper levels before attempting those same corrections. The network represents a tree structure which is searched by the expectation parser.</Paragraph>
    <Paragraph position="19"> Succession in the network is represented by the parent-child relationship, which is indicated in Figure 6 by indentation. Thus, the node containing the command ADV is the parent of the node containing the command CHEK PART ADJ, and so is succeeded by it. Should a command fail, the parser backs up to the parent node of the node that has just failed. Thus, if a check for an adjective in CHEK PART ADJ fails, control will back up to the node containing ADV. Choice is represented by the sibling relationship which is indicated in Figure 6 by the vertical lines connecting nodes. Thus, ADV, EXPADV, SKIPWORD, EXTRAWS, and LOSTWS are all siblings in the tree network and are choices that the parser can make when parsing a sentence. Note that, in this case, these five choices represent the five possible attempts that are made in trying to parse a word slot that were discussed above. A choice is made by picking the siblings in the order in which they appear in the network. Thus, when the CHEK PART ADJ fails and control backs up to ADV, the expectation parser will back up to the START node and then take the second choice, EXPADV, and attempt to proceed down that chain of commands.</Paragraph>
    <Paragraph position="20"> The scoring mechanism within the parser serves to aid in the evaluation of the alternative paths during the parse process and the pruning of improbable choices. A typical spoken input to the system is &amp;quot;add row one to row two&amp;quot; and the speech recognition machine will often return such errorful output as &amp;quot;and row * to row&amp;quot;.</Paragraph>
    <Paragraph position="21"> The asterisk indicates that the device guesses the existence of a word but has failed to identify it.</Paragraph>
    <Paragraph position="22"> The parser must be able to extract the user's original intent and its operation is guided by rating factors which evaluate the quality of the path through the parser, the word selection, the level of agreement with expectation, and the self consistency (or compatibility) of the sentence. These individual ratings work as follows:</Paragraph>
  </Section>
  <Section position="7" start_page="4257" end_page="4257" type="metho">
    <SectionTitle>
1) The Transition Value
</SectionTitle>
    <Paragraph position="0"> Every time the parser moves over a SKIPWORD, EXTRAWS, or LOSTWS command a charge is made to the value of the transition. Normally, a transition does not cost anything, but each SKIPWORD, EXTRAWS, and LOSTWS executed results in a lowering of the transition's value. This charge is made for the rest of the parse unless the SKIPWORD, EXTRAWS, or LOSTWS is backed over. This charge can be seen in the sample grammar net appearing in Figure 6 after the words SKIPWORD, EXTRAWS, and LOSTWS. The charge in this example for each of the three commands is 1000*log\[0.7\] = -35.</Paragraph>
  </Section>
  <Section position="8" start_page="4257" end_page="4257" type="metho">
    <SectionTitle>
2) The Word Value
</SectionTitle>
    <Paragraph position="0"> We define the synophones of a given vocabulary word to be the words a user might speak that could possibly be recognized as that word. Because of the nature of the dynamic programming algorithm in the NEC machine, it yields only one guess at each word slot. So it is necessary for our software to provide the set of synophones for each guessed word. This, in effect, simulates the situation where the speech recognition device provides a larger number of possible matches. Thus, in the case of the above recognizer outpu t, the following synophones would be produced to represent the sequence of possible words spoken: word slot word rating</Paragraph>
    <Paragraph position="2"> Each alternative word is given a rating. The words selected by the recognizer are given maximum ratings and alternatives are given lower values. If two words have the same pronunciation as with to and two, they are given the same values.</Paragraph>
  </Section>
  <Section position="9" start_page="4257" end_page="4257" type="metho">
    <SectionTitle>
3) The Expectation Value
</SectionTitle>
    <Paragraph position="0"> This value is based on whether or not there is an expected sentence, how well the current parse is matching the current expected sentence from the expected sentence set, and how much the current parse is using this expected sentence. Whenever a slot is filled by the parser, it is compared with the corresponding slot in the expected sentence. If they do not match, the expectation value decreases, otherwise the expectation value remains the same.</Paragraph>
  </Section>
  <Section position="10" start_page="4257" end_page="4257" type="metho">
    <SectionTitle>
4) The Compatibility Value
</SectionTitle>
    <Paragraph position="0"> This value differs from the other three in that it is simply true or false. Verb-operand, noungroupnoungroup, and expectation are checks made during the parse. If compatibility fails, then the expectation parser backs up, otherwise it continues forward.</Paragraph>
    <Paragraph position="1"> Each of these components has a value assessed at each word slot in the incoming sentence as well as one for the entire sentence. The word slot values are assumed to have a top rating until the parser reaches that word slot.</Paragraph>
    <Paragraph position="2"> Thus, the parser is always examining a best case situation based on what it has already done. For example, all word slot transition values are assumed, initially, to have the value 1000*log\[i\] = 0. The transition value at a word slot is only lowered if it is necessary for the parser to execute a SKIPWORD, EXTRAWS, or LOSTWS command in parsing that word slot. The charge made is accOrding to the value indicated at the particular command in the grammar network. The average of the current values of all word slot transition values creates the sentence transition rating for the parse so far. The word slot and sentence values for the expectation and word values are computed similarly. The compatibility value differs, however, since it does not have degrees of ratings but rather indicates acceptability or lack thereof. Thus, it is not included in the formula for determining a rating for the parse. Rather, if it fails, then parsing automatically backs up. If it succeeds, then parsing continues forward.</Paragraph>
    <Paragraph position="3"> The values of the transition, word, and expectation components are used to determine two sentence parse ratings. At each word slot, the values of the three factors are averaged together to produce a general word slot parse rating. Also, the sentence values for the three components are averaged together to obtain a general sentence parse rating. Thus, we have the following equations that define the various rating values, where n is  The transition confidence, word confidence and expectation confidence provide an average overall value for the ws transition, ws word, and ws expectation ratings, respectively. These average values provide a best case rating at any point during the parse because they assume perfect ratings for all word slots not yet parsed. The overall parse values, wordslotfactor and sentence factor, are calculated simply from the average of the other three rating values. This is done so that each factor has equivalent power in controlling the parse. If it is desirable to allow one factor to have more control over the parse than the other two, then this can be accomplished by manipulating the particular minimum rating values discussed below. In order to control the expectation parsing, search is cut-off if rating values fall below certain levels. Currently, these levels are:  If any one of the rating factors drops below its corresponding minimum value, the current search path is cut-off and a different route through the grammar nets is attempted. In this way, there is a control over the extent of the search. By setting all the minimum ratings to Computational Linguistics, Volume 12, Number 1, January-March 1986 21 Pamela K. Fink and Alan W. Biermann The Correction of Ill-Formed Input -999, for example, all possibilities in the grammar are checked. On the other hand, setting all the minimum ratings to 0 results in the expectation parser behaving like a normal parser since this essentially turns off the use of the SKIPWORD, EXTRAWS, and LOSTWS commands, the use of synophones, and expectation.</Paragraph>
    <Paragraph position="4"> In theory, the parsing algorithm is admissible. That is, it is capable of finding the best possible parse. The various rating factors can initially be set high and gradually lowered until a parse is found. This parse would have the highest rating possible. However, this is impractical in practice due to the amount of time required to repeatedly search a growing space. Thus, minimum rating values are set and the search is conducted once. In this way, the first parse found is the &amp;quot;best&amp;quot; parse in the sense that it is the first one found whose rating was higher than the minimum pre-set value.</Paragraph>
    <Section position="1" start_page="4257" end_page="4257" type="sub_section">
      <SectionTitle>
5.2 ROUTINES OF THE EXPECTATION MODULE
</SectionTitle>
      <Paragraph position="0"> The task of the expectation module is to acquire a general dialogue from a series of dialogues spoken by a user.</Paragraph>
      <Paragraph position="1"> The dialogues essentially contain examples of how to go about solving a particular kind of problem. In acquiring these dialogues and merging them into one generalized dialogue, the expectation system learns how to solve this particular kind of problem through examples. In a sense, by building this generalized dialogue the expectation system is creating a procedure that can solve a particular subset of problems. This is a future goal of the project.</Paragraph>
      <Paragraph position="2"> However, the current application is for the generalized dialogue to be used as an aid in the voice recognition process by offering predictions about what might be said next.</Paragraph>
      <Paragraph position="3"> The types of problems that can be learned by the existing history-based expectation system include linear algebra applications such as matrix multiplication, simultaneous linear equations, and Gaussian elimination.</Paragraph>
      <Paragraph position="4"> Non-linear algebra problems that require matrix-type representations can also be learned, such as gradebook maintenance and invoice manipulation. Though the implemented system is limited to matrix-oriented problems, the theoretical system is capable of learning a wide range of problem types. The only requirement on the problem or situation is that it can be entered into the expectation system in the form of examples. Thus, for example, it can acquire a &amp;quot;script&amp;quot; such as the one for going to a restaurant as defined in Schank and Abelson (1977).</Paragraph>
      <Paragraph position="5"> The expectation module takes two inputs and produces two outputs. The inputs are * the user behavior graph discussed earlier, called the expected dialogue D, and * the meaning of the most recently input sentence, M(S). Its outputs are a new expected dialogue D modified according to the latest input sentence M(S) and an expected sentence set E. These outputs are produced based upon the inputs and the functions Predicts, Mergeable, and Merge.</Paragraph>
      <Paragraph position="6"> The role of the predicate Predicts can be best understood by recalling the function of the parser P. P uses the set of expected sentences E(current) to try to error correct the incoming sentence S. P may do this by discovering that some Mk in E(current) is quite similar to M(S). If P does select such an Mk and uses it to help parse S, then Predicts (Mk, M(S)) is true. Otherwise, Predicts (Mk, M(S)) is false. Thus the function of Predicts is to select the Mk which the parser used in parsing S. If the parser did not use expectation, then Predicts always is false.</Paragraph>
      <Paragraph position="7"> If the incoming sentence was not predicted by existing transitions in D, perhaps it can be found to be similar to some node Mk in D and a new transition could be added to that node. The routine Mergeable has the job of finding one or more such Mk's into which the current sentence meaning M(S) can be merged. The question of similarity of two sentences is determined by the meanings of the sentences themselves and the &amp;quot;environment&amp;quot; in which they occur in the dialogue. Sentence &amp;quot;meanings&amp;quot; are based on the sentence deep parses produced by the expectation parser, while a sentence &amp;quot;environment&amp;quot; is based on the meanings of the sentences preceding and following it in the expected dialogue.</Paragraph>
      <Paragraph position="8"> Similarity is based on the notion of &amp;quot;distance&amp;quot;. Currently two sentences are considered similar in meaning if their parses differ in only one slot in the noun group template. This means that their noun group distance cannot be greater than one to be considered similar. For example, the following two sentences are similar: M(&amp;quot;double the first row&amp;quot;) = double (rl) M(&amp;quot;double row 2&amp;quot;) = double (r2) The environment of one sentence matches that of another if the sentence meanings preceding the two sentences being compared are identical and/or the sentence meanings following them are identical. Clearly, these definitions are quite arbitrary and many other strategies could be tried. However, for the purposes of this study, they were quite satisfactory.</Paragraph>
      <Paragraph position="9"> Based on the question of how well the environment and the sentence itself matches previously seen environments and sentences, five different matches are possible between the current incoming sentence and the elements of the expected dialogue:  1) The sentence matches a sentence meaning in the expected dialogue exactly, but there is no match of their environments.</Paragraph>
      <Paragraph position="10"> 2) The sentence matches a sentence meaning in the expected dialogue similarly, but there is no match of their environments.</Paragraph>
      <Paragraph position="11"> 3) The sentence matches a sentence meaning in the expected sentence set exactly, which implies that their environments also match.</Paragraph>
      <Paragraph position="12"> 22 Computational Linguistics, Volume 12, Number 1, January-March 1986 Pamela K. Fink and Alan W. Biermann The Correction of Ill-Formed Input 4) The sentence matches a sentence meaning in the expected sentence set similarly, which implies that their environments also match.</Paragraph>
      <Paragraph position="13"> 5) There is no match between the sentence and any  sentence meaning in the expected dialogue.</Paragraph>
      <Paragraph position="14"> In cases 1, 2, and 5, the sentence is determined to be new and unique to the expected dialogue. Therefore, Mk and M(S) are not mergeable. In such cases, M(S) is added as a new entry in the expected dialogue D. In the other two cases, numbers 3 and 4, the incoming sentence is determined to be the same as or similar to one already seen previously in an exact or similar situation. Thus, Mk is mergeable with M(S). In case 3 the sentence is automatically merged with the one that it matches exactly in the expected sentence set. In case 4, the sentence is merged with the one that it matches similarly in the expected sentence set only after it has passed an argument creation algorithm test to be discussed below. Otherwise it is also considered new and unique and added to the expected dialogue as in cases 1, 2, and 5. The actual argument creation occurs in the function Merge.</Paragraph>
      <Paragraph position="15"> The notion of creating an argument is associated with the problem of when to merge a set of similar sentences in an expected dialogue into one sentence with a special flag in the slot where the sentences differ. This is determined by the function Mergeable. As an example, at a certain point in a dialogue, one may have an expected sentence set E(i) such as the following: double (rl) .33 double (r2) .33 double (r3) .33 The numbers indicate the probability levels, derived from j/Ci, as discussed at the end of section 3.</Paragraph>
      <Paragraph position="16"> In such a situation, the user's intentions may be reflected more correctly by the following expected sentence set: double (rARG) 1.0 which signifies that any row may be referred to. However, though this simplified expected sentence set may be a good generalization of the pattern observed, it has ramifications for error correction. Specifically, it will be unable to fill in a row number should that value be missing in the incoming sentence. The first option also has its drawbacks. In this case, should the row number be missing in the sentence, the expectation parser will error correct the sentence to the most probable value, or the first one in the set if the probabilities are equal, here the value one for row 1. Thus, both options are imperfect in terms of the error correction capabilities that they can provide. The comparison that must be made to determine which option is better in a given situation is how often the first will error correct incorrectly as opposed to how much error correcting power we will lose by using the second. How it is done is beyond the scope of this paper but is explained in detail in Fink (1983).</Paragraph>
      <Paragraph position="17"> The Merge function takes two inputs, M1 and M2, which have been determined by the Mergeable function to be similar in some way by considering their respective environments and meanings. Based upon how similar the two meanings are, Merge creates a meaning M that is a generalization of M1 and M2, sometimes employing an argument. Thus, there are only two possible kinds of matches at this point between an input sentence and a member of the expected sentence set, an exact match or a similar match. In the case of an exact match M = M1 = M2 and M replaces Mi in the expected dialogue. In the case of a similar match, the meanings only differ by one slot in the noun group of their deep parse representation, so a generalization of that slot to &amp;quot;ARG&amp;quot; is made, meaning an argument is created. The function appears as follows:</Paragraph>
      <Paragraph position="19"> Thus, if the sentences &amp;quot;Double (rl)&amp;quot; and &amp;quot;Double (r2)&amp;quot; are inputs to Merge, the output would be &amp;quot;Double (rARG)&amp;quot;.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML