File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/80/c80-1072_abstr.xml

Size: 23,662 bytes

Last Modified: 2025-10-06 13:45:50

<?xml version="1.0" standalone="yes"?>
<Paper uid="C80-1072">
  <Title>q~E IMPATIE~ TUTOR: AN INTEGRATED LANG~AGEUNDERSTANDING SYST~</Title>
  <Section position="1" start_page="0" end_page="0" type="abstr">
    <SectionTitle>
q~E IMPATIE~ TUTOR:
AN INTEGRATED LANG~AGEUNDERSTANDING SYST~
</SectionTitle>
    <Paragraph position="0"> We describe a language understanding system that uses the techniques of segmenting the computation into autonomous modules that co, municate by message passing. The goal is to integrate semantic and syntactic processing to achieve greater flexibility and robustness in the design of language understanding systems.</Paragraph>
    <Paragraph position="1"> Introduction This paper addresses the control problem in language understanding systems. Many formalisms have evolved for representing the syntactic, pragmatic, and semantic data of language, but the ability to access them in a flexible and efficient manner has not proceeded apace. This delay is understandable: one needs to know what to control before one can control it. Although the isolation of the subproblems is a valid methodology, there comes a time when a deeper understanding of the language system requires that the data and control aspects of the problem be considered together.</Paragraph>
    <Paragraph position="2"> Linguistic theory has not offered much insight in the control of linguistic processes; Chomsky (1965) finessed the problem by creating ,'competence&amp;quot; as the proper view for theoretical linguistics, rather than the study of &amp;quot;performance&amp;quot;. In fact, it is this study of process that is one of the contributions of computational linguistics to the study of language (Hays, 1971).</Paragraph>
    <Paragraph position="3"> An overview of control strategies Within automated language understanding systems we find a variety of strategies: Linear control.</Paragraph>
    <Paragraph position="4"> A logical approach is to adopt a linear control strategy in which syntactic analysis is followed by semantic interpretation (~s, 1971).</Paragraph>
    <Paragraph position="5"> Unfortunately, this places an overwhelming burden on semantic processing which has to interpret each complete parse when the ambiguity may only lie in part. Further, there are cases where syntactic relations cannot be determined by syntactic analysis alone, for example, the role of &amp;quot;tree&amp;quot; in (I).</Paragraph>
    <Paragraph position="6"> John was hit by the tree. (1) Semantic grammars.</Paragraph>
    <Paragraph position="7"> Faced with a need to access semantic information during syntactic analysis, one suggestion is to construct a &amp;quot;semantic grammar&amp;quot; (HendrJx, 1977) in which some categories in the syntactic rules are replaced by semantically based categories of the domain, e.g., verbs may be subclassified as verbs of movement, containment, excitement, etc. (Sager, 1975). The disadvantage of this approach is that the domain becomes an integral part of the grammar, with the result that either the ntm~ber of syntactic rules is considerably en\]arged, or the rule set has to be rewritten to move to another topic area. Semantic parsing.</Paragraph>
    <Paragraph position="8"> Other approaches have managed to achieve success by avoiding the problem of integration completely: the systems have essentially one component. Schank (\]975) has systems based on the hypothesis that language understanding is driven from the semantics with minimal use of any syntactic analysis. But such systems can go astray because of their high semantic expectation. For example, the word &amp;quot;escape&amp;quot; carries with it the prediction that it is an action  -480of terrorists (Schank, Lebowitz, &amp; Birnbaum, 1978); this causes an erroneous analysis of a sentence such as &amp;quot;The policeman escaped assassination...&amp;quot; Others have proposed procedural systems built around semantic knowledge (Rieger &amp; Small, \]980). In the Rieger and Small system the knowleage is on the word level. Their main drawback is an inability to easily change domains.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
Design Features
</SectionTitle>
      <Paragraph position="0"> The power of syntax diminishes as more complex constituents are encountered.</Paragraph>
      <Paragraph position="1"> Syntax can give good descriptions for the structure of phrases, becomes less detailed when describing the role of phrases within clauses, has relatively little to say about the clause structure of sentences, and even less about sentences in discourse. As syntactic forces diminish, semantic relations describe the structure -- discourse cohesion is semantic (Halliday &amp; Hasan, 1976). Consequently we believe that a language understanding system should have the ability to bring syntactic and semantic knowledge to bear on the analysis at many points ~n the computation in order to prevent the flow of extraneous analyses to later steps in the analysis.</Paragraph>
      <Paragraph position="2"> We agree with Schank (\].975) that the goal of analysis is not to produce a parse tree. It should not even be a subgoal, as is the case in systems that first produce a parse tree then perform semantic interpretation. ~le parse tree should be considered as a data structure that should either be constructed incidentally to the analysis, or be cap@ble of being constructed should it be needed. But syntax cannot be ignored. Often it may not appear to be contributing much, but it is clear that syntactic structure is of use in determining antecedents of proforms, for example.</Paragraph>
      <Paragraph position="3"> Schank's (1975) hypothesis of semantic prediction appears to us to be a good approach. The goal is certainly to build a meaning representation of the linguistic act and top-down analysis can lead to greater efficiency. Top-down systems tend to leave open the question of what to do when there is no prior knowledge to guide the analysis. We envisage a system that can flow into a predictive mode ~wn the situation is appropriate, but otherwise has a default control structure of syntax-then-semantics. In short, we want a data-driven control structure.</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
Message passing
</SectionTitle>
      <Paragraph position="0"> To achieve the design goals mentioned above, we are segmenting the problem into autonomous processes that con~nunicate by passing messages to each other. This is Hewitt's (1976) view of computation as a society of cooperating experts.</Paragraph>
      <Paragraph position="1"> We have experts that know about the organizing principles of syntax and of semantics. The experts are then interpretive, which gives flexibility in changing to another language, or to a new domain. We have experts for case-frames, scripts, clauses, subjects, and the like.</Paragraph>
      <Paragraph position="2"> The experts will. at points in time become associated with domain knowledge, i.e., the grammar of a language, or world knowledge for a problem area.</Paragraph>
      <Paragraph position="3"> The job of an expert can be to instantiate a model that it has been given (top-down analysis), or if it was not given a model, then to find a model (bottom-up analysis). The process of instantiation is performed by eliciting information from other experts who can use their expertise on the problem; they of course may have to consult further experts. Some experts are not instantiators, rather they are processes that are common to several other experts; for parsimonious representation we give them expert status.</Paragraph>
      <Paragraph position="4"> The output of the system is a semantic description of the input as instantiated case-frames. The novelty of the situation is captured by the way in which the case-frames are linked and by their spat~o-temporal settings. The semantic description augments the encyclopedia and is thus available as pragmatic knowledge in the continuing analysis of the input.</Paragraph>
      <Paragraph position="5"> The impatient tutor.</Paragraph>
      <Paragraph position="6"> This initial project is a study of message flow in the system. As each word of the input is processed we are trying to disseminate its effect throughout the system. In particular we wish to have the ana\]ysis rapidly reaching the overall semantic description of the task so that it can be checked against the prescribed actions and any divergence noted. If a deviation is apparent, the system will interrupt the student. We are not proposing the system as a serious tutor;  481-it's shortcomings are quite apparent: if a student intended to say &amp;quot;I will get the hammer before I get the wrench ...&amp;quot; the impatience of the system would cause an interjection after hanm~.r because of an expectancy of a wrench.</Paragraph>
      <Paragraph position="7"> The advantages of message passing Efficiency.</Paragraph>
      <Paragraph position="8"> Without prediction, linguistic analysis can only be a uni-directionalsearch of the problem space, which is ~xponentia! in complexity. If a goal is known or predicted, then bidirectional searching, from input and goal, reduces the complexity. Yet greater efficiency can be achieved if the prediction can be QJrectly associated with the input.</Paragraph>
      <Paragraph position="9"> In other schemes for processing language, the fl~4 of control is constrained to follow the organization of the data.</Paragraph>
      <Paragraph position="10"> The ability of any expert to corrmunicate with any other expert is how we achieve the greater efficiency. If an expert is instantiating a case-frame, for example, it can be in direct con~unJcation with a phrase expert that is trying to instantiate scme syntactic rule. The findings of the phrase expert are transmitted directly to the case-frame expert, which may check the suggestion by calling upon the taxonomic expert.</Paragraph>
      <Paragraph position="11"> As each message carries with it a return address, it can be returned directly to the originator of the query without being chained through any intermediate experts.</Paragraph>
      <Paragraph position="12"> We are using the addresses of messages to achieve our desired perspective on syntax. Although the information mecessary to build a parse tree is in messages, the information can be returned directly to the expert that initiated the query, bypassing other experts who were intermediaries in the answering process. The omitted experts may include those that build s~\]tactic structure. However, a message also has a trace of its route and, should the need arise, the longer path can be followed to build structure.</Paragraph>
      <Paragraph position="13"> Robustness.</Paragraph>
      <Paragraph position="14"> It is apparent that there is a certain amount of redundancy in language. This is probab\]y wily apparently inadequate systems have been able to process well-formed discourse. But real people do not speak with perfection.</Paragraph>
      <Paragraph position="15"> Eventually natural language systems will have to be able to process the normal language of people. A user will not be enamored of a system that demands more care and attention be given to the language of his interaction than is ,\]sual for his other conversational activities.</Paragraph>
      <Paragraph position="16"> To progress to a systematic study of robustness we need to examine schemes by which all of linguistic knowledge may be flexibly invoked; thus we believe that the systems that contain less than this knowledge will not be a suitable vehicle. Linear control structures are equally not the answer. If the erroneous item is first encountered, there is no way of using later cemponents. The flexibility of the message passing scheme will allow other knowledge to be accessed.</Paragraph>
      <Paragraph position="17"> Organization of the data The data of our system is divided into three parts: the syntactic rules, the semantic knowledge, and the definitions of words. The syntactic rules are contained in the &amp;quot;grammar&amp;quot;, the semantic rules in the &amp;quot;encyclopedia&amp;quot; and the word definitioP~ ~n the &amp;quot;dictionary.&amp;quot; Grammar The grammar consists of a set of rules of the form shown in Figure 1.</Paragraph>
      <Paragraph position="19"> The rules are written to allow the presence of a &amp;quot;subject&amp;quot; expert between the &amp;quot;clause&amp;quot; expert and the &amp;quot;NP&amp;quot; expert as it is the subject expert that knows about subject-verb agreement. Agreement rules (not shown) are written in terms of syntactic features such as &amp;quot;ntm~ber&amp;quot;. The experts for syntax use these rules to determine what Darts of speech to  expect next. The ru3es are language specific and are therefore not encoded into the syntactic experts. Only the universal categories have corresponding experts.</Paragraph>
      <Paragraph position="20"> ~ictionary.</Paragraph>
      <Paragraph position="21"> The dictionary consists of word definitions that include the syntactic properties of the word. Thus the word &amp;quot;3eft&amp;quot; would have information that it could be an adjective (as in &amp;quot;left foot&amp;quot;) , a verb (&amp;quot;left home&amp;quot;) and a noun (&amp;quot;the new left&amp;quot;). The description of the sense of each word is reached by a pointer from the dictionary into the encylopedia. For example, that as a noun it refers to a group of people, as an adjective refers to a positional referent, and that as a verb it can build the case frame associated with leaving.</Paragraph>
      <Paragraph position="22"> Encyclopedia.</Paragraph>
      <Paragraph position="23"> The encyclopedia consists of a network of case frames 3inked by  network with information about changing a tire.</Paragraph>
      <Paragraph position="24"> In Figure 2 we see knowledge about changing a tire. The CONTingency links represent causal dependencies. The ME~FA \].inks show the equivalence of concepts, one concept having an equivalent description by a set of concepts. For example &amp;quot;replace&amp;quot; represents &amp;quot;removing an old object and putting on a new one&amp;quot;. If concepts in the resulting description also have meta-3inks, tb~ decomposition can be continued. Schank's (1979) MOP's are similar to our meta-organization.</Paragraph>
      <Paragraph position="25"> The VARiety link is used to show taxonomic classication. Thus &amp;quot;~ange-tire&amp;quot; is a kind of &amp;quot;replace&amp;quot;. Common knowledge need only be represented once; it is inherited by concepts lower in the taxonomy than the point of representation. The INSTance relation captures the episodic nature of memory by storing specific instances as instantiations of intensional descriptions: &amp;quot;That time I changed my tire in front of Mom's house.&amp;quot; is one instantiation of the genera\] changing a tire event.</Paragraph>
      <Paragraph position="26"> Anatomy of an expert Each expert in the system knows how to use specific types of links and to perform operations using local data. An expert also keeps track of its message activity. As an example, take the &amp;quot;Chronology&amp;quot; expert, Figure 3.  There are two parts to each expert. The static part which is not changed during processing, and the dynamic part which is. The dynamic component contains a  -~483memory, which keeps track of all processing done by this expert so far. This is primarily included for efficiency, since it saves the expe_rt from having to repeat computations.</Paragraph>
      <Paragraph position="27"> It also contains a &amp;quot;Message Center&amp;quot;, which tells whether it is waiting for an answer from another expert (is a Client to another expert) or has other experts waiting for replies (has Customers). It also has default Customers to whom messages should be sent even if they have not been requested.</Paragraph>
      <Paragraph position="28"> The static component has a name, a list of the link types which the expert knows about, and a set of process rules. These rules are the heart of the experts, since they contain information on what processes to call to get information and what other experts to call. In the case of the Chronology expert shown in Figure 3 it uses the process &amp;quot;trace&amp;quot; to follow links, an8 can call the taxonomy expert to get superior nodes. In the case of the syntactic experts these process rules inc11~de information about using the syntactic grar~nar rules to find the next expert to call.</Paragraph>
      <Paragraph position="29"> Translation As experts have vocabularies that are peculiar to their domains, messages -in particular from semantic to syntactic experts -- may require translation from the terminology of the sender to that of the receiver.</Paragraph>
      <Paragraph position="30"> For example, messages between clause experts (CLE) and case-frame experts (CFE). ~\]e former uses the concepts of subject, object, verb, etc., whereas the latter has events, states, and agents, i~struments, etc. Let us consider a scenario in which a CLE has analyzed a &amp;quot;subject&amp;quot; and wants to convey this information to a CFE. It could send the role-labelled concept to the OFF..</Paragraph>
      <Paragraph position="31"> However, to attribute a CF role to the concept, the CFE needs to know the mood of the sentence. This it can only determine by sending messages back to the CLE. The overall effect would be to transfer information available to the CLE to the CFE. It is obviously more efficient to have the translation process as part of the resources available to the C\[~ and to have J t send off a possible &amp;quot;agent&amp;quot;, say, to the CFE. The CFE can verify or reject the hypothesis using the semantic resources available to it.</Paragraph>
      <Paragraph position="32"> If the CFE is predicting a certain &amp;quot;instrt~nent&amp;quot;, say, it could have available to it information on the realizations of instruments and remit to the CLE the prediction. Again this is putting knowledge of syntax and of forms into the CFE; it seems better to have the CFE send &amp;quot;instrument&amp;quot; and the word concept to the CLE which decides upon likely realizations.</Paragraph>
      <Paragraph position="33"> All in all the translation process resides more naturally with the CLE.</Paragraph>
      <Paragraph position="34"> general, it is taken that the translation resides in the expe_rts on the syntactic side of the system.</Paragraph>
      <Paragraph position="35"> In Other semantic phenomena that can have correlates in syntax are contingency, sequence, and decomposition. For example, chronological ordering may be realized by &amp;quot;then&amp;quot;. In general there are many possible realizations; they can be single words or even clauses. A little-understood &amp;quot;connective&amp;quot; expert has the job of watching for the syntactic clues.</Paragraph>
      <Paragraph position="36"> An Example of Experts in Action In this section we will outline how the system uses the knowledge Figure 2 to process input about changing a tire, for example, (4) and (5).</Paragraph>
      <Paragraph position="37"> The left front tire is flat. (4) I will change it. (5) The goal of the system is to create a meaning representation by instantiat~ng a CF. Through meta-links, a CF can be equivalent to a complex of CF's; thus the top-level instantiation may be achieved by instantiating the lower rank CF's.</Paragraph>
      <Paragraph position="38"> A CFE normally has a model of a CF that it is trying to instantiate. Initially this cannot be the case and the system has to revert to a bottom-up approach. The CFE sends a message to the CLE requesting that it be sent a translation of a syntactic analysis of a clause.</Paragraph>
      <Paragraph position="39"> The CLE has to find a clause using the rules of the grammar in Figure i. The clause rules show that a &amp;quot;subject&amp;quot; expert has to be invoked. In turn J t sends a request to a &amp;quot;NP&amp;quot; expert. The NP expert finds the rules that describe its constituent structure. G.~ven the many many rules that could be used, it would be inefficient to examine them all, so input is used to guide its choice. The expert gets the word by  --484-asking an &amp;quot;input&amp;quot; expe.rt, which prompts the user. The NP expert selects those rules that can be part of a model consistent with the input. The syntactic instantiation is similar to a chart parse (Kaplan, \]973) showing the hierarchical arrangement of constituents. At this point, the CLE has not recognized any of the entry points to the translator and so cannot yet respond to the CFE. The next input word is taken by the CLE. The input will instantiate some of the analysis paths and possibly eliminate some. And so on until a constituent that can fulfill the subject expert's request is recognized. Omitting a number of steps, the response is &amp;quot;the left front tire&amp;quot;. The subject expert cannot truthfully forward this phrase as it cannot be certain that it is a subject until the mood of the clause is known. We are still considering what to do in this situation. We could wait or could send the concept off without annotation to see if the CFE can make any use of it.</Paragraph>
      <Paragraph position="40"> (The latter would be profitable if there are only a limited number of semantic possib{lities in the context.) Let us assume that we wait. ~le subject expert interrogates the CLE for information on its mood, which require that the clause expert continue the analysis. Once the verb expert has functioned, the information is available and so the the stative verb. The grammar then predicts that a &amp;quot;state&amp;quot; will follow. This is confirmed by the word &amp;quot;flat&amp;quot;. After receiving the response from the CLE, the CFE has the following instantiation:  This episode becomes part of the encylopedia.</Paragraph>
      <Paragraph position="41"> A CFE contains the knowledge that when a state is found, a request shou!d be passed to Chronology asking for the NEXT-EVENT. Chronology traces the LFADTO link from CF\] and predicts that Change:tire will be the next act. It passes this information back to the CFE. The CFE now has the prediction that the  will be found. TOOL6 is a token representing a group consisting of a jack ano a wrench. For the sake of brevity in this example this informatlon is made explicit, in the actual progran ~t can be determined by tracing other links. The CFE has now processed the first case frame to the be~t of its abilities and sets out to instantiate the prediction. As the CFE has CF2 as its model, it can work in a top-C/kmm manner. When the prediction is passed to the CLE and translated, &amp;quot;tire&amp;quot; will be available as a match for the pronoun &amp;quot;it&amp;quot;.</Paragraph>
      <Paragraph position="42"> The instantiation of the model  The CFE seeks to set up more predictions for the dialogue. It looks to see if this action is contingent on any others. To do this it calls up chronology and requests the LAST-EVENT for CF3.</Paragraph>
      <Paragraph position="43"> Chronology calls upon taxonomy which ascends variety links to the &amp;quot;perform&amp;quot; act in &amp;quot; do:job&amp;quot;. The taxonomy expert also checks to see if the meta-node has any contingencies, but in this case it doesn't. If it did, that would also be returned to chronology. It finds CF4:  ~lis iS then passed back to the CFE to serve as a prediction for the next input. And so the cycle of prediction and instantiation continues.</Paragraph>
      <Paragraph position="44"> --485--</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML