File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/85/p85-1031_metho.xml

Size: 25,643 bytes

Last Modified: 2025-10-06 14:11:47

<?xml version="1.0" standalone="yes"?>
<Paper uid="P85-1031">
  <Title>AN ECLECTIC APPROACH TO BUILDING NATURAL LANGUAGE INTERFACES</Title>
  <Section position="4" start_page="0" end_page="254" type="metho">
    <SectionTitle>
LANGUAGE AS A KNOWLEDGE ENGINEERING TOOL
</SectionTitle>
    <Paragraph position="0"> The major bottlenecks in building knowledge based systems have proven to be related to the definition and acquisition of knowledge to be processed.</Paragraph>
    <Paragraph position="1"> The first bottleneck occurs in the knowledge definition phase of system development, where symbolic structures are defined that represent the knowledge necessary to accomplish a particular task. A bottleneck arises because of the ~ortage of knowledge engineers, who are skilled in defining these structures and using them to express relevant knowledge.</Paragraph>
    <Paragraph position="2">  The second bottleneck occurs in the knowledge acquisition phase, which involves the codification of the knowledge necessary for a system to function correctly. A bottleneck arises here because in current practice, the presence of the knowledge engineer is required throughout this time-consuming process.</Paragraph>
    <Paragraph position="3"> In the course of defining a viable methodology for the construction of expert systems (Frelling &amp; Alexander 1984; Alexander et al. 1985), we have identified cermia classes of problems where the task of definin$ the knowledge structures and the task of actually building them can be effectively separated, with only the former being performed by a trained knowledge engineer. The problem of building a large collection of knowledge-based troubleshooters for electronic instrumeats is an example. In order to support the construct/on of a large class of such systems, it makes sense to perform the knowledge definition step for the overall domain initially, and to build domain-specific developmera tools, which include problem-oriented mbsets of Enghsh and special purpose graphical displays, that can be reused in the development of each individual knowledge-based system.</Paragraph>
    <Paragraph position="4"> Even in the context of such an approach, we have found that there is usually a shortage of capable knowledge engineers to carry out the knowledge deflnltioa phase, and that a well-defined methodology can be of great value here in aiding nonlinguistically oriented computer scientists to carry out this verbal elicitation task. The major issue is how to gee started defining the forms into which knowledge is to be cast.</Paragraph>
    <Paragraph position="5"> We have found it an effect/ve technique tO begin this procem by recording statements made by task experts on tape, and transcribing these to fairly natural En~)i~. When enough recording has been done, the statements begin to take on recognizable patterns. It is then pom/ble to build a formal grammar for much of the relevant utterances, using linguistic engineering techniques such as semantic grammars The symbols of this grammar and the task specific vocabulary provide convenient points for defining formal sub-structures, which are pieced together to define a complete symbolic representation.</Paragraph>
    <Paragraph position="6"> Once the grammar is reasonably well-defined, the mapping to symbolic representation can be carried out with mapping tenhniques such as the f-structure constraints of lexical-fuactioaal grammar.</Paragraph>
    <Paragraph position="7"> Up to this point, we can imagine that the entire task has been carried out on paper, or some machine-readable equivalent. Even in such a rudimentary form, the exercise is useful, because it provides a conveniently formal documentation for the knowledge representation decisions that have been made. However, it is also the case that these formal definitions, if appropriately constructed, provide all that is necessary to construct a problem specific interface for acquiring utterantes expressed in this sublanguage. In fact, the idea of using this technique to build acquisition interfaces, using INGLISH, actually occurred as a result of wondering what to do with a grammar we had constructed simply in order to document our representation structures (Freiling et al. 1984).</Paragraph>
    <Paragraph position="8"> We do not intend to imply that it is possible in complex knowledge based system applications to simply build a gram.</Paragraph>
    <Paragraph position="9"> mar and immediately begin acquirin~ knowledge. Often the process leading to construction of the grammar can be quite complex. In our case, it even involved building a simple prototype troubleshooting system before we had gained sufficient confidence in our representation structures to attempt a knowledge acquis/tion interface.</Paragraph>
    <Paragraph position="10"> Nor do we intend to claim that all the knowledge necessary to build a complete expert system need be computed in this fashion. Systems such as INKA can be justified on an economic bash if they make pom/ble only the transfer of a ~'~ nificam fraction of the relevant knowledge.</Paragraph>
  </Section>
  <Section position="5" start_page="254" end_page="256" type="metho">
    <SectionTitle>
GLIB - A PROBLEM SPECHrIC SUBLANGUAGE
</SectionTitle>
    <Paragraph position="0"> The knowledge acquisition language developed for electron/c devine troubleshooting is called GLIB (General Language for Insumneat Behavior), and is aimed primarily at describing observations of the static and dynamic behavior of electrical signals as measured with oscilloscopes, voltmeters, and other standard electronic test instruments (Freiling et al.</Paragraph>
    <Paragraph position="1"> 1984). The grammatical structure of GLIB is that of a semantic grammar, where non-terminal symbols represent units of interest to the problem domain rather than recognizable linguistic categories.</Paragraph>
    <Paragraph position="2"> This semantic grammar formalism is an important part of the DETEKTR methodology because the construction of semantic grammars is a technique that is easily learned by the apprentice knowledge engineer. It also ma~es possible the establishment of very strong constraints on the formal language developed by this process. Two of the design constraints we find it advisable to impose are that the language be unambiguous (in the formal sense of a unique derivation for each legal sentence) and that it be context-free. These constraints, as will be seen, make pom/ble features of the interface which cannot normally be delivered in other contexts, such as menus from which to select all legal next terminal tokens. While increasing complexity of the acquisition sublanguage may make these goals unfeas/ble past a certain point, in simple systems they are features to be cherished.</Paragraph>
    <Paragraph position="3"> Figure I shows a fragment of the GLIB grammar. In the DETEKTR version of INKA, sentences in this language are accepted, and mapped into Proiog terms for proceming by a Prolog based diagnostic inference engine. At present, the eric/ration is unguided: responsibility res/des with the user to ensure that all relevant statements are generated. We are still studying the issues involved ia determining completeness of a knowledge base and assimilating new knowledge. One outcome of these studies should be means of guiding the user to areas of the knowledge base that are incomplete and warrant further elaboration. Future enhancements to the system will include explanation and modification facilities, so that knowledge may be added or changed after testing the inference engine.</Paragraph>
    <Paragraph position="4"> THE NATURAL LANGUAGE INTERFACE DESIGN INGLISH - INterface enGLISH (Ph/Ilips &amp; Nicholl, 1984) - allows a user to create sentences either by menu selection, by typing, or by a mixture of the two. This allows the self-paced transition from menu-driven to a typed mode of interact/on. In-line help is available. To assist the v/pist, automatic spelling correction, word completion, and automatic phrase completion are provided. INGLISH constrains users to create statements within a subset of English, here GLIB.</Paragraph>
    <Paragraph position="5"> A statement can be entered as a sequence of menuselections using only the mouse. A mouse-click brings up a menu of words and phrases that are valid extensions of the  current sentence fragment. Once a selection is made from the menu using the mouse, the fragment is extended. This sequence can be repeated until the sentence is completed.</Paragraph>
    <Paragraph position="6"> Creating a sentence in this manner compares with the NLMENU system (Tennant etal., 1983). Unlike NLMENU, keyboard entry is also possible with IHGLISH. Gilfoil (1982) found that users prefer a command form of entry to menu-driven dialogue as their experience increases. When typing, a user who is unsure of the coverage can invoke a menu, either by a mouse-click or by typing a second space character, to find out what INGLISH expects next without aborting the current statement. Similarly, any unacceptable word causes the menu to appear, giving immediate feedback of a deviation and suggestions for correct continuation. A choice from the menu can be typed or selected u~ng the mouse. |NGLISH in fact allows all actions to be performed from the keyboard or with the mouse and for them to be freely intermingled. As only valid words are accepted, all completed sentences are well-formed and can be translated into the internal representation. Figure 5, in the &amp;quot;INGLISH&amp;quot; window, shows a complete sentence and its translation, and a partial sentence with a menu of continuations. The numbers associated with each menu item provide a shorthand for entry, i.e., &amp;quot;~12&amp;quot; can be typed instead of &amp;quot;RESISTANCE&amp;quot;. As menu entries can be phrases, this can save significant typing effort.</Paragraph>
    <Paragraph position="7"> Input is processed on a word-by-word basis. Single spaces and punctuation characters serve as word terminators. Words are echoed as typed and overwritten in uppercase when accepted. Thus, if lowercase is used for typing, the progress of the sentence is easily followed. An invalid entry remains visible along with the menu of acceptable continuations then is replaced when a selection is made.</Paragraph>
    <Paragraph position="8"> The spelling corrector (a Smalltalk system routine is used) only corrects to words that would be acceptable in the current syntactic/semantic context. As Carbonell and Hayes (1983) point out, this is more efficient and accurate than attempting to correct against the whole application dictionary.</Paragraph>
    <Paragraph position="9"> Word completion is provided with the &amp;quot;escape&amp;quot; character (cf. DEC, 1971). When this is used, INGLISH attempts to complete the word on the basis of the characters so far typed.</Paragraph>
    <Paragraph position="10"> If there are several possibilities, they are displayed in a menu. Automatic phrase completion occurs whenever the context permits no choice. The completion will extend as far as poss/ble In an extreme case a dngle word could yield a whole sentence! The system will &amp;quot;soak-up&amp;quot; any words in the completion that have also been typed.</Paragraph>
    <Paragraph position="11"> The spelling cot'rector and automatic phrase completion can interact in a disturbing manner. Any word that is outside the coverage will be treated ~s an error and an attempt will be made to correct it. If there \[s a viable correction, it will be made. Should phrase completion then be possible, a portion of a sentence could be constructed that is quite different from the one intended by the user. Such behavior will probably be less evident in large gramman. Nevertheless, it may be necessary to have a &amp;quot;cautious&amp;quot; and &amp;quot;trusting&amp;quot; mode, as in Interlisp's DWIM (Xerox, 1983), for users who resent the precocious impat/ence of the interface.</Paragraph>
    <Paragraph position="12"> The system does not support anaphora, and ellipsis is offe:ed indirectly. The interface has two modes: &amp;quot;ENTRY&amp;quot; and &amp;quot;EDIT&amp;quot; (Figure 5). These are selected by clicking the mouse while in the pane at the top right of the interface window. Rules are normally entered in the Enter mode. When in Edit mode, the window gives access to the SmalltaLk editor.</Paragraph>
    <Paragraph position="13"> This allows any text in the window to be modified to create a new statement. After editing, a menu command is used to pass the sentence to the paner as if it were being typed. Any errc;&amp;quot; in the constructed sentence causes a remedial menu to be displayed and the tail of the edited sentence to be thrown away.</Paragraph>
    <Paragraph position="14"> The 1HGLISH interface alleviates the problem of linguistic coverage for designers and users of natural language interfaces. A natural language interface user composes his entries bearing in mind a model of the interface's capabilities. If his model is not accurate, his interactions will be error-prone. He may excerd the coverage of the system and have his entry rejected. If this happens frequently, use of the interface may be abandoned in frustration. On the other hand he may form an overly conservative model of the system and fail to ur~ize the full capabifities of the interface (Tennant, 1980). An interface designer is confronted by many linguistic phenomena, e.g., noun groups, retative rlauses, ambiguity, reference, ellipsis, anaphora, and paraphrases. On account of performance requirements or on a lack of a theoretical understanding, many of these constructions will not be in the interface. INGLISH allows designers to rest more comfortably with the compromises they have made, knowing that users can systematically discover the coverage of the interface.</Paragraph>
  </Section>
  <Section position="6" start_page="256" end_page="256" type="metho">
    <SectionTitle>
THE IMPLEMENTATION OF INGLISH
</SectionTitle>
    <Paragraph position="0"> INGLISH parses incrementally from left to right and performs all checking on each word as it is entered. The parser follows the Left-Corner Algorithm (Gr/ffiths &amp; Petrick, 1965), modified to a pseudo-parallel format so that it can follow all parses simultaneously (Phillips, 1984). Th/s algorithm builds phrases bottom-up from the left-comer, i.e., rules are selected by the first symbol of their r/ght-hand-s/des. For example, given a phrase initial category e, a rule of the form X --e -will be chosen. The remaining rule segments of the right-hand s/de are predictions about the structure of the remainder of the phrase and are processed left-to-right. Subsequent inputs will directly match success/ve rule segments ff the latter are term/aal symbols of the grammar. When a non-terminal symbol is encountered, a subparse is initiated. The subparse is also constructed bottom-up from the left-corner, following the rule selection process just described. When an embedded rule is completed, the phrase formed may have the structure of the non-terminal category that or/ginated the subparse and so complete the subparse. If there is no match, it will become the left-corner of a phrase that will eventually match the originating category.</Paragraph>
    <Paragraph position="1"> The parser includes a Re,whabiliry Mmriz (Griffiths &amp; Petrick, 1965) to provide top-down filtering of rule selection.</Paragraph>
    <Paragraph position="2"> The mntrix indicates when a category A can have a category B as a left-most descendant in a passe tree. The matrix is static and can be derived from the grammar in advance of any pan.</Paragraph>
    <Paragraph position="3"> ing. It is computable as the transitive closure under multiplication of the boolean matrix of left daughters of non-terminal categories in the grammar. It is used as a further constraint on rule selection. For example, when the goal is to construct a sentence and the category of the lust word of input is e, then rule selection, giving X - c --, will also be constrained to have the property S * X -- The filtering is applicable whenever a rule is selected: during subparses the constraint is to reach the category originating the subparse.</Paragraph>
    <Paragraph position="4"> A semantic grammar formalism is used in INGLISH, which make the grammar application dependent. As was mentioned earlier, this format was independently chosen as pan of the knowledge engineering methodology for describing the avplication domain. The rationale for the choice for INGLISH was that the simultaneous syntactic and semantic checking assists in achieving real-time processing. A fragment of the grammar is shown in Figure 1.</Paragraph>
    <Paragraph position="5"> Pre-processing on the grammar coasu'uc:s the terminal and non-terminal vocabularies of the grammar, the reachabllity matrix, and an inverse dictionary. The set of all possible initia/ words and phrases for sentences can also be precomputed.</Paragraph>
    <Paragraph position="6"> The Smafltalk system contnin~ controllers that manage activity on a variety of input devices and from these a controller was readily constructed&amp;quot; to coordinate mouse and key* Smalltalk is an object-oriented language. Instead of creating a procedure that controls system operation, the user creates an object (usually a data structure), and a set of methods (operations that transform, and communicate with the object). Smalitalk programs create objects or send messages to other objects. Once received, messages result in the execution of a method.</Paragraph>
    <Paragraph position="7"> Programmers do not create each object and its methods individually. Instead, classes of objects are deboard activity in INGLISH. Either form of entry increments an intermediate buffer which is inspected by the parser. When a complete word is found in the buffer it is parsed.</Paragraph>
    <Paragraph position="8"> Every phra~ in an on-going analys/s is contained in a Smalltalk object. The final parse is a tree of objects. The intermediate state of a parse is represented by a set of objects containing partially instantiated phrases. After the first word has established an initial set of phrase objects, they are Dolled by the pa~er for their next segments. From these and the rever~; dictionary, a &amp;quot;lookahead dictionary&amp;quot; is estabfished that assoc/ates expected words with the phrasal objects that would accept them. Using this dictionary an incoming word will only be sent to those ob~'ts that will accept it. If the word in not in the set of expected words, the dict/onary keys sre used to attempt spelling correction and, iI correction fails, to make the menu to be displayed. If the dictionary contains only a single word, this indicates that automatic phrase completion should take place. A new lookahead dictionary is then formed from the updated phrase objects, and so On.</Paragraph>
  </Section>
  <Section position="7" start_page="256" end_page="257" type="metho">
    <SectionTitle>
KNOWLEDGE TRANSLATION
</SectionTitle>
    <Paragraph position="0"> The internal form of a diagnostic role is a clause in Prolog. Sentences are translated using functional stigmata, as in lexicai-functioaal grammar. The functional schemata are attached to the phrase structure rules of GLIB (Figure 2).</Paragraph>
    <Paragraph position="2"> Unlike lex/cal-functional grammar, the schemata do not set up constraint equations as the interface and the semant/c grammar ensure the well-formedne~ and unamhigu/ty of the sentence.</Paragraph>
    <Paragraph position="3"> As a result, propagation of functional structure is handled very quickly in a post-proce~ng step since the appficable grammatica/ rules have already been chosen by the parsing process.</Paragraph>
    <Paragraph position="4"> Further, by restricting the input to strictly prescribed sub-language GLIB, not Engl~h in general, the Ur~n~Intioa process is s/mplified.</Paragraph>
    <Paragraph position="5"> fined. A clam definition describes an object and the methods that it understands. Classes are structured h/erarehically, and any class automaticaUy /nherits methods from its superclass.</Paragraph>
    <Paragraph position="6"> As a result of this hierarchy and code inher/tance, applications may be wr/tten by adap~ng previously con* strutted code to the ~k at hand. Much of the appUcat/on code can be inherited from prev/ously defined SmaIitalk code. The programmer need only redefine differences by overriding the inappropriate code with custom/zed code. (Alexander &amp; Freiling, 1985).</Paragraph>
    <Paragraph position="7">  The parser constzvcts a par~ tree with attached schemata, referred to as a constituent-structure, or c-structure. Translation proceeds by instantiatinS the meta-vatiablns of the schemata of the c-structm~ created by INGLISH to form functional equations which ate solved to produce a functional structure (f-~e). The final rule form is obtained from the f-structure of the sentence when its sub.structures are recursively trandormed according to the contents of each f-structure.</Paragraph>
    <Paragraph position="8"> As an example, given the lexical-functioaal form of the semantic grammar in Figure 2 and the following sentence: IF LED-2 IS ON THEN TRANSISTOR-17 HAS FAILED the' c-structure in Figure 3 would be produced. This shows that a rule has a condition part, COND, and a conclus/on part, CNCL, that should become a clausal-form ~Ule(COND, CNCL). ~ The meta-symbol t refers to the parent node and t to the node to which the schema is attached.</Paragraph>
    <Paragraph position="9"> The final phase of INKA interprets the f-structures to produce Pmlog clauses. All of the information required to produce the clauses is contained in the FORM property in this example. The FORM property is printed, with all variables instantiated, to produce the f'mal rule in the form of a Prolog clause. The f-strucntre of Figure 4 produces the Prolog clause rule(state(led-2, on), ~tatus(transistor-17, failed)</Paragraph>
  </Section>
  <Section position="8" start_page="257" end_page="259" type="metho">
    <SectionTitle>
KNOWLEDGE USE
</SectionTitle>
    <Paragraph position="0"> Translated rules are sent to a diagnostic engine that has been implemented ia Pmiog. The diagnosdc engine uses GLIB statements about the hierarchical structure of the device to build a strategy for successive localization of failures. Starting at the highest level ('the circuit&amp;quot; in GLIB terminology), named sub-cimults are examined in turn, and diagnostic rules retrieved to determine correctness or failure of the sub-circuit  The functional specifications of the example may be solved by instantiating the recta-symbols with actual nodes and assigning properties and values to the nodes according to the specifications. In the example given, most specifications are of the form &amp;quot;(t pmpert'y)=value&amp;quot; where &amp;quot;value&amp;quot; is most often *. This form indicates that the node graphically indicated by t in * the c-structure is the specified property of the parent node (pointed to by *). Specifications are left-= _~:oC/_ lative and have a functional semantic interpretation. A specification of (t COND FORM) refers to the FORM property of the parent node's COND property. The f-~mcture for the example is given in Figure 4.</Paragraph>
    <Paragraph position="1"> in question. If no specific determination can be made, the sub-circuit is a.mumed to be functioning properly.</Paragraph>
    <Paragraph position="2"> A sample session including acquisition of a rule and atoning of a test diagnosis is shown in Figure 5. The circuit used in this example consists of an oscillator wh/ch drives a light emitting diode (LED-2 in the schematic) and a power supply (LED-1 indicates when the power supply is on). The schematic diagram of the circuit is in the upper pane of the &amp;quot;Insu'ument Data&amp;quot; window; the circuit board layout is in the lower pane. Rules for diagnosing problems in the circuit  ('troubleshooting&amp;quot; rules) are added to the system in the window labeled &amp;quot;INGLISH.&amp;quot; The interface to the diagnnsziC/ engine is in the &amp;quot;Prolog&amp;quot; window. The &amp;quot;INGLISII&amp;quot; window shows a recently added rule, with its Prolog translation immediately below it. It also shows a partially completed rule along with a menu of acceptable sentence continuations. The user may select one of the menu items (either a word or phrase) to be appendcd to the current sentence. The &amp;quot;Pmlog&amp;quot; window displays the results of a recent test diagnosis. This test was run after the first rule in the ~NGLISH&amp;quot; window was added, but before the addition of the second rule was begun.</Paragraph>
    <Paragraph position="3"> The last question asked during the diagnosis corresponds to the first rule. Resistor 2, in both the schematic and board diagrams of the =Instrument Data&amp;quot; window, is highlighted as a result of running the diagnos/s: whenever the diagnnstic engine selects a specific component for consideration that component is highlighted on the display. Some 20 statements and rules have been collected '.'or diagnosing the circuit; Figure 6 lists a portion of them with their Prolog translation.</Paragraph>
  </Section>
  <Section position="9" start_page="259" end_page="259" type="metho">
    <SectionTitle>
THE CIRCUIT CONTAINS OSCILLATOR-1 AND POWERSUPPLY-1.
</SectionTitle>
    <Paragraph position="0"> has_cemponent(block(circult), block(oscillator(1))).</Paragraph>
    <Paragraph position="1"> has_component(block(c/rcuit), block(powetlupply(1))).</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML