File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/86/h86-1001_metho.xml

Size: 17,129 bytes

Last Modified: 2025-10-06 14:11:56

<?xml version="1.0" standalone="yes"?>
<Paper uid="H86-1001">
  <Title>Research and Development in Natural Language Processing at BBN Laboratories in the Strategic Computing Program</Title>
  <Section position="4" start_page="1" end_page="3" type="metho">
    <SectionTitle>
BBN Laboratories Incorporated
</SectionTitle>
    <Paragraph position="0"> o IRUS-86 contains a new module which exploits this NIKL domain model to simplify MRL expressions; this makes it possible to translate complex MRLexpressions into ERL constants, thus allowing for significant divergences between the input English and the structure of the underlying data base \[Stallard 86\].</Paragraph>
    <Paragraph position="1"> In addition to accessing the NIKL domain model, the parser, semantic interpreter and MRL-to-ERL translator access other knowledge sources which contain domain-dependent information: o the lexicon, o the semantic interpretation rules for individual concepts, o the MRL-to-ERL mapping rules for individual MRL constants, which introduce the details of underlying system structure, such as file and field names. To port IRUS to the navy domain, the relevant domain-dependent data had to be supplied to the system. This task is being accomplished by personnel at the Naval Ocean Systems Center (NOSC). In August, 1985, BBN provided NOSC with an initial prototype system containing small example sets of lexical entries, semantic interpretation rules, and MRL-to-ERL rules; using acquisition tools provided by BBN, NOSC personnel have been entering the rest of the data.</Paragraph>
    <Paragraph position="2"> IRUS-86 was delivered to the FRESH developers at Texas Instruments in January 1986, was installed in a test bed area of the Pacific Fleet Command Center in April 1986, and will be demonstrated in June 1986. Currently, the lexicon and the domain-dependent rules of the system only cover a relatively small part of the OSGP capabilities and the files and attributes of the Integrated Data Base. Once enough data have been entered so that the system covers a sufficiently large part of the data base, it will be tried out in actual use by Navy personnel. This will enable us to gather data about the way the system performs in a real environment, and to fine-tune the system in various respects. For instance, IRUS-86 makes use of shallow heuristic methods to address some aspects of natural language understanding such as anaphora and ellipsis for which general solutions are still research issues. The FCCBMP application provides a test bed in which such heuristic methods can be evaluated, and enhancements to them developed and tested, as part of the evolutionary technological growth intended to continue throughout the Natural Language Technology effort of the Strategic Computing Program.</Paragraph>
  </Section>
  <Section position="5" start_page="3" end_page="4" type="metho">
    <SectionTitle>
BBN Laboratories Incorporated
3 Functional Goals for JANUS
</SectionTitle>
    <Paragraph position="0"> The IRUS-86 system excels by its clean, modular structure, its broad syntactic/semantic coverage, its sophisticated domain model, and its systematic treatment of discrepancies between the English lexicon and the data base structure.</Paragraph>
    <Paragraph position="1"> We thus expect that it will demonstrate considerable utility as an interface component in the FCCBMP application. Nevertheless, IRUS-86 shares with other current systems several limitations which should be overcome if natural language interfaces are to become truly &amp;quot;natural&amp;quot;. In developing JANUS, the successor of IRUS-86, we shall attempt to overcome some of those limitations. The areas of increased functionality we are considering are: semantics and knowledge representation, ill-formedness, discourse, cooperativeness, multiple underlying systems, and knowledge acquisition.</Paragraph>
    <Section position="1" start_page="3" end_page="4" type="sub_section">
      <SectionTitle>
3.1 Semantics and Knowledge Representation
</SectionTitle>
      <Paragraph position="0"> IRUS-86, like most other current systems, represents sentence meanings as formulas of a logical language which is a slight extension of first-order logic. As a consequence, many important phenomena in English have no equivalent in the meaning representation language, and cannot be dealt with correctly, e.g., modalities, propositional attitudes, generics, collective quantification, and context-dependence.</Paragraph>
      <Paragraph position="1"> Thus, one foregoes one of the most important potential assets of a natural language interface: the capacity of expressing complex semantic structures in a succinct and comfortable way.</Paragraph>
      <Paragraph position="2"> In JANUS, we will therefore adopt a new meaning representation language which combines features from PHILIQAI's enriched lambda-calculus \[Scha 76\] with ideas underlying Montague's Intensional Logic \[Montague 70\], and possibly a distributed quote-operator \[Haas 86\]. It will have sufficient expressive power to incorporate a version of Carlson's treatment of generics \[Carlson 79\], a version of Scha's treatment of quantification \[Scha 81\], Montague's treatment of modality, and various possible approaches to propositional attitudes and context-dependence.</Paragraph>
      <Paragraph position="3"> In adopting a higher order logic as proposed, one confronts problems of formula simplification and the need to apply meaning postulates to reduce the semantic representation of an input sentence to an expression appropriate to the underlying system, e.g., a relational algebra expression in the case that the underlying system is a data base. To do this, we will investigate the limited inference mechanisms of KL-TWO \[Moser 83, Vilain 85\], following up on our previous work \[Stallard 86\]. The advantage of these inference mechanisms is their tractability; discovering their power and limitations in this complex problem domain should be an interesting result.</Paragraph>
    </Section>
  </Section>
  <Section position="6" start_page="4" end_page="5" type="metho">
    <SectionTitle>
BBN Laboratories Incorporated
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="4" end_page="4" type="sub_section">
      <SectionTitle>
3.2 Discourse
</SectionTitle>
      <Paragraph position="0"> The meaning of a sentence depends in many ways on the context which has been set up by the preceding discourse. IRUS and other systems, however, currently ignore most of these dependencies, and employ a rather shallow model of discourse structure.</Paragraph>
      <Paragraph position="1"> To allow the user to exploit the full expressive potential of a natural language interaction, the system must track topics, reference times, possible antecedents for anaphora, etc.; it must be able to recognize the constituent units of a discourse and the subordination or coordination relations obtaining between them. A substantial amount of work has been done already on several of these issues, much of it by BBN researchers \[Sidner 85, Hinrichs 81, Polanyi 84, Grosz 86\]. Research in this area continues under a separate DARPA-funded contract. We.expect to be able to integrate some of the results of that research in the JANUS system.</Paragraph>
    </Section>
    <Section position="2" start_page="4" end_page="4" type="sub_section">
      <SectionTitle>
3.3 M-formedness
</SectionTitle>
      <Paragraph position="0"> A natural interface system should be forgiving of a user's deviations from its expectations, be they misspellings, typographical errors, unknown words, poor syntax, incorrect presuppositions, fragmentary forms, or violated selection restrictions.</Paragraph>
      <Paragraph position="1"> Empirical studies show that as much as 25% of the input to data base query systems is ill-formed.</Paragraph>
      <Paragraph position="2"> IRUS currently handles some classes of ill-fo'rmedness by using a combination of shallow heuristics and user interaction. It can correct for typographical misspellings, for omitted determiners or prepositions, and for some ungrammaticalities, like determiner-noun and subject-verb disagreement. The JANUS system will employ a more general approach to ill-formedness that will handle a larger class of ungrammatical constructions and a larger class of word selection problems, and that will also explore correcting several types of semantic ill-formedness.</Paragraph>
      <Paragraph position="3"> These capabilites have major implications for the control of the understanding process, since considering such possibilities can exponentially expand the search space. Maintaining control will require care in integrating the ill-formedness capability into the rest of the system, and also making maximal use of the guidance that can be derived from a model of the discourse and user's goals to constrain the search.</Paragraph>
    </Section>
    <Section position="3" start_page="4" end_page="5" type="sub_section">
      <SectionTitle>
3.4 Cooperativeness
</SectionTitle>
      <Paragraph position="0"> A truly helpful system should not react to the literal meaning of a sentence, but to its perceived intent. If in the context of a given application it is possible to characterize the goals that a user may be expected to be pursuin$ through his interaction with the system, the system should try to infer from the user-input what the underlying goal could be. A system can do this by accessing a goal-subgoal</Paragraph>
    </Section>
  </Section>
  <Section position="7" start_page="5" end_page="5" type="metho">
    <SectionTitle>
BBN Laboratories Incorporated
</SectionTitle>
    <Paragraph position="0"> hierarchy which links the speech acts expressed by individual utterances to the global goals that the user may have. This strategy has been applied successfully to rather small domains \[Allen 83, Sidner 85\]. We wish to investigate whether it carries over to the FCCBMP applications.</Paragraph>
    <Section position="1" start_page="5" end_page="5" type="sub_section">
      <SectionTitle>
3.5 Modelling the Capabilities o_.f Multiple System
</SectionTitle>
      <Paragraph position="0"> The way in which IRUS-86 decides whether an input sentence translates into an IDB query or an OSGP command may be refined. There is a need for work on what kind of knowledge would be necessary to interface smoothly and intelligently to multiple underlying systems. A reasoning component is needed that can determine which underlying system or systems can best fulfill a user's request. Such a reasoning component would have to combine a model of the capabilites of the underlying systems with a model of the user goals and current intentions in the discourse context in order to choose the correct system(s). Such a model would also be useful for providing supporting information to the user.</Paragraph>
    </Section>
    <Section position="2" start_page="5" end_page="5" type="sub_section">
      <SectionTitle>
3.8 Knowledge Acquisition
</SectionTitle>
      <Paragraph position="0"> Further research is also called for to expand the power of the knowledge acquisition tools that are used in adding to the lexicon, the set of case frame rules, the~model of domain predicates, and the set of transformation rules between the Meaning Representation Language and the languages of the underlying systems. The acquisition tools available in IRUS, unlike those in some other systems, are not tied to the specific fields and relations in the underlying database. The acquisition tools should work on the higher level of the domain model, since that provides a more general and transportable result. The knowledge acquisition facilities for JANUS will also need to be redesigned to support and to make maximal use of the power of the new meaning representation language based on intensional logic.</Paragraph>
    </Section>
  </Section>
  <Section position="8" start_page="5" end_page="6" type="metho">
    <SectionTitle>
4 New Underlying Technologies
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="5" end_page="6" type="sub_section">
      <SectionTitle>
4.1 Coping with Ambiguity
</SectionTitle>
      <Paragraph position="0"> The new functionalities we described in the previous section, and the techniques we intend to use to achieve them, raise an issue which has important consequences for the design of JANUS: we will be faced with an explosion in the number of interpretations that the system will have to process; every sentence will be manifold ambiguous. One source of this phenomenon is the improvement of the semantic coverage and the broadening of the discourse context. Distinctions and ambiguities which so far were ignored will be dealt with: for instance, different interpretation and</Paragraph>
    </Section>
  </Section>
  <Section position="9" start_page="6" end_page="6" type="metho">
    <SectionTitle>
BBN Laboratories Incorporated
</SectionTitle>
    <Paragraph position="0"> scopes of quantifiers will be considered, and different antecedents for pronouns. Even more serious is the processing of ill-formed sentences, which may require trying out all partial interpretations to see which one can be extended to a complete interpretation after relaxing one or more constraints.</Paragraph>
    <Paragraph position="1"> To cut down on the processing of spurious interpretations, it is very important that interpretations of sentences and their constituents be tested for plausibility at an early stage. Different techniques must probably be used in conjunction: o Simplification transformations may show that an interpretation is absurd, by reducing it to TRUE or FALSE or the empty set.</Paragraph>
    <Paragraph position="2"> o The discourse context and the model of the user's goals impose constraints on expected sentences.</Paragraph>
    <Section position="1" start_page="6" end_page="6" type="sub_section">
      <SectionTitle>
4.2 Parallel Parsing
</SectionTitle>
      <Paragraph position="0"> Since some of the techniques that we intend to use to fight the ambiguity explosion are themselves rather computation-intensive, it is clearly unavoidable that the improved system functionality that we aim for will lead to a considerable increase in the amount of processing required. To avoid a serious decrease of the new system's response times, we will therefore move it to a suitable parallel machine such as BBN's Butterfly or Monarch, running a parallel Common Lisp. This in itself has rather serious consequences for the software design. It means that from the outset we will keep parallelizability of the software in mind.</Paragraph>
      <Paragraph position="1"> We have begun to address this issue in the area of syntax. A new declarative grammar is being written, which will ultimately have a coverage of English larger than the current RUS grammar; the grammar is written in a side-effect-free formalism (a context-free grammar with variables) so that different parsing algorithms may be explored which are easily parallelizable. The first such algorithm was implemented in May 1986 on BBN's Butterfly.</Paragraph>
    </Section>
  </Section>
  <Section position="10" start_page="6" end_page="7" type="metho">
    <SectionTitle>
5 Contributions from Other Sites
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="6" end_page="7" type="sub_section">
      <SectionTitle>
5.1 ISI/UMass: Generation
</SectionTitle>
      <Paragraph position="0"> We should not expect that JANUS will always be able to assess correctly which interpretation of a sentence is the intended one. In light of such situations, it is very important that the system can give a paraphrase of the input to the user, which shows the system's interpretation. This may be done either explicitly or as part of the answer. To be able to develop such capabilities, work on Natural Language Generation is needed. At USC/ISI a project directed by William Mann and Norman</Paragraph>
    </Section>
  </Section>
  <Section position="11" start_page="7" end_page="7" type="metho">
    <SectionTitle>
BBN Laboratories Incorporated
</SectionTitle>
    <Paragraph position="0"> Sondheimer is underway to develop the generation system PENMAN, using the NIGEL systemic grammar. PENMAN will be integrated to become the generation component of JANUS. PENMAN itself consists of several subcomponents. Some of these, specifically the &amp;quot;text planning&amp;quot; component, will be developed through joint work between USC/ISI and David McDonald at the University of Massachusetts, based on the farter's experience with the MUMBLE system.</Paragraph>
    <Section position="1" start_page="7" end_page="7" type="sub_section">
      <SectionTitle>
5.2 UPenn: Cooperation and Clarification
</SectionTitle>
      <Paragraph position="0"> Under the direction of Aravind Joshi and Bonnie Webber at the University of Pennsylvania, several focussed studies have been carried out to investigate various aspects of cooperative system behaviour and clarification interactions. (For more detail, see their paper in this issue.) As part of the Strategic Computing Natural Lanauge effort, UPenn will eventually develop this into a module which can be integrated into JANUS to further enhance its capabilities.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML