File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/91/h91-1081_intro.xml

Size: 4,208 bytes

Last Modified: 2025-10-06 14:04:59

<?xml version="1.0" standalone="yes"?>
<Paper uid="H91-1081">
  <Title>SPOKEN LANGUAGE SYSTEMS</Title>
  <Section position="3" start_page="0" end_page="407" type="intro">
    <SectionTitle>
ACCOMPLISHMENTS
</SectionTitle>
    <Paragraph position="0"> Work in this area requires the integration of three technologies: large-vocabulary continuous speech recognition, natural language understanding, and system integration. In our work at BBN, we have integrated our BYBLOS continuous speech recognition technology with a new natural language understanding component, DELPHI, resulting in a complete spoken language system, called HARC (Hear And Respond to Continuous speech). A major accomplishment of this project has been the development of a real-time version of HARC, implemented completely on commercially-available hardware. An N-best version of BYBLOS (see below) running in real-time has been implemented on a Sun 4 with a Sky Challenger signal processing board. The DELPHI natural language component also runs on a Sun 4. The complete system has been interfaced recently to a DARPA-sponsored military logistical planning system, called DART (Dynamic Analysis Replanning Tool).</Paragraph>
    <Paragraph position="1"> The DELPHI natural language component uses a Unification formalism for describing the syntax and semantics of English and for enforcing syntactic and semantic constraints. It uses a higher-order intensional logic for representing the meaning of a sentence. The system provides for the incremental application of syntax and semantics; advantages of this approach are that unproductive search paths are cut off more quickly, and any improvements in unification parsing apply automatically to semantics as well as syntax. We have implemented unification semantics for our grammar rules in four task domains: battle management, personnel information retrieval, airline travel information retrieval, and military logistical planning. We have interfaced and extended the JANUS discourse module, developed under an earlier DARPA effort, to the HARC system. We also developed a method for rapid porting of the natural language component to new task domains using the Parlance Learner TM knowledge acquisition tool. Recent accomplishments in DELPHI include parsing speedups, streamlining the unification grammar, and introducing mapping units into the semantic processing. Syntactic and semantic parsing of a sentence now takes less than one second on average on a Sun 4.</Paragraph>
    <Paragraph position="2"> One important contribution has been the development of the N-best search strategy for integrating speech and natural language components. This method produces the N highest scoring sentences that match an input utterance, aided by a statistical language model. The natural language component then searches these N sentences for the highest scoring sentence for which the system can produce a semantic interpretation. The N-best paradigm, by providing a clean and simple interface between speech and natural language, has found immediate acceptance as the method of choice in spoken language integration.</Paragraph>
    <Paragraph position="3"> An efficient two-pass (forward-backward) algorithm for obtaining the N-best sentences has allowed the implementation of the algorithm in real-time on a Sun 4.</Paragraph>
    <Paragraph position="4"> In this project, we have been instrumental in the design of methodologies for the collection of spoken language data and the objective evaluation of spoken language systems. We previously helped specify the DARPA Resource Management Corpus that is now in common use for speech recognition evaluation. More recently, our proposals for the evaluation of spoken language systems have been adopted by the DARPA community.</Paragraph>
    <Paragraph position="5"> We have developed what we believe to be the first successful method for the automatic detection of out-of-vocabulary words. This is an important problem for any realistic system with a large vocabulary. Initial results show a 70% detection rate with only 1% false alarm. We have recently developed a capability for the addition of new words to a speech recognition system.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML