File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/89/h89-2059_abstr.xml
Size: 4,246 bytes
Last Modified: 2025-10-06 13:46:46
<?xml version="1.0" standalone="yes"?> <Paper uid="H89-2059"> <Title>SPOKEN LANGUAGE SYSTEMS</Title> <Section position="1" start_page="0" end_page="443" type="abstr"> <SectionTitle> SPOKEN LANGUAGE SYSTEMS PI: John Makhoul </SectionTitle> <Paragraph position="0"> BBN STC, 10 Moulton St., Cambridge, MA 02138 makhoul@ bbn.com The objective of this project is to develop a spoken language system capable of understanding and responding to spoken English commands and queries for interactive human-machine applications, such as battle management, command and control, and training of personnel on complex tasks. The system will also include a capability to adapt to new speakers and a capability to detect when a user says a new word, and allows the user to add the word to the system.</Paragraph> <Paragraph position="1"> Work in this area requires the integration of three technologies: large-vocabulary continuous speech recognition, natural language understanding, and system integration. In our work at BBN, we have integrated our BYBLOS continuous speech recognition technology with a new unification-based natural language understanding component, resulting in an initial complete spoken language system, called HARC (Hear And Respond to Continuous speech).</Paragraph> <Paragraph position="2"> Our most recent contribution is the development of a new strategy for integrating speech and natural language components, called &quot;N-best&quot;. This method takes a spoken utterance and produces the N highest scoring sentences that match the input utterance withing some threshold, based on a statistical language model. The natural language component then searches these N sentences for the highest scoring sentence for which the system can produce a semantic interpretation. The meaning representations are passed to a discourse component that resolves reference ambiguities and chooses the best meaning. Finally, the chosen meaning representation is passed to a response component which carries out the user's request.</Paragraph> <Paragraph position="3"> Initial experiments have shown that for applications of interest, the correct sentence is usually one of the top five and almost always within the top twenty (i.e., N=20). One important feature of this N-best integration strategy is that it provides a very clean interface between speech and natural language and, therefore, allows for greater sharing of resources among researchers in spoken language systems.</Paragraph> <Paragraph position="4"> The natural language knowledge sources in HARC use a Unification formalism for describing the syntax and semantics of English and a higher-order intensional logic for representing the meaning of an utterance. The system uses unification to enforce syntactic as well as semantic constraints, and provides for the incremental application of syntax and semantics. Advantages of this approach are that unproductive search paths are cut off more quickly, and any improvements in unification parsing (through better algorithms, special hardware, etc.) apply automatically to semantics as well as syntax.</Paragraph> <Paragraph position="5"> in this project, we have been very instrumental in the design and collection of spoken language data for the purpose of objective system evaluation. We previously helped specify the DARPA Resource Management Corpus that is now in common use for speech recognition evaluation, and we provided a Word-Pair Grammar to be used with the corpus. We have recently developed and made available to the DARPA community, with full documentation, a personnel database for use in spoken language evaluations, and a relational database language (ERL) in Common LISP to interface to the database. We have also provided software to aid in the collection of an appropriate corpus by Texas Instruments.</Paragraph> <Paragraph position="6"> Most recently, we have started work on the automatic detection of out-of-vocabulary words. This is an important problem for any realistic system with a large vocabulary, since the user is unlikely to be able to remember which words are in the vocabulary. Imtial results for the detection of open-class wolds have been very encouraging.</Paragraph> </Section> class="xml-element"></Paper>