File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/89/h89-1007_abstr.xml
Size: 3,197 bytes
Last Modified: 2025-10-06 13:46:41
<?xml version="1.0" standalone="yes"?> <Paper uid="H89-1007"> <Title>Integrating Speech and Natural Language</Title> <Section position="1" start_page="0" end_page="82" type="abstr"> <SectionTitle> BBN Systems and Technologies Corporation 10 Moulton Street, Cambridge, MA 02238 </SectionTitle> <Paragraph position="0"> The overall goal of this project is to integrate speech and natural language knowledge sources to build a speech understanding system for human-machine communication using spoken English.</Paragraph> <Paragraph position="1"> The speech knowledge sources use acoustic models based on hidden Markov modeling techniques.</Paragraph> <Paragraph position="2"> The natural language knowledge sources use a Unification grammar formalism for describing the syntax of English, a higher-order intensional logic language for representing the meaning of an utterance, and a 'Montague Grammar style' framework for interfacing syntax and semantics.</Paragraph> <Paragraph position="3"> The objective of an integrated search strategy is to find the globally optimal (by acoustic likelihood) interpretation of the input given the constraints that are imposed by the syntactic and semantic components. Our approach in the BBN Spoken Language System (SLS) uses hidden Markov word models to determine a 'dense' word lattice (to minimize errors due to early decisions) of possible words present in the input speech. Then, a lattice parser is used to find all parses for all the syntactically possible word sequences present in the lattice (ordered by acoustic likelihood).</Paragraph> <Paragraph position="4"> Finally, a semantic interpreter determines from the ordered list of possible word sequences the highest scoring meaningful word sequence. That word sequence is the recognized sentence and its meaning is the interpretation of the input speech.</Paragraph> <Paragraph position="5"> The lattice parser is an extension of our bottom-up word-synchronous parser for the Unification grammar to accept a word lattice as input. The algorithm determines all possible grammatical word sequences and ranks them by acoustic likelihood scores. Then, that list of grammatical sentences is processed to determine the highest scoring word sequence that is also meaningful using the application domain semantics.</Paragraph> <Paragraph position="6"> The SLS system was evaluated using the DARPA Resource Management database. For the language modeling components (syntax and semantics), we used a training corpus of 791 sentences, which has been used for system development, and a test corpus of 200 sentences, which was unseen by the system developers. The grammar coverage was 92 % for the training corpus and 81% for the test corpus. The coverage of the semantic interpreter was 75 % and 52 % for the two corpora, respectively.</Paragraph> <Paragraph position="7"> We have also measured the speech understanding performance on three speakers from the 1000-word Resource Management database. The word accuracy on the 1000-word task improves from 71% when no language model is used to 87 % when the syntax alone is used and to 92 % when both syntax and semantics are used.</Paragraph> </Section> class="xml-element"></Paper>