Robust Continuous Speech Recognition Technology 
Program Summary* 
Clifford J. Weinstein and Douglas B. Paul, Principal Investigators 
Lincoln Laboratory, M.I.T. 
Lexington, MA 02173-9108 
Program Goals 
The major objective of this program is to de- 
velop and demonstrate robust, high-performance con- 
tinuous speech recognition (CSR) techniques and sys- 
tems focused on applications in spoken language sys- 
tems (SLS). A key supporting objective is to de- 
velop techniques for integration of CSR and natural 
language processing (NLP) systems in SLS applica- 
tions. The CSR techniques are based on a continuous- 
observation hidden Markov model (HMM) approach, 
using tied Ganssian mixtures to model the speech pa- 
rameters. A stack-decoder control structure is be- 
ing developed and utilized, both for efficient large- 
vocabulary recognition, and to facilitate integration 
of CSR and NLP systems. 
Background 
The Lincoln program began with a focus on 
improving speaker stress robustness for the fighter 
aircraft environment. A robust HMM isolated-word 
recognition (IWR) system was developed with very 
high performance under stress conditions. The ro- 
bust HMM system has since been adapted and ex- 
tended to large vocabulary CSR. This effort has in- 
cluded the development of a number of new modeling 
and recognition techniques, and the resulting tied- 
mixture HMM CSR system achieved state-of-the-art 
performance for both speaker-dependent (SD) and 
speaker-independent (SI) recognition on the DARPA 
Resource Management (RM) database. Current work 
focuses on extension to the new Wall Street Jour- 
nal (WSJ) CSR corpus, with vocabularies of 5,000- 
20,000 words. 
*This work was sponsored by the Defense Advanced Re- 
search Projects Agency. The views expressed are those of the 
authors and do not reflect the official policy or position of the 
U.S. Government. 
Recent Accomplishments 
Recent accomplishments include: (1) major con- 
tributions to the design of the WSJ corpus, and devel- 
opment and implementation of the necessary text pre- 
processing system to make text and language mod- 
els available to recording sites and testing sites; (2) 
development of an efficient stack decoder algorithm 
for large-vocabulary CSR; (3) development of fast 
match techniques to expedite large-vocabulary recog- 
nition; and (4) application of the tied-mixture, stack- 
decoder-based HMM CSR, with fast-match to obtain 
a first set of results on the new WSJ corpus. 
Plans 
Plans for the current program include: (1) exten- 
sive development and test of large-vocabulary CSR 
techniques of the new CSR corpus; (2) continued de- 
velopment of the tied-mixture HMM CSR system, in- 
cluding adaptive training and recognition techniques, 
mixture weight smoothing, and improved speaker- 
independent techniques; (3) further development of 
the stack-decoder-based HMM, for integration with 
the CSR/NLP interface system and with NLP sys- 
tems developed at other sites; and (4) exploration of 
advanced acoustic modeling techniques for improved 
recognition. 
476 
