EVALUATION AND ANALYSIS OF AUDITORY 
FRONT ENDS 
FOR ROBUST SPEECH RECOGNITION 
PROGRAM SUMMARY* 
MODEL 
Richard P. Lippmann, Principal Investigator 
Lincoln Laboratory, M.I.T. 
Lexington, MA 02173- 9108 
PROGRAM GOALS 
The purpose of this work is to integrate a number of audi- 
tory model front ends into a high-performance HMM recog- 
nizer, to test and evaluate these front ends on noisy speech, 
and to analyze the results in order to develop a more robust 
front end which may combine features of a number of the 
current auditory model-based systems. 
BACKGROUND 
This project was motivated by the need for improved 
speech recognition in noise, and by expectation that au- 
ditory model front ends could make recognition more ro- 
bust to noise, microphone variation, and speaking style. 
The project has focussed on implementing, evaluating, and 
comparing three promising auditory front ends: (1) the 
mean-rate and synchrony outputs of S. Seneff'sauditory 
modal; (2) the ensemble interval histogram (EIH) model 
developed by O. Ghitza; and (3) the IMELDA model due 
to M. Hunt. Additional comparisons have been carried out 
between baseline systems using mel-cepstra derived from 
filterbank and LPC analysis. 
RECENT ACCOMPLISHMENTS 
The three auditory models (Seneff, EIH and IMELDA) have 
been compared extensively among themselves and with a. 
mel-cepstrum front end for HMM isolated-word recogni- 
tion on the TI-105 isolated word corpus. Conditions tested 
have included additive white noise, additive speech babble 
noise, and spectral variability due to microphone placement, 
channel, and acoustic recording environment. The best re- 
sults from the auditory models were shown to provide small 
but consistent improvement over mel-cepstrum under con- 
ditions of high noise and spectral variability. These small 
improvements may not warrant the added complexity of the 
auditory models. 
Additional comparisons between mel-filterbank (MFB) and 
LPC-based cepstrum front ends were conducted, showing 
significant advantages for MFB in noise; the gain in moving 
from LPG to MFB was greater than the gain in moving 
from MFB to any of the auditory models. Most recently, 
*THIS WORK WAS SPONSORED BY THE DEFENSE AD- 
VANCE RESEARCH PROJECTS AGENCY. THE VIEWS EX- 
PRESSED ARE THOSE OF THE AUTHOR AND DO NOT 
REFLECT THE OFFICIAL POLICY OR POSITION OF THE 
U.S. GOVERNMENT. 
selected CSR experiments have been performed on resource 
management comparing auditory models to MFB. These 
results were confirmed at other sites. 
As yet, no improvements have been achieved with the au- 
ditory models. 
PLANS 
Plans include: (1) further investigation of dimension ality re- 
duction using principal components and linear discriminant 
analysis, and (2) completion of the CSR resource manage- 
ment tests on the auditory models. 
399 
