A DESCRIPTION OF THE VESPRA SPEECH PROCESSING SYSTEM 
Rolf Haberbeck 
FU Berlin 
FB Germanistik 
D-tO00 Berlin 33 
TU Berlin 
FB Informatik 
D-tO00 Berlin 10 
ABSTRACT 
The VESPRA system is designed for the 
processing of chains of (not connected 
utterances of) wordforms. These strings 
of wordforms correspond to sentences 
except that they are not realised in 
connected speech. VESPRA means: 
Verarbeitung und Erkennung gesprochener 
Sprache (processing and recognition of 
speech). VESPRA will be used to control 
different types of machines by voice 
input (for instance: non critical 
control functions in cars and in 
trucks, voice box in digital telephone 
systems, text processing systems, 
different types of office work- 
stations). 
I . 
The VESPRA system consists of five 
components: 
I) the noise reduction unit; 
2) the phonetic feature extraction and 
pattern recognition unit; 
3) an ATN grammar , a dialog model and 
a model of the controlled machlne; 
4) a machine control and dialog 
generation unit; 
5) a user friendly software development 
environment. 
2. 
In difference to common speech 
processing systems VESPRA has an 
integrated noise reduction unit. This 
noise reduction unit is context 
sensitive. Depending on the type of 
noise several types of filters will 
reduce the noise corresponding to the 
actual situation in which the system is 
used. Analog and digital filtering 
methods will be used. Noise has been up 
to now a big problem which made a wide 
use of speech processing systems 
impossible. The noise reduction is 
triggered by the actual state of the 
machine and the general acoustical 
environment. VESPRA will be able to 
recognize 500 wordforms speaker 
dependent and 100 wordforms speaker non 
sensitive. 
An ATN grammar processes all meaningful 
sentences on the basis of these 
wordforms (including reduced forms of 
618 
sentences). The result of this lexica\], 
syntactical, semantical and pragmatical 
processing is stored in the dialog 
memory or compared with the content of 
the dialog memory. The interpreted 
command input is processed by the model 
of the actual state of the controlled 
machine. If a command by the user is in 
conflict with the general state of the 
controlled machine VESPRA informs the 
user by voice output or by visual 
output. The voice output will be 
realised by LPC coded speech and is 
included in the VESPRA system. The 
visual output depends on the 
possibilities offered by the controlled 
machine. If a command by the user is 
not in conflict with the general state 
of the controlled machine the VESPRA 
system :jives an instruction to the 
controlled machine, lhe interface 
between VESPRA and the controlled 
machine is designed in a way that 
allows to connect various types of 
sensors and actors to VESPRA. 
There is a feedback control between the 
lexical, syntactical, semantical and 
pragmatical processing unit and the 
phonetic extraction and pattern 
recognition unit in order to optimize 
the phonetic processing and the 
processing of the chains of wordforms. 
The dialog model and the model of the 
controlled machine control the noise 
reduction unit. Tile chains of w ordforms 
may consist of ten wordforms in the 
maximum. After the command input by the 
user is Finished the VESPRA system or 
the controlled machine reacts within 
0.3 seconds. 
A user I~riendly software developmental 
system that runs on a mainframe or a 
workstation gives a non instructed user 
(engineer) the possibility to modify 
certain units of the VESPRA system 
within a certain limit of complexity. 
This developmental system may modify 
the parameters of the following units: 
-lexical, syntactical, semantical and 
pragmatical processing; 
-dialog model and dialog memory; 
-model of the actual state of the 
machine; 
-machine control and dialog generation. 
No special knowledge in linguistics or 
information science is required to use 
this developmental system, 
3. 
The VESPRA system will not only be 
realised as a software simulation on a 
mainframe computer,. The main goal is to 
build a hardware module which can be 
used for several purposes. This system 
will be developed in cooperation with 
several research irlstitutions and major 
industrial companies. This project is 
financed by the industry and the 
federal research and technology 
department (BMFT: Bundesminister for 
Forschung und Technologie). 

610 
THE VESPRA SYSTEM 
v 
SPEECH NOISE 
NOISE REDUCTION 
~ PHONETIC FEATURE EXTRACTION AND SEGMENTATION 
i ........... 
PATTERN RECOGNITION AND CLASSIFICATION 
LEXICAL, SYNTACTICAL, SEMANTICAL .~_ 
AND PRAGMATICAL PROCESSING 
DIALOG MODEL AND DIALOG MEMORY q 
MODEL OF THE ACTIIAL STATE OF THE 
CONTROLLED MACHINE 
l , -- MACHINE CONTROL AND DIALOG GENERATION 
I 
CONTROLLED MACHINE 
I PARAMETERS OF \[HE SYSTEM COMPONENTS 
MENUEGUIDED MODIFICATION OF THE ~i ~ 
620 

References

FeIlbaum,K.: Sprachverarbeitung und 
SprachObertragung, Springer-Verlag, 
Berlin, 1984. 

Wahlster,W.: NatOrlichsprachliche 
Argumentation in Dialogsystemen, 
Sprlnger-Verlag, Berlin, 1981. 
