Book Reviews Understanding Spoken Language 
Understanding Spoken Language 
Donald E. Walker, Editor 
Elsevier North-Holland, New York, 1978, 
420 pp., Paperback, $9.95, ISBN 0-444-00287-1. 
In 1970 the Advanced Research Projects Agency 
decided to fund six research projects aimed at devel- 
oping systems that were capable of understanding 
connected speech. From 1971 to 1975, this research 
was carried out. This book is a collection of articles 
(most of which have been published separately) which 
grew out of the final report of the speech understand- 
ing group at SRI International. Despite its title, its 
stated purpose is to describe SRI's speech understand- 
ing system rather than speech understanding in gener- 
al, and it contains much material pertinent to under- 
standing written language as well as speech. 
Although the introductory and concluding material 
attempts to unify the book, it remains a collection of 
very separate articles rather than a unified whole. As 
a consequence it suffers from the common problems of 
books of this type: inadequate cross referencing, poor 
transitions between chapters, and no index. The ab- 
sence of an index is a serious problem that is com- 
pounded by the fact that the table of contents contains 
only three levels of structure; more detailed outlines of 
the contents are found at the beginning of each chap- 
ter. Fortunately the references have been merged into 
a single list. The reference list is very good, in part 
because it is not too long to scan easily. 
The signal processing part of speech understanding 
is given the barest mention because SRI did not do 
work in that area; the book concentrates on the higher 
level aspects of the understanding process. 
The first chapter is a nicely written introduction 
and overview by Donald Walker. It describes the 
organization of the ARPA speech understanding effort 
and outlines the SRI system. The second chapter, by 
William Paxton, quickly plunges the reader into a rath- 
er detailed description of the language definition sys- 
tem which was used to define the language that the 
system would understand. These definitions were then 
compiled into a form that the executive system, which 
controlled the other components of the system, would 
understand. The language definition consisted of a 
lexicon (words and "multiwords" with grammatical 
categories, grammatical features, and associated se- 
mantic information) and composition rules (phrase 
structure rules augmented by procedures to be execut- 
ed whenever the rule constructs a phrase). The proce- 
dures gave values to attributes of the phrase as a func- 
tion of the attributes of its constituents and judged the 
acceptability of the phrase on a number of grounds 
such as acoustic properties, syntactic properties (such 
as mood and number), semantic properties (using the 
semantic network representation discussed further on), 
and discourse information to handle anaphora and 
ellipsis. Much of the complexity of the language defi- 
nition derives from the fact that it must screen out bad 
input rather than just recognize good input as many 
grammars do. 
In discussing the executive system itself, much 
space is devoted to historical background, comparisons 
with other speech understanding systems, and the ex- 
perimentation (using analysis of variance) that was 
186 American Journal of Computational Linguistics, Volume 6, Number 3-4, July-December 1980 
Book Reviews The Process of Question Answering - A Computer Simulation of Cognition 
done to determine the best control strategy to use. 
The control issues that were tested were left to right 
processing through the sentence versus an "island 
driven" strategy, examining all the words that might 
be present in a given location at once versus taking 
them one at a time, doing time-consuming but accurate 
context checking as soon as possible versus delaying 
such checking, and focusing the processing on a single 
hypothesis versus skipping around to whatever hypoth- 
esis seemed best at the time. The average reader will 
probably be more interested in the results of the ex- 
periments than the details of them, however. 
The third chapter explains the semantic component 
of the system, which was represented by the parti- 
tioned network scheme of Gary Hendrix. With many 
detailed examples, Hendrix shows how the partitioning 
scheme was used to encode quantifiers and logical 
connectives (conjunction, disjunction, negation, impli- 
cation), to form associations between semantic objects 
and the syntactic units of the input, to distinguish 
between new and old information, to encode multiple 
hypotheses, to allow sharing of representations among 
competing hypotheses, and to define hierarchies for 
discourse analysis. The semantic component of the 
system used this formalism to filter out combinations 
of words th~it were acoustically and syntactically ac- 
ceptable but semantically unacceptable; it also con- 
structed a representation of the meaning of good inter- 
pretations for other components to use, and could 
make predictions of words or structures that were 
likely to occur in other parts of the utterance. 
The next chapter is devoted to Barbara Grosz's 
work on discourse knowledge. After showing exam- 
ples of how the focus of a dialogue affects the identifi- 
cation of definite noun phrases, word sense interpreta- 
tion, pronominal reference, and ellipsis, she discusses 
(again, in more detail than some readers would wish) 
the analysis of actual problem solving and question 
answering dialogs which were examined to provide the 
basis for a representation of focus that would enable 
the SRI system to use focus in its semantic interpreta- 
tions. The notion of focus spaces which was derived 
from these experiments was represented in Hendrix's 
partitioned network formalism and used for resolution 
of noun phrases, inferencing, reference resolution, and 
other high level aspects of sentential processing. The 
problems of shifting focus, reinvoking an old focus of 
attention, and dealing with ellipsis are also covered in 
detail. 
The fifth chapter comprises three sections relating 
to the problem of responding to an utterance once it 
has been understood. This is unfortunately limited in 
scope, since the major emphasis of the project was on 
understanding rather than responding to spoken lan- 
guage. Gary Hendrix writes on the problem of inter- 
acting with the deduction component and an English 
generator to formulate a reply. Richard Fikes and 
Gary Hendrix detail the deduction component, and 
Jonathan Slocum's section deals with generating an 
English description of a semantic structure. The con- 
elusion, written by Ann Robinson, summarizes the 
work and points out issues relating to other areas of 
research. 
One of the chief features of the book is the large 
number of illustrations and detailed examples, includ- 
ing as an appendix a short but well chosen example of 
the entire processing of a single utterance. 
Someone already familiar with the ARPA speech 
project will gain little from this book, and someone 
interested in a general overview of the speech under- 
standing problem and the ARPA project's results 
would do better to look elsewhere \[1, 2\]. What this 
book has to offer is something rare but not unimpor- 
tant in the literature: a detailed description of a single 
large and complex system. One can get from it not 
only an understanding of how that system worked but 
also an excellent understanding of the important pieces 
of that work which have had a continuing influence in 
the field of computational linguistics since the termina- 
tion of this particular project. 
Madeleine Bates, Bolt Beranek and Newman 

References 
\[1\] Klatt, Dennis. Review of the ARPA Speech Understanding 
Project. J. Acoust. Soc. Am. 62, 1977. 
\[2\] Lea, Wayne A. (Ed.). Trends in Speech Recognition. Prentice- 
Hall, Englewood Cliffs, N.J., 1980. 
