NATURAL I.~IGUAGE INTERACTION WITH MACHINES : 
A PA~SING FAD? 
0R 
THE WAY OF THE FU~"JRE? 
A. Michael Noll 
American Telephone and Telegraph Company 
Basking Ridge, New Jersey 07920 
People communicate primarily by two medea: acoustic 
-- the spoken word; and visual N the written word. 
It is therefore natural chac people would expect 
their com--,nications with machines Co likewise use 
Chess two modes. 
To a considerable extent, speech is probably the most 
natural of the natural-language modes. ~ence, a 
fascination exists with machines thac respond to 
spoken commands with synthetic speech responses to 
create a natural-language interactive discourse. 
However, although vast amounts of research and 
development effort have been expended in the search 
for systems that understand human speech and respond 
with synthetic speech, the goal of the perfect system 
remains a~ elusive as ever. Syste ms for producing 
natural-sounding speech for large vocabularies with 
unrestricted gr--w.-tical structures and for recog- 
nizing spoken speech for large vocabularies with 
unlimited gr-~-Cical structures and any humber of 
talkers are still beyond the scats of linguistics and 
computer science and technology. 
Given the problems in the speech domain, ic is not 
surprising Chat most interactions between people and 
machines are in the visual mode frequently using 
alphanumeric keyboards as input and textual display 
as output. Such visual terminals are already in 
fairly widespread use in industry and are used for a 
variety of applications including computer 
progr-n~ing, text editing, and data-base access. 
The telephone allows speech celecoa~nications over 
distance between people. Future visual terminals for 
the home and businesses will allow textual 
celecom--,nicacions between people. These visual 
terminals could also be used co telecommunicate with 
machines in a way Chat is presently difficult using 
the telephone and speech. 
ViewdaCa, or videocex, systems are promised soon for 
the home and will allow data-base access and 
transactions with machines and textual messages 
between people. Some viewdata systems use elaborate 
tree searches Co reach the desired frame of 
information. Some people believe that tree searches 
will be "unnatural" for many users and some other 
mere-natural language will be ueeded to search and 
access these data-base sysCeme. 
One conclusion is Chac the future will see mere 
choices in mode for teleco~manicacions between people 
and with machines. The choice of which alternate 
made will probably be dependent upon the specific 
application. For example, textual messages might be 
both easier to enter by keyboard and Co read on a CRT 
screen than speaking to a recording machine and 
listening Co a recorded message. However, social 
chatting might be best over the telephone. However, 
arranging a dace with a stranger might be less 
revealing if done in the textual mode. Considerable 
opportunities exist for basic research to explore the 
suitability of these alternate modes for different 
co~nicacions applications. 
The fascination of technologists with speech-syuchesis 
chips is about to result in a variety of stand-alone 
appliances Chat speak. Ovens chat scare when the 
roast is done, washing machines thac call for the 
addition of fabric softeners, automobiles chat inform 
the driver thaC the door is open, and many ocher 
applications will soon abound in the marketplace. In 
most of chess applications, synthetic speech will 
substitute for a lamp or ocher form of visual 
display. The environment will be polluted with the 
noise of buzzy synthetic speech. Many of these 
applications will undoubtedly be little mere than 
passing fads. 
BuC in some circumstances synthetic speech will 
become the way of the future. One example would be 
synthetic-speech announcements of floors in an 
elevator thereby eliminatin S crooked necks~ 
Most of the preceding examples are very restricted in 
terms of the language used for the interaction with 
machines. The problem with unrestricted natural 
language for cor-unicacion with machines is chaC no 
automatic way has yec beeu discovered Co extract 
meaning in either the speech or textual mode. The 
textual mode does eliminate the ueed for acoustic 
analysis and hence has been more extensively used in 
most systems for restricted, specialized applica- 
tions. However, even if either mode were equally 
near perfect, questions would still arise about user 
preference for one mode over the other. 
Thus, in the end the future will be decided by the 
votes of consumers in the marketplace as they choose 
from the many options presented by technology. The 
shrewd enCerpreneur will use consumer preference and 
needs Co help illuminate in advance the desires and 
needs of the marketplace. Basic research in 
linguistics, human behaviour, natural language, and 
ocher ancillary fields will have an important role in 
developing solutions and in understanding people's 
needs and behaviour. 
137 

