Communication in large distributed AI Systems for Natural 
Language Processing 
Jan W. Amtrup* 
University o:f \[Im:nburg, Com p. Sci. l)ept. 
Vogt-KSlln-Str. 30 
11)-22527 I \[amburg 
amtrul)~.in for:matilc u hi- hanlburg.de 
JSrg Benra* 
DFKI GmbH 
\]~rwm-Schrodinger-St;r. (Bau 57) 
\])-67663 Kaiserslautcrn 
t)enrag~d fki.uni-kl.de 
Abstract 
We. are going to describe the design 
and implementatior, of a connnuniea- 
lion system l.or large AI projects, ca- 
pable of supporting various software 
components in a heterogeneous hard- 
ware and programming-language envi- 
ronment. The system is based on a rood- 
ification of the channel approach intro- 
duced by Hoare (1978). It is a three- 
layered approach with a de facto stan- 
dard network layer (PVM), core rou- 
tines, and interfaces to live different pro- 
gramming languages together with SUl)- 
port l.or the transparent exchange of 
complex data types. A special compo- 
nent takes over: name service functiorrs. 
It also records the actual configuration 
of the modules present in the application 
and the created channels. 
We describe the integration of this com- 
munication facility in two versions of 
a speech-to-speech translation system, 
which ditfer with regard to quality and 
quantity of data. distributed within tire 
applications and with regard to the de- 
gree of interactivity involved in process- 
ing. 
1 Introduction 
(hu:rently, there is a trend of lmilding large AI- 
systems in a distrilulted, agent-oriented manner. 
'l'he complex tasks performed e.g. by systems with 
nmltimodal user interfaces or by systems tackling 
the processing of spontaneous speech often require 
more than one computer in order to run accept- 
ably last. If pure speed is not the primary moti- 
vation, the incorporation of several modules, each 
*'l'his rescm'ch was funded by the Federal Min- 
istry of l~;dncat;ion, Science., ll.esem'ch and Technol- 
ogy (IIMBF) in the framework of the VI,;HIIMOBIL 
Project raider Granl, s 01 IV 10l A/O and 01 IV 101 G. 
of them possibly being realized in a different pro- 
gramming language or even a different program- 
ruing lmradignl, demands complex interfaces be- 
tween these modules, l,'urthermore, only modular- 
ization makes it possible to develop applications 
in a truly distributed inanner without the need to 
eol)y and install versions repeatedly over. 
~l'he actual realization of tire interfaces should 
ground on a sound theoretical framework, and it; 
shoukl be as independent as possible from the 
module implementations. TypicMly, when an in- 
terface between two subcomponents of a system 
is needed, at \[irst very simple means e.g. file 
interfaces or simple pipes are considered. This 
does uot only lead to a variety of different proto- 
cols between components which is natural to 
a certain degree, due to the different tasks per- 
formed by the components and the purpose of the 
internee data but; also to a number of ditf~rent 
implementation strategies :\['or interfaces. 
In this pal)er , we present ICE, the Intarc 
Communication Environment (Amtrup, 1995), 
an implementation of a channel-oriented, multi- 
architecture, multi-language communication mod- 
ule for large, A l-systems, which is particularly use- 
ful for systems integrating speech and language 
processing. 
A channel-oriented model \['or interaction re- 
lations between software modules seemed to be 
the most suitable system for our needs. We 
adopted the CSP-approach (Horn:e, 11978) and 
its actual realization in the transputer hardware 
((~rahatn and King, 1990). This core flmctional 
model was slightly modified to satisfy the needs 
emerging from ext)eriences with actual systems. 
We decided not to implement all communica- 
tion flmctions from scratch, \[)tit instead we use 
PVM, the Parallel Virtual Machine (Geist et al., 
1994), a widespread process-comnmnication soft- 
ware, which turned out to be extremely reliable. 
We will desribe how the communication sys- 
tem has been integrated within Verbmobil, a large 
research project tbr automatic speech-to-speech 
translation (Wahlster, 1993). ICE is used for the 
w~rious I)rototypes of the interpretation system. 
35 
We describe experiences and results of the work 
on the first demonstrator. Furthermore, we show 
that ICE is flexible enough to be used in archi- 
tectural experiments and we are going to report 
some of the experiences made with them. 
2 Application acrchitecture 
Verbmobil, the primary application for which ICE 
was built, aims at developing an automatic inter- 
preting device tbr a special type of negotiation be- 
tween business people. The dialogue situation is 
as follows: Two business persons, speaking difl'er- 
eat languages, are involved in a face-to-face dia- 
logue trying to schedule an appointment. 'l.'hey 
both have at least; some knowledge of English and 
use English as a common hmguage. In case one 
of the dialogue partners runs into problems, he or 
she activates the interpretation system by pressing 
a button and switches back to his or her mother- 
tongue. The system interprets the respective ut- 
terances into English. Therefore, it interprets the 
dialogue on demand in certain situations. 
The Verbmobil system consists of a large num- 
ber of components, each of them designed to cope 
with specific aspects of the interpretation pro- 
cess. Among them are a recorder for speech sig- 
nMs, a HMM-based word recognizer, modules for 
prosodic, syntactic and semantic analysis, dia- 
logue processing, semantic evaluation as well as 
components for both german and english synthe- 
sis. There are several interfaces between the in- 
dividual parts of the application which are used 
to forward results or to realize question-answering 
behavior. 
The interchanged data between components (a 
component normMly corresponds to a unique soft- 
ware module) is very heterogeneous with regard 
to both type and quantity: Speech information as 
it is sent from the recorder to the speech recog- 
nizer consists of a stream of short integer values 
which may amount to several megabytes. The ob- 
jects exchanged between semantics construction 
and transfer are relatively small, but highly struc- 
tured: Semantic representations with several em- 
bedded layers. 
3 ICE: Design and structrue 
As briefly noted above, we are using a chan- 
nel abstraction to model communication between 
components. The model is largely oriented at 
the approach of CSP (Communicating Sequential 
Processes, iIoare (1978)), mainly for two reasons: 
• We decided to use a message-passing ap- 
proach to communication. The two other 
kinds of process communication largely avail- 
able, namely shared memory and remote 
procedure calls are disadvantegous for our 
purposes: The employment of shared mem- 
ory may lead to memory or bus contention 
when several processors are sinmltaneously 
attached to the same physical memory seg- 
ment. l?urthermore, multiple concurrent 
write attempts have to be synchronized. Re- 
mote procedure cMls did not seem to be the 
right choice either since their use implies 
a rendez-vous-synchronization which slows 
down a system due to network latencies 1. 
• Making the objects involved in communica- 
tion explicit, offers several ways to manipu- 
late them. Without too much effort, we were 
able to introduce split channels in order to in- 
corporate visualization tools or introdnce dif- 
ferent modes of communication depending on 
the type of data to be exchanged. 
The low level basis of ICE is realized by PVM 
(Geist et al., 1994), which is a message passing 
system for multiple hardware architectures. It 
has been developed and extended for almost seven 
years now and is very reliable. It allows a net of 
Unix workstations to behave like a single large 
parallel computer. PVM supplies each message 
with a tag which simplified the introduction of 
channels to a large extent (roughly, a message is 
tagged uniquely to identify the channel it is sent 
on. This enables a receiving component to select 
messages Oll individual channels). 
3.1 System structure 
The architecture of a system using ICE as commu- 
nication framework is depicted in Pig. 1. Before 
describing in detail the structure of a component, 
we will point out the overall layout of an applica- 
tion. 
We assume that an application consists of a 
number of components. We could have adopted 
the notion of agents cooperating to a certain de- 
gree while carrying out a certain task coopera- 
tively, but this would have meant to mix up dif- 
ferent conceptual levels of a system: The com- 
munication facilities we are describing here estab- 
lish the means by which pieces of sottware may 
communicate with each other. They do not pre- 
scribe the engineering approaches used to imple- 
ment the individual software components them- 
selves. We do not state that agent architectures 
1 rph e channels of CSP and Occam both use rendez- 
vous-synchronization. In this respect we deviated 
fi'om the original model. 
36 
Figure \]: Principle component layout 
(e.g. Col-,en et al. (:t9!)4)) can not be realized with 
our :mechanis:nl 2, but the range of cases where ICI,', 
can 1)e applied is broader than this. 
All communication is clone by the trieatls of 
channels, as set out above. We (listinguish two 
types of channels: 
• Base channels ~re the primary \['acilities of 
communication. They are configured in a 
way guaranteeing that each comlxment is 
able to interm:t with each other component 
it wishes to, regardless of programming hm- 
guages, hardware architectures, or system 
software being used. This is achiewxl 1)y 
using the standard communication mode of 
PVM, which supports XI)I{ a. Message pass- 
ing is done asynchronously. 
• Additional channels were added in order to 
satis\[~y some needs that frequently arise dur- 
ing the design ~md implementation of b~rge 
A l-systems with heavy use of communication. 
They can 1)e used to separate data st,'cants 
Dora control messages or may 1)c configured 
it, various ways, e.g. by switching off {;he X1)t{ 
encoding to speed up message passing. 
3.2 Split ('hannels 
Both types of channels can be configured in an ~d- 
ditional way. Beyond being bidirectionM commu- 
nication devices between two components, other 
2In(Iced, distributed 1)lackboards as used in 
(Johen et al. (\]994) can easily be modelled using a 
(:hanncl-bascd al)proach. 
aeXternal 
1)ata Representation, see Corbin (1990), an encoding 
schema for data objects independent of the current 
programming environment. 
modules can be attached to listen to data trans- 
ported on a channel or to inject messages. These 
split channels are achieved by dividing ~L channel 
into two endpoints, one at each side of the chan- 
nel. 
Iloth ends are described using a conlignr~tion 
lile that is read by the ILS (see below) upon 
startup. In l;his file, \[br each endpoint a list of 
real chaimels is defined, e~mh of which points to a 
compolmnt and is equipped with a name, (:onfigu- 
ration flags and its purpose (whieh can be sending 
or receiving). Any uumber of' real channels may 
be marked sendiug or receiving. The behavior of 
l;he components allotted by split chammls does not 
have to be changed, since splitting occurs trans- 
I)arently for them. 
Consider Fig. 2 as an exa.mi)\[e for what purpose 
split channels were used. 
Compo l~-~ Compo 
A - B 
\[u, j 
Figure 2: Split channel contiguration 
Two components, A and B, are connected us- 
ing a channel which is depicted by a dashed line. 
The channel endpoints are split up to allow visual- 
ization of message data sent by either component. 
The visualization is performed by two additional 
components la.belled UI_A and UI_B. lqn:ther- 
more, the data sent by component A must un- 
dergo some moditication while being transported 
to cOlnl)onellt l~. Thus, another component C is 
contigurcd capable of transforming the data. It 
is spliced into the (h~ta path between A and B. 
Note that data sent by component B arrives at A 
unaffeeted from modification by component C. 
a.a ILS: Ilfforlnation Service 
Channels can be established by any component. 
There is no need for synchronization between 
conlponents during the configuration of the con> 
munication system. To support this schema, a 
dedicated component named ILS (Intarc License 
,%rver) was introduced, lilt stores information 
about the actuM structure of the applieation sys- 
tem. This information includes names and loca- 
tions of all components participating in the sys- 
37 
tern as well as an overview about all channels cur- 
rently established between components. The ac- 
tions performed by the ILS include: 
• Attachment and Detachment of components. 
A component desiring to take part in the 
communication activities of the application 
has to identify and register itself at the ILS. 
This is done by sending a message containing 
the name of the component to the ILS. Analo- 
gously, a component should detach itself from 
the ILS by sending an appropriate message 
before leaving the application. In case of a 
program failure resulting in the inability of 
a component to detach the 1LS is capable of 
handling the detachment autonomously. 
• Configuration of channels. Each creation and 
destruction of a channel is done by interact- 
ing with the ILS in order to notify the ILS 
of the request and to get back information 
about the necessary data structures. The 
creation of a channel is done in two phases: 
First, any of the endpoint components sends a 
channel creation request to the ILS. The ILS 
updates its internal configuration map taking 
care that split channel definitions are taken 
into account; it then answers to the request- 
ing component the individual tag used for 
this channel and the process identity of the 
target component 4. If the target component 
has not, yet registered within the application, 
this fact is acknowledged to the source com- 
ponent. The only point at which this matters 
is the time of the first message sending at- 
tempt which will be blocked until the target 
component registers at the ILS. In that case, 
the ILS notifies the source component of the 
event and communication c~n take place nor- 
really. 
The second phase handles the notification of 
the target component. As just described, this 
component need not be present by the time 
of the channel creation request. In this case 
the notification is simply delayed. The no- 
tification consists of the necessary data to 
create the intended channel within the com- 
ponent. The implementor need not track 
those configuration messages, the communi- 
cation layer handles this transparently. Fur- 
4pVM addresses components which are identi- 
cal to processes for it - by a task id that is assigned 
by the pwn daemon. The ILS maintains a mapping 
fl'om compolmnt names to those task ids. This map- 
ping need not be bijective, since we allow multiple 
componen6s within one process (see below). 
thermore, concurring channel requests do not 
inteffer. 
3.4 Component structure 
The interior structure of a component (see Fig. 1) 
is layered as far as the communication parts of the 
software are concerned. The low level communi- 
cation routines are provided by PVM (see above). 
Next, a software layer defines the functions of ICE. 
This is comprised of the basic functionality of ICE 
itself and a set of interface functions for different 
programming languages. We currently support C, 
C++, Lisp (Allegro Common Lisp, Lucid Com- 
mon Lisp and CMSP), Prolog (Quintus Prolog 
and Sicstus Prolog) and Tcl/Tk. 
These software layers suffice to communicate 
basic data types like nmnbers and strings. Addi- 
tionally, a separate layer (IDL) is present to allow 
the exchange of more compex data types. One 
may specify routines to encode and decode user- 
defined data types which can then be transmitted 
just as the predefined scalar types. At the lno- 
ment, this schema is used for a few dedicated data 
structures, e.g. for speech data or arbitrary prolog 
terms, which may be even cyclic. 
4 Experiences with the application 
Verbmobil is built up by two sorts of components. 
The "(:ore" components are used to transform the 
input data into the output data (e.g. recording, 
speech recognizer etc.). These Nl,P-<'omponents 
are embedded in the so called "testbed" that 
serves as an application Damework. The testbed 
is designed as an experimental enviromnent that 
provides all the features required to test the core 
components and to study the operation of the 
whole application. '\]?he testbed consists mainly 
of the following parts: 
• The graphical user interface (GUI) provides a 
comlbrtable Dontcnd to the application. Us- 
ing the GUI the user can watch the operation 
of the whole system, control the behavior of 
the components and monitor the datafiow be- 
tween the components. 
• The testbed manager (TBM) is used to start 
up the whole application and to distribute the 
processes of the application to the hosts of 
the network. Further, the testbed manager 
collects data about the operation of the com- 
ponents and visnalizes this intbrmation using 
the GUI. 
® The visualisation manager (VIM) collects all 
the data transferred between any of the com- 
ponents using IC'E channels. 
33 
If one wants to study only some parts of the 
system, it is t)ossib\]e to start the al)l)li(;ation con- 
taining only a subset of the existing components 
(e.g. only the speech recording module aim some 
speech recognizers). The testbed provides the fa.- 
cility to choose in an oflline process the compo- 
nents that are desired to I)e executed. This config- 
uration is done by simply editing a coMiguration 
file and selecting the keywords "yes" or "no" for 
each cornl)onent. All the comf)onents not selected 
are automatically replaced by "stut)-modules," so 
there is no need to change source code and re- 
compile the components, ew:n if data is sent 1,o a 
nou-existent component. On the other side it; is 
possible to configure the usage of alternative com- 
ponents (e.g. two gerlnan speech recognizers). In 
this case l)oth eoml)onents are started and we are 
at)It to select fl'om the GUI which of both (:onq)o- 
Ilents we a(:tllally wallt to \]lSe. 
(htrrent\]y there are 32 existing eon~l)otmnts that 
contribute to roughly 650 Mill disk space (the ex- 
ecutables, libraries and data liles required at rlln- 
time use up 380 MB). Some of the components ~u:e 
structured using sul)eomi)onents that are iml)le 
mented in different programnfing languages and 
are executed in own l)rocesses. The ',{2 main con> 
ponents are implemented using the following l)ro 
tramming languages: C (10 components), l,isl) 
(r (:o.u)ouents), l'r(,log (S ,'onU,onents), (:++ 
,:on u)onents), t,'ortra,, ¢:o, ,pouents), 'r,:F'rk (J 
(:o111 l)Onellt). 
Starting a heavy weight system containing all 
the currently existing eoml)onents, we get al)out 
95 UNIX l)roeesses requiring 520 M\]l memory. In 
this configuration we are using 52 I)ase channels 
and 24 additional channels (76 ICI'; channels in 
total). Six of these 24 additional (:hannels are con- 
figured not to use the XI)R coding, 1)eeause they 
are used to transfer high volume data (e.g. audk) 
data). 
Because the communication is built u 1) I)y 
strictly using the featm:es of ICE and the under- 
h~ying PVM, the apl)licatiott cnn run on ~ single 
host ;~s well as distributed to the hosts of a.a local 
area network. The decision which cOlnl)onent will 
rtm on which host of the network is conligurable. 
Each coml)onent can I)e assigned to a sl)ecilie host, 
or wc can leave the assignment of an adequate host 
to PVM. 
5 Experiences with an 
architectural experiment 
la addition to the employment wil;hin the Verl)mo- 
I)il l)rototype, we used l(Jl,', as con,,,mm(:ation &'- 
vice ('or some eXl)erjt~mnts i l, the \['ra.ntewor\]¢ o1" the 
I ~emantlc Evaluation) 
¢ 
/ =o,.: t 
Beam Decoder ~ 
l,'igttr('. 3: 'l'he experimental system architectm:e. 
archit¢'ctm'M branch of the project. The apl)roac\]l 
here is to develop a. speech translation system 
obeying design principles that have their orighl in 
the goal of constructing a system retlecting some 
of the assumed properties of human speech pro- 
eessing, namely working incrementally fi'om left 
to right and exploring the ell'ects of il~teraetion 
between dilDrent levels of speech recognition and 
understanding. These two principles have serious 
implications for the design of individual tempo 
uents and the (:on-,ph;te system. To give a con- 
crete exmnple, consider the interface between a 
speech recognizer and a synt~mtic parser. The re<:- 
ognizer In'educes a eonllected graph where each 
edge denotes a word hypothesis. Due to the in- 
ability to remove paths in adwmee that can not be 
pursued fln:ther at a late\]: stage of operation, the 
input to the syntactic parser grows enormously. 
We noticed that wordgri~phs produced inerenmn 
tMly laity \])e tell tiIlles larger than conw'.ntionally 
constructed gr~q)hs (resulting in over 2000 word 
hypol, heses for an utterance of 4.7 seconds). 
The exlmrinmntM system architecture is shown 
in Fig. 3. It, consists off several modules inter- 
connected by ,t lnain da.tlt path that delivers re- 
suits according to the "standard" linguistic hie> 
archy, viz. from word recognition to syntax, se- 
uumtics and fitmtly transfer !;. Besides t, his inain 
stream data path we set t\] l) several interaction 
facilities that ~u'e used to propagate int'ornmtiou 
r>J'ltc (;r;ttlsf(w (:ompontm{, i,';t not, shown in \]"it,+ 3. 
39 
backwards, which may consist of binary judge- 
meats about the applicability of a hypothesis, a 
ranking among different possible analyses or even 
predictions about what might be expected in the 
future. 
These methods were for example examined at 
the crucial interface between a HMM-based speech 
recognition device and a syntactic parser (ttauen- 
stein and Weber, 1994). A tight interaction be- 
tween these two components was created which 
was used to model a synchronization point at ev- 
ery frame in the speech input (i.e. every 10 ms). 
At each of these points a set of word hypotheses 
is sent to the parser. The parser then tries to in- 
tegrate the new hypotheses into existing partial 
analyses constructed so far. The feedback loop 
to the speech recognizer consists of information 
about the syntactic ranking of the parse each word 
is integrated into. If a word can not be used in 
any way, it is simply rejected. In the case of in- 
tegration of a word into a parse a ranking is pro- 
duced which incorporates values from a statistical 
n-gram language model and a stochastic unifica- 
|;ion grammar which models the probability of a 
syntactic derivation. 
To realize a prediction mode in this interaction, 
a different schema was used: At each frame the 
parser computes a set of possible continuations 
for each word, i.e. it restricts the language model 
to pairs of words (in case of a bigram model) 
which are syntacticallly plausible and could be in- 
tegrated into a currently existing syntactic deriva- 
tion. By doing so, the search space of the speech 
recognizer is restricted. 
6 Conclusion 
We have presented the concepts and implementa- 
tion of a communication system designed for use 
in large AI systems which nowadays are typically 
built to operate in a distributed manner within lo- 
cal networks of workstations. We argued that the 
adaptation of sound theoretical concepts which 
for example can be found in Hoare (1978) lead 
to solutions that have considerably more power 
that ad-hoc communication devices implemented 
as the need to communicate arises. 'rim channel 
model was slightly modified and realized on top of 
l?VM, a de facto standard for communication in 
distributed systems. The system structure reflects 
a set of components that communicate bilaterally 
without the involvement of a central mechanism 
or data structure that participates in every com- 
rnunication event. Instead, once the identity of 
the communication partners is established, corn- 
munication between them is strictly local. 
We introduced a central name server in order to 
store the components acting in an application and 
to be able to service requests for the creation of 
channels and such. Channels come in two flavors 
what on the one hand guarantees succesful com- 
rnunication between any two partners and on the 
other hand leaves room for tailoring properties of 
message channels to certain preferences. Further- 
more, split channels allow for the easy configura- 
tion of a system with respect to interchangeable 
parts of the system and attached visualization. 
We showed that the communication system 
realized using this methods is advantegeous in 
several situations and system contexts, ranging 
fi'om strictly sequential systems over intermediary 
forms to highly interactive systems. 
References 
Jan W. Amtrup. 1995. 1CI~-Intarc Communica- 
tion Environment: User's Guide and Reference 
Manual. Version 1.4. Verbmobil Technical l)oc- 
ument 14, Univ. of Hamburg, December. 
P.R. Cohen, A. Cheyer, M. Wang, and S.C. Baeg. 
1994. An open agent architecture. In Proc. of 
AAA1-g4, pages 1 8, Stanford, CA. 
John tl.. Corbin. 1990. The Art of Distributed 
Applications. Sun Technical Reference l,ibrary. 
Springer-Verlag, New York. 
AI Geist, Adarn Beguelin, Jack Dongarra, We- 
icheng Jiang, Robert Manchek, and Vaidy Sun- 
detain. 1994. PVM3 User's Guide and Ref 
erence Manual. Technical Report ORNL/TM- 
12187, Oak Ridge National Laboratory, Oak 
Ridge, Te., May. 
lan Graham and Tim King. 1990. The Transputer 
Handbook. Prentice tlall, New York, London et 
al. 
Andreas Hauenstein and Ilans Weber. 1994. An 
Investigation of Tightly Coupled Speech Lan- 
guage Interfaces Using an Unification Gram- 
mar. 111 Proceedinqs of the Workshop on lu~ 
tegration of Natural Language and Speech Pro- 
ccssing at AAA I '94, pages 42- 50, Seattle, WA. 
Charles A. Richard Hoar< 1978. Communicating 
Sequential Processes. Communications of the 
ACM, 21(8):666-677, August. 
Wolfgang Wahlster. 1993. Translation of face-to- 
face-diMogs. In Proc. MT Summit IV, pages 
127--135, Kobe, Japan. 
40 
