P A N E L 
P~Lrallel P~ocessing in Computational Linguis tics 
Helmut Schnelle 
Ruhr-Universit~t Bochum 
Sprachwissenschaftliches Institut 
Postfach 102148, D-4630 Bochum 1 
Panelists: 
Garry COTTRELL 
University of California, Dept. of 
Computer Science, San Diego, 
Mail Code C-O14, La Jolla, CA 92093 
U.S.A. 
Paradip DEY 
The University of Alabama at Birmingham 
Dept. of Computer & Information Science 
Birmingham, AL 35294 
U.S.A. 
Peter A. REICH 
Dept. of Linguistics, University of 
Toronto, Toronto, Ontario MBS IAI 
CANADA 
Lokendra SHASTRI 
University of Pennsylvania 
School of Engineering and Applied 
Science, 200 South 33rd Street 
Philadelphia, PA 19104-6389 
U.S.A. 
Joachim DIEDERICH 
International Computer Science Institute 
1947 Center Street, Berkeley, CA 94704 
U.S.A. 
Akinori YONEZAWA 
Tokyo Institute of Technology 
Dept. of Information Science 
Ookayama, Meguru-ku, Tokyo 152 
JAPAN 
Introduction 
The topic to be discussed by the panel 
is new and at present very much under debate. 
Paralleli~im is developed in a large variety 
of approaches. The panel will make an attempt 
to clarify the underlying concepts, the dif- 
ferences <~f approach, the perspectives and 
general tt!ndencies, and the difficulties to 
be expected. Some differences of approach 
will be iAlustrated with examples from the 
work of the panelists. 
The ~ommon context of our approaches is 
the following:Standard computational 
linguistics tries to solve its problems by 
programmiltg a yon Neumann computer. The 
execution of the programs is inherently 
sequenti~l. This is implied by the fact that 
there is only one central processing unit 
(CPU) executing the program. In contrast to 
this, pal~allel processing defines the solu- 
tion of p~oblems in terms of sets of computa- 
tional units which operate concurrently and 
interactively, unless sequentialized for si- 
mulation ~urposes. 
Various approaches to parallelism differ 
in the computational power they assume for 
the concurrently active units. The differen- 
ces may be outlined as follows: 
Massively parallel systems are usually 
systems whose units are, intuitively 
speaking, purely reactive units, i.e. mathe- 
matically defined by a specific function re- 
lating the state and output of a unit to its 
inputs. They could also be called connectio- 
nist systems in the wide sense; connectionist 
systems in the narrow sense are those whose 
function~ are based on weighted sums of input 
activities. 
In contrast to these systems, the units may 
be themselves complicated systems which com- 
pute their states and outputs depending on 
the messages and control signals which they 
receive. The units cooperate in solving the 
problem. In typical cases, each unit may be a 
central processor unit or even a complete 
computer. Systems with cooperative processors 
(computing "agents") are usually considered 
to be non-massively parallel. 
These distinctions suggest different meta- 
phors used in informal talk about the sy- 
stems: the neural net metaphor on the one 
hand and the society of minds (demons) meta~ 
phor on the other. 
Given this context the panelists have 
answered the following questions: 
I. How is the dynamics of your system 
defined? 
- I.(A) I. What is the computational power 
of a single unit in your approach 
- I.(A) 2. How is the interaction or the 
interdependency between concurrent 
units defined? 
- I.(B} How do you implement your system? 
- I (C) Which methods are used for ~o- 
uramminu? 
II. What is the representational status of 
your system? 
- II.(A) Which parts of grammar or dictionary 
do you model with your system? 
- II.(B) Which parts of grammar or dictionary 
do you model by a concurrent unit of 
your system? 
- II.(C} Is there a general method such that 
a grammar determines uniquely a 
parallel implementation or is this 
imp3ementation an art? 
The answers given seem to be particularly 
appropriate as an introduction to the topic 
and will thus be presented in the subsequent 
passages.{The answers of the different 
panelists and the organizer are prefixed by 
their initials) 
595 
I. How is the dynamics of your system 
defined? 
I.{A) i. What is the computational power 
of a single unit in your approach (a Boolean 
function, a specific numerical function, a 
mapping of vectors, a mapping of strings or 
files, a mapping of trees or configurations 
of other types, or the power of a CPU or a 
complete computer)? 
G.C.:There are no formal limitations on the 
power of a unit in my system; the power is a 
matter of taste, and is expected to be re- 
stricted to simple functions. For example a 
numerical approximation to Boolean functions 
of the inputs, where the inputs are further 
broken down into functions of input sites. My 
implemented system has several hundred units. 
P.D.:Each unit has the power of a VAX- 
11/750.The units share their memories. I'm 
thus currently working in a shared memory 
multiprocessing environment. Specifically, my 
algorithms run on a 30 processor {=unit} Se- 
quent Balance 21000 machine. This is large 
grain parallelism. I prefer the environment 
of large grained shared memory multiproces- 
sots, because they are the most popular gene- 
ral purpose parallel computers available 
today. Earlier, I developed some algorithms 
for a medium grained tree machine, namely the 
DADO parallel machine. 
J.D.: Each unit implements a simple numerical 
function, sometimes a simple combination of 
several functions computing input from seve- 
ral sites of incoming activation. 
P.A.R.: Each unit has the power of a finite 
state device {under 32 different sta- 
tes).There are 16 different types of units 
which differ in their finite state defini- 
tion. They implement (sometimes only 
slightly} different logical functions over 
their input activations. The more important 
ones are: concatenation (logical followed 
by), conjunction (logical and), disjunction 
(exclusive or}, precedence disjunction {if 
both possibilities are realizable, one takes 
precedence over the other), random disjunc- 
tion (pick a choice at random},interjunction 
{inclusive or}, intercatenation {inclusive 
or; if both: concatenation), zero (network 
dead ends producing nothing), bottom edge 
(network outputs something), top edge, feed- 
back barrier. Units operate independently of 
one another and asynchronously. 
H.S.: There are two descriptive levels: large 
grained and small grained. On the former each 
unit is a {special purpose) Turing machine 
(not a universal one}. On the small grained 
level, each unit implements either a simple 
Boolean function or a simple numerical {addi- 
tive, fixed length} function. The large grai- 
ned net is a partitioning of the small grai- 
ned net; its Turing machines are similar to 
von Neumann's growing cellular automata. 
L.S.: Each unit implements a numerical func- 
tion -the most complicated ones have the 
form: If input aa > 0, then take the product 
of inputs bt,bs .... b~, else take the product 
of inputs cl;cz,...cm. 
A.Y.: Each unit is a single CPU with memory. 
My approach involves thousands of units. 
I.(A} 2. How is the interaction or the 
interdependency between concurrent units de- 
fined? Is it strictly connectionist and thus 
also defined by a function? Or is it coopera- 
tive and thus defined by the messages sent, 
encoded and decoded by the units? Is there a 
distinction between data messages and con- 
trol-signal messages or is it a data-fiow sy- 
stem? 
G.C.: Units pass values in a strictly connec- 
tionist way. 
P.D.:The system is a shared memory system. 
All units have in principle access to the 
same information. Actual interaction is defi- 
ned by shared variables. That is, processes 
communicate with each other through shared 
variables. 
J.D.: The system is strictly connectionist, 
i.e. there are no symbolic messages. Each 
unit computes the weighted sum of inputs. 
P.A.R.:Each unit is connected to at most 
three other units in the network. The connec- 
tions are active or not. According to their 
function, three different signals may be di- 
stinguished: production signal and positive 
feedback, negative feedback, and anticipa- 
tory. 
H.S.: On the small grained level the system 
is connectionlst, but not strictly, since not 
only weighted sums of inputs are allowed but 
also other simple functions. 
L.S.: The system is strictly connectionist. 
A.Y.: The interaction between units involves 
message passing. Messages carry either con- 
trol information or data or both. 
I.{B} How do you implement your system? 
By simulation on a yon Neumann computer or by 
programming on a universal parallel machine 
(like the connection machine} or by designing 
hardware (e.g. a special-purpose information 
processing network}? If the first, do you 
plan to implement it eventually by a parallel 
system? 
G.C.: The system is simulated on a VAX. 
P.D.:The system is being implemented on a 30- 
processor Sequent Balance 21000 machine. It 
is currently being implemented in parallel-C 
running under Unix. When a parallel LISP be- 
comes available, it will be implemented in 
parallel LISP. 
J.D.: We use the Rochester Connsctionist Si- 
mulator on a SUN-3 with Graphics Interface. 
Implementations on a Sequent (Parallel Unix 
Machine} are planned. 
P.A,R.:Simulation on a personal computer 
using standard programming language. 
H.S.: The connectionist net is defined on a 
spread-sheet such as LOTUS 1-2-3. Some cells 
of the spread-sheet are identified with the 
units of the net to be programmed. In each of 
these cells a formula for a function is ente- 
red; it determines the reactivity of this 
cell to the states of those neighbouring 
cells whose addresses are arguments of the 
function. Thus, the addresses of the formulas 
on the spread-sheet implement the connectivi- 
ties between the formulas. We run the 
spreadsheet in the computation-mode: itera- 
tive,columnwise, which defines the sequential 
simulation. By definition the different cells 
of the spread-sheet could operate concur- 
rently in each iterative step; their opera- 
tion is sequentialized (and thus adapted to 
the simulation on PC} only through columnwise 
computation. 
L.S.:By simulation on a yon Neumann computer. 
A.Y.:By simulation of a yon Neumann computer, 
and also parallel computers 
596 
I (C) Which methods are used for pro- 
~? Parallelizing of existing non-par- 
allel programs or independent programming? 
Methods of hardware design? 
GoC.: A network is constructed from a high- 
level specification such as a grammar. This 
is given to a network construction routine 
that specifies the model based on the gram- 
mar. 
P.D.:The computational model is MIMD (multi- 
ple inst,'uction multiple data stream). Paral- 
lel programs are developed primarily by data 
partitio,~ing, although function partitioning 
is also itsed. 
J.D.: Independent programu~ing. Networks are 
constructed by writing a C program and use of 
library function of the simulator. 
P.A.R.: The system is programmed by construc- 
ting the gra~maar in network form. There is an 
algorithl4 for representing the network in 
terms of algebraic formulas. Nodes are defi- 
ned by a series of state transition rules. 
The gran~aar is tested by inserting initial 
input sionals and running the simulation. 
H.So: There is a compiler which produces au- 
tomatically for any given CF-grammar a corre- 
sponding network. The processes on the net- 
work cor~:espond to the processes defined by 
an Earley chart parser but, in contrast to 
the latter, all processes are executed con- 
currently whenever this is possible. In par- 
ticular, all parsing paths are followed up in 
parallel. Hardware design of networks is 
planned~ 
L.S.: A "compiler" is provided that transla- 
tes a high level specification of a concep- 
tual structure (semantic network} into a 
connectionist network. It is proved, that the 
network ~enerated by the compiler solves an 
interseting lass of inheritance and reco ngn!~ 
tiol, problems extremely fast - in time pro- 
portional to the depth of the conceptual 
hierarchy. 
A.Y.:We designed an object-oriented concur- 
rent language called ABCL/I and program par- 
sers in this language. 
I.(D} Is your system fixed or does it 
learn ? If the latter, which learning functi- 
ons or learning algorithms are used? 
J.D.:Lea:cning is the most important topic. 
Natural language descriptions of structured 
objects are learned. These objects are also 
present in a restricted visual environment. 
The interaction between language and vision 
in learning is investigated. Various forms of 
weight changes are used: Hebbian learning 
with slow weight change, fast weight change 
for temporary binding, modified Hebbian lear- 
ning with restriction on the increase of 
weights. 
P.A.E.: A substantial number of learning ru- 
les have been developed but not yet implemen- 
ted on computer. Learning involves "inge- 
stion" and "digestion". Ingestion consists of 
co-occurrence rules. If two signals pre- 
viously unconnected co-occur, they are 
connected together. Digestion makes use of 
equivalence relationships to simplify the 
network. Equivalence relationships include: 
associativity, commutativity, distributivity, 
and a number of other relationships which 
have no name in standard algebra. Ingestion 
and digestion operate more or less alterna- 
rely. First a piece of new information is 
connected to the network, then equivalence 
relations are tried in a search for simplifi- 
cation. 
L.S.: Structure is fixed but weights on links 
can be learned using a Hebbian weight change 
rule. 
G.C.,P.D.,H.S.,A.Y.: Our systems do not le- 
arn. 
If. What is the representational status of 
your system? 
II. (A) Which parts of grammar or dic- 
tionary do you model with your system? 
G.C.:I have separate systems designed to work 
together to handle lexical access, case-gram- 
mar semantics, and fixed-length context free 
grammar. 
P.D.:Lexicon, grammar and semantics. The le- 
xicon has words with their categories, subca- 
tegories, and lexical meaning. 
J.D.:Fixed-length context-free grammar. 
P.A.R.:IR theory the entire system from a re-- 
presentation of general cognitive information 
through language specific "deep" or "functio- 
nal" structure, through a syntax-morphology 
structure, and then through a phonological 
structure. In actuality, the syntax-morpho- 
logy and phonology sections have been worked 
out in greatest detail, and the functional 
structure in bits and pieces. 
H.S.:Syntax and phonology as a part of a le- 
xical access system. 
L.S.: Domain knowledge in terms of a hierar- 
chy of concepts/frames - where each concept 
is a collection of attribute-value (or 
slot/filler) pairs. Such information structu- 
res are variably referred to as frame-based 
languages, semantic networks, inheritance 
hierarchies, etc. 
A.Y.:Syntax and some semantics. 
II.(B) Which parts of grammar or dic- 
tionary do you model by a concurrent unit of 
your system? 
G.C.: I use a localist approach: One unit 
stands for a word, a meaning, a syntactic 
class, and a binding between meanings and ro- 
les, syntactic and semantic. 
P.D.: Parts of syntax; lexical search is also 
parallel 
J.D.: Localist representation, i.e. one syn- 
tactic category - one unit 
P.A.R.:Each category (such as noun phrase) is 
distributively represented by many units. 
H.S.:(Localist; on small grained level:) Each 
occurrence of a category-in-rule-context (a 
dotted rule in Earley's parser definition) is 
represented by a unit. (On the large grain le- 
vel:) The set of possible small grain units 
of each category corresponds to a Turing ma- 
chine, such that one of its units represents 
the current state of the "head" of the TM and 
the others its "tape". 
L.S.:(Localist:) A unit may "represent" a 
concept, an attribute, a value, a binder bet- 
ween (concept,attribute,value) triples, or 
control nodes that mediate and control the 
spreading of activation among these units. 
A.Y.:(Localist:} Each grammatical category is 
represented as a unit, actually each occur- 
fence of each category in a grammar descrip- 
tion is a unit. 
597 
'%i:.(C) Is there a general method such 
that a ~ra~,~ar determines uniquely a parallel 
implementation or is this implementation an 
art? 
G~.Co~Given a ~rammar, X have an algortihm to 
generate the network for that grammar. 
P,,D.:Parsing algorithms are developed for 
Tree Adjoinino Grammars° 
JoDo: Implementation is still an art. 
PoA,~Ro~ N?o a certain extent it is an art, at 
this point, but the comprehension-acquisition 
r~les~ if ~ccessfully implemented, should 
p~ovide the ~eneral method~ 
H~So~ ~riting grammars as high-level specifi --~ 
catio~s is au art. From there on there is a 
general method (same answer as L.S.) 
l, oZo:The networks are constructed from a 
high=level specification of the conceptual 
k~o~ledoe to be encoded. The mapping between 
the knowledge level and the network level is 
precisely specified. This mapping is perfor- 
med automatically by a network compiler. 
AoYo: Given a gra~muar, we have an algorithm 
to make a network of units. 
IIIoA short list of papers related to 
your research? 
GoCo:-Cottrell, Go, Small, S.: Viewing 
Parsing as a Word Sense Discrimination: 
A Connectionist Approach° In B.Bara, 
G,Gnida (eds.), Computational Models of 
Natural Language Processing, Amsterdam~ 
North Holland 1984 
--Cottrell, Go : A Connectionist Approach 
to Nord Sense Disambiguation. (Techn. 
Repo 154) Rochester: The University of 
Rochester, Dept of Computer Science~ 
Revised version to be p~blished by 
Pitman in the Research Notes in 
Artificial ~ntelliuence Series 
PoDo~-Dey, P°, Iyengar, S.S., Byoun,J.S. : 
Parallel processing of Tree Adjoining 
Grammars. Dept. of Computer Science, 
University of Alabama at Birmingham~ 
Report 1987 
-Joshi~ A.K., Levy~ L.S., Takahashi, M.: 
Tree Adjoining ~rammars. Journal of the 
Computer and System Sciences, Vol. i0~ 
pp. 136 - 163, March 1975 
Vijay-Shankar, K~, Joshi, A°K. : Some 
Computational Properties of Tree 
Adjoining Grammars.Proc.23rd Ann. 
Meeting Ass°CompiLing., pp. 82-93, 1985 
,}oDo:~Cottrell, G.W° Parallelism in 
Inheritance Hierachies with Exceptions° 
XJCAI-85, 194-202, Los Angeles, 1985o 
-Fanty, M. Context-Free Parsing in 
Connectionist Networks. TR 174, 
University of Rochester, Department of 
Computer Science, November 1985. 
-Fanty, M~ Learning in Structured 
Connectionist Networks° Ph.D. Thesis~ 
CS Department, Univ~ of Rochester,1988. 
~Feldman, J.A., Fanty, M.A., & Goddard, 
No Computing with Structured Neural 
Networks~ IEEE Computer 1988; in press° 
-Shastri, Lo & Feldman, JoA. Semantic 
Networks and Neural Nets. TR 131, 
University of Rochester, Department of 
Computer Science~ June 1984. 
-Shastri~ L,, Evidential reasoning in 
semantic networks: a formal theory and 
its parallel .implementation,, P:::~\]:~,, 
Thesis and TR 166~ ComD~ ScJi , DG:pL, ~ 
Univ. of Rochester~ Se~temb~}: ~ 1985o 
P oAoR.:Literature from systemic li~E~uistJ.c:~ 
and parallel dist~ib~ted D:~oce~tsin~, 
HoS. :-Schneile~ H~ ~ Element~; Of theo~'eticai!. 
net-linguistics t Pax't 1: Syntactica~. 
and morphological nets -- ~euro- 
linguistic interpretations o 'J~heox'etic.~. 
Linguistic..so Berlin: D'alter ~\[e Gz~u~te:~: 
& Coo, 8, 1981~ ppo &7-100. 
.... Schnelle, I~., Job, D~MoZ )9~\].em<~nt~ ul 
theoretical net-lin~l~istics ~. ~?a~'t ~ 
Phonological nets° ~:~h~o~:~tical 
!,inuuistic~ I0,. ~.9S3~ }?pc 3~79-203o 
.... Schnelle, }~o : Array ~_o~ic for ~l~ntact:{.< 
production processors ~ Xn Mey~ J~ (r:-~,d~) ~. 
~,an~uaue and Discourse : Test 
and }~rotest (Sgall--Festseh~ift} 
Amsterdam. ~ John Benjamins D~,V... 
198~, ppo 477-511o 
-McClelland, JoLo~ ~Iman~ JoL.~ 
Interactive p~'ocessien speech pe~ .... 
ce~)::io::~ The '~AC~ model, p~.. 5S--:~:~) 
in: McClelland~ JoL., R~elhart, l.~o~ 
and the PDP-oGroup~ Para.ilel Di~t~':'ib'tVted 
Processin~ --- Exploratio~ in the l~iic~:'o-,- 
structux;e of Cognition, VOlo 2~ 1986o 
Aho, AoVo~ Ullman, J~noz ~z'inciples of 
Compile: Design ~- Reading Mass°~ ~ d o2:4 
'rile Parsing Method of Er:t@~'z A(\]dison '~ 
Wesley, 1979o 
L~S. z-Fahlman, S~. NETI,: A System foz ° 
Representing and Using Rea\].-.~k~'Id 
Knowledge, '}:he MIT Press, Cm.~b~ide3~ 
MA, 1979. 
-Hinton, G.Eo Implementing Sema~t~: 
Networks in Parallel Hardware° In 
Parallel Models of Associative Memo~yo 
pp. 161- 187 in: G.~;oHinton and 3gAg 
Anderson (EdSo)~o La~rence ~rlbau~,~ 
Associates, Hillsdale~ N~Jo~ 198~.o 
~.Derthik, M~ A Conneetionist Archi ~ 
tecture for Representing and Reasoni~.~.i 
about Structured Knowled~eo Pz-oceed ~o 
ings of the ninth annual confe~'ence 
of the Cognitive Science Society° 
Seattle~ July, 1987o I, awre~ce 
Erlbaum Associates, Hillsdale ~oJo 
-Shastri,Lo: A Connectionist Ap~)roach t<~ 
Knowledge Representation and :l.im~ted 
inference. To appeal" in Cognitive 
Science: 12,3 (1988) 
-Shastri, Lo: Se~antic Net~: An Ev.~.de:<~-- 
tial Formalization and its Connectgo .... 
mist Realization. Los Altos~ ~o:£'~as 
Kauffman~ London: Pitman P~bl.Compo 
A.Y. :-Kaplan R. : A Multi-Processor Approach 
to Natural Language, PrOCo National 
Computer Conference, 1973, ppo 435-440. 
,-Small S., Rieger C. : Parsing and Com- 
prehending with Word Experts~ in Stra -~ 
tegies for Natural Language P~oces~in~ 
(~Ds. M.D. Ringle and Wo Lenher) 
Lawrence Erlba~m Associates, 1988.. 
~Matsumoto Y. : A Parallel Parsin~ S~ste~ 
fo~ Natural Langua~e~ Sprin~er Lect~'~re 
Notes in Computer Science, No~ 225~ 
1986, ppo 396-409. 
-Yonezawa A, Ohsawa ~o : A New App~oach 
to Par~llel Parsing for Context--Free 
Grammars, Research Report on Info:¢'~ o~ 
ation Sciences C-87, Dew, to of ~nf~ ScJo 
Tokyo Instltnte of Technolo~y~ ~987. 
