Memory-E cient and Thread-Safe Quasi-Destructive Graph
Uni cation
Marcel P. van Lohuizen
Department of Information Technology and Systems
Delft University of Technology
mpvl@acm.org
Abstract
In terms of both speed and mem-
ory consumption, graph uni cation
remains the most expensive com-
ponent of uni cation-based gram-
mar parsing. We present a tech-
nique to reduce the memory usage
of uni cation algorithms consider-
ably, without increasing execution
times. Also, the proposed algorithm
is thread-safe, providing an e cient
algorithm for parallel processing as
well.
1 Introduction
Both in terms of speed and memory consump-
tion, graph uni cation remains the most ex-
pensive component in uni cation-based gram-
mar parsing. Uni cation is a well known algo-
rithm. Prolog, for example, makes extensive
use of term uni cation. Graph uni cation is
slightly di erent. Two di erent graph nota-
tions and an example uni cation are shown in
Figure 1 and 2, respectively.
In typical uni cation-based grammar
parsers, roughly 90% of the uni cations
fail. Any processing to create, or copy, the
result graph before the point of failure is
b
e
A C F
D
2
4
A = b
C = 1
 D = e  
F = 1
3
5
Figure 1: Two ways to represent an identical
graph.
redundant. As copying is the most expensive
part of uni cation, a great deal of research
has gone in eliminating super uous copying.
Examples of these approaches are given in
(Tomabechi, 1991) and (Wroblewski, 1987).
In order to avoid super uous copying, these
algorithms incorporate control data in the
graphs. This has several drawbacks, as we
will discuss next.
Memory Consumption To achieve the
goal of eliminating super uous copying, the
aforementioned algorithms include adminis-
trative  elds|which we will call scratch
 elds|in the node structure. These  elds
do not attribute to the de nition of the graph,
but are used to e ciently guide the uni ca-
tion and copying process. Before a graph is
used in uni cation, or after a result graph has
been copied, these  elds just take up space.
This is undesirable, because memory usage
is of great concern in many uni cation-based
grammar parsers. This problem is especially
of concern in Tomabechi’s algorithm, as it in-
creases the node size by at least 60% for typ-
ical implementations.
In the ideal case, scratch  elds would be
stored in a separate bu er allowing them to be
reused for each uni cation. The size of such a
bu er would be proportional to the maximum
number of nodes that are involved in a single
uni cation. Although this technique reduces
memory usage considerably, it does not re-
duce the amount of data involved in a single
uni cation. Nevertheless, storing and loading
nodes without scratch  elds will be faster, be-
cause they are smaller. Because scratch  elds
are reused, there is a high probability that
they will remain in cache. As the di erence
 A =  B = c  
D =  E = f  
 
t
2
4
A = 1
 B = c  
D = 1
G =  H = j  
3
5)
2
66
4
A = 1
 B = c
E = f
 
D = 1
G =  H = j  
3
77
5
Figure 2: An example uni cation in attribute value matrix notation.
in speed between processor and memory con-
tinues to grow, caching is an important con-
sideration (Ghosh et al., 1997).1
A straightforward approach to separate the
scratch  elds from the nodes would be to use
a hash table to associate scratch structures
with the addresses of nodes. The overhead
of a hash table, however, may be signi cant.
In general, any binding mechanism is bound
to require some extra work. Nevertheless,
considering the di erence in speed between
processors and memory, reducing the mem-
ory footprint may compensate for the loss of
performance to some extent.
Symmetric Multi Processing Small-
scale desktop multiprocessor systems (e.g.
dual or even quad Pentium machines) are be-
coming more commonplace and a ordable. If
we focus on graph uni cation, there are two
ways to exploit their capabilities. First, it is
possible to parallelize a single graph uni ca-
tion, as proposed by e.g. (Tomabechi, 1991).
Suppose we are unifying graph a with graph b,
then we could allow multiple processors to
work on the uni cation of a and b simulta-
neously. We will call this parallel uni ca-
tion. Another approach is to allow multiple
graph uni cations to run concurrently. Sup-
pose we are unifying graph a and b in addi-
tion to unifying graph a and c. By assigning
a di erent processor to each operation we ob-
tain what we will call concurrent uni ca-
tion. Parallel uni cation exploits parallelism
inherent of graph uni cation itself, whereas
concurrent uni cation exploits parallelism at
the context-free grammar backbone. As long
as the number of uni cation operations in
1Most of today’s computers load and store data in
large chunks (called cache lines), causing even unini-
tialized  elds to be transported.
one parse is large, we believe it is preferable
to choose concurrent uni cation. Especially
when a large number of uni cations termi-
nates quickly (e.g. due to failure), the over-
head of more  nely grained parallelism can be
considerable.
In the example of concurrent uni cation,
graph a was used in both uni cations. This
suggests that in order for concurrent uni ca-
tion to work, the input graphs need to be
read only. With destructive uni cation al-
gorithms this does not pose a problem, as
the source graphs are copied before uni ca-
tion. However, including scratch  elds in the
node structure (as Tomabechi’s and Wrob-
lewski’s algorithms do) thwarts the imple-
mentation of concurrent uni cation, as di er-
ent processors will need to write di erent val-
ues in these  elds. One way to solve this prob-
lem is to disallow a single graph to be used
in multiple uni cation operations simultane-
ously. In (van Lohuizen, 2000) it is shown,
however, that this will greatly impair the abil-
ity to achieve speedup. Another solution is to
duplicate the scratch  elds in the nodes for
each processor. This, however, will enlarge
the node size even further. In other words,
Tomabechi’s and Wroblewski’s algorithms are
not suited for concurrent uni cation.
2 Algorithm
The key to the solution of all of the above-
mentioned issues is to separate the scratch
 elds from the  elds that actually make up
the de nition of the graph. The result-
ing data structures are shown in Figure 3.
We have taken Tomabechi’s quasi-destructive
graph uni cation algorithm as the starting
point (Tomabechi, 1995), because it is often
considered to be the fastest uni cation algo-
arc list
type
ArcNode
Unification data Copy data
Reusable scratchstructures copyforward
comp-arc list
value
label
offset
indexindex
only structuresPermanent, read-
Figure 3: Node and Arc structures and the
reusable scratch  elds. In the permanent
structures we use o sets. Scratch structures
use index values (including arcs recorded in
comp-arc list). Our implementation derives
o sets from index values stored in nodes.
rithm for uni cation-based grammar parsing
(see e.g. (op den Akker et al., 1995)). We
have separated the scratch  elds needed for
uni cation from the scratch  elds needed for
copying.2
We propose the following technique to asso-
ciate scratch structures with nodes. We take
an array of scratch structures. In addition,
for each graph we assign each node a unique
index number that corresponds to an element
in the array. Di erent graphs typically share
the same indexes. Since uni cation involves
two graphs, we need to ensure that two nodes
will not be assigned the same scratch struc-
ture. We solve this by interleaving the index
positions of the two graphs. This mapping is
shown in Figure 4. Obviously, the minimum
number of elements in the table is two times
the number of nodes of the largest graph. To
reduce the table size, we allow certain nodes
to be deprived of scratch structures. (For ex-
ample, we do not forward atoms.) We denote
this with a valuation function v, which re-
turns 1 if the node is assigned an index and 0
otherwise.
We can associate the index with a node by
including it in the node structure. For struc-
ture sharing, however, we have to use o sets
between nodes (see Figure 4), because other-
wise di erent nodes in a graph may end up
having the same index (see Section 3). O -
2The arc-list  eld could be used for permanent for-
ward links, if required.
c_
Left graphoffset: 0
g4
e3 f_
Right graphoffset: 1
2j
h0
_l
3k1b 1i
2 x 0 + 0
a h b ji k
0 1 2 3 4 5 6 7 8 9 101112
d e g
a0
d2
+1
+1 +1
2 x 1 + 1
+1 -2 +0
+3+1
2 x 4 + 0
+4
-2+1+0
Figure 4: The mechanism to associate index
numbers with nodes. The numbers in the
nodes represent the index number. Arcs are
associated with o sets. Negative o sets indi-
cate a reentrancy.
sets can be easily derived from index values
in nodes. As storing o sets in arcs consumes
more memory than storing indexes in nodes
(more arcs may point to the same node), we
store index values and use them to compute
the o sets. For ease of reading, we present our
algorithm as if the o sets were stored instead
of computed. Note that the small index val-
ues consume much less space than the scratch
 elds they replace.
The resulting algorithm is shown in Fig-
ure 5. It is very similar to the algorithm in
(Tomabechi, 1991), but incorporates our in-
dexing technique. Each reference to a node
now not only consists of the address of the
node structure, but also its index in the ta-
ble. This is required because we cannot derive
its table index from its node structure alone.
The second argument of Copy indicates
the next free index number. Copy returns
references with an o set, allowing them to
be directly stored in arcs. These o sets will
be negative when Copy exits at line 2.2,
resembling a reentrancy. Note that only
AbsArc explicitly de nes operations on o -
sets. AbsArc computes a node’s index using
its parent node’s index and an o set.
Unify(dg1;dg2)
1. try Unify1((dg1;0);(dg2;1))a
1.1. (copy;n) Copy((dg1;0);0)
1.2. Clear the fwtab and cptab table.b
1.3. return copy
2. catch
2.1. Clear the fwtab table.b
2.2. return nil
Unify1(ref in1;ref in2)
1. ref1 (dg1;idx1) Dereference(ref in1)
2. ref2 (dg2;idx2) Dereference(ref in2)
3. if dg1 addr dg2 and idx1 = idx2c then
3.1. return
4. if dg1.type = bottom then
4.1. Forward(ref1;ref2)
5. elseif dg2.type = bottom then
5.1. Forward(ref2;ref1)
6. elseif both dg1 and dg2 are atomic then
6.1. if dg1.arcs6= dg2.arcs then
throw Uni cationFailedException
6.2. Forward(ref2;ref1)
7. elseif either dg1 or dg2 is atomic then
7.1. throw Uni cationFailedException
8. else
8.1. Forward(ref2;ref1)
8.2. shared IntersectArcs(ref1;ref2)
8.3. for each (( ;r1);( ;r2)) in shared do
Unify1(r1;r2)
8.4. new ComplementArcs(ref1;ref2)
8.5. for each arc in new do
Push arc to fwtab[idx1].comp arcs
Forward((dg1;idx1);(dg2;idx2))
1. if v(dg1) = 1 then
fwtab[idx1].forward (dg2;idx2)
AbsArc((label;(dg;o ));current idx)
return (label;(dg;current idx + 2 o ))d
Dereference((dg;idx))
1. if v(dg1) = 1 then
1.1. (fwd-dg;fwd-idx) fwtab[idx].forward
1.2. if fwd-dg6= nil then
Dereference(fwd-dg;fwd-idx)
1.3. else
return (dg;idx)
IntersectArcs(ref1;ref2)
Returns pairs of arcs with index values for each pair
of arcs in ref1 resp. ref2 that have the same label.
To obtain index values, arcs from arc-list must be
converted with AbsArc.
ComplementArcs(ref1;ref2)
Returns node references for all arcs with labels that
exist in ref2, but not in ref1. The references are com-
puted as with IntersectArcs.
Copy(ref in;new idx)
1. (dg;idx) Dereference(ref in)
2. if v(dg) = 1 and cptab[idx].copy6= nil then
2.1. (dg1;idx1) cptab[idx].copy
2.2. return (dg1;idx1 new idx + 1)
3. newcopy new Node
4. newcopy.type dg.type
5. if v(dg) = 1 then
cptab[idx].copy (newcopy;new idx)
6. count v(newcopy)e
7. if dg.type = atomic then
7.1. newcopy.arcs dg.arcs
8. elseif dg.type = complex then
8.1. arcs fAbsArc(a;idx)ja2dg.arcsg
[fwtab[idx].comp arcs
8.2. for each (label;ref) in arcs do
ref1 Copy(ref;count + new idx)f
Push (label;ref1) into newcopy.arcs
if ref1.o set > 0g then
count count + ref1.o set
9. return (newcopy;count)
aWe assign even and odd indexes to the nodes of dg1 and dg2, respectively.
bTables only needs to be cleared up to point where uni cation failed.
cCompare indexes to allow more powerful structure sharing. Note that indexes uniquely identify a node in
the case that for all nodes n holds v(n) = 1.
dNote that we are multiplying the o set by 2 to account for the interleaved o sets of the left and right graph.
eWe assume it is known at this point whether the new node requires an index number.
fNote that ref contains an index, whereas ref1 contains an o set.
gIf the node was already copied (in which case it is < 0), we need not reserve indexes.
Figure 5: The memory-e cient and thread-safe uni cation algorithm. Note that the arrays
fwtab and cptab|which represent the forward table and copy table, respectively|are de ned
as global variables. In order to be thread safe, each thread needs to have its own copy of these
tables.
Contrary to Tomabechi’s implementation,
we invalidate scratch  elds by simply reset-
ting them after a uni cation completes. This
simpli es the algorithm. We only reset the
table up to the highest index in use. As table
entries are roughly  lled in increasing order,
there is little overhead for clearing unused el-
ements.
A nice property of the algorithm is that
indexes identify from which input graph a
node originates (even=left, odd=right). This
information can be used, for example, to
selectively share nodes in a structure shar-
ing scheme. We can also specify additional
scratch  elds or additional arrays at hardly
any cost. Some of these abilities will be used
in the enhancements of the algorithm we will
discuss next.
3 Enhancements
Structure Sharing Structure sharing is an
important technique to reduce memory us-
age. We will adopt the same terminology as
Tomabechi in (Tomabechi, 1992). That is,
we will use the term feature-structure sharing
when two arcs in one graph converge to the
same node in that graph (also refered to as
reentrancy) and data-structure sharing when
arcs from two di erent graphs converge to the
same node.
The conditions for sharing mentioned in
(Tomabechi, 1992) are: (1) bottom and
atomic nodes can be shared; (2) complex
nodes can be shared unless they are modi ed.
We need to add the following condition: (3)
all arcs in the shared subgraph must have the
same o sets as the subgraph that would have
resulted from copying. A possible violation
of this constraint is shown in Figure 6. As
long as arcs are processed in increasing order
of index number,3 this condition can only be
violated in case of reentrancy. Basically, the
condition can be violated when a reentrancy
points past a node that is bound to a larger
subgraph.
3This can easily be accomplished by  xing the or-
der in which arcs are stored in memory. This is a good
idea anyway, as it can speedup the ComplementArcs
and IntersectArcs operations.
h0a0
1i
3k
s6
t
G+1
7
Node could be sharedNode violates condition 3
1b j4
+3+1 +2F
K+1
G H
c2 d
e4 f 5
g6
+4
+1 +1
+5F
F G+1
HG+1
K L
b 2j1
3
o2 p3
+4
+1 +1
+5F HG+1
K L
F
0
q4
+1
1n
m
r5
result without sharing result with sharing
F
0m+1
F G+4
s6
-3
+6H
G+1K
Specialized sharing arc-3
-2
3d g7
4l
Figure 6: Sharing mechanism. Node f cannot
be shared, as this would cause the arc labeled
F to derive an index colliding with node q.
Contrary to many other structure sharing
schemes (like (Malouf et al., 2000)), our algo-
rithm allows sharing of nodes that are part of
the grammar. As nodes from the di erent in-
put graphs are never assigned the same table
entry, they are always bound independently
of each other. (See the footnote for line 3 of
Unify1.)
The sharing version of Copy is similar to
the variant in (Tomabechi, 1992). The extra
check can be implemented straightforwardly
by comparing the old o set with the o set for
the new nodes. Because we derive the o sets
from index values associated with nodes, we
need to compensate for a di erence between
the index of the shared node and the index it
should have in the new graph. We store this
information in a specialized share arc. We
need to adjust Unify1 to handle share arcs
accordingly.
Deferred Copying Just as we use a table
for uni cation and copying, we also use a ta-
ble for subsumption checking. Tomabechi’s
algorithm requires that the graph resulting
0
1
2
3
4
5
6
4 5 6 7 8 9 10 11 12 13 14 15 16 17
Time (seconds)
a0
Sentence length (no. words)
"basic""tomabechi"
"packed""pack+deferred_copy"
"pack+share""packed_on_dual_proc"
Figure 7: Execution time (seconds).
from uni cation be copied before it can be
used for further processing. This can result
in super uous copying when the graph is sub-
sumed by an existing graph. Our technique
allows subsumption to use the bindings gener-
ated by Unify1 in addition to its own table.
This allows us to defer copying until we com-
pleted subsumption checking.
Packed Nodes With a straightforward im-
plementation of our algorithm, we obtain a
node size of 8 bytes.4 By dropping the con-
cept of a  xed node size, we can reduce the
size of atom and bottom nodes to 4 bytes.
Type information can be stored in two bits.
We use the two least signi cant bits of point-
ers (which otherwise are 0) to store this type
information. Instead of using a pointer for
the value  eld, we store nodes in place. Only
for reentrancies we still need pointers. Com-
plex nodes require 8 bytes, as they include
a pointer to the  rst node past its children
(necessary for uni cation). This scheme re-
quires some extra logic to decode nodes, but
signi cantly reduces memory consumption.
4We do not have a type hierarchy.
0
5
10
15
20
25
30
35
40
4 5 6 7 8 9 10 11 12 13 14 15 16 17
Heap size (MB)
a0
Sentence length (no. words)
"basic""tomabechi"
"packed""pack+share"
Figure 8: Memory used by graph heap (MB).
4 Experiments
We have tested our algorithm with a medium-
sized grammar for Dutch. The system was
implemented in Objective-C using a  xed ar-
ity graph representation. We used a test set
of 22 sentences of varying length. Usually, ap-
proximately 90% of the uni cations fails. On
average, graphs consist of 60 nodes. The ex-
periments were run on a Pentium III 600EB
(256 KB L2 cache) box, with 128 MB mem-
ory, running Linux.
We tested both memory usage and execu-
tion time for various con gurations. The re-
sults are shown in Figure 7 and 8. It includes
a version of Tomabechi’s algorithm. The
node size for this implementation is 20 bytes.
For the proposed algorithm we have included
several versions: a basic implementation, a
packed version, a version with deferred copy-
ing, and a version with structure sharing.
The basic implementation has a node size of
8 bytes, the others have a variable node size.
Whenever applicable, we applied the same op-
timizations to all algorithms. We also tested
the speedup on a dual Pentium II 266 Mhz.5
Each processor was assigned its own scratch
tables. Apart from that, no changes to the
5These results are scaled to re ect the speedup rel-
ative to the tests run on the other machine.
algorithm were required. For more details on
the multi-processor implementation, see (van
Lohuizen, 1999).
The memory utilization results show signif-
icant improvements for our approach.6 Pack-
ing decreased memory utilization by almost
40%. Structure sharing roughly halved this
once more.7 The third condition prohibited
sharing in less than 2% of the cases where it
would be possible in Tomabechi’s approach.
Figure 7 shows that our algorithm does not
increase execution times. Our algorithm even
scrapes o roughly 7% of the total parsing
time. This speedup can be attributed to im-
proved cache utilization. We veri ed this by
running the same tests with cache disabled.
This made our algorithm actually run slower
than Tomabechi’s algorithm. Deferred copy-
ing did not improve performance. The addi-
tional overhead of dereferencing during sub-
sumption was not compensated by the savings
on copying. Structure sharing did not sig-
ni cantly alter the performance as well. Al-
though, this version uses less memory, it has
to perform additional work.
Running the same tests on machines with
less memory showed a clear performance ad-
vantage for the algorithms using less memory,
because paging could be avoided.
5 Related Work
We reduce memory consumption of graph uni-
 cation as presented in (Tomabechi, 1991)
(or (Wroblewski, 1987)) by separating scratch
 elds from node structures. Pereira’s
(Pereira, 1985) algorithm also stores changes
to nodes separate from the graph. However,
Pereira’s mechanism incurs a log(n) overhead
for accessing the changes (where n is the
number of nodes in a graph), resulting in
an O(nlogn) time algorithm. Our algorithm
runs in O(n) time.
6The results do not include the space consumed
by the scratch tables. However, these tables do not
consume more than 10 KB in total, and hence have
no signi cant impact on the results.
7Because the packed version has a variable node
size, structure sharing yielded less relative improve-
ments than when applied to the basic version. In
terms of number of nodes, though, the two results
were identical.
With respect to over and early copying (as
de ned in (Tomabechi, 1991)), our algorithm
has the same characteristics as Tomabechi’s
algorithm. In addition, our algorithm allows
to postpone the copying of graphs until after
subsumption checks complete. This would re-
quire additional  elds in the node structure
for Tomabechi’s algorithm.
Our algorithm allows sharing of grammar
nodes, which is usually impossible in other
implementations (Malouf et al., 2000). A
weak point of our structure sharing scheme
is its extra condition. However, our experi-
ments showed that this condition can have a
minor impact on the amount of sharing.
We showed that compressing node struc-
tures allowed us to reduce memory consump-
tion by another 40% without sacri cing per-
formance. Applying the same technique to
Tomabechi’s algorithm would yield smaller
relative improvements (max. 20%), because
the scratch  elds cannot be compressed to the
same extent.
One of the design goals of Tomabechi’s al-
gorithm was to come to an e cient imple-
mentation of parallel uni cation (Tomabechi,
1991). Although theoretically parallel uni-
 cation is hard (Vitter and Simons, 1986),
Tomabechi’s algorithm provides an elegant
solution to achieve limited scale parallelism
(Fujioka et al., 1990). Since our algorithm is
based on the same principles, it allows paral-
lel uni cation as well. Tomabechi’s algorithm,
however, is not thread-safe, and hence cannot
be used for concurrent uni cation.
6 Conclusions
We have presented a technique to reduce
memory usage by separating scratch  elds
from nodes. We showed that compressing
node structures can further reduce the mem-
ory footprint. Although these techniques re-
quire extra computation, the algorithms still
run faster. The main reason for this was the
di erence between cache and memory speed.
As current developments indicate that this
di erence will only get larger, this e ect is not
just an artifact of the current architectures.
We showed how to incoporate data-
structure sharing. For our grammar, the ad-
ditional constraint for sharing did not pose
a problem. If it does pose a problem, there
are several techniques to mitigate its e ect.
For example, one could reserve additional in-
dexes at critical positions in a subgraph (e.g.
based on type information). These can then
be assigned to nodes in later uni cations with-
out introducing con icts elsewhere. Another
technique is to include a tiny table with re-
pair information in each share arc to allow a
small number of con icts to be resolved.
For certain grammars, data-structure shar-
ing can also signi cantly reduce execution
times, because the equality check (see line 3 of
Unify1) can intercept shared nodes with the
same address more frequently. We did not ex-
ploit this bene t, but rather included an o set
check to allow grammar nodes to be shared as
well. One could still choose, however, not to
share grammar nodes.
Finally, we introduced deferred copying.
Although this technique did not improve per-
formance, we suspect that it might be bene -
cial for systems that use more expensive mem-
ory allocation and deallocation models (like
garbage collection).
Since memory consumption is a major con-
cern with many of the current uni cation-
based grammar parsers, our approach pro-
vides a fast and memory-e cient alternative
to Tomabechi’s algorithm. In addition, we
showed that our algorithm is well suited for
concurrent uni cation, allowing to reduce ex-
ecution times as well.
References
[Fujioka et al.1990] T. Fujioka, H. Tomabechi,
O. Furuse, and H. Iida. 1990. Parallelization
technique for quasi-destructive graph uni ca-
tion algorithm. In Information Processing So-
ciety of Japan SIG Notes 90-NL-80.
[Ghosh et al.1997] S. Ghosh, M. Martonosi, and
S. Malik. 1997. Cache miss equations: An
analytical representation of cache misses. In
Proceedings of the 11th International Confer-
ence on Supercomputing (ICS-97), pages 317{
324, New York, July 7{11. ACM Press.
[Malouf et al.2000] Robert Malouf, John Carroll,
and Ann Copestake. 2000. E cient feature
structure operations witout compilation. Nat-
ural Language Engineering, 1(1):1{18.
[op den Akker et al.1995] R. op den Akker, H. ter
Doest, M. Moll, and A. Nijholt. 1995. Parsing
in dialogue systems using typed feature struc-
tures. Technical Report 95-25, Dept. of Com-
puter Science, University of Twente, Enschede,
The Netherlands, September. Extended version
of an article published in E...
[Pereira1985] Fernando C. N. Pereira. 1985. A
structure-sharing representation for uni cation-
based grammar formalisms. In Proc. of the
23rd Annual Meeting of the Association for
Computational Linguistics. Chicago, IL, 8{12
Jul 1985, pages 137{144.
[Tomabechi1991] H. Tomabechi. 1991. Quasi-
destructive graph uni cations. In Proceedings
of the 29th Annual Meeting of the ACL, Berke-
ley, CA.
[Tomabechi1992] Hideto Tomabechi. 1992. Quasi-
destructive graph uni cations with structure-
sharing. In Proceedings of the 15th Interna-
tional Conference on Computational Linguis-
tics (COLING-92), Nantes, France.
[Tomabechi1995] Hideto Tomabechi. 1995. De-
sign of e cient uni cation for natural lan-
guage. Journal of Natural Language Process-
ing, 2(2):23{58.
[van Lohuizen1999] Marcel van Lohuizen. 1999.
Parallel processing of natural  parsers.
In PARCO ’99. Paper accepted (8 pages), to
appear soon.
[van Lohuizen2000] Marcel P. van Lohuizen. 2000.
Exploiting parallelism in uni cation-based
parsing. In Proc. of the Sixth International
Workshop on Parsing Technologies (IWPT
2000), Trento, Italy.
[Vitter and Simons1986] Je rey Scott Vitter and
Roger A. Simons. 1986. New classes for paral-
lel complexity: A study of uni cation and other
complete problems for P. IEEE Transactions
on Computers, C-35(5):403{418, May.
[Wroblewski1987] David A. Wroblewski. 1987.
Nondestructive graph uni cation. In Howard
Forbus, Kenneth; Shrobe, editor, Proceedings
of the 6th National Conference on Arti cial In-
telligence (AAAI-87), pages 582{589, Seattle,
WA, July. Morgan Kaufmann.
