An Earley-style Predictive Chart Parsing 
Method for Lambek Grammars 
Mark Hepple 
Department of Computer Science, University of Sheffield, Regent Court, 
211 Portobello Street, Sheffield S1 4DP, UK \[heppleOdcs .shef .ac.uk\] 
Abstract 
We present a new chart parsing method for 
Lambek grammars, inspired by a method for D- 
Tree grammar parsing. The formulae of a Lam- 
bek sequent are firstly converted into rules of 
an indexed grammar formalism, which are used 
in an Earley-style predictive chart algorithm. 
The method is non-polynomial, but performs 
well for practical purposes -- much better than 
previous chart methods for Lambek grammars. 
1 Introduction 
We present a new chart parsing method for 
Lambek grammars. The starting point for 
this work is the observation, in (Hepple, 1998), 
of certain similarities between categorial gram- 
mars and the D-Tree grammar (DTG) formal- 
ism of Rambow et al. (1995a). On this basis, 
we have explored adapting the DTG parsing ap- 
proach of Rambow et al. (1995b) for use with 
the Lambek calculus. The resulting method is 
one in which the formulae of a Lambek sequent 
that is to be proven are first converted to pro- 
duce rules of a formalism which combines ideas 
from the multiset-valued linear indexed gram- 
mar formalism of Rainbow (1994), with the 
Lambek calculus span labelling scheme of Mor- 
rill (1995), and with the first-order compilation 
method for categorial parsing of Hepple (1996). 
The resulting 'grammar' is then parsed using an 
Earley-style predictive chart algorithm which is 
adapted from Rambow et al. (1995b). 
2 The Lambek Calculus 
We are concerned with the implicational (or 
'product-free') fragment of the associative Lam- 
bek calculus (Lambek, 1958). A natural deduc- 
tion formulation is provided by the following 
rules of elimination and introduction, which cor- 
respond to steps of functional application and 
abstraction, respectively (as the term labelling 
reveals). The rules are sensitive to the order of 
assumptions. In the \[/I\] (resp. \[\I\]) rule, \[B\] in- 
dicates a discharged or withdrawn assumption, 
which is required to be the rightmost (resp. left- 
most) of the proof. 
A/B :a B :b /E B:b B\A a 
A: (ab) A: (ab) 
• ....\[B: v\] \[B: v\].:. 
A:a A:a /I \I 
A/B : Av.a B\A : Av.a 
\E 
(which) (mary) (ate) 
rel/(s/np) np (np\s)/np \[np\] /E 
np\s \E 
S 
rel 
The above proof illustrates 'hypothetical 
reasoning', i.e. the presence of additional as- 
sumptions ('hypotheticals') in proofs that are 
subsequently discharged. It is because of this 
phenomenon that standard chart methods are 
inadequate for the Lambek calculus -- hypo- 
theticals don't belong at any position on the 
single ordering over lexical categories by which 
standard charts are organised. 1 The previ- 
ous chart methods for the Lambek calculus 
deal with this problem in different ways. The 
method of K6nig (1990, 1994) places hypothet- 
icals on separate 'minicharts' which can attach 
into other (mini)charts where combinations are 
1In effect, hypotheticals belong on additional subor- 
derings, which can connect into the main ordering of 
the chart at various positions, generating a branching, 
multi-dimensional ordering scheme. 
465 
possible. The method requires rather com- 
plicated book-keeping. The method of Hepple 
(1992) avoids this complicated book-keeping, 
and also rules out some useless subderivations 
allowed by Khnig's method, but does so at 
the cost of computing a representation of all 
the possible category sequences that might be 
tested in an exhaustive sequent proof search. 
Neither of these methods exhibits performance 
that would be satisfactory for practical use. 2 
3 Some Preliminaries 
3.1 First-order Compilation for 
Categorial Parsing 
Hepple (1996) introduces a method of first- 
order compilation for implicational linear logic, 
to provide a basis for efficient theorem proving 
of various categorial formalisms. Implicational 
linear logic is similar to the Lambek calculus, 
except having only a single non-directional im- 
plication --o. The idea of first-order compil- 
ation is to eliminate the need for hypothetical 
reasoning by simplifying higher-order formulae 
(whose presence requires hypothetical reason- 
ing) to first-order formulae. This involves ex- 
cising the subformulae that correspond to hy- 
potheticals, leaving a first-order residue. The 
excised subformulae are added as additional as- 
sumptions. For example, a higher-order formula 
(Z -o Y) --o X simplifies to Z+ (Y -o X), allow- 
ing proof (a) to be replaced by (b): 
(a) \[Z\] Z-oW W-oY (Z-oy)-oX 
W 
Y 
Z--oY 
X 
Y--oX (b) Z Z--o W W--o Y 
W 
Y 
X 
The method faces two key problems: avoiding 
invalid deduction and getting an appropriate se- 
2Morrill (1996) provides a somewhat different tabular 
method for Lambek parsing within the proof net deduc- 
tion framework, in an approach where proof net check- 
ing is made by unifying labels marked on literals. The 
approach tabulates MGU's for the labels of contiguous 
subsegments of a proof net. 
mantics for the combination. To avoid invalid 
deduction, an indexing scheme is used to en- 
sure that a hypothetical must be used to de- 
rive the argument of the residue functor from 
which was excised (e.g. Z must be used to 
derive the argument Y of Y--o X, a condition 
satisfied in proof (b). To get the same se- 
mantics with compilation as without, the se- 
mantic effects of the introduction rule are com- 
piled into the terms of the formulae produced, 
e.g. (Z -o Y) --o X : w gives Z : z plus Y --o X : 
Au.w(Az.u). Terms are combined, not using 
standard application/fl-reduction, but rather 
an operation Ax.g + h =~ g\[h//x\] where a 
variant of substitution is used that allows 'ac- 
cidental' variable capture. Thus when Y--o X 
combines with its argument, whose derivation 
includes Z, the latter's variable becomes bound, 
e.g. lu.w(lz.u) + x(yz) =~ w(Iz.x(yz)) 
3.2 Multiset-valued Linear Indexed 
Grammar 
Rambow (1994) introduces the multiset-valued 
linear indexed grammar formalism ({}-LIG). In- 
dices are stored in an unordered multiset rep- 
resentation (c.f. the stack of conventional lin- 
ear indexed grammar). The contents of the 
multiset at any mother node in a tree is dis- 
tributed amongst its daughter nodes in a lin- 
ear fashion, i.e each index is passed to pre- 
cisely one daughter. Rules take the form 
A0\[m0\]-+ Al\[ml\]...An\[m,~\]. The multiset of 
indices m0 are required to be present in, and 
are removed from, the multiset context of the 
mother node in a tree. For each daughter Ai, 
the indices mi are added into whatever other 
indices are inherited to that daughter. Thus, 
a rule A\[\] --+ B\[1\] C\[\] (where \[\] indicates an 
empty multiset) can license the use of a rule 
DIll ~ a within the derivation of its daugh- 
ter BIll, and so the indexing system allows the 
encoding of dominance relations. 
4 A New Chart Parsing Method for 
Lambek Grammars 
4.1 Lambek to SLMG Conversion 
The first task of the parsing approach is to con- 
vert the antecedent formulae of the sequent to 
be proved into a collection of rules of a form- 
alism I call Span Labelled Multiset Grammar 
(SLMG). For digestibility, I will present the con- 
version process in three stages. (I will assume 
466 
Method: 
(A:(i-j)) p = A:(i-j) where A atomic 
(A/B:(h-i))P = (A:(h-j))P / (B:(i-j)) ~ 
(B\A:(h-i)) p = (B:(j-h)) ~ \ (A:(j-i)) p 
where j is a new 
variable/constant 
aspis +/- 
Example: 
(X/(Y/Z):(O-1)) + = X:(O-h)/(Y:(1-k)/Z:(h-k)) 
(w:(1-2))+ = w:(1-2) 
((W\Y)/Z:(2-3)) + = (W:(i-2)\Y:(i-j))/Z:(3-j) 
Figure 1: Phase 1 of conversion (span labelling) 
that in any sequent F ~ A to be proved, the 
succedent A is atomic. Any sequent not in this 
form is easily converted to one, of equivalent 
theoremhood, which is.) 
Firstly, directional types are labelled with 
span information using the labelling scheme 
of Morrill (1995) (which is justified in rela- 
tion to relational algebraic models for the Lam- 
bek calculus (van Benthem, 1991)). An ante- 
cedent Xi in X1...Xn =~ X0 has basic span 
(h-i) where h -- (i - 1). The labelled for- 
mula is computed from (Xi:(h-i)) + using the 
polar translation functions shown in Figure 1 
(where /~ denotes the complementary polarity 
to p).3 As an example, Figure 1 also shows 
the results of converting the antededents of 
X/(Y/Z), W, (W\Y)/Z =~ X (where k is a con- 
stant and i,j variables). 4 
The second stage of the conversion is adap- 
ted from the first-order compilation method of 
Hepple (1996), discussed earlier, modified to 
handle directional formulae and using a mod- 
ified indexation scheme to record dependencies 
3The constants produced in the translation corres- 
pond to 'new' string positions, which make up the addi- 
tional suborderings on which hypotheticals are located. 
The variables produced in the translation become instan- 
tiated to some string constant during an analysis, fixing 
the position at which an additional subordering becomes 
'attached to' another (sub)ordering. 
4The idea of implementing categorial grammar as a 
non-directional logic, but associating atomic types with 
string position pairs (i.e. spans) to handle word order, 
is used in Pareschi (1988), although in that approach all 
string positions instantiate to values on a single ordering 
(i.e. integers 0 - n for a string of length n), which is not 
sufficient for Lambek calculus deductions. 
between residue formulae and excised hypothet- 
icals (one where both the residue and hypothet- 
ical record the dependency). For this proced- 
ure, the 'atomic type plus span label' units that 
result from the previous stage are treated as 
atomic units. The procedure T is defined by the 
cases shown in Figure 2 (although the method is 
perhaps best understood from the example also 
shown there). Its input is a pair (T, t), T a span 
labelled formula, t its associated term. 5 
This procedure simplifies higher-order formu- 
lae to first-order ones in the manner already dis- 
cussed, and records dependencies between hy- 
pothetical and residue formulae using the in- 
dexing scheme. Assuming the antecedents of 
our example X/(Y/Z),W, (W\Y)/Z ~ X, to 
have terms 81,82,83 respectively, compilation 
yields results as in the example in Figure 2. The 
higher-order X/(Y/Z) yields two output formu- 
lae: the main residue X/Y and the hypothetical 
Z, with the dependency between the two indic- 
ated by the common index 1 in the argument 
index set of the former and the principal index 
set of the latter. The empty sets elsewhere in- 
dicate the absence of such dependencies. 
The final stage of the conversion process 
converts the results of the second phrase into 
SLMG productions. The method will be ex- 
plained by example. For a functor such 
as B\(((A\X)/D)/C), we can easily pro- 
ject the sequence of arguments it requires: 
5Note that the "+" of (A + F) in (TO) simply pairs 
together the single compiled formula A with the set F of 
compiled formulae, where A is the main residue of the 
input formula and F its derived hypotheticals. 
467 
Method: 
(Tla) 
Q-lb) 
(~-2a) 
(v2b) 
(v3a) 
T((T,t))=AUF where T((O,T,t))=A+F 
T((m,X/Y,t)) = T((m,X/(Y:O),t)) where Y has no index set 
as for (Tla) modulo directionality of connective 
T((m, Xa/(Y:ml), t)) = (m, X2/(Y:ml), Av.s) + F 
where Y atomic, T((m, X1, (tv))) = (re, X2, s) + F, v a fresh variable 
as for (T2a) modulo directionality of connective 
v((m,X/((Y/Z):rni),t)) = A + (B U F U A) 
where w, v fresh variables, i a fresh multiset index, m2 = i U rnl 
v((m, X/(Y:m2), Aw.t(Av.w))) = A + F, T((i, Z, v)) = B + A 
(~'3b)-(T3d) as for (T3a) modulo directionality of,connectives 
Example: 
T((X:(O-h)/(Y:(1-k)/Z:(h-k)), si)) = 
T((W:(1--2),s2)) = 
~(((W:(i-2)\Y:(i-j))/Z:(3-j), s3)) = 
(0, X:(O,h)/(Y:(1-k):{1}), Au.sl(Az.u)) 
({1},Z:(h-k)),z) } 
(q}, W:(1-2), s2) 
(~, ( (W:( i-2):O) \ Y:( i-j) ) / ( Z:( 3-j):O), AvAw.( sa v w) ) 
Figure 2: Phase 2 of conversion (first-order compilation) 
A,B,B\(((A\X)/D)/C),C,D =~ X. If the 
functor was the lexical category of a word w, it 
might be viewed as fulfilling a role akin to a PS 
rule such as X --+ A B w C D. For the present 
approach, with explicit span labelling, there is 
no need to include a rhs element to mark the 
position of the functor (or word) itself, so the 
corresponding production would be more akin 
to X -+ A B C D. For an atomic formula, the 
corresponding production will have an empty 
rhs, e.g. A --4 0 .6 
The left and right hand side units of SLMG 
productions all take the form Aim\] (i-j), where 
A is an atomic type, m is a set of indices (if 
m is empty, the unit may be written A\[\](i-j)), 
6Note that 0 is used rather than e to avoid the sug- 
gestion of the empty string, which it is not -- matters to 
do with the 'string' are handled solely within the span 
labelling. This point is reinforced by observing that the 
'string language' generated by a collection SLMG pro- 
ductions will consist only of (nonempty) sequences of 
0's. The real import of a SLMG derivation is not its ter- 
minal Yield, but rather the instantiation of span labels 
that it induces (for string matters), and its structure (for 
semantic matters). 
and (i-j) a span label. For a formula (m, T, t) 
resulting after first-order compilation, the rhs 
elements of the corresponding production cor- 
respond to the arguments (if any) of T, whereas 
its lhs combines the result type (plus span) of 
T with the multiset m. For our running ex- 
ample X/(Y/Z), W, (W\Y)/Z =~ X, the formu- 
lae resulting from the second phase (by first- 
order compilation) give rise to productions as 
shown in Figure 3. The associated semantic 
term for each rule is intended to be applied to 
the semantics if its daughters in their left-to- 
right order (which may require some reordering 
of the outermost lambdas c.f. the terms of the 
first-order formulae, e.g. as for the last rule). 
A sequent X1...Xn =~ Xo is proven if we 
can build a SLMG tree with root X0\[\](0-n) in 
which the SLMG rules derived from the ante- 
cedents are each used precisely once, and which 
induces a consistent binding over span variables. 
For our running example, the required deriva- 
tion, shown below, yields the correct interpret- 
ation Sl(AZ.S3 z s2). Note that 'linear resource 
use', i.e. that each rule must be used precisely 
468 
Example: 
(0, X:(O-h)/(Y:(1-k):{1}), Au.sl(Az.u)) 
({1}, Z:(h-k)), z) 
(O, W:(1-2), s2) 
X\[\](0-h) --+ Y\[1\](1-k) 
Z\[1\](h-k) 0 : z 
W\[\](1-2) 0 : s2 
(0, ( (W:(i-2):O)\Y:(i-j) )/( Z:(3-j):O), AvAw.(s3 v w)) 
Y\[\](i-j) --+ W\[\](i-2) Z\[\](3-j) : 
:  u.sl( z.u) 
 w v.(s3 v 
Figure 3: Phase 3 of conversion (converting to SLMG productions) 
once, is enforced by the span labelling scheme 
and does not need to be separately stipulated. 
Thus, the span (0-n) is marked on the root of 
the derivation. To bridge this span, the main 
residues of the antecedent formulae must all 
participate (since each 'consumes' a basic sub- 
span of the main span) and they in turn require 
participation of their hypotheticals via the in- 
dexing scheme. 
x\[\](o-3) 
I 
Y\[ll(1-k) 
w\[\](1-2) Z\[ll(3-k) 
I I 
0 0 
4.2 The Earley-style Parsing Method 
The chart parsing method to be presented 
is derived from the Earley-style DTG pars- 
ing method of Rambow et al. (1995), and 
in some sense both simplifies and complicates 
their method. In effect, we abstract from their 
method a simpler one for Eaxley-style parsing of 
{}-LIG (which is a simpler formalism than the 
Linear Prioritized Multiset Grammar (LPMG) 
into which they compile DTG), and then ex- 
tend this method to handle the span labelling 
of SLMG. A key differences of the new approach 
as compared to standard chart methods is that 
the usual external notion of span is dispensed 
with, and the combination of edges is instead re- 
girnented in terms of the explicit span labelling 
of categories in rules. The unification of span 
labels requires edges to carry explicit binding 
information for span variables. We use R to de- 
note the set of rules derived from the sequent, 
and E the set of edges in the chart. The general 
form of edges is: ((ml, m2), 9, r, (A ~ F * A)) 
where (~4 ~ F,A) E R, 0 is a substitution 
over span variables, r is a restrictor set identi- 
fying span variables whose values are required 
non-locally (explained below), and ml, m2 are 
multisets. In a {}-LIG or SLMG tree, there is 
no restriction on how the multiset indices associ- 
ated with any non-terminal node can be distrib- 
uted amongst its daughters. Rather than cash- 
ing out the possible distributions as alternative 
edges in the predictor step, we can instead, in 
effect, 'thread' the multiset through the daugh- 
ters, i.e. passing the entire multiset down to 
the first daughter, and passing any that are not 
used there on to the next daughter, and so on. 
For an edge ((ml, m2), 19, r, (A ~ F * A)), ml 
corresponds to the multiset context at the time 
the ancestor edge with dotted rule (,4 -+ .FA) 
was introduced, and m2 is the current multiset 
for passing onto the daughters in A. We call ml 
the initial multiset and m2 the current multiset. 
The chart method employs the rules shown in 
Figure 4. We shall consider each in turn. 
Initialisation: 
The rule recorded on the edge in this chart rule 
is not a real one (i.e. ~ R), but serves to drive 
the parsing process via the prediction of edges 
for rules that can derive X0\[\](1-n). A success- 
ful proof of the sequent is shown if the com- 
pleted chart contains an inactive edge for the 
special goal category, i.e. there is some edge 
((0,0),0,0, (GOAL\[\](,-.) --+ h.)) E E 
Prediction: 
The current multiset of the predicting edge is 
passed onto the new edge as its initial multiset. 
The latter's current multiset (m6) may differ 
from its initial one due either to the removal of 
an index to license the new rule's use (i.e. if 
469 
Initialisation: 
if the initial sequent is X 1 ... X n :=~ Z 0 
then ((O,O),$,O,(GOAL\[\](*-*) --4 .Xo\[\](1-n))) • E 
Prediction: 
ff ((ml,m2),Ol,rl,(A\[m3\](e-f) -+ r. B\[m4\](g-h), A)) • E 
and (B\[rnh\](i-j) --+ A) • R 
then ((m2, m6),O2,r2, (B\[m5\](g-(hO)) -~ .(A0))) • E 
where O=81+MGU((g-h),(i-j)) ; m5 Cm2Um4 ; m6 = (m2t2m4)-m5 
r2 = nlv(m2 \[_J m4) ; 82 = 0/(r2 U dauglnlv(A)) 
Completer: 
if ((ml,rr~2),Ol,rl,(A\[m3\](f-g) --+ F . B\[m4\](i-h),A)) E E 
and ((m2, ms), 02, r2, (B\[m6\](i-j) -4 A*)) E E 
then ((ml, ms), 03, rl, (A\[m3\](f -(gO)) -~ F, B\[m4\](i-j) * (A0))) E E 
where O=01+02+MGU(h,j) ; mhCrn2 ; m6C_m2Um4 ; 
03 = O/(rl U dauglnlv(A)) 
Figure 4: Chart rules 
m5 is non-empty), or to the addition of indices 
from the predicting edge's next rhs unit (i.e. if 
ma is non-empty). (Note the 'sloppy' use of set, 
rather than explicitly multiset, notation. The 
present approach is such that the same index 
should never appear in both of two unioned sets, 
so there is in practice little difference.) 
The line 0 = 01 + MGU((g-h), (i-j)) checks 
that the corresponding span labels unify, and 
that the resulting MGU can consistently aug- 
ment the binding context of the predicting edge. 
This augmented binding is used to instantiate 
span variables in the new edge where possible. 
It is a characteristic of this parsing method, 
with top-down left-to-right traversal and associ- 
ated propagation of span information, that the 
left span index of the next daughter sought by 
any active edge is guarenteed to be instantiated, 
i.e. g above is a constant. 
Commonly the variables appearing in SLMG 
rules have only local significance and so their 
substitutions do not need to be carried around 
with edges. For example, an active edge might 
require two daughters B\[\](g-h) C\[\](h-i). A 
substitution for h that comes from combin- 
ing with an inactive edge for B\[\](g-h) can 
be immediately applied to the next daughter 
C\[\](h-i), and so does not need to be carried 
explicitly in the binding of the resulting edge. 
However, a situation where two occurrences of 
a variable appear in different rules may arise 
as a result of first-order compilation, which will 
sometimes (but not always) separate a variable 
occurrence in the hypothetical from another in 
the residue. For the rule set of our running ex- 
ample, we find an occurrence of h in both the 
first and second rule (corresponding to the main 
residue and hypothetical of the initial higher- 
order functor). The link between the two rules is 
also indicated by the indexing system. It turns 
out that for each index there is at most one vari- 
able that may appear in the two rules linked 
by the index. The identity of the 'non-local 
variables' that associate with each index can 
be straightforwardly computed off the SLMG 
grammar (or during the conversion process). 
The function nfvreturns the set of non-local 
variables that associate with a multiset of in- 
dices. The line r2 = nlv(m2 12 m4) computes 
the set of variables whose values may need to 
470 
be passed non-locally, i.e. from the predicting 
edge down to the predicted edge, or from an 
inactive edge that results from combination of 
this predicted edge up to the active edge that 
consumes it. This 'restrictor set' is used in redu- 
cing the substitution 8 to cover only those vari- 
ables whose values need to be stored with the 
edge. The only case where a substitution needs 
to be retained for variable that is not in the re- 
strictor set arises regarding the next daughter 
it seeks. For example, an active edge might 
require two daughters B\[\](g-h) C\[1\](k-i), 
where the second's index links it to a hypo- 
thetical with span (k-h). Here, a substitution 
for h from a combination for the first daughter 
cannot be immediately applied and so should 
be retained until a combination is made for the 
second daughter. The function call dauglnlv(A) 
returns the set of non-local variables associated 
with the multiset indices of the next daugh- 
ter in A (or the empty set if A is empty). 
There may be at most one variable in this set 
that appears in the substitution 8. The line 
82 = 8/(r2 U dauglnlv(A)) reduces the substi- 
tution to cover only the variables whose values 
need to be stored. Failing to restrict the substi- 
tution in this way undermines the compaction 
of derivations by the chart, i.e. so that we find 
edges in the chart corresponding to the same 
subderivation, but which are not recognised as 
such during parsing due to them recording in- 
compatible substitutions. 
Completer: 
Recall from the prediction step that the pre- 
dicted edge's current multiset may differ from 
its initial multiset due to the addition of indices 
from the predicting edge's next rhs unit (i.e. m4 
in the prediction rule). Any such added indices 
must~be 'used up' within the subderivation of 
that rhs element which is realised by the com- 
binations of the predicted edge. This require- 
ment is checked by the condition m5 C_ m2. 
The treatment of substitutions here is very 
much as for the prediction rule, except that both 
input edges contribute their own substitution. 
Note that for the inactive edge (as for all inact- 
ive edges), both components of the span (i-j) 
will be instantiated, so we need only unify the 
right index of the two spans -- the left indices 
can simply be checked for atomic identity. This 
observation is important to efficient implement- 
ation of the algorithm, for which most effort is in 
practice expended on the completer step. Act- 
ive edges should be indexed (i.e. hashed) with 
respect to the (atomic) type and left span index 
of the next rhs element sought. For inactive 
edges, the type and left span index of the lhs 
element should be used. For the completer step 
when an active edge is added, we need only ac- 
cess inactive edges that are hashed on the same 
type/left span index to consider for combina- 
tion, all others can be ignored, and vice versa 
for the addition of an inactive edge. 
It is notable that the algorithm has no scan- 
ning rule, which is due to the fact that the po- 
sitions of 'lexical items' or antecedent categor- 
ies are encoded in the span labels of rules, and 
need no further attention. In the (Rambow et 
hi., 1995) algorithm, the scanning component 
also deals with epsilon productions. Here, rules 
with an empty rhs are dealt with by prediction, 
by allowing an edge added for a rule with an 
empty rhs to be treated as an inactive edge (i.e. 
we equate "() -" and ". ()"). 
If the completed chart indicates a successful 
analysis, it is straightforward to compute the 
proof terms of the corresponding natural deduc- 
tion proofs, given a record of which edges were 
produced by combination of which other edges, 
or by prediction from which rule. Thus, the 
term for a predicted edge is simply that of the 
rule in R, whereas a term for an edge produced 
by a completer step is arrived at by combining a 
term of the active edge with one for the inactive 
edge (using the special substitution operation 
that allows 'accidental binding' of variables, as 
discussed earlier). Of course, a single edge may 
compact multiple alternative subproofs, and so 
return multiple terms. Note that the approach 
has no problem in handling multiple lexical as- 
signments, they simply result in multiple rules 
generated off the same basic span of the chart. 
5 Efficiency and Complexity 
The method is shown to be non-polynomial by 
considering a simple class of examples of the 
form X1,...Xa-I,a =~ a, where each Xi is 
a/(a/(a\a)). Each such Xi gives a hypothetical 
whose dependency is encoded by a multiset in- 
dex. Examination of the chart reveals spans for 
which there are multiple edges, differing in their 
'initial' multiset (and other ways), there being 
471 
xolal(xll(ala)),xll(x21(ala)),x21(ala),ala, ala, ala, ala, ala, a xo 
Figure 5: Example for comparison of methods 
one for edge for each subset of the indices deriv- 
ing from the antecedents XI,... Xn-2, i.e. giv- 
ing 2 ('~-2) distinct edges. This non-polynomial 
number of edge results in non-polynomial time 
for the completer step, and in turn for the al- 
gorithm as a whole. Hence, this approach does 
not resolve the open question of the polynomial 
time parsability of the Lambek calculus. In- 
formally, however, these observations are sug- 
gestive of a possible locus of difficulty in achiev- 
ing such a result. Thus, the hope for polyno- 
mial time parsability of the Lambek calculus 
comes from it being an ordered 'list-like' sys- 
tem, rather than an unordered 'bag-like' sys- 
tem, but in the example just discussed, we ob- 
serve 'bag-like' behaviour in a compact encoding 
(the multiset) of the dependencies of hypothet- 
ical reasoning. 
We should note that the DTG parsing 
method of (Rambow et al., 1995), from which 
the current approach is derived, is polynomial 
time. This follows from the fact that their com- 
pilation applies to a preset DTG, giving rise to 
a fixed maximal set of distinct indices in the 
LPMG that the compilation generates. This 
fixed set of indices gives rise to a very large, 
but polynomial, worst-case upper limit on the 
number of edges in a chart, which in turn yields 
a polynomial time result. A key difference for 
the present approach is that our task is to parse 
arbitrary initial sequents, and hence we do not 
have the fixed initial grammar that is the basis 
of the Rambow et al. complexity result. 
For practical comparison to the previous 
Lambek chart methods, consider the highly am- 
biguous artificial example shown in Figure 5, 
(which has six readings). KSnig (1994) reports 
that a Prolog implementation of her method, 
running on a major workstation produces 300 
edges in 50 seconds. A Prolog implementation 
of the current method, on a current major work 
station, produces 75 edges in less than a tenth 
of a second. Of course, the increase in comput- 
ing power over the years makes the times not 
strictly comparable, but still a substantial speed 
up is indicated. The difference in the number 
of edges suggests that the KSnig method is sub- 
optimal in its compaction of alternative deriva- 
tions. 

References 
van Benthem, J. 1991. Language in Ac- 
tion: Categories, Lamdas and Dynamic Lo- 
gic. Studies in Logic and the Foundations of 
Mathematics, vol 130, North-Holland, Ams- 
terdam. 
Hepple, M. 1992. ' Chart Parsing Lambek 
Grammars: Modal Extensions and Incre- 
mentality', Proc. of COLING-92. 
Mark Hepple. 1996. 'A Compilation-Chart 
Method for Linear Categorial Deduction.' 
Proc. COLING-96, Copenhagen. 
Hepple, M. 1998. 'On Some Similarities 
Between D-Tree Grammars and Type-Logical 
Grammars.' Proc. Fourth Workshop on Tree- 
Adjoining Grammars and Related Frame- 
works. 
KSnig, E. 1990, 'The complexity of parsing 
with extended categorial grammars', Proc. o\] 
COLING-90. 
Esther K5nig. 1994. 'A Hypothetical Reas- 
oning Algorithm for Linguistic Analysis.' 
Journal of Logic and Computation, Vol. 4, 
No 1. 
Lambek, J. 1958. 'The mathematics of sentence 
structure.' American Mathematical Monthly 
65. 154-170. 
Morrill, G. 1995. 'Higher-order Linear Logic 
Programming of Categorial Dedution', Proc. 
o/EA CL-7, Dublin. 
Morrill, G. 1996. 'Memoisation for Categorial 
Proof Nets: Parallelism in Categorial Pro- 
cessing.' Research Report LSI-96-24-R, Uni- 
versitat Polit~cnica de Catalunya. 
Pareschi, R. 1988. 'A Definite Clause Version 
of Categorial Grammar.' Proc. 26th A CL. 
Rambow, O. 1994. 'Multiset-valued linear index 
grammars.' Proc. A CL '94. 
Rambow, O., Vijay-Shanker, K. & Weir, D. 
1995a. 'D-Tree Grammars.' Proc. ACL-95. 
Rambow, O., Vijay-Shanker, K. & Weir, D. 
1995b. 'Parsing D-Tree Grammars.' Proc. 
Int. Workshop on Parsing Technologies. 
