Completeness Conditions for Mixed 
Strategy Bidirectional Parsing 
Graeme Ritchie* 
University of Edinburgh 
It has been suggested that, in certain circumstances, it might be useful for a grammar writer 
to annotate which rules are to be used bottom-up and which are to be used top-down within a 
parser, using a bidirectional variant of the active chart parsing technique. The formal properties 
of such systems have not been fully explored. One limitation of this mixed strategy technique is 
that certain annotations of rules can lead to incompleteness; that is, there may be valid analyses of 
the input string that cannot be found by the parser. We formalize a fairly natural notion of mixed 
strategy bidirectional parsing for context-free grammars, in which one or more symbols within a 
rule may be annotated as "triggers," so that the rule is either top-down (triggered from its left- 
hand side), or bottom-up (triggered from element(s) of its right-hand side). We define a decidable 
property of annotated grammars, such that any grammar with this property is provably complete. 
There are, however, some complete annotations of grammars that fall outside this decidable class. 
We show that membership of this wider class is undecidable. These results suggest that the mixed 
strategy approach is of rather limited usefulness, regardless of whether it is empirically efficient 
or not. 
1. Overview 
Many methods have been explored for parsing context-free grammars; some of these 
methods are loosely categorized as "top-down" (e.g., recursive descent), some as 
"bottom-up" (e.g., shift-reduce), and some could be seen as a mixture of these two 
varieties (e.g., left-corner). All of the well-explored methods assume that the rules in 
the grammar are handled in a fairly uniform way. In particular, it is not usual for the 
rules to be separated into two classes--those to be used bottom-up and those to be 
used top-down. Steel and de Roeck (1987) argue (giving credit to Henry Thompson 
for some of the ideas) that the performance of a parser could be improved by allowing 
the grammar writer to do exactly this. The motivation comes from linguistic phenom- 
ena where it is intuitively clear that one symbol (linguistic category) in the rule is 
noticeably more distinctive than others, so that a parser should not waste time trying 
to match the rule unless that distinctive element is there. For example, a rule such as 
NP --* NP CONJ NP (where CONJ indicates a conjunction, such as and) should not be 
invoked simply because a noun phrase (NP), or the start of a noun phrase, has been 
found. The proposal is that if the linguist is allowed to mark the CONJ element as a 
"trigger," and the parser introduces the rule, bottom-up, only if the trigger has been 
matched, then parsing would proceed more efficiently. 
Steel and de Roeck describe semi-formally a system they have implemented, which 
they claim benefits from this labeling of rules. The current paper does not take a 
position on the wisdom or effectiveness of such labeling. Instead, we explore the 
* Division of Informatics, 80 South Bridge, Edinburgh EH1 1HN, Scotland. 
(~) 1999 Association for Computational Linguistics 
Computational Linguistics Volume 25, Number 4 
formal consequences of this proposal. We show that, although the idea may seem 
superficially plausible, it still has certain formal limitations in the area of completeness 
and decidability. The proofs may be of some theoretical interest from a formal language 
viewpoint. 
The central ideas are as follows: A conventional context-free grammar is "anno- 
tated" by marking at least one symbol in each rule as a trigger. Marking the left-hand- 
side (LHS) symbol as a trigger indicates that the rule can be used top-down; marking 
a right-hand-side (RHS) symbol as a trigger means that the rule can be used bottom- 
up whenever a constituent labeled with that symbol is found by the parser. 1 Using a 
method of parsing known as active chart parsing, it is straightforward to give a precise 
meaning to this labeling of rules, since a chart parser can operate either bottom-up or 
top-down. The scheme examined here is similar to, but different in important ways 
from, head-driven parsing (see Section 7.2). 
It is simple to construct an annotated grammar in which there are some analyses 
that are valid according to the original (unannotated) grammar but that would not 
be parsed by a chart parser following the annotations. This establishes that not all 
annotated grammars allow complete parsing. 
The main substance of this paper is as follows: A property of annotated grammars 
(direct analyzability) is defined, which is decidable, and it is proven that any annotated 
grammar with this property will also allow the parser to produce all the valid analyses 
licensed by the original grammar. However, some annotated grammars are not directly 
analyzable, but nevertheless lead to complete parsing. A characteristic of (a subset of) 
this wider class of annotated grammars (indirect analyzability) is defined, and it is 
proven that any annotated grammar with this property will allow complete parsing. 
However, indirect analyzability can be shown to be undecidable. 
2. The problems 
2.1 Losing Completeness 
Before presenting a formal definition of the mechanisms, and proceeding to prove their 
various properties, it is useful to consider informally a very simple example that shows 
how this approach can lead to loss of analyses by the parser. As outlined above, the 
central idea is to allow different rules to be marked as either top-down (LHS trigger 
symbol) or bottom-up (RHS trigger symbol(s)), or both. Top-down means that the rule 
can be invoked only if some other rule has established a need for its LHS symbol (or 
if the LHS symbol is the initial symbol of the grammar). Bottom-up means that the 
rule can be invoked only if one of the symbols marked as triggers on its RHS has 
been completely parsed. We shall assume that rules of the form A --* w where w is a 
terminal symbol are never annotated, and can be used whenever needed in the parser 
(all this is made precise in our formalization in Section 3.3 below). 
For this informal presentation, and occasionally elsewhere, we shall mark a trigger 
symbol A by overlining it, thus: A. In the illustrative examples, the distinguished 
(initial) symbol of the grammar will always be S and terminal symbols will be in 
lower case. 
1 The term "bottom-up" is adopted here for compatibiity with some other literature on chart parsing, and 
for lack of a better simple phrase. In fact, there are various possible parsing regimes that are in some 
sense "bottom-up," and it is arguable that some are "more bottom-up" than those outlined here. Where 
right-hand-side triggers are restricted to the leftmost symbol (as in Section 5 below), parsing is more like "left-corner" parsing, but this would be a misleading term when triggers are allowed elsewhere. 
458 
Ritchie Completeness Conditions for Mixed Strategy Bidirectional Parsing 
Consider the annotated grammar (see Section 3 for a definition of grammar): 
S ~ NP VP 
NP --* Art N 
VP ~ runs 
Art --* the 
N--* dog 
It should be intuitively clear that although this grammar generates exactly one sen- 
tence, that string cannot be parsed by a parser that follows the annotations as de- 
scribed. The rule S ~ NP VP cannot be used until an initial NP is recognized, and the 
rule that might do that, NP --+ Art N, cannot be used until an initial need for an NP 
is established (which could happen only using S ~ NP VP). There is a form of dead- 
lock, resulting in incompleteness. It should also be intuitively clear that the presence 
or absence of such combinations of annotations may not be as obvious as it is here. 
In a grammar with hundreds of rules, the presence of a combination that blocks an 
otherwise valid analysis could take some detailed checking. This is a serious flaw, as 
the annotation method was supposed to alter the efficiency of the parser, but not to 
eliminate strings from its language. 
It would be very easy to ensure that annotation does not lose analyses, by stipu- 
lating that all rules are marked as top-down, or that all rules are marked as bottom-up 
with the leftmost symbol as a trigger. The parser would then behave as a conventional 
top-down (or bottom-up) chart parser, which is known to be complete. However, since 
the aim is to allow the grammar writer to make a nontrivial annotation of the grammar 
(in an attempt to allow linguistic knowledge to influence the efficiency of the parsing), 
we need to be able to check the completeness of arbitrarily annotated grammars. In 
Section 4 below, we define formally a nontrivial characteristic of annotated grammars 
that guarantees that they do not lose analyses in this way, and show that this property 
of grammars is decidable. 
2.2 Completeness through Interactions 
The situation is even more complicated than Section 2.1 above indicates. One of the 
crucial aspects of chart parsing (which is central to its simplicity and its efficiency) 
is that any entry in a chart can be used to combine with any other compatible entry, 
regardless of whether there is a single coherent tree that will result from it. In particular, 
an entry that has been inserted in the chart as the result of some rule interaction that 
does not itself produce a complete sentential tree (i.e., a partial fragment of an analysis) 
can contribute to some other analysis that happens to require it. 
This is best demonstrated by a simple artificial example. Consider the strategy- 
marked grammar, notation as before: 
-S--* E H 
H---~BF 
-~__, p Q 
E --* j 
P---~ I 
Q---~m 
F ---~ k 
The un-strategy-marked version of this grammar would generate the string jlmk, with 
459 
Computational Linguistics Volume 25, Number 4 
a derivation as follows (see Section 3 for a definition of the relation "0"): 
S =~ EH ~ jH ~jBF ~jPQF ~jlQF ~jlmF =~jlmk 
The tree described by this derivation cannot be found by a parser following the 
strategy-marked grammar, for reasons similar to those outlined in Section 2.1 above. 
Suppose we now add the following rules to the grammar: 
S--,C D 
D--,-E A 
-d--* B C 
C--+ x 
This larger grammar will also generate the string xjlmx, but this is not relevant to the 
argument. What is more interesting is that the extended grammar does now allow the 
parsing of jlmk, with an associated syntax tree that corresponds to the derivation given 
above (i.e., a tree that makes no direct use of the rules that have been added to the 
grammar). The way in which the added rules act as a "catalyst" to allow the hitherto 
blocked analysis is an example of a general phenomenon. Informally, what happens is 
the following: (A chart parser is assumed here; formal details are given in Section 3.3 
below.) With just the smaller grammar, the nonterminal H cannot be expanded as 
required, since it is on the LHS of a bottom-up rule, and its first symbol B cannot be 
recognized because it requires a top-down rule. In both the original grammar and the 
larger grammar, H is introduced only by the rule S ~ E H, i.e., with E on its immediate 
left. So the only strings where H can participate in an analysis are those where E occurs 
at the start. Consider the parsing, with the larger grammar, of the string jlmk (which 
does indeed start with an E). As E is preterminal, it can be recognized directly (with no 
effect from annotations). In the larger grammar, the bottom-up rule D -~ E A is then 
introduced to the parsing, which creates a predictive entry in the parser's structures 
seeking an A, after the recognized E. The top-down rule A -+ B C is then introduced, 
which leads to an entry, at that same point, seeking a B. This causes the top-down 
rule B --* P Q (from the original grammar) to be introduced; this is a crucial step. This 
allows the sequence Im to be parsed as a B, thereby causing the introduction of the 
bottom-up rule H --+ B F, and the subsequent success of the parse. 
2.3 What Are the Problems? 
The grammars discussed above (Sections 2.1 and 2.2) are examples of various aspects 
of the problem. We shall show that there is a simple, decidable property of annotated 
grammars that guarantees completeness, and that could be used to detect the sim- 
ple blocking illustrated in Section 2.1. However, this property is merely a sufficient 
condition for completeness, as the larger grammar of Section 2.2 above does not pos- 
sess it, despite being complete. We shall show that the larger grammar of Section 2.2 
has a more general property, which also guarantees completeness. However, the more 
general property of annotated grammars is undecidable. 
First, we have to set up the basic formal mechanisms for our definitions. 
3. Trees, Grammars, and Charts 
3.1 Basic Concepts and Terms 
We adopt the standard concepts for syntax trees (see Aho and Ullman \[1972, Sect. 0.5\] 
or Partee, ter Meulen, and Wall \[1990, Chap. 16\] for possible approaches to formaliza- 
tion). A syntax tree is a rooted, ordered, labeled tree. Each node apart from the root 
460 
Ritchie Completeness Conditions for Mixed Strategy Bidirectional Parsing 
has exactly one mother node, and each nonterminal node has one or more daughter 
nodes. A tree is said to span the sequence of labels associated with the sequence of 
its terminal nodes (in left-to-right order), and we shall also say that the root node of 
a (sub)tree spans its sequence of terminal nodes. 
Definition 1 
The height of a node in a tree is defined as follows. A terminal node has height 0; a 
nonterminal node has height = (1 + maximum height of its daughter nodes). 
Definition 2 
The depth of a node in a tree is defined as follows. A root node has depth 0; a nonroot 
node has depth = (1 + depth of its parent node). 
Following the usual conventions (e.g., Aho and Ullman 1972), we will take a 
context-free grammar (CFG) to be a quadruple (VN, VT, P, S), consisting of a set VN 
of nonterminal symbols, a set VT of terminal symbols, a set P of rules (productions), 
and a single distinguished symbol S E VN. Set theoretically, rules can be regarded as 
being ordered pairs where the first element is a nonterminal symbol and the second 
is a tuple of symbols, i.e., of the form (A0, (A1 ..... Ak)) where k > 0, but for ease of 
exposition they will be written as 
A0 --+ A1 ... Ak 
We will make the following simplifying assumptions (which do not lose general- 
ity): 
. 
. 
Each rule in P is either of the form Ao --+ A1A2... Ak with k > 0, where 
all the symbols Ai E VN, or of the form A --+ w where w E VT. 
The grammar has no redundant symbols, in the sense that no symbols 
are "useless" or "inaccessible" as defined by Aho and Ullman (1972, 
Sect. 2.4.2). 
Rules of the form A --* w where w E VT will be referred to as lexical rules, and other 
rules as nonlexical. A nonterminal A that appears in a lexical rule will be called 
preterminal, or lexical. 
Given a CFG G, a syntax tree based on G is a rooted, ordered tree whose nonter- 
minal nodes are labeled with elements of VN and whose terminal nodes are labeled 
with elements of VT. Those nodes immediately dominating terminal nodes will be 
referred to as preterminal; other nonterminal nodes will be referred to as nonlexical. 
Where a tree T spans a terminal string al ... an, and M is a node within T that spans 
ai.. • ak, the start of M is the index i - 1, and the end of M is the index k. 
A syntax tree based on (VN, VT, P, S) is said to be well-formed with respect to 
(VN, VT, P, S) if for every nonterminal node with label A0 and daughter nodes labeled 
A1 .... , Ak, there is a rule in P of the form Ao --+ A1 • .. Ak; this rule is said to license 
the node labeled A0. For convenience, we shall distinguish between a tree that is 
compatible with the rules of the grammar, and a tree that also spans a sentence. A 
syntax tree is said to be generated by a grammar G iff: 
. 
2. 
The root node is labeled with S (the distinguished symbol). 
The tree is well-formed w.r.t.G. 
461 
Computational Linguistics Volume 25, Number 4 
We will write trees(G) for the set of all trees generated by G. 
The conventional "rewrite" interpretation of CFGs will also be used in some sit- 
uations (Section 5 below). Given two strings w1, W2 from (W N \[..J VT)* , then w1 directly 
derives w2, written "w1 ~ w2," if w1 = 6A% ~;2 ~- 6Ol'y and A --* c~ is a rule in G. Sim- 
ilarly, W 1 derives w2, written "w1 G w2," is the reflexive transitive closure of directly 
derives. A derivation is a sequence of symbol strings w1 ..... wn such that wi ~ wi+l 
for all 1 < i < n. A rightmost derivation is one in which each step from wi to ~i+1 is 
made by replacing the furthest right nonterminal symbol in W i USing some rule (i.e., 
-y in the above definition of directly derives is entirely made up of terminal symbols) 
(cf. Aho and Ullman 1972). 
3.2 Annotated Grammars 
Since we are allowing trigger elements of a rule to occur anywhere on the RHS of a rule, 
it is necessary to allow the parser to explore outwards in either direction (leftwards 
or rightwards) from a constituent that has been parsed. Hence the parsing schemes 
defined below are referred to as bidirectional, to reflect this fact. This does not allude 
to the two "directions" of top-down or bottom-up. 
Definition 3 
Let G be a context-free grammar (VN, VT, P, S). A bidirectional strategy marking of 
G is a (total) function tr from the nonlexical rules in P to ~P(N) (the set of sets of 
nonnegative integers) such that for any rule r of the form Ao --* A1 ... Ak: 
1. tr(r) # 0 
2. 0 < i < k for every i E tr(r) 
Informally, tr indicates which element(s) of the rule can trigger it. If 0 E tr(r), the 
LHS of the rule is a trigger; that is, it can be used top-down. If j E tr(r), where j > 0, 
then element j of the RHS can act as a trigger, bottom-up. The value of tr(r) is a set of 
integers in order to allow a rule to have more than one possible trigger; in particular, 
it is allowable for a rule to be used either top-down or bottom-up. 
Definition 4 
A bidirectionally strategy-marked context free-grammar (BSCFG) is a pair (G, tr) 
where G is a CFG and tr is a bidirectional strategy marking of G. 
Definition 5 
Let ((VN, VT, P, S), tr) be a bidirectionally strategy-marked context-free grammar. Then 
a rule r E P is said to be: 
1. top-down, if 0 E tr(r). 
2. bottom-up, if there is an i > 0 such that i E tr(r). 
3. purely bottom-up, if 0 ~ tr(r). 
4. purely top-down, if tr(r) = {0}. 
3.3 Active Charts 
The techniques and structures known as active charts have been in use for parsing 
(at least in the area of natural language processing) since the early 1970s. The method 
462 
Ritchie Completeness Conditions for Mixed Strategy Bidirectional Parsing 
is a generalization of Earley's algorithm (Earley 1970), and tutorial expositions of the 
ideas can be found in Thompson and Ritchie (1984) or Winograd (1983). In keeping 
with more recent presentations (e.g., Shieber, Schabes, and Pereira 1995; Sikkel and op 
den Akker 1996) we define the parsing principles as well-formedness conditions on 
complete charts, abstracting away from the sequence of steps used to build them. 
Definition 6 
Given a CFG G of the form (VN, VT, P, S) a double-dotted rule based on G is a triple 
(p, l, r) where p is a rule in P of the form Ao --* A1 ... Ak and l, r are integers such that 
O<l<r<k. 
Such a rule will be written as: 
Ao --* A1 ... Al • Ai+l . . . Ar • Ar+l ... Ak 
for ease of exposition and similarity to previous literature. Where either l = 0 or r = k, 
the empty portions will be omitted from the expression. 
Definition 7 
Given a CFG G = (VN, VT, P, S), an edge based on G is a triple (i,j, d) where i and j 
are nonnegative integers with i _< j, and d is a double-dotted rule based on G. 
An edge is said to be lexical or nonlexical according to whether or not the rule is 
lexical. An edge of the form (i,j, Ao ~ A1 ... Aq-1 • Aq. .. Ap • Ap+l ... Ak) where either 
q > 1 or p < k (i.e., with a nonempty component at either end) is referred to as 
an active edge, and an edge of the form (i,j, Ao --+ •A1...Ak o) is an inactive edge. 
An active edge (i,i, Ao --* • ,A1...Ak) or (i,i, Ao --* A1...Ak • •) is referred to as an 
empty active edge. (Sometimes it will be referred to as "an empty active edge for 
A0 --+ al... Ak.") 
Definition 8 
Given a CFG G = (VN, VT, P, S) and a string al ..... an from V~, a chart based on 
al,..., an and using G is a set C of edges based on G that meets the following condi- 
tions: 
. 
2. 
for every (i,j, r) E C, i E {0 ..... n} and j c {0 ..... n} 
for ai E VT, (i -- 1, i,L ~ .ai,) E C iff ai C {al, a2,... ,an} and L --+ ai ff P. 
The terminology of the last three definitions will also be used for a BSCFG (G, tr). 
Definition 9 
Let G be a CFG, and let C be a chart based on a string ¢ and using G. C is said to be 
bidirectionally resolved iff both the following conditions hold: 
. Left Extension: For every pair of edges: 
(i,j, Ao --+ ,A1. ..Am') 
(j,k, B0 --~ B1 ...Bq • Bq+I...Bp " Bp+l...Bv) 
where p ~ v, q > 0 and A0 = Bq, there is also an edge: 
(i, k, Bo ~ B1 ... Bq_I • Bq... Bp • Bp+l ... Bv) 
463 
Computational Linguistics Volume 25, Number 4 
2. Right Extension: For every pair of edges: 
(i,j, Bo --* B1. . . Bq • Bq+l .. . Bp • Bp+l . .. Bv) 
(j,k, Ao --* •A1...Am') 
where p < v, q > 0 and Ao -- Bp+l, there is also an edge: 
(i, k, Bo --+ B1 ... Bq • Bq+l ... Bp+l • Bp+2... By) 
Lemma 1 
Let C be a bidirectionally resolved chart based on a string cr and using a CFG G, and 
suppose that C contains an edge of the form: 
(i,j, Bo --~ B1. .. Bq • Bq+l .. . Bp • Bp+l .. . By) 
(i) (Rightwards) If C contains edges of the form: 
(ipq-l,jp+l, Bp+l --+ •Wp+l,) 
(ip+t, jp+,, Bp+t --+ •We+t•) 
where (p + t) <_ v, ik+ 1 ---- jk where (p + 1) _< k < (p + t) and ip+l = j, then C 
also contains an edge of the form: 
(i, jp+t, Bo --~ B1... Bq • Bq+l ...Bp+t • Bp+t+l ...Bv) 
(ii) (Leftwards) If C contains edges of the form: 
( i(q-t), j(q-t), B(q-t) -'4 •bd(q-t) • ) 
(iq,jq, Bq --4 •~q,) 
where 0 < t < q, ik+l = jk where (q - t) ( k < q and jq -~ i, then C also 
contains an edge of the form: 
(i(q_t),j, Bo ~ B1 ...B(q_t_l) • B(q_t) ... Bp • Bp+l... By) 
Proof 
Both the cases (i) and (ii) proceed by induction on the number of inactive edges. 
Corollary 
If C is as described, and it contains a full set of edges as given at both sides (i.e., 
t = (q - 1), so that there are q inactive edges to the left, and (p + t) = v so that there 
are (v - p) inactive edges to the right, all with labels matching the rule), then there is 
a complete (inactive) edge of the form (il,jv, Bo -~ .B1... Bv. ). 
A chart parser is driven by two principles: one is that of edge combination, as given 
in the above definition of bidirectionally resolved, and the other is the introduction of 
rules into the chart. For a strategy-marked grammar, the rule-introduction principle is 
sensitive to the annotation of the rules. 
464 
Ritchie Completeness Conditions for Mixed Strategy Bidirectional Parsing 
Definition 10 
Let (G, tr) be a BSCFG, and let C be a chart based on a string c~ and using G. C is said 
to be bidirectionally mixed strategy explored iff all the following conditions hold: 
1. (Bottom-up activation) For every edge: 
(i,j, Ao ~ .A1...Am') 
there is an edge in C: 
(i,j, Bo ~ B1... Bq_l • Bq • Bq+l ... Bv) 
for every rule r in G of the form Bo --* B1 ... Bv such that q C tr(r) and 
Bq = A0, 
2. (Top-down initialization) For every rule r in G of the form S --~ B1... Bk, 
where S is the distinguished symbol of G and 0 c tr(r), there is an edge 
in C of the form: 
(0,0,S ~ • ° B1...Bk) 
3. (Top-down activation, right) For every edge: 
(i,j, Bo --* B1 ... Bq • Bq+l . . . Bp ° Bp+l . . . Bv) 
where 0 _< p < v, and every rule r in G of the form Ao ~ A1 ... Ak for 
which Bp+l = Ao and 0 C tr(r), there is also an edge in C of the form: 
(j,j, A0 --* " • A1. . .Ak) 
4. (Top-down activation, left) For every edge: 
( i, j, Bo ~ B1. . . Bq • Bq+ l . . . Bp • Bp+ l . . . By) 
where 0 < q _< v, and every rule r in G of the form Ao --* A1 ... Ak for 
which Bq = Ao and 0 c tr(r), there is also an edge in C of the form: 
(i, i, Ao --* A1. . . Ak • •) 
For brevity, the term fully bidirectional will be used for a chart that is both 
bidirectionally resolved and bidirectionally mixed strategy explored. 
To explore the issue of completeness (i.e., whether a parsing mechanism finds all 
valid analyses) we need to define how the edges in a chart correspond to those in a 
syntax tree. 
Definition 11 
A chart C based on ala2.. • an is said to contain a representation of a syntax tree T, iff: 
1. T spans a substring (not necessarily proper) of ala2.., an; and 
2. for every nonterminal node N in T, spanning ai+l • • • aj, labeled Ao, with 
k daughters labeled A1 ..... Ak in order, C contains an edge 
(i,j, Ao --+ °A1...Ak°) 
(This includes the case where k = I and A1 c V~.) 
465 
Computational Linguistics Volume 25, Number 4 
Notice that for any chart C based on a string or, C will contain an edge for each 
preterminal node of any tree that spans c~, by virtue of the definition of a chart being 
"based on" a string. Hence later discussions of parsing and completeness can assume 
the presence of these edges in the relevant charts, with only the presence of edges for 
other nonterminal nodes being subject to verification. 
Definition 12 
Given a CFG G, a bidirectional strategy-marking tr of G is said to be complete iff for 
every tree T E trees(G) that spans a string or, any fully bidirectional chart C based on 
cr and using (G, tr) contains a representation of T. 
4. A Decidable Class of Complete Annotations 
In this section we define a decidable property of annotated grammars that guar- 
antees that a parser following the annotations will not miss analyses in the man- 
ner outlined in Section 2.1. There is also a weaker (more general) sufficient con- 
dition for completeness, which is defined in Section 5 below, but which is unde- 
cidable. The fact that the stronger condition is decidable makes it worth defining, 
and some of the proofs in Section 5 make use of some concepts from the current 
section. 
4.1 Reachability 
Another notion that has to be formalized is the way in which a syntax tree can be 
parsed from a string of terminal symbols in a purely bottom-up manner. 
Definition 13 
Let (G, tr) be a BSCFG. In a syntax tree T generated by G, a nonlexical node M0, 
with daughters M1 ..... Mk, is said to be reachable from below iff M0 is licensed by a 
bottom-up rule r and there is a j, 1 G j G k, such that j E tr(r) and one of the following 
is true: 
. 
2. 
Mj is a preterminal node of T; 
Mj is reachable from below. 
Definition 14 
Let (G, tr) be a BSCFG. A syntax tree T generated by G, is said to be fully reachable 
iff every nonlexical node M in T licensed by a purely bottom-up rule is reachable from 
below. 
4.2 Direct Analyzability 
Now we need to define a property of grammars that will guarantee that generated 
trees are fully reachable in the above sense. This can be done in three stages: first, 
define a property of nonterminal symbols; then, use that to define a property of gram- 
mars; lastly, prove that any grammar with this property generates only fully reachable 
trees. 
A first approximation to the definition for the property of nonterminals would be 
the following: 
466 
Ritchie Completeness Conditions for Mixed Strategy Bidirectional Parsing 
Draft Definition: 
Given a BSCFG (G, tr), a nonterminal symbol A0 is directly analyzable iff every rule 
r of the form A0 --* ... is either lexical, or of the form Ao --* A1 ... Ak with at least one 
i E tr(r), i > 0, for which Ai is directly analyzable. 
The subsequent definition for grammars is then: 
Definition 15 
A BSCFG (G, tr) is directly analyzable iff it meets the following condition: for every 
purely bottom-up rule r of the form Ao --* A1 ... Ak, there is at least one i C tr(r), i > 0, 
for which Ai is directly analyzable. 
The draft definition captures the essential idea in a fairly natural and clear way, 
but it has a slight technical problem. Consider the toy grammar given below: 
S~-d B 
A ---~ CA 
C---~ x 
B -* y 
A --+ z 
In this grammar, the nonterminal A is not classed as directly analyzable. This is be- 
cause there is a cycle from A to itself via trigger symbols in bottom-up rules. 2 It would 
be equally consistent with the draft definition to state that A is directly analyzable, or 
to stipulate that it is not directly analyzable. There is a sense in which the draft deft- 
nition is underspecified, and gives only partial coverage of the items being classified 
(nonterminals). To extend this definition to total coverage, a more elaborate construc- 
tion is needed (borrowed from theoretical computer science; cf. Stoy \[1981, Chap. 6\]). 3 
First, for any BSCFG (G, tr) we define an analyzability predicate as any function g 
from nonterminal symbols to the set {true,false} that assigns true to a category A0 
iff every rule r of the form A0 --* ... is either lexical, or of the form A0 --* A1 ...Ak, 
with at least one i E tr(r), i > 0, for which g(Ai) = true. Call the set of all such func- 
tions A73(G, tr). For any two g,h E A79(G, tr), define the relation "u" by h u g iff 
h(A) = true D g(A) = true. This relation is easily shown to be reflexive, transitive, and 
antisymmetric, and hence (A~(G, tr), F) forms a partially ordered set (Maclane and 
Birkhoff 1967, 59; Stoy 1981, 82). Then for any set gl ..... gn of elements of A~P(G, tr), 
the function g' (in A3V(G, tr)) given by 
g'(A) = true iff either gl(A) = true or ... gn(A) = true 
is at least upper bound (Maclane and Birkhoff 1967; Stoy 1981) for gl ..... gn with 
respect to G. Since A3V(G, tr) is finite, the presence of a 1.u.b. for any subset means it has 
a maximum element, which we will call APMAX(G,tr). This predicate APMAX(G,tr) will 
assign true to a symbol A if there is some analyzability predicate (for (G, tr)) that makes 
this assignment. 4 Then define a nonterminal A (from G) to be directly analyzable iff 
APMAX(G,tr)(A ) = true. Intuitively, any nonterminal that the draft definition might 
2 Thanks to Alistair Willis for pointing out this problem. 
3 Thanks to Suresh Manandhar for suggesting this approach. 4 Stoy (1981, 79-80) illustrates the use of a 
minimum element from an ordered set of possible functions, but here we have chosen to use the maximum. 
467 
Computational Linguistics Volume 25, Number 4 
leave as undefined with respect to being directly analyzable is classed by this new 
definition as being directly analyzable. Instead of the draft definition, we can now 
have the following complete definition: 
Definition 16 
Given a BSCFG (G, tr), a nonterminal symbol A0 is directly 
APMAX(c, tr) (Ao) = true, where APMAX(G,tr) is as constructed above. 
analyzable iff 
Notice that it follows from the construction of APMAX(G,tr) that A0 is directly ana- 
lyzable iff every rule r of the form A0 --* ... is either lexical, or of the form Ao --* A1 ... Ak 
with at least one i E tr(r), i > 0, for which Ai is directly analyzable. That is, the draft 
definition, which was not sufficiently self-contained to be a definition, is now deriv- 
able as a theorem from the more rigorous definition. This means that we can use the 
logical equivalence stated in the draft definition in subsequent proofs. 
Lemma 2 
Let (G, tr) be a BSCFG that is directly analyzable. Let T be a tree in trees(G). Then T 
is fully reachable. 
Proof 
It is straightforward to prove the following preliminary result, using induction on the 
height of nodes and the logical equivalence stated in the draft definition above: 
Any node in T that has a directly analyzable label is reachable from below. 
It is then easy to show that any node in T that is licensed by a purely bottom-up rule 
is reachable from below. \[\] 
4.3 Parsing 
Lemma 3 
Let (G, tr) be a BSCFG. Let T be a fully reachable tree in trees(G), spanning the string 
or. Let C be a fully bidirectional chart based on cr and using (G, tr). Then for any node 
M in T, labeled A: 
. 
. 
If M is licensed by a purely bottom-up rule, then C contains a 
representation of the subtree rooted at M. 
If M is licensed by a top-down rule, and there is in C an active edge 
(t,g, B0 --~ B1 ...Bp_l • Bp...Bq_l • Bq...Bv) 
where either Bp-1 = A and t is the end of M, or Bq = A and g is the start 
of M, then C contains a representation of the subtree rooted at M. 
Proof 
By induction on the height of nodes. 
Inductive Hypothesis: For any 0 < d ~ < d, if node M in T is of height d ~, the 
conditions listed in the lemma hold. 
Base Case: Suppose M is of height 1 (i.e., preterminal). Then C contains a repre- 
sentation of the subtree rooted at M, regardless of the antecedent conditions. 
468 
Ritchie Completeness Conditions for Mixed Strategy Bidirectional Parsing 
Inductive Step: Suppose M is of height d, where d > 1, and is labeled A. 
(a) Suppose M, with daughter nodes M1 ..... Mk, is licensed by a purely 
bottom-up rule A --* A1 • • • Ak. Then, since T is fully reachable, there is a 
j, 1 ~ j ~ k such that My is reachable from below; that is, My is either 
lexical or licensed by a bottom-up rule. Therefore, by the Inductive 
Hypothesis, C contains a representation of the subtree rooted at Mj. Since 
C is fully bidirectional, it contains an edge spanning Mj of the form: 
(l,h,A ~ A1. . .Aj_I • Aj • Aj+l .. .ak) 
Now consider the nodes Mi, for j + 1 K i < k. The Inductive Hypothesis 
applies to each of these nodes. Hence it can be proved by induction on i 
that there is a representation in C of the subtree rooted at Mi for all 
j + 1 < i < k (cf. Lemma 1). Similarly, it can be proved that there are 
representations in C for the subtrees rooted at M1 ..... Mj_I. By the 
corollary to Lemma 1, there is a representation for the tree rooted at M 
inC. 
Suppose M is licensed by a top-down rule. Suppose there is an edge 
(t,g, Bo ~ B1 ... Bp-1 • Bp... Bq-1 * Bq... Bv) 
where Bq = A and g is the start of M (a similar argument holds in the 
case where B;_I = A and t is the end of M). Since the chart is fully 
bidirectional, there must also be an empty active edge 
(g, g,A --~ • • A1. . .Ak) 
By a similar argument to that in case (a) above, it follows that there are 
representations in C for all the nodes M1 ..... Mk and thence for M. 
This establishes the main induction. \[\] 
Lemma 4 
Let (G, tr) be a BSCFG that is directly analyzable. Let T be a tree in trees(G), spanning 
the string ~. Let C be a fully bidirectional chart based on o- and using (G, tr). Then for 
any node M in T, labeled A: 
. 
. 
If M is licensed by a purely bottom-up rule, then C contains a 
representation of the subtree rooted at M. 
If M is licensed by a top-down rule, and there is in C an active edge 
(t,g, Bo ~ B1...Bp_l e B;...Bq_l o Bq...Bv) 
where either Bp-1 = A and t is the end of M, or Bq = A and g is the start 
of M, then C contains a representation of the subtree rooted at M. 
Proof 
Follows from Lemmas 2 and 3. \[\] 
469 
Computational Linguistics Volume 25, Number 4 
Being reachable from below can be seen as a condition on nodes that can be built 
bottom-up. Surprisingly, we do not need a corresponding condition for nodes that 
are built top-down. It is possible to formulate the appropriate condition, but it turns 
out that any tree that meets the condition of being fully reachable will also meet the 
appropriate condition for top-down nodes. It is hard to give an informal, intuitive 
explanation for this, but roughly speaking, the reason is as follows. For a top-down 
rule to be invoked, it must be used in a position at which some prediction of its LHS 
symbol A will be introduced (by some other rule). This can happen either as a cascade 
of predictions from above, using a sequence of top-down rules, or because a rule has 
been introduced and has caused a sequence of predictions to be made, either left- 
to-right or right-to-left, as its RHS symbols are parsed. For either of these to happen, 
either there must be a clear path of daughter categories from some other prediction, or 
A must be on the RHS of a rule that is somehow introduced. The daughter condition 
of "reachable from below" simultaneously imposes these conditions on the top-down 
rules. 
4.4 Completeness 
The final step in proving completeness is now simple. 
Theorem 1 
If a BSCFG (G, tr) is directly analyzable, then tr is complete. 
Proof 
Let T be a tree in trees(G), spanning the string or, with root node M0 labeled S (the 
distinguished symbol of G). Let C be a fully bidirectional chart based on cr and using 
(G, tr). 
(a) 
(b) 
If M0 is licensed by a bottom-up rule, then by Lemma 4, C contains a 
representation of the tree rooted at M0. 
If M0 is licensed by a top-down rule S --* A1 ..... Ak, then C must contain 
an empty active edge of the form: 
(0,0, S --~ • .A1...Ak) 
It follows from repeated applications of Lemma 4 and Lemma 1 (similar 
to part (b) of the Inductive Step of Lemma 3) that C contains a 
representation of the subtrees rooted at the daughters of M0, and thence 
of the subtree rooted at M0. \[\] 
Thus we have proved that all BSCFGs that meet the condition of being directly 
analyzable can be bidirectionally parsed without any valid trees being omitted. 
Theorem 2 
It is decidable whether a given BSCFG is directly analyzable. 
Proof 
This is straightforward to verify from the definition of directly analyzable (see Ap- 
pendix A for an algorithm). 
470 
Ritchie Completeness Conditions for Mixed Strategy Bidirectional Parsing 
5. An Undecidable Class of Complete Annotations 
5.1 Informal Outline 
Before proceeding to formalize the mechanisms underlying the problem presented in 
Section 2.2, it is useful to set out informally the relevant factors in that example. A 
strategy-marked grammar (G, tr) causes problems only if there is some purely bottom- 
up rule of the form Ao --* A1... Ak such that every trigger symbol Ai requires a purely 
top-down rule somewhere in its expansion (see Section 4.2 above). Such a rule leads to 
the possibility of there being a tree T E trees(G) that contains a nonterminal node that 
can only be built by a bottom-up rule, and whose trigger daughter can only be built 
using a top-down rule. This would give rise to a tree that was not fully reachable. In 
the example in Section 2.2 above, the rules 
H---~ BF 
B--~PQ 
create this situation. The "upper" symbol H cannot be parsed because the "lower" 
symbol B cannot be parsed. What salvages this difficulty is the fact that the "upper" 
nonterminal (H in this example) always occurs in a left context, (i.e., a string of sym- 
bols to its left) with the following property: Every possible terminal expansion of the 
left context contains a substring that will, via bottom-up rules, introduce rules that are 
bound to result in the introduction of an active edge, which starts at the point where 
the "upper" symbol (H) is needed and which is seeking the "lower" symbol (B). 
The illustrative grammars in Sections 2.1 and 2.2 are of a particular subclass of 
grammars--those where tr(r) = {0} or tr(r) = {1} for any (nonlexical) rule r. This is 
equivalent to partitioning the rules into two subgroups--top-down and bottom-up-- 
where the bottom-up rules are always triggered in a left-corner manner, much as in 
conventional "bottom-up" chart parsers (such as those in Thompson and Ritchie \[1984\] 
or Winograd \[1983\]). That is, there is a natural subclass of annotated grammars that 
do not rely on the bidirectional exploration of the chart, but allow this limited form 
of mixed strategy left-to-right exploration. 
The definitions and proofs of the earlier sections apply to this subclass. It is also 
clear, from Section 2.2, that the issue of "completeness by interaction" can be illustrated 
within this limited subclass. In the remainder of Section 5 below, it is proved that 
detecting the possibility of such rule interactions is undecidable even for this limited 
subclass of grammars. It follows that it must be undecidable for the more general 
class, where any annotation is permitted. The advantages of focusing on this more 
limited subclass are twofold: it shows that restricting the annotations in this way 
would not ease the undecidability problem, and it simplifies the proofs (which are 
already tediously complex). 
Definition 17 
A left-corner strategy-marked context-free grammar (LCSCFG) is a BSCFG (G, tr) such 
that tr(r) c {0,1} for every rule r in G. 
This definition allows a rule to be both bottom-up and top-down marked, rather 
than enforcing a strict partitioning. In the following proofs, we will define constructs 
for BSCFGs where possible, simply for generality, but where it matters we shall confine 
attention to LCSCFGs, thereby narrowing the range of contexts relevant to parsing a 
particular symbol. 
471 
Computational Linguistics Volume 25, Number 4 
5.2 Left Contexts 
Following from the informal discussion in Section 5.1 above, we need to define more 
precisely the notion of a left context of a symbol. What we want is a way of charac- 
terizing, for a given nonterminal A, exactly those strings of symbols that must appear 
immediately to the left of A in any valid derivation in which A appears. These need 
not be all that is to the left of A in a derivation, but it must be the case that A cannot 
appear without having one of these left context strings immediately adjacent to it. 
In the following definitions, the derivation relationship "G" is the conventional 
one, and is independent of any strategy marking; the relationship "~" indicates a 
rightmost derivation (see Section 3.1 earlier). 
Definition 18 
Suppose we have a context-free grammar (VN, VT, P, S), and a sequence of symbols 
B1 .... ,Bt in VN, where there are rules e i --+ Pi-lBi-lfli-1 for 2 < i < t (Pi, fli C V'~l ). 
Suppose we have a rightmost derivation of the form: 
Bt ~ Pt-lBt-la;t-1 
R 
Pt- lPt- 2 Bt- 20dt-2 
R 
::~ Pt-lPt-2 • •. B2¢v2 R 
=:k Pt-lPt-2 • • • plBlCOl 
(all the ~vi E V~). This derivation is said to be: 
1. 
2. 
3. 
4. 
nonrepeating if B i ~ Bj whenever i ~ j. 
rooted if B t = S. 
localized if there is a longer sequence of nonterminal symbols B1 .... ,Bm 
I and a rooted rightmost derivation Bm ~ Pm--1 ... plBlWl such that Bt = Bk 
for some t < k < m. 
essential if it is nonrepeating and either rooted or localized. 
Also, the derivation is said to be for B1 from Bt, and the string pt-1 • .. Pl is said to be 
the left context sequence of this derivation. 
Definition 19 
For any nonterminal A, the set of essential left contexts of A is 
{or E V~ \[ 3 an essential rightmost derivation D for A and ~b is the left 
context sequence of D and ~b ~ or} 
The following lemma proves that essential left contexts have just the required 
property. 
Lemma 5 
Let G be a CFG. Let T be a tree in trees(G). Let M be a nonterminal node in T. Let cr 
be the terminal string spanned by T, and 6 the portion of cr sparmed by M. Then cr is 
of the form ~1~/(~(~2 for some 3, in the essential left contexts of the label of M. 
472 
Ritchie Completeness Conditions for Mixed Strategy Bidirectional Parsing 
pk,  ki, 
i 
Pk-~ :: Pl i 8 i m, 
9 P 4 
7 0 ~l 
9 I* 
Figure 1 
Situation described in Lemma 5. 
Proof 
(See Figure 1 for an intuitive picture.) Since T is a tree, there is a path of nonterminal 
nodes (N1 .... ,N t) where Ni is labeled Bi, for 1 < i < t, N1 = M, Nt is the root of 
T, and Ni is the mother of Ni-1 for 2 < i < t . Since T E trees(G), there must be a 
sequence of rules e i ---+ Pi-lBi-lfli-1 (Pi, fli E W~l ) such that the node Ni is licensed by 
Bi ---+ Pi-lBi-lfli-1 for 2 < i < t. Hence there is a rooted rightmost derivation for B1 
from Bt. From this it is trivial to form an essential rightmost derivation for B1 from 
some symbol Bk (where k < t): 
Bk ~ Pk-1 ...plBlWl 
where a;1 E V~ and 
Pk-I'''plBIOdl ~ 0 
where 0 is the substring of ~ spanned by Nk. 
This means that 0 is of the form 3"~wl where Pk-1 .. • Pl :~ 3" and B1 ~ 6 (since B1 is 
the label of M, and 6 is the terminal string spanned by M). Then 3' is an essential left 
context of B1, by virtue of the way Pk-1... Pl was constructed. Since 0 is a substring 
of or, this establishes the result. \[\] 
473 
Computational Linguistics Volume 25, Number 4 
5.3 Bottom-up Derivations 
Definition 20 
In a BSCFG (G, tr), suppose A is a nonterminal symbol, and cr is a string of terminal 
symbols. Then ~ can be coherently derived from A (with tree T) iff T is a syntax tree 
generated by G such that: 
1. T spans cr 
2. the root of T is labeled A 
3. T is fully reachable. 
The next definition requires that the derivation can occur without need for top- 
down initiation. 
Definition 21 
Let (G, tr) be a BSCFG. Suppose A is a nonterminal symbol, and ~r is a string of terminal 
symbols, from G . Then cr can be up-derived from A (written "A ~* or") using (G, tr) 
iff: 
. 
2. 
cr can be coherently derived from A with tree T; 
the root of T is reachable from below. 
It is clear that all nodes of such trees will appear in a chart, as formalized in 
Lemma 6. 
Lemma 6 
Let (G, tr) be a BSCFG. If A ~* v using (G, tr), and C is a fully bidirectional chart based 
on a string 3,1c~-~2, and using (G, tr), then C contains a representation of a tree T such 
that T spans e and the root of T is labeled A. 
Proof 
Follows from Lemma 3. \[\] 
5.4 Left-Introducible Rules 
In characterizing formally the situation outlined informally in Section 5.1 above, the 
following definition allows a more succinct statement. 
Definition 22 
Let A, B be two nonterminal symbols from a LCSCFG. A introduces B from above 
(written "A ~,- B") if either A = B, or there is a sequence of top-down rules 
a ---+ ao . . . 
Ao ~ A1 ... 
A t ---+B... 
474 
Ritchie Completeness Conditions for Mixed Strategy Bidirectional Parsing 
Z 
4 
A ............................ t i 
Pl Pi 
Figure 2 
Left-introducibility. 
A i+l ? 
Lemma 7 
Let A, B be two nonterminal symbols from a LCSCFG (G, tr). If A ---* B, then in 
any bidirectionally mixed strategy explored chart C using (G, tr) that contains an ac- 
tive edge (i,j,A'--* .al .Aft1), there will also be an active edge in C of the form 
(I,j,A" --* ca2 • Bfl2), where either l = i or l = j. 
Proof 
Straightforward. The case l = i allows for A = B (and A' = A"), and I = j is the more 
general case where there is a sequence of top-down-invoked active edges linking A 
to B. \[\] 
Next we have a definition of the condition on rules that allows them to enter into 
the parsing process despite the difficulties outlined in Section 2.1 above. 
Definition 23 
In a LCSCFG (G, tr), a rule B0 --+ BlC~ is said to be left-introducible iff for every "~ that 
is an essential left context of B0, there is a bottom-up rule Ao -* A1 • • • Ak such that 
1o 
2. 
3. 
4. 
"y = Xpl ...pi for some i < k 
A1 ~* Pl 
pj can be coherently derived from Aj, for all 1 < j G i 
Ai+l ""+ Bo 
Scrutiny of this definition should reveal its relationship to the informal outlines in 
Sections 2.2 and 5.1 earlier (see also Figure 2). Notice that for any nonterminal A, if 
S ~ A... then the empty string is an essential left context of A and hence any rule of 
the form A --~ ... cannot be left-introducible. 
Lemma 8 
Let (G, tr) be a LCSCFG. Let T be an annotated tree generated by G, and M0 a nonlexical 
node in T whose leftmost daughter is M1, with M0 labeled B0, M1 labeled B1, and where 
475 
Computational Linguistics Volume 25, Number 4 
the start of both M0 and M1 is m. Suppose that the rule B0 --* B1 • .. licensing M0 in T 
is left-introducible. Then in any fully bidirectional chart based on the terminal string 
spanned by T and using (G, tr), there is an active edge of the form 
(I, m, A --* •c~ • Boil) 
(i.e., an edge at the start of M0,M1, seeking B0). 
Proof 
Let the string spanned by T be ~r, with ¢ = cr16cr2, where 6 is spanned by M0. Let C be 
a fully bidirectional chart based on cr and using (G, tr). By Lemma 5, Crl is of the form 
~ where 3/is in the essential left contexts of B0. Since B0 --+ B1 ... is left-introducible, 
every such ~ has the property that there is a bottom-up rule Ao --* A1 ... Ak such that 
• "Y = XPl...Pi 
• A1 "~* Pl 
• pj can be coherently derived from Aj, 1 < j <_ i 
• Ai+l ~ Bo 
It follows from Lemma 6 that, since A1 ~* Pl, there are inactive edges in C for all 
nodes of a tree with root label A1 spanning pl. Since Ao --* A1 ... Ak is bottom-up, this 
means there is an active edge in C 
(1,j',A0 -* •A1 •A2...ak) 
where j is the start of the inactive edge for the root of this tree (i.e., the node labeled 
A1), and j' is its end. By Lemma 1 and Lemma 4, there are inactive edges in C labeled 
A2 ..... Ai, corresponding to nodes spanning/92 ... Pi. By Lemma 1, there is an edge 
(j,m, A0 --* •A1...Ai • Ai+l...Ak) 
where m is the start of & Since Ai+l "~ B0, by Lemma 7 there is an active edge 
(l,m,A --* .c~ . Bofl) \[\] 
5.5 Indirect Analyzability 
In Section 4 we defined direct analyzability as a condition on grammars that would 
lead to complete parsing. Now we establish a more general property that also leads 
to completeness. 
Definition 24 
Let (G, tr) be a LCSCFG. A nonterminal symbol A0 in G is said to be indirectly an- 
alyzable iff every rule A0 --* ~; is either lexical, or top-down and left-introducible, or 
bottom-up of the form A0 --* A-~c~ where A1 is indirectly analyzable. 5 
5 Like the definition of directly analyzable in Section 4, this strictly needs a more detailed definition to 
allow for cycles. This is straightforward to provide, in exactly the manner used in that earlier section, 
and then the "definition" given here becomes a theorem about indirect analyzability. 
476 
Ritchie Completeness Conditions for Mixed Strategy Bidirectional Parsing 
Definition 25 
A LCSCFG (G, tr) is indirectly analyzable iff for every purely bottom-up rule A0 -* Alc~, 
the nonterminal symbol A1 is indirectly analyzable. 
The next two lemmas ensure that a grammar with the property of indirect ana- 
lyzability leads to complete parses. The first is just a generalization of Lemma 4. 
Lemma 9 
Let (G, tr) be a BSCFG that is indirectly analyzable. Let T be a tree in trees(G), spanning 
the string or. Let C be a fully bidirectional chart based on cr and using (G, tr). Then for 
any node M labeled A in T: 
. 
. 
If M is licensed by a purely bottom-up rule, then C contains a 
representation of the subtree rooted at M. 
If M is licensed by a top-down rule, and there is in C an active edge 
(t,g, Bo --* B1...Bp-1 • Bp...Bq_l • Bq...An) 
where either Bp-1 = A and t is the end of M, or Bq = A and g is the start 
of M, then C contains a representation of the subtree rooted at M. 
Proof 
By induction on the height of nodes, in a manner very similar to Lemma 3, except 
that part (a) of the Inductive Step is as follows: 
Inductive Step(a): Suppose M0, with daughter nodes M1 ..... Mk, is 
licensed by a purely bottom-up rule A0 --* A1... Ak. Then, since (G, tr) is 
indirectly analyzable, this means that A1 is indirectly analyzable. Hence 
whatever rule licenses M1, it must be either lexical, or top-down and 
left-introducible, or bottom-up with an indirectly analyzable symbol at 
the start of its RHS. In the lexical and bottom-up cases, the Base Case and 
Inductive Hypothesis establish that C contains a representation of the 
tree rooted at M1. If the rule is top-down and left-introducible, there is an 
edge seeking its LHS symbol at the start of M1, and so, by the Inductive 
Hypothesis, there is a representation of the tree rooted at M1 in C. Since 
C is fully bidirectional, it contains an edge spanning M1 of the form: 
(I, h, A0 --~ ,A1 • A2... Ak) 
Repeated applications of the Inductive Hypothesis and Lemma 1 
(Corollary) establish that there is a representation of the tree rooted at 
M0 in C (i.e., the Inductive Step). 
Lemma 10 
Suppose (G, tr) is an indirectly analyzable LCSCFG. Suppose T C trees(G), and C is 
a fully bidirectional chart based on the string spanned by T and using (G, tr). Then 
for any nonroot node M in T, if M is licensed by a top-down rule A --* w, then C 
contains an active edge at the start of M of the form (tl, t2, B0 ~ o~ • fl • A...) (i.e., 
seeking A). 
477 
Computational Linguistics Volume 25, Number 4 
Proof 
By induction on the depth of nodes. 
Inductive Hypothesis: Assume that for any node M of depth d', where 0 _< d' < d, 
the lemma holds. 
Base Case: Suppose M is of depth 1 (i.e., a daughter of the root node). Suppose 
the root is licensed by a rule S -~ A1 ... Ak, where Mi is the ith daughter of the root 
(1 < i < k) and M = My. 
(a) 
(b) 
Assume this rule is purely bottom-up. Since (G, tr) is indirectly 
analyzable, A1 is indirectly analyzable. Consider the rule that licenses 
M1. It cannot be left-introducible, as S ~ A1 ... (see earlier remark about 
empty essential left contexts); hence it must be either lexical or 
bottom-up. By Lemma 9, C contains a representation of the subtree 
rooted at M1. Since the rule S --+ A1 ... Ak is bottom-up, C contains an 
active edge of the form (0, 0, S ~ •A1 • A2... Ak). 
Assume this rule is top-down. Then there must be an empty active edge 
(0,0,S --~ • •A1...Ak). 
By repeated applications of Lemma 9 and Lemma 1, there are edges of the form 
(0, ti, S --~ cA1 ... Ai • Ai+l ... Ak) 
1 < i < (j - 1). The last of these fulfils the condition. 
Inductive Step: Let M be a node labeled A of depth d > 1, licensed by a top-down 
rule r. Let its mother node be N, of depth (d - 1), and the leftmost daughter of N be 
M1. Consider the rule r' (of the form A0 --+ A1 ... Ak), which licenses N. 
(a) 
1° 
2. 
. 
Suppose r' is purely bottom-up. Then M1 is labeled with an indirectly 
analyzable symbol, A1. Hence for the rule licensing M1, three cases must 
be considered: 
It is lexical. In this case, C contains a representation for M1. 
It is top-down and left-introducible. By Lemma 8, there is an 
edge at the start of M1 seeking its label. If M = M1, this 
establishes the Inductive Step in this situation. Otherwise, by 
Lemma 9, there is a representation in C for the subtree rooted at 
M1. 
It is bottom-up of the form A1 --~ B1 ... where B1 is indirectly 
analyzable. By Lemma 9, there is a representation in C for the 
subtree rooted at M1. 
Since there is a representation in C for the subtree rooted at M1, there is 
an (inactive) edge in C of the form (i,j, A1 -+ ,....), where i is the start 
of M1 (and hence of N). Since r' is bottom-up and A1 is the leftmost 
(trigger) symbol of its RHS, this leads to an active edge of the form 
(i,i, Ao --~ • •al...Ak) for r' at the start of M1 and N. By repeated 
applications of Lemma 1 and Lemma 9, there is an active edge seeking A 
at the start of M. 
478 
Ritchie Completeness Conditions for Mixed Strategy Bidirectional Parsing 
(b) Suppose r' is top-down. By the Inductive Hypothesis, there is an edge at 
the start of N seeking the label of N. Since r' is top-down, there is also 
an empty active edge for r ~ at that point. If M = M1, this establishes the 
Inductive Step in this situation. Otherwise, by repeated applications of 
Lemma 1 and Lemma 9, there is an active edge seeking A at the start 
of M. \[\] 
Theorem 3 
If a LCSCFG (G, tr) is indirectly analyzable, then tr is complete. 
Proof 
Follows from Lemma 9 and Lemma 10. \[\] 
5.6 Undecidability 
We have established that the condition of indirect analyzability suffices to ensure com- 
pleteness. Unfortunately, indirect analyzability is not a decidable property of annotated 
grammars, as we now show. 
Theorem 4 
It is undecidable whether an arbitrary LCSCFG is indirectly analyzable. 
Proof 
Suppose that there were a decision procedure for indirect analyzability. This could 
then be used to construct a decision procedure that determines for any two CFGs G1, 
G2 whether every member of L(G1) ends in a substring that is a member of L(G2). 
This is an undecidable problem (see Appendix B); hence, the indirect analyzability 
question is also undecidable. The construction proceeds as outlined below. 
Suppose we have the two arbitrary CFGs G1 and G2 over the same alphabet VT, and 
assume that their nonterminal alphabets 1 2 Vh, V N do not intersect. Construct a LCSCFG 
as follows. The distinguished symbol S ~ is distinct from all symbols in V 1 U V 2. Use 
symbols B1, B2 ..... B6 also not in V 1 U V 2. The purely bottom-up rules are all the rules 
of G2, together with 
B1 ~ B2B3 
B6 --~ $2B2 
where each Si is the distinguished symbol of Gi. The purely top-down rules are all the 
rules of G1 together with 
S I ___+ S1B1 
B2 ~ B4B5 
Also we include lexical rules: 
B 3 ---+a 
B4 ---~b 
B5 --*c 
for some terminal symbols a, b, c. 
This LCSCFG is indirectly analyzable iff (by definition) every purely bottom-up 
rule has an indirectly analyzable symbol at the start of its RHS (the trigger position). 
All the rules taken directly from G2 meet this condition, since all are bottom-up. So, 
therefore, does the rule B 6 ---+ $2B2. All the rules taken directly from G1, and the rule 
S / --+ SIB1, do not affect the condition, since all are top-down. Hence the grammar 
is indirectly analyzable iff in the rule B1 --+ B2B3, the trigger symbol B2 is indirectly 
479 
Computational Linguistics Volume 25, Number 4 
analyzable. This depends on whether the only rule expanding B2, B2 ~ B4B5 is left- 
introducible. The essential left contexts of B2 is the set {7 E V~ I S1 ~ 3'}. The only 
symbol X for which X -,z B2 is B2 itself. Hence the only rule that meets the schema 
for left-introducibility is B6 ~ $2B2. So B2 --* B4B5 is left-introducible iff every 3' such 
that $1 G 3' is of the form Cp with $2 coherently derived from p. Since all G2 rules are 
bottom-up, $2 is coherently derived from p iff $2 ~ p. Hence the left-introducibility of 
the rule in question is logically equivalent to L(G1) C V~ + L(G2) (where + indicates 
concatenation). \[\] 
6. Some Further Complications 
So far, the proofs have shown that direct analyzability is a sufficient condition for com- 
pleteness, and that indirect analyzability (a more general condition) is also sufficient. 
The question might be posed--is indirect analyzability necessary for completeness? In 
fact, it is not, as there is at least one other sufficient condition for completeness, not 
covered by indirect analyzability. 
It is not worthwhile formalizing and analyzing these possibilities in detail, but a 
brief informal outline of one such condition may be helpful. This occurs where a set of 
rules that is not directly analyzable, and might seem to cause "blocking" as discussed 
in Section 2.1 earlier, is redeemed by interaction with other rules in the grammar. This 
is similar to the phenomenon analyzed in Section 5 above, but whereas the analysis 
above dealt with a configuration of rules that can be parsed bottom-up to the left of 
the problematic rule, there is an analogous condition on subtrees to the left that can 
be parsed top-down. 
The following grammar illustrates this phenomenon. 
S--+ HK 
S--* Z B 
H--+ EF 
E--* P R 
Q---~ TV 
-Z---~ H Q 
K--*Q D 
P--,p 
R ---+ r 
F--* f 
D ---~ d 
T-+ t 
V---+ v 
D Here, the grammar is not indirectly analyzable, as the purely bottom-up rule K --* Q D 
has a trigger category Q that is not indirectly analyzable. (The rule S --* H K is also 
problematic.) However, the only situation in which K --+ Q D would be needed would 
be to parse a string prftvd. Since S --~ H, there will be an empty active edge introduced 
for H --* E F at the start of the string. This will parse prf (top-down) as an H, and 
this will combine with the active edge already introduced for Z --* H Q, leading to the 
introduction of an empty active edge for Q --* T V at the start of the correct substring, 
tvd. 
Intuitively, this is similar to the phenomenon defined earlier as left-introducible, 
but with the catalytic sequence of rules being triggered top-down from the distin- 
480 
Ritchie Completeness Conditions for Mixed Strategy Bidirectional Parsing 
guished symbol of the grammar. It is likely that some generalization could be made 
to cover this pattern of rules and those described in Section 5, but the undecidability 
result in Theorem 4 suggests that this would not improve matters--the more general 
property would also be undecidable. 
7. Discussion 
7.1 Other Bidirectional Schemes 
As mentioned in Section 1 above, the ideas here were developed from a semi-formal 
proposal by Steel and de Roeck (1987). The formalization given here is a slight gen- 
eralization, as it allows multiple possible triggers on the RHS of a rule, which Steel 
and de Roeck did not consider. Steel and de Roeck did not formalize their proposal 
in detail, and did not show how to check if such annotations could lead to the parser 
missing possible analyses (i.e., becoming incomplete), although they concede that this 
is an important issue. 
Satta and Stock (1989, 1991, 1994) have developed various detailed and rigorous 
systems of chart-based parsing, including one (Satta and Stock 1989) that allows a 
form of purely bottom-up bidirectional parsing, but they do not explore the question of 
mixed strategy invocation of rules. Most of the mechanisms in their bottom-up method 
are aimed at avoiding redundant edges in the chart, a problem that has been ignored 
here by working at a more abstract, set-theoretic level. Satta and Stock provide a more 
algorithmic approach in which such issues are of concern. A practical implementation 
of the definitions given above might have to consider whether their system could be 
adopted to achieve greater efficiency. However, Willis (1996) points out that in some 
situations the scheme given in Satta and Stock (1989) can be less efficient (in terms of 
edges introduced to the chart) than a fairly naive implementation of a mixed strategy 
chart parser whose grammar is annotated to run bottom-up (essential for comparison 
with the Satta and Stock algorithm). This seems to be because the Satta and Stock 
method involves the introduction, when a constituent is found, of an edge for every 
rule with that type of constituent on its RHS. 
7.2 Head Parsing 
There has been a growth in interest over the past decade or so in head-driven parsing 
(e.g., Kay 1989). In these approaches, the parsing is guided by the fact that exactly 
one item on the right-hand side of a grammar rule is the head of the construction, in 
the sense that it is a linguistically important part of the rule. Some of these proposals 
have been formalized using chart parsing, and their properties explored. 
Although some of the head-driven strategies are said to act "top-down," this refers 
to the parser exploring from a prediction of a specific nonterminal symbol in some 
region of the input, but not to rules being introduced because the grammar writer 
has indicated that it is to be introduced top-down in the sense used here. The head 
markings are always on the right-hand side of the rule, never the left-hand side (since 
that would not make sense for a linguistic head). Hence, head-driven parsing is, in 
terms of the approach defined here, a form of bottom-up parsing, and the issues 
of incompleteness resulting from a mixed strategy algorithm do not arise. The mixed 
strategy approach here (which was developed independently of the head-driven work) 
could be seen as a possible generalization of a very simple head-driven parser. 
There are similarities between the bidirectional scheme here and the head-corner 
parser of Sikkel and op den Akker 1996, in which top-down predictions can arise 
either from the distinguished symbol (predicted to span the whole input) or by work- 
ing outwards from the specified head constituent (as in the left and right extension 
481 
Computational Linguistics Volume 25, Number 4 
principles of Definition 9 in Section 3.3). They define a transitive reflexive relationship 
">~", which roughly means that A>~B if there is a chain of rules from A to B such 
that the left-hand side of each one is the head of the previous rule in the sequence. 
Sikkel and op den Akker's chart handling principles all have the precondition that the 
introduction of the new edge can happen only if the region of the input in question 
is spanned by a predictive edge seeking a symbol A such that A>~B, where B is the 
label of the constituent or prediction being introduced. This is, as they make clear, 
comparable to using a left-corner oracle to avoid unnecessary edges in a more tradi- 
tional parser. A similar optimization might be possible for a mixed strategy parser of 
the sort discussed here, by using the triggers in bottom-up rules in the same way that 
heads are used by Sikkel and op den Akker. 
7.3 Extended Generalized Left-Corner Parsing 
Stabler (1994) outlines a very general approach to top-down and bottom-up parsing 
of context-free grammars, in a somewhat different formal framework. Although his 
theoretical mechanisms are in some ways a generalization of the left-corner strategy- 
marked grammars discussed in Section 5 above, there is one respect in which they are 
slightly less general, and which places the chart-based proofs given above outside the 
scope of his results. Stabler defines a class of extended generalized left-corner (XGLC) 
parsers, by attaching an (extended) trigger function to a CFG. This function maps each 
pair consisting of a stack configuration and a rule to a prefix of the RHS of that rule. 
Intuitively, the rule indicates how much of the RHS of the rule has to be recognized 
before that rule is to be introduced into the parsing process; making this dependent on 
the parser's stack (which can hold both recognized symbols and predictions of symbols 
needed) allows some sensitivity to the parsing context. Stabler cites a proof that all such 
parsers are complete with respect to the original CFG. This may seem to conflict with 
the proofs offered above, but it is crucial that Stabler's trigger functions are defined to 
be total functions--for any stack configuration, there must be some prefix of the rule's 
RHS. To faithfully reproduce the notion of a top-down rule used in the mixed strategy 
chart system, the trigger function would have to be partial, indicating no prefix at 
all in those cases where the stack did not have the right prediction. It is reasonable 
to assume that Stabler's completeness proofs rely on the total nature of the trigger 
function, and thus do not cover the notion of mixed strategy parsing defined here. 
7.4 Possible Uses 
As mentioned in Section 1 above, the original Steel and de Roeck proposal was put 
forward as a way of improving the efficiency of parsers for natural languages, such 
as English. Although they did not have any real statistical evidence that this guidance 
leads to more efficient parsing, they claimed that it did appear to help, judging by 
the performance of the parser they had implemented for use in an English-language 
query interface. That approach is dependent on the grammar writer having some 
linguistic intuitions about which constituents are best parsed bottom-up and which 
are best parsed top-down. Alternatively, the rule annotations could be developed from 
statistics about rule usage in parsing suitably large corpora. 
Some preliminary results (Willis 1996) suggest that on small grammars, gains of up 
to 35% can be made in efficiency (measured in terms of chart entries) by using certain 
combinations of the mechanisms formalized here. These gains are not great, and it is 
unclear whether similar improvements could be achieved in realistically large natural 
language grammars. The formal results in Sections 4 and 5 above suggest that it may 
not be worthwhile carrying out such experiments, unless grammars are restricted to 
those that are directly analyzable. 
482 
Ritchie Completeness Conditions for Mixed Strategy Bidirectional Parsing 
Context-free grammar has been used as the basis here, both to simplify the for- 
malization, to achieve some degree of generality, and in order to relate the work to 
existing formal language theory. Steel and de Roeck also use a CFG base as an ex- 
pository device for their ideas. However, it is extremely rare within computational 
linguistics for a pure CFG to be used in actual systems that parse natural language. 
Usually, some much more complex grammatical formalizm is used, such as unification 
grammar (Shieber 1986). Many of the methods for parsing unification grammars are 
closely based on traditional CFG parsing techniques, with enhancements. This means 
that an obvious extension of the theoretical definitions and results in this paper would 
be the application of mixed strategy bidirectional parsing to unification grammars. 
Most of the framework could be retained, since the main difference between a simple 
unification grammar formalism and CFG is in the way that nonterminal symbols are 
compared or combined with each other. It is highly improbable that the undecidability 
result would be overturned, and it is even conceivable that the appropriate counterpart 
of direct analyzability might turn out to be less tractable. 
8. Conclusions 
Although the idea of allowing the grammar writer to specify the strategy to be used 
for each rule in a grammar may seem superficially appealing, the formal evidence 
presented here is that it is severely limited. In general, grammar annotation may lead 
to incompleteness. Although there is a decidable property--direct analyzability--that 
guarantees completeness, it is overrestrictive, in the sense that there are complete 
annotations that are not directly analyzable. There is also a wider class of complete 
annotations--indirectly analyzable--that cannot be decidably detected. 
There is also some question over the practical effectiveness of the mixed strategy 
technique, although that issue has not been explored here. 
Appendix A: Computing Direct Analyzability 
The algorithm is a simple variant of the use of an AND-OR graph in problem solving, 
as in Nilsson (1971). The graph will contain a node for each nonterminal symbol A in 
the grammar, and an OR node for each bottom-up rule. Each node has a label, which 
is either a nonterminal symbol or OR, and may, optionally, have a marking, which is 
either SOLVED or FAILED. 
1. For each symbol A c VN, create a node NA, and insert arcs and markings as follows: 
if there is a purely TD rule of the form A --* a 
then mark NA as FAILED 
else if all rules of the form A --* a are lexical 
then mark NA as SOLVED 
else 
for each bottom-up rule of the form A --* A1 • • • Ak: 
- create a node N labeled OR; 
- create an arc from NA to N; 
- create an arc from N to NAI for every i E tr(A --* A1...Ak) 
such that i > 0. 
At this point, each non-terminally labeled node has outgoing arcs for every bottom-up rule 
that might expand it, and each of these arcs connects to an OR node, which in turn connects to 
483 
Computational Linguistics Volume 25, Number 4 
every possible trigger category for that rule. Nodes marked FAILED correspond to categories that 
are not directly analyzable; nodes marked SOLVED correspond to those that are directly analyzable. 
Initially, any node marked SOLVED or FAILED has no outgoing arcs. 
2. Repeat until no changes occur in the graph: 
for each node N in the graph: 
if N is marked FAILED 
then delete any arc into N from a node M; 
if N is labeled oR, or there are no other outgoing 
arcs from M; 
then - mark M as FAILED; 
- remove any outgoing arcs from M; 
if N is marked SOLVED 
then if there is an arc into N from an oR-node M 
then - mark M as SOLVED; 
- remove any outgoing arcs from M. 
else if there is an arc into N from a node MA 
then delete this arc from MA to N 
if this leaves no outgoing arcs from MA 
then mark MA as SOLVED. 
if N is an OR node with no incoming arcs 
then delete N and all its outgoing arcs. 
The properties remarked above remain invariant during this iteration. The iteration termi- 
nates as the graph is finite. On termination, the only arcs left must be in cycles. The categories 
associated with any nodes in cycles should be taken as directly analyzable. 
3. For every node NA that has an arc (incoming or outgoing) attached to it, mark NA 
as SOLVED. 
4. If for every purely bottom-up rule A --* A1 ... Ak, there is an i E tr(A --* A1 ...Ak) 
such that NAi is marked SOLVED, then the grammar is directly analyzable. 
The above statement is not intended to be maximally efficient. No formal proof 
of its correctness is given here, but there is a fairly straightforward relationship to the 
property of direct analyzability, which is stated as the draft definition in Section 4.2. 
Appendix B: Undecidability Proof 
nemma 
For any two context-free grammars G1, G2, it is undecidable whether every member 
of L(G1) ends in a substring that is a member of L(G2). 
Proof 
Let G1 and G2 be two CFGs over the same alphabet V, with languages L(G1) and L(G2) 
respectively. Let # be a symbol that is not a member of V. Consider the language L~ 
given by: 
{#x Ix e L(G1)} 
484 
Ritchie Completeness Conditions for Mixed Strategy Bidirectional Parsing 
and L~ given by: 
{#YIY c L(G2)} 
i t These are both context-free languages; assume that grammars G 1, G 2 generate them. 
Suppose we have a procedure that would decide, for any two context-free grammars, 
whether every member of the language of one ends in a substring that is a member 
of the language of the other. Consider the question whether every member of L(G'2) 
(i.e., L~) ends in a substring that is a member of L(G'I) (i.e., L~). This is true iff every 
string of the form #y in L~ has a final substring that is in {#x I x c L(G1)}. Since # is 
not in V, this can be true iff y c L(G1). This will be true for every such string in L~, iff 
y E L(G1) for every y E L(G2); i.e., L(G2) C L(G1) 
That is, a decision procedure for the final substring question would allow the 
construction of a decision procedure for the subset question for the languages gener- 
ated by two arbitrary context-free grammars, which in turn would provide a decision 
procedure for the equivalence of the languages, and that is known to be undecidable 
(Aho and Ullman 1972, Sect. 2.6.3). \[\] 
Acknowledgments 
I would like to thank Anne de Roeck, 
Alistair Willis, and Suresh Manandhar for 
useful discussions, and Nicolas Nicolov for 
comments on an earlier draft. The incisive 
and thorough comments of various 
anonymous reviewers have greatly 
improved this paper. 
References 
Aho, Alfred V. and Jeffrey D. Ullman. 1972. 
The Theory of Parsing, Translation, and 
Compiling. Volume 1: Parsing. Prentice-Hall, 
Englewood Cliffs, NJ. 
Earley, Jay. 1970. An efficient context-free 
parsing algorithm. Communications of the 
ACM, 13(2):94-102. 
Kay, Martin. 1989. Head-driven parsing. In 
Proceedings of the International Workshop on 
Parsing Technologies, pages 52-62, Carnegie 
Mellon University, Pittsburgh, PA, 
August. 
Maclane, Saunders and Garrett Birkhoff. 
1967. Algebra. Macmillan, London. 
Nilsson, Nils J. 1971. Problem-solving methods 
in artificial intelligence. McGraw-Hill, New 
York. 
Partee, Barbara H., Alice ter Meulen, and 
Robert E. Wall. 1990. Mathematical Methods 
in Linguistics. Kluwer Academic, 
Dordrecht. 
Satta, Giorgio and Oliviero Stock. 1989. 
Formal properties and implementation of 
bidirectional charts. In Proceedings of the 
Eleventh International Joint Conference on 
Artificial Intelligence (IJCAI-89), 
pages 1480-1485. 
Satta, Giorgio and Oliviero Stock. 1991. A 
tabular method for island-driven 
context-free grammar parsing. In 
Proceedings of the Eighth National Conference 
on Artificial Intelligence (AAAI-91), 
pages 143-148. 
Satta, Giorgio and Oliviero Stock. 1994. 
Bidirectional context-free grammar 
parsing for natural language processing. 
Artificial Intelligence, 69:123-164. 
Shieber, Stuart. 1986. An Introduction to 
Unification Approaches to Grammar. CSLI 
Lecture Notes Number 4. Center for the 
Study of Language and Information. 
Shieber, Stuart M., Yves Schabes, and 
Femando C. N. Pereira. 1995. Principles 
and Implementation of Deductive 
Parsing. Journal of Logic Programming, 
24(1 & 2):3-36. 
Sikkel, Klaas and Rieks op den Akker. 1996. 
Predictive head-corner chart parsing. In 
Harry Bunt and Masaru Tomita, editors, 
Recent Advances in Parsing Technology. 
Kluwer Academic, Netherlands, 
chapter 9, pages 169-182. 
Stabler, Edward P. 1994. Parsing for 
incremental interpretation. Unpublished 
paper, UCLA, Los Angeles, CA. 
Steel, Sam and Anne de Roeck. 1987. 
Bidirectional chart parsing. In J. Hallam 
and C. Mellish, editors, Advances in 
Artificial Intelligence. John Wiley, 
pages 223-235. 
Stoy, Joseph E. 1981. Denotational Semantics: 
the Scott-Strachey approach to programming 
language theory. MIT Press, Cambridge, 
485 
Computational Linguistics Volume 25, Number 4 
MA. 
Thompson, Henry and Graeme Ritchie. 
1984. Implementing natural language 
parsers. In T. O'Shea and M. Eisenstadt, 
editors, Artificial Intelligence: Tools, 
Techniques and Applications. Harper and 
Row, New York, Chapter 9, pages 
245-300. 
Willis, Alistair. 1996. Exploring Chart Parsing 
Mechanisms. Master's thesis, Department 
of Artificial Intelligence, University of 
Edinburgh, Edinburgh, Scotland. 
Winograd, Terry. 1983. Language as a 
Cognitive Process. Volume I: Syntax. 
Addison-Wesley, Reading, MA. 
486 
