LHIP: Extended DCGs for Confignrable Robust Parsing* 
Afzal Ba\]lim Graham Russell 
ISSCO, University of Geneva, 54 Route des Acacias, Geneva, CH-1227 Switzerland 
emaih afzal@divsun.unige.ch, russell@divsun.unige.ch 
Abstract 
We present LHIP, a system for incremental gram- 
mar development using an extended DCG for- 
malism. 'rite system uses a robust island-based 
parsing method controlled by user-defined perfor- 
mance thresholds. 
Keywords: I)CG, head, island parsing, robust 
parsing, Prolog 
1 LHIP Overview 
This paper describes LIIII' (Left-Ilead corner 
Island Parser), a parser designed for broad- 
coverage handling of lmrestricted text. The sys- 
tem interprets an extended DCG formalism to 
produce a robust analyser that finds parses of 
the input made from 'islands' of terminals (cor- 
responding to terminals consumed by success- 
ful grammar rules). It is currently in use for 
processing dialogue tr,'mscripts from the tICRC 
Map Task Corpus (Anderson et al., 1991), al- 
though we expect its eventual applications to he 
much wider. 1 Transcribed natural speech con- 
tains a number of frequent characteristic 'un- 
grmnmatical' phenomena: filled pauses, repeti- 
tions, restarts, etc. (as in e.g. Right I'll have 
...you know, like I'll have to ...so I'm going 
between the picket fence and the mill, right.). ~ 
While a full analysis of a conversation might well 
take these into account, for many purposes they 
represent a significmlt obstacle to analysis. LIIIP 
provides a processing method wlfich allows se- 
lected portions of the input to be ignored or han- 
dled differently. 
The chief modifications to the standard Prolog 
*grammar rule' format are of two types: one or 
more rlght-hand side (RtIS) items may be marked 
*This work was carried out under grants nos. 20- 
33903.92 and 12-36505.92 from the Swiss National 
Fund. 
tNote that the input consists of wr/tten texts 
within the Map Task Corpus; LtIIP is not intended 
for use in speech processing. 
2This example is taken fronl the Map Task Corpus. 
as 'heads', and one or more RHS items may be 
marked as 'ignorable'. We expand on these points 
and introduce other differences below. 
The behaviorlr of LHIP can best he understood 
in terms of the notions of island, span, cover 
and threshold: 
Island: Within an input string consisting of the 
terminals (tl,t2,...tn), ~ island is a sub- 
sequence (ti, ti+l,.., ti+,,), whose length is 
m + 1. 
Span: The span of a grammar rule R is the 
length of the longest island (tl,...tj) such 
that ternfinals tl and t i are both consumed 
(directly or indirectly) by R. 
Cow.'r: A rule R is said to cover m items if rn 
terminals are consumed within the island de- 
scribed by R. The coverage of R is then rn. 
Threshold: The threshold of a rule is the mini- 
mum ~;alue for the ratio of its coverage c to 
its span s which must hold in order for the 
rule to succeed. Note that c <_ s, aud that 
if c -- s the rule has completely covered the 
span, consuming all terminals. 
As implied here, rules need not cover all of the 
input in order to succeed. More specifically, the 
constraints applied in creating islands are such 
that ishmds do not have to be adjacent, but may 
be separated by non-covered input. Moreover, 
an island may itself contain input which is unac- 
counted for by the grammar. Islands do not over- 
lap, although when multiple anMyses exist they 
will in general involve different segmentations of 
the input into islands. 
There are two notions of non-coverage of the 
input: sanctioned and unsanetloned non- 
coverage. The latter case arises when the gram- 
mar simply does not account for some terminM. 
S~mctioned non-coverage means that some num- 
ber of special 'ignore' rules have been applied 
which simulate coverage of input material lying 
between the ish'mds, thus in effect making the is- 
lands contiguous. Those parts of the input that 
have been 'ignored' are considered to have been 
501 
consmned. These ignore rules can be invoked ino 
dividually or as a class. It is this latter capabil- 
ity which distinguishes ignore rules from regular 
rules, as they are functionally equivalent other- 
wise, mainly serving as a notational aid for the 
grammar writer. 
Strict adjacency between RHS clauses can be 
specified in the grammar. It is possible to define 
global and local thresholds for the proportion of 
the spanned input that must be covered by rules; 
in this way, the user of an LHIP grammar can 
exercise quite fine control over the required accu- 
racy and completeness of the analysis. 
A chart is kept of successes and failures of rules, 
both to improve efficiency and to provide a means 
of identifying unattached constituents. In addi- 
tion, feedback is given to the grammar writer on 
the degree to which the grammar is able to cope 
with the given input; in a context of grammar de- 
velopment, this may serve as notification of areas 
to which the coverage of the grammar might next 
be extended. 
The notion of ~head' employed here is con- 
nected more closely with processing control than 
linguistics. In particular, nothing requires that a 
head of a rule should share any information with 
the LItS item, although in practice it often will. 
Heads serve as anchor-points in the input string 
around which islands may be formed, and are 
accordingly treated before non-head items (RHS 
items are re-ordered during compilation-see be- 
low). In the central role of heads, LtIIP resem- 
bles parsers devised by Kay (1989) and van Noord 
(1991); in other respects, including the use which 
is made of heads, the approaches are rather dif- 
ferent, however. 
2 The LHIP System 
In this section we describe the LHIP system. 
First, we define what constitutes an acceptable 
LHIP grammar, second, we describe the process 
of converting such a grammar into Prolog code, 
and third, we describe the analysis of input with 
such a grammar. 
LHIP graxnmars are an extended form of Pro- 
log DCG graznmars. The extensions can be sum- 
marized as follows: a 
1. one or more \[tHS clauses may be nominated 
as heads; 
ZA version of LHIP exists which permits a form 
of negation on RHS clauses. That version is not de- 
scribed here. 
2. one or more P~tlS clauses may be marked as 
optional; 
3. 'ignore' rules may be invoked; 
4. adjacency constraints may be imposed be- 
tween l~tIS clauses; 
5. a global threshold level may be set to deter- 
mine the minimum fraction of spanned input 
that may be covered in a parse, and 
6. a local threshold level may be set in a rule 
: to override the global threshold witlfin that 
" rule. 
We provide a syntactic definition (below) of a 
LHIP grammar rule, using a notation with syn- 
tactic rules of the form C -~ F1 I if2--- I Fn 
wtfich indicates that the category C may take any 
of the forms F1 to F,~. An optional item in a form 
is denoted by surrounding it with square brackets 
'\[...\]'. Syntactic categories are italieised, while 
terminMs are underlined: '...'. 
A LtlIP granunar rule has the form: 
lhiVrute ~ \[ - \] term \[ # T \] ~~__~> U~i~body 
where T is a value between zero and one. If 
present, this value defines the local threshold 
fraction for that rule. This local threshold value 
overrules the global threshold. The symbol '-' 
before tile name of a rule marks it as being an 
'ignore' rule. 0nly a rule defined this way can be 
invoked as an ignore rule in an RHS clause. 
lhipbody => lh.ipclause 
I Ihipclause z lhipbody 
f lhipclause ; lhipbody 
I lhipclause - lhipbody 
I (~_" lhipbody ?_)) 
The connectives ',' and ~;~ have the same prece- 
dence as in Prolog, while ~'' has the same prece- 
dence as ~'. Parentheses may be used to resolve 
ambiguities. The connective '~' is used to indi- 
cate that strings subsumed by two RHS clauses 
are ordered but not necessarily adjacent in the 
input. Thus 'A ~ /3' indicates that A precedes 
I3 in the input, perhaps with some intervening 
material. The stronger constraint of immediate 
precedence is marked by ':'; 'A : B' indicates that 
the span of A precedes that of B, and that there 
is no 1recovered input between the two. Disjunc- 
tion is expressed by ~', and optional R/IS clauses 
are surrounded by '(?... ?)'. 
502 
lhipclause ~ temn 
• te~nn 
~. string 
• ._~ string 
- term 
\[\] ~rdoaeod~ }__ 
The symbol '*' is used to indicate a head 
clause. A rule name is a Prolog term, and only 
rules and terminal items may act as heads within 
a rule body. The symbol '@' introduces a ter- 
minM string. As previously said, the purpose 
of ignore rules is simply to consume input ter- 
minals, and their intended use is in facilitat- 
ing repMrs in analysing input that contains the 
false starts, restarts, fdled pauses, etc. mentioned 
above. These rules are referred to individually hy 
preceding their name by the '-' symbol. They 
can also be referred to as a class in a rule body hy 
the speciM I~.tIS clause '\[\]'. If used in a rule body, 
they indicate that input is potentially ignored- 
the problems that ignore rules are intended to re- 
pair will not always occur, in which case tile rules 
succeed without conslmfing any input. There is a 
semantic restriction on the body of a rule which 
is that it must contain at least one clause which 
necessarily covers input (optional clauses and ig- 
nore rules do not necessarily cover input). 
The following is an example of a LtIIP rule. 
Here, the sub-rule 'conjunction(Con j)' is marked 
as a head and is therefore evaluated before either 
of,s(s))' or 's(S0': 
s(conjunct(Conj, SI, Sr)) ~> 
s(St), 
* conjunctlon(Conj), 
s(S~). 
tIow is such a rule converted into Prolog code 
by the LHIP system? First, the rule is read 
and the RHS clauses are partitioned into those 
marked as heads, and those not. A record is 
kept of their original ordering, and this record 
allows each clause to be constrMned with respect 
to the clause that precedes it, as well as with re- 
spect to the next head clause wMch follows it. 
Additional code is added to maintain a chart of 
known successes and failures of each rule. Each 
rule name is turned into the name of a Prolog 
clause, and addltionM arguments are added to it. 
These arguments are used for the input, the start 
and end points of the area of tlm input in which 
the rule may succeed, tile start and end points 
of the actual part of the input over which it in 
fact succeeds, the number of terminal items cov- 
ered within that island, a reference to the point 
in the chart where the result is stored, and a list 
of pointers to sub-results. The converted form of 
tile above rule is given below (rMnus the code for 
chart maintenance): 
s(conjunct(H,I,J), A, B, C, D, E, F, 
ELIK\]-K, G) :- 
lhip_threshold valuo(M), 
conjunction(H, A, B, C, O, P, Q, 
R-S,_), 
s(l, A, B, fl, D, _, T, C-R, _). 
s(J, A. P, C, ~, E, U, s-El, _), 
F is U+Q+T, 
F/(E-D)>=M. 
The important points to note about this con- 
vetted form are the following: 
1. the conjunction clause is searched for be- 
fore either of the two s clauses; 
2. the region of the input to be searched for the 
conjunction clause is the stone as that for 
the rule's LIIS (B-C): its island extends from 
0 to p and covers Q items; 
3. the search region for tile first s clause is B-0 
(i.e. from tile start of tile LHS search region 
to tile start of the conjunction island), its 
island starts at D and covers T items; 
4. the search region for tile second s clause is 
P-C (i.e. from the end of the conjunction 
island to the end of the LIIS search region), 
its island ends at E and covers II items; 
5. the island associated with the rule as a whole 
extends from D to E and covers F items, 
whereFisU+ Q + T; 
6. lhip_throshold_value/l unifies its argu- 
ment M with the current global threshold 
value. 
In the current implementation of LI\[IP, compiled 
rules are interpreted depth-first and left-to-right 
by the standard Prolog theorem-prover, giving an 
anMyser that proceeds in a top-down, qeft-head- 
corner' fashion. Because of the reordering car- 
ried out during compilation, the situation regard- 
ing left-recursion is slightly more subtle than in 
a conventional DCG. The 's(conjunct(... ))' rule 
shown above is a case in point. While at first 
sight it appears left-recursive, inspection of its 
converted form shows its true leftmost subrule 
503 
to be 'conjunction'. Naturally, compilation may 
induce left-recursion as well as eliminating it, in 
which case LIIIP will suffer from the same ter- 
mination problems as an ordinary DCG formal- 
ism interpreted in this way. And as with an or- 
dinary DCG formalism, it is possible to apply 
different parsing methods to LHIP in order to 
circumvent these problems (see e.g. Pereira and 
Shieber, 1987). A related issue concerns the in- 
terpretation of embedded Prolog code. Reorder- 
ing of lZHS clauses will result in code which pre- 
cedes a head within a LtHP rule being evaluated 
after it; judicious freezing of goals and avoidance 
of unsafe cuts are therefore required. 
LHIP provides a number of ways of applying 
a grammar to input. The simplest allows one to 
enumerate the possible analyses of the input with 
the grammar. The order in which the results are 
produced wiU reflect the lexical ordering of the 
rules as they are converted by LHIP. With the 
threshold level set to 0, all analyses possible with 
the grammar by deletion of input terminals can 
be generated. Thus, supposing a suitable gram- 
mar, for the sentence John saw Mary and Mark 
saw them there would be analyses corresponding 
to the sentence itself, as well as John saw Mary, 
John saw Mark, John saw them, Mary saw them, 
Mary and Mark saw them, etc. 
By setting the threshold to 1, only those par- 
tial analyses that have no unaccounted for ter- 
minals within their spans can succeed. Hence, 
Mark saw them would receive a valid analysis, as 
would Mary and Mark saw them, provided that 
the grammar contains a rule for conjoined NPs; 
John saw them, on the other hand, would not. As 
this example illustrates, a partial analysis of this 
kind may not in fact correspond to a true sub- 
parse of the input (since Mary and Mark was not 
a conjoined subject in the original). Some care 
must therefore be taken in interpreting results. 
A number of built-in predicates are provided 
which allow the user to constrain the behaviour of 
the parser in various ways, based on the notions 
of coverage, span and threshold: 
lhip _phras o (+C, + S ) 
Succeeds if the input S can be parsed as an 
instance of category C. 
lhip_ cv_phrase (+C, +S) 
As for lhip_phrase/2, except that all of the 
input must be covered. 
lhip_phras e (+C, +S, -B, -E, -Coy) 
As for lhip_phrase/2, except that B binds to 
the beginning of the island described by this 
application of C, E binds to the position imme- 
diately following the end, and Coy binds to the 
ntunber of ternfinals covered. 
lhip_mc_phrasos (+C, +S, -Coy, -Ps ) 
The maximal coverage of $ by C is Cov. Ps is 
the set of parses of S by C with coverage Coy. 
lhip_rainmax_phr as e s (+C, +S, -Coy, -Ps ) 
As for lh±p_mc_phrases\]4, except that Ps is 
additionally the set of parses with the least 
span. 
lhip seq_phrase(+C,+S,-Seq) 
Succeeds if Soq is a sequence of one or more 
parses of S by C such that they are non- 
overlapping and each consumes input that pre- 
cedes that consumed by the next. 
lhip maxT_phras os (+C, +S, -MaxT) 
MaxT is the set of parses of S by C that have 
the tfighest threshold value. On backtracking it 
returns the set with the next highest threshold 
value. 
In addition, other predicates can be used to 
search the chart for constituents that have been 
identified but have not been attached to the parse 
tree. These include: 
lhip_success 
Lists successful rules, indicating island position 
and coverage. 
lhip_ms_success 
As for lhip_success, but lists ouly the most 
specific successful rules (i.e. those which have 
themselves succeeded but whose results have 
not been used elsewhere). 
lhip_ms_success (N) 
As for lhip_ms_succoss, but lists only suc- 
cessful instances of rule N. 
Even if a sentence receives no complete analysis, 
it is likely to contain some parsalfle substrings; re- 
sults from these are recorded together with their 
position within the input. By using these predi- 
cates, partiM but possibly useful information can 
be extracted from a sentence despite a global fail- 
ure to parse it (see section 4). 
The conversion of the grammar into Prolog 
code means that the user of the system can eas- 
ily develop anMysis tools that apply different 
constraints, using the tools provided as building 
blocks. 
504 
3 Using LHIP 
As previously mentioned, LHIP facilitates a cyc- 
lic approach to grammar development. Suppose 
one is writing an English grammar for the Map 
Task Corpus, and that the following is the first 
attempt at a rule for noun phrases (with appro- 
priate rules for determiners and nouns): 
up(N, D, A) # 0.5 ,-~--~> 
determiner(D), 
* no,,n(N) 
While tiffs rule will adequately anMyse simple 
NPs such as your map, or a missionary camp, on 
a NP such as the bottom right-hand corner it will 
give analyses for the bottom, the right-hand and 
the corner. Worse still, in a long sentence it will 
join determiners from the start of the sentence 
to nouns that occur in the latter hMf of the sen- 
tence. The number of superfluous anMyses can 
be reduced by imposing a local threshohl level, 
of say 0.5. By looking at the various analyses of 
sentences in the corpus, one can see that this rule 
gives the skeleton for noun phrases, but from the 
fraction of coverage of these parses one c,'m also 
see that it leaves out an importmlt feature, adjec- 
fives, which are optionally found in noun phrases. 
np(N, D, A) # 0.5 ~,-~> 
determiner(D), 
(? adjectives(A) ?), 
* noun(N). 
With rids rule, one can now handle such 
phrases as the left-hand bottom corner, and a ba- 
nana tree. Suppose further that this rule is now 
applied to tile corpus, and then the rule is ap- 
plied again but with a local threshold level of 1. 
By looking at items parsed in the first case but 
not in the second, one can identify features of 
nolm phrases found in tlle corpus that are not 
covered by the current rules. Tiffs might include, 
for instance, phrases of the form a slightly dip- 
ping line. One can then go hark to the grammar 
azld see that the noun phrase rule needs to bc 
changed to account for certain types of modifier 
including adjectives and adverbial modifiers. 
It is Mso possible to set loom thresholds dy- 
namically, by making use of the '{ prolog code }' 
facility: 
np(N, D, A) # T ~,,~> 
determiner(D), 
(? adjectives(A) ?), 
* .o~.(N). 
{ set_dynamic_threshold(A,T) }. 
In this way, the strictness of a rule may be var- 
ied according to information originating either 
within the particular run-time invocation of the 
rule, or elsewhere in the current parse. For exam- 
ple, it would be possible, by providing a suitable 
definition for set_dynamic_threshold/2, to set T to 
0.5 when more titan one optional adjective has 
been found, and 0.9 otherwise. 
Once a given rule or set of rules is stabl% and 
tile writer is satisfied with the performtmce of 
that part of the grammar, a local threshold value 
of 1 may bc assigned so that superfluous parses 
will not interfere with work elsewhere. 
The use of the chart to store known results 
and failures allows the user to develop hybrid 
parsing techniques, rather than relying on the 
default depth-first top-down strategy given by 
analysing with respect to the top-most category. 
For instance, it is possible to anMyse the input 
in 'layers' of linguistic categories, perhaps start- 
ing by analysing noun-phrases, then prepositions, 
verbs, relative clauses, clauses, conjuncts, and fi- 
nally complete sentences. Such a strategy allows 
the user to perform processing of results between 
these layers, w:hich can be useful in trying to find 
the 'best' analyses first. 
4 Partial results 
The discussion of built-ln predicates mentioned 
facilities for recovering partial parses. Here we 
ilhlstrate this process, and indicate what further 
use might be made of tile information titus ob- 
tained. 
In the following example, tile chart is inspected 
to reveal what constituents have been built dur- 
ing a t~iled parse of the truncated sentence Have 
you the tree by the brook that... : 
> lhip_phrase(~(S), 
\[have,you,the,tree,by,the,brook,that\]). 
no 
> lhip success. 
(-I) \[7--8) /I "*> Obrook 
(-1) \[5--6) /I "'> ©by 
(-I) \[1--2) /I "'> ehave 
(-i) \[8--9) /2 --> ©that 
(-1) \[3--4) /2 "'> Othe 
(-i) \[6--Z) /I "'> ethe 
505 
(-1) \[4--5) /l "'> ©tree 
(-1) \[2--3) /1 "'> ©you 
(4) \[2--8) 14 --> 
np(nppp(you, 
pp(by,np(the,brook,B)))) 
(4) \[3--8) /5 "'> 
np(nppp(np(the,tree,C), 
pp(by,np(the,brook,D)))) 
(5) \[3--8) /2 "'> rip(rip(the,brook,A)) 
(5) \[6--8) /2 "'> np(np(the,brook,G)) 
(5) \[3--5) /2 "'> np(np(the,tree,E)) 
(7) \[4--5) /I "'> noun(tree) 
(8) \[7--8) /I "'> noun(brook) 
(9) \[2--3) l1 "'> np(you) 
(10) \[5--8) /3 --> 
pp(pp(by,np(the,brook,F))) 
(11) \[3--4) /l "'> det(the) 
(11) \[6--7) /1 "'> det(the) 
yes 
Each rule is listed with its identifier ('-1' for lex- 
ical rules), the island wtfich it has analysed with 
beginning and ending positions, its coverage, and 
the representation that was constructed for it. 
From this output it can be seen that the gram- 
mar manages reasonably well with noun phrases, 
but is unable to deM with questions (the initial 
auxiliary have remains unattached). 
Users will often be more interested in the 
successful application of rules which represent 
maximal constituents. These are displayed by 
lhip_ms_ suc cos s: 
> lhip_ms_success. 
(-1) \[1--2) /1 "'> ©have 
(-1) \[8--9) /1 "'> ©that 
(4) \[2--8) /4 --> 
np(nppp(you, 
pp(by,np(the,brook,J)))) 
(4) \[3--8) /5 "'> 
np(nppp(np(the,tree,I{), 
pp(by,np(the,brook,I)))) 
(5) \[3--8) /2 "-> np(np(the,brook,K)) 
yes 
Here, two unattached lexical items have been 
identified, together with two instances of rule 4, 
which combines a NP with a postmodifying PP. 
The first of these has analysed the island you the 
tree by the brook, ignoring the tree, while the sec- 
ond has analysed the tree by the brook, consum- 
ing all terminals. There is a second analysis for 
the tree by the bTvok, due to rule 5, which has 
been obtained by ignoring the sequence tree by 
the. From this information, a user might wistt to 
rank the three results according to their respec- 
tive span:coverage ratios, probably preferring the 
second. 
5 Discussion 
The ability to deal with large amomlts of possi- 
bly ill-formed text is one of the principal objec- 
tives of current NLP research. Recent proposals 
include the use of probabilistic methods (see e.g. 
Briseoe and Carroll, 1993) and large robust deter- 
ministic systems like Hindle's Fidditch (Hindle, 
1989). 4 Experience so far suggests that systems 
like LIIIP may in the right circumstances provide 
an alternative to these approaches. It combines 
the advantages of Prolog-interpreted DCGs (ease 
of modification, parser output suitable for direct 
use by other programs, etc.) with the ability to 
relax tile adjacency constraints of that form&llsm 
in a flexible and dynamic manner. 
LIHP is based on the assumption that partial 
results can be useful (often much more useful 
than no result at all), and that an approxima- 
tion to complete coverage is more useful when it 
comes with indications of how approximate it is. 
This latter point is especially important in cases 
where a grammar must be usable to some degree 
at a relatively early stage in its development, as 
is, for example, the case with the development of 
a grammar for the Map Task Corpus. In the near 
future, we expect to apply LHIP to a different 
problem, that of defining a restricted language 
for specialized parsing. 
The rationale for the distinction between sanc- 
tioned and unsanctioned non-coverage of input is 
twofold. First, the qgnore' facility permits dif- 
ferent categories of unidentified input to be dis- 
tinguished. For example, it may be interesting 
to separate material which occurs at the start 
of the input from that appearing elsewhere. Ig- 
nore rules have a similar flmctionality to that of 
normal rules. In particular, they can have ar- 
guments, and may therefore be used to assign 
a structure to unidentified input so that it may 
be flagged as such within an overall parse. Sec- 
ondly, by setting a threshold value of 1, LtIIP can 
be made to perform llke a standaxdly interpreted 
Prolog DCG, though somewhat more efficiently 
aIndeed, the ability of Fidditch to return a se- 
quence of parsed but unattached phrases when a 
global analysis fails has clearly influenced the design 
of LHIP. 
506 
due to the use of the chart. ~ 
A number of possible extensions to the sys- 
tem can be envisaged. Whereas at present each 
rule is compiled individually, it would be prefer- 
able to enhance preprocessing in order to com- 
pute certain kinds of global information from the 
grammar. One improvement would be to deter- 
mine possible linking of 'root-to-head' sequences 
of rules, and index these to terminal items for use 
as an oracle during anMysis. A second would be 
to identify those items whose early analysis would 
most strongly reduce the search space for sub- 
sequent processing and sc,'m the input to begin 
parsing at those points rather titan proceeding 
strictly front left to right. This further suggests 
the possibility of a parallel approach to parsing. 
We expect that these measltres would increase 
the efficiency of LHIP. 
Currently, also, results are returned in an order 
determined by the order of rules in the grammar. 
It would be preferable to arrange matters in a 
more cooperative fashion so that the best (those 
with the highest coverage to span ratio) are dis- 
played first. Support for bidirectional parsing 
(see Satta and Stock, to appear) is another candi- 
date for inclusion in a later version. These appear 
to be longer-term research goals, however. 6 
Acknowledgments: The authors would like to 
thank Louis des Tombe and Dominique Estival 
for comments on earlier versions of this paper. 
References 
Anderson, A.tI., M. Bader, E.G. Bard, E. Boyle, 
G. Doherty, S. Garrod, S. Isard, J. Kowtko, J. 
McAllister, J. Miller, C. Sotillo, It. Thompson 
and It. Weinert (1991) "The IIC\]LC Map Task 
Corpus", Language and Speech 34(4), 351-366. 
Briscoe, T. and J. Carroll (1993) "Generalized 
Probabilistie LR Parsing of Natural Language 
(Corpora) with Unification-Based Grammars" 
Computational Linguistics 19(1), 25-59. 
Hindle, D. (1989) "Acquiring Disambiguation 
Rules from Text". Proceedings of the 27th An- 
nual Meeting of the Association for Computa- 
tional Linguistics, 118-125. 
Sin large grammars there is a significant time gain. 
The chart's main advantage, however, is in identify- 
ing unattached constituents and allowing a 'layered' 
approach to analysis of input. 
6Source code for the LHIP system has been made 
publicly available. For information, contact the 
authors. 
Kay, M. (1989) "Head-Driven Parsing", Proceed- 
ings of the Workshop on Parsing Technologies, 
52-62. 
Pereira, F.C.N. and S.M. Shieber (1987) Prolog 
and Natural Language Analysis, CSLI Lecture 
Notes No. 10, Stanford University. 
Satta, G. and O. Stock (to appear) "Bidirec- 
tional Context-Free Grammar Parsing for Nat- 
ural Language Processing", Artificial Intelli- 
gence. 
van Noord, G. (1991) "Head Corner Parsing for 
Discontinuous Constituency", Proceedings of 
the 29th Annual Meeting of the Association for 
Computational Linguistics, 114-121. 
507 
