Know When to Hold 'Em: Shuffling Deterministically in a Parser 
for Nonconcatenative Grammars* 
Robert T. Kasper, Mike Calcagno, and Paul C. Davis 
Department of Linguistics, Ohio State University 
222 Oxley Hall 
1712 Neil Avenue 
Columbus, OH 43210 U.S.A. 
Email: {kasper,calcagno,pcdavis) @ling.ohio-state.edu 
Abstract 
Nonconcatenative constraints, such as the shuffle re- 
lation, are frequently employed in grammatical anal- 
yses of languages that have more flexible ordering of 
constituents than English. We show how it is pos- 
sible to avoid searching the large space of permuta- 
tions that results from a nondeterministic applica- 
tion of shuffle constraints. The results of our imple- 
mentation demonstrate that deterministic applica- 
tion of shuffle constraints yields a dramatic improve- 
ment in the overall performance of a head-corner 
parser for German using an HPSG-style grammar. 
1 Introduction 
Although there has been a considerable amount of 
research on parsing for constraint-based grammars 
in the HPSG (Head-driven Phrase Structure Gram- 
mar) framework, most computational implementa- 
tions embody the limiting assumption that the con- 
stituents of phrases are combined only by concate- 
nation. The few parsing algorithms that have been 
proposed to handle more flexible linearization con- 
straints have not yet been applied to nontrivial 
grammars using nonconcatenative constraints. For 
example, van Noord (1991; 1994) suggests that the 
head-corner parsing strategy should be particularly 
well-suited for parsing with grammars that admit 
discontinuous constituency, illustrated with what he 
calls a "tiny" fragment of Dutch, but his more re- 
cent development of the head-corner parser (van No- 
ord, 1997) only documents its use with purely con- 
catenative grammars. The conventional wisdom has 
been that the large search space resulting from the 
use of such constraints (e.g., the shuffle relation) 
makes parsing too inefficient for most practical ap- 
plications. On the other hand, grammatical anal- 
yses of languages that have more flexible ordering 
of constituents than English make frequent use of 
constraints of this type. For example, in recent 
work by Dowty (1996), Reape (1996), and Kathol 
" This research was sponsored in part by National Science 
Foundation grant SBR-9410532, and in part by a seed grant 
from the Ohio State University Office of Research; the opin- 
ions expressed here are solely those of the authors. 
(1995), in which linear order constraints are taken 
to apply to domains distinct from the local trees 
formed by syntactic combination, the nonconcate- 
native shuffle relation is the basic operation by 
which these word order domains are formed. Reape 
and Kathol apply this approach to various flexible 
word-order constructions in German. 
A small sampling of other nonconcatenative op- 
erations that have often been employed in linguistic 
descriptions includes Bach's (1979) wrapping oper- 
ations, Pollard's (1984) head-wrapping operations, 
and Moortgat's (1996) extraction and infixation op- 
erations in (categorial) type-logical grammar. 
What is common to the proposals of Dowty, 
Reape, and Kathol, and to the particular analysis 
implemented here, is the characterization of nat- 
ural language syntax in terms of two interrelated 
but in principle distinct sets of constraints: (a) con- 
straints on an unordered hierarchical structure, pro- 
jected from (grammatical-relational or semantic) va- 
lence properties of lexical items; and (b) constraints 
on the linear order in which elements appear. In 
this type of framework, constraints on linear order 
may place conditions on the the relative order of 
constituents that are not siblings in the hierarchical 
structure. To this end, we follow Reape and Kathol 
and utilize order domains, which are associated with 
each node of the hierarchical structure, and serve 
as the domain of application for linearization con- 
straints. 
In this paper, we show how it is possible to avoid 
searching the large space of permutations that re- 
sults from a nondeterministic application of shuffle 
constraints. By delaying the application of shuffle 
constraints until the linear position of each element 
is known, and by using an efficient encoding of the 
portions of the input covered by each element of an 
order domain, shuffle constraints can be applied de- 
terministically. The results of our implementation 
demonstrate that this optimization of shuffle con- 
straints yields a dramatic improvement in the overall 
performance of a head-corner parser for German. 
The remainder of the paper is organized as fol- 
lows: §2 introduces the nonconcatenative fragment 
663 
(1) Seiner Freundin liess er ihn helfen 
his(DAT) friend(FEM) allows he(NOM) him(ACC) help 
'He allows him to help his friend.' 
(2) Hilft sie ihr schnell 
help she(NOM) her(DAT) quickly 
'Does she help her quickly?' 
(3) Der Vater denkt dass sie ihr seinen Sohn helfen liess 
The(NOM) father thinks that she(NOM) her(DAW) his(ACe) son help allows 
'The father thinks that she allows his son to help her.' 
(4) 
r_decl dorr~_obj 
PHON seiner Freundin 
SYNSEM NP 
TOPO vf 
r dom_obj \] I zio.I 
' |SWSEM V|' 
LTOPO cf J 
dom_obj \] 
PHON er | 
SYNSEM NP| ' 
TOPO m/ J 
" dom_obj \] 
PHON ihn I 
SYNSEM NP I ' 
TOPO mf 1 
(5) . \[,o o4\] . . \[ o.ow\] . \[,o.o.4 
Figure 1: Linear order of German clauses. 
dora_obj \] 
PHON herren I 
SYNSEM W / 
TOPO vc .I 
S \[DOM(\[seiner Freundin\],\[liess\],\[er\],\[ihn\],\[helfen\])\] 
VP \[DOM(\[seiner Freundin\],\[liess\],\[ihn\],\[hel\]en\])\] NP 
I 
VP \[DOM(\[seiner Freundin\],\[liess\],\[helfen\])\] NP er 
V \[DOM(\[liess\],\[helfen\])\] NP \[DOM(\[seiner\],\[lareundin\])\] ihn 
V V N Det 
I I I I liess helfen Freundin seiner 
Figure 2: Hierarchical structure of sentence (1). 
of German which forms the basis of our study; §3 
describes the head-corner parsing algorithm that we 
use in our implementation; §4 discusses details of the 
implementation, and the optimization of the shuffle 
constraint is explained in §5; §6 compares the perfor- 
mance of the optimized and non-optimized parsers. 
2 A German Grammar Fragment 
The fragment is based on the analysis of German 
in Kathol's (1995) dissertation. Kathol's approach 
is a variant of HPSG, which merges insights from 
both Reape's work and from descriptive accounts of 
German syntax using topological fields (linear posi- 
tion classes). The fragment covers (1) root declara- 
tive (verb-second) sentences, (2) polar interrogative 
(verb-first) clauses and (3) embedded subordinate 
(verb-final) clauses, as exemplified in Figure 1. 
The linear order of constituents in a clause is rep- 
resented by an order domain (DOM), which is a list 
of domain objects, whose relative order must satisfy 
a set of linear precedence (LP) constraints. The or- 
der domain for example (1) is shown in (4). Notice 
that each domain object contains a TOPO attribute, 
whose value specifies a topological field that par- 
tially determines the object's linear position in the 
list. Kathol defines five topological fields for German 
clauses: Vorfeld (v\]), Comp/Left Sentence Bracket 
(c\]), Mittelfeld (m\]), Verb Cluster/Right Sentence 
Bracket (vc), and Nachfeld (nO). These fields are or- 
dered according to the LP constraints shown in (5). 
The hierarchical structure of a sentence, on the 
other hand, is constrained by a set of immediate 
dominance (ID) schemata, three of which are in- 
cluded in our fragment: Head-Argument (where "Ar- 
gument" subsumes complements, subjects, and spec- 
ifiers), Adjunct-Head, and Marker-Head. The Head- 
664 
Argument schema is shown below, along with the 
constraints on the order domain of the mother con- 
stituent. In all three schemata, the domain of a non- 
head daughter is compacted into a single domain ob- 
ject, which is shuffled together with the domain of 
the head daughter to form the domain of the mother. 
(6) Head-Argument Schema (simplified) 
" r MEAD \[-?\] \] 
sv sE  Ls,.,Bo  T 171J 
DOM \[\] 
L s,.,Bc,,,T (DID J 
L \[\] 
A shuffle(~, compaction(~), V~) 
A order_constraints (V~) 
The hierarchical structure of (1) is shown by the 
unordered tree of Figure 2, where head daughters 
appear on the left at each branch. Focusing on 
the NP seiner Freundin in the tree, it is compacted 
into a single domain object, and must remain so, 
but its position is not fixed relative to the other 
arguments of liess (which include the raised argu- 
ments of helfen). The shuffle constraint allows this 
single, compacted domain object to be realized in 
various permutations with respect to the other ar- 
guments, subject to the LP constraints, which are 
implemented by the order_constraints predicate 
in (6). Each NP argument may be assigned either 
vfor mfas its TOPO value, subject to the constraint 
that root declarative clauses must contain exactly 
one element in the vf field. In this case, seiner Fre- 
undin is assigned vf, while the other NP arguments 
of liess are in m~ However, the following permuta- 
tions of (1) are also grammatical, in which er and 
ihn are assigned to the vf field instead: 
(7) a. Er liess ihn seiner Freundin helfen. 
b. Ihn liess er seiner Freundin helfen. 
Comparing the hierarchical structure in Figure 2 
with the linear order domain in (4), we see that some 
daughters in the hierarchical structure are realized 
discontinuously in the order domain for the clause 
(e.g., the verbal complex liess helfen). In such cases, 
nonconcatenative constraints, such as shuffle, can 
provide a more succinct analysis than concatenative 
rules. This situation is quite common in languages 
like German and Japanese, where word order is not 
totally fixed by grammatical relations. 
3 Head-Corner Parsing 
The grammar described above has a number of 
properties relevant to the choice of a parsing strat- 
egy. First, as in HPSG and other constraint-based 
grammars, the lexicon is information-rich, and the 
combinatory or phrase structure rules are highly 
schematic. We would thus expect a purely top- 
down algorithm to be inefficient for a grammar of 
this type, and it may even fail to terminate, for the 
simple reason that the search space would not be 
adequately constrained by the highly general combi- 
natory rules. 
Second, the grammar is essentially nonconcatena- 
tive, i.e., constituents of the grammar may appear 
discontinuously in the string. This suggests that a 
strict left-to-right or right-to-left approach may be 
less efficient than a bidirectional or non-directional 
approach. 
Lastly, the grammar is head-driven, and we would 
thus expect the most appropriate parsing algorithm 
to take advantage of the information that a semantic 
head provides. For example, a head usually provides 
information about the remaining daughters that the 
parser must find, and (since the head daughter in a 
construction is in many ways similar to its mother 
category) effective top-down identification of candi- 
date heads should be possible. 
One type of parser that we believe to be partic- 
ularly well-suited to this type of grammar is the 
head-corner parser, introduced by van Noord (1991; 
1994) based on one of the parsing strategies ex- 
plored by Kay (1989). The head-corner parser can 
be thought of as a generalization of a left-corner 
parser (Rosenkrantz and Lewis-II, 1970; Matsumoto 
et al., 1983; Pereira and Shieber, 1987). 1 
The outstanding features of parsers of this type 
are that they are head-driven, of course, and that 
they process the string bidirectionally, starting from 
a lexical head and working outward. The key ingre- 
dients of the parsing algorithm are as follows: 
• Each grammar rule contains a distinguished 
daughter which is identified as the head of the 
rule. 2 
• The relation head-corner is defined as the reflexive 
and transitive closure of the head relation. 
• In order to prove that an input string can be 
parsed as some (potentially complex) goal cat- 
egory, the parser nondeterministically selects a 
potential head of the string and proves that this 
head is the head-corner of the goal. 
• Parsing proceeds from the head, with a rule being 
chosen whose head daughter can be instantiated 
by the selected head word. The other daughters 
of the rule are parsed recursively in a bidirec- 
tional fashion, with the result being a slightly 
larger head-corner. 
lln fact, a head-corner parser for a grammar in which the 
head daughter in each rule is the leftmost daughter will func- 
tion as a left-corner parser. 
2Note that the fragment of the previous section has this 
property. 
• 665 
• The process succeeds when a head-corner is 
constructed which dominates the entire input 
string. 
4 Implementation 
We have implemented the German grammar and 
head-corner parsing algorithm described in §2 and 
§3 using the ConTroll formalism (GStz and Meurers, 
1997). ConTroll is a constraint logic programming 
system for typed feature structures, which supports 
a direct implementation of HPSG. Several properties 
of the formalism are crucial for the approach to lin- 
earization that we are investigating: it does not re- 
quire the grammar to have a context-free backbone; 
it includes definite relations, enabling the definition 
of nonconcatenative constraints, such as shuffle; 
and it supports delayed evaluation of constraints. 
The ability to control when relational contraints are 
evaluated is especially important in the optimiza- 
tion of shuffle to be discussed next (§5). ConTroll 
also allows a parsing strategy to be specified within 
the same formalism as the grammar. 3 Our imple- 
mentation of the head-corner parser adapts van No- 
ord's (1997) parser to the ConTroll environment. 
5 Shuffling Deterministically 
A standard definition of the shuffle relation is given 
below as a Prolog predicate. 
shuffle (unoptimized version) 
shuffle(IS, \[\] , \[\]). 
shuffle(\[XISi\], $2, \[XIS3\]) :- 
shuffle(SI,S2,S3). 
shuffle(S1, \[XIS2S, \[XIS3\]) :- 
shuffle(S1,S2,S3). 
The use of a shuffle constraint reflects the fact 
that several permutations of constituents may be 
grammatical. If we parse in a bottom-up fashion, 
and the order domains of two daughter constituents 
are combined as the first two arguments of shuffle, 
multiple solutions will be possible for the mother 
domain (the third argument of shuffle). For ex- 
ample, in the structure shown earlier in Figure 2, 
when the domain (\[liess\],\[helfen\]) is combined with 
the compacted domain element (\[seiner Freundin\]), 
shuffle will produce three solutions: 
(8) a. (\[liess\],\[helfen\],\[seiner Freundin\] ) b. (\[liess\],\[seiner Freundin\],\[helfen\] ) 
c. (\[seiner Freundin\],\[liess\],\[helfen\] ) 
This set of possible solutions is further constrained 
in two ways: it must be consistent with the linear 
3An interface from ConqYoll to the underlying Prolog en- 
vironment was also developed to support some optimizations 
of the parser, such as memoization and the operations over 
bitstrings described in §5. 
precedence constraints defined by the grammar, and 
it must yield a sequence of words that is identical 
to the input sequence that was given to the parser. 
However, as it stands, the correspondence with the 
input sequence is only checked after an order do- 
main is proposed for the entire sentence. The or- 
der domains of intermediate phrases in the hierar- 
chical structure are not directly constrained by the 
grammar, since they may involve discontinuous sub- 
sequences of the input sentence. The shuffle con- 
straint is acting as a generator of possible order do- 
mains, which are then filtered first by LP constraints 
and ultimately by the order of the words in the in- 
put sentence. Although each possible order domain 
that satisfies the LP constraints is a grammatical se- 
quence, it is useless, in the context of parsing, to con- 
sider those permutations whose order diverges from 
that of the input sentence. In order to avoid this 
very inefficient generate-and-test behavior, we need 
to provide a way for the input positions covered by 
each proposed constituent to be considered sooner, 
so that the only solutions produced by the shuffle 
constraint will be those that correspond to the or- 
der of words in the actual input sequence. 
Since the portion of the input string covered by 
an order domain may be discontinuous, we cannot 
just use a pair of endpoints for each constituent as 
in chart parsers or DCGs. Instead, we adapt a tech- 
nique described by Reape (1991), and use bitstring 
codes to represent the portions of the input covered 
by each element in an order domain. If the input 
string contains n words, the code value for each con- 
stituent will be a bitstring of length n. If element 
i of the bitstring is 1, the constituent contains the 
ith word of the sentence, and if element i of the 
bitstring is 0, the constituent does not contain the 
ith word. Reape uses bitstring codes for a tabular 
parsing algorithm, different from the head-corner al- 
gorithm used here, and attributes the original idea 
to Johnson (1985). 
The optimized version of the shuffle relation is de- 
fined below, using a notation in which the arguments 
are descriptions of typed feature structures. The ac- 
tual implementation of relations in the ConTroll for- 
malism uses a slightly different notation, but we use 
a more familiar Prolog-style notation here. 4 
4Symbols beginning with an upper-case letter are vari- 
ables, while lower-case symbols are either attribute labels 
(when followed by ':') or the types of values (e.g., he_list). 
666 
~, shuffle (optimized version) 
shuffle(\[\], \[\], \[\]). 
shuffle((Sl&ne_list), \[\], Sl). 
shuffle(\[\], (S2&ne_list), $2). 
shuffle(Sl, $2, S3) :- 
Sl=\[(code:Cl) l_\], S2=\[(code:C2) l_\], 
code_prec (Cl, C2, Bool), 
shuf f le_d (Bool, Sl, $2, S3). 
Y, shuffle_d(Bool, \[HI\[T1\], \[H2JT2\], List). 
7, Bool=true: HI precedes H2 
Y, Bool=false: H1 does not precede H2 
shuffle_d(true, \[HI{S1\], S2, \[H1\]S3\]) :- 
may_precede_all (H1, S2), 
shuffle (Sl, S2, S3). 
shuffle_d(false, Sl, \[H2{S2\], \[H21S3\]) :- 
may_pre cede_all (H2, S i), 
shuffle (Sl, S2, S3). 
This revision of the shuffle relation uses two 
auxiliary relations, code_prec and shuffle_d. 
code_prec compares two bitstrings, and yields a 
boolean value indicating whether the first string pre- 
cedes the second (the details of the implementation 
are suppressed). The result of a comparison be- 
tween the codes of the first element of each domain is 
used to determine which element must appear first 
in the resulting domain. This is implemented by 
using the boolean result of the code comparison to 
select a unique disjunct of the shuffle_d relation. 
The shuffle_d relation also incorporates an opti- 
mization in the checking of LP constraints. As each 
element is shuffled into the result, it only needs to be 
checked for LP acceptability with the elements of the 
other argument list, because the LP constraints have 
already been satisfied on each of the argument do- 
mains. Therefore, LP acceptability no longer needs 
to be checked for the entire order domain of each 
phrase, and the call to order_constraints can be 
eliminated from each of the phrasal schemata. 
In order to achieve the desired effect of making 
shuffle constraints deterministic, we must delay their 
evaluation until the code attributes of the first ele- 
ment of each argument domain have been instanti- 
ated to a specific string. Using the analogy of a card 
game, we must hold the cards (delay shuffling) until 
we know what their values are (the codes must be 
instantiated). The delayed evaluation is enforced by 
the following declarations in the ConTroll system, 
where argn:©type specifies that evaluation should 
be delayed until the value of the nth argument of 
the relation has a value more specific than type: 
delay (code_prec, 
(argl : @string & arg2 : @string) ). 
delay (shuffle_d, argl : ©bool). 
With the addition of CODE values to each domain 
element, the input to the shuffle constraint in our 
previous example is shown below, and the unique 
solution for MDom is the one corresponding to (8c). 
(9) shu~e((\[ PHON liess \] \[PHON hel/en 1 
LCODE 001000 ' LCODE 000001 )' 
( \[CODE 110000 J )' MDom) 
6 Performance Comparison 
In order to evaluate the reduction in the search space 
that is achieved by shuffling deterministically, the 
parser with the optimized shuffle constraints and 
the parser with the nonoptimized constraints were 
each tested with the same grammar of German on 
a set of 30 sentences of varying length, complexity 
and clause types. Apart from the redefinition of the 
shuffle relation, discussed in the previous section, 
the only differences between the grammars used for 
the optimized and unoptimized tests are the addi- 
tion of CODE values for each domain element in the 
optimized version and the constraints necessary to 
propagate these code values through the intermedi- 
ate structures used by the parser. 
A representative sample of the tested sentences 
is given in Table 2 (because of space limitations, 
English glosses are not given, but the words have 
all been glossed in §2), and the performance results 
for these 12 sentences are listed in Table 1. For 
each version of the parser, time, choice points, and 
calls are reported, as follows: The time measurement 
(Time) 5 is the amount of CPU seconds (on a Sun 
SPARCstation 5) required to search for all possible 
parses, choice points (ChoicePts) records the num- 
ber of instances where more than one disjunct may 
apply at the time when a constraint is resolved, and 
calls (Calls) lists the number of times a constraint 
is unfolded. The number of calls listed includes all 
constraints evaluated by the parser, not only shuffle 
constraints. Given the nature of the ConTroll imple- 
mentation, the number of calls represents the most 
basic number of steps performed by the parser at a 
logical level. Therefore, the most revealing compar- 
ison with regard to performance improvement be- 
tween the optimized and nonoptimized versions is 
the call factor, given in the last column of Table 1. 
The call factor for each sentence is the number of 
nonoptimized calls divided by the number of opti- 
mized calls. For example, in T1, Er hilfl ihr, the 
version using the nonoptimized shuffle was required 
to make 4.1 times as many calls as the version em- 
ploying the optimized shuffle. 
The deterministic shuffle had its most dramatic 
impact on longer sentences and on sentences con- 
5The absolute time values are not very significant, be- 
cause the ConTroll system is currently implemented as an 
interpreter running in Prolog. However, the relative time dif- 
ferences between sentences confirm that the number of calls 
roughly reflects the total work required by the parser. 
667 
Nonoptimized 
Time(sec) ChoicePts 
T1 1 5.6 61 
T2 1 I0.0 80 
T3 1 24.3 199 
T4 1 25.0 199 
T5 1 51.4 299 
T6 2 463.5 2308 
T7 2 465.1 2308 
T8 1 305.7 1301 
T9 1 270.5 1187 
T10 1 2063.4 6916 
Tll 1 3368.9 8833 
T12 1 8355.0 19235 
Optimized 
Calls Time(sec) ChoicePts Calls 
359 1.8 20 88 
480 3.6 29 131 
1362 4.9 44 200 
1377 5.2 45 211 
2757 6.2 49 241 
22972 32.4 209 974 
23080 26.6 172 815 
9622 52.1 228 942 
7201 48.0 214 1024 
44602 253.8 859 4176 
74703 176.5 536 2565 
129513 528.1 1182 4937 
Table 1: Comparison of Results for Selected Sentences 
4.1 
3.7 
6.8 
6.5 
11.4 
23.6 
28.3 
10.2 
7.0 
10.7 
29.1 
26.2 I 
Table 
T1. Er hilft ihr. 
T2. Hilft er seiner Freundin? 
T3. Er hilft ihr schnell. 
T4. Hilft er ihr schnell? 
T5. Liess er ihr ihn helfen? 
T6. Er liess ihn ihr schnell helfen. 
T7. Liess er ihn ihr schnell helfen? 
TS. Der Vater liess seiner Freundin seinen 
Sohn helfen. 
T9. Sie denkt dass er ihr hilft. 
T10. Sie denkt dass er ihr schnell hilft. 
Tll. Sie denkt dass er ihr ihn helfen liess. 
T12. Sie denkt dass er seiner Freundin 
seinen Sohn helfen liess. 
2: Selected Sentences 
taining adjuncts. For instance, in T7, a verb-first 
sentence containing the adjunct schnell, the opti- 
mized version outperformed the nonoptimized by a 
call factor of 28.3. From these results, the utility 
of a deterministic shuffle constraint is clear. In par- 
ticular, it should be noted that avoiding useless re- 
sults for shuffle constraints prunes away many large 
branches from the overall search space of the parser, 
because shuffle constraints are imposed on each node 
of the hierarchical structure. Since we use a largely 
bottom-up strategy, this means that if there are n 
solutions to a shuffle constraint on some daughter 
node, then all of the constraints on its mother node 
have to be solved n times. If we avoid producing 
n - 1 useless solutions to shuffle, then we also avoid 
n - 1 attempts to construct all of the ancestors to 
this node in the hierarchical structure. 
7 Conclusion 
We have shown that eliminating the nondetermin- 
ism of shuffle constraints overcomes one of the pri- 
mary inefficiencies of parsing for grammars that use 
discontinuous order domains. Although bitstring 
codes have been used before in parsers for discon- 
tinuous constituents, we are not aware of any prior 
research that has demonstrated the use of this tech- 
nique to eliminate the nondeterminism of relational 
constraints on word order. Additionally, we expect 
that the applicability of bitstring codes is not limited 
to shuffle contraints, and that the technique could 
be straightforwardly generalized for other noncon- 
catenative constraints. In fact, some way of record- 
ing the input positions associated with each con- 
stituent is necessary to eliminate spurious ambigui- 
ties that arise when the input sentence contains more 
than one occurrence of the same word (cf. van No- 
ord's (1994) discussion of nonminimality). For con- 
catenative grammars, each position can be repre- 
sented by a simple remainder of the input list, but 
a more general encoding, such as the bitstrings used 
here, is needed for grammars using nonconcatenative 
constraints. 

References 
Emmon Bach. 1979. Control in montague grammar. 
Linguistic Inquiry, 10:515-553. 
David R. Dowty. 1996. Toward a minimalist the- 
ory of syntactic structure. In Arthur Horck and 
Wietske Sijtsma, editors, Discontinuous Con- 
stituency, Berlin. Mouton de Gruyter. 
Thilo GStz and Walt Detmar Meurers. 1997. 
The ConTroll system as large grammar develop- 
ment platform. In Proceedings of the Workshop 
on Computational Environments for Grammar 
Development and Linguistic Engineering (EN- 
VGRAM) held at ACL-97, Madrid, Spain. 
Mark Johnson. 1985. Parsing with discontinuous 
constituents. In Proceedings of the 23 ra Annual 
Meeting of the Association for Computational 
Linguistics, pages 127-132, Chicago, IL, July. 
Andreas Kathol. 1995. Linearization-based German 
Syntax. Ph.D. thesis, The Ohio State University. 
Martin Kay. 1989. Head-driven parsing. In Proceed- 
ings of the First International Workshop on Pars- 
ing Technologies. Carnegie Mellon University. 
Y. Matsumoto, H. Tanaka, H. Hirakawa, H. Miyoshi, 
and H. Yasukawa. 1983. BUP: a bottom up parser 
embedded in prolog. New Generation Computing, 
1(2). 
Michael Moortgat. 1996. Generalized quantifiers 
and discontinuous type constructors. In Arthur 
Horck and Wietske Sijtsma, editors, Discontinu- 
ous Constituency, Berlin. Mouton de Gruyter. 
Fernando C.N. Pereira and Stuart M. Shieber. 1987. 
Prolog and Natural Language Analysis. CSLI Lec- 
ture Notes Number 10, Stanford, CA. 
Carl Pollard. 1984. Generalized Phrase Structure 
Grammars, Head Grammars and Natural Lan- 
guage. Ph.D. thesis, Stanford University. 
Michael Reape. 1991. Parsing bounded discontin- 
uous constituents: Generalizations of some com- 
mon algorithms. In Proceedings of the First Com- 
putational Linguistics in the Netherlands Day, 
OTK, University of Utrecht. 
Mike Reape. 1996. Getting things in order. In 
Arthur Horck and Wietske Sijtsma, editors, Dis- 
continuous Constituents. Mouton de Gruyter, 
Berlin. 
D.J. Rosenkrantz and P.M. Lewis-II. 1970. Deter- 
ministic left corner parsing. In IEEE Conference 
of the 11th Annual Symposium on Switching and 
Automata Theory, pages 139-152. 
Gertjan van Noord. 1991. Head corner parsing for 
discontinuous constituency. In Proceedings of the 
29 th Annual Meeting of the Association for Com- 
putational Linguistics, pages 114-121, Berkeley, 
CA, June. 
Gertjan van Noord. 1994. Head corner parsing. 
In C.J. Rupp, M.A. Rosner, and R.L. Johnson, 
editors, Constraints, Language and Computation, 
pages 315-338. Academic Press. 
Gertjan van Noord. 1997. An efficient implemen- 
tation of the head-corner parser. Computational 
Linguistics, 23(3):425-456. 
