A grammar formalism and parser for linearization-based HPSG
Michael W. Daniels and W. Detmar Meurers
Department of Linguistics
The Ohio State University
222 Oxley Hall
1712 Neil Avenue
Columbus, OH 43210
daniels|dm@ling.osu.edu
Abstract
Linearization-based HPSG theories are widely
used for analyzing languages with relatively
free constituent order. This paper introduces
the Generalized ID/LP (GIDLP) grammar for-
mat, which supports a direct encoding of such
theories, and discusses key aspects of a parser
that makes use of the dominance, precedence,
and linearization domain information explicitly
encoded in this grammar format. We show
that GIDLP grammars avoid the explosion in
the number of rules required under a traditional
phrase structure analysis of free constituent or-
der. As a result, GIDLP grammars support more
modular and compact grammar encodings and
require fewer edges in parsing.
1 Introduction
Within the framework of Head-Driven Phrase Struc-
ture Grammar (HPSG), the so-called linearization-
based approaches have argued that constraints on
word order are best captured within domains that
extend beyond the local tree. A range of analyses
for languages with relatively free constituent order
have been developed on this basis (see, for example,
Reape, 1993; Kathol, 1995; M¨uller, 1999; Donohue
and Sag, 1999; Bonami et al., 1999) so that it is at-
tractive to exploit these approaches for processing
languages with relatively free constituent order.
This paper introduces a grammar format that sup-
ports a direct encoding of linearization-based HPSG
theories. The Generalized ID/LP (GIDLP) format
explicitly encodes the dominance, precedence, and
linearization domain information and thereby sup-
ports the development of eﬃcient parsing algorithm
making use of this information. We make this
concrete by discussing key aspects of a parser for
GIDLP grammars that integrates the word order do-
mains and constraints into the parsing process.
2 Linearization-based HPSG
The idea of discontinuous constituency was first in-
troduced into HPSG in a series of papers by Mike
Reape (see Reape, 1993 and references therein).
1
The core idea is that word order is determined not
at the level of the local tree, but at the newly intro-
duced level of an order domain, which can include
elements from several local trees. We interpret this
in the following way: Each terminal has a cor-
responding order domain, and just as constituents
combine to form larger constituents, so do their or-
der domains combine to form larger order domains.
Following Reape, a daughter’s order domain
enters its mother’s order domain in one of two
ways. The first possibility, domain union, forms
the mother’s order domain by shuﬄing together its
daughters’ domains. The second option, domain
compaction, inserts a daughter’s order domain into
its mother’s. Compaction has two eﬀects:
Contiguity: The terminal yield of a compacted
category contains all and only the terminal yield of
the nodes it dominates; there are no holes or addi-
tional strings.
LP Locality: Precedence statements only con-
strain the order among elements within the same
compacted domain. In other words, precedence
constraints cannot look into a compacted domain.
Note that these are two distinct functions of do-
main compaction: defining a domain as covering a
contiguous stretch of terminals is in principle in-
dependent of defining a domain of elements for
LP constraints to apply to. In linearization-based
HPSG, domain compaction encodes both aspects.
Later work (Kathol and Pollard, 1995; Kathol,
1995; Yatabe, 1996) introduced the notion of partial
compaction, in which only a portion of the daugh-
ter’s order domain is compacted; the remaining ele-
ments are domain unioned.
1
Apart from Reape’s approach, there have been proposals
for a more complete separation of word order and syntactic
structure in HPSG (see, for example, Richter and Sailer, 2001
and Penn, 1999). In this paper, we focus on the majority of
linearization-based HPSG approaches, which follow Reape.
3 Processing linearization-based HPSG
Formally, a theory in the HPSG architecture con-
sists of a set of constraints on the data structures
introduced in the signature; thus, word order do-
mains and the constraints thereon can be straight-
forwardly expressed. On the computational side,
however, most systems employ parsers to eﬃciently
process HPSG-based grammars organized around a
phrase structure backbone. Phrase structure rules
encode immediate dominance (ID) and linear prece-
dence (LP) information in local trees, so they cannot
directly encode linearization-based HPSG, which
posits word order domains that can extend the lo-
cal trees.
The ID/LP grammar format (Gazdar et al., 1985)
was introduced to separate immediate dominance
from linear precedence, and several proposals have
been made for direct parsing of ID/LP grammars
(see, for example, Shieber, 1994). However, the
domain in which word order is determined still is
the local tree licensed by an ID rule, which is insuf-
ficient for a direct encoding of linearization-based
HPSG.
The LSL grammar format as defined by Suhre
(1999) (based on G¨otz and Penn, 1997) allows el-
ements to be ordered in domains that are larger
than a local tree; as a result, categories are not re-
quired to cover contiguous strings. Linear prece-
dence constraints, however, remain restricted to lo-
cal trees: elements that are linearized in a word
order domain larger than their local tree cannot
be constrained. The approach thus provides valu-
able worst-case complexity results, but it is inade-
quate for encoding linearization-based HPSG theo-
ries, which crucially rely on the possibility to ex-
press linear precedence constraints on the elements
within a word order domain.
In sum, no grammar format is currently available
that adequately supports the encoding of a process-
ing backbone for linearization-based HPSG gram-
mars. As a result, implementations of linearization-
based HPSG grammars have taken one of two op-
tions. Some simply do not use a parser, such as the
work based on ConTroll (G¨otz and Meurers, 1997);
as a consequence, the eﬃciency and termination
properties of parsers cannot be taken for granted in
such approaches.
The other approaches use a minimal parser that
can only take advantage of a small subset of the req-
uisite constraints. Such parsers are typically limited
to the general concept of resource sensitivity – ev-
ery element in the input needs to be found exactly
once – and the ability to require certain categories
to dominate a contiguous segment of the input.
Some of these approaches (Johnson, 1985; Reape,
1991) lack word order constraints altogether. Others
(van Noord, 1991; Ramsay, 1999) have the gram-
mar writer provide a combinatory predicate (such
as concatenate, shuﬄe, or head-wrap) for each rule
specifying how the string coverage of the mother is
determined from the string coverages of the daugh-
ter. In either case, the task of constructing a word
order domain and enforcing word order constraints
in that domain is left out of the parsing algorithm;
as a result, constraints on word order domains either
cannot be stated or are tested in a separate clean-up
phase.
4 Defining GIDLP Grammars
2
To develop a grammar format for linearization-
based HPSG, we take the syntax of ID/LP rules
and augment it with a means for specifying which
daughters form compacted domains. A Generalized
ID/LP (GIDLP) grammar consists of four parts: a
root declaration, a set of lexical entries, a set of
grammar rules, and a set of global order constraints.
We begin by describing the first three parts, which
are reminiscent of context-free grammars (CFGs),
and then address order constraints in section 4.1.
The root declaration has the form root(S,L) and
states the start symbol S of the grammar and any
linear precedence constraints L constraining the root
domain.
Lexical entries have the form A → t and link the
pre-terminal A to the terminal t, just as in CFGs.
Grammar rules have the form A →α; C. They
specify that a non-terminal A immediately domi-
nates a list of non-terminals α in a domain where
a set of order constraints C holds.
Note that in contrast to CFG rules, the order of
the elements inαdoes not encode immediate prece-
dence or otherwise contribute to the denotational
meaning of the rule. Instead, the order can be used
to generalize the head marking used in grammars
for head-driven parsing (Kay, 1990; van Noord,
1991) by additionally ordering the non-head daugh-
ters.
3
2
Due to space limitations, we focus here on introducing the
syntax of the grammar formalism and giving an example. We
will also base the discussion on simple term categories; noth-
ing hinges on this, and when using the formalism to encode
linearization-based HPSG grammars, one will naturally use the
feature descriptions known from HPSG as categories.
3
By ordering the right-hand side of a rule so that those cate-
gories come first that most restrict the search space, it becomes
possible to define a parsing algorithm that makes use of this
information. For an example of a construction where order-
ing the non-head daughters is useful, consider sentences with
AcI verbs like I see him laugh. Under the typical HPSG analy-
If the set of order constraints is empty, we obtain
the simplest type of rule, exemplified in (1).
(1) S→NP, VP
This rule says that an S may immediately dominate
an NP and a VP, with no constraints on the relative
ordering of NP and VP. One may precede the other,
the strings they cover may be interleaved, and mate-
rial dominated by a node dominating S can equally
be interleaved.
4.1 Order Constraints
GIDLP grammars include two types of order con-
straints: linear precedence constraints and com-
paction statements.
4.1.1 Linear Precedence Constraints
Linear precedence constraints can be expressed in
two contexts: on individual rules (as rule-level con-
straints) and in compaction statements (as domain-
level constraints). Domain-level constraints can
also be specified as global order constraints, which
has the eﬀect that they are specified for each single
domain.
All precedence constraints enforce the following
property: given any appropriate pair of elements in
the same domain, one must completely precede the
other for the resulting parse to be valid. Precedence
constraints may optionally require that there be no
intervening material between the two elements: this
is referred to as immediate precedence. Precedence
constraints are notated as follows:
• Weak precedence: A<B.
• Immediate precedence: AlessmuchB.
A pair of elements is considered appropriate
when one element in a domain matches the symbol
A, another matches B, and neither element domi-
nates the other (it would otherwise be impossible to
express an order constraint on a recursive rule).
The symbols A and B may be descriptions or to-
kens. A category in a domain matches a descrip-
tion if it is subsumed by it; a token refers to a spe-
cific category in a rule, as discussed below. A con-
straint involving descriptions applies to any pair of
elements in any domain in which the described cat-
egories occur; it thus can also apply more than once
within a given rule or domain. Tokens, on the other
hand, can only occur in rule-level constraints and
sis (Pollard and Sag, 1994), see combines in a ternary structure
with him and laugh. Note that the constituent that is appropriate
in the place occupied by him here can only be determined once
one has looked at the other complement, laugh, from which it
is raised.
refer to particular RHS members of a rule. In this
paper, tokens are represented by numbers referring
to the subscripted indices on the RHS categories.
In (2) we see an example of a rule-level linear
precedence constraint.
(2) A→NP
1
,V
2
,NP
3
;3<V
This constraint specifies that the token 3 in the rule’s
RHS (the second NP) must precede any constituents
described as V occurring in the same domain (this
includes, but is not limited to, the V introduced by
the rule).
4.1.2 Compaction Statements
As with LP constraints, compaction statements ex-
ist as rule-level and as global order constraints; they
cannot, however, occur within other compaction
statements. A rule-level compaction statement has
the form 〈α, A, L〉, where α is a list of tokens,
A is the category representing the compacted do-
main, and L is a list of domain-level precedence
constraints. Such a statement specifies that the con-
stituents referenced inαform a compacted domain
with category A, inside of which the order con-
straints in L hold. As specified in section 2, a com-
pacted domain must be contiguous (contain all and
only the terminal yield of the elements in that do-
main), and it constitutes a local domain for LP state-
ments.
It is because of partial compaction that the second
component A in a compaction statement is needed.
If only one constituent is compacted, the resulting
domain will be of the same category; but when mul-
tiple categories are fused in partial compaction, the
category of the resulting domain needs to be deter-
mined so that LP constraints can refer to it.
The rule in (3) illustrates compaction: each of the
S categories forms its own domain. In (4) partial
compaction is illustrated: the V and the first NP
form a domain named VP to the exclusion of the
second NP.
(3) S→S
1
, Conj
2
,S
3
;
1lessmuch2, 2lessmuch3,〈[1], S,〈[]〉〉,〈[3], S,〈[]〉〉
(4) VP→V
1
,NP
2
,NP
3
; 〈[1, 2], VP,〈[]〉〉
One will often compact only a single category
without adding domain-specific LP constraints, so
we introduce the abbreviatory notation of writing
such a compacted category in square brackets. In
this way (3) can be written as (5).
(5) S→[S
1
], Conj
2
,[S
3
]; 1lessmuch2, 2lessmuch3
A final abbreviatory device is useful when the en-
tire RHS of a rule forms a single domain, which
Suhre (1999) refers to as “left isolation”. This is de-
noted by using the token 0 in the compaction state-
ment if linear precedence constraints are attached,
or by enclosing the LHS category in square brack-
ets, otherwise. (See rules (13d) and (13j) in sec-
tion 6 for an example of this notation.)
The formalism also supports global compaction
statements. A global compaction statement has the
form 〈A,L〉, where A is a description specifying a
category that always forms a compacted domain,
and L is a list of domain-level precedence con-
straints applying to the compacted domain.
4.2 Examples
We start with an example illustrating how a CFG
rule is encoded in GIDLP format. A CFG rule en-
codes the fact that each element of the RHS imme-
diately precedes the next, and that the mother cat-
egory dominates a contiguous string. The context-
free rule in (6) is therefore equivalent to the GIDLP
rule shown in (7).
(6) S→Nom V Acc
(7) [S]→V
1
, Nom
2
, Acc
3
;2lessmuch1, 1lessmuch3
In (8) we see a more interesting example of a
GIDLP grammar.
(8) a) root(A, [])
b) A→B
1
,C
2
,[D
3
]; 2<3
c) B→F
1
,G
2
,E
3
d) C→E
1
,D
2
,I
3
; 〈[1,2], H,〈[]〉〉
e) D→J
1
,K
2
f) Lexical entries: E→e,...
g) E<F
(8a) is the root declaration, stating that an input
string must parse as an A; the empty list shows that
no LP constraints are specifically declared for this
domain. (8b) is a grammar rule stating that an A
may immediately dominate a B,aC, and a D; it fur-
ther states that the second constituent must precede
the third and that the third is a compacted domain.
(8c) gives a rule for B: it dominates an F,aG, and an
E, in no particular order. (8d) is the rule for C, illus-
trating partial compaction: its first two constituents
jointly form a compacted domain, which is given
the name H. (8e) gives the rule for D and (8f) spec-
ifies the lexical entries (here, the preterminals just
rewrite to the respective lowercase terminal). Fi-
nally, (8g) introduces a global LP constraint requir-
ing an E to precede an F whenever both elements
occur in the same domain.
Now consider licensing the string efjekgikj with
the above grammar. The parse tree, recording which
rules are applied, is shown in (9). Given that the
domains in which word order is determined can be
larger than the local trees, we see crossing branches
where discontinuous constituents are licensed.
(9)
A
B C [D]
E F [D E]H G I K J
J K
ef j ekgi k j
To obtain a representation in which the order do-
mains are represented as local trees again, we can
draw a tree with the compacted domains forming
the nodes, as shown in (10).
(10)
A
H D
ef je kgikj
There are three non-lexical compacted domains
in the tree in (9): the root A, the compacted D, and
the partial compaction of D and E forming the do-
main H within C. In each domain, the global LP
constraint E < F must be obeyed. Note that the
string is licensed by this grammar even though the
second occurrence of E does not precede the F. This
E is inside a compacted domain and therefore is not
in the same domain as the F, so that the LP con-
straint does not apply to those two elements. This
illustrates the property of LP locality: domain com-
paction acts as a ‘barrier’ to LP application.
The second aspect of domain compaction, con-
tiguity, is also illustrated by the example, in con-
nection with the diﬀerence between total and partial
compaction. The compaction of D specified in (8b)
requires that the material it dominates be a contigu-
ous segment of the input. In contrast, the partial
compaction of the first two RHS categories in rule
(8d) requires that the material dominated by D and
E, taken together, be a continuous segment. This
allows the second e to occur between the two cate-
gories dominated by D.
Finally, the two tree representations above illus-
trate the separation of the combinatorial potential
of rules (9) from the flatter word order domains
(10) that the GIDLP format achieves. It would, of
course, be possible to write phrase structure rules
that license the word order domain tree in (10) di-
rectly, but this would amount to replacing a set of
general rules with a much greater number of flatter
rules corresponding to the set of all possible ways
in which the original rules could be combined with-
out introducing domain compaction. M¨uller (2004)
discusses the combinatorial explosion of rules that
results for an analysis of German if one wants to
flatten the trees in this way. If recursive rules such as
adjunction are included – which is necessary since
adjuncts and complements can be freely intermixed
in the German Mittelfeld – such flattening will not
even lead to a finite number of rules. We will return
to this issue in section 6.
5 A Parsing Algorithm for GIDLP
We have developed a GIDLP parser based on Ear-
ley’s algorithm for context-free parsing (Earley,
1970). In Earley’s original algorithm, each edge
encodes the interval of the input string it covers.
With discontinuous constituents, however, that is no
longer an option. In the spirit of Johnson (1985)
and Reape (1991), and following Ramsay (1999),
we represent edge coverage with bitvectors, stored
as integers. For instance, 00101 represents an edge
covering words one and three of a five-word sen-
tence.
4
Our parsing algorithm begins by seeding the chart
with passive edges corresponding to each word in
the input and then predicting a compacted instance
of the start symbol covering the entire input; each
final completion of this edge will correspond to a
successful parse.
As with Earley’s algorithm, the bulk of the work
performed by the algorithm is borne by two steps,
prediction and completion. Unlike the context-free
case, however, it is not possible to anchor these
steps to string positions, proceeding from left to
right. The strategy for prediction used by Suhre
(1999) for his LSL parser is to predict every rule
at every position. While this strategy ensures that
no possibility is overlooked, it fails to integrate and
use the information provided by the word order con-
straints attached to the rules – in other words, the
parser receives no top-down guidance. Some of the
edges generated by prediction therefore fall prey to
the word order constraints later, in a generate-and-
test fashion. This need not be the case. Once one
daughter of an active edge has been found, the other
daughters should only be predicted to occur in string
positions that are compatible with the word order
constraints of the active edge. For example, con-
sider the edge in (11).
(11) A→B
1
•C
2
;1<2
4
Note that the first word is the rightmost bit.
This notation represents the point in the parse dur-
ing which the application of this rule has been pre-
dicted, and a B has already been located. Assuming
that B has been found to cover the third position of a
five-word string, two facts are known. From the LP
constraint, C cannot precede B, and from the gen-
eral principle that the RHS of a rule forms a parti-
tion of its LHS, C cannot overlap B. Thus C cannot
cover positions one, two, or three.
5.1 Compiling LP Constraints into Bitmasks
We can now discuss the integration of GIDLP word
order constraints into the parsing process. A central
insight of our algorithm is that the same data struc-
ture used to describe the coverage of an edge can
also encode restrictions on the parser’s search space.
This is done by adding two bitvectors to each edge,
in addition to the coverage vector: a negative mask
(n-mask) and a positive mask (p-mask). Eﬃcient
bitvector operations can then be used to compute,
manipulate, and test the encoded constraints.
Negative Masks The n-mask constrains the set of
possible coverage vectors that could complete the
edge. The 1-positions in a masking vector represent
the positions that are masked out: the positions that
cannot be filled when completing this edge. The 0-
positions in the negative mask represent positions
that may potentially be part of the edge’s cover-
age. For the example above, the coverage vector for
the edge is 00100 since only the third word B has
been found so far. Assuming no restrictions from a
higher rule in the same domain, the n-mask for C is
00111, encoding the fact that the final coverage vec-
tor of the edge for A must be either 01000, 10000,
or 11000 (that is, C must occupy position four, po-
sition five, or both of these positions). The negative
mask in essence encodes information on where the
active category cannot be found.
Positive Masks The p-mask encodes information
about the positions the active category must occupy.
This knowledge arises from immediate precedence
constraints. For example, consider the edge in (12).
(12) D→E
1
•F
2
;1lessmuch2
If E occupies position one, then F must at least oc-
cupy position two; the second position in the posi-
tive mask would therefore be occupied.
Thus in the prediction step, the parser considers
each rule in the grammar that provides the symbol
being predicted, and for each rule, it generates bit-
masks for the new edge, taking both rule-level and
domain-level order constraints into account. The
resulting masks are checked to ensure that there is
enough space in the resulting mask for the minimum
number of categories required by the rule.
5
Then, as part of each completion step, the parser
must update the LP constraints of the active edge
with the new information provided by the passive
edge. As edges are initially constructed from gram-
mar rules, all order constraints are initially ex-
pressed in terms of either descriptions or tokens. As
the parse proceeds, these constraints are updated in
terms of the actual locations where matching con-
stituents have been found. For example, a constraint
like 1 < 2 (where 1 and 2 are tokens) can be up-
dated with the information that the constituent cor-
responding to token 1 has been found as the first
word, i.e. as position 00001.
In summary, compiling LP constraints into bit-
masks in this way allows the LP constraints to be
integrated directly into the parser at a fundamental
level. Instead of weeding out inappropriate parses
in a cleanup phase, LP constraints in this parser can
immediately block an edge from being added to the
chart.
6 Evaluation
As discussed at the end of section 4.2, it is possible
to take a GIDLP grammar and write out the discon-
tinuity. All non-domain introducing rules must be
folded into the domain-introducing rules, and then
each permitted permutation of a RHS must become
a context-free rule on its own – generally, at the cost
of a factorial increase in the number of rules.
This construction indicates the basis for a prelim-
inary assessment of the GIDLP formalism and its
parser. The grammar in (13) recognizes a very small
fragment of German, focusing on the free word or-
der of arguments and adjuncts in the so-called Mit-
telfeld that occurs to the right of either the finite verb
in yes-no questions or the complementizer in com-
plementized sentences.
6
(13) a) root(s, [])
b) s→s(cmp)
1
c) s→s(que)
1
d) s(cmp)→cmp
1
, clause
2
;
〈[0], s(cmp),〈cmp< , <v( )〉〉
e) s(que)→clause
1
; 〈[0], s(que),〈v( )<〉〉
f) clause→np(n)
1
,vp
2
5
This optimization only applies to epsilon-free grammars.
Further work in this regard can involve determining the minu-
mum and maximum yields of each category; some opti-
mizations involving this information can be found in (Haji-
Abdolhosseini and Penn, 2003).
6
The symbol is used to denote the set of all categories.
g) vp→v(ditr)
1
, np(a)
2
, np(d)
3
h) vp→adv
1
,vp
2
i) vp→v(cmp)
1
, s(cmp)
2
j) [np(Case)]→det(Case)
1
, n(Case)
2
;
1lessmuch2
k) v(ditr)→gab q) v(cmp)→denkt
l) comp→dass r) det(nom)→der
m) det(dat)→der s) det(acc)→das
n) n(nom)→Mann t) n(dat)→Frau
o) n(acc)→Buch u) adv→gestern
p) adv→dort
The basic idea of this grammar is that domain com-
paction only occurs at the top of the head path, af-
ter all complements and adjuncts have been found.
When the grammar is converted into a CFG, the
eﬀect of the larger domain can only be mimicked
by eliminating the clause and vp constituents alto-
gether.
As a result, while this GIDLP grammar has 10
syntactic rules, the corresponding flattened CFG (al-
lowing for a maximum of two adverbs) has 201
rules. In an experiment, the four sample sentences
in (14)
7
were parsed with both our prototype GIDLP
parser (using the GIDLP grammar) as well as a
vanilla Earley CFG parser (using the CFG); the re-
sults are shown in (15).
(14) a) Gab der Mann der Frau das Buch?
b) dass das Buch der Mann der Frau gab.
c) dass das Buch gestern der Mann dort der
Frau gab.
d) Denkt der Mann dass das Buch gestern
der Mann dort der Frau gab?
(15)
Active Edges Passive Edges
Sentence GIDLP CFG GIDLP CFG
a) 18 327 16 15
b) 27 338 18 16
c) 46 345 27 27
d) 75 456 36 24
Averaging over the four sentences, the GIDLP
grammar requires 89% fewer active edges. It also
generates additional passive edges corresponding to
the extra non-terminals vp and clause. It is impor-
tant to keep in mind that the GIDLP grammar is
more general than the CFG: in order to obtain a fi-
nite number of CFG rules, we had to limit the num-
ber of adverbs. When using a grammar capable of
7
The grammar and example sentences are intended as a for-
mal illustration, not a linguistic theory; because of this and
space limitations, we have not provided glosses.
handling longer sentences with more adverbs, the
number of CFG rules (and active edges, as a conse-
quence) increases factorially.
Timings have not been included in (15); it is gen-
erally the case that the GIDLP parser/grammar com-
bination was slower than the CFG/Earley parser.
This is an artifact of the use of atomic categories,
however. For the large feature structures used as
categories in HPSG, we expect the larger numbers
of edges encountered while parsing with the CFG to
have a greater impact on parsing time, to the point
where the GIDLP grammar/parser is faster.
7 Summary
In this abstract, we have introduced a grammar for-
mat that can be used as a processing backbone for
linearization-based HPSG grammars that supports
the specification of discontinuous constituents and
word order constraints on domains that extend be-
yond the local tree. We have presented a prototype
parser for this format illustrating the use of order
constraint compilation techniques to improve eﬃ-
ciency. Future work will concentrate on additional
techniques for optimized parsing as well as the ap-
plication of the parser to feature-based grammars.
We hope that the GIDLP grammar format will en-
courage research on such optimizations in general,
in support of eﬃcient processing of relatively free
constituent order phenomena using linearization-
based HPSG.

References

Olivier Bonami, Dani`ele Godard, and Jean-Marie
Marandin. 1999. Constituency and word order
in French subject inversion. In Gosse Bouma et
al., editor, Constraints and Resources in Natural
Language Syntax and Semantics. CSLI.

Cathryn Donohue and Ivan A. Sag. 1999. Domains
in Warlpiri. In Abstracts of the Sixth Int. Confer-
ence on HPSG, pages 101–106, Edinburgh.

Jay Earley. 1970. An eﬃcient context-free parsing
algorithm. Communications of the ACM, 13(2).

Gerald Gazdar, Ewan Klein, Geoﬀrey K. Pullum,
and Ivan A. Sag. 1985. Generalized Phrase
Structure Grammar. Harvard University Press.

Thilo G¨otz and W. Detmar Meurers. 1997. The
ConTroll system as large grammar development
platform. In Proceedings of the EACL Workshop
“Computational Environments for Grammar De-
velopment and Linguistic Engineering”, Madrid.

Thilo G¨otz and Gerald Penn. 1997. A proposed
linear specification language. Volume 134 in Ar-
beitspapiere des SFB 340, T¨ubingen.

Mohammad Haji-Abdolhosseini and Gerald Penn.
2003. ALE reference manual. Univ. Toronto.

Mark Johnson. 1985. Parsing with discontinuous
constituents. In Proceedings of ACL, Chicago.

Andreas Kathol and Carl Pollard. 1995. Extraposi-
tion via complex domain formation. In Proceed-
ings of ACL, pages 174–180, Boston.

Andreas Kathol. 1995. Linearization-Based Ger-
man Syntax. Ph.D. thesis, Ohio State University.

Martin Kay. 1990. Head-driven parsing. In Masaru
Tomita, editor, Current Issues in Parsing Tech-
nology. Kluwer, Dordrecht.

Stefan M¨uller. 1999. Deutsche Syntax deklarativ.
Niemeyer, T¨ubingen.

Stefan M¨uller. 2004. Continuous or discontinuous
constituents? A comparison between syntactic
analyses for constituent order and their process-
ing systems. Research on Language and Compu-
tation, 2(2):209–257.

Gerald Penn. 1999. Linearization and WH-
extraction in HPSG: Evidence from Serbo-
Croatian. In Robert D. Borsley and Adam
Przepi´orkowski, editors, Slavic in HPSG. CSLI.

Carl Pollard and Ivan A. Sag. 1994. Head-
Driven Phrase Structure Grammar. University of
Chicago Press, Chicago.

Allan M. Ramsay. 1999. Direct parsing with dis-
continuous phrases. Natural Language Engi-
neering, 5(3):271–300.

Mike Reape. 1991. Parsing bounded discontinuous
constituents: Generalisations of some common
algorithms. In Mike Reape, editor, Word Order
in Germanic and Parsing. DYANA R1.1.C.

Mike Reape. 1993. A Formal Theory of Word Or-
der: A Case Study in West Germanic. Ph.D. the-
sis, University of Edinburgh.

Frank Richter and Manfred Sailer. 2001. On the left
periphery of German finite sentences. In W. Det-
mar Meurers and Tibor Kiss, editors, Constraint-
Based Approaches to Germanic Syntax. CSLI.

Stuart M. Shieber. 1984. Direct parsing of ID/LP
grammars. Linguistics &Philosophy, 7:135–154.

Oliver Suhre. 1999. Computational aspects of
a grammar formalism for languages with freer
word order. Diplomarbeit. (=Volume 154 in Ar-
beitspapiere des SFB 340, 2000).

Gertjan van Noord. 1991. Head corner parsing for
discontinuous constituency. In ACL Proceedings.

Shuichi Yatabe. 1996. Long-distance scrambling
via partial compaction. In Masatoshi Koizumi,
Masayuki Oishi, and Uli Sauerland, editors,
Formal Approaches to Japanese Linguistics 2.
MITWPL.
