Lexicalising Word Order Constraints for
Implemented Linearisation Grammar
Yo Sato
Department of Computer Science
King’s College London
yo.sato@kcl.ac.uk
Abstract
This paper presents a way in which a lex-
icalised HPSG grammar can handle word
order constraints in a computational pars-
ing system, without invoking an additional
layer of representation for word order,
such as Reape’s Word Order Domain. The
key proposal is to incorporate into lexi-
cal heads the WOC (Word Order Con-
straints) feature, which is used to constrain
the word order of its projection. We also
overview our parsing algorithm.
1 Introduction
It is a while since the linearisation technique was
introduced into HPSG by Reape (1993; 1994) as
a way to overcome the inadequacy of the con-
ventional phrase structure rule based grammars in
handling ‘freer’ word order of languages such as
German and Japanese. In parallel in computa-
tional linguistics, it has long been proposed that
more flexible parsing techniques may be required
to adequately handle such languages, but hitherto
a practical system using linearisation has eluded
large-scale implementation. There are at least two
obstacles: its higher computational cost accom-
panied with non-CFG algorithms it requires, and
the difficulty to state word order information suc-
cinctly in a grammar that works well with a non-
CFG parsing engine.
In a recent development, the ‘cost’ issue has
been tackled by Daniels and Meurers (2004), who
propose to narrow down on search space while us-
ing a non-CFG algorithm. The underlying princi-
ple is to give priority to the full generative capac-
ity, let the parser overgenerate at default but re-
strict generation for efficiency thereafter. While
sharing this principle, I will attempt to further
streamline the computation of linearisation, focus-
ing mainly on the issue of grammar formalism.
Specifically, I would like to show that the lex-
icalisation of word order constraints is possible
with some conservative modifications to the stan-
dard HPSG (Pollard and Sag, 1987; Pollard and
Sag, 1994). This will have the benefit of making
the representation of linearisation grammar sim-
pler and more parsing friendly than Reape’s influ-
ential Word Order Domain theory.
In what follows, after justifying the need for
non-CFG parsing and reviewing Reape’s theory, I
will propose to introduce into HPSG the Word Or-
der Constraint (WOC) feature for lexical heads. I
will then describe the parsing algorithm that refers
tothis featureto constrainthe searchfor efficiency.
1.1 Limitation of CFG Parsing
One of the main obstacles for CFG parsing is
the discontinuity in natural languages caused by
‘interleaving’ of elements from different phrases
(Shieber, 1985). Although there are well-known
syntactic techniques to enhance CFG as in GPSG
(Gazdar et al., 1985), there remain constructions
that show ‘genuine’ discontinuity of the kind that
cannot be properly dealt with by CFG.
Such ‘difficult’ discontinuity typically occurs
when it is combined with scrambling – another
symptomatic phenomenon of free word order lan-
guages – of a verb’s complements. The follow-
ing is an example from German, where scrambling
and discontinuity co-occur in what is called ‘inco-
herent’ object control verb construction.
(1) Ich glaube, dass der Fritz dem Frank
I believe Comp Fritz(Nom) Frank(Dat)
das Buch zu lesen erlaubt.
the book(Acc) to read allow
‘I think that Fritz allows Frank to read the book’
23
(1’) Ich glaube, dass der Fritz [das Buch] dem Frank
[zu lesen] erlaubt
Ich glaube, dass dem Frank [das Buch] der Fritz
[zu lesen] erlaubt
Ich glaube, dass [das Buch] dem Frank der Fritz
[zu lesen] erlaubt
...
Here (1) is in the ‘canonical’ word order while the
examples in (1’) are its scrambled variants. In
the traditional ‘bi-clausal’ analysis according to
which the object control verb subcategorises for
a zu-infinitival VP complement as well as nomi-
nal complements, this embedded VP, das Buch zu
lesen, becomes discontinuous in the latter exam-
ples (in square brackets).
One CFG response is to use ‘mono-clausal’
analysis or argument composition(Hinrichs and
Nakazawa, 1990), according to which the higher
verb and lower verb (in the above example er-
lauben and zu lesen) are combined to form a sin-
gle verbal complex, which in turn subcategorises
for nominal complements (das Buch, der Fritz and
dem Frank). Under this treatment both the ver-
bal complex and the sequence of complements are
rendered continuous, rendering all the above ex-
amples CFG-parseable.
However, this does not quite save the CFG
parseability, in the face of the fact that you could
extrapose the lower V + NP, as in the following.
(2) Ich glaube, dass der Fritz dem Frank [erlaubt], das
Buch [zu lesen].
Now we have a discontinuity of ‘verbal complex’
instead of complements (the now discontinuous
verbal complex is marked with square brackets).
Thus either way, some discontinuity is inevitable.
Such discontinuity is by no means a marginal
phenomenon limited to German. Parallel phenom-
ena are observed in the object control verbs in
Korean and Japanese ((Sato, 2004) for examples).
These languages also show a variety of ‘genuine’
discontinuity of other sorts, which do not lend
itself to a straightforward CFG parsing (Yatabe,
1996). TheCFG-recalcitrant constructions exist in
abundance, pointing to an acute need for non-CFG
parsing.
1.2 Reape’s Word Order Domain
The most influential proposal to accommodate
such discontinuity/scrambling in HPSG is Reape’s
Word Order Domain, or DOM, a feature that con-
stitutes an additional layer separate from the dom-
inance structure of phrases (Reape, 1993; Reape,
1994). DOM encodes the phonologically realised
(‘linearised’) list of signs: the daughter signs of a







phrase
DOM
angbracketleftbig
1 © 2 © 3 ©...© n
angbracketrightbig
HD-DTR
angbracketleftBiggbracketleftBiggphrase
DOM 1
UNIONED +
bracketrightBiggangbracketrightBigg
NHD-DTRs
angbracketleftBiggbracketleftBiggphrase
DOM 2
UNIONED +
bracketrightBigg
,
bracketleftBigg
phrase
DOM 3
UNIONED +
bracketrightBigg
...
bracketleftBigg
phrase
DOM n
UNIONED +
bracketrightBiggangbracketrightBigg







Figure 1: Word Order Domain
phrase in the HD-DTR and NHD-DTRS features
are linearly ordered as in Figure 1.
The feature UNIONED in the daughters indi-
cates whether discontinuity amongst their con-
stituents is allowed. Computationally, the positive
(‘+’) value of the feature dictates (the DOMs of)
the daughters to be sequence unioned (represented
by the operator ©) into the mother DOM: details
apart, this operation essentially merges two lists in
a way that allows interleaving of their elements.
In Reape’s theory, LP constraints come from
an entirely different source. There is nothing as
yet that blocks, for instance, the ungrammatical
zu lesen das Buch VP sequence. The relevant
constraint, i.e. COMPS≺ZU-INF-V in German, is
stated in the LP component of the theory. Thus
with the interaction of the UNIONED feature and
LP statements, the grammar rules out the unac-
ceptable sequences while endorsing grammatical
ones such as the examples in (1’).
One important aspect of Reape’s theory is that
DOM is a list of whole signs rather than of any
part of them such as PHON. This is necessi-
tated by the fact that in order to determine how
DOM should be constructed, the daughters’ inter-
nal structure need to be referred to, above all, the
UNIONED feature. In other words, the internal
features of the daughters must be accessible.
While this is a powerful system that overcomes
the inadequacies of phrase-structure rules, some
may feel this is a rather heavy-handed way to
solve the problems. Above all, much information
is repeated, as all the signs are effectively stated
twice, once in the phrase structure and again in
DOM. Also, the fact that discontinuity and lin-
ear precedence are handled by two distinct mecha-
nisms seems somewhat questionable, as these two
factors are computationally closely related. These
properties are not entirely attractive features for a
computational grammar.
24
2 Lexicalising Word Order Constraints
2.1 Overview
Our theoretical goal is, in a nutshell, to achieve
what Reape does, namely handling discontinuity
and linear precedence, in a simpler, more lexical-
ist manner. My central proposal consists in incor-
porating the Word Order Constraint (WOC) fea-
ture into the lexical heads, rather than positing an
additional tier for linearisation. Some new sub-
features will also be introduced.
The value of the WOC feature is a set of word-
order related constraints. It may contain any re-
lational constraint the grammar writer may want
with the proviso of its formalisability, but for the
current proposal, I include two subfeatures ADJ
(adjacency) and LP, both of which, being binary
relations, are represented as a set of ordered pairs,
the members of which must either be the head it-
self or its sisters. Figure 2 illustrates what such
feature structure looks like with an English verb
provide, as in provide him with a book.
We will discuss the new PHON subfeatures in
the next section – for now it would suffice to con-
sider them to constitute the standard PHON list –
so let us focus on WOC here. The WOC feature of
this verb says, for its projection (VP), three con-
straints have to be observed. Firstly, the ADJ sub-
feature says that the indirect object NP has to be
in the adjacent position to the verb (‘provide yes-
terday him with a book’ is not allowed). Secondly,
the first two elements of the LP value encode a
head-initial constraint for English VPs, namely
that a head verb has to be preceded by its com-
plements. Lastly, the last pair in the same set says
the indirect object must precede the with-PP (‘pro-
vide with a book him’ is not allowed). Notice that
this specification leaves room for some disconti-
nuity, as there is no ADJ requirement between the
indirect NP and with-PP. Hence, provide him yes-
terday with a book is allowed.
The key idea here is that since the complements
of a lexical head are available in its COMPS fea-
ture, it should be possible to state the relative lin-
ear order which holds between the head and a
complement, as well as between complements, in-
side the feature structure of the head.
Admittedly word order would naturally be con-
sidered to reside in a phrase, string of words.
It might be argued, on the ground that a head’s
COMPS feature simply consists of the categories
it selects for in exclusion of the PHON feature,
that with this architecture one would inevitably
encounter the ‘accessibility’ problem discussed in
v









verb
PHON


phon-wd
CONSTITUENTS
braceleftbig
provide
bracerightbig
CONSTRAINTS{}


COMPS
angbracketleftbigg
np
bracketleftBignp
case Acc
bracketrightBig
, pp
bracketleftBigpp
pform with
bracketrightBigangbracketrightbigg
WOC



woc
ADJ
braceleftBigangbracketleftbig
v , np
angbracketrightbigbracerightBig
LP
braceleftBigangbracketleftbig
v , np
angbracketrightbig
,
angbracketleftbig
v , pp
angbracketrightbig
,
angbracketleftbig
np , pp
angbracketrightbigbracerightBig












Figure 2: Example of lexical head with WOC fea-
ture
Section 1.2: in order to ensure the enforceability
of word order constraints, an access must be se-
cured to the values of the internal features includ-
ing the PHON values. However, this problem can
be overcome, as we will see, if due arrangements
are in place.
The main benefit of this mechanism is that it
paves way to an entirely lexicon-based rule spec-
ification, so that, on one hand, duplication of in-
formationbetween lexicalspecification and phrase
structure rules can be reduced and on the other, a
wide variety of lexical properties can be flexibly
handled. If the word order constraints, which have
been regarded as the bastion of rule-based gram-
mars, is shown to be lexically handled, it is one
significant step further to a fully lexicalist gram-
mar.
2.2 New Head-Argument Schema
What is crucial for this WOC-incorporated gram-
mar is how the required word order constraints
stated in WOC are passed on and enforced in its
projection. I attempt to formalise this in the form
of Head-Argument Schema, by modifying Head-
Complement Schema of Pollard and Sag (1994).
There are two key revisions: an enriched PHON
feature that contains word order constraints and
percolation of these constraints emanating from
the WOC feature in the head.
The revised Schema is shown in Figure 3. For
simplicity only the LP subfeature is dealt with,
since the ADJ subfeature would work exactly the
same way. The set notations attached underneath
states the restriction on the value of WOC, namely
that all the signs that appear in the constraint
pairs must be ‘relevant’, i.e. must also appear as
daughters (included in ‘DtrSet’, the set of the head
daughter and non-head daughters). Naturally, they
also cannot be the same signs (xnegationslash=y).
Let me discuss some auxiliary modifications
25


















head-arg-phrase
PHON



phon
CONSTITS
uniontextbraceleftBigbraceleftbig
ph
bracerightbig
, pa1 ,..., pai ,..., paj ,... pan
bracerightBig
CONSTRTS|LP
uniontextbraceleftbiggbraceleftBig
...,
angbracketleftbig
pai , paj
angbracketrightbig
,...
bracerightBig
, ca1 ,..., cai ,... caj ,..., can
bracerightbigg



ARGS〈〉
HD-DTR hd










word
PHN
bracketleftbigg
CONSTITS
braceleftbig
ph
bracerightbig
CONSTRS{}
bracketrightbigg
ARGS args
angbracketleftBigga1


sign
PHN
bracketleftbigg
CONSTITS pa1
CONSTRS ca1
bracketrightbigg

,..., ai


sign
PHN
bracketleftbigg
CONSTITS pai
CONSTRS cai
bracketrightbigg

,
..., aj


sign
PHN
bracketleftbigg
CONSTITS paj
CONSTRS caj
bracketrightbigg

,..., an
bracketleftBiggsign
PHN
bracketleftBigCONSTITS pan
CONSTRS can
bracketrightBig
bracketrightBigg
angbracketrightBigg
WOC|LP wocs
braceleftBig
...,
angbracketleftbig
ai , aj
angbracketrightbig
,...
bracerightBig










NHD-DTRs args


















where wocs ⊆ {〈x,y〉|xnegationslash=y, x,y∈DtrSet}
DtrSet = {hd}∪ args
Figure 3: Head-Argument Schema with WOC feature
first. Firstly, we change the feature name from
COMPS to ARGS because we assume a non-
configurational flat structure, as is commonly the
case with linearisation grammar. Another change
I propose is to make ARGS a list of underspeci-
fied signs instead of SYNSEMs as standardly as-
sumed (Pollard and Sag, 1994). In fact, this is a
position taken in an older version of HPSG (Pol-
lard and Sag, 1987) but rejected on the ground of
the locality of subcategorisation. The main reason
for this reversal is to facilitate the ‘accessibility’
we discussed earlier. As unification and percola-
tion of the PHON information is involved in the
Schema, it is much more straightforward to for-
mulate with signs. Though the change may not
be quite defensible solely on this ground,1 there is
reason to leave the locality principle as an option
for languages of which it holds rather than hard-
wire it into the Schema, since some authors raise
doubt as for the universal applicability of the lo-
cality principle e.g. (Meurers, 1999).
Turning to a more substantial modification, our
new PHON feature consists of two subfeatures,
CONSTITUENTS (or CONSTITS) and CON-
STRAINTS (or CONSTRS). The former encodes
the set that comprises the phonology of words of
which the string consists. Put simply, it is the un-
1Another potential problem is cyclicity, since the sign-
valued ARGS feature contains the WOC feature, which could
contain the head itself. This has to be fixed for the systems
that do not allow cyclicity.
ordered version of the standard PHON list. The
CONSTRAINTS feature represents the concata-
native constraints applicable to the string. Thus,
the PHON feature overall represents the legitimate
word order patterns in an underspecified way, i.e.
any of the possible string combinations that obey
the constraints. Let me illustrate with a VP ex-
ample, say, consisting of meet, often and Tom, for
which we assume that the following word order
patterns are acceptable,
〈meet, Tom, often〉, 〈often, meet, Tom〉
but not the followings:
〈meet, often, Tom〉, 〈Tom, often, meet〉,
〈Tom, meet, often〉, 〈often, Tom, meet〉.
This situation can be captured by the following
feature specification for PHON, which encodes
any of the acceptable strings above in an under-
specified way.




PHON





CONSTITS
braceleftbig
often, Tom, meet
bracerightbig
CONSTRS



ADJ
braceleftbiggangbracketleftBigbraceleftbig
meet
bracerightbig
,
braceleftbig
Tom
bracerightbigangbracketrightBigbracerightbigg
LP
braceleftbiggangbracketleftBigbraceleftbig
meet
bracerightbig
,
braceleftbig
Tom
bracerightbigangbracketrightBigbracerightbigg













The key point is that now the computation of
word order can be done based on the information
inside the PHON feature, though indeed the CON-
STR values have to come from outside – the word
order crucially depends on SYNSEM-related val-
ues of the daughter signs.
26
Let us now go back to the Schema in Figure 3
and see how to determine the CONSTR values to
enter the PHON feature. This is achieved by look-
ing up the WOC constraints in the head (let’s call
this Step 1) and pushing the relevant constraints
into the PHON feature of its mother, according to
the type of constraints (Step 2).
For readability Figure 3 only states explicitly
a special case – where one LP constraint holds
of two of the arguments – but the reader is
asked to interpret ai and aj in the head daughter’s
WOC|LP to represent any two signs chosen from
the ‘DTRS’ list (including the head, hd). 2 The
structure sharing of ai and aj between WOC|LP
and ARGS indicates that the LP constraint applies
to these two arguments in this order, i.e. ai≺aj.
Thus through unification, it is determined which
constraints apply to which pairs of daughter signs
inside the head. This corresponds to Step 1.
Now, only for these WOC-applicable daughter
signs, the PHON|CONSTIITS valuesare pairedup
for each constraint (in this case 〈pai, paj〉) and
pushed into the mother’s PHON|CONSTRS fea-
ture. This corresponds to Step 2.
Notice alsothatthe CONSTRAINTSsubfeature
is cumulatively inherited. All the non-head daugh-
ters’ CONSTR values (ca1,...,can) – the word or-
der constraints applicable to each of these daugh-
ters – are also passed up, collecting effectively
all the CONSTR values of its daughters and de-
scendants. This means the information concern-
ing word order, as tied to particular string pairs, is
never lost and passed up all the way through. Thus
the WOC constraints can be enforced at any point
where both members of the string pair in question
are instantiated.
2.3 A Worked Example
Let us now go through an example of applying
the Schema, again with the German subordinate
clause, das Buch der Fritz dem Frank zu lesen er-
laubt (and other acceptable variants). Our goal is
to enforce the ADJ and LP constraints in a flexible
enough way, allowing the acceptable sequences
such as those we saw in Section 1.2.1. while
blocking the constraint-violating instances.
The instantiated Schema is shown in Figure 4.
Let us start with a rather deeply embedded level,
the embedded verb zu-lesen, marked v2, found in-
side vp (the last and largest NHD-DTR) as its HD-
2For the generality of the number of ARGS elements,
which should be taken to be any number including zero, the
recursive definition as detailed in (Richter and Sailer, 1995)
can be adopted.
DTR, which I suppose to be one lexical item for
simplicity. This is one of the lexical heads from
which the WOC constraints emanate. Find, in
this item’s WOC, a general LP constraint for zu-
Infinitiv VPs, COMPS≺V, namely np3≺v2. Then
the PHON|CONSTITS values of these signs are
searched for and found in the daughters, namely
pnp3 and pv2. These values are paired up and
passed into the CONSTRS|LP value of its mother
VP. Notice also that into this value the NHD-
DTRs’ CONSTR|LP values, in this case only
lpnp3 ({das}≺{Buch}), are also unioned, consti-
tuting lpvp: we are here witnessing the cumula-
tive inheritance of constraints explained earlier.
Turn attention now to the percolation of ADJ sub-
feature: no ADJ requirement is found between
das Buch and zu-lesen (v2’s WOC|ADJ is empty),
though ADJ is required one node below, between
das and Buch (np3’s PHN|CONSTR|ADJ). Thus
no new ADJ pair is added to the mother VP’s
PHON|CONSTR feature.
Exactly the same process is repeated for the
projection of erlauben (v1), where its WOC
again contains only LP requirements. With the
PHON|CONSTITS values of the relevant signs
found and paired up ({Fritz,der}≺{erlaubt} and
{Frank,dem}≺{erlaubt}), they are pushed into its
mother’s PHON|CONSTRS|LP value, which is
also unioned with the PHON|CONSTRS values of
the NHD-DTRS. Notice this time that there is no
LP requirement between the zu-Infinitiv VP, das
Buch zu-lesen, and the higher verb, erlaubt. This
is intended to allow for extraposition.3
The eventual effect of the cumulative constraint
inheritance can be more clearly seen in the sub-
AVM underneath, which shows the PHON part of
the whole feature structure with its values instan-
tiated. After a succession of applications of the
Head-Argument Schema, we now have a pool of
WOCs sufficient to block unwanted word order
patterns while endorsing legitimate ones. The rep-
resentation of the PHON feature being underspec-
ified, it corresponds to any of the appropriately
constrained order patterns. der Fritz dem Frank
zu lesen das Buch erlaubt would be ruled out by
the violation of the last LP constraint, der Fritz er-
laubt dem Frank das Buch zu lesen by the second,
and so on.
The reader might be led to think, because of
3The lack of this LP requirement also entails some
marginally acceptable instances, such as der Fritz dem Frank
das Buch erlaubt zu lesen, considered ungrammatical by
many. These instances can be blocked, however, by intro-
ducing more complex WOCs. See Sato (forthcoming a).
27






































subordinate-clause
PHON


CONSTITS pv1 ∪ pnp1 ∪ pnp2 ∪ pvp
CONSTRS
bracketleftBigg
ADJ adnp1 ∪ adnp2 ∪ adnp3
LP
braceleftBigangbracketleftbig
pnp1 , pv1
angbracketrightbig
,
angbracketleftbig
pnp2 , pv1
angbracketrightbigbracerightBig
∪ lpnp1 ∪ lpnp2 ∪ lpvp
bracketrightBigg


ARGS〈〉
HD-DTR v1




verb
PHON|CONSTITS pv1
braceleftbig
erlaubt
bracerightbig
ARGS
angbracketleftbig
np1 , np2 , vp
angbracketrightbig
WOC
bracketleftBigg
ADJ{}
LP
braceleftBigangbracketleftbig
np1 , v1
angbracketrightbig
,
angbracketleftbig
np2 , v1
angbracketrightbigbracerightBig
bracketrightBigg




NHD-DTRs
angbracketleftBigg
np1






np
PHON




CONSTITS pnp1
braceleftbig
Fritz, der
bracerightbig
CONSTRS



ADJ adnp1
braceleftbiggangbracketleftBigbraceleftbig
Fritz
bracerightbig
,
braceleftbig
der
bracerightbigangbracketrightBigbracerightbigg
LP lpnp1
braceleftbiggangbracketleftBigbraceleftbig
der
bracerightbig
,
braceleftbig
Fritz
bracerightbigangbracketrightBigbracerightbigg







SYNSEM|... | CASE Nom






, np2






np
PHON




CONSTITS pnp1
braceleftbig
Frank, dem
bracerightbig
CONSTRS



ADJ adnp2
braceleftbiggangbracketleftBigbraceleftbig
Frank
bracerightbig
,
braceleftbig
der
bracerightbigangbracketrightBigbracerightbigg
LP lpnp2
braceleftbiggangbracketleftBigbraceleftbig
der
bracerightbig
,
braceleftbig
Frank
bracerightbigangbracketrightBigbracerightbigg







SYNSEM| ... |CASE Dat






,
vp


















vp
PHON


CONSTITS pvp : pv2 ∪ pnp3
CONSTRS
bracketleftBigg
ADJ adnp3
LP lpvp
braceleftBigangbracketleftbig
pnp3 , pv2
angbracketrightbigbracerightBig
∪ lpnp3
bracketrightBigg


ARGS〈〉
HD-DTR v2




v
PHON| CONSTITS pv2
braceleftbig
zu-lesen
bracerightbig
ARGS
angbracketleftbig
np3
angbracketrightbig
WOC
bracketleftBigg
ADJ{}
LP
braceleftBigangbracketleftbig
np3 , v2
angbracketrightbigbracerightBig
bracketrightBigg




NHD-DTRS
angbracketleftBigg
np3






np
PHON




CONSTITS pnp3
braceleftbig
Buch,das
bracerightbig
CONSTRS



ADJ adnp3
braceleftbiggangbracketleftBigbraceleftbig
Buch
bracerightbig
,
braceleftbig
das
bracerightbigangbracketrightBigbracerightbigg
LP lpnp3
braceleftbiggangbracketleftBigbraceleftbig
das
bracerightbig
,
braceleftbig
Buch
bracerightbigangbracketrightBigbracerightbigg







SYNSEM|... | CASE Acc






angbracketrightBigg


















angbracketrightBigg






































Instantiated PHON part of the above:
PHON






CONSTITS
braceleftbig
erlaubt, Fritz, der, Frank, dem, zu-lesen, Buch, das
bracerightbig
CONSTRS




ADJ
braceleftbiggangbracketleftBigbraceleftbig
Fritz
bracerightbig
,
braceleftbig
der
bracerightbigangbracketrightBig
,
angbracketleftBigbraceleftbig
Frank
bracerightbig
,
braceleftbig
dem
bracerightbigangbracketrightBig
,
angbracketleftBigbraceleftbig
Buch
bracerightbig
,
braceleftbig
das
bracerightbigangbracketrightBigbracerightbigg
LP



angbracketleftBigbraceleftbig
Fritz,der
bracerightbig
,
braceleftbig
erlaubt
bracerightbigangbracketrightBig
,
angbracketleftBigbraceleftbig
Frank,dem
bracerightbig
,
braceleftbig
erlaubt
bracerightbigangbracketrightBig
,
angbracketleftBigbraceleftbig
der
bracerightbig
,
braceleftbig
Fritz
bracerightbigangbracketrightBig
,
angbracketleftBigbraceleftbig
dem
bracerightbig
,
braceleftbig
Frank
bracerightbigangbracketrightBig
,
angbracketleftBigbraceleftbig
das
bracerightbig
,
braceleftbig
Buch
bracerightbigangbracketrightBig
,
angbracketleftBigbraceleftbig
Buch,das
bracerightbig
,
braceleftbig
zu-lesen
bracerightbigangbracketrightBig













Figure 4: An application of Head-Argument Schema
28
the monotonic inheritance of constraints, that the
WOC compliance cannot be checked until the
stage of final projection. While this is generally
true for freer word order languages considering
various scenarios such as bottom-up generation,
one can conduct the WOCcheck immediatelyafter
the instantiation of relevant categories in parsing,
the fact we can exploit in our implementation, as
we will now see.
3 Constrained Free Word Order Parsing
3.1 Algorithm
In this section our parsing algorithm that works
with the lexicalised linearisation grammar out-
lined above is briefly overviewed.4 It expands on
two existing ideas: bitmasks for non-CFG parsing
and dynamic constraint application.
Bitmasks are used to indicate the positions of
a parsed words, wherever they have been found.
Reape (1991) presents a non-CFG tabular parsing
algorithm using them, for ‘permutation complete’
language, which accepts all the permutations and
discontinuous realisations of words. To take for
an example a simple English NP that comprises
the, thick and book, this parser accepts not only
their 3! permutations but discontinuous realisa-
tions thereof in a longer string, such as [book, -,
the, -, thick] (‘-’ indicates the positions of con-
stituents from other phrases).
Clearly, the problem here is overgeneration and
(in)efficiency. In the current form the worst-
case complexity will be exponential (O(n!·2n), n =
length of string). In response, Daniels and Meur-
ers (2004) propose to restrict search space dur-
ing the parse with two additional bitmasks, pos-
itive and negative masks, which encode the bits
that must be and must not be occupied, respec-
tively, based on what has been found thus far and
the relevant word order constraints. For example,
given the constraints that Det precedes Nom and
Det must be adjacent to Nom and supposing the
parser has found Det in the third position of a five
word string like above, the negative mask [ x, x,
the, -, -] is created, where x indicates the position
that cannot be occupied by Nom, as well as the
positive mask [ * , das, *, -], where * indicates the
positions that must be occupied by Nom. Thus,
you can stop the parser from searching the posi-
tions the categories yet to be found cannot occupy,
or force it to search only the positions they have to
occupy.
4For full details see Sato (forthcoming b).
A remaining important job is to how to state the
constraints themselves in a grammar that works
with this architecture, and Daniels and Meurers’
answer is a rather traditional one: stating them in
phrase structure rules as LP attachments. They
modify HPSG rather extensively in a way simi-
lar to GPSG, in what they call ‘Generalised ID/LP
Grammar’. However, as we have been arguing,
this is not an inevitable move. It is possible to keep
the general contour of the standard HPSG largely
intact.
The way our parser interacts with the grammar
is fundamentally different. We take full advan-
tage of the information that now resides in lexi-
cal heads. Firstly, rules are dynamically generated
from the subcategorisation information (ARGS
feature) in the head. Secondly, the constraints
are picked up from the WOC feature when lexical
heads are encountered and carried in edges, elimi-
nating the need for positive/negative masks. When
an active edge is about to embrace the next cate-
gory, these constraints are checked and enforced,
limiting the search space thereby.
After the lexicon lookup, the parser generates
rules from the found lexical head and forms lexi-
cal edges. It is also at this stage that the WOC is
picked up and pushed into the edge, along with the
rule generated:
〈Mum→ Hd-Dtr • Nhd1 Nhd2...Nhdn; WOCs〉
where WOCs is the set of ADJ and LP constraints
picked up, if any. This edge now tries to find the
rest – non-head daughters. The following is the
representation of an edge when the parsing pro-
ceeds to the stage where some non-head daughter,
in this representation Dtri, has been parsed, and
Dtrj is to be searched for.
〈Mum→ Dtr1 Dtr2...Dtri• Dtrj...Dtrn; WOCs〉
When Dtrj is found, the parser does not immedi-
ately move the dot. At this point the WOC com-
pliance check with the relevant WOC constraint –
the one(s) involving Dtri and Dtrj – is conducted
on these two daughters. The compliance check is
a simple list operation. It picks the bitmasks of
the two daughters in question and checks whether
the occupied positions of one daughter precede/are
adjacent to those of the other.
The failure of this check would prevent the dot
move from taking place. Thus, edges that violate
the word order constraints would not be created,
thereby preventing wasteful search. This is the
same feature as Daniels and Meurers’, and there-
fore the efficiency in terms of the number of edges
is identical. The main difference is that we use
29
the information inside the feature structure with-
out having media like positive/negative masks.
3.2 Implementation
I have implemented the algorithm in Prolog and
coded the HPSG feature structure in the way de-
scribed using ProFIT (Erbach, 1995). It is a head-
corner, bottom-up chart parser, roughly based on
Gazdar and Mellish (1989). The main modifi-
cation consists of introducing bitmasks and the
word order checking procedure described above.
I created small grammars for Japanese and Ger-
man and put them to the parser, to confirm that
linearisation-heavy constructions such as object
control construction can be successfully parsed,
with the WOC constraints enforced.
4 Future Tasks
What we have seen is an outline of my initial pro-
posal and there are numerous tasks yet to be tack-
led. First of all, now that the constraints are writ-
ten in individual lexical items, we are in need of
appropriate typing in terms of word order con-
straints, in order to be able to state succinctly gen-
eral constraints such as the head-final/initial con-
straint. In other words, it is crucial to devise an
appropriate type hierarchy.
Another potential problem concerns the gen-
erality of our theoretical framework. I have fo-
cused on the Head-Argument structure in this pa-
per, but if the present theory were to be of gen-
eral use, non-argument constructions, such as the
Head-Modifier structure, must be accounted for.
Also, the cases where the head of a phrase is itself
a phrase may pose a challenge, if such a phrasal
head were to determine the word order of its pro-
jection. Since it is desirable for computational
transparencynot touse emergentconstraints,Iwill
attempt to get all the word order constraints ul-
timately propagated and monotonically inherited
from the lexical level. Though some word order
constraints may turn out to have to be written into
the phrasal head directly, I am confident that the
majority, if not all, of the constraints can be stated
in the lexicon. These issues are tackled in a sepa-
rate paper (Sato, forthcoming a).
In terms of efficiency, more study has to be re-
quiredto identify the exactcomplexity of myalgo-
rithm. Also, with a view to using it for a practical
system, an evaluation of the efficiency on the ac-
tual machine will be crucial.
References
M. Daniels and D. Meurers. 2004. GIDLP: A gram-
mar format for linearization-based HPSG. In Pro-
ceedings of the HPSG04 Conference.
G. Erbach. 1995. ProFIT: Prolog with features, in-
heritance and templates. Proceedings of the Seventh
Conference of the European Associationfor Compu-
tational Linguistics.
G. Gazdar and C. Mellish. 1989. Natural Language
Processing in Prolog. Addison Wesley.
G.Gazdar,E.Klein,G. Pullum,andI.Sag. 1985. Gen-
eralized Phrase Structure Grammar. Harvard UP.
E.Hinrichs andT.Nakazawa. 1990. Subcategorization
and VP structure in German. In S. Hughes et al.,
editor, Proceedings of the Third Symposium on Ger-
manic Linguistics.
D. Meurers. 1999. Raising Spirits (and assigning them
case). GroningerArbeitenzur GermanistischenLin-
guistik, Groningen Univ.
C. Pollard and I. Sag. 1987. Information-Based Syntax
and Semantics. CSLI.
C. Pollard and I. Sag. 1994. Head-Driven Phrase
Structure Grammar. CSLI.
M. Reape. 1991. Parsing bounded discontinuous
constituents: Generalisation of some common algo-
rithms. DIANA Report, Edinburgh Univ.
M. Reape. 1993. A Formal Theory of Word Order.
Ph.D. thesis, Edinburgh University.
M. Reape. 1994. Domain union and word order vari-
ation in German. In J. Nerbonne et al., editor, Ger-
man in Head-Driven Phrase Structure Grammar.
F. Richter and M. Sailer. 1995. Remarks on lineariza-
tion. Magisterarbeit, Tübingen Univ.
Y. Sato. 2004. Discontinuous constituency and non-
CFG parsing. http://www.dcs.kcl.ac.uk/pg/satoyo.
Y. Sato. forthcoming a. Two alternatives for lexicalist
linearisation grammar: Locality Principle revisited.
Y. Sato. forthcoming b. Constrained free word order
parsing for lexicalist grammar.
S. Shieber. 1985. Evidence against the context free-
ness of natural languages. Linguistics and Philoso-
phy, 8(3):333–43.
S. Yatabe. 1996. Long-distance scrambling via partial
compaction. In M. Koizumi et al., editor, Formal
Approaches to Japanese Linguistics 2. MIT Press,
Cambridge, Mass.
30
