THE PROPER TREATMENT OF WORD ORDER IN HPSG* 
EAREL OL~VA 
Department of Computational Linguistics 
University of Saarland 
Im Stadtwald 
W-6600 SaarbrOcken ii 
Federal Republic of Gelqnany 
e-mail: karel@coli.uni-sb.de 
Abstract: 
This paper describes a possi- 
bility of expressing ordering con- 
straints among non-sister constituents 
in binary branching syntactic 
structures on a local basis, supported 
by viewing the binary branching 
structure as a list (rather than a 
tree) of constituents within HPSG- 
style grammars. The core idea of such 
a description of ordering is consti- 
tuted by creating a type lattice for 
lists. The possibilities of expressing 
different approaches to word order in 
the framework are briefly discussed, 
exemplified and compared to other 
methods. 
In the standard immediate-con- 
stituent based approaches, the "free" 
word order I is described either di- 
rectly in the phrase-structure (PS) 
rules, which thus express simultane- 
ously both dominance (mother/daughter) 
relations and precedence (ordering) 
relations between syntactic cate- 
gories, or, in more recent formalisms 
such as GPSG or HPSG, by the linear 
precedence (LP) rules creating a sepa- 
rate component of the grammar, whose 
other component is the set of immedi- 
ate-dominance (ID) rules. In both the- 
se cases, the ordering constraints are 
limited to sister constituents, i.e. 
they are strictly local. 
One of the problems of both 
these variations of the standard PS- 
approach is the description of ad- 
juncts (free modifiers). On the usual 
assumption that their number per clau- 
se is in principle not limited (though 
finite for a particular clause), an 
approach to their ordering presuppos- 
ing them to be all sister constituents 
laust necessarily presuppose also an 
(at least potentially) infinite set of 
generative rules (e.g., a set induced 
by a Kleene star used in a "basic" 
variant of one of the rules). 
In languages whose grammar al- 
lows for more word order freedom than 
English 2, it is often the case that 
adjuncts and complements of a head 
(typically, but not solely, of a fi- 
nite verb) can be freely intermixed, 
which makes the approach where the lo- 
cality of LP constraints forces the 
head as well as its modifiers (both 
complements and adjuncts) be expanded 
as sisters still less attractive. 
Another possibility of de- 
scription of word order is the "topo- 
logical" approach used predominantly 
in more traditionally oriented German 
linguistics. Applied to German, this 
approach divides a clause into several 
word order "fields" ("Vorfeld", 
"Mittelfeld", "Nachfeld", "linke/rech- 
te Satzklammer") whose mutual position 
is fixed, and studies mainly the word 
order regularities within these 
"fields". Though a lot of work has 
been done and many valuable insights 
presented within this paradigm, seen 
from the viewpoint of computational 
linguistics this approach has the 
fatal disadvantage that it is ex- 
tremely difficult to formalize within 
the standard frameworks (e.g., none of 
the "fields" with the possible ex- 
ception of "Vorfeld" creates a con- 
stituent in any usual sense etc.). 
AS an alternative to the two ba- 
sic approaches mentioned above (as a 
modification of the first one, in 
fact), the description based on binary 
right-branching structures has been 
proposed independently in several 
works concerned with languages ex- 
Aca~s DE COL1NG-92, NANTES, 23-28 AO~'T 1992 1 8 4 PROC. OF COLING-92, NnbrrES, AUG. 23-28, 1992 
hibiting a considerable share of free 
word order: in (Uszkoreit,1986) for 
the description of verb-final clauses 
Jn German, in (Gunji, 1987) for 
Japanese and in (Avgustinova and Ol/- 
va,1990) for (mainly) Slavic lan- 
guages. However, the price paid for 
the removal of some problems, mainly, 
the free intermixing of heads, comple- 
ments and adjuncts, of the above- 
mentioned more standard descriptions 
is rather high - at least two problem- 
atic points arise due to the strict 
binarity of the structure. The first 
of them is the fact that in binary 
structures no LP-rules relying on the 
relation "being sister constituent" 
can be used for ordering heads, com- 
plements and adjuncts in cases this is 
required, since these are not sisters 
any more. The second problematic point 
can be seen at best at the variant of 
the formalism given in (Avgustinova 
and Oliva, 1990) the occurrence of 
the phonologically empty rightmost 
element of the branching 3 (cf. the 
structure (1) for the string "John 
kissed Mary yesterday"). 
6 
The former problem concerning word or- 
der is in the majority of the binary- 
branching approaches (as far as they 
are at all concerned with it) solved 
by introducing word order mechanisms 
which are either of non-local nature 
or which burden the syntactic cate- 
gories (understood as feature bundles) 
with otherwise unmotivated features 
used solely for the purpose of im- 
posing ordering constraints (and most 
often, with a combination of the two). 
Neither approach is more fortunate 
than the other - nonulocality is su- 
rely an unwished phenomenon in the de- 
scription, and the presence of special 
ordering features in the categories is 
hardly better, i.a. also because order 
is a property of the syntactic struc- 
ture (made of categories) rather than 
of the categories themselves 4 . 
This paper will try to show that 
in spite of the abovementioned reser- 
vations the "binary branching" can be 
a correct and fruitful approach to 
syntactic description if seen from a 
slightly different viewpoint. In order 
to get the proper perspective, let us 
observe the "binary branching" 
structure for the example sentence 
"The small boy ate an apple" shown in 
(2) . 
There are several things to be 
taken into consideration here. 
The most obvious among them is 
the division of the structure into 
"levels" - contiguous sequences.of no- 
des with identical marking. Thus one 
,,dj( .... 1,1/ \ 
~f ~o NP ~x~P 
N (boy) V(a<~ 
A~t(an) / ~ Np ~\]Vp 
N (app\] e) 
ACRES DE COLING-92, NAN'\['IkS, 23-28 AOt\]T 1992 1 8 5 l'~tOC. OF COLING-92, NANTES, AUG. 23-28, 1992 
"VP-level" and two "NP-levels" are to 
be clearly seen, each having a distin- 
guished element at its end (the phono- 
logically empty element). 
Further, it can be observed that 
each of the "levels" has one (and only 
one) other distinguished element some- 
where in a non-final position - it is 
the V element for the "VP-level" and 
the N elements for the "NP-levels", in 
other words each level has a head, 
It is also worth remarking that 
the levels of the binary branching 
have a direct relation to more usual 
approaches. Thus, the standard i~edi- 
ate-constituent tree (3) for the sen- 
tence from (2) can be obtained by fac- 
torizing (collapsing) the NP nodes 
from the respective NP-levels into a 
single one, and by factorizing all the 
VP-nodes except for the uppermost 
"sentential" one. 
(3) S (= VP) 
NP VP /\ 
Art Adj N V NP 
r i J /\ 
the small boy ate Art N 
i 1 an apple 
The dependency tree (4) of the same 
sentence can be then obtained by col- 
lapsing all nodes of a level plus its 
head into a single node. 
(4) /.a~t e 
the small 
The most striking observation 
concerning the syntactic structures of 
the kind exemplified in (2), however, 
is the nature of their data type: 
showing a strict binary right-directed 
branching and having a distinguished 
(by its phonological emptiness) node 
as their final element, they are in 
fact nothing but lists 5 . 
Adopting this view brings along 
several advantages: 
- first, the syntactic structure 
is strictly uniform - and also simpler 
than the general tree structure, with 
all (mainly practical) consequences 
following from this 
second, the overall usage of 
lists (whose members may be lists 
again) brings back the notion of lo- 
cality of syntactic description - each 
list used in the structure (i.e., each 
"level" of the structure as discussed 
above) constitutes a local domain, 
creating thus also a natural area of 
application of local constraints (such 
as subcategorization, linear prece- 
dence etc.) 
- third, such an approach allows 
for merging both the syntactic and the 
topological approaches in a single 
formal description, keeping, however, 
the two components clearly separated - 
the categorial information being ex- 
pressed by means of (syntactic and 
other) features and their bundles 
(attribute-value matrices), the topo- 
logical information being expressed by 
means of refinement of kinds of lists 
and their elements and sublists. 
Thus, given fairly usual as- 
sumptions about the nature and func- 
tion of constituents in a phrase, the 
general type 6 for nonempty lists from 
(5) is to be split into subtypes shown 
in (6) (where minor covers consti- 
tuents made of cemplementizers, 
particles etc.). 
(5) ~klist 
top list 
(6) ~kphrase 
head phrase 
/~phrase 
' s 
complement phrase 
• /~k phrase 
adjunct phrase 
• /~k phrase 
minor phrase 
ACTES DE COLING-92, NANTES, 23-28 ^o6"r 1992 1 8 6 Paoc. oF COLING-92, NANTEs, AUG. 23-28, 1992 
In practice, even more delicate divi- 
sion is needed according to kinds of 
phrases used and according to the na- 
ture of modifiers these phrases allow. 
Introduction of more fine-grained sub- 
types may be needed also for the final 
element of a list (usually n£1); the 
respective subtypes should mirror the 
kinds of phrases used as functions of 
the "levels" of the syntactic struc- 
ture, giving thus rise, e.g., to types 
end ofnp, end ofvp etc. 
\[\]sing a different form of struc- 
tural representation enforces also us- 
ing different form (but not different 
background intuitions) of rules and 
principles of the grammar, all of them 
corresponding tothe types of lists as 
introduced in the immediately preced- 
ing text. 
Thus, the Head Feature Principle 
(HFP) is to be expressed as a conjunc- 
tion of two implications 7 (rather than 
a single implication, combining con- 
junctively with other principles of 
the grammar), one describing the case 
where the first element of the respec- 
tive nonempty 8 list is the head of the 
respective phrase ("level"), the other 
one describing the rest of the cases. 
(7) Head Feature Principle 
\[first: \[head \]\] 
=>~synsem:cat:head: \[11 1 
I first:synsem:cat:head: II Lrest : synsem: cat : head: lllJ 
& 
\[ first : \[not (head) \] \] 
=>Fsynsem:cat:head: Iii 11 I 
Lrest:synsem:cat:head: 
Assuming the version of HPSG using 
sets (rather than lists) as values of 
the feature subcat, the Subcategori- 
zation Principle has to consist of 
four implications, each for a parti- 
cular configuration in the syntactic 
list 9 . The first part describes the 
impact of an expansion of a comple- 
ment, the second the impact of an ex- 
pansion of a head (consisting just in 
copying the subcategorization of the 
head into a special head feature 
head subcat, with the aim of inherit- 
ing the information about the 
subcategorization of the head consti- 
tuent into the final element of the 
respective list via the HFP), the 
third the impact of an expansion of an 
adjunct or of a minor category, and 
the last one expresses the requirement 
that the subcahegorization of the fi- 
nal element of the phrase, covering no 
syntactic material, be equal to the 
subcategorization of the head of the 
phrase. (The effect of the second and 
the fourth implication taken together 
is worth comparing with the above- 
mentioned condition from works by 
Uszkoreit and Gunji, namely that the 
verb - the source of the subcatego- 
rization stand at the end of the 
clause.) 
(8) Subcategorization Principle 
\[first: \[complement \]\] 
=> ~irst : Fynsem: subcat :i21 Ill 1 
~est:synsem:subcat: Ill u{12l 
& 
\[first: \[head \]\] 
=>~ynsem:cat:head:headsubcat: I1 1 
~irst:synsem:subcat: ill 
& 
\[first: \[not(head) & not(complement) \]\] 
=> ~ynsem:subcat: III I 
Lrest:synsem:subcat: Ill 
& 
\[nm \] 
=> ~ynsem: Fsubcat : I i I 71 L 
Lcatlhead:head subcat: Ii 
Assuming further a phonological prin- 
ciple stating that the phonology of 
constituents of the type nil (and of 
all its subtypes) is empty while 
phonology of all other constituents is 
the combination I° of phonologies of 
their first and rest subconstituents, 
this approach allows for reduction of 
the number of grammar schemata 
("rules") describing the eategorial 
structure to one (similarly as in 
Gunji, 1987) having the gross shape 
shown in (9). 
(9) Ffirst: \[\]\] 
Lrest : \[ \] 
The word order constraints, on the 
other hand, can be expressed within 
the hierarchy of sorts of lists used 
in the system, by means of which the 
ordering information is not only kept 
separated from the categorJal one, but 
is also formulated in local domains 
(each constituted by a llst) only. 
AcrEs DF, COLING-92, NA~rrES. 23-28 hotrr 1992 1 8 7 PROC. OF COLING-92, NANTm% AU~. 23-28. 1992 
The practical usage of the idea 
of using the sort hierarchy of lists 
for the purpose of expressing word 
order constraints will be now illu- 
strated on an example. In this exam- 
ple, the symbol "==" will be used for 
defining the type hierarchy. The type 
standing on the left-hand side of the 
"~' will be a supertype of the type 
standing on its right-hand side. As 
(i0) \[clause \] == \[verb first_clause \] 
v \[verb second clause \] 
v \[verblast clause \] 
the example proper, let us take a 
slightly simplified system of German 
word order as used in the "field"- 
based approach, and let us assume for 
the moment that the sorts finite verb, 
nonfiniteverb, complement and adjunct 
are primitives (though, obviously, in 
reality they are net). The description 
of the word order of the clause then 
may look like as shown in (10). 
\[verb_first clause \] == pirst: \[finite_verb \] 1 
L rest : \[middle_field and rest fields " 
\[verbsecond_clause \] == ~firet: \[forefield \] 1 
L rest : \[verb_first_clause • 
last cl .... \] == \[first: \[verbal_modifi~r \] l \[v~rb 
L rest \[ verb_last_claus~ 11\]j 
V 
first: \[finite verb \] 1 
est : \[nil \] 
\[forefield \] == \[verbal modifier \] 
\[middle field and rest fields \] == 
- \[first: \[verhe~ modifier \] _7 
L rest : \[middJe field and rest fields J 
v 
\[nil \] v 
first:\[nonfini ..... b \] 1 
est : \[after field \] 
\[after field \] == \[niJ \] 
V 
r first : \[ verbal modifier 1 
est : \[after field \] 
\[verbal modifier \] == \[complement \] 
V \[adjunct \] " 
The first definition in (i0) describes 
the fact that a clause is either a 
verb-first clause, a verb-second clau- 
se or a verb-last clause. The next 
three definitions describe the word 
c rder within these kinds of clauses . 
The definitions of verb-first and 
verb-second clauses are quite simple, 
specifying only the types of the first 
and rest features of the respective 
syntactic lists. The definition of 
verb-last clauses expresses the fact 
that they can consist either of a ver- 
bal modifier followed by (the rest of 
the) verb-last clause, or of a finite 
verb, which cannot be followed by any 
syntactic material 11 . The last four 
definitions express actually the 
"field" approach to the German sent- 
ence. The first of them states that 
the forefield consists of a verbal 
modifier, which, in turn, is defined 
as being either a complement or an ad- 
junct (in the last definition). The 
specification of middle field (and 
contingent following parts of the 
clause) says that the it can contain 
first of all any verbal modifi- 
ACRES DE COLING-92, NANTKS, 23-28 Aotrr 1992 1 8 8 PROC. OF COLING-92, NANTES. AUG. 23-28, 1992 
er followed by (the rest of the) mid- 
dle field or that it can be empty or 
that it can contain a nonfinite verb 
followed by an afterfield. Finally, 
the afterfield is defined either as 
empty or as containing a verbal modi- 
fier followed by (the rest of the) af- 
terfield. Some clarification of the 
general idea should be brought in by 
the structure (12) for the sentence 
(Ii). Here, as well as in the struct- 
ures that follow, only the most spe- 
cific deducible sorts are given for a 
type (e.g., with the constituent Hans 
the sort con~lement is used rather 
than the sort forefield, because 
co~plement is more specific than fore- 
field). 
(ii) Hans hat gestern Maria ein Buch gegeben. 
(12) /~rb second clause \] 
~/ \] ~\[verb first clause \] 
\[ .... i ...... \] / ~ -- 
~'/ \] ~\[middle field and rest fields \] 
\[finite verb \] / ~ -- 
\[ 0 ~/ \] ~\[middle field and rest_fields \] 
\[adjunct\]/ ~ - - 
\[ • Cf \] /~ \[middle_field and rest fields \] 
\[complem~mt \] / ~-- 
C ~ ~middle field and rest fields \] 
\[ ..... i .... t \] / ~ ..... 
(5 ~O \[nil \] 
\[nonflnite verb \] 
The previous example showed a 
relatively simple case where the num- 
ber of elements ("fields" of the 
clause) to be ordered was low and more 
or less given in advance, and their 
ordering absolute (e.g., forefield 
first, finite verb second, middlefield 
third etc.). However, the descriptive 
power of the approach is not limited 
to this: cases where the number of 
elements is not given beforehand and 
their ordering is not absolute can be 
coped with, too, as well as cases of 
word order combining the two kinds of 
requirements. For more details see 
(Oliva, 1992). 
The principal achievement of the 
approach presented is the (re)intro- 
duction of locality into binary 
branching structures allowing for re- 
placing the more standard but in this 
case unsuitable concept of ordering 
constraints holding for sister 
constituents by a very similar concept 
of ordering constraints holding for a 
list of constituents. However, the in- 
troduction of lists allows also for 
easiness of expressing different other 
techniques of describing word order 
arld its variations, such as the "topo- 
logical" approach discussed above 12 or 
the "systemic ordering" as worked out 
by Prague linguists (e.g., Sgall et 
ai.,1986) etc. Worth consideration is 
also the relation of this approach 
based on typing of lists of surface 
constituents (expressing thus i.a. 
also their obliqueness hierarchy) to 
the "<<"-type of LP-rules of the 
standard HPSG, which, unlike the 
approach discussed, force obliqueness 
of complements to be expressed re- 
peatedly within the subcategorization 
list of each head. 
The applicability of tile method 
of description of word order as dis- 
cussed in this paper has been proved 
by using it successfully in an experi- 
mental grammar of German developed in 
the STUF '91 forlaalism within the i£Log 
project. 
Acknowledgement 
I am thankful to Taxi/a Avgust inova, 
Judith Engelkamp, Gl:egor Erbach, Eva 
Haji~ovA and Petr Sg,~!\] for most valu- 
able corm~ents on the first draft of\[ 
this papel . 
AcrEs :OE COLlNG-92. NANTES. 23-28 AoG7 1992 1 8 9 PROC. OF COLING-92, NAN IT, S, AUG. 23-28. 1992 
Footnotes: 
* Needless to say, hopefully, that this 
title should not be taken all that 
seriously ... 
1 In fact, free constituent order - but I 
shall stick to the traditional terminology 
in this paper. 
2 Though even the English word order is 
not that rigorous as is often assumed - 
especially as to the position of different 
adjuncts. 
3 This is not to say that this problem is 
not latent also in the approaches pre- 
sented by Uszkoreit and Gunji - they just 
use the clause-final position of the verb 
in German and Japanese (i.e., a phenomenon 
from the fixed-word-order sphere) which 
helps to cover it. 
4 Though in HPSG the formal difference 
between the two is removed due to the 
existence of the feature d trs 
(daughters), the intuitive difference of 
course remains. 
5 For this reason, the term syntactlo llst 
will be used later in this paper as an 
equivalent of the term binary branc~Ltng 
syntactic structure. 
6 In the present paper, the term type is 
used for what is usually called also a 
feature structure, an attribute-value 
matrix etc. The term sort will be used for 
kinds of feature structures, i.e. for what 
is sometimes used to be called type (of a 
feature structure). This convention will 
be used consequently and should not, thus, 
cause n~sunderstanding. Both types and 
sorts create their respective lattices, in 
principle independent from each other. 
This allows for using operators &, v and 
not for creating unification, disjunction 
and complement, respectively, of types and 
sorts. In the following examples, types 
will be given as attribute-value matrices 
enclosed in square brackets, and sorts as 
subscripts in italics of these brackets. 
7 Reme~r that the principle in the form 
of an implication applies only if the 
left-hand side of the implication unifies 
with the structure the principle should 
apply on; note also that in the particular 
case formulated here, the HFP could have 
been simplified into a conjunction of a 
non-implication and an implication parts. 
8 No form of HFP applies on the empty list 
no inheritance of features between 
mother and head daughter can occur there, 
obviously. 
9 Notwithstanding the particularization 
(four implications instead of one in 
standard HPSG), this still should be 
treated as a principle - it is a genera- 
lization holding across kinds of phrases 
(NP's, irP's etc.). 
i0 Typically, but not necessarily, conca- 
tenation. 
iI Two short remarks seem to be needed on 
this spot. First, here the fact that in 
German the clause cannot be constituted by 
a finite verb alone, is to be coped in 
other parts of the grantmar (e.g., by the 
subcategorization of the verb). Second, 
the fact that no afterfield is allowed in 
verb-last clauses in this example is an 
arbitrary decision, having little to do 
with the general expressive power of the 
presented approach, and even less with 
grammar of "real" German. 
12 Of particular importance on this spot 
seem to be the facts that on the approach 
sketched it is naturally possible to speak 
about, e.g., the middlefiel d, giving to 
this term also a clear-cut formalized 
treatment bot without forcing it to occur 
as a true constituent in the description, 
as well as the possibility to specify the 
position of the verb in verb-first and 
verb-second German clauses without re- 
torting to any kind of "movement" 
mechanism (e.g., to SLASH). 
AcrEs DE COLING-92, NANTES, 23-28 AO6T 1992 1 9 0 PREC. OF COLING-92. NANTES, AUO. 23-28, 1992 

Bibliography

Avgustinova T. and K. Oliva: Syntactic 
Description of Free Word Order Lan- 
guages, in: Proceedings of Coling '90, 
Helsinki 1990 

Gazdar G.,E.Klein,G.Pullum and I.Sag: 
Generalized Phrase Structure Grammar, 
Basil Blackwell, Oxford 1985 

Gunji T.: Japanese Phrase Structure 
Grammar, Reidel, Dordrecht 1987 

Oliva K.: Word Order Constraints in 
Binary Branching Syntactic Structures, 
to appear in the report series of 
CLAUS (Computational Linguistics at 
University Saarbr~cken) 1992 

Pollard C. and i. Sag: Information 
Based Syntax and Semantics vol°l: 
Fundamentals, CSLI Lecture Notes No. 
13, CSLI, Stanford, California 1987 

Sgall P., E.Haji~ov~ and J.Panevov~: 
The Meaning of the Sentence in Its Se- 
mantic and Pragmatic Aspects, 
Academia, Prague 1986 

Uazkoreit B.: Linear Precedence in 
Discontinuous Constituents: Complex 
Fronting in German, CSLI Report NO.47, 
CSLI, Stanford, California 1986 
