SYNTACTIC DESCRIPTION OF FREE WORD ORDER LANGUAGES 
Tanle Avguutlnova & l%mrel Olive* 
Lingustic Modelling Laboratory, 
Coordination Centre 
for Computer Science and Computer Technology, 
Bulgarian Academy of Sciences, 
seed. G. Bencher st. bl. 25A, 
BG - 1113 Sofia, 
Bulgaria 
Abstrae%: 
A framework for the description of syntactic 
structures of free word order languages is proposed, 
based on combination of intuitions underlying imme- 
diate constituent description, dependency description 
and communicative dynamism. The combined approach is 
compared to its sources and shown superior in descrip- 
tive power, esp. in the area of free intermixing of 
(any number of) adjuncts with complements and in coor- 
dination. Close resemblance to two other recent ap- 
proaches is pointed out. 
I. Syz~tactic Structures for Free Word Order 
The absolute majority of current linguistic 
frameworks characterize syntactic structures of natu- 
ral languages in predominantly static terms, paying 
only minimal or no attel~tion to the communicative 
function ef language and its reflection in the process 
of utterlng/understandlng (generation/parsing) senten- 
ces. 
In the case of generative frameworks based on 
i~nedlate constituent approaches to .language de- 
scr~ptlon, this can be ascribed (at least partly) to 
the fact that many of them (GB, LFG, GPSG, TAGS, to 
mention the most widespread ones) were created prima- 
rlly for the sake of description of English, a highly 
configuratlonal language in which the impact of 
communicative functions on syntactic structure is 
quite limited (at least in comparison with the so-ca!- 
led free-word-order languages - henceforth FWOLs). 
The frameworks based on the dependency syntax 
(e.g., the "Meaning-Text" model of Mel'chuk and Apre- 
syan, the "Functional Generative Description" of Sgall 
et al., the "Word Grammar" of Hudson), on the other 
hand, by the very principle separate linguistic 
structures from the process of generation/parsing so 
sharply that even if any reflection of the communica- 
tive process is present in the generation/parsing pro- 
cedures, it gets lost in the resulting structures and 
has to be added there (if needed) more or \]ess artifi- 
cially, e.g., in the form of different indices (cf. 
the structures in Sgall et el, 1986). 
This lack of reflection of communicative aspects 
of language in the (syntactic) structures, together 
with still other features of the abovementioned frame- 
works (such as, for the immediate constituent based 
approaches, the incapability of the standard "S --> NP 
VP" approach to describe, e.g., the +'Object-Subject- 
Verb" constltuent order, or this obstacle overcome in 
some way, the problems connected with free intermixing 
of any number of free adjuncts among the complements, 
and, for the dependency based approaches, the problems 
involved in capturing even quite simple instances of 
coordination), make relatively profound adjustments in 
the existing frameworks or development of a new one a 
necessary and highly important task if a language in- 
volving broad impact of the communicative aspects on 
its syntactic structures (such as the FWOLs) has to be 
described formally in such a way that the description 
can be directly implemented on a computer and function 
as a generator or a parser. 
The easiest way how to overcome the difficultles 
connected with the current frameworks and to achieve 
the abovementioned goal of creating a framework suit- 
able for a reasonable description of FWOLs as well as 
for an easy and efficient computer implementation 
seemed to be to augment the immediate eonstltuent ba- 
sed nontransformatlonal approaches (which are easier 
to implement due to the clearcut correspondence be- 
tween the rules of the grammar and the structures they 
generate) with tile intultions contained in more tradi- 
tional descriptions of the FWOLs as well as in the de- 
scriptions of functional sentence perspective and com~ 
municatlve dynamism (Firhas,1971,1975; Sga!l et 
ai,1973). 
In the unmarked case, the scale of co~nunicative 
dynamism allows for splitting the sentence or arty of 
its parts on the level of the "main" constituel\]ts 
(such as Subject, Object, different verbal Adjuncts 
etc.) at any moment into two parts, the first con+ 
sisting of the constituent being processes (uttered, 
expanded) at the very moment, i+e. the currently least 
dynamic constituent, and the second one consisting of 
the "rest" of the sentence, i.e. of all the consti~ 
tuents more dynamic than the currently processed one. 
This results in a non-transformational accounc of syn- 
tactic structures, in the form of binary right-branch- 
ing trees (if the division sketched above is broadened 
to all constituent types used in the description). Art 
example of the structure for the notorious sentence 
"John loves Mary" is given in (I). (Mind the rightmost 
"Rest S" nonterminal dominating an empty string: 
"nothing more is to" be uttered" in the sentence, 
"nothing is more dynamic" than "Mary".) 
(I) 
John 
0 ~ ~.~ t S 
loves 
/~ Rest S 
Mary 
On such an approach, both the generation of 
all possible constituent orders and free intermixing 
of any number of adjuncts between any two complements 
is guaranteed for FWOLs, and this without ~slng the 
Kleene star in the rules, metaru\]es generating an in- 
finite number of rules or any other way o~ using 
(explicitly or implicitly) an infinite rule set. Just 
on the contrary, the approach results in a drastic 
simplification of the number and shape of rules 
needed: one gross rule scheme (2) is sufficient for 
the whole grammar: 
(2) ~~at is to be 1 
\[be expandedJ lexpanded NO~A 
In this scheme, the second constituent on the right 
hand side is always the phrasal head; accozding to the 
nature of the left daughter, the rule set can be furt- 
her factorized into the following subsets reflecting 
the classical linguistic wisdom: ru\]es expanding the 
lexical head, rules expanding a complement, rules ex- 
panding a free adjunct, rules expandi~g an extraposed 
constituent, rules expanding a member of a coordinated 
structure, rules expanding minor categories 
(conjunctions, particles etc.}. Such a division is im- 
1 311 
portan5 not only because it brings along some more pu- 
rity and perspicuity, but also because it allows for a 
straightforward implementation of different feature 
inheritance principles of the framework (such as the 
Head Feature Principle, Subcategorization Principle 
etc.) in the computer variant of the grammar; on a 
reasonable formal notation of the grammar rules al- 
lowing marking off the type of the rule as the pro- 
perty of the rule itself, it is possible to bound the 
application of the principles to the whole rule types 
rather than to each rule separately, as the case often 
is in many current parsers (e.g., for a head daughter 
in a rule, it is not necessary to stipulate explicitly 
the sharing of its head features with the mother, 
since this is provided for by listing the rule in the 
class of headexpanding rules). 
2. Relation to other Syntactic Frameworks 
The proposed structures might seem rather 
unconventional at first glance; however, their re- 
lation to structures used in more usual syntactic 
frameworks can be shown to be quite straightforward in 
simple cases. All what is needed to obtain dependency 
trees is to factorize the set of nodes of the describ- 
ed structures by all bar projections of a single ter- 
minal node. An X-structure can be obtained by facto- 
rizing the set of nodes by projections of the same 
bar~level of a single terminal node. 
In more complicated cases, however, the 
factorizations sketched above cannot be performed. 
Exactly in these cases, the structures proposed rank 
better in describing at least the following phenomena 
of FWOLs: 
in relation to dependency syntax (tradi- 
tionally used for description of FWOLs), first of all 
in describing coordination, but also the so-called 
non-proJectlve constructions (e.g., unbounded depen- 
dencies) as well as cases where contact position of 
certain words or constituents is required or positions 
are to be strictly fixed even in FWOLs (e.g., the Wac- 
kernaGel's position of clitlcs), which both is diffi- 
cult to achieve in dependency descriptions if non-pro- 
jective constructions are allowed to occur since these 
~nterfere with the "basic" projective ordering gene- 
rated 
- in relation to standard variants of X-syn- 
tax, the approach adopted solves the problems with the 
position of subject, with free intermixing of comple- 
ments and adjuncts and, in addition, it is able to 
cope with certain cases of "heavy" coordination (see 
below) on a context-free basis. 
3. suboategorization and Coomdinatlon 
Generally speaking, the intuitions (as opposed to 
the formalism) standing behind the framework are very 
close to (if not the same as) those supporting depen- 
dency approaches (certainly more so than to the 
intultions of the majority of current immediate con- 
stituent approaches, cf., e.g., the nonexistence of 
the "NP/VP" division of a sentence), but the structu- 
res developed for the formal incarnation of these in- 
tuitions have by far more descriptive power than the 
standard formalizations proposed for the dependency 
approaches. This extra power (even in comparison with 
the standard X-approaches) stems mainly from the in- 
creased number of nonterminal symbols: the greater 
number of nonterminals allows for a more subtle struc- 
tullng of the terminal string. 
The crucial point of this refinement of structu- 
ral information is the one concerning sub- 
categorization of phrases. In accordance with the 
treatment of subcategorization in HPSG and other fra- 
meworksr subcategorization can be informally viewed as 
the number and shape of constituents to be added to a 
particular phrase for it to become a saturated projec- 
tion of its lexical head (e.g., for a VP, this subca- 
tegorizatlon is the number and shape of constituents 
to be added for the VP to become a sentence; thus, a 
sentence is Just an alias for a VP with empty subcate- 
gorization). In the example (3), it is important to 
notice the "sharing" of the subcategorizatlon re- 
quirements (depicted schematically as sets of subcate- 
gorlzed-for elements associated with the nonterminal 
nodes of the structure) between the lexlcal head of 
the sentence (the verb) and its rlghtmost phrasal 
projection, as well as the stepwlse right-to-left re- 
duction of the subcategorization requirements of the 
VP's, and also the fact that the expansion of a lexi- 
col head or a free adjunct does not affect the subca- 
tegorization. 
(3) //~( } 
o" /b~kVP { SUBJ } 
John/ ~k 
~" ~hkVP ( SUBJ } 
kissed / 
o- /~\vP {su~J, OBJI Mary/ ~. 
O" b VP(SUBJ, OBJ} 
yesterday 
As mentioned above, this "extra descriptive po- 
wer" can be made use of for descriptlon of (among 
other) certain "heavy" coordinations. The instances we 
have in mind are "Right Node Raising" and "Across the 
Board" coordinations exemplified in (4) and (5), 
respectively. 
(4) Mary baked and John ate an apple pie. 
(5) the pie Mary baked and John ate 
Before presenting the treatment proper, two matters 
have to be pointed out: 
- first, in FWOLs "Right Node Raising" and 
"Across the Board" are exactly the same cases of coor- 
dinative constructions (due to the free-word-order, 
the position of the "extracted" constituent plays no 
syntactic role) 
- second, the grammaticality of other cases of 
coordination can be order dependent, even in FWOLs: 
typical case is "Gapping" (of. the contrast shown for 
English in (6)a,b but holding also in (at least) Bul- 
garian, Czech, Polish, Russian and Slovak), somewhat 
unclear is the situation with "Non-Constituent Coordi- 
nation", where speakers of the sbovementioned !an- 
guages seem to have different opinions about the gram- 
maticality of the respective counterparts of (7)b. 
(6) a. John loves Mary and Jim Sue. 
b. * Jim Sue and John loves Mary. 
(7) a. John gave a book to Mary 
and a bunch of flowers to Sue. 
b. ?? A book to Mary and 
a bunch of flowers to Sue John gave. 
This corroborates the view that (6)a,(7)a are in- 
stances of some extragrammatlcal communicative pro- 
cesses (i.e. processes not reflected in the grammar of 
the language - such as the tendency to avoid uttering 
identical parts of coordinated structures etc.) rather 
than true cases of "coordinated predication" which 
seems to be the case with "Right Node Raising" and 
"Across the Board". 
The treatment of "Right Node Raising" and 
"Across the Board" relies fully on the refinement of 
subcategorization into the increased number of nodes 
of the structure, but on the other hand it does not 
require any augmentation of the coordination mecha- 
nisms of the framework, the only coordination rule 
being the "coordination of likes". The approach even 
allows for description of constructions where "Right 
Node Raising" and "Across the Board" cooccur. The 
structure assigned to such cases is given in (8) (the 
terminal string of which is quite probably no good 
English, but translations into the FWOLs tested are 
considered fully acceptable). 
312 2 
(8) .~ VP } 
0 "-'~ ~P ( SUBJ } 
John ~~ 
/~ IP { SU\[~J, (}BJ} Z )~VP { SUBJ, OBd } "~'~'~---~ /}/~3 VP {SUBJ} 
i ought and later {lave / ~ VP 
yesterday to Sue ~ ~ { SUBJ, OBJ} 
some app\] es 
Note that the term "\]exica\] head" should be taken with 
a grain of salt for the VP in (8) {as well as for all 
{)(her coordinative constructions) - this is, hog, ever, 
a purely termirlologlca\] matter which can be coped with 
easily in a ful\]f\]edged exposition of the theory and 
i~as no bearing on its validity. Similarly, for coordi- 
iIatJons eons\[stieg of more than two members, the e×em- 
plifJed construction would not conforw to the scheme 
Item (2); this Js again due. to simplifications adopt()(\] 
let the p~rpose of the e~rrent preseniatlon, and \]n a 
i~ore detai led uxposJ tion coordinative constructions 
would be al~.o expanded in th~. "one-member-¢~t-a-time" 
\[\[l,lnne r. 
it talght be also interesting to observe that 
"Gapping" and "Non-Constituent Coordination" cannot be 
treated in the framework, unless it is augmented with 
:ome "deletiun" processes operating on \[:he structures 
<)enerated by the context-free base. 
4. Conclusion~ 
The framewo£k presented in U\]is paper was created 
J n the course of preparatory work for an i;:;plementa- 
t\]on of a pa~se~ fo~ Bulgarian, a free-word-order fan -÷ 
qlage from the Slavonic g~oup. "he !,sin \](iea standing 
b~!hJnd the structures as presented was me.'glng the in- 
.<;\[ghts concernJrlg cummunlcative dynamism co,rained in 
the works of linguists o£ the Prague School with the 
Jrl'<uit',ons uP.deriyieg the dependency descrip~icP.s of 
\]~mguage, and imp!ement\[rEi the whole \[n a:~ i:maediate 
eenst 1 tuent based formalism. The result mig~:t seem 
rather unorthodox in msny respects, bl\]t the eevlat~.ons 
contained ca\[\] be sanctioned by at least two re.T, arkabie 
advantages ef th.e framework proposed over the more 
s\[ andard approaches: 
-- fir:;t, on the theoretical side, the \[;se of 
(ine~eaaed n~moer of} ncntermlna! symbols makes the 
framework superlo: in descriptive {and, let us hope, 
a\]so expl an\[,d o~ y} adequacy concer~irlg suc~ \[ £.eT;o~;lena 
as coo(d\] nat fen and the so-called "eon-p~ oject ire" 
constructions in FWOLs, while sim~ItaP.eously keeping 
the generative power on nhe context-free level \[for 
space \]imltations, no examples of the non~-projeetive 
coestructlorls were glven, but due to the presence of 
nonterminals in the structures, they can be treated in 
the way broadly used \]n other context-free based \]l~f.e- 
dlate constituent approaches, e.g., by the "SLASH" me- 
ci~anisrli of GPSG or }{PSG) 
second, from a more practical viewpoint, while 
the overall approach a\]\].ows for keeping vlrtually all 
linguistic intuitions contained in the dependency ap- 
pl:oaches traditionally used for the description o~ 
FWOLs, the formalization adopted allows for using a 
generative grammar (i.e. a set of declar\]tlvely stated 
rewziting rules) for the description of the language, 
which in turn guarantees a clear correspondence be- 
tween the structures of the language and the mechanism 
that generates them. This correspondence is never so 
slraightfozward in "pure" dependency approaches which, 
a!: a rule, use some exclusively procedural laachinery 
r;)ther remote to the structures that are to be 9e- 
n{.rated o~ parsed by it (of. the framework described 
in Apresyan et ai,1989 for implementation of the 
M{aning-Text model or the "t£ansducing ~lutomata" of 
FtnlcEJonal Generat\]ve Desoripticn described in Sgal! 
et a\],\] 969) . Needless to add, the possibi\] \]ty of 
k(eping the linguistic information in a separated de 
c\]acat\[ve forlaat (such as a set of rewriting ~(\]r!:) 
makes the job !>\[ crt!atlng ~JIld.. u~.,i~ly, (iebug<in{~ th{ 
generator/pa~ser eas\[er (though, of c.')urs(>, not at al 
easy) . 
Partly also as an indireet support2 for th(~ forma- 
lism presented, it is further necessary to say that 
structures in many respects similar were proposed 
independently in (Uszkorelt,\]986} for the descrlptlen 
of "complex fronting" in verb-final C<.~inan c!atJsc~s and 
in (GunJi,i987) in the framework (if JPSC (which, how-- 
ever, advocates the existence of the "NP/VP" d~vls\[c: 
of the sentence and also, slmilarly to Uszke~e.it, hea- 
vily relies on the faet than Japanese is a verb-final 
language) . What Js Inttenesting and, hopefully, also 
important to observe is the fact that this i~apper, ed in 
spite of the fact that both Uszkoreit and Gunj\] start- 
ed flora intuitions different f~orn the ones !ace(Do-- 
rated in the prese,~ted framework (and also different 
from each other} and aimed at deserlption o! phe_somena 
often marginal, remote Or even n(\]nexisteI;t in the \[an- 
gtlages cn the base of whic'!\] the currently preae~teu 
framework was developed and for the descr\]pt\]en cf 
which it is intended to serve. 
~-sTi~g{st Apri~ 1990: 
Lehrstuhl ~r Computerl~ng~:istlk 
Unlversit~t des Saarlandes 
Im Stadtwald 
D-6600 Saarbrhcken 
(West) Germany 
3 31.3 

Bibliography

Apre~yan i.D. , I.M. Boguslav\[klj, L.L. Xemdin, ~.V. 
LazurukiJ, N.V. Perhsov, V.Z. Sannikov ~nd L.L. ~'sin- 
man: The Linguistic Support for the System ETAP-2 (in 
RussJ an) , Nauka, Moskva 1989 

Fi.rbas J.: on the Concept et ComJrlunlcative Dyn,unism in 
the Theory of Functional Sentence Pc(spent i ve, in!: 
Brno Studies \[n Englisi~ \], Oniver!~ity of Jan Eva~tgc- 
lists Purkyne (formez\]y an(1 liow again University o£ 
Tomes C, ar~.igue Masaryk), Brl;o 197\], pp.12-47 

Firba~ J. : Or. the Thereat ic arld the Non-Thematic Sec- 
tion of the SenteP.ce, in: II.Ringbom (ed.) : Style a~!a 
Text - Studies Presented to N.E. Enkvlst, c~ ~,. ~ . 
1975 

GunJl T. : Japanese Phrase S~ rue(urn Cralr.',~ar, 8ei(\]el, 
Dord~echt \] 987 

Pollard C. and X,Sag: Information Based Syi~ta× and 
Semantics. vo\].!: Fundamentals, CSLi Lecture Notes No. 
\]3, CSLi,Stanford, Californ\]a 1987 

S~g X.,G.Gazdar, T.Wasow and S.Welsl~r : Coo(dinar ion 
and How lc Distinguish. Categories, CSL! Report No.3, 
CSL\[, Stanford, Callfo~n\[a 1984 ; x 

Sgall P . , L. N~be~ky, A. Go(el \[£kova and E . lie ji~ova~ : A 
F~m,':t ions \] Approach to Syntax in CA, sere( i ve De- 
scrip(lea of Language, Illsevler, New York 1969 

Sgall P.,E.llajiVov£ and E.Beno~ova~: Topic, Focus and 
Generative Semantics, Kronberg, 'lab<nun 1973 

Sgal\] P. , E. Haji~ov~ and J.Pan~vov~: T::e Mean\] ng of the 
Sentence in its Semantic and Pragmatic Aspects, Acade- 
mia, Prague 1986 

Shieber S., S.Stucky, II.Uszkoreit and J.Robinson: For- 
mal Constiraints on Metarules, in: Proceedings o£ the 
2lst Annua \]. Meeting of the ACL, Cambridge, Maa- 
sach~isett s 1983 

U~zkoreit If. : Linear P~ eceden0e In Discontinuous 
Constituents: Complex Fronting in German, CSLI Report 
No.47, CSL!, Stanford, Cal!fornia !986 
