Literal Movement Grammars 
Annius V. Groenink* 
CWI 
Kruislaan 413 
1098 SJ Amsterdam 
The Netherlands 
avg@cwi,nl 
Abstract 
Literal movement grammars (LMGs) pro- 
vide a general account of extraposition phe- 
nomena through an attribute mechanism al- 
lowing top-down displacement of syntacti- 
cal information. LMGs provide a simple 
and efficient treatment of complex linguistic 
phenomena such as cross-serial dependen- 
cies in German and Dutch--separating the 
treatment of natural language into a parsing 
phase closely resembling traditional context- 
free treatment, and a disambiguation phase 
which can be carried out using matching, as 
opposed to full unification employed in most 
current grammar formalisms of linguistical 
relevance. 
1 Introduction 
The motivation for the introduction of the literal move- 
ment grammars presented in this paper is twofold. The 
first motivation is to examine whether, and in which 
ways, the use of unification is essential to automated 
treatment of natural language. Unification is an ex- 
pensive operation, and pinpointing its precise role in 
NLP may give access to more efficient treatment of 
language than in most (Prolog-based) scientific appli- 
cations known today. The second motivation is the 
desire to apply popular computer-science paradigms, 
such as the theory of attribute grammars and modu- 
lar equational specification, to problems in linguistics. 
These formal specification techniques, far exceeding 
the popular Prolog in declarativity, may give new in- 
sight into the formal properties of natural language, 
and facilitate prototyping for large language applica- 
tions in the same way as they are currently being used to 
facilitate prototyping of programming language tools. 
For an extensive illustration of how formal specifi- 
cation techniques can be made useful in the treatment 
of natural language, see (Newton, 1993) which de- 
scribes the abstract specification of several accounts of 
phrase structure, features, movement, modularity and 
*This work is supported by SION grant 612-317-420 
of the Netherlands Organization for Scientific Research ~wo). 
90 
parametrization so as to abstract away from the exact 
language being modelled. The specification language 
(ASL) used by Newton is a very powerful formalism. 
The class of specification formalisms we have in mind 
includes less complex, equational techniques such as 
ASF+SDF (Bergstra et al., 1989) (van Deursen, 1992) 
which can be applied in practice by very efficient exe- 
cution as a term rewriting system. 
Literal movement grammars are a straightforward 
extension of context-free grammars. The derivation 
trees of an LMG analysis can be easily transformed 
into trees belonging to a context-free backbone which 
gives way to treatment by formal specification systems. 
In order to obtain an efficient implementation, some 
restrictions on the general form of the formalism are 
necessary. 
1.1 Structural Context Sensitivity in Natural 
Language 
Equational specification systems such as the 
ASF+SDF system operate through sets of equations 
over signatures that correspond to arbitrary forms 
of context-free grammar. An attempt at an equa- 
tional specification of a grammar based on context- 
free phrase structure rules augmented with feature con- 
straints may be to use the context-free backbone as a 
signature, and then implement further analysis through 
equations over this signature. This seems entirely ana- 
loguous to the static semantics of a programming lan- 
guage: the language itself is context-free, and the static 
semantics are defined in terms of functions over the 
constructs of the language. 
In computer-science applications it is irrelevant 
whether the evaluation of these functions is carried 
out during the parsing phase (I-pass treatment), or 
afterwards (2-pass treatment). This is not a trivial 
property of computer languages: a computer language 
with static semantics restrictions is a context-sensitive 
sublanguage of a context-free language that is either 
unambiguous or has the finite ambiguity property: for 
any input sentence, there is only a finite number of 
possible context-free analyses. 
In section 1.3 we will show that due to phenom- 
ena of extraposition or discontinuous constituency ex- 
hibited by natural languages, a context-free backbone 
for a sufficiently rich fragment of natural language no 
longer has the property of finite ambiguity. Hence an 
initial stage of sentence processing cannot be based on 
a purely context-free analysis. 
The LMG formalism presented in this paper at- 
tempts to eliminate infinite ambiguity by providing 
an elementary, but adequate treatment of movement. 
Experience in practice suggests that after relocating 
displaced constituents, a further analysis based on fea- 
ture unification no longer exploits unbounded struc- 
tural embedding. Therefore it seems that after LMG- 
analysis, there is no need for unification, and further 
analysis can be carried out through functional match- 
ing techniques. 
1.2 Aims 
We aim to present a grammar formalism that 
t~ is sufficiently powerful to model relevant frag- 
ments of natural language, at least large enough 
for simple applications such as an interface to a 
database system over a limited domain. 
t, is sufficiently elementary to act as a front-end to 
computer-scientific tools that operate on context- 
free languages. 
t~ has a (sufficiently large) subclass that allows ef- 
ficient implementation through standard (Earley- 
based) left-to-right parsing techniques. 
1.3 Requirements 
Three forms of movement in Dutch will be a leading 
thread throughout this paper. We will measure the 
adequacy of a grammar formalism in terms of its ability 
to give a unified account of these three phenomena. 
Topicalization The (leftward) movement of the ob- 
jects of the verb phrase, as in 
(1) \[Which book\]/ did John forget to 
return el to the library? 
Dutch sentence structure The surface order of 
sentences in Dutch takes three different forms: the 
finite verb appears inside the verb phrase in relative 
clauses; before the verb phrase in declarative clauses, 
and before the subject in questions: 
(2) ... dat Jan \[vP Marie kuste \] 
(3) Jan kustei \[vP Marie el \] 
(4) kustei Jan \[,ca Marie ei \] ? 
We think of these three (surface) forms a s merely being 
different representations of the same (deep) structure, 
and will take this deep structure to be the form (2) that 
does not show movement. 
Cross-serial dependencies In Dutch and German, 
it is possible to construct sentences containing arbitrary 
numbers of crossed dependencies, such as in 
... dat Marie Jani Fredj Annek 
that 
(5) hoordei helpenj overtuigen k 
heard help convince 
(that Mary heard John help Fred convince Anne). Here 
the i, j, k denote which noun is the first object of which 
verb. The analysis we have in mind for this is 
dat Marie Jani Fredj Annek 
\[ve hoorde el helpen ej overtuigen e~.\] 
Note that this analysis (after relocation of the extra- 
posed objects) is structurally equal to the correspond- 
ing English VP. The accounts of Dutch in this paper 
will consistently assign "deep structures" to sentences 
of Dutch which correspond to the underlying structure 
as it appears in English. Similar accounts can be given 
for other languages--so as to get a uniform treatment 
of a group of similar (European) languages such as 
German, French and Italian. 
If we combine the above three analyses, the final anal- 
ysis of (3) will become 
Jan kustei Mariej \[w el ej \] 
Although this may look like an overcomplication, this 
abundant use of movement is essential in any uniform 
treatment of Dutch verb constructions. Hence it turns 
out to occur in practice that a verb phrase has no lexical 
expansion at all, when a sentence shows both object 
and verb extraposition. Therefore, as conjectured in 
the introduction, a 2-pass treatment of natural language 
based on a context-free backbone will in general fail-- 
as there are infinitely many ways of building an empty 
verb phrase from a number of empty constituents. 
2 Definition and Examples 
There is evidence that suggests that the typical human 
processing of movement is to first locate displaced in- 
formation (the filler), and then find the logical location 
(the trace), to substitute that information. It also seems 
that by and large, displaced information appears earlier 
than (or left of) its logical position, as in all examples 
given in the previous section. The typical unification- 
based approach to such movement is to structurally 
analyse the displaced constituent, and use this anal- 
ysed information in the treatment of the rest of the 
sentence. This method is called gap-threading; see 
(Alshawi, 1992). 
If we bear in mind that a filler is usually found to 
the left of the corresponding trace, it is worth taking 
into consideration to develop a way of deferring treat- 
ment of syntactical data. E.g. for example sentence 1 
this means that upon finding the displaced constituent 
which book, we will not evaluate that constituent, but 
rather remember during the treatment of the remaining 
part of the sentence, that this data is still to be fitted 
into a logical place. 
This is not a new idea. A number of non- 
concatenative grammar formalisms has been put for- 
ward, such as head-wrapping grammars (HG) (Pol- 
lard, 1984), extraposition grammars (XG) (Pereira, 
1981). and tree adjoining grammars (TAG) (Kroch 
and Joshi, 1986). A discussion of these formalisms 
as alternatives to the LMG formalism is given in sec- 
tion 4. 
91 
Lessons in parsing by hand in high school (e.g. in 
English or Latin classes) informally illustrate the pur- 
pose of literal movement grammars: as opposed to the 
traditional linguistic point of view that there is only 
one head which dominates a phrase, constituents of a 
sentence have several key components. A verb phrase 
for example not only has its finite verb, but also one or 
more objects. It is precisely these key components that 
can be subject to movement. Now when such a key 
component is found outside the consitituent it belongs 
to, the LMG formalism implements a simple mecha- 
nism to pass the component down the derivation tree, 
where it is picked up by the constituent that contains 
its trace. 
It is best to think of LMGs versus context-free 
grammars as a predicate version of the (propositional) 
paradigm of context-free grammars, in that nonter- 
minals can have arguments. If we call the general 
class of such grammars predicate grammars, the dis- 
tinguishing feature of LMG with respect to other pred- 
icate grammar formalisms such as indexed grammars I 
(Weir, 1988) (Aho, 1968) is the ability of binding or 
quantification in the right hand side of a phrase struc- 
ture rule. 
1 2.1 Definition We fix disjoint sets N, T, V of non- 
terminal symbols, terminal symbols and variables. We 
will write A, B, C... to denote nonterminal symbols, 
a, b, c... to denote terminal symbols, and x, y, z for 
variables. A sequence ala2. • • a,~ or a E T* is called 
a (terminal) word or string. We will use the symbols 
a, b, e for terminal words. (Note the use of bold face 
for sequences.) 
1 2.2 Definition (term) A sequence tlt2...t~ or 
t E (V U T)* is called a term. If a term consists of 
variables only, we call it a vector and usually write x. 
1 2.3 Definition (similarity type) A (partial) func- 
tion # mapping N to the natural numbers is called a 
similarity type. 
1 2.4 Definition (predicate) Let # be a similarity 
type, A E N and n = /~(A), and for 1 <_ i <_ n, 
let ti be a term. Then a predicate qa of type # is 
a terminal a (a terminal predicate) or a syntactical 
unit of the form A ( t l , t 2, • •., t,~ ), called a nonterminal 
predicate. If all t~ = xl are vectors, we say that 
= A(a~l, ~e2,... , a~n) is apattern. 
Informally, we think of the arguments of a nonterminal 
as terminal words. A predicate A(x) then stands for 
a constituent A where certain information with termi- 
nal yield x has been extraposed (i.e. found outside 
the constituent), and must hence be left out of the A 
constituent itself. 
1 2.5 Definition (item) Let/z be a similarity type, 
~p a predicate of type #, and t a term. Then an item 
of type # is a syntactical unit of one of the following 
forms: 
1 Indexed grammars are a weak form of monadic predicate 
grammar, as a nonterminal can have at most one argument. 
1. qo (a nonterminal or terminal predicate) 
2. x:~ (a quantifier item) 
3. ~/t (a slash item) 
We will use ¢, qJ to denote items, and a,/3, 3' to denote 
sequences of items. 
1 2.6 Definition Let /z be a similarity type. A 
rewrite rule R of type/2 is a syntactical unit qo ---, 
qbl (I)2 • ' • qb,~ where qo is a pattern of type #, and for 
I < i < n, ~i is an item of type #. 
A literal movement grammar is a triple (#, S, P) 
where # is a similarity type, S E N, #(S) = 0 and P 
is a set of rewrite rules of type #. 
Items on the right hand side of a rule can either refer 
to variables, as in the following rule: 
A(x, yz) -~ BO/x a/y C(z) 
or bind new variables, as the first two items in 
A 0 ---, x:B 0 y:C(x) D(y). 
A slash item such as B()/x means that x should be 
used instead of the actual "input" to recognize the non- 
terminal predicate B(). I.e. the terminal word x should 
be recognized as B0, and the item BO/x itself will 
recognize the empty string. A quantifier item x:B() 
means that a constituent B() is recognized from the 
input, and the variable x, when used elsewhere in the 
rule, will stand for the part of the input recognized. 
1 2.7 Definition (rewrite semantics) Let R = 
A(Xh..., x,~) ~ ~1(I)2 ,.. ~rn be a rewrite rule, then 
an instantiation of R is the syntactical entity obtained 
by substituting for each i and for each variable x E xl 
a terminal word a~. 
A grammar derives the string a iff S 0 =~ a where 
G ===~ is a relation between predicates and sequences of 
items defined inductively by the following axioms and 
inference rules: 2 
G a~a 
G qo ==* a when qo --* a is an instantiation 
of a rule in G 
qo ~ /3 A(tl,...,t,~) 7 A(tl,...,t,~) ~ a MP 
G ,-t3 a 7 
 -Lc,/3 ¢/a 7 /E 
:E 
(/3 a 3')\[a/x\] 
2Note that \[a/x\] in the :E rule is not an item, but stands 
for the substitution of a for z. 
92 
B(aa) ~ a/a b B(a) c a~ a 
B(aa) ~ b B(a) c 
/E 
B(a) =~a a/a b B(e) c a~a 
B(a)~a b B(¢) c 
/E 
B(¢) =~ 
B(a) =~G bc 
MP 
MP 
sO ~ x: A 0 B(x) A 0 ==~ aa 
S 0 =~ aa B(aa) 
B(aa) ~ bbcc 
:E 
B(aa) =~G bbcc 
MP 
S 0 ~ aabbcc 
Figure 1. Derivation of aabbcc. 
1 2.8 Example (a'~b'~c '*) The following, very ele- 
mentary LMG recognizes the trans-context free lan- 
guage anbnc n : 
s0 ~ ~:AO B(~) 
A 0 ---* a A 0 
A 0 --~ 
B(xy) ~ a/x b B(y) c 
8(6) ~ 
Figure 1 shows how aabbcc is derived according to 
the grammar. The informal tree analysis in figure 2 
s0 
A0 
a A 0 
6 
B(y) = B(aa) 
a/a b B(a) c 
a/a b B(e) c 
E 6 
Figure 2. Informal tree analysis. 
illustrates more intuitively how displaced information 
(the two a symbols in this case) is 'moved back down' 
into the tree, until it gets 'consumed' by a slash item. 
It also shows how we can extract a context-free 'deep 
structure' for further analysis by, for example, formal 
specification tools: if we transform the tree, as shown 
in figure 3, by removing quantified (extraposed) data, 
and abstracting away from the parameters, we see that 
the grammar, in a sense, works by transforming the lan- 
guage anbnc n to the context-free language (ab)ncn. 
Figure 4 shows how we can derive a context free 'back- 
bone grammar' from the original grammar. 
12.9 Example (cross-serial dependencies in 
Dutch) The following LMG captures precisely the 
three basic types of extraposition defined in section 
1.3: the three Dutch verb orders, topicalization and 
cross-serial verb-object dependencies. 
s ~ s'(~) 
S'(e) -~ dat NP VP(e,e) 
S'(e) * n:NP S'(n) 
S'(n) -~ v:V NP VP(v,n) 
S'(e) ~ NP v:V VP(v,e) 
re(v, n) -~ m:NP W(v,,~m) 
VP(v,,~) --, V'(v, n) 
V(c, ~) --, v7 
V(v, ~) --, Vt/v 
V(¢,n) -, VT NP/n 
V'(v, n) ~ VT/v NP/n 
12(¢,nm) ---, VR NP/n 12(e,m) 
V(v, nm) ---* VR/v gP/n V(e,m) 
V ~ VI 
V --+ VT 
V ~ VR 
A sentence S' has one argument which is used, if 
nonempty, to fill a noun phrase trace. A VP has two 
S 
XP B 
a b B c 
a b B c I 
Figure 3. Context free backbone. 
93 
S --, XPB 
B -~ abBc 
B -~ e 
XP -* e 
Figure 4. Backbone grammar. 
arguments: the first is used to fill verb traces, the sec- 
ond is treated as a list of noun phrases to which more 
noun phrases can be appended. A V' is similar to a VP 
except that it uses the list of noun phrases in its second 
argument to fill noun phrase traces rather than adding 
to it. 
Figure 5 shows how this grammar accepts the sen- 
tence 
Marie zag Fred Anne kussen. 
We see that it is analyzed as 
Marie zag i Fredj Annek 
IV' ei ej \[V, kussen e~ \]\] 
which as anticipated in section 1.3 has precisely the 
basic, context-free underlying structure of the corre- 
sponding English sentence Mary saw Fred kiss Anne 
indicated in figure 5 by terminal words in bold face. 
Note that arbitrary verbs are recognized by a quanti- 
s() 
s'(e) 
! \[ VP(v ~- zag, e') 
Marie z ag ~n~ ~....~ n 
NP , = Fred) 
n = Fred Anne) Fred p I 
Anne Vt(zag, Fred Anne) 
VR/~E, Anne) 
VT NP/Anne 
I : 
kussen e 
Figure 5. Derivation of a Dutch sentence 
fier item v:V, and only when, further down the tree, a 
trace is filled with such a verb in items such as VR/v, 
its subcategorization types VI, VT and VR start playing 
a role. 
3 Formal Properties 
The LMG formalism in its unrestricted form is shown 
to be Turing complete in (Groenink, 1995a). But the 
grammars presented in this paper satisfy a number of 
vital properties that allow for efficient parsing tech- 
niques. 
Before building up material for a complexity result, 
notice the following proposition, which shows, using 
only part of the strength of the formalism, that the 
literal movement grammars are closed under intersec- 
tion. 
1 3.1 Proposition (intersection) Given two lit- 
eral movement grammars G1 --- (#1,$1, P1) and 
Gz = (tzz, $2, Pz) such that dom(#l) n dom(#2) = O, 
we can construct the grammar GI = (#1 U #z U 
{(S, 0)}, S, P1 U P2 U {R}) where we add the rule 
R: 
so -~ =S,O Sz()/x 
Clearly, GI recognizes precisely those sentences which 
are recognized by both G1 and Gz. 
We can use this knowledge in example 2.9 to restrict 
movement of verbs to verbs of finite morphology, by 
adding a nonterminal VFIN, replacing the quantifier 
items v:V that locate verb fillers with v:VFIN, where 
VFIN generates all finite verbs. Any extraposed verb 
will then be required to be in the intersection of VFIN 
and one of the verb types VI, VT or VR, reducing 
possible ambiguity and improving the efficiency of 
left-to-right recognition. 
The following properties allow us to define restrictions 
of the LMG formalism whose recognition problem has 
a polynomial time complexity. 
1 3.2 Definition (non-combinatorial) An LMG is 
non-combinatorial if every argument of a nonterminal 
on the RHS of a rule is a single variable (i.e. we do 
not allow composite terms within predicates). If G 
is a non-combinatorial LMG, then any terminal string 
occurring (either as a sequence of items or inside a 
predicate) in a full G-derivation is a substring of the 
derived string. The grammar of example 2.8 is non- 
combinatorial; the grammar of example 2.9 is not (the 
offending rule is the first VP production). 
1 3.3 Definition (left-binding) An LMG G is left- 
binding when 
1. W.r.t. argument positions, an item in the RHS of 
a rule only depends on variables bound in items 
to its left. 
2. For any vector x ~ • • • x,~ of n > 1 variables on the 
LHS, each of xl upto xn-~ occurs in exactly one 
item, which is of the form qo/xl. Furthermore, 
for each 1 < I < k < n the item referring to xz 
appears left of any item referring to x~. 
For example, the following rule is left binding: 
A(xyz, v) ~ u:B(v) C(v)/x DO/y E(u,z) 
but these ones are not: 
(a) g(y) ---* C(x) x:D(y) 
(b) A(xy) ---* A(x) B(y) 
(c) A(xyz)~ A(z) BO/x CO/y 
94 
because in (a), x is bound right of its use; in (b), 
the item A(x) is not of the form qo/x and in (e), the 
variables in the vector zyz occur in the wrong order 
(zzy). 
Ifa grammar satisfies condition 1, then for any deriv- 
able string, there is a derivation such that the modus 
ponens and elimination rules are always applied to the 
leftmost item that is not a terminal. Furthermore, the 
:E rule can be simplified to 
:E G 
The proof tree in example 2.8 (figure 1) is an example 
of such a derivation. 
Condition 2 eliminates the nondeterminism in find- 
ing the right instantiation for rules with multiple vari- 
able patterns in their LHS. 
Both grammars from section 2 are left-binding. 
1 3.4 Definition (left-recursive) An LMG G is 
left-recursive if there exists an instantiated nonterminal 
G predicate qa such that there is a derivation of ~o ~ ~pc~ 
for any sequence of items c~. 
The following two rules show that left-recursion in 
LMG is not always immediately apparent: 
A(y) ~ BO/Y A(e) 
B 0 ~ 
for we have 
A(¢) ~ B()/¢ a(e) B 0 ~ 
A(~) =:~ A(e) 
/E 
We now show that the recognition problem for an arbi- 
trary left-binding, non-combinatorial LMG has a poly- 
nomial worst-case time complexity. 
1 3.5 Theorem (polynomial complexity) Let 
G be a LMG of similarity type # that is non- 
combinatorial, left binding and not left-recursive. Let 
m be the maximum number of items on the right hand 
side of rules in G, and let p be the greatest arity of 
predicates occurring in G. Then the worst case time 
complexity of the recognition problem for G does not 
exceed O(IGIm(1 + p)nl+'~+2P), where n is the size 
of the input string ala2" • .a,~. 
Proof (sketch) We adopt the memoizing recursive de- 
scent algorithm presented in (Leermakers, 1993). As 
G is not left-binding, the terminal words associated 
with variables occurring in the grammar rules can be 
fully determined while proceeding through sentence 
and rules from left to right. Because the grammar is 
non-combinatorial, the terminal words substituted in 
the argument positions of a nonterminal are always 
substrings of the input sentence, and can hence be rep- 
resented as a pair of integers. 
The recursive descent algorithm recursively com- 
putes set-valued recognition functions of the form: 
\[~o\](i) = {jl~o ~ ai+l"" .aj} 
where instead of a nonterminal as in the context- 
free case, qo is any instantiated nonterminal predicate 
A(bl,..., b,~). As bl,...,b,~ are continuous sub- 
strings of the input sentence ala2 • • • an, we can re- 
formulate this as 
\[A\](i, (tl, r,),..., r,,)) 
= {jlA(ah+ 1...a,,,,...,at.+l...ar~) 
ai+ 1 • •. aj } 
Where # = #(A) < p. The arguments i, ll,...,l~, 
and rl,. •., r t, are integer numbers ranging from 0 to 
n - 1 and 1 to n respectively. Once a result of such 
a recognition function has been computed, it is stored 
in a place where it can be retrieved in one atomic 
operation. The number of such results to be stored is 
O(n) for each possible nonterminal and each possible 
combination of, at most 1 + 2p, arguments; so the total 
space complexity is O(IGIn2+2p). 
Much of the extra complication w.r.t, the context- 
free case is coped with at compile time; for example, 
if there is one rule for nonterminal A: 
A(x,,x2) ~ x3:Ba(xj) B2() B3(x3)/x2 
then the code for \[g\](i, (ll, r,), (12, r2)) will be 
result := empty 
for kl E \[B1\](i, (/1, rl)) 
do 13 := i 
r 3 := k 1 
rot e 
for k 3 e \[B3\](/2, (/3, r3)) 
if (k 3 =: r2) 
add k2 to result 
return result 
The extra effort remaining at parse time is in copy- 
ing arguments and an occasional extra comparison 
(the if statement in the example), taking rn(1 + p) 
steps everytime the innermost for statement is reached, 
and the fact that not O(n), but O(n l+2p) argument- 
value pairs need to be memoized. Merging the re- 
sults in a RHS sequence of rn items can be done in 
O(m(1 + p)n ~-1) time. The result is a set of O(n) 
size. As there are at most O(IGln 1+2p) results to be 
computed, the overall time complexity of the algorithm 
is O(IGIm(1 + p)nl+m+2P). \[\] 
| 3.6 Remark If all nonterminals in the grammar are 
nullary (p = 0), then the complexity result coincides 
with the values found for the context-free recursive 
descent algorithm (Leermakers, 1993). Nullary LMG 
includes the context-free case, but still allows move- 
ment local to a rule; the closure result 3.1 still holds for 
this class of grammars. As all we can do with binding 
and slashing local to a rule is intersection, the nullary 
LMGs must be precisely the closure of the context-free 
grammars under finite intersection. 
These results can be extended to more efficient al- 
gorithms which can cope with left-recursive gram- 
mars such as memoizing recursive ascent (Leermak- 
ers, 1993). A very simple improvement is obtained 
by bilinearizing the grammar (which is possible if it 
95 
is left binding), giving a worst case complexity of 
o(Ic\[(1 + p)n3+2,). 
4 Other Approaches to Separation of 
Movement 
A natural question to ask is whether the LMG for- 
malism (for the purpose of embedding in equational 
specification systems, or eliminating unification as a 
stage of sentence processing) really has an advantage 
over existing mildly context-sensitive approaches to 
movement. Other non-concatenative formalisms are 
head-wrapping grammars (HG) (Pollard, 1984), extra- 
position grammars (XG) (Pereira, 1981) and various 
exotic forms of tree adjoining grammar (Kroch and 
Joshi, 1986). For overviews see (Weir, 1988), (Vijay- 
Shanker et al., 1986) and (van Noord, 1993). The most 
applicable of these formalisms for our purposes seem 
to be HG and XG, as both of these show good re- 
sults in modeling movement phenomena, and both are 
similar in appearance to context-free grammars; as in 
LMG, a context-free grammar has literally the same 
representation when expressed in HG or XG. Hence it 
is to be expected that incorporating these approaches 
into a system based on a context-free front-end will not 
require a radical change of perspective. 
4.1 Head Grammars 
A notion that plays an important role in various forms 
of Linguistic theory is that of a head. Although there 
is a great variation in the form and function of heads 
in different theories, in general we might say that the 
head of a constituent is the key component of that con- 
stituent. The head grammar formalism, introduced by 
Pollard in (Pollard, 1984) divides a constituent into 
three components: a left context, a terminal head and 
a right context. In a HG rewrite rule these parts of a 
constituent can be addressed separately when building 
a constituent from a number of subconstituents. 
An accurate and elegant account of Dutch cross- 
serial dependencies using HG is sketched in (Pollard, 
1984). However, we have not been able to construct 
head grammars that are able to model verb move- 
ment, cross-serial dependencies and topicalization at 
the same time. For every type of constituent, there 
is only one head, and hence only one element of the 
constituent that can be the subject to movement. 3 
4.2 Extraposition Grammars 
Whereas head grammars provide for an account of 
verb fronting and cross-serial dependencies, Pereira, 
3However, a straightforward extension of head grammars 
defined in (Groenink, 1995a) which makes use of arbitrary tu- 
pies, rather than dividing constituents into three components, 
is (1) capable of representing the three target phenomena of 
Dutch all at once and (2) weakly equivalent to a (strongly 
limiting) restriction of literal movement grammars. Head 
grammars and their generalizations, being linear context- 
free rewriting systems (Weir, 1988), have been shown to 
have polynomial complexity. 
introducing extraposition grammars in (Pereira, 1981), 
is focused on displacement of noun phrases in English. 
Extraposition grammars are in appearance very similar 
to context-free grammars, but allow for larger patterns 
on the left hand side of PS rules. This makes it possible 
to allow a topicalized NP only if somewhere to its right 
there is an unfilled trace: 
S --~ Topic S 
Topic . . . XP --* NP 
While XG allows for elegant accounts of cross-serial 
dependencies and topicalization, it seems again hard 
to simultaneously account for verb and noun move- 
ment, especially if the bracketing constraint introduced 
in (Pereira, 1981), which requires that XG derivation 
graphs have a planar representation, is not relaxed. 4 
Furthermore, the practical application of XG seems 
to be a problem. First, it is not obvious how we should 
interpret XG derivation graphs for further analysis. 
Second, as Pereira points out, it is nontrivial to make 
the connection between the XG formalism and stan- 
dard (e.g. Earley-based) parsing strategies so as to 
obtain truly efficient implementations. 
5 Conclusions 
We have presented the LMG formalism, examples 
of its application, and a complexity result for a con- 
strained subclass of the formalism. Example 2.9 shows 
that an LMG can give an elegant account of movement 
phenomena. The complexity result 3.5 is primarily in- 
tended to give an indication of how the recognition 
problem for LMG relates to that for arbitrary context 
free grammars. It should be noted that the result in 
this paper only applies to non-combinatorial LMGs, 
excluding for instance the grammar of example 2.9 as 
presented here. 
There are other formalisms (HG and XG) which 
provide sensible accounts of the three movement phe- 
nomena sketched in section 1.3, but altogether do not 
seem to be able to model all phenomena at once. In 
(Groenink, 1995b) we give a more detailed analysis of 
what is and is not possible in these formalisms. 
Future Work 
1. The present proof of polynomial complexity does 
not cover a very large class of literal movement gram- 
mars. It is to be expected that larger, Turing complete, 
classes will be formally intractable but behave reason- 
ably in practice. It is worthwile to look at possible prac- 
tical implementations for larger classes of LMGs, and 
investigate the (theoretical and practical) performance 
of these systems on various representative grammars. 
2. Efficient treatment of LMG strongly depends 
on the left-binding property of the grammars, which 
4Theoretically simultaneous treatment of the three move- 
ment phenomena is not impossible in XG (a technique similar 
topit-stopping in GB allows one to wrap extrapositions over 
natural bracketing islands), but grammars and derivations 
become very hard to understand. 
96 
seems to restrict grammars to treatment of leftward 
extraposition. In reality, a smaller class of rightward 
movement phenomena will also need to be treated. It 
is shown in (Groenink, 1995b) that these can easily 
be circumvented in left-binding LMG, by introducing 
artificial, "parasitic" extraposition. 
Acknowledgements 
I would like to thank Jasper Kamperman, Ren6 Leer- 
makers, Jan van Eijck and Eelco Visser for their en- 
thousiasm, for carefully reading this paper, and for 
many general and technical comments that have con- 
tributed a great deal to its consistency and readability. 
David J. Weir. 1988. Characterizing Mildly Context- 
Sensitive Grammar Formalisms. Ph.D. thesis, Uni- 
versity of Pennsylvania. 
References 
A.V. Aho. 1968. Indexed Grammars -an Extension 
to Context-free grammars. JACM, 15:647-671. 
Hiyan Alshawi, editor. 1992. The Core Language 
Engine. MIT Press. 
J.A. Bergstra, J. Heering, and P. Klint, editors. 1989. 
Algebraic Specification. ACM Press Frontier Se- 
ries. The ACM Press in co-operation with Addison- 
Wesley. 
Annius V. Groenink. 1995a. Accounts of 
Movement--a Formal Comparison. Unpublished 
manuscript. 
Annius V. Groenink. 1995b. Mechanisms for Move- 
ment. Paper presented at the 5th CLIN (Compu- 
tational Linguistics In the Netherlands) meeting, 
November 1994. 
A.S. Kroch and A.K. Joshi. 1986. Analyzing Extra- 
position in a TAG. In Ojeda Huck, editor, Syntax 
and Semantics: Discontinuous Constituents. Acad. 
Press, New York. 
Ren6 Leermakers. 1993. The Functional Treatment of 
Parsing. Kluwer, The Netherlands. 
Michael Newton. 1993. Formal Specification of 
Grammar. Ph.D. thesis, University of Edinburgh. 
Fernando Pereira. 1981. Extraposition Grammars. 
Computational Linguistics, 7(4):243-256. 
Carl J. Pollard. 1984. Generalized Phrase Struc- 
ture Grammars, Head Grammars, and Natural Lan- 
guage. Ph.D. thesis, Standford University. 
Arie van Deursen. 1992. Specification and Genera- 
tion of a A-calculus environment. Technical report, 
CWI, Amsterdam. Published in revised form in Van 
Deursen, Executable Language Definitions--Case 
Studies and Origin Tracking, PhD Thesis, Univer- 
sity of Amsterdam, 1994. 
Gertjan van Noord. 1993. Reversibility in Natural 
Language. Ph.D. thesis, Rijksuniversiteit Gronin- 
gen. 
K. Vijay-Shanker, David J. Weir, and A.K. Joshi. 1986. 
Tree Adjoining and Head Wrapping. In 11th int. 
conference on Computational Linguistics. 
97 
