Cross-Serial Dependencies Are Not Hard to Process 
Carl Vogel Ulrike Hahn Holly Branigan 
Institute for Computational Department of Experimental Centre for Cognitive Science 
Linguistics 
University of Stuttgart 
Azenbergstr. 12 
D-70174 Stuttgart 
Germany 
Psychology 
University of Oxford 
South Parks Road 
Oxford OX1 3UD 
England 
University of Edinburgh 
2 Buccleuch Place 
Edinburgh EH8 9LW 
Scotland 
{vogel,holly,ueh}~cogsci.ed.ac.uk 
Abstract 
Cross-serial dependencies in Dutdl and 
Swiss-German are the only known extra- 
context fi'ee natural language syntactic 
phenonmna. Psycholinguistie evidence 
suggests cross-serial orderings tend to be 
easier to process t, lmn nested cons\[,ruc- 
|iions. We, argue thai; |;tie expressivity re- 
quirements of the corresponding formal 
languages do not actually entail |;hat pro- 
cessing reduplication languages require 
the worst-ease time complexity for lmi- 
guages of the same expressive class. We 
dist;inguish between context-free repre- 
sentability and contc, xt-free processing. 
We show that for any language with up 
to context fl'ee expressive power, pro- 
cessing cross-scriM dependencies can be 
accommodated without atfect;ing tmrsing 
complexil,y. This is relal,ed to other work 
on reduplication phenonmna in formal 
models of computation. 
1 Introduction 
The cross-serial dependencies in Dutch and Swiss- 
German are the only known constituent-h;vel syn- 
tacl;ic phenomena whMl make natural languages 
not representable in con|,ext fi'ee languages (Gaz- 
dar, 198,5; Gazdar mid Pullum, 1985). Psycholin- 
guistic s~,udy of the cross-scriM dependencies re- 
veals thai; tim cross-serial orderings tend to be 
preferred over nested constructions (Bach eL al., 
1986)} Bach et al. argue Dora this dmt tim push- 
down stack cannot be the universal basis of the 
human parsing mechanism (since the pushdown 
automaton is essentially a context free recognil, ion 
device whidt cannot represent cross-serial depen- 
dencies). Stabler (1994), on the other hand, con- 
siders t;tm findings of Bach el; al. (1986) as evi- 
dence for finite hunian sentence processing capac- 
ity. In l, his paper, we dist, inguish between conl,ext- 
fi'ee representability and context-fl'ee processing. 
1Nested constructions are a quintessentially con- 
text free phenomenon. 
We show that for any language with up to con- 
text fi'ee expressive power, processin9 cross-serial 
dependencies can be accommodated without af- 
fecting parsing complexity. While this does essen- 
tially inflme the language with indexed expressiv- 
ity, it does so while allowing us to rc~ain eollt,ext 
free (or even regular) parsing eonqJexity. Ess~'n- 
tially, il; is possible t,o carve oul; a cross-section 
of l,he expressivity hierarchy with dm dcaircd pro- 
cessing complexity. The result is based oil the sim- 
ple observation that t,he cross-serial dependencies 
m'e idealized by the string duplication language 
(whereas the nested dependencies m'e idealized by 
the palindrome language), and that it is t, rivial to 
provide a context-free (or regular) language parse 
for half of the st;ring, followed by a Lest: of equal- 
il,y for the remaining half of the string. This is 
consis(,ent; with tindings that cross-serial depen- 
dencies are not; hard to process, but qualilies the 
interpret;ation that Bach el; al. give to their re- 
suits and l,he implications on the human parsing 
niechanism, hi parl;icular, this suggests thai, with 
ml addil, ional operation |tie pushdown stack can 
be adequate for processing human lasiguages. It, 
also suggests an explanation for die finding that 
Dutch cross-serial dependencies arc easier to pro- 
cess than Gernlan nested dependencies. We out- 
line fllrliher consequences of our proposal in terms 
of patterns of disfhiencies that are likely to occur 
in languages that admit cross-serial dependencies 
and propose a strate.gy R~r emtfirical investigation. 
2 Preliminaries 
To calibrate our discussion, we quickly review t,h~, 
salient terminology from formal langm~ge theory 
and the current undersl,anding of dm import; tor 
natural language.s. 
2.1 Terminology 
Let 12i denote the hierarchy of languages gener- 
ated by the corresponding hierarchy of gramnmrs 
(according to dm usuN hierarchy (Hopcroft and 
Ulhnan, 1979)). Thus,/20 denot;es the (:lass of lan- 
guages general,ed by type 0 grammars. They are 
ehm'aeterized by unrestricted grammar produc- 
157 
tion rules. £1 is the class of languages generated 
by context sensitive granlmars--the sole restric- 
tion on production rules in this type of grammar 
is that the right hand side (RHS) of each rule is at 
least as long as the left hand side (LHS). £1.5 de- 
notes the class of languages generated by indexed 
grammars. Gazdar (1985) provides the most per- 
spicuous notation for the restricted forms that 
production rules may take in such grammars: 2 
1. A\[...\] --+ W\[...\] 
2. A\[...\] ---+ B\[i, ...\] 
3. A\[i,...\] ----+ W\[...\] 
Indexed grammars incorporate a notion of stack- 
ing; rules of the form in (2) describe push opera- 
tions, and those of the form in (3) involve pops. 
Rules of the form (1) are copy operations. The 
elipses indicate that the remainder of the stack 
is passed on from the LHS to each nonterminal 
(and only the nonterminals) on the RHS. £2 is the 
class of context free languages generated by gram- 
mars whose productions are restricted such that 
the LHS of each is a single nonterminal symbol, 
and each RHS is a sequence of terminals and non- 
terminals. Finally, the regular languages, £3 are 
those produced by regular grammars, character- 
ized by rules that have a single nonterminal sym- 
bol on the LHS and on the RHS, either a terminal 
symbol or a terminal and a single nonterminal. 
These classes of languages can be arranged into 
a hierarchy based on proper containment rela- 
tions among them: £3 C £2 C £1.5 C £1 C 
£0 (£0 is the least restrictive, the most expres- 
sive). Aho (1968) shows the existence of lan- 
guages that are a proper subset of the indexed 
languages and a proper superset of the context 
free. Joshi et al. (1989) conjecture that there 
is actually a convergence in expressive power 
among the 'mildly context sensitive' (MCS) lan- 
guages, but other work points out exceptions (Sav- 
itch, 1989; Vogel and Erjavec, 1994). Since the 
reduplication languages (Savitch, 1989) are cen- 
tral to the point of this paper we define them-- 
the languages homomorphic to the set of strings 
{ww\[w 6 {a,b}*}. The string duplication lan- 
guages are not context free, although they are 
closely related to the string reversal languages 
({wwR\[w 6 {a, b}*}, where the R indicates the re- 
versal operator) which are context free. The two 
languages induce different dependency relation- 
ships which is best described as nesting in the con- 
text free case and cross-serial in the indexed case: 
abba abab 
.I----÷ 
4- ..... ÷ I 
.I----÷ 
2The bracketed material indicates a stack of in- 
dices; W denotes a sequence of elements of terminals 
and nonterminals; A, B denote nonterminals. 
An important property of the each of the lan- 
guage classes is that it is closed under bottl in- 
tersection with regular languages (e.g., the inter- 
section of a context free language and a regular 
language is no more expressive than a context 
free language) and homomorphism (e.g., an or- 
der preserving map of each symbol in a language 
to a single element (possibly a string) of a context 
free language implies that the first language is also 
context free). It is convenient to refer to languages 
with homomorphismSwwR{WWRIwto E {a, b}*} ai~d 
{wwIw 6 {a,b}*} as and ww, respectively. 
Corresponding to expressivity class and the as- 
sociated model of computation is the complex- 
ity of recognition for each class. Table 1 gives 
an informal ranking of the language classes with 
their corresponding worst case recognition com- 
plexity on the standard model of computation. 
Thus, given a context free grammar for ww R and 
a string of length n, then in the worst case it will 
take an amount of time proportional to the cube 
of the length of the string to determine whether 
the string is in ww R (and identify its structure). 
While the expressivity hierarchy is useful for dif- 
ferentiating classes of lmlguages in precise terms 
like worst-case recognition complexity, it is easy 
to use the hierarchy incorrectly. For instance, 
it is not valid to conclude that because a lan- 
guage is in a particular language class all subsets 
of that language are also included that language 
class (e.g. ww;i is a proper subset of w, yet w 6£3 
ww R 6£2). Also, in most cases the structural de- 
scriptions that underlie strings of a language are 
of more interest than the string sets themselves. 
For this reason it is useful to distinguish weak and 
strong containment of a grammar in a language 
class: e.g., a grammar is weakly context free if 
its stringset is context free; a grammar is strongly 
context free if its treeset is also context free. 
2.2 Applicability to Natural Language 
Pullum and Gazdar (1982) survey the arguments 
up to the time they wrote for the non-coritext- 
freeness of natural language. The most interesting 
were those that considered idealizations of linguis- 
tic phenomena in terms of the string duplicating 
language, ww. In each case they found the m'- 
gument flawed: the phenomena in question did 
not yield languages whose stringsets were homo- 
morphic to tile duplication language. Bresnan et 
al. (1982) argue that Dutch is not strongly con- 
text free. Shieber (1985) provides a stringset ar- 
gument about a dialect of Swiss-German, which 
has a class of verb phrases with cross-serial depen- 
dencies (through case marking) between NPs and 
their Vs, which establishes even the weak-non- 
context-freeness of natural language because of 
homomorphism to ww. Manaster-Ramer (1987) 
re-analyzes an argument considered by Pullum 
and Gazdar (1982) about Dutch and produces a 
158 
Hierarchy Level \]\] Language Type . Model of Computation Complexity 
• . ,,, 
0 unrestricted phrase structure undecidable 
grammar (=.r.e.) .. 
1 context sensitive (C recursive) PSPACE 
1.5 indexed 
1.75 mildly context sensitive 
2 context free 
3 regular 
Turing Machine (TM) 
Linear Bounded Automata 
LBA) 
ested Stack Automata  NSA) 
mbeded Pushdown 
Automata (EPDA) 
Pushdown Automata (PDA) 
Finite State Machines (FSM) 
NP-Complete 
n 7 
n a 
' linear " ' 
Table 1: Models of Grammar and Computation 
corrected stringset argument that Dutch licences 
a"b'*c '~ constructions, which are MCS. No known 
syntactic phenomenon requires greater than in- 
dexed language expressivity. 
The point of this paper is to emphasize that al- 
though a particular Swiss-German dialect renders 
natural language syntax non-context free, it does 
not entail that natural languages, induding the 
ones that license cross-serial dependencies, incur 
the worst case recognition complexity costs for in- 
dexed languages. In fact, we argue in the next 
section that ww is fairly straightforward to pro- 
cess. Essentially, we consider languages xx homo- 
morphic to ww, where x can be either £3 or £2, 
and argue that the recognition for xx is no worse 
than worst case recognition for £3 if x E£3 and 
no worse than the worst case for £2 ifx E£2, even 
though xx is itself indexed. 
3 Cross-Serial Dependencies Are 
Not Hard to Process 
It is always possible to compile less restrictive 
grammar formalisms into more restrictive covering 
formalisms, allowing different constituent analy- 
ses and potential stringset overgeneration. Meta- 
grammatical techniques give an alternative that 
preserve coverage, but use special purpose pro- 
cessing. We suggest a parsing method for lan- 
guages that rely on ww which does not cost a 
greater complexity fec than the worst case for 
parsing context fi'ee grammars. The method is 
metagrammatical and therefore akin to propos- 
als put forward previously for handling coordina- 
tion (Dahl and McCord, 1983) with logic gram- 
mars and TAGs (Shieber, 1995) or for extraposi- 
tion (Milward, 1994). The method is constrained 
enough not to augment overall processing com- 
plexity, implying that ww does not require the 
worst case recognition complexity for its charac- 
teristic class, the MCS languages. 
3.1 Why not? 
Trivially, the string duplication languages can be 
recognized with time complexity proportional to 
the length of the string -- if the string is of even 
length, and its first half is identical to the sec- 
ond half, then this can be established in just lin- 
ear time. Though trivial in the sense of being 
about mere recognition, this is nonetheless inter- 
esting. In particular, under the reasonable hy- 
pothesis that humans are not in general reverse- 
wired a it is easier to process serial orders thml 
their reverse. In this trivial recognition model we 
could take tile serial ordering as primitive, but to 
use the same model as a recognizer for the con- 
text free string reversal languages would require 
an additional step of reversing the second tlalf 
of the string before checking equivalence, which 
means the recognition complexity is nlogn. Thus, 
for trivial recognition tim string duplication lan- 
guages are easier to process than the string rever- 
sal lazlguagcs. This is a concrete illustration that 
not every language costs the worst case recogni- 
tion complexity for its expressivity class. 
However, in the case of natural languages, pars- 
ing is of greater interest than mere recognition. 
A generalization of the recognizer method can be 
used inside a parsing approach as well. Suppose 
some i such that i > 2; suppose we want a rec- 
ognizer for {ww\]w E {a,b}*} where w E £i, then 
we can use a parser that is no worse than cubic 
(if i : 2) and which can be linear (if i = 3) to 
determine if w EEl. Thus, if we parse exactly 
half of the string using a processor designed for 
languages in £i, and then ascertain whether the 
remaining half is identical, then we remain in the 
aWhile there actually is structural reverse wiring, 
psychological effects, like child learning of the dis- 
tinction between left and right hands on themselves 
and on a person facing them, suggest that there is a 
difference in processing time required between recog- 
nizing a copy and an inverse copy. Another example 
comes from the recognition of rotated objects. There 
is a robust effect for which given a reference object 
and a rotated object-in-question it takes time linear 
in the amount of rotation to recognize the objects as 
copies. Mirror-image objects are isomorphic, yet it 
takes strictly more time to recognize reflected copies 
than to recognize nonreflected copies (Cooper, 1975). 
159 
same processing complexity class, since the iden- 
tity check occurs after tile parse and only requires 
linear time, but we also have structural informa- 
tion about the sentence as a whole. We know the 
structure of the first half of the string, and the sec- 
ond half of tile string but not the structure of tile 
second half (the grammar for w could be ambigu- 
ous), although we can assume that the second w 
was licensed by exactly the same tree structure as 
the first. This method also preserves a relative dif- 
ference between parsing ww and ww n, at least for 
£3. Since ww ~ can be represented directly within 
£2 it can be argued that we should not be required 
to use the metagrammatical method of parsing it, 
just to keep symmetry with the duplication lan- 
guages. Interestingly, if w is in £2 and we use the 
metagrammatical parsing method, then ww ~¢ also 
requires more processing time than ww for the 
same reason as the trivial case. Suppose instead 
that we allow ww n to be parsed without using 
tile metagrammatical method. In that case ww is 
relatively even easier t.o process since it costs \[wl 3 
to parse with the metagrammatical approach but 
ww I~ will cost (2\[wl) 3 in tile direct approach. It, 
might be claimed that just as we argue ww not to 
require the worst case complexity for its language 
class (£1.5), neither need ww n for £2; but, the 
reversal language is a canonical example of a lan- 
guage that makes maximal use of the stack in the 
PDA. In any case, the metagrammatical method 
for parsing ww costs no more than just parsing 
strings in the characteristic language class of w. 
If this were the complete story then we could 
only recognize languages homomorphic to the du- 
plication languages. Clearly even the Ziirich di- 
alect of Swiss-German allows other constructions, 
all of which we can assume are context free (Pul- 
lure and Gazdar, 1982). Essentially we want to 
be able to write arbitrary £3 or £2 grammars and 
also be able to parse the string duplication lan- 
guage for whichever £i we choose. The language 
defined by such a union is no longer £i, but will 
not contain arbitrary £1.5 strings, and if i = 3 
then the union will not even contain arbitrary con- 
text fi'ee strings. However, the situation is more 
involved than tile basic approach since there needs 
to be a way to indicate where the metagrammat- 
teal approach is to be invoked. Add a single fea- 
ture to the grammar interpreted by tile processor 
as 'expect a copy'. 4 
1. A ---+ WBMY 
We allow context free productions of the form 
shown in (1), where A and B are nonterminals and 
W, Y are (possibly empty) sequences of terminals 
and nonterminals, B possibly occurring among 
4Ollce we admit 'interpretability by the processor' 
we in principle have TM power. Itowever we make 
quite restricted use of such interpretation. The rule 
format makes clear that it is less expressive than in- 
dexed grammars when interpreted directly. 
the nonterminals of Y. For an ambiguous CFG, 
there is no guarantee that multiple instances of 
a nontcrminal will rewrite to through the same 
sequence of productions to yield the same string. 
There are any number of ways that this basic 
notation can be used in a metagrammatical ap- 
proach. In the first instance, we take c to be a 
signal to the processor to generate an expectation 
for a duplicate of the terminal sequence that the 
nonterminal it is attached to gets rewritten to, 
and that this expectation must be satisfied by the 
next nonterminal of the same name and in the 
same local domain. 5 This approach will require 
that the sequence of terminals rewritten from the 
first B in (1) will be duplicated by the terminal 
sequence rewritten from the first instance of B (if 
any) that occurs in Y. The restriction will not 
hold of subsequent instances of the nonterminal 
marked for copying in the same local domain nor 
at ditferent levels in the analysis. A stronger in- 
terpretation could require an expectation for the 
same constituent analysis of the nonterminal as 
well. Since we do not allow the feature to stack, 
tile string-based method does not yield the full 
expressive power of indexed languages. The point 
is just that it's possible to keep a CF (or regular) 
grammar, and supplement the processor with a 
string-duplication operator which can be; invoked 
at the subsentence level. This is sufficient to yield 
languages thai; more closely resemhle the Ziirich 
dialect in having other constructions besides the 
duplication construction, yet remaining efficiently 
processable. ~ 
We have implemented tile interpreter in a chart 
parser that can be used in either top-down or 
bottom-up fashion. Edges in the chart are marked 
with a category (some nonterminal or preter- 
minal symbol from the grammar), constituents, 
subs|ring span and expectations (along with a 
unique identifier for each edge). This is modi- 
fied to include a list of constraints, which for the 
present purposes is presumed to be just duplica- 
tion checks. An edge with no expectations is in- 
active (saturated) and one with expectations is 
active. In the completer step, when active edges 
combine with adjacent inactive edges whose cate- 
gory satisfies the current expectation of the ac- 
tive, the usual process of creating a new edge 
with one less expectation is augmented with an- 
other: if the current expectation has an associ- 
ated copy feature, then the new edge is marked 
with a constraint interpreted by the parser as in- 
dicated above -- the nonterminal symbol and tile 
string spanned by the inactive edge are noted so 
5We take a local domain, in tree terms, as a node 
and tile set of nodes that it immediately dominates. 
~To get closer still to the Zfirich dialect, we require 
that the duplication operator be applied at the level of 
preterminals, with complementation, to get the pair- 
ings of case-marked NPs and Vs. 
160 
that the next inaetive edge of the same category 
(if one is expected) will have to span an |dent|- 
eL1 string. Constraints of this form are not passed 
on after satisfied once, and are not passed out of 
the local domain. Within the same set of restric- 
tions the implemented constraint could have been 
'expect a reversed copy'. This would require con> 
putating the string's reverse before annotating the 
constraint list. 
4 Discussion 
Tile context; free languages have alre.ady been 
studied from the perspective of minimal addition 
to incorporate copy languages. Savitch (1989) 
does exactly that by prese, nting the model of con> 
put;at|on required for the class of languages de- 
lined by augment;ing the CFLs with redut)lication: 
a Reduplicat;ion PDA (RPDA). An I~PDA is just 
a PDA which has a special type of symbol thai, 
can tie put onto the stack to nlake the machine 
treat the part of the stack above it ms if it were 
a queue. Essentially, t,his obtains the reversM be- 
havior nee, ded of a st.ack to process copy languages 
as well as rew',rsals. Mull,|pie instances of the 
special sylnbol can be placed on |.he stack. Say- 
itch present,s a chara.ct,erization of the languages 
ill te, rms of stxingsets and the requisil;e compu- 
Lal;ional structures. The family that we charac- 
terized above in terms of graInntars arc tn'operly 
a sullset of the languages recognized by R.PDA, 
a restrk:tion of RPDA languages which Savitch 
(1989) terlns simple R, PDA lanqu,.qes. The model 
of comput~ttion here is an RPDA in which only 
(me spe, cial symbol is allowed on the stack at 
any one, time. We have not In'oven the equiva- 
lence we conje(:tllre bel, we(,'tl our Inetagranunatical 
method and the reduplication contex&free gram- 
mars (RCFC, s) that Savi|,ch introduces as genera- 
tive of simple RPDA languages. Saviteh's (1989) 
grammars are stated in terms of rule schemata (a 
tin|re set) that general,e potentially infinite sets of 
rewril;e rules. This is the tradeoff lletwe, en doing 
things metagraInmatieally and directly. 
Josh| and Rainbow (Josh|, 1990; Rainbow and 
Josh|, 1994) have also considered the perforntan('e 
data associated with processing crossed vs. nested 
dependencies and present an alternative com- 
putal, ion model, |;tie bottom-up embedded PDA 
(BEPDA), designed for a wit|an|; of tree-adjoining 
gralnmar (it uses a stack of slacks and a more 
complex operation for eml)tying the stack). II,am- 
bow an(1 Josh| (1994) use the processing model to 
demonstrate that it can account for the dilDrence 
between crossed and nested dependencies in terlns 
of the amount of time associated objects spend in 
the pushdown store of the BEPDA using a mildly 
context free language model that captures depen- 
dencies directly, rather t;han metagrammatically. 7 
r Josh| (1990) gives a similar analysis fi)r EDPAs. 
Essentially, their analysis (:oncludes (;tie satne: 
when judging string isomorphisnls, it; is easier to 
make the judgment of identic~flly ordered pairs 
than it is to reversely ordered pairs. Thus, the 
cross-serial dependencies needn't cost the worst 
ease complexity for parsing indexed or mildly ecru- 
text sensitive languages. Parsing ww languages 
requires, at worst, (;lie worst ease complexity of 
parsing w in whichever language class w is re- 
stricted to. Shieber (1985) pointed out without 
proof that (;tie nonCl,' data associated ZiMch di- 
Nee(; is linearly parsable; our task has been to 
clarify how this follows from the language (;heory. 
4.1 A Caveat 
For eilicicnt processing of ww to entail correspond~ 
ing eomplexity fin" natural lmlguages that license 
cross-serial dependencies hinges crucially on there 
being eflMently (:(mlputable hoinonmrphisms tm- 
tween the natural language, and the string dupli- 
eati<m languages. This is aIl open question, tIow- 
ever, given that empirical work that COlnpares pro- 
cessing of crossed atld nested dependencies alld 
concludes that the m'oss-serial dependencies are 
preferred to nested ones (Bach el; al., 1986), and 
giw~,n (}tit' arl{un!.ent thai, cross-serial dependencies 
are in theory easier to process, we feel it. teas(m- 
able to enterta.in the asSUml)tion that somel;hing 
such exists. This does n(~t require us l.(~ assunlo 
thai; ileol)le a(:Lually use conl,exl;-fl'ee grammars 
and COlllp/lte holllolnort)hisills ill order 1,o itnder- 
stand natural languages, just thai; l:he c(mlt)ul;a- 
tional model should lm at least approximat.ely as 
eflicient as t)eoph~,. 
4.2 ImI)lications 
()tit' inetagralnmatical approach to dealing with 
cross serial dependencies involves the ~uSSUlnpl,ion 
of an operation for testing string duplication. We 
hinl;ed earlier that we h;el there to lm sutlicient 
reason to believe that copy-checldng is a basic cog- 
nitive flmction, and although we don't suppose 
that, people have built in production systems and 
processors isolnorphic to ollr chart parser aim base 
language, we do think that t,his copy-dmeking is 
invoked in the processing of crossed depe.Ildencies. 
Our approach to accounting for the processing 
complexity that the string duplication languages 
should take does make empMcal predictions and 
these can lie teste, d. For instance, if it is t;he case 
that such a nmchanisin exists, then patterns of 
string-copy disthtency should ocellr with (lifferenl. 
frequency in languages that lk:ense cross-serial (le- 
pendencies than in those tha, t (t(I iI.ot. A stxing- 
copy dislhleney is just one that involves a repeat 
of part of the sentence, ul, t;ered so far: 
1. We went to the to lhe store to buy some Jlo'.a'. 
The idea ix that speakers of bmguages with ww 
homomorphisms have a different pattenl of in- 
voking copy-checking than those who speak lan- 
161 
guages that do not admit cross serial dependen- 
cies. These differences should be manifest in 
speech corpora like those that are currently being 
accumulated (Anderson et al., 1992; Miller, 1995), 
but which n~d augmentation by a corpus derived 
from copy-language dialects. Verifying this would, 
for example, establish whether the copied strings 
need to be constituents, and this has a bearing on 
whether processing models designed for incremen- 
tal interpretation (Milward, 1992) are the best de- 
scriptors of human performance." We do not offer 
arguments that our metagrammatical approach is 
the best description of human processing of cross- 
serial dependencies, just that it is another theo- 
retical justification for the difference in process- 
ing nested dependencies and efficient processing 
of crossed dependencies. 
Acknowledgements 
Vogel is grateful to the SFB 340 for funding 
his stay Stuttgart; Hahn acknowledges the sup- 
port of ESRC grant No. R004293341442; Brani- 
tan, EPSI~C research studentship No. 92315069. 
All would like to thank Catherine Collin, Toma~ 
Erjavec, Tsutomu Fujinami:, Merce Prat, Fred 
P0powich, Mark Steedman, and the anonymous 
reviewers. 
References 
Alfred V. Aho. 1968. Indexed grammars--an exten- 
tion to context-free grammars. Journal of the As- 
sociation/or Computing Machinery, 15(4):647-671. 
Anne H. Anderson, Miles Bader, Ellen Gurman Bard, 
Elizabeth H. Boyle, Gwyneth M. Doherty, Simon C. 
Garrod, Stephen D. Isard, Jacqueline C. Kowtko, 
Jan M. McAllister, Jim Miller, Catherine F. Sotillo, 
Henry S. Thompson, and Regina Weinert. 1992. 
The IICRC Map Task corpus. Language and 
Speech, 34(4):351-366. 
Emmon Bach, Colin Brown, and William Marslen- 
Wilson. 1986. Cross and nested dependencies in 
german and dutch: A psycholinguistic study. Lan- 
guage and Cognitive Processes, 1(4):249-262. 
Joan Bresnan, Ron Kaplan, Stanley Peters, and Annie 
Zaenen. 1982. Cross-serial dependencies in dutch. 
Linguistic Inquiry, 13(4):613-35. 
Lynn Cooper. 1975. Mental rotation of random two- 
dimensional shapes. Cognitive Psychology, 7:23-43. 
Veronica Dahl and Michael McCord. 1983. T~eating 
coordination in logic grammars. American Journal 
o/Computational Linguistics, 9(2):69-91. 
SNore that the English 'respectively' constructions 
require a special intonational behavior in the sing- 
song litany-voice that is required for a speaker to 
make an extended 'respectively' construction inter- 
pretable, thus arguments for specifically metagram- 
matical treatment do exist (where intonational facts 
axe considered evidence for a signa\] to the processor 
to do something unusual). 
Gerald Gazdar and Geoffrey Pullum. 1985. Compu- 
tationally relevant properties of natural languages 
and their grammars. Technical Report CSLI-85-24, 
Stanford: Center for the Study of Language and 
Information. 
Gerald Gazdar. 1985. Applicability of indexed gram- 
mars to natural language. Technical Report CSLI- 
85-34, Stanford: Center for the Study of Language 
and Information. 
John E. Hopcroft and Jeffrey D. Ullman. 1979. Intro- 
duction to Automata Theory, Languages, and Com- 
putation. Addison-Wesley Publishing Co., Reading 
MA. 
Aravind K. Joshi, K. Vijay-Shanker, and David Weir. 
1989. The convergence of mildly context-sensitive 
grammar formalisms. Technical Report MS-CIS- 
89-14; LINC LAB 144, Department of Computer 
and Information Science University of Pennsylva- 
nia, Philadelphia, PA. 
Aravind Joshi. 1990. Processing crossed and nested 
dependencies: An automaton perspective on the 
psycholingnistic results. Language and Cognitive 
Processes, 5(1):1-27. 
Alexis Manaster-Ramer. 1987. Dutch as a formal 
language. Linguistics and Philosophy, 10(2):221- 
46. 
Jim Miller. 1995. Focus in the languages of europe. 
To appear in G. Bernini (ed.) Pragmatic organiza- 
tion in the languages (Volume I. of Typology of the 
languages of Europe). Mouton-de Gruyter. 
David Milward. 1992. Dynamics, dependency gram- 
mar and incremental interpretation. In COL- 
ING92, pages 1095-9. 
David Milward. 1994. Dynamic dependency gram- 
mar. Linguistics and Philosophy, 17:561-605. 
Geoffrey Pullum and Gerald Ga~dar. 1982. Natural 
languages and context-free languages. Linguistics 
and Philosophy, 4:471-504. 
Owen Rambow and Aravind Joshi. 1994. A process- 
ing model for free word order languages. In L. Fra- 
zier C. Clifton, Jr. and K. Rayner, editors, Perspec- 
tives on Sentence Processing. Lawrence Erlbaum. 
Walter Savitch. 1989. A formal model for context-free 
languages augmented with reduplication. Compu- 
tational Linguistics, 15(4):250-61. 
Stuart Shieber. 1985. Evidence against the context- 
freeness of natural language. Linguistics and Phi- 
losophy, 8(3):333-43. 
Stuart Shieber. 1995. What is wrong with tags. In- 
vited talk at the Seventh Conference of the Euro- 
pean Chapter of the Association for Computational 
Linguistics. Belfield, Dublin, Ireland. 
Edward P. Stabler. 1994. The finite connectivity of 
linguistic structure. In Lyn Frazier Charles Clifton, 
Jr. and Keith Rayner, editors, Perspectives on Sen- 
tence Processing. HiUsdale, N J: Lawrence Erlbaum. 
Carl Vogel and Toma~ Erjavec. 1994. Restricted dis- 
continuous phrase structure grammar and its rami- 
fications. In Carlos Martin-Vide, editor, Current Is- 
sues in Mathematical Linguistics. The Netherlands: 
Elsevier Science Publishers. 
162 
