On the Structural Complexity of Natural Language Sentences 
Dekang Lin* 
Artificial I ntclligcn cc I,a, bovatory 
Ma.ss~cclmsetts Insdtttt(; of :l~cttnology 
l{m 7(57, 545 Technology Square 
(~atnbridgc, Massac.husetts, USA, 02\] 239 
E-maih lindck~*ai.mit.edu 
Abstract 
The objective of this pal)er is to \[brmal- 
ize the intuition al)out l,he comph;xity 
of syntactic structures. We propose a 
definition of structm:al COml)h'xity such 
that sentences ranked by our definition 
as more COml)h;x are gen(;rally more diI'- 
ficult lbr humans to process. We justify 
the definition by showing how it is ahle to 
account for several seemingly unrelated 
phenomena in natural languages. 
1 Introduction 
Intuitive\]y, certain syntactic structures arc uLore 
difficult for htnnans to process thau others. For 
example, compare the following to sentences: 
(1) a. 'Fhe cat that the dog that the man 
bought chased died. 
b. The man bought the dog that chased 
the cat that died. 
It is ohvious that sentence (la.) is much mor(' dif- 
ficult to understarld than (1 b). Since the two sen- 
tences are of the same length an(l involve the same 
set of semantic relationships, the ditliculty in rm- 
derstan(ling (1 a) can only be attributed to its syn- 
tactic structure. 
'\['he objecl;ive of this pal)er is to fortnalize the 
intuition a.bout the complexity of syutactic stru(> 
tures. We propose a detinition of structural 
colnI)h~xil;y (SC) such thai; sentences ranked by 
our definition as more complex are generally more 
difficult for humans to process than otherwise sim- 
ilar sentences, hi other words, suppose a pair of 
sentences A arid B consist of the same set of words 
and have essentially the same meaning, then sen- 
tence A is more difficult to process than sentence 
1~ if SC(A)>SC(B). For example, the proposed 
detinition of structural complexity correctly pre- 
*On lea,re Dora the University of Manitoba., Win- 
nipeg, M~mitoba, (\]~tnmla. This rt,.se,Lrch has 1)een sup- 
ported by NSli',II.(\] ltcsearch (',rant OG1)121338. The 
author is very gr~teful to the reviewers who pointed 
oat several mistakes in the draft. 
dicts that (la) ix much more difficult to process 
than (lb). 
'I'll(: notiou of structural complexity proposed 
in this l)apc'r oilers explanations \['or a set of seem- 
iugly unrelated phenomena: 
• We will show dlat the definition of structural 
comph:xity explains why a I)utch sentence in- 
volving cross-serial dependencies is sliglrdy 
easier to underst~md than a corresponding 
cenl, er-embedded German sentence. 
• We will also show that extrapositions, such as 
heavy-NP shift and PP extractions are moti- 
vated by reducing syntactic complexity. The 
extraposition of an element is only warranted 
when the. structural COml)lexity of the sen- 
I.en(:e is reduced as a result. 
• NP ntodifiers of a head tend to be closer to 
the head than its PP modifiers, which in turn 
tend to be closer than its CP (clausal) modi- 
tiers. In Generalized Phrase Strcuture Gram- 
mar ((~VS(~) (Gazd~u" ctal., 1985), these lin- 
ear order constraints are stated explicitly in 
the gralIllnar. The notion of structured com- 
plexity provides an explanatory account. 
'l'here are several reasons why the notion of 
strHCtltl:a\] COml)lexity ix tlseful. Firstly, in nat- 
ural language generation, a generator should get> 
era.re the simphest sentence that conveys the in- 
tended meanings. Structural complexity can be 
used to choose l;he syntactic strnctures with l;he 
lowest structural complexity so that the resulting 
sentence is easier to understand than other alter- 
natives. 
Secondly, structural complexity is also needed 
in assessing the readability of dommtents. \[t is 
well known that the length of a sentence is not, a 
relit~ble indicator of its readability. Yet, the read- 
ability of texts has up to now heen measured by 
tlJe lengths of sentences and familiarities of th(: 
words in the documents. Using structural com- 
plexity instead of sentence length allows the read- 
~fl)ility of documents to be measured tnorc accu- 
rately. 
Finally, we propose, in Section 4, that extrapo- 
sitions ~re rnotiw~ted by reduction of structural 
729 
complexity. In other words, extrapositions are 
only allowed if the structural complexity of tile 
sentence is reduced as a result. This constraint 
is nsefnl both in parsing sentences with extrapo- 
sitions and in deciding where to use extraposition 
during generation. 
The notion of structural complexity is defined in 
Section 2. We then justify the definition of struc- 
tural complexity by demonstrating in Sections 3, 
4, and 5 that sentences with lower structural com- 
plexity are easier to understand than otherwise 
similar sentences with higher structural complex- 
ity. 
2 Structural Complexity 
The definition of structural complexity presumes 
the notion of dependency relationships between 
words in a sentence. In dependency grammars 
(Hudson, 1984; Mel'Suk, 1987), a dependency re- 
lationship is a primitive relationship between two 
words, called the head and the modifier. In 
constituency grammars that contain the X-bar 
theory as a component, dependency relationships 
between words are implicitly specified in X-bar 
structures. The modifiers of a word w are the head 
words of the specifier, complements, and adjuncts 
of w. For example, Figure 1 is the X-bar struc- 
ture of (2). The word "will" has two modifiers: 
the head word ()fits NP specifier ("Kinf') and the 
head word of its VP complement ("bring"). The 
dependency relationships in tim X-bar structure 
in l!'igure 1 are shown in Figure 2. Each directed 
link in Fignre 2 represents a dependency relation- 
ship with the direction going from the head to the 
modifier. 
(2) Kim will bring the wine in the evening. 
/% 
" /% 
Kim will 
v 
bring 
DET l' 
the N 
wine 
PP 
in P //~ 
DET I' 
the N 
Figure 1: X-bar structures of (2) 
evening 
3 
1 1 2 
Kim will bring the wine in the evening 
Figure 2: t)ependency structure of (2) 
In order to recognize the structure of a sentence, 
a parser must establish the dependency links be- 
tween the words in the sentence. Structural com- 
plexity measures how easy or di\[\[icnlt it is to es- 
tablish these dependency links. The definition of 
structural complexity is based on the assumption 
that the shorter dependency links are easier to es-- 
tablish than longer ones, where the length of a 
dependency link is one plns the nmnber of words 
between the head and the moditier. I:or e.xample, 
the lengths of tile links in Figure 2 are shown by 
the numbers attached to the dependency links. 
Definition 2.1 (Struetural Complexity) 
The slructural complexity of a dependency struc- 
lure is the total length of the dependency links in 
the structure. 
For example the structural complexity of the de- 
pendency structure in Figure 2 is 11. 
\[n the next three sections, we will show that 
the definition of structural comph'.xity does i,- 
deed retlect the difficulty in processing a sentence. 
We will present examples in which sentences with 
lower structural complexities are easier to process 
than similar sentences with higher structural com- 
ph;xities. 
3 Center embedding 
The difficulty in processing center embedding sen- 
ten('es: such as (13), hgs been explained by its 
requirement on the size of tile stack in a parsb, r. 
This explanation presumes that the human parser 
uses a push-down stack to store the partially built 
constituents. 'l'he notion of structural complex- 
ity provides an explanation of the difficulty of 
processing center embedding that makes much 
weaker commitment to the parsing model. Fig- 
ure 3 shows the lengths of the dependency links 
in a center-embedding sentence (la) and a non- 
center-embedding sentence (lb) with similar se- 
mantics. The structural complexity of the center- 
embedding sentence is 30, which is much higher 
than the structurM complexity (=112) of the non- 
center-embedding sentence. 
The presumption that human sentence pro- 
cessor uses a push-down sl;ack is challenged by 
the contrast between cross-serial dependencies in 
Dutch (e.g., Figure 4a) and center-embedding sen- 
tences in German (e.g., Figm:e 4b.) 
Since the cross serial dependencies are much 
more ditficnlt to handle with push-down stacks 
730 
n; 
I" F7 Lr ill 
1~1 r-?l?-I u\] 
The cat that the dog that the man bought chased died 
The man bought the dog that chased the cat that died 
Figure 3: (\]ent,er-l,\]lnlmdding vs. Not-(,elt.e.l- 
(hire th;m their English couvd,erpm'l;s" (p. 249). 
'I'his is also consistent with the sl.ru(;1;ural (:()lll- 
Iflexity account, since the structur3l comtJexity of 
I:igurc 4c is 9, whic.h is signili('nntly lower t:hmi its 
Dutch a, ml (;c;rman connterl)iU'l,s (l"igure 43 an(l 
4l))i 
4 Extrapositions 
l,\]xl,ra, l)osil:ion I'e\[~l'S l,O i,he lllOVetllell\[, Of ~Ill ele 
inenl; \['roill il,s ,ov'nml i)ositio\]l t,o a I)osii,iotl 3t, or 
lle3r Iihe end of l,tle senl,ellC(', l'\]xa\[I\]l)l(ts o\[ extr~t. 
1)osition in I",nglish in,:;lu(h< 
l!hnlw.(hling Sentences ....... Ihmvy-NP shift 
De manne hebben Hans de paarden leren voeren 
The man have Hans the horses teach feed 
a. 1)utch: cross serial dependency, shuctuml complexity= 13 
Die Maenner haben Hans die Pferde fuettern gelehrt 
The men have Hans the horses feed teach 
b. Gennan: center embedding, structural complcxily=14 
The men taught Hans to feed the horses 
c. English: right branching, slmclural complexity=9 
I"igure 4: cross s(;ri31 dependency vs. ceni.er 
(3) a..Ioc sent \]12(; Izook he found in I)a.ris /o 
his pal 
b. Joe sent :o his pal 
I, he book lie found in I)3ris 
F, xtral)oS(~.d relative 
(4) a. A m3n I;h31; no one knew slood 'up 
h. A nr, m slood "up tihal\] I10 Olle klleW 
I~P-(,,xtral)osition 
(5) a. I Feaid a. dcscril)tion 
of llockncy's lal,csi, l>ic(.ur(~ !lcslcrday 
b. 1 read a descril)t,ion ycslerday 
of IIo(k I(y s hi test F, icl,ure 
lgxl;ra('l;ion from AP 
(6) a.. \[low cerl,ain that the, Me~s will wi, arc 
:q o 'u '? 
h. I Iow certa iu a'rc '!lO~t 
tlmt the Mets will wi.? 
Me('hanislll for constraining exliraposiiiiOll iS Ill'- 
gently ,ee(h'xl i, both parsing 3nd g~encr3l.ioll. '1'0 
et),lm(l(ling ........... vs. right.-i~ra.nchillg \[ihc I)CStl of the ;ulthor's I~nowhe(lge, noue of (.Ira 
than nested dependeucies, the hypothesis th3t hu- 
m3n parsex uses a, push-down sta,'k would pre- 
dict th3t the I)utch sentences sttch as Figure 43 
shotlld be much lilt)re ditli(:ult (IC) underst3nd than 
the correslmnding (.lerllt3II setlLeil('es with IleSl;ec\[ 
(lepende~ncies (Figure 4b). Itowever, da.ta from 
psycho-linguistic experinlenl;s suggest (;h3t the, y 
a,re in fact slightly easier to proce.ss than the cor- 
responding (;erm3n senl;ent-cs with nested de, pen- 
dencics (B3ch et 31., 1986). This obserw~,tion can 
be 3('counted \[or l)y structur31 complexity, since 
the sl;ru('tur31 comt)lcxity of the I)ut,ch sentence 
(Figure 4&) is 13, which is slightly lower tha,n Lhc 
structur31 complexity (=14) of the correspollding 
(~ernl3II senl.(:tlce I:'igurc 411. It was 31so el)served 
in (Bach et el., 1!)86) that "For someone with (weu 
3 limited competence in English 3nd either of the 
other langu3ges, the p3tterns in l)utch and Ger- 
m3n seem to be more difficult to process 3nd pro- 
I)road coverage parse.rs or ~ener3l,ors h3\[IdIes ex- 
~P \[~ \[~Pt) ()$1 ~ I ()\] ~ 1 ~ \] ~I I~rincil4e.d fashioh. The reason 
('OF this iS that exlii'a,i)osil;iOllS 3Ill)ear tO be depen- 
dent upon ccrt3in ast)ects of (:ontcxts thi{t 3re not 
cN)l;ured by usual synt~wtic fe3t;ures. For exam- 
t>le, compare the following 1)3Jr of sentences 
(7) a, I (,alked wi(;h a, m3n yesh:rda,!l 
with a must3che 
b.*l l,alked with a, ma, n one year and fo'~ur 
'monihs ago wiiJi a tnjhsbw.hq 
The syuta, ctic struct,,,'es of (7a)3,,,I (7B)are th,, 
s3me, which is shown in I:igure 5, except Iiha,\[i t)he 
3dwa:bial phrase Advl ) is "yesterday" in (7a) a, ml 
"one year a,.d four months 3go" in (71)). Although 
the two adw;rt)i31 phra,ses m'c two different stri.gs, 
ILihey a, re identical in their syntactic (~/.lltll'{~s I Yet., 
extr31)osition is good in (73) but b3d in (7b). 
We propose~ th3t the lmrposc of extr31)osition is 
to m3ke 3 sentence easier to mlderst3nd. There- 
|'ore, ext, r3posil&m is only allowed when the struc- 
tural comph~xity of l;he S(Hll;ellCe is reduced 3s a re- 
still;. Note 1,}131, reduction of structuraJ c()mtJexity 
is not l, he only const, r3int on cxtr3position. '\['here 
731 
jlp~ 
NP ~~__ 
l PP 
V PP with mustache 
talked 
to a man 
Figure 5: Parse tree of (7a) and (7b) 
are also syntactic constrains such as Right Roof 
Condition (Ross, 1967) or Complement Principle 
(Rochemont and Culicover, 1990). 
When a phrase is extraposed, the set of depen- 
dency relationships remains the same. However, 
the lengths of some of the dependency links will 
change. The structural complexity of the sentence 
may change as a result. Figure 6 illustrates how 
extrapositions atlhct the lengths of dependency 
links is, (3), (4), (5), and (6). Only the depen- 
dency links whose lengths are changed are shown 
there. In all cases, structural complexity is re- 
duced by the extraposition. 
Consider the difference between (7a) and (7b). 
In (7a), the extraposition of \[pp with a mustache\] 
increases the length of the dependency link be- 
tween "man" and "with" by 1, but reduces the 
length of the dependency between "talked" and 
"yesterday" by 3. Therefore, the structural com- 
plexity is reduced by 2 as a result of the extrapo- 
sit,on. In contrast, in (Tb), the extraposition of 
\[pp with a mustache\] increases tile length of the 
dependency link between "man" and "with" by 6 
and reduces the length of the dependency link be- 
tween "talk" and "ago" by 3. Thus the structural 
complexity is increased when \[Pe with a mustache\] 
is extraposed. 
The hypothesis that extraposition must reduce 
the structural complexity also explains why in 
heavy-NP shift, the extraposed NP must be heavy, 
i.e., consisting of many words. When the comple- 
ment Nil ) of a verb is 'shifted' to the right across an 
adjunct modifier of the verb, the length of the de- 
pendency link from the verb to tile head of the NP 
is increased by length the adjunct modifier. On 
the other hand, the length of the dependency link 
fi'om the verb to the adjunct modifier is reduced 
by the length of the NP. Theretbre, the structural 
complexity of the sentence can only be reduced as 
a result of the extraposition when the NP is longer 
than the adjunct modifier, 
Joe sent the book he found in Paris to his pal 
Joe sent to his pal the book he found in Paris 
(a) Heavy-NP shift, SC reduction = (7+2)-(5+1) = 3 
A man that no one knew stood up 
A man stood up that no one knew 
(b) Exuaposcd relative clause, SC reduction=(5+l)-(3+l)=2 
I 7 ~ } 
I read a description of Hockney's latest picture yesterday 
I read a description yesterday of Hockney's latest picture 
(c) PP-extraposition, SC reduction=(7+l)-(4+2)=2 
How certain that the Mats will win are you 
How certain are you that the Mats will win 
(d) Extraction fl'om AP, SC reduction=(6+ 1 )-(3+ 1)=3 
Figure 6: Extraposition must reduce structural 
complexity 
5 Linear Precedence 
In most languages, the NP modifiers of a word 
tend to be <;loser to the word than it, s PP rood,- 
tiers, which, in turn, tend to be closer to the word 
than its CP (clansal) modifiers. In GPSG (Gaz- 
dar et al., 1985), these lineal: order constraints are 
stated explicitly as the linear precedence rules. In 
this section, we show thai; the linear precedence 
rules in GPSG can be derived fl'om the assump- 
tion that the linear order among different types 
of modifying phrases, such as NP, PP, and CP, 
should minimize the structural complexity so that 
the sentence is as easy to process as possible. 
Suppose a word w has n modifiers XP:I, XP~, 
..., XP, ; the number of words in XPi is li; mM the 
head word of XPi is wi, which is the pi'th word 
in XPi. Without loss of generality, let us assume 
that w precedes its modifiers. \[f the order of the 
modifiers is XP1, XPu, ..., XP,;, then the length 
of the dependency link between w and the head 
of XPi is (Pi + 2j-:tl lj) and the total length of 
dependency links within tile maximal projection 
732 
of w is: 
-,~ \-,i- t lj \] )~,~i=l (Pi -1- L,j=I 
/ 
=: (. :-l)lt + (,,.- U)l  + ...+l .... , + 
Among all \])ert\]lltt;al;iofls Of XP I , Nit2, ..., Xl;' .... 
t;hc al>ove sum is tit<'+ minitnal when 11 <_ 12 <_ ... < 
In. In ol, her words, the total h',ngt;h of dCl)cnden<:y 
links is minilnal when tit<'+ modifiers with I'ewer 
words are <:loser to the h(;ad. (Icnerally spealdng, 
PPs contain more wor<ls than NPs and Cl)s con: 
rain more words than l.)Ps. Therefor('., Nt ) mod- 
i\[iers shouhl b<; closer 1,o t, he \]mad word t, han 1)1 > 
moditiers and t)l ) modifiers shoul<l be closer Lo 
t;h(" head word t.han CI ) mo(li\[i(ws if l;hc sl;ru<:l, ural 
comph'.xity of tim ma.xinml pl:ojec@tu of the. \]mad 
word w is to be minimized. 
6 Discussion 
We used the total length of the dci>endency links 
in the definition of structural complexil,y. The ex- 
amples ltresented in the previous sect, ions are also 
consistent w:ith a definition l;hat uses the inaxi- 
mum length of structural links. The reasotL we 
choose to use the stnrt is that the definition natu- 
rally incorporate the length into consi<lcration. 
'I'hc arguments presented in previous s(;<:l;ions 
arc preliminary. Our riga,re work inchldc backing 
up th('~ hypothesis with (;mph'ical cvidc.n<:e and in- 
vest;igate the application of structtn'al complcxil,y 
in handling extraposition in parsing and genera- 
lion. 
7 Conclusion 
We have proposed a notion of stru<'tural cotuph',x- 
ity+ A senten<'e with higher st, ructural cOral>It+x- 
ity is more dif\[icult I,o process than a similar set> 
tences with lower structural complexity. Struc- 
tural coruph'+xity is needed in both l>arsing a.nd 
general;ion. \[t can also be used to a.sscss tl,e read- 
ability of <locument;s. W<: supl>ort l, hc d('\[init;ion of 
stru<:t, ural ('.omph~xity wil.h a sel, of se<>mingly un- 
relate<| phenomena: the contrast hcl, wcen center-- 
embedding and right-branching sentences, ('.xt;ra- 
i>ositions, and the linear order among modifying 
l>hrases. \[in all of these cases, sentcnc(;s with lower 
structural complexity arc easier to mtdcrstand. 
l{i<:lmr<l lludson. 1984. Word Crammar. Basil 
l~lackw(;ll l)ublishers \[fruit;cal., Oxfor(1, Eng- 
land. 
Igor A. Mcl'Suk. 1987. l)cpc.ndency .synlax: the- 
ory and practice. Sl;al;c Univ<;rsity o\[' New York 
Press, Albany. 
Michael S. ILochcmont and P('~t(',r W. (\]ulicovcr. 
1990. l','nglish I"ocu,~ Constructions a'n.d the 
Theory of Grammar. Camblfidge Studies in Lin- 
guistics. Cambridge Univcrsit, y Press. 
J. I(,()ss. 1967. Constraints on. variables in synla,. 
I'h.I). thesis, M.\[.'I'., Canal>ridge, MA. 
References 
E. Bach, C. l{rown, an<\[ W. Marslen-Wilson. 
1986. (5 ossed and nested d(;pemlencies in (,e~- 
ma.n and I)utch: A psycholinguistic, stmdy. Lan- 
guage and Cognilive l>rocesscs, 1(4):249 262. 
" ( ( + Gerald Gtzlat, Ewan Klein, G',oflcly l'ulhm:h 
and Ivan Sag. 1985. (~cneralizcd Ph.r'aisc ,%'lruc- 
ture Grammar. Basil Blackwell Publisher Lt,(\[, 
Oxibrd, UK. 
733 
