Isolating Cross-linguistic Parsing Complexity with a 
Principles-and-Parameters Parser: 
A Case Study of Japanese and English * 
Sandiway Fong & Robert C. Berwick 
NEC Research Institute, 4 Independence W~y, Princeton, N J, 08540 USA, szLndiway@research.nec.com 
Rm 838, MIT At L~bortLtory, 545 Technology Sq., Cainbrldgc, MA 02139, berwick~ai.mit.edu 
1 Introduction 
As parsing models and linguistic theories have broad 
ened to encorapass a wider range of non-English lan- 
guages, a particularly uscfifl "stress test" is to buikl a 
single theory/parser pair that can work for multiple lan- 
guages, in the best case with minor variation, perhaps 
restricted to the lexicon. This paper reports on the re- 
suits of just such a test applied to a fully operational 
(Prolog) implementation of a so-called principles-and- 
parameters model of syntax, for the case of Japanese 
and English. This paper has two basic aims: (1) to show 
how an implemented model tbr an entire principles-and- 
parameters model (essentially all of the linguistic theory 
in Lasnik & Uriagereka (1988)), see figure 2 for a com- 
puter snapshot, leads directly to both a parser for nml- 
tiple languages and a useful "computational linguistics 
workbench" in which one can easily experiment with 
alternative linguistic theoretical tormulations of gram- 
marital principles as well as alternative computational 
strategies; (2) to use this system to uncover sources 
of parsing complexity in Japanese as opposed to En- 
glish. In particular, we examine the "null hypothesis" 
that a single parsing design suffices for efficient pro- 
cessing of both/lead-first and Ilead-final languages, in 
contrast to approaches that posit, e.g., a right-corner 
or other mirror-image strategy for parsing Japanese as 
compared to English (e.g., BUP; Mazuka (1990)). In 
this case we can confirm computationally and precisely, 
in accordance with nmch current psychollnguisitic work 
(l'~razier and Raynert (1988); lnoue and J.1). Fodor 
(1991); Nagai (1991)) that it is not the lien&final 
character of Japanese that results in processing diffi- 
culty so mudl as the possibility of scrambling and free 
deletion of NPs (so-called "super Pro Drol)" ). We do 
this by empirically investigating the effects of 3 t)ossi- 
ble "optimizations" of the parsing system for Japanese: 
(1) the use of right-context information via automatic 
source transformations, using a l,rogramming language 
compiler technique to introduce dummy nonterminals 
and corresponding semantic actions; (2) modification 
of the Japanese grammar to put the specitier of CP 
*Tiffs research ltas been supported by NSF (~raltt 
DCR85552543 under a Presidential Young fnvestiglttor 
Award to Professor Robert C. Berwiek, and a grant front the 
Kapor Family Foundation. We would like to thank linward 
Lasnik, Alec Marantz, Shigeru Miyagawa, David Pesetsky, 
and Mamoro Saito for valuable discussions and valiant at- 
tempts to tell us about Japanese. 
(= S) on tile right and so eliminate unnecessary center- 
embedding; and (3) eliminating of scrambling and NP 
drop to isolate tile separate effects of llead-final (e.g., 
Verb-final) l)hrase structure in Japanese. 
By explicit construction, the implementation demom 
stratcs that it is possible to build an efficient principle- 
and-parameters parser lbr multiple languages, using 25 
principles titat are expressed in a language quite close 
in form to that of the original linguistic theory. '\].'he 
English-Japanese dilt~rences handled include the ba- 
sic Suhjeet-Objcct-.Verb (SOV) order of Japanese; free 
"scrambling" of Japancsc noun phrases; topic-comment 
structurc; ,mnappearauee of noun phrases that are dis- 
course recoverable; and lack of wh-word movemcnt in 
Japanese questions. No rule reprogrammingis required 
to accommodate these differences, but changes to only 
4 binary switches and a ininimally distinct lexicon with 
different thematic grids in some cases. The parser 
couples several already-known parsing design strategies 
to ol)tain efficient parsing times, e.g., type-checking; 
nmltitdc~entry canonical LR(1) parsing; and automatic 
(source-to-source) grammar transformations. 1 
2 Princlplc-based parsing 
In a t)rinciple-based parser, construction- and language~ 
speeitic rules arc rcplaeed with broader principles that 
remain invariant aside from parametric variation (see 
below). The parser works by a (partially interleaved) 
generate-and-test tedmique that uses a canonical LR(1) 
covering grammar (derived from X theory plus the the- 
ory of movement) to lirst buibl an initial set of tree 
structures; these structures are then run through a se- 
1'1'o the best of our knowledge, this system is the first and 
broa(lest-coveragc of it~ type to be able to parse Japanese 
aml English by setting jnt;t a few parameter switches. Dorr 
(1987), uoder the supervlsi:)n of tim second author, devel- 
oped a c(mc"i:tually similar scheme to fiandle L'nglish, Span- 
ish, and (~crman. lit,wryer, l)orr's system did not have the 
same bro~td coverage of English; did not handle Japanese; 
used hand rather than automatic compiling; and was ap- 
proximately 15 times slowek. Gunji's (1987) Japanese unifi- 
ceti(m grammar comes closest to the principle-ba~ed model, 
but requires hand-modification from a set of core principles 
~utd does not really accommodate the important Japanese 
phenomenon of scrambling; see below. Otficr such systems 
work only on nmch smatlcr parts of English, e.g., Sharp 
(1985); Wehrli (1987); Crocker (1989); Cortes (1988); Johno 
son, (1989); or are not in fact parsers, but proof-cfieckers, 
e.g., Stabler, (1991, forthcoming). 
Ac~\].;s DE COLING-92, NANTES, 23-28 Ao(rr 1992 6 3 1 l'l~o(:, oi: COI,ING-92, N^NTES, AUO. 23-28, 1992 
ries of predicates whese conjunction defines the remain- 
der of the constraints the (sentence, I)hra.ue structurc, 
LF) triple must satisfy. This is done using familiar ma- 
chinery from Prolog to output LFs that satisfy all the 
declarative constraints of the linguistic theory. In prac- 
tice, a straightforward generate-and-test mechanism is 
grossly inefficient, since the principles that apply to 
at the level of surface structure (S-structure) are but 
a fraction of those that apply in the overall system. 
The usual problems of lexical and structural ambiguity 
the the underconstrained nature of the initial X system 
means that the number of possible S-structures to hy- 
pothesize may be buge. qb obtain an efficient parser 
we use a full multiple-entry table with backtracking (as 
in 2bruits, 198fi), extending it to a canonical LR(1) 
parser. The LIt. machine uses an atttomatically-built 
S-structure grammar tbat folds together enough of tim 
constraints from other principles, parameters, lexical 
subcategory information oflline to produce a 25-fold 
improvement over tile online phrase structure recov- 
ery procedure originally proposed by Fong and Berwick 
(1989). Optimizations include extra conditions in ac- 
tion classes to permit interleaving of other principles 
(like movement) with structure-building (the 'intcrleaw 
ing' noted by principles marked 'l' in the snapshot in 
figure 2 below); control structure flexibility in principle 
ordering; precomputation of the LK transition function; 
elimination of infinite recursion of empty elements by an 
additional stack mechanism, and so forth. We exploit 
the explicit modularity of the principle-ba.qcd system in 
way that is impossible in an ordinary rule-based sys- 
tem: we can build a grammar for phrase strneture that 
is small enough to make full, canonical LI~.(1) parsing 
usable, unlike large CFGs. The earlier error detection 
of full Lll(1) parsing over LALtt methods means that 
fail as early as possible, to avoid expensive trcc con- 
structions that can sever participate in final solutions. 2 
3 The Japanese parser 
We begin with a very simple parameterization of 
Japanese that will nonetheless be able to cover all 
the Lasnik and Salts w/l-questions, scrambling, and so 
forth; sec tile table on the next page that follows the ex- 
ample sentences. The important point is that very little 
additional must bc said in order to parse a wide vari- 
ety of distinctive Japancsc sentences; the principles as 
shown on tbe ri~hthand side of the computer snapshot 
do no~ change. ~ 
Consider first the example wh-movement sentences 
found in the linguistics paper On the Nature of Proper 
Govcramenl by Lasnik & Salts (1984). 4 These seu- 
:qb providca rough measure of the machine size for the 
phrase structurc grammar of S-structure for both English 
and Japanese, the augmented CFC, consists of about 74 pro- 
ductions derived fronl a schema of 30-34 rules. The resulting 
characteristic tisitc state automaton (CFSM) consists of 123 
states with 550 traalsitions between the various states. The 
action table consists of a totM of 984 individual (nonerror) 
entries. 
3We will scramble only from direct object positions here, 
even though it is straightforward to scramble front indirect 
object positions. Informally, we have noted that scrambhng 
from IO greatly increases computation time. A tighter set 
of constraints oil scrambling seems called for. 
4Average best parsing time for the Japanese sentences 
shown is 0.37sees/word on a Symbolies 3650 (64K LIPS) (el 
tence.q (listed below) display nlany familiar typologi- 
cal Japanese-English differences, and cover a rather 
soplfiaticated sct of differences between English and 
Japanese: for instance, why (6) is fine in Japanese but 
not in English; frec omissiol) of NPs; "scrambling" of 
subjects and objects; Verb-final (more generally, IIead- 
final) constituent structure, and no overt movement of 
wh-phras~. We also consider a different set of Japanese 
sentences (also listed below) designed to illustrate a 
range of the same phenomena, taken from ttosokawa 
(1990). We stress that these sentences are designed to 
illustrate a range of sentence distinctions in Japanese, 
as well a.q our investigative method, rat|mr than serve 
as any complete list of syntactic differences between the 
two languages (since they aro obviously not). s 
\[Lasnik & Salts (1984)\] 
(2) Watashi-wa Taro-ga nani-o katta ka shitte iru 
'I know what Johll bought' 
{6) Kimi-wa dare-ni Taro~ga naze kubi-ni natta tte 
itta no 
'qb whom did you say that John was fired why' 
(32) *Meari-wa Taro-g~ nani-o katta ka do ka sltiranai 
'Mary does not know whether or sot John bought 
what' 
(37a) Taro-wa haze kubi-ni natta no 
'Why was John fired' 
(37b) Iliru-wa Taro-ga haze kubi-ni nntta tte itta no 
'Why did Bill say that John was fired' 
(39a) Taro-ga nani-o te-ni ireta koto-o sonnani okotteru 
no 
~What arc you so angry about the fact that "Faro 
obtained' 
(39b) ~l'aro-ga naze sore-o te-ni ireta koto-o sonnani 
okotterll no 
'Why are you so angry about the fact that Taro 
obtained it' 
(41a) ltanoko-ga Taro-ga nani-o te-ni frets tte itta koto- 
0 sonn&lll okottcru )to 
~What arc you so angry about the fact that 
I\[anoko said that Taro obtained' 
(4lb) *Hanoko-ga Taro-ga naze sore-o re-hi frets tte itta 
koto-o Solinalli okottern no 
'Why are you so angry about the fact that Ilanoko 
said that Taro obtained it' 
(60) Kimi-wa nani-o doko-de katta no 
~Where did you buy what' 
(63) Kimi-wa nani-o sagashiteru no 
'Why are you looking for what' 
Complement/noneomplement asymmetry, 
scrambling and uneXl)eeted parses 
To see bow the parser handles one Japanese exam- 
plc (see the actual computer output in figure 1 or fig- 
ure 2), consider (39a) (and thc corresponding illicit 
(39b)), where a complement wh but not a noncom- 
plement wh can be extracted from a complex NP: (a) 
Taro-ga nani~o te-ni frets koto-o sonnani okotterun no; (b) 
*Taro-ga haze sorc-o re-hi frets koto-o 'What/*Wlly are you 
so angry about the fact that 'Faro obtained' 
Tbis example illustrates several Japanese typologi- 
cal differences with Englisb. The subject of the ma- 
trix clause (= you) has been omitted. Nani ('what') 
and te ('hand') have been scrambled; the direct object 
= 1.52see, n= 100). Parsing time on a Sun Sparestation 2 
is approximately an order of magnitude faster. 
SE.g., the doublc-o constraint; cast-overwriting, passive 
and causative constructions, etc. all remain to be fully 
implmoented. 
ACYEs DE COLING~92, NANTI~S, 23-28 AOt)r 1992 6 3 2 PROC. OF COLING-92, NANfI!S, AUG. 23-28, 1992 
(marked -o) now al>pcaring in front of tim indirect ob- 
ject re, Phr~ule structure is llead final. Our relaxation 
of the Caac Adjacency paranteter and the rule that al- 
lows adjunctiou of NP to VP, plus transmission of Case 
to the scrambled NP will let this analysis through. '\]?he 
LF for this nentence should be something along the lines 
of: for what x, pro is so angry about \[tiLe fact that "Faro 
obtained x\] 
IlL this example ply denotes the understood subject 
of okottern ("be angry"). The Ll:s actually returned by 
the parser are shown in tile siLapshot in tigure l.S 
\[llosokaw~t (1990)\] 
(tb)' Gengogaku-no gakuseioga tiizu~o tabeta 
linguistics-sen student-nora cheese-ace eat-pmst 
'A student of linguistics ate cheese' 
(2by Nagai karat-no gakusei-ga tiizu-o tabeta 
long hair-gcn student-,nom cheese-ace eat-p~lt 
'A hmg haired student ate cheese' 
(3b) Taro-ga hoit-o katta 
John-nora book-ace buy-pa~t 
'.lohn bought a book' 
(4b) Taro-ga Hanoko-ni hon-o ageta 
John-nora Mary-dat book-ace give-past 
'John bought Mary a book' 
(5b) Taro-ga hon-o table-no ue-ni sits 
John book:ace ta.ble-gen top-dat (top of table) 
put-p~t 
'John put the book on the table' 
(6b) 'Faro-wa gakkoo-ni itta 
John-top uchool-dat gO-l)~t 
'John went to school' 
(151)) Watashi-wa tattvga nani-o katta ka shiranai 
I-top John-nora what-ace bought Q know-not 
<1 don~t know what John bought' 
(lTb) q_'mo wa Chomsky-no Barriers-o yontimashita ka 
John-top Cln)msky-gen tlarriers-acc rcad-1)mut Q 
'I)id John read Chonmky's Barriers' 
(18b) llanoko-wa 
Biru-ga Chomsky-no Barriers-o yonda ka do ka 
shiranai 
Mary-top llill-noin Chomsky-gen Barriers-ace 
read Q know-not 
'Mary does not know whether or not Bill read 
Chomsky's, .. ' 
Tile parametric differences that we need to accomodate 
all these differences between English and Japanese arc 
quite few: 
OWe will not have room to tlescribe in detail the 
derivation of these LFs. But, it uhouhl be noted 
tbat the derivation sequence is quite complex. Note, 
for example, that .ant ('what') undergoes moventent 
at two levels of phrase structure in order to get to 
the specilier position of the matrix Complementizer: 
lOP nani\[IP *l~aro\[NP\[CP pro\[ VP~t'&\[ VI~ iretal\]\] huts\],..\]\] 
Furthermore, the LF trace t' violates the so-called empty 
category principle unless it is deleted (as indicated by \[\] in 
the snapshot), under the present theory. Tile lack of wh- 
ntovement at S-structure in Japanese, and its presence in 
Engbsh, interacts with these constraints to bar example8 
like (6) in English; see Lasnik ~ Salts. 
AcrEs DE COLING-92, NANTES, 2.3-28 AO(rl" 1992 6 3 3 
~h 
: *ltca<l order 
Agreement 
|lounding 
~Casc Adjacettcy 
sWh in Syntax 
aPro-Drop 
md Japanese parameter settjn6s 
npecPinal :- 
\+ Bpeclnitial, 
het~d|nitial. 
hcadFinal :- 
\+ headhfitial. 
u~r(weak / 
bouudingNode(i2), 
boundingNode(np). 
caseAdj acency. 
wh\] n,qyntlax 
:- fig proDrop, 
mpecFinal :- \+ apeelnitial. 
headFm~l. headlnitial :- 
\+ hendFinal. agr(weak). 
boundingNode(i2 ). boundingNode(np). 
;- no caneAdjacency, :-no whlnSynmx 
i)roDrop. __ _ 
As one can see from the figure, the system does cor- 
rectly rccover the right l,F, a.s the lmut one in snap- 
shot. llowever, it also (surprisingly) discovers three 
additional LFs, illustrating the power of the system to 
uncover alternative interpretations that a proper theory 
of context would have the job of ruling out. Ignoring 
in(liccs~ they all have tile sanlc t~)rn|: for what x, 'Faro 
is so angry about \[the f~tct that pro obtained z\] 
llere the embedded subject 7hro h~ been inter- 
changed with the matrix subject pro. It turns out that 
the sentence happens to bc ambiguous with respect to 
the two basic interl)rc~atiotts, z l,br complcteness, here 
ate the three variants of that correspond to the first 
three LFs reported l)y the parser. S. Miyagawa (i).c.) 
informs us that the last two, given proper context, are in 
fact possible. These include.: (1) pro is eoreferent with 
koto ("fact"): s, i.e., for what x, Taro is so attgry about 
\[the fact that tim fact obtained x\]; (2) pry is corefcrent 
with taro: for what z, Taro is so angry about \[the fact that 
Taro obtained ~:\]; and (3) pry is free in the sentence: for 
what x, Taro is so &ngry al)out \[thc fact that (someone else) 
t)btained x\]. ~ 
4 Parsing Japanese: the computational 
effects of scrantbllng~ pro-drop, and 
phrase structure 
Next we turn to the investigation of the computalioaal 
differcnee,~ between the two languages that we have 
explored, and show how to use the system in mJ ex- 
ploratory mode I~o discover complexity differences be- 
tween English and Japanese. Ia the discussion that fop 
lows, we shall need to draw on comparisons between the 
complexity of different parses. While this is a dclicate 
matter, there arc two obvious metrics to use in compar- 
ing this parser's comt)lexity. The tirst is the total num- 
ber of principle operations used to analyze a sentence - 
the munber of S-structures, chain forlnations, indcx- 
ings, tile case filter and otitcr constraint applications, 
etc. We can treat these individually and tm a whole to 
give all account of the entire "search space" the parser 
moves thr(mgh to discow~r analyses, llowever, this is 
rThis was pointed out by D, PesetHky, and conlirmed by 
M. Salts. llowever, t)resumably the nse of wa rather than 
9a and intonational pauses could be exploited as a surface 
cue to rate out more gcnerally ambiguity in this examptc 
and others like it. See l'bng and llerwick (1989) for a discuu- 
sion of how to integrate mtrfax:e cues into the principled~ased 
~ystem. 
tThis interpretation c~n be eliminated by itoposing sclcc- 
tional restriction, on the possible "agents" of okotteru (let 
uu say tbat they muut be animate). 
~Itaving a parsing system that can recover all such lin- 
guistic alternatives is of interest in its own rigltt, both to 
verify and correct the linguiutie theory, as well a8 enmlre 
that no possibilities are overlooked by human interpreters. 
PRoc, OF COLING-92, N^I~rEs, AUG, 23-28, 1992 
Principle-and-Parameters Par~ 
Build LR Graph Language Op Status Option~ Parsers Run Screen Sentences Time Traein0 I 
Run Sentence~ (ExaMples) e39a .... 
i 
e39a 7-~ro-~ rJant-o tr-nl Irmta koto-o ~oona,~t okotteru no LFt \[C2CNP r~niJ-aoc \[01\[~2CNP taroJ-no~ \[liCvPCveCNvEs2Ct2 pro CllEVp\[\] CvPCsP teJ-dat CvICuet-B-P\] \[v /(BGR) 
t 2 0 1 4 1 3 
\[V ireta\]s j 5\]I\] \[Itfl\]\] \[C\]\]\[N koto\]\]-acc3\[q\[Aov so¢~il \[v (~(otte\]6\]fi\] \[VtT\] \[I I(HOR)2\[v Iro\]7\]2\]\] \[C no\]\]\] 
LF: \[CZ\[NP nanl\]-acc \[CI\[IZ\[NP taro\]-rloB \[II\[VP\[VPtNP\[C2\[12 P~'O \[n\[vP\[\] \[VP\[SP tel-dot \[VI\[NPC-FFP\] \[v l(fl~R) 
1 2 2 1 ~ 1 2 
\[V lreta\]4\]4\]\]\] \[It2\]\] \[C\]\] \[N kOto\]\]-~W~5\[V\[ADV ~w~\[\] \[V okotte\] \] 6\] \[Vt 7\] \[I I(~RI2tv ~rt/\] ?\] 2\]\] \[c r~\]\]\] 
LF: Cc2\[sP t~an|\]-a(~ \[CI\[12\[NP taro\]-t~ \[II\[VP\[VP\[NPEC2\[12 pro \[II\[VP\[\] \[VP\[NP te\]-daL \[VI\[NPt-~-P\] \[V I(P~R) 
1 2 0 t 4 1 
Cv Ireta\]5\]5\]\]\] \[It3\]\] Ccl\] Cs koto\] \]-ace6 \[vCAov ~ot~i\] Cv okotte\] 7 \] 7 \] Creel Cl ,(mR)2 Cv Iru\]o\]2\]\] Cc ~\]\]\] 
LF : \[c2 \[NP aanl\] -ac¢i \[el It2 ~ro 2 ell \[vP \[vP \[NP \[ c2 \[12 the taro\] ~8 \[11 \[vP \[\] I \[VP CsP t~\] -dat 4 \[ v1 \[NPt-~-P\] I Cv l(fl~R) 3 
\[v Ireta\] \] \]\]\] tit \]\]\[C\]\]\[N koto\]\]-a~c \[vtAov ~i\]\[v ~o~te\] \] \] \[vt \]\[I I(~R) \[v it'u\] \] \]\]re n~\]\]\] 
55 3 6 77 8 2 82 
Ro (.ore) par'~e~ 
~h 
Figure 1: Computer snapshot from Lasnik & Saito. 
often not a good measure of the total time speut in 
analysis. The second measure we use is more particular 
and precisely tailored to the specific backtracking-Lit 
design we have built to recover structural descriptions: 
we can count the total number of Lit finite-state control 
steps taken in recovering the S-structure(s) for a given 
sentence; indeed, this accmmts for tile bulk of pars- 
lug time for those cases, as in Japanese and many En- 
glish sentences, where multiple quasi-S-structures are 
returned. Taken together, these two measures provide 
both a coarse and a more fine-grained way of seeing 
what is hard or easy to compute) ° 
5 Complexity of Japanese parsing 
Given this initial set of analyses, let us now examine 
the complexity of Japanese sentence processing as com- 
pared to English. To do this, we initially examined 
sentences that we thought would highlight the ease of 
Japanese relative to English, namely, the "classic" En- 
glish center-embedded vs. Japanese left-branching con- 
structs from Kuno (1973), e.g., The cheese the rat the 
cat John keeps killed, :Taro-ga kaHe-iru ncko-ga ko- 
rosila nezumi-ga 
On the conventional Chomsky-Miller account, the 
English construction is very difficult to parse, while the 
left-branching Japanese form is completely understand- 
able. Interestingly, as shown in figure 2 the nmnber of 
operations required to complete this parse correctly is 
enormous, as one can see from the righthand column 
numbers that show the structures that are passed into 
and out of each principle module. 
It at first appears that left-branching structures are 
definitely not simpler than the corresponding center- 
embedded examples. Why should this be? On a mod- 
ern analysis such as the one adopted here, recall that 
restrictive relative clauses, e.g. the rat the cat killed, 
are open sentences, and so contain an operator-variable 
structure coindexed with the rat, roughly: 
(l) \[NP\[NP the rat\]l \[ep Op .... the cat killed h\]\] 
l°Note that these two are metrics that are stable across 
compile-cycles and different platforms. This would be not 
true, of course, for simple parse times -- the obvious 
alternative. 
where the empty operator (Op) is base-generated 
in an A-position and subsequently fronted by Move-c~ 
(Chomsky, 1986:86). 
Thus, the Japanese structures are center-embedded 
after aU--thc parser places a potcntially arbitrary string 
of empty Operators at tile front of tile sentence. Per- 
haps, then, the formal accounts of wily this sentence 
should be easy are incorrect; it is formally difficult but 
easy oil other grounds. Of course, alternatively, the the- 
pry or parsing model could be incorrect, or perhaps it 
is scrambling, or pro-drop, or the tlead-final character 
of the language makes such sentences difficult. In the 
rest of this paper we focus on 3 attempts to discover 
the source of the cmnplcxity. 
To investigate these questions, wc embarked on a se- 
ries of optimization efforts that focused on the Spec 
positions of CP and the Ilead-final character of tile lan- 
guage, with the goal of making the Japanese as easy, 
or easier than, the corresponding English sentences or 
determining why we could not make it easier. In all, 
we conducted three empirical tests: (1) using dummy 
nonterminals to "lift" information from the verb to the 
VP node, to test the lIead-first/final hypothesis; (2) 
placing Spec of CP on the left rather than the right, to 
test the center-embedding hypothesis; and (3) building 
a "restricted" pseudo-Japanese that eliminated scram- 
bling and frec pro-drop, while nol lifting the informa- 
tion up aml to the left, leaving the llead-final character 
intact. We will next cover cash computer expcrimeut in 
turn. Figure 3 gives a bar-graph summary of tim three 
experimental results in the form of times improvemcnt 
(reduction) ill LR state creation. 
Optimization 1: Head-final information 
Our first optimization centers on the IIead-fiual 
phrase structure of Japanese. With Heads at the end, 
valuable information (subcategorization, etc.) may bc 
unavailable at the time the parser is to make a partic- 
ular decision, tIowever, for our Lit machine, there is 
a well-known programming language optimization: in- 
troduce dummy nonterminals on the left of a real non- 
terminal, e.g., VP--* X V NP, which, when reduced, 
call semantic action routines that can check the input 
stream for a particular property (say, tile presence of a 
noun arbitrarily far to the right). Specifically, if verb 
ACRES DE COLING-92, NANTES, 23-28 ^0~' 1992 6 3 4 PRec. OF COLING-92, NANTES, AUG. 23-28. 1992 
Princ, iple-e~nd-F%r'ome~er~ ParserS_ 
Build L~ firuph Languag= Oo Stalu~ Ootlon¢ Par==r,¢ ~tm 5c~=o S=nt, anc¢¢ lithe l~cino 
~nte~e Select nnd R~ Stn~a,~a flgR~ 'raro-~ )¢.¢¢. =eu n*ko-#a koroetca nez~#-g= e=b.e= ¢~=*u-~ 
-FI \[CZEllCN.\[C= Op \[Cl\[l¢\[X~\[¢/ ORafC\]\[II\[Np\[C¢ O~3\[Ci\[~\[~ ~\]4\[H\[YP\[VP\[B~e~\]a\[V ~\]S\] \[qt 6\] \[I IS 
,& 
o~w~- :Z. J'" 7 f~'t~ ~ s~;T'~l 
~an~tte ~lt~ 
\[11 
iz c 
,.* ~U---~---~- n t~, 
\[-~, ~p\] 
k.tt. 
• I 
\]hu 81 Oct, 9=4;1140J GflhIIIIJRY Hill 13137 CL PL-U~ER= ~ LllpKld, l~ 
FlUef~ 
~7 ........ 
I 
Oenerator~ 
4a 
a0ct~ 
Figure 2: Tile parse of tile Japanese counterpart of the English center-embedded question. 'I~'acing out the left- 
hand fringe of the tree, note the string of empty operators, ms well as, on the right-hand column, the large number 
of parser operations required to build this single correct LF as COml)arcd to English (in the text). Still, a single 
parse is correctly returned. 
information occurs on the right we can oflline "lift" that 
information up to the VP node, where it can then in- 
fluence the Lit state transitions that are made when 
examining material to the left of the head. For exam- 
ple, for each V subcategory, the LK machine will con- 
tain in effect a a new Lit state; the system will add a 
command to look ms far into the input as needed to de- 
termine whether to branch to this new state or another 
V subcategory state. This is precisely tile mechanism 
we used to determine whether to insert an empty cat- 
egory or not in a flead-first language. For instance, 
in Japanese relative clauses this is of importance be- 
cause tile parser may get valuable information from the 
verb to determine whether a preceding NP belongs to 
that relative clause or not. tile action and transition 
tables of the resulting Japanese machine, which we will 
call "optimized," will be far larger than its base case 
counterpart (more precisely: the action table is 3 times 
larger, or about 380K to 980K, while tile transition ta- 
ble is about twice as large, 72K to 142K). 
The advantages accrued by this optimization are sub- 
stantial, 2-10 times better; see the table below. (This 
also holds across other sentences; see the bar graph sum- 
mary at the end of the paper.) The unoptimized num- 
ber of LR state transitions grows astonishingly rapidly. 
For example, the transitions needed to parse ce4 is ex- 
actly mu shown--over 20 million of them, compared to 
1 million for the optimized version) 1 
Sentences: 
eel. The cheese was rotten; 
ee2. The cheese tile rat ate was rotten; 
ce3. The che~e the rat tile cat killed ate was rotten. 
ce4. The cheese the rat the cat John keeps killed ate was 
rotten. (=l~a on snapshot) 
(See figure 2 for computer output of the corresponding 
Japanese sentence.) 
Total number of Lit state transitions 
Se,,t ..... I JP,'~n°pt" I Jl', O,,t. I Ti .... E.~,~ 
I \[ I better(E') J 
el \[ 232 ~1~-- \[ 1,9(6.1) 74ff~ 1 
e2 I 7122 I 1518 14.7(1.6) 2431 / 
~ I 257,042 125/246 \[ 10.18(.19) 4979 1 
20,360,664966,114 ~A~(.03) 32101 _j 
The same basic trend also holds, though not as 
strongly, when we look at these and other sentences 
in terms of the total number of principle operations re- 
quired; while we do not have space to review all of these 
here, as an example, sentence (15b) takes 4126 opera- 
tions in the base case, and 455 when optimized in this 
fashion; while ce3 takes 1280 operations and 667 when 
optimized, respectively. 
a i We should point out that in all cases, about a two-thirds 
of these transitions occur before the LR machine reaches a 
point in the search space where the solutions are "clustered" 
enough that the remaining solutions do not take go much 
effort. 
AcrEs DE COLING-92, NANTES, 23-28 Aour 1992 6 3 5 PROC. OF COLING-92, NANTES, AUG. 23-28, 1992 
Optimization 2: Spec of CP on the right 
A second obvious strategy is to remove the center- 
embedding itself, llere there is a grammatical move we 
can make. Evidently, in Japanese the only elements 
that appear in Spec of CP are put there by LF move- 
ment. Thus, these elements can never be visible in this 
position on the surface. If this is so, then there is really 
nothing to prevent us from placing just the Spec of CP 
on the right, rather titan the left. This is an example 
of the "testbed" property of the system; this change 
takes two lines of code. Given this change, the result- 
ing structures will have their Operators on the right, 
rather than the left, and will not be center-embedded. 
In addition, in this test the parser will not take advan- 
tage of right-hand information, thus eliminating this a.s 
a possible source of speedup. 
Parsing complexity is reduced by this move, by a fac- 
tor of just about one-half, if one considers either LR 
state transitions or principle operations; not. as good as 
the first optimization; see below for some representative 
results. Also, with the most deeply center-embedded 
sentence the total number of principle operations ac- 
tually is worse titan in the base case. Evidently we 
have not located the source of the parser's problems in 
center-embedding alone. 
Complexlty for Spee on the right 
Sentence LR trans. Total ops 
eel 122 32 
ce2 4930 97 
ce3 209,980 721 
ce4 16,290,667 12605 
Optimization 3: Factoring out tim effects of 
scrambling arid pr0-drop 
While it appears that tlead~final information helps 
the most, we nmst also remember that part of the com- 
plexity of Japanese is the result of frce scrambling and 
pr0-drop. To factor apart these effects, we ran a series 
of computer experimeuts on a quasi-J apanese grammar, 
J*, ttlat was just like Japanese except scrambling and 
pro-drop were barred. The changes were again simple 
to make: one change was automatic, just turning off a 
parameter value, while the second involved 3 lines of 
hand-coding in the X" schemas to force the system to 
look for a lexical NP in DO (and IO) positions l"urther, 
we did not optimize for right-hand information (so that 
the tlead-final character was left intact). Of course, 
we now can rio longer parse sentences with scrambled 
direct objects. 
The table below shows the results. This was the best 
optimization of all. Without scrambling, and hence 
no movement at all compared to English, the Ilead- 
final quasi-Japanese was for the most part parsed 5- 
10 times more efficiently than English, and at worst 
(for the triply-embedded sentence) with three times 
fewer LR transitions and only about 30% more prin- 
ciple operations than English. Thus, this was even 
more efficient than the righthand information optimized 
Japanese parser. (The first column gives the number of 
LK transitions and the second gives the total munber 
of principle operations for this "no scramble/drop" ver- 
sion, while the last two columns give the same informa- 
tion for English.) 
No scrambling/drop vs. Engllsh 
~e~,R t ...... No. ops Eng. LR Eng. ops 
eel \].03. 32 745 109 
ce2 \[274 88 2431 168 
ce3 11241 445 4979 558 
ce4 ~ 3719 21,074 2874 
As before, with a short sentence, there is little differ- 
ence between optimization mcthods, but over a range 
of sentences and with longer sentences, the no-scramble 
or pro-drop optimization works better than any other. 
Evidently, given the framework of assumptions we have 
made, the IleFtd-fnal character of Japanese does not 
hurt the most; rather, it is scrambling and pro-drop 
that does, since if we remove these latter two effects we 
get the biggest improvement in parsing efficiency. We 
can confirm this by looking at the Lt~ transitions for the 
other sentences (lb)-(18b) across methods, summariz- 
ing our tests. We can summarize the three experiments 
acro~q sentences in figure 3. 
Summary of colnplexity across teats 
u Sentence ,opt. Opt. Spcc-Final No Scra- 
mble/drop 
~-- ~6" 730 602 216 
2b ~4 790 957 298 
311 3 289 185 103 
4b I 422 307* 149 
5b t2 1051 878* 370 
fib ~ 377 267 138 
1511 955 19,998 11,205 1681 
17b ~7 1789 685 272 
18b i,036 84,727 43,745 5306 
6 Conclusions 
Given our limited set of test sentences, our results must 
be correspondingly tentative. Nonetheless, we can draw 
several initial conclusions: 
* One can parse Japanese by parametrically varying a 
grammar, nmch as expected. Tile limits of the method 
are theory-bound: we can accommodate just as much 
as we understand about Japanese syntax. 
* Attempting to parse more than one language with 
the same grammar and parser carl quickly reveal what 
is wrong with one's theory for either language. In our 
case, we discovered omissions in the implementation 
relating to Case transmission, the Wh-Comp Require- 
ment, and trace deletion, among other items. 
* A single parser suffices for very distinct languages. 
The grammar is parameterized, but not the parser, con- 
firming nmch recent other research in Japanese sentence 
processing cited in the introduction. Japanese at first 
appears much more complex to parse titan correspond- 
ing English sentences. We suggest, tentatively, that 
complexity is introduced by scrambling and omission of 
NPs, rather than Ilead-final properties. Unoptimized, 
the system is too slow. Some efficiency is obtained if one 
can "lift" information from the right for use in parsing 
with an Llt machine. Frmn a heuristic standpoint, this 
suggests that strategies limiting what may appear in a 
scrambled position or dropped in a certain context will 
aid such art LR-based device more titan switching to a 
parser based presumably geared for a different branch- 
ing direction. 
® The prineiple-bmsed system affords a new and gen- 
erally straightforward way to precisely explore differ- 
ent grammatical theories, structural assumptions, and 
parsing methods and their computational consequences 
AcrEs DE COLING-92, NANTES, 23-28 AOt'rr 1992 6 3 6 I'ROC. OF COLING-92, NAI~rES, AUG. 23-28, 1992 
40 
35- 
i 30 - ~ 25- 
i 20- 15- 
'LO- 
0 
2470 
LO 
lb 2b ab 4b 5b 7b 15b 17b t 8b col co2 co3 e~,4 
.%tt~,m, .; \[\] Right-lured iafiwmttion 
\[\] Spoe CP tm fight * = complete partm not ob(ain~d 
\[\] No itcrtmbllng orprorhop 
Figure 3: A bar graph showing the improvemeat in total LI1. transitions when parsing Japanese examples lb -18b, 
and cel-ee4, compared against tim original base case unoptimized parser, across tile 3 experiments described here. 
The horizontal line drawn at 1.0 indicates improvement over the base cause. 
in a precise way, without extensive hand coding. All of 
the experiments we tried took no more than a few lines 
of modification. Of course, the difficult part is to come 
up with a universal set of principles in the first placc~ 
so that in fact, English looks just about like Japanese, 
and vice-versa. 

References 

Baltin, M.R., aml A.S. Kroch (eds.), 1989. Alternative 
Conceptions of Phrase Structure, The Univemity of 
Chicago Press. 

Chomsky, N.A., 1986. Knotoledge o\] Language: Its Nature, 
Origin, and Use. Prager. 

Correa, N., 1988. Syntactic Analysis of English with respect 
to Government-binding Grammax, Ph.D. dissertation, 
Syracuse University. 

Crocker, M.W., 1989. A Principle-Based System .for Syn- 
tactic Analysis, (m.s.). 

Dorr, B.J., 1987. UNITRAN: A Principle-Based Appro~tch 
to Machine Translation. S.M. thesis. MIT Department 
of Electrical Engineering and Computer Science. 

Fong, S. & R.C. Berwick, 1989. The computational in|- 
plementa~tion of principle-based parsers, blternational 
Workshop on Parsing Technologies., Carnegie Mellon 
University, in M. Tomita (ed) Current lssues in Pars- 
ing Technologies, Kluwer. 

Fong, S., 1991. Computational Properties of Principle- 
baaed Grammatical Theories. Ph.D., dissertation, M\[T 
Department of Computer Science and Electrical Engi- 
neering. 

Frazier, L., and K. Rayner, 1988. Parameterizing the lan- 
guage processing system: Left- vs. right-branching 
within and across languages. In J. tiawkias (ed.) Ex- 
pluming Language Universals, llasi\[ Blackwell, Oxford, 
pp. 247 279. 

Hosokawa, I1. 1991. Syntactic difl:erences betwemt English 
and Japanese. Georgetown Journal of Languages and 
Linguistics, 1:4,401-414. 

lnoue, N. and Fodor, J.D., 1991. Information-paced process- 
ing of Japanese sentences. Paper presented at the In- 
ternationM Workshop on Japanese Syntactic Process- 
ing, Duke University. 

Johnson, M., 1989. Use of tile Knowledge of Language, 
Journal of Psyeholingulstic Research. 18(1). 

Knno, S., 1973. \]'he Slruclure of the Japanese Language, 
Cambridge, MA: MIT Press. 

Lamdk, H. & M. Saito, 1984. On the nature of proper gov- 
ernment. Lir~guistic h~quiry, 15:2. 

Lasnik, II. & J. Uriagereka, 1988. A C'ourse in GB Syn- 
tax: Lectmves on Binding and Empty Categories, Cam- 
bridge, MA: M1T Press. 

Mazuka, It., 19911. Processing of empty categories in 
Japanese. Manuscript, Duke University. 

Nagai, N., 1991. Paper presented at the International Con- 
ference on Japanese Sylttactic Processing, l)uke Uni- 
versity. 

Sharp, ILM., 1985. A Model of Gramntar Based on Princi- 
ples of (~overnment attd l\]inding Theory. M.S. thesis. 
Department of Computer Science. University of British 
Columbia. 

Stabler, E.P., Jr., 1991 forthcoming. The Logical Approach 
to Syntax: Foundations, Specifications and Implemen- 
tations o\] Theories of Government and Binding., Cam- 
bridge, MA: MIT Press. 

Tomita, M., 1986. E~icient Parsing \]or Natural Language: 
A Fast Algorithm for Practical Stlstems. Kluwer. 

Wehrli, E., 1986. A Government-Binding Parser `for l')~neh. 
Working Paper No. 48. University of Geneva. 
