A NEW DESIGN OF PROLOG-BASED 
BOTTOM-UP PARSING SYSTEM 
WITH GOVERNMENT.BINDING THEORY 
Hsin-Hsi Chen*,**, I-Peng Lin* and Chien-Ping Wu** 
* Department of Computer Science and Information Engineering 
National Taiwan University, Taipei, Taiwan, R.O.C. 
** Graduate Institute of Electrical Engineering 
National Taiwan University, Taipei, Taiwan, R.O.C. 
abstract 
This paper addresses the problems of movement 
transformation in Prolog-based bottom-up parsing system. 
Three principles of Government-Binding theory are employed 
to deal with these problems. They are Empty Category 
Principle, C-command Principle, and Subjacency Principle. A 
formalism based upon them is proposed. Translation 
algorithms are given to add these linguistic principles to the 
general grammar rules, the leftward movement grammar rules, 
and the rightward movement grammar rules respectively. This 
approach has the following specific features: the uniform 
treatments of leftward and rightward movements, the arbitrary 
number of movement non-terminals in the rule body, and 
automatic detection of grammar errors before parsirlg. An 
example in Chinese demonstrates all the concepts. 
1. Introduction 
The movement transformation is one of the major 
problems encountered in natural language processing. It has 
relation to the empty constituents (traces) that exist at various 
levels of representation in natural language statements. 
Consider the following example in Chinese: 
~\]\[~, ~ ~ N T o (That book, I read.) 
The word "N" (read) is a transitive verb, which should take a 
direct object. However, the object "JJl~" (that book) is 
topicalized to the first position of the sentence. For the 
treatment of this phenomenon, we cannot just write down the 
rules: 
sentence --> noun-phrase,verb-phrase. 
verb-phrase --> transitive-verb,noun-phrase. 
verb-phrase --> transitive-verb. 
verb-phrase --> intransitive-verb. 
This is because many ungrammatical sentences will be 
accepted. Thus, we must provide some mechanisms in the 
grammars in order to capture them. It is still a hard work to do. 
Several difficulties are listed as follows: 
(1) The determination of movement is difficult. That is, 
an element may be in a topicalization position, but it is not 
moved from some other place in the sentence. For example, 
(Fruit, I like.) 
zk~, ~ ~gt ~o 
(As for fruit, I like banana.) 
the first can be considered to be a movement phenomenon, but 
the second cannot. 
(2) The empty constituent may exist at many possible 
positiofis. For example, given an n-word sentence such as 
Cl Wl e2 w2 e 3 ... e(n_l) W(n-1) e(n) W(n) e(n+l) 
where w i is the i-th word and e i is an empty 
constituent, there am (n+l) possible positions from which the 
moved constituent may originate. That is, for a moved 
constituent (if there is any), there are so many possible empty 
constituents to co-index. (3) Since the gap in between the moved constituent and 
its corresponding trace is arbitrary, it is implausible to list all 
the possible movements exhaustively, and specify each 
movement constraint explicitly in the grammars. 
112 
The Government-Binding (GB) theory \[1\] provides 
universal principles to explain the movements. Some of them 
are shown as follows: 
(1) Empty Category Principle \[7\] - 
A trace must be properly governed. 
(2) C-command Principle \[7\]- 
a c-commands B iff every branching node 
dominating a dominates/3. 
(3) Subjacency Principle \[7\] - 
Any application of move - a may not cross more 
than one bounding node. 
Summing up the above principles, we have to find a movezl 
constituent to c-command a trace. The constituent can neither 
relate to a u'ace out of its c-command domain, nor match a trace 
when more than one bounding node is crossed. Such 
principles nan'ow down the searching space to some extent. 
For example, (El) ~\]l~ tN J~ ~ ti~t'~ ~- ~ To 
(The student that the man saw t i came.) 
(E2)*t m~-N t n~N~ ~T ~9~\[~ff~J~o 
(* The man that the student who t m saw t n came. ) 
There is a trace t i in the example (El). Two NPs, i.e. "N~" 
(the student) and "~\]g~lA. " (the man), may co-index with it, 
but only the former is acceptable. The reason is specified as 
below: 
(ZI') In,, \[s\]~. (the man)~ (saw) q }~J (de)~-~i (the student) 1~ (cane)? (asp) 
L.~o~ 
(\[~i' ') ~,~the man)\[ n, ,\[st}~ (saw)tl ~ (de}~-i (the student)\]~ (came)\]' (asp) 
x ~J L.- o.-I 
The (El") interpretation violates the subjacency principle 
(assuming that s and n" are two bounding nodes). Two traces 
exist in the example (E2). The traces t m and t n may co-index 
with the two NPs "~J~=" (the student) and "~\]I~{NA." (the 
man), and vice versa. However, both are wrong because of the 
snbjacency principle: 
i-7- °--a (E2') .\[s In' latm~ (saw)tn\]~J (de}~ nlthe student)15 (came)~ (asp)\]~ (de\]~\[~)~. m (the man) 
\] L ×- 
~ o .-=--~ 
(F,2' ') Is \[n' ~ tm~ (saw)tn\]/@ (de)~m(the student)\]~ (came)T (asp)\]IfJ {de)~\[l~L n (the man) 
I X~ _ ._ I 
2. A Government-Bindlng based Logic Grammar 
Formalism 
2.1 The specifications of grammar formalism 
The Government-Binding based Logic Grammar 
(GBLG) formalism is specified informally as follows: 
(I) the general grammar rules - 
(a) c(Arg) --> Cl(Argl),C2(Arg2),...,cn(Argn). 
where c(Arg) is a phrasal non-telrninal, and may 
be also a bounding non-terminal, 
c.(Arg.) (l<j < n) is, a lexical terminal or J J - _ . 
a phrasal non-terrmnal. 
(b) c(Arg) --> Cl(Argl),C2(Arg2),...,ei(Argi), 
trace(TraceArg),c(i+ I ) (Argo+ 1 )),'",cn(Argn)" 
where the definitions of c(Arg) and 
cj(Argj) (l<j < n) are the same as above, 
trace('rraceArg) is a virtual non-terminal. 
The special case i=O is common. For example, a 
noun phrase is topicalized from a subject position. It is 
represented as s -- > trace,np. 
(2) the leftward movement grammar rules - 
c(Arg) --> c 1(Arg I ),c2(Arg2),-",ei(Argi), 
m(Argm)<<<traee(TraeeArg), 
c(i+ 1 )(Arg(i+ 1 )),...,cn(Argn). 
where the.definitions of c(Arg) and 
cj(Argj) (l<j < n) arc the same as l(a), 
m(Argm)<<<trace(TraceArg) is a movement 
non-terminal. 
When i=0, the movement non-terminal is the first 
element in ~he rule body. 
(3) t he rightward movement grammar rules - 
(:(Arg) --> c i (Arg 1 ),e2(Arg2),'",ei(Argi), 
trace(TraceArg)>>>m(Argm), 
e(i+ 1 )(Argo+ 1 )),'",en(Argn)- 
Except that the operator '>>>' is used, the other 
definitions are the same as those in the leftward movement 
rules. It is apparent because of the uniform treatments of the 
leftward and the rightward movements. 
2.2 A sample grammar 
A sample grammar GBLG1 for Chinese shown below 
introduces lhe uses of the formalism: 
(rl) slbm'(slbar(Topie,S)) --> 
topic(Topic) <<< traeeT(Topic),s(S). 
( r2 ) slb~t,-(slbar(S)) --> s(S). 
( r3 ) s(s(N2bar,V2bar,Par0) --> 
z~2bar(N2bar),v2bar(V2bar),* part(Part). 
( r4 ) s(s(N2bar,V2bar)) --> n2bar0N2bar),v2bar(V2bar). 
( r5 ) s(s(tt aceR(Trace),V2bar)) --> 
t raceR(Traee),v2bar(V2bar). 
( r6 ) topic(topic(N2bar)) --> n2bar(N2bar). 
( r7 ) n2bar(n2bar(Det,CL,Nlbar)) --> 
* det(Det),* cl(CL),nlbar0Nlbar). 
( r8 )n2bar(n2bar(Nlbar)) --> nlbar(Nlbar). 
( 19 ) nlbac(nlbar(Rel,N2bar)) --> 
tel(Rel),traceR(N2bar) >>> n2bar(N2bar). 
(riO) nlbar(nlbar(N)) --> * n(N). 
(rl 1) rel(rcl(S,De)) --> s(S),* de(De). 
(r12) v2bar(v2bar(Adv,Vlbar)) --> * adv(Adv),vlbar(Vlbar). 
(r13) v2bac(v2bar(Vlbar)) --> vlbar(V1 bar). 
(r14) vlbar(vlbar(TV, N2bar)) -> * tv(TV),n2bar(N2bar). 
(r15) vlbat'(vlbar(TV,traceT(Trace)))--> 
* tvOW),traeeT(Trace). 
(r16) vlbac(vlbarfrV,traeeR(Traee)))--> 
* tv(TV),traeeR(Trace). 
(r17) vlbac(vlbar(iv)) --> * iv(IV). 
Among tlu;se grammar rules, (rl) deals with the leftward 
movement (topiealization), (r9) treats the rightward movemen ~ 
(relafivization), and the others am normal grammar rules. The 
heads of the grammar rules (r3), (r4), (r5), (r7), and (r8) are 
bounding nodes. The virtual non-terminals traceT(Trace) and traceR(Trace) 
appear in the rules (r5), (r15), and (1"16). 
2.3 Tramtsitive .relation of c.command theory 
For a phrasal non-terminal X, a virtual non-terminal Y 
and a transitive relation TR, X TRY if 
(1) X is the rule head of a grammar rule, and Y is an 
element in its rule body, or 
(2) X is the rule head of a grammar rule, a phrasal 
non-terminal I in its rule body, and I TR Y, or 
(3) there exists a sequence of phrasal non-terminals I 1, 
12 ..... I n, such that X TR I I TR 12 TR ... TR I n. 
The transitive relation TR is also a dominate relation. 
The c-command theory is embedded implicitly in the 
GBLGs if ~very grammar rules satisfy the following property: 
for a rule X 0 --> X1,X2,...,X m where X i is a terminal 
or a non-terminal, la i ~ m, if Xi=(A<<<B) then there must 
exist some Xj (i< j < m); such that Xj dominates the virtual 
non-terminal B in other grammar rule. That is, Xj TR B. The 
phrasal non-terminal X 0 is the first branching node that 
dominates A and Xj, and thus also dominates B. Therefore, A 
c-commands B. Xi=(B>>>A) has the similar behavior. Rules 
(rl) and (r9) in grammar GBLGI show these <<< and >>> 
relations respectively. 
2.4 Comparison with other logic programming 
approaches 
Compared with other logic programming approaches, 
especially the RLGs \[8,9\], the GBLGs have the following 
features: 
(1) the uniform treatments of leftward movement and 
the rightward movement - 
The direction of movement is expressed in terms of 
movement operators <<< or >>>. The interpretation of 
movement non-terminals A <<< B or B >>> A is 
If A is a left moved constituent (or a fight moved 
constituent), then the corresponding trace denoted by B should 
be found after (or before) A <<< B (or B >>> A). It is 
illustrated in the Fig. 1. The two trees are symmetric and the 
corresponding rules are similar. However, the rules are not 
similar in RLGs. That is, the two types of movements are not 
treated in the same way. For the rightward movement, a 
concept of adjunct node is introduced. It says that the right 
moved constituent is found if the rule hung on the adjunct node 
is satisfied. The operation semantics is enforced on the writing 
of the logic grammars. It destroys the declarative semantics of 
logic grammars to some extent. 
(2) the arbitrary number of movement non-terminals in 
the rule body - In our logic grammars, the number of movement 
non-terminals in a rule is not restrictive if the rule satisfies the 
property specified in the last section. The RLGs allow at most 
one movement non-terminal in their rules. The position of 
movement non-terminal is declared in the rule head. It is 
difficult for a translator to tell out the position if different types 
~he moved i onst it uont~/~~ 
i AA±AA A 
k ...... 
trace 
\[the empty constituent) 
XO the moved constltucnt 
Xl ... XJ n zx AA± 
el ...... c). 
A A 
trace 
(the empty constituent) 
Fio. I Symmetric tree for leftward and riohtward movement 
of elements are interleaved in the rule body. Thus, our 
formalism is more clear and flexible than RLGs'. 
(3) automatic detection of grammar errors before parsing 
For significant grammar rules, a transitive relation 
TR must be satisfied. The violation of the transitive relation 
can be found beforehand during rule translation. Thus, this 
feature can help grammar writers detect the grammar errors before parsing. 
I13 
3. A Bottom-up Parser in Prolog 
3.1 Problem specifications 
The Bottom-Up Parsing system (BUP) \[2,3,4\] uses the 
left-coruer bottom-up algorithm to implement Definite Clause 
Grammars (DCGs) \[5\]. It overcomes the problems of 
top-down parsing, e.g. the left-recursive invocation, and 
provides an efficient way as Earley's and Pratt's algorithms 
\[3\]. However, it does not deal with the important syntactic 
problem - movement transformation. Extraposition Grarmnars 
(XGs) \[6\] propose extraposition lists (x-lists) to attack the 
movement problem, but when to extract traces from x-lists 
becomes a new obstacle \[8,9\]. Restricted Logic Grammars 
(RLGs) \[8,9\] based upon GB try to tackle the unrestricted 
extraction from the x-list. They emphasize the importance of 
the c-command and the subjacency principles during parsing. 
The extraction must obey these two principles. The parsing 
strategies of XGs and RLGs are all depth-first and left-to-right, 
thus they have the same drawbacks as DCGs do \[4\]. If the 
parsing strategy is left-coruer bottom-up, the following issues 
have to be considered in the translation of GBLGs: 
(1) the empty constituent problem - 
The first element in the rule body, which acts as a 
left-corner, should not be empty in left-corner bottom-up 
algorithm. However, the type 1 (b) of rules is common. 
(2) the transfer of trace information - 
From Fig. 1, we know that the positions of empty 
constituents are usually lower than those of moved 
constituents. Because the parsing style is bottom-up, the trace 
information must be transferred up from the bottom. The 
conventional different list cannot be applied here. Fig. 2 and 
Fig. 3 illustrates the differences of data flow between top-down 
parsing and bottom-up parsing. 
e 
: Cl 91 c(i41) cn k 
HO 1' HI H(i-l) Hi Hi H{i+I} H(n-l) ~ H 
,,' ...... ";, A ,': ........ A A \ 
, : ,, : ,. ,, 
"cll c12 ~ toil el2 ., ~ :c(I+I)I,...% ~nl on2 
FiG. 2 the data flow in tile top-down parsing 
~qro~ (\[HI, H2 ..... Hn}) 
HI "" H(i-I)---Hi ............ H(i÷l)-,H(n-l)---Hn .......... ~-~ A,, A \ ,, ,, ,, 
Fig. 3 tho data flow in the bottom~uo parslnQ 
3.2 Data structure 
The transfer of trace information is through a list called extraposition list (x-list) 
and denoted by a symbol H. The 
transformation of x-list is bottom-up. Fig. 3 sketches the 
concept. A special data structure shown below is proposed to carry the information: 
\[In sequence of trace information/X\],X\] 
Note a mii variable X is introduced. Based upon this notation, 
an empty list is represented as \[Z,Z\]. An algorithm that merges 
arbitrary number of lists in linear time is designed: 
merge(X,Y) :- merge(X,Y,\[Z,Z\]). 
merge(\[\],L,L) :- !. 
merge(\[\[B,X\]JT\[,Y,\[A,B\]) :- merge(T,Y,\[A,XD. 
In the conventional list structure such as 
\[a sequence of trace information\] 
even though the difference list concept is adopted, the 
computation time is still in proportion to ml+m2+...+mn, 
where m i (1< i < n) is the number of elements in the i-th list. 
114 
Although our merge algorithm is the fastest, it is still a 
burden on the parsing. In most cases, the predicate merges 
empty lists. That is nonsense. To enhance the parsing speed, 
the merge predicate is added in which place it is needed. 
Observing the merge operation, we can find that it is needed 
only when the number of lists to be merged is greater than one. 
The following method can decrease the number of x-lists 
during rule translation, and thus delete most of the unnecessary merges: 
Partition the basic elements in the logic grammars into 
two mutually exclusive sets: carry-H set and non-earry-H set. 
The elements in the carry-H set may contribute trace 
information during parsing, and those in the non-carry-H set do 
not introduce trace information absolutely. The transitive 
relation TR defined in the section 2.3 tells us which phrasal 
non-terminals constitute the carry-H set. 
3.3 The translation of grammar rules 
The translation of basic elements in the GBLGs are 
similar to BUPs. Only one difference is that an extra argument 
that carries trace information may be added to phrasal 
non-terminal if it belongs to carry-H set. Appendix lists the 
translated results of the grammar GBLG 1. 
3.3.1 The general grammar rules 
The general grammar rules are divided into two types 
according as a virtual non-terminal disappears or appears in the rule body: 
(a) c(Arg) --> Cl(Argl),C2(Arg2),...,cn(Argn)" 
When c is not a bounding node, e.g. rule (r2), the 
translation is the same as that in BUP \[2,3,4\], except that an 
extra argument H (if necessary) for x-list and a built-in 
predicate merge are added in the new translation algorithm. 
This predicate is used to merge all the x-lists on the same level. 
The transformation of x-lists is bottom-up (only one direction) 
as shown in Fig. 3, Thus, the rule (a) is translated into 
cl(G,Argl,H1,X1,X) :- 
goal(c2,Arg2,H2,X1,X2), 
goal(cn,Argn,Hn,X(n- l),Xn), 
merge(\[H1,H2 ..... Hn\],H), 
c(G,Arg,H,Xn,X). 
When c is a bounding node, e.g. rule (r4), the 
information is used to check the x-list transferred up. Thus, an 
extra predicate bound is tagged to this type of rules: 
cl(G,Argl,H1,X1,X) :- 
goal(c2,Arg2,H2,X1,X2), 
goal(cn,Argn,Hn,X(n- 1),Xn), 
merge(\[HI,H2 ..... Hn\],H), 
bound(c,H), 
c(G,Arg,H,Xn,X). 
The predicate bound implements the subjacency principle. Its definition is: 
bound(C,\[X,Y\]) :- (var(X),!;boundaux(C,X)). 
boundaux(C,X) :- var(X),l. 
boundaux(C,\[x(Trace,B ound,Direction)lXs\]) :- 
(var(Bound),! ,B ound=C,boundaux(C,Xs); 
Bound=s,C=s,!,boundaux(C,Xs); 
fail). 
A variable Bound which records the cross information is 
reserved for each element in the x-list. When a bounding node 
is crossed, this variable is checked tO avoid the illegal 
operation. 
(b) c(Arg) --> e 1 (Arg 1),e2(Arg2),...,ci(Argi), 
trace(TraceArg),c(i+ 1)(Arg0+ 1)),'..,cn(Argn). 
where i >= O. The rules (r5) and (r15) are two 
examples. 
If the left-coruer bottom-up parsing algorithm is used, 
the grammar rules should free of empty constituents. When 
i=0, the grammar rule considers a trace (an empty constituent) 
to be the first element in the rule body. It overrides the 
principle of the algorithm, but we can always select the first 
element c(i + 1) that satisfies the following criterion: 
(i) a lexieal terminal, or 
(2) a phrasal non-terminal, or 
(3) a phrasal non-terminal in a movement non-terminal, 
to be the left-comer and put the trace inforumtion into an x-list 
before this non-terminal. Thus, the translation is generalized as 
follows (assume that cl is a left-comer ). 
ci(G,ArgI,H1,X1,X) :.. 
goal(c2,Arg2,H2,X 1,X2), 
goal(ci,Argi,Hi,X(i- 1),Xi), 
goal(c(i+l),Arg0+l),H(i+l),Xi,X(i+l)), 
goal(cn,Argn,Hn,X(n-l),Xn), 
merge(\[H1,H2 ..... Hi, 
\[ \[x(trace(TraceArg),Bound,D)lZ\],Z\], 
H(i+I),...,Hn\],H), 
c(G,Arg,H,Xn,X). 
Here, the trace information is placed between Hi and H(i+l). 
Summing up, the virtual non-terminal is represented as a fixed 
fommt: 
x(trace(l'raceArg),Bound,Direction) 
and placed into x-list via merge operation. The position in 
x-list is reflected from the original rule. 
3°3.2 The leftward movement grammar rules 
The leftward movement grammar rnles can be 
generalized as below: 
c(Arg) o-> c 1 (Arg 1),c2(Arg2) ..... 
ci(Argi)<<<trace(TraceArg),c(i+ 1)(Arg(i+ 1)) ..... 
cn(Argn). 
The rule (rl) is an example. Its translation is shown as 
follows: 
cl(G,Argl,H1,X1,X) :- 
goal(c2,Arg2,H2,X1,X2), 
goal(ci,Argi,Hi,X(i-1),Xi), 
goal(c0+l),Arg(i+l),H(i+l),Xi,X(i+l)), 
goal(cn,Argn,Hn,X(n- 1),Xn), 
merge(\[H(i+l) ..... Hn\],T1), 
cuLtrace(x(trace(TraceArg),Bound,left), 
T1,T2), 
merge(\[H1,H2,...Hi,T2\],H), 
c(G,Arg,H,Xn,X). 
Comparing this translation with that of general grammar rules, 
we can find a new predicate cut_ .trace is added. The cut trace 
implements tile c-command principle, and its definition i~. 
cnLtrace(Trace,\[Y,X\],\[Y 1,X\]) :- 
(var(Y),!, 
(l'race=x(TraceIn fo,Bound,left), !;fail); 
cut traceaux(Trace,Y,Y1)). 
cnLtraceaux(Trace,\[TracelXs\],Xs) :- !. 
cut traceaux(Trace,\[HIX\],\[HIY\]) :- 
(vat(X),!, 
(Trace=x(TraceInfo,Bound,left),!;fail); 
cut_traceaux(Tracc,X,Y)). 
The cut trace tries to retract a trace from x-list if a movement 
exists. ~landarin Chinese has many specific features that oilier 
languages do not have. For example, topic-comment structure 
does not always involve movement transformation. The first 
cut traeeauz chmse matches the trace information with the x-list 
~ransferred f,'om the bottom on its right part. The second 
cut traceatrc tells us that if the expected leftward trace cannot 
ma~h one of the elements in the x-list, then it will be drop out. 
The x-list is not changed and transferred up. The concept is 
demonstrated in Fig. 4. It also explains why we can detect 
grammar errors before parsing. In summary, each movement 
non-terminal is decomposed into a phrasal non-terminal and a 
vimml non-.tet~ninal. The phrasal non-terminal is translated the 
'same as before. The vktual non-terminal is represented as 
x( tr ace(l YaceA r g ),Bound, left ) 
in this case, however, cut_trace is involved instead of merge. 
3.3.3 The rightward movement grammar rules 
Because we treat the leftward and the rightward 
movement grammar rules in a uniform way, the translation 
algorithm of both are similar. The rightward movement 
gr~n~nar ruks are wifll the following format: 
c(Arg) --> c l(Arg l),C2(Arg2) ..... 
lxace(TraceArg)>>> ci(Argi), 
c(i+l)(Arg(i+l)),'",cn(Argn)" 
The rule (r9) is an example. "Itle corrsponding translated result 
cl(G,\[Argl\],HI,X1,X) :- 
goal(c2,\[ Arg2\],H2 X 1 ,X2), 
e 
Hk.. H(i-l) Ill 
1 
if there is a traca the expected 
in thls range, left-moved 
the corresDondln Q constltuent 
moved elemsnt is 
on the uDDer level 
e(l+l) A A 
I 
a trace should be found in 
this range if the 
oxpoctat I on succeeds 
Flg. 4 the sketch o~ the translation 
of the leftward production rules 
goal(c(i-1),\[Arg0-1)\],H(i-1),X(i-2),X(i_l)), 
merge(\[H1,H2 ..... H(i-1)\],T1), 
cut_trace(x(trace(TraceArg,Bound,right)), 
TI,T2), 
goal(ci,\[Argi\],Hi,X(i- 1),Xi), 
goal(cn,\[Argn\],Hn,X(n-1),Xn), 
merge(\[T2,Hi ..... Hn\],H), 
c(G,\[Arg\],I-t,Xn,X). 
The translation is very apparent for the symmetric property of 
the leftward and the rightward grammar rules illustrated in Fig. 
4 and Fig. 5. A slight difference appears in the definition of the 
c 
c(i-l) i A 
<p 
1 
a trace should be 
found in this range 
en 
1 
the right-moved if there Is a trace 
constituent in this range, 
the corraspondlng 
moved element is 
sn the upper level 
Fioo f. 5 the sketch of the translation 
the rightward production rules 
predicate cut trace. It shows an important linguistic 
phenomenon in-Mandarin Chinese: 'Relativization is always a 
movement transformation.' Thus, if we expect a trace and 
cannot find a corresponding one, failure is issued. The 
direction information in x(trace(TraceArg),Bound, right), i.e. 
fight, tells out the difference between the leftward and the 
rightward movements. \](n general, we allow both leftward 
movement and rightward movement to appear in the same rule. 
A new predicate intersection is introduced to couple these two 
translations. 
3.4 Invocation of the parsing system 
The parsing system is triggered in the following way: 
goal(a start non-terminal,\[a sequence of arguments\], 
an empty :~-list,\[a sequence of input string\],\[\]). 
In GBLG1, the invocation is shown as follows: 
goal(sl bar,\[ParseTree\],\[Z~I,\[input sentence\],\[\]), Par(Z). 
Because an empty x-list is represented as \[Z,Z\] (Z: a variable) 
in our special data structure shown in Section 3.2, var(Z) 
verifies its correctness. For example, to parse the Chinese 
sentence "~Jl~ ~ A. ~t~ \[t.~ ~ 3K ?" (the student that that 
man saw came), we trigger tile parser by calling: 
?- goal(slbar,\[S lbar\],\[Z,Z\],\['JJ~','~','J~','~tE ', '~','~','~',' Y '\],\[\]),vat(Z). 
/* 7- goal(s 1 bar,\[S 1 bar\],\[Z,Z\],\['that','man','saw', 
'dc','student','ca me','aspect'\],\[\] ),var(Z ) . ,/ 
115 
4. Conclusion 
This paper addresses the problems of movement 
transformation in Prolog-based bottom-up parser. Three 
principles of Government-Binding theory are considered to deal 
with these problems. They are Empty Category Principle, 
C-comr0and Principle and Subjacency Principle. A sequence 
of translation rules is given to add these linguistic principles to 
the general grammar rules, the leftward movement grammar 
rules, and the rightward movements grammar rules 
respectively. The empty constituent problem is solved in this 
paper to allow the trace to be the first element in the grammar 
rule body. A special data structure for extraposition list is 
proposed to transfer the movement information from the bottom 
to the top. Based upon this structure, the fastest merge 
algorithm is designed. Those unnecessary merge predicates 
can be eliminated with the help of transitive relation. Thus, the 
new design not only extends the original bottom-up parsing 
system with the movement facility, but also preserves the 
parsing efficiency. 
Appendix 
Based upon the translation algorithms specified in 
Section 3, the logic grammar GBLG1 is translated as below. 
The clause (ti) is the relevant translated result of the grammar 
rule (ri). Note the codes have been optimized. Those 
unnecessary merge operations are deleted from the translated 
results. 
( tl ) topic(G,\[Topic\],H1,X1,X) :- 
goal(s,\[S\],H2,X 1 ,X2), 
cut_trace(x(traceT(Topie),Bound,left),H2,T1 ), 
merge(\[H1,T1\],H), 
slbar(G,\[slbar(Topic,S)\],H,X2,X). 
(t2) s(G,\[S\],H,X1,X) :- slbar(G,\[slbar(S)\],H,X1,X). 
( t3 ) n2bar(G,\[N2bar\],H1,X1,X) :- 
goal(v2bar,\[V2bar\],H2,X1,X2), 
lookup(part,\[Part\],X2,X3), 
merge(\[H1,H2\],H), 
bound(s,H), 
s(G,\[s(N2bar,V2bar,Par t)\] ,H,X3,X). 
( t4 ) n2bar(G,\[N2bar\],H1,X1,X) :- 
goal(v2bar,\[V2bar\],H2,X 1 ,X2), 
merge(\[H1,H2\],H), 
bound(s,H), 
s(G,\[s(lq2bar,V2bar)\],H,X2,X). 
116 
( t5 ) v2bar(G,\[V2bar\],H1,X1,X) :- 
merge(\[\[\[x(traceR(Trace),Bound,righ01Z\],Z\], 
HI\],H), 
bound(s,H), 
s(G,\[s(traceR(Trace),V2bar)\],H,X I ,X). 
( t6 ) n2bar(G,\[N2bar\],H,XLX) :- 
topic(G,\[topic(N2bar)\],H,X I ,X). 
( t7 ) det(G,\[Det\],X1,X) :- 
lookup(cl,\[CL\],X1,X2), 
goal(nl bar,\[N lbar\],H,X2,X3), 
bound(n2bar,H), 
n2bar(G,\[n2bar(Det,CL,N 1 bar)\],H,X3,X). ( t8 ) nlbar(G,\[Nlbar\],H,Xl,X) :- 
bound(n2bar,H), 
n2bar(G,\[n2bar(Nlbar)\],H,X1,X). 
( t9 ) reI(G,\[Rel\],H1,X1,X) :- 
eut_trace(x(tmeeR(N2bar),Bound,right),H1,T1), 
goal(n2bar,\[N2bar\],H2,X 1,X2), 
merge(\[T1,H2\],h0, 
nlbar(G,\[nlbar(Rel,N2bar)\],H,X2,X). 
(tl0) n(G,\[N\],X1,X) :- nlbar(G,\[nlbar(N)\],\[Z,Z\],X1,X). (tll) s(G,\[S\],H,X1,X) :- 
lookup(de,\[De\],X1,X2), 
rel(G,\[reI(S,De)\],H,X2~X). 
(t12) adv(G,\[Adv\],X1,X) :- 
goal(vlbar,\[V 1 bar\],H,X 1,X2), 
v2bar(G,\[v2bar(Adv,V 1 bar)\] ,H,X2,X). (t13) vlbar(G,\[Vlbar\],H,X1,X) :- 
v2bar(G,\[v2bar(V 1 bar)\],H,X 1,X). (t14) tv(G,\[TV\],X1,X) :- 
goal(n2bar,\[N2bar\],H,XI,X2), 
vlbar(G,\[vlbar(TV,N2bar)\],H,X2,X). (t15) tv(G,\[TV\],X!,X) :- 
v 1 bar(G,\[v 1 bar(TV,traceT(Traee))\], 
\[\[x(traeeT(Traee),Bound,left)lZ\],Z\],H,XI,X). (t16) w(G,\[TV\],X1,X) :- 
v 1 bar(G,\[v 1 bar (TV,tr;aceR(Traee))\], 
\[\[x(traceR(Trace),Bound,right)lZ\],Z\],H,X 1 ,X). (t17) iv(G,\[IV\],X1,X) :- vlbar(G,\[vlbar(IV)\],\[Z,Z\],X1,X). 

References 

\[ 1 \] N. Chomsky, Lectures on Government and Binding. 
Foris Publication, Dordrecht, Holland, 1981. 

\[ 2 \] Yuji Matsumoto, Hozumi Tanaka, et al., "BUP: A 
Bottom-Up Parser Embedded in Prolog," New 
Generation Computing, Vol. 1, No. 2, 1983, pp. 
145-158. 

\[ 3 \] Yuji Matsumoto, Masaki Kiyono, and Hozumi Tanaka, 
"Facilities of the BUP Parsing System," in Dahl, V. and 
P. Saint-Dizier, Natural Language Understanding and 
Logic Programming, 1985, pp. 97-106. 

\[ 4 \] Yuji Matsumoto, Hozurni Tanaka, and Masaki Kiyono, 
"BUP: A Bottom-Up Parsing System for Natural 
Languages," in Warren, D.H.D. and M. Canegham 
(eds.), Logic Programming and Its Applications, 1986, 
pp. 262-275. 

\[ 5 \] F. Pereira and D.H.D. Warren, "Definite Clause 
Grammars for Language Analysis - A Survey of the 
Formalism and a Comparison with Augmented~ 
Transition Networks," Artificial Intelligence, Vol. 13, 
1980, pp. ~!-278. ,, 

\[ 6 \] F. Pereira~,Extraposition Grammars, American Journal 
of CompU~¢tion Linguistics, ~Vol. 7, No. 4, 1981, pp. 
243-256. 

\[ 7 \] P. Sells, Lectures On Contemporary Syntactic Theories, 
Center for the Study of Language and Information, 
1985. 

\[ 8 \] E.P. Stabler, Jr,, "Restricting Logic Grammars," Proc. 
of the AAAI Conference, 1986, pp. 1048-1052. 

\[ 9 \] E.P. Stabler, Jr., "Restricting Logic Grammars with 
Government-Binding Theory," Computational 
Linguistics, Vol. 13, No. 1-2, January-June, 1987, pp. 
1-10. 
