A Logic-Based Government-Binding Parser for Mandarin Chinese 
ttsin-Hsi CHEN 
Department of Computer Science and Information Engineering 
National Taiwan University 
Taipei, Taiwan 10764, R.O.C. 
NTUTO46@TWNMO E l O.BITN ET 
Abstract 
Mandarin Chinese is a highly flexible and context-sensitive 
language. It is difficult to do the case marking and index 
assignment during the parsing of Chinese sentences. This 
paper proposes a logic-based Government-Binding approach to 
treat this problem. The grammar formalism is specified in a 
formal way. Uniform treatments of movements, arbitrary 
number of movement non-terminals, automatic detection of 
grammar errors beforehand, and clear declarative semantics are 
its specific features. Many common linguistic phenomena of 
Chinese sentences are represented with this fornmlism. For 
example, topic-comment structures, the ba-constructions, the 
bei-constructions, relative clause constructions, appositive 
clause constructions, and serial verb constructions. A simple 
pronot,n resolution is touched upon. The expressive 
capabilities and the design methodologies show this mechanism 
is also suitable for other flexible and context-sensitive 
languages. 
1. Introduction 
Chinese is a highly flexible language, The same meaning 
may be represented in many different Chinese patterns. In 
other words, Chinese provides many ways for the native 
speakers to express their feelings. For example, a sentence like 
"I have told Mr. Lee that they want these books" in English, we 
can form multiple different patterns in Chinese: 
(a) ~ ~i~ ~ \[np ~Zl Is ~f~\] ~ ~-~ ~1° 
I have told \[np Mr. Lee\] \[s that they want these books\]. 
(b) \[np-:~ li,~J~ ~-~ ~ ti \[s t~ ~ ~ {~\]o 
\[tap Mr. Leeli, I have told t i \[s that they want these books\]. 
(c) ~ ~}=~ ~ \[np ~qg~\] \[s \[np ~ ~ :~ \]j {I.~ ~\] ~ tj\] 0 
I have told \[tap Mr. Lee\] \[s that \[np these books\]j they want 
tj\]. 
(d) \[np i~li, ~ ~ ~ ti Is \[tap ~ ~ ~\]j 4~\] ~ tj\]o 
\[np Mr. Lee\] i, I have told t i \[s that \[np these books\]j they 
want tj\]. 
(e) \[rip ~:¢i~ \]i, \[tap i~-~-~ ~\]j, ~ ~ i~ t i \[s {~t ~ tj\]o 
\[tap Mr. Lee\] i, \[np these books\]j, I have told t i Is that they 
want tj\]. 
In reality, it shows the specific pattern: topic-comment structure 
in Mandarin Chinese. Topicalization may be deemed one of the 
movement transformations. Examples (b) and (c) specify an 
object is moved to the topic position. Examples (d) and (e) are 
sentences with multiple topics. We can realize that the more 
predicates a sentence includes, the more topic positions it has. 
And thus, the more complicated patterns may be generated. It 
is good for the language users, however, it is difficult to 
process this type of languages in computer. 
Chinese is also a highly context-sensitive hmguage. There 
are so many phenomena, e.g. index assigmnent, case marking, 
etc., depending on the context information even within a 
Chinese sentence. The index assignments in the topic-comment 
patterns shown above explain this point. Examples (d) attd (e) 
are legal interpretations. However, their bindings are different. 
The former is a serial binding, and the latter is a crossed 
binding. Serial binding is qot always true. For example, the 
index assignment cannot be 
* Inp-Z'~-~Ji\]t=L=li , \[np Lff. a:- ~ ~lj, ~\[J~ ~-~d~ i~,.'~ tj Is ~:~ 
ti\]o 
* \[np Mr. Lee\] i, \[tap these books\]j, I have told tj \[s that 
they want ti\]. 
This is because the object that someone told must be an 
animate. Therefore, the index assignment, which is a 
necess,'uy step toward correct interpretation of natural language 
sentences, is difficult in computer. 
This paper proposes a Government-Binding approach to 
deal with these highly flexible and context-sensitive languages 
such as Mandarin Chinese. It is organized as follows. Section 
2 specifies the concepts of Government-Binding Theory. 
Section 3 gives a fortnal definition of Government-Binding 
based logic grammars. Section 4 demonstrates a Chinese 
parser from several context-sensitive constructions, and 
touches on the simple pronoun resolution within a Chinese 
sentence. Section 5 concludes the remarks. 
2. Government-Binding Theory 
Government-Binding (GB) Theory/Chomsky 1981, Sells 
1985/is the descendant of Transformation Grammars/Radford 
1981/. Its simplified organization is shown in Figure 1. Move 
- c~ , which is a general operation, moves anything anywhere 
between d-structure and s-structure, and between s-structure 
and logical foma. GB Theory includes a series of modules that 
contain constraints and principles which govern the movement 
transformation. 
The Projection Principle preserves the syntactic information 
and the semantic information at each level (d-structure, 
s-structure, and logical form) during the movement 
transformation. Trace Theory postulates that there exist various 
empty categories at various levels of mental representation. 
48 1 
1. Projection Principle 
d-structure s-structure - logical form 
2. Empty Category 3. Binding Theory 
Fil;ure 1. Government-Binding Theory 
Thus, we must have the capabilities to verify the relationship 
between the moved constituent and the empty constituent. GB 
Theory provides several mechanisms for the verification. The 
Empty Category Principle (ECP) says "A trace must be 
properly governed." That is, we must find some cx that 
c-commands the trace /3 . And cx binds /3 iff (a) c, 
c-commands/3 , and (b) a and /3 are co-indexed. Their 
definitions are based on C-Command Condition. The 
C-Command Condition states the following: 
a c-eomman&/3 if and only if the first branchit,g ~,ode 
dominating et also dominates /3, and ct does not itselJ 
dominate t3. 
It states a co-reference relation between a moved element and its 
mtce. The Subjaeency Condition is given in the following: 
Any application of Move - e~ may not cross more than one 
bounding node. 
\[t specifies island consuaints on the moved constituents. 
The Binding Theory/Sells 1985/shownbelow is used for 
simple pronoun resolution: 
(Principle A) An anaphor is bound in its Governing Category. 
(Principle B) A pronominal is free in its Governing Category. 
(Principle C) An R-expression is free. 
Anaphors include reflexive and reciprocals, pronominals 
include pronouns, and R-expressions include all other noun 
phrases. 
3. A Government-Binding Based Logic Grammar 
Formalism 
The formal definition of Government-Binding based Logic 
Grammars (GBLGs) is specified incrementally in the 
following. 
Definition 1. A Government-Binding based Logic Grammar 
is a 6-tuple GBLG = (T,2,B,S,C,R) where: 
(1) T is the set of lexical terminals. Each lexical terminal 
is denoted by an atomic formula with lexical category as its 
predicate symbol. 
(2) ,'~ is the set of non-terminals. Y. = ZI' U ~\]v k) \]~M 
k) ~G where: 
(a) Zp is the set of phrasal non-terminals. Each phrasal 
non-terminal is represented by an atomic formula with phrasal 
category as its predicate symbol. 
(b)Y. V is the set of virtual non-terminals. Each virtual 
non-terminal is specified by an atomic formula. 
(c)Y. M is the set of movement non-terminals. A 
movement :non-terminal is one of the following two forms: 
A<<<BorB>>>AwhereAETk) ~pt9 ~v,and 
B E ~V" \]~-7'LM and ~RM denote the set of non-terminals A 
<<< B and the set of non-terminals B >>> A, respectively. 
(d)~ G is the set of goals. Each goal is denoted by a 
literal. 
(3) B C ~p is the set of bounding non-terminals. A 
botmding non-terminal is a phrasal non-terminal with bounding 
node as its predicate symbol. 
(,4) S E ~p is the start non-terminal. 
(5) C is the set of logic connectives 'and' and 'or' that are 
denoted by ',' and ';' respectively. A grammar element is 
defined rccursivcly in terms of logic connectives as follows: 
(a) A lexical tm'minal L E T is a grammar element. 
(b)A phrasal non-terminal P E ~p is a grammar element. 
(c) A virtual non-terminal V E .~v is a grammar element. 
(d)A movement non-terminal M ~ ~M is a grammar 
element. 
(e)A goat E ~JG is a grammm" element. 
(f) If A and B are grammar elements, then (A,B) and 
(A;B) are g,ammar elements. 
The first five types are called basic grammar elements, and the 
last one is a compound grammar element. Let G t and G E be the 
set of basic granmaar elements and the set of compound 
grammar elements, respectively. 
(6) R is the set of production rules. A production rule is 
of the following form: 
X 0 °-> X l C l X 2 C 2 ... C(m.l) Xm 
where X 0 < ~l,, 
X i E G E for 1 _< i _< m, and 
C i g2 C for I £ i _< (m-1). 
It is obvious each production rule can be translated into a 
sequence of production rules with the logical operator 'and' 
only. 
An example written with this formalism is shown as 
follows. It captures the relative clauses in English like "The 
man who he met is a teacher." 
(rl) s-->np, vp. 
(r2) np --> pronoun. 
(r3) np --> det, notm. 
(r4) np--> det, noun, rel. 
(r5) vp -.-> tv, np. 
(r6) vp --> tv, trace. 
(rT) vp --> iv. 
(r8) rel --> tel pronoun <<< trace, s. 
where T = {pronoun, det, noun, tv, iv, rel pronoun}, 
£p = {s, np, vp, rel}, 
£v = {trace }, 
~m = {rel_pronot, n <<< trace}, and 
B = {s, np}. 
The rule (r8) describes a constituent in phrase structure s is 
extraposed to the rel pronoun position. Which constituent may 
be moved from which position is specified by rule (r6). 
Definition 2. \]\[:or X E ~p, Y E ~v and TR is a transitive 
relation, X TR Y if 
(1) X is tile rule head of a production rule, and Y is a 
grammar clement in its ntle body, or 
(2) X is tile rule head of a production rule, 1 {- Y.p is a 
grammar element in its rule body, and I TR Y, or 
(3) there exist 11, 12 ..... and I n E ~,p, such that X TR I t 
2 49 
TR 12 TR ... TR I n TRY. 
The transitive relation TR is also a dominate relation. This 
is because TR is a dominate relation between a phrasal 
non-temfinal and a virtual non-terminal. 
Definition 3. A production rule X 0 --> X 1, X 2 ..... X m 
(where X i E G I for I < i < m) is significant if it satisfies the 
extra restrictions: 
(1) for any grammar element X i = (A <<< B) E \]~LM, 
there must exist some Xj, i <j -< m, such that (Xj, B) E TR. 
(2) for any grammar element X i = (B >>> A) E ~RM' 
there must exist some Xj, 1 _<j < i, such that (Xj, B) E TR. 
A logic grammar GBLG is significant if each production 
rule E R is significant. The above sample grammar is 
significant for the following reasons: 
(1) The rules (rl) - (r7) are significant trivially. 
(2) The rule 
tel --> rel pronoun <<< trace, s 
is significant because there exists a transitive relation TR 1 such 
that s TR 1 vp TR l trace. 
Proposition 1. The c-command condition is embedded 
implicitly in GBLGs if these grammars are significant. 
Proof. For a significant production rule: 
X 0 --> X l, X 2 ..... X m 
if X i = (A <<< B) E ~LM then there must exist some Xj (i < j 
< m), such that Xj dominates the virtual non-terminal B in the 
other production rule. The phrasal non-terminal X 0 is the first 
branching node that dominates A and Xj, and thus also 
dominates B. Therefore, A c-commands B. X i = (B >>> A) E 
~RM has the similar behavior. 
This property can be used to check the con'ectness of 
granmwas automatically before parsing. 
Definition 4. The transitive relation TRsubjacency is a subset 
of TR and satisfies the restrictions: for X E ~p, Y E Y~V, X 
TRsubjacency Y if X TR I l TR 12 TR ... TR I n TR Y, and there 
does not exist more than one Ij such that lj E B. 
Proposition 2. A significant logic grammar is a restrictive 
context sensitive grammar. This is because the truth value of a 
movement non-terminal depends on the appearance of a virtual 
non-temainal preceding or following it. 
/Chen 1988/ proposes a bottom-up parsing system for 
GBLGs. Figure 2 shows the execution of our sample grammar 
for the sentence "The man who he met is a teacher". The label 
on the are indicates the step number during parsing. The empty 
constituent trace is generated in phrase vp, then passed to 
phrase s, and finally cut in phrase rel. Comp,'tred with other 
logic programming approaches/Matsumoto 1983, McCord 
1987, Pereira 1981, Stabler 1987/, especially RLGs/Stabler 
1987/, GBLGs have the following features: 
(1) the uniform treatments of leftward movement and the 
rightward movement, 
(2) the arbitrary number of movement non-terminals in 
the 
det 
man who 
2 3 
w w 
noun rel_gronoun 
cut trace 
10~ 
np 
J 
he met trace 
4~ 6~ 
pronoun tv 
np vp(trace).\x~ 
8 
s(trace) ,,'," 
tel .\~, ,..,' I ~ce 
t6,~ 
S 
is a teacher 
tv det noun 
np 
.... i ts;~ 
vp 
Figure 2. Sample Parsing 
the rule body, 
(3) automatic detection of grammar errors befi)re parsing. 
The former two features are useful to express the highly 
flexible languages like Chinese. 
4. A Chinese Parser 
4.1 Topic-comment Structures 
Topic-comment structure is one of the specific features in 
Mandarin Chinese. There are several interesting linguistic 
phenomena concerning these structures: 
(1) Topic may be moved from the argument positions in 
the comment - as subject, direct object, or indirect object. 
(2) Many categories may appear in the topic position, e.g. 
n", s', v", or p". 
(3) There may be multiple topics in a sentence. 
(4) The comment may not contain a constituent which is 
anaphorically related to the element in the topic. 
Under the above observations, topic may be represented as: 
topic(topic(N2bar),n2bar,Semanfic,Index,Case) --> 
n2bar(N2bar,Semantic,Index,Case,Classifier). 
The second argument of predicate topic specifies the phrasal 
category of the topic, i.e., n2bar in this example. It is 
important for tile parser to decide whether the constituent may 
co-index with a trace. 
Next, the production rules for generating sentences are 
shown as follows: 
s 1 bar(s 1 bar(Topic 1 ,Topic2,S)) --> 
topic (Topic 1 ,Cat 1 ,S 1,I 1 ,Case 1) 
<<< trace(topic,info(Cat 1,S 1,I1,Case 1)), 
topic(Topic2,Cat2,S 2,I2,Case2) 
<<< trace(topic,info(Cat2,S2,I2,Case2)), s(S). 
s 1 bar(s 1 bar(Topic,S)) --> 
topic(Topic,Cat,S,I,Case) 
<<< trace(topic,info(Cat,S,I,Case)), s(S). 
slbar(slbar(S)) --> s(S). 
50 3 
()1' these three production rules, the first two define the 
"topie-comnrent" pattern, and the last one is a rule without 
topic. 
Finally, the phrasal non-terminal s is introduced. 
s(s(N2bar,V2bar)) --> 
n2bar(N2bar,Semantic,lndex,Case,Classifier), 
v2bar(V2bar,Semantic,lndex,Case,subj,nonbei). 
s(s(t(Case,lndex),V2bar)) --> 
mtce(X ,in fo(n2bar,Semantic,lndex,Case)), 
v2bar(V2bt~r,Semantic,lndex,Case,subj,nonbei). 
s(s(N2bar,V2bar)) --.> 
n2bar(N2bar,S,I,C,Classifier) 
<<< tracc(bei,info(n2bar,S,l,C)), 
v2bar(V2bar,S 1,11 ,C 1 ,subj,bei). 
s(s(t(C,l),V2bar)) --> 
trace(lelative,info(n2bar, S,I,C)) 
<<< Irace(bmj ~lo(n2bm,S,,,C)), 
v2bar(V2bar,S 1,11,C 1,subj,bei). 
s(s(V2bar)) --> v2b u(V2bar, ....... nosubj nonbei). 
"llhe first s rule is a nornutl case, i.e., no movement. Semantic 
denotes the semantic feature of tire head noun, It must be 
unifiable with tt~e semantic feature prey dec by the matrix verb 
with the type tree matching/McCord 198'7/. The same logical 
variable Case appears in the phrasal non-temfinals n2bar and 
v2bar. It means tire case of subjcct is assigned by tire maltix 
vcrb externally according to 0 - theory, The second s rule 
captures one I¢f tile movement transfornralions - relativizaticm, 
topiealizalion, ha-Ira rs\[o II ttion, or bci-transformat on An 
(,' err llorlll phFas(? is lllovcd via the foFlrlCr operatiotl, \[hlis ii 
virtual noll-ternrinal tlTI( (:(X in/(;(n2hdr,5'emavtic,lnde.r, Cave)) 
i:¢ left at the empty sile. It specific:; onb' n2bar can appear herc, 
a~d what ki :Is i) no;.'ements are not concerned. Tile semantic 
t~.:ature and case arc confined by the matrix verb. The third s 
rule deals with beiqra rst'o m~ tie ~. Vet example, 
(The tiller i is arrested t i by, tire t'o\]ice.) 
~'he thief (')J\['; {\[;~;l dxl'l~ ') is m}t a lo.,qcal subject of v2bar. The 
r~:al subject is tile object af bci (~:), i.e., the police. Thus, at 
different group <S,I,C> of vaiab!es is used. The ti2bar acts as 
the ".' .. o,\]cct of if)/)or or the subicct of lhe embedded sentence. 
"l'he tol1I-I\]1 5" rtl!e ctlpItlrcs double movements for fill tl2bar, l:or 
3 c G~irll I le, 
(The thief i arrested t i by tire police escaped again.) 
A left-moved constituent (')\]lt {N ,~\]x{~\]'~', the thief) is moved 
rightward furthermore. In this rule, two virtual non-terminals 
appear art both sides of movement operator '<<<'. Tim fifth s 
nile describes those sentences without subject. An atom nosut)j 
ins/cad of ,wd.~/,~pecilics StlC}l ii silualioli. 
4.2 Nnt,n Phrase 
A rlo/lrl phrase ca~l be a protlOtll?~ a simple noun, or a noHn 
phls other elements that act as pre-modifiers of that noun. 
Those clements are (1) classifier phrases, (2) associative 
phrases, and (3) modifying phrases. Only associative phrase, 
relative clause, and appositive clause atre listed in the tbllowing. 
Associative phrase denotes two noun phrases are linked by a 
special Chinese word tie ('f19 '). For example, 
~-t ~ ~d ~J .X. V\] (the population of China). 
'I he rule n2bar(n2bar(A,N2bar),Semantic,Index,Case,Classifier)--> 
asc(A), 
n2bar(N2bar,Semantic,Index,Case,Chtssifier) 
represents this constnmtion. The definition of associative 
clause is: 
asc(asc(N2bar,De)) --> 
n2bar(N2bar,Semantic,Index,Case,Classifier), 
* de(De). 
Both relative clause and appositive clause are nominalization in 
the form: nominalization + head noun, and are defined as 
follows: 
ret(rel(S,De)) --> s(S), * de(De). 
app(app(S,De)) -.> s(S), * de(De). 
ttowever, they are different in the restricting the reference of 
tire head noun. The head noun that a rehttive chmse modifies 
refers to some unspecified participant in the nominalization 
part. l:or example, 
(the former i who t i grows fluits), ',rod 
4lt~\]'tJ N t i \[l',J :/k-~¢ i 
(the fluits i that they grow ti). 
The head uoun 'Zk-~ -~' (tire fruits) refers to an empty constituent 
(either subjcct or object) in the relative clause. This type of 
constructions can be considered a rightward movement. For 
appositive clause and head noun pair, tile head noun does not 
refer to any entity in the modifying clause, i.e., appositive 
clause, t;or example, 
;fJ~ {l'g ~It N::e: fl',J N 
(the matter concerning our renting a house). 
The nominalization ,~.~ ¢lj ;fll .~-~' (our renting it house) serves 
as a complement to the head noun -:~' (the matter). This type 
of constrllctiorrs cannot be regarded as a 111ovcrllerlt 
transformation. Two rules are specitied for them: 
n2bar(n2bar(Re\],N2bar),S,I,C,Classifier) -..> 
rel(Rel), 
trace(relative, in tb(n2bar,S,I,C 1 )) 
>>> n2bar(N2bar,S,I,C,Classifier). 
n2bar(n2bar(Atlp,N2bar),S,I,C,Classifier) --> 
app(App), 
n2bar(N2bar,S,I,C,Classifier). 
The only difference between these two rules is a trace has to be 
found i"n rehltive clause. Note the cases of the empty 
constituent and the overt constituent may be different in relative 
clause + head noun cot}strut/ion. For tire sake of space, the 
nlbar is neglected in this paper. 
4.3 Verb Phrase 
Different from a noun phrase, a verb phrase may have 
pre-modifiers and post-modifiers. The preverbal specifiers are 
ha-phrases, bei-phrases, adverbial phrases, degree phrases, 
preposition phrases, quantifier phrases, aspect, and modal. 
The postverbal modifiers are semential constructions, adverbial 
phrases, quantifier phzascs, classifier phrases, prepositional 
ph,ases, and aspect. Only Serial Verb Constructions (SVCs) 
are abom to discuss in detail. The rule 
v2bar(v2bar(Va 1 bar, V b 1 bar),S,I,\[C l,C2\],subj) --> 
v 1 bar(Va 1 bar,S,I,C 1 ,sub j), 
vl bar(Vb 1 bar,S,I,C2,su b j) 
means two separate events juxtaposed together, e.g. ~'J~ iv' 
-~\] Iv' i~2-2J~\] (I Iv' bought a ticket\] and iv' went inD. It is 
one of the SVCs. The two events have tile identical subject, 
but cases may be different. The other groups of SVCs are: 
(1) One verb phrase or clause serving as the direct object 
of another verb, e.g. 
~ ~ ~2, ~1~ o (I want to go to school.) 
~J~ ~I~ ,(\[~ 5~ -~\]~ o (I want him to go to school.) 
(2) Pivotal constructions, e.g. 
4 51 
(I entrust him to take care of an affair.) 
(3) Descriptive clauses, e.g. 
(She cooked a dish that I very much enjoyed eating.) 
Only the former two are considered. Tile verbs with first use 
are classified into t2 attd t3, attd the verbs with the second use, 
i.e., pivotal construction, are classified into t8. It is not easy to 
define descriptive clauses with a rule or a new category, e.g. 
POSSESSIVE/Yang 1987/. This is because tile descriptive 
clause is optional. Without this clause, the original sentence is 
acceptable too. Furthermore, many verbs may be used with the 
descriptive clauses. 
The lowest level vlbar (v') touches on the uses of the 
subcategorization frames of the specified verb. According to 
the frames and ECP, a virtual non-terminal trace is placed 
wherever it is needed. For example, 
v 1 bar(v 1 bar(T l,N2bar),Semantic,Index,Case,HasSubj)--> 
* t 1 (T l,HasSubj:Semantic:Case,Semantic 1 :Casel), 
n2bar(N2bar,Semantic l,lndex l,Casel,Classifier). 
v 1 bar(v 1 bar(T1 ,t(Case 1 ,I ndex 1 )), Semantic,Index ,Case, 
tIasSubj) --> 
* tl(T1,HasSubj:Semantic:Case,Semanticl :Casel), 
trace(X,in fo(n2bar,Semantic 1 ,Indexl,Casel)). 
v 1 bar(v 1 bar(T2,pseudoS (e(Case 1 ,Index),V2bar)), 
Semantic,lndex,Case,subj) --> 
* t2(T2,subj :Semantic:Case), 
v2bar(V2bar,Semantic,l ndex,Case 1,sub j). 
The lexical category tl denotes transitive verb. Here, the trace 
may be generated by any movement transformation. The third 
rule is for SVCs. Note v2bar should have a subject and share it 
(Index) with the matrix verb. Thus, the semantic features of 
the two are the same. However, cases may be different. That 
is, one is assigned by the matrix verb, and the other one by the 
embedded verb. The rules for other lexical categories are 
omitted in this paper. The details can refer to/Lin 1989/. 
4.4 Ba-construction 
Ba-construction is usually generated by ba-transformation, 
which is one of the movement transformations. Tile direct 
object is placed immediately after '|P2' (ba) and before the verb like: 
subject '~' (ba) _direct ~ verb. 
For example, 
~J~ }~, -~)'~',: N:i ~l~ ~ t i -\]" o (I sold all three books.) 
t Iowever, there is another pattern for ba-construction: 
subject '\]\[.q' (ba) ~ verb ~. 
It is not constructed by movement transfom\]ation becanse some 
noun phrase appears after verb, i.e., ot!ject 2. For example, 
~J~ }U ~ ~ !aZ T =_:Z ~ o (I ate three of apples.) 
It shows a part-whole relation between object 1 and object 2. 
In the well-performed parsing systems, all the two patterns 
must be treated. It is also easy to represent this construction 
with our formalism. 
4.5 Bei-construction 
Bei-construction is a familiar Chinese pattern like the following: 
nonn~ ~gU (bei) noun phrase 2 verb. For example, 
(The bird was let go (by me).) 
Bei-construction has disposal shown as below similar to 
ba-construction: 
~I~ ~ I"1 ~ ¢~ } l~} T ~ I~I il~ o 
(That door was kicked (by naB) and a hole is left.) 
The rules in Section 4.1 (topic-comment structure) capture the 
above phenomena. 
4.6 Pronoun Resolution 
Binding Theory can be rephrased in the following 
procedures. Assume /3 is an anaphor, a pronominal, or an 
R-expression depending on which principle is used. Each 
element/3 may have two sets: set of possible pairs and set of 
flnpossible pairs. These two sets are denoted by possible-pair 
and impossible-pair respectively, and are defined in the 
following: 
possible-pair(B )={ a I cx can co-index with/3 }, 
impossible-pair(/3 )={ a I ca cannot co-index with 13 }. 
(Principle A) For an acceptable sentence, try to find some 
such that ca is in/3 's Governing Category and c--commands 
/3. Each a that is outside of this range should not have a 
co-index relationship with 13. This principle defines two sets 
for/3. For example, 
(* Mr. Lee i said \[s that you saw yourselfiJ. ) 
possible-pair('~l ~ ')-- { '4~g' } 
(possible-pair(self)= { you }), and 
impossible-pair('l~ ~ ')= { ~ 5~,*-~" ' } 
(impossible-pair(self)={Mr. Lee\]). 
Both '4~;' (you) and '~5~ J_~.= ' (Mr. Lee) c-command '~ t~' 
(self). The former is in the governing category of the reflexive 
' ~ ~' (self), but the latter is outside. So the index assignment 
is not acceptable. 
(Principle B) Those a s that are in tile range of Governing 
Category and c-command /3 should not co-index with /3. 
This principle just says which a s cannot be in the candidate 
set. However, we cannot determine whether those cx s that are 
in its range and do not c-command/3, co-index with/3 or not. 
If such an a co-indexes with/3, it must satisfy other criteria, 
e.g. other binding principles, the same semantic feature, and so 
on. Thus, this principle says only the i,qmssible-pair. For 
example, 
* \[s~i~-~ T ~2i \]o (* \[s Mr. Lee i saw himi\]. ) 
impossible-pair('{ tg') = \[ ~ 3~ ~' } 
(impossible-pair(him)= { Mr. Le e }). 
The phrase ~ 3~t5'-~.' (Mr. Lee) c-commands '~' (him), thus 
they cannot be co-indexed based on Principle B. Consider 
a~othcr example: 
* \[s {lgi~..~, 71" ~R-~i\]o (* \[sHei saw Mr. Leeil.) 
The R-expressiort does not c-command the pronominal. 
According to Principle B, we have no way to detemfine their 
binding relationship. But if Principle C is applied, it can tell t,s 
the index assignment is wrong, 
(Principle C) For any ca where a c-commands/3, a ought 
not to have co-index relationship with/3. This principle says 
nothing for those a s that do not c-command /3. A set 
impossible-pair is defined from this principle. For example, 
*~i~ \[s4~ ~-~ T ~i\]o 
(* He i said \[s that you saw Mr. Leei\].) 
52 5 
impossible-pair(-'-4 !g~ 3~ ~'~ ') = { "(t~', 'gJ~' ) 
(impossible-pair(Mr. Lee)={he, you }). 
The pronominal '~' (he) c-commands '~-3~'35' (Mr. Lee), so 
they should have different indices. 
Based on these three principles, a post-processing routine 
embedded in the parser is used to determine the co-index 
relationship between constituents from the parse tree. The 
algorithm is sinai:de: Traverse the parse tree, generate the 
relations possible-pair and impossible-pair. If it is unknown up 
to now, a rehttion unknown is given temporarily. When a new 
relation possible-pair or impossible-pair is got, use it to check 
all the unknown relations. Retract the unknowns accordingly. 
Finally, assign the anaphors and pronominals suitable indices 
based on the relations possible-pair and irtwossible-pair. 
5. Conclusion and Remarks 
Many natural langt, ages are flexible and context-sensitive. 
Mandarin Chinese is a famous example. It is difficult to 
capture tile linguistic phenoinena lot these languages in 
computer. This paper adopts GB Theory to deal with this 
problcm. According to GB Theory, the rule of 'move - a' 
moves anything anywhere, and the universal princil~les operate 
interactively to rule out the illegal movements. Thus, the only 
things shoukt be declared in tim grammars ree: 
(1) which phrases are the possible empty constituents, 
(2) which positions are their possible empty sites, 
(3) which positions are their possible landing sites, 
(4) which phrasal categories are bounding nodes. 
In such cases, a robnst parser for n:ttural hmguages can be 
designed. As an example, we represent many context-sensitive 
constructions in Mandarin Chinese, and do case marking and 
index assignment for Chinese sentences. An experime,mfl 
Chinese parser is running under the euvironments: (1) 
Vax-I 1/785, (2) Quintus Prolog, (3) lexicon with about 200 
words (about 33K bytes), and (4) about 150 production rules 
(about l I2K bytes). Besides movement transformation, 
pronotm resolution is another index assignment. For well 
treatment o\[ pronoun resolt, tion, the syntactic knowledge is not 
enough. This is because the Binding Theory tells us much the 
impossible pair, but little the possible pair. Much more 
semantic information should be included. 
Moreover, our GB approach is also useft,1 when we would 
like to compose logical formulae from their syntactic 
counterparts. The idea is that the mapping between d-structure 
and s-struc.ture, as well as between s-structure and logical form 
are treated in the similar way. The movement transformation 
between d-structure and s-structure tells us the relationship 
anaong verb and its accolnpanying arguments. The skeleton of 
the given 'verb is defined in tile lexicon, and base-generated in 
the d-structure. For example, 
'N '(Subject,Object) (buy(Subject,Object)). 
The index assignment relates ~' (book) to the verb '.~:~ ' (buy) 
in the following sentence: 
(There is one bcmk i that every student bought ti.) 
Because the variable of the type -~- ' (book) and the second 
argument of the template 'N '(Subject, Object) (buy(Subject, 
Object)) should be the same in the logical form, the index (a 
unique integer) can be changed into a variable, say X. That is, 
they share the same variable shown below: 
exist(X,'i~ '(X),forall(Y,'-~ ~L'(Y),'N '(Y,X))) 
(exist(X,book(X),forall(Y,student(Y),buy(Y,X)))). 
The formtfla tells us the SVO-SOV inversion in the logical 
tbrm. This phenomenon can be added into our parser easily 
with our formalism. The details concerning the logical 
interpretation of Chinese sentences refer to/Chen 1989/. 

References 

Chcn, tI.1t., I.P. Lin and C.P. Wu (1988) 'A New Design of 
Prolog-hased Bottom-up Parsing System with 
Government-l:linding Theory.' Proceedings of tile 12th 
lnternatio¢~al Conference on Computational Lingtdstics, pp. 
112-116. 

Chen, H.tl. (1989) 'The I,ogical Interpretation of Chinese 
Sentences.' Computer Processing of Chinese attd Oriental 
Languages 4(2,3), pp. 171-184. 

Chomsky, A.N. (1981) Lectures on Government-Binding. 
Foris Publication, Dordrecht, I Iothmd. 

Lin, I.P., S.F. IIuang, \[I.H. Chen and K.W. Chui (1989) 
77w Study qf the Knowledge Base in Mandarin Syntax (ll). 
Project Report, Department of Computer Science and 
Information Engineering, National Taiwan University, 
Taipei, Taiwan, R.O.C. 

Matsumoto, Y., 1t. Tanaka, et al. (1983) 'BUP: A Bottom-up 
Parser Embedded in Prolog.' New Generation Computing 
1(2), pp. 145-158. 

McCord, M.C. (1987) 'Natural Language Processing in 
Prolog.' In: Walker, A. (Editor) A Logical Approach to 
Expert Systems and Natural Language ProcessMg. 
Addison-Wesley Publishing Company, Inc., pp. 291-402. 

Pereira, F. (1981) 'Extraposition Grammars.' American 
Journal of Computational Liltguistics 7(4), pp. 243-256. 

P, adford, A. (1981) TratisJbrmation Sytltax. The Cambridge 
\[Jnivcrsity Press. 

Sells, P. (1985) Lectures on Contemporary Syntactic 
Theories. Stanford, Center fix' the Study of l.anguage and 
Information. 

Stabler, E.P., Jr. (1987) 'Restricting I,ogic Grammars with 
Government-Binding Theory.' Computational Linguistics 
13(1-2), pp. 1-10. 

Yang, Y. (1987) 'Combining Prediction, Syntactic Analysis 
and Semantic Analysis in Chinese Sentence Analysis.' 
Proceedings of the lOth lnterrational Joint Conference on 
Artificial Intelligence, pp. 679-681. 
