PRINCIPLE-BASED PARSING WITHOUT OVERGENERATION 1 
Dekang Lin 
Department of Computing Science, University of Manitoba 
Winnipeg, Manitoba, Canada, l~3T 2N2 
E-mail: lindek@cs.umanitoba.ca 
Abstract 
Overgeneration is the main source of computational 
complexity in previous principle-based parsers. This 
paper presents a message passing algorithm for 
principle-based parsing that avoids the overgenera- 
tion problem. This algorithm has been implemented 
in C++ and successfully tested with example sen- 
tences from (van Riemsdijk and Williams, 1986). 
1. Introduction 
Unlike rule-based grammars that use a large num- 
ber of rules to describe patterns in a language, 
Government-Binding (GB) Theory (Chomsky, 1981; 
Haegeman, 1991; van Riemsdijk and Williams, 
1986) ezplains these patterns in terms of more 
foundmental and universal principles. 
A key issue in building a principle-based parser is 
how to procedurally interpret the principles. Since 
GB principles are constraints over syntactic struc- 
tures, one way to implement the principles is to 
1. generate candidate structures of the sentence 
that satisfy X-bar theory and subcategoriza- 
tion frames of the words in the sentence. 
2. filter out structures that violates any one of 
the principles. 
3. the remaining structures are accepted as parse 
trees of the sentence. 
This implementation of GB theory is very ineffi- 
cient, since there are a large number of structures 
being generated and then filtered out. The prob- 
lem of producing too many illicit structures is called 
overgenera~ion and has been recognized as the cul- 
prit of computational difficulties in principle-based 
parsing (Berwick, 1991). Many methods have been 
proposed to alleviate the overgeneration problem 
by detecting illicit structures as early as possible, 
such as optimal ordering of principles (Fong, 1991), 
coroutining (Doff, 1991; Johnson, 1991). 
\] The author wishes to thank the anonymous referees for 
their helpful comments and suggestions. This research was 
supported by Natural Sciences and Engineering Research 
Council of Canada grant OGP121338. 
This paper presents a principle-based parser that 
avoids the overgeneration problem by applying prin- 
ciples to descriptions of the structures, instead of 
the structures themselves. A structure for the input 
sentence is only constructed after its description has 
been found to satisfy all the principles. The struc- 
ture can then be retrieved in time linear to its size 
and is guaranteed to be consistent with the princi- 
ples. 
Since the descriptions of structures are constant- 
sized attribute vectors, checking whether a struc- 
tural description satisfy a principle takes constant 
amount of time. This compares favorably to ap- 
proaches where constraint satisfaction involves tree 
traversal. 
The next section presents a general framework 
for parsing by message passing. Section 3 shows how 
linguistic notions, such as dominance and govern- 
ment, can be translated into relationships between 
descriptions of structures. Section 4 describes in- 
terpretation of GB principles. Familiarity with GB 
theory is assumed in the presentation. Section 5 
sketches an object-oriented implementation of the 
parser. Section 6 discusses complexity issues and 
related work. 
2. Parsing by Message Passing 
The message passing algorithm presented here is 
an extension to a message passing algorithm for 
context-free grammars (Lin and Goebel, 1993). 
We encode the grammar, as well as the parser, 
in a network (Figure 1). The nodes in the net- 
works represent syntactic categories. The links in 
the network represent dominance and subsumption 
relationships between the categories: 
• There is a dominance link from node A to B 
if B can be immediately dominated by A. The 
dominance links can be further classified ac- 
cording to the type of dominance relationship. 
• There is a specialization link from A to B if A 
subsumes B. 
The network is also a parser. The nodes in the 
network are computing agents. They communicate 
112 
with each other by passing messages in the reverse 
direction of the links in the network. 
/x !':" .............. ~ ......... / 
......... \ .... t..." /\°.... 
x PSpec B / i VI~. "d'''-~ , _~ \ % 
1 i I i k "\" ".. .3. 
• F S ~N~ \] AUX" Have%e iv( //--' 
., : \ $ : ",,,i \ Xi 
ASpec .. A'bar %~ D~et " 
N 
.............. ~ --" 0 barrier adjunct-dominance specialization link 
~--~ .,ll.ll*.ll| 
head dominance specifier~ominance complement-dominance 
Figure 1: A Network Representation of Grammar 
The messages contains items. An item is a 
triplet that describes a structure: 
<surface-string, attribute-values, sources>, 
where 
surface-string is an integer interval \[i, j\] denoting 
the i'th to j'th word in the input sentence. 
attribute-values specify syntactic features, such as 
cat, plu, case, of the root node of the struc- 
ture described by the item. 
sources component is the set of items that describe 
the immediate sub-structures. Therefore, by 
tracing the sources of an item, a complete 
structure can be retrieved. 
The location of the item in the network deter- 
mines the syntactic category of the structure. 
For example, \[NP the ice-cream\] in the sentence 
"the ice-cream was eaten" is represented by an item 
i4 at NP node (see Figure 2): 
<\[0,1\], ((cat n) -plu (nforta norm) 
-cm +theta), {ix, 23}> 
An item represents the root node of a structure 
and contains enough information such that the in- 
ternal nodes of the structure are irrelevant. 
The message passing process is initiated by send- 
ing initial items externally to lexical nodes (e.g., N, 
P, ...). The initial items represent the words in the 
sentence. The attribute values of these items are 
obtained from the lexicon. 
In case of lexical ambiguity, each possibility is 
represented by an item. For example, suppose the 
input sentence is "I saw a man," then the word 
"saw" is represented by the following two items sent 
to nodes N and V:NP 2 respectively: 
<\[I,I\], ((cat n) -plu (nform norm)), {}> 
<\[i,I\], ((cat v) (cform fin) -pas 
(tense past)), {}> 
When a node receives an item, it attempts to 
combine the item with items from other nodes to 
form new items. Two items 
<\[iljx\], A~, SI> and <\[i2,j~\], A2, $2> 
can be combined if 
1. their surface strings are adjacent to each 
other: i2 = jx+l. 
2. their attribute values A1 and As are unifiable. 
3. their sources are disjoint: Sx N $2 = @. 
The result of the combination is a new item: 
<\[ix~j2\], unify(A1, A2), $113 $2>. 
The new items represent larger parse trees resulted 
from combining smaller ones. They are then prop- 
agated further to other nodes. 
The principles in GB theory are implemented 
as a set of constraints that must be satisfied dur- 
ing the propagation and combination of items. The 
constraints are attached to nodes and links in the 
network. Different nodes and links may have differ- 
ent constraints. The items received or created by a 
node must satisfy the constraints at the node. 
The constraints attached to the links serve as 
filters. A link only allows items that satisfy its con- 
straints to pass through. For example, the link from 
V:NP to NP in Figure 1 has a constraint that any 
item passing through it must be unifiable with (case 
acc). Thus items representing NPs with nominative 
case, such as "he", will not be able to pass through 
the link. 
By default, the attributes of an item percolate 
with the item as it is sent across a link. However, 
the links in the network may block the percolation 
of certain attributes. 
The sentence is successfully parsed if an item is 
found at IP or CP node whose surface string is the 
input sentence. A parse tree of the sentence can be 
retrieved by tracing the sources of the item. 
An example 
The message passing process for analyzing the sen- 
tence 
2V:NP denotes verbs taking an NP complement. Sim- 
ilarly, V:IP denotes verbs taking a CP complement, N:CP 
represents nouns taking a CP complement. 
113 
IP i12 @ 
(~) ~bar ~. (~ 
i9/ V~ bar i\[ / ~i, 
• / \] NP. i4. Aux Have Be 
NP i4 \ 
Nbar i3 
Det il N i2 
The ice-cream 
~IP~ t l Ibar i /\ 
I i6 VP il0 
i9 Vbar /. 
18 v, 
Be i5 V:NP i7 
was eaten 
& The message passing process b. The parse tree retrieved 
11 :<\[0,0\] ((cat d)), {}> 
12 =<\[1,1\] ((cat n) -plu (nform norm) +theta),{}> 
13 =<\[1,1\] ((cat n) -plu (nform norm) +theta),{i2}> 
14 =<\[0,1\] ((cat n) -plu (nform norm) -cm +theta), {il, i3}> 
15 =<\[2,2\] ((cat i) -plu (per 1 3) (cform fin) +be +ca +govern (tense past)), {}> 
16 =<\[2,2\] ((cat i) -plu (per 1 3) (cform fin) +be +ca +govern (tense past)), {i5}> 
17 =<\[3,3\] ((cat v) +pas), {}> 
18 ----<\[3,3\] ((cat v) +pas +nppg -npbarrier (np-atts NNORM)), {i7}> 
19 =<\[3,3\] ((cat v) +pas +nppg -npbarrier (rip-arts NNORH)), {is}> 
110=<\[3,3\] ((cat v) +pas +nppg -npbarrier (rip-arts NNORM)), {i9}> 
111=<\[2,3\] ((cat ±) +pas +nppg -npbarrier (np-atts NNORH) (per 1 3) (cform fin) 
+ca +govern (tense past))), {i6, ilo}> 
i12~-<\[0,3\], ((cat i) +pas (per 1 3) (cform fin) +ca +govern (tense past)), {i4, ill}> 
Figure 2: Parsing the sentence "The ice-cream was eaten" 
(1) The ice-cream was eaten 
is illustrated in Figure 2.a. In order not to convolute 
the figure, we have only shown the items that are 
involved in the parse tree of the sentence and their 
propagation paths. 
The parsing process is described as follows: 
1. The item il is created by looking up the lexi- 
con for the word "the" and is sent to the node 
Det, which sends a copy of il to NP. 
2. The item i2 is sent to N, which propagates it to 
Nbar. The attribute values ofi2 are percolated 
to i3. The source component eli3 is {i2}. Item 
i3 is then sent to NP node. 
3. When NP receives i3 from Nbar, i3 is com- 
bined with il from Det to form a new item i4. 
One of the constraints at NP node is: 
if (nform norm) then -cm, 
which means that normal NPs need to be case- 
marked. Therefore, i4 acquires -cm. Item i4 is 
then sent to nodes that have links to NP. 
4. The word "was" is represented by item i5, 
which is sent to Ibar via I. 
5. The word "eaten" can be either the past par- 
ticiple or the passive voice of "eat". The sec- 
ond possibility is represented by the item i7. 
The word belongs to the subcategory V:NP 
which takes an NP as the complement. There- 
fore, the item i7 is sent to node V:NP. 
6. Since i7 has the attribute +pas (passive voice), 
an np-movement is generated at V:NP. The 
movement is represented by the attributes 
nppg, npbarrier, and np-atts. The first two 
attributes are used to make sure that the 
movement is consistent with GB principles. 
The value of np-atts is an attribute vector, 
which must be unifiable with the antecedent 
of this np-movement, l~N0aM is a shorthand for 
(cat n) (nform norm)• 
7. When Ibar receives il0, which is propagated 
to VP from V:NP, the item is combined with 
114 
i6 from I to form i11. 
8. When IP receives i11, it is combined with i4 
from NP to form i12. Since ill contains an np- 
movement whose np-atts attribute is unifiable 
with i4, i4 is identified as the antecedent of np- 
movement. The np-movement attributes in i12 
are cleared. 
The sources of i12 are i4 from NP and ill from 
Ibar. Therefore, the top-level of parse tree consists 
of an NP and Ibar node dominated by IP node. The 
complete parse tree (Figure 2.b) is obtained by re- 
cursively tracing the origins of i4 and ill from NP 
and Ibar respectively. The trace after "eaten" is in- 
dicated by the np-movement attributes of i7, even 
though the tree does not include a node representing 
the trace. 
3. Modeling Linguistics Devices 
GB principles are stated in terms of linguistic con- 
cepts such as barrier, government and movement, 
which are relationships between nodes in syntactic 
structures. Since we interpret the principles with 
descriptions of the structures, instead of the struc- 
tures themselves, we must be able to model these 
notions with the descriptions. 
Dominance and m-command: 
Dominance and m-command are relationships be- 
tween nodes in syntactic structures. Since an item 
represent a node in a syntactic structure, relation- 
ships between the nodes can be represented by re- 
lationships between items: 
dominance: An item dominates its direct and in- 
direct sources. For example, in Figure 2, i4 
dominates il, i2, and iz. 
m-command: The head daughter of an item repre- 
senting a maximal category m-commands non- 
head daughters of the item and their sources. 
Barrier 
Chomsky (1986) proposed the notion of barrier to 
unify the treatment of government and subjacency. 
In Chomsky's proposal, barrierhood is a property 
of maximal nodes (nodes representing maximal cat- 
egories). However, not every maximal node is a bar- 
rier. The barrierhood of a node also depends on its 
context, in terms of L-marking and inheritance. 
Instead of making barrierhood a property of the 
nodes in syntactic structures, we define it to be a 
property of links in the grammar network. That 
is, certain links in the grammar network are clas- 
sified as barriers. In Figure 1, barrier links have a 
black ink-spot on them. Barrierhood is a property 
of these links, independent of the context. This def- 
inition of barrier is simpler than Chomsky's since 
it is context-free. In our experiments so far, this 
simpler definition has been found to be adequate. 
Government 
Once the notion of barrier has been defined, the gov- 
ernment relationship between two nodes in a struc- 
ture can be defined as follows: 
government: A governs B if A is the minimal gov- 
ernor that m-commands B via a sequence of 
non-barrier links, where governors are N, V, 
P, A, and tensed I. 
Items representing governors are assigned 
+govern attribute. This attribute percolates across 
head dominance links. If an item has +govern at- 
tribute, then non-head sources of the item and their 
sources are governed by the head of the item if there 
are paths between them and the item satisfying the 
conditions: 
1. there is no barrier on the path. 
2. there is no other item with +govern attribute 
on the path (minimality condition (Chomsky, 
1986, p.10)). 
Movement :3 
Movement is a major source of complexity in 
principle-based parsing. Directly modeling Move-c~ 
would obviously generate a large number of invalid 
movements. Fortunately, movements must also sat- 
isfy: 
c-command condition: A moved element must c- 
command its trace (Radford, 1988, p.564), 
where A c-command B if A does not domi- 
nate B but the parent of A dominates B. 
The c-command condition implies that a movement 
consists of a sequence of moves in the reverse direc- 
tion of dominance links, except the last one. There- 
fore, we can model a movement with a set of at- 
tribute values. If an item contains these attribute 
values, it means that there is a movement out of the 
structure represented by the item. For example, in 
Figure 2.b, item i10 contains movement attributes: 
nppg, npbarr±er and np-atts. This indicates that 
there is an np-movement out of the VP whose root 
node is il0. 
3We limit the discussion to np-movements and wh- 
movements whose initial traces are in argument positions. 
115 
The movement attributes are generated at the 
parent node of the initial trace. For example, V:NP 
is a node representing normal transitive verbs which 
take an NP as complement. When V:NP receives 
an item representing the passive sense of the word 
eaten, V:NP creates another item 
< \[i,i\] , ((cat v) -npbarrier +nppg 
(np-atts (cat n))), {}> 
This item will not be combined with any item from 
NP node because the NP complement is assumed 
to be an np-trace. The item is then sent to nodes 
dominating V:NP. As the item propagates further, 
the attributes is carried with it, simulating the effect 
of movement. The np-movement land at IP node 
when the IP node combines an item from subject 
NP and another item from Ibar with np-movement 
attributes. A precondition on the landing is that 
the attributes of the former can be unified with the 
value of np-atts of the latter. Wh-movements are 
dealt with by attributes whpg, whbarrier, wh-atts. 
This treatment of movement requires that the 
parent node of a initial trace be able to determine 
the type of movement. When a movement is gener- 
ated, the type of the movement depends on the ca 
(case assigner) attribute of the item: 
ca 
+ 
movement examples 
wh active V, P, finite IP 
np A, passive V, non-finite IP 
For example, when IP node receives an item from 
Ibar, IP attempts to combine it with another item 
from subject NP. If the subject is not found, then 
the IP node generates a movement. If the item 
represent a finite clause, then it has attributes +ca 
(cform fin) and the movement is of type wh. Oth- 
erwise, the movement is of type np. 
4. Interpretation of Principles 
We now describe how the principles of GB theory 
are implemented. ~ 
-bar Theory: ~N~ 
• Every syntactic category is a projection of a \] 
lexical head. / 
• There two levels of projection of lexical I 
heads. Only the bar-2 projections can be) 
complements and adjuncts, j/ 
The first condition requires that every non-lexical 
category have a head. This is guaranteed by a con- 
straint in item combination: one of the sources of 
the two items being combined must be from the 
head daughter. 
The second condition is implemented by the 
structure of the grammar network• The combina- 
tions of items represent constructions of larger parse 
trees from smaller ones. Since the structure of the 
grammar network satisfies the constraint, the parse 
trees constructed by item combination also satisfy 
the X-bar theory. 
Case Filter: Every lexical NP must be case-~ 
arked, where A case-marks B iff A is a case as- I 
~igner and A governs B (Haegeman, 1991, p.156)fl 
The case filter is implemented as follows: 
1. Case assigners (P, active V, tensed I) have +ca 
attribute. Governors that are not case assign- 
ers (N, A, passive V) have -ca attribute• 
2. Every item at NP node is assigned an at- 
tribute value -cm, which means that the item 
needs to be case-marked. The -cm attribute 
then propagates with the item. This item is 
said to be the origin of the -era attribute. 
3. Barrier links do not allow any item with -cm 
to pass through, because, once the item goes 
beyond the barrier, the origin of-cm will not 
be governed, let alone case-marked. 
4. Since each node has at most one governor, if' 
the governor is not a case assigner, the node 
will not be case-marked. Therefore, a case- 
filter violation is detected if +govern -era -ca 
co-occur in an item. 
5. If +govern +ca -cm co-occur in an item, then 
the head daughter of the item governs and 
case-marks the origin of -cm. The case-filter 
condition on the origin of -era is met. The -era 
attribute is cleared. 
For example, consider the following sentences: 
(2) a. I believe John to have left. 
b. *It was believed John to have left. 
c. I would hope for John to leave• 
d. *I would hope John to leave. 
The word "believe" belongs to a subcategory of verb 
(V:IP) that takes an IP as the complement. Since 
there is no barrier between V:IP and the subject 
of IP, words like "believe" can govern into the IP 
complement and case-mark its subject (known as 
exceptional case-marking in literature). In (2a), the 
-cm attribute assigned to the item representing \[NP 
John\] percolates to V:IP node without being blocked 
by any barrier. Since +govern +ca -cm co-occur in 
the item at V:IP node, the case-filter is satisfied 
(Figure 3.a). On the other hand, in (25) the pas- 
116 
*g ..... V:IP ~.. 
-pas / ~'IP 
believe /~ \ 
NP -crn Ibar 
John 
to have left 
a. Case-filter satisfied at V:IP 
~ :CP ~ CP.~ 
+govern Cbar hope +ca ~'~/ ~; 
for NP -cm Ibar 
John 
to leave 
c. Case-filter satisfied at Cbar, --cm cleared 
+govern V:IP ~..-cm :;as 
// -,< / IP 
be,ieved ~ \ 
NP -era Ibalr 
John 
to have left 
b. Case-filter violation at V:IP 
v:cP~ / 
hope 
NP -cm IbM 
John 
to leave 
d. The attribute --cm is blocked by a barrier. 
Figure 3: Case Filter Examples 
sive "believed" is not a case-assigner. The case-filter 
violation is detected at V:IP node (Figure 3.b). 
The word "hope" takes a CP complement. It 
does not govern the subject of CP because there is 
a barrier between them. The subject of an infini- 
tive CP can only be governed by complement "for" 
(Figure 3.c and 3.d). 
criterion: Every chain must receive and one~ 
ly one 0-role, where a chain consists of an NP I 
d the traces (if any) coindexed with it (van I 
emsdijk and Williams, 1986, p.245). / 
We first consider chains consisting of one element. 
The 0-criterion is implemented as the following con- 
straints: 
1. An item at NP node is assigned +theta if its 
nform attribute is norm. Otherwise, if the value 
of nform is there or it, then the item is as- 
signed -theta. 
2. Lexical nodes assign +theta or -theta to items 
depending on whether they are 0-assigners (V, 
A, P) or not (N, C). 
3. Verbs and adjectives also have a subj-theta 
attribute. 
value O-role* examples 
+subj-theta yes "take", "sleep" 
-subj-theta no "seem", passive verbs 
*assigning O-role to subject 
This attribute percolates with the item from 
V to IP. The IP node then check the value of 
theta and subj-theta to make sure that tile 
verb assigns a 0-role to the subject if it re- 
quires one, and vice versa. 
Figure 4 shows an example of 0-criterion in action 
when parsing: 
(3) *It loves Mary 
-theta lP ~. +subj-theta 
-em /~// % +govern +ca 
NP Ibar 
It ..... " ..... +theta "" V. ~ +theta 
+govern Iove Nl:* 
Mary 
Figure 4: 0-criterion in action 
The subject NP, "it", has attribute -theta, which 
is percolated to the IP node. The verb "love" has 
attributes +theta +subj-theta. The NP, "Mary", 
has attribute +theta, When the items representing 
"love" and "Mary" are combined. Their theta at- 
tribute are unifiable, thus satisfying the 0-criterion. 
The +subj-theta attribute of "love" percolates with 
the item representing "love Mary", which is prop- 
agated to IP node. When the item from NP and 
Ibar are combined at IP node, the new item has 
both -theta and +subj-theta attribute, resulting in 
a 0-criterion violation. 
117 
The above constraints guarantee that chains 
with only one element satisfy 0-criterion. We now 
consider chains with more than one element. The 
base-position of a wh-movement is case-marked and 
assigned a 0-role. The base position of an np- 
movement is assigned a 0-role, but not case-marked. 
To ensure that the movement chains satisfy 0- 
criterion we need only to make sure that the items 
representing the parents of intermediate traces and 
landing sites of the movements satisfy these condi- 
tions: 
None of +ca, +theta and +subj-theta is 
present in the items representing the parent 
of intermediate traces of (wh- and np-) move- 
ments as well as the landing sites of wh- 
movements, thus these positions are not case- 
marked and assigned a O-role. 
Both +ca and +subj-theta are present in the 
items representing parents of landing sites of 
np-movements. 
Subjacency: Movement cannot cross more thanJ 
ne barrier (Haegeman, 1991, p.494). 
A wh-movement carries a whbarrier attribute. The 
value -whbarrier means that the movement has not 
crossed any barrier and +whbarrier means that the 
movement has already crossed one barrier. Barrier 
links allow items with -whbarrier to pass through, 
but change the value to +whbarrier. Items with 
+whbarrier are blocked by barrier links. When a 
wh-movement leaves an intermediate trace at a po- 
sition, the corresponding whbarrier becomes -. 
The subjacency of np-movements is similarly 
bandied with a npbarrier attribute. 
Ermpty Category Principle (ECP): A traceJ 
its parent must be properly governed. 
In literature, proper government is not, as the term 
suggests, subsumed by government. For example, 
in 
(4) Who do you think \[cP e' \[IP e came\]\] 
the tensed I in liP e came\] governs but does not 
properly govern the trace e. On the other hand, # 
properly governs but does not govern e (Haegeman, 
1901, p.4 6). 
Here, we define proper government to be a sub- 
class of government: 
Proper government: A properly governs B iff A 
governs B and A is a 0-role assigner (A do not 
have to assign 0-role to B). 
Therefore, if an item have both +govern and one of 
+theta or +subj-theta, then the head of the item 
properly governs the non-head source items and 
their sources that are reachable via a sequence of 
non-barrier links. This definition unifies the notions 
of government and proper government. In (4), e is 
properly governed by tensed I, e I is properly gov- 
erned by "think". 
This definition won't be able to account for 
difference between (4) and (5) (That-Trace Effect, 
(Haegeman, 1991, p.456)): 
(5) *Who do you think \[CP e' that \[IP e came\]\] 
However, That-Trace Effect can be explained by a 
separate principle. 
The proper government of wh-traces are handled 
by an attribute whpg (np-movements are similarly 
dealt with by an nppg attribute): 
Value Meaning 
-whpg the most recent trace has yet to 
be properly governed. 
+~hpg the most recent trace has already 
been properly governed. 
1. If an item has the attributes -whpg, -theta, 
+govern, then the item is an ECP violation, 
because the governor of the trace is not a 0- 
role assigner. If an item has attributes -whpg, 
+theta, +govern, then the trace is properly 
governed. The value of whpg is changed to +. 
2. Whenever a wh-movement leaves an interme- 
diate trace, whpg becomes -. 
3. Barrier links block items with -~hpg. 
N:CP 
-ca CP claim / 
CSpec Cbar 
that Reagan met e 
Figure 5: An example of ECP violation 
For example, the word claim takes a CP com- 
plement. In the sentence: 
(6) *Whol did you make the claim e~ that 
Reagan met ei 
there is a wh-movement out of the complement CP 
of claim. When the movement left an intermedi- 
ate trace at CSpec, the value of whpg became -. 
When the item with -whpg is combined with the item 
118 
representing claim, their unification has attributes 
(+govern -theta -whpg), which is an ECP violation. 
The item is recognized as invalid and discarded. 
PRO Theorem: PRO must be ungoverned 1 
Haegeman, 1991, p.263). 
When the IP node receives an item from Ibar with 
cform not being fin, the node makes a copy of the 
item and assign +pro and -ppro to the copy and 
then send it further without combining it with any 
item from (subject) NP node. The attribute +pro 
represents the hypothesis that the subject of the 
clause is PRO. The meaning of -ppro is that the 
subject PRO has not yet been protected (from being 
governed). 
When an item containing -ppro passes through a 
barrier link, -ppro becomes +ppro which means that 
the PRO subject has now been protected. A PRO- 
theorem violation is detected if +govern and -ppro 
co-occur in an item. 
5. Objected-oriented Implementation 
The parser has been implemented in C++, an 
object-oriented extension of C. The object-oriented 
paradigm makes the relationships between nodes 
and links in the grammar network and their soft- 
ware counterparts explicit and direct. Communica- 
tion via message passing is reflected in the message 
passing metaphor used in object-oriented languages. 
I \ 1,1 , ,_,,_1 \ \ 
----~" = (~) I I 
instance of subclass of instance class 
Figure 6: The class hierarchy for nodes 
Nodes and links are implemented as objects. 
Figure 6 shows the class hierarchy for nodes. The 
constraints that implement the principles are dis- 
tributed over the nodes and links in the network. 
The implementation of the constraints is modular 
because they are defined in class definitions and all 
the instances of the class and its subclasses inherit 
these constraints. The object-oriented paradigm al- 
lows the subclasses to modify the constraints. 
The implementation of the parser has been 
tested with example sentences from Chapters 4- 
10, 15-18 of (van Riemsdijk and Williams, 1986). 
The chapters left out are mostly about logical form 
and Binding Theory, which have not yet been im- 
plemented in the parser. The average parsing time 
for sentences with 5 to 20 words is below half of a 
second on a SPARCstation ELC. 
6. Discussion and Related Work 
Complexity of unification 
The attribute vectors used here are similar to those 
in unification based grammars/parsers. An impor- 
tant difference, however, is that the attribute vec- 
tors used here satisfy the unil closure condition 
(Barton, Jr. et al., 1987, p.257). That is, non- 
atomic attribute values are vectors that consist only 
of atomic attribute values. For example: 
(7) a. ((cat v) +pas +whpg (wh-atts (cat p)) 
b. * ((cat v) +pas +ghpg (wh-atts (cat v) 
(np-att (cat n)))) 
(7a) satisfies the unit closure condition, whereas 
(7b) does not, because wh-atts in (7b) contains a 
non-atomic attribute np-atts. (Barton, Jr. et al., 
1987) argued that the unification of recursive at- 
tribute structures is a major source of computa- 
tional complexity. On the other hand, let a be the 
number of atomic attributes, n be the number of 
non-atomic attributes. The time it takes to unify 
two attribute vectors is a + na if they satisfy the 
unit closure condition. Since both n and a can 
be regarded as constants, the unification takes only 
constant amount of time. In our current implemen- 
tation, n = 2, a = 59. 
Attribute grammar interpretation 
Correa (1991) proposed an interpretation of GB 
principles based on attribute grammars. An at- 
tribute grammar consists of a phrase structure 
grammar and a set of attribution rules to compute 
the attribute values of the non-terminal symbols. 
The attributes are evaluated after a parse tree has 
been constructed by the phrase structure grammar. 
The original objective of attribute grammar is to 
derive the semantics of programs from parse trees. 
Since programming languages are designed to be un- 
ambiguous, the attribution rules need to be eval- 
uated on only one parse tree. In attribute gram- 
mar interpretation of GB theory, the principles are 
119 
encoded in the attribution rules, and the phrase 
structure grammar is replaced by X-bar theory and 
Move-~. Therefore, a large number of structures 
will be constructed and evaluated by the attribution 
rules, thus leading to a serious overgeneration prob- 
lem. For this reason, Correa pointed out that the 
attribute grammar interpretation should be used as 
a specification of an implementation, rather than an 
implementation itself. 
Actor-based GB parsing 
Abney and Cole (1986) presented a GB parser that 
uses actors (Agha, 1986). Actors are similar to ob- 
jects in having internal states and responding to 
messages. In our model, each syntactic category 
is represented by an object. In (Abney and Cole, 
1986), each instance of a category is represented 
by an actor. The actors build structures by creat- 
ing other actors and their relationships according to 
0-assignment, predication, and functional-selection. 
Other principles are then used to filter out illicit 
structures, such as subjacency and case-filter. This 
generate-and-test nature of the algorithm makes it 
suscetible to the overgeneration problem. 
7. Conclusion 
We have presented an efficient message passing al- 
gorithm for principle-based parsing, where 
* overgeneration is avoided by interpreting prin- 
ciples in terms of descriptions of structures; 
* constraint checking involves only a constant- 
sized attribute vector; 
• principles are checked in different orders at dif- 
ferent places so that stricter principles are ap- 
plied earlier. 
We have also proposed simplifications of GB the- 
ory with regard to harrier and proper government, 
which have been found to be adequate in our exper- 
iments so far. 

References 
Abney, S. and Cole, J. (1986). A government- 
binding parser. In Proceedings of NELS. 
Agha, G. A. (1986). Actors: a model of concurrent 
computation in distributed system. MIT Press, 
Cambridge, MA. 
Barton, Jr., G. E., Berwick, R. C., and Ristad, E. S. 
(1987). Computational Complexity and Natural 
Language. The MIT Press, Cambridge, Mas- 
sachusetts. 
Berwick, R. C. (1991). Principles of principle-based 
parsing. In Berwick, B. C., Abney, S. P., and 
Tenny, C., editors, Principle-Based Parsing: 
Computation and Psycholinguistics, pages 1- 
38. Kluwer Academic Publishers. 
Chomsky, N. (1981). Lectures on Government 
and Binding. Foris Publications, Cinnaminson, 
USA. 
Chomsky, N. (1986). Barriers. Linguistic Inquiry 
Monographs. The MIT Press, Cambridge, MA. 
Correa, N. (1991). Empty categories, chains, and 
parsing. In Berwick, B. C., Abney, S. P., and 
Tenny, C., editors, Principle-Based Parsing: 
Computation and Psycholinguislics, pages 83- 
121. Kluwer Academic Publishers. 
Dorr, B. J. (1991). Principle-based parsing for ma- 
chine translation. In Berwick, B. C., Abney, 
S. P., and Tenny, C., editors, Principle-Based 
Parsing: Computation and Psycholinguistics, 
pages 153-184. Kluwer Academic Publishers. 
Fong, S. (1991). The computational implementation 
of principle-based parsers. In Berwick, B. C., 
Abney, S. P., and Tenny, C., editors, Principle- 
Based Parsing: Computation and Psycholin- 
guistics, pages 65-82. Kluwer Academic Pub- 
lishers. 
Haegeman, L. (1991). Introduction to Government 
and Binding Theory. Basil Blackwell Ltd. 
Johnson, M. (1991). Deductive parsing: The use 
of knowledge of language. In Berwick, B. C., 
Abney, S. P., and Tenny, C., editors, Principle- 
Based Parsing: Computation and Psycholin- 
guistics, pages 39-64. Kluwer Academic Pub- 
lishers. 
Lin, D. and Goebel, I%. (1993). Contex-free gram- 
mar parsing by message passing. In Proceedings 
of PACLING-93, Vancouver, BC. 
Radford, A. (1988). Transformational Grammar. 
Cambridge Textbooks in Linguistics. Cam- 
bridge University Press, Cambridge, England. 
van Riemsdijk, H. and Williams, E. (1986). Intro- 
duction to the Theory of Grammar. Current 
Studies in Linguistics. The MIT Press, Cam- 
bridge, Massachusetts. 
