Comparing two trainable grammatical relations finders 
Alexander Yeh 
Mitre Corp. 
202 Burlington Rd. 
Bedford, MA 01730 
USA 
asy~.mitre.org 
Abstract 
Grammatical relationships (Glls) form an im- 
portant level of natural  processing, 
but different sets of ORs are useflfl for different 
purposes. Theretbre, one may often only have 
time to obtain a small training corpus with the 
desired GI1. annotations. On su& a small train- 
ing corpus, we compare two systems. They use 
difl'erent learning tedmiques, but we find that 
this difference by itself only has a minor effect. 
A larger factor is that iLL English, a different GI/. 
length measure appears better suited for finding 
simple m:gument GI{s than ~br finding modifier 
GRs. We also find that partitioning the data 
ma W help memory-based learning. 
1 Introduction 
Grmnnmtical relationships (GRs), whidl in- 
clude arguments (e.g., subject and object) and 
modifiers, form an important level of natural 
 processing. Glls in the sentence 
Yesterday, my cat ate th, e food in the bowl. 
include ate having tile subject my cat, the ob- 
ject the food and the time modifier Ycstcr'- 
day, ~md t, hc .food having the location modifier 
in (the bowl). 
However, different sets of GRs are useful for 
dii%rent purposes. For exmnple, Ferro et al. 
(1999) is interested in semantic interpretation, 
and needs to differentiate between time, lo- 
cation and other modifiers. The SPARKLE 
project (Carroll et al., 1997), on the other lmnd, 
* This paper reports on work performed at, the MITRE 
Corporation under the support of the MITRE Sponsored 
Research Program. Marc Vilain, Lynette Hirsehman 
and Warren Greiff have helped make this work happen. 
Christine l)oran and John Henderson provided helpflfl 
editing. Copyright @2000 The MITRE Corporation. All 
rights reserved. 
does not differentiate between these types of 
modifiers. As has been mentioned by John Car- 
roll (personal communication), this is fine for 
infbrmation retrieval. Also, having less differ- 
entiation of tile modifiers can make it, easier to 
find them (Ferro et al., 1999). 
Unless the desired set of GRs matches the set 
already annotated in some large training col 
pus (e.g., the Buchholz el; al. (1999) GR finder 
used the GRs annotated in the Penn 3~'eelmnk 
(Marcus el; al., 1993)), one will have to either 
manually write rules to find tile GI{s or mmo- 
tate a training corpus tbr the desired set. Man- 
ually writing rules is expensive, as is annotating 
a large corpus. We have performed experiments 
on learning to find ORs with just a small an- 
notated training set. Our starting point is the 
work described in l?erro et al. (1999), which 
used a faMy smM1 training set. 
This paper reports on a comparison between 
the transforination-based error-driven learner 
described in Ferro et al. (1999) and the 
lnemory-based learner for GRs described in 
Buchholz et M. (1.999) on finding GIls to verbs 1 
by retraining the memory-based learner with 
tile data used in Ferro et al. (1999). We find 
that the transformation versus memory-based 
difference only seems to cause a small differ- 
ence in the results. Most of the result differ- 
ences seem to instead be caused by differences 
in tile representations and information used by 
tile learners. An example is that different GR 
length measures are used. In English, one mea- 
sure seems better fbr recovering simple argu- 
ment ORs, while another measure seems better 
ibr modifier GIl.s. We also find that partitioning 
the data sometimes helps melnory-based learn- 
1That is, ORs that have a verb as the relation target. 
For example, in Cats eat., there is a "subject" relation 
that has cat as the target and Cats as the source. 
1146 
ing. 
2 Differences Between the Two 
Systems 
Forro otal. (\].()00) alld Buchholz et al. (1999) 
both describe learning systems to find GRs. 
rl'he former (TI{) uses transformation-based 
error-driven learning (Brill and Resnik, 1994) 
aim the latter (MB) uses lnemory-bascd learn- 
ing (l)aelemans et al., 1999). 
In addition, there are other difl'erences. The 
TR system includes several types of inibrma- 
tion not used in the MB system (some because 
memory-based systems have a harder time han- 
dling set-wdued attributes): possible syntactic 
(Comlex) and semantic (Wordnet) classes of a 
c\]11111k headword, 1;11(', stem(s) and named-entity 
category (e.g., person, h)cation), if any, of a 
c\]mnk h eadword, lcxemes in a clmnk besides the 
headword, pp-attachment estimate and cerl;ain 
verb chunk properties (e.g., passive, infinitive). 
Some lexemes (e.g., coordinating COlljllllC- 
tions an(1 lmnctuation) are usually outside of 
any clmnk. The T12, system will store these in 
an attribute of the nearest chunk to the left; and 
to the right of such a \]eXellle. r.l'lle MB sys- 
tem represents such lexemes as if the, y arc Olle 
word chunks. Tim MB system cmmot use 1;11(; 
TI{ syste, m method of storage, l)ecaus('~ melnory- 
based systelns have difficulties with set-v~ducd 
al,l, ribtttes (value is 0 or \]now~ lexemes). 
'\['lie N,iB systellt (all(l llOt the '.\['1{ syStelll) 
a\]so exalnines the areal)or of commas an(t verb 
(:hunks crossed by a potential G12.. 
The si)acc of l)ossible GlTls searched 1)y the 
two systems is slightly different. The TI{, system 
searches fbr Gl~s of length three clmnks or less. 
The MB system set~r(-hes for GRs which cross 
at lllOSt either zero (target to the source's left) 
or one (to the right) verb (:lnulks. 
Also, slightly different are the chunks ex- 
mnined relative to a potential GR. Both sys- 
tems will examine the target and source chunks, 
plus the source's immediate neighboring chunks. 
The MB systeln also examines the source's se('~ 
end neighl)or to the left;. The Tll, system instead 
also exmnines the target's immediate height)ors 
and all the clmnks between the source and tar- 
get. 
The T12, system has more data partitioning 
than the MB system. With the TI{ syst:em, 
possible Gl{s that have a diit'erent source chunk 
tyl)e (e.g., noun versus verl)), o1" a different re- 
lationship type (c o. subiect versus ol)iect ) or \ ",% ", , 
direction or length (in chunks) are alwws con- 
sidered separately and will be afl'ected by differ- 
eat rules. The MB system will note such differ- 
ences, but lil W decide to ignore some or all of 
them. 
3 Comparing the Two Systems 
3.1 Experiment Set-Up 
One cannot directly coral)are the two systems 
from the descriptions given in Ferro et al. 
(1999) and Buchholz et al. (1999), as the re- 
suits in the descril)tions we, re based oll different 
(tatt~ sets and on different assumptions of what 
is known and what nee, ds to be fbund. 
Itere we test how well the systems 1)erform 
using the same snm\]l annotated training set, 
the a2.~).0 words of elementary school reading 
comprehension test bodies used in t, brro et al. 
(1999). 2 We are mainly interested in compar- 
ing the parts of the system that takes in syn- 
tax (noun, verb, etc.) chunks (also known 
as groups) and tinds the G12.s between those 
chunks. So for the exl)eriment , we used the gen- 
eral 'I'iMBL sysi;eln (l)aelemans et al., 1999) to 
just reconst;ruct the part of the MB systcan that 
takes in (:hmlks an(t finds G12s. Th(', input to 
1)ot\]1 this reconstructed part and the T\]-{ systonl 
is data that has been manually alHlotate(t for 
syntax chunks and ORs, ahmg with automatic 
lexeme and sentence segmentation altd t)art-of - 
sl)eech tagging. In addition, the '\].'12. system 
has nlmltlal nallie(t-e\]ltity allllOt3.tioll~ all(1 alltO- 
matic estimations for verb properties and inel)o- 
sition and sul)ordilmte conjmlction attachments 
(l~k',rro el; al., 1999). Because the MB system 
was originally desigued to handle Gll.s attached 
to verbs (and not noun to 1101111 O12S, etc.), We 
17311 the reconstructed part to only find Glis to 
w;rbs, and ignored other types of GRs when 
eomt)aring the reconstructed part with the T12. 
system. The test set is the 1151 word test set 
used in Ferro et al. (t999). Only G12s to verbs 
were examined, so the elt'eetive training set GR 
count fell ti'om 1963 to 1298 and test set C12. 
')Note that if wc had been trying to compare the two 
systems on a large mmotated training set, the, M\] ~ system 
would do better by default just lmcause the TR system 
wotlld take too long to l)roecss a large trailling set. 
1147 
(:ount from 748 to 500. 
3.2 Initial Results 
In looking at the test set results, it is useful to 
divide up the Gils into the following sub-tyl)es: 
1. Simple m'guments: subject, object, indirect 
object, copula subject and object, expletive 
subject (e.g., "It" in "It mined today. "). 
2. Modifiers: time, location and other modi- 
fiers. 
3. Not so simple arguments: arguments that 
syntactically resemble modifiers. These are 
location objects, and also sut)jects, objects 
and indirect objects that are attached via 
a preposition. 
Neither system produces a spurious response 
for tyl)e 3 Gils, but neither sysl;em recalls many 
of the test keys either. The reconstructed MB 
system recalls 6 of the 27 test key instances 
(22%), the TR system recalls 7 (26%). A pos- 
sible ext)lanation tbr these low performances is 
the lack of training data. Only 58 (3%) of the 
training data GR instances are of this type. 
The type 2 GRs are another story. There are 
103 instances of such Glls in the tess set key. 
Tile results are 
Type 2 C4II.s 
System Recall Precision F-s(:ore 
MB 47 (46%) 4!)% 47% 
TR 25 (24%) 64% 35% 
Recall is the number (and percentage) of the 
keys that m'e recalled. Precision is the number 
of correctly recalled keys divided by the munber 
of ORs the system claims to exist. F-score is the 
harmonic mean of recall (r) and precision (1)) 
percentages. It equals 2pr/(p+r). Here, the dif- 
ferences in r, p and F-score are all statistically 
significant, a The MB system performs better as 
measured by tile F-score. But a trade-off is in- 
volved. The MB system has both a higher recall 
and a lower precision. 
Tile t)ulk (370 or 74%) of tile 500 Gil, key 
instmmes in tile test set are of type 1 and most 
3When comparing differences in this paper, the sta- 
tistical signiticance of the higher score I)eing better than 
the lower score is tested with a one-sided test. Differ- 
cnccs deemed statistically significant m'e significant at 
the 5% level, l)ifferences deemed non-statistically signif 
leant are not significant at the 10% level. 
of these are either subjects or objects. Witll 
type J GRs, the results are 
Type 1 GRs 
System Recall Precision l?-score 
MB 23I. (62%) 66% 64% 
Til. 284 (77%) 82% 79% 
With these GRs, the TR system I)erforms con- 
siderably better both in terms of recall and pre- 
cision. The ditferences in all three scores are 
statistically significant. 
Because 74% of the GI-/. test key instances are 
of tyt)e 1, where the TR system performs better, 
this system peribrlns better when looldng at the 
results for all the test Gl{s coml)ined. Again, 
all three score diffferenees are statistically sig- 
nificant: 
Combined Results 
System Recall l?recision F-score 
MB 284 (57%) 63% 60% 
Til, 31(  (G3%) so% 
Later, we tried some extensions of the re- 
constructed MB systeln to try t;o lint)rove its 
overall result. We eould improve the overall re- 
sult by a combination of using the I\]71 search 
algorithm (instead of IG27~EE) in TiMBL, re- 
stricting the t)otential Gils to those that crossed 
no verb chunks, adding estimates on prepo,si- 
tion and complement attachments (as was done 
in TR) and adding infbrnlat, ion on verb chunks 
about 1)eing passive., an infinitive or an uncon- 
jugated present 1)articit)le. The overall F-score 
rose to 65% (63% recall, 67% precision). This is 
an improvement, but the Til. system is still bet- 
ter. The differences between these scores and 
the other MB and Til, confl)ined scores are sta- 
tistically significant. 
3.3 Exploring the Result Differences 
3.3.1 Type 2 GRs: modifiers 
The reconstructed MB system performs better 
at type 2 Gil,s. How can we account tbr this 
result difl'erence? 
Letting the TR system find longer GRs (be- 
yond 3 chunks in length) does not hell) nmch. 
It only finds one more type 2 Oil, in the test set 
(adds 1% to recall and 1% or less to precision). 
Rerumfing the TR system rule learning with 
an information organization closer to the MB 
system produces the stone 47% F-score as the 
1148 
M\]} sysl;(;nl (rc.ca.ll is \]<)w(~r, Iml; 1)rc(:isi(n, is 
high(;r). S1)c<:ifi(:a.lly, we ~,;()I: 1;his rc.sull; when 
I;\]IC ~I'\]{, sys{;CIII WaS l'Crllll wil;h 1lo informal;ion 
<)n l)p--al;l;a.(:hnl('nl;s, v(;rl) chunl( 1)r<)l>(,rl;i(;,~ (e.g.; 
l)aS,~dv('., ilflinil;ivc), nam(;d-cm;il;y lal)cls <)r hc.a.d- 
r 1 "1 wor(t sLems. Also, l;\]m .I l~. ,%ml;cm How (;xmn- 
in(;,q (;h<; <:hmtks cxamin(;(l 1)y l;h<' <)rJj,,ina\] M\]I 
s2/sl;('an: garg('.|;, '~ourcc and ,~our(:(;'.~ n('Jj,ihl)<)rs. 
Ill a(l<lil;ion, insl;c, ad of 6 al)s<>lul;e h;ngl,h ca.l;('- 
gories (l;arg;cL i,~; 3 <:\]ranks 1;() l;hc lcf|;, 2 (:\[mnks, 
\] clmnk, and similarly for I;11('. righl;), 1;he (\]l{.s 
c, m,~i<l('xc, l now jusl; fall Jill;t);/.11<t a.re \])m'i;it;i(m<'xt 
inL<) 3 re\]at;iv(; cat;(;gorics: i;argcl; is l;h(! tirsl; verb 
chunk 1;o Lhc lolL, ,~dmila.rly /;<) Lh<'. righl; and l;a.r- 
gel; is i;\]m :-~c.(:(m<\[ verb (:hun\]~ 1;<) l:\]~c rig\]~I;. ~l'h('. 
MI\]{ SNS(i('.III (;;~11 (\[isgin?;ui~',h l)('kwe('~n (;\]w+;e sam(; 
r(',\]al;iv(; <:a,1;('gori(',~. 
I h'.<loing l;hi,s 'l.'l{. sy,%<;~ n r<;i'm~ 'wil./m'l+,l, <:hi u ~l~ 
h('.a.(lwor<l ~;ynLa(;l;i<: (>r ,~('.mmg;\](: (:las~;(',s \])r<>- 
<lu<:cs z~ d(i(/> l:-s(:or<'~. If in a<hlii;i(/n, (,h(,: l)l)- 
al;La.(:linmnl;, v(,.rl) <:\]ulnk 1)rol)('.rl;y, nam<;<l--cHl;il;y 
tal)el and \]madword ,ql;<'.n~ intbrnml;i<)n are ad<l(:<l 
I)a(;l( in, i;t~('~ i:-s(:or(', aclamlly (h'ol),~ 1;()/J3<~). '.\[.'\]1(; 
<lifl'(;r<',nc<'.,", t)(;1;w<:cn l;hc,~(; d'\[(/), d(i(/> mM d3</) rc.- 
r~m ,~c<)r(;,~ :u:('~ H()I; ,%a,i;isl;i(:alls~ :;ig~fiti(',anl;. 
H<) wit;h I;yt)<; 2 (~1C4, MII ~y~;l;('.m',~; l)c(;lcr \])('.r- 
\['()rman(:(~ >c('.n~s 1;<) 1)(; l~tainly (lu(: il~'; a l)ilil;y \[;() 
~litl'er(,.nl;ia.I;e lh<; l)<>(;(,nl;ial CI:I,~ I)3~ 1,11('. \['ca.l;ur<~ <)t: 
l;\]t(; ~mml)<;r <>f v<!rt) (:\]mld;,~; <:r(>s>(;<l 1)3: a C,I{. In 
l)a.rl;:i<:uhu', makin~,~ l;hi,~ :/.l~<l a few <>l;hcr <:hm~,;cs 
I;() I;Im 'l'l{ ,%',~l;<;nl in(:r(,.m:;cs il;:; i:.,~;<:()r<~ l<) \[;11(; 
MB sysl;(;m's \]i'-~4<x>re, a\]l<t 1;\]1('. <)Lh<u' <:}m,ug<;s (r('- 
moviug (:cri;ain iniorm al;i<)n) <h)(>~ n()l lmv(' a, ,,~i,"-~ 
nifica,ni; ('Ke.cl;. So u~dng l;h(; rip,hi: lb.a,l;~u('+~ can 
(ll~\[cr(m(:('. make a, la,rgo '~' 
For I;hes<~ I;yl)e 2 (~'\]L~ (luoditicr~,) in En- 
glish, il, <lots s('~(un Llm|; l;hc. nmnb(u" ()\[ verb 
cJmnks (:ro,~,s(;<t is a, l)('l;l;(;r was~ i;o {~rou 1) l)OS -- 
si\])\]c, m()<lifi(;r,q Lhan I;11<: al)solul;(~ clmn\]( l<'~ngl;h. 
An cXaml)\](: is comt>aring \] ./ly o'a ~l~ac.sday. mM 
l ./1!/k,o'm,c .fl'o'm, k,(",'c o':~, \[l~u,('.,sd(~?/. in 1)oi;h s(',n- 
l;(;ll(:(',s, on T.,<'sday is a, l;imc m(>(liticr <)f Jly and 
on (:ro,~;s('~s no vca'l),~; 1:<) reach .fly (on al;l;ach('~s 
l;<) l;hc tirsl; verb 1;() ii;s left;). \]htl; in l;h(; fir,~;I; 
s(;nt(;ncc o'~, is ncxl; to Jly, while in l;h(; s(~(:<)n<l 
SCllt;(~llC(~ 1;lmr(~ arc l;\]ll"(x ~, chllil\]r~,q ,~c\]);l,r;i.{,ill~ o71~ 
aud .fly. 
a.a.~ Type 1 (-11.{.s: simple arguments 
\]~(/r Ly\])e \] (l\]{s, l;lm T\]{, sy,%cm t)(;rforms 1)('%l;(u'. 
How can w<' a,(:(:<)unl; for t;his'? 
Much <If 1;h(; cxt, ra inf()rnml;ion 1;11(; '.\['\] {. ,~;ysl;('.m 
cxamiHc,~ (comt)ar(;d to 1;he MB syslx;m) do(;s 
noI; seem 1;o have much of an ctk, cI;. When th(- 
'.1.'\]-{, sysl;Clll Wa,q rerun with no in\['orma.l;ion on 
\]ma(lwol'd synl;acLic or semanLic classes, nmnt'~<l- 
cnl;ity labels or h(;adword sLems, 1;h(', \].:-scc>r('~ in-- 
creased t}om 79% l;o 80%. Anol;lmr r(:;rllll t;lli~|; in 
addiLion had no information (m t)\])-~d,l;a(:hm(ufl; 
c, sl;ima,l;<;.s or any of 1;he non-hca(lwor<t,s in tim 
(:lmnks also ha<l an F-score of 80%. A 1;hir<l 
lCrllll t,hal; fllrl;h(;rlllor(; had no il~tin:nmLioll Oll 
verb ('.hunk 1)ro\])crl;ics (c..g., lmssivc., infinil;ivc) 
had an F-score of 78%. In Lifts set of F-,~t:ore.s, 
only l;he diffc.rences bc.Lwcen the 8()~) scor('.s and 
I;hc 78(X) s(:orc arc sl;at;isti(:ally signiiicaJfl;. 
Some M\]/ sysl;cni rcrun,q sliowcd t)~('l;ors thai; 
S(;CIlICd l;o ltta, l;l;(;r 11101"(;. hi {;\]lC \[\]1"s|; 17(~1'Ull~ w(; 
l)arl;il;ion(;(l t;h(; (la.La \])y t)ot;(;n/;ial (11{ ,~om'(:<~ 
<:hlud~ i;yt)c (c..g, lloun versus vcrl)) and ra.ll a 
SCl)aral;('. m<mlory-l)a.s(~(l l;railfiUt~ and l;('.~;i; \[or 
ca(:h t)arl;it;ion. '.\['hc. Colnl)ill(~d \]:-s(:<)rc in<:rcav;('.d 
fl'om 64% t;o (i9(fi~. At'Lcrwmds, wc made a r('- 
>m l;haL res<;nfl)l(,d I;h<; TI{. sysl;cm run wil;h l:hc, 
78(/0 \]'Ls(:t>rc, (ex<:('=\])l; l;\]mi; m(unory-l)asc(\] l<'m'n- 
ing was use<l): only C,I./s <)f lcngt~h 3 <:}mnk,~; <n" 
less were (:<)nsid(;rc<l, l;h<; data was l>arl;il;i<m<;<l 
(in :~<\](lil;ion I;<> s()urcc, <:\]mnk I;y\])c) l)y (111 lengl;\]l 
au<\[ (lir('.<:Li<>n ((, ,,' I;arg(;l; is l;w<> ('.\]mnks I;() l;h<~ 
l(;fL) an(l also t)y r<'\]al;i(>n l;y l)(; (:~c 1)~,ral;c rm Is ti)l 
<;a(:h 1,yl)('.), l;hc c<)111111~/. ~/.\]l(| V(}l'}) <:hunk (:ro,q:;- 
ing (:()unt;s wcr(; n<)l: (:on~d<h'.rc<t, aal(t l,h(; <:lmuk> 
normally cxa.min(~<l 1)y l;h(; '11{ ~ysl;<'m were cx-. 
mni,l(,.(t. 'l'\]fis fm'Lll('~r iJmrcmce<t Lh(; \]:- >;(:<>r('. lx) 
75%. In l;his s<;l; of F-scor('~s, all l;h('~ ditl'ercncc> 
are sLa.l;isl;i(:ally ,~ignilica.nl; and al\] I;\]m l -->;(:orc.s 
in Lifts seA; arc. slal;isl;ica.lly signifi(:a.nt;ly diffcrcnl; 
J'l'Olll (;lit', 'i'll. SySi;Cll\[ l'llllS wil;h l;h<" 78</) and 7!)</> 
\])'-s(:orcs. 
From l;he sLal;ist;ica.lly ,~;ignitit:ant; scow. <liif<'a- 
(;n(x's, iL s(;<:ms l;\]lal; t)arl;il;ionin/ (\[al;a 1)y l)()- 
l;ent;ia\] CI/. s(>m'(:e <:hunk l;yp<', hell)s (in(:rc.a.q<'~ 
from (id:(/> 1;o (i9%), as does Lh<'. resL <>f l;hc. 
imrl;ii;ioning l)(:rf<>rm(;d mM rim.king some slighl; 
clmngcs in what is (',xamincd (increase 1;o 75%), 
using |;rmlsformaLion-lmscd learning insl;ca(t of 
mcln<)ry-bas(',t learning (increase to 78(/~) an(l 
using v<',rl) chunk l)rot)(;rty informat, ion (in('r<',ase 
|:o 80%). 
In the. original MB sysl,cm run, Lhc somce 
clmnk Lyl)('= mid l;h('= 1)oi;('ag;ial (:,11 \]CltgLh 
an(t (lir('=(:l;Jon wcxe a\]rca(ly deLcrminc(l l)y I;hc 
m(',m(/ry-bas('d \]carn(;r l;o l)e l;h(', mosL iml)orl;mfl; 
1149 
attributes exanlined. So why would partition- 
ing the data and runs by the values of these 
attributes be of extra help? A possible answer 
is that for different values, the relative order 
of importance of the other attributes (as deter- 
nfined by the menlory-based learner) changes. 
For example, when the som'ce chunk type is a 
noun, the second most inlportant attribute is 
the source dlunk's headword when the target 
is one to the right, but is the source chunk's 
right neighbor's headword when tile target is 
one to the left;. Partitioning the data and runs 
lets these different relative orders be used. Hav- 
ing one combined data set and rUlL inealLS that 
only one relative order is used. Note that while 
this partitioning may 11ot; be the standard way 
of using memory-based learning, it; is consistent 
with the central idea in memory-based learning 
of storing all the training instances and trying 
to find tile "lmarest" training instance to a test 
case. 
Another question is why using 
transtbrlnatiol>based (rule) learning seems 
to be slightly better than nmmory-1)ased 
learning for these type 1 GFls. Memory-based 
learning keeps all of the training instances and 
does not try to find generalizations such as 
rules (Daelenlans el; al., 1999, Ch. 4). However, 
with type 1 Gl~s, a few simt)le generalizations 
can account for many of the, instances. In the 
nlanner of Stevenson (1998), we wrote a set 
of six simple rules that when run on the test 
set type 1 ORs produces an F-score of 77%. 
This is better than what our reconstructed MB 
system originally achieved and is (:lose to the 
TII. system's original results (close enough not 
to be statistically significantly different). An 
example of these six rules: IF (1) the center 
drank is a verb chunk and (2) is not considered 
as possibly passive and (3) its headword is not 
some fbrm of to be and (4) the right neighbor 
is a noun or ve, rb chunk, THEN consider that 
chunk to the right as 1)eing an object of the 
center ehuuk. 
4 Discussion 
ORs are important, but different sets of GR,s 
are useflfl for different imrposes. We have 
been looking at ways of ilni)roving automatic 
Oil, finders when one has only a small amount 
of data with tile desired Oil, mmotations. In 
this paper, we compared a transformation rule- 
based systeln with a menlory-based system oi1 
a small training corpus. We found that oll GIls 
that point to verbs, most of the result ditfer- 
ences can be accounted fbr by ditferences in the 
representations and information used. The type 
of GR determines which information is more im- 
portant. The rule versus memorpbased differ- 
ence itself only seeins to produce a small result 
difference. We also find that partitioning the 
data mw hell) melnory-based learning. 

References 

E. Brill and P. Resnik. 1994. A rule-based 
approach to prepositional phrase attachlnent 
disambiguation. In lath, lVn, te:r'national Cot@ 
on Computational Linguistics (COLING). 

S. Buchholz, J. Veenstra, and W. Daelelnans. 
1999. Cascaded grmmnatical relation assign- 
ment. In Joi'n,t SIGDAT' Cor@renee on Em- 
pir'ical Methods in NLP and Very Larqe Col 
pora (EMNLP/VLC'99). cs.CL/9906004. 

J. Carroll, T. Briscoe, N. Calzolari, S. Federiei, 
S. Montemagni, V. Pirrelli, G. Grefenstette, 
A. Sanfilippo, G. Carroll, and M. F\[ooth. 
19!)7. Sparkle work package 1, specification 
of phrasal imi'sing, final report. Available 
at; http://www, ilc.pi, cnr.it/sparkle/- 
sparkle, htm, Now;tuber. 

\¥. Daelemans, J. Zavrel, K. van der Sloot, and 
A. van den Bosch. 1999. Tilnbl: Tilburg 
memory based learner, version 2.0, refer- 
ence guide. ILK Technical lleport ILK 99-01. 
Available fl'om http ://ilk. kub. nl/~ilk/- 
papers/ilk9901, ps. gz. 

L. Ferro, M. Vilain, and A. Yeh. 1999. Lem'n- 
ing transtbrmation rules to find gralnmatieat 
relations. In Computa, tional natural la'n, guage 
lear'nin9 (CONL£-99), pages 43 52. EACL'99 
workshop, cs. CL/9906015. 

M. Marcus, B. Santorini, and M. Mareinkiewiez. 
1993. Building a large mmotated corpus of 
english: tile penn treebank. ComI)utational 
Linguistics, 19(2). 

M. Stevenson. 1998. Extracting syntactic 
relations using heuristics. In I. Kruijff 
KorbayovS, editor, Prec. of the 3'rd ESSLLI 
Student Scssion. Chapter 19. 
