Representing Information Need with Semantic 
Relations* 
Anil S. (',ha.lcr~w~rthy 
M IT Media, I,a,l,oratx)ry 
Abstract 
hll'oriiia,\];icm retriewd syM,eilis cai/1)e ilia, iI~, ili~)i'e ell',,(> 
l,ivc~ hy i)roviding lllOl'(! expressiw~ qliery l;-tligil:.l.g(,s \]'()1' 
ii.,.-;(!r,<-; 1.0 Slmcil'y their il/\['(3r1\[la,LiOll ile(!(i. This \[l;il)l!l' 
;uWms thai, ibis can he achhwed i llrough IJw use ~fl' 
sclHa, nl,i( rchd;ions as query t)rimitivc,s, and dcscrihcs 
a new techniqtie \['or exl,racLing SOlila.ii\],ic rel;d,i()ns \['roiii 
aJl i'niliue did, iOli;u'y, Ill c()lil, i'asl, 1;o cxisiing research, 
this LeC\[iliiqtic involw~s l, hc cvmpvs{livTz o\[ hnsic sc 
Ilia, Ill,i(: rela, l;ions, a, I)I'OCOSS akin I,o consl, rained spr<ul- 
ilig a, cl;iwi, t;ion in selna, ni,ic nei, w(ir\](s. 'l'hc pr, q)(ised 
l,echnique is ewthial, ed iu Lhc ccml;exl, of ~xtra,cl;ing sc 
iiianl;ic re\]a.l, ions IJla,i; are relewmt \[}~l' rel, riowl\] froiii a 
corpus o\[' picLures. 
1 Introduction 
Ill l'OCOl\[l; years> 1,\[liWO \]l~lS; I',etq/ COllsidera, hie ii/l;el'eS\], 
in alq)lying l,cchni(luc's \['l'C)lll COlnl)ut;al;ioual liuguisl;ic,q 
l;o hiiprove vaJ'iOll,'-J aSl)ecl,s o1' inl'orlnid,ion i'ctricv;d 
(ILL) \[1\]. This p~q)cr d0scrihcs ucw I.cclmiques lbr ~x- 
t:i';t(:l.illg SC, ilialil;iC I(nowlcdge fl'Olll online IiuguisLic i'c 
,qOlll'(:(~s in order Lo l)rovidc b~fl,ter iilo/:\[l()ds ~\[' express 
hig III, qliel'ies. 
\]lelkili and (h'o\['|, \[2\] ilso~(\[ I, hc t(!rlll i.dDr.l(tli<,~l ~c<d 
\];o charaxfl;erize Lhe IlSel'>S i liol,iwd;ion lT~i' using ;i, ll IlL 
sy,<-;t(~iii, hi clirroiil; IlL ,qysl,t!lllS~ ilsi!rs tirsi, l,r;llisiate 
t;heir il/\['Ol'llla, l;iOll lleed int,o queries. The Ill, sysl,011i 
pr(wesscs l:hese queries ~li(i IllatCllO,q 1,\[1(~111 against ,'qlll'- 
rogatl!s represenl,ing\] the to×i, (or lne<lia) colll~cl, i(ms l,i) 
rcl,rieve eleluenl,s of Lhe collccti{.i l,ha, l, a,i'e i)ossibly 
l;}ie lies\], ina,Lches \];o l:iio ilser's queric,.,. 
In lliOSl; inl})rmaLi{m rei;ricwd ,<-;y,~\],elil.q hi wid<~ il.q(!, 
l, he qllrrogaJ,l~s For the c, ollection are wr)rcls or word 
(:(JllocaLi(ms i,\[i;i,\], ;tl'(~ ciLht,r cxl,r~cci,ecl aul,oiiial,ic;dly ~)l' 
provided I)y a l/tllli;I, li ilidexei'. This ('Oli,%i'ahl~ IJSCI'N I,() 
express \],heir queries as Ctillii/illat, ioii,q ()\[' words ()i' '~w)l'd 
co\]local,ions, resulLillg ill ~tll ilHt, CClll';t,L/~ o1' \[liaCi(~(lil~-tl,(~ 
des(:rilH;i,::,n o\[' 1;h(;ir il/l'orlli;itioll ime(I, ,qlllOal;ol/ \[\[\] 
characl,eriz0,q l;he ideal as a c(nicel)i;uai in\['orina, tion 
*This I'(\]St2DA'C}! 'lV~l.S sul)porl,ed in pari \]>3' ;t. felh,wship \[runi Lhe 
\]nLcrva.\] \]{cse;i,l'ch (J~l'pOl.&Lion. I iLlil gr;lte\[ul t<> \]'~l'fl.li \[$Og, lll';I.(?V, 
Ken l{aasc, I,ouis Weitzman and Ih<: M;Lchilm {hl,lerstandlug 
group ;ll. the Media \[,ab tLr their fccdl);~ck .n this w.rk. Au 
thor's l,;-Ul~dh ;ulil(ll)mcdimnlll..cdu 
l'cl;l'ievaJ sysLcln, whcreill Ilscrs expross Lhcir IlCCd ~t,~ 
SOlllO colll\[)ill~l~iOll or COl/C(~,l)l;s ~LIIil th(~ sysI;(~l\[l lll;tl, ch(?8 
lhese l,o concept~ represent iug I, he underlying t;exI, or 
media colhml:ioll. ALtempLs have beell made Lo buihl 
sm'h sysLelus for spcci\[ic dotHains, Iml, how to aut,,> 
maJ, ically cxLracl; ~11(1 rcpreseJH; concepl,s in general is 
still Far From cle~w. 
Many researchers in comlmt~dJonal linp;uistic,~ haVo 
roc.gnized theft elect, tonic dictiom~ries could bc used 
to ~Mdl'ess t;his collceptuM inlSrmalAon botlJcncck 
(e.g., \[3, 4\]) and a, lot of work has I)een devoted 
hztoly 1,o (tXtl'&Cl,\[ll~ S(!/iI3Jlt,iC I'e\[aJ;iolls \])(fl;wcell words 
(e.g., \[5\]). This resca, rch allows queries to he cx 
pressed as cnmbinaticms not, ()lily of wor(ls but, Msc, 
o\[' scm;mtic rchd;ions. '1'() l,;d¢c a simple exanlple, hfl 
\[is }LHSIIIIIC I,\[lilJ, 1,\]1(~ sysl,t!ltl has ~L(:cess 1;o ;-L database 
oF ificture,~ of ;mim:ds with th,~ names of :miiHals as 
l\]le mlrl'ogaJ;l~s. If a user is inl,crestc(1 iu picturc,~ of 
(I.gs, the query \[X A-KINI)-()F (log\] is (:OlL%r/l(:t(!(\]. 
'l'}l(' sysl,Cl|l \])roc(,sst~s LILt" (lll(!ry (,o rel, ri(w(~ pictur(~s 
w\[ios(" siil'l'og&l,es al'(! WOl'(\]s like "hassct hotlll(J/' 
"i)(2;tK\[I!," }llld SO 011 SiliCe Lilt! qllcry IIUd;(:\]tes Lhe 
scumnl, i(: rchlLions, \[Imss~t hound A-KIN D() F dog\], 
\[beagle A-KINI) OF dog\], etc. Thus, scNi~ultic rcla 
l ious cHahlc the user I.o (:Oll:-Jl;rll0L queries Lhzrl. corre 
:.pond \]:(i enLire cla~ses of wor(l-hased queries. 
The niahi purl'Ji/se of this l)ad)cr i~ t,- &'scribe" 
iiew teclmiques for extracl, iug ~c\[ii;tlil,ic relaOions l,ll;d: 
wcr<" insl,ircd hy i;he work of Quilli:m \[(J\]. Q,lil- 
linn denionsl, raLe(I Lila, l: by organizing individuM s< 
ln~mt.ic relations inLo SCllt~tlll.ic net;works, one c~mld 
oI'd;~tin c(n,q>osii;ions o\[" exisLing SClUallLiC i'(!l~l, lii()llH 
I).y a process o\]' sl)rendiug ncl;iwtl, ion. l<'or (!x~/JllpI0, 
l,\]ie I,wo rl?\]aJ;iOllS, \[basset hl')liUd A-KIND ()1" dog;\] 
mid \[<.log IIAS-PA lit tail\]> c~m be coliil)OSe(l l;() yield 
\[hassei; hound IIAS-I'A 1{:1' tail\]. 
\]11 this palml' , we dcscrilm a I)rCtgl'~/.lll in which imli 
vidlml ,<~nliianl,io rcla, l,ion,s exl,ra, cl;ed l'roni a dictionary 
}II'(~ (:OIIlt)O>q(!(\] /,0 yield m~w SOIIlall{,i(: rel:d,i(ins Ion" l'e 
t, ricv;d over a dal,abase o\[' l)icl;urcs. We zulclress l,hree 
issues l,tiat, arise in this COlii/ecl, io\[l: 
* (Joul;ro\[ ()\[' sproa(liilg ax't, ivation: UnbOUll(ie(\] 
sprea.(lilig ax'tiwttiou o\['l;01l resultis ill gOl'lileCtillg 
word,'-; \]Jir(lugh rela, Liclns Llia, L do nol, rail ini, o a, liy 
desired type. To be ti,sc:\[u\[, ii, is liecessa, ry Lo coil 
ti'oi spreadir~g a,ctiwltion so l;hat only relai, ion,q oF 
737 
the desired types are /bnnd. 
* l,',quivalence of alternative compositions: De- 
pending on (;lie conligural, ion of the semantic net- 
work, there might be several acceptable alterna- 
tives that yield the same new conlposed relation. 
* Word-sense ambiguity: Since individual semantic 
relations are extracted from the dictionary text, it 
is necessary to constrahl i, he spreading activation 
to tile "correct" senses of the target word. 
Section 2 describes the test database that we have 
heen using and semantic relations that are useful for 
retrieval over this database. In Section 3, we de- 
scrihe a pattern-based approach that we haw~ em- 
t)loyed to control spreading activation and recognize 
alternative compositions. In Section 4, we present the 
results and analysis of a series of tests that we con- 
ducted to test the accuracy of" the progr~m~ and its 
coverage as coral)areal to a hand-coast, rutted systeul, 
WordNet \[7\]. This section also describes and ewthl- 
ales a new word-sense disambiguation technique that 
is based on knowledge of the semantic relations involv- 
ing tile ambiguous word. FinMly, Section 5 includes a 
brief sulnmary of the work and discusses issues that 
riced to be addressed in future work. 
2 A Database of Pictures 
The primary motivation for our work was to provide 
retriewd based on semantic relations for a corpus of 
pictures collected from the American lleritage Dic- 
tionary. The corpus contains 1359 pictm'es, each of 
which is annotated with a single word or word colic)- 
cation from tile dictionary. Clearly, there are a great 
many semantic relations that coil\](\[ he useful for re- 
trieval fl'om such a database. To narrow down the set 
of interesting sen,antic relations, we used the fact that 
the annotations are single words or word collocations. 
As in memory experiments in cognitive psychology, 
we used tile annotations as cues for flee reeMl by as- 
sociation. We then analyzed the resnlts to locate se- 
manticrelations that occurred most often. Based on 
l, his analysis, we picked the sewm relations shown in 
'Ddfie I (which we will henceforth call modes to dis- 
tinguish them from individual senlantic relations). 
The OCC, UI{S-WITII mode refers to typical phys- 
ical collocation of objects. It. is useful for making 
"intelligent" guesses about what else might be in the 
picture besides the objects explicitly annotated. As 
the example in q'ahle 1 shows, tiffs is not Mways 
symmetric. It can he argued that tile presence of 
an ax in a picture much more often indicates the 
presence of wood than the other way around. The 
P LAYS- RO bl'~O I '~ lnod e differs fl'ol n the F, XA M P L E- 
OF mode in having a connotation of typical use. The 
(,'HA RA( ITE1USTI(~-ACTIVITY mode is used to re- 
late both objects and agents to typicM activities they 
Mode Examt)le 
OCCURS-WITII (ax, wood) 
EXAMPLE-OI" (basset hound, (log) 
PLAYS-I~,OIA%OF (cat, pet) 
CHA I{,ACTF~IUSTI(L (ax, chol/ping) 
ACTIVV1W 
ItAS-PUIU'OSF, (aqueduct, water) 
CONSTI'PUI~N'I~-()F (balance beam, 
gymnastics) 
IIAS-C()NSTITUI~NT (<log, leg) 
Table 1: Modes used for picture database rctrievM 
are involved in. The IIAS-PUI{POS1/mode is used to 
relate an object to a word denoting its lmrpose. As 
ill the Table 1 example, that word could either denol, e 
an activity or another object where there ix a typicM 
activity involving both objects. (X)NSTITUEN'P-()I" 
and IIAS CONSTITUI'~NT are similar to the widely- 
used PAI{;I'-OI" and IIAS-I'ART primitives except 
that metaphorical inclusion is valid its well. The 
next section describes our scheme for extracting these 
modal relations front the dictionary t . 
3 Extracting Modal Relations 
from Dictionary Definitions 
\]~xtracting modal relations Deal dictionary detlnitions 
involves three components: a l)reprocessor that tags 
the dethfition with p~u't of-speech informatiolh a rood 
ule that pulls out triples (basic semantic relations of 
the (brm \[wordl I,INK-TYPI,; word2\[) from the pre- 
l/rocessed definition, and a pattern hlterl/reter that 
checks tile list of triples for modal relations using sets 
of patterns. We will now describe each of these in 
tllrll. 
For 1)reprocessing the dictionary definitions, we 
have experimented with two ditDrent Caggers: the Xe- 
rox PAR(J part-of-speech tagger \[8\], and the Chop- 
per \[9\], an optimizing finit, e state luachine-hased tag- 
her built at the MIT Media l,a}~ by Ken llaase. 13(fore 
tagging the delhfition, we apply a few simple lilters to 
remove botanical names, usage guidelines, etc. The 
perfornmnce of both taggers was satisfactorily high. 
The example below shows the output of t.he Xerox 
tagger on n slm'lple definition2: 
aqueduct: a conduit for water 
:AT :NN :IN :NN 
l All dmse expmqments have been run oil a Websters ,reline 
dictionary. The progranl is written in l.ucld (~ommon l,isp and 
rams on a 1)lZ, Cstatkm. 
2'\['he tags used are from the Brown corpus, e.g., :A'\[' = ar- 
ticle; :NN = singular noun; :IN = prepositiom 
738 
I. Use lil)rary ,.fl' A KINI)-OI*' ,ur I,;NTAII,S extrac 
ti<m p;-tl;{,Cl'IIS to \[O(:a,t(' the g(~\[lllS \[;0rill. I';x(.racl. 
triples from modilhq's o\[" the gcuus t.erHL 
2. ltera, I;e over the ,.lill'c:renl, ia.u construcl, iHg triph~s 
using eacl'~ of l, hem unl;il eiihm' the eml ~>1' the 
detinit, i(m or till no l,rilflC can he c()tislrucl;ed from 
the dill(~renl:ia lblmTl, 
:';. Apply l*OSl,-I)ro<'essittg t~tet, h<)ds t.(~ c(>77strt~ct 
<d,her tril)les usiug Itial,ching t'uh:s. 
I"igttre l: I'roc<x\[m'e for cxt,ract,hTg l, riph~s I'\['()H7 a pre 
I)rocess('d de\[itfitiou 
3.:1 Extract ing ~lS"iph~s 
The Mg;oril;hm fl)r (:xtra.cl.ing I, riptcs is buill, ()ll tim as 
sUnll)tiolt thai; <lict, icmary ,.hqiHiti<)n,~ :~ l,yl)h:ally c~msisl, 
(>1' a, genus 1,erlH (i(h,ntilyh~,e; its kiml) f.lh.w.d hy dit: 
f<'eutia.c (h(,w it is dilfercnt l'r(,m t,h., ,v;~,m~s) \[10\]. lu 
the aqueduct dclinil,ioH a.hov<~, the genus ter~H is "c(>h 
duit" (\[mlUeduct A-I~:INI):()I,' comhtil;\]) and I;he (rely 
dilfer<,m:ia is given hy lhe t'P "ft,' wat,er". Itl the ca:( 
of verb dcllnitions, the gemls tm'm is related hy the 
link "I",N'I'AII,S". The three stages of I;h{~ algorithm 
are i)resent;ed iH Figure 1+ 
We use ~'\[ wn'iel,y of l>al,l,erHs d<~scrihed in the liter: 
at, ure to extract the initiM gcmts term(s) c()rre(-tly \[hi 
(e.g., l);tLl,erns like % N I', .... <ql, hcr ()1" two i>hu'al:N I'," 
"o11<~ ()l+a family uf I)hu'al-.N P"). 'l'h(' l)att, lq'ns coHd)iuc 
both sylll.act, ic {711'.1 string eletHents, which makes th(qn 
more t',+)werl'tfl l, hatt purely st, ring-I>a,-,ed patJ,crns \[1 I I. 
,qin<:e it is very iml)(:,rt;mt; l,o \[iml I,h~. t~f?lTLit-; t<!rlll c(~r 
rcct.ly, a "lasl:-ditch" extracl.<)r is invoked if mine <)l' 
t, he stamlard l:~+dJ;crus work. 'l'hi,~ hu-;t dii,ch exl, ra/.:l,(ir 
~lS.qHIll(~b I.h+li. l.h(! tagger 7tlttSl; }lave 7iia(le a 7uisl.al,:e 
alTd tl'ies l,o c,')IU+l'+elTsal,c \['()I' cotlllTlOlt tagger mistake,; 
(c.g., i, aggh,g a,n ing l'orJn as v(>rl)insl,,.md <~f;ulj(~ctiv(>). 
()IIC(! {,}7(~ gOTlllS \[,O1'ii78 it;IV(' \[)een \['07lt7(1> w(, atmlyze 
l he m()rphological form ol +lhe modiliers for triples. 
For inst,~mce, siTlce "violin" i~ delhmd as % bowed itl 
strument.," the i;riple \[violin ()lUl';(/l' ()F bowinp;\] is 
r(~c<~7'd<.d 't , 
111 Step 2, each ~>1' th(" <lilli~ret~tia,(~ is as,'-;tuncd I<) he 
either a relal, ive chutsc or a prcpositiomd phras._,. As i7~ 
,qtel> I, lmml nouus (+t: verhs ;~r<~ l<)c~m'd for each (>f 1;17(' 
ditl'er<~utiae and result itt triples hciug; \['ortned widt the 
word(s) beiltg tnodilied, Whl,.r(~ I,)mre is atl,a.c\]Htteut, 
aml>iguit,y (as wit, h i)r<,l,.,sitional l,\[trases \[13\]), triples 
arc li)rmed for all l~ossihle al,t;~tchm<qfl,,'-;. 
,gl.e 1) :/ is ;~ I>OSt-t>ro<'e.,,sing st(q) which t'e.'mlls 
iH n('w t,riples being \['<)rlued and sont(! I,riplcs Irt)lli 
:+The lJrog:r&llt only lu,.ndh~s noun and vm-I) definit, i\[,ns. 
41(1 ;d\[, t.\]mt'e au'e a.I)<,u( 15 link types it~ tril)h:s, namely~ 
A KINI)-()I,', I,;NTAII,S, l'Algl' ()1,', \[IAS. I'AI.tI', A(;I;NT-()I+, 
()IgJ F,C'I'..OI,', WIT|I, I,'o1/, A~, ()1,', and s~w~.,al ~l/d, iM i)r,q.:) 
silions like IN aml ON \[1'2\], 
Step 2 being (4iminal,ed. For insta, itce, c(mshlcr 
the li41<,wing <hqinil, i<m of "acropolis": %cr()l). 
lis: the upl>er, forl,ilied part of a (h'eek cry (as 
Athens)," In Stup 3, t,wo tril>lcs i)rt.\]u.::(~d in SI;ep 2, 
\[acropolis A KINI)-()I" part\] and \[part ()1" tit,y\] at{, 
merged into \[;u:ropulis PAIU'=()I" city\]. Sinfihuqy. 
1,here arc rules I~u' cr<~d, iHg links of other types. ()ther 
p()sI, processing rules deal with elimiuatiug r<4"eretL<:('s 
t.<) A I(INI)-()F gcnu.~ I.erms in triples hy replacing 
them with l:hi~ (lellnien<hHH. 
3.2 Extracting Modal Relations from 
Triples 
For each nlodal relat, i<)tt iu 'l'ah\[e 1, there i, a set ()1' 
l),~d,t,ern,q thai: Sl>(~ci\[ies how l, hc modal relation can 
hc detect, cd from the triples of' one or ulor(, (letiui 
I,ions. Ea, cl', l>alJ,(~r~. (~,ucodes a, h<',urisl, ic rule t, hai, is 
based (>Tt the l, ril>h~s exlrp.cl,ed f'r(itn Llte dicl:iouary 
</(4inil, i(ms. l,'or ex~qHi)\] % one patl,er77 for exl;racl 
ing ()(',(',IJRS.WITII rchU,ions (me-des l,\[m hcurislJc 
l,hat, t,yl+ically if ()h.iect:l aud ()h.i~'ct~ are i\]p,'.lv(~<l iu 
I;hc saHm acti,m, \[()l>j<~ctl ()(;(~UI{S-WI'I'II ()h.ic(.t~\] 
\~rh(m this i)atterll is applied I,o I,h{~ (h4itiiti(m~ 
"ax: a <:utth~g t.<7<>1 tha, t c<)nsi~l,s ()1' a heavy 
edged head lixed to a handh' with the edge liar 
aiM to l;he IlalMh, aml tll;tl, is used c;-;I}c<:i;flly 
I'or felling l,l'e(~s ;tnd chol>ping and sF+litt, i~g w<,.(l', 
the two Hlodal rclati(,ns \[a×()(:(:UI{.S WITII trccj 
aml \[ax()(',(;IJII,S+WITII wood\] arc found. Pal. 
l,crn,', c+ul apply 1.o nmltil)le dc:llnitions as w~ql. 
Using the salHc heuristic a,s al)()ve, wc have de 
Ihlcd a pntl;ern tha, l. <!Xl, l';tct.s L\]I(! modal rehd.iou 
\[atomizer ()(J(IUI{S-WI'I'II spray\] from th~ two dG 
init.i(ms hclow: 
a|;Ollliz(w: ;711 iltsl, rllttl('~71t, for al;olltizhlg 71~4tl 
ally a perl'utue, (lisiufecl,~ml, or nmdicament 
at,,(lllliZ(~: to I'0(hlc(~ to lllill/ll;C particles or to 
a 1171e spray 
4 Performan(:e Evaluation 
'1'o ;ma\]yze the perf'ormance of the progra, Hi, we picked 
the Ih'st 300 annouLtions fi'onl Ge corpus of ph%ur(,s 
dcscrihed iu Secl, i<m 2. The Iirst part of this scc 
ti(m l)resents the results of applying lJm modal reta 
I:ion Iml:terns th;~t work on (me definido.. The sec.ml 
Imrt discu,'+ses diflicult:ies i.w4wM in the (w;dmU:i~,u ()1' 
II\]O(la,\[ l'('f{~(,i(~ll im, H;erns that, wc)t'k OII IIIOI'R til;qH (HR 
de/iuition. 
In order t,:> obtaiH a,:n imhT.:mdenl~ csl, iiHa.le {4" 
I,\[l(! pl!l'l'ur77t;t77cc: ()\]+ l,hc ext;r;-I.(:t:iOl7 pl'OL~l';tlll s W(! 
COllH>arc'(\] /,he oUt, l'.ut (1(" tim pr,.)gra~m where i)<;~.- 
sible t,o the <)Utl)Ul, ()1" WordNet, \[7\], which is a 
large, manually<:odc.d semantic :netwc)rk, W()rd 
Net, d{}es not haw~ liuks corresponding t,,~ \['(mr 
"i.,:., hMelmn(lent, of t, iu dicthmary we a.rc c~sin,v.; 
739 
-87--o 74~ I s:).2W .5~;(7-W:~Sl 2,~.<) --4- 
21~'2 - 129 ' 60.8z\] - -- -(i 
- 14-~6-- 114- 78.(~8 
1 ,~d- 7~5- 6,5.7~---- 9~ ~):~ 
Table 2: Performa.nce cff the modal relation exl;raction 
t) rograIH 
of l.he modes listed in Table 1: ()(XI:UI{.S-. 
WITH, I'I,AYS-R()LI';-OI", IIAS-PlURPOSI';, and 
CItA I{A(\]Tle, IUSTIC-ACTIVITY. For l;he el.her three 
Inodes~ We fotllld eorresl)Olldenc(?s I)y ass/tIilillg thai; 
all hypernyn,s are valid exan\]plcs of I,\]XAMPI,\[~ 
OF, all merollyll|s of HAS-CONSTITUI~,NT, and all 
holonyms of (X)NST1TUI",NT-OI r. The performance 
results and the comparison with WordNet are l)re - 
sen~,ed in Table 2. 
The sevell rows of 'l'al)le 2 correspoud t,o th(' sevell 
modal relations of Table 1. The lirst colmnn shows 
the total mtmher of modal relations ext;racted for a 
mode while I, he second c(/hunn gives the iuunbc:r of 
modal relations judged to be correct (by the ~mthor) 
with the t)ercotll;age figure showll ill I.he third (:ohltnll. 
The fourth column giw~s the number of such relations 
fouud ill WordNet, while the lifth gives the number of 
those relations that were also folmd by the extraction 
program (with the cohtmt\] after that; l)roviding the 
percenl;age llgure). The last cohlnm shows the uum- 
bet of modal extractiou patterns imt/leirlented f(ir the 
lilo(le. 
We will now briefly discuss t;he perl'()rlllallce Of {.lie 
program. A detMled analysis is presented it\] \[15\]. The 
precision of tim extraction is over 60% in ~11 cases; 
there are three main reasous for the precision not be- 
ing higher: 
o Many of the patterns (e.g., for O(XI~III{.S-WI'I'II 
or (:HAI{.A(JTI~\[USTI(;-A(\]TIVI'I'Y) implicitly 
assume that verbs deuote activity. This is not, 
l;rue of many verbs like "suggest", "repres(.IH:", 
llreSeHlble~ ~ ol;e. 
® Many patterns hinge ou the l)reseuce el'particular 
links (like "WITII" and "IN"), and preeision is 
dragged down by I;heir aml)iguil;y. 
* The tagger makes mistakes during l.he prepro- 
cessing resulting it\] h\]correcl; matches for l;he pat- 
terl\]s. 
The number of matches with WordNet was get\]e> 
ally low because WordNet uses word collocations as 
link destinations to construct more detailed hierar- 
chies. So, for instance, while WordNet has the link 
\[ae(-ordion \[\]YI'I!;RNYM free-reed instrument\], ttur 
program generates \[accordion I,\]XAMI'LE-()I ~ instru- 
lllell(;\]. 
We h;we not similarly analyzed f;he performance of 
patl, erns that operate ow~r two definitions. The main 
reason is t, hz~t to get an accurate estimate of the preei-. 
sion (as in the second cohunn of Table 2), we have to 
combine the dictionary definition of the t:esl; word with 
every other (lelh\]i(,ion in the (lic(,ionary. This work is 
in progress, llowever, it. is clear thai. word-seltse ~unbi 
gully can lead to i)oor perl'ornlance by running m(~(lal 
exla'action patterns over unintended ,senses of a word. 
For inst~mee, if we ret\]lrll l,o {,\]\](? "a\[,()l\]\]iT, er" exalnple 
at the end of the previous section, we tlnd that, wc are 
"sl)reading aetNa\]ion" thr(mgh the verl) "atomize." 
There is another sense of "atomize", viz., %o sul)jeet 
to atom 1)oml)ing," which is not of interest here mid 
should he ignored. We will now brielly describe a new 
wol,dosense disamlfiguation technique that is appliea- 
ble in this contexL A detailed discussion can be found 
hi \[1,5\], 
Krovei,z au(I (h'of/; \]l/l\] characterized the process of 
word-sense disamhiguation as bringing to bear sev 
eral kinds of evidence del)ending on the context of 
occurrence of l;he word, namely, l)arl;-of-speech, roof 
phology, subeategorizal;iou, sen,antic restrictions and 
suhjecl, elassitieations. Continuing in the ssune flame 
work, we deei(h~d to use the smnanl;ie relations in 
volving an antl)iguous word as another source of ev- 
idence. I,(% the anll)iguous word be denoted hy 
W,,,o,, and \[H/,,,,. I{ELATION W,,~I,\] I)e a triple in I,he 
Ih'st delhi\]lion of I.he modal relation 1)attern. '\]'hell, 
each deflnitiou of W,m/, which iuchMes the triple 
\[W, mt, IU~I,ATION-INVI';I{.St r, H4,.,~\] e;m be eonsid- 
(w(.(\[ a.s ;t correct sense (for spreading a(q;iwltion), 
where RI'3LA'I'I()N-INVIdlISIB is the inverse link type 
of IU';I,A'I'I()N (;. For Llle same 300 words as ill 'l'al)le 2, 
we tested l,his hyl)othesis on three kimls of links: 
1. A-KINI)-OF: The inverse of A KINI)4)I,' is AS. 
The definitions of "l)uildiug" and "sl;ructure" 
given below illusl;rate the inverse relationship. 
Imilding: a usually roofed add walled 
sl;rtle\[.ure built for l)er\[Hanellt USe (as 
fbr a dwelling) 
st, rue\]re'o: something (as a Imilding) 
that is constructed 
2. I'A|{:I'-()|", whose inverse is IIAS-I>AIIT. 
3. I1AS-PAI{:I', whose inverse is PAl{;l'-Ol". 
The results were very disappointing, with less (;t/;tl\] 
5% of the words tested being successfully disam- 
bigual;(~d by (his technique. Often I,he problem seemed 
that tile inverse link was l)resenl;, but; using a synonym 
or a hyponynl. 'l% test this, we conducl;ed ~t\] experi 
nlent on ItAS.-I~AI(I ' where all we required l;o judge a 
s ( :learly, this technique only applies to dictionaries and other 
tl!xL sollr(:es whi(:h are (lethfitiona\] ill Illtttlr(:. 
740 
d0tinit,ion as correci; was the preset~ce of SO{he PAI{:I' 
()1" triplo, llO lIl.q.{I;or wha, i, il, was l>.rl, ~1\[ +, This ill;R\](' 
l;ho technique Loo general and inotl'ectiw~. Oil1; of the 
103 fl AS- PA 1{51' relations tested, t;hore were 14 col'r~ct. 
(lisa, lnl)igua,i;ion cases ~-uid :\]5 i llcorrecl; ('ases. 
A llioro ell'ecl;ive 1,edil\[ique seeiiis 1,o be I,. tls<~ {;he 
ino(\[illers ('Jr t, he ;.i, il\]l)igtiotls word. Wo cou(hlct, ed a, ll 
('~Xl)Cl;ili;iolll~ Oll A--KINI)-()I" links in which, \[br ;'til 
a, liCi\])igUOllS delh\]ienduln, we acccl)l;ed ;is t, hc (:orl'oXfl; 
sciises l, hos0 (le\[iilil;iolls w|i()se geilllS I;(~rllhq ha, d SOlli(~ 
tnodiliers in 0oiii\]iloii wiLh i;he (tetiui0nduin in iLs orig- 
inal conl,exL Oul; of tho 860 eiiLrios t, osl,ed, l, hc:rc we!re 
8/I (:orreci; (\]isa, l\[ll)igua, I;ions atl(I :{9 itl('orr(~cl, OltCS. 
5 Conclusions 
This paF.er is hascd on Llic argunioill, iJla, scuiatlLic 
relal;ions Ca, l{ provid<~ a heiJ;er w~cy 17~l' tlSCl'S 1,o e×pr0ss 
t;heir in\['orlii;_ul, ion need Lt+~ all 111, sysl;elri, Ig>r such all 
hii;erf'ace to bc widdy al>l~licah\]<~, l,h<~ IlL sy;+;LCtll should 
he cat)able of auix)liiai;ically acquirillg SOill;l, lll, i(: rcla 
l;ions frolll available Iinguist, ic ros()Ul'CCs l'~-d,hcl' (,hfl, li 
l;iirotlgh halld (',oding. l{ecctil; work iu scintqit, ic k+nowl- 
edge cxl;racl;ion Fl'Olll ()ill{lie dicl, i.uarics SCOlllS Ill{,, ~,, 
l>rotnisiilg lnc, IJiod o\[' atll,onial, ic +U:Clltisil, ion. This i);c 
por dcscrihos a, now l, cchni(tu~ \['or exl;ra, cl;iou \]'l'Oili dic 
I;iuna+rics l, hal, was iii@h'ed hy t,he wdl-kliowu niech- 
ailisn:i oI + spreading act,ivaLiuli hi scnianl, ic, liCt,w{)rks. 
The l;echnique i'dies o/1 sol, s ()\[' \])al, l,ei'llS /;(i (:()lll\[)(Js(! 
haslc semantic relat.iol> f',',,m one or ,n<:,re dici:i,:mary 
delhiiL;ions int;o modal ?'cl(siolzs bl~l;wl'eu wc)t'cts. Wc 
cw+luat;c i;his f+ecliniquc in i~he (:olil;l!Kl, o1' sev(~ll llJOdC:s 
t, ha, i, ~-Irl'O ItSefll\[ for rcf0ricval \['r()lll a dal:al)as~ (>f pic 
l, llrOs, l<'imdly, wc I>rOl)OSe mid owdual,e a, uew I,ech.. 
i+.iqiie for word-sense disa, lnhigua, l;ion wit, hin dicl,iolmry 
dclinil;iol\]s t, ha, {Hakes use ol'scnianl, ic I'elal,i~ms iilwdv- 
big the anibiguous word, 
We pl:+ui (,o e×l;encl Lhis w<>rk, it{ illally dh'ecLions. \,"qc 
would like L;o devdol> a sl;alidai'd gl'a, liilflaJ" for defh~ing 
inodal relaLion palii, erns> alid all inl,erprcl;cr for t;his 
st, aiidard ~raliilliaJ'. ()liCe t, hc~ gl'aJlillia, i > is awcihd)lc, 
we i\]lLeiid 1,o devdop all hit,or\[ace for ~<G)ecil)ilig llow 
na<)da,1 r<~lai;ions and I)aJ,I,erils \[+or (llq,cct;ing t,\[IOIil CiV('~t' 
nnill;il)10, delhliL;ions. We also, plan l,c) ii\](:{/rt>or~fl:c lilt 
pletn@t, aJ;ions of exisi, in~ word +-;011s(~ disa, inl)igual;ion 
aJgoril, hins hll,o 1;ho i:iioda,\] paA;l,crn inl, t~rl)rcl, el'+ Fl'Olii 
an 11{ i)crsl+ecl&g wc pl;cn t,o inwlsl, igai,e l, he s ynergis- 
i, ic ilSe o\[' modal rela,ions wii, h ol,il<~r (lilcry priiiiil, ives 
like kcywords for rela'iowfl froin a, ii c×l,cn(led dai;ahas~, 
of annoi;a,l,ed t)ici;urcs. One d~sirahle resull; of l:his 
work for Flil;Itro resea, rch would lie Lhc ost,~d)lisluiieill, 
or st, al~_dard benchniarks and l, csl, suil,os for r<l, ricwd 
~roiii picl, ure da,;dms~s, 
References 
\[1\] SmcaLon, A. F. 'Trugress in tJlc Apl>licat, i<m d" 
Natural I,anguage Processing to InformaL{oil IL~ 
l, ricval Tasks," in Comp. Jxd., 35(3), 1!19'2. 
\[2\] Belkin: N..I., and Cro\['t., W. IL '%forma, ion Fil- 
>rin~ and hd'orma.ion ILet, rieval: Two Si&'s -I" 
t, he Same (Join?," in CA(/M: 35(12), 1992, 
\[3\] Fox, I",. A., Nui.ter, ,I. T., Ahlswe(l% 'l'., I",v<:m;, 
M., aid Markowitz, .I. "lTuilding a I,arge The: 
smlrus for lnf<~rmathrm II,el.rieval," i,l /'roe, of lh~ 
,%could (,'o~U~:rc'+~ce ou Apflicd NI;P> 1!188. 
\[4\] WilL<s, Y., t<'ass, 11., (;uo> (;., M<:l)onahl, .I., 
Pla.c.A., aid Slal.or, II. "Providing Machil~,' 
'l'ra, cl, al+~ l)icl.ionary '\['o<)Is," ivl ,%'emca.lic.s a,d 
Ihc l,c:l:Tc(n+, ,\]. l'ust;@~vsky, edil;or, l(luw<' Aca- 
(hmiic l'uhlishers, 1!)!/3. 
\[r)\] (lhoclorow, M. S., Byrd, IL .1., ml<l Ih~idorn, G.I';. 
"\]i',xi;racl,ing Scmmni.ic Ilierm'chies I'rom a bu't,;c 
()n--LiIle I)icl.ionary," in /%<, of A(/L-??3, 198rl. 
\[6\] Quill{all, M .1¢. <'Semaltic Memory," in ,<¢c'ma~z//c 
I',form<tliou \])roccssi'+*y h M. I,. Minsky, <~d., M IT 
I:'r~,ss, 1%8. 
\[7\] Miller, (;. A., and lg~llbamu, ('.. liS0111alll;ic N(~I.- 
works of I,;nglish," iTl (/oy1~6i>% 41, 1.9!11. 
\[8\] Cm, lkl~, 11., l(upiec, .I., I)ecUers<m, ,I. told Sihml, 
P. "A I>racl;ical l':-u'L-of-Sl>¢~cch '\]'a,e;gcr," in l'roc. 
of Ihc ,Trd (7o~lj; o'¢~ Applied NLP, 1992. 
\[9\] Ilaase, 1(. II. "Multi-Scah~ Parsin~ usillg ()l>l,iNdz - 
ivlg l,'Mte-St.a.e Machiv~es," Inl,ernal M(m>, MIT 
Me(li,~c l,ahoratory, 1!193. 
IlOl Amslcr, IL "'l'h< ~ ~l,rucl, ure of t, he Mel'l+i0Jll Wch- 
st,0r l>.ckel, l)i(:l, iumcn'y," Phi) l)is,'-;~%aii<m, Ihli 
w~rsit.y of Texas, Aust, ii h 1!)80. 
\[11\] Monl,clttagifi, S. ;+rod Vat\]derw<mde, I,. "Stoic: 
(m'al I>al;l;erns vs. String l~atLerns for Igxilracl.-- 
ing S~m+mtic Itfforma, ion fl'om l)ic/,ionarios," in 
Pro(:. of C()LINC-.O?3, 1.().()2, 
\[12\] Miller, G. A., ?rod Johnso>\[,aird, 1'. N. I/agluac 
a~d I'crcc.ptio~+, Ilarw~r<l I1. l'r(>s, 1976. 
\[13\] .h, Nse,, K., aud It{not., .I L. "l)isa,lhigmctii~g 
l'rq~ositi(mal l'hras<~ Al, l, aclluncnl, s by llsilig ()n 
l,itw l)M;i()Hary l)e/itHt, iotls," in Co~@+d(tliolzal 
\[,i;zg~tislics, 13, 3-4, pl > 251 2($(), 1!)87, 
\[14\] Krovel;z, I{,., anti (h'ol'l,, W, l'l, +¢Wol'd Sense l)is 
~uid)iguai, i<m I, lsin,e; M a,<:h illo-l¢.{wJal)le i)icl:i+mar 
i%" in t",'oc. (t/,<;l(;lt~-(V9, 198!). 
\[15\] (:hakra,vtcrt.hy, A. S. "l{,et>resenl, ing hllbritlation 
Need with Sc:tnai+(;i(: I{,elations," \]nl;erna\[ Memo, 
MIT Media I,al)oralory, 1993. 
741 
