File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/94/c94-2118_metho.xml
Size: 16,708 bytes
Last Modified: 2025-10-06 14:13:42
<?xml version="1.0" standalone="yes"?> <Paper uid="C94-2118"> <Title>Representing Information Need with Semantic Relations*</Title> <Section position="3" start_page="737" end_page="737" type="metho"> <SectionTitle> 2 A Database of Pictures </SectionTitle> <Paragraph position="0"> The primary motivation for our work was to provide retriewd based on semantic relations for a corpus of pictures collected from the American lleritage Dictionary. The corpus contains 1359 pictm'es, each of which is annotated with a single word or word colic)cation from tile dictionary. Clearly, there are a great many semantic relations that coil\](\[ he useful for retrieval fl'om such a database. To narrow down the set of interesting sen,antic relations, we used the fact that the annotations are single words or word collocations.</Paragraph> <Paragraph position="1"> As in memory experiments in cognitive psychology, we used tile annotations as cues for flee reeMl by association. We then analyzed the resnlts to locate semanticrelations that occurred most often. Based on l, his analysis, we picked the sewm relations shown in 'Ddfie I (which we will henceforth call modes to distinguish them from individual senlantic relations).</Paragraph> <Paragraph position="2"> The OCC, UI{S-WITII mode refers to typical physical collocation of objects. It. is useful for making &quot;intelligent&quot; guesses about what else might be in the picture besides the objects explicitly annotated. As the example in q'ahle 1 shows, tiffs is not Mways symmetric. It can he argued that tile presence of an ax in a picture much more often indicates the presence of wood than the other way around. The P LAYS- RO bl'~O I '~ lnod e differs fl'ol n the F, XA M P L E-OF mode in having a connotation of typical use. The (,'HA RA( ITE1USTI(~-ACTIVITY mode is used to relate both objects and agents to typicM activities they are involved in. The IIAS-PUI{POS1/mode is used to relate an object to a word denoting its lmrpose. As ill the Table 1 example, that word could either denol, e an activity or another object where there ix a typicM activity involving both objects. (X)NSTITUEN'P-()I&quot; and IIAS CONSTITUI'~NT are similar to the widely-used PAI{;I'-OI&quot; and IIAS-I'ART primitives except that metaphorical inclusion is valid its well. The next section describes our scheme for extracting these modal relations front the dictionary t .</Paragraph> </Section> <Section position="4" start_page="737" end_page="737" type="metho"> <SectionTitle> 3 Extracting Modal Relations from Dictionary Definitions </SectionTitle> <Paragraph position="0"> \]~xtracting modal relations Deal dictionary detlnitions involves three components: a l)reprocessor that tags the dethfition with p~u't of-speech informatiolh a rood ule that pulls out triples (basic semantic relations of the (brm \[wordl I,INK-TYPI,; word2\[) from the prel/rocessed definition, and a pattern hlterl/reter that checks tile list of triples for modal relations using sets of patterns. We will now describe each of these in tllrll.</Paragraph> <Paragraph position="1"> For 1)reprocessing the dictionary definitions, we have experimented with two ditDrent Caggers: the Xerox PAR(J part-of-speech tagger \[8\], and the Chopper \[9\], an optimizing finit, e state luachine-hased tagher built at the MIT Media l,a}~ by Ken llaase. 13(fore tagging the delhfition, we apply a few simple lilters to remove botanical names, usage guidelines, etc. The perfornmnce of both taggers was satisfactorily high.</Paragraph> <Paragraph position="2"> The example below shows the output of t.he Xerox tagger on n slm'lple definition2: aqueduct: a conduit for water</Paragraph> </Section> <Section position="5" start_page="737" end_page="738" type="metho"> <SectionTitle> :AT :NN :IN :NN </SectionTitle> <Paragraph position="0"> l All dmse expmqments have been run oil a Websters ,reline dictionary. The progranl is written in l.ucld (~ommon l,isp and rams on a 1)lZ, Cstatkm.</Paragraph> <Paragraph position="1"> 2'\['he tags used are from the Brown corpus, e.g., :A'\[' = article; :NN = singular noun; :IN = prepositiom I. Use lil)rary ,.fl' A KINI)-OI*' ,ur I,;NTAII,S extrac ti<m p;-tl;{,Cl'IIS to \[O(:a,t(' the g(~\[lllS \[;0rill. I';x(.racl. triples from modilhq's o\[&quot; the gcuus t.erHL 2. ltera, I;e over the ,.lill'c:renl, ia.u construcl, iHg triph~s using eacl'~ of l, hem unl;il eiihm' the eml ~>1' the detinit, i(m or till no l,rilflC can he c()tislrucl;ed from the dill(~renl:ia lblmTl, :';. Apply l*OSl,-I)ro<'essittg t~tet, h<)ds t.(~ c(>77strt~ct sUnll)tiolt thai; <lict, icmary ,.hqiHiti<)n,~ :~ l,yl)h:ally c~msisl, (>1' a, genus 1,erlH (i(h,ntilyh~,e; its kiml) f.lh.w.d hy dit: f<'eutia.c (h(,w it is dilfercnt l'r(,m t,h., ,v;~,m~s) \[10\]. lu the aqueduct dclinil,ioH a.hov<~, the genus ter~H is &quot;c(>h duit&quot; (\[mlUeduct A-I~:INI):()I,' comhtil;\]) and I;he (rely dilfer<,m:ia is given hy lhe t'P &quot;ft,' wat,er&quot;. Itl the ca:( of verb dcllnitions, the gemls tm'm is related hy the link &quot;I&quot;,N'I'AII,S&quot;. The three stages of I;h{~ algorithm are i)resent;ed iH Figure 1+ We use ~'\[ wn'iel,y of l>al,l,erHs d<~scrihed in the liter: at, ure to extract the initiM gcmts term(s) c()rre(-tly \[hi (e.g., l);tLl,erns like % N I', .... <ql, hcr ()1&quot; two i>hu'al:N I',&quot; &quot;o11<~ ()l+a family uf I)hu'al-.N P&quot;). 'l'h(' l)att, lq'ns coHd)iuc both sylll.act, ic {711'.1 string eletHents, which makes th(qn more t',+)werl'tfl l, hatt purely st, ring-I>a,-,ed patJ,crns \[1 I I. ,qin<:e it is very iml)(:,rt;mt; l,o \[iml I,h~. t~f?lTLit-; t<!rlll c(~r rcct.ly, a &quot;lasl:-ditch&quot; extracl.<)r is invoked if mine <)l' t, he stamlard l:~+dJ;crus work. 'l'hi,~ hu-;t dii,ch exl, ra/.:l,(ir ~lS.qHIll(~b I.h+li. l.h(! tagger 7tlttSl; }lave 7iia(le a 7uisl.al,:e alTd tl'ies l,o c,')IU+l'+elTsal,c \['()I' cotlllTlOlt tagger mistake,; (c.g., i, aggh,g a,n ing l'orJn as v(>rl)insl,,.md <~f;ulj(~ctiv(>). ()IIC(! {,}7(~ gOTlllS \[,O1'ii78 it;IV(' \[)een \['07lt7(1> w(, atmlyze l he m()rphological form ol +lhe modiliers for triples.</Paragraph> <Paragraph position="2"> For inst,~mce, siTlce &quot;violin&quot; i~ delhmd as % bowed itl strument.,&quot; the i;riple \[violin ()lUl';(/l' ()F bowinp;\] is r(~c<~7'd<.d 't , 111 Step 2, each ~>1' th(&quot; <lilli~ret~tia,(~ is as,'-;tuncd I<) he either a relal, ive chutsc or a prcpositiomd phras._,. As i7~ ,qtel> I, lmml nouus (+t: verhs ;~r<~ l<)c~m'd for each (>f 1;17(' ditl'er<~utiae and result itt triples hciug; \['ortned widt the word(s) beiltg tnodilied, Whl,.r(~ I,)mre is atl,a.c\]Htteut, aml>iguit,y (as wit, h i)r<,l,.,sitional l,\[trases \[13\]), triples arc li)rmed for all l~ossihle al,t;~tchm<qfl,,'-;.</Paragraph> <Paragraph position="3"> ,gl.e 1) :/ is ;~ I>OSt-t>ro<'e.,,sing st(q) which t'e.'mlls iH n('w t,riples being \['<)rlued and sont(! I,riplcs Irt)lli merged into \[;u:ropulis PAIU'=()I&quot; city\]. Sinfihuqy.</Paragraph> <Paragraph position="4"> 1,here arc rules I~u' cr<~d, iHg links of other types. ()ther p()sI, processing rules deal with elimiuatiug r<4&quot;eretL<:('s t.<) A I(INI)-()F gcnu.~ I.erms in triples hy replacing them with l:hi~ (lellnien<hHH.</Paragraph> <Section position="1" start_page="738" end_page="738" type="sub_section"> <SectionTitle> 3.2 Extracting Modal Relations from Triples </SectionTitle> <Paragraph position="0"> For each nlodal relat, i<)tt iu 'l'ah\[e 1, there i, a set ()1' l),~d,t,ern,q thai: Sl>(~ci\[ies how l, hc modal relation can hc detect, cd from the triples of' one or ulor(, (letiui I,ions. Ea, cl', l>alJ,(~r~. (~,ucodes a, h<',urisl, ic rule t, hai, is based (>Tt the l, ril>h~s exlrp.cl,ed f'r(itn Llte dicl:iouary </(4inil, i(ms. l,'or ex~qHi)\] % one patl,er77 for exl;racl ing ()(',(',IJRS.WITII rchU,ions (me-des l,\[m hcurislJc l,hat, t,yl+ically if ()h.iect:l aud ()h.i~'ct~ are i\]p,'.lv(~<l iu I;hc saHm acti,m, \[()l>j<~ctl ()(;(~UI{S-WI'I'II ()h.ic(.t~\] \~rh(m this i)atterll is applied I,o I,h{~ (h4itiiti(m~ &quot;ax: a <:utth~g t.<7<>1 tha, t c<)nsi~l,s ()1' a heavy edged head lixed to a handh' with the edge liar aiM to l;he IlalMh, aml tll;tl, is used c;-;I}c<:i;flly I'or felling l,l'e(~s ;tnd chol>ping and sF+litt, i~g w<,.(l', the two Hlodal rclati(,ns \[ax()(:(:UI{.S WITII trccj aml \[ax()(',(;IJII,S+WITII wood\] arc found. Pal.</Paragraph> <Paragraph position="1"> l,crn,', c+ul apply 1.o nmltil)le dc:llnitions as w~ql.</Paragraph> <Paragraph position="2"> Using the salHc heuristic a,s al)()ve, wc have de Ihlcd a pntl;ern tha, l. <!Xl, l';tct.s L\]I(! modal rehd.iou \[atomizer ()(J(IUI{S-WI'I'II spray\] from th~ two dG init.i(ms hclow: a|;Ollliz(w: ;711 iltsl, rllttl('~71t, for al;olltizhlg 71~4tl ally a perl'utue, (lisiufecl,~ml, or nmdicament at,,(lllliZ(~: to I'0(hlc(~ to lllill/ll;C particles or to a 1171e spray</Paragraph> </Section> </Section> <Section position="6" start_page="738" end_page="740" type="metho"> <SectionTitle> 4 Performan(:e Evaluation </SectionTitle> <Paragraph position="0"> '1'o ;ma\]yze the perf'ormance of the progra, Hi, we picked the Ih'st 300 annouLtions fi'onl Ge corpus of ph%ur(,s dcscrihed iu Secl, i<m 2. The Iirst part of this scc ti(m l)resents the results of applying lJm modal reta I:ion Iml:terns th;~t work on (me definido.. The sec.ml Imrt discu,'+ses diflicult:ies i.w4wM in the (w;dmU:i~,u ()1' II\]O(la,\[ l'('f{~(,i(~ll im, H;erns that, wc)t'k OII IIIOI'R til;qH (HR de/iuition.</Paragraph> <Paragraph position="1"> In order t,:> obtaiH a,:n imhT.:mdenl~ csl, iiHa.le {4&quot; I,\[l(! pl!l'l'ur77t;t77cc: ()\]+ l,hc ext;r;-I.(:t:iOl7 pl'OL~l';tlll s W(! COllH>arc'(\] /,he oUt, l'.ut (1(&quot; tim pr,.)gra~m where i)<;~.sible t,o the <)Utl)Ul, ()1&quot; WordNet, \[7\], which is a large, manually<:odc.d semantic :netwc)rk, W()rd Net, d{}es not haw~ liuks corresponding t,,~ \['(mr &quot;i.,:., hMelmn(lent, of t, iu dicthmary we a.rc c~sin,v.;</Paragraph> <Paragraph position="3"> of l.he modes listed in Table 1: ()(XI:UI{.S-.</Paragraph> <Paragraph position="4"> WITH, I'I,AYS-R()LI';-OI&quot;, IIAS-PlURPOSI';, and CItA I{A(\]Tle, IUSTIC-ACTIVITY. For l;he el.her three Inodes~ We fotllld eorresl)Olldenc(?s I)y ass/tIilillg thai; all hypernyn,s are valid exan\]plcs of I,\]XAMPI,\[~ OF, all merollyll|s of HAS-CONSTITUI~,NT, and all holonyms of (X)NST1TUI&quot;,NT-OI r. The performance results and the comparison with WordNet are l)re sen~,ed in Table 2.</Paragraph> <Paragraph position="5"> The sevell rows of 'l'al)le 2 correspoud t,o th(' sevell modal relations of Table 1. The lirst colmnn shows the total mtmher of modal relations ext;racted for a mode while I, he second c(/hunn gives the iuunbc:r of modal relations judged to be correct (by the ~mthor) with the t)ercotll;age figure showll ill I.he third (:ohltnll. The fourth column giw~s the number of such relations fouud ill WordNet, while the lifth gives the number of those relations that were also folmd by the extraction program (with the cohtmt\] after that; l)roviding the percenl;age llgure). The last cohlnm shows the uumbet of modal extractiou patterns imt/leirlented f(ir the lilo(le.</Paragraph> <Paragraph position="6"> We will now briefly discuss t;he perl'()rlllallce Of {.lie program. A detMled analysis is presented it\] \[15\]. The precision of tim extraction is over 60% in ~11 cases; there are three main reasous for the precision not being higher: o Many of the patterns (e.g., for O(XI~III{.S-WI'I'II or (:HAI{.A(JTI~\[USTI(;-A(\]TIVI'I'Y) implicitly assume that verbs deuote activity. This is not, l;rue of many verbs like &quot;suggest&quot;, &quot;repres(.IH:&quot;, llreSeHlble~ ~ ol;e.</Paragraph> <Paragraph position="7"> (r) Many patterns hinge ou the l)reseuce el'particular links (like &quot;WITII&quot; and &quot;IN&quot;), and preeision is dragged down by I;heir aml)iguil;y.</Paragraph> <Paragraph position="8"> * The tagger makes mistakes during l.he preprocessing resulting it\] h\]correcl; matches for l;he patterl\]s. null The number of matches with WordNet was get\]e> ally low because WordNet uses word collocations as link destinations to construct more detailed hierarchies. So, for instance, while WordNet has the link \[ae(-ordion \[\]YI'I!;RNYM free-reed instrument\], ttur program generates \[accordion I,\]XAMI'LE-()I ~ instrulllell(;\]. null We h;we not similarly analyzed f;he performance of patl, erns that operate ow~r two definitions. The main reason is t, hz~t to get an accurate estimate of the preei-. sion (as in the second cohunn of Table 2), we have to combine the dictionary definition of the t:esl; word with every other (lelh\]i(,ion in the (lic(,ionary. This work is in progress, llowever, it. is clear thai. word-seltse ~unbi gully can lead to i)oor perl'ornlance by running m(~(lal exla'action patterns over unintended ,senses of a word. For inst~mee, if we ret\]lrll l,o {,\]\](? &quot;a\[,()l\]\]iT, er&quot; exalnple at the end of the previous section, we tlnd that, wc are &quot;sl)reading aetNa\]ion&quot; thr(mgh the verl) &quot;atomize.&quot; There is another sense of &quot;atomize&quot;, viz., %o sul)jeet to atom 1)oml)ing,&quot; which is not of interest here mid should he ignored. We will now brielly describe a new wol,dosense disamlfiguation technique that is applieable in this contexL A detailed discussion can be found hi \[1,5\], Krovei,z au(I (h'of/; \]l/l\] characterized the process of word-sense disamhiguation as bringing to bear sev eral kinds of evidence del)ending on the context of occurrence of l;he word, namely, l)arl;-of-speech, roof phology, subeategorizal;iou, sen,antic restrictions and suhjecl, elassitieations. Continuing in the ssune flame work, we deei(h~d to use the smnanl;ie relations in volving an antl)iguous word as another source of evidence. I,(% the anll)iguous word be denoted hy W,,,o,, and \[H/,,,,. I{ELATION W,,~I,\] I)e a triple in I,he Ih'st delhi\]lion of I.he modal relation 1)attern. '\]'hell, each deflnitiou of W,m/, which iuchMes the triple \[W, mt, IU~I,ATION-INVI';I{.St r, H4,.,~\] e;m be eonsid(w(.(\[ a.s ;t correct sense (for spreading a(q;iwltion), where RI'3LA'I'I()N-INVIdlISIB is the inverse link type of IU';I,A'I'I()N (;. For Llle same 300 words as ill 'l'al)le 2, we tested l,his hyl)othesis on three kimls of links: 1. A-KINI)-OF: The inverse of A KINI)4)I,' is AS.</Paragraph> <Paragraph position="9"> The definitions of &quot;l)uildiug&quot; and &quot;sl;ructure&quot; given below illusl;rate the inverse relationship.</Paragraph> <Paragraph position="10"> Imilding: a usually roofed add walled sl;rtle\[.ure built for l)er\[Hanellt USe (as fbr a dwelling) st, rue\]re'o: something (as a Imilding) that is constructed 2. I'A|{:I'-()|&quot;, whose inverse is IIAS-I>AIIT.</Paragraph> <Paragraph position="11"> 3. I1AS-PAI{:I', whose inverse is PAl{;l'-Ol&quot;.</Paragraph> <Paragraph position="12"> The results were very disappointing, with less (;t/;tl\] 5% of the words tested being successfully disambigual;(~d by (his technique. Often I,he problem seemed that tile inverse link was l)resenl;, but; using a synonym or a hyponynl. 'l% test this, we conducl;ed ~t\] experi nlent on ItAS.-I~AI(I ' where all we required l;o judge a s ( :learly, this technique only applies to dictionaries and other tl!xL sollr(:es whi(:h are (lethfitiona\] ill Illtttlr(:. d0tinit,ion as correci; was the preset~ce of SO{he PAI{:I' ()1&quot; triplo, llO lIl.q.{I;or wha, i, il, was l>.rl, ~1\[ +, This ill;R\](' l;ho technique Loo general and inotl'ectiw~. Oil1; of the 103 fl AS- PA 1{51' relations tested, t;hore were 14 col'r~ct. (lisa, lnl)igua,i;ion cases ~-uid :\]5 i llcorrecl; ('ases. A llioro ell'ecl;ive 1,edil\[ique seeiiis 1,o be I,. tls<~ {;he ino(\[illers ('Jr t, he ;.i, il\]l)igtiotls word. Wo cou(hlct, ed a, ll ('~Xl)Cl;ili;iolll~ Oll A--KINI)-()I&quot; links in which, \[br ;'til a, liCi\])igUOllS delh\]ienduln, we acccl)l;ed ;is t, hc (:orl'oXfl; sciises l, hos0 (le\[iilil;iolls w|i()se geilllS I;(~rllhq ha, d SOlli(~ tnodiliers in 0oiii\]iloii wiLh i;he (tetiui0nduin in iLs original conl,exL Oul; of tho 860 eiiLrios t, osl,ed, l, hc:rc we!re 8/I (:orreci; (\]isa, l\[ll)igua, I;ions atl(I :{9 itl('orr(~cl, OltCS.</Paragraph> </Section> class="xml-element"></Paper>