File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/00/c00-1019_concl.xml
Size: 3,767 bytes
Last Modified: 2025-10-06 13:52:43
<?xml version="1.0" standalone="yes"?> <Paper uid="C00-1019"> <Title>OUVRIERES LA BOUR FA(,J()N h;VENT P * 17' I~VIDLNCL CLEARLY EVIDh;NC\]'; OBVIO USIN HOMMF, S POIATICIANS PRISONNIFJ{S PR/SONEI{S RETOUR, BA.CII(, REVENIR BACK CONVENU AGREED SIGNE SIGNEI) VU SEEN AGRJCOLE AGR1C UL'I'URE ENT'IER AROUN\]) E N T I ER T Ill RO U G I\] O U T OCCIDENTAL WESTERN AVIDUGLI~S BI,IND CIIA.USSURI'2S SI-IOES CONSTRUC;I'EURS BUILDh;RS PENSIONN, F,S PENSIONERS RISTRAITES PENSIONERS VETEMENTS CLOTHING POISSON FISI\] PORC IK)RK Figure 2: Sanli)le Chlsters</Title> <Section position="9" start_page="129" end_page="130" type="concl"> <SectionTitle> 7 Conclusion and Future Work </SectionTitle> <Paragraph position="0"> Using word clustering to automatically generalize the example corpus of an I;BM'I? system can provide the majority of tile improvement which can be achieved using both ~ manually-generated set of equivalence ('lasses and a product;ion rule grammar. The use of a set of small initial equivalence classes produces a substantial further reduction in training text at a very low cost (a few hours) in lal)or.</Paragraph> <Paragraph position="1"> An obvious {'~xtension to using st,.e{\] clusl;ors iS (;(} 1+180 (,110 I'Osllll; ()\[' a, ClU,'ql;{':l'illg 1&quot;I+111 ;IS l;tl{? i\]lil;ia\] seed \['or a second it{;ra,l;io\]t o1' chlsl,ering, sin('{', th{; additional g{,neralization of lo{;a.i COlll;{!xl;s cnabl(;d 1)y the la.rgcr s{,e(1 clusl,(,J's will l)ormit a.(l(litional ex\])allSiOll O\['LIlo clusl,(Brs. l:or such itera.tivo {:lustoring, a.II but the last rou n(1 shouI(1 l)l'(2Slllllal)ly USe sl;ri(;Ler 1,hreshol(Is, to avoi(1 adding goo many irr{;l{,A,ant inonlt)ers Lo tim clusLers. I)rdiminary OXl)erinmnts hay{ B been inconclusive --although ihc result o\[' a second it{wation {'onta.ins more {,{'.rms ill the {;lusl;ers, IBBMT l}erforma.nce {toes not seem to lint)rove.</Paragraph> <Paragraph position="2"> More sophistica.ted {;hlsl;o.l'illg; a.lg(}rithms such as k-lneans and (l('+terlninLqtic a.nnealing l\]lay' 1)rovi(lo \])etter-qua.lity clust{ws for bcl, ter t)ei't&quot;of lllall{;e} :-1+,{; the (~xi)ens(; of illCl'Oas(;(\] t)ro{'eHsill~ tim{'..</Paragraph> <Paragraph position="3"> This a.i)l}Z:oach to gelWXa.l,ing e(luival('Jw(~ cla.sses should worl( j usl; as well \['or l)h rases as I'or single words, simply hy mo(lil~qng {;he conver-Si()ll SLOp 1;O el'oat;(; C(}lltOXt VeCl;ors l&quot;or phrases. This enhancenmnt would elimi,lal;{'~ i;he current limitation t, hat trat,slal;ion \]):q,il:S l,O 1)O. clust(;red \]\]lUSt t)O single words in 1)oth languages. \Vot:k or, this n\]o(lifi{;al;ion is {:urP(~ll|;ly ttn(ler way. An inleresting \['ui, ur{~ (;xI)eriment would 1)(~ tbr{'going gratnnlar rules based {)n standa.rd gl:allllll:-/,l;ical \['{'.:-1+,l;tll'(~s Sll{:h as \]).~l,rl, o\[' st){&quot;(':{:\]l , and inst{,ad crea,tinp; a gran~ma, r guid(,{I I} 3 , {;1~{; ('lusters I'oun(l fully aul,o~tati{'ally (wil, houl, sce{liug) fronl th{~ exa.nll}lc re\l,. 'File r{,{:(;nt woH{ I)y +\(lcTait and 'lY=..iillo (I 999) {}, OXtl':dcl.,ing tra1~slal, ion t}al;l;{'+rn,q woul(l a.t)poa.r t,o 1}o. a l){;rfe{:l; {;oml)lc'nmnt, as 1;h{'5 are it, e\[t'ect lindi,g {:ont;ext strings wit\], (}l}e. slots, while the work descril)ed h('.re lit,(ls {,he fillers I'(}1' tJ~{)s(' slots. (liv{;n the al)ility to learn such +1+. gra.mmar without l\]\]a.nual interv{mtion, it would \])e(:onl{'.</Paragraph> <Paragraph position="4"> I)ossil)l{~ to ere'at{; an I!'I:~MT 8yst{m\] usillg g{:qlera, liz{,(l e,:aml)les from nol, hi\]~g ~n{)r{; than l)arallel l;ext~ which for n~any hulguag(, pairs could also 1)c acquired a hnost fully a, utom~tically 1)y crawling the World Wide VVel) (Resnil{, :1.998).</Paragraph> </Section> class="xml-element"></Paper>