File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/96/c96-2211_intro.xml
Size: 3,810 bytes
Last Modified: 2025-10-06 14:06:04
<?xml version="1.0" standalone="yes"?> <Paper uid="C96-2211"> <Title>Pattern-Based Machine Translation</Title> <Section position="3" start_page="0" end_page="0" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> It wouhl b(: difficult, for a.nyon(: to dist)ute tit(: id(:a, tha.t th(: World-With: Web (WWW) has b(:(:tt tit(: most phertom(:na.l invention of tit(: la.st d(:ca.de in the (:t)mlnH;ing (;nvironnwnt. It ha.s suddenly OltCncd up a. window to wtst a.mounts of da.ta, on the \[ntcrnet. Unfortunately fin' those, wit() axe not na.tiv(', English Sl)(;a.kers , textua.I da.ta. axe mort: often tha.n not written in a. foreign hmgua.ge.</Paragraph> <Paragraph position="1"> A doz(;n or so ma.(:hino tra.nsla.tion (MT) tools ha.w; recently bean put on the, ma.rket, to make such te, xtua.l da.ta, more a.ccessibh', but novice PC us(;rs will be simply a.ma.zed a.t the mea.g(:rness of their rewa.rd for th(: effort of building a. so-(:a.lh:d &quot;user di(:tiona.ry.&quot; '\['lm main r(:a.sons tbr tiffs prol)h:m a,r(:: :1. Most MT systems do not employ a. l)ow(wful &quot;lexica.list&quot; forma.lism.</Paragraph> <Paragraph position="2"> 2. Most MT systems ca.n lm customized only by a.ddittg a. user dictiona.ry.</Paragraph> <Paragraph position="3"> Thero, for(~,, ,ls0,rs ca,It neither giva prefe, re.nct~,s Ol, i.dividua.1 prel)ositiona.l-1)hra.se a.tta.chments (e.g., to ol)ta.in informa.tion from a, server) nor deiinc tra.nsla.tions of spe tiff(: verb-object pa.irs (e.g, to take advantage of something). null Powerful gra.mma.r forma.lisms a.nd h;xica.l-sema.ntics forma\]isnts ttawe, I)een known for yea.rs (see l,F(~(Ka.l)la.n a.nd l/restore, 1982), 1 tPSG (Polla.rd a.nd Sa,g, 1987), a.nd Ge, ncra.tive l~exicon(Pustejovslcy, 199l), for example), bttt pra.ctica\] iml)h'~me.nta.tion of a.n M:I&quot; system ha.s yet to tax:kle, the computa.tiona.l colnl)lcxity of pa.rsing a.lgo rithms fin' these formalisms a.nd the workl(m.d of building st. la.rgc sca.lc lexicon.</Paragraph> <Paragraph position="4"> I'~xaml)le-based MT(Sa.to a.nd Na.ga.o, 1990; Sumita.</Paragraph> <Paragraph position="5"> a.nd lida., 199l) a.nd sta.tistica.I MT(Brown et a.1., 1993) a.l'c both promising apln'oa.chcs tha.t genera.lly demonsl;ra,te, incrementaJ iml)roveme, nl: in tra,nsla,tion a.ccura,cy a.s tile qua.lity of examples or tra.ining da.ta, grows. It is, however, a,n olmn qttestion whether these a.pl)roa.(:hes a, lone ca.n be used to crca,te a. fltll-fh;dged MT system; tha.t is, it is uncerta.in whether such a. system ca.n be used tbr wt,rit)us dotnains withottt showing sever(~ (h;gra.dation in trans/a.l:ion accuracy, or if it has to 1)(' tb, d IW a. r(:as(ma.1)ly Ia.rge, set of (',xaml)les or tra.ining da.ta tk)r (~,Hch IIQ, W (IOlIl;l,il,.</Paragraph> <Paragraph position="6"> TAGlmsed MT(Abeilld et M., 1990) I a.nd pa.ttcrnl)a.s(;d tra.nsla.tion(Ma.ruya.ma., 1993) shaye ma.ny intl)or~See l/l'A(l(Sch;dms el; a.l., 1!)S8)(l,exicalized TAt',) a.nd ta.nt propertie.s fiw successful im ple, menta.tion in 1)racticM MT systetns, namely:</Paragraph> <Section position="1" start_page="0" end_page="0" type="sub_section"> <SectionTitle> structures </SectionTitle> <Paragraph position="0"> In this pa.pcr, wc show tha.t thc;rc exists a.n attra.ctivc way of crossing these apl)roa.(:hes, which wa ca.ll pattern-based MT. 'e In l;he tollowing two s(~x:l;ions, we introduce a. class of tra.nsla.tion &quot;pa.tte, rns&quot; ha.sad Oli (2otltoxti tl &quot;~ Free Gramma.l (CI G), and a, pa.rsing a.lgorithm with O(\]G\]'2n ') worst-case time COml)lexity. Furthcrnlore, we show tha.t our fva, nmwork ca.n I)c (;xttmded to incorpora, te exa.mt)h;-lmsed MT a,nd a powerful le, a rning mox:ha.nisn,.</Paragraph> </Section> </Section> class="xml-element"></Paper>