File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/96/c96-2211_intro.xml

Size: 3,810 bytes

Last Modified: 2025-10-06 14:06:04

<?xml version="1.0" standalone="yes"?>
<Paper uid="C96-2211">
  <Title>Pattern-Based Machine Translation</Title>
  <Section position="3" start_page="0" end_page="0" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> It wouhl b(: difficult, for a.nyon(: to dist)ute tit(: id(:a, tha.t th(: World-With: Web (WWW) has b(:(:tt tit(: most phertom(:na.l invention of tit(: la.st d(:ca.de in the (:t)mlnH;ing (;nvironnwnt. It ha.s suddenly OltCncd up a. window to wtst a.mounts of da.ta, on the \[ntcrnet. Unfortunately fin' those, wit() axe not na.tiv(', English Sl)(;a.kers , textua.I da.ta. axe mort: often tha.n not written in a. foreign hmgua.ge.</Paragraph>
    <Paragraph position="1"> A doz(;n or so ma.(:hino tra.nsla.tion (MT) tools ha.w; recently bean put on the, ma.rket, to make such te, xtua.l da.ta, more a.ccessibh', but novice PC us(;rs will be simply a.ma.zed a.t the mea.g(:rness of their rewa.rd for th(: effort  of building a. so-(:a.lh:d &amp;quot;user di(:tiona.ry.&amp;quot; '\['lm main r(:a.sons tbr tiffs prol)h:m a,r(:: :1. Most MT systems do not employ a. l)ow(wful &amp;quot;lexica.list&amp;quot; forma.lism.</Paragraph>
    <Paragraph position="2"> 2. Most MT systems ca.n lm customized only by a.ddittg a. user dictiona.ry.</Paragraph>
    <Paragraph position="3">  Thero, for(~,, ,ls0,rs ca,It neither giva prefe, re.nct~,s Ol, i.dividua.1 prel)ositiona.l-1)hra.se a.tta.chments (e.g., to ol)ta.in informa.tion from a, server) nor deiinc tra.nsla.tions of spe tiff(: verb-object pa.irs (e.g, to take advantage of something). null Powerful gra.mma.r forma.lisms a.nd h;xica.l-sema.ntics forma\]isnts ttawe, I)een known for yea.rs (see l,F(~(Ka.l)la.n a.nd l/restore, 1982), 1 tPSG (Polla.rd a.nd Sa,g, 1987), a.nd Ge, ncra.tive l~exicon(Pustejovslcy, 199l), for example), bttt pra.ctica\] iml)h'~me.nta.tion of a.n M:I&amp;quot; system ha.s yet to tax:kle, the computa.tiona.l colnl)lcxity of pa.rsing a.lgo rithms fin' these formalisms a.nd the workl(m.d of building st. la.rgc sca.lc lexicon.</Paragraph>
    <Paragraph position="4"> I'~xaml)le-based MT(Sa.to a.nd Na.ga.o, 1990; Sumita.</Paragraph>
    <Paragraph position="5"> a.nd lida., 199l) a.nd sta.tistica.I MT(Brown et a.1., 1993) a.l'c both promising apln'oa.chcs tha.t genera.lly demonsl;ra,te, incrementaJ iml)roveme, nl: in tra,nsla,tion a.ccura,cy a.s tile qua.lity of examples or tra.ining da.ta, grows. It is, however, a,n olmn qttestion whether these a.pl)roa.(:hes a, lone ca.n be used to crca,te a. fltll-fh;dged MT system; tha.t is, it is uncerta.in whether such a. system ca.n be used tbr wt,rit)us dotnains withottt showing sever(~ (h;gra.dation in trans/a.l:ion accuracy, or if it has to 1)(' tb, d IW a. r(:as(ma.1)ly Ia.rge, set of (',xaml)les or tra.ining da.ta tk)r (~,Hch IIQ, W (IOlIl;l,il,.</Paragraph>
    <Paragraph position="6"> TAGlmsed MT(Abeilld et M., 1990) I a.nd pa.ttcrnl)a.s(;d tra.nsla.tion(Ma.ruya.ma., 1993) shaye ma.ny intl)or~See l/l'A(l(Sch;dms el; a.l., 1!)S8)(l,exicalized TAt',) a.nd ta.nt propertie.s fiw successful im ple, menta.tion in 1)racticM MT systetns, namely:</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
structures
</SectionTitle>
      <Paragraph position="0"> In this pa.pcr, wc show tha.t thc;rc exists a.n attra.ctivc way of crossing these apl)roa.(:hes, which wa ca.ll pattern-based MT. 'e In l;he tollowing two s(~x:l;ions, we introduce a. class of tra.nsla.tion &amp;quot;pa.tte, rns&amp;quot; ha.sad Oli (2otltoxti tl &amp;quot;~ Free Gramma.l (CI G), and a, pa.rsing a.lgorithm with O(\]G\]'2n ') worst-case time COml)lexity. Furthcrnlore, we show tha.t our fva, nmwork ca.n I)c (;xttmded to incorpora, te exa.mt)h;-lmsed MT a,nd a powerful le, a rning mox:ha.nisn,.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML