File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/02/w02-1405_evalu.xml

Size: 3,251 bytes

Last Modified: 2025-10-06 13:58:53

<?xml version="1.0" standalone="yes"?>
<Paper uid="W02-1405">
  <Title>Improving a general-purpose Statistical Translation Engine by Terminological lexicons</Title>
  <Section position="6" start_page="0" end_page="0" type="evalu">
    <SectionTitle>
5 Results
</SectionTitle>
    <Paragraph position="0"> We considered three terminological lexicons whose characteristics are summarized in Table 3; they essentially difier in terms of number of entries and therefore coverage of the text to translate.</Paragraph>
    <Paragraph position="1"> lexicon nb coverage SER WER  ent terminological lexicons. nb is the number of entries in the lexicon and coverage reports the number of difierent source entries from the lexicon belonging to the text to translate and the total number of their occurrences.</Paragraph>
    <Paragraph position="2"> The flrst lexicon (namely sniper-1) contains the 33 entries used in the study of terminological consistency checking described in (Macklovitch, 1995). The second and third lexicons (namely sniper-2 and sniper-3) contain those entries plus other ones added manually after an incremental inspection of the sniper corpus.</Paragraph>
    <Paragraph position="3"> As can be observed from Table 3, introducing terminological lexicons into the translation engine does improve performance, measured in terms of WER, and this even with lexicons that 5Our trigram model has been trained to provide parameters such as p(UNKjab).</Paragraph>
    <Paragraph position="4"> Source le tireur d' Pelite voit simultanPement les flls croisPes et l' image ( l' objectif ) . Target the sniper sees the crosshairs and the image - target - at the same time . without the gunman being same son sit and picture of the hon. members : agreed . with the sniper simultaneously see the crosshairs and the image (objective . ) Source contr^ole de la dPetente .</Paragraph>
    <Paragraph position="5"> Target exercising trigger control .</Paragraph>
    <Paragraph position="6"> without the control of dPetente .</Paragraph>
    <Paragraph position="7"> with control of the trigger .</Paragraph>
    <Paragraph position="8">  bold.</Paragraph>
    <Paragraph position="9"> cover only a small portion of the text to translate. With the narrow coverage lexicon, we observe an absolute reduction of 7%, and a reduction of 10% with the broader lexicon sniper-3. This suggests that adding more entries into the lexicon is likely to decrease WER. In another study (Carl and Langlais, 2002), we investigated whether an automatic procedure designed to detect term variants could improve these performances futher.</Paragraph>
    <Paragraph position="10"> Table 4 provides two examples of translation outputs, with and without the help of terminological units. The flrst one clearly shows that EVEN A few TU (two in this case) may substantially improve the quality of the translation output; (the translation produced without the lexicon was particularly poor in this very case. Even though terminological lexicons do improve the overall WER flgure, a systematic inspection of the outputs produced with TU reveals that the translations are still less faithful to the source text than the translations produced for the hansard text. OOV words remain a serious problem.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML