File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/04/w04-1404_intro.xml

Size: 2,949 bytes

Last Modified: 2025-10-06 14:02:40

<?xml version="1.0" standalone="yes"?>
<Paper uid="W04-1404">
  <Title>Language Resources for the Semantic Web perspectives for Machine Translation -</Title>
  <Section position="2" start_page="0" end_page="0" type="intro">
    <SectionTitle>
1. Introduction
</SectionTitle>
    <Paragraph position="0"> Machine Translation (MT) was nominated on the first place among the 10 emerging technologies who will change the world (Technical Review 2004). It is expected that with the increased number of official language in Europe, and the continuous growth of non-English Internet resources, machine translation systems will become an indispensable tool in everyday work.</Paragraph>
    <Paragraph position="1"> For the moment high-quality MT-systems are on one hand expensive and on the other hand domain oriented. The on-line existent tools produce poorquality translation, and very often offer a false image of current translation engines capabilities.</Paragraph>
    <Paragraph position="2"> The main reason why on-line machine translation tools offer so poor results is that they rely either on corpus-based methods trained on a limited number of examples or they infer rules from a limited linguistic knowledge base (Gaspari 2002).</Paragraph>
    <Paragraph position="3"> Following the statistics published in (McLaughlin and Schwall 1998) already in 1998 there were at least 25 countries with more than 500 000 Internet users, and in at least half of these countries English is neither the first nor the second spoken language. This statistic shows clearly that access to on-line information can be guaranteed only through high-quality on-line machine translation tools. However, an on-line translation system has a number of specific requirements (i.e. different from the &amp;quot;traditional&amp;quot; ones): - It has to be fast but not always perfect.</Paragraph>
    <Paragraph position="4"> The translation of web-documents is more a kind of &amp;quot;translation for assimilation&amp;quot; in the Carbonell's classification (Carbonell 1994). However it has to go beyond the word-to-word quality offered by the actual on-line systems - A large number of languages / pair of languages have to be covered - The system has to be a &amp;quot;fully integrated black box&amp;quot;. Most part of the users do not have the expertise to tune different parameters.</Paragraph>
    <Paragraph position="5"> There are different approaches to automatic translation, however not all of them are suited to  be used for on-line translation.</Paragraph>
    <Paragraph position="6"> 1. Rule-based MT systems are based on  complex linguistic modules both in the analysis and generation phase (morphology, syntax, semantics, pragmatics). Such modules are developed for only few languages and they are not commercially -free available. The implementation of such modules requires deep linguistic knowledge in both languages (especially for the transfer rules)</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML