XML Viewer - c02-1002

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/02/c02-1002_concl.xml
Size: 2,954 bytes
Last Modified: 2025-10-06 13:53:11
<?xml version="1.0" standalone="yes"?>
<Paper uid="C02-1002">
  <Title>A cheap and fast way to build useful translation lexicons</Title>
  <Section position="5" start_page="2" end_page="2" type="concl">
    <SectionTitle>
6 Applications and further work
</SectionTitle>
    <Paragraph position="0"> We used the multilingual lexicon, mentioned before, for a sense discrimination exercise described in Erjavec et al (2001) where the criterion for sense clustering was the way the different occurrences of an English word in the &amp;quot;1984&amp;quot; parallel corpus were translated in the other 6 languages. The experiment carried on involved 91 highly ambiguous English nouns and was extremely encouraging; new results are described in Erjavec&amp;el all (2002).</Paragraph>
    <Paragraph position="1"> Another application of the translation lexicons was in the BALKANET project aimed at developing wordnets for Balkan languages, Romanian included. The translation lexicons were used both in building from scratch, but in a harmonized way, the synsets for the base concepts and also for cross-lingual validation on running text (this was again the &amp;quot;1984&amp;quot; novel) of the interlingual index (ILI) mapping of these basic concepts. Considering that 4 languages in the BALKANET are represented in the &amp;quot;1984&amp;quot; parallel corpus we plan to take advantage of the ILI mapping for further refinement of the word-sense discrimination method mentioned above and add cluster labeling. The obvious language independent labeling is based on ILI-record numbers.</Paragraph>
    <Paragraph position="2"> The experiments reported here were evaluated on European language. A new experiment has been preliminarily evaluated for an extract of 500 sentences Chinese-English form a parallel corpus of juridical texts. The experiment was focused on noun translations extraction, used an LL-score threshold set to 9 and no conflict resolution method for the competitive translations. We had two result sets: RS1: contains translations which haven't competitors (that is whenever there were competing translations for the same word none of them was selected) RS2: differs from DS1 by the inclusion in the output lexicon of all the competing translations.</Paragraph>
    <Paragraph position="3"> It is obvious that if 1:1 mapping hypothesis is true, for any competing translations included in RS2 only 1 is correct and all the others are errors. Therefore the precision for RS2 is much less than for RS1.</Paragraph>
    <Paragraph position="4"> The results of this experiment are shown in Table 5 and they show that without making a decision on the competing translations we either loose many good translations (RS1) or include a lot of noise (RS2).</Paragraph>
    <Paragraph position="5"> Result set # extr. pairs precision recall  Further work will address the issue of defining adequate heuristics for filtering out competing candidates.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML