File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/98/p98-1004_evalu.xml

Size: 4,948 bytes

Last Modified: 2025-10-06 14:00:28

<?xml version="1.0" standalone="yes"?>
<Paper uid="P98-1004">
  <Title>A Simple Hybrid Aligner for Generating Lexical Correspondences in Parallel Texts</Title>
  <Section position="6" start_page="33" end_page="34" type="evalu">
    <SectionTitle>
5. Evaluation
</SectionTitle>
    <Paragraph position="0"> The algorithm was tested on two different texts; one novel (66,693 source words) and one computer program manual (169.779 source words) which both were translated from English into Swedish. The tests were run on a Sun UltraSparcl Workstation with 320 MB RAM and took 55 minutes for the novel and 4 and a half hour for the program manual.</Paragraph>
    <Paragraph position="1"> The tests were run with three different configurations on each text: (i) the baseline (B) configuration which is the t-score measure, (ii) all modules except the weights module (AM-W), but a linkdistance constraint was used and set to 10; and (iii) all modules (AM) including morphology, weights and phrases. The t-score threshold used was 1.65 for B and AM-W, and 2.7 for AM, the minimum frequency of source expression was set to 3. Closed-class expressions were linked in all configurations. In the baseline configuration no distinction was made between closed-class and open-class expressions. In the AM-W and AM tests the closed-class expressions were divided into different subcategories and at the end of each iteration the linking direction was reversed at the end of each of the six iterations which improves the chances of linking low frequency source expressions. The characteristics of the source texts used are shown in Table 3.</Paragraph>
    <Paragraph position="2">  a high number of low frequency words whereas the program manual contains a higher proportion of words that the algorithm acturally tested as the frequency threshold was set to 3.</Paragraph>
    <Paragraph position="3"> The results from the tests are shown in Table 4. The evaluation was done on an extract from the automatically produced dictionary. All expressions starting with the letters N, O and P were evaluated for all three configurations of each text.</Paragraph>
    <Paragraph position="4"> The results from the novel show that recall is almost tripled in the sample, from 234 in the B configuration to 709 linked source expressions with the AM configuration. Precision values for the novel lie in the range from 90.13 to 92.50 per cent when partial links are judged as errors and slightly higher if they are not. The use of weights seems to make precision somewhat lower for the novel which perhaps could be explained by the fact that the novel is a much more varied text type. For the program manual the recall results are as good as for the novel (three times as many linked source types for the AM configuration compared to baseline). Precision is increased, but perhaps not only (B), all modules except the weights (AM null to the level we anticipated at first. Multi-word expressions are linked with a relatively high recall (above 70%), but the precision of these links are not as high as for single words. Our evaluations of the links show that one major problem lies in the quality of the multi-word expressions that are fed into the alignment program. As the program works iteratively and in the current version starts with the multi-word expressions, any errors at this stage will have consequences in later iterations.</Paragraph>
    <Paragraph position="5"> We have run each module separately and observed that the addition of each module improves the baseline configuration by itself. To compare our results to those from other approaches is difficult. Not only are we dealing with different language pairs but also with different texts and text types. There is also the issue of different evaluation criteria. A pure word-to-word alignment cannot be compared to an approach where lexical units (both single word expressions and multi-word expressions) are linked. Neither can the combined approach be compared to a pure phrase alignment program because the aims of the alignment are different.</Paragraph>
    <Paragraph position="6"> However, as far as we can judge given these difficulties, the results presented in this paper are on par with previous work for precision and possibly an improvement on recall because of how we handle low-frequency variants in the morphology module and by using the single-wordline strategy. The handling of closed-class expressions have also been improved due to the division of these expressions into subcategories which limits the search space considerably.</Paragraph>
  </Section>
  <Section position="7" start_page="34" end_page="34" type="evalu">
    <SectionTitle>
Acknowledgements
</SectionTitle>
    <Paragraph position="0"> This work is part of the project &amp;quot;Parallell corpora in Link6ping, Uppsala and G6teborg&amp;quot; (PLUG), jointly funded by Nutek and HSFR under the Swedish National research programme in</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML