File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/96/p96-1021_evalu.xml

Size: 4,389 bytes

Last Modified: 2025-10-06 14:00:21

<?xml version="1.0" standalone="yes"?>
<Paper uid="P96-1021">
  <Title>A Polynomial-Time Algorithm for Statistical Machine Translation</Title>
  <Section position="7" start_page="155" end_page="156" type="evalu">
    <SectionTitle>
5 Results
</SectionTitle>
    <Paragraph position="0"> The algorithm above was tested in the SILC translation system. The translation lexicon was largely constructed by training on the HKUST English-Chinese Parallel Bilingual Corpus, which consists of governmental transcripts. The corpus was sentence-aligned statistically (Wu, 1994); Chinese words and collocations were extracted (Fung and Wu, 1994; Wu and Fung, 1994); then translation pairs were learned via an EM procedure (Wu and Xia, 1995). The resulting English vocabulary is approximately 6,500 words and the Chinese vocabulary is approximately 5,500 words, with a many-to-many translation mapping averaging 2.25 Chinese translations per English word. Due to the unsupervised training, the translation lexicon contains noise and is only at about 86% percent weighted precision.</Paragraph>
    <Paragraph position="1"> With regard to accuracy, we merely wish to demonstrate that for statistical MT, accuracy is not significantly compromised by substituting our efficient optimization algorithm. It is not our purpose here to argue that accuracy can be increased with our model. No morphological processing has been used to correct the output, and until now we have only been testing with a bigram model trained on extremely limited samples. A coarse evaluation of  (Xigng g~mg de ~n dlng f~n r6ng shl w6 m~n sh~ng hu6 fgmg shi de zhi zh~.) Hong Kong's stabilize boom is us life styles's pillar.</Paragraph>
    <Paragraph position="2"> Our prosperity and stability underpin our way of life.</Paragraph>
    <Paragraph position="3"> (B6n g~ng de jing ji qian jing yfi zhSng gu6, t~ bi~ shl gu~ng dSng shrug de ring jl qiPSn jing xi xi xi~ng gu~n.) Hong Kong's economic foreground with China, particular Guangdong province's economic foreground vitally interrelated.</Paragraph>
    <Paragraph position="4"> Our economic future is inextricably bound up with China, and with Guangdong Province in particular.</Paragraph>
    <Paragraph position="5"> (W6 wgm quPSn zhi chi ta de yl jign.) I absolutely uphold his views.</Paragraph>
    <Paragraph position="6"> I fully support his views.</Paragraph>
    <Paragraph position="7"> (Zh~ xi~ gn pdi k~ ji~ qiPSng w6 m~n rl hbu w~i chi jin r6ng w6n ding de n~ng li.) These arrangements can enforce us future kept financial stabilization's competency. These arrangements will enhance our ability to maintain monetary stability in the years to come.</Paragraph>
    <Paragraph position="8"> (Bh gub, w6 xihn zhi k~ yi k6n ding de shuS, w6 m~n ji~ng hul ti gSng w~i dPS d~o g~ xihng zhfi yho mfl biao su6 xfi de jing f~i.) However, I now can certainty's say, will provide for us attain various dominant goal necessary's current expenditure.</Paragraph>
    <Paragraph position="9"> The consultation process is continuing but I can confirm now that the necessary funds will be made available to meet the key targets.</Paragraph>
    <Paragraph position="10">  translation accuracy was performed on a random sample drawn from Chinese sentences of fewer than 20 words from the parallel corpus, the results of which are shown in Figure 3. We have judged only whether the correct meaning (as determined by the corresponding English sentence in the parallel corpus) is conveyed by the translation, paying particular attention to word order, but otherwise ignoring morphological and function word choices. For comparison, the accuracies from the A*-based systems are also shown. There is no significant difference in the accuracy. Some examples of the output are shown in Figure 4.</Paragraph>
    <Paragraph position="11"> On the other hand, the new algorithm has indeed proven to be much faster. At present we are unable to use direct measurement to compare the speed of the systems meaningfully, because of vast implementational differences between the systems. However, the order-of-magnitude improvements are immediately apparent. In the earlier system, translation of single sentences required on the order of hours (Sun Sparc 10 workstations). In contrast the new algorithm generally takes less than one minute--usually substantially less--with no special optimization of the code.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML