File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/95/p95-1032_concl.xml

Size: 1,417 bytes

Last Modified: 2025-10-06 13:57:29

<?xml version="1.0" standalone="yes"?>
<Paper uid="P95-1032">
  <Title>A Pattern Matching Method for Finding Noun and Proper Noun Translations from Noisy Parallel Corpora</Title>
  <Section position="6" start_page="241" end_page="241" type="concl">
    <SectionTitle>
7 Conclusion
</SectionTitle>
    <Paragraph position="0"> Our algorithm bypasses the sentence alignment step to find a bilingual lexicon of nouns and proper nouns.</Paragraph>
    <Paragraph position="1"> Its output shows promise for compilation of domainspecific, technical and regional compounds terms. It has shown effectiveness in computing such a lexicon from texts with no sentence boundary information and with noise; fine-grain sentence alignment is not necessary for lexicon compilation as long as we have highly reliable anchor points. Compared to other word alignment algorithms, it does not need a priori information. Since EM-based word alignment algorithms using random initialization can fall into local maxima, our output can also be used to provide a better initializing basis for EM methods. It has also shown promise for finding noun phrases in English and Chinese, as well as finding new Chinese words which were not tokenized by a Chinese word tokenizer. We are currently working on identifying full noun phrases and compound words from noisy parallel corpora with statistical and linguistic information. null</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML