File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/05/h05-1061_abstr.xml

Size: 1,366 bytes

Last Modified: 2025-10-06 13:44:13

<?xml version="1.0" standalone="yes"?>
<Paper uid="H05-1061">
  <Title>Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing (HLT/EMNLP), pages 483-490, Vancouver, October 2005. c(c)2005 Association for Computational Linguistics Mining Key Phrase Translations from Web Corpora</Title>
  <Section position="1" start_page="0" end_page="0" type="abstr">
    <SectionTitle>
Abstract
</SectionTitle>
    <Paragraph position="0"> Key phrases are usually among the most information-bearing linguistic structures.</Paragraph>
    <Paragraph position="1"> Translating them correctly will improve many natural language processing applications. We propose a new framework to mine key phrase translations from web corpora. We submit a source phrase to a search engine as a query, then expand queries by adding the translations of topic-relevant hint words from the returned snippets. We retrieve mixed-language web pages based on the expanded queries. Finally, we extract the key phrase translation from the second-round returned web page snippets with phonetic, semantic and frequency-distance features. We achieve 46% phrase translation accuracy when using top 10 returned snippets, and 80% accuracy with 165 snippets. Both results are significantly better than several existing methods. null</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML