File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/relat/04/w04-1809_relat.xml

Size: 1,289 bytes

Last Modified: 2025-10-06 14:15:47

<?xml version="1.0" standalone="yes"?>
<Paper uid="W04-1809">
  <Title>Term Extraction from Korean Corpora via Japanese</Title>
  <Section position="5" start_page="2" end_page="2" type="relat">
    <SectionTitle>
4 Related Work
</SectionTitle>
    <Paragraph position="0"> A number of corpus-based methods to extract bilingual lexicons have been proposed (Smadja et al., 1996). In general, these methods use statistics obtained from a parallel or comparable bilingual corpus and extract word or phrase pairs that are strongly associated with each other. However, our method uses a monolingual Korean corpus and a Japanese lexicon independent of the corpus, which can easily be obtained, compared with parallel or comparable bilingual corpora.</Paragraph>
    <Paragraph position="1"> Jeong et al. (1999) and Oh and Choi (2001) independently explored a statistical approach to detect foreign words in Korean text. Although the detection accuracy is reasonably high, these methods require a training corpus in which conventional and foreign words are annotated. Our approach does not require annotated corpora, but the detection accuracy is not high enough as shown in Section 3.1. A combination of both approaches is expected to compensate the drawbacks of each approach.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML