File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/relat/04/w04-1809_relat.xml
Size: 1,289 bytes
Last Modified: 2025-10-06 14:15:47
<?xml version="1.0" standalone="yes"?> <Paper uid="W04-1809"> <Title>Term Extraction from Korean Corpora via Japanese</Title> <Section position="5" start_page="2" end_page="2" type="relat"> <SectionTitle> 4 Related Work </SectionTitle> <Paragraph position="0"> A number of corpus-based methods to extract bilingual lexicons have been proposed (Smadja et al., 1996). In general, these methods use statistics obtained from a parallel or comparable bilingual corpus and extract word or phrase pairs that are strongly associated with each other. However, our method uses a monolingual Korean corpus and a Japanese lexicon independent of the corpus, which can easily be obtained, compared with parallel or comparable bilingual corpora.</Paragraph> <Paragraph position="1"> Jeong et al. (1999) and Oh and Choi (2001) independently explored a statistical approach to detect foreign words in Korean text. Although the detection accuracy is reasonably high, these methods require a training corpus in which conventional and foreign words are annotated. Our approach does not require annotated corpora, but the detection accuracy is not high enough as shown in Section 3.1. A combination of both approaches is expected to compensate the drawbacks of each approach.</Paragraph> </Section> class="xml-element"></Paper>