File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/94/c94-2139_intro.xml
Size: 1,815 bytes
Last Modified: 2025-10-06 14:05:42
<?xml version="1.0" standalone="yes"?> <Paper uid="C94-2139"> <Title>Analysis of Japanese Compound Nouns using Collocational Information</Title> <Section position="3" start_page="0" end_page="0" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> Analyzing compound nouns is one of the crucial issues for natural language processing systems, in particular for those systems that aim at a wide coverage of domains. Registering all compound nouns in a dictionary is an impractical attproach, since we can create a new conll)ound lloun by conlbluing nouns. Therefore, a mechanism to analyze the structure of a con,pound noun front the individual nouns is necessary.</Paragraph> <Paragraph position="1"> In order to identify structures of a compound noun, we must first find a set of words that compose the compound noun. This task is trivial for languages such as English, where words are separated by spaces. The situation is worse, however, in Japanese where no spaces are placed betwem, words. The process to identify word boundaries is usually called segmentation. In processing languages such as Japanese, ambiguities in segmentation should be resolved at the same time as anndeg lyzing structure. I&quot;or instance, thc Japanese compound noun &quot;$)~\]llJ~;I~&quot;(ncw indirect tax), produces t6(= 2 4) segcmentations possibilities for this case. (By consulting a /lai)anese dictionary, we would filter out some.) In this case, we have two remaining possibilities: &quot;50\[&quot; (new)/~ (type)/lllJ~'}~ (indirect)/t~ (tax)&quot; and &quot;)~#~ (new)/lll\]~)~ (indirect)/ :~, (tax). ''i Wc nmst choose the correct segmentation, &quot;~)?~'J. (new)/llll}~ (indirect)/~ (tax)&quot; and analyze structure.</Paragraph> </Section> class="xml-element"></Paper>