File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/02/c02-1143_concl.xml
Size: 2,222 bytes
Last Modified: 2025-10-06 13:53:12
<?xml version="1.0" standalone="yes"?> <Paper uid="C02-1143"> <Title>Simple Features for Chinese Word Sense Disambiguation</Title> <Section position="5" start_page="0" end_page="0" type="concl"> <SectionTitle> 4 Conclusion </SectionTitle> <Paragraph position="0"> We have demonstrated the high performance of maximum entropy models for word sense disambiguation in English, and have applied the same approach successfully to Chinese. While SENSEVAL-2 showed that methods that work on English also tend to work on other languages, our experiments have revealed striking differences in the types of features that are important for English and Chinese WSD. While parse information seemed crucial for English WSD, it only played a minor role in Chinese; in fact, the improvement in Chinese performance contributed by manual parse information in the CTB disappeared altogether when automatic parsing was done for the PDN. The fact that bracketing was more important for English than Chinese WSD suggests that predicate-argument information and selectional restrictions may play a more important role in distinguishing English verb senses than Chinese senses. Or, it may be the case that Chinese verbs tend to be adjacent to their arguments, so collocational information is sufficient to capture the same information that would require parsing in English. This is a question for further study.</Paragraph> <Paragraph position="1"> The simpler level of linguistic processing required to achieve relatively high sense-tagging accuracy in Chinese highlights an important difference between Chinese and English. Chinese is different from English in that much of Chinese linguistic ambiguity occurs at the basic level of word segmentation. Chinese word segmentation is a major task in itself, and it seems that once this is accomplished little more needs to be done for sense disambiguation. Our experience in English has shown that the ability to identify multi-word constructions significantly improves sense-tagging performance.</Paragraph> <Paragraph position="2"> Multi-character Chinese words, which are identified by word segmentation, may be the analogy to English multi-word constructions.</Paragraph> </Section> class="xml-element"></Paper>