File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/97/w97-0321_concl.xml
Size: 3,050 bytes
Last Modified: 2025-10-06 13:57:51
<?xml version="1.0" standalone="yes"?> <Paper uid="W97-0321"> <Title>Word Sense Disambiguation Based on Structured Semantic Space*</Title> <Section position="8" start_page="194" end_page="194" type="concl"> <SectionTitle> 6. Conclusions and Future work </SectionTitle> <Paragraph position="0"> In this paper, we propose a formal resource of language, structured semantic space, as a foundation for word sense disambiguation tasks. For a word in some context, the context can activate some sense clusters in the semantic space, due to its similarity with the contexts of the senses in the clusters, and the correct sense of the word can be determined by comparing its definitions and those of the words in the clusters.</Paragraph> <Paragraph position="1"> Structured semantic space can be seen as a general model to deal with WSD problems, because it doesn't concern any language-specific knowledge at all. For a language, we can first make use of its mono-sense word to outline its semantic space, and produce a dendrogram according to their similarity, then word sense disambiguation can be carried out based on the dendrogram and the definitions of the words given in a dictionary.</Paragraph> <Paragraph position="2"> As can be seen that ideal structured semantic space should be homogeneous, i.e., the clusters in it should be well-distributed, neither too dense nor too sparse. If it is too dense, there may be too many clusters activated by a context. On the contrary, if it is too sparse, there may be no clusters activated by a context, even if there is any, it may be the case that the senses in the clusters are not similar with the correct sense of the target word. So future work includes how to evaluate the homogeneity of the semantic space, how to locate the non-homogeneous areas in the space, and how to make them homogeneous.</Paragraph> <Paragraph position="3"> Obviously, the disambiguation accuracy will be reduced if the cluster contains less words, because less words in the cluster will lead to invalidity of its definition vectors in revealing the similar words included in their definitions. But it seems to be impossible to ensure that every cluster contains enough words, with only mono-sense words taken into consideration when building the semantic space.</Paragraph> <Paragraph position="4"> In order to make the cluster contain more words, we must make use of ambiguous words. So future work includes how to add ambiguous words into clusters based on their contexts.</Paragraph> <Paragraph position="5"> Another problem is about the length of the contexts to be considered. With longer contexts taken into consideration, there may be too many clusters activated. But if we consider shorter contexts, the meaningful information for word sense disambiguation may be lost. So future work also includes how to make an appropriate decision on the length of the contexts to be considered, meanwhile make out the meaningful information carried by the words outside the considered contexts.</Paragraph> </Section> class="xml-element"></Paper>