File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/04/w04-0834_intro.xml
Size: 2,501 bytes
Last Modified: 2025-10-06 14:02:34
<?xml version="1.0" standalone="yes"?> <Paper uid="W04-0834"> <Title>Supervised Word Sense Disambiguation with Support Vector Machines and Multiple Knowledge Sources</Title> <Section position="2" start_page="0" end_page="0" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> This paper describes the approach adopted by our systems which participated in the English lexical sample task and the multilingual lexical sample task of SENSEVAL-3. The goal of the English lexical sample task is to predict the correct sense of an ambiguous English word a0 , while that of the multi-lingual lexical sample task is to predict the correct Hindi (target language) translation of an ambiguous English (source language) worda0 .</Paragraph> <Paragraph position="1"> The multilingual lexical sample task is further subdivided into two subtasks: the translation subtask, as well as the translation and sense subtask.</Paragraph> <Paragraph position="2"> The distinction is that for the translation and sense subtask, the English sense of the target ambiguous word a0 is also provided (for both training and test data).</Paragraph> <Paragraph position="3"> In all, we submitted 3 systems: system nusels for the English lexical sample task, system nusmlst for the translation subtask, and system nusmlsts for the translation and sense subtask.</Paragraph> <Paragraph position="4"> All systems were based on the supervised word sense disambiguation (WSD) system of Lee and Ng (2002), and used Support Vector Machines (SVM) learning. Only the training examples provided in the official training corpus were used to train the systems, and no other external resources were used. In particular, we did not use any external dictionary or the sample sentences in the provided dictionary.</Paragraph> <Paragraph position="5"> The knowledge sources used included part-of-speech (POS) of neighboring words, single words in the surrounding context, local collocations, and syntactic relations, as described in Lee and Ng (2002). For the translation and sense subtask of the multi-lingual lexical sample task, the English sense given for the target word was also used as an additional knowledge source. All features encoding these knowledge sources were used, without any feature selection.</Paragraph> <Paragraph position="6"> We next describe SVM learning and the combined knowledge sources adopted. Much of the description follows that of Lee and Ng (2002).</Paragraph> </Section> class="xml-element"></Paper>