File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/04/w04-0812_intro.xml
Size: 1,249 bytes
Last Modified: 2025-10-06 14:02:35
<?xml version="1.0" standalone="yes"?> <Paper uid="W04-0812"> <Title>Senseval-3: The Italian All-words Task</Title> <Section position="2" start_page="0" end_page="0" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> This paper describes the Italian all-words sense disambiguation task for Senseval-3: about 5000 words were manually disambiguated according to the ItalWordNet (IWN) word senses. The first section briefly describes of the corpus and the lexical reference resource. The second section contains some general criteria adopted for the annotation of the corpus and illustrated by a series of examples. Issues connected to the treatment of phenomena typically found in corpora, e.g.</Paragraph> <Paragraph position="1"> abbreviations, foreign words, jargon, locutions are discussed. Furthermore, the encoding of compounds, metaphorical usages, and multiword units is described. Problems connected with i) the high granularity of sense distinctions in the lexical resource and ii) unsolvable ambiguities of the contexts are dealt with. Finally, it is evidenced how the annotation exercise can be of help in updating or tuning IWN, by adding missing senses and/or entries.</Paragraph> </Section> class="xml-element"></Paper>