File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/06/w06-1670_concl.xml
Size: 2,767 bytes
Last Modified: 2025-10-06 13:55:41
<?xml version="1.0" standalone="yes"?> <Paper uid="W06-1670"> <Title>Broad-Coverage Sense Disambiguation and Information Extraction with a Supersense Sequence Tagger[?]</Title> <Section position="11" start_page="600" end_page="601" type="concl"> <SectionTitle> 6 Conclusions </SectionTitle> <Paragraph position="0"> In this paper we presented a novel approach to broad-coverage word sense disambiguation and information extraction. We defined a tagset based on Wordnet supersenses, a much simpler and general semantic model than Wordnet which, however, preserves significant polysemy information and includes standard named entity recognition categories. We showed that in this framework it is possible to perform accurate broad-coverage tagging with state of the art sequence learning methods. The tagger considerably outperformed the most competitive baseline on both Semcor and Sensevaldata. Tothebestofourknowledgetheresults on Senseval data provide the first convincing evidence of the possibility of improving by considerable amounts over the first sense baseline.</Paragraph> <Paragraph position="1"> We believe both the tagset and the structured learning approach contribute to these results. The simplified representation obviously helps by reducing the number of possible senses for each word (cf. Table 3). Interestingly, the relative improvement in performance is not as large as the relative reduction in polysemy. This indicates that sense granularity is only one of the problems in WSD. More needs to be understood concerning sources of information, and processes, that affect word sense selection in context. As far as the tagger is concerned, we applied the simplest feature representation, more sophisticated features can be used, e.g., based on kernels, which might contribute significantly by allowing complex feature combinations. These results also suggest new directions of research within this model. In particular, the labels occurring in each sequence tend to coincide with predicates (verbs) and arguments (nouns and named entities). A sequential dependency model might not be the most accurate at capturing the grammatical dependencies between these elements. Other conditional models, e.g., designed on head to head, or similar, dependencies could prove more appropriate.</Paragraph> <Paragraph position="2"> Another interesting issue is the granularity of the tagset. Supersenses seem more practical then synsets for investigating the impact of broad-coverage semantic tagging, but they define a very simplistic ontological model. A natural evolution of this kind of approach might be one which starts by defining a semantic model at an intermediate level of abstraction (cf. (Ciaramita et al., 2005)).</Paragraph> </Section> class="xml-element"></Paper>