File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/04/p04-1039_concl.xml
Size: 1,666 bytes
Last Modified: 2025-10-06 13:54:04
<?xml version="1.0" standalone="yes"?> <Paper uid="P04-1039"> <Title>Relieving The Data Acquisition Bottleneck In Word Sense Disambiguation</Title> <Section position="9" start_page="0" end_page="0" type="concl"> <SectionTitle> 6 Conclusion and Future Directions </SectionTitle> <Paragraph position="0"> In this paper, we applied an unsupervised approach within a learning framework a0a8a1a9a3a5a3a10a6 a3 for the sense annotation of large amounts of data. The ultimate goal of a0a2a1a4a3a11a3a10a6 a3 is to alleviate the data labelling bottleneck by means of a trade-off between quality and quantity of the training data.</Paragraph> <Paragraph position="1"> a0a8a1a9a3a5a3a10a6 a3 is competitive with state-of-the-art unsupervised systems evaluated on the same test set from SENSEVAL2. Moreover, it yields superior results to those obtained by the only comparable bootstrapping approach when tested on the same data set. Moreover, we explore, in depth, different factors that directly and indirectly affect the performance of a0a2a1a4a3a5a3a7a6 a3 quantified as a performance ratio, PR. Sense Distribution Correlation (SDC) and Sense Context Confusability (SCC) have the highest direct impact on performance ratio, PR. However, evidence suggests that probably a confluence of all the different factors leads to the best prediction of an acceptable PR value. An investigation into the feasibility of combining these different factors with the different attributes of the experimental conditions for SALAAMto automatically predict when the noisy training data can reliably replace manually annotated data is a matter of future work.</Paragraph> </Section> class="xml-element"></Paper>