File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/04/w04-2313_concl.xml
Size: 1,918 bytes
Last Modified: 2025-10-06 13:54:26
<?xml version="1.0" standalone="yes"?> <Paper uid="W04-2313"> <Title>Towards Automatic Identification of Discourse Markers in Dialogs: The Case of Like</Title> <Section position="11" start_page="0" end_page="0" type="concl"> <SectionTitle> 9 Conclusion </SectionTitle> <Paragraph position="0"> This paper has presented several computational approaches to the disambiguation of discourse markers, with a focus on the highly ambiguous word like. Experiments regarding the human capacity to annotate reliably the discourse marker like show that relatively untrained annotators reach a kappa agreement of about 0.74, producing reliable, though not perfect, annotations - provided they have access to the sound files. Automatic performance of the identification task, using a set of collocation filters, can help annotators by discarding some of the non-pragmatic occurrences. However, POS taggers seem unable to disambiguate the occurrences of like in speech transcripts. Finally, the training of decision trees on about 2,100 occurrences of like confirms the relevance of collocation filters as the main features, followed by time-based features, while correctly classifying more than 80% of the occurrences of like, and more than 90% of those of well.</Paragraph> <Paragraph position="1"> Future work should explore the relevance of other potential features. However, given the strong pragmatic function of DMs, it is unlikely that low-level features combined with machine learning will entirely solve the problem. As we have seen, POS tagging is quite unreliable on DMs, but POS tags from the surrounding words could serve as features for statistical training. More data and more reliable annotations will also help. Another promising approach is the generalization of classification features across several DMs, which will allow the detection of an entire class of discourse markers.</Paragraph> </Section> class="xml-element"></Paper>