File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/06/w06-1318_concl.xml

Size: 3,895 bytes

Last Modified: 2025-10-06 13:55:35

<?xml version="1.0" standalone="yes"?>
<Paper uid="W06-1318">
  <Title>Measuring annotator agreement in a complex hierarchical dialogue act annotation scheme</Title>
  <Section position="8" start_page="558" end_page="558" type="concl">
    <SectionTitle>
6 Conclusion and future work
</SectionTitle>
    <Paragraph position="0"> In this paper we have presented agreement scores for Cohen's unweighted k and claimed that for annotation schemes with hierarchically related tags, a weighted k gives a better indication of (dis)agreement than unweighted k. The k scores for some dimensions seem not particularly spectacular but become more interesting when looking at semantic-pragmatic differences between dialogue acts or CFs. Even though there are somewhat arbitrary aspects in weighting, when parameters are carefully chosen a weighted metric gives a better representation of the inter-annotator agreements. More generally, we propose that semantic-pragmatic relatedness between taxonomic concepts should be taken into account when calculating inter-annotator (dis)agreement. While we used DIT++ as tagset, the weighting function we proposed can be employed in any taxonomy containing hierarchically related concepts, since we only used structural properties of the taxonomy.</Paragraph>
    <Paragraph position="1"> We have also quantitatively8 evaluated the DIT++ tagset per dimension, and obtained an indication of its usability. We focussed on agreement per dimension, but when we desire a global indication of the difference in semantic-pragmatic interpretation of a complete utterance it requires us to consider other aspects. A truly multidimensional study of inter-annotator agreement should not only take intra-dimensional aspects into account but also relate the dimensions to each other.</Paragraph>
    <Paragraph position="2"> In (Bunt and Girard, 2005; Bunt, 2006) it is argued that dimensions should be orthogonal, meaning that an utterance can have a function in one dimension independent of functions in other dimensions.</Paragraph>
    <Paragraph position="3"> This is a somewhat utopical condition, since there are some functions that show correlations and dependencies with across dimensions. For this reason it makes sense to try to express the effect of the presence of strong correlations, dependencies and possible entailments in a multidimensional notion of (dis)agreement. Additionally, it may be desirable to take into account the importance that a CF can have. It is widely acknowledged that utterances are often multifunctional, but it could be argued that in many cases an utterance has a primary function and secondary functions; for instance, if an utterance has both a task-related function and one or more other functions, the task-related function is typically felt to be more important than the other functions, and disagreement about the task-related function is therefore felt to be more serious than disagreement about one of the other functions. This might be taken into account by adding a weighting function when combining agreement measures over multiple dimensions.</Paragraph>
    <Paragraph position="4"> Other future work we plan is more methodological in nature, quantifying the relative effect of the factors that may have influenced the scores that we have found. This would create a situation in which there is more insight in what exactly is evaluated.</Paragraph>
    <Paragraph position="5"> As for evaluating the tagset, we for instance plan to further analyze co-occurence matrices to identify frequent misannotations, and to have annotators thinking aloud while performing the annotation task.</Paragraph>
    <Paragraph position="6"> 8Kappa statistics are indicative. To get a full understanding of what the figures represent, qualitative analysis by using e.g. co-occurence matrices is required, which is beyond the scope of this paper.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML