File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/04/n04-4025_intro.xml
Size: 2,344 bytes
Last Modified: 2025-10-06 14:02:18
<?xml version="1.0" standalone="yes"?> <Paper uid="N04-4025"> <Title>Automated Team Discourse Annotation and Performance Prediction Using LSA</Title> <Section position="3" start_page="0" end_page="0" type="intro"> <SectionTitle> 2 Data </SectionTitle> <Paragraph position="0"> Our corpus (UAV-Corpus) consists of 67 transcripts collected from 11 teams, who each completed 7 missions that simulate flight of an Uninhabited Air Vehicle (UAV) in the CERTT (Cognitive Engineering Research on Team Tasks) Lab's synthetic team task environment (CERTT UAV-STE). CERTT's UAV-STE is a threeteam member task in which each team member is provided with distinct, though overlapping, training; has unique, yet interdependent roles; and is presented with different and overlapping information during the mission. The overall goal is to fly the UAV to designated target areas and to take acceptable photos at these areas.</Paragraph> <Paragraph position="1"> The 67 team-at-mission transcripts in the UAV-Corpus contain approximately 2700 minutes of spoken dialogue, in 20545 separate utterances or turns. There are approximately 232,000 words or 660 KB of text. All communication was manually transcribed.</Paragraph> <Paragraph position="2"> We were provided with the results of manual annotation of the corpus by three annotators using the Bowers Tag Set (Bowers et al. 1998), which includes tags for: acknowledgement, action, factual, planning, response, uncertainty, and non-task related utterances.</Paragraph> <Paragraph position="3"> The three annotators had each tagged 26 or 27 team-at-missions so that 12 team-at-missions were tagged by two annotators. Inter-coder reliability had been computed using the C-value measure (Schvaneveldt, 1990).</Paragraph> <Paragraph position="4"> The overall C-value for transcripts with two taggers was 0.70. We computed Cohens Kappa to be 0.62 (see Section 4 and Table 1).</Paragraph> <Paragraph position="5"> In addition to the moderate level inter-coder agreement, tagging was done at the turn level, where a turn could range from a single word to several utterances by a single speaker, and the number of tags that taggers assigned to a given turn might not agree. We hope to address these limitations in the data set with a more thorough annotation study in the near future.</Paragraph> </Section> class="xml-element"></Paper>