File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/05/p05-2019_abstr.xml
Size: 988 bytes
Last Modified: 2025-10-06 13:44:31
<?xml version="1.0" standalone="yes"?> <Paper uid="P05-2019"> <Title>A corpus-based approach to topic in Danish dialog[?]</Title> <Section position="2" start_page="0" end_page="0" type="abstr"> <SectionTitle> Abstract </SectionTitle> <Paragraph position="0"> We report on an investigation of the pragmatic category of topic in Danish dialog and its correlation to surface features of NPs. Using a corpus of 444 utterances, we trained a decision tree system on 16 features. The system achieved near-human performance with success rates of 84-89% and F1-scores of 0.63-0.72 in 10-fold cross validation tests (human performance: 89% and 0.78). The most important features turned out to be preverbal position, definiteness, pronominalisation, and non-subordination. We discovered that NPs in epistemic matrix clauses (e.g. &quot;I think . . . &quot;) were seldom topics and we suspect that this holds for other inter-personal matrix clauses as well.</Paragraph> </Section> class="xml-element"></Paper>