File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/04/w04-0214_evalu.xml

Size: 1,326 bytes

Last Modified: 2025-10-06 13:59:17

<?xml version="1.0" standalone="yes"?>
<Paper uid="W04-0214">
  <Title>Discourse Annotation in the Monroe Corpus</Title>
  <Section position="6" start_page="0" end_page="0" type="evalu">
    <SectionTitle>
4 Results
</SectionTitle>
    <Paragraph position="0"> Spoken dialogue is a very difficult domain to work with because utterances are often marred with disfluencies, speech repairs, and are incomplete or ungrammatical. Speakers will interrupt each other. As a result, many empirical methods that work well in very formal, structured domains such as newspaper texts or manuals tend to suffer. For example, many leading pronoun resolution methods perform around 80% accuracy over a corpus of syntactically-parsed Wall Street Journal articles (e.g., (Tetreault, 2001) and (Ge et al., 1998)), but in spoken dialogue the performance of these algorithms drops significantly (Byron, 2002).</Paragraph>
    <Paragraph position="1"> However, by including semantic and discourse information, one is able to improve performance.</Paragraph>
    <Paragraph position="2"> Our preliminary results show that using the semantic feature lists associated with each entity as a filter for reference increases performance to 59% from 44%. Adding discourse segmentation boosts that figure to 66% over some parts of the corpus.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML