File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/02/w02-0702_evalu.xml

Size: 2,118 bytes

Last Modified: 2025-10-06 13:58:54

<?xml version="1.0" standalone="yes"?>
<Paper uid="W02-0702">
  <Title>Topic Detection Based on Dialogue History</Title>
  <Section position="7" start_page="22" end_page="22" type="evalu">
    <SectionTitle>
5 Results
</SectionTitle>
    <Paragraph position="0"> We performed the detection test described in 4.3 on 13 types of topic combinations using typical dialogue data and real situation dialogue data.</Paragraph>
    <Section position="1" start_page="22" end_page="22" type="sub_section">
      <SectionTitle>
5.1 Test results on typical dialogue data
</SectionTitle>
      <Paragraph position="0"> Figure 1 shows the results of topic detection on typical dialogue data for a varying number of clusters. The figure shows that the accuracy is highest when one sentence is set as one cluster (one sentence per cluster) in each topic, and lowest when one whole topic is set as one cluster.</Paragraph>
    </Section>
    <Section position="2" start_page="22" end_page="22" type="sub_section">
      <SectionTitle>
5.2 Test result on real situation dialogue
</SectionTitle>
      <Paragraph position="0"> data Figure 2 shows the results of topic detection on real situation dialogue data for a varying number of clusters. The figure shows that the accuracy of the medium cluster is slightly better than that for one sentence per cluster. This indicates that sentences grouped in terms of similarity heighten the accuracy of similarity calculation between input sentences and the training data.</Paragraph>
    </Section>
    <Section position="3" start_page="22" end_page="22" type="sub_section">
      <SectionTitle>
5.3 Results of dialogue history
</SectionTitle>
      <Paragraph position="0"> application We evaluated the effect of the dialogue history for typical dialogue test data, and compared the case of one sentence per cluster with the case of medium cluster. Using only the input sentence, the topic detection accuracy was 59.2% for the former and 56.0% for the latter. Using three sentences from the dialogue history, the respective figures were 72.0% and 70.0% with equal weights, 76.7% and 77.0% with time series weights.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML