File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/06/n06-2006_evalu.xml
Size: 2,722 bytes
Last Modified: 2025-10-06 13:59:40
<?xml version="1.0" standalone="yes"?> <Paper uid="N06-2006"> <Title>Class Model Adaptation for Speech Summarisation</Title> <Section position="6" start_page="22" end_page="23" type="evalu"> <SectionTitle> 5 Results </SectionTitle> <Paragraph position="0"/> <Section position="1" start_page="22" end_page="22" type="sub_section"> <SectionTitle> 5.1 TRS Results </SectionTitle> <Paragraph position="0"> Initial experiments were made on the human transcriptions (TRS), and results are given in Table 1.</Paragraph> <Paragraph position="1"> Experiments on word models (Word) show relative improvements in terms of SumACCY of 7.5% and 2.1% for the 10% and 30% summarisation ratios, respectively. ROUGE metrics, however, do not show any significant improvement.</Paragraph> <Paragraph position="2"> Using class models (Class and Mixed), for all ROUGE metrics, relative improvements range from 3.5% to 13.4% for the 10% summarisation ratio, and from 8.6% to 16.5% on the 30% summarisation ratio. For SumACCY, relative improvements between 11.5% to 12.9% are observed.</Paragraph> </Section> <Section position="2" start_page="22" end_page="23" type="sub_section"> <SectionTitle> 5.2 ASR Results </SectionTitle> <Paragraph position="0"> ASR results for each experiment are given in Table 2 for appropriate summarisation ratios. As for the TRS, LiM adaptation showed improvements in terms of SumACCY, but ROUGE metrics do not corroborate those results for the 10% summarisation ratio. Using class models, for all ROUGE metrics, relative improvements range from 6.0% to 22.2% and from 7.4% to 20.0% for the 10% and 30% summarisation ratios, respectively. SumACCY relative improvements range from 7.6% to 15.9%.</Paragraph> </Section> </Section> <Section position="7" start_page="23" end_page="23" type="evalu"> <SectionTitle> 6 Discussion </SectionTitle> <Paragraph position="0"> Compared to previous experiments using only word models, improvements obtained using class models are larger and more significant for both ROUGE and SumACCY metrics. This can be explained by the fact that the data we are performing adaptation on is very sparse, and that the nine talks used in these experiments are quite different from each other, especially since the speakers also vary in style. Class models are more robust to this spontaneous speech aspect than word models, since they generalise better to unseen word sequences.</Paragraph> <Paragraph position="1"> There is little difference between the Class and Mixed results, since the development phase assigned most weight to the class model component in the Mixed experiment, making the results quite similar to those of the Class experiment.</Paragraph> </Section> class="xml-element"></Paper>