File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/05/p05-2021_concl.xml

Size: 1,255 bytes

Last Modified: 2025-10-06 13:54:44

<?xml version="1.0" standalone="yes"?>
<Paper uid="P05-2021">
  <Title>Speech Recognition of Czech - Inclusion of Rare Words Helps</Title>
  <Section position="5" start_page="124" end_page="125" type="concl">
    <SectionTitle>
4 Conclusion
</SectionTitle>
    <Paragraph position="0"> In this paper, we have suggested to inject a loop of supplementary words into the back-off state of a rst-pass language model. As it turned out, addition of rare or morphology-generated words into a language model can considerably decrease both recognition word error rate and oracle WER in single recognition pass. In the recognition of Czech Broadcast News, we achieved 13.6% relative improvement in terms of word error rate. In terms of oracle error rate, we observed more than 30% relative improvement. On the MALACH data, we attained only marginal word error rate reduction. Since the text corpora already covered the transcribed speech relatively well, a smaller OOV reduction translated into a smaller word error rate reduction. In the near future, we would like to test our approach on agglutinative languages, where the problems with high OOV are even more challenging. We would also like to experiment with more complex language models.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML