File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/02/p02-1023_concl.xml
Size: 2,253 bytes
Last Modified: 2025-10-06 13:53:18
<?xml version="1.0" standalone="yes"?> <Paper uid="P02-1023"> <Title>Improving Language Model Size Reduction using Better Pruning Criteria</Title> <Section position="10" start_page="111" end_page="111" type="concl"> <SectionTitle> 6 Conclusion </SectionTitle> <Paragraph position="0"> The research on backoff n-gram pruning has been focused on the development of the pruning criterion, which is used to estimate the performance loss of the pruned model.</Paragraph> <Paragraph position="1"> This paper explores several pruning criteria for backoff n-gram model size reduction. Besides the widely used probability, two new pruning criteria have been developed based on rank and entropy. We have performed an empirical comparison of these pruning criteria. We also presented a thresholding algorithm for model pruning, in which two pruning criteria can be used simultaneously. Finally, we described our techniques of finding the optimal setting of the threshold pair given a specific model size.</Paragraph> <Paragraph position="2"> We have shown several interesting results. They include the confirmation of the estimation that the measures which are better correlated with CER for LM evaluation leads to better pruning criteria. Our experiments show that rank, which has the best correlation with CER, achieves the best performance when there is only one criterion used in bigram model pruning. We then show empirically that the overlap of the bigrams pruned by different criteria is relatively low. This indicates that we might obtain improvements through a combination of two criteria for bigram pruning since the information provided by these criteria is complementary. This hypothesis is confirmed by our experiments. Results show that using two pruning criteria simultaneously achieves better bigram models than using either of the criteria separately. In particular, the combination of rank and entropy achieves the smallest bigram models at the same CER.</Paragraph> <Paragraph position="3"> For our future work, more experiments will be performed on other language models such as word-based bigram and trigram for Chinese and English. More pruning criteria and their combinations will be investigated as well.</Paragraph> </Section> class="xml-element"></Paper>