File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/92/h92-1016_evalu.xml
Size: 3,666 bytes
Last Modified: 2025-10-06 14:00:08
<?xml version="1.0" standalone="yes"?> <Paper uid="H92-1016"> <Title>T\]he MIT ATIS System: February 1992 Progress Report 1</Title> <Section position="6" start_page="87" end_page="87" type="evalu"> <SectionTitle> SUMMARY AND FUTURE WORK </SectionTitle> <Paragraph position="0"> This paper describes the improvements that we have made to the recognition component of our ATIS system.</Paragraph> <Paragraph position="1"> By incorporating more language constraints (using a bi-gram and a probabilistic LR parser) and performing context dependent phonetic modelling, a significant reduction in recognition error rates is realized. This has led to a corresponding decrease in weighted error of the overall spoken language system. Much of the phonetic recognition parts of our system has been ported to a set of off-the-shelf DSP boards. The complete system, using an IBM RS6000 for lexical access and a Sun SPARCstation-II for the rest of the processing, now runs in 2-3 times real-time.</Paragraph> <Paragraph position="2"> In the coming months, we plan to conduct research in several directions that will hopefully lead to further im-Provement in system performance. These areas include the introduction of gender-specific acoustic models, modelling out-of-vocabulary words, modelling, spontaneous speech effects such as pauses, increasing the size of the lexicon and training set size, and better language models.</Paragraph> <Paragraph position="3"> Our results show that better language modelling is crucial to improved performance. Our future research in this area falls in several categories. In addition to developing a bigram grammar, we have begun to explore the use of class bigram's as well as more general N-grams.</Paragraph> <Paragraph position="4"> The class bigrams we examined grouped similar words together in order to reduce the number of unseen word pairs. We investigated both grouping the conditioning context into classes: p( b I o) IC@o)) as well as the word itself:</Paragraph> <Paragraph position="6"> where C(wb) is the general class of words that wb belongs to. We explored a number of different classes and found that we could reduce the development set perplexity by a small amount, to 19.5.</Paragraph> <Paragraph position="7"> We have also begun to explore the use of more general N-grams and class N-grams. The N-gram language model store all word sequences observed in the training data. In order to represent these grammars efficiently we store them in the form of a hierarchical tree, where each node deeper in the tree represents one word farther back in the past. Smoothing becomes extremely important for the N-gram. Thus far we have used the generalization of the bigram interpolation procedure so that N-gram smoothing is done recursively:</Paragraph> <Paragraph position="9"> so that p@, l o =pn Our initial experiments suggest that, by incorporating a class 4-gram directly into the N-best search, we can reduce our sentence error rate from 62.5% to 56.3% (for N = 1) on the development set, although there is a corresponding increase in the amount of search required. We have also found that by simply adding the class 4-gram scores into our N-best resorting algorithm we can reduce our sentence error rate from 59.6% to 56.0%on the February '92 test set.</Paragraph> <Paragraph position="10"> Future work in language modelling will focus on application of this general model. In particular, conditioning the probabilities on the entire parse stack rather than on the current state (essentially the top of stack) should further reduce perplexity and bring long-distance constralnts to bear.</Paragraph> </Section> class="xml-element"></Paper>