File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/97/p97-1056_concl.xml
Size: 2,209 bytes
Last Modified: 2025-10-06 13:57:51
<?xml version="1.0" standalone="yes"?> <Paper uid="P97-1056"> <Title>Memory-Based Learning: Using Similarity for Smoothing</Title> <Section position="8" start_page="440" end_page="441" type="concl"> <SectionTitle> 6 Conclusion </SectionTitle> <Paragraph position="0"> We have analysed the relationship between Back-off smoothing and Memory-Based Learning and established a close correspondence between these two frameworks which were hitherto mostly seen as unrelated. An exception is the use of similarity for alleviating the sparse data problem in language modeling (Essen & Steinbiss, 1992; Brown et al., 1992; Dagan et al., 1994). However, these works differ in their focus from our analysis in that the emphasis is put on similarity between values of a feature (e.g.</Paragraph> <Paragraph position="1"> words), instead of similarity between patterns that are a (possibly complex) combination of many features. null The comparison of MBL and Back-off shows that the two approaches perform smoothing in a very similar way, i.e. by using estimates from more general patterns if specific patterns are absent in the training data. The analysis shows that MBL and Back-off use exactly the same type of data and counts, and this implies that MBL can safely be incorporated into a system that is explicitly probabilistic. Since the underlying k-NN classifier is a method that does not necessitate any of the common independence or distribution assumptions, this promises to be a fruitful approach.</Paragraph> <Paragraph position="2"> A serious advantage of the described approach, is that in MBL the back-off sequence is specified by the used similarity metric, without manual intervention or the estimation of smoothing parameters on held-out data, and requires only one parameter for each feature instead of an exponential number of parameters. With a feature-weighting metric such as Information Gain, MBL is particularly at an advantage for NLP tasks where conditioning events are complex, where they consist of the fusion of different information sources, or when the data is noisy. This was illustrated by the experiments on PP-attachment and POS-tagging data-sets.</Paragraph> </Section> class="xml-element"></Paper>