File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/00/w00-0701_intro.xml
Size: 2,797 bytes
Last Modified: 2025-10-06 14:00:59
<?xml version="1.0" standalone="yes"?> <Paper uid="W00-0701"> <Title>Learning in Natural Language: Theory and Algorithmic Approaches*</Title> <Section position="2" start_page="0" end_page="0" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> Many important natural language inferences can be viewed as problems of resolving phonetic, syntactic, semantics or pragmatics ambiguities, based on properties of the surrounding context.</Paragraph> <Paragraph position="1"> It is generally accepted that a learning component must have a central role in resolving these context sensitive ambiguities, and a significant amount of work has been devoted in the last few years to developing learning methods for these tasks, with considerable success. Yet, our understanding of when and why learning works in this domain and how it can be used to support increasingly higher level tasks is still lacking.</Paragraph> <Paragraph position="2"> This article summarizes work on developing a learning theory account for the major learning approaches used in NL.</Paragraph> <Paragraph position="3"> While the major statistics based methods used in NLP are typically developed with a * This research is supported by NSF grants IIS-9801638, SBR-9873450 and IIS-9984168.</Paragraph> <Paragraph position="4"> Bayesian view in mind, the Bayesian principle cannot directly explain the success and robustness of these methods, since their probabilistic assumptions typically do not hold in the data.</Paragraph> <Paragraph position="5"> Instead, we provide this explanation using a single, distribution free inductive principle related to the pac model of learning. We describe the unified learning framework and show that, in addition to explaining the success and robustness of the statistics based methods, it also applies to other machine learning methods, such as rule based and memory based methods.</Paragraph> <Paragraph position="6"> An important component of the view developed is the observation that most methods use the same simple knowledge representation. This is a linear representation over a new feature space - a transformation of the original instance space to a higher dimensional and more expressive space. Methods vary mostly algorithmicly, in ways they derive weights for features in this space. This is significant both to explaining the generalization properties of these methods and to developing an understanding for how and when can these methods be extended to learn from more structured, knowledge intensive examples, perhaps hierarchically. These issues are briefly discussed and we emphasize the importance of studying knowledge representation and inference in developing a learning centered approach to NL inferences.</Paragraph> </Section> class="xml-element"></Paper>