File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/96/j96-1002_abstr.xml

Size: 4,074 bytes

Last Modified: 2025-10-06 13:48:40

<?xml version="1.0" standalone="yes"?>
<Paper uid="J96-1002">
  <Title>Renaissance Technologies</Title>
  <Section position="2" start_page="0" end_page="0" type="abstr">
    <SectionTitle>
1. Introduction
</SectionTitle>
    <Paragraph position="0"> Statistical modeling addresses the problem of constructing a stochastic model to predict the behavior of a random process. In constructing this model, we typically have at our disposal a sample of output from the process. Given this sample, which represents an incomplete state of knowledge about the process, the modeling problem is to parlay this knowledge into a representation of the process. We can then use this representation to make predictions about the future behavior about the process.</Paragraph>
    <Paragraph position="1"> Baseball managers (who rank among the better paid statistical modelers) employ batting averages, compiled from a history of at-bats, to gauge the likelihood that a player will succeed in his next appearance at the plate. Thus informed, they manipulate their lineups accordingly. Wall Street speculators (who rank among the best paid statistical modelers) build models based on past stock price movements to predict tomorrow's fluctuations and alter their portfolios to capitalize on the predicted future.</Paragraph>
    <Paragraph position="2"> At the other end of the pay scale reside natural language researchers, who design language and acoustic models for use in speech recognition systems and related applications. null The past few decades have witnessed significant progress toward increasing the predictive capacity of statistical models of natural language. In language modeling, for instance, Bahl et al. (1989) have used decision tree models and Della Pietra et al. (1994) have used automatically inferred link grammars to model long range correlations in language. In parsing, Black et al. (1992) have described how to extract grammatical  * This research, supported in part by ARPA under grant ONR N00014-91-C-0135, was conducted while the authors were at the IBM T. J. Watson Research Center, P.O. Box 704, Yorktown Heights, NY 10598 t Now at Computer Science Department, Columbia University.</Paragraph>
    <Paragraph position="3"> :~ Now at Renaissance Technologies, Stony Brook, NY.</Paragraph>
    <Paragraph position="4"> (c) 1996 Association for Computational Linguistics  Computational Linguistics Volume 22, Number 1 rules from annotated text automatically and incorporate these rules into statistical models of grammar. In speech recognition, Lucassen and Mercer (1984) have introduced a technique for automatically discovering relevant features for the translation of word spelling to word pronunciation.</Paragraph>
    <Paragraph position="5"> These efforts, while varied in specifics, all confront two essential tasks of statistical modeling. The first task is to determine a set of statistics that captures the behavior of a random proceSs. Given a set of statistics, the second task is to corral these facts into an accurate model of the process--a model capable of predicting the future output of the process. The first task is one of feature selection; the second is one of model selection. In the following pages we present a unified approach to these two tasks based on the maximum entropy philosophy.</Paragraph>
    <Paragraph position="6"> In Section 2 we give an overview of the maximum entropy philosophy and work through a motivating example. In Section 3 we describe the mathematical structure of maximum entropy models and give an efficient algorithm for estimating the parameters of such models. In Section 4 we discuss feature selection, and present an automatic method for discovering facts about a process from a sample of output from the process. We then present a series of refinements to the method to make it practical to implement. Finally, in Section 5 we describe the application of maximum entropy ideas to several tasks in stochastic language processing: bilingual sense disambiguation, word reordering, and sentence segmentation.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML