File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/92/j92-4001_abstr.xml

Size: 5,860 bytes

Last Modified: 2025-10-06 13:47:34

<?xml version="1.0" standalone="yes"?>
<Paper uid="J92-4001">
  <Title>The Acquisition and Use of Context-Dependent Grammars for English</Title>
  <Section position="2" start_page="0" end_page="0" type="abstr">
    <SectionTitle>
1. Introduction
</SectionTitle>
    <Paragraph position="0"> An enduring goal for natural language processing (NLP) researchers has been to construct computer programs that can read narrative, descriptive texts such as newspaper stories and translate them into knowledge structures that can answer questions, classify the content, and provide summaries or other useful abstractions of the text. An essential aspect of any such NLP system is parsing--to translate the indefinitely long, recursively embedded strings of words into definite ordered structures of constituent elements. Despite decades of research, parsing remains a difficult computation that often results in incomplete, ambiguous structures; and computational grammars for natural languages remain notably incomplete. In this paper we suggest that a solution to these problems may be found in the use of context-sensitive rules applied by a deterministic shift/reduce parser.</Paragraph>
    <Paragraph position="1"> A system is described for rapid acquisition of a context-sensitive grammar based on ordinary news text. The resulting grammar is accessed by deterministic, bottom-up parsers to compute phrase structure or case analyses of texts that the grammars coven The acquisition system allows a linguist to teach a CDG grammar by showing examples of parsing successive constituents of sentences. At this writing, 16,275 example constituents have been shown to the system and used to parse 345 sentences ranging from 10 to 60 words in length achieving 99% accuracy. These examples compress to a grammar of 3,843 rules that are equally effective in parsing. Extrapolation from our data suggests that acquiring an almost complete phrase structure grammar for AP Wire text will require about 25,000 example rules. The procedure is further demonstrated to apply directly to computing superficial case analyses from English sentences.</Paragraph>
    <Paragraph position="2">  * Department of Computer Sciences, AI Lab, University of Texas, Austin TX 78712. E-mail @cs.texas.edu t Boeing Helicopter Computer Svces, Philadelphia, PA (~) 1992 Association for Computational Linguistics  Computational Linguistics Volume 18, Number 4 One of the first lessons in natural or formal language analysis is the Chomsky (1957) hierarchy of formal grammars, which classifies grammar forms from unrestricted rewrite rules, through context-sensitive, context-free, and the most restricted, regular grammars. It is usually conceded that pure, context-free grammars are not powerful enough to account for the syntactic analysis of natural languages (NL) such as English, Japanese, or Dutch, and most NL research in computational linguistics has used either augmented context-flee or ad hoc grammars. The conventional wisdom is that context-sensitive grammars probably would be too large and conceptually and computationally untractable. There is also an unspoken supposition that the use of a context-sensitive grammar implies using the kind of complex parser required for parsing a fully context~sensitive language.</Paragraph>
    <Paragraph position="3"> However, NL research based on simulated neural networks took a context-based approach. One of the first hints came from the striking finding from Sejnowski and Rosenberg's NETtalk (1988), that seven-character contexts were largely sufficient to map each character of a printed word into its corresponding phoneme---where each character actually maps in various contexts into several different phonemes. For accomplishing linguistic case analyses McClelland and Kawamoto (1986) and Miikkulainen and Dyer (1989) used the entire context of phrases and sentences to map string contexts into case structures. Robert Allen (1987) mapped nine-word sentences of English into Spanish translations, and Yu and Simmons (1990) accomplished comparable context-sensitive translations between English and German simple sentences. It was apparent that the contexts in which a word occurred provided information to a neural network that was sufficient to select correct word sense and syntactic structure for otherwise ambiguous usages of language.</Paragraph>
    <Paragraph position="4"> In order to solve a problem of accepting indefinitely long, complex sentences in a fixed-size neural network, Simmons and Yu (1990) showed a method for training a network to act as a context-sensitive grammar. A sequential program accessed that grammar with a deterministic, single-path parser and accurately parsed descriptive texts. Continuing that research, 2,000 rules were accumulated and a network was trained using a back-propagation method. The training of this network required ten days of continuous computation on a Symbolics Lisp Machine. We observed that the training cost increased by more than the square of the number of training examples and calculated that 10,000-20,000 rules might well tax a supercomputer. So we decided that storing the grammar in a hash table would form a far less expensive option, provided we could define a selection algorithm comparable to that provided by the trained neural network.</Paragraph>
    <Paragraph position="5"> In this paper we describe such a selection formula to select rules for context-sensitive parsing, a system for acquiring context-sensitive rules, and experiments in analysis and application of the grammar to ordinary newspaper text. We show that the application of context-sensitive rules by a deterministic shift/reduce parser is a conceptually and computationally tractable approach to NLP that may allow us to accumulate practical grammars for large subsets of English texts.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML