File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/96/p96-1002_intro.xml

Size: 11,076 bytes

Last Modified: 2025-10-06 14:06:02

<?xml version="1.0" standalone="yes"?>
<Paper uid="P96-1002">
  <Title>A Model-Theoretic Framework for Theories of Syntax</Title>
  <Section position="3" start_page="0" end_page="11" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> Generative grammar and formal language theory share a common origin in a procedural notion of grammars: the grammar formalism provides a general mechanism for recognizing or generating languages while the grammar itself specializes that mechanism for a specific language. At least initially there was hope that this relationship would be informative for linguistics, that by characterizing the natural languages in terms of language-theoretic complexity one would gain insight into the structural regularities of those languages. Moreover, the fact that language-theoretic complexity classes have dual automata-theoretic characterizations offered the prospect that such results might provide abstract models of the human language faculty, thereby not just identifying these regularities, but actually accounting for them.</Paragraph>
    <Paragraph position="1"> Over time, the two disciplines have gradually become estranged, principally due to a realization that the structural properties of languages that characterize natural languages may well not be those that can be distinguished by existing language-theoretic complexity classes. Thus the insights offered by formal language theory might actually be misleading in guiding theories of syntax. As a result, the emphasis in generative grammar has turned from formalisms with restricted generative capacity to those that support more natural expression of the observed regularities of languages. While a variety of distinct approaches have developed, most of them can be characterized as constrain~ based--the formalism (or formal framework) provides a class of structures and a means of precisely stating constraints on their form, the linguistic theory is then expressed as a system of constraints (or principles) that characterize the class of well-formed analyses of the strings in the language. 1 As the study of the formal properties of classes of structures defined in such a way falls within domain of Model Theory, it's not surprising that treatments of the meaning of these systems of constraints are typically couched in terms of formal logic (Kasper and Rounds, 1986; Moshier and Rounds, 1987; Kasper and Rounds, 1990; Gazdar et al., 1988; Johnson, 1988; Smolka, 1989; Dawar and Vijay-Shanker, 1990; Carpenter, 1992; Keller, 1993; Rogers and Vijay-Shanker, 1994).</Paragraph>
    <Paragraph position="2"> While this provides a model-theoretic interpretation of the systems of constraints produced by these formalisms, those systems are typically built by derivational processes that employ extra-logical mechanisms to combine constraints.</Paragraph>
    <Paragraph position="3"> More recently, it has become clear that in many cases these mechanisms can be replaced with ordinary logical operations. (See, for instance:  Johnson (1989), Stabler, Jr. (1992), Cornell (1992), Blackburn, Gardent, and Meyer-Viol (1993), Blackburn and Meyer-Viol (1994), Keller (1993), Rogers (1994), Kracht (1995), and, anticipating all of these, Johnson and Postal (1980).) This approach abandons the notions of grammar mechanism and derivation in favor of defining languages as classes of more or less ordinary mathematical structures axiomatized by sets of more or less ordinary logical formulae. A grammatical theory expressed within such a framework is just the set of logical consequences of those axioms. This step completes the detachment of generative grammar from its procedural roots. Grammars, in this approach, are purely declarative definitions of a class of structures, completely independent of mechanisms to generate or check them. While it is unlikely that every theory of syntax with an explicit derivational component can be captured in this way, ~ for those that can the logical re-interpretation frequently offers a simplified statement of the theory and clarifies its consequences. null But the accompanying loss of language-theoretic complexity results is unfortunate. While such results may not be useful in guiding syntactic theory, they are not irrelevant. The nature of language-theoretic complexity hierarchies is to classify languages on the basis of their structural properties. The languages in a class, for instance, will typically exhibit certain closure properties (e.g., pumping lemmas) and the classes themselves admit normal forms (e.g., representation theorems). While the linguistic significance of individual results of this sort is open to debate, they at least loosely parallel typical linguistic concerns: closure properties state regularities that are exhibited by the languages in a class, normal forms express generalizations about their structure.</Paragraph>
    <Paragraph position="4"> So while these may not be the right results, they are not entirely the wrong kind of results. Moreover, since these classifications are based on structural properties and the structural properties of natural language can be studied more or less directly, there is a reasonable expectation of finding empirical evidence falsifying a hypothesis about language-theoretic complexity of natural languages if such evidence exists.</Paragraph>
    <Paragraph position="5"> Finally, the fact that these complexity classes have automata-theoretic characterizations means that results concerning the complexity of natural languages will have implications for the nature of the human language faculty. These automata-theoretic characterizations determine, along one axis, the types of resources required to generate or recognize the lan2Whether there are theories that cannot be captured, at least without explicitly encoding the derivations, is an open question of considerable theoretical interest, as is the question of what empirical consequences such an essential dynamic character might have.</Paragraph>
    <Paragraph position="6">  guages in a class. The regular languages, for instance, can be characterized by finite-state (string) automata--these languages can be processed using a fixed amount of memory. The context-sensitive languages, on the other had, can be characterized by linear-bounded automata--they can be processed using an amount of memory proportional to the length of the input. The context-free languages are probably best characterized by finite-state tree automata--these correspond to recognition by a collection of processes, each with a fixed amount of memory, where the number of processes is linear in the length of the input and all communication between processes is completed at the time they are spawned. As a result, while these results do not necessarily offer abstract models of the human language faculty (since the complexity results do not claim to characterize the human languages, just to classify them), they do offer lower bounds on certain abstract properties of that faculty. In this way, generative grammar in concert with formal language theory offers insight into a deep aspect of human cognition--syntactic processing--on the basis of observable behavior--the structural properties of human languages.</Paragraph>
    <Paragraph position="7"> In this paper we discuss an approach to defining theories of syntax based on L 2 (Rogers, 1994), a K,P monadic second-order language that has well-defined generative capacity: sets of finite trees are definable within L 2 iff they are strongly context-free K,P in a particular sense. While originally introduced as a means of establishing language-theoretic complexity results for constraint-based theories, this language has much to recommend it as a general framework for theories of syntax in its own right. Being a monadic second-order language it can capture the (pure) modal languages of much of the existing model-theoretic syntax literature directly; having a signature based on the traditional linguistic relations of domination, immediate domination, linear precedence, etc. it can express most linguistic principles transparently; and having a clear characterization in terms of generative capacity, it serves to re-establish the close connection between generative grammar and formal language theory that was lost in the move away from phrase-structure grammars. Thus, with this framework we get both the advantages of the model-theoretic approach with respect to naturalness and clarity in expressing linguistic principles and the advantages of the grammar-based approach with respect to language-theoretic complexity results.</Paragraph>
    <Paragraph position="8"> We look, in particular, at the definitions of a single aspect of each of GPSG and GB. The first of these, Feature Specification Defaults in GPSG, are widely assumed to have an inherently dynamic character.</Paragraph>
    <Paragraph position="9"> In addition to being purely declarative, our reformalization is considerably simplified wrt the definition in Gasdar et al. (1985), 3 and does not share its misleading dynamic flavor. 4 We offer this as an example of how re-interpretations of this sort can inform the original theory. In the second example we sketch a definition of chains in GB. This, again, captures a presumably dynamic aspect of the original theory in a static way. Here, though, the main significance of the definition is that it forms a component of a full-scale treatment of a GB theory of English S- and D-Structure within L 2 This full definition estab- K,P&amp;quot; lishes that the theory we capture licenses a strongly context-free language. More importantly, by examining the limitations of this definition of chains, and in particular the way it fails for examples of non-context-free constructions, we develop a characterization of the context-free languages that is quite natural in the realm of GB. This suggests that the apparent mismatch between formal language theory and natural languages may well have more to do with the unnaturalness of the traditional diagnostics than a lack of relevance of the underlying structural properties. null Finally, while GB and GPSG are fundamentally distinct, even antagonistic, approaches to syntax, their translation into the model-theoretic terms of L 2 allows us to explore the similarities between K,P the theories they express as well as to delineate actual distinctions between them. We look briefly at two of these issues.</Paragraph>
    <Paragraph position="10"> Together these examples are chosen to illustrate the main strengths of the model-theoretic approach, at least as embodied in L2K,p, as a framework for studying theories of syntax: a focus on structural properties themselves, rather than on mechanisms for specifying them or for generating or checking structures that exhibit them, and a language that is expressive enough to state most linguistically significant properties in a natural way, but which is restricted enough to have well-defined strong generative capacity.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML