XML Viewer - p96-1002

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/96/p96-1002_metho.xml
Size: 17,900 bytes
Last Modified: 2025-10-06 14:14:20
<?xml version="1.0" standalone="yes"?>
<Paper uid="P96-1002">
  <Title>A Model-Theoretic Framework for Theories of Syntax</Title>
  <Section position="4" start_page="11" end_page="11" type="metho">
    <SectionTitle>
2 L~,p--The Monadic Second-Order
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="11" end_page="11" type="sub_section">
      <SectionTitle>
Language of Trees
</SectionTitle>
      <Paragraph position="0"> L2K,p is the monadic second-order language over the signature including a set of individual constants (K), a set of monadic predicates (P), and binary predicates for immediate domination (,~), domination (,~*), linear precedence (-~) and equality (..~). The predicates in P can be understood both as picking out particular subsets of the tree and as (non-exclusive) labels or features decorating the tree. Models for the language are labeled tree do- null by the fact that it is couched in terms of an algorithm for checking models.</Paragraph>
      <Paragraph position="1"> mains (Gorn, 1967) with the natural interpretation of the binary predicates. In Rogers (1994) we have shown that this language is equivalent in descriptive power to SwS--the monadic second-order theory of the complete infinitely branching tree--in the sense that sets of trees are definable in SwS iff they are definable in L 2 This places it within a hi- K,P&amp;quot; erarchy of results relating language-theoretic complexity classes to the descriptive complexity of their models: the sets of strings definable in S1S are exactly the regular sets (Biichi, 1960), the sets of finite trees definable in SnS, for finite n, are the recognizable sets (roughly the sets of derivation trees of CFGs) (Doner, 1970), and, it can be shown, the sets of finite trees definable in SwS are those generated by generalized CFGs in which regular ,expressions may occur on the rhs of rewrite rules (Rogers, 1996b). 5 Consequently, languages are definable in L2K,p iff they are strongly context-free in the mildly generalized sense of GPSG grammars.</Paragraph>
      <Paragraph position="2"> In restricting ourselves to the language of L 2 K,P we are restricting ourselves to reasoning in terms of just the predicates of its signature. We can expand this by defining new predicates, even higher-order predicates that express, for instance, properties of or relations between sets, and in doing so we can use monadic predicates and individual constants freely since we can interpret these as existentially bound variables. But the fundamental restriction of L 2 K,P is that all predicates other than monadic first-order predicates must be explicitly defined, that is, their definitions must resolve, via syntactic substitution, 2 into formulae involving only the signature of LK, P.</Paragraph>
    </Section>
  </Section>
  <Section position="5" start_page="11" end_page="13" type="metho">
    <SectionTitle>
3 Feature Specification Defaults in
GPSG
</SectionTitle>
    <Paragraph position="0"> We now turn to our first application--the definition of Feature Specification Defaults (FSDs) in GPSG. 6 Since GPSG is presumed to license (roughly) context-free languages, we are not concerned here with establishing language-theoretic complexity but rather with clarifying the linguistic theory expressed by GPSG. FSDs specify conditions on feature values that must hold at a node in a licensed tree unless they are overridden by some other component of the grammar; in particular, unless they are incompatible with either a feature specified by the ID rule licensing the node (inherited features) or a feature required by one of the agreement  just in case it is incompatible with these other components that gives FSDs their dynamic flavor. Note, though, in contrast to typical applications of default logics, a GPSG grammar is not an evolving theory.</Paragraph>
    <Paragraph position="1"> The exceptions to the defaults are fully determined when the grammar is written. If we ignore for the moment the effect of the agreement principles, the defaults are roughly the converse of the ID rules: a non-default feature occurs iff it is licensed by an ID rule.</Paragraph>
    <Paragraph position="2"> It is easy to capture ID rules in L 2 For instance K,P&amp;quot; the rule:</Paragraph>
    <Paragraph position="4"> where Children(z, Yl, Y~, Y3) holds iff the set of nodes that are children of x are just the Yi and VP, (SUBCAT, 5), etc. are all members of p.7 A sequence of nodes will satisfy ID5 iff they form a local tree that, in the terminology of GKP&amp;S, is induced by the corresponding ID rule. Using such encodings we can define a predicate Free/(x) which is true at a node x iff the feature f is compatible with the inherited features of x.</Paragraph>
    <Paragraph position="5"> The agreement principles require pairs of nodes occurring in certain configurations in local trees to agree on certain classes of features. Thus these principles do not introduce features into the trees, but rather propagate features from one node to another, possibly in many steps. Consequently, these principles cannot override FSDs by themselves; rather every violation of a default must be licensed by an inherited feature somewhere in the tree. In order to account for this propagation of features, the definition of FSDs in GKP&amp;S is based on identifying pairs of nodes that co-vary wrt the relevant features in all possible extensions of the given tree. As a resuit, although the treatment in GKP&amp;S is actually declarative, this fact is far from obvious.</Paragraph>
    <Paragraph position="6"> Again, it is not difficult to define the configurations of local trees in which nodes are required to agree by FFP, CAP, or HFC in L 2 Let the predi- K,P&amp;quot; cate Propagatey(z, y) hold for a pair of nodes z and y iff they are required to agree on f by one of these principles (and are, thus, in the same local tree).</Paragraph>
    <Paragraph position="7"> Note that Propagate is symmetric. Following the terminology of GKP&amp;S, we can identify the set of nodes that are prohibited from taking feature f by the combination of the ID rules, FFP, CAP, and HFC as the set of nodes that are privileged wrt f.</Paragraph>
    <Paragraph position="8"> This includes all nodes that are not Free for f as well 7We will not elaborate here on the encoding of categories in L 2 K,P, nor on non-finite ID schema like the iterating co-ordination schema. These present no significant problems.</Paragraph>
    <Paragraph position="9"> as any node connected to such a node by a sequence of Propagate/ links. We, in essence, define this inductively. P' (X) is true of a set iff it includes all \] nodes not Free for f and is closed wrt Propagate/.</Paragraph>
    <Paragraph position="11"> There are two things to note about this definition.</Paragraph>
    <Paragraph position="12"> First, in any tree there is a unique set satisfying PrivSet/(X) and this contains exactly those nodes not Free for f or connected to such a node by Propagate\]. Second, while this is a first-order inductive property, the definition is a second-order explicit definition. In fact, the second-order quantification of L 2 allows us to capture any monadic K,P first-order inductively or implicitly definable prop-erty explicitly.</Paragraph>
    <Paragraph position="13"> Armed with this definition, we can identify individuals that are privileged wrt f simply as the mem-</Paragraph>
    <Paragraph position="15"> One can define Privileged_,/(x) which holds whenever x is required to take the feature f along similar lines.</Paragraph>
    <Paragraph position="16"> These, then, let us capture FSDs. For the default \[-INV\], for instance, we get: (Y=x)\[-~Privileged\[_ INV\](X) &amp;quot;&amp;quot;+ \[-- INV\](x)\]. For \[BAR0\] D,,~ \[PAS\] (which says that \[Bar 0\] nodes are, by default, not marked passive), we get:</Paragraph>
    <Paragraph position="18"> The key thing to note about this treatment of FSDs is its simplicity relative to the treatment of GKP&amp;S. The second-order quantification allows us to reason directly in terms of the sequence of nodes extending from the privileged node to the local tree that actually licenses the privilege. The immediate benefit is the fact that it is clear that the property of satisfying a set of FSDs is a static property of labeled trees and does not depend on the particular strategy employed in checking the tree for compliance.</Paragraph>
    <Paragraph position="19"> SWe could, of course, skip the definition of PrivSet/ and define Privilegedy(x) as (VX)\[P'(X) ---* Z(x)\], but we prefer to emphasize the inductive nature of the definition.</Paragraph>
  </Section>
  <Section position="6" start_page="13" end_page="14" type="metho">
    <SectionTitle>
4 Chains in GB
</SectionTitle>
    <Paragraph position="0"> The key issue in capturing GB theories within L 2 K,P is the fact that the mechanism of free-indexation is provably non-definable. Thus definitions of principles that necessarily employ free-indexation have no direct interpretation in L 2 (hardly surprising, K,P as we expect GB to be capable of expressing non-context-free languages). In many cases, though, references to indices can be eliminated in favor of the underlying structural relationships they express. 9 The most prominent example is the definition of the chains formed by move-a. The fundamental problem here is identifying each trace with its antecedent without referencing their index. Accounts of the licensing of traces that, in many cases of movement, replace co-indexation with government relations have been offered by both Rizzi (1990) and Manzini (1992). The key element of these accounts, from our point of view, is that the antecedent of a trace must be the closest antecedent-governor of the appropriate type. These relationships are easy to capture in L 2 For A-movement, for instance, K,P&amp;quot; we have:  where F.Eq(x, y) is a conjunction of biconditionals that assures that x and y agree on the appropriate features and the other predicates are are standard GB notions that are definable in L 2 K,P&amp;quot; Antecedent-government, in Rizzi's and Manzini's accounts, is the key relationship between adjacent members of chains which are identified by non-referential indices, but plays no role in the definition of chains which are assigned a referential index3 deg Manzini argues, however, that referential chains cannot overlap, and thus we will never need to distinguish multiple referential chains in any single context. Since we can interpret any bounded number of indices simply as distinct labels, there is no difficulty in identifying the members of referential chains in L 2 On these and similar grounds we can extend K,P&amp;quot; these accounts to identify adjacent members of referential chains, and, at least in the case of English,  of chains of head movement and of rightward movement. This gives us five mutually exclusive relations which we can combine into a single link relation that must hold between every trace and its antecedent:</Paragraph>
    <Paragraph position="2"> Right-Link(x, y).</Paragraph>
    <Paragraph position="3"> The idea now is to define chains as sequences of nodes that are linearly ordered by Link, but before we can do this there is still one issue to resolve. While minimality ensures that every trace must have a unique antecedent, we may yet admit a single antecedent that licenses multiple traces. To rule out this possibility, we require chains to be closed wrt the link relation, i.e., every chain must include every node that is related by Link to any node already in the chain. Our definition, then, is in essence the definition, in GB terms, of a discrete linear order with endpoints, augmented with this closure property.</Paragraph>
    <Paragraph position="4">  --X is closed wrt the Link relation Note that every node will be a member of exactly one (possibly trivial) chain.</Paragraph>
    <Paragraph position="5"> The requirement that chains be closed wrt Link means that chains cannot overlap unless they are of distinct types. This definition works for English because it is possible, in English, to resolve chains into boundedly many types in such a way that no two chains of the same type ever overlap. In fact, it fails only in cases, like head-raising in Dutch, where there are potentially unboundedly many chains that may overlap a single point in the tree. Thus, this gives us a property separating GB theories of movement that license strongly context-free languages from those that potentially don't--if we can establish a fixed bound on the number of chains that can overlap, then the definition we sketch here will suffice to capture the theory in L 2 and, consequently, the K,P theory licenses only strongly context-free languages.  This is a reasonably natural diagnostic for context-freeness in GB and is close to common intuitions of what is difficult about head-raising constructions; it gives those intuitions theoretical substance and provides a reasonably clear strategy for establishing context-freeness.</Paragraph>
    <Paragraph position="6"> this distinction is; one particularly interesting question is whether it has empirical consequences. It is only from the model-theoretic perspective that the question even arises.</Paragraph>
  </Section>
  <Section position="7" start_page="14" end_page="14" type="metho">
    <SectionTitle>
6 Conclusion
5 A Comparison and a Contrast
</SectionTitle>
    <Paragraph position="0"> Having interpretations both of GPSG and of a GB account of English in L 2 provides a certain K,P amount of insight into the distinctions between these approaches. For example, while the explanations of filler-gap relationships in GB and GPSG are quite dramatically dissimilar, when one focuses on the structures these accounts license one finds some surprising parallels. In the light of our interpretation of antecedent-government, one can understand the role of minimality in l~izzi's and Manzini's accounts as eliminating ambiguity from the sequence of relations connecting the gap with its filler. In GPSG this connection is made by the sequence of agreement relationships dictated by the Foot Feature Principle. So while both theories accomplish agreement between filler and gap through marking a sequence of elements falling between them, the GB account marks as few as possible while the GPSG account marks every node bf the spine of the tree spanning them.</Paragraph>
    <Paragraph position="1"> In both cases, the complexity of the set of licensed structures can be limited to be strongly context-free iff the number of relationships that must be distinguished in a given context can be bounded.</Paragraph>
    <Paragraph position="2"> One finds a strong contrast, on the other hand, in the way in which GB and GPSG encode language universals. In GB it is presumed that all principles are universal with the theory being specialized to specific languages by a small set of finitely varying parameters. These principles are simply properties of trees. In terms of models, one can understand GB to define a universal language--the set of all analyses that can occur in human languages. The principles then distinguish particular sub-languages--the head-final or the pro-drop languages, for instance. Each realized human language is just the intersection of the languages selected by the settings of its parameters. In GPSG, in contrast, many universals are, in essence, closure properties that must be exhibited by human languages--if the language includes trees in which a particular configuration occurs then it includes variants of those trees in which certain related configurations occur. Both the ECPO principle and the metarules can be understood in this way. Thus while universals in GB are properties of trees, in GPSG they tend to be properties of sets of trees. This makes a significant difference in capturing these theories model-theoretically; in the GB case one is defining sets of models, in the GPSG case one is defining sets of sets of models. It is not at all clear what the linguistic significance of We have illustrated a general formal framework for expressing theories of syntax based on axiomatizing classes of models in L 2 This approach has a K,P* number of strengths. First, as should be clear from our brief explorations of aspects of GPSG and GB~ re-formalizations of existing theories within L 2 K,P can offer a clarifying perspective on those theories, and, in particular, on the consequences of individual components of those theories. Secondly, the framework is purely declarative and focuses on those aspects of language that are more or less directly observable--their structural properties. It allows us to reason about the consequences of a theory without hypothesizing a specific mechanism implementing it. The abstract properties of the mechanisms that might implement those theories, however, are not beyond our reach. The key virtue of descriptive complexity results like the characterizations of language-theoretic complexity classes discussed here and the more typical characterizations of computational complexity classes (Gurevich, 1988; Immerman, 1989) is that they allow us to determine the complexity of checking properties independently of how that checking is implemented. Thus we can use such descriptive complexity results to draw conclusions about those abstract properties of such mechanisms that are actually inferable from their observable behavior. Finally, by providing a uniform representation for a variety of linguistic theories, it offers a framework for comparing their consequences. Ultimately it has the potential to reduce distinctions between the mechanisms underlying those theories to distinctions between the properties of the sets of structures they license. In this way one might hope to illuminate the empirical consequences of these distinctions, should any, in fact, exist.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML