CATEGORY STRUCTURES 
Gerald Gazdar 
Cognitive Studies Programme, University of Sussex, Brighton BN1 9QN, U.K. 
Geoffrey K. Pullum 
Cowell College, University of California, Santa Cruz, Santa Cruz, California 95064, USA 
Robert Carpenter, Ewan Klein 
Centre for Cognitive Science, University of Edinburgh, Edinburgh EH8 9LW, U.K. 
Thomas E. Hukari 
Department of Linguistics, University of Victoria, Victoria, B.C., Canada V8W 2Y2 
Robert D. Levine 
Department of Linguistics, University of British Columbia, Vancouver, B.C., Canada V6T lW5 
This paper outlines a simple and general notion of syntactic category on a metatheoretical level, 
independent of the notations and substantive claims of any particular grammatical framework. We 
define a class of formal objects called "category structures" where each such object provides a 
constructive definition for a space of syntactic categories. A unification operation and subsumption and 
identity relations are defined for arbitrary syntactic categories. In addition, a formal language for the 
statement of constraints on categories is provided. By combining a category structure with a set of 
constraints, we show that one can define the category systems of several well-known grammatical 
frameworks: phrase structure grammar, tagmemics, augmented phrase structure grammar, relational 
grammar, transformational grammar, generalized phrase structure grammar, systemic grammar, 
categorial grammar, and indexed grammar. The problem of checking a category for conformity to 
constraints is shown to be solvable in linear time. This work provides in effect a unitary class of data 
structures for the representation of syntactic categories in a range of diverse grammatical frameworks. 
Using such data structures should make it possible for various pseudo-issues in natural language 
processing research to be avoided. We conclude by examining the questions posed by set-valued features 
and sharing of values between distinct feature specifications, both of which fall outside the scope of the 
formal system developed in this paper. 
The notion syntactic category is a central one in most 
grammatical frameworks. As Karttunen and Zwicky 
(1985) observe, traditional "parsing" as taught for lan- 
guages like Latin involved little more than supplying a 
detailed description of the grammatical category of each 
word in the sentence to be parsed. Phrase structure 
grammars are entirely concerned with assigning termi- 
nal strings to categories and determining dominance and 
precedence between constituents on the basis of their 
categories. In a classical transformational grammar 
(TG), the objects transformations manipulate are pri- 
marily strings of syntactic categories (and, to a lesser 
extent, of terminal symbols). This is just as true of 
recent TG work. 
Although the use of syntactic categories is not a 
logical prerequisite of generative grammar (see Levy 
and Joshi (1978)), no linguistic approach known to us 
dispenses with them altogether. In view of this, it is 
perhaps surprising that linguists have not attempted to 
explicate the concept "syntactic category" in any gen- 
Copyright 1988 by the Association for Computational Linguistics. Permission to copy without fee all or part of this material is granted provided 
that the copies are not made for direct commercial advantage and the CL reference and this copyright notice are included on the first page. To 
copy otherwise, or to republish, requires a fee and/or specific permission. 
0362-613X/88/010001-19503.00 
Computational Linguistics, Volume 14, Number I, Winter 1988 1 
Gerald Gazdar et al. Category Structures 
eral way, i.e., independently of particular systems of 
notation and the associated substantive assumptions 
about grammar. 
In this paper we offer an explicit metatheoretical 
framework in which a notion of "syntactic category" 
receives a precise definition. The framework is intended 
to facilitate analysis and comparison of the underlying 
concepts of different theories, freed from the notational 
and sociological baggage that sometimes encumbers the 
original presentations in the literature. Viewed from the 
standpoint of implementation, it can be regarded as 
providing a unitary data structure for categories that can 
be used in the implementation of a number of superfi- 
cially different grammatical frameworks. 
We begin by defining in section 1 a space of catego- 
ries broad enough to encompass the objects employed 
as syntactic categories in a range of diverse types of 
generative grammar. Then, in section 2, we present the 
syntax and semantics for L c, a formal language for 
defining constraints on categories. In the succeeding 
section we provide illustrative definitions of the gram- 
matical categories used in a number of frameworks. We 
cover simple phrase structure grammar in section 3.1; 
tagmemics in section 3.2; Harman's (1963) augmented 
phrase structure grammar in section 3.3; relational 
grammar and arc pair grammar in section 3.4; X syntax, 
TG, and the government-binding (GB) framework in 
section 3.5; generalized phrase structure grammar 
(GPSG) in section 3.6; systemic grammar in section 3.7; 
categorial grammar in section 3.8; and Aho's (1968) 
indexed grammar in section 3.9. We then go on to 
consider some relevant computational complexity mat- 
ters (section 4). Finally, we discuss two issues that do 
not arise in any of these approaches, and which fall 
outside the scope of the simple theory that we present, 
namely the use of sets as values of features (section 5) 
and values shared between distinct feature specifica- 
tions (section 6). These issues are important in the 
context of the category systems employed in functional 
unification grammar (FUG), lexical functional grammar 
(LFG), and the PATR II grammar formalism. 
Our goal in this paper is not an empirical one, but 
rather one which is analogous to that of Montague's 
"Universal Grammar" (1970) (see Halvorsen and La- 
dusaw (1977) for a useful introduction) which attempts 
to give a general definition of the notion "possible 
language" in terms applicable to, but not limited to, the 
study of human languages. We have the much more 
modest goal of characterizing one rather simple and 
general notion of "possible syntactic category", and of 
exploring the range of linguistic approaches that it will 
generalize to, its formal properties, and its limitations. 
As will become evident below, our exercise is comple- 
mentary in certain respects to that of Pereira and 
Shieber (1984) and Shieber (1987) and to recent work of 
Rounds and his associates on the development of a logic 
for the description of the notions of syntactic category 
that are embodied in functional unification grammar and 
PATR II (see Kasper and Rounds (1986), Moshier and 
Rounds (1987), Rounds and Kasper (1986)). 
We do not concern ourselves with the appearance or 
representational details of a given theory of categories 
(or any of the other aspects of the linguistic framework 
in question, e.g., its rule system), but only with its 
underlying semantics--the issue of what set-theoretic 
(or other nonlinguistic) objects provide categories with 
their interpretation. We are content with being able to 
exhibit an isomorphism between one of the theories of 
categories permitted by our framework and the concrete 
example we are considering; we need not demonstrate 
identity. Hence we have deliberately refrained from 
specifying a formal language for representing categories 
and features. To the extent that we need to produce 
exemplificatory features or categories for inspection, 
we may use the conventional notation of the approach 
in question, or the ordinary notations of set theory, or 
an informal labeled graph notation introduced below, 
but we do not offer a representational formalism for 
categories that has a significance of its own. 
In the framework we provide, it is possible to define 
the category systems of a wide variety of apparently 
very different approaches to natural language syntax 
simply by defining two primitive typing functions, and 
by varying the constraints stated on the categories that 
they induce. The exercise of expressing the content of 
various specific linguistic approaches in such terms 
immediately calls attention to certain interesting formal 
issues. For example, we reconstruct below the notion of 
a list-valued (or stack-valued) feature in terms of cate- 
gory-valued features, which automatically allows oper- 
ations defined on categories such as unification to apply 
to lists without special redefinition. 
An interesting fact that emerges from the view taken 
here is that on the matter of syntactic categories, there 
is somewhat more commonality among the diverse 
approaches currently being pursued than there appears 
to be when those approaches are viewed in the formal- 
isms used by their practitioners. The various syntactic 
frameworks that we examine below can be seen to share 
a great deal of their underlying substantive claims about 
the information content of the category label of a 
constituent. Our explication of these underlying com- 
monalities may make somewhat easier the task of the 
computational linguist attempting to implement a sys- 
tem on the basis of some grammatical framework, or 
attempting to decide which approach to implement in 
the first place. 
In order to prepare for some of the definitions that 
follow, we will briefly and informally sketch some of 
our assumptions about features and categories and the 
terminology we shall use for talking about them. A 
category is a set of feature specifieations meeting certain 
conditions to be defined below. A feature specification 
is an attribute-value pair (f, v) where the attributef(the 
feature) is atomic (i.e., given by some finite list, and 
regarded as unanalyzable) and the value v is either 
2 Computational Linguistics, Volume 14, Number 1, Winter 1988 
Gerald Gazdar et al. Category Structures 
atomic or complex. Here we shall assume just one type 
of complex value, namely a category (but see below in 
section 5). 
An example of an atom-valued feature specification 
would be (SINGULAR,+) (which many grammarians 
would write as \[+SINGULAR\]); intuitively, it might mark 
singular number (though, of course, the interpretation it 
actually has depends on the role it plays in the gram- 
mar). An example of a complex feature specification, 
with a category as the value, would be (AGREEMENT, 
{(SINGULAR, +), (GENDER, FEM), (PERSON, 3)}); intuitively, it 
might be used to convey that the value of the AGREEMENT 
feature is a category representing the combination of 
singular number, feminine gender, and third person. In 
the following sections, we will always use SMALL CAPI- 
TALS for feature names, and we will generally replace 
"-" and "+", which are standard usage in the linguis- 
tic literature for the atomic values of a binary feature, 
by 0 and 1 respectively. 
As we have said, a category is a set of feature 
specifications meeting certain conditions. We will now 
specify these. We do not require that every feature 
name be represented in each category, but we do 
require that each occurrence of a feature be paired with 
exactly one value in any set of specifications; thus 
{(SINGULAR,+>, (SINGULAR,-->} could not be a category. 
Hence a category can be modeled as a partial function 
C:F--> V, where F is a set of features and V is the set of 
values. An equivalent alternative would be to treat 
categories as total functions into a range that includes 
an element ± that can stand as the value where the 
corresponding partial functions would fail to assign a 
value. Note that we use the term 'range' here, and 
subsequently, to refer to a set that includes all the 
values that a partial function or family of partial func- 
tions might take given appropriate domain elements, 
rather than just the set of values that it does take when 
we fix a particular domain for a function. 
It may be helpful to think of a category as having the 
structure of an unordered tree, and we will introduce a 
type of diagram below which exhibits this structure 
overtly. Often, however, the idea of categories as 
partial functions will be crucial, so it should be kept in 
mind throughout. 
Since the set V of values may include categories, the 
definition of the entire set of categories has to be given 
recursively. Moreover, it has to allow for the possibility 
that not all values are compatible with all features. 
Thus, for example, in a given feature system, (GENDER, 
0) and (PERSON, plural) might be coherent objects but 
mnemonically perverse, whereas in another feature 
system, they might simply be ill-formed. We shall show 
how these issues can be resolved in the coming sec- 
tions. We will not, however, give a constructive defini- 
tion of the set of categories for each grammatical 
framework we consider. Instead, given our comparative 
and metatheoretical goals, it turns out to be more 
convenient to define a category system as a pair (~, C> 
where ~ is a category structure, which defines a set of 
potential categories (see section 8), and C is a set of 
constraints expressed in L c, a language for which the 
category structure defines the models (see section 9). 
The actual categories in the system are then to be 
construed as that subset of the potential categories 
defined in ~, each member of which satisfies every 
constraint listed in C. 
1 DEFINING CATEGORY STRUCTURES 
In this section we define the notion of a category 
structure, which is basically a choice of primitives: a list 
of features, and a range of possible values for each. 
Here and throughout the paper we will frequently use 
"2" to denote the set {0, l} (the context will make it 
clear when "2" represents an integer and when it 
represents a set). We will write A B for the set of total 
functions from B into A,A (m for the set of partial 
functions from B into A, @(A) for the power set of A,IAI 
for the cardinality of A, and Aft) for the domain of a 
(partial) function f (iff is a partial function than A(f) is 
the set of items to which f assigns a value). 
A category structure E is a quadruple (F, A, r, p) 
where F is a finite set of features, A is a finite set of 
atoms, r is a function in 2 F, and p is a function from 
{flW.D = 0} into ~(A). The function r partitions F into 
two sets: the set of type 0 features F ° = {fir(f) = 0}, and 
the set of type 1 features F l = {fir(f) = 1}. We will write 
r as r 0 when F = F °. Type 0 features take atomic values 
and type 1 features take categories as values. The 
function p assigns a range of atomic values to each 
feature of type 0. 
The set of categories K is recursively defined in 
terms of (F, A, r, p), in a way very similar to that used 
in Pollard (1984, p. 299ff), though Pollard's assumptions 
differ on some important details. A relatively informal 
presentation will suffice here. We will refer to the set of 
partial functions from F ° into A that are consistent with 
p as the type 0 categories. We first define the set of pure 
type 0 categories of ~ as those containing only type 0 
feature specifications. Then we build up K via a series 
of approximations we will refer to as levels, finally 
taking the infinite union of all the levels to obtain K 
itself: 
(1) a. O is a category at level 0 
b. If a is a type 0 category and fl is a category 
containing only type 1 features whose values are 
categories at level n, then a U fl is a category at 
level n + 1. 
c. K is the set of all categories at all levels n -> 0. 
Given the way K is built up, the induction step in (Ib) 
being restricted to union of finite partial functions, it 
should be clear that K is a recursive set. 
We can define certain relations and operations on the 
space K of possible categories. Thus, we can give a 
Computational Linguistics, Volume 14, Number I, Winter 1988 
Gerald Gazdar et aL Category Structures 
constructive definition for unification (symbolized U) as 
a binary operation on categories. 
(2) Definition: unification 
(i) if (f, v) E a but/300 is undefined, then (f, v) 
E aU/3; 
(ii) if (f, v) E/3 but a(f) is undefined, then (f, v) 
E ~U/3; 
(iii) if (f, v;) e a and (f, b) e /3 and ~-(\]) = 1, then 
if vi U b is undefined, a U/3 is also undefined, 
else (f, v i U vj) @ a U/3; 
(iv) if (f, vi) E a and (f, 5) E /3 and T(j') = 0, then 
if vi = vj, (f, vi) E /3 U/3, else a U/3 is 
undefined. 
(v) nothing else is in a U/3. 
We can then use unification to define the subsumes 
relation between categories (where 'subsumes' means 
'is more general/underspecified than', or 'is extended 
by'). We symbolize 'subsumes' with 'E_', and define it as 
follows. 
(3) Definition: subsumption 
o~ subsumes/3 (a E_/3) if and only if/3 = a U/3. 
Thus a subsumes/3 if and only if/3 is the unification 
of a and/3. When a subsumes/3 then we may refer to/3 
as an extension of a. If a U/3 is undefined, then/3 = a 
U/3 fails, and a does not subsume/3. From this it follows 
that, if a and/3 are categories, then a =/3 if and only if 
a E_/3 and/3 E_ a. The following theorem is provable by 
induction on category levels. 
(4) Theorem: 
a subsumes/3 if and only if 
(i) Vf E (A(a) fq F °) \[a(f) = /3(f)\] and 
(ii) Vf E (A(a) tq F 1) \[a(f) F"/3(f)\]. 
2 THE CONSTRAINT LANGUAGE L c 
We now provide an interpreted formal language, L C, for 
expressing specific constraints on categories. Con- 
straints are statements that can be true or false of a 
category. By requiring satisfaction of the constraint, a 
constraint can be used to delimit a subspace within the 
set K induced by a given category structure E, to serve 
as the grammatical categories for a particular type of 
grammar. 
It should be noted that our goals in formulating L c 
are slightly different from those of Rounds and his 
associates: Lc is a language for formulating constraints 
on well-formed categories, not a language whose ex- 
pressions are intended for use in place of categories. To 
put it rather crudely, our language is for category 
definition whereas Rounds' is (in part) for category 
manipulation. However, the languages look rather sim- 
ilar syntactically, and where they overlap, the seman- 
tics is essentially the same. 
We define two types of constraint: basic and complex. 
If f is an element of F, and a is an element of A, then 
there are just two distinct types of well-formed basic 
constraint: 
(5) a. f 
b. 32a (where ~-(f) = 0) 
Informally, (5a) constrains a category to contain some 
specification for the feature f; thus, the constraint 
"BAR" says that every syntactic category satisfying it 
has as one of its elements a pair (BAR, n). This does not 
entail that every value of every category-valued feature 
contained in the category must contain BAR; a basic 
constraint applies to the "top level" of the tree-like 
structure of a category. Likewise, (5b) says of a cate- 
gory satisfying it that it has as one of its elements the 
pair (f, a). Note that the only thing a basic constraint 
can require of a type 0 feature beyond saying that it 
must be present (defined) is that it have a particular 
atomic value, and that a basic constraint cannot require 
anything of a type 1 feature at all beyond demanding its 
presence. 
Turning to complex constraints, we now continue the 
list (5), giving the syntax for each type of complex 
constraint together with an informal indication of its 
semantics. Assume thatfis an element of F 1, and 05 and 
are themselves well-formed basic or complex con- 
straints, and that we are considering the interpretation 
of the constraints with respect to some fixed category 
structure Y and some category a. 
(5) c. 32 05 'f is defined in a and its value satisfies 05' 
d. -7 05 'a does not satisfy 05' 
e. 05 V ~O 'a satisfies either 05 or ~O' 
f. 05 A @ 'a satisfies both 05 and ~0' 
g. 05 --> ~0 'either a does not satisfy 05 or a does 
satisfy ~0' 
h. 05 ~ ~0 'a satisfies either both or neither of 05 and 
i. D05 'a satisfies 05, and all values of type 1 
features in a satisfy 1~05' 
j. © 05 'either a satisfies 05 or some value of a type 
1 feature in a satisfies O 05' 
Constraints of the forms (5a) through (5h) are fairly 
straightforward, but constraints like those shown in (5i) 
and (5j) need a little more discussion. They introduce 
modality into our language. Their purpose is to allow for 
recursive constraints to be imposed on successively 
embedded layers of category values. As indicated, a 
category a satisfies r-\]05 provided that, firstly, a satisfies 
05 and secondly, whenever a assigns a category/3 to a 
type 1 feature f, /3 satisfies D05. This may appear to 
introduce a circularity, but it does not: categories are 
finite, and within any category there will be a level so 
deeply embedded in the tree structure that there are no 
more category values within it; at that point \[~05 is true 
if 05 is, thus ending the recursion. 
Our choice of notation in (5i) is quite deliberate: in 
effect, constraints of the form (50 express universal 
quantification over embedded accessible' categories in 
4 Computational Linguistics, Volume 14, Number 1, Winter 1988 
Gerald Gazdar et al. Category Structures 
the way that the familiar necessity operator \[\] of modal 
logic enforces universal quantification over accessible 
worlds in the standard semantics. The possibility oper- 
ator in (5j) is, as usual, the dual of the necessity 
operator: O 4, says of a category a satisfying it that 
either a satisfies tO, or there exists a category-value/3 
assigned to a type 1 featurefby a such that/3 satisfies 
<>tO. 
As a simple example of the sort of work a complex 
constraint in Lc might do in a grammatical theory, 
consider the constraint that is known as the "Case 
filter" in recent TG (see Chomsky 1980, p. 25). Stated 
informally as "*N, where N has no Case", the con- 
straint appears to require every occurrence of the 
feature complex characterizing the category N, i.e., 
every occurrence of \[+N, -V\], to co-occur with a 
feature called "Case". The constraint can be stated in 
Lc as (6). 
(6) \[\](((N: I) A (v: 0)) ~ CASE) 
Here and from now on, we use parentheses in the 
obvious way wherever it is necessary prevent ambiguity 
in the statement of constraints. 
The account of L c given thus far will suffice for a 
reading of this paper, but those readers who would like 
to see the semantics given more formally may turn to 
the appendix. 
To recapitulate, a theory of categories ® in our sense 
is a pair (E, C), where I£ is a category structure and C is 
a set of sentences of Lo The set of categories deter- 
mined by ® is the maximal subset Kc of K determined 
by E such that each member of K c satisfies every 
member of C. 
3 ILLUSTRATIVE APPLICATIONS 
We will now illustrate the application of the apparatus 
developed thus far by reconstructing the category sys- 
tems used in a number of well-known grammatical 
frameworks that linguists have developed, most of them 
frameworks that have been used in natural language 
processing systems at one time or another. 
3.1 SIMPLE PHRASE STRUCTURE GRAMMAR 
The case of simple phrase structure grammar is trivial, 
but will serve as an introduction to the form of later 
sections, and as a straightforward example of the use of 
a type 0 feature. 
The set of categories used in a simple phrase struc- 
ture grammar is just some finite set of atomic categories 
{al ..... a,}, for example, {S, NP, VP, Det, N, V}. So 
we fix values for F, A, z, and p as in (7): 
(7) a. F = {LABEL} 
b. A = {a 1 ..... a,} 
c. ~'o 
d. p = {<LABEL, A)} 
Thus, for example, we might have A = {S, NP, VP, Det, 
N, V}, and thus have /~LABEL) as the same set. In 
addition, we need the following constraint, to make sure 
that every category does indeed have a specification for 
the solitary type 0 feature LABEL, i.e., to exclude the 
empty set from counting as a category: 
(8) LABEL 
Obviously, we can now show that the category inven- 
tory for any simple phrase structure grammar is repre- 
sentable. We let 0 be the bijection defined by 0(a/) = 
{(LABEL, ai)}, and the result is immediate. Thus there is a 
bijection from the set of simple phrase structure gram- 
mar categories to the categories admitted by the cate- 
gory structure (7) under the constraint (8). As is evident, 
the set of categories induced is finite, and of cardinality 
n = IAI. 
3.2 TAGMEMICS 
It may be that there are more published syntactic 
analyses of languages in the framework of tagmemics 
than in any other theoretical framework ever devel- 
oped. Since the early 1960s, those who have followed 
the work of Kenneth Pike, including a very large 
number of field linguists working for the Summer Insti- 
tute of Linguistics, have produced analyses of hundreds 
of languages, mostly non-Indo-European. Moreover, 
Postal (1964, p. 33) remarks that "these languages are, 
for the most part, exotic enough so that the tagmemic 
descriptions of them may very well be the only ones 
done." 
Tagmemics describes syntactic structure in terms of 
TAGMEMES, which are notated in the form A:b, where A is 
said to represent a SLOT and b a FILLER. For example, 
Elson and Pickett (1962) represent (part of) the structure 
of English prepositional phrases and intransitive clauses 
with tagmemic formulm (i.e., rules) similar to the fol- 
lowing (we simplify very slightly): 
(9) a. LraPhr-- +R:prep +A:mNc 
b. mNc -- +Lim:ar --M:aj +H:nc 
c. iCl = +S:mNc +iP:v 3 
The informal explication of these is: (9a) one type of 
location relater-axis phrase consists of an obligatory 
relater slot filled by a preposition followed by an 
obligatory axis slot filled by a modified count noun 
phrase; (9b) one type of modified count noun phrase 
consists of an obligatory limiter slot filled by an article 
followed by an optional modifier slot filled by an 
adjective followed by an obligatory head slot filled by a 
count noun; (9c) one type of intransitive clause consists 
of an obligatory subject slot filled by a modified com- 
mon noun phrase followed by an obligatory intransitive 
predicate slot filled by a verb of class 3. Thus the left 
hand side of a formula (before the equality sign) consists 
of an atomic label, and the right hand side is a string of 
tagmemes, which are ordered triples (a, b, c) where a is 
an indication of optional (-+) or obligatory (+) status, b 
is a slot or function name, and c is a filler or category 
label. 
Computational Linguistics, Volume 14, Number 1, Winter 1988 5 
Gerald Gazdar et al. Category Structures 
One way of representing tagmemes in our terms is to 
employ a type 0 feature bearing the slot name, taking as 
value an atomic label identifying the filler. Thus we set 
up correspondences like the following: 
(10) a. R:prep {(R, prep)} 
b. A:mNc {(A, mNc)} 
c. Lim:ar {(him, ar)} 
d. M:aj {(M, aj)} 
e. H:nc {(H, nc)} 
f. S:mNc {(s, mNc)} 
g. P:v 3 {(P, v3)} 
Left hand sides of formulae can be seen as implicit 
schematizations over slot names. For example, (9b) 
says that for any slot name o-, a constituent labelled {(o-, 
mNc)} may have the immediate constituent analysis 
seen on the right hand side of the equation. 
A category structure representing a set of categories 
including all those seen in the above illustrative exam- 
ples is given in (11). 
(II) a. F = {R,A,LIM,M,H,S,P} 
b. A = {LraPhr,prep,mNc,ar,aj,nc,vl,v2,v3} 
c . "r 0 
d. p = {(R,{prep}),(A,{mNc}),(LIM,{ar}),(m,{aj}),(H, 
{nc}),(s,{mNc}),(p,{Vl,V2,V3})} 
This artificially tiny fragment does not show much of 
the structure that would be revealed in a larger frag- 
ment, with more word classes and phrases types, but it 
will suffice to show how we could set up a category 
structure that provided isomorphic correspondents to 
the categories employed in a tagmemic description. 
Moreover, there is an unclarity about whether there is 
more to a tagmemic formula than has been illustrated 
here; as discussed by Postal (1964), there are some 
remarks about the treatment of agreement in Elson and 
Pickett (1962) that imply either finite schematization or 
additional representational devices of an unclarified 
sort. We will not explore this topic here. 
Postal (1964) is probably right in saying that tagme- 
mics appears to be only notationally distinct from 
context-free phrase structure grammar. Longacre (1965) 
claims that "\[b\]y bringing together function and set in 
the tagmeme" tagmemics ensures that "function is at 
once kept in focus and made amenable to formal anal- 
ysis." Under our reconstruction, "functions" like 
"subject" or"modifier" are "made amenable to formal 
analysis" simply by incorporating them into the feature 
structure of categories, making it clear that little was at 
stake in the debate between Postal and Longacre over 
the content of tagmemics. It is clear that the number of 
categories defined by a category structure for tagme- 
mics will be bounded from above by IFI • IAI, and thus 
finite. The question of whether tagmemics reduces to 
context-free grammar therefore turns on whether tag- 
memic formulae can in all cases be reduced to context- 
free rules. This seems likely, but such issues are not the 
focus of our attention in this paper. 
3.3 HARMAN'S AUGMENTED PHRASE STRUCTURE 
GRAMMAR 
Harman (1963) presents a proposal that involves aug- 
menting the ordinary category inventory (S, NP, VP, 
etc.) of simple phrase structure grammar by attaching 
"an unordered sequence of zero or more (up to N for 
some finite N) subscripts" to a category. Abbreviatory 
conventions are then used to manage large sets of rules 
over the resultant vocabulary. Note that the indices 
stand for the members of a set rather than a sequence, 
and that there is only a finite number of them. 
To formalize Harman's proposal in the present 
framework, we again use LABEL as the feature that 
identifies major syntactic categories in the traditional 
sense, and we set up a finite number of type 0 features 
~'1 ..... ~', to correspond to the presence (value 1) or 
absence (value 0) of each of the n different subscripts. 
The set of feature specifications for these features 
reconstructs the characteristic function of the set of 
indices. The category structure is as follows: 
(12) a. F = {LABEL, F 1 ..... Fn} 
b. A = {al ..... a,,} tA 2 
c. "r 0 
d. p = {(LABEL, {a 1 ..... am}), (FI, 2) ..... (Fn, 2)} 
We now have to guarantee that every category has a 
value for LABEL and a value for each  ,. in F. We 
therefore impose the following constraint: 
(13) LABEL /~ F I /~ . . . /~ F n 
The resultant specification induces a finite set of cate- 
gories, of cardinality m • 2 n. 
Harman's system is more than just a historical curi- 
osity. More recent works are found that use almost 
exactly the same sort of syntactic categories. For ex- 
ample, the use made of syntactic features in one influ- 
ential variety of augmented phrase structure grammar, 
the Prolog-based definite clause grammar (DCG) for- 
malism of Pereira and Warren (1980) closely resembles 
that of Harman. However, it is clear that the full power 
of the DCG formalism can, in principle, be used to 
exploit features with structured values and value- 
sharing (see section 6 on the latter). 
3.4 RELATIONAL AND ARC PAIR GRAMMAR 
Relational grammar (RG) Perlmutter and Postal (1977) 
and arc pair grammar (APG) Johnson and Postal (1980), 
(henceforth J&P) appear to make relatively little use of 
grammatical category information, expressing most 
grammatical rules as conditions on arcs representing 
grammatical relations between nodes (in RG) or as 
conditions on relations between such arcs (in APG) 
rather than on the labeling of nodes. Nonetheless, J&P 
make clear that nodes are assigned grammatical cate- 
gory labels in APG, and since APG is essentially a 
formalized elaboration of RG ideas, we will assume that 
much the same is true in RG, though the RG literature 
so far has not made such aspects of the approach 
6 Computational Linguistics, Volume 14, Number 1, Winter 1988 
Gerald Gazdar et ai. Category Structures 
explicit. Syntactic category labels are not entirely with- 
out utility in RG and APG, since, for example, agree- 
ment rules crucially make reference to categorial prop- 
erties like number, gender, and person, and the proper 
formulation of agreement rules has been a topic of some 
interest in RG and APG research. 
As defined in J&P, an arc is an ordered pair (R((a, 
b)), c I ... CA) where R((a, b)) indicates that b (the 
second or head node) bears the grammatical relation 
named by the "relational sign" R to a (the first or tail 
node), and cl through ck are the representational strata 
Ladusaw (1985) at which this holds. In APG, categories 
are assigned to nodes by means of arcs in which the 
relational sign is L; such arcs are referred to as L arcs. 
The head of an L arc is simply an atomic label from a set 
of "grammatical category nodes" (called GNo by J&P) 
that is given by listing. 
Two types of grammatical category are recognized in 
APG: Major categories such as CI (clause), Nom (nom- 
inal), and V (verb), and minor categories such as 
Feminine, Singular, Third-Person, etc. A general con- 
straint (Pair Network Law 31, the Major Category 
Exclusiveness Law) prevents a node from being the tail 
of two distinct arcs with heads in the set Major (J&P, 
202), i.e., the set of grammatical category nodes that 
represent major categories. We can obtain the effect of 
this law simply by assuming a type 0 feature LABEL 
which takes values in the set of Major categories. 
In the case of minor categories, APG permits multi- 
ple atomic elements from GNo to be attached by L arcs 
to a single tail node (J&P). Thus a node might be the tail 
of L arcs whose head nodes are the atoms Nom, 
Feminine, Singular, and Third-Person, representing a 
third person singular feminine noun or noun phrase. It is 
easy to represent such sets of labels attached to a single 
node using type 0 features. We can represent the set of 
elements of GNo assigned to a given tail node by 
including a category corresponding to the characteristic 
function of that set, as with the indices in Harman's 
system. So we fix values for F, A, T, and p as shown in 
(14): 
(14) a. F = {LABEL, F I ..... Fn} 
b. A = {al ..... am} U 2 
c. ~'o 
d. p = {(LABEL, {a I ..... am}), (F1, 2) ..... (Fn, 2)} 
Here Major = {?l ..... ?n}, and GNo = {? 1 ..... ~,} U 
{a I ..... am}. The constraint needed is the following: 
(15) ~'IA...AF. 
This has the effect of requiring every category to 
include the characteristic function of a set (of minor 
categories, in the APG sense). However, we do not 
need to guarantee that every category has a specifica- 
tion for LABEL, as J&P specifically leaves it open 
whether there are nonterminal nodes with no associated 
grammatical categories; the absence of any grammatical 
category node will be reconstructed in our terms as that 
function ~" that is undefined for LABEL and which assigns 
0 to each ~'i E F. 
It can be shown that the category system just defined 
adequately represents category labelling in APG, in the 
sense that there exists a bijection 0 between (a) nonter- 
minal nodes together with their grammatical category L 
arcs in an admissible APG syntactic representation and 
(b) admissible categories induced by the category struc- 
ture in (14) and the constraint in (15). 
From an arbitrary well-formed APG pair network we 
can extract the set X of arcs it contains (J&P), and the 
set N of nodes associated with X. Since we are not 
concerned with coordinates, we can discard the coordi- 
nate sequences and consider just the incomplete arcs to 
which the arcs in X correspond. By Theorem I (J&P), 
all and only the terminal nodes in N are heads of L arcs. 
Extracting just the arcs with terminal nodes as heads 
gives us the set of L arcs from X; and discarding those 
with heads not in GNo gives us just the L arcs with 
grammatical category labels as their heads. The mem- 
bers of this set can be partitioned into equivalence 
classes having the same tail node (since by definition no 
arc has more than one tail). For convenience of refer- 
ence we can call these equivalence classes category- 
labelled nodes. 
Theorem. There is a bijection from APG category- 
labelled nodes to categories admitted by (14) and (15). 
Proof. Consider an arbitrary category-labelled node K 
with tail n. By PN Law 31, the Major Category Exclu- 
siveness Law, exactly one arc in K has a head which is 
in Major. Let 01 be the bijection established by 01(L(n, 
a)) = t~, and let 02 be the bijection established by 0(a) = 
(LABEL, O~) iff a E Major and (a, 1) otherwise. The 
category corresponding to K will be the smallest set that 
contains 0102(A) for all arcs A in K and contains (Fi, 0) for 
all vi in F that are not in the range of 01. Since 01 and 02 
are bijections, their product 0102 is a bijection. The 
correspondence in the opposite direction is obvious. A 
node that is the tail of no L arcs will be mapped by 0102 
to ~, and other nodes will be mapped onto categories in 
which the values of the features record the details of the 
category-labelling L arcs in r together with (redun- 
dantly) information about which one is the major cate- 
gory, the mapping yielding a unique result in each 
case.ll 
The set of APG (and, we assume, RG) categories 
induced is finite, and ceteris paribus is of cardinality 
m • 2"; it will be much smaller once further conditions 
on coocurrence of minor categories are imposed (Mas- 
culine and Feminine presumably cannot both be 
mapped to 1 in a category, for example). It is of interest 
that despite the utterly different grammatical formalism 
and theoretical background associated with it, the APG 
notion of syntactic category can be seen to be almost 
identical to that of Harman's augmented phrase struc- 
Computational Linguistics, Volume 14, Number I, Winter 1988 7 
Gerald Gazdar et al. Category Structures 
ture grammar, nodes without LABEL values contributing 
the only relevant difference. 
3.5 X SYNTAX, TRANSFORMATIONAL GRAMMAR, 
GOVERNMENT-BINDING 
In the great majority of contemporary works in trans- 
formational grammar (TG), including those representing 
what is known as "government-binding" (GB) 
Chomsky (1981), the conception of grammatical catego- 
ries follows what is called "the X-bar convention" 
Jackendoff (1974) Hornstein (1977) or "X-bar syntax". 
"X-bar" is often notated X or X', or as X 1, X 2, etc., the 
superscript numeral denoting the number of bars or bar 
level.) The central idea of X-bar syntax is that phrasal 
categories are "projected" from lexical categories. 
Given a lexical category X, the related phrasal nodes 
are assumed to be X(= X' = X1), X(= X" = X2), and so 
on. 
Representing phrasal categories as founded on lexi- 
cal categories in this way amounts to treating categories 
as non-atomic, the distinction between lexical catego- 
ries and the various levels of phrasal category being 
tantamount to a feature specification distinction. Bar 
level is not treated in terms of features in most works 
using X-bar notation, probably because of the tradition 
in TG (and related work in segmental phonology) re- 
stricting features to the values {-, +}. Thus Bresnan 
(1975) treats categories as ordered pairs (i, M) where i is 
a natural number representing the bar level and M is a 
matrix of feature specifications, and the same formal- 
ization is used by Lasnik and Kupin (1977). Here we 
simply integrate bar level information with the rest of 
the feature system. 
Although the origins of the X-bar proposal (Harris 
1951) do not take such a feature analysis of categories 
any further, but treat lexical categories as atomic, it is 
always assumed in current instantiations of X-bar syn- 
tax that lexical categories themselves have a feature 
analysis. In much TG, it is presupposed that the lexical 
categories N, A, V, and P are to be analyzed in terms of 
two binary features N and v. 1 Lasnik and Kupin (1977) is 
a fairly explicit formulation of this type of category 
system. They assume a maximum bar level of three. To 
characterize their system of categories, we fix our 
values for F, A, ~-, and p as in (16), and impose the 
constraint in (17). 
(16) a. F = {N, V, BAR} 
b. a = {0, 1,2,3} 
C. ~o 
d. p = {iN, 2), (V, 2>, (BAR, a)} 
(17) N/% v/% BAR 
This yields a system of 16 categories, four at each bar 
level. 
Jackendoff (1977) proposes a version of X-bar syntax 
in which lexical categories are distinguished from one 
another by means of the features \[-----SUBJ\], \[--0BJ\], 
\[-----C0MP\], and \[--+DET\] rather than by I-----N\] and \[-+v\]. He 
does not provide an explicit definition of his full set of 
categories, but he gives enough detail for it to be 
deducible. To define Jackendoff's system of categories, 
we fix our values for F, A, ~', and p in the manner shown 
below: 
(18) a. F = {SUBJ, C0MP, DET, 0BJ, BAR} 
b. a = {0, 1,2,3} 
C. 7 0 
d. p = {(SUBJ, 2), (C0MP, 2), (DET, 2), (0BJ, 2), 
<BAR, A)} 
To get the exact set of permissible categories, we need 
to make sure that SUBJ, 0BJ, COMP, and BAR are defined in 
all categories, and that DET is only specified in \[-C0MP\], 
\[-0BJ\] categories. The following set of L c constraints 
will achieve this. 
(19) a. SUBJ A OSJ A COMP /%, BAR 
b. DET --) ((COMP:0) /~ (0BJ:0)) 
We can now obtain a bijection between Jackendoff's 
X-bar categories and the admissible categories induced 
by F, A, and the constraints listed in (19). We define a 
mapping 0 between the Jackendoff's own category 
abbreviations and the admissible categories with re- 
spect to (19a) and (19b), as follows (we schematize by 
writing X with n bars as X n, 0 <-- n <- 3): 
(20) a. 0(V") = {(SUBJ, 1), (0BJ, 1>, (C0MP, 1), (BAR, n)} 
b. 0(M") = (SUBJ, 1), (0BO, I), (C0MP, 0>, (bar, n)} 
c. 0(P") = {(SUBJ, 0), (0BJ, I), (C0MP, 1), (BAR, n)} 
d. 0(Prt n) = {(SUBJ, 0), (OBJ, 1), <C0MP, 0), (BAR, n)} 
e. 0(N n) = {(SUBJ, 1), (0BJ, 0>, (C0MP, 1), (BAR, n)} 
f. 0(Art") = {(SUBJ, 1), (0BJ, 0>, <C0MP, 0), (DET, 1), 
(BAR, n)} 
g. 0(Q") = {(SUBJ, 1), (0BJ, 0), (COMe, 0), (DET, 0>, 
(BAR, n)} 
h. 0(A n) = {(SUBJ, 0), (OBJ, 0), (C0MP, 1), (BAR, n)} 
i. 0(Deg") = {(SUBJ, 0), (0BJ, 0), (COMP, 0), (DET, 1), 
(BAR, n>} 
j. O(Adv") = {(SUBJ, 0), <OBJ, 0), (COMP, 0), (DET, 
0), (BAR, ~} 
An example of a category admitted in Jackendoff's 
system would be {(BAR, 3), (SUBJ, 1), (0BJ, 0), (C0MP, 1)}, 
which can be more perspicuously presented in the 
graphic form given in (21). 
BAR 3 
SUBJ 1 
OBJ 0 
COMP 1 
As is evident, the set of categories induced by Jacken- 
doffs system has a cardinality of 40, ten at each bar 
level. 
8 Computational Linguistics, Volume 14, Number 1, Winter 1988 
Gerald Gazdar et al. Category Structures 
Sets of categories as small as this are clearly insuffi- 
cient for the description of natural languages. All trans- 
formational grammarians seem to agree that references 
to distinctions of tense, mood, voice, person, number, 
gender, case, pronominality, definiteness, wh-ness, and 
many other morphological and syntactic distinctions are 
in fact needed in a grammar. As pointed out by PuUum 
(1985), some statements in the TG literature suggest that 
further features are provided for the expression of such 
distinctions but are restricted to lexical (<BAR, 0)) cate- 
gories. However, it is easy to find examples in the 
literature of additional features like definiteness, case, 
wh-ness, and many others, being assigned to phrasal 
nodes as well. In marked contrast to a work such as 
Stockwell, Schachter and Partee (1973), recent TG has 
not been explicit about such matters. Allowing for 
twenty binary morphosyntactic features (a modest esti- 
mate if any serious effort at coverage is to be made) and 
allowing them only on lexical categories would increase 
the cardinality of the set of categories to about 4 • 106 in 
the case of Lasnik and Kupin's system and to over 107 
in the case of Jackendoff's. 
In one respect, what we have said so far may not 
adequately capture the conception of categories found 
in recent TG and GB works. These works generally 
make considerable and crucial use of co-indexing of 
nodes, using indices taken from an infinite set such as 
the integers. If the index on a node is taken to be part of 
the structure of the category labelling that node 
Chomsky (1970), which is not the only view one could 
take, then the number of distinct categories becomes 
infinite. This does not mean it becomes difficult to 
represent. Indexing of this sort can be represented 
directly in the present framework without adding an 
infinite set of additional atoms such as the natural 
numbers. We add a type 0 feature 0e (with p(0e) = {0}) 
and a type 1 feature SUCCESSOR to the feature system and 
use this to build the set of indices. Thus the index "3" 
would be represented as shown in (22), where category- 
valued feature specifications are shown with pointers to 
categories in their value positions. 
In some recent TG, more than one indexing system is 
employed. Thus Rouveret and Vergnaud (1980, p. 160) 
"postulate that each verbal complex in a structure is 
identified by some integer p and each \[-N\] element in the 
verbal complex p bears the superscript p." This super- 
scripting system is distinct from the subscripting system 
maintained to indicate anaphoric linkage or binding, and 
neither places an upper bound on the number of indices. 
Hence it would not be sufficient to have a single type 1 
feature. Two further type 1 features SUBSCRIPT and 
SUPERSCRIPT could be used, each taking category values 
representing indices with SUCCESSOR and OF. 
It may seem implausible to suppose that anyone 
would choose in practice to handle indexing via a 
feature system such as that just suggested• Nonetheless, 
it would clearly be possible, which shows that one can 
incorporate integer indices into the structure of catego- 
ries in terms of a finite number of features and a finite 
number of atoms, which might not initially have been 
evident• 
3.6 GENERALIZED PHRASE STRUCTURE GRAMMAR 
The generalized phrase structure grammar framework 
(GPSG), as set out in Gazdar, Klein, Pullum, and Sag 
(1985), (henceforth GKPS), differs from the examples 
considered so far in that it makes extensive use of 
features that are permitted to have categories as their 
values. 2 
For concreteness, we suggest how the set of catego- 
ries for the GKPS version of GPSG would be recon- 
structed in the framework presented here (see GKPS 
pp. 245-6, for the complete lists where we abbreviate 
with "..."). 
(24) a. F = {SUBJ, N, C0MP, BAR ..... AGR, SLASH} 
b. A = {0, 1, 2, .... for, that .... } 
C. ~" = {(SUBJ, 0), <N, 0), <V, 0), <COMP, 0), (BAR, 0), 
... <AGR, 1), <SLASH, 1)} 
d. p = {(SUBJ, 2), <N, 2), <V, 2), <C0MP, {for, that, 
• . .}),..., <BAR, {0, 1, 2})} 
We add to this, for each feature f E F ~, the following 
(22) 
Constraints are necessary to ensure that the value of 
SUCCESSOR does not contain anything but SUCCESS0ROr 0e 
specifications. To this end, we constrain each feature f 
E F ° (except 0e) as shown in (23a), and in addition we 
impose (23b) and (23c): 
(23) a. \[\] --1 (SUCCESSOR: 39 
b. \[\] --1 (SUCCESSOR A OF) 
C. \[\] --1 (SUCCESSOR: --1 OF --I SUCCESSOR) 
constraint: 
(25) \[\] -~ (f: of) 
This prevents a category-valued feature f from being 
specified anywhere within the value of an occurrence of 
f. An example of a moderately complex category with 
more than one category-valued feature that nonetheless 
obeys (25) is shown in (26). 
Computational Linguistics, Volume 14, Number 1, Winter 1988 9 
Gerald Gazdar et al. Category Structures 
(,,t~ 
The constraint (25) restricts us to exactly the set of legal 
GKPS categories. 3 The total GKPS category set is 
finite, but naturally, it is extremely large (Ristad (1986) 
calculates that it is in excess of 10774). It is clear that the 
set of GKPS categories is vastly too large to be precom- 
piled and stored-and indeed, no implementation that we 
know of has attempted this. 
3.7 SYSTEMIC GRAMMAR 
Systemic grammar, originally known as "scale and 
category" grammar, has its origins in the work of 
Halliday (1961) and is widely known among computa- 
tional linguists through Winograd (1972) and other 
works, and it has recently received rigorous formaliza- 
tion in the hands of Patten and Ritchie (1987). Tree 
structures in systemic grammar tend to be fiat, more 
structural information being expressed through catego- 
ries than in most other approaches Hudson (1971). 
Categories in systemic grammar are simply bundles of 
feature specifications: there is "nothing in systemic 
theory corresponding to the distinction between "fea- 
tures"--such as \[+past\] --and "categories"--such as 
NP and S---in TG theory" Hudson (1971, p. 48). A set 
of well-formed categories in a systemic grammar is 
defined by a system network, which "is in effect a body 
of rules, in symbolic form, which specify precisely how 
features can combine with each other: in other words, 
which features can appear together in the paradigmatic 
description of a single item, and which cannot" Hudson 
(1971). 
We will not discuss rules for forming systemic net- 
works (and hence categories) here, but will instead refer 
the reader to the presentation in Winograd (1983), 
where a system network expressing category informa- 
tion for the English pronominal form is provided as an 
example of the notational techniques used in systemic 
grammar for specifying a set of categories. We repro- 
duce this in Figure 1. 
The content of Figure 1 can be reconstructed 
straightforwardly as a category structure subject to a set 
of L c constraints (for a closely related analysis of this 
10 
{~ Animate 
Question __ -- Subjective 
Case Objective Reflex 
ve Possessive 
Possessive-Determ ner 
_ I First 
Personal ~_.P__~ Second _ _ I Feminine 
Ingular-- J Neuter 
f \[ | Plural 
Demonstrative --l~ Near 
/ Far 
Figure 1: Systemic Network for English Pronouns 
example, developed independently, see Mellish (1986). 
The following is the category structure that we need: 
(27) a. F = {PRONOUN, CASE, PERSON, GENDER, NUMBER, 
ANIMACY, PROXIMITY} 
b. A = {question, personal, demonstrative, subjec- 
tive, objective, reflexive, possessive, posses- 
sive-determiner, first, second, third, feminine 
masculine, neuter, singular, plural} 
C. T O 
d. p = {<PRONOUN, {question, personal, 
demonstrative}), 
<CASE, {subjective, objective, reflexive, posses- 
sive, possessive-determiner}), 
<PERSON, {first, second third}), 
<GENDER, {feminine, masculine, neuter}), 
<NUMBER, {singular, plural}), 
<ANIMACY, 2>, 
<PROXIMITY, 2>} 
The constraints that must be imposed are the following: 
(28) a. PRONOUN 
b. (PRONOUN:question) ~ (CASE /~ -'-I PERSON /'k ---I 
NUMBER /~ ANIMACY /~ 7 PROXIMITY) 
C. (PRONOUN:personaD <--> (CASE /~ PERSON /~ NUMBER 
/k -q ANIMACY /~ "7 PROXIMITY) 
d. (PRONOUN:demonstrative) <--) (7 CASE /~ 7 PERSON 
/~ NUMBER /~ -3 ANIMACY /~ PROXIMITY) 
e. GENDER ~ (PRONOUN A (PERSON:thirD A (NUMBER: 
singular)) 
Note that this description of the pronominal system of 
English is artificially complicated by its isolation from 
the rest of the grammar. If it were embedded in the 
context of a definition of a wider class of categories (for 
example, the English noun class network given by 
Winograd (1983), it would be modified by the elimina- 
tion of (28a) and the relaxation of (28b-d) to simple 
conditionals. 
Computational Linguistics, Volume 14, Number 1, Winter 1988 
Gerald Gazdar et al. Category Structures 
The structure seen in this example employs only type 
0 features. For example, the category it defines for a 
pronoun like herself would be (29). 
(29) PRONOUN 
CASE 
PERSON 
NUMBER 
GENDER 
personal 
reflexive 
third 
singular 
feminine 
Interestingly, however, systemic grammar as formal- 
ized by Hudson (1971), at least is not limited to type 0 
features. Hudson explicitly permits recursive growth of 
feature structures in order to count constituents (see pp. 
60-62). This could be reconstructed here by using a type 
1 feature in roughly the manner we employed SUCCES- 
SOR, above. Such a use of type 1 features immediately 
makes the size of the category set infinite. 
3.8 CATEGORIAL GRAMMAR 
Categorial grammar originates with work by Lesniewski 
and Adjukiewicz in the 1940s (see van Benthem, Busz- 
kowski and Marciszewski (1986), Haddock, Klein and 
Morrill (1987) and Oehrle, Bach and Wheeler (1987) for 
recent work and references to the earlier literature). The 
set of categories used is infinite. It is often defined as 
the smallest set containing some set of basic categories 
{al ..... a,}, and closed under the operation of forming 
from two categories a and/3 a new category al/3. 
To reconstruct the category system for categorial 
grammar, we define E as shown in (30). 
(30) a. F = {LABEL, DOMAIN, RANGE} 
b. A = {a, ..... a.} 
C. "/" = {<LABEL, 0), <DOMAIN, I), <RANGE, 1>} 
d. p = {<LABEL, A>} 
We then add the following: 
(31) a. \[-\](DOMAIN <--> --l LABEL) 
b. \[-\](DOMAIN <--> RANGE) 
We can now represent any category allowed in the 
simple form of categorial grammar considered so far. 
For example, the category (StNP)I(SINP) can be repre- 
sented as shown graphically in (32). 
(32) 
DOMAIN 
DOMAIN 
RANGE 
DOMAIN 
RANGE 
Computational Linguistics, Volume 14, Number 1, Winter 1988 
To show formally that we have captured the content 
of the category system of categorial grammar, we can 
exhibit a bijection between the categorial grammar 
categories and the admissible categories induced by F, 
A, and the constraints defined above. We define a 
mapping 0 between the categorial grammar categories 
and the admissible categories with respect to (31a) and 
(31b), as follows: 
(33) a. O(a i) = <LABEL, ai) where a i e A 
b. 0(al/3) = {<DOMAIN, 0(/3)), <RANGE, 0(o0> } where a 
and/3 are categories. 
A simple structural induction argument suffices to show 
that 0 is indeed bijective. The smallest category will be 
of the type ai, and corresponds to {<LABEL, ai) }. The next 
step up yields a category of the form ailaj, which 
corresponds to: 
(34) {<DOMAIN, {<LABEL, aj)}), <RANGE, {<LABEL, ai)}>}. 
Each further step replaces a i or aj by a non-basic 
category and will clearly yield a unique result. It can be 
seen immediately that the mapping 0 has an inverse. 
The categories defined thus far are non-directional, 
in the sense that a complex category can combine with 
an argument either to its left or its right. However, most 
definitions assume directional categories Bach (1984). 
This further specification can be easily incorporated by 
introducing a new feature name DIRECTION which takes 
values in 2. We then add a constraint that categories 
taking values for DOMAIN also take a value for DIRECTION, 
thus determining the directionality of the category. 
(35) \[-\](DOMAIN ~ DIRECTION) 
The translation function is then: 
(36) a. O(a i) = {<LABEL, ai)} 
b. 0(og/3) = {<DOMAIN, 0(/3)>, <RANGE, 0(a)>, 
<DIRECTION, I)} 
C. 0(a//3) = {<DOMAIN, 0(/3)>, <RANGE, 0(iX)), 
<DIRECTION, 0)} 
This translation function is again a bijection, for the 
same reasons as before. Clearly we could employ an 
analogous move to subsume the od/3 vs. od//3 category 
distinction employed in Montague (1973). 
In some recent work on categorial grammar, it makes 
sense to think of expressions being assigned to infinite 
sets of categories rather than to a single category, but 
we will not pursue the implications of such a move here 
(see van Benthem (1986c) for relevant discussion). 
3.9 INDEXED GRAMMAR 
Indexed grammars are a generalization of phrase struc- 
ture grammars due originally to Aho (1968). Like cate- 
gorial grammar and some of the other frameworks 
previously mentioned, it uses an infinite category set. In 
the formulation presented in Gazdar (1985), an indexed 
grammar category consists of an atomic label and a 
11 
Gerald Gazdar et al. Category Structures 
possibly empty list (or stack) of atomic indices drawn 
from a finite set. 
There is a familiar technique for encoding lists or 
stacks in a notation which relies on the fact that lists can 
be decomposed into an initial element and the residual 
list (see, for example, Shieber (1984)). Thus, we add 
new elements INDEX and LIST to the set F: 
(37) a. F = {LABEL, INDEX, LIST} 
b. A = {a 1 ..... am} tO {0, i, ..... i,} 
C. 7" = {(LABEL, 0>, (INDEX, 0>, (LIST, I>} 
d. p -- {(LABEL, {a I ..... am}), (INDEX, {0, i I ..... 
i,}>} 
A list of indices of the form (38a) is represented as (38b). 
(38) a. \[J0, Jl ..... JJ 
b. {(LIST, {<INDEX,jo> , (LIST, {<INDEX,jl) , (LIST .... 
{(INDEX, Jk), (LIST, {(INDEX, 0>})} . . .>} 
In addition, we need the following constraints: 
(39) a. LABEL /% LIST 
b. \[\] --1 (LABEL /~ INDEX) 
C. \[\] --I (LIST: --I INDEX) 
d. \[\] --I (LIST /~ INDEX:0) 
The first requires that at the top level, an indexed 
category has a label and a list of indices. The second 
disallows INDEX from co-occurring with LABEL, enforcing 
the constraint recursively downward. The third requires 
that if LIST is defined anywhere, then INDEX is defined in 
its value. And the last, also enforced recursively down- 
ward, requires that if INDEX has the value 0, LIST is not 
defined (so the end of the list of indices is unambig- 
uously flagged by INDEX having the value 0). A category 
bearing an "empty" list of indices is thus one whose 
value for LIST is {(INDEX, 0)}. An example of a category 
allowed by these constraints is shown in (40). 
defined, since the distinction between atomic indices 
and indices taken from a finite set of categories has no 
language-theoretic implications. 
Given the representability of list-valued features as 
category-valued features in the present framework, the 
definitions of subsumption and unification automati- 
cally apply to lists without the need for any redefinition. 
If the empty category is used as the end marker for lists 
then two lists of different lengths will unify if one is a 
prefix of the other. Depending upon the linguistic inter- 
pretation of lists, this may or may not be what one 
wants. In our illustration, we use an atomic end marker 
that will block prefix unification. 
4 COMPUTATIONAL COMPLEXITY OF CATEGORY 
CHECKING 4 
The checking problem for categories is the problem of 
determining whether a category is legal given a fixed set 
of constraints, or more precisely, of determining for an 
arbitrary category oz and a fixed formula 4' of L c 
whether o~ satisfies 4'. It is a special case of the problem 
of determining whether some arbitrary model satisfies 
some fixed formula of a logic. 
Theorem. The checking problem for categories is solv- 
able in linear time. 
Proof. Assuming a category structure E = <F, A, ~-, p), 
we represent a category in K as a partially labelled, 
unordered tree with all nodes except the root labelled 
from F tO A, all nodes labelled from A being terminals. 
The category   corresponds to a single unlabelled node; 
</, a) for f E F ° and a E A corresponds to a node 
labelledfwith daughter labelled a; and (\[, {~q ..... o-k}> 
for f E F ~ and n - 0 corresponds to a node labelled f 
with the first elements of oq through o- k as its daugh- 
(40) I LAB £ I D I 
I LIS  I __li  xlal LIST ~" I  li rs:Xl:! 
LIST l 0 I 
Indexed grammar as originally formalized by Aho 
uses lists of atomic indices as part of the composition of 
categories. It is also possible in the framework we have 
defined to allow features to have lists of categories as 
their values. This is in fact proposed in the literature by 
Shieber (1984) and Pollard (1985). To extend an indexed 
grammar to permit GKPS-style categories in place of 
atomic indices, one can simply make INDEX a type 1 
feature, add the GKPS category structure and con- 
straints to the indexed grammar category structure and 
constraints, and then exempt LIST (but, crucially, not 
INDEX) from being subject to the constraint schema in 
(25). The resultant type of grammar, assuming that the 
limitations on rules in indexed grammars are main- 
tained, is equivalent to indexed grammar as originally 
LeT(s). Let T be such a tree, and let 4' be a fixed formula 
of L c. We check T for satisfaction of 4' by annotating 
each node of T with the complete list of all subexpres- 
sions of 4', and working from the frontier to the root 
recording at each node which subexpressions are satis- 
fied by the subtree rooted there. At each point the 
checking is local: only the current node and its daugh- 
ters (if any) need be examined. Even for a subformula 
like \[-q¢, all that must be verified at a node q as we work 
up the tree is that q, is satisfied at q and 7q¢ is recorded 
as satisfied at each daughter node. The conclusion of 
the procedure will be to determine whether or not 4' 
itself is true at the root of T, and thus whether T is 
well-formed. If 4' has s subformul~e and T has n nodes, 
the time taken is bounded by sn (the number of steps 
12 Computational Linguistics, Volume 14, Number 1, Winter 1988 
Gerald Gazdar et al. Category Structures 
required if every subformula is evaluated at every 
node), and thus linear in n, the size of the input. • 
Of somewhat less interest than the checking problem 
is the universal checking problem, that of determining 
for an arbitrary input pair (tk, a), ~b a formula and a a 
category, whether a satisfies 4). The difference is that 
here ~b is not held constant; the task is analogous not to 
checking the legality of a category within a selected 
grammatical framework, but rather to a kind of frame- 
work-design oversight role, switching frameworks with 
every input and evaluating the given category relative to 
the proffered constraint. We note, however, that the 
universal checking problem only calls for, at worst, 
quadratic time. To see this, simply note that we can use 
the algorithm sketched above, and take account of s as 
well as n as part of the size of the input. The worst case 
is where s and n contribute about equally to the size of 
the product sn, i.e., where s -~ n. Then sn -~ ((s + n)/2) 2 
= (s + n)2/4, which varies with the square of the input 
size s + n. 
For some special cases, both the checking problem 
and the universal checking problem are of course much 
easier. For example, if only type 0 features are permit- 
ted, checking is decidable in real time by a simple 
inspection of the finite number of (f, a) pairs, regardless 
of whether ~b is part of the input or not. 
Note that the much harder satisfiability problem, that 
of determining for an arbitrary formula ~b whether there 
exists a category a that satisfies it, is of even less 
interest in the present context. When a grammatical 
framework intended for practical use is devised, the 
constraints on its category system are formulated to 
delimit a particular set of categories already well under- 
stood and exemplified. There is no practical interest in 
questions about arbitrary formulae of L c for which no 
one has ever considered what a satisfying category 
would be like. 
We would expect the satisfiability problem for Lc to 
be PSPACE-complete, like the satisfiability problem for 
most modal logics. Ristad (1986, p. 33-4) proves a 
PSPACE-hardness result for what he calls "GPSG 
Category-Membership", specifically with respect to the 
GKPS framework, and this can immediately be seen to 
be extendable to the satisfiability result for L c (as 
mentioned in footnote 3, L c is in effect a language for 
the statement of feature cooccurrence restrictions, and 
can be used in the same way that Ristad uses the GKPS 
FCR formalism). The problem he considers, despite the 
misleading name he gives it, is the analog of satisfiabil- 
ity, not of checking; it asks whether there exists an 
extension of a given category that satisfies a given set of 
FCRs, and since the given category might be O, this is 
equivalent to satisfiability. Satisfiability is NP-complete 
even for simple propositional logic, so as soon as it is 
appreciated that a language for stating constraints on 
categories is in effect a logic with categories as its 
models, the complexity of satisfiability for category 
constraints comes as no surprise. Checking of GKPS 
categories, on the other hand, which Ristad does not 
consider, can be done very fast, as a corollary of the 
theorem above. 
5 SETS AS VALUES 
All the syntactic approaches that we have considered so 
far distinguish syntactic categories from structural de- 
scription of expressions in a fairly transparent fashion. 
In FUG Kay (1979, 1985), LFG Kaplan and Bresnan 
(1982), and work by Shieber and others on PATR II 
Shieber (1984), this traditional distinction disappears 
almost entirely. Thus, in LFG, syntactic categories and 
the structural descriptions known as f-structures are 
exactly the same kind of object. In FUG, not only is 
there no formal distinction between categories and 
structural descriptions, but even the distinction be- 
tween structural descriptions and grammars disappears. 
At first sight, LFG f-structures seem likely to be the 
trivial case of a set of categories observing no con- 
straints on admissibility at all. We simply take F to be 
the LFG set of f-structure attribute names, and A to be 
the LFG set of atomic f-structure values (the "simple 
symbols" and "semantic forms"). So, following this 
reasoning, the set of LFG f-structures would be just K, 
modulo the appropriate typing. However, this is not the 
case, for reasons that will emerge below. 
The first problem we consider is that at least two of 
the frameworks just mentioned permit sets as feature 
values. In one sense we already permit sets as values 
since type 1 features have categories as their values, 
and categories are sets. Categories are a rather special 
kind of set, however, namely partial function from 
features to values. Suppose we merely wanted to have 
a model for a set of atoms. Then, as we saw in our 
discussion of APG, we can model such a set by con- 
structing the set's characteristic function. But modelling 
a set that way, whilst perfectly adequate for APG 
categories, has a consequence that may not always be 
acceptable: two sets on the same domain will unify just 
in case they are exactly the same set. Given certain 
quite natural interpretations of a feature system making 
use of sets, this may not be what we want. 
An alternative strategy then, and one which is also 
consistent with our framework, is to model sets as 
partial functions into a single value range (as opposed to 
total functions into a two value range). For example, the 
subset of the authors of this paper with British ad- 
dresses could be represented as a partial function on the 
domain {Gazdar, Pullum, Carpenter, Klein, Hukari, 
Levine}, namely the function {(Carpenter, 1), (Gazdar, 
1), (Klein, 1),} instead of the following total (character- 
istic) function on the same domain: {(Carpenter, 1), 
(Gazdar, 1), (Hukari, 0), (Klein, 1), (Levine, 0), 
(Pullum, 0)}. Then unification of the partial functions 
amounts to union of the corresponding sets. 
This is fine if our intended interpretation of the set is 
Computational Linguistics, Volume 14, Number 1, Winter 1988 13 
f 
Gerald Gazdar et al. Category Structures 
conjunctive, i.e., if {a, b, c} means that a holds and b 
holds and c holds (Carpenter has a British address and 
Klein has a British address and Gazdar has a British 
address). But if our intended interpretation is disjunc- 
tive, then we want the unification operation to give us 
intersection, not union. FUG actually uses set-valued 
attributes with a disjunctive interpretation Kay (1979). 
And, in a discussion of possible enhancements to the 
PATR II formalism, Karttunen (1984) provides a num- 
ber of very relevant examples that illustrate the issues 
that arise when a unification-based formalism is aug- 
mented in order to encompass disjunction. 
As Chris Barker has pointed out to us, a perverse 
variant of the approach to conjunctively interpreted sets 
outlined above serves to handle the disjunctive inter- 
pretation of sets of atoms. We map the set {Accusative, 
Dative} into the partial function {(NOMINATIVE, 0>, 
(ABLATIVE, 0}, (GENITIVE, 0)} on the domain {ACCUSATIVE, 
DATIVE, NOMINATIVE, ABLATIVE, GENITIVE}. NOW unifica- 
tion (and hence union) of such complement-specifying 
partial functions gives us an operation equivalent to 
intersection applied to the original sets. Thus the unifi- 
cation of {(NOMINATIVE, 0), (ABLATIVE, 0), (GENITIVE, 0>} 
(standing for {Accusative, Dative}) with {(NOMINATIVE, 
0), (ACCUSATIVE, 0), (GENITIVE, 0)} (standing for 
{Ablative, Dative}) gives us {(NOMINATIVE, 0>, (ABLATIVE, 
0), (GENITIVE, 0), (ACCUSATIVE, 0>} which stands for 
{Dative}. 
Clearly, the present approach could be generalized to 
directly allow a type of feature that would take sets of 
atoms as values. The price to be paid for this, in a 
metatheoretical exercise such as the one we are engaged 
in, would be that the definition of unification becomes 
dependent on the intended interpretation of such fea- 
tures: the relevant clause needs to use union if the 
interpretation is conjunction, and intersection if the 
interpretation is disjunction. 
An altogether more serious issue arises when we 
consider the possibility of attributes taking sets of 
categories as values. We could represent such sets in a 
manner analogous to the treatment of lists, but with a 
special marking (given in terms of special attribute- 
value pairs) indicating that the list representation in 
question is to be interpreted as a set. The trouble with 
this is that the identity conditions for the resulting 
objects are no longer transparent. Two structurally 
distinct lists may or may not count as identical, depend- 
ing on whether or not they are both representing sets, 
and that in turn will depend on whether particular 
attributes appear in certain relevant structural posi- 
tions. Likewise, our existing definitions of unification 
and subsumption would simply fail to provide one with 
intuitively reasonable results, and its seems unlikely 
that they could be made to do so without further formal 
contortions. This whole strategy seems contrived and 
inelegant. 
The alternative is, again, to introduce a new type of 
feature, one taking sets of categories as its values, and 
some recent works have done just this. Sabimana (1986) 
proposes a feature ARG which takes a set of categories as 
its value. The feature appears on elements that corre- 
spond semantically to predicates, and its value is the set 
containing the categories that correspond semantically 
to the arguments of that predicate. The Japanese Phrase 
Structure Grammar (JPSG) of Gunji (in press) goes 
further in that it restricts itself entirely to such features 
(together with atom-valued features, of course) and 
does not employ simple category-valued features at all. 
Both FUG and LFG also permit category-set values, 
in effect, though the interpretation they assign to the 
resulting objects is, once again, different. FUG's inter- 
pretation is, as with atom sets, disjunctive. On this 
interpretation, unification of two sets of categories can 
be defined as the set of categories each of whose 
members is the unification of a pair in their Cartesian 
product (again, see Karttunen (1984) for relevant dis- 
cussion of this kind of approach). In LFG, sets of 
categories acting as values for single attributes are used 
in the analysis of adjuncts (and possibly coordination) 
and the interpretation is intendedly conjunctive Kaplan 
and Bresnan (1982). Under this interpretation, there is, 
in general, no unique unification to be had, although one 
can define an operation to provide one with a set of 
possible unifications. In Gunji (in press), where a con- 
junctive interpretation is assigned to category-set val- 
ues, the non-uniqueness problem is sidestepped by 
defining unify as a predicate of category pairs, rather 
than as an operation. 
In view of all these considerations, we have opted for 
simplicity over generality and simply excluded set val- 
ued features from our purview. 
6 SHARED VALUES 
One property that FUG and PATR II have in common, 
which sets them apart from the simpler grammar type 
discussed earlier in this paper, is the option of letting 
two or more distinct features share the same value. 
Thus, FUG functional descriptions allow one instance 
of a value to simultaneously be the value of more than 
one (instance of an) attribute. Consequently, the im- 
plicit hierarchy, represented graphically, does not re- 
spect the single-mother requirement that is built deep 
into our definitions. Of course, two category-valued 
features within a category may contingently have iden- 
tical values, but this is not the same as sharing the same 
value (except in common parlance, perhaps). Kasper 
and Rounds (1986) refer to the distinction as one of type 
identity versus token identity. If we take a category, 
containing two contingently identical category-values, 
and unify it with a second category, then the contingent 
identity may not be preserved in the result. Consider, 
for example, the result of unifying these two categories: 
14 Computational Linguistics, Volume 14, Number 1, Winter 1988 
Gerald Gazdar et al. Category Structures 
where the values of F and H are identical in the first but 
not in the second. The result is: 
d e 
and here the values of ~" and H are no longer identical. If 
the original common value had been genuinely shared, 
then no unification would have been possible (see also 
Shieber (1985) where the term "reentrancy" is used in 
this connection). 
There is an alternative way of thinking about the 
problem of shared values, and that is to reconstruct it in 
terms of indexing: every value carries an index, and two 
structurally identical values are the very same thing if 
and only if they bear the same index. An integer 
indexing of this sort can be represented in the present 
framework as we have already see in section 4.5 above. 
However, a coindexing reconstruction would not be a 
sensible way of thinking about shared values in the 
present context since such a use of indices makes 
nonsense of structurally defined unification, subsump- 
tion, and so on. For two intuitively identical structures 
to unify, it would not be sufficient for them to exhibit 
the same internal patterns of coindexed values. Rather, 
they would need in addition to manifest the very same 
choice of indices. Clearly, this is not what one wants, as 
choice of index is completely arbitrary, and structures 
differing only in identity of the integers selected as 
indices should be regarded as equivalent. 
To achieve a semantics for shared-value category 
formalisms, it is necessary to move beyond the partial 
function-based category structures that provide the 
basis for our semantics, and thus depart from the 
particular category constraint logic that it induces. Like 
set values, shared values are simply beyond the scope 
of the rather parsimonious theory of categories devel- 
oped here. 5 The reader interested in pursuing richer 
approaches should consult Pereira and Shieber (1984) 
for a domain-theoretic account of the semantics of 
categories in LFG, PATR II, and GPSG; Ait-Kaci and 
Nasr (1986), who capture shared values with a corefer- 
ence relation on the nodes of the tree; Kasper and 
Rounds (1986), Moshier and Rounds (1987), and Rounds 
and Kasper (1986) for a finite state automaton-based 
logic and semantics for categories in FUG and PATR II; 
and van Benthem (1986a, b) for an interesting founda- 
tional discussion and application of such an automaton- 
based semantics. 
7 CONCLUSION 
We have developed and applied a general framework for 
defining syntactic categories, including categories in 
which features can have categories as their value, which 
latter possibility turns out to subsume the possibility of 
a feature taking as its value a list of indices or catego- 
ries, drawn from either a finite or an infinite set. The 
unitary way in which we have characterized these 
diverse systems is intended to assist in the exploration 
and comparison of grammatical formalisms. Questions 
concerning whether particular rule types and operations 
on categories that are familiar from one approach to 
grammar can be carried over unproblematically to an- 
other approach, and questions concerning the imple- 
mentation difficulties that arise when a given formalism 
is adopted, can in many cases be settled in a straight- 
forward and familiar way, namely by reducing them to 
previously encountered types of question. 
The grammatical frameworks we have considered as 
examples fall into a five-class typology which we can 
now explicate. The first class contains the frameworks 
that use only atom-valued features (simple phrase struc- 
ture grammar, Harman's augmented phrase structure 
grammar; RG and APG); the second contains the spe- 
cial case of GKPS, which uses category-valued features 
but imposes a constraint which prevents them from 
having effects on expressive power that could not 
ultimately by simulated by atom-valued features; the 
third contains the frameworks that use just a single 
category-valued feature (our key example being indexed 
grammar); the fourth contains frameworks making use 
of more than one category-valued feature (an example 
being categorial grammar); and the fifth includes those 
frameworks that fall outside the scheme we have devel- 
oped in that their categories are not representable as 
finite partial functions constrained by statements in L c 
(LFG, FUG, PATR II, etc.). 
It is not at all clear which of these five classes of 
approaches will prove the most suitable for implement- 
ing natural language processing systems in the long 
term. In this paper, we hope to have made somewhat 
clearer the nature of the issues at stake. We hope also to 
have done something more: for the first four classes, we 
have provided what is in effect a unitary type of data 
structure for the representation of their syntactic cate- 
gories. Thinking in terms of such data structures should 
make it possible for pseudo-issues in natural language 
processing research to be avoided in a large class of 
circumstances, to the point that even a decision in 
mid-project to change the grammatical framework from 
one linguistic approach to another need not entail any 
fundamental redesign of what are in most frameworks 
the basic objects of syntactic representation. 
APPENDIX 
In this appendix we restate the semantic rules for L c 
more precisely. All well-formed expressions of L c have 
Computational Linguistics, Volume 14, Number 1, Winter 1988 15 
Gerald Gazdar et aL Category Structures 
the same kind of denotation~they denote truth values 
(i.e., members of 2) relative to the category structure 
and a category a determined by E. If 05 is a well-formed 
expression of L o then we use f105flx,~ to stand for the 
denotation of 05 with respect to the category structure 
and category a. If 005(\]z~,, ~ = 1 then we shall say that t~ 
SATISFIES 05. The formal statement of our semantic rules 
is the following, where a, f, 05, and q~ are as above. 
(AI) a. 0fl~.~ = 1 iff s0') is defined. 
b. Of:aD~,,~ = 1 iff a(J) = a. 
c. Uf:05fl~.~ = 1 iff 0050~.,~0~ = I. 
d. D~050~.~ = 1 iff 0050~.~ = 0. 
e. 005 V q~.~ = 1 iff D05D~,~ = 1 or 0q~:,,~ = 1. 
f. 005 A ~,,~ = 1 iff 005~,,~ = 1 and 0q~,,~ = 1. 
g. 005 ---* q~,,~ = 1 iff 005~,,~ = 0 or 0q~x,,~ = 1. 
h. 105 <-> ~1:~,~ = 1 iff 1050:~,~ = lqA~,~. 
i. 1\[\]051~,~ = 1 iff 1051:~,~ = 1 and for all fin F ~ n 
~(a),lD051:,,,~00 = 1. 
j. I 0 051:, ~ = 1 iff 1051:, ~ = 1 or for some fin F ~ n 
~(a),l ~ 4>0:~,=00 = 1. 
Note that if a ~_ fl and a satisfies 05, it does ~0T follow 
in L c that 3 satisfies 05 (compare Rounds and Kasper 
(1986), Theorem 6). For example, we have ~ ~_ {(F, a)} 
and ~ satisfies --1 F, but {(F, a)} does not. Likewise, the 
fact that both a and/3 satisfy some constraint 05 does not 
entail that a U/3 will satisfy 05, even if a IA/3 is defined. 
The desire to incorporate negation whilst maintaining an 
upward closure property lead Moshier and Rounds 
(1987) to set aside a classical semantics for their feature 
description language and postulate an intuitionistic se 
mantics that, in effect, quantifies over possible exten- 
sions. 
We will write ~ 05 to mean that for every category 
structure \]i and category a in 11, a satisfies 05. Given 
this, we can list some valid formula: and valid formula 
schemata of the logic of category constraints. 
(A2) a. ~a) --> f (for all a E p(\]), f E F °) 
This simply says that if a feature has an atomic value, 
then it has a value. We also have all the valid formula: 
of the standard propositional calculus, which we will 
not list here. Furthermore, we have the following famil- 
iar valid modal formula:. 
(A2) b. ~1-\]05 ~--10 7 05 
c. ~(05---, 05) 
d. ~05 ~ 05 
e. ~05~ <)05 
f. PD(05 A q,) ~ ( 05 A \[\]q,) 
g. t=~(05V ~b),~-~(O05V O~b) 
h. ~D05 ~ •DO5 
Here, (A2h) shows us that our logic at least contains $4 
(we follow the nomenclature of Hughes and Cresswell 
(1968) throughout). But we do not have ~ <> 05--~ \[\] 0 05, 
and so our logic does not contain $5. To see this, 
consider the following category, assuming F is a cate- 
gory-valued feature: {(F, O)}. This category satisfies 0 F 
but not \[\] O F. 
The category {(F, {(G, a)}), (H, {(G, b)})} (graphically 
represented in (50), below) provides us with an analo- 
gous falsifying instance for ~ O 1-105 ~ \[\] O 05 when we 
set 05 = (~:a). 
This shows that our logic does not contain $4.2. Inter- 
estingly, the converse of this constraint zs valid, hence: 
(A2) i. ~F-1005---~ OD05 
This is easy to demonstrate: if o~ satisfies \[\] O 05 then 0 05 
must hold in all the categories that terminate a, and if 
O 05 holds in those categories, then 4, and I-\]05 hold in 
them as well. So r-\]05 holds in at least one category in o~, 
and thus a must satisfy O D05. This shows that our logic 
at least contains K1 and, as a consequence, is not 
contained by SS. 
However, our logic cannot contain K2, since the 
latter contains S4.2. Nor does it contain K1.2 since the 
latter's characteristic axiom, namely ~ 05 ~ 1-1( O 05 ~ 05) 
is shown to be invalid by the category {(G, a), (F, {(G, b), 
(F, {(G, a)})})} (shown in (51), below) when set set 05 = (c: 
a). 
I 
In fact, our logic does not merely contain K1, it also 
contains KI.1, whose characteristic axiom is: 
(A2) j. ~Fq(D(05 --> D05) ---> 05) -o 05) 
Hughes and Cresswell note that KI.1 'is characterized 
by the class of all finite partial orderings, i.e., finite 
frames in which R \[the accessibility relation\] is reflex- 
ive, transitive, and antisymmetrical' Hughes and Cress- 
well ((1984), p. 162). So it should be no surprise, given 
the basis for our semantics, that our logic turns out to 
include KI.1. This logic, also known as S4Grz (after 
Grzegorczyk (1967)), 'is decidable, for every nontheo- 
rem of S4Grz is invalid in some finite weak partial 
ordering' (Boolos (1979, p. 167). 
Two further valid formula schemata of Lc have some 
interest, before we conclude the list of valid formula: in 
(A2): 
(A2) k. ~0 -Tf (for allfE F I) 
1. ~(f.'05)--> 005 (forallfEF 1) 
The first of these follows from the fact that categories 
are finite in size and thus ultimately grounded in cate- 
gories that contain no category-valued features: f must 
be false of these terminating embedded categories, and 
hence O --1 f must be true of the category as a whole. 
16 Computational Linguistics, Volume 14, Number 1, Winter 1988 
Gerald Gazdar et al. Category Structures 
The second states that if a category is defined for a 
category-valued feature whose value satisfies 4,, then 
the category as a whole satisfies O 4'. 
(A2) m. ~(f:th) ---~f (for allfE F I) 
n. ~((f:4,)A(f:~))~tf.'4,A0) (forallfEF ~) 
o. P((i2~b)V~q0)~--~(f:thVq0 (forallfEF l) 
It is worth considering the valid formulae one would 
get in certain restricted classes of category structures. 
Suppose we consider category structures which contain 
only atom-valued features (i.e., F = F°). In this case, as 
one would expect, the modal logic collapses into the 
propositional calculus and the relevant notion of valid- 
ity (call it Po) gives us the following: 
(A5) moth ,o ruth 
The converse case, where we only permit category- 
valued features (i.e. F = F1), is uninteresting, since it is 
not distinct from the general case. We can always 
encode atom-valued features as (sets of) category- 
valued features and subject the latter to appropriate 
constraints, as follows. For every feature specification 
(f, a) such thatfE F ° and a E p(f), we introduce a new 
type 1 feature fa and use the presence of 0Ca, 0) to 
encode the presence of (f, a) and likewise absence to 
encode absence. Then, for each pair of atoms a and b in 
p(f), we require the new features to satisfy \[\] -7 (fa A 
fb). And to constrain each new feature fa to have the 
empty set as its value, we stipulate \[\] -7 (fa:g) for every 
feature g. 
However, consider validity in category structures 
containing at most one category-valued feature (call this 
kind of validity ~ 1)- With this restriction, the $4.2 axiom 
considered earlier becomes valid: 
(A6) ~10\[N~b--~ \[\]O4, 
In addition, we get (A7). 
(A7) ~ ~\[\](\[\]t h ~ \[-\]~) V \[-l(f--\]q~--~ \[~th) 
This means that this restricted logic at least contains 
K3, but it cannot contain K4, since ~1~)~ (0\[~(~ "--> 
D~b) is falsified by the category {(G, a), (~" {(G, b), (~', {(G, 
a)})})} when we set ~b = (G: a). 
In fact it must also contain K3.1, in view of the validity 
of (A2j) above, and this logic, also known as S4.3Grz, is 
characterized by finite linear orderings Hughes and 
Cresswell (1984). This is the characterization we would 
expect given the character of the ~1 restriction on the 
form of permissible categories, since with only one 
category-valued feature, there is at most one path 
through the structure of a category and so the partial 
order becomes a linear order. These observations con- 
cerning the logic induced by category structures where 
IFll = 1 are of some potential relevance to the study of 
indexed grammars whose categories can be- construed 
as being restricted in just this way (see section 4.9, 
above). 
ACKNOWLEDGMENTS 
Chris Barker has contributed substantively to the re- 
search reported here, and we offer him our gratitude. 
We are also grateful to Edward Briscoe, Jeremy Carroll, 
Roger Evans, Joseph Halpern, David J. Israel, Ronald 
M. Kaplan, William Keller, James Kilbury, William A. 
Ladusaw, Christopher Mellish, Richard E. Otte, Fer- 
nando Pereira, P. Stanley Peters, Carl J. Pollard, Ste- 
phen Pulman, William Rounds, Stuart M. Shieber, 
Henry Thompson and Manfred Warmuth for their gen- 
erous assistance during the research reported in this 
paper. Though in some respects they have contributed 
substantially, they should not be associated with any 
errors that the paper may contain. In addition, we thank 
Calvin J. Pullum, who is responsible for the diagrams, 
and we acknowledge partial research support from the 
following sources: the UCSC Syntax Research Center 
(Gazdar, Hukari, Levine, Pullum); grants from the 
(U.K.) SERC and ESRC (Gazdar); NSF Graduate Fel- 
lowship RCD-8651747 (Carpenter); NSF grants BNS-85 
11687 and BNS-85 19708 (Pullum). 
NOTES 
1. Bresnan (1975) correctly attributes the \[-+N, -+V\] feature system to 
lectures delivered by Chomsky at the 1974 Linguistic Institute in 
Amherst, Massachusetts. In some works, e.g., Jackendoff (1977) 
and Gazdar, Klein, Pullum, and Sag (1985), Chomsky (1970) is 
wrongly given as the source. The latter work does, however, 
contain the following relevant comment: "we might just as well 
eliminate the distinction of feature and category, and regard all 
symbols of the grammar as sets of features" (p. 208). 
2. As Hendriks (1986) has noted, the definition of categories given in 
GKPS "is a bit of a mess from a formal point of view" (1986, p. 
19). Definition 1 reads as follows: ,,po is a function from F to 
POW(A) such that for allf~ (F-Atom), p°00 = {{}}" (GKPS, p. 
36). But {{}} is not in the power set of A; "POW(A)" should be 
replaced by "POW(A) O {{{}}}". Parts of the text and examples 
following Definition l assume correctly that it ends ,,pO(f) = {{}},,, 
but other parts assume incorrectly that it ends ,,po(f) = {},,. If the 
latter version were adopted, Definition 4 would fail to add 
category-valued feature specifications in the desired way (since 
the condition "3C' E ff'-~(t)\[C' C_ C\]" would never be satisfied 
wheren = 1.) 
3 The "feature cooccurrence restrictions" (FCRs) of GKPS form 
part of the definition of admissible tree rather than being part of the 
definition of categories. However, every GKPS FCR can be 
expressed in L c, and the translation is trivial. 
4. We are indebted to Joseph Haipern for his help with the material 
in this section. 
5. One of our referees has suggested that our semantics can be made 
to handle sharing by introducing an equality predicate into L c, 
marking shared value situations with special nonce features, and 
then using conditional constraints triggered by these features to 
impose identical values on the relevant features. But we have 
been unable to get any scheme of this kind to work in the general 
case. There appears to be no upper bound to the number of nonce 
features that may be required, and moreover, unification ceases 
to behave in an intuitively reasonable manner. 
Computational Linguistics, Volume 14, Number 1, Winter 1988 19 

REFERENCES 
Aho, Alfred V. 1968 Indexed Grammars. Journal of the Association 
for Computing Machinery 15: 647-671. 
Ait-Kaci, Hassan; and Nasr, Roger. 1986 Proceedings of the 13th 
Annual ACM Conference on Principles of Programming Lan- 
guages: 219-228. Association for Computing Machinery. 
Bach, Emmon. 1984 Some Generalizations of Categoilal Grammar. In 
Landman, Fred; and Veltman, Frank, Eds., Varieties of Formal 
Semantics: Proceedings of the 4th Amsterdam Colloquium, Sep- 
tember 1982, Foils, Dordrecht, Holland: 1-23. 
van Benthem, Johan. 1986a Semantic Automata. In Groenendijk, 
Joroen; de Jongh, Dick; and Stokhof, Martin, Eds., Information, 
Interpretation and Inference. Foils, Dordrecht, Holland. Re- 
printed in van Benthem, Johan. 1986 Essays in Logical Semantics. 
D. Reidel, Dordrecht, Holland: 151-176. \[Also published as CSLI 
Report 85-27, Center for the Study of Language and Information, 
Stanford, 1985\] 
van Benthem, Johan. 1986b Towards a Computational Semantics: In 
Cooper, Robin; Engdahl, Elisabet; and Gardenfors, P., Eds., 
Proceedings of a Workshop on Generalized Quantifiers, Lund 
1985. D. Reidel, Dordrecht, Holland. 
van Benthem, Johan. 1986c Categoilal Grammar. In Johan van 
Benthem. 1986 Essays in Logical Semantics. D. Reidel, Dor- 
drecht, Holland: 123-150. 
van Benthem, Johan; Buszkowski, W.; and Marciszewski, W., Eds., 
Categorial Grammar. John Benjamin, Amsterdam, Holland. 
Boolos, George. 1979 The Unprovability of Consistency. Cambridge 
University Press, Cambridge, England. 
Bresnan, Joan W. 1975 Transformations and Categories in Syntax. In 
Butts, Ronald; and Hintikka, Jaakko, Eds., Basic Problems in 
Methodology and Linguistics. D. Reidel, Dordrecht, Holland: 
283-304. 
Chomsky, Noam. 1970 Remarks on Nominalization. In Jacobs, R.; 
and Rosenbaum, P., Eds., Readings in English Transformational 
Grammar. Ginn, Waltham, Massachusetts: 11-61. 
Computational Linguistics, Volume 14, Number I, Winter 1988 17 
Gerald Gazdar et al. Category Structures 
Chomsky, Noam. 1980 On Binding. Linguistic Inquiry 11: 1--46. 
Chomsky, Noam. 1981 Lectures on Government and Binding. Dor- 
drecht: Foris. 
Elson, Benjamin; and Pickett, Velma. 1962 An Introduction to Mor- 
phology and Syntax. Summer Institute of Linguistics, Santa Ana, 
California. 
Gazdar, Gerald. 1985 Applicability of Indexed Grammars to Natural 
Languages. Center for the Study of Language and Information, 
Stanford, California: Report No. CSLI-85-34. 
Gazdar, Gerald; Klein, Ewan; Pullum, Geoffrey K.; and Sag, Ivan A. 
1985 Generalized Phrase Structure Grammar. Harvard University 
Press, Cambridge, Massachusetts. 
Grzegorczyk, Andrzej. 1967 Some Relational Systems and the Asso- 
ciated Topological Spaces. Fundamentae Mathematicae 60: 223- 
231. 
Haddock, Nicholas; Klein, Ewan; and Morrill, Glyn, Eds., 1987 
Categorial Grammar, Unification Grammar and Parsing. Edin- 
burgh Working Papers in Cognitive Science 1, Edinburgh, Scot- 
land. 
Halliday, Michael A. K. 1961 Categories of the Theory of Grammar. 
Word 17:241-292. 
Halvorsen, Per-Kristian; and Ladusaw, William A. 1979 Montague's 
'Universal Grammar': an Introduction for the Linguist. Linguis- 
tics and Philosophy 3: 185-223. 
Harman, Gilbert H. 1963 Generative Grammars without Transforma- 
tion Rules: a Defense of Phrase Structure. Language 39: 597-616. 
Harris, Zellig S. 1951 Methods in Structural Linguistics. University of 
Chicago Press, Chicago, Illinois. 
Hendriks, Herman. 1986 Foundations of GPSG Syntax. Doctoraal- 
scriptie Wijsbegeerte, University of Amsterdam, Amsterdam, 
Holland. 
Hornstein, Norbert. 1977 S and X' Convention. Linguistic Analysis 3: 
137-176. 
Hudson, Richard A. 1971 English Complex Sentences. North Hol- 
land, Amsterdam, Holland. 
Hughes, G. E.; and Cresswell, Max J. 1968 An Introduction to Modal 
Logic. Methuen, London, England. 
Hughes, G. E.; and Cresswell, Max J. 1984 A Companion to Modal 
Logic. Methuen, London, England. 
Jackendoff, Ray. 1974 Introduction to the X Convention. Indiana 
University Linguistics Club, Bloomington, Indiana. 
Jackendoff, Ray. 1977 X Syntax: A Study of Phrase Structure. MIT 
Press, Cambridge, Massachusetts. 
Johnson, David E.; and Postal, Paul M. 1980 Arc Pair Grammar. 
Princeton University Press, Princeton, New Jersey. 
Kaplan, Ronald; and Bresnan, Joan. 1982 Lexical-Functional Gram- 
mar: a Formal System for Grammatical Representation. In J. W. 
Bresnan, Ed., The Mental Representation of Grammatical Rela- 
tions. MIT Press, Cambridge, Massachusetts: 173-281. 
Karttunen, Lauri. 1984 Features and Values. Proceedings of the lOth 
International Conference on Computational Linguistics and the 
22nd Annual Meeting of the Association for Computational Lin- 
guistics. Stanford University, California: 28-33. 
Karttunen, Lauri; and Zwicky, Arnold M. 1985 Introduction to 
Dowty, D.R.; Karttunen, L.; and Zwicky, A.M., Eds., Natural 
Language Parsing. Cambridge University Press, Cambridge, En- 
gland: 1-25. 
Kasper, Robert T.; and Rounds, William C. 1986 A Logical Semantics 
for Feature Structures. In Proceedings of the 24th Annual Meeting 
of the Association for Computational Linguistics: 257-266. 
Kay, Martin. 1979 Functional Grammar. In Chiarrello, Christine et 
al., Eds., Proceedings of the 5th Annual Meeting of the Berkeley 
Linguistics Society: 142-158. 
Kay, Martin. 1985 Parsing in Functional Unification Grammar. In 
Dowty, D.R.; Karttunen, L.; and Zwicky, A.M., Eds., Natural 
Language Parsing. Cambridge University Press, Cambridge, En- 
gland: 251-278. 
Ladusaw, William A. 1985 A Proposed Distinction Between Levels 
and Strata. Presented to the Annual Meeting of the Linguistic 
Society of America, Seattle, Washington. Memo no. SRC-85-04, 
Syntax Research Center, University of California, Santa Cruz, 
California. 
Lasnik, Howard; and Kupin, Joseph J. 1977 A Restrictive Theory of 
Transformational Grammar. Theoretical Linguistics 4: 173-196. 
Levy, Leon S.; and Joshi, Aravind, K. 1978 Skeletal Structural 
Descriptions. Information and Control 39:192-211. 
Longacre, Robert E. 1965 Some Fundamental Insights of Tagmemics. 
Language 41:65-76. 
Mellish, Christopher. 1986 Implementing Systemic Classification by 
Unification. Manuscript, University of Sussex. 
Montague, Richard. 1970 Universal Grammar. In Thomason, Rich- 
mond H., Ed., Formal Philosophy. Yale University Press, New 
Haven, Connecticut: 222-246. 
Montague, Richard. 1973 The Proper Treatment of Quantification in 
Ordinary English. In Thomason, Richmond H., Ed., Formal 
Philosophy. Yale University Press, New Haven, Connecticut: 
247-270. 
Moshier, M. D., and Rounds, William C. 1987 A Logic for Partially 
Specified Data Structures. Proceedings of the ACM Conference 
on Principles of Programming Languages, Munich. 
Oehrle, Richard T.; Bach, Emmon; and Wheeler, Deirdre W., Eds., 
1987 Categorial Grammars and Natural Language Structures, D. 
Reidel, Dordrecht, Holland. 
Patten, Terry; and Ritchie, Graeme. 1987 A Formal Model of Sys- 
temic Grammar. In Kempen, Gerard, Ed., Natural Language 
Generation: Recent Advances in AI, Psychology and Linguistics. 
Kluwer, Amsterdam, Holland. 
Pereira, Fernando C. N.; and Shieber, Stuart M. 1984 The Semantics 
of Grammar Formalisms Seen as Computer Languages. In Pro- 
ceedings of the lOth International Conference on Computational 
Linguistics and the 22nd Annual Meeting of the Association for 
Computational Linguistics: 123-129. 
Pereira, Fernando C. N.; and Warren, David H. D. 1980 Definite 
Clause Grammars for Language Analysis--a Survey of the For- 
malism and a Comparison with Augmented Transition Networks. 
Artificial Intelligence 13: 231-278. 
Perlmutter, David M.; and Postal, Paul M. 1977 Toward a Universal 
Characterization of Passivization. In Whistler, Kenneth et al., 
Eds., Proceedings of the 3rd Annual Meeting of the Berkeley 
Linguistics Society 394--417. Reprinted in: Perlmutter, David M., 
Ed., Studies in Relational Grammar 1. University of Chicago 
Press, Chicago, Illinois. 
Pollard, Carl J. 1984 Generalized Phrase Structure Grammars, Head 
Grammars, and Natural Languages. Ph.D. dissertation, Stanford 
University. 
Pollard, Carl. 1985 Phrase Structure Grammar Without Metarules. 
Goldberg, Jeffrey; MacKaye, Susannah; and Wescoat, Michael, 
Eds., Proceedings of the West Coast Conference on Formal 
Linguistics, Volume Four. Stanford Linguistics Association, Stan- 
ford, California: 246-261. 
Postal, Paul M. 1964 Constituent Structure: A Study of Contemporary 
Models of Syntactic Description. Publication 30 of the Indiana 
University Research Center in Anthropology, Folklore, and Lin- 
guistics, Bloomington, Indiana. 
Pullum, Geoffrey K. 1985 Assuming Some Version of X-Bar Theory. 
In Eilfort, William D.; Kroeber, Paul D.; Peterson, Karen L., 
Eds., CLS 21, Part 1: Papers from the General Session at the 
Twenty-First Regional Meeting. Chicago Linguistic Society, Chi- 
cago, Illinois: 323-353. 
Ristad, Eric Sven. 1986 Computational Complexity of Current GPSG 
Theory. Proceedings of the 24th Annual Meeting of the Associa- 
tion for Computational Linguistics: 30-39. 
Rounds, William C.; and Kasper, Robert T. 1986 A Complete Logical 
Calculus for Record Structures Representing Linguistic Informa- 
18 Computational Linguistics, Volume 14, Number I, Winter 1988 
Gerald Gazdar et al. Category Structures 
tion. Proceedings of the 15th Annual Symposium on Logic in 
Computer Science. Cambridge, Massachusetts. 
Rouveret, Alain; and Vergnaud, Jean-Roger. 1980 Specifying Refer- 
ence to the Subject: French Causatives and Conditions on Repre- 
sentations. Linguistic Inquiry 11: 97-202. 
Sabimana, Firmard. 1986 The Relational Structure of the Kirundi 
Verb. D.Phil. dissertation, Indiana University, Bloomington, In- 
diana. 
Shieber, Stuart. 1984 The Design of a Computer Language for 
Linguistic Information. In Proceedings of the lOth International 
Conference on Computational Linguistics and the 22nd Annual 
Meeting of the Association for Computational Linguistics: 362- 
366. 
Shieber, Stuart. 1985 Criteria for Designing Computer Facilities for 
Linguistic Analysis. Linguistics 23:189-211. 
Shieber, Stuart. 1987 Separating Linguistic Analyses from Linguistic 
Theories. In Whitelock, Peter J. et ai., Eds., Linguistic Theory and 
Computer Applications. Academic Press, London. 
Stockwell, Robert P.; Schacter, Paul; Partee, Barbara H. 1973 The 
Major Syntactic Structures of English. Holt, Rinehart and Win- 
ston, New York, New York. 
Winograd, Terry. 1972 Understanding Natural Language. Academic 
Press, New York, New York. 
Winograd, Terry. 1983 Language as a Cognitive Process: Volume 1 
Syntax. Addison-Wesley, Reading, Massachusetts. 
