The Semantics of Grammar Formalisms 
Seen as Computer Languages 
Fernando C. N. Pereira and Stuart M. Shieber 
Artificial Intelligence Center 
SRI International 
and 
Center for the Study of Language and Information 
Stanford University 
Abstract 
The design, implementation, and use of grammar for- 
ma\]isms for natural language have constituted a major 
branch of coml)utational linguistics throughout its devel- 
opment. By viewing grammar formalisms as just a spe- 
cial ease of computer languages, we can take advantage of 
the machinery of denotational semantics to provide a pre- 
cise specification of their meaning. Using Dana Scott's do- 
main theory, we elucidate the nature of the feature systems 
used in augmented phrase-structure grammar formalisms, 
in particular those of recent versions of generalized phrase 
structure grammar, lexical functional grammar and PATR- 
I1, and provide a (lcnotational semantics for a simple gram- 
mar formalism. We find that the mathematical structures 
developed for this purpose contain an operation of feature 
generalization, not available in those grammar formalisms, 
that can be used to give a partial account of the effect of 
coordination on syntactic features. 
1. Introduction I 
The design, implementation, and use of grammar for- 
malisms for natural lang,age have constituted a major 
branch of computational linguistics throughout its devel- 
opment. Itowever, notwithstanding the obvious superfi- 
cial similarily between designing a grammar formalism and 
designing a programming language, the design techniques 
used for grammar formalisms have almost always fallen 
short with respect to those now available for programming 
language design. 
Formal and computational linguists most often explain 
the effect of a grammar formalism construct either by ex- 
ample or through its actual operation in a particular im- 
plementation. Such practices are frowned upon by most 
programming-language designers; they become even more 
dubious if one considers that most grammar formalisms 
in use are based either on a context-free skeleton with 
augmentations or on some closely related device (such as 
ATNs), consequently making them obvious candidates for 
IThe research reported in this paper has been made possible by a gift 
from the System Development Foundation. 
a declarative semantics z extended in the natural way from 
the declarative semantics of context-free grammars. 
The last point deserves amplification. Context-free 
grammars possess an obvious declarative semantics in 
which nonterminals represent sets of strings and rules rep- 
resent n-ary relations over strings. This is brought out by 
the reinterpretation familiar from formal language theory 
of context-free grammars as polynomials over concatena- 
tion and set union. The grammar formalisms developed 
from the definite-clause subset of first order logic are the 
only others used in natural-language analysis that have 
been accorded a rigorous declarative semantics--in this 
case derived from the declarative semantics of logic pro- 
grams \[3,12,1 I\]. 
Much confusion, wasted effort, and dissension have re- 
sulted from this state of affairs. In the absence of a rigorous 
semantics for a given grammar formalism, the user, critic, 
or implementer of the formalism risks misunderstanding the 
intended interpretation of a construct, and is in a poor posi- 
tion to compare it to alternatives. Likewise, the inventor of 
a new formalism can never be sure of how it compares with 
existing ones. As an example of these dillqculties, two sim- 
ple changes in the implementation of the ATN formalism, 
the addition of a well-formed substring table and the use 
of a bottom-up parsing strategy, required a rather subtle 
and unanticipated reinterpretation of the register-testing 
and -setting actions, thereby imparting a different meaning 
to grammars that had been developed for initial top-down 
backtrack implementation \[22\]. 
Rigorous definitions of grammar formalisms can and 
should be made available. Looking at grammar formalisms 
as just a special case of computer languages, we can take 
advantage of the machinery of denotational semantics \[20 i 
to provide a precise specification of their meaning. This 
approach can elucidate the structure of the data objects 
manipulated by a formalism and the mathematical rela- 
tionships among various formalisms, suggest new possibil- 
ities for linguistic analysis (the subject matter of the for- 
malisms), and establish connections between grammar for- 
malisms and such other fields of research as programming- 
2This use of the term "semantics" should not be confused with the 
more common usage denoting that portion of a grammar concerned 
with the meaning of object sentences. Here we are concerned with the 
meaning of the metalanguage. 
123 
language design and theories of abstract data types. This 
last point is particularly interesting because it opens up 
several possibilities--among them that of imposing a type 
discipline on the use of a formalism, with all the attendant 
advantages of compile-time error checking, modularity, and 
optimized compilation techniques for grammar rules, and 
that of relating grammar formalisms to other knowledge 
representation languages \[l\]. 
As a specific contribution of this study, we elucidate 
the nature of the feature systems used in augmented phrase- 
structure grammar formalisms, in particular those of recent 
versions of generalized phrase structure grammar (GPSG) 
\[5,15\], lexical functional grammar (LFG) \[2\] and PATR-II 
\[ 18,17\]; we find that the mathematical structures developed 
for this purpose contain an operation of feature generaliza- 
tion, not available in those grammar formalisms, that can 
be used to give a partial account of the effect of coordina- 
tion on syntactic features. 
Just as studies in the semantics of programming lan- 
guages start by giving semantics for simple languages, so 
we will start with simple grammar formalisms that capture 
the essence of the method without an excess of obscuring 
detail. The present enterprise should be contrasted with 
studies of the generative capacity of formalisms using the 
techniques of formal language theory. First, a precise defini- 
!;ion of the semantics of a formalism is a prerequisite for such 
generative-capacity studies, and this is precisely what we 
are trying to provide. Second, generative capacity is a very 
coarse gauge: in particular, it does not distinguish among 
different formalisms with the same generative capacity that 
may, however, have very different semantic accounts. Fi- 
nally, the tools of formal language theory are inadequate to 
describe at a sufficiently abstract level formalisms that are 
based on the simultaneous solution of sets of constraints 
\[9,10\]. An abstract analysis of those formalisms requires a 
notion of partial information that is precisely captured by 
the constructs of denotationai semantics. 
2. Denotational Semantics 
In broad terms, denotational semantics is the study of 
the connection between programs and mathematical enti- 
ties that represent their input-output relations. For such 
an account to be useful, it must be compositional, in the 
sense that the meaning of a program is developed from the 
meanings of its parts by a fixed set of mathematical oper- 
ations that correspond directly to the ways in which the 
parts participate in the whole. 
For the purposes of the present work, denotational se- 
mantics will mean the semantic domain theory initiated 
by Scott and Strachey \[20\]. In accordance with this ap- 
proach, the meanings of programming language constructs 
are certain partial mappings between objects that represent 
partially specified data objects or partially defined states of 
computation. The essential idea is that the meaning of a 
construct describes what information it adds to a partial 
description of a data object or of a state of computation. 
Partial descriptions are used because computations in gen- 
eral may not terminate and may therefore never produce a 
fully defined output, although each individual step may be 
adding more and more information to a partial description 
of the undeliverable output. 
Domain theory is a mathematical theory of consider- 
able complexity. Potential nontermination and the use of 
functions as "first-class citizens" in computer languages ac- 
count for a substantial fraction of that complexity. If, as is 
the case in the present work, neither of those two aspects 
comes into play, one may be justified in asking why such 
a complex apparatus is used. Indeed, both the semantics 
of context-free grammars mentioned earlier and the seman- 
tics of logic grammars in general can be formulated using 
elementary set theory \[7,21\]. 
However, using the more complex machinery may be 
beneficial for the following reasons: 
• Inherent partiality:, many grammar formalisms oper- 
ate in terms of constraints between elements that do 
not fully specify all the possible features of an ele- 
ment. 
• Technical economy, results that require laborious 
constructions without utilizing domain theory can be 
reached trivially by using standard results of the the- 
ory. 
• Suggestiveness: domain theory brings with it a rich 
mathematical structure that suggests useful opera- 
tions one might add to a grammar formalism. 
• Eztensibilit~. unlike a domain-theoretic account, a 
specialized semantic account, say in terms of sets, 
may not be easily extended as new constructs are 
added to the formalism. 
3. The Domain of Feature Struc- 
tures 
We will start with an abstract denotational description 
of a simple feature system which bears a close resemblance 
to the feature systems of GPSG, LFG and PATR-II, al- 
though this similarity, because of its abstractness, may not 
be apparent at first glance. Such feature systems tend to 
use data structures or mathematical objects that are more 
or less isomorphic to directed graphs of one sort or an- 
other, or, as they are sometimes described, partial func- 
tions. Just what the relation is between these two ways 
of viewing things will be explained later. In general, these 
graph structures are used to encode linguistic information 
in the form of attribute-vahm pairs. Most importantly, par- 
tial information is critical to the use of such systems--for 
instance, in the variables of definite clause grammars \[12\] 
and in the GPSG analysis of coordination \[15\]. That is, the 
elements of the feature systems, called fealure struclures 
(alternatively, feature bundles, f-structures \[2\], or terms} 
can be partial in some sense. The partial descriptions, be- 
ing in a domain of attributes and complex values, tend to be 
equational in nature: some feature's value is equated with 
some other value. Partial descriptions can be understood 
124 
in one of two w:ays: either the descriptions represent sets 
of fully specilied elements of an underlying domain or they 
are regarded as participating in a relationship of partiality 
with respect to each other. We will hold to the latter view 
here. 
What are feature structures from this perspective? 
They are repositories of information about linguistic enti- 
ties. In domain-theoretic terms, the underlying domain of 
feature structures F is a recursive domain of partial func- 
tions from a set of labels L (features, attribute names, at- 
tributes) to complex values or primitive atomic values taken 
from a set C of constants. Expressed formally, we have the 
domain equation 
F=IL~F\]+G 
The solution of this domain equation can be understood as 
a set of trees (finite or infinite} with branches labeled by 
elements of L, and with other trees or constants as nodes. 
The branches la .... , Im from a node n point to the values 
n{lt),..., n(Im) for which the node, as a partial function, is 
defined. 
4. The Domain of Descriptions 
What the grammar formalism does is to talk about F, 
not in F. That is, the grammar formalism uses a domain of 
descriptions of elements of F. From an intuitive standpoint, 
this is because, for any given phrase, we may know facts 
about it that cannot be encoded in the partial function 
associated with it.. 
A partial description of an element n of F will be a set 
of equations that constrain the values of n on certain labels. 
In general, to describe an element z E F we have equations 
of the following forms: 
(... (xII. })-..)ll;.) = (..-(z(li,))...)(l;.) 
(".(x{li,))".)(li,~) = ck , 
which we prefer to write as 
(t~,...I;.) = (Ij,..-i;.) 
(li,"'li=) = ck 
with x implicit. The terms of such equations are constants 
c E C' or paths {ll, ". It=), which we identify in what follows 
with strings in L*. Taken together, constants and paths 
comprise the descriptors. 
Using Scott's information systems approach to domain 
construction \[16\], we can now build directly a characteriza- 
tion of feature structures in terms of information-bearing 
elements, equations, that engender a system complete with 
notions of compatibility and partiality of information. 
The information system D describing the elements of 
F is defined, following Scott, as the tuple 
D = (/9, A, Con, ~-) , 
where 19 is a set of propositions, Con is a set of finite subsets 
of P, the consistent subsets, I- is an entailment relation 
between elements of Con and elements of D and A is a 
special least informative element that gives no information 
at all. We say that a subset S of D is deductively closed 
if every proposition entailed by a consistent subset of S is 
in S. The deductive closure -S of S ___ /9 is the smallest 
deductively closed subset of/9 that contains S. 
The descriptor equations discussed earlier are the 
propositions of the information system for feature structure 
descriptions. Equations express constraints among feature 
values in a feature structure and the entailment relation 
encodes the reflexivity, symmetry, transitivity and substi- 
tutivity of equality. More precisely, we say that a finite set 
of equations E entails an equation e if 
• Membership: e E E 
• Reflezivit~. e is A or d = d for some descriptor d 
• Symmetry. e is dl = d2 and dz = dl is in E 
• Transitivity. e is da = dz and there is a descriptor d 
such that dl = d and d = dz are in E 
• Substitutivit~r. e is dl = Pl • d2 and both pl = Pz and 
dl = P2 • d.~ are in E 
• Iteration: there is E' C E such that E' b e and for all 
e'E~ EF-e' 
With this notion of entailment, the most natural definition 
of the set Con is that a finite subset E of 19 is consistent if 
and only if it does not entail an inconsistent equation, which 
has the form e~ = cz, with et and Cz as distinct constants. 
An arbitrary subset of/9 is consistent if and only if all 
its finite subsets are consistent in the way defined above. 
The consistent and deductively closed subsets of D ordered 
by inclusion form a complete partial order or domain D, 
our domain of descriptions of feature structures. 
Deductive closure is used to define the elements of D 
so that elements defined by equivalent sets of equations are 
the same. In the rest of this paper, we will specify elements 
of D by convenient sets of equations, leaving the equations 
in the closure implicit. 
The inclusion order K in D provides the notion of 
a description being more or less specific than another. 
The least-upper-bound operation 12 combines two descrip- 
tions into the least instantiated description that satisfies 
the equations in both descriptions, their unification. The 
greatest-lower-bound operation n gives the most instanti- 
ated description containing all the equations common to 
two descriptions, their generalization. 
The foregoing definition of consistency may seem very 
natural, but it has the technical disadvantage that, in gen- 
eral, the union of two consistent sets is not itself a consistent 
set; therefore, the corresponding operation of unification 
may not be defined on certain pairs of inputs. Although 
this does not cause problems at this stage, it fails to deal 
with the fact that failure to unify is not the same as lack of 
definition and causes technical difficulties when providing 
rule denotations. We therefore need a slightly less natural 
definition. 
First we add another statement to the specification of 
the entailment relation: 
125 
• Falsitv. if e is inconsistent, {e} entails every element 
of P. 
- That is, falsity entails anything. Next we define Con to be 
simply the set of all finite subsets of P. The set Con no 
longer corresponds to sets of equations that are consistent 
in the usual equational sense. 
With the new definitions of Con and I-, the deductive 
closure of a set containing an inconsistent equation is the 
whole of P. The partial order D is now a lattice with top 
element T = P, and the unification operation t_l is always 
defined and returns T on unification failure. 
We can now define the description mapping 6 : D --* F 
that relates descriptions to the described feature structures. 
The idea is that, in proceeding from a description d 6 D to 
a feature structure f 6 F, we keep only definite informa- 
tion about values and discard information that only states 
value constraints, but does not specify the values them- 
selves. More precisely, seeing d as a set of equations, we 
consider only the subset LdJ of d with elements of the form 
(l~-..lm)=c~ . . 
Each e 6 \[d\] defines an element f(e) of F by the equations 
f(e)(l,) = f, 
fi-,(li) ---- fl 
f,._,(l,.) = ek , 
with each of the f~ undefined for all other labels. Then, we 
can define 6(d) as 
6(d) = L\] f(e) 
~eL~l 
This description mapping can be shown to be continu- 
ous in the sense of domain theory, that is, it has the prop- 
erties that increasing information in a description leads 
to nendecreasing information in the described structures 
{monotonieity) and that if a sequence of descriptions ap- 
proximates another description, the same condition holds 
for the described structures. 
Note that 6 may map several elements of D on to one 
element of F. For example, the elements given by the two 
sets of equations 
(fh) = c (gi) = e 
describe the same structure, because the description map- 
ping ignores the link between (f h) and (g i) in the first 
description. Such links are useful only when unifying with 
further descriptive elements, not in the completed feature 
structure, which merely provides feature-value assignments. 
Informally, we can think of elements of D as directed 
rooted graphs and of elements of F as their unfoldings as 
trees, the unfolding being given by the mapping 6. It is 
worth noting that if a description is cyclic---that is, if it has 
cycles when viewed as a directed graph--then the resulting 
feature tree will be infinite2 
Stated more precisely, an element f of a domain is fi- 
nite, if for any ascending sequence {d~} such that f E_ U~ d~, 
there is an i such that f C_ d~. Then the cyclic elements 
of D are those finite elements that are mapped by 6 into 
nonfinite elements of F. 
5. Providing a Denotation for a 
Grammar 
We now move on to the question of how the domain D 
is used to provide a denotational semantics for a grammar 
formalism. 
We take a simple grammar formalism with rules con- 
sisting of a context-free part over a nonterminal vocabu- 
lary .t/= {Nt,..., Ark} and a set of equations over paths in 
(\[0..c~\]- L*)0C. A sample rule might be 
S ~ NP VP 
(o s,,bj) = (I) 
(o predicate) = (2) 
(1 agr) = (2 agr) 
This is a simplification of the rule format used in the PATR- 
II formalism \[18,17\]. The rule can be read as "an S is an 
NP followed by a VP, where the subject of the S is the 
NP, its predicate the VP, and the agreement of the NP 
the same as the agreement of tile VP'. 
More formally, a grammar is a quintuple G = 
(//, S, L, C, R), where 
• ,t/is a finite, nonempty set of nonterminals Nt,..., Nk 
• S is the set of strings over some alphabet (a fiat do- 
main with an ancillary continuous function concate- 
nation, notated with the symbol .). 
• R is a set of pairs r = (/~0 ~ N,, .. . N,., E~), 
where E. is a set of equations between elements of 
(\[0..m\] - L') 0 C. 
As with context-free grammars, local ambiguity of a 
grammar means that in general there are several ways of 
assembling the same subphrases into phra.ses. Thus, the 
semantics of context-free grammars is given in terms of 
sets of strings. The situation is somewhat more compli- 
cated in our sample formalism. The objects specified by 
the grammar are pairs of a string and a partial description. 
Because of partiality, the appropriate construction cannot 
be given in terms of sets of string-description pairs, but 
rather in terms of the related domain construction of pow- 
erdomains \[14,19,16\]. We will use the Hoare powerdomain 
P = PM(S x D) of the domain S x D of string-description 
pairs. Each element of P is an approximation of a transdue- 
tion relation, which is an association between strings and 
their possible descriptions. 
We can get a feeling for what the domain P is doing 
by examinin~ our notion of lexicon. A lexicon will be an 
SMote precisely a rational tree, that is, a tree with a finite number of 
distinct subtrees. 
126 
element of the domain pk, associating with each of the k 
nonterminals N;, I < i < k a transduction relation from the 
corresponding coordinate of pk. Thus, for each nontermi- 
nal, the lexicon tells us what phrases are under that non- 
terminal and what possible descriptions each such phrase 
has. llere is a sample lexicon: 
NP: 
{"Uther", } 
{(agr n,tm) = sg, (agr per) = 3}) 
("many knights", 
{ <agr num} = pl, (agr per) = 3}) 
VP: 
("slorms Cornwall", } 
{(,~," n,,.,) = sg}) 
("sit at the Round Table", 
{(agr hum} = pl}) 
s: {} 
By decomposing the effect of a rule into appropriate 
steps, we can associate with each rule r a denotation 
Ir~ :P~ --. pk 
that combines string-description pairs by concatenation 
and unification to build new string-description pairs for the 
nonterminal on the left-hand side of the rule, leaving all 
other nonterminals untouched• By taking the union of the 
denotations of the rules in a grammar, (which is a well- 
defined and continuous powerdomain operation,) we get a 
mapping 
TG(e) d~j U H(e) 
reR 
from pk to pk that represents a one-step application of all 
the rules of G "in parallel." 
We can now provide a denotation for the entire gram- 
mar as a mapping that completes a lexicon with all the 
derived phrases and their descriptions. The denotation of 
a grammar is the fimetion that maps each lexicon ~ into the 
smallest fixed point of To containing e. The fixed point is 
defined by 
i=O 
as Tc is contimmus. 
It remains to describe the decomposition of a rule's ef- 
fect into elementary steps. The main technicality to keep in 
mind is that rules stale constraints among several descrip- 
tions (associated with the parent and each child), whereas 
a set of equations in D constrains but a single descrip- 
tion. This nfismateh is solved by embedding the tuple 
(do,..., d,,) of descriptions in a single larger description, 
as expressed by 
(i) = di, 0 < i < r. 
and only then applying the rule constraints--now viewed as 
constraining parts of a single description. This is done by 
the indexing and combination steps described below. The 
rest of the work of applying a rule, extracting the result, is 
done by the projection and deindcxing steps• 
The four steps for applying a rule 
r = (N,, --* U,, . .. N,.., E,) 
to string-description pairs (s,,d,} ..... (sk,dk} are as fol- 
lows. First, we index each d,, into d~ by replacing every 
• . . • . $ • path p m any of tts equatmns with the path I " P. We 
then combine these indexed descriptions with the rule by 
unifying the deductive closure of E, with all the indexed 
descriptions: 
d= u Ud{, 
j=l 
We can now project d by removing from it all equations 
with paths that do not start with O. It is clearly evident 
that the result d o is still deductively closed. Finally, d o is 
deindexed into deo by removing 0 from the front of all paths 
O. p in its equations. The pair associated with N,o is then 
( s,, . . . s,,, d,o). 
It is not difficult to show that the above operations 
can be lifted into operations over elements of pk that leave. 
untouched the coordinates not mentioned in the rule and 
that the lifted operations are continuous mappings• With 
a slight abuse of notation, we can summarize the foregoing 
discussion with the equation 
\[r\] = deindex o projecl o combine, o index, 
In the case of tile sample lexicon and one rule grammar 
presented earlier, \[G~(e) would be 
NP : 
VP: 
S: 
{... as before.- .} 
{--. as before-..} 
("Uther storms Cornwall", 
{(subj agr nnm} = sg .... }) 
("many knights sit at the Round Table", 
{(sub 1 agr hum) = pl .... }) 
("many knights storms Cornwall", T) 
6. Applications 
We have used the techniques discussed here to analyze 
the feature systems of GPSG \[15\], LFG \[2\] and PATR-II 
\[17\]. All of them turn out to be specializations of our do- 
main D of descriptions. Figure 1 provides a summary of two 
of the most critical formal properties of context-free-based 
grammar formalisms, the domains of their feature systems 
(full F~ finite elements of F, or elements of F based on 
nonrecursive domain equations) and whether the context- 
free skeletons of grammars are constrained to be off-line 
paraeable \[13\] thereby guaranteeing decidability. 
127 
DCG-II a PATR-II LFG GPSG b 
FEATURE SYSTEM full finite finite nonrec. 
CF SKELETON full full off-line full 
aDCGs based on Prolog-lI which allows cyclic terms. 
bHPSG, the current Hewlett-Packard implementation derived 
from GPSG, would come more accurately under the PATR-II 
classification. 
Figure 1: Summary of Grammar System Properties 
Though notational differences and some grammatical 
devices are glossed over here, the comparison is useful as 
a first step in unifying the various formalisms under one 
semantic umbrella. Furthermore, this analysis elicits the 
need to distinguish carefully between the domain of fea- 
ture structures F and that of descriptions. This distinction 
is not clear in the published accounts of GPSG and LFG, 
which imprecision is responsible for a number of uncertain- 
ties in the interpretation of operators and conventions in 
those formalisms. 
In addition to formal insights, linguistic insights have 
also been gleaned from this work. First of all, we note 
'that while the systems make crucial use of unification, gen- 
eralization is also a well-defined notion therein and might 
indeed be quite useful. In fact, it was this availability of the 
generalization operation that suggested a simplified account 
of coordination facts in English now being used in GPSG 
\[15\] and in an extension of PATR-II \[8\]. Though the issues 
of coordination and agreement are discussed in greater de- 
tail in these two works, we present here a simplified view of 
the use of generalization in a GPSG coordination analysis. 
Circa 1982 GPSG \[6\] analyzed coordination by using a 
special principle, the conjunct realization principle (CRP), 
to achieve partial instantiation of head features {including 
agreement} on the parent category. This principle, together 
with the head feature convention (HFC) and control agree- 
ment principle {CAP), guaranteed agreement between the 
head noun of a subject and the head verb of a predicate in 
English sentences. The HFC, in particular, can be stated 
in our notation as (0 head) = (n head) for n the head of 0. 
A more recent analysis \[4,15\] replaced the conjunct re- 
alization principle with a modified head feature conven- 
tion that required a head to be more instantiated than the 
parent, that is: (0 head) E (n head) for all constituents 
n which are heads of 0. Making coordinates heads of 
their parent achieved the effect of the CRP. Unfortunately, 
since the HFC no longer forced identity of agreement, a 
new principle--the nominal completeness principle (NCP), 
which required that NP's be fully instantiated--was re- 
quired to guarantee that the appropriate agreements were 
maintained. 
Making use of the order structure of the domains we 
have just built, we can achieve straightforwardly the effect 
of the CRP and the old HFC without any notion of the 
NCP. Our final version of the HFC merely requires that 
the parent's head features be the generalization of the head 
features of the head children. Formally, we have: 
(0 head) ---- \[7 (i head) 
i~heads of 0 
In the case of parents with one head child, this final HFC 
reduces to the old HFC requiring identity; it reduces to the 
newer one, however, in cases {like coordinate structures} 
where there are several head constituents. 
Furthermore, by utilizing an order structure on the do- 
main of constants C, it may be possible to model that trou- 
blesome coordination phenomenon, number agreement in 
coordinated noun phrases \[8,15\]. 
7. Conclusion 
We have approached the problem of analyzing the 
meaning of grammar formalisms by applying the techniques 
of denotational semantics taken from work on the semantics 
of computer languages. This has enabled us to 
• account rigorously for intrinsically partial descrip- 
tions, 
• derive directly notions of unification, instantiation 
and generalization, 
• relate feature systems in linguistics with type systems 
in computer science, 
• show that feature systems in GPSG, I, FG and PATR- 
II are special cases of a single construction, 
• give semantics to a variety of mechanisms in grammar 
formalisms, and 
• introduce operations for modeling linguistic phenom- 
ena that have not previously been considered. 
We plan to develop the approach further to give ac- 
counts of negative and disjunctive constraints \[8\], besides 
the simple equational constraints discussed here. 
On the basis of these insights alone, it should be clear 
that the view of grammar formalisms as programming lan- 
guages offers considerable potential for investigation. But, 
even more importantly, the linguistic discipline enforced 
by a rigorous approach to the design and analysis of gram- 
mar formalisms may make possible a hitherto unachievable 
standard of research in this area. 
References 
\[1\] Ait-Kaci, H. "A New Model of Computation Based 
on a Calculus of Type Subsumption." Dept. of Com- 
puter and Information Science, Univer:ity of Penn- 
sylvania (November 1983). 
\[2\] Bresnan, J. and R. Kaplan. "Lexical-Functional 
Grammar: A Formal System for Granmlatical Repre- 
sentation." In J. Bresnan, Ed., The ,%Icntal Represen- 
tation of Grammatical Relations, MIT Press, Cam- 
bridge, Massachusetts (1982), pp. 173-281. 
128 
\[3\] Colmera,er, A. "Metamorphosis Grammars." In L. 
Bolc, Ed., Natural Language Communication with 
Computers, Springer-Verlag, Berlin (1978). First 
appeared as "Les Grammaires de M~tamorphose," 
Groupe d'lnt611igence Artificielle, Universit~ de Mar- 
seille II (November 1975). 
\[4\] Farkaz, D., D.P. Flickinger, G. Gazdar, W.A. Ladu- 
saw, A. Ojeda, J. Pinkham, G.K. Pullum, and P. 
Sells. "Some Revisions to the Theory of Features 
and Feature lnstantiation." Unpublished manuscript 
{August 1983). 
\[5\] Gazdar, Gerald and G. Pullum. "Generalized Phrase 
Structure Grammar: A Theoretical Synopsis." Indi- 
ana University Linguistics Club, Bloomington, Indi- 
ana (1982). 
\[6\] Gazdar, G., E. Klein, G.K. Pullum, and I.A. Sag. 
"Coordinate Structure and Unbounded Dependen- 
cies." In 1M. Barlow, D. P. Flickinger and I. A. 
Sag, eds., Developments in Generalized Phrase Struc- 
ture Grammar. Indiana University Linguistics Club, 
Bloomington, Indiana (1982). 
\[7\] Itarrison, M. Introduction to Formal Language The- 
ory. Addison-Wesley, Reading, Massachussets (1978). 
\[8\] Kaitunnen, Lauri. "Features and Values." Proceed- 
ings of the Tenth International Conference on Com- 
putational Linguistics, Stanford University, Stanford, 
California (4-7 July, 1984). 
\[9\] Kay, M. "Functional Grammar." Proceedings of the 
Fifth Annual Meeting of the Berkeley Linguistic Soci- 
ety, Berkeley Linguistic Society, Berkeley, California 
(February 17-19, 1979), pp. 142-158. 
\[10\] Marcus, M., D. Hindle and M. Fleck. "D-Theory: 
Talking about Talking about Trees." Proceedings of 
the 21st Annual Meeting of the Association for Com- 
putational Linguistics, Boston, Massachusetts (15-17 
June, 1982). 
\[11\] Pereira, F. "Extraposition Grammars." American 
Journal of Computational Linguistics 7, 4 (October- 
December 198 I}, 243-256. 
\[12\] Pereira, F. and D. H. D. Warren. "Definite Clause 
Grammars for Language Analysis--a Survey of the 
Formalism and a Comparison with Augmented Tran- 
sition Networks." Artificial Intelligence 18 {1980), 
231-278. 
\[13\] Pereira, F. C. N., and David H. D. Warren "Parsing 
as Deduction." Proceedings of the ~Ist Annual Meet- 
ing of the Association for Computational Linguistics, 
Boston, Massachusetts, (15-17 June, 1983), pp. 137- 
144. 
\[14\] Plotkin, G. "A Powerdomain Construction." SIAM 
Journal of Computing 5 (1976), 452-487. 
\[15\] Sag, I., G. Gazdar, T. Wasow and S. Weisler. "Coor- 
dination and How to Distinguish Categories." Report 
No. CSLI-84-3, Center for the Study of Language 
and Information, Stanford University, Stanford, Cal- 
ifornia (June, 1982). 
\[16\] 
. \[17\] 
Scott, D. "Domains for Denotational Semantics." In 
ICALP 82, Springer-Verlag, Heidelberg (1982). 
Shieber, Stuart. "The Design of a Computer Lan- 
guage for Linguistic Information." Proceedings of 
the Tenth International Conference on Computational 
Linguistics \[4-7 July, 1984) 
\[18\] Shieber, S., H. Uszkoreit, F. Pereira, J. Robinson and 
M. Tyson. "The Formalism and Implementation of 
PATR-II." In Research on Interactive Acquisition and 
Use of Knowledge, SRI Final Report 1894. SRI In- 
ternational, Menlo Park, Califi)rnia (1983). 
\[19\] Smyth, M. "Power Domains." Journal of Computer 
and System Sciences 16 (1978), 23-36. 
\[20\] Stoy, J. Denotational Semantics: The Seott-Strachey 
Approach to Programming Language Theory. MIT 
Press, Cambridge, Ma.ssachusetts (1977). 
\[21\] van Erodes, M. and R. A. Kowalski. "The Seman- 
tics of Predicate Logic as a Programming Language." 
Journal of the ACM 23, 4 {October 1976), 733-742. 
\[22\] Woods, W. et al. "Speech Understanding Systems: 
Final Report." BBN Report 3438, Bolt Beranek and 
Newman, Cambridge, Massachusetts (1976). 
129 
