Talking About Trees 
Patrick Blackburn 
Department of Philosophy, Rijksuniversiteit Utrecht 
Heidelberglaan 8, 3584 CS Utrecht. Email: patrick@phil.ruu.nl 
Claire Gardent 
GRIL, Universit4 de Clermont Ferrand, France, and 
Department of Computational Linguistics, Universiteit van Amsterdam 
Spuistraat 134, 1012 VB Amsterdam. Email: claire@mars.let.uva.nl 
Wilfried Meyer-Viol 
Centrum voor Wiskunde en Informatica 
Kruislaan 413, 1098 SJ Amsterdam. Email: W.Meyer.Viol@cwi.nl 
Abstract 
In this paper we introduce a modal lan- 
guage L T for imposing constraints on trees, 
and an extension LT(L r) for imposing con- 
straints on trees decorated with feature 
structures. The motivation for introducing 
these languages is to provide tools for for- 
malising grammatical frameworks perspic- 
uously, and the paper illustrates this by 
showing how the leading ideas of GPS6 can 
be captured in LT(LF). 
In addition, the role of modal languages 
(and in particular, what we have called 
layered modal languages) as constraint for- 
malisms for linguistic theorising is discussed 
in some detail. 
1 Introduction 
In this paper we introduce a modal language L 7 
for talking about trees, and an extension L'r(L F) 
for talking about trees decorated with feature struc- 
tures. From a logical point of view this is a nat- 
ural thing to do. After all, the trees and feature 
structures used in linguistics are simple graphical 
objects. To put it another way, they are merely 
rather simple Kripke models, and modal languages 
are probably the simplest languages in which non- 
trivial constraints can be imposed on such structures. 
Moreover the approach is also linguistically natural: 
many of the things linguists need to say about trees 
(and feature structures) give rise to modal operators 
rather naturally, and indeed our choice of modalities 
has been guided by linguistic practice, not logical 
convenience. 
There are several reasons why we think this path 
is an interesting one to explore, however two are of 
particular relevance to the present paper. 1 First, 
we believe that it can lead to relatively simple and 
natural formalisations of various grammatical frame- 
works. In our view, neither simplicity nor natural- 
ness are luxuries: unless a formalisation possesses a 
high degree of clarity, it is unrealistic to hope that 
it can offer either precise analyses of particular sys- 
tems or informative comparisons of different frame- 
works. We believe our approach has the requisite 
clarity (largely because it arose by abstracting from 
linguistic practice in a rather direct manner) and 
much of this paj?er is an attempt to substantiate 
this. Second, L r can be combined in a very nat- 
ural way with feature logics to yield simple systems 
which deal with configurational concepts, complex 
categories and their interaction. The key idea is to 
perform this combination of logics in a highly con- 
strained way which we have called layering. Layer- 
ing is a relatively new idea in modal logic (in fact the 
only paper devoted exclusively to this topic seems to 
be \[Finger and Gabbay 1992\]), and it seems to pro- 
vide the right level of expressive power needed to 
model many contemporary grammar formalisms. 
The paper is structured as follows. In section 2 we 
define the syntax and semantics of L T, our modal 
language for imposing constraints on tree structure. 
~Lurking in the background are two additional, rather 
more technical, reasons for our interest. First, we believe 
that being ezplicit about tree structure in our logical ob- 
ject languages (instead of, say, coding tree structure up as 
just another complex feature) may make it easier to find 
computationally tractable logics for linguistic processing. 
Second, we believe that logical methods may interact 
fruitfully with the mathematical literature on tree ad- 
missibility (see \[Peters and Ritchie 1969\], \[Rounds 1970\] 
and \[Joshi and Levy 1977\]). However we won't explore 
these ideas here. 
21 
In section 3 we put L T to work, showing how it can 
be used to characterise the parse trees of context free 
phrase structure grammars. In section 4 we consider 
how the structured categories prevalent in modern 
linguistic formalisms are dealt with. Our solution 
is to introduce a simple feature logic L F for talk- 
ing about complex categories, and then to layer L T 
across L F. The resulting system LT(L F) is capable 
of formulating constraints involving the interaction 
of con~igurational and categorial information. Sec- 
tion 5 illustrates how one might use this expressive 
power by formulating some of the leading ideas of 
(~PS(~ in LT(LF). We conclude the paper with some 
general remarks on the use of modal languages as 
constraint formalisms in linguistics. 
2 The language L T 
The primitive alphabet of the language LT(Prop) 
contains the following items: two constant symbols 
s and t, some truth functionally adequate collection 
of Boolean operators, 2 two unary modalities ~ and 
T, a binary modality ::~, a modality of arbitrary pos- 
itive arity •, the left bracket ( and the right bracket 
). In addition we have a set of propositional symbols 
Prop. We think of the symbols in Prop as given to us 
by the linguistic theory under consideration; differ- 
ent applications may well result in different choices 
of Prop. To give a very simple example, Prop might 
be {S, NP, VP, N, V, DET, CONJ ). 
The wits of LT(Prop) are defined as follows. First, 
all elements of Prop are LT(Prop) wits, and so are 
the constant symbols s and t. Second, if ¢, ¢ and 
¢1,.-., Cn (n > 1) are LT(Prop) wffs, then so are -,¢, 
(¢ A ¢), T¢, ~ ¢, (¢ =~ ¢) and •(¢1,..., Ca). Third, 
nothing else is an LT(Prop) wff. In what follows, we 
will assume that some choice of Prop has been fixed 
and drop all subsequent mention of it. That is, we'll 
usually speak simply of the language L T. 
The semantics of L T is given in terms of finite 
ordered trees. We regard finite trees T as quadruples 
of the form (F, >, O, root ), where r is a finite set of 
nodes, > is the 'Mother of' relation between nodes 
('u > v' means 'u is the mother of v'), O (C_ r) is 
the set of terminal nodes and root is the root node of 
the tree. As for the precedence ordering, we define: 
Definition 2.1 (Finite ordered trees) 
A finite ordered tree is a pair O = IT, A) where T 
is a tree (I', >, (9, root ) and A is a function assign- 
ing to each node u in F a finite sequence of nodes 
of F. For all nodes u • I', 2(u) must satisfy two 
constraints. First, )~(u) must contain no repetitions. 
Second, ~(u) = (ut,...,u~) iff u > ul,...,u > uk 
2In what follows we treat -, and ^ as primitive, and 
the other Boolean symbols such as V (disjunction), --* 
(material implication), *-* (material equivalence), T (con- 
stant true) and / (constant false) as abbreviations de- 
fined in the familiar manner. 
and there is no node u I in I" such that u > u I and C 
does not occur in the sequence ~(u). o 
(In short, the repetition-free sequence assigned to a 
node u by ~ consists of all and only the nodes im- 
mediately dominated by u; the sequence gives us a 
precedence ordering on these nodes. Note that it 
follows from this definition that ~(u) is the null- 
sequence iff u is a terminal node.) Next, we define 
a model M to be a pair (O, V) where O is a finite 
ordered tree (T,)~) and V is a function from Prop to 
the powerset of r. That is, V assigns to each proposi- 
tional symbol a set of nodes. Given any model M and 
any node u of M, the satisfaction relation M, u ~ ¢, 
(that is, the model M satisfies ¢ at node u) is defined 
as follows: 
M,u~p iff ueV(p), 
for all p E Prop 
M, u ~ s iff u = root 
M,u~t iff uEO 
M,u~-~¢ iff not M,u~¢ 
M,u ~¢A¢ iff M,u~¢ 
and M, u ~ ¢ 
M,u~¢ iff BC•r(u>C 
and M, u' ~ ¢) 
M,u~T¢ iff 3u' •F(u' > u 
and M, u' ~ ¢) 
M,u ~¢=~¢ iff Vu' • r(M, ul ~ ¢ 
implies M, u' ~ ¢) 
M,u ~ *(¢1,...,¢k) iff length(A(u)) = k 
and M,)t(u)(i) ~ ¢i, 
for all 1 < i < k 
(In the satisfaction clause for e, ~(u)(i) denotes the 
ith element of the sequence assigned to u by ~.) If 
~b is satisfied at all nodes of a model M, then we say 
that ¢ is valid on M and write M ~ ¢. The notion of 
validity has an important role to play for us. As we 
shall see in the next section, we think of a grammar 
G as being represented by an L T wff CG. The trees 
admitted by the grammar are precisely those models 
on which ¢u is valid. 
Note that L T is just a simple formalisation of lin- 
guistic discourse concerning tree structure. First, 
and T enable us to say that a daughter node, or a 
mother node, do (or do not) bear certain pieces of 
linguistic information. For example, ~ ¢ insists that 
the information ¢ is instantiated on some daughter 
node. Second, :¢. enables general constraints about 
tree structure to be made: to insist that ¢ =~ ¢ is to 
say that any node in the tree that bears the informa- 
tion ¢ must also bear the information ¢. Finally • 
enables admissibility conditions on local trees to be 
stated. That is, • is a modal operator that embod- 
ies the idea of local tree admissibility introduced by 
\[McCawley 1968\].3 It enables us to insist that a node 
3As well as McCawley's article, see also \[Gazdar 1979\]. 
Our treatment of node admissibility conditions has been 
heavily influenced by Gazdat's paper and later work in 
the GPSG tradition (for example, \[Gazdar et al. 1985\]). 
22 
must immediately and exhaustively dominate nodes 
bearing the listed information, and in the following 
section we'll see how to put it to work. 4 
Although the design of L T was guided by linguistic 
considerations, with the exception of • it turns out 
to be a rather conventional language. 5 In particular, 
=~ is what modal logicians call strict implication, and 
and 1" together form a particularly simple example 
of what modal logicians like to call a 'tense logic'. 
In what follows we will occasionally make use 
of the following (standard) modal abbreviations: 
O¢ =&! T ==> ¢ (that is, ¢ is satisfied at all nodes) 
and ~¢ =d~! "~ T-'¢ (that is, ¢ is true at the mother 
of the node we are evaluating at, if in fact this node 
has a mother). 
3 Talking about trees 
In this section we show by example how to use L T to 
formulate node admissibility conditions that will pick 
out precisely the parse trees of context free gram- 
mars. Consider the context free phrase structure 
grammar G = (S, N, T, P) where S is the start sym- 
bol of G; where N, the set of non-terminal symbols, 
is {S, NP, VP, N, V, DET, CONJ }; where T, the 
set of terminal symbols, is {the, a, man, woman, don- 
key, beat, stroke, and, but}; and where P, the set of 
productions, is: 
S , NP VP\[ S CONJS 
NP ) DET N 
VP ~ V NP 
N , man JwomauJ donkey 
V ~ beat \[ stroke 
D ET ) the \[ a 
CONJ , and I but 
Let's consider how to capture the parse trees of this 
grammar by means of constraints formulated in L T. 
The first step is to fix our choice of Prop. We choose 
it to be N U T, that is, all the terminal and non- 
terminal symbols of G. The second step is to capture 
the effect of the productions P. We do so as follows. 
Let ¢*" be the conjunction of the following wffs: 
4The reader will doubtless be able to think of other 
interesting operators to add. For example, adding opera- 
tors J.* and \]'* which explore the transitive closures of the 
daughter-of and mother-of relations respectively enables 
GB discourse concerning command relations to be mod- 
eled; while weakening the definition of * to ignore the 
precedence ordering on sisters, and adding a new unary 
modality to take over the task of regulating linear prece- 
dence, permits the ID/LP format used in GPSG to be nat- 
urally incorporated. Such extensions will be discussed in 
\[Blackburn et al. forthcoming\], however for the purposes 
of the present paper we will be content to work with the 
simpler set of operators we have defined here. 
5Indeed, on closer inspection, even • can be regarded 
as an old friend in disguise; such logical issues will be 
discussed in \[Blackburn et al. forthcoming\]. 
S =V .(NP, VP) V .(S, CON J, 5) 
NP => ,(DET, N) 
VP =~ .(11, NP) 
.(man) v .(woman) v .(donkey) 
V .(beaO v.(stroke) 
DET ::~ .(the) V *(a) 
co w -(and)v.(buO 
Note that each conjunct licences a certain informa- 
tion distribution on local trees. Now, any parse tree 
for G can be regarded as an L T model, and because 
each conjunct in eP mimics in very obvious fashion 
a production in G, each node of an L G parse tree is 
licenced by one of the conjuncts. To put it another 
way, eP is valid on all the G parse trees. 
However we want to capture all and only the parse 
trees for G. 
This is easily done: we need merely express 
in our language certain general facts about parse 
trees. First, we insist that each node is labeled 
by at least one propositional symbol: \[3(Vp~teuT p) 
achieves this. Second, we insist that each node is 
labeled by at most one propositional symbol: (p =~ 
Aq~((NuT)\{p}) "~q) achieves this. Third, we insist 
that the root of the tree be decorated by the start 
symbol S of the grammar: s =:, S achieves this. 
Fourth, we insist that non-terminal symbols label 
only nonterminal nodes: ApeN(P =~ -,t) achieves 
this. Finally, we insist that terminal symbols label 
terminal nodes: Apew(p => t) achieves this. Call the 
conjunction of these five wits ev; note that, mod- 
ulo our choice of the particular sets N and T, ev 
expresses universal facts about parse trees. 
Now, for the final step: let eG be eP A dr. That 
is, eG expresses both the productions P of G and 
the universal facts about parse tree structure. It is 
easy to see that for any model M, M ~ eG iff M is 
(isomorphic to) a parse tree for G. Indeed, it is not 
hard to see that the method we used to express G 
as a formula generalises to any context free phrase 
structure grammar. 
4 Trees decorated with feature 
structures 
The previous section showed that L T is powerful 
enough to get to grip with interesting 'languages' in 
the sense of formal language theory; but although 
natural language syntacticians may use tools bor- 
rowed from formal language theory, they usually 
have a rather different conception of language than 
do computer scientists. One important difference is 
that linguists typically do not consider either non- 
terminal or terminal symbols to be indivisible atoms, 
rather they consider them to be conglomerations 
of 'lower level' information called feature structures. 
For example, to say that a node bears the informa- 
tion NP is to say that the node actually bears a lot 
23 
of lower level information: for example, the features 
+N, -V, and BAR 2. Moreover (as we shall see 
in section 5) the constraints that a tree must sat- 
isfy in order to be accepted as well formed will typi- 
cally involve this lower level information. The feature 
co-occurrence restrictions and feature instantiation 
principles of ¢PSa are good examples of this. 
Is it possible to extend our framework to deal with 
such ideas? More to the point, is there a simple ex- 
tension of our framework that can deal with com- 
plex categories? Layered modal languages provide 
what is needed. Semantically, instead of associating 
each tree node with an atomic value, we are going to 
associate it with a feature structure. Syntactically, 
we are going to replace the propositional symbols of 
L T with wits capable of talking about this additional 
structure. To be more precise, instead of defining the 
wits of L T over a base Prop consisting of primitive 
propositional symbols, we are going to define them 
over a base of modal formulas. That is, we will use 
a language with two 'layers'. The top layer L T will 
talk about tree structure just as before, whereas the 
base layer (or feature layer) L v will talk about the 
'internal structure' associated with each node. 6 
Clearly the first thing we have to do is to define our 
feature language L F and its semantics. We will as- 
sume that the linguistic theory we are working with 
tells us what features and atoms may be used. That 
is, we assume we have been given a signature (~',,4) 
where both jr and ,4 are non-empty finite or denu- 
merably infinite sets, the set of features and the set of 
atomic information respectively. Typical elements of 
~r might be CASE, NUM, PERSON and AGR; while typ- 
ical elements of ,4 might be genitive, singular, plural, 
1st, 2nd, and 3rd. 
The language L F (of signature (~',,4)) contains 
the following items: all the elements of ,4 (which we 
will regard as propositional symbols), a truth func- 
tionally adequate collection of Boolean connectives, 
and all the elements of jr (which we will regard as 
one place modal operators). The set of wits of L F 
is the smallest set containing all the propositional 
symbols (that is, all the elements of ,4) closed under 
the application of the Boolean and modal operators 
(that is, the elements of jr). Thus a typical wff of 
L F might be the following: 
(AGR)(PERSON)3rd A (CASE)genitive. 
Note that this wff is actually something very familiar, 
namely the following Attribute Value Matrix: 
\[ AGR \[PERSON 3rd\] \] CASE 
genitive 
6As was mentioned earlier, at present there is rela- 
tively little published literature on layered modal lan- 
guages. The most detailed investigation is that of 
\[Finger and Gabbay 1992\], while \[de Rijke 1992\] gives a 
brief account in the course of a general discussion on the 
nature of modal logic. 
Indeed the wits of L F are nothing but straightfor- 
ward 'linearisations' of the traditional two dimen- 
sional AVM format. Thus it is unsurprising that the 
semantics of L F is given in terms of feature struc- 
tures: 
Definition 4.1 (Feature structures) 
A feature structure of signature (~,,4) is a triple 
F of the form (W, {RI}/el:, V), where W is a non- 
empty set, called the set of points; for all f 6 ~', R! 
is a binary relation on W that is a partial function; 
and V is a function that assigns to each propositional 
symbol (that is, each ~ 6 ,4), a subset of W. 7 \[\] 
Our satisfaction definition for L F wffs is as follows. 
For any F = (W, {Rl}yey, V) and any point w 6 W: 
F, w ~ a iff w 6 V(~), for all a fi ,4 
F, w ~ -~¢ iff not F, w ~ ¢ 
F,w~¢A¢ iff F,w~¢ and F,w~¢ 
F, w ~ (f)¢ iff 3w'(wRlw' and F, w' ~ ¢) 
With L F and its semantics defined, we are ready 
to define a language for talking about trees decorated 
with feature structures: the language LT(LF), that 
is, the language L T layered over the language L F. 
That is, we choose Prop to be L F and then make the 
L T wffs on top of this base in the usual way. s As a 
result, we've given an 'internal structure' (namely, a 
modal structure, or AVM structure) to the proposi- 
tional symbols of L T. This is the syntactical heart 
of layering. 
Definition 4.2 (Feature decorated trees) 
By a (finite ordered) feature structure decorated tree 
(of signature (Yr, A)) is meant a triple (O, D, d) 
where O is a finite ordered tree, D is a function 
that assigns to each node u of O a feature struc- 
ture (of signature ( , A)), and d is a function that 
assigns to each node u of O a point of D(u). That 
is, d(u) 6 D(u). 9 \[\] 
It is straightforward to interpret LT(L F) wits on 
feature structure decorated trees: indeed all we have 
7For detailed discussion of this definition see \[Black- 
burn 1991, 1992\] or \[Blackburn and Spaan 1991, 1992\]. 
For present purposes it suffices to note that it includes as 
special cases most of the well known definitions of feature 
structures, such as that of \[Kasper and Rounds 1986\]. 
SThis is worth spelling out in detail. The wffs of the 
language LT(L F) (of signature (~, ~4)) axe defined as fol- 
lows. First, all L F wits (of signature (.%',.4)) axe LT(L F) 
wtfs, and so axe the constant symbols s and t. Second, 
if ~b, ~b and ~b, ..... ~b, axe LT(L F) wtfs then so are --~b, 
~b A~b, T~b, ~¢, ~b=~ ~b and .(~b,,...,~b,). Third, nothing 
else is an LT(L F) wtf. 
9In a number of recent tallcs Dov Gsbbay has ad- 
vocated the idea of 'fibering' one set of semantic en- 
tities over another. This is precisely what's going on 
here: we're fibering trees over feature structures. Fibered 
structures axe the natural semantic dommns for layered 
languages. 
24 
to do is alter the base clause of the L T definition. So, 
let M = (O, D, d) be a feature structure decorated 
tree, and u be any node in O. Then for all wffs ¢ ELF: 
M, u ~ ¢ iff D(u), d(u) ~ ¢. 
In short, when in the course of evaluating an LT(L ~) 
wff at a node u we encounter an L F wff (that is, 
when we reach 'atomic' level) we go to the feature 
structure associated with u (that is, D(u)), and start 
evaluating the L F wff at the point d(u). This change 
at the atomic level is the only change we need to 
make: all the other clauses (that is, the clauses for s 
and t, the Boolean operators, for =~, 1, T, and e) are 
unchanged from the L T satisfaction definition given 
in section 2. 
To close this section, a general comment. LT(L F) 
is merely one, rather minimalist, example of a lay- 
ered modal language. The layering concept offers 
considerable flexibility. By enriching either the L T 
component, the L F component, or both, one can tai- 
lor constraint languages for specific applications. In- 
deed, it's worth pointing out that one is not forced 
to layer L T over a modal language at all. One could 
perfectly well layer L T across a first order feature 
logic or over a fragment of such a first order logic 
(such as the SchSnfinkel Bernays fragment explored 
in \[Johnson 1991\]), 1° and doubtless the reader can 
imagine other possibilities. That said, we're struck 
by the simplicity of purely modal layered languages 
such as LT(LF), and we believe that there are good 
theoretical reasons for being interested in modal ap- 
proaches (these are discussed at the end of the pa- 
per). Moreover, as we shall now see, even the rather 
simple collection of operators offered by LT(L F) are 
capable of imposing interesting constraints on syn- 
tactic structures. 
5 LT(L F) and linguistic theory 
At this stage, it should be intuitively clear why T F 
L (L) is well suited for modeling contemporary lin- 
guistic theories. On the one hand, the L T part of 
the language lets us talk directly about tree struc- 
ture, thus clearly it is a suitable tool for imposing 
constraints on constituent structure. On the other 
hand, the L F part of the language permits the de- 
scription of complex (rather than atomic) categories; 
and nowadays the use of such categories is standard. 
The aim of this section is to give a concrete illustra- 
tion of how LT(L F) can be used to model modern 
linguistic theories. The theory we have chosen for 
this purpose is GPSG. In what follows we sketch how 
some of the leading ideas of GPSG can be captured 
using LT(L F) wits. 
1°Layering over first order languages is treated in 
\[Finger and Gabbay 1992\]. 
5.1 Complex categories 
One of the fundamental ideas underlying GPSG (and 
indeed many other contemporary syntactic theories) 
is that a linguistic category is a complex object con- 
sisting of feature specifications, where feature speci- 
fications are feature/value pairs, and a value is either 
an atom or is itself a category. In LT(L F) , this idea 
is easily modeled since L'I"(L F) contains L F, a lan- 
guage specifically designed for talking about feature 
structures. To give a simple example, consider the 
following complex category: 
NOUN - \] 
VERB -l- 
BAR two 
This is naturally represented by the following 
L F wff: 
".noun A verb A (BAR)two 
where the attribute BAR is represented by a modality 
and the atomic symbols and Boolean features are 
represented by propositional symbols. This wff is 
satisfied at any point w in a feature structure such 
that noun is false at w, verb is true at w, and the 
propositional information two is reachable by making 
a BAR transition from w. 
5.2 Admissibility constraints on local trees 
The heart of GPS¢ is a collection of interacting 
principles governing the proper distribution of fea- 
tures within syntactic trees. Central to this the- 
ory is the concept of admissibility constraints on 
local trees. Very roughly, 11 the idea is that a lo- 
cal tree is admissible if it is a projection of an im- 
mediate dominance rule (that is, each node in the 
tree corresponds in some precisely defined way to 
exactly one category in the rule) and it satisfies all 
of the grammar principles; these include feature co- 
occurrence restrictions (FCRs), feature specification 
defaults (FSDs), linear precedence (LP) statements, 
and universal feature instantiation principles (UIPs). 
In what follows, we show how LT(L F) can be used 
to model some of these admissibility conditions on 
local trees: section 5.2.1 shows how to model phrase 
structure restrictions and section 5.2.2 concentrates 
on FCRs. Finally, in section 5.2.3 we sketch an 
LT(L F) treatment of the GPSG UIPs. 
5.2.1 Phrase structure restrictions 
In GPSG, restrictions on constituent structure are 
expressed by a set of ID/LP statements. As the name 
indicates, I(mmediate) D(ominance) statements en- 
code immediate dominance restrictions on local trees 
(for instance, the ID rule A --* B, C licenses any local 
tree consisting of a mother node labeled with cate- 
gory A and exactly two daughter nodes labeled with 
11For a more precise formulation of the constraints on 
tree admissibility, see \[Gazdar et al. 1985, page 100\]. 
25 
categories B and C respectively), whereas LP state- 
ments define a linear precedence relation between sis- 
ter nodes (for example, the LP statement C -4 B 
states that in any local tree with sisters labeled B 
and C, the C node must precede the B node). 
Strictl Z speaking, such restrictions cannot be mod- 
eled in LT(L F) . The reason for this is trivial. As has 
already been pointed out, the satisfaction definition 
for * makes use of both the immediate dominance 
and linear precedence relations. In a full-blooded at- 
tempt to model GPSG, we would probably define a 
variant modal operator o of • that did not make use 
of the precedence relation, and introduce an addi- 
tional modal operator (say ~,) to control precedence. 
However, having made this point, we shall not pur- 
sue the issue further. Instead, we will show how the 
present version of LT(L F) allows for the encoding of 
phrase structure rules involving complex categories. 
As was shown in section 3, rules involving atomic 
categories can be modeled in a fairly transparent way 
using =~, • and V. For instance, 
S ::~ .(NP, VP) V .(S, CON J, 5") 
captures the import of the following two phrase 
structure rules: 
S , NP VP 
S , S CONJ S 
In these rules the information associated with each 
node of the tree is propositional in nature, that 
is, non-structured. However because LT(L F) allows 
one to peer into the internal structure of nodes, 
this way of modeling phrase structure rules extends 
straightforwardly to rules involving complex cute- 
gories: it suffices 
bols by L F wffs. 
rule: \[ 
NOUN 
VERB 
BAR. 
SUBCAT 
can be formulated 
to replace the propositional sym- 
For example, the phrase structure 
NOUN -- \] 
VERB + 
BAR two 
NOUN + 
+ VERB -- zero 
trans BAR tWO 
as the following LT(L ~') wff: 
(-moun A verb A (BAR)two) 
*((-,noun A verb A (BAR)zero A (SUBCAT)trans), 
(noun ^ ^ (BAR)twO)) 
That is, the L F wffs give the required 'internal 
structure' in the obvious way. 
5.2.2 Feature co-occurrence restrictions 
FCRs encode restrictions on the distribution of 
features within categories. More specifically, they 
express conditional or bi-conditional dependencies 
between feature specifications occurring within the 
same category. For instance, the FCR: 
\[INV +\] ~ \[AUX +, VFORM fin\] (FCR1) 
states that any category with feature specification 
INV -t- must also contain the feature specifications 
AUX + and VFORM fin. In other words, any inverted 
constituent must be a finite auxiliary. 
FCRs are naturally expressed in LT(L F) by using 
the ::~ connective. Thus, FCR1 can be captured by 
means of the following schema: 
inv ::~ (a~x A (VFORM) fin) 
This says that for any ordered tree and any node 
u in this tree, if the feature structure associated with 
u starts with the point w and inv is true at w, then 
auz is also true at w and furthermore, the proposi- 
tional information fin is reachable from w by making 
a (VFORM) transition to some other node w'. 
5.2.3 Universal principles 
In this section, we show that LT(L F) allows us 
to axiomatize the main content of GFSG three fea- 
ture instantiation principles namely, the foot feature 
principle, the head feature convention and the con- 
trol agreement principle. 
Consider first the foot feature principle (FFP). 
This says that: 
Any foot feature specification which is in- 
stantiated on a daughter in a local tree must 
also be instantiated on the mother cate- 
gory in that tree. \[Gazdar et aL 1985, page 
81\] 12 
So, assume that our GPSG theorising has resulted 
in signature (jc, .A) which includes the feature FOOT. 
We capture the FFP by means of the following 
schema: 
(FOOT)~b =~I~(FOOT)~b. 
This says that for any node u, if the information ~b 
is reachable by making a FOOT transition in the fea- 
ture structure associated with u, then it must also 
be possible to obtain the information ~b by making 
a FOOT transition in the feature structure associ- 
ated with the mother of u. That is, FOOT infor- 
mation percolates up the tree. So for instance, if 
three sister nodes ul, u2 and u3 of a tree bear the 
information (FOOTI¢I , (FOOT)¢2 and (FOOTIC3 
respectively, then the feature structure associated 
with the mother node must bear the information 
(FOOT)C1 A (FOOT)C2 A (FOOT)C3. Incidentally, it 
then follows from the semantics of L F that this node 
bears the information (FOOW)(~b 1 ^ ¢2 ^ ~b3)- That 
is, the three pieces of foot information are unified. 
a2This axiom is actually a simplified version of the FFP 
in that it ignores the distinction between inherited and 
instantiated features. See section 5.3 for discussion of 
this point. 
26 
Consider now the head feature convention (HFC). 
A simplified version of the HFC can be stated as 
follows; is 
Any head features carried by the head 
daughter is carried by the mother and vice- 
versa. 
Assuming a signature (~',,4) which includes the 
feature HEAD-FEATURE and the atomic information 
head, we capture the HFC by means of the following 
schema: 
(head A (HEAD-FEATURE)~) ::~(HEAD-FEATURE)~ 
A 
(head A T(HEAD-FEATURE)~) :=~ (HEAD-FEATURE)~ 
The first conjunct says that whenever the feature 
structure associated with a node u marks it as a head 
node, and the information ff is reachable by making 
a HEAD-FEATURE transition, then one can also reach 
the same information ~ by making a HEAD-FEATURE 
transition in the feature structure associated with 
the mother of u. The second conjunct works analo- 
gously to bring HEAD-FEATURE information down to 
the head daughter. 
Finally, we sketch how the effect of the more elab- 
orate control agreement principle (CAP) can be cap- 
tured. ~PSG formulates CAP by making use of 
the Montagovian semantic type assignments. As we 
haven't discussed semantics, we're going to assume 
that the relevant type information is available inside 
our feature structures. With this assumed, our for- 
mulation of CAP falls into three steps: first, defining 
the notions of controller and controllee (or target in 
GPSQ terminology); second, defining the notion of a 
control feature; and third, defining the instantiation 
principle. We consider each in turn. Controller and 
controllee are defined as follows: 14 
A category C is controlled by another cate- 
gory C ~ in a constituent Co if one of the 
following situations obtains at a seman- 
tic level: either C is a functor that ap- 
plies to C ~ to yield a Co, or else there 
is a control mediator C" which combines 
with C and C ~ in that order to yield a Co. 
\[Gazdar et al. 1985, page 87\] 
Further, a control mediator is a head category 
whose semantic type is (VP, (NP, VP)) where VP 
denotes the type of an intransitive verb phase and 
NP that of a generalised quantifier. The first step is 
to formulate the notions of controller and controllee. 
ISThe exact formulation of the i/FC implies that only 
flee feature specifications are taken into account. See 
section 5.3 for discussion of this point. 
14Again this is somewhat simplified in that the final 
GPSG definition of control only takes into account so- 
called x-features so as to ignore perturbations of se- 
mantic types introduced by the presence of instantiated 
features. 
We do this with the following three wits (a and b are 
metavariables over semantic types, and np and vp 
correspond to the NP and VP above): 
• ( (TYPE)a/b, (TYPE)a) 
=¢, e(controllee, controller) 
• ( (TYPE)a, (TYPE)a/b) 
=~ •(controller, eontrollee) 
• ( T, (TYPE)vp/(np/vp),T) 
::~ •(controller, T, controllee) 
Control features are SLASH and AOR and are not 
mutually exclusive. The problem is to decide which 
should actually function as control feature when both 
of them are present on the controllee category. In ef- 
fect, in case of conflict (cf. \[Gazdar et al. 1985, 89\]), 
SLASH is the control feature if it is inherited, else 
ACR is. As we have no way to distinguish here be- 
tween inherited and instantiated feature values, we 
will (again) give a simplified axiomatisation of con- 
trol features, namely: 
(SLASH)~b ::¢" (CONTROL_FEAT)~b 
(~ (SLASH)'\]- A (AGR)~) ::~ (CONTROL_FEAT)~ 
Finally, we turn to the CAP itself. This says 
that the value of the control feature of the controllee 
is identical with the category of the controller. In LT(L F) : 
(~ (eontroller)A ~ (controllee)) ::~ 
I ((controller A ~) 
(controllee A (CONTROL_FEAT)C)) 
5.3 Discussion 
In the preceding sections, we showed how LT(L F) 
could be used to capture some of the leading concepts 
of GPSG. Although the account involves many sim- 
plifications and shortcomings, the examples should 
illustrate how to use LT(Le'): one expresses linguis- T F - 
tic principles as L (L) wffs, and only those (deco- 
rated) trees validatings all these wffs are considered 
well-formed. What we hope to have shown is that 
LT(L F) is a very natural language for expressing the 
various types of theoretical constructs developed in 
GPSG and, more generally, in most modern theories of 
grammar. Complex categories can be described us- 
ing the L F part of the language while general infor- 
mation concerning the geometry of trees and the dis- 
tribution of feature specifications within those trees 
can be stated using the full language. More specifi- cally, 
the bullet operator • provides an easy way to 
27 
express phrase structure constraints while the strict 
implication operator :=~ allows one to express various 
types of constraints on the distribution of features in 
trees. When used to connect two L F wffs, ==~ ex- 
presses generalisations over the internal structure of 
categories (as illustrated in section 5.2.2 on FCRs), 
whereas when used together with T, ~ and • it allows 
information sharing between feature structures asso- 
ciated with different nodes in the tree (cf. section 
5.2.3). 
As already repeatedly mentioned, there remain 
many shortcomings in our approach to modeling 
GPSG. To close this discussion let's consider them a 
little more closely; this will lead to some interesting 
questions about the nature of linguistic theorising. 
The first type of shortcoming involves lack of ex- 
pressivity in L T (L F) and is illustrated by the impos- 
sibility of expressing ID/LP statements (cf. section 
5.2.1). As already indicated, we don't regard such 
shortcomings as a failure of the general modal ap- 
proach being developed here. With a slightly differ- 
ent choice of modal language, an adequate modeling 
of ID/LP statements could be attained. More gener- 
ally, we think it is important to explore a wide range 
of modal languages for linguistic theorising, for we 
believe that it may be possible to usefully classify 
differing linguistic theories in terms of the different 
modal operators required to formalise them. A theo- 
retical justification for our confidence will be given in 
the following section; here we'll simply say that we 
think this is a feasible way of addressing the ques- 
tions raised in \[Shieber 1988\] concerning the com- 
parative expressivity and computational complexity 
of grammatical formalisms. 
The second type of shortcoming is more serious 
and potentially far more interesting. Two cases 
in point are (i) the distinction made in 6PS6 be- 
tween instantia~ed and inherited features and (ii) the 
GPSG notion of a free feature. Briefly, inherited fea- 
tures are features whose presence on categories in 
trees is directly determined by an ID rule whereas 
instantiated features are non-inherited features (eft 
\[Gazdar et al. 1985, page 76\]). Furthermore, given 
a category C occurring in a tree r such that r is a 
projection of some ID rule that satisfies the FFP and 
the CAP, a feature specification is said to be free in 
C iff is is compatible with the information contained 
in C (cf. \[Gazdar et al. 1985, page 95\] for a more 
precise definition of free features). The problem in 
both cases is that derivational information needs to 
be taken into account. In the first case, the source 
of the feature specification must be known (does it 
stem from an ID rule or from some other source?). In 
the second case, we must know that both CAP and 
FFP are already being satisfied by the category un- 
der consideration. There is an essentially dynamic 
flavour to these ideas, something that goes against 
the grain of the essentially static tree descriptions 
offered by LT(LF). Whether this dynamic aspect is 
in fact required, and how it could best be modeled, 
we leave here as open research questions. 
6 But why modal languages? 
To close this paper we wish to discuss an issue that 
may be bothering some readers: why were modal 
languages chosen as the medium for expressing con- 
straints on trees and feature structures? A reader 
unfamiliar with the developments that have taken 
place in modal logic since the early 19T0's, and in 
particular, unfamiliar with the emergence of modal 
correspondence theory, may find the decision to work 
with modal languages rather odd; surely it would 
be more straightforward to work in (say) some ap- 
propriate first order language? However we believe 
that there are general reasons for regarding modal 
languages as a particularly natural medium for lin- 
guistic theorising, and what follows is an attempt to 
make these clear. 
The first point that needs to be made about modal 
languages is that they are nothing but extremely 
simple languages for talking about graphs. Unfortu- 
nately, the more philosophical presentations of modal 
logic tend to obscure this rather obvious point. In 
such presentations the emphasis is on discussing such 
ideas as 'possible worlds' and 'intensions'. Such dis- 
cussions have their charms, but they make it very 
easy to overlook the fact that the mathematical 
structures on which these ideas rest are extremely 
simple: just sets of nodes decorated with atomic in- 
formation on which a transition relation is defined. 
Kripke models are nothing but graphs. 
The second point is even more important. Modal 
languages are not some strange alternative to classi- 
cal languages; rather, they are relatively constrained 
fragments of such languages. If a problem has been 
modelled in a modal language then it has, ipso facto, 
been modeled in a classical language; and moreover, 
it has been modeled in a very resource conscious way. 
The point deserves a little elaboration. Ever since 
the early 1970's, one of the most important branches 
of research in technical modal logic has been modal 
correspondence theory (see \[van Benthem 1984\] and 
references therein), the systematic study of the in- 
terrelationships between modal languages on the one 
hand, and various classical logics (first order, infini- 
tary, and second order) on the other. Modal corre- 
spondence theory rests on the following simple ob- 
servation. It is usually possible to view modal oper- 
ators as logical 'macros'; essentially modal operators 
are a prepackaging of certain forms of quantification 
that are available in classical languages. To give a 
simple example, we might view a statement of the 
form T ~b as a shorthand for the first order expres- 
sion 3y(y > z A ~0(y)), where ~o(y) is a certain first 
order wff called the standard translation of ~b. 15 For 
l~This is somewhat impressionistic; for the full story 
consult \[van Benthem 1984\]. For a discussion of the fun- 
28 
present purposes the details aren't particularly im- 
portant; the key point to note is that the T operator is 
essentially a neat notation which embodies a limited 
form of first order quantificational power: namely 
the ability to quantify over mother nodes. More gen- 
erally, modal languages eschew the quantificational 
power that classical languages achieve through the 
use of variables and binding, in favour of a variable 
free syntax in which quantification is performed us- 
ing operators. Expressive power is traded for syntac- 
tic simplicity. 
The relevance of these points for linguistics should 
be clear. Linguistic theorizing makes heavy use of 
graph structures; trees and feature structures are ob- 
vious examples. Thus modal languages can be used 
as constraint formalisms; what correspondence the- 
ory tells us is that they are particularly interesting 
ones, namely formalisms that mesh neatly with the 
linguists' quest for revealing descriptions using the 
weakest tools possible. 
Acknowledgements: We would like to thank Jo- 
han van Benthem, Gerald Gazdar, Maarten de Ri- 
jke, Albert Visser and the anonymous referees for 
their comments on the earlier draft of this paper. 
Patrick Blackburn would like to acknowledge the fi- 
nancial support of the Netherlands Organization for 
the Advancement of Research (project NF 102\[62- 
356 'Structural and Semantic Parallels in Natural 
Languages and Programming Languages'). 
References 
\[Blackburn 1991\] Blackburn, P.: 1991, Modal Logic 
and Attribute Value Structures. To appear in 
Diamonds and Defaults, edited by M. de Ri- 
jke, Studies in Logic, Language and Informa- 
tion, Kluwer. 
\[Blackburn and Spaan 1991\] Blackburn, P. and 
Spaan, E.: 1991, On the Complexity of At- 
tribute Value Logics. Proceedings of the Eighth 
Amsterdam Colloquium, edited by P. Dekker 
and M. Stokhof, Philosophy Department, Ams- 
terdam University, The Netherlands. 
\[Blackburn and Spaan 1992\] Blackburn, P. and 
Spaan, E.: 1992, A Modal Perspective on the 
Computational Complexity of Attribute Value 
Grammar. To appear in Journal of Logic, Lan- 
guage and Information. 
\[Blackburn 1992\] Blackburn, P.: 1992, Structures, 
Languages and Translations: the Structural Ap- 
proach to Feature Logic. To appear in Con- 
straints, Language and Computation, edited by 
C. Rupp, M. Rosner and It. Johnson, Academic 
Press. 
\[Blackburn et al. forthcoming\] Blackburn, P., Gar- 
dent, C., and Meyer-Viol, W.: Modal Phrase 
Structure Grammars. In preparation. 
damental correspondences involved in feature logic see 
\[Blackburn 1992\]. 
\[de RJjke 1992\] de Rijke, M.: 1992, What is Modal 
Logic? In Logic at Work, proceedings of the 
Applied Logic Conference, CCSOM, University 
of Amsterdam, 1992. 
\[Finger and Gabbay 1992\] Finger, M. and Gabbay, 
D.: 1992, Adding a Temporal Dimension to a 
Logic System. Journal of Logic, Language and 
Information, 1, pp. 203-233. 
\[Gazdar 1979\] Gazdar, G.: 1979, Constituent Struc- 
tures. Manuscript, Sussex University. 
\[Gazdar et al. 1985\] Gazdar, G.: Klein, E., Pullum, 
G., and Sag, S.: 1985, Generalised Phrase Struc- 
ture Grammar. Basil Blackwell. 
\[Johnson 1991\] Johnson, M.: 1991, Features and 
Formulas, Computational Linguistics, 17, pp. 
131-151. 
\[Joshi and Levy 1977\] Joshi, A. and Levy, S.: 1977, 
Constraints on Structural Descriptions: Local 
Transformations. SIAM Journal of Computing, 
6, pp. 272-284. 
\[Kasper and Rounds 1986\] Kasper, R. and Rounds, 
W.: 1986, A logical semantics for feature struc- 
tures. Proceedings of the 24th Annual Meeting of 
the Association for Computational Linguistics, 
Columbia University, New York, pp. 257-266. 
\[McCawley 1908\] MeCawley, J.: 1968, Concerning 
the Base Component of a Transformational 
Grammar. Foundations of Language, 4, pp. 55- 
81. 
\[Peters and Ritchie 1969\] Peters, S. and Ritehie, R.: 
1969, Context-Sensitive Immediate Constituent 
Analysis - Context-Free Languages Revisited. 
Proceedings ACM Symposium on Theory of 
Computing, Association for Computing Machin- 
ery, pp. 1- 10. 
\[Rounds 1970\] Rounds, W.: 1970, Tree-Oriented 
Proofs of Some Theorems in Context-Free and 
Indexed Languages. Proceedings ACM Sympo- 
sium on Theory of Computing, Association for 
Computing Machinery, pp. 210 - 216. 
\[Shieber 1988\] Shieber, S.: 1988, Separating Lin- 
guistic Analyses from Linguistic Theories. In 
Natural Language Parsing and Linguistic Theo- 
ries, edited by U. Reyle and C. Rohrer, Reidel. 
\[van Benthem 1984\] van Benthem, J.: 1984, Corre- 
spondence Theory, in Handbook of Philosophical 
Logic, 2, edited by D. Gabbay and F. Guenth- 
ner, Reidel. 
29 
