COORDINATION IN UNIFICATION-BASED GRAMMARS 
Richard P. Cooper 
Department of Psychology 
University College London 
London WC1E 6BT, U.K. 
JANET: ucjtrrc@ucl.ac.uk 
ABSTRACT 
Within unification-based grammar formalisms, 
providing a treatment of cross-categorial coor- 
dination is problematic, and most current solu- 
tions either over-generate or under-generate. In 
this paper we consider an approach to coordi- 
nation involving "composite" feature structures, 
which describe coordinate phrases, and present 
the augmentation to the logic of feature struc- 
tures required to admit such feature structures. 
This augmentation involves the addition of two 
connectives, composite conjunction and compos- 
ite disjunction, which interact to allow cross- 
categorial coordination data to be captured ex- 
actly. The connectives are initially considered to 
function only in the domain of atomic values, be- 
fore their domain of application is extended to 
cover complex feature structures. Satisfiability 
conditions for the connectives in terms of deter- 
ministic finite state automata are given, both for 
the atomic case and for the more complex case. 
Finally, the Prolog implementation of the connec- 
tives is discussed, and it is illustrated how, in the 
atomic case, and with the use of the freeze/2 
predicate of second generation Prologs, the con- 
nectives may be implemented. 
The Problem 
Given a modern unification-based grammar, 
such as HPSG, or PATR/FUG-styIe grammars, 
where feature structure descriptions are associ- 
ated with the constituents of the grammar, and 
unification is used to build the descriptions of 
constituents from those of their subconstituents, 
providing a treatment of coordination, especially 
cross-categorial coordination, is problematic. It 
is well known that coordination is not restricted 
to like categories (see (1)), so it is too restric- 
tive to require that the syntactic category of a 
coordinate phrase be just the unification of the 
syntactic categories of the conjuncts. Indeed, the 
data suggest that the syntactic categories of the 
conjuncts need not unify. 
(1) a. Tigger became famous and a com- 
plete snob. 
b. Tigger is a large bouncy kitten and 
proud of it. 
Furthermore, it is only possible to coordinate 
certain phrases within certain syntactic contexts. 
Whilst the examples in (1) are grammatical, those 
in (2) are not, although the same constituents are 
coordinated in each case. 
(2) a. *Famous and a complete snob chased 
Fido. 
b. *A large bouncy kitten and proud of 
it likes Tom. 
The difference between the examples in (1) and 
(2) is the syntactic context in which the coordi- 
nated phrase appears. The relevant generalisa- 
tion, made by Sag et al. (1985) with respect to 
GPSG, is that constituents may coordinate if and 
only if the description of each constituent unifies 
with the relevant description in the grammar rule 
which licenses the phrase containing the coordi- 
nate structure. Example (la) is grammatical be- 
cause the phrase structure rule which licenses the 
constituent became famous and a complete snob 
requires that famous and a complete snob unify 
with the partial description of the object sub- 
categorised for by became, and the descriptions 
of each of the conjuncts, famous and a complete 
snob, actually do unify with that partial descrip- 
tion: became requires that its object be "either an 
NP or an AP", and each of famous and a com- 
plete snob is "either an NP or an AP". (lb) is 
grammatical for analogous reasons, though is is 
less fussy about its object, also allowing PPs and 
predicative VPs to fill the position. (2a) is un- 
grammatical as chased requires that its subject 
be a noun phrase. Whilst this is true of a com. 
167 - 
plete snob, it is not true of famous, so the descrip- 
tion of famous does not unify with the descrip- 
tion which chase requires of its subject. (2b) is 
ungrammatical for similar reasons. 
Two Approaches to a Solution 
Two approaches to this problem are immediate. 
Firstly, we may try to capture the intuition that 
each conjunct must unify with the requirements 
of the appropriate grammar rule by generalising 
all grammar rules to allow for coordinated phrases 
in all positions. This general approach follows 
that of Shieber (1989), and involves the use of 
semi-unification. Note that this does not involve 
a grammar rule licensing coordinate constituents 
such as a and fl: following this approach c~ and 
/~ can never be a constituent in its own right. 
An alternate approach is to preserve the orig- 
inal grammar rules, but generalise the notion of 
syntactic category to license composite categories 
-- categories built from other categories -- and 
introduce a rule licensing coordinate structures 
which have such composite syntactic categories. 
That is, we introduce a grammar rule such that 
if a and ~ are constituents, then a and ~ is also 
a constituent, and the syntactic category of this 
constituent is a composite of the syntactic cate- 
gories of a and ft. 
Within a unification-based approach, this gen- 
eralisation of syntactic category requires a gener- 
alisation of the logic of feature structures, with 
an associated generalisation of unification. This 
is the approach which we adopt in this paper. 
One of the consequences of this approadl is that 
for (almost) any constituents a and fl, the gram- 
mar should also license the string a and fl as 
a constituent, irrespective of whether there axe 
any contexts in which this constituent may occur. 
Thus our grammar might admit in the garden and 
chases Fido as a constituent, though there may 
be no contexts which license such a constituent. 
Our approach differs from other approaches 
to cross-categorial coordination (such as those 
employing generalisation, or that of Proudiau 
& Goddeau (1987)) which have been suggested 
in the unification grammar literature in that 
it involves a real augmentation of the logic of 
feature structures. Other approaches which do 
not involve this augmentation tend to ovel- 
generate (the approaches employing general. 
isation) or under-generate (the approach of 
Proudian & Goddeau). 
Generalisation over-generates because in gen- 
eralisation conflicting values are ignored. In the 
ease of became, assuming that we analyse became 
as requiring an object whose description unifies 
with \[CATEGORy NP V AP\], generaiisation would 
license (la), as well as both of the examples in (3). 
(3) a. *Tigger became famous and in the 
garden. 
b. *Tigger became a complete snob and 
in the garden. 
This is because the generalisation of the de- 
scriptions of the two conjuncts (\[CATEGORY AP\] 
and \[CATEGORY PP\] in the case of (3a) and \[CAT- 
gooltv NP\] and \[CATEGORY PP\] in the case of 
(3b)) is in each case \[CATEGORY _l_\], which uni- 
fies with the \[CATEGORY NP V AP\] requirement 
of became. 
It is not clear how the approach of Proudian & 
Goddeau could be applied to the became example: 
the disjunctive :subcategorisation requirements of 
became cannotbe treated within their approach. 
For further details see Cooper (1990). 
Composite Atomic Values 
Following Kasper & Rounds (1990), and ear- 
lier work by the same authors (Rounds & Kasper 
(1986) and Kasper & Rounds (1986)), we adopt 
a logical approach to feature structures via an 
equational logic. The domain of well-formed for- 
mulae is defined inductively in terms of a set A of 
atomic values and a set L of labels or attributes. 
These formulae are interpreted as descriptions of 
deterministic finite state automata. 
In the formulation of Kasper & Rounds, these 
automata have atomic values assigned to (some 
of) their terminal states. A simplifed reading of 
the coordination data suggests that these values 
need not be atomic, and that there is structure 
on this domain of "atomic" values. To model this 
structure we introduce an operator "~", which 
we term composite conjunction, such that if a and 
\]~ are atomic values, then a ,~/~ is also an atomic 
Value. Informally, if a large bouncy kitten is de- 
scribed by the pair \[CATEGORY NP\] and proud of 
it is described by the pair \[CATEGORY AP\], then 
any coordination of those constituents, such as 
neither a large bouncy kitten nor proud of it will 
be described by the pair \[CATEGORY NP ~ AP\]. 
Before discussing satisfiability, we consider 
some of the properties of ~ : 
- 168 - 
• ^ is symmetric: a noun phrase coordinated 
with an adjectival phrase is of the same cate- 
gory as an adjectival phrase coordinated with 
a noun phrase. Thus for all atomic values a 
and/~, we require 
• ^ is associative: in constructions involv- 
ing more than two conjuncts the category of 
the coordinate phrase is independent of the 
bracketing. Hence for all atomic values a,/~ 
and % we require 
^t = 
• ^ is idempotent: the conjunction of two (or 
more) constituents of category x is still of 
category x: Hence for all atomic values a, 
we require 
These properties exactly correspond to the prop- 
erties required of an operator on finite sets. For 
full generality we thus take ^ to be an operator 
on finite subsets of atomic values rather than a 
binary operator satisfying the above conditions, 
but for simplicity use the usual infix notation for 
the binary case. 
Given one further requirement, that for any a 
(and hence that a^a = ^ {a}) the use of an op- 
erator on sets directly reflects all of the above 
properties: 
= = 
-^-= ^{.} 
Given this structure on the domain of atomic 
values, we restate the satisfiability require- 
ments. We deal in terms of deterministic finite 
state automata (DFSAS) specified as six-tuples, 
(Q, q0, L, 5, A, lr), where 
• Q is a set of atoms known as states,: 
• q0 is a particular element of Q known as the 
start state, 
• L is a set of atoms known as labels, 
• 6 is a partial function from \[Q x L\] to Q 
known as the transition function, 
• A is a set of atoms, and 
• ~r is a function from final states (those states 
from which according to/f there are no tran- 
sitions) to A. 
To incorporate conjunctive composite values 
we introduce structure on A, requiring that for 
all finite subsets X of A, ^ X is in A. Satisfiabil- 
ity of formulae involving composite conjunction 
is defined as follows: 
• -4~ ~{ax, .... a,~}iff.4=(Q,qo, L,6,A, tr) 
~ where 6(q0,/) is undefined for each I in L and 
a'(qo) -" ^ {al,..., a,~}. 1 
This is really just the same clause as for all atomic 
values: 
• .4 ~ aiff.A = (Q, qo,L,6,A,~r) where ~(q0,1) 
is undefined for each 1 in L and ~r(q0) - a. 
As such nothing has really changed yet, though 
note that by an "atomic value" now we mean an 
element of the domain A. The structure which 
we have introduced on A means that strictly 
speaking these values are not atomic. They are, 
however, "atomic" in the feature structure sense: 
they have no attributes. 
The real trick in handling composite conjunc- 
tive formulae correctly, however, comes in the 
treatment of disjunction. We introduce to the 
syntax a further connective ~, composite dis- 
junction. As the name suggests, this is the ana. 
logue of disjunction in the domain of composite 
values. Like standard disjunction v exists only 
in the syntax, and not in the semantics. For sat- 
isfiability we have: 
• .4 ~ (aV3) where a,/~EA iffA~a or 
: 
More generally: 
• .4 ~ ~& where ~CA and 4~ is finite iff 
.4 ~ ~ ~' for some subset (I)' of 4). 
With this connective, disjunctive subcategori- 
sation requirements may be replaced with com- 
posite disjunctive requirements. The intuition be- 
hind this modifcation stems from the fact that 
if a constituent has a disjunctive subcategorisa- 
ti0n requirement, then that requirement can be 
met by any of the disjuncts, or the composite 
of those disjuncts. To illustrate this reconsider 
1For aimplidty we ignore connectivity of D~Xs. If con- 
nectivity is to be included in the definitions, then in this 
case Q must he the singleton {qo}. 
- 169 - 
the example in (la). Originally the subcategori- 
sation requirements of became might have been 
stated with the disjunctive specification \[CAT~.- 
GORY NP V AP\]. This could be satisfied by either 
an NP or an AP, but not by a conjunctive com- 
posite composed of an NP and an AP, i.e., not by 
the result of conjoining an NP and an AP. To al- 
low this we respecify the requirements on the sub- 
categorised for object as \[CATEGORY NP~tAP\]. 
This requirement may be legitimately met by ei- 
ther an NP or an AP or a conjunctive composite 
NP~AP. 
Composite Feature 
Structures 
This use of an algebra of atomic values allows 
composites only to be formed at the atomic level. 
That is, whilst we may form a ,'~/3 for a, f~ atomic, 
we may not form a ~/3 where a,/3 are non-atomic 
feature structures. However, such composites do 
appear to be useful, if not necessary. In par- 
ticular, in an HesG-like theory, the appropriate 
thing to do in the case of coordinate structures 
seems to be to form the composite of the HEAD 
features of all conjuncts. The above approach 
to composite atoms does not immediately gen- 
eralise to allow composite feature structures. In 
particular, whilst the intuitive behaviour of the 
connectives should remain as above, the seman- 
tic domain must be revised to allow a satisfactory 
rendering of satisfiability. 
With regard to syntax we revert back to an 
unstructured domain A of atoms but augment 
the system of Kasper & Rounds (1990) with two 
clauses licensing composite formulae: 
• A & is a valid formula if q) is a finite set, each 
element of which is a valid formula; 
e ~ (I) is a valid formula if (I) is a finite set, each 
element of which is a valid formula. 
The generalisation of satisfiability holds for 
composite disjunction: 
• .A ~ ~4 & iff .A ~ ,'~ 4' for some subset (I)' of 
(I'. 
We must alter the semantic domain, the domain 
of deterministic finite state automata, however, 
to allow a sensible rendering of satisfaction of 
composite conjunctive formulae -- we need some- 
thing like composite states to replace the compos- 
ite atomic values of the preceding section. 
In giving a semantics for ~ we take advantage 
of the equivalence of ,'~ {a} and a. We begin by 
generalising the notion of a deterministic finite 
state automaton such that the transition function 
maps states to sets of states: 
A generalised deterministic finite state automa- 
ton (GDFSA) is a tuple (Q, qo, L, 6,A, 7r), where 
• Q is a set of atoms known as states, 
• qoEPow(Q) is a distinguished set of states 
known as the start state set, 
• L is a set of atoms known as labels, 
• 6 is a partial function from \[Q x L\] to 
Pow(Q), 
• A is a set of atoms, and 
• ~ is a partial assignment of atoms to final 
states. 
Any DFSA ,A "-- (Q, qo,L,~f,A,~) has a corre- 
sponding ¢DFSA .A' given by (Q, {q0}, L, 6', A, It) 
where 6'(q, I) ----{6(q, l)}. 
Given a GDFSA .4 we define satisfiability of 
conjunctive, disjunctive and atomic formulae as 
usual. There is a slight differences in satisfiabil- 
ity of path equations: 
• .A ~ l : ~b: iff .Aft is defined and .A/! ~ ~, 
where if ~4: = (Q, {q}, L, 6, A, ~), then .Aft = 
(Q, 6(q, 0, L, 6, A, 
This clause has been altered to enforce the re- 
quirement that q0 be a singleton, and that 6 maps 
this single element to a set. 2 
The extensions for V and ~ are: 
• .A ~ V • iff .A ~ ,~ (I) I for some subset (I)~ of 
4~ (as above). 
• .A ~ ~ iff for each ~b E 4~, there exists a 
q' E q0 such that (Q,{q'},L,6,A,7 0 ~ ~. 
Note that in the case of • a singleton, this last 
clause reduces to .A ~ ,'~ {~} iff ¢4 ~ d. 
The reason why the satisfiability clauses for 
these connectives are so simple resides principally 
in the equivalence of ,~ {a} and a. We cannot fol- 
low this approach in giving a semantics for stan- 
dard set valued attributes because in the case of 
sets we want {~} and ~ to be distinct. 
2Again we are ignoring connectivity. 
- \]70 - 
Properties of Composites 
The properties of composite feature structures 
and the interaction of ~ and ~ may be briefly 
summarised as follows: 
• Disjunctive composite feature structures are 
a syntactic construction. Like disjunctive 
feature structures they exist in the language 
but have no direct correlation with objects 
in the world being modelled. 
* Conjunctive composite feature structures de- 
scribe composite objects which do exist in 
the world being modelled. 
* A disjunctive composite feature structure de- 
scribes an object just in ease one of the dis- 
juncts describes the object, or it describes a 
composite object. 
• A disjunctive composite feature structure de- 
scribes a composite object just in case each 
object in the composite is described by one 
of the disjuncts. 
• A conjunctive composite feature structure 
describes an object just in case that object 
is a composite object consisting of objects 
which are described by each of the descrip- 
tions making up the conjunctive composite 
feature structure. 
The crucial point here is that conjunctive 
composite objects exist in the described world 
whereas disjunctive composite objects do not. 
An Example 
To illustrate in detail the operation of composites 
we return to the example of (la). In an nPSG-like 
formalism (see Pollard & Sag (1987)) employing 
composites, the object subcategorised for by be- 
came would be required to satisfy: 
I SYNILO C I HEAD L SUBCAT 
According to our satisfiability clauses above, 
this may be satisfied by: 
• an AP such as famous, having description 
PHON 
SYN\[LOC HEADIMAJ SUBCAT 
• an NP such as a complete snob, having de- 
scription 
PHOt~ a complete snob \] 
s..cA, <)jj 
• or ~a AP ~ NP such as famous and a com- 
plete snob, having description s 
sPHON famous and a complete snob " 
The subcategorlsation requirements may not, 
however, by satisfied by, for example, a PP, or any 
conjunctive composite containing a PP. Hence the 
examples in (3) are not Mmitted. 
Implementation Issues 
The problems of implementing a system involv- 
ing composites really stem fromtheir requirement 
for a proper implementation of disjunction. Im- 
plementation may be approached by adopting a 
strict division between the objects of the language 
and the objects of the described world. Accord- 
ing to this approach, and in Prolog, Prolog terms 
~re taken to correspond to the objects in the se- 
mantic domain, with Prolog clauses being inter- 
preted much as in the syntax of an equational 
logic, as constraints on those terms. Conjunctive 
constraints correspond to unification. The for- 
mation of conjunctive composites is also no prob- 
lem: such objects exist in the semantic domain, so 
structured terms may be constructed whose sub- 
terms are the elements of the composite. Thus 
if we implement the composite connectives as bi- 
nary operators, * for ~ and + for ~, we may 
form Prolog terms (A * B) corresponding to con- 
junctive composites. Disjunction, and the use of 
disjunctive composites, cannot, however, be im- 
plemented in the same way. The problem with 
disjunction is that we cannot normally be sure 
which disjunct is appropriate, and a term of the 
form (A + B) will not unify with the term A, as 
is required by either form of disjunction. The 
freeze/2 predicate of many second generation 
Prologs provides some help here. For standard 
aWe assume that the rule licensing coordinate struc- 
tures unifies all corresponding values (such as the vahies 
for each SUBCAT attribute) except for the values of the 
HEAD attributes. The value of the HEAD attribute of the 
coordinate structure is the composite of the values of the 
HEAD attribute of each conjunct. 
171 - 
disjunction, we might augment feature structure 
unification clauses (using <=> to represent the 
unification operator and \/to represent standard 
disjunction) with special clauses such as: 
A <=> CA1 \1 A2) :- 
freeze(A, ((A <=> hl);. 
(A <=> A2))) 
Similarly for composite disjunction, we might 
augment the unification clauses with: 
A <=> (AI + A2) :- 
freeze(A, ((A <--> A1); 
(A <=> A2); 
CA <=) (A1 * A2)))) 
The idea is that the ~reeze/2 predicate de- 
lays the evaluation of disjunctive constraints un- 
til the relevant structure is sufficiently instanti- 
ated. Unfortunately, "sufficiently instantiated" 
here means that it is nonvar. Only in the case 
of atoms is this normally sufficient. Thus the 
above approach is suitable for the implementa- 
tion of composites at the level of atoms, but not 
suitable in the wider domain of composite feature 
structures. 
Concluding Remarks 
In giving a treatment of coordination, and in 
particular cross-categorial coordination, within a 
unification-based grammar formalism we have in- 
troduced composite feature structures which de- 
scribe composite objects. A sharp distinction is 
drawn between syntax and semantics: in the se- 
mantic domain there is only one variety of com- 
posite object, but in the syntactic domain there 
are two forms of composite description, a con- 
junctive composite description and a disjunctive 
composite description. Satisfiability conditions 
are given for the connectives in terms of a gener- 
alised notion of deterministic finite state automa- 
ton. Some issues which arise in the Prolog imple- 
mentation of the connectives are also discussed. 
REFERENCES 
Cooper, Richard. Classification-Based Phrase 
Structure Grammar: an Extended Revised 
Version of :HPSG. Ph.D. Thesis, University 
of Edinburgh. 1990. 
Kasper, Robert & William Rounds. A Logical 
Semantics for Feature Structures. In Pro- 
ceedings of the ~4 th ACL, 1986, 257-265. 
Kasper, Robert & William Rounds. The Logic 
of Unification in Grammar. Linguistics and 
Philosophy, 13, 1990, 35-58. 
Pollard, Carl & Ivan Sag. Information-Based 
Syntax and Semantics, Volume 1: Funda- 
mentals. 1987, CSLI, Stanford. 
Proudian, Derek & David Goddeau. Constitu- 
ent Coordination in HPSG. CSLI Report 
#CSLI-87-97, 1987. 
Rounds, William & Robert Kasper. A Complete 
Logical Calculus for Record Structures Rep- 
resenting Linguistic Information. In Proceed- 
ings of the 1 °t IEEE Symposium on Logic in 
Computer Science, 1986, 38-43. 
Sag, Ivan, Gerald Gazdar, Thomas Wasow and 
Steven Weisler. Coordination and How to 
Distinguish Categories. Natural Language 
and Linguistic Theory, 3, 1985, 117-171. 
Shieber, Stuart. Parsing and Type Inference for 
Natural and Computer Languages. Ph.D. 
Thesis, Stanford University, 1989. 
ACKNOWLEDGEMENTS 
This research was carried out at the Cen- 
tre for Cognitive Science, Edinburgh, under 
Commonwealth Scholarship and Fellowship Plan 
AU0027. I am grateful to Robin Cooper, William 
Rounds and Jerry Seligman for discussions con- 
cerning this work, as well as to two :anonymous 
referees for their comments on an earlier version 
of this paper. All errors remain, of course, my 
own. 
- 172 - 
