Coordination in an A xlomatl  Grammar* 
David Milwa.rd 
University of Cambridge 
Computer Laboratory 
New Museums Site, Pembroke Street, Cambridge, CB2 3QG, England 
drml@cl.cam.ac.uk 
Abstract 
For some time there has been interest in the idea of 
parsing as deduction. Here we present a grammati- 
cal tbrmalism, 'Axiomatic Grammar', which is based 
upon a small number of linguistically motivated ax-- 
toms and deduction rules. Each axiom or rule com- 
bines a 'category' with a string of words to form a 
further category. This contrasts with the usual 'tree- 
structure' approach to syntactic analysis where con- 
stituents are combined with each other to form a 
further constituent. 
We describe a grammar for English which has 
a good coverage of 'non-constituent' coordination. 
The grammar has been integrated with a toy se- 
mantics, and has been implemented in a left-to- 
right parser with incremental semantic interpreta- 
tion. The parser does not suffer fi'om spurious am- 
biguity. 
I Introduction 
Coordi:uation is a particularly troublesome phe- 
nomenon to account for in theories of syntax based 
upon phrase structure rules. Acceptable examples of 
'non-constituent' coordination such as: 
(1) Jolhn gave Mary a book and Peter a paper 
(2) Ben likes and Fred admires Mary 
have led some to abandon a single level of gram- 
matical description, and others to abandon phrase 
structure rules. 
An example of the former approach is Modi- 
fier Structure Grammar (Dahl and McCord, 1983), 
which was justified as tbllows: 
... it appears that a proper and general 
treatment must recognise coordination as 
a "metagrmnmatieal' construction, in the 
sense that metarules, general system oper- 
ations, or 'second-pass' operations such as 
transformations, are needed for its formu- 
lation. 
Modifier Structure Grammar embeds its rules for co- 
ordination into the parsing algorithm (there are close 
*This research was supported by an SE\[IC resem'ch 
studentship. 
parallels with the SYSCONJ system (Woods, 1973)). 
In order to parse sentence (2), the state of the parser 
at the point immediately before 'Fred' is matched to 
the state immediately before 'Ben'. 'Fred admires' is 
then parsed, and the resulting state is merged with 
the state after parsing 'lien likes'. 
The alternative approach to dealing with coordi- 
nation uses a single level of grammatical description, 
but uses a weaker notion of constituency than phrase 
structure grammar. It is presently exemplified by 
proposals to extend Categorial Grammar with For- 
ward Composition, the Product operator, Subject 
Type-Raising etc. 
Categorial Grammar, just like phrase structure 
grammar, is based upon the combination of one 
or more constituents to form a further constituent. 
In order to deal with coordination, the category 
(X\X)/X 1 is assigned to the conjunction, or, n\]ore 
usually, a phrase structure rule is invoked of the form: 
X-~X conj X 
In either case, each conjunct has to be assigned a 
category. Extensions to Categorial Grammar pro- 
vide a greater coverage of coordination phenomena 
by allowing a greater number of strings to form cat- 
egories. For example, to accept both (2) and (3), 
(3) Ben likes Mary and admires Jane 
an extended grammar must allow 'Ben likes' to form 
a category which can combine with 'Mary' to form a 
sentence, and 'likes Mary' to form a category which 
can combine with 'Ben' to form a sentence. The 
consequence of this is that the simple sentence 'Ben 
likes Mary' can be assigned at least two different syn- 
tactic structures: (Ben (likes Mary)) or ((Ben likes) 
Mary), which both correspond to the same readi,g 
(the sentence is spuriously ambiguous according to 
the grammar). 
Axiomatic Grammar avoids the problem of spu- 
rious ambiguity by avoiding the need to assign cat- 
egories to conjuncts. Although the formalism was 
developed during research into extended Categorial 
Grammar, the separation of grammatical informa- 
tion into axioms and rules makes its treatment of 
coordination look similar to that in a metalevel ap- 
proach such as Modifier Structure Grammar. '\]'his 
1 Capita\] letters will be used to denote variables througl~out 
this paper. 
l 207 
similarity is easiest to show if we introduce the cen- 
tral notion of 'category transition' through the idea 
of state transition. 
Consider a left-to-right parse of the sentence 'a 
man sits' based upon a phrase structure grammar 
including the rules: 
s --~ np vp np --+ det n 
Initially we start in a state expecting a sentence 
(which we can encode as a list (s)). After absorb- 
ing the determiner, 'a', we can move to a new state 
which expects a noun followed by a verb-phrase (en- 
coded as a list (n,vpl). Following the absorption of 
'man', we can move to a state expecting just a verb- 
phrase, and following the absorption of 'sits' we have 
a successful parse since there is no more input and 
nothing more expected. The transitions between the 
encodings of the states are as follows: 
(s) + "a" --* (n,vp) where a:det 
(n,vp} + "man"-+ (vp} where man:n 
(vp) + "sits" ~ 0 where sits:vp 
('a:det' means lhat the word "a' is a determiner) 
Instead of deriving these transitions from the phrase 
structure rules, consider directly supplying axioms of 
the form: 
<s> + "w" --+ <n,vp) where W:det 
(n,vp> -{- :'W" ---+ (vp> where W:n 
<vp} + "W" --+ <> where W:vp 
If constituent names are replaced by category speci- 
fications, generalisations become possible. The three 
axioms: 
<s) + "W" --* <n,s\np) where W:np/n 
(n,s\np) + "W" --+ (s\np) where W:n 
(s\np) + "W" --* (> where W:s\np 
('np/n' is an np requiring a noun on its right, and 
's\np' is a sentence requiring an up on its left) 
are instantiations of the axioms: 
<X) ® R + "W" -+ <Z,X\Y) • R where W:Y/Z 
(X) ® R + "W" --+ R where W:X 
('X' is the head and 'R', the tail of the list encoding 
the state. '.' denotes concatenation, so '(n,s\np}' is 
equivalent to '(n) • <s\np>') 
The 'encoded states' will roughly correspond to 
'principal' categories in Axiomatic Grammar, and 
the axioms above to the axioms of Prediction and 
Composition. 
The rule for coordination in Axiomatic Grammar 
is stated in terms of principal category transition. 
For example, the acceptability of sentence (2) is de- 
pendent upon a proof that the two strings "Ben likes" 
and "Fred admires" both take us from the initial 
category (corresponding to a parsing state expect- 
ing a sentence) to a second category (corresponding 
to a state expecting a noun-phrase). The rule will 
be stated formally after a general description of the 
formalism. 
2 The Basics 
Axiomatic Grammar is mainly lexically based, with 
lexical entries containing both subcategorisation and 
order information. An association of a word with 
a 'lexical' category is given by an expression of the 
form: 
word: LEX-CAT 
Each lexical category is a feature valued structure. 
The features of interest are 'cat', which gives the 
base type of the category ('s','np', or 'n'), and 'left' 
and 'right' which contain lists of 'arguments'. Each 
argument is itself a lexical category. Categories are 
complete if the argument lists are empty. As an ex- 
ample, consider the lexical entries for the determiner 
'the' and the transitive verb 'likes': \[ \[ 
cat = np 
left = 0 
the: 
right = ( 
"ca% -- 8 
left = ( 
likes: 
right = ( 
\[ 
ca% -~ n \] I left = 0 ) 
righZ = 0 
\] cat =np left = 0 ) right = 0 
cat = ~p \] 
left = 0 J ) right = 0 
We can read the category for 'likes' as follows: given 
a complete noun-phrase on the left and a complete 
noun-phrase on the right, we can form a complete 
sentence. It is Worth comparing this category with 
the category generally assigned to 'likes' by a Cate- 
gorial Grammar: 
(S \ NP)/ NP 
The categories differ in two respects. Firstly, the 
Categorial Grammar category not only provides in- 
formation as to what is on the left, and what is on 
the right, but also determines the order in which each 
argument is to be absorbed (in the above, the argu- 
ment on the right must be absorbed first, followed 
by the argument on the left). Secondly, whereas 
the Categorial Grammar category would be regarded 
as having the syntactic type 'np~-~(np--+s)', the Ax- 
iomatic Grammar category is regarded as having the 
base type 's'. This difference has a bearing on the 
treatment of modifiers (discussed later). 
When a string of words is absorbed it causes a 
transition between principal categories. A princi- 
pal category is again a feature structure, the feature 
of interest being the 'right' feature i.e. the list of 
argmnents 2 required on the right. A parse of a sen- 
tence consists of a proof that, starting with a prin- 
cipal category which requires a sentence, we can end 
2 Arguments are again lexical categories. 
208 2 
up with a complete principal category. I'br example, 
Lo prove that 'Ben sits' is a sentence we prove the 
statement 3 
1 (} ) "Ben sits" r = 0 4 
r () 
llenceforth, the convention is adopted that left or 
right argument lists which are not specified are 
empty. This allows us to rewrite the statement above 
rather more compactly as: 
A proof of a parse is performed using rules and ax- 
ioms. An a×aom declares that a string of words per- 
{brms a transition between two principal categories. 
Axioms are either simple statements, or restricted 
~tatements of the form: 
Co String C1 where .... 
Three axiorns will be discussed here 5. The first, Iden- 
i;ity, merely declares that an empty string performs 
the identity transition i.e. 
Co ";; C0 
The other axioms, Prediction and C, omposition, work 
on strings consisting of a single word. They have the 
format: 
Co "W" C1 where W:LEX-CAT 
rFhe flfll definitions, given in Figure 1, should become 
clearer as we work through an example. 
A deduction rule in Axiomatic Grammar declares 
that a string of words performs a transition between 
t~vo principal categories provided that certain sub- 
st.rings perform certain transitions i.e. rules have the 
format6: 
Co String0 C1, ... ,C~ String, C,~+1 
Ca String C~ 
(subscripted strings are substrings of'String') 
The consequent of a rule (the statement tinder the 
line) can be proved by proving all the antecedents 
(the staternents above the line). 
3Feature names arc abbreviated in an obvious manner. 
Simple statements have the general form: 
Co String C1 
('Co'and 'C1' are principal categories, and 'String' is a string 
4 The corresponding state transition would be: 
(s) + "Ben sits" -+ 0 
5A fourth axiom is used for topicalisation. 
6This is actually the form of simple rules. As with aximr~s, 
rules may be restricted using a 'where' clause. 
3 An Example Proof 
In order to prove that 'Ben sits' is a sentence, we need 
to use all the axioms, and two rules, Sequencing and 
Optional Reduction. The relevant proof tree is given 
in Figure 2 7. 
The Prediction Axiom is restricted in English to 
the case where a category requires a sentence on the 
right, and the word encountered has a lexical cate- 
gory of base type noun-phrase. Thus starting with 
the principal category: 
we can absorb the proper-name 'BeeF, which has 
the lexical category, \[c = up l, to form a princi- 
pal category, 'c0', which requires first an optional 
noun-phrase modifier (e.g. a non-restrictive relative 
clause), and then a sentence which requires a noun- 
phrase (a verb-phrase) i.e. 
= ,,p\]l ' np\]/ I 
(the use of parenlheses around the base type of the 
noun-phrase modifier denotes optionalily s) 
Writing this as a statement in the logic, we have a 
proof that: 
\[r = (\[c = s\])\] "Ben" cO 
The Sequencing Rule 9 is used to combine the effects 
of the absorption of two strings. The rule declares 
that if one string defines a transition from Category0 
to Category1, and another defines a transition from 
Category1 to Category2, then the combined string 
defines a transition from Category0 to Category2 i.e. 
Co String0 Ca, CI String 1 C2 
Co String 0 ® String 1 C~ 
(here 'e' denotes concatenation of word strings e.g. 
"Ben"® "sits" is equivalent to "Ben sits") 
For this example, we can instantiate the Sequenc- 
ing Rule as follows: 
i j, \] " eo" cO, co \[ \] 
It: ,,B0o \[ \] 
7At this stage no restrictions have been imposed upon the 
ordering of the rules, and more than one proof tree is possible. 
However, it is relatively trivial to prove the existence of a 
nolwnal proof strategy which supplies a single proof tree for a 
given sentence and a possible semantics (Milward, 1990). 
8Parentheses are used as a shorthand. The lexical cate- 
gories ill argument lists actually include a featm'e 'opt', which 
is set to an uninstantiated variable when the argument is op. 
tional, to 'true' if the argument is compulsory. 
9The name 'Sequencing Rule' is due to a loose correspon- 
dence between the grammar and the Floyd Iloare Ihlles for 
Axiomatic Semantics of programming languages 
3 209 
COMPOSITION 
:r = ( 1 L ). r = R',( 
r R 
"1 1 
c = (X) \]),R"\[ where J J !=x W: = L 
RoR' 
PREDICTION 
' 1 ( c = Y ) ).R" 
('.' denotes concatenation of lists. Optional arguments have the value of the 'cat' feature in parentheses. 
X may be instantiated to the base types 's','np', or 'n'; L,I~,R' and R" lo lists of categories) 
Figure 1: Axioms 
.............. IDENTITY \[\]'"' \[1 
................. COMPOSITION .............. OPTIONAL REDN i-.1 
L d 
............................................................. SEQUENCING 
cl"sits" \[\] 
............................................ PREDICTION ................... OPTIONAL REDN 
........................................................................................................... SEQUENCING 
( "cO','cl' and 'c2' are principal categories mentioned i'a the text) 
Figure 2: Proof Tree for 'Ben sits' 
We can thus obtain a proof of the whole sentence by 
proving the antecedents to the rule. The first, has 
already been proved, so we are left to prove: 
cO "sits" \[\] 
The head of the argument list of cO is an optional 
noun-phrase modifier, Optional categories at the 
head of the argument list of a principal category can 
be deleted by the use of the Optional Reduction Rule 
which is as follows: 
\[r = R"\] String C 
r = ( 1 = L )* String C 
r = R 
We instantiate the Optional Reduction Rule to: 
cl "sits" \[\] 
c0"sits"\[\] 
in which 'el' is cO without the optional modifier i.e. 
r=( ( c=np) ) 
Tile proof now consists of proving the antecedent of 
tile Optional Reduction Rule i.e. 
r = ( (\[c : ,p\]/ ) 
This can be proved using first the Composition Ax- 
iom, then tile Sequencing Rule followed by Optional 
Reduction, and finally the Identity Axiom. 
The Composition Axiom 1° absorbs a word which 
has the same base category as the head of the a~r- 
gument list of a principal category. Since the word 
'sits' has the following category: 
\[c_-s \] 
i = (\[c: np 
the Composition Axiom can be used to absorb 'sits' 
and get us to the category 'c2" 
l°The name 'Composition' is due to the sinfilarity with 
the rule of generalised I~rward Composition in a Catcgorial 
G r&mlll&r. 
210 4 
Using the Sequencing Rule once more, we can prove 
the whole given a proof of 
r:l i (\[~:~\])) .... 
which can be proved by first invoking the Optional 
Reduction Rule. The optional sentcntial modifier is 
then deleted, leaving ,is with a proof of 
\[\] .... \[\] 
which is true by the Identity Axiom. 
4 tgul.es and Lexical Items 
So tar we have introduced three axioms which are 
used by the grammar, and two rules. Before consid- 
ering further rules it is worth discussing the grammar 
as it stands. 
The effect of the axioms, Prediction and Compo- 
sition, is to absorb a word and to predict an op- 
tional modifier for the base type. For exarnple, in 
parsing 'the girl' a noun-phrase modifier is predicted 
after parsing 'the' and a noun-modifier is predicted 
after parsing 'girl'. Thus, given a treatment of non- 
restrictive relatives, we could parse something like: 
(4) The girl outside, who has been waiting a long 
time, looks frozen 
Moreover, alter parsing a noun modifier, another 
noun modifier is predicted (the base type of a noun 
modifier is, after all, a noun). Thus we could also 
parse 
(5) The girl outside in the red dress with the large 
man .... 
Although the treatment of noun and noun-phrase 
modification looks reasonably traditional, the treat- 
ment of verbal modification is less so. Since the base 
type of a verb is a sentence, a modifier for the verb 
has the same type as a sentential modifier. For ex- 
ample, in: 
(6) John hit the ball with a racket 
the action of the Composition Axiom is to add an 
optional sentential modifier onto tile end of the sub- 
categorisation list of the verb 'hit', and then to add 
this list onto the list of expected arguments i.e. after 
absorbing "hit" the principal category becomes: 
":= I ~ = np , i: (\[~ = s\]l 
A successful proof of the sentence is achieved by giv- 
ing 'witlf a lexical entry: 
Sentences such as 'John decided to sack Mary in se- 
cret? are correctly treated as being structurally am- 
biguous, since 'in secret' may modify the 's' intro- 
duced by 'decided' or the %' introduced by 'sack'. 
The grammar which has been described so far ira=. 
poses a strict notion of word order. This seems par- 
ticularly inappropriate fbr relative clauses which can 
be extraposed from a position following the subject 
noun-phrase to after the verb-phrase. Consider the 
sentence: 
(7) Children arrived who only spoke English 
The present grammar treats this case by allow- 
ing heavy noun and noun-phrase modifiers to swa.p 
places with categories having a base type ~s'. Thus 
the principal category created afLer absorbing "Chil- 
dren" : \[ \]\[ 
can be transformed into: 
i <\[~ : .,~p\]> ) 
r : / 1 /It : ' : / 
The possibility is being considered of replacing lists 
of arguments by sets of arguments associated with 
linear precedence constraints (along the lines of 
work done on bounded discontinuous constituency 
(Rcape, 1989)). 
Finally, let us consider the particular restriction 
which was made to the Prediction Rule for English. 
The effect of the restriction is that the only accept- 
able lexical entries with left arguments are either of 
the form 
1 = ( c =np ) or 1 = ( c = X ) 
r= R r = R 
i.e. verbs (which require a noun-phrase subject on 
their left), or modifiers of the base types. 
5 The Coordination Rule 
The Coordination Rule is as follows: 
Co String0 C1, Co String 1 C1 
Co String0 ® "W" • String I C1 
where W 6 {and,or,but\] 
This contrasts with the phrase structure rule: 
X --, X conj X 
which can be expressed in deduction rule format as: 
String 0 : X, String 1 : X 
String 0 g "W" • String I : X 
where W 6 {and,or,but} 
Both rules allow nested and iterated conjunction, 
5 211 
however, whereas the phrase structure rule enforces 
that conjuncts are of the same category, the Coordi- 
nation Rule enforces that each conjunct defines the 
same transition between principal categories. 
We can show the expressive power of the Coordi- 
nation Rule by considering some examples. The first 
is an example of 'unbounded' Right-Node Raising: 
(8) John admires, but Mary thinks he loves, the new 
teacher 
This can be proved by separately proving that both 
"John admires" and "Mary thinks he loves" perform 
a transition between the initial principal category, 
\[r = (\[c = s\]) \] and the category: 
r = ( ~ = np , 1 (\[¢ = s\]) ) 
The proof is completed by proving that "the teacher" 
defines a transition between this category and the 
complete principal category, \[ \]. 
The second example involves sharing on both the 
right and the !eft: 
(9) He lent John a book, and Mary a paper, about 
subjacency . 
This example, which has been used to argue for the 
addition of recta-rules to Categorial Grammar (Mor- 
rill, 1987), is of interest when the required reading is 
where the noun modifier 'about subjacency' applies 
to both the book and the paper. To prove the sen- 
tence we first prove that "He lent" performs a tran- 
sition between the initial principal category and the 
category: 
We then prove separately that "John a book", and 
"Mary a paper" perform a transition between this 
category and the category: 
\[ 1\[ 1\] r : I / : nil = '\]1 / 
Finally we prove that the string "about subjacency" 
takes us from this category to the complete category, \[\]. 
The basic grammar does accept some sentences 
which are generally regarded as unacceptable, and 
extra features are needed to constrain the rules. The 
situation with the basic grammar is not, however, 
as bad as with many extended Categorial Gram- 
mars. The Coordination Rule enforces a parallelism 
between conjuncts in a similar manner to the par- 
allelism enforced by the phrase structure rule men- 
tioned above. This can be contrasted with assign- 
ing conj unctions the polymorphic category (X\X)/X, 
which allows sentences like: 
(10) John likes Mary and, or Peter likes Joan and 
Anne 
Further parallelism is enforced by tile particular 
treatment of wh-movement n which, for example, 
predicts the acceptability of (11) but not of (12): 
(11) The book arrived which John had shown Mary 
and given to Peter 
(12) *The book arrived which John had shown 
Mary and to Peter 
6 Conclusion 
This paper has introduced Axiomatic Grammar, and 
has given some justification for particular axioms and 
rules chosen for English. The formalism itself has 
been left very much underspecified, and further re- 
search is required both into its applicability to other 
languages, and into its formal properties. 
A larger grammar for English has been imple- 
mented, including a treatment of wh-movement and 
verbal ellipsis (gapping). The parser works word- 
by-word from left-to-right, and was designed so that 
incorporation of the coordination rule does not slow 
down parsing in general. 
Axiomatic Grammar fits in naturally with an in- 
cremental approach to semantic interpretation, or 
with semantics based upon state change. The 
present, grammar is integrated with a toy semantics 
based upon the incremental (but non-monotonic) ac- 
cumulation of constraints. 
11 The rules for wh-movement involve the use of a featm'e 
on principal categories which 'stacks' extracted elements, and 
the use of further features to control extraction sites. 

References 

Dahl, Veronica and McCord, Michael C. (1983). 
Treating Coordination in Logic Grammars. 
Computational Linguistics, 9(2):69-91. 

Milward, David R. (1990). Ph.D. Thesis. Forthcom- 
ing. 

Morrill, Glyn (1987). Meta-Categorial Grammar. 
In Haddock, Nicholas, Klein, Ewan, and Mor- 
rill, Glyn (eds.), Categorial Grammar, Unifica- 
tion Grammar and Parsing, Centre for Cognitive 
Science, University of Edinburgh. 

Reape, Mike (1989). A Logical Treatment of Semi- 
Free Word Order and Bounded Discontinuous 
Constituency. In European ACL, pages 103-110. 

Woods, W. (1973). An Experimental Parsing System 
for Transition Network Grammars. In l%ustin, R. 
(ed.), Natural Language Processing, Algorithmics 
Press. 
