On compositional semantics 
WIodek Zadrozny 
IBM Research 
T. J. Watson Research Center 
Yorktown Heights, NY 10598 
WLODZ @ WATSON.1BM.COM 
Abstract. We prove a theorem stating that any 
semantics can be encoded as a compositional 
semantics, which means that, essentially, the 
standard definition of compositionality is for- 
mally vacuous. We then show that when one re- 
quires compositional semantics to be 
"systematic" (that is the meaning function can- 
not be arbitrary, but must belong to some class), 
one can easily distinguish between compositional 
and non-compositional semantics. We also pre- 
sent an example of a simple grammar for which 
there is no "systematic" compositional seman- 
tics. This implies that it is possible to distinguish 
"good" and "bad" grammars oll the basis of 
whether they can have compositional semantics. 
As a result, we believe that the paper clarifies the 
concept of compositionality and opens a possi- 
bility of making systematic comparisons of dif- 
ferent systems of grammars and NLU programs. 
l.lntroduction. 
Compositionality is defined as the property that 
the meaning of a whole is a function of the me- 
aning of its parts (cf. e.g. Keenan and Faltz 
(1985),pp.24-25)? This definition, although intu- 
itively clear, does not work formally. For in- 
stance, Ilirst (1987) pp.27-43 claims that the 
semantics of Woods (1967) and (Woods, 1969), 
is not compositional, because "the interpretation 
of the word depart varies as different preposi- 
tional phrases are attached to it": 
AA-57 departs from Boston 
= > depart(aa-57, boston). 
AA-57 departs from Boston to Chicago 
= > connect(aa-57, boston, chicago). 
AA-37 departs from Boston on Monday 
= > dday(aa-57, boston, monday). 
AA-57 departs from Boston at 8:00 a,m. 
= > equal(dtime(aa-57, boston), 8:00am). 
AA-57 departs from Boston after 8:00 a.m. 
= > greater(dtime(aa-57,boston),8:l10am). 
AA-57 departs from Boston before 8:00 a.m. 
= > greater(8:0Oam,dtime(aa-57,boston)). 
Although this semantics does look like non- 
compositional, it is easy to create a function that 
produces the meanings of all these sentences 
from the meanings of its parts -- we can simply 
define such a function by cases: the meaning of 
departs~from/ is connect, the meaning of 
departs/from/on is dday, and so on. Hirst there- 
fore changes the definition of compositionality 
to "the meaning of a whole is a Lystematic me- 
aning of the parts" (op. cit. p.27.; tile emphasis 
is ours), but without defining the meaning of the 
word "systematic." 
in this paper we show that, indeed, tlirst was 
right in assuming that the standard definition of 
compositionality has to be amended. Namely, 
we prove a theorem stating that any semantics 
can be encoded as a compositional semantics, 
which means that, essentially, the standard defi- 
nition of compositionality is formally vacuous. 
We then show that when one requires composi- 
An equivalent definition, e.g. Partee et al. (1990), postulates the existence of a homomorphism from syntax to semantics. 
ACTES DE COLING-92, NANrEs, 23-28 ^Ot~T 1992 2 6 0 PROC. OF COLING-92, NANTES, AUO. 23-28, 1992 
tional semantics to be "systematic" (i.e. the me- 
aning function must belong to some class), one 
can easily distinguish between compositional 
and non-compositional semantics. We also give 
an example of a simple grammar lor which there 
is no "systematic" compositional semantics". 
This result implies that it is possible to distin- 
guish "good" and "bad" grammars on the basis 
of whether they can have a compositional se- 
mantics with a meaning function belonging to a 
certain class. As a result, we believe that the 
paper finally clarifies the concept of composi- 
tionality and opens a possibility of making sys- 
tematic comparisons of different systems of 
grammars and NLU programs. 
2.Some compositional meaning 
function can always be found 
Compositional semantics, or CS, is usually de- 
fined as a functional dependence of the mean- 
ing of an expression on the meanings of its 
parts. One of the first natural questions 
we might want to ask is whether a set of NL 
expressions, i.e. a language, can have some CS. 
This question has been answered positively by 
van Benthem (1982). ttowever his result says 
nothing about what kinds of things shoukl be 
assigned e.g. to nouns, where, obviously, we 
would like nouns to be mapped into sets of en- 
tities, or something like that. That is, we want 
semantics to encode some basic intuitions, e.g. 
that nouns denote sets of entities, and verbs de- 
note relations between entities, and so on. 
So what about having a compositional semantics 
that agrees with intuitions? That is, the ques- 
tions is whether after deciding what sentences 
and their parts mean, we can find a function that 
would compose the meaning of a whole from 
the meanings of its parts. 
The answer to this question is somewhat dis- 
turbing. It turns out that whatever we decide 
that some language expressions should mean, it 
is always possible to produce a ffmction that 
would give CS to it (see below tor a more precise 
formulation of this fact). The upshot is that 
compositionality, as commonly defined, is not a 
strong constraint on a semantic theory. 
The intuitions behind this result can be illus- 
trated quite simply: Consider tile language 
of finite strings of digits from 0 to 7. Let's fix a 
random function (i.e. an intuitively bizarre func- 
tion) from this language into {0,1). Let the me- 
aning function be defined as the value of the 
string as the corresponding number in base 8 if 
tile value of the function is 0, and in base 10, 
otherwise. Clearly, the meaning of any string 
is a composition of the meanings of digits (notice 
that the values of the digits are the same in both 
bases). But, intuitively, this situation is different 
fi'om standard cases when we consider only one 
base and the meaning of a string is given by a 
simple lbrmula relizrring only to digits and their 
positions in the string, The theorem we prove 
below shows that however complex is the lan- 
guage, aqd whatever strange meanings we want 
to assign to its expressions, we can always do it 
compositionally. 
One of the more bizarre consequences of this 
fact is that we do not have to start building 
compositional semantics fbr NL beginning with 
assigning meanings to words. We can do equally 
well by assigning meanings to phonems or even 
LETTFRS, assuring that, for any sentence, the 
intuitive meaning we associate with it would be 
a lhnction of the meaning of the letters from 
which this sentence is composed. 
PROVING EXISTENCE OF C()MPOSI- 
TIONAL SEMANTICS 
Let S be any collection of expressions (intu- 
itively, sentences and their parts). Let M be a set 
s.t. for any seS, there is m=m(s) which is a 
member of M s.t. n, is the meaning of s. We 
want to show that there is a cmnpositional se- 
mantics for S which agrees with the function as- 
sociating m with re(s) , which will be denoted by 
re(x). 
Since elements of M can be of any type, we do 
not automatically have (for all elements of S) 
ACqES DE COL1NG-92, NANTES, 23-28 AO~r 1992 2 6 1 Pgoc. OF COLING-92. NANrES. AUG. 23-28, 1992 
m(s.t) = m(s)#m(t) (where # is some operation 
on the meanings). To get this kind of homo- 
morphism we have to perform a type raising 
operation that would map elements of S into 
functions and then the functions into the re- 
quired meanings. We begin by trivially extending 
the language S by adding to it an ~end of ex- 
pression H character $, which may appear only 
as the last element of any expression. The pur- 
pose of it is to encode the function re(x) in the 
following way: The meaning function tz that 
provides compositional semantics for S maps it 
into a set of functions in such a way that 
l~(s.t) = t~(s)(#(t)). We want that the original 
semantics be easily decoded from p(s), and 
therefore we require that, for all . s, 
I~(s.$) = re(s). Note that such a type raising op- 
eration is quite common both in mathematics 
(e.g. 1 being a function equal to 1 for all values) 
and in mathematical linguistics. Secondly, we 
assume here that there is only one way of com- 
posing elements of S -- by concatenation 2 but 
all our arguments work for languages with many 
operators as well. 
THEOREM. Under the above assumptions. 
There is a function ~t s.t, for all s, 
#(s.t) = #(s)(tt(t)) , and l~(s.$) = re(s). 
Proof. See Section 5.1. 
3.What do we really want from 
compositional semantics? 
In view of the above theorem, any semantics is 
equivalent to a compositional semantics, and 
hence it would be meaningless to keep the defi- 
nition of eompositionality as the existence of a 
homomorphism from syntax to semantics with- 
out imposing some conditions on this homo- 
morphism. Notice that requiring the 
computability of the meaning function won't do. s 
Proposition. If the original function re(x) is 
computable, so is the meaning function/~(x). 
Proof. See the proof of the solution lemma in 
Aczel (1987). 
3.1 What do we really want? 
We have some intuitions and a bunch of exam- 
ples associated with the concept of composi- 
tionality; e.g. for NP -> Adj N , we can map 
nouns and adjectives into sets and the concat- 
enation into set intersection, and get an intu- 
itively correct semantics for expressions like 
"grey carpet", "blue dog", etc. 
There seem to be two issues here: (1) Such pro- 
cedures work for limited domains like "everyday 
solids ~ and colors; (2) The function that com- 
poses the meanings should be "easily" definable, 
e.g. in terms of boolean operations on sets. This 
can be made precise for instance along the lines 
of Manaster-Ramer and Zadrozny (1990), where 
we argue that one can compare expressive 
power of various grammatical formalisms in 
terms of relations that they allow us to define; 
the same approach can be applied to semantics, 
as we show it below. 
3.1,4 simple grammar without a system- 
atic semantics 
If meanings have to be expressed using certain 
natural, but restricted, set of operations, it may 
turn out that even simple grammars do not have 
a compositional semantics. 
Consider two grammars of numerals in base 10: 
Grammar ND 
• N <--ND 
• N<--D 
• D <-0111213141516171819\[ 
And the second grammar 
2 We do not assume that concatenation is associative, that is (a.(b.c)) = ((a.h).c). Intuitively, this means that we assign se- 
mantics to parse trees, not to strings of words. But the method of proof can be modified to handle the case when concat- 
enation is associative. 
Also, note that in mathematics (where semantics obviously is compositional) we can talk about noncomputable functions, 
and it is usually clear what we postulate about them. 
AL-TRS DE COLING-92, NANTES. 23-28 AOUT 1992 2 6 2 PRoc. OF COLING-92, NANTES, AUG. 23-28, 1992 
Grammar DN 
• N <--DN 
• N<--D 
* D <--01112131415161718191 
For the grammar ND, the meaning of any nu- 
meral can be expressed in the model 
(Nat, +, x, 10) as 
#(N D) = 10 x #(N) + u(D) 
that is a polynomial in two variables with coef- 
ficients in natural numbers. 
For the grammar DN, we can prove that no 
such a polynomial exists, that is 
Theorem. There is no polynomial p in the vari- 
ables #(D), ~(N) such that 
#(D iV) = p(p.(D), #(N)) 
and such that the value of #(D N) is the number 
expressed by the string D N in base I0. 
Proof. See Section 5.2. 
But notice that there is a compositional seman- 
tics for the grammar DN that does not agree 
with intuitions: #(D N) = 10 × #(N) +/~(D), 
which corresponds to reading the number back- 
wards. And there are many other semantics cor- 
responding to all possible polynomials in #(D) 
and/I(N). 
Also observe that (a) if we specify enough values 
of the meaning function we can exclude any 
particular polynomial; (b) if we do not restrict 
the degree of the polynomial, we can write one 
that would give any values we want on a finite 
number of words in the grammar. 
The moral is that not only it is natural to restrict 
meaning functions to, say, polynomials, but to 
further restrict them, e.g. to polynomials of de- 
gree 1. Then by specifying only three values of 
the meaning function we can (a) have a unique 
compositional semantics lbr the first granmaar; 
(b) show that there is no compositional seman- 
tics tbr tile second grammar (directly from the 
proof of the above theorem). 
4. Conclusions 
4.1. Relevance for theories of grammar 
4. I. 1. On reduction of syntax to lexieal meanings 
T. Wasow on pp.204-205 of Sells (1985) writes: 
It is interesting that contemporary syn- 
tactic theories seem to be converging on 
the idea that sentence structure is gen- 
erally predictable from word meanings 
\[...\]. \[...\] The surprising thing (to lin- 
guist) has been how little needs to be 
stipulated beyond lexical meaning. \[_.\] 
The reader should notice that the meaning func- 
tion m in our main theorem is arbitrary. In par- 
ticular we can take re(s) to be the preferred 
syntactic analysis of the string s. The theorem 
then confirms the above observation: indeed, 
the syntax can be reduced to lexical meanings. 
At the same time, it both trivializes it and calls 
out for a deeper explanation. It trivializes the 
reduction to lexical meanings, since it also says 
that with no restriction on the types of meanings 
permitted, syntax can be reduced to the meaning 
of phonems or letters. The benefits of the re- 
duction to lexical meanings would have to be 
explained, especially if such meanings refer to 
abstract properties such as binding features, 
BAR, or different kinds of subcategorization. 
It is the view of this author (cf. Zadrozny and 
Manaster-Ramer (1997)) and, implicitly, of Fill- 
more et al. (1988), that such a reductionist ap- 
proach is inappropriate. But we have no room 
to elaborate it here. 
4.1.2. On good and bad grammars 
By introducing restrictions on semantic func- 
tions, i.e. demanding the systematicity of se- 
mantics, we can for the first time formalize the 
intuitions that linguists have had for a long time 
about "good" and "bad" grammars (cf. Manast- 
er-Ramcr and Zadrozny (1992)). This allows us 
ACrF~ DE COLING-92. NANTES, 23-28 AOI3T 1992 2 63 PROC. of COLING-92. NANTES, AUO. 23-28, 1992 
to begin dealing in a rigorous way with the 
problem (posed by Marsh and Partee) of con- 
straining the power of the semantic as well as the 
syntactic components of a grammar. 
We can show for instance (ibid.) that some re- 
strictions have the effect of making it in principle 
impossible to assign correct meanings to arbi- 
trary sets of matched singulars and plurals if the 
underlying grammar does not have a unitary rule 
of reduplication. Thus, grammars such as 
wrapping grammars (Bach (1979)), TAGs (e.g., 
Joshi (1987)), head grammars, LFG (e.g., Sells 
(1985)) and queue grammars (Manaster-Ramer 
(1986)), all of which can generate such a lan- 
guage, all fail to have systematic semantics for 
it. On the other hand, we can exhibit other 
grammars (including old-fashioned trans- 
formational grammars) which do have systcm- 
atic semantics for such a language. 
4.2. What kind of semantics for NL? 
In view of the above results we should perhaps 
discuss some of the options we have in semantics 
of NL, especially in context of NLU by com- 
puters. To focus our attention, let's consider the 
options we have to deal with the semantics of 
depart as described in Section 1. 
• Do nothing. That is, to assume that the se- 
mantics is given by sets of procedures asso- 
ciated with particular patterns; e.g. "X 
departs from Y" gets translated into 
"depart(X,Y)". 
• Give it semantics a la Montague, for in- 
stance, along the lines of Dowry (1979) (see 
esp. Chapter 7). Such a semantics is quite 
complicated, not very readable, and it is not 
clear what would be accomplished by doing 
this. However note that this doesn't mean 
that it would not be computational -- ttobbs 
and Rosenschein (1977) show how to trans- 
late Montagovian semantics into Lisp func- 
tions. 
Restrict the meaning of compositionality re- 
quiring for example that the meaning of a 
verb is a relation with the number of argu- 
ments equal to the number of arguments of 
the verb. If the PP following the verb is 
treated as one argument, there is no com- 
positional semantics that would agree with 
the intended meanings of the example sen- 
tences. This would formally justify tile argu- 
ments of t lirst. 
• Recognize that depending on the PPs the 
meaning of "X departs PP" varies, and de- 
scribe this dependence via a set of meaning 
postulates (Bernth and Lappin (1991) show 
how to do it in a computational context). In 
such a case the semantics is not given di- 
rectly as a homomorphism from syntax into 
some algebra of meanings, but indirectly, by 
restricting the class of such algebras by the 
meaning postulates. 
* Admit that the separation of syntax and se- 
mantics does not work in practice, and work 
with representations in which form and me- 
aning are not separated, that is, there cannot 
be a separate syntax, except in restricted 
domains or for small fragments of language. 
This view of language has been advocated 
by Fillmore et al. (1988), and shown in Za- 
drozny and Manaster-Ramer (1997), Za- 
drozny and Manaster-Ramer (1997) to be 
computationally feasible. 
• tiope that the meaning will emerge from 
non-symbolic representations, as advocated 
by the "connectionists." 
5.The proofs 
5.1. The Existence of compositional me- 
aning functions 
Let S be any collection of expressions (intu- 
itively, sentences and their parts). Let M be a 
set s.t. for any swhich is a member of S, there 
ism=m(s) which isa member of Ms.t.m is 
the meaning of s. We want to show that there 
is a compositional semantics for S which agrees 
with the function associating m with m(s) , 
which will be denoted by rn(x). To get the ho- 
Afzr~:s DE COLING-92, NANTES. 23-28 AO13T 1992 2 6 4 PROC. OI~ COLING-92. NANTES, AUO. 23-28, 1992 
momorphism from syntax to semantics we have 
to perfbrm a type raising operation that Would 
map elements of S into fhnctions and then the 
functions into the required meanings. 
As we have described it in Section 2, we extend 
S by adding to it the "end of expression" char- 
acter $, which may appear only as the last ele- 
ment of any expression. Under these 
assumptions we prove: 
THEOREM. There is a function ~t s.t, for all s, 
#(s.t) = #(s)(tt(t)) , and 
~(s.$) : m(s). 
Proof. Let /(0),1(1) .... , t(a) enumerate S. We 
can create a big table specifying meaning values 
for aU strings and their combinations. Then the 
conditions above can be written as a set of 
equations shown in the figure below 
t,(t(0)) ~ { < $, m(t(0)) >, < it(t(0)),/t(t(0).t(0)) > ..... < #(t(u)), it(t(O).t(a)) > .... } 
#(t(I)) ~ { < $, m(t(1)) >, < ~t(t(0)),/~(t(1).t(0)) > ..... </~(t(~)), tt(t(I).t(a)) > .... } 
tt(t(a)) = { < $, m(t(a)) >, < #(t(0)), ~(/(a).t(0)) > ..... < #(t(a)), #(t(~).t(a)) > .... } 
Continuing the proof: By the solution lemma 
(Aczel (1987) and Barwise and Etchemendy 
(1987)) this set of equations has a solution 
(unique), which is a function. 
To finish the proof we have to make sure that 
the equation #($)= $ holds. Formally, this re- 
quires adding the pair < $, $ > into the graph of 
tt that was obtained from the solution letmna. 
\[\] 
We have directly specified the function as a set 
of pairs with appropriate values. Note that that 
there is place for recursion in syntactic catego- 
ries. Also, if a certain string dues not belong to 
the language, we assume that the corresponding 
value in this table is undefined; thus # is not ne- 
cessarily defined for all possible concatenations 
of strings of S. 
Note: The above theorem has been proved in set 
theory with the anti-loundation axiom, ZFA. 
This set theory is equiconsistent with the stand- 
ard system of ZFC, thus the theorem does not 
assume anything more than what is needed for 
"standard mathematical practice". Furthermore, 
ZFA is better suited as foundations for seman- 
tics of NL than ZFC (Barwise and Etchemendy 
(1987)). 
5.2. A grammar without compositional 
semantics 
Vor the grammar DN, we can prove that no 
such a polynomial exists, that is 
1qaeorem. There is no polynomial p in the vari- 
ables #(D), #(N) such that 
#( D IV) = p(la( D ), I~( N) ) 
and such that the value of#(D P0 is the number 
expressed by the string D N in base 10. 
Proof. We are looking for 
~(~ ~0 = p(u(~, ~(O)) 
= #(D) x (1() tength(~) + p(N) 
where the function p must be a polynmnial in 
these two variables. But such a polynomial does 
not exist, since it would have to be equal to 
#(N) for p(N) in tile interval 0..9, and to 
/~(D) × 10+/~(N) for /~(N) in 10..99, and to 
#(D) × 100 + ~t(N) for ~t(N) in 100..999, and so 
on. And if the degree of this polynomial was less 
than l~, lbr some n, it would have to be equal 
identically to /~(D) × 10" +/~(N) , since it would 
share with it all the values in l@..10 ~ - 1, and 
therefore could not give correct values on the 
other intervals. 
AcrEs DE COLING-92, NANTES, 23-28 AOl\]'r 1992 2 6 5 PROC. OF COL1NG-92, NANTES, Au(L 23-28, 1992 
Acknowledgements. Alexis Manaster Ramer brought 
to my attention the need to clarify the meaning of 
compositionality, and commented on various aspects 
of this work. I also benefited greatly from a discussion 
with John Nerborme and exchanges of e-mail with 
Feruando Pereira and Nelson Correa. (Needless to 
say, all the remaining faults of the paper are mine). 
Parts of this paper were written at the University of 
Kaiserslautern; I'd like to thank Alexander yon Hum- 
boldt Stiftung for supporting my visit there. 
ALT~ DE COLING-92, NANT~, 23-28 AOt3"r 1992 2 6 6 PROC. OF COLING-92, NANTES, AUG. 23-28, 1992 

References 

P. Aczel (1987). Lectures on Non-Well-Founded Sets. 
Stanford, CA: CSLI Lecture Notes. 

E. Bach (1979). Control in Montague Grammar. 
Linguistic Inquiry, 515-531. 

J. Barwise & J. Etchemendy (1987). The Liar. New 
York, NY: Oxford University Press. 

A. Beruth & S. Lappin (1991). A meaning postulate 
based inference system for natural language. 
RC 16947, Yorktown lteights, NY: IBM T.J. 
Watson Research Center. 

D. R. Dowry (1979). Word Meaning and Montague 
Grammar. Dordrecht, Holland: D. Reidel. 

C. J. Fillmore, P. Kay, & M. C. O'Connor (1988). 
Regularity and idiomatieity in grammatical 
constructions. Language, 3, 501-538. 

G. tlirst (1987). Semantic interpretation and the re- 
solution of ambiguity. Cambridge, Great 
Britain: Cambridge University Press. 

J. llobbs & S. Rosenschein (1977). Making compu- 
tational sense of Montague's intentional logic. 
Artificial Intelligence, 3, 287-306. 

A. K. Joshi (1987). An Introduction to Tree Adjoin- 
ing Grammars: In A. Manaster-Ramer (Ed.), 
Mathematics of Language (pp. 87-114). 
Amsterdam/Philadelphia: John Benjamins. 

E. L. Keenan & L. M. Faltz (1985). Boolean Seman- 
tics for Natural Language. Dordrecht, 
Holland: D Reidel. 

A. Manaster-Ramer (1986). Copying in Natural Lan- 
guages, Context-Freeness, and Queue G-tam- 
mars. Proc. ACL'86, 85-89. 

A. Manaster-Ramer & W. Zadrozny (1990). 
Expressive Power of Grammatical Formalisms 
(Proceedings of Coling-90). tlelsinki, 
Finnland: Universitas ltelsingiensis. 

A. Manaster-Ramer & W. Zadrozny (1992). System- 
atic Semantics. in preparation. 

B. H. Partee, A. t. Meulen, & R. E. Wall (1990). 
Mathematical Methods in Lingusitics. Dor- 
dreeht, The Netherlands: Kluwer. 

P. Sells (1985). Lectures on Contemporary Syntactic 
Theories. Stanford, CA: CSLI lecture Notes (3). 

J. van Benthem (1982). The Logic of Semantics: In 
Fred Lanman & Frank Veltman (Eds.), 
Varieties of Formal Semantics. Dordrecht, 
Holland: Foils. 

W. A. Woods (1967). Semantics for a question an- 
swering system. Harvard University. 
PhDThesis 

W. A. Woods (1969). Procedural semantics for a 
question answering machine. AFIPS confer- 
ence proceedings, 33, 457-471. 

W. Zadrozny & A. Manaster-Ramer (1997a). The 
Significance of Constructions. submitted. 

W. Zadrozny & A. Manaster-Ramer (199~). Assem- 
bling Constructions. submitted. 
