COOKING UP REFERRING EXPRESSIONS 
Robert Dale 
Centre for Cognitive Science, University of Edinburgh 
2 Buccleuch Place, Edinburgh EH8 9LW, Scotland 
email: rda~uk, ac. ed. epJ.stemi~nss, c s. ucl. ac. uk 
ABSTRACT 
This paper describes the referring expression 
generation mechanisms used in EPICURE, a com- 
puter program which produces natural language 
descriptions of cookery recipes. Major features of 
the system include: an underlying ontology which 
permits the representation of non-singular entities; 
a notion of diacriminatory power, to determine 
what properties should be used in a description; 
and a PATR-like unification grammar to produce 
surface linguistic strings. 
INTRODUCTION 
EPICURE (Dale 1989a, 1989b) is a natural lan- 
guage generation system whose principal concern 
is the generation of referring expressions which 
pick out complex entities in connected discourse. 
In particular, the system generates natural lan- 
guage descriptions of cookery recipes. Given a top 
level goal, the program first decomposes that goal 
recursively to produce a plan consisting of oper- 
ations at a level of detail commensurate with the 
assumed knowledge of the hearer. In order to de- 
scribe the resulting plan, EPICURE then models 
its execution, so that the processes which produce 
referring expressions always have access to a rep- 
resentation of the ingredients in the state they are 
in at the time of description. 
This paper describes that part of the system 
responsible for the generation of subsequent refer- 
ring expressions, i.e., references to entities which 
have already been mentioned in the discourse. The 
most notable features of the approach taken here 
are as follows: (a) the use of a sophisticated un- 
derlying ontology, to permit the representation of 
non-singular entities; (b) the use of two levels of se- 
mantic representation, in conjunction with a model 
of the discourse, to produce appropriate anaphoric 
referring expressions; (c) the use of a notion of dis- 
crimiaatory power, to determine what properties 
should be used in describing an entity; and (d) the 
use of a PATR-1ike unification grammar (see, for ex- 
ample, Karttunen (1986); Shieber (1986)) to pro- 
duce surface linguistic strings from input semantic 
structures. 
THE REPRESENTATION OF 
INGREDIENTS 
In most natural language systems, it is assumed 
that all the entities in the domain of discourse are 
singular individuals. In more complex domains, 
such as recipes, this simplification is of limited 
value, since a large proportion of the objects we 
find are masses or sets, such as those described 
by the noun phrases two ounces of salt and three 
pounds of carrots respectively. 
In order to permit the representation of enti- 
ties such as these, EPICURE makes use of a notion 
of a generalized physical object or physob\]. This 
permits a consistent representation of entities irre- 
spective of whether they are viewed as individuals, 
masses or sets, by representing each as a knowledge 
base entity (KBE) with an appropriate structure at. 
tribute. The knowledge base entity corresponding 
to three pounds of carrots, for example, is that 
shown in figure 1. 
A knowledge base entity models a physobj in a 
particular state. An entity may change during the 
course of a recipe, as processes are applied to it: 
in particular, apart from gaining new properties 
such as being peeled, chopped, etc., an ingredient's 
structure may change, for example, from set to 
mass. Each such change of state results in the 
creation of a new knowledge base entity. Suppose, 
for example, a grating event is applied to our three 
pounds of carrots between states so and sl: the 
entity shown in figure i will then become a mass of 
grated carrot, represented in state sl by the KBE 
shown in figure 2. 
BUILDING A REFERRING 
EXPRESSION 
To construct a referring expression corresponding 
to a knowledge base entity, we first build a deep se- 
68 
KBE -~ 
indus = ZO state 
= so 
structure = set 
quantity = \[ num~erUnit = pound= 3 \] 
speC = structure = individual 
substance = carrot -,-- \[ \] 
packaging = ehape= carrot 
• = regular 8| Ze 
Figure 1: The knowledge base entity corresponding to three pounds of carrots 
KBE = 
irides = zo 
state ---- Sl 
strt~|urc = m~8o 
qu4ntity = \[ unit = pound \] 
spec = number = 3 
substar~e = carrot 
grated = + 
Figure 2: The knowledge base entity corresponding to three pound8 of grated carrot 
mantic structure which specifies the semantic con- 
tent of the noun phrase to be generated. We call 
this the recoverable semantic content, since it con- 
sists of just that information the hearer should 
be able to derive from the corresponding utter- 
ance, even if that information is not stated explic- 
itly: in particular, elided elements and instances of 
oae-anaphora are represented in the deep seman- 
tic structure by their more semantically complete 
counterparts, as we will see below. 
From the deep semantic structure, a surface 
semantic structure is then constructed. Unlike the 
deep semantic structure, this closely matches the 
syntactic structure of the resulting noun phrase, 
and is suitable for passing directly to a PATR-like 
unification grammar. It is at the level of surface 
semantic structure that processes such as elision 
and one-anaphora take place. 
PRONOMINALIZATION 
When an entity is to be referred to, we first check 
to see if pronominalisation is possible. Some pre- 
vious approaches to the pronominalization deci. 
don have taken into account a large number of 
contextual factors (see, for example, McDonald 
(1980:218-220)). The approach taken here is rel- 
atively simple. EPICURE makes use of a discourse 
model which distinguishes two principal compo- 
nents, corresponding to Grosz's (1977) distinction 
between local focus and global focus. We call that 
part of the discourse model corresponding to the 
local focus cache memory: this contains the lex- 
ical, syntactic and semantic detail of the current 
utterance being generated, and the same detail for 
the previous utterance. Corresponding to global 
focus, the discourse model consists of a number 
of hierarchically-arranged focua spaces, mirroring 
the structure of the recipe being described. These 
focus spaces record the semantic content, but not 
the syntactic or lexlcal detail, of the remainder 
of the preceding discourse. In addition, we make 
use of a notion of discourse centre: this is intu- 
itively similar to the notion of centering suggested 
by (\]ross, Joshi and Weinstein (1983), and corre- 
sponds to the focus of attention in the discourse. 
In recipes, we take the centre to be the result of 
69 
the previous operation described. Thus, after an 
utterance like Soak the butterbeaa.s the centre is 
the entity described by the noun phrase the but- 
terbeans. Subsequent references to the centre can 
be pronominalized, so that the next instruction in 
the recipe might then be Drain and dnse tltem. 
Following Grosz, Joshi and Weinstein (1983), 
references to other entities present in cache mem- 
ory may also be pronominalized, provided the cen- 
tre is pronominalized. 1 
If the intended referent is the current centre, 
then this is marked as part of the status infor- 
mation in the deep semantic structure being con- 
structed, and a null value is specified for the struc- 
ture's descriptive content. In addition, the verb 
case frame used to construct the utterance speci- 
fies whether or not the linguistic realization of the 
entity filling each case role is obligatory: as we 
will see below, this allows us to model a common 
linguistic phenomenon in recipes (recipe contezt 
empty objects, after Massam and Roberge (1989)). 
For a case role whose surface realization is obliga- 
tory, the resulting deep semantic structure is then 
as follows: 
D$ = 
inde: : : 
\[ N~en = + 
statttm : e.cntrs : t 
sem : oblig -~ + 
"Pec = \[ "PC=q) I 
This will be realized as either a pronoun or an 
elided NP, generated from a surface semantic struc- 
ture which is constructed in accordance with the 
following rules: 
• If the status includes the features \[centre, +\] 
and \[oblig, +\], then there should be a cor- 
responding element in the surface semantic 
structure, with a null value specified for the 
descriptive content of the noun phrase to be 
generated; 
t We do not permit pronominal reference to entities last 
mentioned before the previous utterance: support for this 
restriction comes from a study by Hobbs, who, in a sam- 
ple of one hundred consecutive e~.amples of pronouns from 
each of three very different texts, found that 98% of an- 
tecedents were either in the same or previous sentence 
(Hobbs 1978:322-323). However, see Dale (1988) for a sug- 
gestion as to how the few instances of/onc-dbt~a.e pronom- 
inalimtion that do exist might be explained by means of a 
theory of discourse structure like that suggested by Gross 
and Sidner (1986). 
7O 
• If the status includes the features \[centre, +\] 
and \[oblig,-\], then this participant should 
be omitted from the surface semantic struc- 
ture altogether. 
In the former case, this will result in a pronominal 
reference as in Remove them, where the surface se- 
mantic structure corresponding to the pronominal 
form is as follows: 
ind~z = z 
status : \[ 
SS = 
"1 given = + | 
J centre = ~r oblig = + 
\[ nu~ = pl agr 
8p~ ~--- C CG$~ = GCC 
&*c = 
However, if the participant is marked as non-obligatory, 
then reference to the entity is omitted, as in the 
following: 
Fry the onions. 
Add the garlic ~b. 
Here, the case frame for add specifies that the in- 
direct object is non-obllgatory; since the entity 
which fills this case role is also the centre, the 
complete prepositional phrase to the onions can 
be elided. Note, however, that the entity corre- 
sponding to the onions still figures in the deep 
semantic structure; thus, it is integrated into the 
discourse model, and is deemed to be part of the 
semantic content recoverable by the hearer. 
FULL DEFINITE NOUN PHRASE 
REFERENCE 
If pronominalization is ruled out, we have to build 
an appropriate description of the intended refer- 
ent. In EPICURE, the process of constructing a 
description is driven by two principles, very like 
Gricean conversational maxims (Grice 1975). The 
p~'nciple of adequacy requires that a referring ex- 
pression should identify the intended referent un- 
ambiguously, and provide sufficient information to 
serve the purpose of the reference; and the princi- 
ple of e~ciency, pulling in the opposite direction, 
requires that the referring expression used must 
not contain more information than is necessary for 
the task at hand. 2 
These principles are implemented in EPICUItE 
2Similar considerations are discussed by Appelt (1985). 
DS ~--- 
inde= ..~ = 
status =. \[ given = + unique = + 
eel'n ~- 
opec = 
agr = 
tvpe= 
I countable = + \] 
J number = pl 
category : olive 
$ize : regular props = 
pitted = + 
Figure 3: The deep semantic structure corresponding to the pitted olives 
#tat*t. = 
epee = 
a/yen= + \] unique = + 
\[countable : -~ \] agr 
= number = pl \] 
head = olive 
dee¢ = mad= \[ head = pltted \] 
Figure 4: The surface semantic structure corresponding to the pitted olives 
by means of a notion of discriminatory power. Sup- 
pose that we have a set of entities U such that 
U = {zl,z2,...,x,} 
and that we wish to distinguish one of these en- 
tities, zl, from all the others. Suppose, also, that 
the domain includes a number of attributes (a I, a~, 
and so on), and that each attribute has a number 
of permissible values {v,,t, v,,2, and so on}; and 
that each entity is described by a set of attribute- 
value pairs. In order to distinguish z~ from the 
other entities in U, we need to find some set of 
attribute-value pairs which are together true of zl, 
but of no other entity in U. This set of attribute- 
value pairs constitutes a distinguishing descriptior, 
of xl with respect to the ,~ontext U. A mini- 
mal distinguishing description is then a set of such 
attribute-value pairs, where the cardinality of that 
set is such that there are no other sets of attribute- 
value pairs of lesser cardinality which are sufficient 
to distinguish the intended referent. 
We find a minimal distinguishing description 
by observing that different attribute-value pairs 
differ in the effectiveness with which they distin- 
guish an entity from a set of entities. Suppose 
U has N elements, where N > I. Then, any 
attribute-value pair true of the intended referent 
zl will be true of n entities in this set, where 
n >_ i. For any attribute-value pair < a, v > that 
is true of the intended referent, we can compute 
the discriminatory power (notated here as F) of 
that attribute-value pair with respect to U as fol- 
lows" 
~'(< ~,v>, U) = ~-'~ l<n<N 
F thus has as its range the interval \[0,1\], where 
a value of 1 for a given attribute-value pair indi- 
cates that the attribute-value pair singles out the 
intended referent from the conte×t, and a value of 
7\] 
DS -~- 
indez = z2 
status = 
SSf~t 
SpSC -~ 
\[ #/uen= + \] 
unique = + 
number = sg 
agr = countable ---- + 
type = 
\] 
categorl! = capsicum r 
I eolour = red properties L 
size = small 
Figure 5: The deep semantic structure corresponding to the small red capsicum 
SS = 
indez = z2 
, unique = + 
i 
Jpsc = 
_ ~ nu,n~sr= so \] agr- \[ countable = + J 
Figure 6: The surface semantic structure corresponding to the small red one 
0 indicates that the attribute-value pair is of no 
assistance in singling out the intended referent. 
Given an intended referent and a set of entities 
from which the intended referent must be distin- 
guished, this notion is used to determine which set 
of properties should be used in building a descrip- 
tion which is both adequate and efficient. 3 There 
remains the question of how the constituency of 
the set U of entities is determined: in the present 
work, we take the context always to consist of the 
working set. This is the set of distinguishable enti- 
sstrictly speaking, this mechanism is only applicable in 
the form described here to those properties of an entity 
which are realizable by what are known as abJolute (or t~- 
tereect/ee or pred~tiee) adjectives (see, for example, Kamp 
(1975), Keenan and FaRm (1978)). This is acceptable in 
the current domain, where many of the adjectives used are 
derived from the verbs used to describe processes applied 
to entities. 
ties in the domain at any given point in time: the 
constituency of this set changes as a recipe pro- 
ceeds, since entities may be created or destroyed. 4 
Suppose, for example, we determine that we 
must identify a given object as being a set of olives 
which have been pitted (in a context, for example, 
where there are also olives which have not been 
pitted}; the corresponding deep semantic struc- 
ture is then as in figure 3. 
Note that this deep semantic structure can 
be realized in at least two ways: as either the 
olives which have been pitted or the pitted olives. 
4A slightly more sophisticated approach would be to 
restrict U to exclude those entities which are, in G rosz and 
Sidner's (1986) terms, only present in closed focus spaces. 
However, the benefit gained from doing this (if indeed it is a 
valid thing to do) is minimal in the current context because 
of the small number of entities we are dealing with. 
72 
indez = z 
~tatt~ = .\[ \] 
number = pl 
agr = "ountable = + 
DS = ~.nuant 
8pec = 
8ubst = 
\] 
t number--- 3 \] agr = countable = + 
tltpe -- categorlt = pound \] 
number = pl l 
agr = countable = + J 
type = category = carrot \] 
J 
Figure 7: The deep semantic structure corresponding to three pounds of carrots 
Both forms are possible, although they correspond 
to different surface semantic structures. Thus, 
the generation algorithm is non-deterministic in 
this respect (although one might imagine there are 
other factors which determine which of the two re- 
alizations is preferrable in a given context}. The 
surface semantic structure for the simpler of the 
two noun phrase structures is as shown in figure 4. 
ONE ANAPHORA 
The algorithms employed in EPICURE also permit 
the generation of onc-anaphora, as in 
Slice the large green capsicum. 
Now remove the top of the small red one. 
The deep semantic structure corresponding to the 
noun phrase the small red one is as shown in fig- 
ure 5. 
The mechanisms which construct the surface 
semantic structure determine whether one-anaphora 
is possible by comparing the deep semantic struc- 
ture corresponding to the previous utterance with 
that corresponding to the current utterance, to 
identify any elements they have in common. The 
two distinct levels of semantic representation play 
an important role here: in the deep semantic struc- 
ture, only the basic semantic category of the de  
scription has special status (this is similar to Wel>- 
her's (1979) use of restricted quantification), whereas 
the embedding of the surface semantic structure's 
dcsc feature closely matches that of the noun phrase 
to be generated. For one-anaphora to be possi- 
ble, the two deep semantic structures being com- 
pared must have the same value for the feature 
addressed by the path <sere spec type category>. 
Rules which specify the relative ordering of ad- 
jectives in the surface form are then used to build 
an appropriately nested surface semantic structure 
which, when unified with the grammar, will result 
in the required one-anaphoric noun phrase. In the 
present example, this results in the surface seman- 
tic structure in figure 6. 
PSEUDO-PARTITI'VE NPS 
Partitive and pseudo-partitive noun phrases, ex- 
emplified by half of the carrots and three pounds of 
carrots respectively, are very common in recipes; 
EPICURE is capable of generating both. So, for 
example, the pseudo-partitive noun phrase three 
pounds of carrots (as represented by the knowledge 
base entity shown in figure 1) is generated from the 
deep semantic structure shown in figure 7 via the 
surface semantic structure shown in figure 8. 
The generation of partitive noun phrases re- 
quires slightly different semantic structures, de- 
scribed in greater detail in Dale (1989b). 
THE UNIFICATION GRAMMAR 
Once the required surface semantic structure has 
been constructed, this is passed to a unification 
73 
$S = 
ind.= = z 
atatua= 
8era 
epee = 
. \[ giuen = -- \] 
countable = + 
agr = number = 3 
epec I = 
&so = 
$p¢c2 = 
\] 
t countable = + age = number = 3 
desc = head = pound 
agr= \[\[eountab|e=+ 
d¢8c = head = carrot 
Figure 8: The surface semantic structure corresponding to three pounds of carrots 
grammar. In EPICURE, the grammar consists of 
phrase structure rules annotated with path equa- 
tions which determine the relationships between 
semantic units and syntactic units: the path equa- 
tions specify arbitrary constituents (either com- 
plex or atomic) of feature structures. 
There is insufficient space here to show the en- 
tire NP grammar, but we provide some representa- 
tive rules in figure 9 (although these rules are ex- 
pressed here in a PATR-Iike formalism, within EPI- 
CURE they are encoded as PROLOG definite clause 
grammar (DCG) rules (Clocksin and Mellish 1981)). 
Applying these rules to the surface semantic struc- 
tures described above results in the generation of 
the appropriate surface linguistic strings. 
CONCLUSION 
In this paper, we have described the processes used 
in EPICURE to produce noun phrase referring ex- 
pressions. EPICURE is implemented in C-PROLOG 
running under UNIX. The algorithms used in the 
system permit the generation of a wide range of 
pronominal forms, one-anaphoric forms and full 
noun phrase structures, including partitives and 
pseudo-partitives. 
ACKNOWLEDGEMENTS 
The work described here has benefited greatly from 
discussions with Ewan Klein, Graeme Ritchie, :Ion 
Oberlander, and Marc Moens, and from Bonnie 
Webber's encouragement. 
REFERENCES 
Appelt, Douglas E. (1985) Planning English Refer- 
ring Expressions. Artificial Intelligence, 26, 1-33. 
Clocksin, William F. and Melllsh, Christopher S. 
(1981) Programming in Prolog. Berlin: Springer- 
Verlag. 
Dale, Robert (1988) The Generation of Subsequent 
Referring Expressions in Structured Discourses. 
Chapter 5 in Zock, M. and Sabah, G. (eds.) Ad- 
uances in Natural Language Generation: An Inter- 
disciplinary Perspective, Volume 2, pp58-75. Lon- 
don: Pinter Publishers Ltd. 
Dale, Robert (1989a) Generating Recipes: An Over- 
view of EPICURE. Extended Abstracts of the Sec- 
ond European Natural Language Generation Work- 
shop, Edinburgh, April 1989. 
Dale, Robert (1989b) Generating Referring Ex- 
pressions in a Domain of Objects and Processes. 
PhD Thesis, Centre for Cognitive Science, Univer- 
sity of Edinburgh. 
Grice, H. Paul (1975) Logic and Conversation. In 
Cole, P. and Morgan, J. L. (eds.) Syntax and Se- 
mantics, Volume 3: Speech Acts, pp41-58. New 
York: Academic Press. 
Grosz, Barbara J. (1977} The Representation and 
Use of Focus in Dialogue. Technical Note No. 151, 
74 
NP 
N2 
Nll 
NPx 
NPI ---4. 
Dee N1 
<Dee sere> 
<NP 8yn agr> 
<N1 syn agr> 
<Dee syn agr> 
<N1 sere> 
N 
<N sent> 
AP NI2 
<AP sere> 
<NI~ sere head> 
<NP2 sere> 
<N1 sere> 
<NI 8yn ayr> 
<NPa 8era statuJ> 
<NP2 sere status> 
<NPa 8era> 
<PP 8era> 
= <NP sere status> 
= <NP sere spec agr> 
= <NP syn agr> 
= <N1 syn agr> 
= <NP sere spec desc> 
= <N1 sent head> 
= <Nll sere rood> 
-- <Nlx sere head> 
= <NPx sere spec desc specx > 
= <NPx sere spec desc spe¢2> 
= <NPx sere spec agr> 
= <NPz sere status> 
= <NPx sere status> 
= <NPx sere spec desc spec> 
= <NPx sere spec desc set> 
Figure 9: A fragment of the noun phrase grammar 
SRI International, Menlo Park, Ca., July, 1977. 
Grosz, Barbara J., Joshi, Aravind K. and Wein- 
stein, Scott (1983) Providing a Unified Account of 
Definite Noun Phrases in Discourse. In Proceed- 
ings of the ~lst Annual Meeting o/the Associa- 
tion for Computational Linguistics, Massachusetts 
Institute of Technology, Cambridge, Mass., 15-17 
June, 1983, pp44-49. 
Grosz, Barbara J. and Sidner, Candace L. (1986) 
Attention, Intentions, and the Structure of Dis- 
course. Computational Linguistics, 12, 175-204. 
Hobbs, Jerry R. (1978) Resolving Pronoun Refer- 
ences. Lingua, 44, 311-338. 
Kamp, Hans (1975) Two Theories about Adjec- 
tives. In Keenan, E. L. (ed.) Formal Semantics of 
Natural Language: Papers from a colloquium spon- 
sored by King's College Research Centre, Cam- 
bridge, pp123-155. Cambridge: Cambridge Uni- 
versity Press. 
Karttunen, Lauri (1986) D-PATR: A Development 
Environment for Unification-Based Grammars. In 
Proceedings of the 11th International Conference 
on Computational Linguistics, Bonn, 25-29 Au- 
gust, 1986, pp74-80. 
Keenan, Edward L. and Faltz, Leonard M. (1978) 
Logical Types for Natural Language. UCLA Occa- 
sional Papers in Linguistics, No. 3. 
McDonald, David D. (1980) Natural Language Gen- 
eration as a Process of Decision-Making under Con- 
straints. PhD Thesis, Department of Computer 
Science and Electrical Engineering, MIT. 
Massam, Diane and Roberge, Yves (1989) Recipe 
Context Null Objects in English. Linguistic In- 
quiry, 20, 134--139. 
Shieber, Stuart M. (1980) An Introduction to Unification- 
based Approaches to Grantmar. Chicago, Illinois: 
The University of Chicago Press. 
Webber, Bonnie Lynn (1979) A Formal Approach 
to Discourse Anaphora. London: Garland Pub- 
lishing. 
75 
