A Meta-Algorithm for the Generation of Referring Expressions
Emiel Krahmer, Sebastiaan van Erk and Andr´e Verleg
TU/e, Eindhoven University of Technology,
The Netherlands
email: E.J.Krahmer@tue.nl
Abstract
This paper describes a new approach
to the generation of referring expres-
sions. We propose to formalize a scene
as a labeled directed graph and describe
content selection as a subgraph con-
struction problem. Cost functions are
used to guide the search process and
to give preference to some solutions
over others. The resulting graph al-
gorithm can be seen as a meta-algorithm
in the sense that defining cost functions
in different ways allows us to mimic —
and even improve— a number of well-
known algorithms.
1 Introduction
The generation of referring expressions is one
of the most common tasks in natural language
generation, and has been addressed by many re-
searchers in the past two decades (including Ap-
pelt 1985, Dale 1992, Reiter 1990, Dale & Had-
dock 1991, Dale & Reiter 1995, Horacek 1997,
Stone & Webber 1998, Krahmer & Theune 1999
and van Deemter 2000). As a result, there are
many different algorithms for the generation of
referring expressions, each with its own object-
ives: some aim at producing the shortest possible
description, others focus on efficiency or realistic
output. The degree of detail in which the various
algorithms are described differs considerably, and
as a result it is often difficult to compare the vari-
ous proposals. In addition, most of the algorithms
are primarily concerned with the generation of de-
scriptions only using properties of the target ob-
ject. Consequently, the problem of generating re-
lational descriptions (i.e., descriptions which in-
corporate references to other objects to single out
the target object) has not received the attention it
deserves.
In this paper, we describe a general, graph-
theoretic approach to the generation of referring
expressions. We propose to formalize a scene
(i.e., a domain of objects and their properties and
relations) as a labeled directed graph and describe
the content selection problem —which proper-
ties and relations to include in a description for
an object?— as a subgraph construction problem.
The graph perspective has three main advantages.
The first one is that there are many attractive al-
gorithms for dealing with graph structures. In
this paper, we describe a branch and bound al-
gorithm for finding the relevant subgraphs, where
we use cost functions to guide the search pro-
cess. Arguably, the proposed algorithm is a meta-
algorithm, in the sense that by defining the cost
function in different ways, we can mimic various
well-known algorithms for the generation of re-
ferring expressions. A second advantage of the
graph-theoretical framework is that it does not run
into problems with relational descriptions, due to
the fact that properties and relations are formal-
ized in the same way, namely as edges in a graph.
The third advantage is that the combined usage
of graphs and cost-functions paves the way for
a natural integration of traditional rule-based ap-
proaches to generation with more recent statist-
ical approaches (e.g., Langkilde & Knight 1998,
Malouf 2000) in a single algorithm.
The outline of this paper is as follows. In sec-
tion 2, we describe how scenes can be described as
labeled directed graphs and show how content se-
lection can be formalized as a subgraph construc-
tion problem. Section 3 contains a sketch of the
branch and bound algorithm, which is illustrated
with a worked example. In section 4 it is argued
that by defining cost functions in different ways,
we can mimic various well-known algorithms for
the generation of referring expressions. We end
with some concluding remarks in section 5.
2 Graphs
Consider the following scene:
a0a1
a2a4a3
a2a6a5
a7
a7
a7
a7
a8
a8
a8
a8
a0a1
a2a10a9
a2a10a11
a7
a7
a7
a7
a8
a8
a8
a8
Figure 1: An example scene
In this scene, as in any other scene, we see a
finite set of entities a12 with properties a13 and
relations a14 . In this particular scene, the set
a12a16a15a18a17a20a19a22a21a24a23a25a19a27a26a6a23a25a19a29a28a4a23a25a19a29a30a32a31 is the set of entities, a13a33a15
a17 dog, chihuahua, doghouse, small, large, white,
brown a31 is the set of properties and a14a34a15a35a17 next to,
left of, right of, contain, in a31 is the set of relations.
A scene can be represented in various ways. One
common representation is to build a database,
listing the properties of each element of a12 :
a36a10a37 : dog (a36a10a37 ), brown (a36a10a37 ), ..., in (a36a10a37 , a36a24a38 )
... ... ... ...
a36a24a39 : doghouse (a36a24a39 ), white (a36a24a39 ), ..., right of (a36a24a39 , a36 a38 )
Here we take a different approach and represent a
scene as a labeled directed graph. Let a40a41a15a42a13a44a43a45a14
be the set of labels (with a13 and a14 disjoint, i.e.,
a13a47a46a48a14 a15 a49 ). Then, a labeled directed graph
a50
a15a52a51a54a53a55a23a57a56a59a58 , where a53a61a60a62a12 is the set of vertices
(or nodes) and a56a63a60a64a53a66a65a67a40a68a65a69a53 is the set of labeled
directed arcs (or edges). The scene given in Fig-
ure 1 can be represented by the graph in Figure 2.
Keep in mind that the a19 labels are only added to
ease reference to nodes. Notice also that proper-
ties (such as being a dog) are always modelled as
loops, i.e., edges which start and end in the same
node, while relations may (but need not) have dif-
ferent start and end nodes.
Now the content determination problem for re-
ferring expressions can be formulated as a graph
construction task. In order to decide which in-
formation to include in a referring expression for
an object a19a63a70a71a53 , we construct a connected dir-
ected labeled graph over the set of labels a40 and
an arbitrary set of nodes, but including a19 . This
graph can be understood as the “meaning repres-
entation” from which a referring expression can
be generated by a linguistic realizer. Informally,
we say that a graph refers to a given entity iff
the graph can be “placed over” the scene graph
in such a way that the node being referred to is
“placed over” the given entity and each edge can
be “placed over” an edge labeled with the same la-
bel. Furthermore, a graph is distinguishing iff it
refers to exactly one node in the scene graph.
Consider the three graphs in Figure 3. Here and
elsewhere circled nodes stand for the intended ref-
erent. Graph (i) refers to all nodes of the graph
in Figure 2 (every object in the scene is next to
some other object), graph (ii) can refer to both a19a72a21
and a19a27a26 , and graph (iii) is distinguishing in that it
can only refer to a19a72a21 . Notice that the three graphs
might be realized as something next to something
else, a chihuahua and the dog in the doghouse re-
spectively. In this paper, we will concentrate on
the generation of distinguishing graphs.
Formally, the notion that a graph a73 a15
a51a54a53a75a74a76a23a57a56a77a74a78a58 can be “placed over” another graph
a50
a15a79a51a54a53a81a80a82a23a57a56a77a80a55a58 corresponds to the notion of a sub-
graph isomorphism. a73 can be “placed over” a50 iff
there exists a subgraph a50a77a83 a15a79a51a54a53 a80a85a84 a23a57a56 a80a85a84 a58 of a50 such
that a73 is isomorphic to a50 a83 . a73 is isomorphic to a50 a83
iff there exists a bijection a86a88a87a89a53 a74a91a90 a53 a80a85a84 such that
for all nodes a92a93a23a57a94a63a70a95a53 a74 and all a96a97a70a98a40
a99
a92a93a23a25a96a100a23a57a94a59a101a102a70a98a56 a74a64a103
a99
a86a105a104a106a92a81a23a25a96a100a23a57a86a105a104a106a94a59a101a107a70a98a56 a80a85a84
In words: the bijective function a86 maps all the
nodes in a73 to corresponding nodes in a50 a83 , in such
a way that any edge with label a96 between nodes a92
a108a109
chihuahua
brown
small
dog
d1 a110a111
chihuahua
dog
brown
small
d2
a112a113
doghouse
large
white
d3 a114a115
doghouse
white
large
d4
contains in
next_to
next_to
next_to
next_to
left_of
right_of
left_of
right_of
next_to
left_of
next_to
right_of
Figure 2: A graph representation of Figure 1.
a116a117 a118a119
next_to
(i)
a120a121
chihuahua
a122a123 a124a125
dog in doghouse
(iii)
(ii)
Figure 3: Some graphs for referring expressions,
with circles around the intended referent.
and a94 in a73 is matched by an edge with the same
label between the a50 a83 counterparts of a92 and a94 , i.e.,
a86a105a104a106a92 and a86a105a104a106a94 respectively. When a73 is isomorphic
to some subgraph of a50 by an isomorphism a86 , we
write a73a127a126a129a128 a50 .
Given a graph a73 and a node a92 in a73 , and a graph
a50 and a node
a94 in
a50 , we define that the pair a99
a92a93a23a57a73a130a101
refers to the pair a99 a94a76a23 a50 a101 iff a73 is connected and
a73 a126a107a128
a50 and
a86a105a104a106a92a131a15 a94 . Furthermore,
a99
a92a93a23a57a73a130a101
uniquely refers to a99 a94a45a23 a50 a101 (i.e., a99 a92a93a23a57a73a130a101 is distin-
guishing) iff a99 a92a93a23a57a73a130a101 refers to a99 a94a76a23 a50 a101 and there is
no node a94 a83 in a50 different from a94 such that a99 a92a93a23a57a73a130a101
refers to a99 a94 a83 a23 a50 a101 . The problem considered in this
paper can now be formalized as follows: given a
graph a50 and a node a94 in a50 , find a pair a99 a92a81a23a57a73a132a101 such
that a99 a92a93a23a57a73a130a101 uniquely refers to a99 a94a76a23 a50 a101 .
Consider, for instance, the task of finding a pair
a99
a92a81a23a57a73a132a101 which uniquely refers to the node labeled
a19 a21 in Figure 2. It is easily seen that there are a
number of such pairs, three of which are depic-
ted in Figure 4. We would like to have a mechan-
ism which allows us to give certain solutions pref-
erence over other solutions. For this purpose we
shall use cost-functions. In general, a cost func-
tion a133 is a function which assigns to each sub-
graph of a scene graph a positive number. As we
shall see, by defining cost functions in different
ways, we can mimic various algorithms for the
generation of referring expressions known from
the literature.
A note on problem complexity The basic de-
cision problem for subgraph isomorphism (i.e.,
testing whether a graph a73 is isomorphic to a sub-
graph of a50 ) is known to be NP complete (see
e.g., Garey & Johnson 1979). Here we are in-
terested in connected a73 , but unfortunately that
a134a135 a136a137
in
a138a139 a140a141
dog in doghouse
a142a143 a144a145
in
doghouse
large
whitebrown
small
(iii)
dog
(i)
(ii)
Figure 4: Three distinguishing node-graph pairs
referring to a19a72a21 in Figure 2.
restriction does not reduce the theoretical com-
plexity. However, as soon as we define an up-
per bound a146 on the number of edges in a distin-
guishing graph, the problem loses its intractability
and becomes solvable in polynomial a147 a99a149a148a151a150 a101 time.
Such a restriction is rather harmless for our cur-
rent purposes, as it would only prohibit the gen-
eration of distinguishing descriptions with more
than a146 properties, for an arbitrary large a146 . In
general, there are various classes of graphs for
which the subgraph isomorphism problem can be
solved much more efficiently, without postulating
upper bounds. For instance, if a50 and a73 are planar
graphs the problem can be solved in time linear in
the number of nodes of a50 (Eppstein 1999). Ba-
sically, a planar graph is one which can be drawn
on a plane in such a way that there are no cross-
ing edges (thus, for instance, the graph in Figure
2 is planar). It is worth investigating to what ex-
tent planar graphs suffice for the generation of re-
ferring expressions.
3 Outline of the algorithm
In this section we give a high-level sketch of
the algorithm. The algorithm (called make-
ReferringExpression) consists of two main
components, a subgraph construction algorithm
(called findGraph) and a subgraph isomorphism
testing algorithm (called matchGraphs). We
assume that a scene graph a50 a15a152a51a54a53a153a23a57a56a59a58 is given.
The algorithm systematically tries all relevant
subgraphs a73 of the scene graph by starting with
the subgraph containing only the node a92 (the
target object) and expanding it recursively by
trying to add edges from a50 which are adjacent to
the subgraph a73 constructed so far. In this way
we know that the results will be a connected sub-
graph. We refer to this set of adjacent edges as the
a73 neighbors in
a50 (notation: a50 .neighbors(
a73 )).
The algorithm returns the cheapest distinguishing
subgraph a73 which refers to a92 , if such a distin-
guishing graph exists, otherwise it returns the
empty graph a154a48a15a34a51a149a49a89a23a25a49a29a58 .
3.1 Cost functions
We use cost functions to guide the search process
and to give preference to some solutions over oth-
ers. If a73 a15a152a51a54a53 a74 a23a57a56 a74 a58 is a subgraph of a50 , then
the costs of a73 , notation a133 a99 a73a130a101 , are given by sum-
ming over the costs associated with the nodes and
edges of H. Formally:
a133
a99
a73a132a101a105a15 a155
a156a158a157a32a159a158a160
a133
a99
a92a93a101a85a161 a155
a162a25a157a6a163a85a160
a133
a99a165a164
a101
We require the cost function to be monotonic.
That is, adding an edge to a (non-empty) graph can
never result in a cheaper graph. Formally:a21
a166a167a50a59a83
a60
a50
a87
a166 a164
a70
a50
a104 edges:
a50 a83
a104 cost a168
a99 a50 a83
a161
a164
a101a57a104 cost
This assumption helps reducing the search space
substantially, since extensions of subgraphs with
a cost greater than the best subgraph found so far
can safely be ignored. The costs of the empty, un-
defined graph are infinite, i.e. a133 a99 a154a169a101a97a15a171a170 .
3.2 Worked example
We now illustrate the algorithm with an example.
Suppose the scene graph a50 is as given in Figure
2, and that we want to generate a referring expres-
sion for object a19a72a21 in this graph. Let us assume
for the sake of illustration that the cost function
is defined in such a way that adding a node or an
edge always costs 1 point. Thus: for each a92a172a70a173a53a81a80
and for each a164 a70a18a56a77a74 : a133 a99 a92a93a101a63a15 a133 a99a165a164 a101a174a15 a175 .
a37
Here and elsewhere, we use the following notation. Let
a176a98a177a132a178a180a179a22a181a54a182a105a183 be a graph and
a184 an edge, then
a176a129a185
a184 is the graph
a178a180a179a68a186a77a187
a184a189a188node1
a181
a184a189a188node2a190
a181a149a182a191a186a77a187
a184a192a190
a183 .
makeReferringExpression(a92 ) a17
bestGraph := a154 ;
a73 := a51a193a17a24a92a93a31a32a23a25a49a29a58 ;
return findGraph(a92 , bestGraph, a73 );
a31
findGraph(a92 , bestGraph, a73 ) a17
if (bestGraph.cost a168a41a73 .cost) then return bestGraph fi;
distractors := a17 a148a130a194a6a148 a70 a50 a104 nodes a195 matchGraphsa99 a92a93a23a57a73a173a23 a148 a23 a50 a101a85a195 a148a197a196a15a198a92a93a31 ;
if (distractors = a49 ) then return a73 fi;
for each edge a70 a50 .neighbors(a73 ) do
a199
a87a200a15 findGraph(a92 , bestGraph, a73a201a161
a164 );
if a199 .cost a168 bestGraph.cost then bestGraph := a199 fi;
rof;
return bestGraph;
a31
Figure 5: Sketch of the main function (makeReferringExpression) and the subgraph construction func-
tion (findGraph).
a202a203
a204a205 a206a207
in
a208a209 a210a211
H =
H =
(i)
(ii)
(iii)
H =
chihuahua left_of brown
Figure 6: Three values for a73 in the generation
process for a19a72a21 .
(In the next section we describe a number of more
interesting cost functions and discuss the impact
these have on the output of the algorithm.) We
call the function makeReferringExpression (given
in Figure 5) with a19a72a21 as parameter. In this function
the variable bestGraph (for the best solution found
so far) is initialized as the empty graph and the
variable a73 (for the distinguishing subgraph un-
der construction) is initialized as the graph con-
taining only node a19a72a21 ((i) in Figure 6). Then the
function findGraph (see also Figure 5) is called,
with parameters a19a72a21 , bestGraph and a73 . In this
function, first it is checked whether the costs of
a73 (the graph under construction) are higher than
the costs of the bestGraph found so far. If that is
the case, it is not worth extending a73 since, due
to the monotonicity constraint, it will never end
up being cheaper than the current bestGraph. The
initial value of bestGraph is the empty, undefined
graph, and since its costs are astronomically high,
we continue. Then the set of distractors (the ob-
jects from which the intended referent should be
distinguished, Dale & Reiter 1995) is calculated.
In terms of the graph perspective this is the set of
nodes in the scene graph a50 (other then the target
node a92 ) to which the graph a73 refers. It is easily
seen that the initial value of a73 , i.e., (i) in Figure
6, refers to every node in a50 . Hence, as one would
expect, the initial set of distractors is a50 a104 nodes a212
a17a20a19a22a21a24a31 . Next we check whether the current set of
distractors is empty. If so, we have managed to
find a distinguishing graph, which is subsequently
stored in the variable bestGraph. In this first iter-
ation, this is obviously not the case and we con-
tinue, recursively trying to extend a73 by adding
adjacent (neighboring) edges until either a distin-
guishing graph has been constructed (all distract-
matchGraphs(a92 , a73 , a94 , a50 ) a17
if a73 .edges(a92 , a92 ) a196a60 a50 .edges(a94 , a94 ) then return false fi;
matching := a17a24a86a105a104a106a92a213a15a198a94a76a31 ;
a214 :=
a73 .neighbors(a92 );
return matchHelper(matching, a214 , a73 );
a31
matchHelper(matching, a214 , a73 ) a17
if a194 matching a194 a15 a194a73 a194 then return true fi;
if a214 a15a198a49 then return false fi;
choose a fresh, unmatched a215 from a214 ;
a216
a87a200a15a63a17a20a217a191a70
a50 a194
a215 might be matched to a217a169a31 ;
for each a217a218a70 a216 do
if a217 is a valid extension of the mapping
then if matchHelper(matching a43a219a17a24a86a105a104a106a215a191a15a42a217a72a31 , a214 , a73 ) then return true fi;
fi;
rof;
return false;
a31
Figure 7: Sketch of the function testing for subgraph isomorphism (matchGraphs).
ors are ruled out) or the costs of a73 exceed the costs
of the bestGraph found so far. While bestGraph
is still the empty set (i.e., no distinguishing graph
has been found yet), the algorithm continues un-
til a73 is a distinguishing graph. Which is the first
distinguishing graph to be found (if one or more
exist) depends on the order in which the adjacent
edges are tried. Suppose for the sake of argument
that the first distinguishing graph to be found is (ii)
in Figure 6. This graph is returned and stored in
bestGraph. The costs associated with this graph
are 5 points (two nodes and three edges). At this
stage in the generation process only graphs with
lower costs are worth investigating, which yields a
drastic reduction of the search space. In fact, there
are only a few distinguishing graphs which cost
less. After a number of iterations the algorithm
will find the cheapest solution (given this particu-
lar, simple definition of the cost function), which
is (iii) in Figure 6.
3.3 Subgraph Isomorphism testing
Figure 7 contains a sketch of the part of the al-
gorithm which tests for subgraph isomorphism,
matchGraphs. This function is called each time
the distractor set is calculated. It tests whether the
pair a99 a92a93a23a57a73a130a101 can refer to a99 a94a76a23 a50 a101 , or put differently,
it checks whether there exists an isomorphism a86
such that a73 a126 a128 a50 with a86a105a104a106a92a66a15a220a94 . The function
matchGraphs first determines whether the looping
edges starting from node a92 (i.e., the properties of
a92 ) match those of a94 . If not (e.g., a92 is a dog and
a94 is a doghouse), we can immediately discard the
matching. Otherwise we start with the matching
a86a105a104a106a92a132a15a71a94 , and expand it recursively. Each recur-
sion step a fresh and as yet unmatched nodea215 from
a73 is selected which is adjacent to one of the nodes
in the current matching. For each a215 we calculate
the set a216 of possible nodes in a50 to which a215 can
be matched. This set consist of all the nodes in a50
which have the same looping edges as a215 and the
same edges to and from other nodes in the domain
of the current matching function a86 :
a216
a87a200a15a221a17a20a217
a194
a217a191a70
a50
a104 nodes a195
a73a95a104 edgesa99 a215a81a23a57a215a81a101a82a60
a50
a104 edgesa99 a217a72a23a25a217a72a101a153a195
a166a167a222
a70a95a73a95a104 neighbors
a99
a215a81a101a85a46 Dom
a99
a86a97a101a223a87
a99
a73a95a104 edges
a99
a215a75a23
a222
a101a82a60
a50
a104 edges
a99
a217a72a23a57a86a105a104
a222
a101a153a195
a73a95a104 edges
a99 a222
a23a57a215a81a101a82a60
a50
a104 edges
a99
a86a105a104
a222
a23a25a217a89a101a224a101
a31
The matching can now be extended with a86a105a104a106a215a225a15a198a217 ,
for a217a226a70 a216 . The algorithm then branches over all
these possibilities. Once a mapping a86 has been
found which has exactly as much elements as a73
has nodes, we have found a subgraph isomorph-
ism. If there are still unmatched nodes in a73 or
if all possible extensions with a node a215 have been
checked and no matching could be found, the test
for subgraph isomorphism has failed.
3.4 A note on the implementation
The basic algorithm outlined in Figures 5 and 7
has been implemented in Java. Various optimiz-
ations increase the efficiency of the algorithm, as
certain calculations need not be repeated each iter-
ation (e.g., the set a50 .neighbors(a73 )). In addition,
the user has the possibility of specifying the cost
function in a way which he or she sees fit.
4 Search methods and cost functions
Arguably, the algorithm outlined above is a meta-
algorithm, since by formulating the cost func-
tion in certain ways we can simulate various al-
gorithms known from the generation literature.
4.1 Full (relational) Brevity Algorithm
The algorithm described in the previous section
can be seen as a generalization of Dale’s (1992)
Full Brevity algorithm, in the sense that there
is a guarantee that the algorithm will output the
shortest possible description, if one exists. It is
also an extension of the Full Brevity algorithm,
since it allows for relational descriptions, as does
the Dale & Haddock (1991) algorithm. The latter
algorithm has a problem with infinite recursions;
in principle their algorithm could output descrip-
tions like “the dog in the doghouse which con-
tains a dog which is in a doghouse which ...etc.”
Dale & Haddock propose to solve this problem
by stipulating that a property or relation may only
be included once. In the graph-based model de-
scribed above the possibility of such infinite re-
cursions does not arise, since a particular edge is
either present in a graph or not.a26
a227 Notice incidentally that Dale’s (1992) Greedy Heuristic
algorithm can also be cast in the graph framework, by sort-
ing edges on their descriptive power (measured as a count
of the number of occurrences of this particular edge in the
scene graph). The algorithm then adds the most discrimin-
ating edge first (or the cheapest, if there are various equally
distinguishing edges) and repeats this process until a distin-
guishing graph is found.
4.2 Incremental Algorithm
Dale & Reiter’s (1995) Incremental Algorithm,
generally considered the state of the art in this
field, has the following characteristic properties.
(1) It defines a list of preferred attributes, list-
ing the attributes which human speakers prefer
for a certain domain. For example, when dis-
cussing domestic animals, speakers usually first
describe the “type” of animal (dog, cat), before
absolute properties such as “color” are used. If
that still is not sufficient to produce a distin-
guishing description, relative properties such as
“size” can be included. Thus, the list of preferred
attributes for this particular domain could be
a51 type, color, size a58 . The Incremental Algorithm
now simply iterates through this list, adding a
property if it rules out any distractors not pre-
viously ruled out. (2) The algorithm always in-
cludes the “type” attribute, even if it is not distin-
guishing. And (3) the algorithm allows subsump-
tion hierarchies on certain attributes (most notably
for the “type” attribute) stating things like a fox ter-
rier is a dog, and a dog is an animal. In such a hier-
archy we can specify what the basic level value is
(in this case it is dog). Dale & Reiter claim that
there is a general preference for basic level values,
and hence their algorithm includes the basic level
value of an attribute, unless values subsumed by
the basic level value rule out more distractors.
These properties can be incorporated in the
graph framework in the following way. (1) The
list of preferred attributes can easily be modelled
using the cost function. All “type” edges should
be cheaper than all other edges (in fact, they could
be for free), and moreover, the edges correspond-
ing to absolute properties should cost less than
those corresponding to relative ones. This gives
us exactly the effect of having preferred attributes.
(2) It also implies that the “type” of an object is
always included if it is in any way distinguishing.
That by itself does not guarantee that type is al-
ways is included. The most principled and effi-
cient way to achieve that would be to reformu-
late the findGraph algorithm in such a way that
the “type” loop is always included. (Given such a
minor modification, the algorithm described in the
previous section would output (iii) from Figure 3
instead of (iii) from Figure 6 when applied to a19 a21 .)
Such a general modification might be undesirable
from an empirical point of view however, since
in various domains it is very common to not in-
clude type information, for instance when the do-
main contains only objects of the same type (see
van der Sluis & Krahmer 2001). (3) The subsump-
tion hierarchy can be modelled in the same way
as preferred attributes are: for a given attribute,
the basic level value should have the lowest costs
and the values farthest away from the basic level
value should have the highest costs. This implies
that adding an edge labeled dog is cheaper than
adding an edge labeled chihuahua, unless more
(or more expensive) edges are needed to build
a distinguishing graph including dog than are re-
quired for the graph including chihuahua. Assum-
ing that the scene representation is well-defined,
the algorithm never outputs a graph which con-
tains both dog and chihuahua, since there will al-
ways be a cheaper distinguishing graph omitting
one of the two edges.
So, we can recast the Incremental Algorithm
quite easily in terms of graphs. Note that the
original Incremental Algorithm only operates on
properties, looped edges in graph terminology. It
is worth stressing that when all edges in the scene
graph are of the looping variety, testing for sub-
graph isomorphism becomes trivial and we re-
gain polynomial complexity. However, the above
graph-theoretical formalization of the Incremental
Algorithm does not fully exploit the possibilities
offered by the graph framework and the use of cost
functions. First, from the graph-theoretical per-
spective the generation of relational descriptions
poses no problems whatsoever, while the incre-
mental generation of relational descriptions is by
no means trivial (see e.g., Theune 2000, Krahmer
& Theune 1999). In fact, while it could be argued
to some extent that incremental selection of prop-
erties is psychologically plausible, this somehow
seems less plausible for incremental generation of
relational extensions.a28 Notice that the use of a
a38
As Dale & Reiter (1995:248) point out, redundant prop-
erties are not uncommon. That is: in certain situations people
may describe an object as “the white bird” even though
the simpler “the bird” would have been sufficient (cf. Pech-
mann 1989, see also Krahmer & Theune 1999 for discus-
sion). However, a similar argument seems somewhat far-
fetched when applied to relations. It is unlikely that someone
would describe an object as “the dog next to the tree in front
of the garage” in a situation where “the dog in front of the
garage” would suffice.
cost function to simulate subsumption hierarch-
ies for properties carries over directly to relations;
for instance, the costs of adding a edge labeled
next to should be less than those of adding one
labeled left of or right of. Hence, next to will be pre-
ferred, unless using left of or right of has more dis-
criminative power. Another advantage of the way
the graph-based algorithm models the list of pre-
ferred attributes is that more fine-grained distinc-
tions can be made than can be done in the Incre-
mental Algorithm. In particular, we are not forced
to say that values of the attribute “type” are always
preferred over values of the attribute “color”. In-
stead we have the freedom to assign edges labeled
with a common type value (e.g., dog) a lower cost
than edges labeled with uncommon colors (such
as Vandyke-brown), while at the same time edges
labeled with obscure type values, such as polish
owczarek nizinny sheepdog, can be given a higher
cost than edges labeled with common colors such
as brown.
4.3 Stochastic cost functions
One of the important open questions in natural
language generation is how the common, rule-
based approaches to generation can be combined
with recent insights from statistical NLP (see e.g.,
Langkilde & Knight 1998, Malouf 2000 for par-
tial answers). Indeed, when looking at the Incre-
mental Algorithm, for instance, it is not directly
obvious how statistical information can be integ-
rated in the algorithm. Arguably, this is differ-
ent when we have cost functions. One can easily
imagine deriving a stochastic cost function from a
sufficiently large corpus and using it in the graph-
theoretical framework (the result looks like but is
not quite a Markov Model). As a first approxima-
tion, we could define the costs of adding an edge
a133
a99a165a164
a101 in terms of the probability a13
a99a165a164
a101 that
a164 oc-
curs in a distinguishing description (estimated by
counting occurrences):
a133
a99a165a164
a101a105a15a63a212 log
a26
a99
a13
a99a165a164
a101a224a101
Thus, properties which occur frequently are
cheap, properties which are relatively rare are
expensive. In this way, we would probably derive
that dog is indeed less expensive than Vandyke
brown and that brown is less expensive than polish
owczarek nizinny sheepdog.
5 Concluding remarks
In this paper, we have presented a general graph-
theoretical approach to content-determination for
referring expressions. The basic algorithm has
clear computational properties: it is NP com-
plete, but there exist various modifications (a
ban on non-looping edges, planar graphs, upper
bound to the number of edges in a distinguish-
ing graph) which make the algorithm polynomial.
The algorithm is fully implemented. The graph
perspective has a number of attractive proper-
ties. The generation of relational descriptions is
straightforward; the problems which plague some
other algorithms for the generation of relational
descriptions do not arise. The use of cost func-
tions allows us to model different search meth-
ods, each restricting the search space in its own
way. By defining cost functions in different ways,
we can model and extend various well-known al-
gorithms from the literature such as the Full Brev-
ity Algorithm and the Incremental Algorithm. In
addition, the use of cost functions paves the way
for integrating statistical information directly in
the generation process.a30
Various important ingredients of other genera-
tion algorithms can be captured in the algorithm
proposed here as well. For instance, Horacek
(1997) points out that an algorithm should not col-
lect a set of properties which cannot be realized
given the constraints of the grammar. This prob-
lem can be solved, following Horacek’s sugges-
tion, by slightly modifying the algorithm in such
a way that for each potential edge it is immedi-
ately investigated whether it can expressed by the
realizer. Van Deemter’s (2000) proposal to gener-
ate (distributional) distinguishing plural descrip-
a39
A final advantage of the graph model certainly deserves
further investigation is the following. We can look at a graph
such as that in Figure 2 as a Kripke model. The advantage
of this way of looking at it, is that we can use tools from
modal logic to reason about these structures. For example,
we can reformulate the problem of determining the content
of a distinguishing description in terms of hybrid logic (see
e.g., Blackburn 2000) as follows:
a228a105a229a231a230a78a232 A
a233a20a234a236a235a238a237
a177
a233a55a239a35a240
a228a167a241a192a230a93a242
In words: when we want to refer to node a235 , we are looking for
that distinguishing formula a230 which is true of (“at”) a235 but not
of any a233 different from a235 . One advantage of this perspective
is that logical properties which are usually considered prob-
lematic from a generation perspective (such as not having a
certain property), fit in very well with the logical perspective.
tions (such as the dogs) can also be modelled quite
easily. Van Deemter’s algorithm takes as input a
set of objects, which in our case, translates into a
set of nodes from the scene graph. The algorithm
should be reformulated in such a way that it tries to
generate a subgraph which can refer to each of the
nodes in the set, but not to any of the nodes in the
scene graph outside this set. Krahmer & Theune
(1999) present an extension of the Incremental Al-
gorithm which takes context into account. They
argue that an object which has been mentioned in
the recent context is somehow salient, and hence
can be referred to using fewer properties. This
is modelled by assigning salience weights to ob-
jects (basically using a version of Centering The-
ory (Grosz et al. 1995) augmented with a recency
effect), and by defining the set of distractors as
the set of objects with a salience weight higher or
equal than that of the target object. In terms of the
graph-theoretical framework, one can easily ima-
gine assigning salience weights to the nodes in the
scene graph, and restricting the distractor set es-
sentially as Krahmer & Theune do. In this way,
distinguishing graphs for salient objects will gen-
erally be smaller than those of non-salient objects.
Acknowledgements
Thanks are due to Alexander Koller, Kees van
Deemter, Paul Piwek, Mari¨et Theune and two an-
onymous referees for discussions and comments
on an earlier version of this paper.

References
Appelt, D. (1985), Planning English Referring Expres-
sions, Artificial Intelligence 26:1-33.
Blackburn, P. (2000), Representation, Reasoning, and
Relational Structure: A Hybrid Logic Manifesto,
Logic Journal of the IGPL 8(3):339-365.
Dale, R. (1992), Generating Referring Expressions:
Constructing Descriptions in a Domain of Objects
and Processes, MIT Press, Cambridge, Massachu-
setts.
Dale, R. & N. Haddock (1991), Generating Refer-
ring Expressions Involving Relations, Proceedings
of EACL, Berlin, 161-166.
Dale, R. & E. Reiter (1995), Computational Interpret-
ations of the Gricean Maxims in the Generation of
Referring Expressions, Cognitive Science 18: 233-
263.
van Deemter, K. (2000), Generating Vague Descrip-
tions, Proceedings INLG, Mitzpe Ramon.
Eppstein, D. (1999), Subgraph Isomorphism in Planar
Graphs and Related Problems, J. Graph Algorithms
and Applications 3(3):1-27.
Garey, M. & D. Johnson (1979), Computers and
Intractability: A Guide to the Theory of NP-
Completeness, W.H. Freeman.
Grosz, B., A. Joshi & S. Weinstein (1995), Centering:
A Framework for Modeling the Local Coherence
of Discourse, Computational Linguistics 21(2):203-
225.
Horacek, H. (1997), An Algorithm for Generating Ref-
erential Descriptions with Flexible Interfaces, Pro-
ceedings of the 35th ACL/EACL, Madrid, 206-213.
Krahmer, E. & M. Theune (1999), Efficient Generation
of Descriptions in Context, Proceedings of Work-
shop on Generation of Nominals, R. Kibble and K.
van Deemter (eds.), Utrecht, The Netherlands.
Langkilde, I. & K. Knight (1998), The Practical Value
of a243 -Grams in Generation, Proceedings INLG,
Niagara-on-the-lake, Ontario, 248-255.
Malouf, R., (2000), The Order of Prenominal Adject-
ives in Natural Language Generation, Proceedings
of the 38th ACL , Hong Kong.
Pechmann, T. (1989), Incremental Speech Produc-
tion and Referential Overspecification, Linguistics
27:98–110.
Reiter, E. (1990), The Computational Complexity of
Avoiding Conversational Implicatures, Proceedings
of the 28th ACL , 97-104.
van der Sluis, I. & E. Krahmer (2001), Generating
Referring Expressions in a Multimodal Context:
An Empirically Motivated Approach, Proceedings
CLIN, W. Daelemans et al. (eds), Rodopi, Amster-
dam/Atlanta.
Stone, M. & B. Webber (1998), Textual Economy
Through Close Coupling of Syntax and Semantics,
Proceedings INLG, Niagara-on-the-lake, Ontario,
178-187.
Theune, M. (2000), From Data to Speech: Language
Generation in Context, Ph.D. dissertation, Eind-
hoven University of Technology.
