Strategies for Comparison in Encyclopaedia Descriptions 
Maria Milosavljevic and Robert Dale 
Microsoft Research Institute 
Macquarie University 
North Ryde NSW 2109 Australia 
{mariam, rdale}@mpce, mq. edu. au 
Abstract 
Comparisons are typically employed to distin- 
guish similar entities, or to illustrate a prop- 
erty of an entity by referring to another com- 
• monly known entity which shares that prop- 
erty. Based on an analysis of a corpus of 
encyclopaedia texts, we define three types of 
comparisons and outline some strategies for 
applying these in the generation of entity de- 
scriptions. We describe how these compar- 
ison strategies are used within the PEBA-II 
hypertext generation system to generate de- 
scriptions of animals. 
1 Introduction and Aims 
In this paper, we outline some strategies 
for comparison which we use in PEBA-II, 
a hypertext generation system which pro- 
duces encyclopaedia descriptions of entities as 
World Wide Web (www) documents, based 
on an underlying taxonomic knowledge base. 
PEBA-II is part of a larger research pro- 
gramme built around the idea of an intelli- 
gent on-line encyclopaedia, where the descrip- 
tions produced by the system vary for differ- 
ent users and at different times. Our work 
is grounded in the domain of animal descrip- 
tions, although similar issues arise in many 
other domains. 
Comparisons are widespread within exist- 
ing encyclopaedia descriptions. In particular, 
when describing a new concept to a user, a 
comparison may be made with reference to 
other known concepts or ideas, enabling the 
hearer to more easily process and understand 
the new material (see \[Milosavljevic 1996\]). 
So, for example, in the context of animal de- 
scriptions, if the user knows about the porcu- 
pine and requests a description of the echidna, 
then we might describe the echidna by high- 
lighting both its similarities to and differences 
from the porcupine. 
Clearly this requires us to make use of some 
notion of a user model, and in a way that is 
distinct from previous work in user-modelling 
in text generation. Our aim is to produce 
texts which introduce new concepts by ref- 
erence to existing knowledge the user is as- 
sumed to have, thus employing the user model 
to greater advantage; past research (see, in 
particular, \[Paris 1987\]) has concentrated on 
avoiding the production of texts which repeat 
what the user already knows. 
By making use of a discourse model, we 
can also generate comparisons that take ac- 
count of entities that have been mentioned 
in the previous discourse. This is particu- 
larly important in the context of the dynamic 
construction of hypertext documents from 
an underlying representation: by employing 
text generation techniques, we can produce 
context-dependent descriptions which vary 
depending on the information which has al- 
ready been presented to the user, thus over- 
coming some of the limitations of hypertext 
documents which have been constructed sim- 
ply by breaking an existing linear text into 
161 
pieces. As has been noted by others (see, 
for example, \[Reiter et al 1992\] and \[Moore 
1989\]), the dynamic generation of hypertext 
also permits the user to effectively drive the 
text generation system, alleviating from the 
system some of the responsibility of reason- 
ing about what to present to the user. 
In Section 2, we provide an overview of the 
PEBA-II system; in Section 3, we provide a def- 
inition of comparison and identify three types 
of comparison on the basis of a corpus anal- 
ysis; and in Section 4 we describe how the 
corresponding discourse strategies are imple- 
mented in PEBA-II. Section 5 ends the paper 
by pointing to some future research directions. 
2 An Overview of PEBA-II 
The architecture of the PEBA-II system is 
shown in Figure 1; the components are as fol- 
lows. 
The knowledge base that currently under- 
lies the system has been hand-constructed 
from an analysis of encyclopaedia descriptions 
of animals and constitutes a taxonomy of the 
Linnaean animal classes with their associated 
properties. Particular properties may also be 
labeled as distinguishing for a specific class. 
The plan library consists of discourse plans 
which are used by the text planning compo- 
nent. Currently, the system makes use of two 
high level discourse plans, which we name 
identify and compare-and-contrast. The iden- 
tify discourse plan is used to describe an entity 
and the compare-and-contrast discourse plan is 
used to compare two entities. These discourse 
plans are similar in spirit but rather differ- 
ent in content to the similarly-named schemas 
used by McKeown \[1985\], with a number of 
the differences arising from the fact that we 
are generating hypertext pages. 
A new discourse goal is generated by the 
user clicking on a hypertext link in the cur- 
rent document being viewed. Given this new 
goal, the text planning component selects any 
relevant information from the knowledge base 
Communicative ( Animal Facts 
"~Discourse Goals ~Knowledge Base) 
I Text Planning com :n  t 
Discourse 
Plan 
1, 
Surface Realisation ~ 
Component I ~Lexicon) 
HTML 
Text 
HTML ~ World Wide Web 
Command ~\] Viewer 
Figure 1: The architecture of the PEBA-II sys- 
tem 
and organises the information according to 
the current discourse plan. The leaves of the 
instantiated discourse plan are then realised 
via a simple template mechanism. 1 
The output from the PEBA-II system is a 
document marked up using a subset of HTML 
commands. This document may be displayed 
using any www document renderer such as 
Mosaic or Netscape. The user poses new 
discourse goals to the system by clicking on 
any of the hypertext tags, and the cycle 
continues. 2 
The combination of text generation and hy- 
pertext has been explored by others, most no- 
tably in Moore's \[1989, 1995\] PEA and in Re- 
1Although we have experimented with using El- 
hadad's \[1992\] FUF realisation engine, for the texts 
we currently generate a template-based mechanism is 
faster and seems quite adequate. Speed is important 
in the context of Web-based generation: see Tulloch 
and Dale \[1995\] for some ideas on addressing the prob- 
lems here. 
2A version of PEBA-II is available on the Web at 
URL: http ://~. mpce. mq. edu. au/msi/peba, html. 
162 
iter et al's \[1992, 1995\] IDAS. PEBA-II is clos- 
est in concept to the IDAS system; a more de- 
tailed description of PEBA-II can be found in 
\[Milosavljevic, Tulloch and Dale 1996\]. Knott 
et al \[1996\] discuss some further issues in- 
volved in combining hypertext with natural 
language generation. 
3 Defining Comparisons 
3.1 Data analysis 
A corpus analysis has been conducted to 
identify how comparisons are used in ency- 
clopeedia articles, so that these techniques 
may be built into the PEBA-II system. In 
the first instance, we have concentrated on 
the domain of animal descriptions; we in- 
tend to widen the scope of this analysis to 
other domains in order to provide a more 
domain-independent theory of comparative 
forms. The two encyclopmdias analysed 
were Microsoft Encarta \[Microsoft 1995\] and 
Groliers Multimedia Encyclopmdia \[Groliers 
1992\]; each encyclopaedia yielded around 1200 
animal entries, and from these we collated 
a subcorpus of sentences involving compari- 
son. This subcorpus contains 1722 sentences 
from the Encarta corpus, and 1557 from the 
Groliers corpus. 
The aim of the corpus analysis was to 
reverse-engineer the comparisons found in an- 
imal descriptions in order to answer the fol- 
lowing questions: 
What entities are compared in descrip- 
tive texts and how do they relate to each 
other? What properties of these entities 
are used in comparisons? 
Why are particular entities compared? 
Why are some entities better compara- 
tors than others? 
What techniques do we need to build into 
a text generation system to be able to 
produce similar comparisons? 
3.2 Some Definitions 
3.2.1 Comparison 
We will adopt the following definitions: 
A comparative proposition is a 
proposition whose purpose is to 
draw the hearer's attention to a dif- 
ference or a similarity that two en- 
tities have for the value of a shared 
attribute. 3 
A comparison is the linguistic re- 
alisation of a set of one or more 
comparative propositions, where the 
purpose of the set of propositions is 
to draw the hearer's attention to one 
or more differences or similarities be- 
tween two entities. 
We have identified three different types of 
comparative forms that appear in descrip- 
tive texts, which we refer to here as DI- 
RECT COMPARISONS, CLARIFICATORY COM- 
PARISONS, and ILLUSTRATIVE COMPARISONS. 
Of these three types, only the first has been 
explored to any great degree in the context 
of natural language generation: both McKe- 
own \[1985\] and Maybury \[19951 have looked 
at various aspects of direct comparisons. 
3.2.2 Direct Comparisons 
A DIRECT COMPARISON is a comparison 
whose purpose is to compare two entities 
where neither entity is more central to the 
discourse than the other. In the context of 
a language generation system like PEBA-II, 
direct comparisons arise when the user en- 
ters a request such as: What is the difference 
between the Echidna and the African Porcu- 
pine? PEBA-II generates the text shown in 
Figure 2 in response to such a query. 
This text is essentially 'bi-focal': the 
echidna and the porcupine are equally impor- 
aIn the terminology we adopt here, a PROPERTY is 
a tuple consisting of an ATTRIBUTE and a VALUE; \[or 
example, (colour, red/. 
163 
:  " £ile ~,.dit ~iew ~o ~ookmarks .O.ptlons D.irectoly ~indow ~elp I 1 1 112\] I ! 
,  The Echidna and the African Porcupine i 
The Echidna, also known as the spiny Anteater, is a type of Monotreme. The Monotreme is a type of Mammal 
that lays eggs with leathery shells similar to reptiles. The African Porcupine is a type of Flacental Mammal. 
The Placental Mammal is a type of Mammal that carries its developing young inside the mothers womb. 
Some Comparisons: 
• Like the African Porcupine, the Echidna has a browny black coat and paler-coloured spines. 
• The Echidna is found in Australia whereas the African Porcupine is found in Africa. 
• The Echidna is a carruvore and eats ants, termites and earthworms whereas the African Porcupine is a 
herbivore and eats leaves, roots and fruit. 
• The Echidna is active at dawn and dusk whereas the African Porcupine is nocturnal. 
• The Echidna lives by itself whereas the African Porcupine either lives by itself or in groups. 
• The Echidna has a lifespan of 50 years in captivity whereas the African Porcupine has a lifespan of up to 
17 years. 
Peba-E Text Generation System 
~.,~iL..~l\] iDocument: Done I 
Figure 2: A direct comparison as generated by PEBA-II 
rant, and the purpose of the text is to deter- 
mine their similarities and differences based 
on both their relationship within a taxon- 
omy of animals (their lowest common ances- 
tor) and their attributes. This is, of course, 
the same notion of comparison that is used in 
McKeown's \[1985\] TEXT system. 
The key point here is that direct compar- 
isons are generally user-initiated. More inter- 
esting from the point of view of text genera- 
tion are clarificatory and illustrative compar- 
isons: here, the entity being described by the 
system is described in relation to some other 
entity chosen by the system. 
3.2.3 Clarificatory Comparisons 
A CLARIFICATORY COMPARISON is a compar- 
ison whose purpose is to describe an entity 
by distinguishing it clearly from another en- 
tity with which it might be confused or with 
which it shares a number of salient properties. 
In such cases we will refer to the first entity 
as the FOCUSED ENTITY, and to the second 
entity as the POTENTIAL CONFUSOR. 
The main difference between a clarificatory 
comparison and a direct comparison is that 
a clarificatory comparison is made within a 
text whose purpose is to describe one entity 
and not purely to provide a comparison be- 
tween two entities. A clarificatory compari- 
son serves to describe the focused entity; thus, 
it corresponds to the user entering a request 
such as What is the echidna? In such a case, 
instead of describing the echidna in isolation, 
the system may choose to describe it using a 
clarificatory comparison with the porcupine. 
164 
There are two reasons why a clarificatory 
comparison might be used: 
• The focused entity might be extremely 
similar to another entity, and therefore 
often confused with that entity. In this 
case, it is important that, when describ- 
ing the focused entity, it is sufficiently 
distinguished from the potential confu- 
sor. 
• Alternatively, an entity sharing a number 
of salient features with the focused en- 
tity might already be known to the user; 
in such a case, a clarificatory comparison 
between these entities may aid the user's 
understanding of the focused entity. 
For example, consider the following text ex- 
tracted from the animal corpus: 
Sheep, are hollow-horned ruminants 
belonging to the genus Ovis, subor- 
der Ruminata, family Bovidae. Sim- 
ilar to goats, sheep differ in their 
stockier bodies, the presence of scent 
glands in face and hind feet, and 
the absence of beards in the males. 
Domesticated sheep are also more 
timid and prefer to flock and follow 
a leader. \[Groliers 1992\]. 
In this text, the focused entity (the sheep) is 
very similar and might often be confused with 
the comparator entity (the goat); this is par- 
ticularly true of some wild sheep. A reader 
who is familiar with the comparator entity 
will also more easily form a mental picture 
of what the focused entity is like. 
There are a number of interesting research 
issues here: 
• How is a comparator entity selected? For 
example, a very appropriate comparator 
for the echidna is the porcupine, but the 
two entities are not closely related within 
the Linnaean taxonomy of animal classes. 
The reason for the choice of compara- 
tor entity here lies in the fact that both 
animals possess sharp spines--this is the 
only salient property the animals share. 
• How do we make clarificatory compar- 
isons which do not cause the user to make 
incorrect inferences? For example, if a 
user who is not familiar with sheep re- 
quests a description of the sheep and the 
system describes the sheep by informing 
the user of its similarities with the goat 
and not their differences, then the user 
could be led to believe that the two an- 
imals are more similar than they are in 
reality. The text shown above very care- 
fully describes both similarities and dif- 
ferences for only the most salient features 
which clearly distinguish the animals. 
A user model is advantageous here since the 
importance of different attribute types will 
vary from person to person. For example, if 
external appearance is the most important at- 
tribute, then we would want to compare the 
echidna to the porcupine. If, on the other 
hand, reproduction is considered a more im- 
portant feature, then we might compare the 
echidna to the platypus. The geographical lo- 
cation of the user can also play an important 
role: for example, in the texts that we have ex- 
amined, squirrels are often used as compara- 
tors; but Australians are not necessarily fa- 
miliar with the features of squirrels, and some 
North Americans might only know of the ex- 
istence of black squirrels. 
3.2.4 Illustrative Comparisons 
An ILLUSTRATIVE COMPARISON is a compari- 
son whose purpose is to describe one or more 
attributes of an entity by referring to the same 
attribute(s) of another entity with which the 
user is familiar. The difference between an il- 
lustrative comparison and a clarificatory com- 
parison is that in an illustrative comparison, 
the comparator entity, although usually of a 
similar type (in this case, an animal), may 
only share one attribute with the focused en- 
tity, and is not necessarily similar in any other 
165 
way to the focused entity. 
Here are some illustrative comparisons from 
our corpus: 
• Powerful and aggressive animals about 
the size of a large dog, baboons have 
strong, elongated jaws, large cheek 
pouches in which they store food, and 
eyes close together. \[Microsoft 1995\] 
• \[Aye-aye\] are about the size of a large 
cat and have long, bushy tails, a shaggy 
brown coat, and large ears. \[Microsoft 
1995\] 
• About the size of a small fox, \[the Aye- 
aye\] has a long, bushy tail, moderately 
large eyes, thick fur, and a pair of en- 
larged front teeth resembling those of ro- 
dents. \[Groliers 1992\] 
• This echolocation system, similar to that 
of the bat, enables the dolphin to navigate 
among its companions and larger objects 
and to detect fish, squid, and even small 
shrimp. \[Microsoft 1995\] 
Slightly larger than chinchillas, the 
mountain viscachas have long, rab- 
bitlike ears and a long squirrel-like 
tail. \[Microsoft 1995\] 
In each of these sentences, an illustrative com- 
parison is made so that the reader can more 
easily grasp the concept being described. In- 
stead of describing the size and proportion of 
the viscacha's ears in absolute terms, a refer- 
ence to the rabbit's ears makes it easier for 
the reader to understand what the ears really 
look like. 
There is a great deal of scope for tailoring 
descriptions to a user's knowledge here: for 
example, illustrating the size of the aye-aye 
with the fox might be appropriate for a user 
who is familiar with the fox; however this il- 
lustration might not be appropriate for some- 
one located in Australia, since the fox is not 
found in Australia. The features of a particu- 
lar animal (the sheep, for example) might also 
vary geographically. 
4 Implementing Comparison 
Strategies 
Above, we identified three particular types of 
comparisons that are present in our corpus. 
In PEBA-II, each corresponds to a particular 
discourse strategy for generating a hypertext 
page. In this section, we describe how these 
strategies are implemented within PEBA-II. 
4.1 Choosing Amongst the Strate- 
gies 
We are faced with two interdependent ques- 
tions: when do we decide to describe an entity 
by comparing it to another entity, and how do 
we decide which type of comparison to use? 
Recall from earlier that PEBA-II can address 
two different discourse goals: requests to de- 
scribe some specified entity, and requests to 
compare two specified entities. The latter 
discourse goal corresponds, of course, to the 
category of direct comparisons we identified 
above. As we noted earlier, direct compar- 
isons are thus user-initiated. We are more in- 
terested here in how PEBA-II decides when it is 
appropriate to use either a clarificatory com- 
parison or an illustrative comparison. Each 
becomes an option when PEBA-I! has been 
asked to describe some specified entity. A 
clarificatory comparison is generated when- 
ever the entity to be described is known to 
have a POTENTIAL CONFUSOR: our implemen- 
tation of this strategy is currently very sim- 
ple, and is described in Section 4.3. Illustra- 
tive comparisons are the focus of the current 
work, and we describe our approach to these 
in Section 4.4. 
4.2 Direct Comparisons 
As mentioned earlier, the PEBA-II system al- 
lows the user to request one of two actions: 
to describe a single entity or to compare 
two entities. A direct comparison is gener- 
ated by PEBA-II whenever the user requests 
a comparison between two entities. Using a 
166 
corpus-derived property classification system, 
the discourse plan used here pairs up those at- 
tributes which are of a similar type (for exam- 
ple, measurements such as height and length) 
and compares their values. An example www 
page generated using this strategy is shown in 
Figure 2. 
4.3 Clarificatory Comparisons 
The purpose of a clarificatory comparison is 
to ensure that the reader does not confuse the 
entity being described with some other entity. 
Such confusions are possible when the entity 
being described is similar in relevant respects 
to some other entity. 
We could try to generate such clarifica- 
tory comparisons from first principles: when 
we have to describe some entity e, we could 
search the knowledge base for entities which 
share properties with e, and then use some 
mechanism to determine whether there is any 
chance that the two entities might be con- 
fused. We could then phrase our description 
of e to make sure that we distinguish e from 
such potential confusors. For example, in de- 
scribing the rabbit, it may be important to 
distinguish it from the very similar hare in or- 
der to avoid confusion. 4 There are problems 
with such an approach: searching the knowl- 
edge base in this way would be a very costly 
process: it assumes a rather more complete 
knowledge base than we may be able to rely 
on; and, most important of all, it assumes that 
we can determine likelihood of confusability 
on the basis of some metric--but it is not at 
all clear what such a metric might be. 
Our current solution to these problems is 
to sidestep them entirely: for each entity that 
has a potential confusor--for example, sheep 
and goats--we specify this explicitly in the 
4There are clearly ideas we might use here in Mc- 
Coy's \[1988\] work on correcting a user's misconcep- 
tions; however, the real issue here lies in determin- 
ing whether such a misconception might arise from a 
generated comparison (see Zukerman and McConachy 
\[1993, 1995\] for some work in this area). 
knowledge base by means of a clause of the 
following form: 
(hasprop sheep 
(potential-confusor goat)) 
Then, whenever we have to describe the 
sheep, we know immediately that it has a po- 
tential confusor in the goat, and invoke a dis- 
course strategy that makes an explicit com- 
parison between the two entities. The result- 
ing text includes a comparison with the goat 
but is aimed at describing the sheep and hence 
goes further than a direct comparison between 
the sheep and goat. 
Hard-coding potential confusors might be 
considered an 'easy way out', although it is 
our view that this is one of many places in 
NLG where there is benefit in adopting so- 
lutions that make use of precomputed infor- 
mation in preference to working things out 
from first principles. For example, singling 
out potential comparator entities in this way 
is no different in principle to explicitly mark- 
ing in the knowledge base those properties 
which are distinguishing characteristics, a tac- 
tic that both McKeown \[1985\] and we our- 
selves use. 5 We have adopted this philoso- 
phy for various design decisions made in the 
development of PEBA-II, so that, for exam- 
ple, we also make use of a phrasal lexicon as 
a repository of precomputed mappings from 
semantic units to multi-word lexico-syntactic 
resources (see \[Becker 1979\] for an early jus- 
tification for this approach). Again, a similar 
philosophy underpins the use of precomputed 
lists of preferred attributes in the work on the 
generation of referring expressions reported in 
\[Reiter and Dale, 1992\]. Our position is that 
such methods can be a virtue rather than a 
vice, since they allow broad coverage systems 
to be built more quickly. 
5Note that Maybury \[1995\], on the other hand, out- 
lines an algorithm for determining the distinguishing 
characteristics for an entity from first principles. 
167 
4.4 Illustrative Comparisons 
Currently, most of our attention is focused on 
the third category of comparisons, those we 
have termed illustrative comparisons. These 
are cases where one or more attributes of an 
entity being described are compared to those 
of a common object with which the reader is 
assumed to be familiar. For the present dis- 
cussion, we will concentrate on the attributes 
of size and weight, and the mechanisms used 
to produce illustrative comparisons that in- 
dicate these attributes of the entity being de- 
scribed. These are probably two of the easiest 
properties to deal with; it remains to be seen 
to what extent the mechanisms we propose 
will generalise to other attributes. 
For illustrative comparisons, there are two 
questions to be answered: 
• How do we decide whether an illustrative 
comparator should be introduced? 
• How do we decide which comparator to 
choose when there are multiple candi- 
dates? 
We could perform these comparisons using a 
similar approach to that which we adopted 
for clarificatory comparisons: for each entity- 
attribute pair we could specify some entity 
that can be used as a comparator. Thus, we 
might have clauses in the knowledge base that 
look like the following: 
(hasprop baboon 
(illustrative-comparator size dog)) 
However, this would be unwieldy: part of the 
justification for taking this approach in the 
case of clarificatory comparisons is that we 
would expect a relatively small subset of the 
entities in the knowledge base to have poten- 
tial confusors, and so the cost of explicitly 
encoding a representation of these potential 
confusors is not too great. However, virtually 
any entity-attribute pair might be described 
using an illustrative comparison, and so we 
need some way of generalising the processing 
here. 
We do this by making use of the notion of 
a COMMON COMPARATOR SET. This is a set 
of entity types that can be compared against 
for illustrative purposes. For the moment, a 
common comparator set is defined for each at- 
tribute we might wish to describe; there may 
be some scope for interesting generalisations 
later. We focus here on the size and weight at- 
tributes: for both of these, our common com- 
parator set is the set 
(human, dog, cat) 
Note that the common comparator set for any 
given attribute is 
domain specific: different comparator 
sets for size and weight will be appropri- 
ate in different domains; 
user specific: it is likely that different 
comparator sets will be appropriate for 
different users; and 
in principle extensible, both directly and 
indirectly: we can imagine the user ex- 
plicitly being allowed to specify a set of 
comparator objects, or we could dynam- 
ically extend the set used on the basis of 
the ongoing discourse history. 
There may be ways of building or precom- 
piling a common comparator set automati- 
cally using the knowledge base and informa- 
tion from a user model, but for the moment 
we assume that it has been preconstructed. 
Given an entity e we want to describe and 
some attribute a of the entity we want to com- 
municate, we use the algorithm in Figure 3. 
The procedure used here for finding the 
best match is one that in our current exper- 
iments looks acceptable, although it is likely 
to be applicable only for a relatively narrow 
range of attributes. There are a number of 
obvious deficiencies, all of which we are cur- 
rently exploring: 
• Properties are not independent: for ex- 
ample, we have found that, when deal- 
168 
To describe attribute a of entity e (the focused 
entity): 
• Identify comparator set Sa for attribute a 
• Val = median value of a for e 
• For each ei E Sa, identify location of Val on 
the range of values that e~ has for a 
• Choose best match: 
- choose ei whose median value for a is 
closest to Val 
-if this doesn't select uniquely from 
amongst the comparator set then 
choose ei whose range for a is closest 
to Val 
Figure 3: Choosing a comparator object 
ing with size, we also need to take ac- 
count of similarity of body-form in deter- 
mining which entity makes the best com- 
parator, and so our current mechanism 
distinguishes three different size mea- 
surements: height, length and shoulder- 
height. 
* Similarity and difference are not com- 
pletely distinct: the similarity of two val- 
ues for a particular attribute should be 
viewed as a scale of similarity rather than 
as a binary distinction. 
• The user's degree of familiarity with the 
potential comparators can help in mak- 
ing a choice. 
• The degree of relatedness between the 
two entities can also play a role in choos- 
ing the best comparator. 
So far, however, the results of the simple 
method we have outlined seem promising. For 
example, PEBA-II currently generates the fol- 
lowing sentences: 
• The platypus is about the same length as 
a domestic cat. 
• The baboon has about the same shoulder 
height as a domestic dog. 
Note that the use of a common compara- 
tor set in conjunction with the algorithm 
specified here means that we can separate 
the domain-specific aspects of the computa- 
tion from the domain-independent aspects; in 
principle, the aim is that the comparator set 
specifies domain-specific information, but the 
algorithm itself is domain independent. 
As always, our methodology is to pur- 
sue solutions that first assume a consider- 
able amount of precompiled knowledge and 
then introduce generalisability and flexibility 
through subsequent parameterisation, rather 
than beginning with a very limited coverage 
solution that works from first principles. It 
is our view that this methodology is the only 
one that is likely to be successful for broad 
coverage, practical NLG systems. 
5 Conclusions and Future 
Work 
In this paper, we have: 
described the PEBA-II system, as an ex- 
ample of a system which integrates nat- 
ural language generation and hypertext 
in the provision of user-tailored informa- 
tion; 
• defined some notions relevant to the 
study of comparison; and 
looked at the concept of illustrative com- 
parison in detail, with the aim of defining 
a mechanism for generating such compar- 
isons that embodies a clear distinction 
between domain-dependent and domain- 
independent information. 
For future work, we intend to elaborate upon 
and extend further the techniques described 
here. In particular, we intend to make use 
of these notions in the generation of compar- 
isons which take account of the discourse his- 
tory; some examples of this phenomenon are 
discussed in \[Dale and Milosavljevic 1996\]. 
169 
Acknowledgements 
We would like to thank Mike Johnson of Mac- 
quarie University and the members of the 
Language Technology Group at Microsoft Re- 
search Institute for many discussions related 
to this work. 

References 
J. D. Becker \[1979\] The Phrasal Lexicon. In Pro- 
ceedings of the Conference on Theoretical Is- 
sues in Natural Language Processing, Cam- 
bridge, MA, pp. 70-77. 
Robert Dale and Maria Milosavljevic \[1996\] Au- 
thoring on Demand: Natural Language Gen- 
eration in Hypertext Documents. In Proceed- 
ings of the First Australian Document Com- 
puting Conference, Melbourne, Australia. 
Michael Elhadad \[1992\] Using Argumentation to 
Control Lexical Choice: A Functional Uni- 
fication Implementation. PhD Thesis, 
Columbia University. 
Groliers \[1992\] The New Grolier Multimedia En- 
cyclopaedia. Copyright Grolier Incorporated, 
(c) 1987-1992 Online Computer Systems, Inc. 
Alistair Knott, Chris Mellish, Jon Oberlander 
and Mick O'Donnell \[1996\] Sources of Flexi- 
bility in Dynamic Hypertext Generation. In 
this volume. 
Kathleen McCoy \[1988\] Reasoning on a High- 
lighted User Model to Respond to Misconcep- 
tions. Computational Linguistics, 14(3):52- 
63. 
Kathleen McKeown \[1985\] Text Generation - Us- 
ing Discourse Strategies and Focus Con- 
straints to Generate Natural Language Text. 
Cambridge University Press. 
Mark T Maybury \[1995\] Using Similarity Metrics 
to Determine Content for Explanation Gen- 
eration. Expert Systems with Applications, 
8(4):513-525. 
Microsoft \[1995\] Microsoft (R) Encarta'95 Ency- 
clopaedia. Copyright (c) 1994 Microsoft Cor- 
poration. Copyright (c) 1994 Funk and Wag- 
nail's Corporation. 
Maria Milosavljevic \[1996\] Introducing New Con- 
cepts Via Comparison: A New Look at User 
Modeling in Text Generation. In Proceedings 
of the Fifth International Conference on User 
Modelling, Doctoral Consortium, pp. 228- 
230. 
Maria Milosavljevic, Adrian Tulloch and Robert 
Dale \[1996\] Text Generation in a Dynamic 
Hypertext Environment. In Proceedings of 
the Nineteenth Australasian Computer Sci- 
ence Conference. 
Johanna Moore \[1989\] A Reactive Approach to 
Explanation in Expert and Advice-Giving 
Systems. PhD Thesis. UCLA. 
Johanna Moore \[1995\] Participating in Explana- 
tory Dialogues. MIT Press, Cambridge, MA. 
Cecile Paris \[1993\] User Modelling in Text Gen- 
eration. Pinter Publishers. 
Ehud Reiter and Robert Dale \[1992\] A Fast Al- 
gorithm for the Generation of Referring 
Expressions. In Proceedings of Coling-92, 
Nantes, France, August 1992. 
Ehud Reiter, Chris Mellish and John Levine 
\[1992\] Automatic Generation of On-line Doc- 
umentation in the IDAS Project. In Pro- 
ceedings of the Third Conference on Applied 
Natural Language Processing, Trento, Italy, 
March-April 1992. 
Ehud Reiter, Chris Mellish and John Levine 
\[1995\] Automatic Generation of Technical 
Documentation. Applied Artificial Intelli- 
gence, 9(2):259-287. 
Adrian Tulloch and Robert Dale \[1995\] Speeding 
up Linguistic Realisation by Caching Previ- 
ous Results. In Poster Proceedings of the 
Eighth Australian Joint Artificial Intelligence 
Conference, Canberra, November 1995. 
Ingrid Zukerman and Richard McConachy \[1993\] 
Generating Concise Discourse that Addresses 
a User's Inferences. In Proceedings of the 
Thirteenth International Joint Conference on 
Artificial Intelligence (IJCAI'93), pp. 1202- 
1207. Morgan Kaufmann Publishers. 
Ingrid Zukerman and Richard McConachy \[1995\] 
WISHFUL: A Discourse Planning System 
that Considers a User's Inferences. Techni- 
cal Report, Department of Computer Science, 
Monash University, Australia. 
