Generating Natural Language Text in a Dialog System 
Mare Koit Media Saluveer 
Department of Programming 
2 Juhan Liivi Street 
Tartu State University 
202400 Tartu Estonia USSR 
Artificial Intelligence Laboratory 
78 Tiigi Street 
Tartu State University 
202400 Tartu Estonia USSR 
A~stract 
The paper deals with generation of natu- 
ral language text in a dialog system. The 
approach is based on principles underlying 
the dialog system TARLUS under development 
at Tartu State University. The main problems 
concerned are the architecture of a dialog 
system and its knowledge base. Much atten- 
tion is devoted to problems which arise in 
answering the user queries - the problems of 
planning an answer, the non-linguistic and 
linguistic phases of generating an answer. 
1. Introduction 
Several problem domains can be named whe- 
re the task of automatic generation of natu- 
ral language texts, not sentences, has beco- 
me the"hot topic", e.g. machine translation, 
dialog between the user and the computer in 
a dialog systam~ etc. Given the two centraI ~ 
roblems of natural language generation, 
hat of what to s.ay and that of how to say 
it, we discuss shortly the main components 
oF s dialog system which should enable the 
system to solve the problem of how to decide 
what to say, and then proceed to a more de- 
tailed treatment of the second task. The pre- 
sent approach draws heavily from the dialog 
system TARLUS being developed at Tartu State 
University, Estonia, USSR. Several modules 
of the system have been working independent- 
ly for some time already, and experiments are 
under way in merging these modules into a 
unified dialog system. Though the surface 
generation is carried out in the Estonian 
language the authors hope that a number of 
ideas which have found their way into this 
system will be of more general interest. 
2. Main Components of the Dialog System 
The dialog system consists of the follow- 
ing modules: Linguistic Processor, Turns" 
Interpreter, Turns" Generator, Planner, Dia- 
log Monitor, 5oluar~ In addition to this the 
dialog system includes several knowledge ba- 
ses for long-term knowledge: goals" know- 
ledge base, problem domain knowledge base, 
linguistic knowledge base, dialog knowledge 
base, partner's knowledge base and self- 
knowledge base. To store short-term know~- 
edge the system contains a number of models: 
of activated goals of the system, of the text 
of preceding dialog turns, of the communica- 
tive situation, of the partner, of the sys- 
tem itself. 
Dialog partners always follow certain 
goals in their interaction. Yhe goals of the 
dialog system may be thought of as implicit 
questions the system is seeking answers to 
during the dialog. The long-term goals of 
the system are kept in goals'~ knowledge base 
where they have attached to them priority 
assessments. In the course of interaction 
these goals (or subgoals) may rise or lower. 
For every new goal the system must set a 
priority assessment. 
There can be three types of questions in 
the dialog system: user questions to the 
system, the questions of the system to it- 
self, and the questions of the system to 
its user. Every guestion in its turn may 
concern either the problem domain or the 
process of interaction. The central notion 
of the dialog is a "turn". In natural dialog 
both interlocutors generate their turns in 
certain order and thus we may represent a 
dialog as a sequence of interchanging turns t;, 
where t~, t~,L etc. ere the turns of the first 
~terlocuto~, b b ~nd t1~ t~, etc. are the turns 
of the second interlocutor. Every turn may 
consist-of one or several communicative 
steps, e.g. the turn refusal may consist of 
the communicative steps REFUSAL + MOTIVATION, 
where REFUSAL dominates MOTIVATION. 
The dialog system functions 8s follows. 
The Linguistic Processor carries out mor- 
phological and syntactic analysis of the 
input turn with the help of linguistic knowl- 
edge base. A later task of the Linguistic 
Processor is the generation of the surface 
answer from its semantic representation. 
The Turns~ Interpreter fulfills several 
tasks: 
i) it constructs the semantic representation 
of the turn with the help of problem domain 
knowledge base. For the interpretation of the 
folIowing turns it may be necessary to take 
into account the preceding turns of the~pa~t- 
hers. To this end the Turns" Interpreter 
simultaneously constructs the semnatic inter- 
pretation of the dialog text already 8na~i ? 
lysed. 
ii) the Turns" Intenpreter should recognize 
in a turn the corresponding communicative 
step(s). For this task it uses the dialog 
knowledge base. At the same time the Turns" 
Interpreter forms the model of communicative 
situation, by supplementing it with typical 
structures of recognized communicative steps 
and combining them into bigger units on the 
basis of recognized turns and turn cycies. 
iii) it establishes the activated goals of 
the dialog system during a dialog procee- 
ding from the turn under interpretation. The 
questions which the user poses to the system 
and the questions that the system formuIates 
576 
on the basis of recognized communicative 
steps with the he~p of "interest rules" are 
carried into the model of goals and supplied 
with priority assessments. 
When the Turns" Interpreter has finished 
its job the Planner scans the goals in the 
goaIs knowledge base and the model of acti- 
vated goa\]s. It a~so tries to find answers 
to the remaining questions by addressing 
the Solver when the question concerns the 
problem domain, or the Dialog Monitor when 
the problem is about communication, 
The Turns" Generator selects the type of 
answer turn on the basis of the set of ques- 
tions chosen by Planner and the communicati ~ 
ve steps which will form the prospective 
turn. For instance, the question may be con- 
veyed as communicative steps ANSWER, EX- 
PRESSION OF DOUBT, etc. The question that 
has not yet been answered may be conveyed 
as communicative steps QUESTIONj REQUEST, 
OROER, etc. Secondly, the Turns Generator 
constructs the semantic representation of 
the future turn end adds it to the model of 
text of preceding dialog turns. The genera- 
tion of an answer turn is finished by the 
Linguistic Processor which transforms seman- 
tic representation of a turn into a text. 
To organize cooperation between the dif- 
ferent modules is a very complicated process 
We share the opinion that it cannot proceed 
linearly but is rather organized as coope- 
ration between experts permanently exchan- 
ging information among themselves (Oim et 
el., 1984), 
3. Knowledge Used in Text Generation 
3.1. Dialog knowledge base 
Dialog knowledge base contains type 
structures of communicative steps and rules 
which the dialog system uses in interpreting 
the replies of its partner and generating 
its own turns. 
\[he main structural components of a com- 
municative step are: 
SETTING - facts describing the situation 
where the given communicative 
step takes place: preconditions 
which hold about the author of 
that step or, in the author's 
mind, about the partner as well as 
about the objective reality 
PLOT - contents/theme of the given commu- 
nicative step 
GOAL - communicative goal of the author 
of the step 
CONSEO - hhe outcome of the communicative 
step~ i.e. the changes in the com-. 
municatiue situation which take 
place as a result of that communi- 
cative step. 
For the cooperation with the dialog sys- 
tem to be natural the dialog knowledge base 
should also contain certain rules of commu- 
nication which the system should follow when 
carrying out the dialog. Among them the most 
important are the principle of cooperation I 
Grice 1975) and the principle of politeness 
Leech 1983). In dialog knowledge base these 
principles ere contained as fixed types of 
rul es. 
In the following we exemplify some of ' 
these rules together with some examples of 
using them in live dialogs. 
I. The general form of rules of behavior 
is as follows: 
IF ~situatlon~ THEN ~action~ 
For instance, 
IF interlocutors A and B have a common 
goal G and A thinks that there exists 
an obstacle on the way of achieving G 
THEN A has the right to demand from B the 
discussion of that obstacle and dis- 
covering of the possible ways of or@re 
coming it 
These rules, on the one hand, limit the 
activity of the author of a turn in oon~ 
tructino his turns and, on the other hand, 
help the addressee understand these turns 
by drawing implicatures. 
Implicetures are inferences drawn if two 
conditions are met. First, the turn of the 
partner violates a principle of communica- 
tion and, secondly, the communicative situ- 
ation does not contain any clues that it is 
done intentionally. Therefore the addressee 
starts making hypotheses, i.e. drawing infe- 
rences which help him construct such an in~ 
terpretation for the input turn which satis~ 
flee the principle of cooperation. If there 
are no counterarguments to this hypothesis 
the addressee supposes this to be the inten- 
ded meaning of the input turn. 
Drawing of implicatures in the dialog 
system proceeds according to special proce~ 
dures oased on rules of behavior. 
2= A special case of rules of behavior 
are ra metic inference rules: 
IF ~type of communicative step~ 
THEN ~default GOAL~> 
Where GOAL" is a goal inferred by default 
from the GOAL of the author of the turn. For 
instance, when A asks B how to reach a goal 
(e.g. How can I get to the railway station?) 
then his goal may be to achieve that result 
(i.e. to be st the station). 
3. Rules of interest have the form 
<interest source>~<question/problem> 
They determine from the type of a commu- 
nicative step the questions, or "interests", 
which the interlocuto~ must find answers to. 
They typically concern such problems, as 
= what does the author suppose about the 
addressee when asking a question or 
making a proposal 
- does the claim of the author hold 
- ere there any obstacles to the plan 
put forward in a communicative stept 
etc. 
As interestsources may function various 
communicative situations with wide differen- 
ce in their complexity. The questions they 
trigger are typically related to the struc- 
tural parts SETTING and GOAL of the commu- 
nlcative s~tep. 
Here are some examples of the rules (Am 
author, B= addressee): 
A: REFUSAL ~ B: Why A refused (What 
is the MOTIVE of REFUSAL)? 
577 
A: CLAIM P ~ B: Does P hold? 
4. Rules of logical inference use data 
within one communicative step ..... only. We have 
treated that type of rules in detail in 
(Oim, 5aluveer~ 1985) and w~ll therefore not 
discuss them here. 
5. Rules of turn ,compilation~ are used in 
constructing and interpreting a turn which 
consists of more then one communicative steps. 
For examplep a turn expressing a refusal may 
consist of only one communicative step 
REFUSAL, but more common are such combinati- 
ons as REFUSAL plus MOTIVE, only MOTIVE, 
REFUSAL plus ALTERNATIVES, etc.. The rules of 
turn compilatlon fix the possible combinati- 
ons of communicative steps and their possib- 
le sequence in a turn (the sequence of turns 
is important because the steps ere not simp- 
ly linearly ordered but there exist fixed 
subordlnation relations between the communi- 
cative steps within a turn). These rules ha- 
ve the general form 
type of turn ~ CI~ C2~ ..., C k 
(k~l),where CI,O.. , C k are types of communi- 
cative steps~ 
6. To rules of dialo~ coherence belong 
first and foremost rules which determine 
from the components SETTING and CONSEQ of a 
partner's turn the contents of the component 
SETTING in the other partner's turn. 
There are several subgroups within this 
general group: 
(i) default rules are used in such situa- 
tions in a dialog where the turn of a partner 
is "blank", i.e. when the partner does not 
answer to a remark. E°G. t when somebody asks 
"Don't you belive me?" then a silence from 
his partner is equal to a negatiue turn (the 
partner does not belive the author). 
(ii) rules determining cycles of turns in 
coherent dialog. It has been pointed out 
that as a minimal unit in interaction func- 
tions not a pair of turns but a triplette.g. 
A: INITIATION 
B: ~EACTION 
A: ACCEPTANCE OF REACTION 
A: Would you pass the sugar, please. 
B: Here you are. 
A: Thanks. 
Here are some other examples of such tri- 
plets in dialogs: 
A: QUESTION - B: ANSWER - A: THANKING 
A: QUESTION - B: SPECIFYING QUESTION - A: 
ANSWER TO SPECIFYING QUESTION - B: ANSWER 
A: THANKING - B: ACCEPTANCE OF THANKING 
A: Can I have a bottle of brandy? 
B: Are you twenty-one? 
A: No. 
B: No. 
The rules of these two groups may be best 
represented in the form of augmented transi- 
tion networks where types of communicative 
steps correspond to nodes and the steps 
which can follow one another in a dialog are 
connected with arcs (Me:zing, 1980). 
3.2. Linguistic knowledge base 
This base includes knowledge about mor- 
phology= syntax and to a certain degree of 
semantics of the language. 
The lexicon stores declarative knowledge 
of the language in the form of following 
entries: 
<primary form> <stem> ~type of stem> 
<semantic characteristics of word> 
Morphological rules should guarantee the 
morphological analysis end synthesis of the 
words used, i.e. a transition from the word 
form to its morphological representation 
(number I case, tense, person) in analysis 
and the reversed transition in generation. 
The output of syntactic analysis (and in- 
put to syntactic generation) is a tree of 
dependencies. 
In order to reduce the number of possible 
resulting dependency trees we may use in- 
stead of purely syntactic rules suntactico- 
semantic rules which combine syntactic and 
semantic features of a word: 
word 1 word 2 
IF morphological morphological 
information I information 2 + 
semantic semantic 
characteristics I characteristics 2 
THEN word I relati°n~ word 2 
Linguistic knowledge base is used mainly 
by the Linguistic Processor. During parsing 
the input to the Linguistic Processor is the 
user's utterance in natural language, the 
output is the syntactic representation of 
the turn in the form of dependency trees. In 
surface generation the input to the Linguis- 
tic Processor is the dependency tree(s) and 
the output is an answer turn in natural lan- 
guage. 
3.5. Problem domain knowledge base 
To this base belong definitions of all the 
objects and relations between them in that 
problem domain and ale5 the methods of solv- 
ing the problems the system deals with. The 
definitions of objects end their relations 
may be represented in the form of frames, 
the algorithms of solving problems as proce- 
dures with parametres. Some procedures may 
be fillers of frame slots. This knowledge ba- 
se is used by both the Turns Generator and 
Interpreter, as well as by the Planner (when 
solving problems which have cropped up du- 
ring the dialog). 
4. Answering the User: Text Generation 
4.1. Planning the answer 
In planning its answer the dialog system 
proceeds from its current activated goals. 
The Planner choses questions from the model 
of goals which then underlie the output 
turn. The choice is made according to the 
priorities of the questions, which may con- 
cern either the problem domain or interacti- 
on. Planning the answer turn is carried out 
578 
simultaneously with interpreting the user's 
input turn. In case of questions which are 
connected with the problem domain the Plann- 
er makes use of the Solver. The Solver tries 
to answe~ the questions put to the system by 
the use~snW~o¢?~y the system itself and 
marks in the model of goals these questions 
which it has succeeded in finding an answer 
to~ In order to answer questions about inter- 
action the Planner turns to the Dialog Moni- 
tor. Most questions about interaction belong 
to the domain "system questions to itself". 
Rare exception are ouestions of the type 
"How dare you speak to me like this?" The 
dialog system usually does not direct the 
questions about interaction to its partner 
except in cases when the partner's turn some- 
how concerns the dialog system as a "persona- 
lity" (in man-machine dialog this is yet an 
unimportant aspect of interaction). To find 
out such questions the dialog system uses 
its knowledge about dialog, as well as inte- 
rest rules and dialog coherence rules and 
its own and the partner's models. In inter- 
preting a turn the dialog system must also 
check whether the partner has stuck to aIl 
rules of communication. In the opposite ca- 
se a question appers in the column "system 
questions to the user" which the dialog sys- 
tem may ask about the violation of communi- 
cative rules. 
4.2. Non-lingulstic synthesis of the answer 
As a r0~sult of the work of the Planner in 
the model of goals of the dialog system the- 
re ere a number of questions from the system 
to its user from which the Turns'Generator 
must construct the semantic representation 
of the future turn. With the highest priori- 
ty are questions about the problem domain. 
The Turns'Generator determines 
i) the possible types of answer turns - 
enswer~ refusal, request etc° 
ii) the choice among the possible alterna- 
tives with the help of rules of behavior 
iii) the use of rules of turn compilation by 
deciding which types of communicative 
steps the turn of the dialog system must 
consist ofp end filling in concrete in- 
formation to the chosen typical structu- 
res of communicative steps. 
As a result of all these actions the sem- 
antic representation of the future turn Is 
formed. 
4.3. Linguistic synthesis of the answer 
The generation of surface text from its 
semantic representation takes place in the 
Linguistic Processor and can be divided into 
three stages: transformation of semantic re- 
presentation into syntactic (semantic synthe- 
sis), transformation of syntactic representa- 
tion Into morphological (syntactic synthesis) 
and transformation of morphological represen~ 
ration into the surface text (morphological 
synthesis). 
4.3.1. Semantic synthesis 
In the process of semantic synthesis it is 
necessary to ~slice~ the semantic represents- 
tion of the future text (text frame in case 
of TARLUS) into sentence representations~ : 
i.e. sentence frames. To achieve this the 
frames which belong to the semantic catego- 
ry of ACTION. must be separated from one en~ 
other according to their sequence in time. 
Every action frame is transformed into a de- 
pendency tree of a (simple) sentence. A num- 
ber of slots in the action frame containing 
irrelevant information from the point of 
view of the user are disposed of (e.g. slot 
SUP referring to the generic notion of a ca- 
tegory, procedural slots in frames, etc.). 
The remaining slot fillers serve as labels 
of the nodes of the dependency tree, while 
slot ~ames serve as labels for the arcs on 
the tree. The preliminary order of the nodee 
wilI be determined by the corresponding verb 
patterns for that action. A verb pattern 
will determine the order of verb and its at- 
tributes In an isolated sentence but not in 
actual text. Verb patterns depend upon tar- 
get language. 
The text frame is composed of either ter- 
minal or conceptual frames. The names of the 
former are words of the target (i.e. Esto- 
nian) language. The names of the latter are 
names of semantic categories, e.g. ACTION, 
ANIMAL, TRANSFER, etc. If the dependency 
tree node is labelled by a semantic catego- 
ry, a word of the target language must be 
substituted for it depending on the context. 
E.g. instead of the conceptual frame TRANS- 
FER the system has to choose one of the 
words from the list of verbs such as ~uy, 
borrow, rob, make a present, etc,, The pro- 
o~-~hoos~ng a correct lexical item 
for a semantic category node is a category- 
oriented pass through a binary tree each 
node of which presents a discrimination pro- 
cedure. The tree gradually limits the set of 
possible candidates until finally there will 
~e only one word lefto The choice of a word 
among nea~ synonyms is a means for achieving 
a greater coherence of the text (cf. lexe- 
mes like steel, pilfer, nab~ purloin, etc. 
for the semantic notion of stealing). When 
choosing among these near synonyms the Lin- 
guistic Processor should also take into acc- 
ount the model of the partner: the output 
text shoul not contain words which are un- 
known to him. 
4.3.2. Syntactic synthesis 
This stage can in its turn divided into 
two steps: first, transformations on depen- 
dency trees with the aim of achieving grea- 
ter coherence of the text and, secondly, 
suppying the lexemes with morphological in- 
formation. 
To achieve a greater smoothness of the 
output text it is necessary to perform some 
modifications on the dependency trees during 
this phase of generation: 
i) reordering of nodes 
The primary order of nodes in a dependen- 
cy tree is determined by the verb pattern 
which does not take into account the place 
of the sentence in the text. Therefore, it 
will be necessary sometimes to change the 
order of nodes: the nodes expressing the 
theme of the sentence will be placed higher 
579 
and those representing the theme will be 
placed lower. To accomplish this reordering 
of nodes,, a mechanism of three stacks is 
used, The first two stacks contain the labels 
of the two immediately preceding dependency 
trees~ In the third stack those labels which 
have occured in the previous two stacks are 
also placed lower. Experiments have shown 
that it is sufficient to take into account 
the word order of only two immediately prece- 
ding sentences. Even more - if the system 
"remembers" too much from the preceding in- 
formation, the smoothness of the text may 
get losto The use of thls, method allows us 
to get text (2) instead of text (1): 
(1) John took a book from John's briefcase. 
John gave Mary the book,~ 
John left John's briefcase, on the table. 
(2) John took a book from John's briefcase. 
The book he gave to Mary. 
John's briefcase John left on the table. 
ii) Use of pronouns 
A pronoun may be substituted for a lexeme 
corresponding to a node of a dependency tree 
according to special rules. The application 
of these rules gives us text (3) instead of 
text (2): 
(3) John took a book from his briefcase. 
The book he gave to Mary. 
His briefcase John left on the table. 
iii) Deletion of repeated phrases 
If there exist similar subtrees in two de- 
pendency trees then in the second tree the 
stem may be substituted for the subtree. The 
result is text (4)t 
(4) John took a book from his briefcase. 
The book he gave to Mary. 
The briefcase he left on the table. 
iv) Integration of two or more dependency 
trees into a coherent graph. 
One of the rules in this domain states: 
IF in several immediately following depen- 
dency trees one and the same lexeme 
fulfills the role of agent/patient 
THEN all these trees may be integrated into 
one coherent graph by removing from 
the second tree downward the nodes 
with identical label, and connecting 
arcs to the corresponding node of the 
first tree 
This rule helps us get text (5) instead 
of text (1): 
(5) John took a book from John's briefcasep 
gave Msry the book and left on 
the table John's briefcase. 
The use of these rules in different order 
results in different, output texts, 
To ascribe morphological information to 
lexemes syntactic rules are used which deter- 
mine from the syntactlco- semantic relations 
between two words the morphological charac- 
teristics of the words., and as a result we 
get the morphological representation of the 
text, 
580 
4.3.3. Morphological synthesis 
On the basis of morphological representa- 
tion with the help of primary forms of words 
and their morphological characteristics con- 
crete word forms are built. If it is possible 
to construct several parallel forms as~ for 
example, are short and long forms of the plu- 
ral nouns in Estonian, then the choice of one 
of them is an additional means for achieving 
fluency of the text. 
From the above mentioned facts it may be 
concluded that coherent text generation dif- 
fers in many respects from single sentence 
generation, and the regularities governing 
this process must be taken into account from 
the very start of the generation process. 
5. Conclusion 
Several modules are involved in genera- 
tion of natural language turns in a natural 
language dialog system. First, the Planner 
chooses among the currently activated goals 
of the dialog system those which the system 
would carry out in its reply to the user. 
Secondly, Turns'Generator constructs the 
semantic representation of the answer turn, 
i.e~ chooses the necessary communicative 
steps for carrying out the goals laid down 
by the Planner. It also fills in these steps 
with concrete data by using knowledge about 
the problrm domain and/or about the communi- 
cative process. The Linguistic Processor gi~ 
yes the finishing touches to the surface text. 
But before a turn is communicated to the 
user there should be a ~'check-up" unterpre- 
tatlon of it - the dialog system carries out 
the possible interpretations of th~s turn in 
its "mind" t using additional knowledge from 
its model of the partner and its knowledge 
base. The result is compared with the inten- 
ded meaning of the turn and in case of dis- 
crepansies the system should ~eturn to the 
module where the interpretations were still 
similar in order to try another possible way 
of genera*long the answer turn~ 
Of the modules listed above there is a 
complete version of the Linguistic Processor 
both for parsing and generation running on a 
Ryad 1060 computer in the language PL/I~ The 
Turns'Generate~p~is being tested at the pre- 
sent time. 
References 

Grice H.P. 1975.Logic and Congersation. - 
Syntax and Semantics, uol~ 3. Speech Acts. 
New Y~rk: Academic Press. 

Leech 0. 1983. Principles of pragmatlcs. 
Cambridge: Cambridge University Press. 

Metzing D. 1980. ATNs used as a procedural 
dialog model. Prec. COLING-80, Tokyo, 
487-491. 

Olm H,, Kolt M., Litvak S., Roosmaa T., Salu- 
veer M. 1984. Reasoning and discourse: 
experts as a link between high and low 
level inferences.- Papers on Artificial 
Intelligence, eel.VII, Tartu, I?@~90. 

Oim H., 5aluveer M. 1985° Frames in linguis- 
tic descriptions. - Quaderni di Semantica, 
vol. VI, N 2, 282-292. 
