A Computational Model of Incremental 
Utterance Production in Task-Oriented Dialogues 
Kohji Dohsaka 
NTT Basic Research Laboratories 
3-1 Morinosato-wakamiya, 
Atsugi, Kanagawa, 243-01 Japan 
dohsaka@at om. brl. ntt. j p 
Akira Shimazu 
NTT Basic Research Laboratories 
3-1 Morinosato-wakamiya, 
Atsugi, Kanagawa, 243-01 Japan 
shimazu@at om. brl. ntt. jp 
Abstract 
This paper presents a comtmtational 
model of incremental utterance produc- 
tion in task-oriented dialogues. This 
model incrementally produces utter- 
antes to propose the solution of a given 
problem, while simultaneously solving 
the problem in a stepwise manner. Even 
when the solution has been partially de- 
termined, this model starts utterances 
to satisfy time constraints where pauses 
in mid-utterance must not exceed a cer- 
tain length. The results of an analysis 
of discourse structure in a dialogue cor- 
pus are presented and the fine structure 
of discourse that contributes to the in- 
cremental strategy of utterance produc- 
tion is described. This model utilizes 
such a discourse structure to incremen- 
tally produce utterances constituting a 
discourse. Pragmatic constraints are ex- 
ploited to guarantee the relevance of dis- 
courses, which are evaluated by an utter- 
ance simulation experiment. 
1 Introduction 
Dialogues occur in real-time and so are suscep- 
tible to time constraints. For example, dialogue 
participants must produce utterances under time 
constraints where pauses in mid-utterance must 
not exceed a certain length. Moreover, partici- 
pants are inference-limited (Walker and Rainbow 
1994). Due to time constraints and limits in ilffer- 
ence, dialogue participants cannot help producing 
utterances incrementally. Incremental utterance 
production is characterized like this: speakers pro- 
duce utterances while deliberating what to say~ 
and refine what they will say while articulating 
the first part of their utterances. 
The incremental strategy of utterance produc- 
tion plays a crucial role in spoken dialogues in two 
respects. First, it helps speakers to satisfy time 
constraints on pauses. This is crucial since lengthy 
pauses imply the transition of a turn from the cur- 
rent speaker to others. Second, it helps hearers to 
easily understand utterances since it enables the 
piecemeal transmission of information. 
This paper presents a computational model of 
incremental utterance production in task-oriented 
dialogues. This model produces utterances to pro- 
pose the solution of a given problem while simulta- 
neously solving the problem in a stepwise manner. 
To satisfy time constraints on pauses, this model 
starts utterances even when the solution has not 
been fully determined and refines on the solution 
during the articulation of utterances. 
We present the results of an analysis of dis- 
course structure in a corpus of Japanese task- 
oriented dialogues and show that the fine struc- 
ture of discourse prevails in spoken dialogues and 
the predominant discourse structure contributes 
to the incrementM strategy. Based on such a dis- 
course structure, this model incrementally pro- 
duces utterances constituting a discourse. How- 
ever, the incremental strategy is subject to gen- 
erating irrelevant discourses. To guarantee the 
relevance of discourses, this model utilizes prag- 
matic constraints and a context model, which are 
evaluated in an utterance simulation experiment. 
2 Related Research 
Recent studies of human speech production (Lev- 
elt 1989) show that human speakers frequetatly 
use the incremental strategy of utterance produc- 
tion. This paper is concerned with a computa- 
tional model of incremental utterance production. 
Computational models for the incremental 
syntactic formulation of a sentence were pro- 
posed (Kempen and HoenKamp 1987; De Smedt 
and Kempen 1991). Although incremental syntac- 
tic formulation is an important issue, we do not 
address this here. 
POPEL is a parallel and incremental nat- 
ural language generation system (Finkler and 
Schauder 1992; Reithinger 1992). In POPEL, the 
"what to say" component determines the content 
to be generated and gradually carries it over to 
the "how to say" component, which formulates a 
sentence incrementally. POPEL can generate dis- 
304 
(1.) aiko-ishida made desune / (2) itte/ <hai> 
PN to COl'ULA go 
(to The iiko-ishida station) (go) 
(3) sokode basu nandesuga / ~to 
then bus COI'ULA FIId,I,;I( 
(then b.s) (ullm) 
(d) morinosato-aoyama-ikitoiu <hal> basu ga 
PN nametI btls SuII.I 
am-node / 
exist-CAUSAl, 
(as there is a Ires itamed morinosato-aoyama-iki) 
(5) sore ni notte-moratte / <hal> 
it OBJ gc't on 
(get ()it it) 
(Note: <hal> shows that the dialogue partner in- 
serts an utterance to provide acknowledgnmnt.) 
Figure 1: Part of transcription of dialogue 
courses using eontextuM information, tlowever, 
it d(les not allow for the line structure ()f dis- 
course prevailing in st)oken diMogues. We l)resent 
a eomlmtational model of incremental utterance 
l)roduetion using the line structure of discourse. 
Carletta, Caley, and Isard (1993) proi)osed an 
architecture for time-constrained la.nguage pro- 
duction. As for phenomena peculiar to st)oken di- 
alogues, they focused on tlesil;¢Ltion an(l self-tel)air. 
Although our model ca,n 1)roduce filler terms and 
repair prior utterances, our chief concern is the 
tine structure of spoken discourse, which is closely 
related to incremental utterance pro(tnction. 
3 Discourse Structure Analysis 
We analyzed the discourse structure in a corpus 
of task-oriented diah)gues, which were collected by 
the folh)wing method. The subjects were ninety 
native Japaimse. In each diah)gue, two subjects, 
N and E, were ~Lsked to converse by telephone 
to lind a solution to the l)roblem of how N could 
get from one place to another. Subjects were cho- 
sen such that E had enough knowledge to solve 
the problem but N did not. Eigilty dialogues 
were recorded and transcribed. Fifteen dialogues 
were randomly chosen for analysis. The discourse 
structure was analyzed in terms of information 
units and discourse relations. 
a.1 Analyzing information units 
Speakers organize tile information to t)e conveyed 
to information units (Halliday 1994), which are 
the units for traitsmission of information. The in- 
formation units (IKs for short) are regarded as 
minilnal components of discourse structure. We 
assume that IUs a,re realized by grammatical lie- 
vices: a clause realizes an 1 U, an inteljectory word 
realizes an 1U, and a tiller term shows the end of 
an IU. Figure 1 shows pa.rt of the transcription of 
a dialogue where a diahlgue participant prol)oses 
a domMn l)lan. Tile symbol "/" separates the IUs. 
Table 1: Frequency distribution for information units 
Clause 929 
Interjectory word 665 
PP or NP 279 
Conjun(:tion 84 
Sequence of PPs or NPs 41 
Others 14 
Total 2012 
Tal)le 1 shows the frequency distribution for tim 
/~ramnlatiea\] (:ategoric's of IU, where NP and I'P 
mean noun 1)hrase and 1)ostl)ositional phrase. '.\['he 
average nmnber of NPs in an IU as a clause, NP, 
Pl', or sequen(-e of NPs and Pl?s is 1.01 in the 
tifteen dialogues. The vm'ianee is 0.28. 
This reslflt indicates that small IUs are fre- 
quently used. For example, althougil IU (1) in 
Figure 1 descril)es only a part of a domain ac- 
tion, it is regarded as ail IU siil.ce it has a e(/pula 
("desu") an(1 a sentenee-linal t)article ("ne"). 
3.2 Analyzing discourse relations 
Discourse relations between adjacent discourse 
segments w(,.re examined. A (liscourse segment 
is an IU or a sequence of IUs. For discourse re- 
ta.tions, we here adot)ted those used in Rhetori- 
(:al Structure Theory (M~nn and Thomt)son 1988) 
and tlere followed Hovy (1993) to classify them 
into semantic and interpersonal ones. Figure 2 
shows discourse relations tllat appear in the dis- 
course displayed in Figure 1. The small IUs are hi- 
erarchically related. This results ill the fine struc- 
ture of diseom:se. 
Table 2 shows tile frequency distributions for 
discourse relations in tile fifteen diak)gues. Let us 
consider the role that tile l)redominant relations, 
Elaboration, Circumstance, and Motivation, l)lay in 
tile inereinental strategy of utterance t)roduction.* 
First, Elaboration is exploited to describe domain 
actions, states or objects in a piecemeal fashion. 
Elaboration enables speakers to distribute the con- 
tent to be conveyed among different lUs. '_Ptfis re- 
lation is useful for the incremental strategy since 
it allows speakers to begin uttering even when the 
content has not been fully determined. 
Second, Circumstance is the relation between 
two segments, a nucleus and a satellite. The m> 
eleus describes a domain action or state. The 
satellite describes the circumstances where the m> 
cleus is interpreted, such as the preconditions of 
a domain action. There are 41 cases where the 
satellite describes a precondition of a domain ac- 
tion, which amounts to 68% of all cases. The con- 
stituents of a domain action are often referred to 
in its preconditions. We see ~ typical ease in tile 
relation l)etween (4) and (5) in Figure 1. (5) de- 
scribes the action of getting on a bus and (4) de- 
1We found no direct reh~tionship between Sequence 
and the increme.nt~d strategy. 
305 
Sequence / 
Circumstance, Motivation / 
Elaboration Elaboration 
(1) (2) (3) (4) (5) 
Figure 2: Discourse relations in Figure 1 
Table 2: Distribution for discourse relations 
Elaboration 305 Sequence 
74 Cirucumstance 60 
Result 25 
Condition 25 
Purpose 2 
Contrast 1 
(a) Semantic relations 
Motivation \[i l Background 
Evidence Interpretation 
Concession Enablement 
(b) Interpersonal relations 
scribes the existentional status of the bus ms the 
precondition of the action. By utilizing this rela- 
tion, speakers can distribute the content of a do- 
main action between two IUs. They can pick up 
a constituent of an action and describe it before 
describing the whole content of the action. Thus 
Circumstance is useful for the incremental strategy. 
Finally, Motivation is mainly used for describing 
a domain action as a nucleus and motivating ad- 
dressees to adopt the action by presenting a fact 
as a satellite. In typical cases, speakers motivate 
addressees to adopt an action by asserting that 
its precondition is satisfied. In such cases, Moti- 
vation occurs together with Circumstance and con- 
tributes to the incrementa.1 strategy in the same 
way as Circumstance. 
4 The Model 
As shown in Figure 3, this model is composed of 
five modules: a problem solver, an utterance plan- 
ner, an utterance controller, a text-to-speech con- 
verter, and a pause monitor. The problem solver 
makes domain plans that solve a given problem. 
The utterance planner makes utterance plans to 
propose domain plans. Pragmatic constraints and 
a context model are used to generate relevant dis- 
courses. According to utterance plans, the ut- 
terance controller sends linguistic expressions to 
tile text-to-speech converter. The pause monitor 
watches the length of pauses and signals the utter- 
ance planner and controller when the pause length 
exceeds a given length. 
These modules work in parallel. Both domain 
plans and utterance plans are made in a stepwise 
manner using the hierarchical planning mecha- 
nism (Russel and Norvig 1995: Chap.12). This 
model starts to make an utterance plan before a 
fully determined domain plan has been obtained. 
When a pause exceeds the time limit, the utter- 
ance planner sends the utterance controller an ut- 
Input: a domain problem Parallel Modules / 
\[ ProlflcnlSolvcr \] \[~f Pragmatic ~'~ 
k,, ConstraintsJ domain plans ~ ../"\] 
\[- Utterance Planner \]i Context Model utterance plans 
Utterance Controller 
expressions 
Output: utterances 
Figure 3: Model overview 
terence plan obtained within the time limit. A 
domain plan is refined during the planning and ar- 
ticulaton of utterances. Based on a refined domain 
plan, the utterance plan is replanned. When the 
utterance controller is not given utterance plans 
within the time limit, it produces a filler term. 
5 Pragmatic Constraints 
Pragmatic constraints are required to guarantee 
the relewmce of discourses. This model exploits 
the following pragmatic constraints. 
(cl) Avoid conveying redundant information. 
((:2) Pronominalize objects in the focus of atten- 
tion (Grosz and Sidner 1986). 
(c3) Be relevant according to the attentionM state. 
?\['he context model records the information that 
has been conveyed and tracks the attentional 
state. For example, consider the domain action 
of moving from one location 11 to another 12. To 
describe such a domain action with verbs such as 
"iku(go)", It must be in focus. Otherwise, the de- 
scription is irrelevant. After such an action has 
been described, 12 is in the focus. Moreover, any 
object marked as a topic becomes a focused one. 
6 Problem Solving 
We outline the problem solver using a sample 
problem of how to move from the Musashino Cen- 
ter to the Atsugi Center on the map in Figure 4. 
Tile problem solver first makes an abstract do- 
main plan, which is a sequence of three actions el, 
a2, and a3 : moving from the Musashino Center 
to the nearest station by bus, moving to the sta- 
tion nearest the Atsugi Center, and then moving 
to the Atsugi Center by bus. This plan is written 
as (rl). The contents of these actions are written 
as (r2). Expression cont(X, Y) means that the 
content of X is represented as a set Yof literals. 
(rl) plan(\[el, a2, a3\]) 
(r2) cont(al, {type(el, move), source(el, xl), 
manner(el, x2), dest(al, x3)}) 
cont(xl, {type(M, building), 
named(xl, "musashino senta~")}) 
eont(x2, {type(x2, bus)}) 
306 
Musashino Center 
Mitaka KichijojiW""~ ...~hinjuku 
j.~-Shimokitazawa /x  "Zo0a  uL4 
Atsugi Center 
Figure 4: Sample map 
cont(x3, {type(x3, station), nearest(x3, xl)}) 
cont(a2, {type(a2, move), source(a2, x3), 
dest(a2, x4)}) 
cont(aa, {type(a3, inove), source(a3, x4), m  n.er(a3, xS), de t(a3, x6)}) 
cont(x4, {type(x4, station), nemest(x4, x6)}) 
cont(x5, {type(x5, bus)}) 
cont(x6, {tytte(x6, building), 
named(x6, "atsugi sentaa") }) 
The problem solver tries to make a more con- 
crete plan. When more tha,n one domain 1)lan is 
possible, it chooses tile domain i)lan that requires 
the shortest execution time. In this domain, the 
domain plan is a sequence of actions a/t, a5, a6 and 
aT: moving from the Musashino Center to Kichi- 
joji station by bus, moving to Shimokitazawa sta- 
tion by tile Inokashira IAne, moving to Aiko-ishida 
station by the Odakyu Line, ~md then moving to 
the Atsugi Center by bus. Part of the content of 
this plan is represented as follows. 
(r3) phm(\[a4, a5, a6, a71) 
(r4) cont(a4, {type(M, move), source(a4, xl), 
manner(a4, x2), dest(a4, x7)}) 
cont(xT, {type(xT, station), 
named(x7, "kichijoji" )}) ...... 
7 Utterance Planning 
An utterance plan is a sequence of colnmnnieative 
actions that achieves a communicative goal. It is 
refined in a stepwise manner. A sequence of sur- 
face communicative actions corresponding to the 
uttering of linguistic ext)ressions is finally planned. 
7.1 Communicative goals 
Generation systems engaging in dialogues must 
record communicative goals related to commu- 
nicative actions (Moore and Paris 1994). Com- 
municative goals used here are: 
• persuaded-plan(P): dialogue partner is per- 
shaded to adopt dommn plan P. 
• persuaded-act(A): dialogue partner is per- 
suaded to adopt domain action A. 
• described-event(E, C, At): domain event E is 
described as an event having content C an(t 
attitude At toward E is also described. 
• dc.scribed-obj(O, 6): domain object O is de- 
scribed ~s an object having content C. 
• dcscribcd-them.a-rel(l?~, O, E): thematic rela- 
tion It is described, which domain object O 
bears to domain event E. 
When the domain t)lan (rl) is obtained, (r5) is 
given as the initiM communicative goal. 
(1"5) persuadeA-plan(\[al, a2, a3\]) 
7.2 Surface coinmunicative actions 
Sllrfa(;e commnnicativ(, actions used here are: 
• sv.rfacc-desc-cvent(E, C, At): utter expres- 
sions tO descrit)e, domain event E iLq all event 
having content C and des(-ribe attitude At to- 
ward E. 
• surface-desc-obj(O, C, It): utter expressions to 
describe doinain object O as an object having 
content C and bearing thenmtic relation R to 
a certain event. 
7.3 Planning utterances based on tile 
fine structure of discourse 
An utterance pbm is elaborated using action 
schemata and decomposition methods. An action 
schema consists of an action description, appli- 
cability constraints and an effe(:t. 2 It defines a 
communicative action. A decomposition inetho(l 
consists of an action description, applieal)ility con- 
straints and a plan. It specifies how an action is 
decolnposed to a detailed phm. 
The following schema (r6) defines the commu- 
nic;ttive action of proI)osing a domain plan by us- 
ing Sequence. The decomposition method (rT) 
specifies how the ~mtion is decomposed to a se- 
quence of finer actions. :~ 
(1"6) Aet(propose-acts-in-seq( * P), 
Constr: plan( . P), 
Effect: persuaded-plan( * P) ) 
(r7) Decomp(propose-acts-in-seq(\[*Act l *Rest\]), 
Constr: *Rest ¢ \[\], 
Plan: \[aehieve(persuaded-aet( * Act ) ), 
propose-acts-in-scq( * Rest ) \] ) 
In these representations, achieve(P) designates 
an action that achieves goal P. Notation \[H I L\] 
specifies a list, where H is the head of the list and 
L is the rest. Symbols starting with "*" represent 
variables. By applying (r6) and (r7) to the initial 
communicative goal (rS), the following utterance 
plan is obtained: 
(r8) achieve(pers,laded-act (al)), 
achieve(persuaded-act(a2)), 
achieve(persuaded-act(a3)). 
2In this paper, we do not consider a precondition 
for an action schema. 
aWe have omitted other method to avoid intinite 
reeursive application of the method (r7). 
307 
(r9) Act(propose-act@A), Effect: persuadcd-act( * A ) ) 
(rl0) Decomp(proposc-act( * A ), Constr: cont( , A, *C), Plan: achicvc( dcscribcd-cvcnt( , A, ,C, proposal)) 
(rl 1)Act(describc-cvcnt-by-elaboration(,E, *C, *At), Effect: described-cvcnt( , E, *C, *At)) 
(r12) Decomp( describc-event-by-elaboration( , E, *C, *At), Constr: * Thema E *CA 
*Thcma =.. \[*R, *E, *0\] A *R ¢ type A cont(*O, ,ObjC) A *Rest = *C - {*Thema} 
plan: \[ chi  e( descr  cd-obj( ,O, *ObjC) ), *R, *0, *E) ), 
ach, ie e( d sc i ed-e e, t( , E, , Re  t, , At ) ) \] ) 
(r13) Act( describe-obj-with-thcma( *O, *C, *R, *E), 
Effect: dcscribcd-obj( *O, *C)A described-thema-rcl( ,R, *0, *E) ) 
(r14) Decoinp(dcscribe-obj-with-thcrna(,O, *C, *R, *E), Plan: surface-desc-obj(,O, *C, *R)) 
(r15) Act( dcscribc-cvcnt-type( ,E, *C, *At), Constr: *C = {type(*E, *T)}, 
Effect: describcd-cvcnt( , E, *C, *At)) 
(r16) De~comp( describc-event-type( , E, *C, *At), Plan: surface-desc-event( * E, *C, *At)) 
Figure 5: Action schemata and decomposition methods for proposing domain action 
(r8) is decomposed by applying the action 
schemata and decomposition methods shown in 
Figure 5. These schemata define communicative 
actions for proposing a domain action while elabo- 
rating the content of the action in a stepwise man- 
ner. They reflect the results of a discourse struc- 
ture analysis, which show that speakers tend to 
distribute the constituents of a domain action into 
different IUs by using EI,ABORATION. In (r12), 
notation F(X, 17,...) =.. \[F, X, Y,...\] is used for 
decomposing term F(X, Y,...) into relation F and 
arguments X, Y, .... 
When domain objects are linguistically realized 
by the surfaee-desc-obj in (r14), pragmatic con- 
straint (c2) is exploited to t)ronominalize focused 
objects. In addition, according to constraint (c3), 
the objects that are not in focus need to be topi- 
ealized if they must be in focus. 
By applying these schemata to the first action 
in (r8), the following utterance plan is obtained. 
Thematic relations are chosen in default order 
when (r12) is applied. 
(r17)surface-desc-obj(xl, {type(xl, building), 
named(x1, "mnsashino sentaa")}, source), 
surface-desc-obj(x2, {type(x2, bus)}, manner), 
surface-desc-obj(x3, {type(x3, station), nearest(xa, xl)}, dest), 
surface-desc-event(al, {type(a1, move)}, 
proposal). 
According to utterance plan (r17), this model 
can start the following utterances to satisfy the 
time constraints before Obtaining a concrete do- 
main plan such as (r3). 
(ul)musashino sentaa kara-wa desune/ 
PN from-Topic COPULA 
(from the Musashino Center) 
basu de/ mo~ori-no-eki made/ ikimasu/ 
bus by nearest station to go 
(by bus) (to the nearest station) (go) 
For brevity, we have omitted action schemata 
and decomposition methods for utterance plan- 
ning using MOTIVATION and CIRCUMSTANCE. 
7.4 Replanning utterance plans 
While planning and articulating utterances using 
an abstract domain plan, a more concrete domain 
plan is being made. When a more concrete do- 
main plan is obtained, an utterance plan is re- 
planned. For example, consider the case where 
a concrete domain plan, (r3), is obtained during 
the production of utterance (ul). The following 
utterance plan is replanned: 
(r18) surface-desc-obj(xl, {type(x1, building), 
named(M, "nmsashino sentaa")}, source), 
surface-desc-obj (x2,{type(x2, bus)},manner), 
surface-desc-obj(x7, {type(x7, station), 
named(x7, "kichijoji")}, dest), 
surface-desc-event (a4, {type(a4, move) }, 
proposal). 
We assume that plan (r18) is obtained when 
this model finishes uttering "moyori-no-eki made" 
in utterance (ul). Then (ul) is interrupted and 
utterances follow based on (r18). Consequently, 
the following utterances are produced: 
(u2)musashino sentaa kara-wa desune/ 
PN from-Toplc COPULA 
(from the Musashino Center) 
basu de/ moyori-no-eki made/ 
bus by nearest station to 
(by bus) (to the nearest station) 
kichijoji made desune / ikimasu / 
PN to COPULA go 
(to Kichijoji station) (go) 
In the above, the redundant information is not 
restated according to pragmatic constraint @1). 
Self-repair occurs: "moyori-no-eki made" is re- 
placed by "kichijoji made". 
8 Experiments 
This model has been implemented in Com- 
mon Lisp. A logical constraint unification sys- 
tern (Nakano 1991) is used in the planners. The 
domain planner includes 18 action schemata and 
16 decomposition methods. The utterance plan- 
308 
(el) Musashino sentaa kara-wa desune / 
PN front-Topic, COPULA 
(from the Musashino Center) 
(e2) ~to Kichijoji made / dete-kudasai / 
FILI,F,I~ PN to go-t)le~se 
(crm to Kichijoji station) (please go) 
@3) ~to desune sorekara inokashira-sen de 
FILLER then PN by 
(erm then by the Inokastfira Line) 
(e4) odakyu-sen hi/ norikaete / 
PN for change 
((:hange train for the Odakyu Line) 
basu de / moyori-no-eki made / 
bus by nearest station to 
(by Ires) (to the nearest station) 
desune / shimokitazawa made / 
COI'UI,A PN to 
(to Shimokitazawa station) 
aiko-ishida made / ikimasu / ..... 
PN to go 
(to Aiko-ishida station) (go) 
Figure 6: Discourse generated by implemented system 
ner includes 16 action schemata and 16 decom- 
position methods. We ewduated pragmatic con- 
straints in an utterance simulation experiment, 
where discourses generate.d with the constraints 
were contpared with those generated without 
them. A map including 120 h)cations such as sta- 
tion and 25 railroad lines w~s used. The pause 
length limit was ().5 see. 
When pragmatic constraints were used, this im- 
plemented systeIn generated relevant discourses. 
Figure 6 shows the discourse generated when the 
problem of inoving frolIl the Mus~ksliino Center to 
the Atsugi Center was given. Filler terms such as 
gto were produced to satisfy the time constraints. 
Pragmatic constraint (el) was used ill (e2), ~Uq ex- 
plained in section 7.4. Constraint (c2) was used 
to zero-proImminalize stations in the focus of at- 
tention. Constraint (e3) was used in ((:1) to top- 
icalize the Musashino Center. Topicalization was 
also used in other cases where the system must 
shift the focus of attention to the location al- 
ready described in the preceding discourse. Such 
cases happened when the system started utter- 
antes based on an abstract domain I)lan, took a 
long time to obtain a more concrete plan, and then 
elaborated on a route from a location that was 
not in focus based on the concrete plan. With- 
out prt~gmatic constraints, the system generated 
irrelevant and excessively redundant discourses. 
9 Conclusion 
We presented a computational model of producing 
utterances incrementally so as not to make exces- 
sively long pauses. We presented the results of an 
anMysis of discourse structure and showed that 
speakers frequently use small information units 
and exploit the fine structure of discourse that 
contributes to the increlnentM production strat- 
egy. This model can utilize such a discourse struc- 
ture to incrementally produce utterances accord- 
ing to pragmatic constraints. These were ewdu- 
ated by an utterance simulation experiment. 
References 
Carletta, Jean; Caley, Richard; and Isard, Stephen. 
itte / 
go 
(go) 
(1993). A system architecture for simulating time- 
constrained language production. Research Paper 
ItCRC/RP-43, University of Edinburgh. 
I)e Smedt, Koenraad, and Keinpen, Gerard. (1991). 
Segment grammar: A fornmlism for incremental 
sentence generation, in Natural Language Gen- 
eTnlion in ArtiJieial Intelligence and Computa- 
tional Linguistics. Edited by Cdcile L. Paris, 
William R. Swartout, and William C. Mann, 
Khlwer Academic Publishers, 329-349. 
Finkler, ~Volfgang, and Schauder, Ann. (1992). Effects 
of incrementM output on incrementM natural lan- 
guage generation, ht Proc. of 10th ECAI, 505-507. 
Grosz, Barbara J., and Sidner, Candace L. (1986). At- 
tention~ intentions~ and the structure of discourse. 
Computational Linguistics, 12, 175-204. 
lIalliday, M. A. K. (1994). An Introduction to l,'unc- 
lional Grammar. Fdward Arnold. 
\]lovy, F, duard II. (\] 993). Automated discourse gener- 
ation using discourse structure relations. Artificial 
Intelligence, 63, 34t-385. 
Kempen, Gerard, and Iloenkamp, Edward. (1987). An 
incremental procedural grammar for sentence for- 
mulation. Cognitive Science, 11, 201-258. 
Levelt, Willem J. M. (1989). Speaking: Prom Inten- 
tions to Articulation. The MIT Press. 
Mann, William C., and TholnI)son , Sandra A. (1988). 
Rhetorical structure theory: Towards a functional 
theory of text organizatiou. Text, 8(3), 243-281. 
Moore, Johanna D., and Paris, Cdcile L. (1994). Plan- 
ning text for advisory dialogues: capturing inten- 
tional aim rhetorical information. Computational 
Linguistics, 19(4), 651-694. 
Nakano, Mikio. (1991). Constraint projection: An ettl- 
cient treatment of disjunctive feature descriptions. 
In Proc. of 29th ACL, 307-314. 
Reithinger, Norbert. (1992). The performance of an 
incrementM generation for multi-modM dialog con- 
tributions. In Aspects of Automated Natural Lan- 
guage Generation. Edited by Robert Dale, Ell- 
uard tlovy, l)ietmar RSsner, and Oliviero Stock, 
Springer-Verlag, 263-276. 
Russel, Stuart, and Norvig, Peter. (1995). Artificial 
Intelligence: A Modern Approach. Prentice IIall. 
Walker, Marilyn A., and Rainbow, Owen. (1994). The 
role of cognitive modeling in achieving communica- 
tive intentions. In Proe. of 7th International Con- 
ference on Natural Language Generation. 
309 
