.4~<.x;¢~>~: The aim of the presenteA rc~:~ffeh is the dt~velop 
~:~-~i: ~f a lh~gaisdc mo.del of the function01 cont~pts topic and 
~i;, m~ ~h:.~: cm~. h<: n~tl in naiurat lmtgnage processing systems. 
Tk~ ?a\[~r d~'.,als wilti two points of' investigation: the first point 
c~mtx~r~m the identification of the topic and focus of 9at utterauce. 
Withha the frame of the linguistic discussion on such concepts 
topic and focus will lye considered as semantic, wagmatic and 
i.*~tonational rather then as syntactic phenomena. An operational 
definition of topic a~M tbeus is obtained on the basis of basic 
semantic-pragmatic categories which are defined in relation to a 
specitied co~4~zto The second l~int eonoerns the. integration of 
the topic an6 ibcus identification rules iu a system for natural 
im,gitage gencratiou. The aim of the application is the validation 
of the developed topic and focus model with respect to some 
asl~et of" the generation process like thematic progression and 
acc~:;nt mapping. Moreover the identification of topic and focus 
:~an be used to ma~e prediction about the thematic progression 
~md ii~ accx:n~ mapping in the blocks world texts. For the predie- 
~i(>~ of i~definite pronouns like "one" and of definite articles 
• :,'iddn a noml phrase it is ne¢,ssary to recur to the seman-- 
iic..prag~fiatic cate.go~ ie~. 
Vi~tcenza I'~I~}NATARO 
PaL f. Lingaisfik u.Literattn,viss. 
Universitht Bielefeld 
PostfN~h 8640 
D--4g0b ~ Bielefeld 1 
neA, which "allows speaking about the assumptious of the speaker 
about the hearer and his world. In fhct in the experiment the 
slmaker aud the hearer do not see each other; they mainly rely 
on assumptions about their mutual knowledge /Sehiffers 1972/. 
Statements of l~e a'e for ex. sink(at) where arakis a property 
symbol of I, and a~ is the statement on(b~,b~.), amk(on(b~,b2)) 
means that it is assumed to be mutuM lmowledge that the block 
b t is on the block b> Other examples of statements of L~ are 
id(b~) meaning that the block b I has been identified in a TAKE 
action, neq(TAKE) meaning that TAKE can be inte~preteal 
unequivocally arid apl(li) meaning that li is assumed to be a 
potential position tbr the moved blocks. 
For practical reasons we consider the context C to be, a pair of 
eaets of statements in rite language Iq: <CO b CO2>. CO~ con- 
tains statements about the world of the speaker and those assump-- 
tions about mutual knowledge which remain unchanged during an 
experiment. CO~ contains the SlXmker's assumptions almut the 
heatvr's blocks, their actual and their pNential positions. 
'~':a~c in~p<n't~mc,~ of co~xtua| fhctors tbr the whole comuuica- 
{:i<~ process ,:L,d ~br /lie subprocesses running in parallel like the 
distiit~,~ation ~i' iuR~xmaion sad tim marking of topic and focus is 
the teas(at }2~r attempting a definiilon of' ire "context" for this 
restricted domain. Ttlis t~lew approach is designed to replace the 
tr:~_ditional sintple questio, criterion useA in deternfining the topic 
~gl ~bcL~s o; single s,.mtenccs and make a gennine seman- 
:ic Fragmaiic defi~fiiion of topic, salient topic and ffmus possible. 
ii e~cie~d:.i {x~iaio id{~s of the Prague Sclmol /el. 
~@egw(/VA~,vff 1982, HajiclSva~ggMl 1885, SgMl et al. 19861. 
I'h¢ pacae)~a: geimration modd is developed ~s a shnulation of 
d~3 gc.,i~r*atk~,t of s~mplified German texts taken from blocks 
v:,~A<i ~,~xpa~ri~t(;nt~, it~ which a speaker tia.~ to advise the hearer 
,: ~ow to bt, ild a pyranfid, a bridge and a ~h~ade, The under.~ 
qying dis;~.:o0.r::~: m{m~s~l consists of the sedan sequence TAKE, 
l."~U't' ~._,*t Rt','_OVE~ which w~s lbtmd to be constant in tim pro- 
din>;>:01 c.ol~vetsatkms. The number of blocks involved in rite 
q~A{()'; :a_ctio:~ d¢~rmines rite nmnbor of rite fbllowing P1JT 
~.:3i io~t.% 
5,~ir,.,.~ a latJgoage 1,1 is d~;i~ed, whidi Mlows a description of 
!;,:~ w.~nld of the exl~;~h~ient:~ t~smg s~mtements like cretan(hi), 
,,t(i~.{~), O~E~(bl) m~aning 6mr b I is an element M ~ the uno~de-. 
a'c:.! :~.~'~ uf tt~.:: l~:a~'cr's blocks. Bo.t tbr the cfonummication pr¢~-ess 
:'.:.~ov'¢i~sdg~ ~axgi o~l~exien~>~# of ~ho pa~ie,~:ipants ,tin as importartt as 
',:: ~,~gibl~ thi~gs :~,o~,~t~t ~hem. Therel~bre a language I~ is deft-- 
L2 Operationalisation of` t~m categories assig~me.~t 
The units of analysis m'e semarttic representation of t~tt0r~mcx',a 
from the blocks woHd texts. To every element of the semantic 
representation some semantic categories will be. operationally 
assigned. In this session a formal definition of the units of` aria- 
lysis ,'rod of the operational rides is given. Every semantk: ropre-- 
seutation of" an illocutionary plau is an ordered s,:t P? 
<xb...x,,>, where xl is the verb and the remaining elements x 2 
to x,, correspond to the elements of the case frmnc of the verb 
a I. For every element x of IP there is ~l individual cou:itant in 
ttm language 1. 2 referred to as x*. The assigmnent of seman- 
tic-pragmatic categories to the elements of IP is a function, which 
maps every pair (x~C), where x e IP and C is the context, onto 
the semanticiipragmatic categories of x, representing the status of 
x with respect to C. 
The contextual labels are given (g), chosen (ch), 
mentioned (m), mentioned in the previous sentence (rap) and 
,their negation -,ch, -,m, ~mp. -~g does not occur. These 
symbols build tile "alphabet A:= {ch, m,mp,-,ch,-,.t~ ~rnp}. 
The oFerationalisathm criteria ;we: 
(t) If' x e IP, then: 
(i) if there is a property y of lq s~mh that y(×*) c~ (201 '{1 
CO2 ar, d tbr every other object 
x': y(x"~) ¢ COl U CO2 ,the~ g(x). This criterion 
applies eog. in ea'~e there is only one ×'~' tbr which the 
propel w hearel(x*) hold,s, 
(it) If ~mk(x*) and aeq(x*) e CO1 U CO2, than g(x). This 
criterion appli¢:s e.g. tar x----TAKE being element of 
the action sequence <TAKE,PUT,PROVE >, which is 
consklered m lm assmnex/ mntuM knowledge and for 
the he~a'er unequivocally interpretable. 
515 
(2) If'x ~ IP, UHB(x*) e CO2, then ch(x) and -~m(x). This 
criterion applies e.g. to the elements of the unordered set 
of the bearer's blocks. 
(3) If x ~ IP, id(x*) e CO2, then : 
(i) if x is the first object in a sequence of PUT actions, 
then ch(x), re(x), rap(x). 
(ii) If x is neither the first nor the last object in a sequen- 
ce of PUT actions, then ch(x), re(x) and -,rap(x). 
(iii) If x is the last object of the sequence of PUT actions, 
then -~ch(x), re(x) mid -~mp(x). 
(iv) If x is the only object of file single PUT-action, then 
-ch(x), re(x), and -rap(x). 
(4) If x e IP and apl(x*) e CO2, then ch(x) and -~m(x). This 
criterion applies e.g. when the speaker assumes that there is 
a position on(b0, among others, that can be potentially 
occupied by the block being moved. 
The labels ch, ~ ch mirror the step of the problem solving while 
the labels m, rap, -~m, ~mp directly refer to the dynamics of 
the utterance production. 
4. Definition of Topic (t), Salient Topic (st) and Focus (0 
topic rules: 
(5) If g(x) ~ IP then t(x). 
(6) If ( -~ch, m, rap(x)) e IP then t(x). .-a 
salient topic rules: 
(7) If (ch, m, rap(x)) e IP then st(x). 
(8) If (ch, m, --rap(x)) e IP then st(x). 
(9) If ( ~ch, m, --,mp(x)) e 1P then st(x). 
The rules (7) and (8) can be replaced by the equivalent rule 
(7*) If (oh, re(x)) e IP then st(x). 
focus rules: 
(10) If (ch, ~m(x)) ~ IP then fix). 
1.4 Examples 
For lack of space I will not give a detailed specification of the 
context. In order to give an idea about the relation between the! 
single arguments of the representation of the iilocutionary plans' 
and their contextual status example 1 will be presented in rite 
following order: rule number, assigned category and contexma~ 
information. 
The arguments of the illocntionary plan 
ADRESSEE,OBJECT> yield the following labels: 
<TAKE, 
(1 ii) --- > g(TAKE) 
amk(TAKE*) e CO1, ueq(TAKE*) e CO t 
(1 i) --- > g(ADRESSEE) 
hearer(ADRESSEE*) e CO t 
(2) - - - > ((ch, -~m)(OBJECT)) 
UHB(OBJECT*) E CO 2 
Therefore the new IP' is <g(TAKE), 
516 
g(ADRESSEE), 
((ch, -~m)(OBJECT))> and the application of rule (5) to the first 
and second argument and of rule (10) to ~he ~tird urgament of 
IP' gives 
IP" : < t(TAKE), t(ADRESSEE),f(OBJECT) >. 
The surface structure of the illocutionary plan ISP wonld be: 
"du nimmst einen retch Klotz" meaning "(you) rake a red 
block". Bold print within the examples designate possible occur- 
rences of accents and midedining highlights the words responsible 
for the cohesion of the surface tbrm. 
Exanl 1~@ 2 
Application of the rule to 
IP = <PUT,ADRESSEE~OBYFCT, GOAL:> gives: 
(1 ii) --- > y(PUT), 
(1 i) .... > g,(ADRESSEE), 
(3 iv) .... > ((-~ch,m,mp)(OBJECT)), 
(4) ..... > ((ch, -,n0(GOAL)). 
In this ease the new illocutionary plan IP' is: 
< g(PUT),g(ADRESSEE),(( -, ch,m,mp)(OBJECT)), 
((ch, -m)(GOAL) >. 
The application 'of rule (5) to the first and second argmnenL of 
rule (6) to the third and of rule (10) to the fourth argument of 
1P' gives 
IP": < t(PUT),t(ADRESSEE),t(OBJECT)),f(GOAL) >. 
The surface structure would be "du stellst ihu anf den tisch" 
meaning "put it on the table". 
Example 3 
In order to illustrate the application of the salient topic rule we 
assume that the following utterance is made as a consequence of 
an illocutionary TAKE plan: "du nimmst einen r~en und einen 
blauen Klotz', meaning "take a red and a blue block". Two i 
illocutionary PUT plans would follow: 
IP 1 = < PUT,ADRESSEE,OBJECT 1,GOAL,>, 
IP 2 = <PUT,ADRESSEE,OBJECT2,GOAL)>. 
For the first , second and fourth argument of the set 1P~ and IP 2 
the same conditions as in the above PUT examples hold. For rite 
third argument the following rules apply: 
(3 i) .... > ((ch,m,mp)(OBJECT1)), 
(3 iii) .... > (( "~ch,m, -~mp)(OBJECT~)). 
The new iUoeutioanry plans are therefore: 
IPI'= <g(PUT),g(ADRESSEE),((ch,m~alp)(OBJECTI)), 
((ch, ~ m)(GOAL) >, 
IP2'= < A,(PUT) ,g(ADRESSEE), 
(ch, m, -~mp(OBJECT2)),((ch, ~m(GOAL) >. 
The application of rule (5) to the first and second argurncntz~ of 
rule (7) to the third argument in lPl',of nile (9) to the fl~ird , 
argument of IP 2' and of rule (10) to the fbm~th arguments of ~¥ 
and IP 2' yields: 
IPI" = < t(PUT),t(ADRESSEE) j~(OBJECq'I)) ,f(CCOAL) >, 
IP2" = < t~PUT) ,t(ADRESSEF.),~OBJEL~))~f(GOAL) > o 
Tt~ surface slructure would be : "d~ ~llst den r~t~a ~f de~ 
grt~nen mid den blanch anf den rotea ~ meaning "put the r~_~'~ I 
the green and the blue on the red". 
If in an illocntionary TAKE plan the third argmnent consists of 
list of many obje6~, then for every object OBJ ch(OBJ) ~d 
-~m(OBJ) holds. This can be abbreviated by tl~ oxpre~k~a~ 
(ch, -~m)* of the formal language over tim ~lphabe~ A. For every~ 
third argument of an iUocutionary PUT plan file following hol&~: 
for the first object (ch,m~mp), fbr the objt~ 2 to ~o~. 
(ch,m, ~mp) and for the last object ( -,ch,m, "~mp). Tiffs can 
abbreviated by the expression 
(ch,m,mp),(ch,m~ -~mp)*,( ",ch, m, -mp)i 
~f "di~ nit;trust zwei/drei gtiirm Kl6tz¢', meaning "take three 
r~l block~'~ is ut~re~t then a colmsive succeding utterance should 
be "du stell.~t e~u .... eimm .... and eiimn / den letz~n .... ", 
meaning "put one.., one .... and one/the last one...", lax case the 
take~ block~,, were "two reds and a blue" the succeeding answers 
~u~t b,: "da ~telist einen ro~en,.., einen roten .... trod den - 
bha~m~,~,,.." me~udug '~ put one red .... one red.., and the blue..." 
~o E~8~th>a ~n ~ generation sytaem 
The coati,)l of the dynmnics of the conversation through the 
IM*I:; m(;nfi()ne, d (m), mentioned ill the previous utterance (rap) 
and the ma~ldng of uttermtce elements topic (t) and focus (f) are 
,~!y two of the various subprocesses that run parallel during the 
mai~ production processes. In file automatic generation of natural 
la~g0age~ si~)oken ~ well as written, the thematic progression of 
a ~:equot~ ~, ~)f ta~rar~ces and their formal cohesion must also be 
~,~ken into consideration° For the spoken lauguage prosodic cohe- 
sio~ mu,~t hc considered additionally. Our rules for the identifica- 
tio~ of topic (0, .~alient topic (~t) and focus (f) guarantees the 
coher~;nc~:: ~:,' the fl~ematic progression /DanEs 1970/ of two or 
more sncce, sive utterm~ces of the action sequence. Two very 
siml)le rules for thematic progression with the respective number 
of tht~ ex~ant)les above are now given. 
Rt: Tlxe o~)iy tbcussed OBJECT of a TAKE-action becomes the 
topicalizexl OBJECT of the following PUTfi action (ex.l,2). 
~: The two/thrc.e focussed OBJECTS of the TAKEaction be- 
come ~ite OBJECT of the following tw0/three PUT actions 
~nld wilt be labelled salient topic (Ex.3). 
Oar topic, salient topic and focus identification rules also al- 
lows to m~ie predictions about the distribution of accents. Indeed 
an accent will be assigned to the elements labelled salient topic 
(sO and fbcas (f); the topic elements (t) get no accents. In this 
phone of the work accents ale assigned to all arguments of the 
proposition. The assignment of the accent to the adjective instead 
of the nou)~ in phrases like "...den lx~ten..." involves application 
~of the same criteria inside lower level constituents. In order to 
generate cohesive surface structures it is also necessary to know 
when to n~ a definite article within noun phrases (the last one : 
der letzte) or ,oat indefinite pronoun (one : ein). This choice de- 
pends on the pragmatic decision of taking one or more blocks 
and on the; properties shared by the objects in question. Under 
the t~ssurnpdon that only the parallel processing semantic and 
pragmatic i,aformation allows the choice of appropriate lexical 
material. For this purpose, we will extend our set of semantic 
categories to express if a ce(tain objects is art underdetermined or 
a determined element of a set. /For an extended discussion see 
Pignataro 1!187 and Pignataro (forthcoming)/. 
The generation model consists of four functions: FI, F2, F3 
a~O V~,o ~ maps a illocufionary plan IP and the context C onto 
a~J ill~mfioa,'a T plan IP' with additional sertumticii pragmatic 
categories. :i,~ 2 maps IW onto 1P": i.e. semanticpragmafic catego- 
a-ices onto to})ie, salient topic and focus. F a maps IP" onto surface 
~o~tences. )?'4 maps C and IP onto the changed context C'. 
(IP,C)-F1- - > tIP') - -F2- - > tIP") - -F3- - >Surf.Str 
~4 
% 
k (c ') 

References

Dan~s, F. (1970) Zur linguistischen Analyse der Textstruktur, in 
Folia Linguistics IV , pp.72- 78 

HajicAv~,E.,Vrbov\[ (1982) "On the role of the Hierarchy of 
Activation in the process of Natural Language Understanding" 
in Horecky J.(ed) Coling '82 Prague. 

Hajictvff, E.,Sgall,P.(1985) "Towards an automatic Identification 
• of Topic and Focus"; Proceedings of the second Conference 
of the European Chapter of the Association for Computational 
Linguistics. Geneva pp.263-267. 

tPignataro, V.(1987) Topik and Fokus in der Sprachproduktion; 
KoLiBri Nr. 8, Universita't Bielefeld. 
Piguataro, V. (forthcoming) 

Schiffers,S.R.,(1972) Meaning;Oxford, at the Clarendon Press. 

SgaU,P.,HajidSvA,E.,Pane¢0v~,J.(1986) The Meaning of the sen- 
tence and its Semantic and Pragmatic Aspects; Dordrecht, 
Reidel P.C. 
