t rl~ C\[ A \[~,: ~ gen('.rw. +Sl)('('('h... s.y~(,he,qis, s,Vs(;c)u 
Ahm W I~lacl<. a.nd Paul Taylo)'1 
A'\['I~ \[ut;crl)r(:t.ing T(~\[ccomtmmicaLions l,al)or;t(,ot'i(~,~ 
7, 2 I l il.:;uglai, Seik+;t..(:ho, ,qoral.:u-gu,t, Ky(st.c, G 19 02, .\] A I>A N 
awl)((,~iLl.al, r.co.jl> or \])a.uh,((!~c:c, gsci.(:<.\].a.c.td<. 
A 15s(;rrtc(; 
This Im.15(:r (h:s,::ribcs a !,;('.neric Sl>(~u.(:h sy,lt.hesis sy 
stem calle,l CIIA'I'I/. whi(:h is being dev(~top(~d ',it: 
A;L'I'L (JtlA'rH. is desit,;\]uxl iu ~,. ulo(lular way s(.) 
(,hat module l)ara.nlel,ers a\]\]cl (veil which modules 
a.rc a.ci, ua.lly us(xl \]na.y bc s+~L and sel+'ct;(xl at PLIll 
(;itne. Ai(J)ough so,u(', inl,(?r(lclSen(leucies cxisl. \])cL 
w,+!cn tnodulcs, (.11 IA'rl/olfc:rs a us,::\['ul rcseal:(:h to()l 
iH which \['un(;I;}onlaJly e(tuiva.lenl, 10.odu\]es ulay Im 
ca.sily COml)ar<'d. It also acts as a sin~F, lC syst,cm 
for I, hos(! less inL(er(!sl,ed in I,h(e it+tt, erna.ls (>f Sl)(,.ech 
synl,h(!sis hut, .ius{, wish their cOUll.mt,er I+o I,all,:. 
Tol)i,:'.: Sl)ee(:h synt, hcsis, genc,ic syst,cIIIS. 
\]ntrodnction 
'l'\]ler(: ar(! II/aily rC(lllil:(:llt(;llt,S \['or a Sl)Ce(:\[I syl/t, hc 
,';is sysl,enl. In ;/ddit:iot, I,o high <lualit,y liart.lll'al so, 
,mding Sl>oech (sui. 151tl. , Lhc sysgeUI should l)e fie 
xil)le a.ud nol; siufl)ly Im hard-wired I.o l);u'i.i(:n\]at: 
a.lgorithms. For examlslc il. should al, lea.st; l)c l.lm 
CIISq~ (.\[ID..D i1(3\~' wor(l:-; can be: actdo.I 1:o '.+1),:~ h'.xi<x:n. 
()lJter more g(!neraL chang,::s should a.lso 15<2/>ossilfle 
e.g. si~c,:;ifi(:a.l;i<.m of \]lew inixmal.i(>na.l (IIllI(~N, vgiry- 
ing of out.15ut, voices, oh(tic( of l)holmtnc set i:(> I~(, 
tls('.(l (<g. ira. diff(er,:mt, lexicou is to I)e used), arm 
owm tlm ,.:hoi(:<'. or la.,igua.gc I,gti~.,; Sl)ol<.(!u. A ).(..s(~ 
a.rchcr reqtfires a.(:cess (,(+, inlc.rnal B(,l'ttC(,tll'(~s, al)ilil,y 
I.(5 ulix an(l nla.I;(;h i,(!(-hui(I(tes ~ grai5\]+i(:al (lislsIay <>I' 
ul,i.,eram'<'.s and coml)ai.ihili(,y ',',,il;It c,t.h('.t: sysl,mi,s. 
I::h/I, (.hose who aro ul/int.er('st.e(\[ i,t I, he i,H.ern;ds o\[" 
speech synl:hesis .iusl; 'wa.nL Iheir (:onll)l)t;(!r I.o I;a\[I.:. 
'1'(5 t;\]l<:tt\] i,\]l,:! rc(llLirel\[te.nl~s o\[" a sy)tl, hesi,'; syst,em 
arc (\]i\[fcr(,.nl,. All,llough t,h('y st, ill "~v;tt/t, a. (\[(L~ree 
oF cout,\]'()l over syuLhesis, it. i~ r(!al-t~itu<~ l>r()(luct, iou 
or Slmc(:h , 1)lachin(e hl(h+q),+mdc,lc(L a.ud (ease (5\[" us,'~ 
i,hal, are I.h,:~ ra(:l:ors Ihat, are m(ssl, in/porl,altl. As a 
well +:,.~gim!,'~rcd ,~yM,ein (\]11ATI{ meets Lhese requi 
FPI\] I ('fl 1 \[,S, 
I:k:(;a.use A'I'FI.'s lllain sI>ecch l)rojcct: is in 
Lhe aren (+)17 Sl)(mch t.ranshfli(+n sysl,ems, hil)Ut, I,o 
C'H A'I'I:L (:art I)e mtwh ):Mwr l.ha.n silnple 15lain i.;exi. 
I)uring t,r~uls\]a.t;iot), t\]l.l.('.rat,(:(~s are r(?\])\]'es(ml.ed ill 
a ri(:h sLru(;l;m'(: i\]\[(:|udiHg synl,acLic, semanl.ic and 
s\[)e(;(:h a(:L iul'orntaLi(m. Unlik,:~ a ('oilveut.\]o\]laL t.,+~.',:l. 
1 \]>resel~L addT+('.ss ItCI/(',, \[hfiversi(,y of I';diul',t~rgh, 2 
Ih~(:(:lct\[(:h I>\[ace, I';,:litd)urgh El tl ,q\],W, ,qcot.\]aud. 
{,O ~\])(!(!Cll Sy~l,(!ll\] V~rhiCh IL(~C(\]~ l,O \]+<+I:()nsLrtlc(, tJfi+ 
it\]rornm.Lk>n From raw 1,( xt.; (:IIATI'L (:a.n uae (,his ex- 
plk:it ittl+orllmt, ic)n (lh'eciJy and It(uice produce \]nor(+ 
;~.CCII\]'aI.(! syuth<%s. Alihottgh CIIATI't. does also 
mqfl)(>rI, l,¢>xt-t,o sl>eecli, ctlrt'cH\](, devclolmtent, has 
(x)tl+cont,r::ted on the us(> ()f lal>c:llc.<l input rat,her 
Lha\]l ra.w I.(exL 
(~IIATII is (l<~.+i~+;J,c(l i, a. modttlar way so t, hat 
flmcl.io\]mlly cquiwtlcn(, modules may ex\]sl, wit, hin 
tlm s.ysl,,+ult. Flow of COllt4'ol \]lla..y l)e scl(x:txM +,.+I; 
\]'un l, him, \viLhouL recomt>ilati()n. Wit.hht a si>(>,nch 
synt, hesis r+~sear(:h enviromlumt, this is tlSCfu\] ++.s it. 
allows (:lose comi)a.risolt )>f (:omp(suetd, s t,() ktcnl, ify 
dill'crc,,<:cs. Thus C(ltdWfl(:nl, modifies ma,y be t.esLe(l 
wit, hit~ e×act.ly I,he s+mtc ellvh'o\]mw.nl; inl.era+cCively. 
For e×aml)h> , (HIATII curt(lilly SUpl)orts a nulnl)er 
o\['difier,::nt, low level (,,wtv4'ort, 0 synl.hesizcrs. This 
process is (ltdl.e iudel)CUC/CmL of it\]lx)na.i, iolt or du- 
ra.lJoH n,o4uh+s. (',IIATL'I.'s lltodularity a.ll()v,,s syu 
IJtesis of exaci.ly same. ulJ.era.,i<:e through (fiftY,rent 
,,vavcf~)r t n sy\]'d;/l,:+mzers. 
Tim ucxI. sect.ion discusses (.he inl,ertml repro- 
senLai, ioI+ of ul:t, erntmc+s wit,hiu CIIATI:t. Then I,he 
overall sl.ru<:l, ure o\[' I.he s,yst,eln is (lisct+tssed v,,ith 
sonic t,ypica.l nto(luh:s (/(e:+<:ril),'ed. Fi\[(a.lly I;\]i(: (:tit' 
r,:+tlt, c(mfigt,rat.i(>t+ o(' (\]IIATI/in d,:!scril)cd det.ailing 
its ~u:l.u+d modules. Also some ,:liscussion is given 
a.l)out t, hc short,<:onlings of t, hc syst, em a.nd how we 
wou/(/ like to +c(e il. iml)rove(+. 
Ut reran((: rel)r(:send;ati<)n 
In conveuLio,m\] SlSC,::('.h syld;hesi.-, systems (such as 
M l'l'all,: \['2\])a "lfil><'lin& archil,c(:(,ure in (fitch used. 
In\[i)rn,a.i, iott is passed t, hJ'ot@i a. t)ipeJhle of modu 
les. 10,ach \]llo(h,h'. ,:h.fii((.s what. inrorm~+d;ion is i)as- 
sod on Ir() s,.:cec(ting mo(htl,+es, lhfl,, if a.tl earlier 
(;OnllsOn(mL (\]o(es not, pass ou hlf'ormal, ion whM~. is 
\]aixu" I+omtd t,o l>c. u,+,e(h>,J dov,'u sta'catn, all httcr 
nm(lia, t,e nio(htles ,,viii t+ec,:l to I>e rc wril,l, en i,o pass 
on t, ha.t. iM'<:)rma(.iou. ILl <:()ntTast,, (:\]IATH. uses a. 
si,@e "l)tncl,:15(',aM" rel)resrmi.agion for all aslse(:l.s of 
;Ill Ill.I.)?l'a.Ii(:O. All ulo(hll(>s ha\,(+ access t(> aJl i>art;s. 
Alt,h(>uF, tl global, more iJu'tn oue ut, t,et'a.l\]ce object, 
may cxisl in (,he s.y,-;tem a.l, all+fl.inle. 
There arc. eff(~<:i, iv+~ly t.,,vo I,yl)c.s of module which 
a(:l, on ul, l,(,l'ml(:e ol6ect,s. Synt, hcsis mo(ll\]l<:,,+ will 
I,yl)ically modify t;he (:ont, eld;:.+ of a.n ul, l,erance ha,-. 
S(',(I ()1"i i{,.~ CIII:\]'C,\]\](; COllLOll(, (a\]\](I ol, hcr i)a,ranlet, ers). 
Other ,uo(ltdes (exist; which a.rc, more genct'a.\] in na 
9~ 
tnre, which, for instance, display an utterance, save 
its contents or play the synthesized wameform. 
An utterance object consists of a number el s/,'re- 
ares. F, ach stream consists of a.n ordered list; of 
cells. The number of streams can easily be than- 
Red in CllATR and not all streams need exist in 
all utterances. Typical streams are: words, sylla- 
bles, and l)honemes. Relations may be set between 
stream ceils and so, for example, it is l)Ossible to 
llnd which word a syllable is in. The followin R dis 
gram shows a typical st.ream structnre \['or part of 
an utterance object. 
Word @ / \ 
Syllable ~ unstresse( 0 /\ /\ 
Each stream cell is linked to its preceding and snc- 
ceeding cells. Cells contain all the appropriate in- 
formation for that type of stream. For example our 
phoneme cell contains a name, phonetic featm'es, a 
duration etc. 
Note that although there will be hierarchical 
structure I)etween streams this is ,tot mandatory 
(e.g. the silence '#' l)honemc' al)ove is not part of 
any syllable or word), l,'or example in a Lrea.tlnent 
of intonation implemented within CHATI/. (1)ased 
on \[6\]) the cells in the intonation stream a.re linkc(I 
to syllables but no direct hierarchical relal;ionsldp 
exists between intonation cells and phonenms. 
The existing streams could even be ignored and 
others introduced if the cnrrent ones are inappro- 
priate to some synthesis task. For exa.mple a. dil- 
ferent intonation model may require quii.e dill'e- 
rent streams rrofI\] the Taylor model currently im- 
plemented. Stre;uns must \])e defined at compile 
time but may be selected per nttermice at synthe- 
sis time Chat is, defining matw difli:rent streams 
does not impinge on the size or efficiency of' the 
utterance structures built. 
Levels of input 
CIIATR offers input an many levels. At; i;he most 
abstract it can accept linguistic descriptiolis of ut 
teranees h'om whieh it can generate prosodic phra 
sing and intonations.1 tnne through a rHle driven 
process (described in \[3\]). Alterna.tiw'.ly, inpnt may 
explicitly include prosodic phrasing and intona.tio 
nal t~a.l, ures specifying tunic. This second level al- 
lows much more explicit control over phrasing and 
intonation. A third level allows even more degree, el 
control specifying individnal phonemes, dnrations 
and F0 target values (or a slightly higher symbolic 
description of F0). At the lowest level, wave\['orms 
themselves can be specified allowing CHATI{ to Re- 
nerate any a.rl)itrary sound. These diffe~ring levels 
of input a.llow a user of (',llAq'l{ to specify the form 
of an utterance in as much detail as is desired. 
Multiple levels of inl)ut are nseful in synthesis 
research. /"or example, naturally occnrring dnrs.- 
lions and/or pitch may be explicitly speeilied in 
the input, a.llowmg exa.et control oval: parts of the 
synthesis, thus ernphasizing the other parts under 
investigation. 
There is currently a fixed number of inpnt l.y- 
pes. Although new levels can easily be added, it: 
wonld be better if an utterance could be sl)eeifled 
at any level of precision ill any stream and synthe-- 
sis nio(lules could be used to fill in missing parts. 
This has not, yeA; been added to the system but some 
discussion is given below. 
Overall structure 
A command hmgm~ge based on l,isp is 1)rovided so 
the IlSel' can execllte COlninallds su(;h as synthesize, 
play, set intona.tion statistics, define a l)honeme set, 
etc. The Lisp, although a full language, is designed 
merely for control rather than encoding speech syn 
thesis algorithms. Lisp list; st\]'uctnres are nsed to 
rel~resent most el' the ASCI\[ daix~ in the system 
(e.g. dural;ion stal, istics, lexicons, phoneme set de 
linit.ions etc.). This means that data can easily be 
changed and (re)loaded into the system. As all 
I, hese files are s expressions iio new \[tie i/o routines 
are rcquirex\[. 
Flow of control, i.e. which modules are called, 
can also be specified in Lisp, thus, fnnct, ionally 
equivalent modules may be selected bet, weem inCer- 
aetively at run time. 
The system consists of olle htrge exec.ul, able 
wh MI includes a numher oV dift'l:renl; modules. Mo- 
dules may 1)e written in C or (,'-t-q- (or in fact any 
ol,\]mr language i/" an int,erFace to the st;ream and ut 
terance structures is provided for thai, language), It 
may ha.ve been possible to write IJ)e whoh' CIIA\[I'I~,, 
system in IAsp (or Prolog or some other ImlguaRe 
designed for symbolic nmnipulal:ion). 'Phis howe 
ver was speeifleally decided again.st as in addition 
to the symbolic aspect el' CI IA'I'I{ we also wish the 
signal processing aspects el speech synthesis to I)e 
efficient (and lna.ny such a.lgori Ill ,ns al ready exist, in 
C,). Although most l,isp systems Stll)port C1 iiH, erl'a 
ces l;hey are typically non-stanch~rcl and I)Orl.al)ilily 
of the whole system was an iml)orl,ant crii.erion. 
Modules 
A number of modules exist in the sySll('Ill 1)ut IlOt 
all a.re used for the synthesis of all utterances. {}li 
terance modules are those fmlct.ions that are Riwn 
a single utterance object as an arg,,lC,l. '\]'yl)ically 
they will access a number of streams and create (or 
984 
modify) another sl;rea.,n. Scleciion of which mo- 
(hlles gel, (:ailed is I)~se(/ on Um illp,l\[, I;yl)e of the 
ul.l.erance, glob;d opl.ions a.ml dm Slmcilied i)aL\]l. 
Lei, us look al, one l.ylfical Ino(lule: l:he lexi- 
con lllOdu\[e. ()llr Cllrl'Clll. lexicon module allows 
l;he COllSia'ucl,ion a.nd use or lexicons whose enLries 
Sl)ecif'y sylla.bles, sl, rc,,ss mid pho\],emes \['or a. given 
word (which is id(ml, ificd I)y a. chara.ci.e\]' sl,ring plus 
opi;iona.l \['e~dmres). When l;he lexicon module is (:al- 
led die desired words a,re already se.t~ up in l;he word 
sl;realn. 'l'he lexicon module looks up each word in 
l.he lexicon a.nd (:rea.i;es t, he syllabh~ and l)holt0nm 
s~,rea.ms wil;h the inrornlal.ion Found in l, he lexical 
enury (words nol; fomld call opl;iomdly l)e l.rea.l;ed by 
lel, l,er I,o sound rllles, /)e ignored or c&lls(? synl hcsis 
I,o aborl.). 
Some modules offer choices bel,ween l'unctio- 
nally equiva.h'.nl, lllO(\[tlleS l)y simply set, l, ing globa.l 
I);I.IN/.IlIO.{;OI'S. 1"()1: exa.Hlple we have l;wo inodules 
which i)l:cdicl, durations For i)honennes. One is ba.- 
sed ol, I.he l(la.N, dural;ion rttlcs ill \[2, (ill. 9\], while 
l, he second is based on (;ampbell's worl,: \[4\]. Seh:c-, 
l;ion \]ml,ween them is simi)ly I)y a (;Olllltlgtlld of dm 
\[()rill 
chal;r> (Parameter Duration2,1ethod KLATT) 
Anol, her ,~ecl;ion where sele(:l;iou or equiva.lenl> IHO- 
dules is (;Ollll\[lOll is I, he low-level synl;hesi,~ slielihods, 
\.Vc wish I,o allow COllll);/.risoll O1' (lifli',relll, I'Oi'lllS OF 
wlwe/'orln synl,hesis based (m IJle Sa.lll(' II\[,I,(+.l;il, il(;(',, 
(\]ui't'culJy~ CIIAT/{ offers ;~ Uililiber o|' synthesis 
nml;hods: I(lal, I, \['of'ilia,ill; s.ynl,hesis, I,P(7 based di- 
17hone synl;liesis, a.n(\] a. liUllll)er o\[ concatenal;ive 
syul,hesis met, hods (ea.ch wil;h il, s own inl;eruM opl;i- 
Oils 1;O choose bel;ween (\[\]\[t'(!roiiL unil+ selection sl;ra- 
i,<%ies). Tile sa.n/c ut,{,era.nc<? (with I.lie salll<; sog- 
iileilis, dural;iOll+,; alld inl,onallion) ca,n l)e resyllt;h0.- 
sized wii h a, di\[\['erenl, wa,ve\[(Trii~ gerlerat, ion niodule 
allowing dil'eCl; coinpa.risol~ I)el;woen nlel,hods. INew 
low level synl, he,~is inel,hod,<-; Call be easily added i,a- 
king HII ii\[,|;Clq4.11co 8{,l;lICIdlF(~ ;IS ;'t, \[);/+l?;llll~.t,(~l" and ge- 
neral, ing a, WaV(;\[OiTtl ()11 rol;llrli, 
(\]ei'l;ain ol, her nioduies hi CIIATI/, arc' nol, di- 
re(;tly part, of t,}ie s yiti;\]lesis process. Audio oui,- 
put, is provided for l,hrougli ;l genr,q:al n/o(Inle \[ihal; 
plays l;he Wa, VC\[CTI!III Sl,l'eall\[ 0\[" aii ll\[,(;Ol'a, llC(~. \/Ve, 
provide a liUli-ibe.r (71 ine, c}iaili,siTis t,o do l his, We 
wish C, I IATI/ i,o 17<'. hi(lelTendent; of ha.rdwal'e so 
we offer su\[)lx)rl; For audT(7 sc>'rvc'rs. '\['hese are. ilel;- 
\vorI,: I,l'gt\]lSl)arel\]l, s3,'sl;elils that, a.llow access l,o au- 
(lio ha, r(Iwar(;, ht {,he s&iiie i)ara.dighl a,-; X WilI(\]OW8 
\[\]:)i' gra, i)hics , audio server~ offer ~ iliii\['Orlli &cces>; 
inei, hod for varion~ audio d<wic0s ,sucli gliaJ, wa- 
vo\[(71HliS (whic\]l iul;e,l'nally describe i;heir en(;o(lilig, 
byt:e Ol'der ;lll(i Saliip\]ilig I're<ll/(;licy ) (;all 17e ea.<dly 
bc played. We also ofl'ci: COllll\[lall(I driven play rotl- 
t,ines \[,o Oll,qttl:e CtlATFt will work Oll any ina<:hine 
with au(lio oul4)uiL 
Similarly a, disph~y module is offered t, hal, can 
graphica.lly clispla~y au ut.l.erance's wa.vcrorut, pho 
It(tiles, WOI'(IS, ,~yllables, inl;ona.l;ion e\[,c. ILatlier 
l, han incorpora.I.c a. ru\]l graphical diq)iay mecha- 
uimn m CIIATI{ it, self we ofrer inLerraccs to ol,hcr 
sysl:ems with gl:~q)llic.~ ca, pa,bilil;y. (~urrcld, ly we 
SUpl)ori; l;wo systems: J!',nLrol)ic's wa.vcs I sysl;elu 
and King, a fre(>sofl.Wa.lre Slmech graphics i)a.ckage. 
Example synthesis 
As sl;al;ed above CIIA'I'I{. ollk~l:s many levels of syu 
thesis bul, here we will discuss one i)arLiculam coi\]li- 
gural;ion. ()no o\[' l he uses (:1 lATH. is pill. \[;o is ill a 
Sl)cecll l, rauslat, ion sys(,enl. Tim I,l:ansla, i;ion part, {)\[" 
Lhe sysLeol ~.~ell(;ral,es ,qylli,;Ixti, iC aml semaucic U'ces 
(rel)resenl;ed as l'eai.u,'e SI;l'llChll'(~£) o1' l,\]l(? ii{.(.ci;;tllC(? 
I,o be spokclL 'r}lis is used :is Ole iupltl I;o one 
oF (J\[IA'l'l{.'s inl)uf~ IllO(les. The following diagram 
skel,cl/es die in Formal, ion \[low 
S'I'III,;A M S MOI)\[\] I,I,;S 
SI 'h,'ase (l'. si.i'lict.m'e) ..... -~ f t 
Iq,onological Word ~-~- I iscour e 
/ el.,. 
Se~menl .. i F~I\]\] Synt.hesis 
L7 Mel;hods \A/ave for m .,1.- 
The inlml; specifies sl~ee(:h act. iurorlnat.ion and to 
pic/\['ocus inrornlat,iou. A rule driven system l;r~tns-- 
/a.l.es tliis input I.,> a. lower level Form. Prosodic 
plu:asmg is general.ed fl:onl syn(,acl, ic sl;rucl,nre and 
special fl2al:ul:es ill the inlml,, llH,Ollai,ion l;une is 
geHel:ai,ed based on speech act. and I:opic ma.rkm's. 
The restdt, held in the /q~o~mlogical Hlord .sh'+:anl is 
I.hen i~assml I:o lower lew;Is 'rlm lexicon is used I;o 
liml a word';~ syIIa.hle sla:ucl.ure a.nd deI'a.uIl; pronuu- 
(:ial:ioII, All ini,o\]la.i.ion module gelmrales /'b l,a.rgel. 
I)oinLq I)ascd oil the ge}lera.l:ed iui,ol,al4Ol) I'caLtll',f~s 
(~tlld Sl)(!a.kcr s\[)c(;ili(" illgOll{/{.iO\]l i)al;~/\[/iol;e.rs). A 
dm:a.tion module gene.ra.l, es phoucmc dura.l, ions ba 
sed on phoneme conl:exl; and inl:onal.ional \[Tm.l:m'es. 
All l.his low-level informa.l.ion is I)rought. I.oget.her 
in I.he scgm.~ ~l sh, avg. I)elmmling Oll selecl.ion, OllO 
low--level synl.hesis n,odnle is then called to geue 
ra.t.e a waw4k)rm based on l he inro\]'ma.tion in I.he 
,~C~lll(:ll\[, Sl.l'(?\[t iii. 
IJsil'lg paranlet.er seLl;ings I.o select, t, he form of 
s.yniJmsis required means dmt, C, IIIATI:L cali easily 
b,:. /1,~e~l for mult,i-sl~ea.lo:,r s yni, hesis, and also we 
hol)e Ibr imlll, i-|amguage synt;hesis. 
985 
Implementation 
(231A'I'1~. is writ.ten in a mixtm:e of ANSI C a,nd 
(~++. The, eore arehiLeet, ure is wrilA, en iu C, but 
I)erha.ps C-I-f would I)e more suitable as tl~e core 
objeel;s (ut, teranees and streams) fit well inl, o the 
ol)jeet.-orientcd i)aradigm Other modules are writ- 
t;en in (2 or Cq--I tot" reasons related t,o history as 
well as a.ppropriateness. The l,isp command sy- 
stem is in fac/, a small Seheme interpreter written 
specially for ( 111A't'IL. A n interactive command line 
int.erfaee ofl'ers eolnmand line editing, history and 
eonlpletion for commands, t,heir argumenl;s, va.ria- 
hies and file names. This interf~ee makes the sy- 
st, era significantly easier I;o use, though CIlATR 
\[ria.y also be used in batch mode. 
'l'he syst.em lqms on a~ number of dili~renL ar- 
chil;ectures (including those with difDrent byte of 
der) including Sun SPAI{Cs, IlPs, l)ECsl,a.l;ions 
a.d 38(5J~S\]). IL should pori, to rely system with 
an ANSI (3 and C+-F COml)iler. 
Discussion 
()no enhancement. Lo the system currently being di 
scussed is a nluc\]l more {'orlnal definition of modu- 
lea. A module has pw:erequisit, es a.nd provides some 
result II; should be possible to explicitly decla.re 
fllese so thai; a module will only be invoked when 
the J~ec(',ssaxy l)rerequisites are met. 
A much eh;arer way of dumping ml ul;Cera.nce in 
a f'orm which ean be easily reloaded is also requi-- 
red. As we wish to allow C\]IAY\]?IL t;o inl;erl'aee wilJa 
other existing speech synthesis systems this may 
involve eommmfi(:a/,ion with a coml)lei, ely sepa.rate 
l)rogra.n~. Being a.I)le t,o duml) the full utl, er~mc(; 
s/:rtlcttll:e a.n(1 conver~ it to some a.\]terlla.Live forl'd 
fOl' allOl;ller \[)rogr&lEI t,o operate on a.nd l;hen con- 
vert it bn.ek, reload and continne is something I;hat 
wold(I make (;HATFL um:lch more useful in COOl)el:a- 
t, lug wil;h ol;her synldlesis l)rogra, ms. 
Conli)let, e fl:eedom of development, can somel,i- 
hies I)e I:oo genera.l. Alflmugh as a. system, CI:IA'\['FL 
does not resLriel; how modtfles int, el"acl;, if' we are l;o 
be able t,o compare simil&r sub-systems il, is neces- 
sary that those sub systems a.et on the same (lata.. 
Ilenee tllos{; of Ollr low-level synthesis reel, hods ae 
~,l,ally work from the information in t, he seglrlenl; 
stream. Thai; is they l,a.ke exa.eLly the sa.me inlyUL 
Even when existing synthesizers are integral, ed int.o 
CHATI{ we eneour~,ge use of l, he segment stream 
as a common intx,.rmediate stage bet, weeD. high-level 
syni;hesis a.nd low-level waveform generation. 
()l~her laboral,ories are also a.ware of the pro 
blems of nmltil)le syul,llesizers and requh'e a. eom- 
tllOtl enviromnclH, for their developnmnl;. COM 
POST \[I\] is one such system. Unlike (31A!FI/ il; 
int.roduces a new language for high level synt;hesis 
specifiea.tion but like CI1A'I'I{ iL offers a choice of 
low level synl;hesizers t.haL can be select.ed Ik)r each 
uLteral/ce. 
()llA'l'l{'s current.ly implement.ed fea.l.ures ill- 
elude: a well delined al'ehiCccl;m:e, multiple types 
of input,, choice of wave\[kn:m synthesis metdJods, pa- 
ra.meteri~,ed int.onaLion /ha.t.ures, two duration mo- 
dules, a.bstra.cL phonctllc sets, a text-to-speech mo- 
dule, graphical displays a.nd a.n ul.l;erance object 
inspector. Current. expaalsion includes improviug 
unit selecl, ion \[Sr concatenal;ive synt;hesis and inte 
grating ~-'I'A/,K \[5\] a .Ia.l>ane.se non:uniform unil; 
conca.t.enaLive symhesis system, making CIIATI£ 
into a nmlLi language synthesis sysl;em. 
Thus CIIATI{ may be used al. many levels. 
First, simply as a black box speech synt, hesizer. 
Simple conl, rol of voice is possible and t, he text-to- 
speech eomponenl; is adequa.te Ibr many purposes. 
At a. deeper level, (~IIAJI'FL ca.n be used interae- 
tively, allowing experimenta.t, ion wit.h int, olmtiolml 
features a.nd rules, resynt, hesizing existing ul.l;eran- 
ees with mocti\[ied durai:ions and pil;eh, building new 
unit da.t, abases, all wiLhoul; modification of (I sour- 
ces. At the deepest; level CIIATI{ may be used to 
develop new synthesis ~dgorit, hms: refit setectiou 
sl, ra.l;cgies, new ilfl,onai, iol, lnodu\[es eLc, may easily 
be added, building cleanly on i;he exiat;ing architec- 
ture. In summary CIIAq'IL goes a. t'n.ir way I:o meet 
our original cril;eria. 
AcknowledgenmnLs: 'l'he ~.ttl~hol'.q wish t.o acknow- 
ledge (.he help m',d comme.nl.s hy Nick Campbell and 
Norio \[Iiguchi and the membc.rs of I)elmrtment 2. 
References 
\[I\] M. Alissali a.nd (i. t~a.illy. (X)MP()S'\]?: a clienl.- 
server model \[i)l: a.pl)licat, ions using I;ext,-to 
speech syst.ems. In lh'ocecdings of I'\]UI~Ob'- 
I)I','\]':CI:I '93, volume 3, I)p 2(/95- 2098, 1993. 
\[2\] 3. Allen, M. IhmnieuL, and K. I(latt,. Tea:t-to- 
speech: "/'he Ml7hlk sgstem. (!ambridge Uni 
versiLy Press, Ca.ml)ridge, UI(., 1987. 
\['3\] A. \,V. I~la.cl,: and P. 'l'aylol:. A framework for ge- 
nerating prosody from high level linguist, it de 
seripgions. In Proceedings of I.h¢ Aeoltsgics ,%- 
cicl.# of,Japan, pp 239 2d0, 3 8 5, Spring, 199d. 
\[d\] N. Campbell. Syllable-based segmental dur~- 
t,ion. In G. I{arilly a.nd C. Benoig, eds, Talking 
Machines, Pl) 211 I :225. North Ilolkmd, 1992. 
\[5\] Y. Sa.gisaka, N. Kaiki, N. lwa.hashi, and K. Mi 
111111"~t.. A'I'H Je-'I'AIA( Sl)eeeh syuLhesis systenl. 
In \[C'£'LP 92, volume I, pp 483 zi86, 1992. 
\[g\] I >. Ta.ylor. A lqtonetic Model q\[ Enqlislt h+l.o++a- 
/io~. Phi)iJmsis, I!;dinburgh thliversity, 1992. 
9//6 
