Ol~.~l_(Sf\]{~; ~'~t'fLINgt)NS 1tl i%}{F4I)oSp\[EI$~ 
U.I~@~JIE'tt P~II;I{DIS fRi F~(B;i~ 
|'|.el're \] I{\[SCA~.{\['.S 
Ihliw;)x<~ity ef Weote):n (hltar.to~ Canada 
P'lnt tliet,~ CI/I}gKE|I 
llsiveJm_iLy .f' ~h':it:iuh Columbia~ Caeada 
Th:i~ pepsi? preselit~i the deveh)pment of 
coil{pal:or pro.qrSrniS ul~od for t):afluerilKl; U P:('eni;h 
h-;;'t itlb? phsliO|;JJ3 speech° Based (ifl all earlier 
pro{iraiii ',~,{a(/£ta & T;eeseasel;~ JglJ| ) ilialJe 01"' ,'l *~rd: 
lip 8JJliii) 9\[JU t(.,xl>te.-pher\]¢;tiee rtlJe, s tli-iil tii' ti 
compae~; riO'l{ e, F pretlodie £1Jle,~ i~t\]I ' aylrl:hotie 
q:)eecli~ {;lie, prelinill; reoem-'eh :,ms peili/to?iJ.y 
u:iailn(t a+ rlt.~velelti+l!i the boat possible nlqorii:l-lill 
te +~ecotl il: fe:e IrJl'mJtlca\].~_y soy t/orfl in the 
+:J:'elluh !\]ollel.~a\]~ J.(JX.\[eOii~ fla opposed to lllOat 
i'reqtmn~\[y uaJd werdt+ only+ .In ()hie+ to 
eafloiderdbJ+y enh~llleo tile original ?u.lLes~ rheas' 
wore ~Gyi~i;ei#)tJ.eal.!y bml;ed against a 5(J~l\]O0 wo)_'d 
P):eneh p~,eHoneiation d:actioeary and al equally 
:kiffJorLelfi; (;orpi.ts ef toxDt ,ah:i.eh were +111 entered 
ill an IBM I~C-Xl'o At the aalrle t:ii,ie, a set of 
syntliet\[o rules W\[i~:; d,,.;veloped for most l:ielSOlm~ 
lillliidator/ t~t-ld |'orbJ.dderil slid homograpils i:e 11o 
\[etlnd :il'l Il, ench speeelL The restlJ.t is e set of. 
some #'L,{\]0I} esi)verstoli l, tl\].ea+ f~mtecl ar:\]sinst the 
5l}lOl}0 word,~ el" the pronunciation diotiorlary~ 
they y:{~.: bt i:~ low pe£.contage of' errer+ \[erors ill 
text sre fiJ.lflJ.lsr.\[y Iniliitilal~ oiid duo i\]tl.lliil~y til 
f'oreigll (|:\]/glJ..£h) wsrde. 'to allow I"aster 
t:)roJjrslllJrl£~ whJ_le aeeoullting lot f~lost eoallt/nly 
used worcil~ ill l:reileh~ li compact set at' 1~0\[11) 
rnles |{(is hP, eli d(;veloped. It :is easentJ.s\]}_y 
be{led ell e .'~tatistic;al unslysJ.e yie\].dii-i£1 lliost 
f'v'equei,Lly nsed general rilleso fhe aim of both 
i:~igerihham has \[men !;() Inalee possible bettei? 
text~to~speeeh sol'tware +'(Jr freneh. 
1 o I\]~t lm-~&mLi~n 
l'he transcription p).'o{|rams oporah; by 
pattern match:{an 9 the aotlrel~ text with the 8el ef 
text.else- speeoh rule8o 
"flle 8yetel,i J_,<+ aotuslly a Lhree I.)aas one+ 
fh(~ f'irrt~ the piifllary objective of' the 
~,esesrch~co\[~vet't.'i eny text using the traditioim\]. 
spellin\[j systel, into phoiieti(" syl,bols using the 
;(nterilaLloi\]al Phonetic Alphabet (IPA)~ A i'tl!e 
has t, htt~" the general fsrm- ~ 
oontextl. \[text\] eantexttl = phoiJeties; 
fh~s reads; in a .qiwm eentext~ if sny~ 
indicated at left sed/or right o£ breclcets~ the 
t,+ri.t|;en form involved iri Lhe ru\].e yield8 the 
phenol{r: l, epreaelltation appearing in the, ~ second 
port of the equation° for {nots{lee I 
th\[aJ+\]=aajj~ (el. Appendix B) 
m~erm that the mitten form a~ preceded by <Lh 
(:i.t\[~elf preceded hy e spare indicating that it 
\[ilUSt be the, start of" a lexeme)~ arid no matter 
what Follows (aloes there is oo 8pace hufore the 
eqmd. sJgn~ mk'mg the rule applicable not only 
to ttm~ but to Ta~wai b tail~aimi% &co) sheuld 
IJo 'b-rb\]~'otineed /a j/ (Mliolb in our sol-hoe 
phsn(.~tio syotelii~ see Apperidix A~ i represented 
by ai+), 
1he seee.(I system csnvert,,~ the phon(~L:\[e 
roprese.ntatiorl J.nto a more specific sne~ which 
determines the, appropriate duration ami 
:i~H:}F;ct.Lun oC each phonem(;, "file third eonwn:h; 
{ill) lliOPe ,qpe(;:\[i'i(~ rq)resenLal:ioll iilto acl\]ua\] 
(:Of}i?£ ar fIIi-GqlIOSJl;S whioh are )..e(ltliJ?ed hy a 
Sllecii'Je q,ec, eh ~ynLheeizer. Each phase uses 
peaY\[~J~>ly Hle salm; tran<'ieription iil{.~eharlisll)U., 
tlmt .;s:, {ilU eonree t:oxt ~is tranar:)::Lbed~ then 
Hie hmi,q photo!ties are Dalmcribed~ Lhen i:lm 
spec,:ii':ie i)helieUes are i.ransci'ihed -- all tmJ.nrj 
th\[; ;faille peot;edtJr\[..>. 'ihe en\].y uhancje is the tier 
af rules boilIg llsed For each pllPose,, 
Our initial hei~ic prograin written in 
197%.80 with Peter Be Maggs et the University of 
!\].\]J.noio at Ilrbana~ at,.i acquired in 1983 by Lhe 
f;aLh()lie School Board of Montreal (CEI\]N) far use 
411 a aeeondm?y school for the Imndicapped~ 
core{sLed ef less than a thousand Lext>to~ 
phonet.;es and speech synthesis rules - that i.(~ 
les,i than the therou(ihly researched "maximuill" 
program of' 1~\].58 rules of Nina Cakucll C\].984)~ 
Ours had been tested agairmt the 5~t}00 word 
Juilland PrecLlenl}y. D.Let:iona~ of I:retm'h Worlds .............. \] ............................... 
(1970)~ end an equn\[..y small corpus sf textu 
(ore Meggs & h.escmmi~ 1980)o It was i:heref'or(. ~ 
felt that the conversion rule 6 and~ above al\[~ 
the treatment o1" liaison and humographs could 
great!y be lmproved ilpOll~ is order to yield 
greater accureey in |.he phonetic traescei;£inn 
of non technical texts. 'lids was made poss:ibie 
in tile Fa\].l of 1984 after receivin 9 Lwe major 
grands from the St)rio{. Sciences and ltumald.i;le'.~ 
Research Council (SSIIIE) of' Calmda, und furthe_~' 
assJ.stanee f'rola the UnJ.ver'uity of New IirunswicVo 
\]he reseslmh started ~n the l'all of 1984 
hod s ,'~ tetple ohjectiw;o 
F.irst~ Jt 8ouqht te develop a more sample× 
~let at:' phonetization l, ule8 for OOJnmOIl~ staBda£d 
\[reno|l~ end not for s opec{tie variety o£ 
F\[,ench~ amh as the one from Frmme of (,Juebeeo 
tim rules are therefore based on the 
phonological system of "standard" Frencl b 
including the nasal vowel /~/ and the harrowed 
English phoneme /'0 / but esing only front /a/ to 
transuribe the two s~% arid /n j/ to transerihe 
the dorso-pslstal no-so{. /j-i/~ Gomilmtes within 
words were generally not accounted far° 
1he ms51 objective was to develop the best 
l)assible set of ?LIIo8 for transcription of any 
word ill text({ \[{seed on tile (\]£ mrs{ lexicon oF 
Freneh. To thi~l effect rules should yield a 
correct transcript:ion for meal Dequent cases of 
homographa~ end mo,~L commonly used "exceptions" 
and )!are words° 
The algorithm thus developed wss inteeded 
for off'is{eat use in micro-computerSo In an 
effort to a(;hieve accurate transeription~ on( ~, 
671 
had to be carefu/ not to overextend the 
program, A balance had continually to be struck 
between writing rules that cannot possibly he 
finite, if aimed at exhausting aii the 
possibilities of the lexicon and of speech, and 
program manageability and efficiency, lhe task 
of attempting to take into account new 
exceptions to the rules, and all possible 
contexts far homographs and liaisons, without 
mentioning English words and abbreviations, is 
virtl?ally never-ending. 
Various constraints influenced the approach 
taken, 
3.1 Constraints 
The approach followed was contingent on the 
various constraints under which research had to 
be conducted (ever a three-year period by a 
computer programmer and the main investigator, 
with a full teaching load). Research had 
therefore to be based on the original program 
developed four years earlier. An overhaui of 
the set of rules taking into account 
syllabification and the graphemic system of 
French as set forth by N. Cataeh (1980) was not, 
For all practical purposes, possible. 
Consequently, if the above-described 
phonological system chosen for representing 
common French pronunciation little differs from 
the one used by N. Catach (1984), our graphic 
system is not always based on graphemes, now 
considered as the units in this system, As a 
result, for instance, we have rules For the 
graphic forms aill and all instead of rules for 
the graphemes i11 and il preceded by 2. This, 
and the lack of syllabification, is no doubt a 
cause of some lesser efficiency in the text to 
phonetics correspondences. Whereas N. Catach's 
program consists of 240 basic grapheme-phoneme 
rules, ours is made up of some 325 such rules, 
directly evolved from the original basic 
program. 
However, as said earlier, our primary 
objective in the present research was to 
correctly transcribe the maximum number of words 
belonging to the general lexicon while dealing 
with homographs and liaison in discourse in a 
more sophisticated way. 
3.2 Tools 
In order to greatly improve upon the 
original program~ rules had to be tested against 
large corpuses. 
The first was made up of the French 
Pronunciation Dictionary by A. Martinet and 
H. Wal~er (1973) based on actual usage, and not 
on a prescriptive norm. The dictionary is made 
up of some 50,000 words, out of which lO,O00, 
chosen for the investigation, register 
variations in pronunciation. A computer program 
was written for checking the pronunciation of 
each word, as given by the program rules, 
against the first - or only - end any other 
pronunciation registered in the dictionary (of. 
Appendix C). Statistics of matches against 
672 
first, second.., tenth pronunciation were 
simultaneously computed, as shown below for all 
words starting with letter a: 
.............. *> File Statistics 
Total words in file :4322 
Total words checked :4322 
lotal words 1 i sted : 525 
Form number of matches: 4022 %=93.06 
======~==> Breakdown of matches : 
Matches in pron # 1 are:3630 %=83.99 
Matches in prnn # 2 are: 372 %= (l.6t 
Matches in pron # 3 are: 14 %= 0.32 
Matcbe.~ in pron # 4 are: 2 %= 0.05 
Hatclln~ in preen # 5 are: 2 %: 0.05 
Matches in pron # 6 are: 0 %: O. O0 
Matches in pron # 7 are: 2 %= 0.05 
l'latche=_~ in proo # 8 are: 0 %= a.eo 
Matches in pron # 9 are: 0 %= 0.00 
Matches in pron #10 are: O %= 0.00 
............ *> Nnuiiral File Skaklmtics Total word~ in {ile :4322 
Total warcln checked : 4322 
Total word~ listed :52~ 
Total numkler o~ matches:4274 %=98.89 
:===~:===> Breakdown of matches : 
Matches in pron 11 I are:4039 %~93.45 
Matches io pron # 2 are: 225 %= 5.21 
Hate:lien in prnn # 3 are: 6 %= O. 14 
Hatches in pron # 4 are: 4 %= a.e9 
14atche~ in pr'on # 5 at'e: O %= 0.00 
Matches in pron # 6 am: 0 %= 0.00 
Matche~ in pron # 7 are: 0 %= 0.00 
8atches in pron # 8 are: O %= O.O0 
Hatches in pr'on # 9 ar~: 0 %= 0.00 
Matcbes in pron #I0 are: 0 %= O. O0 
As a result, non matches as recorded on 
print-outs allowed for systematic correction of 
faulty rules, and, consequently, continuous 
improvements to the algorithm. It was moreover 
possible, when a word was mispronounced in the 
dictionary or in a text, to detect which rule, 
or ordering, was involved. The following 
example shows how 811 the rules involved in the 
pronunciation of a word could be listed: 
\[ \]= 
\[\]= 
\[P\]=pP 
\[e\]CC=eh 
\[eh\]=sh 
\[b\]=bb 
\[i\]=ii 
pechbl\[en\]de =ehnn 
\[d\]=dd 
tel-= \[ \]= 
French: pechblende IPA: ppehahbbllehnndd 
As numerous non matches - not considered aa 
genuine pronunciation errors but rather aa 
possible variations - resulted from 
transcriptions with open instead of closed 
unstressed, vowels, or vice-versa, in the ease 
of E, 0 and EU, the neutralization applying to A 
(selection of one phoneme, the front one) was 
extended for eorreotion~s sake. Real 
pronunciation errors were then made to stand-out 
in the print-outs, and facilitated systematic 
correction of errors in the rules. As far as 
the graphic=to-phonetics rules themselves are 
concerned, to the exclusion of other types of 
rules (for liaisons, homographs, etc...), 
correction stopped when it was felt that the 
rules produced an optimal phonetic transcription 
(of. Appendix D) considering the system used. 
Given the fact that there are many permitted 
variations even within standard pronunciation, 
in particular in unstressed syllables, 
comprumlses had to be often reached in the 
development of the rules. 
IL should not, however, be construed from 
the use of Martinet & Waiter's dictionary that 
it was used as a model of "good" French 
pronunciation to be reproduced. It was 
basically used, as mentioned earlier above, as a 
tool for systematic corrections of tile rules, as 
well as a reference dictionary registering 
variations in pronunciation. In the development 
of the conversion rules between graphic forms 
and tlleir phonetic representation, constant use 
was made of other reference works, mainly of Le 
Petit Robert (PR) and the Dictionary of --" 
r'Fr~6unc~n ~ Ao Lerond (Larousse, 1980). In 
fact no effort at all was made to reproduce some 
features to be found in the general 
pronunciation revealed by the data such as the 
pronunciation of geminates and the use of /o%/vs/E/. 
\]he second corpus, made up of texts, as 
opposed to lists of single words, also consisted 
of some 50,000 words. These were mainly 
articJes found in French and Quebec general 
information magazines. They were not only 
intended as a further way of checking on 
graphic=phonetic rules but, above all, as s 
means of enhancing rules for liaisons and 
homographs. 
tie will now briefly discuss two major areas 
oH which tile research focused. 
4. Treatment of liaisons 
improvements to the earlier basic program 
proce,)ded along two lines: e) increasing the 
number of words causing obligatory, and to s 
lesser extent semi-obligatory, liaisons; b) 
refining on rules taking more liaison contexts 
into account without creating forbidden liaisons 
eL the same time. 
4.1 5electio. of words making up liaison rules 
llere again, a balance had to be struck 
between the will to take as many liaison, 
contexts into account, and the over extension of 
the rules, not to mention that absolutely all 
liaison possibilities cannot be allowed for 
selection or words causing liaison - to be added 
to lists in original program based on Juilland 
frequency list - was further made a) according 
to frequencies given in the Basic Orthographic 
Lists or LO__BB (Catach, 1984);-~b) linguistic 
awareness, since it was no longer s matter of 
focusing on a 5,000 but on s 50,000 word 
lexicon. Memory limitations on the program 
contributed moreover all along to reduce the 
list of liaison rules as much as possible, l o 
this effect, semi-obligatory liaieons were 
generally only entered for verbs, and the 
decision was finally taken because it was felt 
they made comprehension of synthetic speech 
easier. 
The program includes close to 300 - mostly 
obligatory - liaison rules, out of which some 
180 for adjectlves alone. Here is an example of 
such a rule (of. Appendix B) 
vii\[sin \]L=ehnn; 
It reads in a liaison context L the masculine 
adjective ending is pronounced like the 
feminine. 
In the earlier programs, all liaison rules 
looked like the above. Liaison was always 
applied, except before a consonant or so called 
aspirated h words. These have ts be listed, and 
liaison is prevented before them through the use 
of a macro symbol referring to their list. 
llowever, applying liaison to all other vowel 
initial words is bound to cause unwanted 
liaisons such as in: 
II est vllain eL r~ossit 
4.2 UBe of ~croe 
It was eonsequently necessary to introduce 
constraints on the context of application of 
liaison rules, and this was accomplished through 
the use of two different macro symbols. The 
first, E, prevents application of the liaison 
rule only before s consonant initial or an 
aspirated h word. The second, L, restrict 
liaison beTore entire parts of speech. In this 
case before prepositions (P), relative and 
interrogative pronouns (R), conjunctions (J) and 
verbs (B). These complex macros - as opposed to 
simple ones such as V for vowel and C for 
consonant - therefore require lists of words to 
be available to the system. To increase 
efficiency, such lists |lad to be drawn and 
ordered in terms of the frequency of the words 
to be included. 
By preventing liaison before vowel initial 
words belonging to specific parts of speech 
represented by other macro symbols, and included 
in lists available to the system~ the use of the 
macro symbol L in liaison rules allows for a 
correct transcription, without liaison, of 
' usually a determiner, in such a context 
We: 
Le premier/el man second se ressemblent. 
It must, however, be obvious that the 
treatment of ambiguities, in the case of liaison 
or homographs, through the immediate context of 
the word involved in the rule, has its 
limitations. Beeides the fact that all parts of 
speech are not represented, macros end the lists 
they make available to the system are far from 
being exhaustive. 
4.3 @so of hard vs. Bolt hyphe. 
The presence of an hyphen after, or before 
a word, or part of a word, to which a liaison 
rule applies, is bound to modify the nature of 
the liaison. Forinstance, if final s in ~tats 
can be pronounced /z/ in des Elate un~s2~_a \[ des 
liens 4conomique8, liaison must be'made in let 
~'tats'-unis. Similarly a third person singular 
verbal ending in d should be pronounced /t/ then 
followed by on hy-pheo and a personal pronoun 
(inversion of the subject) whereas liaison is 
optional or forbidden in other eases. 
673 
fol' such reasc)o,'+~ Lwe different type.~ of 
hypllen were introduced° The presence of:' a hard 
(H) in o rule wiJl I;\[;\]IJSl\] Lhe tulo' ko apply .~,~\]iL\[~{ 
.;kf l~h/',£p..j.j, L2 hy~?he~±/xt t.~h~! tgxt~ The rule : 
\[ed\]HV=eeLl ; 
will make. the fJ.nal ed be pronounced /eL/ ~N a 
1.iai,,;on context such an in sled-i\] or 
mends Lory hyphen° 
Use of h~rd hyi')h(!;n has khu. <~. .several 
adwmtages as 71 resLl?i.cLe the app\]ical:ioi o\[' ~ 
t'ule, in 1;i.aieen a~; wc,1\], a;.i other eoses~ Hlld 
therefore reduce,':} Lhc everal.1 size of tllfJ 
i)rogram, il; prov\[;(J pa.',zLicu\],orl.y usei"tJl il-I the 
ease Of com\[\]oorid~J, t:or Jlls~aHce s it ~ilJ.ows for 
generalization of liaison irlva\].ving Lhe final 
conFlonant of the first 6)\]eliien\]: of a compotlndi 88 
in 
\[t:\]tt=tLi (2ot--au feel and also dit-.i\]:,.) 
Any slich generalization will of course 
entail listing rules accounting for exceptions 
(%hat--hua)Dt.) and will. cause some new compounds 
ts be incorrectly proneulleed (1.esL~ut.s2\].t-- 
On the oLher hand, the use off a soft hyphen 
(-.) J.n a rll\]G ~ will. allow the rule tO be appl:ied 
whether the word is or riot: preceded and/or 
f'ol\]owed by an hyphen. \[tie rule 
.--mar\[sJ =ss~ 
will cause pronunciation of" the final s whether 
the word appears sniLs own~ is found '{'n a 
compound (ehal2~\[lde.7.ma~8) or in a phrase such as~ 
en fdw~ier~mars. \]he use of soft hyphen helps 
solve the preb\].em of" the uusys'temst.ie use of" 
hyphen in French compounds~ thus reducing 
pronunciation errors in computer processed 
texts° 
As above mentioned~ use of' macro aymbols~ 
referring to special lists of words available to 
the syste,% makes possible generalization 
concerning the context surrounding graphie forms 
appesrieg in rules. They were particularly 
useful as new symbols were creeled to this 
effect:, in tile treatment of ambiguities~ which 
constitute a major problem to be solved in 
developing text-to--phonetics rules for French. 
There are numerous eases of homographs in French 
involving va#.ious problems (le bus vs. tu husj 
nous inventions den inventions 9 etco. o). I'he 
meaL eomp\]\[e.x end frequesL'one no deubt inwilvea 
the -ant final not pronounced in verbal endings 
buk 5~}{erwise pronounced /s/. 1his case wl.\]_l be 
used to illustrate our general treatment of 
ambiguity .i.n pronunciation, as summed up in the 
following chart ~ 
TRt!;AiNENT OF-.ENT 
CA1EGORIES 
Rules for adjectives causing liaison 
6 7 4 
ex+: appar(ent) L=/'fl~t/~ 
I1 Other exceptions IN -ENT 
ex.." abriv(ent)=/~f/~ 
Ill Rules of deaambiguisstion (in \[ux, lr~ oV 
graphic forms of endings) 
eXo ~ ei~(ent)=/a~/~ 
iV Syntaxie rules 
exo~ I W(ent)::~ (ile afT\].Hent) 
ltomogrophs 
exo~ , S afflu(ent)=/a/~ 
(us aff\].ue~lt re. Jls &ff\].noit|;) 
Vl Ge, nerel ru\].e: (ent)=~ 
In the order of application of rnles~ 
Part I is made tlp of most frequent ~ent 
adjectives involved in liaisons~ Pa{i~-1.\[ 
inc.luded all frequent words whose ending :(s 
pronounced and cannot be derived from general 
l.ules (around 2OO). Part II,I :i.e made up of 
general rulee existing in the earlier program ~, 
and taking into account the immediate context 
before the -ant final° For example~ all word:~ 
ending in ~,emeot have their final pronounced, 
Part VI eoestitutee tbe most general rule~ 
whereby the final is not prononncedo Parts IV 
and V were additions to the progral% and make 
abundant use of' macro symbols in an effort to 
reduce ambiguity by defining the context beyond 
~.he immediate graphemes preceding the £orm 01\] 
which the rule bearSo 
For i.nstance, the rule 
I WEent\]=; (ils affluent) 
where I can be ally personal pronoun 
and W ~s any lexeme~ or part of it~ .p_~ieceded..by 
a spaee~ cause any final -en tt~ in a word 
immediately preceded by a personal pronoun~ nod 
to be pronounced. 
Similarly, the rule 
I A W Ws W\[ent\] =~ (lea trY,8 heereux 
parents rayannent de 
joie) 
makes use of 1"t for plural determiners, and A~ 
which can be any adverb from a special liet or 
nothing. 
The program includes some lOO rules of the 
kind i which evolved from the analy,~Jis of the 
texts. They were tested against these and such 
nenSellae 8ellJ;snee8 a8 th.i8 one~ correctly 
transcribed by tile program: Lea parents du 
pr~ssident president en occident le serpent ~: 
serpent le froment de l'opulent president,, 
Naturally the same rules that produce a 
correct pronunciation in the above 8entenoe~ 
could produce errors in ethers. It is obvious 
again that this treatment of ambiguity i8 noL 
error proof for 8everal reasons: 1) more remet:.. ~ 
contexts then the one8 tl)at can be defined by 
~;IJc:ll ~:U\]x.m a;ee uolrlut;iliie8 .i+ilvr}lved i 2) flOi; 81\], 
pu!q.:~i t)f" 8pee(;h,~ in pL4.}tt;J.OII\]\[al~ adjeotives.~ aro 
cop:re:sOn!led by a iuao\]'a ,<~yM)ol~ 3) specia:! \]:krH:u 
iiiadi-; awsij.ab\],e Lo tl~e 9ysl;eirl are net e×l!alia{::i.vo; 
/f.) in t:ll:\] ~tl:l:'~;.~eillL ~tgti~gl-. * Of" th~ progl..am~ ti(Jlfle 
,':iyfid)O}.S ouu}.d no{; JildiFfeeeili:\],y be used #Juht: 
and \].el{: -f I:lie b!'uck:u'{:'d in I:tiu rules~ thus 
:tc;tlLfJ.o~:i.liiJ t|oat::l'itliz~li.:Jflil~ ~)) I":i_#1(tI\]y I J.l: I, vj. 1\] 
kcli',o a ,~ophi<,~l:ieaf.ed I-JeliiOili:il2 itild syntaeLJ.e Le;d. 
Hgt~\]yr;i,~ "Lo irlt;et, l'-Ji'~{: certain ease~:t of" alilllicjuii:y 
(:\[.l.t; oill: f.oult \],6,t.l#f.~ \]ivl~t~.8) oC aoeJ.olJ.rlljuiFJi;io 
wr:iatio. (liles)o (Inly the use (,f' huge cel.'Duses 
el:' "L(wJ:c~: v,/{\]:\]7"I~'ea~; Lhe. (.'erl:~etees~i cd:" the l.'tl}.e<~s 
d~~ve_lop~ul For d~uJ.J.elj wi|;h hoiito£irelph~ :ill 
j:Pelrieh<> r.i_'llie ~I(ld iilelflOl,'y \].:i_lllJJgtltJolll:l |)£e.Vt!llti~.d 
~liy f:lll'|:f-R}t' develapi~lei~t of' i:he 12ill, ell I wiH.,fll 
~!:q){.m'p I:o be adeqtl(~f:e fo~c' .'i\]_} il;mr."l:.:{e~fl 
ii!~).'i~o,'.,o:-,'-~, emd ill the rll?O.qOiit: ai:ate i-i\[' l:ht ~, 
)?O, aU I t l'(~h o 
Ill khe ,~bsenet;t and impoasibi\].it:y~ of one 
ilu~!m of Prenel'i F, roriune.i-ation~ i;here can be rio 
abso!ul:ely rigol:'ol.18 liiel;hod o£ eva\].ual:ir,g \[:lie 
coPk'ool:iieaf; Of` 9raph:i.e to phaee{io8 l~tl\].eop 
e.upec:i.aJly ill tile dJ~ieouI'~;e~ iTle diutionalTy oF 
pl.Ol-iiJlleli3~iell eliLe.~.,ed iill:U {he oompuLez' in ilrder 
1:o tesl; Otl/.' Pe.\[e8 hl%,J n,'~vorl;I-ie\].e88 allowed For 
an assessment (if" l:he rilles doveloped as app\].J.ed 
t:e a l{~,xieono 
\]hu lax)ninm:b-~tion of` eech headword hy i:lle 
i'u\]_e~<~ c:(uld be illal:clied aga:iil,s~; any I)r(inl!neialLion 
pegi.ske:eed ill thu dic~;ioeary,~ ai'id ~,vo!'ds noi, 
ilia(:ehiri{! f'J+l,'si.; i)i, eilulieiaLioD ~¢~3#e \] ia\[:eci i l-li:l 
silo~,lii Jr {;he Lt.eat:illenl: of' the f':i.ral; \[)aqo of` l:iil ~, 
fJie!;J.ol~(i~ry (el ~. /tppendix C)o blat(,'hes~ w.iJh uc 
lVJ.'l:ho/i{: ilOl.ll;l.a\].izal;Joil Of` (lelS|l\],e l:ililb:12e vo/'4e.\].{J~ 
i'.etl\]d {Jell be oOillptl!ged 8S 8haw|G ill tLhe, 
t-;i;al;istJerl iPor \].ett:ec A (cf"o Appendix D)o 
Sfml:isl:Jcs For \[tie entire dict&onary J.lidioatO a 
9(\]~!)% rffid 96o2,~ iilLlt.eh 8vet'oqo ~:'eapeei:ively W:I.t;II 
(is.' wikhouk eeutralizaLion. Mat:ehes a(js:i.ll.':lt \[he 
I':i\]?st pi,oniliieiation only ave#age respective\].y 
lil.6N end 89o'/"Io It: 8hoti:l,d he noted t;hat: n 
pz'opm.'tJ.on of' el's'o~'s J.s imused by t~,eaLmenL of` 
aome 91mphio eigns u~ted irl dioi;ierlal'y (llyphen.s 
aiict apl~.g|:l.oplleS fOUlid ie eoiiliiOllilda eta:eel i~nl'.l 
l:lm/; our u,ae of~jirld/fOr/'Ti/~ lee/ fo# SOIliC; al I;ho 
dict;ionlll:,y i s /~ / #tile. aiid l::l-ie f'aet; t:hal: Ule 
t!id floi geriel:,8\]!.J.y l-~/;l, elllpl=ed '1:0 pI'odtlee i}elllJ.lla~.eEl 
"likev/J.sf: f"aJ.lud l;e i)lto(luee lllatehes. The soJ.u 
i)urpo.se el" t:tlo aDove-,irleiitioned l;tat;ii-~i::i.es iu i:o 
he\]!J {live a gerieral ides of the pelef"ormance oF 
'r.lte x'u\].oa at~ applied t;o a pro.uliciuLJon 
d:Letionary o 
St:aries?lea were oempul:ed en the applie,qtJoll 
t)l' uJ.}_ l;li~ ru\]\[ea 1;o bol'h eoz'pusee of' 50~000 
vso~'ds each .... \].exieon and texte~ For J,nal:aneel 
£!~},e~ (lilp\].J.ed inol'e than 101(\]00 Liilloe Or ..io ape 
~lle l"o}_.Low.in tj I 
3\]. 99(\] tat; 18 822 :e=l"; 
26 2Z9 {f"ieel) e :=~ 17 325 d=d~ 
25 891. :t_<'-:L~ .1..7 068 \[J=p; 
.1.¢, 941; m=ui,~ \[4 915 n:,:iq; 
1,5 725 (e)gu ~ 12 /!:18 6=e~ 
J4 382 e~:u; l,\[i 83\] il::y!; 
\]!! 1.2\] i:,:l<~ 9 457 ll::h~ 
l Iu.;+~:! .¢j~:alDit3~it_;~ OOlt\].d aL'\]o t:lo IhSOd l;a 
deD,,q'milie I;he frequency ( ) J:l uaeh qpal)hemeo lhuy 
~ndiuatc,~ f'li~ :i.l~sl:anut;j, i;ha|: thu f',_'oquency fox" 
.'!.~ j:{ and i} Imu been re<.clp{~ukJw-;ly 9'I',~ 2~ uHd 
:leli~; i:.Iiriil-;!~ coulparod ko ";9~; "l w nn(I \]~'4 a<'\] 
queLud i,y No \[:tai:er.'l-i (£980)° 
"itii; t'i£1J(:J/iil(J ()i P ptil.e,4 wa~; Ei£SO (>i)fii"i:Pl/iRd i)y 
|:lie tii:./li;i.~tl.:ir;8~ au file:re, q~JnelN'i\], rtlJt~!; uillitl.'l.d bu 
rippJiu(I Ill(iCe of tee t:han :tel l\].l.~s \]i.qkud 10iefal?6' il/ 
kile p ):'e g}N<~lil o 
l:illa:tly~ dtlPiiqj i;he eelllelqhtll: \].i.inil;i.~d {:hilo 
lt\]..!.(l~¢lelJ i+Of ~he £'e:;~Jalbl\]}h I 'JLal;i.4lics oe 
F#t;quency o\[' itplJl:i.eal;iell Of' \[:lie Ptl\]O~i I#eve. 
I~Rlill\]y exploil:ed I;o duve.\],op a illOl?e eOlill)t\](7~ 
lJl)egl:alli f'oit /'aal;oi:' lille wit;i-I a 8ppech 
8ynl;hesizel, o 171if:haul;: going into pal?t;icu.laes 
Ilcpc~ Lhe lea.q? fsl)plhM l'll\].eO I /:h/tl: ia ul\]dlTf- l:oii 
i;\]Lfli{\]{i (ll' 81") s lqe)7¢; G+\]_ililiflal+tKl~ I.lil~l~..~;~ i\[: Wa\[I J'O.\].(: 
tlmt: /;he \].aw f'pequ~:iley was 1118i1117 clue i;o L!ic~ 
UFl'dltl°Picielll: 8i.Ze Of t;he ooitlitlf~ 11i" Lo)(l; (llllttlt) oi <" 
{;OliID ~;{JilB(llJ wopdn)o Wl! thcl'el:'oi'e, lilt'lied i'll (1 
sliii_r'Ler di.e(:Jolm:vy of' nonla 16t0O0 woltds~ I.e 
\[,\]ouveoll \[DiX)lJtl.~\]e di}8 {t~t)/lt:llll'L{; t \[::dil:ioii <" 
{ih7~d\]Tr~ U:G47177 Fo~':f~e~i; iTi-E,J{ff&-iT.\] tkLi.i,; 
used s_'ti:\[e. ~} ~O iio kept;o \]tie ne\].eekion pFOG'e,'i.<I 
.t-t;i}tl\].~.ed ill a Ileelllpt\]\[\]~ll .~ge~ el" :/~0fJlJ INI\]O.Gf hll\].P 
Llie ~qizc of \[:lie llCOllllllei;dti i)POLJp(Illl, tJ{ll;h ::il:J/;,~3 
tlliTJ.Ut|19 lie#lie. 21~llJ \]l'UlefJ for le{\]J.i,,3h Wol'd~;o Itiey 
Clive 811 aeel!llt:ab\]A:; pt.OlRir/eia/.iOli fop l*ouqhJy 7/3 
o\[' Erl{jli~;h wl)ltdf; lilee\].y /:o be fotlnd J|i (I l:i?(;;i~:li 
dJei;ielm:Cyo \]:t :Ln uukkln/l\[.ed I:Htl|: i\]7:ot!ni\] 5~Jll 
l'liles .';Pe ileeded f'm~ {:he 5LIe I/ill/ illLotjuai;~d 
l\]rlgli<4h bO#l'ei, qielj,q 1,17£lJ,~i|;t,,;l_,tJ(l \[11 i:lIO d\]cLhlnui'y 
ilaed; fhis \]:'opPolleai:..3 a 1"klLiO n}' 2/3 f}i' 4 i'Ll\]6' 
Fez' every \[:i~gl_i.4h IHo\[,d i:nl<eli JlltJ(J t~ci.'ottn{;l With 
J20(jLLl.'d8 t:O \].Ol/9t.h ()~ l;hu rtll.o~ IiIid ~;i'f'i(J:i..r:liCy oJ:' 
i:he progrtll% English word,% a::; ~,,;e\]:/ as pTq:)per 
IIOtlllt; {illd elJb:l_'evittkioi18 (0o N< V,, ) ~ collstil:tll:e 
(;ulllp\].ex \[i)?oblerflEi sl::i.\].:l_ t;o tie lie\]veda 
Pinal}.y~ sLaL\[si;ioa on Lhe npplloaLion of 
1'Ides l,:o e()l.'~Jtl,'llT~+; could be (Ji:peot;ly tmc~,d I:o 
t)Olllp\].ete:\[y l'eo:edel: + l:h. t.u\].e8 al:elit,(J\[llq Lo 
Pl, etiLlerley t a!rl('l Lilei,,el:'o#e {:jl'eaEly increase Ell() 
rH:'f'icienoy of' l..ht., ill?ell#gillie /his~ and ol;hui? 
\[lli:~t\]~:(gPa t_.l)tlehed upon~ WOllid be duffi.ci.eni: k(i 
13x'(ive ft;llnf:~ a\]_l:hotlgh I~01 ' \]e,U~ I;Olilp\]u× to devol.t)lJ 
-l:til:li-i Peles for .qpeech synthesis it,uelf~ 
text-,l:o-phonel:ies l'liJLes Pot" l:reneh wJ}\] rl:;il~ntn a 
good field° 
TABLE OF SYMBOLS 
APPENDIX A 
Phonetic D_ictionary/Computer Program Output 
O. ~ ~ an 
E ~ eh 
en 
> oh 
oh ~ on 
ui 
O- th % 
~ ew 
gn 
~ ng 
J ÷ ah 
~ zh 
'~ /% eu 
a a s8 
b b bb 
Z Z 
The following denote how the accents are 
transcribed when text is input: 
Acute ................ ( 
Grave ................ 
Circumflex ........... 
Diaerisis ............ + 
Cedilla .............. % 
ZZ 
APPENDIX C 
LIST ERRORS ON IST PRONUNCIATION 
a IPA: as 
@ (abejlmpvwx) - ah NEUT Match 
a (dgknrty) - aa Maoh Neut Match 
an- IPA: an 
an • Bann 
abaisaer IPA: aabbeesaee 
ab<ae (cdgjlpvw) - aabbehasee NEUT Match 
abese (bkmnrtxy) - aabbeessee Match NEUT Match 
abasourdissant IPA: aabbaazzuurrddiiaaan 
abasurdia@ ~ (aokmnptvw) - aabbaaasuurrddiissan 
abazurdisB~(djlrxy) ~ aabbaazzuurrddiissan 
Match NEUT Match 
676 
RULE LISTING 
APPENDIX B 
; Cai\]guer =am 
Ca\] =as! \[ai\]guez =eel 
Ca>\]=as; Eai\]gue<=eel 
urEaeus\] =eeyysm; \[ai\]gu1=e~; 
\[ae<\]=aaee~ £ai\]Ker =~e l 
\[ae>\]maaehl \[~i\]Kez =eel 
C~\]schn~ =eh; £ai\]Km<=eel 
mEae+\]=aaeh! Eai\]Ki=ee l 
m\[ae\]=aaeh~ Eai\]Ku=ee; 
-pEae\]lla-=aaeh$ £alm\]V=ehmm; 
tEae\]l =~aeh; Ealm\]=en~ 
Cae\]=~e; ~ain\]V=ehnn i 
\[ai*\]Cer =eel carE\[sin \]L=Qhnn; 
\[ai*\]Cez ~ee; proch\[ain \]L=ehnn~ 
\[ai~\]Ce<=~el s\[ain \]L=ehnn~ 
Eai*\]Ci=e~; scud\[sin \]L=ehnr~; 
Eai~\]~eh~ v\[ain \]L=ehnnj 
dj\[ai+n\] =aaiinn; vii\[sin \]L=ehnn; 
JCai+n\] =aaiinn~ £ain\]=en; 
\[ai+\]V=aajj! Ea~\]maa; 
samourEai+\] =aajj; TEai\]sV~eu~ 
-tEai+\]=aaJj| assEaIJ =aajj~ 
th\[ai+\]=aajj; gEai\] =ee l 
Eai+3=aaii| Cai\]=eh l 
\[aient\]H=ehtt; \[a\]nV=aa; 
\[aient \]E=~htt! Ea\]nn=aa; 
Eaimnt\]=eh! -grEands-3L=anzz; 
sCalm\]tt=ehjJeh; -\[an\]h=aann; 
g\[aie\] =e~; cy£an\]hydrlque =aann; 
\[ale\]CeK~eel deans \]E=anzz~ 
\[aie\]Ci=ee; deans\] =an; 
vrEales \]L=eezz~ dean\] -aann; 
\[aie\]=eh; hetm\[an\] =aann; 
\[a~illi\]V=aaJj; karmEan\] =aann l 
Eailli\]e =aajjil; ombudsm\[an\] =aann l 
joEailli\]e=aallJJl p~an\]helle<nJ=aann~ 
Ea~ill\]V=aajjl pirEan\]ha =aann; 
Eaill\]V=aajjl s\[an3heKdrin=aann; 
Eail\]V=ehll; -\[and3-=ehnndd; 
cocktEail\]=ehll; -barmEan\]-=aann; 
\[ail\]=aaJJl -fEan\]-=aann; 
\[ai\]Cer =ee gentlem\[an\] =aann l 
\[ai\]Cez =e~ -~£an\]-=aann~ 
\[ai\]Ce<=ee~ recordmCan\] ~aann; 
\[ai\]Ci=ee~ \[an\]=an~ 
\[ai\]Cu~ee~ £a\]mV=aa; 
Eai\]gner =eel Ea\]mm=aal 
\[ai\]gnez =ee I rh\[amn\]ace<e =aammnn; 
Eai\]gn~<=~e; £a\]mn=aa; 
£aiJgrier =e~; \[am\]b=an! 
APPENDIX D 
I' europe un re*ve impossible ? 
iI ewrrohpp un rrehvv enppohasibbIl?. 
non ! 
nnon.. 
I' europe existe ,elle ne s' eat j' amais aussi 
bien port~¢e I 
II ewrrohpp ehggzziisett /. ehli nneu se eh zh 
aammehzzooaeii bbjjen ppohrrttee.. 
des europe<ens convaincus de i' avenir deIeur 
continent 
ddeezzewrrohppeeen kkonvvenkkyy ddeu Ii 
aavvnniirr ddeu iioerr kkonttiinnan /. 
on en entend beaucoup . 
/ohnnannnanttan bbookkuu.. 
mais parfoia , le lieu d' ou~i' on parIe imports 
autant que ce qu' on a a> dire = bruno kreisky 
/mmeh ppaarrffwwaa /. lieu lljjew dd uu II co 
ppasrrII enppehrrkt /oottan kkeu aseu kk ohnnaa 
/aa ddiirr /. bbrryynnoo kkrreheskkii. 
, I' ex-ehancelier d' autriche il a pris sa 
retraite 1' an pase~(, 
/. II ehkksshanaseulljjee dd oottrriiah /Jill aa 
pprrii ssaa rreuttrrehtt II an ppaaaaee /. 

References

Catach, N., L'Orthographe francaise, Paris
NttLhttn ~ 1980 

.................... , ~.2h__°n'~J ~ t a EtJ.!"± (!"~_q~!"J~J ~! je !i_tJ 
fh~.j.r~ Nutlmn~ )984° 

.................... , ~.2h__°n'~J ~ t a EtJ.!"± (!"~_q~!"J~J ~! je !i_tJ 
fh~.j.r~ Nutlmn~ )984° 

I e:('ond, A o, D~.~.t.i~p~i~% dq \]E p~:onm\]~:i.,t.~o!p 
i'Jtiria~ L.a;eous<~ie~ 1981)o 

NarfcJe(;L~ Ao el: Wn).Lei:~ Ilo~ Dio£:Lonnai~,e tie lu 
~,~T~-FEFi/G- Jq:~6-~.:e/f.:x\]7,Tdg:~\;r;~ X.!;7~: .... 
E, "I 5 

I rescases, P. eL Maggs, P.B., "De l'~criL A 
l'eral. Un programme Bur ordinateur pour 
machine A parler A l'uBage de8 aveugles 
francophones", ~e fran£ais moderne, 
juillet 1980, n ° 3~ pp. 224-244. 

Le Petit Robert~ 1978. 

No uyeau Larousae des d6buta,ntat, ddition 
canadienne, 1981, 
