American Journal of Computational Linguistics 
Mi crof i che 5 7 
DHON~LOGICAL RULES FOR A 
TEXT--TO-SPE€f# SYSTEM 
SHARON HUNNICUTT 
This work was supported originally by the Joint Services Elec- 
tronibs Program, Contract DAAB07-71-C-0300, and more *recently 
by the National Science Foundation (Grant EPP74-124353) 
Copyright (C c, )1976 
~ssociation for Computational Linguistics 
Summary 
The phonological rules discussed in this paper are part .of a system 
which has beenunder development at M.I.T. to convert unresfricted text to 
speech. .- The system utilizes a morph lexicon and a vocal tract model. 
Although most of the 1i.nguist.i~ analysis is done by decomposing words into 
their constituent morphemes, such a system 1s not sufficient for unrestricted 
text. In order to attain the competence of a comprehensive system, it was 
necessary to develop a scheme for dealing with unrecognizablq words. Thfs 
is called the "letter-to-sound" system. 
When a decomposition fails, that is, when a word cannot be decomposed 
into Its cons-tituent motphs or when it is too infrequent in the English 
laneuage to be included in the morph lexicon, the "letter-to-sound" system 
is invoked. The letter string which it receives is converted into a stressed 
phoneme gtring using two sets of ordered phonological ruled. The first set 
to be applied convects letters to phonemes, first stripping affdxes, then 
converting consonants and finally converting vowels and affues. The second 
set applies an ordered set of rules which determine the stress contour of 
the phoneme string. 
These rules were developed, by a process of extensive statistical analy- 
sis of English words. The form of the rtiles rerlects the fact that pronuncia- 
tion of vowels and vowel digraphs, consonants and consonant clusters, and 
prefixes and suffixes is hikhly dependent upon context. The method of order- 
ing rules allows converted strings which are highly dependable to be used as 
context for those requiring a more complex framework. Detailed studies 
of allowable suffix combinations and the effect of .;~iffixati,on on stress 
and vowel quality have also provided for more reliable rcsutlts. 
Table of Contents 
-- 
I. The Text-to-Speech System 
I. Letter-to-Sound 
1 Lexical Stress Placement 
IV. Reliability 
Appendix 
Phoneme Table 
Listing of Letter-to-Sound Rules 
Figures 
1. Application of Letter-to-Sound Rules 
2. Cyclic Rules (Flrst Phase); Domain of Application 
3. Notation 
4. Stress Rules - Flow Chart 
5. Stress Placement Rules 
Examples 
A., PTOLEMAIC 
B. TABLE 
C. CARIBOU 
D. SCENARIO 
E . SUBVERSION 
F. SCIENCE 
Lexical Stress Placement 
Main Stress Rule 
Stressed Syllable Rule 
Alternating Stress Rule 
Destressing Rule 
Compound Stress Rule 
Strong First Syllable Rule 
Cursory Rule 
Vowel Redoction Rule 
A Complete Example 
/ 
I. The Text-to-Speech System 
A system to convert unregtricted printed text to speech has been under 
development for several years at M.I.T. 
1,2,3 
The approach has been to model 
the proces-s employed by a nat'ive speaker of English when reading aloud. In 
order to develop correct computational algorithms for the pronunciation of 
English words, it has been necessary to reflect the basic nature of linguis- 
tic processes. Consequently, considerable emphasis has been placed on the 
development of morphological and phon~logical analysis, stress patterns, 
parsing systems and prosodic correlates. 
It i$ possible, using the current system, to convert any English word 
or string df words in a textual representation to intelligible speech. In 
order to effect this conversion, a number of subsystems are utilized 
including mo~phological analysis, letter-to-sound rules, stress rules and 
phonemic speech synthesis. Prosodic studies are now in grogress; experi- 
mental parameters for fo contours and timing patterns will soon be included 
In %he system. 
The letter-string which represents a word is usually converted to a 
phoneme string in preparation for speech synthesis by a process of morpho- 
lop,ical analysis. Amorph lexicon c-ontnininq approximately 11,000 entries 
has been drv~l loped and is ilsed in ron jrinct ion with n rnorph dec*ompos it ion 
algorithm. Included in the lexicon are two major classes oE rnorphs. One 
I1 11 I I 
class is composed of roots such as trust" and snow, i .c. ; words which cqlr 
occur alone, and bound roots such as -ceive perceive, receive, connive), 
rot- (rotary, rotate, rotor) and -miss- (dismiss, missive, permiqsiveness). 
A£ fixes make up the second class and may be attached either to roots or tc 
bound morphemes. Accompanying each lexical entry is its phonemic representa- 
tion, its morph class and its part(s) of speech. 
Algorithfnic decomposifion of letter-strings models the pr~cedure used 
by a native speaker when confronted by a word which he does not immediately 
recognize or has previously not encountered. If the word is not immediately 
1 I 
recognizable, i,e., if ~t is not in the le~icon" of the native speaker in 
its entirety, an attempt will be made to break it apart into its constitu'ent 
morphs. Such a process is probably used when oAe reads a word skh as 
"antidisestablishmentarianism," earthrise" or "cranappley for the Tirst time 
The algorithm also models the ability to recognize mutations such 2s 
the dropping of a fLbal silent i el4 (observe - observance), the doubling of 
a final consonant (red-reddest') and the substitution of ti1 for final [yl 
preceding vocalic suffixes (glory - glorious). Morphophonemic rules are 
also included, modeling the ability to give a correct pronunciation for any 
plural (horses, cats, dogs) or past tense (quieted, hushed, whispered), and 
to palatalize in appropriate contexts ( sculpt - sculpture, confuse - confu- 
sion) . 
Another feature of the algorithm is a set of selectional rules whit\, 
although very simple in form, choose the corpect morphemic analysis from all 
possibilities in a large number of cases. A standard form for sequences of 
morphemes is compared with each possibiiity, and rules describing preferred 
composition are used in a pairwise comparison leading to an acceptable 
result. 
The word "formally" provides an example in which the only rule 
needed is one stating that a root followed by a suffix is preferable to two 
concatenated roots. Possible decompositionsof the word "formally" are 
1 t 
detected as foklows (R represents root1' and $, tlsuf-fix") : 
form (R) + a1 (S) + ly (S) f om (R) + ally (R) 
for. (R) ,+ mall (R) + y (S) form (R) + all (R) + y (S) 
It is clear that the correct decomposition is the only candidate having the 
form of a single root followed by suffixes. 
When a complete morphemic analysis fails, that is, when a word encoun- 
tered by a'native speaker cannot be se~arated into its constituent morphemes, 
or when it is so infrequent in the English language that he has previously 
not encountered it, a "letter-to-sound" system is invoked, i.e., an attempt 
is made to sound out the word letter by 'letter. The competence of a native 
speaker which allows him to perform this convemsion is based on correspon- 
dence& between l'etters and sounds in English which have been internalized 
through experience. (A native speaker will also apply the same correspon- 
dences to a foreign-word from any language of which he has nQ knowledge.) 
A scheme; to model this process,musE be made available, sequenced after the 
decomposition algorithm, in,order to be ablk to convert unrestricted text to 
speech. The text-ro-speech system includes a phonological model having a 
two-phase structure: the first phase is 3 set of rules which converts 
letters to phonemes, and the second is an algorithm for placement of stress 
on the converted phonemes. 
This system has been implemented ;In RCPI, on n DEC PDP-9 and on a PDP-10 
inbMAC LISP. 
Onp might ask why, with a set of letter-to-sound rules available, it 
is necessary to have a lexicon. There hre three reasons; all are restrictions 
which must be imposed om any viabl'e letter-to-sound system. -First, it has 
been observed that high-frequency wotds 
perhaps because of extensive use, do 
nbt always follow letter-to-sound rules. For example, the only instance in 
which a final Lf] is pronounced as /v/ is in the word "of If The letter [w] 
9 I 
following a consonant is generally pronounced like the [w] in 
sweett1 or 
"tw~ll," but in the word "two" it is not pronounced at all. 
A study5 of the 
200 most frequent words in English according to the Bl'own corpusb as made to 
determine their regularity of pronunciation. It was found that although the 
regular case is that a final [el preceded by a single consonant (other than [rJ) 
lengfhhns the preceding vowel, four of the 200 words, i.e., "haveft (compare 
It 
"behave, " "shave") , one, 
11 I1 I I 
some" and come," (compare "ldne," "ode") are 
exceptions. The case of initial [th] is even more irregular among high- 
I-- 
frequencyword$. 
I'n most English words, initial ;th j is unvoiced as in 
If Jf 11 
"thistle, fKin," "thesis., However, twelve of the 200 words began with 
voiced [thl. 
Secondly, it must be recognized that the fetter~to-sound rules which 
operate within a morpheme, do not, ~ecessarily apply across morph bounddries. 
In particular, the pronunciatian of compounds tequires a lexicon. Such-words 
as "hc&house" or "potherbff might etherwise appear to contain the consonant 
cluster lth], and the motph-final srlent e] in "houseboatf1 would certain1 y not 
be silent if the word were not recognized as a compound. The application of 
letter-to-sound rules must therefore be restricted to words containing no more 
than a single root. 
Thirdly, foreign words which retain their original prmunciation must 
be lexical entries. The entries may be made in the same way a native speaker 
of English muld add a foreign word to his vocabulary, i.e., by pronouncing it 
as if 13 were an English word (using English letter-to-sound rules) until 
informed of its correct pronunciation, and then plaCing it in his mental 
lexico?. 
It is apparent, then, that both morphnlogical and phonological 
systems are necessary and that together, in sequence, they can provide a 
phonemic representation for any English word presented for conversion to 
speech. 
Although, at present, there is no ingergction between the decomposition 
and letter-to-phoneme algorithms, a more highly efficient System could be 
developed in the future. The size of the lexicon could be reduced, f6r 
example, by app.l!ication of stress placement rules to the output 8f the 
decomposition algorithm or by omitting unnecessary phonemic representations 
The c~nversfon of a letter string to a phoneme string in the letter-to- 
sound program prdceeds in three s-ges. In the first stage, prefjxes and 
suffixes are detected, (cc. Figure I) Such affixes appear' im the list.of 
phonological rules. Each is classified according to :13 its po~sible parts 
of speech, (2) the possible   arts pf speech bf aasuffix preceding it, f%) its 
restriction or lack of restriction to word-final pos"i5op and (4) its*abitlity 
to change a preceding Ly] to [i] or to cause the omission of a preceding [el. 
Prefixes aye given no further specification. 
Detectio~ of suffixes proceeds in a xight-to-left, longest-match-first 
fashion. When no additional ~ffixras c-an be d'etected, or; FJhe~ a possible 
suffix is judged as syntacSically incompatible with its right-adjacent suTfix 
by a parts-of-speech test using classifications (I) and (2) above, the pa- 
cees is terminated. Finally, prefixes are detected 2eft-to-sight, a1so.b~ 
longest match first. If at any time the removal of an affix would leave no 
consonant or no vowel in the remainder of the word, the affix is not removed. 
Example : 
dkt ;+ ate + or + ship 
ship: (a) nominal suffix 
(b) follows nominal 
suffix 
or: (a) nominal suffix 
(b) fo$lows verbal 
suffix 
ate: (a) verbal, nominal 
and adjectival 
(b) fol~ows verbal, 
nominal and ad j ect- 
ival suffixes 
dict + ate + or + ship 
possible suffix analysis 
parts of speech are compatible; 
analysis accepted. 
Example : passing 
pa 4- s + s + ing 
possible suffix ana&y.qis 
ing: (a) nominal and 
vefbal suffix 
(b) follows nominal 
or verbal suffix- 
s: (a') nominal and 
verbal suffix 
(b) follows nominal and 
verbal suffixes 
(c) appears only in unacceptable ana3ysis 
in word-final positiorr 
pass 4- ing correct analysis 
Example; finishing 
fin + ish 4- ing 
ing: (a) nomingl and verbal 
suffix 
(b) follows nominal or 
verbal su5f ix 
ish: (a) adjectival 
(b) follows nominal c>Y 
ad j PC t ivxl su Ef ix 
finish + ing 
possible suffix analyslg 
parts of speech not 
compatible 
correct analysis: root 
functions. as verb with 
verbal endxng, is11 
The domain of application of the second stage rules excludes any 
previously recognized affixes and is assumed to be a single morpheme. This 
stage is intended primarily for consanant rules and proceeds from the left of 
the string to the right. 
Extsnding the domain to the whole letter string 
once again for the third stage, a phonemic representation is given to affixes 
and to vowel6 and vowel digraphs, (cf. Figure 1). 
Phonemic repr'esentations are produced by a set of ordered rules which 
convert a letfer string to a phoneme sGri&g in a given context. Bgth left 
and right contexts are permitted in the expression of a rule, and may contain, 
variables as well as letters or phopemes Any one context may be composed of 
either letters and letter variables or of phonemes and phoneme vpriables- 
Combination of these possibilities for both left and right contexts allows 
fog four possible co~text types. One cype Q£ rule, for exapple, makes it 
possible to convert a particular letter string to a phoneme string only if the 
left context is a specified phoneme string and the right context is a specified 
letter string. 
The method of ordering rules allows converted strings which are highly 
dependable to be used as context for those requiring a more complex frame- 
work. 
Because the pronunciatioh of consonants is least dependent uDon context, 
phonological rules for consonants are applied first, i.e., in the second stage. 
Rules for vowels and affixes, requiring more specification of environment, 
are applied in the thiddand final stage. With the benefit of a previously 
converted consonant framework and the option of including as context any 
phoneme to the left of a string under consideration, the task of converting, 
voweks and affixes is simplified. 
DENOMINAT IONS Input 
DENOMIN + ATE + ION f S 
DE = NOMIN + ATE + ION +S 
-- = 
n-m-n 4 --- +. --- + - 
Stage 1: (a) recognition and isolation 
of suffixes 
(b) rep-ognition and isolation 
of prefixes 
Stage 2: conversion of consonants in 
root 
dx = n-m-n + --- + --- + - Stage 3: (a) conversion of prefix 
dr = namrn + --- + --- + - (b) . conversion df vowel s inl 
root 
dz = namzn + el +an + z (c) conversion of su1fixes 
Result of Stress Placement Rules 
All phonemes are given in IPA symbols. A dash ) serves as a 
place-holder for a letter which has not yet bpen converted; an equals 
sign (=) follows each prefix; a plus (+) precedes each suffix The result, 
of stress placement rulev is also given. 
Figure 1 
Application of Letrer-to-Solin-d Rules 
within the two sets of rules for conversion of consonants and vowels, 
ordering proceeds from longer strings to shester strings and, for each 
string,,from specific context to general context. 
The rule for pronuncia- 
tion uf [cch] , then, appears before rhe rules for [cc] and ich], each of 
which is ordered before rules for LC] and [h]. 
Procedures for the recog- 
nition of prefixes and suffixes also require an ordering: 
t'le pfefixes 
lcomj and [con] must be ordered before 
[co]; any suffSx ending with the 
letter Ls] must be recognizpd before the suffix consisting of 'that letter 
only. 
As an example of ordering rules for a particular string, consider 
the vowel [a] and .assume that it is followed by the Letter Cr]. This [a f 
1 I 
may be pr,onounced like the [ a? in "warp, " "lariat" or carp" depending 
upon spebif ication of further context. It is pronounced like the [ a] in 
I I 
carp" if it is followed by [r] and another consonant (other than Lr]) 
and if it is preceded by any consonant phoneme except /w/ (n,ote "quarter, 
11 
"wharf") Consequentlv, a rule for [a1 in the context of being preceded 
7 
by the phoneme /w/ ~d.followed by the sequence [r~] is placed in the set- 
of rules. ~~rcif'ication of a left context in the rule for the [a] in 
11 
carp" is subsequentl! Unnecessary. If the [a] is preceded by a /w/, this 
rule will never be reached; if preceded by a vowel, a rule for vowel 
digraph9 will already have- applied. Using this method. rules may be stated 
simply and without redundancy. 
Development of the set of phonological rules was begun by informal 
8 
inspection and reference to published works, e.g., Venezky. By ,a process 
of extensive statistical analysis, other rules were added and ordered 
appropriately. The. principal source of words was the Merriam Websteb .Pockef 
9 
Dictionary. A computer print-out was generafed in which all words contain- 
ing each letter and each specified cluster of letters were isolated. Within 
each category, words were sorted alphabetically according to the riglit-hand 
context of the letter(s) under considaration. In addition, walker's rhyming 
dictionary lo was used to determine pronunciation of suffixes and the ef fect 
of suffixation on preceding phonemes. Words from Lb Brown Corpus, the 
I1 12 
Heritage Engllsh Dictionary ahd Stedman's Medical Dictionary have heed 
used in testing procedures. 
Examples Of Rule Application 
In this section, a number of words will be analyzed according to the 
phonological rule program. I'ntermediate output, i.e.,.th'e results of the 
first and second stages, will be provided for each word, and the rules which 
have been applied to produce this output will be discussed. Generalkations 
of these rules and rules.which are believed to he related will b% included 
in the discussion whenever possible. All phonemes axe given in IPA Symbols; 
a dash (- is a place-holder for aletter which is to be converted in a later 
stage. The result of application of stress rules (to be discussed later) 
is given without comment following each derivation. 
A. PTOLEMAIC Input 
PT~LEMA + IC Result of Stage 1 
t-l-m- + - Result of Stage 2 
tal~me + xk Result of Stage 3. 
Final result after stress --primary stress 
appears wer the /el and secondarj: stress , 
over the /a/ 
In the first stage, [icl is recognized as a suffix and a plus (+) is 
inserted to its left. Since no \other affixes are recognized, Stage 1 is 
terminated. 
Morph-initial Cpt) is pronouhced It/, and [ 1) and [m] are given the 
pronunciation= /1/ and /m/ respettively, according to the most general rule 
in the rule sequence for each. 
(The most general rule is the final rule in 
the rule sequence and contains no specified context.) 
In Stage 3, the contexts of the vowels lo] and [el are not among 
those contexts specified in the sequence of rules and are pronounced accord- 
ing to the final, context-independent rule. The vowel lal, on the other 
hand,, precedes anokher vowel and, for this reason is lengthened (tensed). 
The suffix [+ic] is word-final and receives the pronunciation /TIC/, In the 
final result, stress rules have beeg applied and unstressed non-tense vowels 
have been reduced. 
Generalizations and Relhted Rules 
) 
Morph-initial Lpm] and [ps] are given the pronunciations /n/ and 
/s/ respectively, the [ p] remaining silent as in morph-initial Lptl. 
2.) Vowels in pre-vocalic position are usually lengthened (tensed). 
34 
There is only one context in which [ ic] is pronounted /IS/ rather 
than /zk/, i.e., preceding the variable representing the vowels 
Cif, le3 and CYI. 
B. TABLE Lnput 
TABLE Result of Stage 1 
t-b -1 Result of Stage 2 
teE 1 Result of Stage 3 
- 
---------------------------------------------------------*-------------------- 
1 
teb .1 
-16- 
Final result after stress 
The,reSult of Stage 1, in this case, is the same as the input since 
no affixes are detected. 
The letter Lt7 is pronounced by the most general rule in its rule 
sequence, ahd Lb] has oniy one given pronunciation. However, [I] precedes 
morph-final [el and is itself preceded by another consonant, [b]. & this 
context, [I] is syllabic. 
The sequence [blel now forms a very specific context for the third 
stage. .The letter la] when followed by LCle] is lengthened if the conso- 
nant, C, is neither irl nor 111. The vowel [el is morph-final and there- 
fore silent. 
Generalizagions and Related Rules 
1 .) 
The rules for [ bt] and [mb] d I which the Cb'J is silent are 
sequenced preceding the single rule for. Cb]. 
2.) 
All vowels except [el, if located inthe first syllable of a 
morph, are long if followed by C~le#] where C is neither [r] 
nor [I]. Examples are "maple, "bible ,I1 "ogle" and "bugle. 
11 
I I 
An exception is triple." The letter [el apears to be long 
In this context only if it is part of s vowel digraph, e-g., 
the vowel in "treble" is short, but the vowel digraphs in 
"eagle," "people" and ''beetle'' are long. Vowels in this 
context which do not appear in the first syllable must be con- 
verted to short pronunciations so that they will not be given 
primary stress by the stress rules, e.g., "monocle," "'barnacle. 
11 
C. CARIBOU 
CARIBOU 
k-r-b-- 
k grrbu 
Input 
Result of Stage 1 
Result of Stage 2 
Result of Stage 3 
-----------------k---------------------r------------------------ 
1 
k-derzbu Final result after stress 
During Stage 1, no affixes are detected. Converting consonants in 
Stage 2, we find that Lr'J is pronortnced according to the most general rule 
in its rule sequence and that [b] has only one given pronunciation. The 
letter LC], because it precedes la], is pronounced /k/. 
When [a] precedes tr] which, in turn, precedes either a vowel or 
another Lr] dthin the same morph, it usually has the pronunciation /=/. 
The letter (11, following its most general pronunciation, is assigned the 
phoneme /r/. Morph-final Lou) is given the pronunciation /u/. 
Generalizations and Related Rules 
- -- - - 
1.) 
The letter Lr'J is syllabic if preceded by a consonant other 
than [r] and followed by a morph-fi-1 [el, (e.g?, "acre"), 
or the in£ ldctional suffixes [+s] or \+ed). 
11 
2.) 
The letter [dis palatalized in some cases, e.g., special, I I 
11 
(context: [v-i~]) "ancient, (context: nV It is 
assigned the phoneme /d/ latgr in its rule sequence i'f it is 
followed re], LiI or [y]. It may be noted that this? is the 
same context which assinns the pronunciation /rs/ to the suffix 
[+ic] . If [ c] is followed by [a], [o] or [u) , it is usually 
pronounced /k/, 'as in this example.. 
3.) 
-hen [a) precedes [rl and [rl is - not 4ollbred by either a vowel 
or another [r] within the same morph, [a1 is pronounced /a/, (e .g . , 
"far, " "cartoon") unless preceded by the phoneme /w/ , e. g. , 
'11 11 
"warble, I' "warp, war, " "wharf ," "quarter") . 
11 
4. ) 
In a word such as macaroon," the [a] preceding [~vJ is assigned 
prpnunciation /x/ in the phonological rules and is reduced to 
schwa in the stress rules because it is unstressed. 
D. SCEN4IO Input 
SCENARIO Result of Stage 1 
ss-n-r-- Re'sult of Stage 2 
ssXnario Result of Stage 3 
s-rn b~rio Final ~sult af;ter stress 
In Stage 2 
-we find that the consonant cluster [sc], like the let.ter 
[c], usually has the sound of /s/ preceding [el, ti] or Cy]. 
The letter [r] 
does not occur in any cohtext given in its rule sequence and is therefore 
given hs mosg general pronunciation. Thereis only one rule for the pronun- 
ciation of Cn] . 
Moving on to Stage 3, the vowel [el receives the pronunciatian /t/ 
given by its most geperal rule. The vowel [a] follows the rule given in the 
previous example. The yowel [o] is morph-final and has the feature 
[-constricted pharynx], and is lengthened acco~dingly. Because the vovel [i] 
precedes another vowel,, it is lengthened also. 
13 
Generalizations and Related RuleG 
1.) 
The consonant cluater [sc] is given the representation of a 
double phoneve because the informatton that it is orthographi- 
cally a double consonant is needed both in the vowel rules and 
in the rules for) stress; It is later reduwd to a single phoneme. 
2.) 
Two other contexts or [sc] must be ordered before the rule which 
I1 
applied to scenario .It If [sc] precedes [i) followed by anorrler 
vowel, and certain letters precede [scj , a palatalization 
effect is nbserved. When preceded by a vowel in this context, 
It 
[sc] becomes /I/, e .g., prescience" ; when preceded by an [nl, 
tt 
it becbmes /;I or /t$/, e.g., ~onscious. 
I I 
3. ) 
The pronunciation Lsc] receives in "scenario" is also found 
preceding syllabic [l] in ewmple B. 
4.) If none of the contexts mentioned in 2.) or 3.) are found, the 
phonemic representation of [scl becomes /sk/ . 
5.) The reduction of /€/ to /T/ occurs in the stress rules. 
E. SUBVERSION Znpu t 
SUB = VERS + ION Result of Stage 1 
sab = vtjrg + an 
Result of Stage 2 
Resylt of Stage 3 
-------------------------------------------------------------- 
2 1 
ssb=v 5 ;+an Final result after stress* 
In this example, the wffix.l+ion] and the prefix csub=] are yecog- 
nized in Stage 1. 
There is only one pronunciation provided for the consonant [v], and 
[rJ, because it does not fit a specified context for syllabictr] is given 
the standard pronunciation. The letter [s] is followed by the sequence L-kLv], 
making it a candidate for palatalization. The palatalization rule whfch 
9 
applies assigns the phoneme /z/ 
In the final stage of letter-to-phoneme conversion, the affixes and 
vowels are considered. The prefix [sub=) has only one possible pronuncia- 
tion. The letter [el, because it precedes the sequence [r~] where the 
consonant, C, is not an [I;], is given the pronunciation /A/. The palatal 
phoneme /;/ now forms a left context for -the suffix [$ion], which, being 
word-final, is pronounced /an/. 
Generalizations and Related Rules 
1 .) 
Because (+s] is marked as occurring in word-f inal position orly, 
ihe \s] preceding [+ion] is not recognized as a suffix*. This 
step also prevents the [eK] preceding the is] from considera- 
tion as a possible suffix. 
2.) 
When an [s] pre~~ding the sequence [+i~] or [i~] is preceded by 
either a vowel or an [r], it is usually pronounced /;/. Some 
11 11 I1 
examples are revision, artesion, " "Persian" and "dispersion" ; 
two exceptions are "controv~rsial" and "torsion. " When [s] 2s 
preceded by [I], and when it occurs as part of the consonant 
cluster [ss] , the phoneme preceding the \vowel sequence is /f/ , 
It 
e.g., emulsion," "Russian." A third pronunciation is observed 
11 11 It 
when [s] is preceded by [n], e. g., transient, comprehensi~a. 
1 I 
3. ) 
The sequence /A r/ is la tef changed to / r/. 
4. ) 
The sequence [ion1 following a non-palatal ized consonant is 
I1 11 
pronounced /ian/ , e. g. , "oblivion, criterion, " "champion. 
I I 
5. ) 
The suffix [f ion] may be given other prolluhciations if not- morph- 
final. For example, it is pronounced /+idn/ in ganglidnic" and 
"histrionic. 
1 I 
F. SCIENCE 
Input 
SCI + ENCE Result of Stage 1 
ss- + ---- Result Stage 2 
ssa + 'ms Result of Stage 3 
,------&---------------------------------------------------. ------- 
1 
s iu + rns Final result after stress 
In Stage 1, L-tence] is recognized as a suffix. The consonant cluster 
[sc,] precedes [i] and is therefore given the pronunciation /ss/, later 
changed to /s/ as described earlier. The pronunciation of [i] preceding a 
vowel as / / is a consequence of its left context being a morph-initial 
% 
cmsonant cluster. 
Generalizations and Related Rules 
1.) 
Although [+ience] is a possible suffix, it is not reco~nized as 
such in this case because of the requirement' that at least one 
consodant and one vowel remain in the "root." This stipulation 
forces the correct suffix, [-I-ence], to be recognized. 
2.) 
Because [sc] is moeh-initial, it is not palatalized even though 
it precedes the sequence [i~]. 
3,) 
The letter [iJ is also pronounced /y/ i'f it is followed bv a 
vowel and is morph-initial, e.g., "iota, 11 11 lambic. tt 
NEUTRALIZATIONS 
NEUTR + AL + IZE + ATE -H ION + S 
L - 
I 
cycle -4 - 
--- - - --- .- - 
cycle 5- 
- cycle 6 -- I . 
(no cycle actually applied because 
+s is in a special stress-exclusion 
category) 
Figure 2 
Cyclic Rules (First Phase) 
Domain of Application 
NOTATION 
vowel 
consonant 
at least i consonants 
at most j consonants 
at least i consonants) at most j con\sonants 
variables 
a weak syllable, i,e, a shgrt vowel followed by 
no mgre than one consonant (a syllable begins ' 
with a vowel and terminates (a) imediately 
before the next vowel or (b) immediately before 
a formative boundary if one occurs before the 
next vowel 
a feature, e. g. , [ -long], C1 stress], or a phoneme 
with specified f eature(s) , 
e*g*y[, st:ess] 
either A or B or ... or P 
optional element; materi-a1 in parentheses is 
neglectell if and only if it does not correspond 
to context in the word under consideration -- 
word context is compared with rule context by 
first comparing it with the ~aximum string in 
the rule, i.e., with all parentheses remd~ed,~and 
then by ignoring parenthesized material beginning 
at the inneqost parentheses. and proceeding'to 
the outermost parentheses 
aomain of rule -- formativ,e boundqries of st'ring 
under consideration for cycllc rules, word-bound- 
aries fof last cycle and for non-cyclic r&les 
subscripts -king appearance of optional elements 
conditional (actual condit.ion given below rule) 
Figure 3. 
assign feature(s) Ly] to element X in the 
context YXZ 
assign feature(%) LyJ to ag element X with 
specified f eakure (8) cz] in the cootext PXZ 
Figqre 3 (concluded) 
START 
' Strip off all 
suffixes for 
first doma in of 
application 
Figure 4. 
Strew Rules - Flow Chart 
- 
v 
Apply 
Alternating Stress 
i 
I 
I 
Rule 
I 
I 
I 
I 
I 
I 
, 
I 
I 
No 
-- - J 
Figure 4. (concluded) 
? 
v 
v 
Apply 
Strong First Syllable 
Rule 
- 
A 
; 
Apply 
Cursory 
Rul6 
Apply 
Vowel Redpct i Jn 
Rules 
i i 
Main Stress Rule (cyclic) 
/ Ex-c~ 
Stressed Syll able Rule. 
C- 
Condit'ions: (1) no stress placement to the 
left of a prefix boundary 
(2) if right-most morph is a 
suffix, test far special 
stress placement category; 
astign [l stress] or skip 
cyc-le accbrding to cateaorv. 
(cycl ic) 
Conditions: (1) Y cbntaiss no pr-Pmary stress 
(2) mstress placement to the 
left of a preEix boundary 
[~lternatin~ Stxess Rule (cyclic) 
V+[I stress] / EX C~ (v) 
Destressing RuI-e (non-cycl ic) 
Conditions: (1) if ( ),is not present, ( 
must be present 
(2) not applied to the first vowel 
if applied to' second' vowel 
Compound Stress Rule (non-cyclic) 
Y ]+[I stress]/ 
1 stress 
- 
Conditions: (1) Y contains no 1 seress 
(2) if right-most morph is a suffix, 
check for special stress xeten- 
tion or s-tresg exclusion category 
and reassign [1 stress] according 
to category. 
Stress Placement Rules 
Strong First Syllable Rule (non-cyclic) 
. 
v +[I stress] f/ Go 
Condition: (I) ( 
present 
Cursozy Rule (non-cyciic) - 
Condition: (1) if right-most mbrph is a 
suffix, check for stress 
exclusion category 
Vowel Reducti~n Rules (non-cyclic) 
Figure 5. (concluded) 
I. Lexical Stress Placement 
The stress rules which have been implemented are a modification of 
a set of ordered rules developed by Halle. 
~odifications fall into three 
categories: 
(1) adjustments due to the corrdition that input is completely 
phonemic, (2 reduction of the numbeir of str~ 4s to 1 stress 
(primary) 2 .* stress (stress less tban primary) ~LIU 0 stress, and (3) 
addition of special suffiwdependent stress categories. 
Application of the rules proceeds in two phases. The first phase 
cons is.^^ of the application of three-ordered rules which are applied cycli- 
cally, first to the root, then to the root and left-most suffix,combined. 
The process continues with one more suffix adjoined to the string under 
consideration before each cycle begins until the end of the word is reached. 
This cyclic phase is devoted solely to the placement of primarLy stress. 
The second, non-cyclic- phase, includes the appli-ion -to the entire word 
of ordered rules and reduces all but one of the primary stress marks to 
secondary or zero stress. 
In the following section, stress placement rules will be given in sym- 
bolic form. Each rule which contains more than one case is brdken down lint0 
cases for whicLbrief descriptions and examples are given. The rules,are 
11 
listed in the order in which they apply and are marked either cyclic" or 
11 non-cyrlicm" Particular modifications 
to each rule will be given at the end 
of the discussion about that rule under the subheading Modifications. (See 
Figure 3 for an explanation of notation, Figure 4 for a flow ahart of the 
stress rules and Figure 5 for* the complete set of stress-placement rules-in 
linguistic notation. 
Main Stress Rule (cyclic) 
Condition: (1) no stress placement to 
the left of a prefix 
boundary 
(2) if right-most morph is 
a suffix, test for special 
stress placement category 
assign LI stress] or skip 
cycle according to cate- 
gory 
Case 1. (Maximum string; all parentheses removed) 
(a) Assign 1 stress to the vowel in a syllable preceding a weak 
cluster followed by a morph-final syllable containing a 
short vowel and zero or more consonants: 
1 
difficult dzf fzk~lt 
(b) 
Assign 1 stress to the vowel in a syllable preceding a weak 
cluster followed by a morph-final vowel: 
1 
oregano 3rEnmno 
(c) Assign 1 stress to the vowel in a syllable preceding a vowel 
f~llowed by a morph-final syllabxe containing a short vowel 
and zero or more consonants: 
1 
secretariat s~krtt~ri~ t 
(d) Assign 1 stress to the vowel in a syllable preceding a vowel 
f olloked by a* morph-f inal vowel : 
1 
oratorio 3r &t~rio 
Case, 2. (Innermost parenthesized string exclpded) 
I 
(a) 
Assign 1 Btress to the vowel in a syllable preceding a short 
vowel and zero or more consonants: 
1 
edit ldrt 
bitumen 
1 
bdjtum~n 
(b) 
Assign 1 stxess to the vowel in a svllable preceding a 
morph-final vowel: 
1 
agenda 
Case 3. (All parenthesized strings excluded) 
(a) Assign 1 stress to the vowel in the last syllable: 
1 1 
slt and stxnd go go 
1 
parole p~rbl hurricane Impken 
(reduced to 2 stress by a later rule) 
Conversion of the Main Stress Rule into aigorithmic form is facilitated 
by ordering the above cases in the following manner: 
Algorithmic Order of Application: (1) If the final syllable is the 
only syllable, or if it consists of a long vowel followed 'by at least one 
consonant, the final vowel receives primary strgss. Otherwise, (2) if there 
are only two syllables, or if the penultimate syllable terminates in more 
than one consonant or if it consists of a bng vowel follmed by at least one 
consonant, the penultimate vowel receives primary stress. Otherwise, (3) 
the antepenultimate vowel receives primary stress, 
Modifications: The presence of the optimal vowel immegiatel~ preced- 
ing another vowel and the presence of the morph-final vowel are necessary 
modifications of tk Main Stress Rule due to the difficulty of retrieving 
the long (tense) pronunciation of a laxed vowel when its orthographic 
representation i,s no longer available, 
The Main Stress Rule, as developed by Halle, applies mly to roots 
which function as nouns and to suffixed forms. However, until parsing 
methods are further developed, it will not be possible to take advantage 
of known parts of speech. 
15 
For this reason, the Main Stress Rule is cur- 
rently applied to all roots. 
The suffixes referred to in Condition (2) fall into two categories. 
Some suffixes are marked to force stress to be placed on either the final or 
the penultimate syllable of the root and suffixes under consideration. This 
placement of stress replaces the MSR on the cycle in which the special 
suffix is the right-most morph. These suffixes are listed below with the 
phonemic representation which actually appears as input. 
Examp 1e 
- 
EE - i final-syllable stress, retained by special 2 1 
ca tegorizat ion trainee 
EER - /~r/, final-syllable stress, retained by special 2 1 
categorization buccaneer 
ESCE - /€s/, -final-syllable stress, retained by 2 1 
special cate'gorization luminesce 
ESQUE - /~sk/, final-syllable stpess, retained by 2 1 
special categorization arabesque 
ETTE - /~t/, final-syllable stress, retained by 2 1 
special categorization marionette 
OON - /un/, final-syllable stress, retained by 1 
special categorization spitoon 
2 1 
herself 
SELF - /sglf/, final-syllable stress, retained by 
special categorization 
1 2 
bushelf ul FUL - /fUl/, final-syllable stress, later reduced 
by Compound Stress Rule 
1 2 
womanhood HOOD - /hUd/, final-syllable stress, later reduced 
by Compound Stress Rule 
2 1 2 
humidify IN - /rf hj/, final-syllable stress, latqr reduced 
by Compound Stress Rule 
1 2 
radicalize IZE - / %z/, final.-hyllable stress, later reduced 
by Compound Stress Rule 
12 
ovoid OID - /'ydf, final-syllgbble stress, later reduced 
by Compound Stress Rule 
1 2 
f riendghip SHIP - / p, f inal-syllable stress, later reduced 
by Compound Stress Rule 
2 1 2 
romanticism ISM - ham/, penultimate-syllable sttess,later 
zeduced by special categorization 
2 1 
- / aeri/, / & i, /&rr/, penultimate-syllable stress circumLocu- 
reduced by ~orn~ound ~tdiss Rule 
2 
tionary 
penultimate-syllable sttress, deleted by 
,Cursory Rule 
2 1 
infirmary 
2 1 2 
inhibitory ORY - 3, 3, penultimate-syllable stress, 
reduce'd by Compound Stress Rule 
L 
refractory penultimate-syllable stress, deleted by 
Cursory Rule 
1 2, 
ERY - r, penultimate-syllable stress, reduced stationery 
by Compound Stress Rule 
1 
slippery pqnultimate-syllable stress, deleted by 
Cursory Rule 
1 2 
systematory ATORY - ti, penultimate-syllable*strqss, 
reduced by Compound Stress Rulea* 
1 
ITION - I, penultimate-syllable stress, -d sedltian 
1 
IFIC - /rfrk/, /ria/, penultimate-syllable stress, specific 
retained 
2 1 
penultimate-syllable stress, reduced by specificfdv 
Destressing Rule 
2 1 
IC - /rk/, /rs/, penultimate-syllable stress, retained orthographic 
penultimate-syllable stress, reduced by 
2 1 
simplicity 
Compound Stress ~uLe 
The other category of suffixes referred to in Condition (2) does not 
affect stress; the cycle in which such a suffix is right-most in the domain 
is skipped. Later cycles, however, do include the suffix as part of their. 
domafn of application. These suffixes are listed below, and are accom- 
panied by exqmples demonstrating their inclusion in this category. 
1 
ABLE: (a) all words terminating in KABLE, e.g., eradicable, 
1 
connnunicabla 
1 1 1 1 
(b) formidable, n6 t iceable, manageable, knowled geabl~ 
ABLY: as with ABLE above 
1 12 12 
AGE: (a brigandage, vagabondage, chaperonage 
1: 2 
(b) 
anecdotage -- at the time Walker's rhyming dict$onary 
was compiled, the two stresses were interchangeq 
1 
DOM: (a) bachelordom 
1 1 
(b) 
words such as chrtstendom and martyrdom do not sfipply 
evidence; "d~m" must be considered a separate syqlable, 
i. e., the syllable preceding "om" is not strong 
1 2 1 1 
ED: (a) opinionated, talented, shepherded 
(b) Exceptions occur in words with no secondary stress and 
with primary stress moce than two sylLables to the left 
before the affixation of +ed, e.g., 
12 1 2 
precedented, interested (in some dialects) 
(c) 
Note that ED has no vowel in itspronunciation if not pre- 
ceded by [ tl or [ d] , A. e. , it is not a separate syllable. 
2 1 2 1 
EJR: (a) caravaneer , charioteer 
EN : There is no evidence of stress change due to the suffixa- 
tion of "erl"; most words to which it is added are l-syllable 
roots. 
1 1 2 
ER: eountenancer, clrcumscriber 
2 1 2 1 
ESQUE: Raphaelesque, harlequinesque 
1 1 1 
ES: (a) privileges, cartilages, luridnesses 
1 
(b) impoverishes 
(c) 
Note that ES (and S affixed to morph-final [el)has no uowr. 
in its pronunciation if not preceded by s,z,H,E,f, or r. 
EST: There is no evidence of stress change due to the suffixation 
of "est. 
It 
1 
ETH: seventieth 
FUL: No evidence of' aress change 
1 2 
HOOD: parenthood 
1 1 
IBLE : (a) eligible, intelligible 
1 I 
(b) words such as putrescible and fermentescible are not 
exceptions; the verbal ending ESCE always carries primary 
stress 
IBLY: as yith IBLE above 
1 1 
ILE: replicatile, fluviatile 
1 1 
ING : (a) conveyan(: ing, countenaming 
(b) Exceptions may occur in those contexts mentioned under ED. 
In the case of "countenancing," the syllable consisting 
of "en" is generally so reduced that it is imperceptible 
as a syllable. 
ISH: (adjectival) amateurish, sycophantish 
1 2 1 2" 
ISM: (a) Pharisaism, Sadduceeism 
1 2 21 2 
(b) invalidism, theatricalism 
1 2 1 2 
(c) vagabondism, monarchism 
1 2 1 2 1 2 
EE: (a) standardize, jeopardize, energize 
1 2 1 2 1 2 1 2 
(b) radicalize, memorialize, secularize, proselytize 
I 1 1 
LESS: conscienceless, characterless, objectless 
LET: No evidence of stress .change due to the affixation of "let. 
11 
1 1 1 
LY: particularly, passionlessly, precipitously 
1 f 
MENT: (a) Words such as government and sojournment indicate that 
EBNT should be placed in this category. 
(b) 
Most woadg of four or more syllables are given alternate 
pronuncia~ions corresponding to the placement of MENT in 
either this category or in the category of regular stress 
glacement, e.g., 
1 1 1 1 
advertisement / admttisement, medicament 1 medicament 
1 1 
NESS: disinterestedness, haphazardness 
5. 1 1 2 
OR:$ gowmor, warrantor, bcubator 
1 1 1 
RY: heraldry, wizardry, .. charlantanry 
SELF: No evidence of st-ress change 
1 2 
SHIP: (a) umpireshlp 
1 2 1 2 1 2 L 2 
(b) advocateship, candidateship,,laureateship, cartlinalship 
SOME: No evidence of stress change 
1 1 
TY: sheriffalty, suzerainty 
1 
1 1 
URE: judicature, triplicature, caricature 
All other suffixes not in theabove categories receive stress according 
to the general form of the Main Stress Rule. 
Stressed Syllable Rule (cyclic) 
Conditions: (1) Y contains no primary 
stress 
(2) no stress placement to 
the left of a prefix 
boundary 
Case 1. (Maximum string: all parentheses removed) 
~4 1 .tress] x c vco [ 1 stress y 3 
(a) Assign 1 stress to the vowel in a syllable preceding a weak 
cluster followed by a vowel and any number of consonants @hich 
is followed by the right-most priniary-stressed vowel ,: 
- 
oxygenate aks~ j €net 
(stress on final syllable later reduced) 
b Assign 1 stress to the vowel in a syllable preceding a vowel 
followed by a vowel and any number of consonants which is 
followed by the right-most primary-stressed vowel: 
stereobate 
01 
st~riobet 
(stress on final syllable later reduced) 
Case 2. (Innermost parenthesized string excluaed) 
~.*~lstress-~ - covco [ 1 Y-3 
1 stress 
(a) Assign 1 stress to the vowel two syllables to the left of the 
right-most primary-stressed vowel: 
1 1 
prapxg cxnd 3.L 
propaganda 
(stress on left-most stressed vowel later 
reduced) 
Case 3. (Next innermost parenthesized string excluded) 
r s7 7 
v 
v+\i stress] / EX ,c0 stressj Y 
(a) Assign 1 stress to the vowel one syllable to the left of the 
right-most primary-stressed vowel, i.e., to the vowel in the 
first syllable of the root* 
hormone 
'0 1 
h.zrmon 
Case - 4. (All parenthesized strings excluded) 
v-1 C 1 stressJ /EX c0 3 
(a) Assign 1 stress to the voael in the last syllable, i, to the 
vowel in the only eyllable of the root: 
1 
stand st bend 
(assig~bng 1 stress to vowel which already carries 1 stress 
bas no effect unless the rule specifies as in the Compound 
Stress Rule, that the vowel inust pre'viously be 1-stressei .) 
Algorithmic Order of Application: (1) If the right-moqt syllable 
containing primary stress is the left-most syl>able in tlie word, no stress 
is assinried. Otherwise, (2) if the syllable preceding the right-most 
stressed syllable is-the only syllable preceding it, asqign primary stress 
to the vowel in that syllable. Otherwise, (3) if the second syllable to the 
left of t.he right-most stressed syllable is the left-mosg syllable, or if it 
terminates in more than one consonant _or consists of a long vowel followed 
by at least one consonant, assign primary stress to the vowel in that syll- 
able. Otherwise, (4) the vowel in the third syllable to the left of the 
righf-most-stressed syllable receives stress. 
Modificatims: The ~ptional~vowel in pre-vocalic position appea'rs in 
the Stre-d Syllable Rulo as well as in the Main Stress Rule. Its presence 
prevents words such as "stereobate, I' "alveolate , I' and "heliotrope" from 
being stressed incorrbctly. 
The Stressed Syllable Rule, as developed by Halle, places stress on 
the final syllable of the non-nouns which have been excluded fromthe domain 
of application of the Main Stress Rule. Words for which the Categorization 
of nounlnon-noun amear to be most useful are those in which a one-syllable* 
12 
prefix precedes a one-syllable root or bound morpheme, e. g., Cperrnit] vs. 
1 12 1 
N 
[PR-l,lit] , Cins~lt)~ vs. linsult] Because there are many more verbs of 
v v 
this sort than nouns, the Stressed Syllable Rule has beep mod.ified to prevent 
the retraction of stress into a prefix. The effect-of this modification is 
to produce only the verbal pronunciation of two-syllable noun/verb pairs. 
Another more positive, effect is the correct placement of stress in Vkrbs 
1 1 1 
such as edit,; inhibit and pummel. However, two-syllable nouns of the form 
"prefix-roo,tl' which.have no verbal counterpart are stressed incorrectly, e.g. 
1 1 
empire, inverse. (This modification will be remaved or changed after a 
parsing algorithm is incorporated in the system.) 
Alternating Stress Rule (cvclic) 
Case 1. (.Maximum string) 
V+ [I stress] / EX - C~WC~- \. z stress) c0 3 
(a) Assign 1 stress to the vowel three syllables to thealeft of a 
primary-stressed vowel occurring in the last syllable if the. 
following syllable contains only a vowel: 
heliotrope 
(stress in last syllable later reduced) 
Case, P (Parenthesized stting excluded) 
(a) 
Assign 1 stress to the vowel two syllables to the left of a 
primary-stressed vowel occurring in the'lakt syllable: 
11 1 
gelinate Te1~ t'lnet 
(stress in first syllable later deleted; stress in last 
syllable later reduced) 
Algorithmic Order of Application: (1) If there are at least two 
syllables preceding a primary-stressed vowel in the last syllable of the 
phoneme string, and if tbe first of these two syllables is composed of more 
than a single vowel, place primary stre- on the vowel two syllables to the 
left of the vowel with primary stress. 
Otherwise, (2)*if there are at 
least three syllables to the left, the second of which is composed of a 
singl'e vowel, place primary stress on the vowel three syllables to the left 
of the vowel with primary stress. Otherwise, (3) no stress ~ssignment is 
made. 
1 2 1 2 
Exceptions; Note that words such as peregrinate, oxygenate and 
1 2 
metropolitanate which are corcectly stressed ~y the Stressed Syllable Rule 
are stressed incorrectly, thereafter,by the Alternating Stress Rule. 
Modification: The optional vowel in pre-vocalic position appears 
in the Alternating Stress Rule as well as. in the Main Stress and %Stressed 
Syllable rules. 
Proposed Modification: The restriction'of the Alternating Stress 
Rule to word@ in which a prefix boundary does not precede the final primary- 
stressed syllable could be constraided to uerbs. Such a c~nstraint would 
provide the correct stress placement in nouns and adjectives such as 
1 2 1 2 1 2 1 2 
multiform, contraband: intercept and miniskirt while retaining correct* 
2 1 2- 1 2 1 
stress placement in the verbs intercept, contradict-and comprehend. Such 
a modification would require moving thg Strong First Syllable Rule in 
Halle's scheme to follow the Compound Stress Rule, asgigning [2 stress] in 
the same context in which (1 st'ress] was previously ass'igned. This-modifi- 
cation has already been implemented in this program for independent reasons, 
and is discussed under the heading Modifications in the Strong First 
Syllable Rule. 
Destressing Rule (non-cyclic, applicable to all vowels having required 
context) 
Conditions: (1) if ( )ais not present; ( 1 
must be present b 
(2) not applied to first vowel if- 
applied %a second vowel 
Case 1. (Vowel to be reduced not in first syllable) 
+tS 3 
(a) Shorten and' destress any vowel not in the first syllable which 
is followed by a singl'e coneonant and a stressed vowel: 
1 sol 
instrumental Tnstrumf nt ~1 
( /u/ reduced to /U/, later to /a/) 
Case 2. (Vowel to be reduced is in first syllable) 
(a) 
Destress a non-long vowel in the first syllable which is 
followed by a single consonant and a stressed vowel: 
flO1 1 
gelat-inate jrl~ trnet 
Algorithmic Application: (1) If a vowel not in the first sy1Lable 
is immediately followed by one consenant and s vowel which has'previously 
been assigned primary stress6 shorten (lax) it if ie is long, and remove 
,any stress it has been assigned. (2) If a short vowel is in $he first 
syllable, and is immediately followed by one consonant and a vowel which has 
previously been assigned primary stress; and if (1) doe$ not apply to the 
vowel in the second tsyllable, remove any stress that has been assigned to 
the vowel in t,he first syllable. 
Modificakibn: The single required consonant preceding the ~rimary 
stressed vowel has been changed from C; (zero oz one consonane to C 
one consonant) so that pre-vocalic vowels are not shotcened. 
Compound Stress Rule (non-cyclic) 
This rule, as developed by Halle, applies to both compounds and 
non-compounds. As it applies to words converted by letter-to-phoneme rules 
in the program, and therefore to non-compounds only, its efPect is to locate 
the primary stress which is to be retained. All other primary stress is 
reduced to secondary. Halle has used the Nuclear Stress Rule for both phrase- 
level stress and the reduction of secondary to tertiary stress in lexical* 
ttems. Neither is necessary in this algorithm; the Nuclear Stress Rule has 
therefore been omitted. 
Condition: (1) Y contains no 1 stress 
(2) if right most morph is a 
suffix, check for special stress 
retention or stress exclusion , 
category and reassign [l stress] 
according &o category 
Case 1. (Maximum string) 
v 
[I stress]+Il stress, / EX - wco [-sti:6s] a 
(a) Retain 1 stress on a vowel if it is followed by at least one 
syllable and s word-final unstressed /i/. Reduce all other 
1 stress to 2 stress: 
legendary 
Case 2. (Innermost parentheshed string excluded) 
r v 7 
I1 stress)~[l stress] / a x YVC~~ 
(a) Retain 1 stress on a vowel if it is followed by at least one 
syllable. Reduce all other 1 stress to 2 stress: 
1 f12 P* 1 
hurricane h ~r rten gastritis g~stra trs 
iv 1 
trinitarian trmrtXri3n 
Case 3. (All parenthesized strings excluded) 
st~essj-t~ stress] /EX Y tij 
(a) 
Retain 1 stress on the only vowel tb which it has been assigned: 
1 
s t and stbend 
1 
edit ~d~t 
1 
difficult drf f rkdt 
Alg6rithrnic Order of Application: (1) If primary stress occurs only 
once, no changes are made. Otherwise, (2) if the right-most vowel with 
primary stress is followed by at least one more syLMle, the right-most of 
whikh is - not composed of an unstressed /i/, it retains primary stress and 
all other prfmary stress is reduced to secondary. (3) If the right-most 
vowel+ with primary stress is (a) the rigfit-most vowel in the word, or (b) 
the right-most vowel with the exception of a final syllable chpased of an 
unstressed /i/, the first primary-stressed vowel to its left retains primary 
stress and all other primary stress is reduced to secondary. 
Modifications: As mentioned previously, input to the stress rules 
from the letter-to-phoneme program do~s not include compounds. The part 
of-he rde designed for compounds is, the~efore, omitted. 
This rule formerly contained the letter Cy] instead of 
has been substituted due to unavailability of the original brthogsaphy. 
The suffixes referred to in Condition (2) fall into two categories. 
Those suffixes discussed under Condition (2) of the Main Stress Rule which 
do not affect stress placement are excepted from the Bamain of the Compound 
Stress Rule if they are either word-final or precede another word-final 
suffix in the same category. 
The other category of suffixes is marked for special stress retention. 
The fellowing suffixes retain primary stress in word-final position under 
Condition (2) of the Compound Stress Rule: 
2 1 2 1 
EE: trainee, legatee 
2 1 2 1 
EER: buccaneer, engineet 
2 1 2 I 
ESCE: luminesce, acquiesce 
2 1 2 1 
ESQUE : arabesque, Romanesque 
2 1 ZB 1 
ETTE : marionette, majorette 
Z 1 1 
OON : macaroon, baboon 
The foldowing suffix does not retain primary stress on the penulti- 
mate syllable under Condition (2) of the Compound Stress Rule: 
---- - 
12 2 I a 
ISM: Babism, dRomanticism 
Note: Th&s categorization is equivalent to the statement that 
syllabic M does not function as a syllable .in morph-final 
position. The same stress pattern appears in wordp ending 
in [ithm], although it is not included here as a sbffix, e.g., 
12 1 2 
logarithm, algorithm. 
The same categorization should be extended to morph-final 
syllabic 113. However, it does not function as a suffix, 
e.g- 9 
1 
corpuscle 
The original set of stress rules included the Trisyllabic Shortening 
Rule at this point in the ordering. The rule was stated as follows: 
Condition: (1) does not apply to /u/ 
Test results indi,cated misp~onunciations arisYkng from its application. A 
16 
study 
was undertaken to determine the usefulness of this rt~lp. and to 
uncover problem areas which might lead to a more proper resolution of 
observed effects for which the Trisyllabic Shortening Rule was formulated. 
It: was found that a restatement of phonological rules, including the require- 
ment of a short vowel in a one-syllable root preceding a Single consonant, 
and certain suffixes, obviated the need for the Trisyllabic Shorte'ning Rule 
in the set of stress rules. 
Strong First Syllable Rule (non-cyclic) 
Condition: (1) ( ) or ( ){ must be 
a 
present 
Case 1. (Maximum string) 
(a) 
Assign 2estress to the vowel in the,first syllable if it is long 
and followed by at least two cpnsonants: 
0 11 
hydrosanitation h adros~nrteJan 
Case 2. (First subscripted optional string excluded) 
~+\2 stress] / KC/* CO 
(a) 
Assign 2 stress to the vowel in the first syllable if it is 
followed by at least two consonants: 
1 1 1 
Case 3. (Second subscr4pted optional string excluded) 
V+CZ stress] / C, [z, 1 Y 3 
(a) 
Assign 2 stress to the vowel in the ffrst syllable if it'is long: 
dielectric 
011 
dajElcktnk 
Algorithmic Application: If the first syllable is strong, i.e., if 
it contains either a long vowel ot two or more consonants, assign the vowel. 
primary stress. 
Modifications: This rule has been extended to include both the-first 
syllable of the root and the first syllable of the left-most prefix. 
This rule has been moved to follow the Compound Stress Rule to pre- 
vent the retentipn of primary stress in prefixes by the Compound Stress Rule 
in words such as recruit and intend. 
Cursory Rule (non-cyclic) 
Conditibn: (1) if right-most morph is a 
suffix, check for stress 
exclusion category, 
Algorithmic Application: (only one case of the Cursory Rule) The 
vowel following the primary-stressed vowel, if it is not the last vowel in 
the word, is shortened and its stress removed. 
2 1 X'O 1 AH0 lPO 
Examples : infirmary , cursory-, curative (/e/-+ 1 a~ 1 ,- later reduced 
tb /a /.) 
Modifications: Pre-vocalic vowels are not shortened. 
The suffixes discussed under the Main Sttess Ruleahi-ch do not affecL 
stress placement are excepted from the domain of the Cursdry Rule if they 
are either. word-final or precede another, word-final suffix in the same cate- 
gory 
%we1 Reduction Rub (non-cyclic,applicable to all vowels having required 
ntext) 
Case lL. (reduction of 
21 
ptolemaic 
Case 2. (reduction of \ ot,her short non-srressed vowels) 
1 
curator ( 1x1 +la/,, 131 + /a/) 
Algori,thm.ic Appl$cation: All non-long unstressed vowels are 
reddced,. /E/ and /I/ to /5/, i.e., redaced I, and all others to /a/. 
Modification: The phonemes /€/ and /T/ are reduced Go /I/ rather 
than to' /a/. 
A Stress-Depe'hden~ Letter-to-Phoneme Rule 
The rule which follows appears to be stress-dependent and was placed 
in the stress placement section rather than with other letter-to-phoneme 
rules : 
Rule: 
The phoneme it/ is changed to /$/ and the phoneme /d/ to /J/ 
if it is not in the initial consonant cluster and precedes unstressed /u/ 
or /U/, or if it precedestlnstressed /a/ which was /u/ or /U/ before applica- 
tion of stress placement rules. 
2 1 2 1 
Example : perpetuity (It/) perpetual (I:/) 
Examples (showing wards which do not fit the context of the rule and 
'herefire retain /t/ or Id/ as pronunciation): 
1 1 I 1 1 
(a) tutor, duty, studious, duration, Lureen 
In these -cases, the /t/ or /d/ is in the initial consonant 
cluster 
2 1 1 1 
(b) adumbration, - modus, - status - 
The It/ or /d/ in these cases is not in the initial consonant 
cluster; but precedes unstressed /9/ which was not /u/ or 
/ U/ before application of stress rules. 
2 I 2 1 1 1 2 
(c) institution, - centurion, - Hindu, - constitute - 
In the above cases, /t/ or /d/ is not in the initial eonso- 
nant cluster and precedes stressed /u/ or /U /. 
The stress program has been modified to effect this change. The 
phonemes ft/ and /d/ preceding unstressed /u/ or /U / not in the first syl- 
lable are chgnged following the cyclic rules which place all stress. After 
the Destressing Rule and the Cursory Rule, a change is also made if the 
destressed (and possibly 'shortened) vowel was previously a /u/ or /u / and 
not in the first syllable. 
A Complete Example 
Input 
Result of Stage 1 
Result of Stage 2 
Result of Stage 3 
.Main Stress mle, cycle 1 
(domain : mu1 t ienukliol) 
Stressed Syllable Rule, cycle 1- 
Alternating Stress Rule, cycle 1 
Main Stress Rule, cycle 2 
(domain: mhlti=nukliol+et) 
Stressed Syllable Rule, cycle 2 
Alternating Stress Rule, cycle 2 
(There are no further cycles 
since Condition (2) of the MSR 
applies to +ED) 
Strong First Syllable Rdle 
Destressing Rule 
1 2 Compound Stress Rule 
a T 
Vowel Reduct ion RliIe 
mnltf euklia flet5 >1#111>1111J12#1,#1>) 
FSnal Result 
IV Reliability 
Two studies have been made to determine the accuracy of phonolo- 
gical and stress placement rules' and to select a minimal set df rules 
which will produce accurate results in as many cases as possible. 
The 
set of letxer-to-phonemewrules used in the first testing proced~re 
contained 534 rules: included were 127, consonant roles, 46 prefix rules 
(giving pronunciations for 40 prefixes), 155 suffix rules (covering 96 
suffixes) and 206 vowel rules. The Trisyllabic Shortening Rule was 
included in the set.of stress rules. A sample of 4,725 worb from the 
Brown Corpus was ~ested with the following results. 
Percentage given 
Number of Words acceptable pronunciation 
r 
7-letter words 1,174 73 
72- to 21-letter words 637 65 
An acceptable pronunciation .is one which is given in Webster ' s 
17 
Third International Dictionary, either preferred or alternate. Of the 
2,375-1- rq 5-lekter words which receiyed acceptable pronunciatLons, 
2,135 were given preferred pronunciat&ons, 228 were given alternace pro- 
nunciations and 12 received the verbaL pgonunciation of nounlverb pairs. 
A table of frequency of use and statistical accuracy 6f each rule 
was derivet from this study. These results led to the removal of the 
Tris'yllabic Shortening Rule and to the formulation of eight gets of 
phonological rules ranging from a maximal set of 557 rules to a minimal 
set of 277 rules. 
In the second study, these eight sets of rules were each applied to 
a new group of te$t words which was composed of a xandom sampling of six- 
letter words from the Brow Corpus (250 words), the Heritage English 
Dictionary (150 words) and Stedman's Medical Dictionary (100 words). 
Results of this study are as follows: 
number of Rules Percentage siven acceptable pronunciation 
nerita~ Brown Car-pus Stedman' s 
557 73 69 65 
531 73. 69 64 
453 72 69 64 
413 72 67 53 
35 9 70 65 49 
308 68 64 43 
286 67 63 44 
277 66 62 43 
Note: The addition of special medical prefixes would increase the accuracy 
& r+s appl.iredfo fie sample from stedmanrs Medical Dictionary by approxi- 
mately ten per cent. 
The set of rules turrentJg being used in the text-to-speech system 
is the set mqtaining 413 rules A list of the maximal set of 557 rules 
together with instructions hr extracting the other sets of rules is given 
in the appendix. 
There are a number of problem areas, many of which derive from the 
lack of a lekicon. Boblems of this type include incorrect suffix or prefix 
recognition and the treatment of compounds as single norphs. 
Some examples 
from each problem area are given below: 
Mispronunciation of single vowel: 
international 
menu 
/e/ modeled /e/ 
* 
/u/ strategf - cally /€/ 
environmental - /r/ bur - jed /PI 
hotels - 
/a/ two - 
The pr~nunciations of the underlined vowels in the contexts above 
ark encountered infrequently, and, in most cases, are not predictable. In 
the word international, the context which determines 
the pronunciation of 
@]is the right-hqnd context Lc+~v], A 10- xowel almost always is found in 
this context as in nation, station, explanation, observational, gensational. 
A short [el is usually found preceding PC], e.g . , maleuc, angelic ,. systemic, 
photogenic, and is long only in a few words, e.g., strategic, scenic, and 
the suffix *legit. .There are very few words ending in the vowel[u]-- 
most are either low frequency words or proper names. The palatalization 
in menu is not found in other words with finallu], e.g., flu, emu, gnu, 
-- 
i'mprqmptu. The word - two is very irregular in pronunciation. Most words 
ending in[o]such as E, - no, so, calico, echo have the sound lo/. It may 
be noted, however, that two other words which, like -1 two me very high fre- 
quency words, have the same pronunciation of final[o]as -9 two i.e., - do, - to. 
The mispronunciation of the[e]in modeled is due to the assumption, lacking 
a lexicon, that the morphemic analysis is -- model C - ed. 
Mispronunciation of vowel digraph: 
said 
- 
break - /i/ 
forfeit 
- 
endowed 
- 
shoes 
- 
should, - would - 
theirs 
- 
The reasons for mispronunciation of the vowel digraphs underlined above 
fall into a number of categories. There are very high frequency words, 
said, should and would, which do not follow letter-to-sound rules. Said 
may be contrasted with the words laid, maid, paid, and raid; the words 
should and would contrast with mould, shoulder and boulder. The sequence 
bir] as in theiys, heir, weir is not found frequently in English, nor is 
the sequence (feit] as in for£ eit surfsit and counterfeit . Rules for ki] 
,....9 
in these two contexts w@rakcskis:risidered unproductive. ~inal to4 in English 
is usually pronounced as in oboe, toe aA foe; the pronunciation found in 
shoe, and also in canoe, is rare. Rules governing the pronunciation of 
&ow] (endowed) and bi] (guitars) are statistically based. Althougfi there are 
rna~ywordo in- whiekhllar nort=ccmt ext dsqxmdmt-prwornlat3m~~s 
m, 
e.g., -9 cow' allow, eyebrow and build, guilt, guinea, other pronunciations 
are statistically more likely, e.g., those found in shadow, glow, follow 
and bruise, juice, nuisance. The ptonunciation of break is not-predictable 
-- the word steak has the same digraph pronunciation, but other similar 
words such as creak, freak and Streak are pronounced like the majority of 
words congaining the digraph Lea], 
Mispronunciation of .single consonant: 
of 
- /f/ cor~ /PI 
eager / j/ exhaust - /h/ 
two - /w / physiological - /s/ 
deserts - /s/ schizophrenic - /z/ 
The consonants underlined above are either silent or have unpredict- 
able or unusual pronunciations. Silent consonants are found in two, corp- 
and exhaust .- The yord two is a h'igh-frequency word in which both the l&] ahd 
b] have unusual pronunciatio~s . Silent M is rare, although it is also 
found in the word sword. Pinal silent [d, as found in corp is also rare. 
(This word is c~~sidered in this section because the pronunciation of both 
the,[r]and therp] are determined by rules for single consonants.) There are 
a few tmrds, like exhaust, in which [h] is .silent following kx], e. g. exhibit, 
exhilarate, exhort and exhume. However, this rule is not sufficiently 
productive to merit inclusion. 
The letter b] preceding (e],[il and ty] in English usually has a soft sound 
as in integer and wager. In particular, many words ending inber] ark a 
combinatton of a root with final [el and the Buff ix her], e. g . , mrager, manager, 
merger, all of which have a soft [g] sound. The pronunciation of the \g] in 
eager is unusual and not predictable. Another pr~nunciation which is fre- 
quently unpredictable, i.e., not context-dependent,, ts that of the lqtter[s] 
between vowels. The rule for this context predic~ the more frequent sound 
/s/ whereas the sound /z/ is found in deserts and physiologieal. The 
letter [ z] in schizophrenic has the rgre pronunciation /ts/, and the word - if, 
as previously discussed, is the only English word in which a final [f] is 
chef 
- /c/ laugh- /-/ 
wauld -3 should - /ld) cliches 
7 
/E/ 
calf - /lf / issue - /s/ 
tsar 
- /ts/ - these /e / 
Consonant clusters are in£ rqquently mispronounced The cluster tch 1 
is the most frequent problem fn Ghi~ category, its pronu~eiation being 
determined, in many cases, by the Greek or Latin origin of the word in 
which it appeBrs. The pronunciation /If as in -. chef and - cliches is less 
frequent than. either /:/, e.g., church, or /k/, e. g., .chemical. Morph- 
final Cgh] may be pdonounced either if/ as in laugh, enough and cough or, 
with slightly hiqher probability, not pronounced, a3 in high, weigh and 
dourn. 
Unusual and rare pronunciations of the clusters -3 t~ -9 ss - id add - li 
are found in other yoxds above. The pronunciatipn of [ss] as /J/ is 
found preceding certain suffixes, e.g.,.deprgsqion, fissute, but rarely 
within a motph, (tissue, above). Russian orthography is still reflected 
in the English spelling of tsar evemthough the pronunciation has been 
Anglicized. A silent [l] appears in could, .would and calf. The words 
could and would are high freauency words and also .differ 
from regular;,pz'~- 
nunciation in the vowel digraph cou3. Although half like -5 calf also has 
a silent [l] , in most words a final [lf] is gronopnced /If / , e. g. , - elf, -- shelf, 
self, gulf. 
The high-frequency-ward pronunciation of: morph-in itial [th] as 
/ 8 1, e. g. , these, -9 then - the, has been discussed previous3 " 
~ncorrect suf f is recognition : 
1 1 
water 
- 
thus - 
relying 
heated 
- 
disengagement 
/ ril+itig / exist - 
n 
/hi+e t+%d / alas - 
Almost al,l problems in this category arise from the lack of a morph 
lexicon. Words are pronounced incorrectly because letter strings in a 
root which appear to be suffixes are converted to phonemes using ,rules 
for suffix pronunciation. It may be seen in the examples above that a 
mistake in morph ana5iysis can cause obvious errors in pronunciation. 
Incorrect prefix recognition: 
1 1 
unit 
- / an=Zt/ - /d%s ~m+dl/ 
decimal 
1 i 
cool 
- /ko=al/ - emerald 
It m= tr 1116 / 
1 
reah 
1 
- /ri=am/ encouragement It ntbhr+f j+mTnt 
I 1 
de~ /di=€m/ -rQnenters. /~ks=par=5nbt+ar+z / 
Mistakes in morph analysis produce prohunciation errors, as in the 
previous category. 
 on-recornition of r refix 
apart 
fef er 
- 
dissent 
- 
hydrocele 
thennoinhibitory /~krm$nhabdt+$ri/ 13 
I6 
berated 
- /btr+e t+rd / 
correspondent 
2 1 
- 
/ ko=?and+a .I t / 
pericardiorrhaphy /p??=%kkrd~$r~f+i/ 
There are many technical prefixes which have not been included in 
the prefix list. 
These may 6e added by a user with particular technical 
needs. A few prefixes such as[a]in apart andp] in eject have not been 
included because a high error rate would result, i.e., all words begipning 
in &or g would be incorrectly analyzed. In the remaining cases, prefixes 
were incorrectly analyzed as part of a root after suffixes were incorrectly 
removed. Errors in pronunciation, and particularly in stress are the result. 
Incorrect stress: 
Most of the words in this category have unusual stress patterns 
which are unpredictable. A comparison with similar words shows the regular 
stress pattern: 
1 
motel 
1 
palette 
1 
sonata 
1 
urea 
1 
uncomfortable 
1 
lunatic 
1 
renegade 
marsel, gavel9 fuel 
1 1 1 
majorette, crurette, suffragette 
1 1 1 - 
vertebra, camera, automata 
1 1 1 
trachea, azalea, miscelanea 
1 1 
unaffordable, uncontestable 
1 '1 1 
erratic, fanatic, aromatic 
brigade, serenade, marinade 
The word selects is stressed incorrectly due to lack of information 
concerning its part of speech (c.f., discussion of modificatiqns in the Main 
Stress Rule). 
The results of this study indicate that the letter-to-phoneme system 
is quite powerful, even in isolaqion. When considered in the domain of the 
over-all text-to-speech system in which a lexicon is available for high- 
frequency words and compounds, the letter-to-phoneme system should be highly 
reliable . 

References

Alleq, J. "Speech Synthesis from Unrestricted Eext" in Speech Synthesis, 
edited by 3. I,. Flanagan and L. R. Rabiner, Stroudsburg, 1973. 

Allen, J. "Reading Machines for the Blind: 
The Technical Problems and 
Methods Adopted for their Solution," IEEE Transactions on Audio and 
Electroacoustics, Vol. Au-21, No. 3, June 1973, 259-264. 

Allen, J. "Synthesfs of Speech from Unrestricted Text," Proceedings of 
the IEEE, Vol. 64, No. 4, April 1976, 433-442. 

Hunnicutt, S. "~ronunciatiorL of HighrFrequency English ~ords," Natfiral 
Language Processing Group Memo, M.I.T., 1973. 

Kucera, H. andaFrancis, W. N. Co~putational Analysis of Present-Day 
American English, Providence, R. I. , Brown Univerbsity Press, 1967. 

Venezky, R. A Study of English Spelling-to-Sound Correspondeqces on 
Historical Principles, Ann Arbor,, 1965, 

Walker, J. Rhyming Dictiopary of the English Language, London, 1924 

The American Heritage DTctionary of the English Language, Paperback 
edition, New York, 1973. 

Stedman's Medical Dictionary, Baltimore, 1961 

(N. Chomsky and M. Halle, 
The Sound Pattern of English (New York, 1968), pa 74; 

M. Halle and S.J. 
Keyser, English Stress (New York, 1971) p'. 30), 

16. Hunnicutt, S. "Removal of the Trisyllabic shortening Rule from 
Stress, Natural Language Processing Group Memo, M.I.T., 1925. 

17. Webstergs Third New International Dictionary, Springfield; 1966. 
