Proceedings of the 2nd Workshop on Building Educational Applications Using NLP,
pages 77–84, Ann Arbor, June 2005. c©Association for Computational Linguistics, 2005
A Software Tool for Teaching Reading Based on Text-to-Speech  
Letter-to-Phoneme Rules 
 
Marian J. Macchi Dan Kahn 
E-Speech Corporation 
Princeton, NJ 08540 
  
mjm@espeech.com dk@espeech.com 
 
 
 
 
Abstract 
Native speakers of English who are good 
readers can “sound out” words or names 
from printed text, even if they have never 
seen them before, although they may not 
be conscious of the strategies they use.  
No tools are available today that can con-
vey that knowledge to learners, showing 
them the rules that apply in English text.  
We have adapted the letter-to-phoneme 
component of a text-to-speech synthesizer 
to a web-based software system that can 
teach word decoding to non-native speak-
ers of English, English-speaking children, 
and adult learners. 
1 Introduction 
Learning to read a language like English involves 
learning many different operations, including pho-
nemic awareness, word recognition, fluency, ver-
bal comprehension, and expression.  The research 
in this project focuses on the pronunciation aspect 
of reading from the printed page: understanding 
how letters, or graphemes, in words are related to 
sounds, or phonemes.   
Most people recognize that the relationship be-
tween English orthography and phonetic represen-
tation is complex and somewhat arbitrary. 
Although there is significant evidence that phono-
logical information plays an important role in word 
reading (Kayner, Foorman, Perfetti, Pesetsky, and 
Seidenberg, 2001), the precise role of “phonics 
rules”  that would allow a learner to “sound out” a 
printed word has been debated by educators as well 
as by cognitive psychologists, and many versions 
of phonics rules have been discussed by educators.  
A classic paper by Clymer (1963) argued that 
most of the phonics generalizations taught in ele-
mentary school are not valid most of the time. 
Clymer found that for many of the rules, there 
were so many exceptions that the rule had little 
utility as a generalization for teaching learners to 
sound out a word of English.  However, the 
Clymer results do not necessarily mean that phonic 
generalizations are not useful to readers.  Since 
Clymer, there have been many papers that have 
suggested alternate formulations of the letter-to-
phoneme rules for teaching reading.  For example, 
a recent study by Johnston (2001) found one rea-
son that Clymer considered phonics rules to be 
unreliable is because the rules he evaluated were 
too general.  Today, there is no consensus on a set 
of rules, nor does there exist any complete, explicit 
rule system that “decodes” any word or proper 
name of English for learners.  
E-Speech’s letter-to-phoneme (LTP) software, 
developed over many years for text-to-speech and 
speech recognition applications, uses proprietary 
rules to produce pronunciations for any input text. 
We have adapted the LTP software into a proto-
type web-based, interactive online system that tea-
ches word pronunciation by explicitly presenting 
rules for those words/names pronounced according 
to regular rules and by showing exceptions to the 
rules. The system allows students to view families 
of words that obey any given rule and to view 
words with the same letter patterns that obey diffe-
rent rules.  
Our intent is to develop a system that can pro-
vide phonics training for beginning readers, either 
children or adults who are native speakers of En-
77
glish, as well as for nonnative speakers of English 
and language-disabled learners. We envision the 
system either as part of an interactive dictionary or 
general language-teaching package or stand-alone 
as an instructional tool  for teaching word pronun-
ciation.   
A major challenge is to identify rules that are 
useful for learners and to present them effectively. 
We have begun to test our prototype system with 
nonnative speakers of English who were studying 
English as a second or foreign language. Our pre-
liminary results indicate that the software was (1) 
useful in improving nonnative speakers’ pronun-
ciation of English words; (2) effective at teaching 
both “basic” pronunciation rules, such as those 
commonly taught in phonics programs, and some 
novel, proprietary pronunciation rules. 
2 Software Design 
The Word Pronunciation tool allows a student to 
enter any word or name – whether it is in any dic-
tionary or not - and see our set of rules that account 
for its pronunciation.  The screen capture below 
shows the output for the word “photograph”.  The 
student can also hear the word pronounced, either 
normally or syllable-by-syllable.
1
  
 
Figure 1 : Word Pronunciation tool display 
                                                        
1
 The system uses the International Phonetic Alphabet (IPA) to repre-
sent phonetic transcriptions,  because most of our target population, 
adult foreign-born learners of English as a second language, were 
familiar with this alphabet, since it is used in many English learners’ 
dictionaries. 
 
In addition, a student can click on any rule and see 
other words obeying the same rule as well as 
words that are exceptions to the rule.
2
  
The Letter Pattern tool allows a student to enter 
a letter or sequence of letters (ie, a letter pattern) 
and see the rules that apply to that pattern and ex-
ceptions to those rules.  For example, a student 
confused by the fact that "how" and "snow" don't 
rhyme can enter the letter pattern "ow" and view 
the various generalizations (rules) that determine 
the pronunciation of this letter string in different 
contexts, as well as words that don't follow these 
generalizations (exceptions). The software under-
lying this tool allows the user to tailor the output to 
his needs of the moment.  For example, one can 
choose 1-syllable versus multisyllabic words as 
targets for the rules, how many sample words to 
output by default, and how big a vocabulary from 
which to draw words. A simple example of the 
operation of this tool is illustrated in Figure 2.  
 
Figure 2: Letter Pattern tool display 
 
While we developed the Letter Pattern tool for ge-
neral use by learners, we used its underlying search 
engine in exercises designed to diagnose and teach 
pronunciation rules. 
We implemented a framework for a self-paced 
set of exercises that allows the user to work alone 
to diagnose his pronunciation-rule weaknesses and 
learn the rules necessary to correct his errors.  We 
used this framework to assess the effectiveness of 
our rules and system for learners. 
In the typical exercise interaction, the user sees 
a sequence of words and must choose the correct 
pronunciation for each.  He indicates his choice of 
                                                        
2
 In this prototype, we worked on presenting the segmental rules, that 
is, rules for pronouncing phonemes. Although our letter-to-phoneme 
software assigns lexical stress (to indicate which syllable bears pri-
mary stress in polysyllabic words), the stress assignment algorithm is 
quite complicated. Although the algorithm is accurate, it is too com-
plicated for a human to apply for learning.  We also ignored the rules 
for morphological decomposition, such as for analyzing “walking’ 
into “walk” plus “ing” or “snowman” into “snow” plus “man”. 
Words with the letter pattern eigh 
Top 60000 words; Rules for vowel only 
 
1 rule (Common words shown first) 
     eigh →  eɪ    32 words    weight /weɪ t/    
more words 
Exception    aɪ     7 words     height  /haɪ t/    
more words 
Type a word or name: 
photograph                                                      PRONOUNCE IT     .
  
Word: photograph 
 
Pronunciation Rules   Click to see: 
Rule: ph →  f                      more words & exceptions 
Rule: o →  oʊ  in o.V                 more words & exceptions 
Rule: t →  t                                   more words & exceptions 
Rule: o →  ɑ :, Reduction ɑ : →  ə        more words & exceptions 
Rule: g →  g                                  more words & exceptions 
Rule: r →  r                                   more words & exceptions 
Rule: a →  æ                                   more words & exceptions 
Rule: ph →  f                      more words & exceptions 
Pronunciation:  ‘foʊ   tə   græf    
Syllable-by-Syllable :  ‘foʊ     tə       græf    
78
pronunciation by clicking on one of several op-
tions, represented in the International Phonetic Al-
phabet (IPA) or by clicking on a speaker-symbol, 
so that he can hear the options spoken. An example 
of a test item would be the nonsense word “doke”. 
If the user chooses an incorrect pronunciation for 
this word, he is told the correct pronunciation, as 
well as the relevant rule, which in this case is that 
an 'o' followed by a single consonant followed by a 
final 'e' is pronounced /oʊ /.  The user can choose 
to see actual examples of the rule in action 
(“smoke”, “home”, etc.) and other rules involving 
'o' (eg., the default pronunciation /ɑ :/, as in “hot”). 
In some cases, the exercises tested real English 
words, and in others, “nonsense” words (words 
that do not exist in English but are possible as 
words, because they have letter sequences occur-
ring in English words). Only through knowing the 
general rules of English pronunciation can a stu-
dent correctly predict the pronunciation of words 
he has never seen before. Figure 3 shows the exer-
cise for knowledge of the letter “a” in the nonsense 
word “jate." 
Figure 3. Exercise example 
 
A wrong answer would cause the screen in Figure 
4 to appear, in an attempt to teach the student the 
rule he apparently hadn’t mastered. 
Figure 4. Exercise feedback lesson example 
This “lesson” screen highlights the relevant pro-
nunciation rule in the word.  Because subjects told 
us that our pronunciation rule syntax, derived from 
our LTP rules, was “too mathematical” and hard to 
understand, the screen also displays an English 
language explanation for each rule (e.g., “In the 
letter pattern a – any letter – e, the letter a is pro-
nounced as /eɪ /”). In our prototype, we developed 
a simple text-generation algorithm to translate 
from our “mathematical” rule syntax (“a → /eɪ /, in 
a.e”) into normal English for the rules that we tes-
ted in our evaluation.  Going forward, however, we 
will need to produce the explanations via a more 
sophisticated algorithm or simply hand-prepare 
explanations for the rules. 
A subject can click on the “See all words” link 
to see more English words in which that rule ap-
plies. After the “lesson” the learner is given the 
opportunity to try again, in order to reinforce the 
correct pronunciation.  
The design of the prototype incorporates several 
features that are important to its extension to a full  
learning system.  First, the set of exercises is table-
driven, so that is relatively easy to add a new set of 
exercises.  This feature is important since a com-
plete system will need a large number of exercises. 
Second, the system is designed so that the corpus 
of words that serve as examples of the rules can be 
changed easily.  This feature is important since 
different user groups (e.g., adult nonnative spea-
kers, children, speakers with reading disabilities) 
may require different kinds of words as examples. 
3 Experimental Results 
In addition to developing lexical resource tools, we 
conducted an experiment to determine (1) if our 
software could be useful in teaching nonnative 
speakers of English how to pronounce English 
words, and if so, (2) if both commonly-taught pro-
nunciation rules and pronunciation rules that are 
idiosyncratic to the E-Speech letter-to-phoneme 
system can effectively be taught.  
We considered testing the lexical resource tools 
directly by giving students lists of words and in-
structing them to use the tools to learn the pronun-
ciation rules for the words.  However, we felt that a 
more efficient way of testing our software would 
be to develop a set of exercises to diagnose and 
teach various pronunciation rules and then to test 
how effectively students learned from the exer-
cises. We developed the design of the exercises 
You may need help with the pronunciation rule for a in jate. 
The correct pronunciation for jate is ʤ eɪ t  
Here are the pronunciation rules for all the letters in jate 
      Pronunciation rule  j  → ʤ  
→  Pronunciation rule a  →  eɪ , in a . e 
      Pronunciation rule t  →  t 
      Pronunciation rule e# → (not pronounced) 
 
Let’s look at the rule for pronouncing a in this word: 
→  Pronunciation rule a → eɪ , in a . e 
This rule says: 
In the letter pattern a – any letter – e, the letter a is pronounced as eɪ .
Examples: There are 2049 words in English where this rule applies.  
Here are 5 of them: 
 made meɪ d  
 state steɪ t  
 make meɪ k  
 same seɪ m  
 place pleɪ s    
   see all words 
Let’s get started. Here is item 1 of 22 items. 
Choose the correct pronunciation for this word. 
79
based on informal comments and results of pretests 
with more than 40 nonnative speakers of English. 
 
3.1 Experimental Design 
 
We sought to improve nonnative speakers’ word 
pronunciation competence, aiming toward giving 
them the competence of native speakers of English.  
Therefore, we included both native and nonnative 
speakers as subjects. 10 nonnative speakers of 
English and 7 native speakers of English success-
fully completed the final set of exercises.   Six 
non-native subjects were undergraduates or gradu-
ate students at Montclair State University who had 
been assigned to an English as a Second Language 
course based on their performance on an English 
language test administered by the university.  The 
other four were nonnative speakers of English in 
Brazil, Bolivia, and Germany.  Native languages of 
the subjects were German, Portuguese, Korean, 
Spanish, Polish, Bangla (Bangladesh), and Urhobo 
(Nigeria).  The native English-speaking subjects 
were high school or college students who grew up 
in New Jersey.
3
   
The subjects were assigned logins to the system 
and were instructed to complete a series of exer-
cises, each of which would present different Eng-
lish pronunciation rules.  A subject logged in to the 
system with a web-browser over the internet, saw a 
printed word and a set of possible pronunciations 
for the word (as described above). The student was 
instructed to listen to the set of choices and to 
choose the pronunciation that he thought was cor-
rect.   Subjects were told that each exercise would 
consist of two parts.  The first part of each exercise 
would identify the pronunciation rules with which 
a subject might need help and then teach the rule; 
the second part of the exercise would determine 
whether teaching the pronunciation rules was ef-
fective.  In the teaching part of the exercise, each 
rule was presented several times, as it applied to 
different words.  Subjects were allowed to repeat 
the first part of each exercise as many times as they 
wished, until they felt comfortable about proceed-
ing to the test part of the exercise. 
                                                        
3
 Nonnative subjects were told that we had developed software for teaching 
word pronunciation and we needed nonnative speakers to try the software and 
see if it helped them to improve their word pronunciation. English-speaking 
subjects were told that we had developed software for teaching word pronuncia-
tion to nonnative subjects, and that we needed to compare the students’ per-
formance with native speakers’ performance. 
 
Our software logged the students’ choice for 
each word in each part of each exercise and scored 
it as correct (1) or incorrect (0).  We computed the 
percentage of correct choices, which we call the 
word pronunciation score.  We also logged the 
number of times a student practiced with the diag-
nosis/lesson portion of each exercise, and the 
amount of time a student spent with each item. 
The basic exercises were: 
Basic: 1-syllable nonsense words representing 
“basic” rules, rules that are extremely common in 
English words. These are productive rules (English 
speakers apply them in nonsense words), and rules 
capturing these generalizations are commonly 
taught in phonics programs. Specifically the exer-
cise teaches: 
• a is pronounced /eɪ / in the letter sequence a - any 
letter - e , as in make  
• a is pronounced /æ/ by default, as in cat and analo-
gous rules for the letters e, i, o, u. 
 
The other exercises taught and tested rules from 
the LTP system, using English words rather than 
nonsense words as the material.  These were: 
LTP1: the basic rules for the letter a, plus the 
trisyllabic laxing rule (which we call the 3-syllable 
rule), which causes underlying long vowels and 
diphthongs to shorten to a lax vowel in antepenul-
timate syllables:  
• a is pronounced /eɪ /, in a - any letter - e, as in make  
• a is pronounced /æ/ in a – any letter - e when a is 3 
syllables from the end of a word (the “3-syllable 
rule”),  as in tragedy  
• a is pronounced /æ/,by default, as in cat  
 
LTP2: rules for a before the letter l:  
• a is pronounced /ɔ :/ in all at the end of the word, as 
in ball 
• a is pronounced /ɔ :/, in alt, as in salt  
• a is pronounced /eɪ /, in a - any letter - e, as in sale 
and make  
• a is pronounced /æ/ by default, as in pal and cat  
 
LTP3: rules for the letter a when it is followed 
by the letter r:  
• a is pronounced /ɔ :/ in war  
• a is pronounced /æ/ in arr followed by any vowel , 
as in carry  
• a is pronounced /ɑ :/  in ar at the end of the word, as 
in car, and in ar followed by any consonant, as in 
part  
• a is pronounced /eɪ /, in a - any letter - e , as in care  
 
LTP4: rules for the letter a when it is preceded 
by the phoneme /w/:  
• a is pronounced  /ɔ :/  in war 
80
• a is pronounced /ɑ :/  in /w/a, as in watch and qual-
ity … except … 
• a is pronounced /æ/ in /w/a before the phonemes /k/, 
/g/, /m/, /ŋ /, as in wag, swam, quack, swang 
• a is pronounced /eɪ / in a - any letter - e , as in wake  
 
Since each of these exercises included only several 
rules, the final exercise, LTP-all, recapped the 
other exercises, in order to assess how well stu-
dents could integrate all the rules. 
LTP-all: an integrated exercise:  all the rules for 
the letter a that were presented in the previous ex-
ercises, plus the rule 
• a → /eɪ / in aste, as in paste 
We chose this particular set of LTP rules be-
cause they would allow us to compare the “basic” 
rules, common to many phonics programs, and 
rules in our LTP system that are not taught in 
phonics programs.   All the non-basic rules in our 
experiment were pronunciation rules for the letter 
“a” and were chosen because they applied to many 
English words and represented a variety of formal 
types of rules.  For example, some had relatively 
simple contexts (e.g., “alt”), and some had compli-
cated contexts. For most of the rules, the context 
was specified in terms of the surrounding letters 
(for example, the letter a when followed by any 
letter and the letter e).  For one rule, the context 
was specified in terms of the surrounding pho-
nemes.  This latter type of rule is complicated be-
cause it requires that a learner first identify the 
phonemes for the letters surrounding the target let-
ter “a”. 
In the exercise on “basic rules”, we used non-
sense words to teach and test pronunciation rules.  
Our reasoning was that the strongest test of 
whether a student knows the rules is to test his 
pronunciation of nonsense words, since the only 
way he could possibly know how to pronounce a 
nonsense word is by applying the rules. Further, 
we felt there was strong evidence that the “basic 
rules” are productive in English.  That is, native 
speakers of English know these rules and apply 
them in novel words and nonsense words. For ex-
ample, English speakers pronounce the “a” in non-
sense words with “a” – consonant – “e” at the end 
of a word, (e.g., “pake”, “glape”, “nade”) as /eɪ /. 
Consequently, we felt that teaching and testing 
with nonsense words would help give nonnative 
speakers the same competence as native speakers. 
However, for the other exercises we used real 
English words.  Our reasoning was that the non-
basic rules, although they apply to classes of Eng-
lish words, may not be productive in English. That 
is, native speakers of English might not apply the 
rules to nonsense words, even though the rule gov-
erns a class of existing English words. For exam-
ple, in English, “oo” is most commonly /u:/ (as in 
“coo” and “cool”), but when the “oo” is followed 
by a final “k”, the vowel is almost always pro-
nounced /ʊ / (e.g., “took”, “book”, “cook”, 
“brook”, “crook”, “snook”, though there are a few 
exceptions: “kook”, “spook”).   The question thus 
arises whether native speakers of English, who 
obviously know how to pronounce these words, 
have internalized the “ook rule” and apply it in 
novel words. Native speakers do not always pro-
nounce novel “ook” words with /ʊ /; instead, they 
sometimes use /u:/ in nonsense words, like 
“mook”, “dook”, “vook” (see Treiman et. al., 
2003). Of course, the fact that a rule is not produc-
tive does not mean that it is not useful for teaching 
students how to pronounce words; clearly it would 
be useful for students to know that “ook” is usually 
pronounced /ʊ k/.  However, since we wanted to 
compare nonnatives’ performance to natives’ per-
formance, and we were primarily concerned with 
teaching nonnative speakers how to pronounce real 
English words, we chose to teach and test real, as 
opposed to nonsense, words.
4
   
We did include one test of non-basic-rules that 
used nonsense words, anticipating that native 
speakers might perform differently from nonna-
tives, if the nonnatives, who had been explicitly 
taught pronunciation rules, applied them to non-
sense words, even if natives did not apply them 
productively in nonsense words. 
 
 
3.2 Experimental Results 
 
We present our results informally, without sta-
tistical analysis of significance, primarily because 
we have to date collected data a relatively small 
number of subjects. Consequently, we interpret our 
results as preliminary. 
                                                        
4
 We attempted to choose words that have relatively low frequency-of-
occurrence, to minimize the chances that a nonnative speaker would simply 
know the word. 
 
81
50
55
60
65
70
75
80
85
90
95
100
Native NonNative
Subjects
W
o
r
d
 P
r
onu
nc
i
a
t
i
on S
c
o
r
e
 
(
%
 
co
r
r
e
c
t
)
Before
After
 
Figure 5. Overall Word Pronunciation Scores 
 
Figure 5 shows word pronunciation scores aver-
aged over all subjects and exercises, tabulated as 
“before” (word pronunciation scores before sub-
jects were offered any lessons) and “after” (word 
pronunciation scores from the test parts of the ex-
ercises, after the lessons). As would be expected, 
native speakers had higher word pronunciation 
scores than nonnative speakers.  Further, nonnative 
speakers had higher word pronunciation scores 
after completing the lessons than they did before 
the lessons, although, overall, they did not achieve 
native speakers’ level of word pronunciation.  
Thus, our data suggests that, overall, nonnative 
speakers were able to learn aspects of word pro-
nunciation from our system. 
50
55
60
65
70
75
80
85
90
95
100
Native                                            NonNative
Subject
W
o
r
d P
r
o
nunc
i
a
t
i
on S
c
or
e
 (
%
 co
r
r
ect
)
Before
After
 
Figure 6. Word Pronunciation Scores by Subject 
 
Figure 6 indicates that there was wide variability 
among the subjects.  Some nonnative subjects’ 
scores increased much more than others’, and sev-
eral subjects’ scores did not increase or increased 
only slightly.  Nonnative subjects with higher “be-
fore” scores, in general, did not increase as much 
as the nonnatives with low “before” scores, proba-
bly because their scores were high to start with. 
 
Figure 7. Word Pronunciation Scores by Exercise 
(B=Basic, L1=LTP1, L2=LTP2, L3=LTP3, L4=LTP4, 
La=LTP-all) 
Figure 7 presents the same data, collapsed 
across subjects, for the different exercises, which 
represented different sets of pronunciation rules. 
We wanted to know whether some exercises 
proved more learnable than others. In general, as 
shown at the right side of the Figure 7, for nonna-
tive speakers, for each exercise, the word pronun-
ciation scores were higher after the lessons than 
before, although the effects were greater for some 
exercises than others. For native speakers, in con-
trast, there were no systematic differences in the 
before versus after scores.  However, overall, 
scores were higher for some exercises than for 
other exercises even for native speakers.  Examina-
tion of the native speakers’ “incorrect” responses 
suggests that dialectal issues may have caused 
some native speakers to choose responses that we 
did not anticipate.  For example, for the word 
“waffle”, some native subjects chose the pronun-
ciation /wɔ :fə l/, although we had assumed that the 
pronunciation in these subjects’ dialect was 
/wɑ :fə l/. 
Figures 8 and 9 suggest that some rules were 
useful to nonnative subjects. For example, the "ba-
sic" rules, in general, were effective; the nonsense 
words tested after the lessons elicited higher scores 
than those tested before any lessons.  Of the other, 
letter-to-phoneme-based rules, the "war" rule and 
the 3-syllable rule seemed to be effective (the “be-
fore” bar for “war” is not displayed in Figure 9, 
because the before score was extremely low, only 
20%).   
50
60
70
80
90
100
Native                    Nonnative 
Exercise 
W
o
r
d
 P
r
onunc
i
a
t
i
o
n S
c
o
r
e
 (% c
o
r
r
e
c
t
)
 
After
  B   L1 L2  L3 L4  La     B   L1 L2  L3 L4  La
Before
82
Figure 8. Word Pronunciation Scores for Basic 
Rules 
 
LTP Rules 
(in words)
40
50
60
70
80
90
100
a
rC
/w
/
a
a-
d
e
fa
u
l
t
al
l#
a.
e
al
t
ar
r
V
3-
s
y
l
l
w
a
r
ar
C
/
w/
a
a
-
d
e
fa
u
lt
a
l
l#
a.
e
a
l
t
a
r
rV
3
-s
y
l
l
w
a
r
Native                            Nonnative
Before
After
 
Figure 9. Word Pronunciation Scores for LTP 
Rules 
 
A complicated rule, the /w/a rule (i.e., the rule that 
the letter "a" after the phoneme /w/ is pronounced 
/ɑ :/), appeared not to be useful to nonnative sub-
jects. We found no evidence for the effect of teach-
ing for another rule, the “aste” rule, because all 
nonnative subjects knew the pronunciation of the 
“aste” words before the lessons. However, there 
were differences between subjects. One source of 
this difference is probably due to differences in the 
nonnative subjects’ pre-existing English knowl-
edge; that is  some subjects knew some word pro-
nunciations in advance of the lessons. 
Consequently, for some rules we obtained data for 
only a few subjects. 
As discussed above, since the items in the exer-
cises testing our letter-to-phoneme rules were Eng-
lish words (even though there were not common 
words) how do we know that subjects’ perform-
ance was due to our lessons; perhaps the subjects 
simply knew the words’ pronunciation before par-
ticipating in our experiments? How well do nonna-
tive speakers apply the rules they learned to words 
that we can be sure they have never seen before? 
One of our exercises, LTP-all included a section 
that contained only nonsense words (e.g., “later-
ous”, “plar”, “swarg”, “falt”). 
Integrated Test
(after)
W
o
r
d
s
W
o
r
d
s
N
o
n
W
o
r
d
s
N
o
n
W
o
r
d
s
50
60
70
80
90
100
Native NonNative
Subject
W
o
r
d
 P
r
onunc
i
a
t
i
on 
S
c
o
r
e
 (
%
 
co
rr
ec
t
)
 
Figure 10.  Word Pronunciation Scores for non-
words versus words in the Integrated Rules Test 
LTP-all after lessons 
 
Figure 10 presents the results of the nonsense 
word portion of LTP-all:  pronunciation scores for 
real English words versus nonsense words for non-
native subjects and the analogous scores for native 
speakers, collapsed across subjects.  Although 
there were between-subject differences,  on aver-
age, both sets of subjects had lower scores for non-
sense words than for words.  If subjects based their 
scores for all test items – words as well as non-
sense words – entirely on the word pronunciation 
rules that we included in our exercises, then we 
would expect their scores to be the same for words 
and nonsense words.  Since words have an empiri-
cally correct pronunciation (they are given in a 
dictionary, for example), native speakers may be 
relying on a stored phonological representation for 
the word items.  For nonsense words, however, the 
subjects must rely on rules or other principles.   If 
subjects used rules or principles for pronouncing 
nonsense words different from the ones we ex-
pected, then the nonsense word scores would be 
lower than those for words. However, the pronun-
ciation scores for the nonsense words for the non-
native subjects were higher than that for natives.   
This fact suggests that the nonnatives were, in fact, 
applying the pronunciation rules they had learned 
in our lessons to the nonwords. 
Basic Rules  
40 
50 
60 
70 
80 
90 
Before
 100 
After
  (in nonsense words) 
   a  e  i  o  u  a  e  i  o  u     a  e  i  o  u  a e   i  o  u  
vowel-anyletter–e   vowel default  vowel-anyletter-e   vowel default 
   Native                     Nonnative 
83
In summary, our preliminary results indicate 
that: (1) our software was useful in teaching non-
native speakers of English how to pronounce Eng-
lish words; (2) both “basic” pronunciation rules 
and some novel, proprietary pronunciation rules 
were useful for teaching word pronunciation. 
 
4 Future Research 
 
The major directions for our future research are to 
select and reformulate LTP rules that are useful for 
teaching and then to obtain more user data for 
nonnative speakers and for native-English-
speaking children learning to read. 
First, we intend to produce a complete learner’s 
rule system for English, based on the entire set of 
text-to-speech LTP rules.  Since our text-to-speech 
system contains roughly 800 LTP rules, which we 
believe is too large a number for learners, a signifi-
cant task is to reduce the number of rules.    We 
intend to focus on rules that apply to large numbers 
of words and remove rules that apply to few words, 
relegating the words to which they apply to the 
exceptions dictionary.  We also intend to attempt 
to collapse rules that apply in similar contexts.   
Second, we need to determine whether learners 
can, in general, understand a pronunciation system 
that requires rule ordering. In our system, some 
rules are labeled as “default” rules, meaning that 
they apply if no other rules apply. Consequently, 
learners must know all the rules in order to know 
when to apply the default rule.  If the number of 
rules is too large, learners may need a system in 
which each rule is independently unambiguous.  
Third, we need to recruit more nonnative speak-
ers of English who are good candidates for improv-
ing their word pronunciation skills.  Some of our 
subjects in this experiment had relatively high 
word pronunciation scores before exposure to our 
lessons, so that observing any effect of our lessons 
was inherently limited.  Therefore, we would like 
to recruit more subjects with lower word pronun-
ciation ability, in order to get a better picture of the 
effectiveness of our system.  Can we predict which 
students will have high pronunciation scores and 
which will have low scores based on a student’s 
report of his or her experience in English 
50
60
70
80
90
100
05101520
Years studied English
W
o
r
d
 P
r
o
n
un
ci
at
i
o
n 
S
c
o
r
e (
%
 c
o
r
r
e
ct
)
 
Figure 11. Relation between Word Pronunciation 
Score and Years Studied English 
 
Figure 11 presents the average word pronuncia-
tion scores for each subject at the beginning of the 
exercises, before doing lessons.  As shown, the 
reported length of time a student had studied Eng-
lish was not correlated with word pronunciation 
scores.  Consequently, we cannot depend on a stu-
dent’s reported length of time studying English as 
predictive of his word pronunciation abilities.  In-
stead, we will need to screen prospective subjects 
via pretesting. 
Finally, although we have developed our system 
for nonnative speakers, we would like to test our 
system with native-English-speaking children 
learning to read.  However, it is likely that the user 
interface and corpus of exemplar words will need 
to be different for the child population. 
Acknowledgments 
This work was supported in part by an SBIR grant from 
the National Science Foundation. 

References  
Clymer, T. (1963). The utility of phonic generalizations 
in the primary grades. The Reading Teacher, 50, 182-
187. 
Johnston, F. P. (2001). The utility of phonic generaliza-
tions: Let’s take another look at Clymer’s conclusions, 
The Reading Teacher, 55, 132-143 
Rayner, K., Foorman, B. R., Perfetti, C. A., Pesetsky,  
D., and Seidenberg, M. S. 2001.  How psychological 
science informs the teaching of reading. Psychologi-
cal science in the Public Interest, vol. 2. 2: 31-94. 
Treiman, R, Kessler, B, and Bick, S. 2003.  Influence of 
consonantal context on the pronunciation of vowels: 
a comparison of human readers and computational 
models. Cognition 88: 49-78. 
