Proceedings of the Eighth Meeting of the ACL Special Interest Group on Computational Phonology at HLT-NAACL 2006, pages 11–20,
New York City, USA, June 2006. c©2006 Association for Computational Linguistics
Improving Syllabification Models with Phonotactic Knowledge
Karin M¨uller
Institute of Phonetic Sciences
University of Amsterdam
kmueller@science.uva.nl
Abstract
We report on a series of experiments
with probabilistic context-free grammars
predicting English and German syllable
structure. The treebank-trained grammars
areevaluatedonasyllabificationtask. The
grammar used by M¨uller (2002) serves
as point of comparison. As she evalu-
ates the grammar only for German, we re-
implement the grammar and experiment
with additional phonotactic features. Us-
ing bi-grams within the syllable, we can
model the dependency from the previous
consonant in the onset and coda. A 10-
fold cross validation procedure shows that
syllabification can be improved by incor-
porating this type of phonotactic knowl-
edge. ComparedtothegrammarofM¨uller
(2002), syllable boundary accuracy in-
creases from 95.8% to 97.2% for En-
glish, and from 95.9% to 97.2% for Ger-
man. Moreover, our experiments with
different syllable structures point out that
there are dependencies between the on-
set on the nucleus for German but not
for English. The analysis of one of our
phonotactic grammars shows that inter-
esting phonotactic constraints are learned.
For instance, unvoiced consonants are the
most likely first consonants and liquids
and glides are preferred as second conso-
nants in two-consonantal onsets.
1 Introduction
In language technology applications, unknown
words are a continuous problem. Especially, Text-
to-speech (TTS) systems like those described in
Sproat (1998) depend on the correct pronunciation
of those words. Most of these systems use large pro-
nunciation dictionaries to overcome this problem.
However, the lexicons are finite and every natural
language has productive word formation processes.
Thus, a TTS system needs a module which con-
verts letters to sounds and a second module which
syllabifies these sound sequences. The syllabifica-
tion information is important to assign the stress sta-
tus of the syllable, to calculate the phone duration
(Van Santen et al. (1997)), and to apply phonologi-
cal rules (Kahn (1976), Blevins (1995)). Many au-
tomatic syllabification methods have been suggested
e.g., (Daelemans and van den Bosch, 1992; Van den
Bosch, 1997; Kiraz and M¨obius, 1998; Vroomen
et al., 1998; M¨uller, 2001; Marchand et al., to ap-
pear 2006). M¨uller (2001) shows that incorporat-
ing syllable structure improves the prediction of syl-
lable boundaries. The syllabification accuracy in-
creases if the onset and coda is more fine-grained
(M¨uller, 2002). However, she only incorporates par-
tial phonotactic knowledge in her approach. For in-
stance, her models cannot express that the phoneme
/l/ is more likely to occur after an /s/ than after a
/t/ in English. The information that a phoneme is
very probable in a certain position (here, the /l/ ap-
pears as second consonant in a two-consonantal on-
setcluster)willnotsufficetoexpressEnglishphono-
tactics of an entire consonant cluster. Moreover, she
11
only reports the performance of the German gram-
mar. Thus, we are interested if the detection of syl-
lable boundaries can be improved for both English
and German by adding further phonotactic knowl-
edge to a grammar.
Phonotactic constraints within the onset or coda
seem to be important for various tasks. Listeners in-
deed use phonotactic knowledge from their mother
language in various listening situations. Vitevitch
and Luce (1999), e.g., showed if English speak-
ers have to rate nonsense words how “English-like”
the stimuli are, highly probable phonotactic stimuli
were rated more “English-like” than stimuli with a
lower probability. Speakers make also use of their
phonotactic knowledge when they have to segment
a sequence into words. In a words spotting task,
Weber and Cutler (2006) found evidence that speak-
ers of American English can segment words much
easier when the sequence contains phonotactic con-
straints of their own language.
Beside many perception experiments which show
that phonotactic constraints are useful information,
many different methods have been suggested to
model phonotactic constraints for language tech-
nology applications. Krenn (1997), for instance,
uses Hidden Markov Models to tag syllable struc-
ture. The model decides whether a phoneme be-
longs to the onset, nucleus or coda. However, this
model does not incorporate fine-grained phonotac-
tics. Belz (2000) uses finite state automatons (FSA)
to model phonotactic structure of different sylla-
ble types. We use similar positional features of
syllables. Moreover, Carson-Berndsen (1998) and
Carson-Berndsen et al. (2004) focus on automat-
ically acquiring feature-based phonotactics by in-
duction of automata which can be used in speech
recognition. In our approach, we concentrate on
explicit phonotactic grammars as we want to test
different suggestions about the internal structure of
words from phonological approaches (e.g. Kessler
and Treiman (1997)). We assume, for instance, that
codas depend on the previous nucleus and that on-
sets depend on the subsequent nucleus.
In this paper, we present experiments on a series
of context-free grammars which integrate step by
step more phonological structure. The paper is or-
ganized as follows: we first introduce our grammar
developmentapproach. Insection3, wedescribeour
experiments and the evaluation procedure. The sub-
sequentsection4showswhatkindofphonotacticin-
formation can be learned from a phonotactic gram-
mar. Last, we discuss our results and draw some
conclusions.
2 Method
We build on the approach of M¨uller (2001) which
combines the advantages of treebank and brack-
eted corpora training. Her method consists of four
steps: (i) writing a (symbolic i.e. non-probabilistic)
context-free phonological grammar with syllable
boundaries, (ii)trainingthisgrammaronapronunci-
ation dictionary which contains markers for syllable
boundaries (see Example 1; the pre-terminals “X[”
and “X]” denote the beginning and end of a sylla-
ble such that syllables like [strIN] can be unambigu-
ously processed during training), (iii) transforming
the resulting probabilistic phonological grammar by
dropping the syllable boundary markers1 (see Ex-
ample 2), and (iv) predicting syllable boundaries
of unseen phoneme strings by choosing their most
probable phonological tree according to the trans-
formed probabilistic grammar. The syllable bound-
aries can be extracted from the Syl node which gov-
erns a whole syllable.
(1) Word → X[ Sylone ]X
(2) Word → Sylone
We use a grammar development procedure to de-
scribe the phonological structure of words. We ex-
pect that a more fine-grained grammar increases the
precision of the prediction of syllable boundaries as
more phonotactic information can be learned. In the
following section, we describe the development of a
series of grammars.
2.1 Grammar development
Our point of comparison is (i) the syllable com-
plexity grammar which was introduced by M¨uller
(2002). We develop four different grammars: (ii) the
phonotactic grammar, (iii) the phonotactic on-nuc
grammar (iv) the phonotactic nuc-coda grammar
and (v) the phonotactic on-nuc-coda grammar. All
five grammars share the following features: The
grammarsdescribeawordwhichiscomposedofone
1We also drop rules with zero probabilities
12
to n syllables which in turn branch into onset and
rhyme. The rhyme is re-written by the nucleus and
the coda. Onset or coda could be empty. Further-
more, all grammar versions differentiate between
monosyllabic and polysyllabic words. In polysyl-
labic words, the syllables are divided into syllables
appearing word-initially, word-medially, and word-
finally. Additionally, the grammars distinguish be-
tween consonant clusters of different sizes (ranging
from one to five consonants).
We assume that phonotactic knowledge within
the onset and coda can help to solve a syllabifica-
tion task. Hence, we change the rules of the syl-
lable complexity grammar (M¨uller, 2002) such that
phonotactic dependencies are modeled. We express
the dependencies within the onset and coda as well
as the dependency from the nucleus by bi-grams.
2.1.1 Grammar generation
The grammars are generated automatically (using
perl-scripts). Asallpossiblephonemesinalanguage
are known, our grammar generates all possible re-
write rules. This generation process naturally over-
generates, which means that we receive rules which
will never occur in a language. There are, for in-
stance, rules which describe the impossible English
onset /tRS/. However, our training procedure and
our training data make sure that only those rules will
be chosen which occur in a certain language.
The monosyllabic English word string is used as
a running example to demonstrate the differences
of the grammar versions. The word string is tran-
scribed in the pronunciation dictionary CELEX as
([strIN]) (Baayen et al., 1993). The opening square
bracket, “[“, indicates the beginning of the syllable
and the closing bracket, “]”, the end of the syllable.
The word consists of the tri-consonantal onset [str]
followed by the nucleus, the short vowel [I] and the
coda [N].
In the following paragraphs, we will introduce the
different grammar versions. For comparison rea-
sons, we briefly describe the grammar of M¨uller
(2002) first.
2.1.2 Syllable complexity grammar (M¨uller,
2002)
The syllable complexity grammar distinguishes
between onsets and codas which contain a differ-
ent number of consonants. There are different
rules which describe zero to n-consonantal onsets.
Tree (3) shows the complete analysis of the word
string.
(3) Word
Sylone
a8a8
a8a8
a8
a72a72
a72a72
a72
Onsetone
Onone.3.1
a8a8 a72a72s On
one.3.2
a8a8 a72a72t On
one.3.3
r
Rhymeone
a8a8a8 a72a72a72Nucleus
one
I
Codaone.1
Coone.1.1
N
(4) Onone.3.1 → s Onone.3.2
(5) Onone.2.1 → s Onone.2.2
Rule 4, e.g., describes a tri-consonantal onset, e.g.,
[str]. This rule occurs in example tree 3 and will
be used for words such as string or spray. Rule (5)
describes a two-consonantal onset occurring in the
analysis of words such as snake or stand. However,
this grammar cannot model phonotactic dependen-
cies from the previous consonant.
2.1.3 Phonotactic grammar
Thus, we develop a phonotactic grammar which
differs from the previous one. Now, a consonant in
the onset or coda depends on the preceding one. The
rules express bi-grams of the onset and coda conso-
nants. The main difference to the previous gram-
mars can be seen in the re-writing rules involving
phonemic preterminal nodes (rule 6) as well as ter-
minal nodes for consonants (rule 7).
(6) X.r.C.s.t → C X.r.C+.s.t
(7) X.r.C.s.t → C
Rules of this type bear four features for a conso-
nant C inside an onset or a coda (X=On, Cod),
namely: the position of the syllable in the word
(r=ini, med, fin, one), the current terminal node
(C = consonant), the succeeding consonant (C+),
the cluster size (t = 1...5), and the position of a
consonant within a cluster (s = 1...5).
The example tree (8) shows the analysis of the
word string with the current grammar version. The
13
rule (9) comes from the example tree showing that
the onset consonant [t] depends on the previous con-
sonant [s].
(8) Word
Sylone
a8a8
a8a8
a8
a72a72
a72a72
a72a72
Onsetone.3
Onone.s.3.1
a8a8 a72a72s On
one.t.3.2
a8a8 a72a72t On
one.r.3.3
r
Rhymeone
a8a8a8 a72a72a72Nucleus
one
I
Codaone.1
Coone.t.1.1
N
(9) Onone.s.3.1 → s Onone.t.3.2
2.1.4 Phonotactic on-nuc grammar
We also examine if there are dependencies of the
firstonsetconsonantonthesucceedingnucleus. The
dependency of the whole onset on the nucleus is
indirectly encoded by the bi-grams within the on-
set. The phonotactic onset-nucleus grammar distin-
guishes between same onsets with different nuclei.
In example tree (12), the triconsonantal onset start-
ing with a phoneme [s] depends on the Nucleus [I].
Rule (10) occurs in tree (12) and will be also used
for words such as strict or strip whereas rule (11) is
used for words such as strong or strop.
(10) Onsetone.I.3 → Onone.s.3.1
(11) Onsetone.O.3 → Onone.s.3.1
(12) Word
Sylone.I
a8a8
a8a8
a8a8
a72a72
a72a72
a72a72
Onsetone.I.3
Onone.s.3.1
a8a8 a72a72s On
one.t.3.2
a8a8 a72a72t On
one.r.3.3
r
Rhymeone.I
a8a8a8 a72a72a72Nucleus
one.I
I
Codaone.1
Coone.N.1.1
N
2.1.5 Phonotactic nuc-coda grammar
The phonotactic nucleus-coda grammar encodes
the dependency of the first coda consonant on the
nucleus. The grammar distinguishes between codas
that occur with various nuclei. Rule 13 is used, for
instance, to analyze the word string, shown in Ex-
ample tree 15. The same rule will be applied for
words such as bring, king, ring or thing. If there is
a different nucleus, we get a different set of rules.
Rule 14, e.g., is required to analyze words such as
long, song, strong or gong.
(13) Codaone.I.1 → N Coone.t.1.1
(14) Codaone.O.1 → N Coone.t.1.1
(15) Word
Sylone
a8a8
a8a8
a8a8
a72a72
a72a72
a72a72
Onsetone.3
Onone.s.3.1
a8a8 a72a72s On
one.t.3.2
a8a8 a72a72t On
one.r.3.3
r
Rhymeone.I
a8a8a8 a72a72a72Nucleus
one.I
I
Codaone.I.1
Coone.N.1.1
N
2.1.6 Phonotactic on-nuc-coda grammar
The last tested grammar is the phonotactic onset-
nucleus-coda grammar. It is a combination of gram-
mar 2.1.4 and 2.1.5. In this grammar, the first con-
sonant of the onset and coda depend on the nucleus.
Tree 16 shows the full analysis of our running exam-
ple word string.
(16) Word
Sylone.I
a8a8
a8a8
a8a8
a72a72
a72a72
a72a72
Onsetone.I.3
Onone.s.3.1
a8a8 a72a72s On
one.t.3.2
a8a8 a72a72t On
one.r.3.3
r
Rhymeone.I
a8a8a8 a72a72a72Nucleus
one.I
I
Codaone.I.1
Coone.N.1.1
N
The rules of the subtree (17) are the same for words
such as string or spring. However, words with a dif-
ferent nucleus such as strong will be analyzed with
a different set of rules.
14
(17) Word
Sylone.I
a8a8
a8a8
a72a72
a72a72
Onsetone.I.3
Onone.s.3.1
Rhymeone.I
a8a8a8 a72a72a72Nucleus
one.I
I
Codaone.I.1
Coone.N.1.1
N
3 Experiments
In this section, we report on our experiments with
four different phonotactic grammars introduced in
Section 2.1 (see grammar 2.1.3-2.1.6), as well as
with a re-implementation of M¨uller’s less complex
grammar (M¨uller, 2002). All these grammars are
trained on a corpus of transcribed words from the
pronunciation lexicon CELEX. We use the full forms
of the lexicon instead of the lemmas. The German
lexicon contains 304,928 words and the English lex-
icon 71,493 words. Homographs with the same pro-
nunciation but with different part of speech tags are
taken only once. We use for our German exper-
iments 274,435 words for training and 30,492 for
testing (evaluating). For our English experiments,
we use 64,343 for training and 7,249 for testing.
3.1 Training procedure
We use the same training procedure as M¨uller
(2001). It is a kind of treebank training where we
obtain a probabilistic context-free grammar (PCFG)
by observing how often each rule was used in the
training corpus. The brackets of the input guaran-
tee an unambiguous analysis of each word. Thus,
the formula of treebank training given by (Charniak,
1996) is applied: r is a rule, let |r| be the number
of times r occurred in the parsed corpus and λ(r) be
the non-terminal that r expands, then the probability
assigned to r is given by
p(r) = |r|summationtext
rprime∈{rprime|λ(rprime)=λ(r)} |rprime|
After training, we transform the PCFG by drop-
ping the brackets in the rules resulting in an anal-
ysis grammar. The bracket-less analysis grammar is
used for parsing the input without brackets; i.e., the
phoneme strings are parsed and the syllable bound-
aries are extracted from the most probable parse.
In our experiments, we use the same technique.
The advantage of this training method is that we
learn the distribution of the grammar which maxi-
mizes the likelihood of the corpus.
3.2 Evaluation procedure
We evaluate our grammars on a syllabification task
which means that we use the trained grammars to
predict the syllable boundaries of an unseen corpus.
As we drop the explicit markers for syllable bound-
aries, the grammar can be used to predict the bound-
aries of arbitrary phoneme sequences. The bound-
aries can be extracted from the syl-span which gov-
erns an entire syllable.
Our training and evaluation procedure is a 10-fold
cross validation procedure. We divide the original
(German/English)corpusintotenpartsequalinsize.
We start the procedure by training on parts 1-9 and
evaluating on part 10. In a next step, we take parts
1-8 and 10 and evaluate on part 9. Then, we evaluate
on corpus 8 and so forth. In the end, this procedure
yields evaluation results for all 10 parts of the orig-
inal corpus. Finally, we calculate the average mean
of all evaluation results.
3.2.1 Evaluation Metrics
Our three evaluation measures are word accuracy,
syllable accuracy and syllable boundary accuracy.
Word accuracy is a very strict measure and does not
depend on the number of syllables within a word. If
a word is correctly analyzed the accuracy increases.
We define word accuracy as
# of correctly analyzed words
total # of words
Syllable accuracy is defined as
# of correctly analyzed syllables
total # of syllables
The last evaluation metrics we used is thesyllable
boundary accuracy. It expresses how reliable the
boundaries were recognized. It is defined as
# of correctly analyzed syllable boundaries
total # of syllable boundaries
The difference between the three metrics can
be seen in the following example. Let our
evaluation corpus consist of two words, transfer-
ring and wet. The transcription and the sylla-
ble boundaries are displayed in table 1. Let our
trained grammar predict the boundaries shown in
table 2. Then the word accuracy will be 50%
15
transferring trA:ns–f3:–rIN
wet wEt
Table 1: Example: evaluation corpus
transferring trA:n–sf3:–rIN
wet wEt
Table 2: Example: predicted boundaries
(1 correct word2 words ), the syllable accuracy will be 50%
(2 correct syllables4 syllables ), and the syllable boundary accu-
racy is 75% (3 correct syllable boundaries4 syllable boundaries ). The differ-
ence between syllable accuracy and syllable bound-
ary accuracy is that the first metric punishes the
wrong prediction of a syllable boundary twice as
the complete syllable has to be correct. The syllable
boundary accuracy only judges the end of the sylla-
bleandcountshowoftenitiscorrect. Mono-syllabic
words are also included in this measure. They serve
as a baseline as the syllable boundary will be always
correct. If we compare the baseline for English and
German (tables 3 and 4, respectively), we observe
that the English dictionary contains 10.3% monosyl-
labic words and the German one 1.59%.
Table 3 and table 4 show that phonotactic knowl-
edge improves the prediction of syllable bound-
aries. The syllable boundary accuracy increases
from 95.84% to 97.15% for English and from 95.9%
to 96.48% for German. One difference between the
two languages is if we encode the nucleus in the on-
set or coda rules, German can profit from this in-
formation compared to English. This might point at
a dependence of German onsets from the nucleus.
For English, it is even the case that the on-nuc and
the nuc-cod grammars worsen the results compared
to the phonotactic base grammar. Only the combi-
nation of the two grammars (the on-nuc-coda gram-
mar)achievesahigheraccuracythanthephonotactic
grammar. We suspect that the on-nuc-coda grammar
encodes that onset and coda constrain each other on
the repetition of liquids or nasals between /s/C on-
sets and codas. For instance, lull and mam are okey,
whereas slull and smame are less good.
4 Learning phonotactics from PCFGs
We want to demonstrate in this section that our
phonotactic grammars does not only improve syl-
grammar version word syllable syll bound.
accuracy accuracy accuracy
baseline 10.33%
(M¨uller, 2002) 89.27% 91.84% 95.84%
phonot. grammar 92.48% 94.35% 97.15%
phonot. on-nuc 92.29% 94.21% 97.09%
phonot. nuc-cod 92.39% 94.27% 97.11%
phonot. on-nuc-cod 92.64% 94.47% 97.22%
Table 3: Evaluation of four English grammar ver-
sions.
grammar version word syllable syll bound.
accuracy accuracy accuracy
baseline 1.59%
(M¨uller, 2002) 86.06% 91.96% 95.90%
phonot. grammar 87.95% 93.09% 96.48%
phonot. nuc-cod 89.53% 94.09% 97.01%
phonot. on-nuc 89.97% 94.35% 97.15%
phonot. on-nuc-cod 90.45% 94.62% 97.29%
Table 4: Evaluation of four German grammar ver-
sions.
labification accuracy but can be used to reveal in-
teresting phonotactic2 information at the same time.
Our intension is to show that it is possible to aug-
ment symbolic studies such as e.g., Hall (1992),
Pierrehumbert (1994), Wiese (1996), Kessler and
Treiman (1997), or Ewen and van der Hulst (2001)
with extensive probabilistic information. Due to
time and place constraints, we concentrate on two-
consonantal clusters of grammar 2.1.3.
Phonotactic restrictions are often expressed by ta-
bles which describe the possibility of combination
of consonants. Table 5 shows the possible combi-
nations of German two-consonantal onsets (Wiese,
1996). However, the table cannot express differ-
ences in frequency of occurrence between certain
clusters. For instance, it does not distinguish be-
tween onset clusters such as [pfl] and [kl]. If we con-
sider the frequency of occurrence in a German dic-
tionary then there is indeed a great difference. [kl] is
much more common than [pfl].
4.1 German
Our method allows additional information to be
added to tables such as shown in table 5. In what
follows, the probabilities are taken from the rules
of grammar 2.1.3. Table 6 shows the probability of
2Note that we only deal with phonotactic phenomena on the
syllable level and not on the morpheme level.
16
mono l R n m s v f t ts p k j z g
0.380 S 0.160 0.093 0.056 0.074 0.165 0.318 0.131
0.158 k 0.351 0.322 0.175 0.151
0.090 b 0.489 0.510
0.086 t 0.955 0.044
0.083 f 0.620 0.364 0.015
0.066 g 0.362 0.617 0.019
0.042 p 0.507 0.400 0.030 0.061
0.033 d 1.000
0.019 s 0.200 0.066 0.100 0.133 0.033 0.133 0.333
0.019 ts 1.000
0.011 pf 0.882 0.117
0.007 v 1.000
Table 6: German two-consonantal onsets in monosyllabic words - sorted by probability of occurrence
mono l r n m s v f t ts p k j z g w S d
0.322 s 0.157 0.001 0.099 0.060 0.001 0.004 0.223 0.150 0.174 0.006 0.120
0.148 k 0.375 0.390 0.003 0.003 0.030 0.196
0.093 b 0.420 0.574 0.004
0.083 f 0.591 0.333 0.075
0.079 p 0.480 0.457 0.056 0.005
0.072 g 0.283 0.709 0.006
0.068 t 0.686 0.039 0.274
0.048 d 0.822 0.112 0.065
0.035 h 0.089 0.910
0.018 T 0.857 0.047 0.095
0.014 S 0.878 0.030 0.030 0.060
0.004 m 1.000
0.003 n 1.000
0.002 l 1.000
0.002 v 1.000
Table 7: English two-consonantal onsets in monosyllabic words - sorted by probability of occurrence
Sonorants Obstruents
l R n m s v
Obstruents
p + + (+) - + -
t - + - - - (+)
k + + + (+) (+) +
b + + - - - -
d - + - - - -
g + + + (+) - -
f + + - - - -
v (+) + - - -
ts - - - - - +
pf + + - - - -
S + + + + - +
Table 5: (Wiese, 1996) German onset clusters
occurrence of German obstruents ordered by their
probability of occurrence. [S] occurs very often in
German words as first consonant in two-consonantal
onsets word initially. In the first row of table 6,
the consonants which occur as second consonants
are listed. We observe, for instance, that [St] is
the most common two-consonantal onset in mono-
syllabic words. This consonant cluster appears in
words such as Staub (dust), stark (strong), or Stolz
(pride). We believe that there is a threshold indicat-
ing that a certain combination is very likely to come
from a loanword. If we define the probability of a
two-consonantal onset as
p(onset ini 2) =def p(C1)×p(C2)
where p(C1) is the probability of the rule
X.r.C1.s.t → C1 X.r.C2.s.t
and p(C2) is the probability of the rule
X.r.C2.s.t → C2,
then we get a list of two-consonantal onsets ordered
by their probabilities:
p(St) > ... > p(sk) > p(pfl) > p(sl) > ... > p(sf)
These onsets occur in words such as Steg (foot-
bridge), stolz (proud), Staat (state), Skalp (scalp),
Skat (skat) Pflicht (duty), Pflock (stake), or Slang
(slang) and Slum (slum). The least probable
combination is [sf] which appears in the German
word Sph¨are (sphere) coming from the Latin word
sphaera. The consonant cluster [sl] is also a very
uncommon onset. Words with this onset are usually
loanwords from English. The onset [sk], however, is
an onset which occur more often in German words.
Most of the words are originally from Latin and are
translated into German long ago. Interestingly, the
onset [pfl] is also a very uncommon onset. Most
of these onsets result from the second sound shift
where in certain positions the simple onset conso-
17
nant /p/ became the affricate /pf/. The English trans-
lation of these words shows that the second sound
shift was not applied to English. However, the most
probable two-consonantal onset is [St]. The whole
set of two-consonantal onsets can be seen in Table 8.
4.2 English
English two-consonantal onsets show that unvoiced
first consonants are more common than voiced ones.
However, two combinations are missing. The alveo-
larplosives/t/and/d/donotcombinewiththelateral
/l/ in English two-consonantal onsets. Table 8 shows
the most probable two-consonantal onsets sorted by
their joint probability.
4.3 Comparison between English and German
The fricatives /s/ and /S/ are often regarded as extra
syllabic. Accordingtoourstudyontwo-consonantal
onsets, these fricatives are very probable first con-
sonants and combine with more second consonants
than all other first consonants. They seem to form
an own class. Liquids and glides are the most impor-
tantsecondconsonants. However,Englishprefers/r/
over /l/ in all syllable positions and /j/ over /w/ (ex-
cept in monosyllabic words) and /n/ as second con-
sonants. Nasals can only combine with very little
first consonants. In German, we observe that /R/ is
preferred over /l/ and /v/ over /n/ and /j/. Moreover,
the nasal /n/ is much more common in German than
in English as second consonants which applies espe-
cially to medial and final syllables.
When we compare the phonotactic restrictions of
two languages, itis also interesting toobserve which
combinations are missing. If certain consonant clus-
ters are not very likely or never occur in a language,
this might have consequences for language under-
standing and language learning. Phonotactic gaps
in one language might cause spelling mistakes in a
second language. For instance, a typical Northern
German name is Detlef which is often misspelled in
English as Deltef. The onset cluster /tl/ can occur
in medial and final German syllables but not in En-
glish. The different phonetic realization of /l/ may
play a certain role that /lt/ is more natural than /tl/ in
English.
Mono-syllabic: /st/>/kr/>/sk/>/kl/>/br/>/gr/>/sl/>/fl/>/sp/>/tr/
>/dr/>/bl/>/sw/>/pl/>/pr/>/sn/>/hw/>/kw/>/fr/>/gl/>/sm/>
/tw/>/Tr/>/Sr/>/fj/>/dj/>/kj/>/pj/>/mj/>/dw/>/hj/>/nj/>/tj/
>/vj/>/lj/>/sj/>/Tw/>/sf/>/Tj/>/Sw/>/km/>/kv/>/gw/>/Sn/
>/Sm/>/pS/>/bj/>/sr/>/sv/
Initial /pr/>/st/>/tr/>/kr/>/sp/>/sk/>/br/>/gr/>/fl/>/kl/>/fr/>
/bl/>/pl/>/sl/>/kw/>/dr/>/sn/>/sw/>/gl/>/hw/>/nj/>/sm/>/sj/
>/pj/>/Tr/>/mj/>/kj/>/dj/>/tw/>/tj/>/fj/>/hj/>/lj/>/bj/>/ps/
>/Sr/>/dw/>/sf/>/vj/>/gj/>/gw/>/pw/>/mn/>/Sm/>/Tj/>/Tw/
>/Sn/>/tsw/>/zj/>/pt/>/mw/>/kn/>/gz/
Medial: /st/>/tr/>/pr/>/sp/>/gr/>/kj/>/kr/>/kw/>/pl/>/br/>/tj/
>/lj/>/dj/>/dr/>/kl/>/nj/>/sk/>/mj/>/fr/>/pj/>/bl/>/fl/>/bj/
> /gl/ > /gj/ > /fj/ > /Sn/ > /sj/ > /vj/ > /Sj/ > /Tr/ > /vr/ > /gw/ > /sl/ >
/nr/ > /sw/ > /mr/ > /sn/ > /hj/ > /hw/ > /sm/ > /zj/ > /tSr/ > /rj/ > /sr/ >
/dw/>/Zr/>/Sr/>/jw/>/tSw/>/tSn/>/vw/>/Dr/>/dZr/>/dn/>/Tj/
>/tw/>/Sw/>/Zj/>/zr/>/zn/>/zw/>/Zw/>/dZj/>/dZn/>/dZw/
Final: /st/ > /tr/ > /kl/ > /bl/ > /gr/ > /dr/ > /pl/ > /br/ > /sk/ > /sp/ > /pr/
>/kr/>/tj/>/fr/>/nj/>/fl/>/lj/>/kw/>/dj/>/sj/>/kj/>/sl/>/gl/
>/hw/>/Sn/>/vr/>/Sj/>/vj/>/bj/>/pj/>/fj/>/Tr/>/mj/>/gw/>
/sr/>/sw/>/sm/>/nr/>/sn/>/tSr/>/mr/>/tw/>/dZr/>/zj/>/gj/>
/dZj/>/Sr/>/Zr/>/sf/>/nw/>/zr/>/Tj/>/rj/>/Dr/>/vw/>/dw/>
/dn/>/tSj/>/pw/>/jw/>/hj/>/St/>/Zw/>/tSn/>/Zj/>/pn/>/Dj/>
/dZn/>/zn/>/Sw/>/Zn/>/tSw/>/Tw/>/bd/>/tsj/>/Dw/
Monosyllabic: /St/>/tR/>/Sv/>/Sl/>/kl/>/fl/>/kR/>/Sp/>/bR/>
/bl/>/gR/>/SR/>/dR/>/fR/>/Sm/>/kn/>/gl/>/kv/>/pl/>/Sn/>
/tsv/>/pR/>/pfl/>/vR/>/sk/>/sl/>/tv/>/ps/>/sp/>/sv/>/sm/>
/pfR/>/pn/>/gn/>/sn/>/fj/>/sf/
Initial: /St/>/tR/>/pR/>/Sp/>/kR/>/Sv/>/gR/>/Sl/>/fR/>/kl/>
/bR/>/bl/>/fl/>/Sm/>/gl/>/tsv/>/pl/>/kv/>/kn/>/Sn/>/dR/>
/SR/ > /sk/ > /pfl/ > /ps/ > /gn/ > /sl/ > /sm/ > /sts/ > /sf/ > /sv/ > /ks/ >
/tv/>/vR/>/sn/>/mn/>/st/>/pn/>/sp/>/fj/>/pfR/>/mj/
Medial: /St/ > /tR/ > /bR/ > /fR/ > /Sl/ > /gR/ > /kR/ > /bl/ > /dR/ > /Sp/
>/kl/>/fl/>/pR/>/gl/>/Sv/>/SR/>/st/>/pl/>/ks/>/kv/>/gn/>
/Sn/>/Sm/>/kn/>/tsv/>/pfl/>/dl/>/dn/>/gm/>/sp/>/sn/>/fn/>
/bn/>/vj/>/xR/>/tn/>/sl/>/vR/>/sk/>/pj/>/ps/>/sts/>/xn/>/xl/
>/ml/>/Rn/>/Nn/>/NR/>/zn/>/zl/>/mn/>/tl/>/sf/>/ln/>/tsR/
>/tsl/>/sR/>/ft/>/zR/>/pfR/>/pt/>/nR/>/sg/>/pn/>/dm/>/tz/
>/sv/>/zv/>/tv/
Final: /St/ > /tR/ > /bl/ > /Sl/ > /bR/ > /fl/ > /kl/ > /dR/ > /gR/ > /Sp/ >
/kR/>/Sv/>/fR/>/SR/>/gl/>/ks/>/dl/>/pl/>/gn/>/pR/>/Sn/>
/Sm/>/kn/>/dn/>/kv/>/tsv/>/tl/>/ml/>/xl/>/tsl/>/gm/>/pfl/>
/Nl/ > /zl/ > /tn/ > /xR/ > /vR/ > /fn/ > /bn/ > /vj/ > /zn/ > /Nn/ > /pn/ >
/RR/>/mn/>/xn/>/zR/>/NR/>/lR/>/dZm/>/tsR/>/nl/>/gv/>/ps/
>/ft/>/pfR/>/tZl/>/nR/>/sp/>/st/>/sv/>/sk/>/sR/>/sn/>/sl/>
/sm/>/sts/
Table 8: Two-consonantal onsets ordered by joint
probability (top: English, bottom:German)
18
5 Discussion
Comparison of the syllabification performance with
other systems is difficult: (i) different approaches
differ in their training and evaluation corpus;
(ii) comparisons across languages are hard to inter-
pret; (iii) comparisons across different approaches
require cautious interpretations. Nevertheless, we
want to refer to several approaches that examined
the syllabification task. Van den Bosch (1997) in-
vestigated the syllabification task with five induc-
tive learning algorithms. He reported a general-
ization error for words of 2.22% on English data.
However, the evaluation procedure differs from ours
as he evaluates each decision (after each phoneme)
made by his algorithms. Marchand et al. (to ap-
pear 2006) evaluated different syllabification algo-
rithms on three different pronunciation dictionaries.
Their best algorithm (SbA) achieved a word accu-
racy of 91.08%. The most direct point of compari-
son are the results presented by M¨uller (2002). Her
approach differs in two ways. First, she only eval-
uates the German grammar and second she trains
on a newspaper corpus. As we are interested in
how her grammars perform on our corpus, we re-
implemented her grammars and tested both in our
10-fold cross evaluation procedure. We find that the
first grammar (M¨uller, 2001) achieves 85.45% word
accuracy, 88.94% syllable accuracy and 94.37% syl-
lable boundary accuracy for English and 84.21%,
90.86%, 95.36% for German respectively. The re-
sults show that the syllable boundary accuracy in-
creases from 94,37% to 97.2% for English and from
95.3% to 97.2% for German. The experiments point
out that phonotactic knowledge is a valuable source
of information for syllabification.
6 Conclusions
Phonotactic restrictions are important for language
perception and production. They influence the abil-
ity of children to segment words, and they help to
recognize words in nonsense sequences. In this
paper, we presented grammars which incorporate
phonotactic restrictions. The grammars were trained
and tested on a German and an English pronuncia-
tion dictionary. Our experiments show that English
and German profit from phonotactic knowledge to
predict syllable boundaries. We find evidence that
German codas depend on the nucleus which does
not apply for English. The English grammars which
model the dependency of part of the onset or coda
on the nucleus worsen the syllabification accuracy.
However, the combination of both show a better per-
formance than the base phonotactic grammar. This
suggests that there are constrains in the selection of
the onset and coda consonants.
7 Acknowledgments
I would like to thank Paul Boersma who invited
me as a guest researcher at the Institute of Phonetic
Sciences of the University of Amsterdam. Special
thanks also to Detlef Prescher as well as to the three
anonymous reviewers, whose comments were very
useful while preparing the final version of this pa-
per.

References
HaraldR.Baayen, RichardPiepenbrock, andH.vanRijn.
1993. The CELEX lexical database—Dutch, English,
German. (Release 1)[CD-ROM]. Philadelphia, PA:
Linguistic Data Consortium, Univ. Pennsylvania.
Anja Belz. 2000. Multi-syllable phonotactic mod-
elling. In Proceedings of SIGPHON 2000: Finite-
State Phonology, Luxembourg.
Juliette Blevins. 1995. The Syllable in Phonological
Theory. In John A. Goldsmith, editor, Handbook
of Phonological Theory, pages 206–244, Blackwell,
Cambridge MA.
Julie Carson-Berndsen, Robert Kelly, and Moritz Neuge-
bauer. 2004. Automatic Acquisition of Feature-Based
Phonotactic Resources. In Proceedings of the Work-
shop of the ACL Special Interest Group on Computa-
tional Phonology (SIGPHON), Barcelona, Spain.
Julie Carson-Berndsen. 1998. Time Map Phonology. Fi-
nite State Models and Event Logics in Speech Recog-
nition, volume 5 of Text, Speech and Language Tech-
nology. Springer.
Eugene Charniak. 1996. Tree-bank grammars. In Pro-
ceedings of the Thirteenth National Conference on Ar-
tificial Intelligence, AAAI Press/MIT Press, Menlo
Park.
Walter Daelemans and Antal van den Bosch. 1992. Gen-
eralization performance of backpropagation learning
onasyllabificationtask. InM.F.J.DrossaersandANi-
jholt, editors, Proceedings of TWLT3: Connectionism
and Natural Language Processing, pages 27–37, Uni-
versity of Twente.
Colin J. Ewen and Harry van der Hulst. 2001.
The Phonological Structure of Words. An Introduc-
tion. Cambridge University Press, Cambridge, United
Kingdom.
Tracy Hall. 1992. Syllable structure and syllable related
processes in German. Niemeyer, T¨ubingen.
Daniel Kahn. 1976. Syllable-based Generalizations in
English Phonology. Ph.D. thesis, Massachusetts Insti-
tute of Technology, MIT.
Brett Kessler and Rebecca Treiman. 1997. Syllable
Structure and the Distribuation of Phonemes in En-
glish Syllables. Journal of Memory and Language,
37:295–311.
George Anton Kiraz and Bernd M¨obius. 1998. Mul-
tilingual Syllabification Using Weighted Finite-State
Transducers. In Proc. 3rd ESCA Workshop on Speech
Synthesis (Jenolan Caves), pages 59–64.
Brigitte Krenn. 1997. Tagging syllables. In Proceedings
of the 5th European Conference on Speech Commu-
nication and Technology, Eurospeech 97, pages 991–
994.
Yannick Marchand, Connie A. Adsett, and Robert I.
Damper. to appear 2006. Automatic syllabification in
English: A comparison of different algorithms. Lan-
guage and Speech.
Karin M¨uller. 2001. Automatic Detection of Syllable
Boundaries Combining the Advantages of Treebank
andBracketedCorporaTraining. In Proc. 39th Annual
Meeting of the ACL, Toulouse, France.
Karin M¨uller. 2002. Probabilistic Context-Free Gram-
mars for Phonology. In Proceedings of the Workshop
on Morphological and Phonological Learning at ACL
2002.
Janet Pierrehumbert. 1994. Syllable structure and word
structure: a study of triconsonantal clusters in English.
In Patricia A. Keating, editor, Phonological Structure
and Phonetic Form, volume III of Papers in Labo-
ratory Phonology, pages 168–188. University Press,
Cambridge.
Richard Sproat, editor. 1998. Multilingual Text-to-
Speech Synthesis: The Bell Labs Approach. Kluwer
Academic, Dordrecht.
Antal Van den Bosch. 1997. Learning to Pronounce
Written Words: A Study in Inductive Language Learn-
ing. Ph.D. thesis, Univ. Maastricht, Maastricht, The
Netherlands.
Jan P.H. Van Santen, Chilin Shih, Bernd M¨obius, Eve-
lyne Tzoukermann, and Michael Tanenblatt. 1997.
Multilingual duration modeling. In Proceedings of
the European Conference on Speech Communication
and Technology (Eurospeech), volume 5, pages 2651–
2654, Rhodos, Greece.
Michael S. Vitevitch and Paul A. Luce. 1999. Proba-
bilistic Phonotactics and Neighborhood Activation in
Spoken Word Recognition. Journal of Memory and
Language, (40):374–408.
Jean Vroomen, Antal van den Bosch, and Beatrice
de Gelder. 1998. A Connectionist Model for Boot-
strap Learning of Syllabic Structure. Language and
Cognitive Processes. Special issue on Language Ac-
quisition and Connectionism, 13(2/3):193–220.
Andrea Weber and Anne Cutler. 2006. First-language
phonotactics in second-language listening. Journal of
the Acoustical Society of America, 119(1):597–607.
Richard Wiese. 1996. The Phonology of German.
Clarendon Press, Oxford.
