Word Boundary Identification fro m Phoneme Sequence Constraints in Automatic 
Continuous Speech Recognition 
Jo~atha~t HARRINGTON Gordon WATSON Maggie COOPER 
The Centre for Speech Technology Research, University of Edinburgh, 80 South Bridge, Edinburgh 
EH1 1HN, Scotland. 
Abstract 
Th.s paper explores the extent to which phoneme sequence 
constralnt'~ can be used to identify word boundaries in coutinnous 
speech recog~fition. The input consists of phonemic transcriptions 
(without word boundaries indicated) of 145 utterances produced by 
1 ~e speaker. The constraints are derived by matching the 
complete set of 3 phonente sequences that can occur' across word 
boundaries to entries in large lexicons containing both citation and 
reduced h)rm pronunciations. Phonemic assimilatoxy adjushnents 
across word boundaries are alse taken into account. The results 
show that around 37% of all word boundaries can be correctly 
identified t rom a knowledge of .~uch phoneme sequence contraints 
alone, and ~hat this figure rises to 45% when a knowledge of one- 
and two-phoneme words and all legal, word-initial and word-final, 
two-pitoneloe sequences are taken into account. The possibility of 
including :~uch constraints in tim architecture of a corrtinuous 
speech reeogniser is discussed. 
I. Introduction 
TiLe identification of word boundaries fl'om continuous 
speech by human listeners depends, in part, on an interaction 
between :prosodic, syntactic and semantic processing, Since, 
however, ~his interaction is difficult to model in machines and 
since some prosodic variables, such as sentence stress patterns, are 
difficult to extract automatically from the acoustic waveform, the 
identification of word bmmdaries must often be accomplished by 
different hinds of processing in continuous ,~peeciL reeognisers: one 
possibility~ discussed in Lamel & Zue (1984) and explored in this 
paper, depends on the incorporation of a knowledge of' phoneme 
sequence constraints, Phoneme sequence constraints are based on a 
knowledge of phoneme sequences which do not occur 
! word-internally: for' example, since there are no words which end 
in/m g/~ at~d since/m g l/does not occur word-internally, a word 
boundary must occur after/m/ (Lamel & Zue, 1984). Harrington, 
Johnson & Cooper (1987) showed that word boundary CVC 
sequences are often excluded word-internally in monomorphemic 
words if the pre- and post-vocalic consonants are similar: thus,/s N 
V N/(N == nasal),/C l V l/,/f V p/,/g V ld,/z V ,iJ,/sh V sh/ 
are all exchtded, or are at least extremely rare, word-internally in 
British English Received Pronunciation (top). In the study 
discussed b.~low, we extend the investigations of Lamel & Zue 
i (1984) anti 1 larrington et al. (1987) by developing an algorithm for 
the autmmttic identification o~" word boundaries from such 
sequences in a continuous speech recogniser. 
In the Alvey Demonstrator continuous speech recogniser 
being developed at the Centre for Speech Technology Research 
(CSTR), Edinburgh University (Figure 1), the identification of 
word boundaries from a string of phonemes is accomplished by a 
chart-parsing s~rategy which matches the lexicon from 
left-to-right against a string of phonemic symbols that are 
themselve~ derived from the phonetic processing of the 
acoustic-waveform. In this system, only cmnplete parsings of the 
phonemic units are passed to higher' levels for syntactic and 
semantic processing. The only possible parsing, therefore, of the 
phonemic string/t ii eh i ng w i 1/is teaching+will, since there 
are no ether paths which parse the entire string of phonemes. 
TEACH TEA 
Parameterised Acoustic Signal 
~bone~ Rules 
honeme Stdng 
tlichingwil 
Cba P ~/ rt- arsing StratecJy 
TEACHING TEE " 
teaching will 
Fig. 1: A schematic outline of the components between acoustic 
waveform and lexical access in one of the continuous speech 
recognisers at Edinburgh University. 
The relationship between the identification of word 
boundaries fl'om piloneme sequence constraints and the 
chart-parsing strategy outlined above can be clarified with respect 
to Figure 1: at all points where the arcs do not overlap, it should be 
possible to in.~ert a word bour~dary from a knowledge of ptloneme 
sequence constraints. Since, therefore, the only point at which the 
arcs are non-overlapping is between /ng/ and /w/, phoneme 
sequence constraints stlould apply to insert a word boundary at 
that point (there being no monomorphemic words in the English 
language that contain a medial /ng w/). At the same time, 
however, Figure 1 would seem to suggest that the prior" 
implementation of phoneme sequence constraints is superfluous, 
since all word boundaries can he found frmn the chart-parsing 
strategy. However, the application of phoneme sequence 
constraints may enable recovery when the chart-parsing strategy 
is unable to parse the phonemic string because of the incorrect 
:derivation of a particular phoneme. Suppose, for example, that the 
acoustic-phonetic component incorrectly derives/el ng/from the 
parameteriscd acoustic wavetbrm instead of/i ng/(Figure 2). 
TEACH TEA 
TEE~~~oi 3 ng--3 w \] i jERROR: cannot be parsed 
Fig. 2: The incorrect substitution of ~oil for /i/ makes the above 
sequence unparsable since /ch oi ng7 occurs neither 
word-internally nor across word boundaries. 
225 
In this case, a left-to-right chart-parsing strategy would break off 
at/ch/because/ch oi ng/is unparsable: there are no words that 
end in/ch el/or begin with/el rig/and/el/is not usually corrsidered 
to be a word (aside from an exclamation) in the English language. 
Since the strategy works from left-to-right, the phonemes which lie 
to the right of this error would also remain unparsed: thus will 
would not be derived frmn /w i 1/, unless the chart-parsing 
strategy were modified in some way to be able to cope with this 
kind of error. If, on the other hand, phoneme sequence constraints 
had been applied, a word boundary would have been inserted 
between/ng/and/w/. This would enable immediate recovery from 
the kind of error described above: in this case, if the chart-parsing 
strategy is unable to continue parsing phonemes at a particular 
point (from /ch/ to /el/ to /ng/) it can continue parsing from the 
following word boundary (between /ng/ and /w/) that trod been 
automatically inserted by phoneme sequence constraints. The 
prior application of phoneme sequence constraints, therefore, 
breaks up a single string of phonemes into smaller units, whicb, 
from the point of view of the left-to-right chart-parsing strategy, 
are independent of each other. A by-product of the prior insertion 
of word boundaries in this way is that the chart-parsing strategy 
could parse each of these units in parallel (Figure 3). 
Paramet~~ 
Phonetic Rules ,I, 
meni thangksfoosendi~ng miidh@kopi @v yoolet@ 
Phoneme Sequence Constraint Processor 
#me ni # tha ngks #foose riding # n~iidh(i k o pi @v #yoolet@ 
t Apply Chart-Parsing Strategy in Parallel 
// / / / 
#moni # thangka #foosendlng # miidh@ ko pi @v #yoolet@ .I. 
m n thanks for ndm\]'me h a y se ' g t ecopyofyourlettev 
Fig. 3: The prior application of phoneme sequence constraints 
would enable the chart-parsing strategy to apply in parallel from 
all the pre-identified word boundaries. 
Such a parallel strategy may be computationally faster than one 
which parses the string strictly from left-to-right. 
As in Harringt0n & Johnstone (1988), sentences 
transcribed by a trained phonetician are used as the input data. 
The experiment does not take account, therefore, of any errors 
which may arise as a result of inaccuracies in the automatic 
extraction of the phonemes from the acoustic signal by the 
phonetic rule component of a continuous speech recogniser. 
2 Method I 
2.1 Word boundary sequences 
In order to identify phoneme sequences which are excluded 
word-internally (and which therefore signal the presence of a word 
boundary), it is necessary to determine a priori the complete set of 
three phoneme sequences which can occur across word boundaries. 
For this purpose, a 'Word-lexicon' of the 23,000 most frequent 
words (including many derivational and inflectional 
morphological variants and compounds) in part of the 
Lancaster-Oslo-Bergen corpus (Johannson, Leech & Goodluck, 
1978} was used with each word keyed to one citation form and zero 
or more reduced form pronunciations. The citation form entry, 
which is often identical to the one given in Gimson (1984), 
corresponds to a phonemicisation of an isolated production of the 
word at a moderately slow tempo. The reduced forms include 
variant phonemieisations of the same words which might occur in 
faster speech productions. In general, three different kinds of 
reduction rules are included: alternation rules in which segments 
226 
are in free variation (e.g./co k sh @ n/,/o k sh @ rg, auction); 
deletion rules in which single segments may be deleted (/o k sh n/ 
from/o k sh @ n/, auction); and word-internal assimilation rules 
(/g u b b at/from/g u d b a~/, good-bye). The rules do not take into 
account phonological assimilation across word boundaries (see 
Harrington, Laver & Cutting (1986) for further details of the 
reduction rules). The reduced forms were derived from the citation 
forms by rule using a software package running on Xerox-ll00 
workstations in Interlisp-D (Cutting & Harrington, 1986). After 
the application of the reduction rules on the 23,000 word lexicon, 
around 70,000 reduced forms were derived (on average, ttmrefore, 
each word is associated with 4 different pronunciations). 
In order to derive the complete set of possible three 
phoneme sequences that occur across word boundaries, all final 
two phonemes (PP#) were paired with all initial phonemes (#P) of 
all citation and reduced forms, thus deriving the complete set of 
PP#P sequences (where P is any phoneme); and all final phouenms 
(P#) were paired with the first two phonemes (#PP) of all citation 
and reduced forms thus deriving the complete set of P#PP 
sequences. This pairing operation produced a total of 62,670 
different three-phoneme sequences. 
Subsequently, it was necessary to take into account some 
of the modifications to word boundary sequences which occur as a 
result of assimilatory processes since, as stated above, these were 
not included in the reduction rules. In order to take into account 
the realisation of/r/in phrases such as/dh e@ raa m e n i/ (there 
are many) and 'intrusive/rf (/dh ii aid i @ r i z/, the idea is), the 
sequences in (1) were paired with all word-initial vowel phonemes 
that occurred in the Word lexicon: 
(1) /U@ r, e@ r, i@ r, @ r, @@ r, oo r, aa r/ 
thus deriving, for example,/@ r# i/(measure is),/aa r # au/(far 
out) etc. In addition,/r/was paired with all #VP sequences in the 
Word-lexicon where V is any word-initial vowel and P is any 
phoneme. This pairing operation results in sequences such as/r # 
i z,J (measure is), /r # au t\] (far" out) etc. 
In order to account for the assimilation of alveolars to 
bilabials preceding bilabials, all PPt # sequences (where P is any 
phoneme and Pt is one of /t,d,n/) were extracted from the 
Word-lexicon. Final/t\],/d/,/n/were then changed to/p/,/b/and/m/ 
respectively (thus the PPt # sequences/it t #/,/it d #/,/it n #/were 
changed to/it p #/,/it b #/,/it m #f). The changed sequences were 
then paired with the labial consonants/p,b,m,f,v,w/. This pairing 
operation produces sequences such as/it p # b/(eat by),/on m # f/ 
(shown few),~@@ m # w~ (burn wood). 
A similar procedure was used to take account of the 
instability of some of the alveolars before palatals and velars as 
shown in Table 1 below. 
/s/to/sh/: oo sh # sh sh # shuu (horse shoe) 
/zJ to/zh/: i zh # sh zh # sh u@ (is sat'el 
/t/to/ch/: a ch # y eh # y oo (at your) 
/d/to/jh/: ijh # y jh # y uu (didyou) 
It/to/k/: ai k # k k # k uh (might come) 
/dJ to/g/: iig # k g # k 1 (need cleaning) 
/n/to/ng/: e ng # k ng # k a (when can) 
Table I: Sonm of the word boundary assimilation cases considered 
in the derivation of word boundary sequences. 
Consideration was given to some deletion rules across word 
boundaries such as the deletion of the alveolar stop in/faa .q # s pii 
clr t, (fast speech). In this case, a coraplete list of three-phoneme 
sequences occurring word-finally was made from the Word-lexicon 
where the penultimate consonant was a fricative and the final 
consonant an alveolar stop. The final alveolar stop was deleted and 
tile resulting two.phoneme sequence was tmired with all members 
of iI'P (thus/aa s t #/ (j'hst) ->/aa s #/ (first) ->/aa s # ",;/, /b, st 
speech). All wm'd boundary sequences which resulted frmn the 
inclusion of these assinfilation rules were added to the previously 
derived P#PP arm I'P#1 ) sequences, thus producing a total of 
69,819 wla'd boundary sequences. 
2.2 Word boundary sequences excluded word-lnl:m'nally 
We new wished to determine which word boundary 
sequences do not oecm" word-internally (since these enable tile 
automatic detection of a word boundary), tlowcver, it is clear fi'om 
the phouolog 3 literature (Fudge, 1969; Cleruents & Keysm', 1983) 
that sequential constraints on phonemes are ltot upheld aeross 
many morpheme boundaries. For example, it is well documented 
(Rockey, t97a) that mdy alveolar!; and palato-alvel.lars may fi)\]low 
/au/ (town, h)wl, couch). Bat retch a constraiot is not upheld 
word internv.lly acrosq the nloFphetne boundary in a colnpollnd 
such a~; eew&)y, /k aub oil Similarly, /uu art l/ does uot occur 
morpheme--internally, hut does occur in componnds such as 
throughout. Since the Word-lexicon include~; compounds, 
sequences .'inch as /an au t/ would be considered to occm 
word--internally alnd would therelbrc be excluded fl'om the list of 
phmleme seq oence coustraiuts that enable the autenmtic detection 
of a word b(madary fi'om a string of phonemes Bnt this has the 
unfortunate effect that a word boundary would not be inserted in 
the sequem:e through outer,/th r au all t 00/. Since iIl fact we prefer 
word boanda)'ies to be it,serted wherever possible, all coalpotw-d~ 
were removed from the Wet'd-lexicon, m; a resnlt of which 
/uu au V wotdd be included as a possible phoneme sequence 
eonstraiut. Cousequeatly, we would expect a word bonndary to be 
inserted in both through outer and throughout. This implies 
either that tLroughout must be stm'ed its /th r uu # au t/ in the 
lexieou which tile chart-parsing strategy matches against the 
phonemic string, or else that morphoh)gical rules must apply after 
the phoueme :m(tuencc constraint processor to \['ell~.ove the medial # 
in throughout. 
A similar argument applies to inflectional ntm'pheme 
boundaries. \["or example,/n th s/is excluded morphmne internally 
hut does occm" across stem/inflectional suffix boundaries (months). 
For the reasons outlined above, morphoh)gical variants with 
regnlar inflections (plm'ah;, present and past tense sufti×es) were 
removed from the Word-lexicon. Exeludiag these inflectional 
morphological vm'iants has the (undesizable) effect that a 
t)oundary will be inserted between/th/an(t/~ in three months time, 
/th r ii lauh n th # s tai m/. lIowcver, some inflectional 
morphological rules, which apply after the phoneme sequence 
constraiu~ pr~)ces~qor, are designed to convert these boundaries into 
morpheme (M) boundaries (see section 4 below). 
Finally, it is also the ease that many sequences that are 
excluded monomorphemicully (e.g. /m ei sI~) can occur 
word-internally in derived morphological variants 
(/k o n f @ mei sh @ n/, confirmation). A similar ease could 
be nlade for renmving derivational variants \[i'(ua tim Word-lexicon 
and applying morphoh)gieal rules to rmaove the//t)oun(hu'y from 
sequences such as /k o n f @ m /¢ ei sh (u). n/ which would 
result after the applicatiou of the l)hlmeme sequence constraint 
processor. However, deri,/atioual variants were not removed, in 
part duc I;o the complexity of the interaction between the 
inflectional and derivational nmrphological rules that would have 
to apply after word boundaries had been inserted automatically. 
Only compounds and regular morphologically inflected 
variants were removed from the Word-lexicon; hencetbrth, the 
resulting lexicon with such entries removed will be referred to as 
the Morpheme.lcxicon. 'Uhe Morpheme-lexicon coutained around 
12,0110 h;xical entries alter these mori)ttologlcal variants had been 
~elaoved from the 23,000 Word-lexicon. 
All word boundary sequences, including those which 
account for the assinfilatory processes described in 2.1, were 
placed in one file and the medial word boundary symbol was 
removed. After all duplicate entries had been renmved, the 
resulting filc was matched against the Morpheme-lexicon in order 
to determine which boundary sequences do nut occur 
'morphenm'-internally. The matching algorithm for this purpose 
was a UNiX shell script runnin,g.on a 12 mB Masseomp: it outputs 
the frequency with which the word boundary sequences occur 
word-irtternally in a given lexicou. 
2.3 The word boundary identification algorithm 
All word boundary sequences which did not occur 
'umrpheme'-ini:ernally were compiled into a discrinfination tree in 
which, working from left to right, common phonemes share 
identical branches. At the end of each branch, an instruction is 
included for where the boundary should be inserted if the 
:~equencc is found in an input l)honemic string (Figure 4). 
/ ....... 
MATCll tREE AGAIN5T PHONEME 
5r/ZING 
PI P2 PaIP,I 1'5 I'll 117 11R ....... 
IN$t:RT ANY BOUNDARIE$ 
PI 1'2 # P3 P,l 1'5 Pt~ 1'7 118 .... 
SHIFt WIN~OW ONE PJIONEME 
AND MATCII TREE AGAINST PltONEME STRING 
,,,I;: ::;: \] ........ ............... 
Figure 4: The tn'oeess by which the tree containing the ptmneme 
sequence constraints is matched against a phonemic input. 
In the case of/d b a/, tilt" example, the boundary must be inserted 
after/d\], since there are no entries in the MorphemeAexicon with 
final /d b/. ttowever, since there are entries that both end in 
/dh @/and begin with/(a) d/,/dh @ d/cannot be unambiguously 
parsed: in this case a '?' is inserted after the first phoneme of the 
word boundary sequence./dh ? @ d/nmans, therefore, that a wm'd 
houndary occurs either after/dh/, or after/@/. 
For any given input phonemic string, the algorithm 
matches three phonemes at a time against the tree (Figure 4) fi'om 
left..to-right through the string. If they match, a boundary is 
inserted at the appropriate place. Subsequently, the fixed window 
of three phonemes shifts one phoneme to the right and the new 
sequence is matched in the same way. Thus, the matching 
algm'ithm steps through the input string one phoneme at a time 
with a window width of three phonemes until the end of the string 
is reached. 
Phonemic transcriptions (excluding stress or boundary 
symbols) were made hy a trained phonetician of 145 sentences 
produced by one lip speaker. '\['he average numbers of words per 
utterance and phonenms per word were 10.73 and 4.04 
respectively. The sentences wcve taken from a 'phonemically 
balanced' passage constructed f~r the speech recognition project at 
Edinburgh University; sentences from Section It of the 
LancasterOslo-Bergen corpus (Johannson, Leech and Goodluck, 
1978); and sentences fl'om a corpus of business dictation collected 
at CSTR. The transcribed sentences, which clearly do not contain 
any errors that could have arisen as a result of phonetic processing 
of the acoustic waveform by a speech recognlser, were input to the 
algorithm schematically outlined in Figm'e 4. 
227 
3. Results I 
The statistics on the automatically inserted # boundaries 
are shown in Table If. 
Target number of word.boundaries 
Total number of inserted # I~o~mdarlee 592" 
# correctly inserted 
Remainder 69 
l~edaced forms net accounted for 
Lexical items not accounted for 7 \] 
Corresponding to morpheme boundaries 44.~ 
Table II: Word boundaries automatically inserted in the 145 
phonemically transcribed utterances. 
The results show that 523/1411 (37%) of the target word 
boundaries were correctly detected. However, there were 69 
automatically inserted # boundaries which did not correspond to 
word boundaries in the original utterances. Of these, 14 were 
incorrectly inserted because of the presence of reduced 
phonological forms in the utterances (e.g./w @ dh\] for with) which 
we had failed to generate by rule; and 7 were inserted because 
some words occurred in the utterances that had not been included 
in the Word-lexicon (most of these were proper names). 44 # 
boundaries were inserted at morpheme boundaries, both in 
compounds (/h au # e v @/for however) and preceding inflectional 
suffixes (/s i' m # z/ for seems). In the next section, some 
morphology rules ale described which attempt to convert the # at 
stem/suffix boundaries in cases such as/s i m # Z\] into morpheme 
boundaries. Finally, 244 '?' were inserted at appropriate points 
(i.e. for each/P?QPJ, where/PQR/are phonemes, either/P#QP,/or 
/PQ#PJ occurred in the original utterances}. The next section also 
describes rules for converting some of these '?' boundaries into 
definite # boundaries. 
4. Method II 
4.1 Morphology rules 
The phonemic strings with the word boundaries inserted 
by the matching algorithm in Figure 4 are input to a second stage 
of processing which uses four additional sources of knowledge: 
PHON1 and PHON2 (a list of all one and two phoneme words in 
the Morphology-lexicon) and #PP and PP# (a list of all legal 
word-initial and word-final two phoneme sequences). Since these 
data are extracted from the Morphology-lexicon, they take 
account of phonologically reduced variants, but not the 
morphological variants that were excluded from the Word-lexicon. 
The morphology rules test whether the two phonemes that 
occur to the right of an automatically inserted # are legal with 
respect to PHON1, PHON2, #PP andPP#. If they are not, the 
assumption is made that the # occurs across a stem/inflectional 
morpheme boundary. Morphological rules are then applied to shift 
the # to the correct place, if possible. Consider for example, the 
phrase boys and girls in... which, after the application of the first i 
stage of processing, was analysed as: 
(2) boi#zan?g@@l#zin 
The insertion of the word boundaries at this first stage of 
processing is attributable to the fact that neither /b oi Z\] nor 
/g..@@..1..z/ occurred in the Morphology-lexicon. Furthermore, 
since there are no words that begin with/el Z\] nor/1 z/, the relevant 
sequences would be stored as/b oi # Z\] and/@@ 1 # z/in the tree in 
Figure 4. The following test is now performed on the two phonemes 
to the right of the first # in (2): 
(3) If/z a/is not in #PP rewrite/el # z a/as/el M z # a/ 
else rewrite/el # z a/as/el M? z a/. 
228 
Informally, (3) states that if/z a/cannot begin words (according to 
the Morphology-lexicon),/z/must be an inflectional suffix of the 
previous word: therefore place an 'M' (morpheme boundary) before 
/Z\] and shift the # symbol to the right of/z\]. Alternatively, if/z a/ 
does begin words in the Morphology-lexicon, it is impossible to 
determine whether/z/is a plural suffix or the first phoneme of a 
following word. In this case, M? is used to denote these two 
possibilities: it is an abbreviation for either/el M z # M or /el # z 
a/. In fact, since there are no words that begin with /z a/, (2) is 
analysed as/M z # a\]. A solution with M? would occur if boys are 
were analysed at the first stage of processing as: 
(4) b oi # z aa 
since in this case/Z\] can also be the first phoneme of a word (Csar}. 
A test is often performed with respect to PHON1 and/or 
PHON2 rather than #PP. This occurs in the following example, in 
which two # symbols have been automatically inserted in close 
proximity at the first stage of processing: 
(5) b i g i n # z @ # t ai p # (begins a type) 
In this case, a test is made to determine whether/z @/occurs in 
PHON2 (i.e. whether it is is a two phoneme word}. Since it is not, 
(5) is reanalysed as/b.i g i n M z # @ # t ai p #/. 
The test in (3) above is only made if the structural 
description of phonemes to the left and right of the # is met by 
certain conditions. Specifically, the test is performed in contexts 
such a s those given in Table IIL 
PAST TENSE 
{p, k, f, th, s, sh}# t (tapped, missed, wished) 
voiced phonemes excluding/d\] #d (paved, seemed) 
PLURALS/PRESENT TENSE 
{p, t, k, f, th} # s (mats, picks, meets) 
voiced phonemes excluding/z, zh, jh/# z (tabs,sings) 
Table III: Some of the contexts in which the morplmlogy rules 
apply. 
4.2 Resolving Ambiguities 
The four sets of data PHON1, PHON2, #PP and PP# are 
also used to convert some '?' symbols into definite (#) word 
boundaries. In order to resolve the hypothetical ambiguity 
/ABC?DEF/, for example, it is first expanded into the two possible 
cases it represents in (7) and (8) below: 
(7) ABC#DEF 
(8) ABCD#EF 
An attempt is then made to prove that either (7) or (8) is illegal (on 
the basis that, if (7) is illegal, ABC?DEF must correspond to the 
representation in (8) and vice-versa). (7) can be proved illegal if (9) 
is true: 
(9) Either C is not in PHON1 and BC is not in PP# 
Or D is not in PHON1 and DE is not in #PP 
An informal interpretation of (9) is the following. If C is not a 
one-phoneme word, test whether BC is a legal two-phoneme 
sequence that can end words; if C is not a one-phoneme word and 
BC cannot end words, then (7) must be illegal. Otherwise, if (7) i 
cannot be shown to be illegal on the basis of the phonemes that 
precede #, the phonemes that follow # are considered. In this case 
if D is not a one-phoneme word and if" DE cannot begin a word, (7) 
must be illegal. Otherwise, (7) cannot be shown to be illegal and so 
the following (similar} test is applied to (8): 
(10) (8) is illegal if: 
Either D is not in PITON1 and CD is not in PP# 
Or E is not in PHON1 and EF is not in #PP. 
If neithm' (7) nor (8) cat, be proved illegal, the '?' cannot be resolved 
into #. 
When two '?' symbols occur in close proximity, an 
expansion is made into fore" alternatives. If t:hree of the 
alternatives can be proved illegal, both '?' symbols can be resolved 
as definite # symbols. For exmnple, after the first stage of 
processing, ;aeasuring the gun was analysed as: 
(11) /me~hring#dh?@?guhn/ 
This expamt:~ into the following alternatives: 
(12) /rn e dl ring# dh # @ # g uh n/. 
(13) /me e,h ring#dh # @ g#uhn/, 
(14) hne dt ring#dh @##guhn/. 
(15) hnezhring#dh @ #g#uhn/. 
(12) and (13) nmst be illegal since hlh/is not a one-.phonenm word 
((13) is additionally illegal since /@ g/ is not a possible 
two-phoneme word). (15) is illegal since/g/ is not a one-t)honeme 
word. Theret ore (14) is the only possible analysis of (11). 
This type of expansion into four possibilities is only made 
when 3 phmtemes, or fewer, occur between the two '?' symbols: if 
more than three phonemes intervene, the result of resolving both 
? symbols together is the same as if each ? symbol were considered 
separately. 
Finally, the example with two '?' symbols in (11) is 
extended to the general ease in which n '?' symbols occur in close 
proximity to one another (i.e. a series of n '?' symbols with 3, or 
fewer, l)homanes between successive '?' symbols). These expand 
i~to 2 n alternatives. As in the example above, if 2 n - l alternatives 
can be proved illegal, all r~ '?' symbols can be converted to # 
symbols. 
4.3 Order of rules 
After: the application of the first stage of the word 
boundary in..~ertion rules, expansion rules apply in which each '?' 
symbol is e~panded into two alternatives. The morphology rules 
apply to each of these expanded alternatives and at all other 
points in the utterance at which their structural description is met. 
Only after ~;he morphology rules have applied can any of the 
alternates be eliminated. The morphology rules must apply before 
eliminating alternatives, othm'wise some altm'natives might be 
incorrectly eliminated. This can be illustrated with the example 
boys and girls which, after the first stage of processing, was 
analysed as/b oi # z a n ? g @@ 1 # z/. This expands into: 
(16) boi#z an#g@@l#z 
(17) boi#z an g#@@l#z 
If the elimination rules applied prior to morphological rules, both 
(16) and (17) would be eliminated, since /z a/is not in #PP (and 
(17) is illegal since/n g/is not in PP#). I~, on the other hand, the 
morphology rules apply first, (18) and (19) would be derived from 
(16) and (17) respectively: 
(18) boiMz #an#g@@lMz# 
(19) boi~'fz #an g#@@lMz# 
Only (19) would be eliminated, on the grounds that/n g/is not a 
legal two-phoneme sequence occurring word-flnally. 
A further illustration of the interaction between the 
expansion rules, morphological rules and elimination of 
alternatives is shown in (20 - 33) below. After the first stage of 
processing, months tie (from a sentence in a gardening manual, 
'after a few months, tie in more growth') was analysed as/m uh n 
th ? s t ? ai/. This expands to four alternatives: 
(20) m uh n th # s t # ai i n 
(21) muhnth#st ai#in 
(22) muhnth s#t#aiin 
(23) muhnth s#t ai#in 
Morphology rules are applied to the four alternatives: 
(24) muhnthMs#t#aiin 
(25) muhnthM?st ai#in 
(26) muhnth sMt#aiin 
(27) muhnth sM?t ai#in 
(25) and (27) are further expanded into tim two alternatives they 
represent. This given a total of 6 alternatives: 
(28) m uh n th M s # t # ai i n (fl'om (24)) 
(29) m uh n th M s # t ai # i n (from (25)) 
(30) m uh n th # s t ai # i n (from (25)) 
(31) muhnth sMt#aiin (frmn (26)) 
(32) m uh n th s M t # ai # i n (from (27)) 
~'(33) muhnth s#t ai#in (from (27)) 
In eliminating the alternatives, a slight modification has to be 
made to the rules: rather than referring to two segments to the left 
and right of #, they refer to the two segments to the left of an M 
symbol (if present) and to two segments to the right of #. But the 
segments that intervene between an M and # are ignored. The 
following test would therefore be made to test the legality of (29): 
(34) (29) is illegal if: 
l,\]ither /th/is not in PITON1 and/n th/is not in PP# 
Or /t ai/is not in P\[ION2 
It is possible to eliminate (28) since/t/is not in PIION1. (31), (32) 
and (33) can be eliminated since/th s/does not occur in PP# (final 
/th s/occurring only across a stem/inflectional suffix boundary). 
(29) and (30) remain, and are collapsed into one representation in 
(35) using the M? notation: 
(35) muhnthM?s t ai#in 
Tim analysis shows therefore that /in uh n th ? s t ? ai/ 
corresponds to either months tie in or month sty in. 
5. Results II 
The statistics on the automatically inserted # boundaries 
are shown in Table IV. 
Target number of word-boundaries 1411 
Total number of inserted # boundaries • 690 
# correctly inserted 645 
Remainder 45 
Reduced forms not accotmted for 14 
Lexlcalitems not accounted for 10 
Corresponding to morpheme boundaries 21 
Table IV: Word boundaries automatically insm'ted after the 
application of the morphology, expansion and elimination rules. 
;The results show that 645/1411 (45.7%) of the target word 
boundaries were correctly detected. This is an increase of around 
9% compared with the result obtained prior to the application of 
the rules described in the preceding section. 24 # boundaries were 
inserted at inappropriate points, either because of the presence of 
229 
because of the presence of reduced forms in the utterances that we 
had not derived by rule, or because of lexical items that had not 
been included in the word-lexicon. All 21 inserted # symbols that 
corresponded to morpheme boundaries were inserted medially in 
compounds (e.g. how#ever, there#fore), while all automatically 
inserted # symbols that had occurred at stem/inflectional suffix 
boundaries (/s..i..m..#..z/ for seems) were converted to M or M? 
symbols using the morphology rules described above. 
An approximate measure of the probability of a word 
boundary being incorrectly inserted can be made as follows. 
Firstly, since it was our intention that the algorithm should insert 
# symbols not only between words but also within compounds, the 
target number of boundaries to be identified can be considered to 
be 1411 (the number of word boundaries in the utterances) plus 78 
(the number of boundaries occurring within compounds), i'.e. 1489. 
Of these (see Table IV), 645 + 21 = 666 (44.7%) were correctly 
inserted. The probability of a word boundary being incorrectly 
inserted, either as a result of a reduced form which was not 
derived by rule, or because of the omission of a word from the 
Word-lexicon, is given by: 
(36) (24/(666 ~}- 24) x 100) % 
6. Discussion 
--- 3.5%. 
This study has shown that around 45% of all word 
boundaries can be correctly identified from a knowledge of 
three-phoneme sequences that occur across word boundaries but 
which do not occur word-internally together with a knowledge of 
one- and two-phoneme words and all two-phoneme sequences that 
can begin and end words. The result is based on 
hand-transcriptions which can be considered analogous to the 
phonemic strings that would be extracted automatically from the 
acoustic speech signal if the recogniser made no errors in this 
derivation. 
A current area of investigation is to identify the set of 
phoneme sequences which occur neither" across a word boundary 
nor word-internally. Such phoneme sequences can be easily 
obtained from the data sets discussed in this paper and they would 
enable errors to be detected in the acoustic-phonetic stage of 
processing in a continuous speech recogniser. Some examples of 
these sequences are given in (37): 
(37) /1 z ng/,/aa dh l/,/e w n/ 
For example, /e w n/ must be illegal since it does not occur 
word-internally and because it does not occur across word 
boundaries (both/e # w rd and/e w # n/must be ruled out on the 
grounds that there are no words which end in/e/or/e w/). The 
incorporation of this kind of knowledge would enable an error to be 
detected ff such a sequence were derived automatically after the 
acoustic-phonetic stage of processing. 
8. No~s 
1 
used in this paper is shown below: 
The CSTR Machine Readable Phonemic Alphabet for BP 
Ipl pea Ig fan /11 lee 
~hi bead tv/ van It~ road 
It~ tea /th/ think /w/ win 
/w ~uy /d~ V_hen /y/ y_oa 
/k/ key is/ s_ing /m/ man 
Igl gay I~ zoo in/ name 
/ehl chew /sh/ shoe /ng/ s-/ng 
/jh/ ~udge /zh/ measur~ /h/ hat 
liil we Iol /tot leil sta~ 
li/ hit Ioo/ sa_ww lay sigh 
~el head lu/ could_.. ~oil toy 
/al had /uu/ who /au/ now 
/aal har_d I@/ the /ou/ go 
/i@/ here /u@/ sure /e@/ there 
/@@/ first 
This research was supported by SERC grant number GR/D29628 
and is part of an Alvey funded project in continuous speech 
recognition. Our thanks to John Laver and Briony Williams for 
many helpful comments in the preparation of this manuscript. 
230 

References 

Clements G.N. & Keyser S.J. (1983) CV Phonology. A Generativ~ 
Theory of the Syllable. MIT Press: Cambridge Mass. 

Cutting D. & Harrington J.M. (1986) Phongram: a phonological 
rule interpreter. In (Lawrence R. ed.) Proceedings of the Institute of 
Acoustics, 8, 461-469. Institute of Acoustic: Edinburgh. 

Fudge E.C. (1969) Syllables. Journal of Linguistics 5,253-286. 

Gimson A.C. (1984) English Pronouncing Dictionary (Revised 
edition, originally compiled by D. Jones). Dent: London. 

Harrington J.M. & Johnstone A. (1988, in press) The effects of 
equivalence classes on parsing phonemes into words in contiuuotm 
speech recognition. Computer Speech & Language. 

Harrington J.M., Johnson I. & Cooper M. (1987) The applicatiot~ of 
phoneme sequence constraints to word boundary identification in 
automatic, continuous speech recognition. In (Laver J. 8r. Jack M. 
eds.) European Conference on Speech Technology, Vol. 1, 163-166. 

Harrington J.M., Laver J. & Cutting D. (1986) Word-structure 
reduction rules in automatic, continuous speech recognition. In 
Proceedings of the Institute of Acoustics (R. Lawrence ed.) 8, 
451-460. Institute of Acoustics: Edinburgh. 

Johannson S., Leech G.N. & Goodluck H. (,1978) The 
Lancaeter-Oslo/Bergen Corpus of British English. Department of 
English, Oslo University. 

Lamel L. & Zue V.W. (1984) Properties of consonant sequencesr 
within words and across word boundaries. Proceeedings ICASSP 
42.3.1 - 42.3.4. 

Rockey D. (1973) Phonetic Lexicon. tieyden: Oxford. 
