E 
DEFAULT FINITE STATE MACHINES 
AND FINITE STATE PHONOLOGY 
Gerald Penn 
Computational Linguistics Program 
Carnegie Mellon University 
Pittsburgh, PA 15213 
Internet: penn@lcl.cmu.edu 
Richmond Thomason 
Intelligent Systems Program 
University of Pittsburgh 
Pittsburgh, PA 15260 
Internet: thomason~isp.pitt.edu 
Abstract 
We propose DFSM's as an extension of finite state 
machines, explore some of their properties, and in- 
dicate how they can be used to formalize naturally 
occurring linguistic systems. We feel that this im- 
plementation of two-level rules may be more lin- 
guistically natural and easier to work with eom- 
putationally. We provide complexity results that 
shed light on the computational situation. 
INTRODUCTION 
Two-level phonology combines the computational 
advantages of finite state technology with a for- 
malism that permits phenomena to be described 
with familiar-looking rules. The problem with 
such a scheme is that, in practice, the finite state 
machines (FSM's) can grow too large to be man- 
ageable; one wants to describe them and to run 
them without having to deal with them directly. 
The KIMMO approachlseeks to achieve this by 
(I) decomposing the computational process into 
a battery of parallel finite state machines and 
(2) compiling rules (which notationally resemble 
familiar phonological rules, but which axe inter- 
preted declaxatively) into these parallel finite state 
implementations. But the KIMMO formalism un- 
fortunately gains no tractability in the process 
of compilation. Moreover, the compiler is com- 
plex enough to create software engineering prob- 
lems, and this has led to practical difficulties, 
which in turn have made the KIMMO technol- 
ogy less generally available than one might wish. 
IIere, we describe a different finite-state founda- 
tion for two-level rules, involving generalizations 
of FSM's which we call Default Finite State Ma- 
chines (DFSM's). Whether or not this approach 
remains intractable after compilation is an open 
question; but even without compilation, we be- 
lieve that it has some conceptual advantages as 
well. 
1See the discussion and references in (Sprout, 1992). 
DFSM's extend FSM's (specifically, finite- 
state transducers) so that transitions can be 
context-sensitive, and enforce a preference for the 
maximally specific transitions. The first change 
allows phonological rules to appear as labels of 
transition arcs in transducers; the second change 
incorporates the elsewhere condition into the com- 
putational model. 2 DFSM's can be implemented 
directly, although there may be a method to com- 
pile them into a more efficient machine. We be- 
lieve that either approach will be feasible for re- 
alistic linguistic applications (though, of course, 
not in the theoretically worst case). In paxticu- 
lax, the direct implementation of DFSM's is very 
straightforward; no rule compiler is needed, since 
rules are labels on the arcs of the machines them- 
selves. This implementation may not provide an 
optimal allocation of space and time usage at run 
time, but we believe that it will be adequate for 
testing and research purposes. 
This presentation of DFSM's is confined to 
defining the basic ideas, presenting some exam- 
pies of linguistic description, and providing a par- 
tial complexity analysis. In later work, we hope 
to explore descriptive and implementational issues 
further. 
NOTATIONAL PRELIMINARIES 
We assume an alphabet L, with a reserved symbol 
0 ~ £ for insertions and deletions. A replacement 
over £ is a pair of the form I = (1,1') where (1) 
! E £ and (2) I I E £ or i I = 0; Replacements£ 
is the set of replacements over £. US-strings£ is 
the set of strings over the set £2 U \[£ x {0}\] of 
replacements. 
2The elsewhere condition is built into an implemen- 
tation (due to Karttnnen) of the TWOL rule compiler; 
see (Dalrymple et al., 1993), pp. 28-32. But on this' 
approach, default reasoning and the elsewhere condi- 
tion are not employed at a level of computation that is 
theoretically modeled; this reasoning is simply a con- 
venient feature of the code that translates rules into 
finite state automata. 
33 
We use roman letters to denote themselves: 
for instance, T denotes the letter I. Boldface let- 
ters denote constant replacements: for instance, I 
is the pair (l,l). Moreover, ¢ is the empty string 
over L~, and ~ is the empty string over the £ re- 
placements. When the name of a subset of/2 (e.g. 
C) is written in boldface, (e.g. C), the set of iden- 
tity pairings is intended (e.g., C = {l:l/l E C}). 
We use ordinary italics as variables over let- 
~rs, and boldface italics as variables over replace- 
ments and strings of replacements. Ordinarily, we 
will use I for replacements and z, 7t for strings of 
replacements. Finally, we use ' I:I" for the pair (l,l'). 
Where z E US-strings£, U-String(a,.) is the 
underlying projection of z, and S-String(z) is its 
surface projection. That is, if z = (z,z'), then 
U-String(z) = z and S-String(z) = x'. 
RULE NOTATION AND 
EXAMPLES 
The rules with which we are concerned are like 
the rewrite rules of generative phonology; they are 
general, context-conditioned replacements. That 
is, a rule allows a replacement if (1) the replace- 
ment belongs to a certain type, and (2) the sur- 
rounding context meets certain constraints. 
If we represent the contextual constraints ex- 
tensionally, as sets of strings, a rule will consist of 
three things: a replacement type, and two sets of 
US-Strings. Thus, we can think of a rule as a triple 
(X, Y, F), where X and Y are sets of US-strings. 
Imagine that we are given a replacement instance l 
in a context (z, y), where z and y are US-strings. 
This contextualized replacement (~, l, y) satisfies 
the rule ifzEX, yE Y, andIEF. 
For linguistic and computational purposes, 
the sets that figure in rules must somehow be 
finitely represented. The KIMMO tradition uses 
regular sets, which can of course be' represented 
by regular expressions, for this purpose. We have 
not been able to convince ourselves that regular 
sets are needed in phonological applications, a In- 
aThe issue here is whether there are any linguis- 
tically plausible or well-motivated applications of the 
Kleene star in stating phonological rules. For instance, 
take the English rule that replaces "e by 0 after a 
morpheme boundary preceded by one or more con- 
sonants preceded by a vowel." You could represent 
the context in question with the regular expression 
VC*C; but you could equally well use VC I VCC \] 
VCCC \] VCCCC.The only way to distinguish the two 
rule formulations is by considering strings that vio- 
late the phonotactic constraints of English; but as far 
as we can see, there are no intuitions about the re- 
sults of applying English rules to underlying strings 
like typppppe+ed. We do not question the usefulness 
stead, we make the stronger assumption that con- 
texts can be encoded by finite sets of strings. A 
string satisfies such a context when its left (or 
its right) matches one of the strings in this set. 
(Note that satisfaction is not the same as mem- 
bership; infinitely many strings can satisfy a finite 
set of strings.) Assuming a finite alphabet, all re- 
placement types will be finite sets. With these 
assumptions, a rule can be finitely encoded as a 
pair {(X, Y~, F), where the sets X and Y are fi- 
nite, and F is s replacement type. 
Rule encodings, rule applicability and satis- 
faction are illustrated by the rule examples given 
below. The ideas are further formalized in the 
next section. 
Language: 
Let £ = {a,b,...,z,+,#,', i} 
Declare the following subsets of £: 
C = {b,c,d,f,g,h,j,k,l,m,n,p,q,r,s,t,v,w, 
x,y,z} 
Csib = {s, x, z} 
Example rules: 
Example 1 
Rule encoding: {d), {(+,0)}) 
Rule notation: + --~ 0 \[ 
Rule description: Delete +. 
Example 2 
Rule encoding: ({C, {(+, 0)}), {(y, i)}) 
Rule notation: y --~ i / C_ + :0 
Rule description: Replace y by i before a mor- 
pheme boundary and after a constant US- 
consonant, i.e. after (l,i), where ! E C. 
Example 3 
Rule encoding: (({sh}, {i ^(#, O) / I E Csib}), {(+,e)}) 
Rule notation: + --~ e / sh_Csib #:0 
Rule description: Keplace + with e after sh and 
before a suffix in Csib. 
Example rule applications: 
1. The rule encoded in Example 1 is satisfied by 
(+,0) in the context (cat, s) because (1) for 
some  , cat = z^e, (2) for some y, s = c ^y, 
and (3) (+,0) e {{+,0)}. 
of regular expressions in many computational applica- 
tions, but are not convinced that they are needed in 
a linguistic setting. We would be interested to see a 
well motivated case in which the Kleene star is linguis- 
tically indispensable in formulating a two-level phono- 
logical rule. Such a case would create problems for the 
approach that we adopt here. 
34 
2. The rule encoded in Example 2 is not satisfied 
by (y,i) in the context (spot + :t, +:0 hess) 
because there is no s such that spot + :t = ~e ~l, 
where I E C. 
3. The rule encoded in Example 3 is not satis- 
fied by (+, 0) in the context (ash, s #:0). In 
fact, the context is satisfied: (1) sh = m-sh 
for some :e and (2) s #:0 E Csib ~y for some 
It. (3.1) Moreover, the underlying symbol of 
the replacement (namely, +) matches the ar- 
gument of the ~ule's replacement function. Un- 
der these circumstances, we will say that the 
rule is applicable. But the rule is not satis- 
fied, because (3.2) the surface symbol of the re- 
placement (namely, 0) does not match the value 
of the rule's replacement function (namely, e): 
thus, (+,0) ~\[ {(+,e)}. 
INDEXED STRINGS AND RULES 
We now restate the above ideas in the form of 
formal definitions. 
Definition 1. Context type. 
A context type is a pair C = (X, Y), where X 
and Y are sets of US-Strings. 
Definition 2. Indexed US-strings. 
An indexed US-String over £ is a triple 
(as, l,y), where a,y E US.stringsr and I E 
Replacementsr. 
An indexed US-string is a presentation of a 
nonempty US-string that divides the string into 
three components: (1) a replacement occurring in 
the string, (2) the material to the left of that re- 
placement, and (3) the material to the right of it. 
Where (as, I, y) is an indexed string, we call as the 
left context of the string, I/the right context of 
the string, and I the designated replacement of the 
string. 
A rule licenses certain sorts of replacements 
in designated sorts of environments, or context 
types. For instance, we may be interested in the 
environment after a consonant and before a mor- 
pheme boundary. Here, the phrase "after a con- 
sonant" amounts to saying that the string before 
the replacement must end in a consonant, and the 
phrase "before a morpheme boundary" says that 
the string after the replacement must begin in a 
morpheme bound'ary. Thus, we can think of a 
context type as a pair of constraints, one on the 
US-string to the left of the replacement, and the 
other on the US-string to its right. If we identify 
such constraints with the set of strings that satisfy 
them, a context type is then a pair of sets of US- 
strings; and an indexed string satisfies a context 
type in case its left and right context belong to the 
corresponding types. 
35 
Definition 3. Replacement types. 
A replacement type over £ is a partial function 
F from £ U {0} to £ U {0}. (Thus, a replacement 
type is a certain set of replacements.) Dora(F) 
is the domain of F. 
Definition 4. Rules. 
A rule is a pair 7~ = (C, F), where C is a context 
type and/' is a replacement type. 
Definition 5. Rule applicability. 
A rule ((X, Y), F) is applicable to an indexed 
string (se, (i,l'), y) if and only if as E X, y ~ Y, 
and F(l) is defined, i.e., i E Dom(F). 
Definition 6. Rule satisfaction. 
An indexed string (as, i, y) satisfies a rule (C, F) 
if and only if as E X, y E Y, and F (l) = l °. 
The above definitions do not assume that the 
contexts are finitely encodable. But, as we said, we 
axe assuming as a working hypotheses that phono- 
logical contexts are finitely encodable; this idea 
was incorporated in the method of rule encoding 
that we presented above. We now make this idea 
explicit by defining the notion of a finitely encod- 
able rule. 
Definition 7. LeftExp( X ), RightExp( X ) 
LeftExp(X) = {z^,/, E X} Right~xp(X) 
= {®^z/ ® ~ X} 
Definition 8. Finite encodability 
A subset X of US-strings j: is left-encoded by a 
set U in case X = LeftExp(U), and is right- 
encoded by 17 in case X = RightExp(V). (It is 
easy to get confused about the usage of "left" 
and "right" here; in left encoding, the left of 
the encoded string is arbitrary, and the right 
must match the encoding set. We have chosen 
our terminology so that a left context type will 
be left-encoded and a right context type will be 
right-encoded.) 
A context type C = (X, Y) is encoded by a pair 
(U, V) of sets in case U left-encodes X and V 
right-encodes Y. 
A rule ~ = (C, F) is finitely encoded by a rule 
encoding structure ((U, V),g) in case (U,V) 
encodes C, g = F, and ff and V are finite. 
In the following material, we will not only con- 
fine our attention to finitely encodable rules, but 
will refer to rules by their encodings; when the 
notation ((X, Y), F~ appears below, it should be 
read as a rule encoding, not as a rule. Thus, for 
instance, the indexed string (cat, +:0, s I satisfies 
the rule (encoded by) (({~}, {~}), {(+, 0)}), even 
though cat ¢ {e}. 
SPECIFICITY OF CONTEXT 
TYPES AND RULES 
We have a good intuitive grasp of when one con- 
text type is more specific than another. For in- 
stance, the context type preceded by a back vowel 
is more specific than the type preceded by a vowel; 
the context type followed by an obstruent is nei- 
ther more nor less specific than the type followed 
by a voiced consonant; the context type preceded 
by a vowel is neither more nor less specific than 
the type followed by a vowel. 
Since we have identified context types with 
pairs of sets of strings, we have a very natural way 
of defining specificity relations such as "more spe- 
cific than", "equivalent", and "more specific than 
or equivalent": we simply use the subset relation. 
Definition 9. C < C'. 
Let C = (X1, Y1) and C' = (X2, Y2} be context 
types. C < C' if and only if X~ C_ X~ and Yt C_ 
Y~. 
Definition 10. C _= C ~. 
C =_ C' if and only if C < C' and C' _< C. 
Definition 11. C < C ~. 
C < C' if and only if C < C' and C I ~ C. 
It is not in general true that if LeflEzp(X) C 
LeflEzp(JO, then X C Y; for instance, 
LeftExp({aa, ba}) C_ £eflExp({a}), but {aa, ba} 
{a}. However, we can easily determine the speci- 
ficity relations of two contexts from their finite 
encodings: 
Lemma 1. LeflExp(X) C_ LeflEzp(Y) iff for all 
z E X there is a y E Y such that for some z, ffi = 
z Ay. Similarly, RightExp(X) C RightExp(Y) iff 
for all z E X there is a y E Y such that for some 
Proof of the lemma is immediate from the def- 
initions. It follows from the lemma that there is a 
tractable algorithm for testing specificity relations 
on finitely encodable contexts: 
Lemma 2. Let C be finitely encoded by {X1, X2) 
and C' be finitely encoded by {YI, Y2). Then 
there is an algorithm for testing whether C < C ~ 
that is no more complex than O(m × n x k), where 
m = max(I Xxl, \[.X21), n = max(I Yll, I Y zl), and k 
is the length of the longest string in Y1 U Y2. 
Proof. Test whether for each zl E X1 there 
is a Yl E Yl that matches the end of zl. Then 
perform a similar test on X2 and Y~. 
DFSM'S 
A DFSM's transitions are labelled with finitely en- 
codable rules rather than with pairs of symbols. 
Moreover, nondeterminism is restricted so that in 
case of conflicting transitions, a maximally spe- 
cific transition must be selected. The critical def- 
inition is that of minimal satisfaction of an arc by 
an indezed path, where an indexed path repre- 
sents a DFSM derivation, by recording the state 
transitions and replacements that are traversed in 
processing a US-String. 
Definition 12. Arcs. 
An arc over a set S of states and alphabet £ is 
a triple A = (s, sl,~), where s,s I E S and 7~ is 
a rule over/:. 
Definition 13. DFSMs. 
A DFSM on ~: is a structure .hd = {S,i,T,.A}, 
where S is a finite set of states, i E S is the 
initial state, T C S is the set of terminal states, 
and .,4 is a set of arcs over S on £. 
Definition 14. Paths. 
A path ~" or ~r(s0, an) over .M from state so to 
state sn is a string s011stll...lnsn, where for 
all m, 0 _< m _< n, sm is a state of .h4 and 
lm E US-strings~c. 
Remark I: n >_ 0, so that the simplest possible 
path has the form s, where s is a state. Remark & 
we use the notations ~r and ~r(s, s ~) alternatively 
for the same path; the second notation provides a 
way of referring to the beginning and end states 
of the path. 
Definition 15. Recovery of strings from paths. 
Let lr = solzszll...lnsn. Then String(~') = 
11 ...1.. 
Definition 16. Indezed paths. 
An indexed path over .Ad is a triple (%1, 7r') 
where 7r, 7c' are paths, and l,n E US-strings£. 
(Tr, 1, or') is an indexing of path a if and only if 
o" --" ¢r ~l ~lr I. 
Definition 17. Applicability of an arc to an in- 
dezed path. 
An are (u,u',7~) is applicable to an indexed 
path {lr(s, t), 1, ~'~(s ~, t')} if and only if t = u and 
the rule 7~ is applicable to the indexed string (String0r), 1, String(~')). 
Definition 18. Satisfaction of an arc by an in. 
dezed path. 
(~'(s, t), 1, r~(s ~, t~)) satisfies an are {u, u ~, ~) if 
and only if t -- u, s ~ = u ~, and the indexed 
string {String(~r), 1, String(~'~)) satisfies the rule 
36 
Definition 19. Minimal satisfaction of an arc by 
an indezed path. 
Ca',l, z") minimally satisfies an arc A = (s, s', 7~) 
of.M if and only if (a', 1, lr') satisfies A and there 
is no state s" and arc A' = i s, s", ~') of Ad such 
that A' = (s, s','g') is applicable to (a',l, a") 
and ~' < g. 
As we said, the above definition is the cru- 
cial component of the definition of DFSM's. Ac- 
cording to this definition, to see whether a DFSM 
derivation is correct, you must check that each 
state transition represents a maximally specific 
rule application. This means that at each stage the 
DFSM does not provide another arc with a com- 
peting replacement and a more specific context. 
("Competing" means that the underlying symbols 
of the replacement match; a replacement competes 
even if the surface symbols does not match the let- 
ter in the US-String being tested.) 4 
Definition 20. Indezed path acceptance by a 
DFSM. 
M = (8, i,T,.A) accepts an indexed path 
(Tr, l,z "~) if and only if there is an arc A I = 
(s, s I, g~) of .M that is minimally satisfied by 
(,~, I, 7r'). 
Definition 21. Path acceptance by a DFSM. 
= (8, i, T, ,4) accepts a path a'(s, s ~) if and 
only if .Ad accepts every indexing of ~', s = i, 
and s' G T. 
Definition 22. US-String acceptance by a DFSM. 
.Ad accepts z E US-stringsr if and only if there 
is a path ~r such that ,Ad accepts ~r, where z = 
String(Jr). 
Definition 23. Generation of SF from UF by a 
DFSM. 
.A4 generates a surface form z' from an underly- 
ing form z (where z and z' are strings over £) 
if and only if there is a a E US-strings£ such 
that .Ad accepts z, where U.String(v) = z and 
S-String(v) = z'. 
EXAMPLE: SPELLING RULES 
FOR ENGLISH STEM+SUFFIX 
COMBINATIONS 
The following is an adaptation of the treatment in 
Antworth (1990) of English spelling rules, which 
4This use of competition builds some directional 
bias into the definition of DFSM's, i.e., some prefer- 
ence for their use in generation. Even if we are using 
DFSM's for recognition, we will need to verify that 
the recognized string is generated from an underlying 
form by a derivatio~ that does not allow more specific 
competing derivations. 
in turn is taken from Karttunen and Wittenburg 
(1983). 
• .M = (S, i, T, A), where S = {i, s, t}. T = {t}. 
- Task of i: Begin and process left word bound- 
ary. 
- Task of s: Process stem and suffixes. 
- Task oft: Quit, having processed right word 
boundary. 
• Remark: the small number of states is deceptive, 
since contexts are allowed on the arcs. An equiv- 
alent finite-state transducer would have many 
hundreds of states at least. 
• Remark: the relatively small number of arcs 
enumerated below is also deceptive, since two 
of these "arcs," are 3 and arc 13, are actually 
schemes. In the following discussion we will 
speak loosely and refer to these schemes as arcs; 
this will simplify the discussion and should cre- 
ate no confusion. 
• Declare the foUowing subsets of £: 
Ltr= {a, b, c, d, e, f, g, h,i,j, k, 1, m, n, o, p, q, r 
S, t~ U, V, W, X, y, Z) 
C = {b,c, d, f, g, h,j, k, l, m, n, p, q, r, s, t,v,w, 
x,y,z} 
Csib = {s, x, z} 
Opal = {c, g} 
V = {a, e, i, o, u} 
Vbk = (a, o, u}; 
Where s,s' E 8, let A,,,, = {A/A G A and 
for some 7¢,A = (s, s', 'g)}. We present arcs 
by listing the rules associated with the arcs, for 
each appropriate pair (s, s') of states. We will 
give each arc a numerical label, and give a brief 
explanation of the purpose of the arc. 
• Arcs in .Ai,, : 
1. #~0/_ 
Delete left word boundary. 
• Arcs in .A,,,: 
2. + ---~ 0/__ 
Delete morpheme boundary. 
3. I--~1/__ : lGLtr 
Any underlying letter is normally unchanged. 
4. '~'/__ 
Apostrophe is normally unchanged. 
Stress is normally unchanged. 
6. + ~e/\[Csiblch\[sh \[ y:i\]--s \[+:0 I #:0\] 
Epenthesis before -s suffix. 
37 
7. y--~ i / C__ + :0 
Spell y as i after consonant and before suffix. 
8. y-~ y / C_ + :0\[i:i I ':'\] 
Exception to Rule 7; cf. "trying", "fly's". 
9. s ~ 0 / \[+:0 I +:e\]s +:0 '- 
Delete possessive's after plural suffix. 
10. e --~ 0 / VCC +_ + :0 V 
Elision. ~ 
11. e --~ e / VC+Cpal_ +:0Vbk 
Exception to Rule 10. 
12. i --~ y / _e:0 +:0i 
Spell i as y before elided e before i-initial suf- 
fix. 
13. +~i/':OC+Vi:l_\[Vly\] : 
1 E {b, d, g, l, m, n, p, r, t} 
Gemination. 
• Arcs in Ae,t: 
14. #-.o/_ 
Delete right word boundary. 
• Illustrations 
I. The derivation that relates #kiss+s# to 
0kisses0 proceeds as follows. 
1. Begin in state i looking at #:0. 
2. Follow arc 2 to s, recognizing k:k. (This is 
the only applicable arc.) 
3. Follow arc 3 to s, recognizing i:i. (This is the 
only applicable arc.) 
4. Follow arc 3 to s, recognizing s:s. (This is the 
only applicable arc.) 
5. Follow arc 3 to s, recognizing s:s. (This is the 
only applicable arc.) 
6. Follow arc 6 to s, recognizing +:e. (Arc 2 is 
also applicable here; but see the next illustra- 
tion.) 
7. Follow arc 3 to s, recognizing s:s. (This is the 
only applicable arc.) 
8. Follow arc 14 to f, recognizing #:0. (This is 
the only applicable arc.) 
II. No derivation relates #kiss+s# to 0kiss0s0. 
Any such derivation would have to proceed like 
the above derivation through Step 5. At the 
next step, the conditions for two arcs are met: 
arc 2 (replacing + with 0) and arc 6 (replac- 
ing + with e). Since the context of the latter 
~llere, C + can be any string of no more than four 
consonants. 
arc is more specific, it must apply; there is no 
derivation from this point using arc 2. 
III. The derivation that relates #try+ing# to 
0try0ing0 proceeds as follows. 
1. Begin in state i looking at #:0. 
2. Follow arc 2 to s, recognizing t:t. (This is the 
only applicable arc.) 
3. Follow arc 3 to s, recognizing r:r. (This is the 
only applicable arc.) 
4. Follow arc 8 to s, recognizing y:y. (There are 
three applicable arcs at this point: arc 3, arc 
7, and arc 8. However, arcs 3 and 7 are illegal 
here, since their contexts are both less specific 
than arc 8's.) 
5. Follow are 2 to s, recognizing +:0. (This is 
the only applicable arc.) 
6. Follow arc 3 to s, recognizing i:i. (This is the 
only applicable arc.) 
7. Follow arc 3 to s, recognizing n:n. (This is 
the only applicable arc.) 
8. Follow arc 3 to s, recognizing g:g. (This is the 
only applicable arc.) 
9. Follow arc 14 to f, recognizing #:0. (This is 
the only applicable arc.) 
IV. No derivation relates #try+ing# to 
0tri0ing0. Any such derivation would have to 
proceed like the above derivation through Step 
3. At the next step, arc 7 cannot be traversed, 
since arc 8 is also applicable and its context is 
more specific. Therefore, no arc is minimally 
satisfied and the derivation halts at this point. 
COMPUTATIONAL 
COMPLEXITY 
We now consider the complexity of using DFSM's 
to create one side of a US-string, given the other 
side as input. There are basically two tasks to be 
analyzed: 
• DFSM GENERATION: Given a DFSM, D, 
over an alphabet, £, and an underlying form, u, 
does D generate a surface form, s, from u? 
• DFSM RECOGNITION: Given a DFSM, D, 
over an alphabet, £, and a surface form, s, does 
D generate an underlying form, u, from s? 
These two tasks are related to the tasks of KIMMO 
GENERATION and KIMMO RECOGNITION, the 
various versions of which Barton et al. (1987) 
proved to be NP-complete or worse. 
Relationship to Kimmo 
The DFSM is not a generalization of KIMMO; it 
is an alternative architecture for two-level rules. 
38 
KIMMO takes a programming approach; it pro- 
vides a declarative rule formalism, which can be 
related to a very large FS automaton or to a sys- 
tem of parallel FSI automata. The automata are 
in general too unwieldy to be pictured or managed 
directly; they are manipulated using the rules. By 
integrating rules into the automata, the DFSM 
approach provides .a procedural formalism that is 
compact enough to be diagrammed and manipu- 
lated directly. 
DFSM rules are procedural; their meaning de- 
pends on the role that they play in an algorithm. 
In a DFSM with many states, the effect achieved 
by a rule (where a rule is a context-dependent re- 
placement type) will in general depend on how the 
rule is attached to states. In practice, however, the 
proceduralism of the DFSM approach can be lim- 
ited by allowing only a few states, which have a 
natural morphonemic interpretation. The English 
spelling example that we presented in the previ- 
ous section illustrates the idea. There are only four 
states. Of these, two of them delimit word process- 
ing; one of them begins processing by traversing a 
left word boundary, the other terminates process- 
ing after traversing a final word boundary. Of the 
remaining two states, one processes the word; all 
of the rules concerning possible replacements are 
attached to arcs that loop from this state to it- 
self. The other is a nonterminal state with no arcs 
leading from it. Inthe example, the only purpose 
of this state is to render certain insertions or dele- 
tions obligatory, by "trapping" all US-strings in 
which the operation is not performed in the re- 
quired context. 
In cases of this kind, where the ways in which 
rules can be attached to arcs are very restricted, 
tile proceduralism of the DFSM formalism is lim- 
ited. The uses of rules in such cases correspond 
roughly to two traditional types of phonological 
constructs: rules that allow certain replacements 
to occur, and constraints that make certain re- 
placements obligatory. 
Although DFSM's are less declarative than 
KIMMO, we believe that it may be possible to 
interpret at least some DFSM's (those in which 
the roles that can~ be played by states are lim- 
ited) using a nonmonotonic formalism that pro- 
vides for prioritization of defaults, such as prior- 
itized default logic; see (Brewka, 1993). In this 
way, DFSM's could be equated to declarative, ax- 
iomatic theories with a nonmonotonic consequence 
relation. But we have not carried out the details 
of this idea. 
Though it is desirable to constrain the num- 
ber of states in a DFSM, there may be appli- 
cations in which we may want more states than 
in the English example. For instance, one natu- 
ral way to process vowel harmony would multiply 
states by creating a word-processing state for each 
vowel quality. Multiple modes of word-processing 
could also be used to handle cases (as in many 
Athabaskan languages) where different morpho- 
phonemic processes occur in different parts of the 
word. 
If they are desired, local translations of the 
four varieties of KIMMO rules ° into DFSM's are 
available, by using only one state plus a sink state.• 
The following correspondences provide transla- 
tions, in polynomial time, to one or more DFSM 
arcs: 
Exclusion, u : s/ ~ LC__RC: an arc u 
s / LC--RC from the state to a sink state .... ..... 
Context Restriction, u : s ~ LC_-RC: a loop 
u --~ s / LC__RC, and an arc u --~ s / _ to a 
sink state. 
Surface Coercion, u : s ~ LC__RC: a loop u 
s / LC--RC, and for each surface character s t E 
£, an arc u --~ s t / LC.--RC to a sink state. 
Composite, u : s ¢~ LC._RC: all of the arcs 
mentioned in Context Restriction or Surface Co- 
ercion. : 
Extended DFSM's 
The differences between KIMMO and DFSM's pro- 
hibit the complexity analysis for the correspond- 
ing two KIMMO problems from naturally extend- 
ing to an analysis of DFSM generation and recog- 
nition. In fact, we can define an extended DFSM 
(EDFSM), which drops the finite encodability re- 
quirement that KIMMO lacks, for which we have 
the following result: 
Theorem 1. EDFSM GENERATION is PSPACE- 
hard 
Proof by reduction of REGULAR EXPRES- 
SION NON-UNIVERSALITY (see Figure 1). Given 
an alphabet E, and a regular expression, a ¢ ~b, 
over E, we define an EDFSM over the alphabet, 
U {$}, where $ ~ E. We choose one non-empty 
string ceEL(a) of length n. The EDFSM first rec- 
ognizes each character in a, completing the task 
at state n0: 
al al I(£:£)*--(£:£)* 7 
From no, there are two arcs, which map to different 
states: 
eSproat (1992), p. 145. 
7Unlike with normal DFSM's, we will use reg,lar 
expressions for the contexts themselves in EDFSM's, 
not their encodings, since they may be infinite anyway. 
39 
2 ~ 2 / E*.._,~* 
2 -~ 2 / (a + 2)__(a + 2) 
where the latter rule traverses to some state 81, 
with a being the expression which replaces each 
atom, b, in a by its constant replacement, b:b, 
and likewise for ~. 
From Sl, the EDFSM then recognizes o~ again, 
terminating at the only final state. We provide 
this EDFSM, along with the input ot2o~ to EDFSM 
GENERATION. This EDFSM can accept c~$ot if 
and only if, at state so, the context (~3", ~*) is not 
more specific than the context ((a + $), (a + 2)). 
So, we have: 
(~', ~') ¢ ((. + 2), (~ + 2)) 
(~', ~') ~ ((~ + $),(a+ 2)) or (z', ~*) =_ ((a + 2), (~ + $)) 
~. ~" ~ L(a + 2) 
or Z* = L(a+ $) 
~* ~ L(a+ 2), since $ ~ ~, 
~* ~ L(a) U {$} 
E* ~ L(a), since $ ~ ~, 
~" ¢ L(.) 
¢} L(a) # E* (we know L(a) C_ E*) 
The translation function is linear in the size of the 
input.~ 
Owrite ccu 
, 10 
i~ ~" write ~:c~ 
(a+$) _ (a+$) 
$ -> ~ z*k..J /x*_ 
Figure 1. EDFSM constructed in Theorem 1. 
The Complexity of DFSM GENERATION 
Finite encodability foils the above proof tech- 
nique, since one can no longer express arbitrary 
regular expressions over pairs in the contexts of 
rules. In fact, as we demonstrated above, there 
is a polynomial-time algorithm for comparing the 
specificities of finitely-encodable contexts. Finite 
encodability does not, however, restrict the com- 
plexity of DFSM's enough to make DFSM GEN- 
ERATION polynomial time: 
Theorenl 2. I)I"SM GENERATION is NI L 
complete. 
Proof DFSM GENERATION is obviously in 
NP. The proof of NP-hardness is a reduction of 
3-SAT. Given an input formula, w, we construct 
a DFSM consisting of one state over an alphabet 
consisting of 0, 1, ~, one symbol, u~, for each vari- 
able in w, and one symbol, ej, for each conjunct 
in w. Let m be the number of variables in w, and 
n, the number of conjuncts. For each variable, ui, 
we add four loops: 
u, ~ 1 / #:# u1:£ ... u~-1:£--, 
u~ ~ 0 / #:# u1:£ ... u~-1:£--, 
ui-~ 1 / u/:l ui+l:£ ... um:£ £:£ 
u1:£ ... ul-x:£--, 
u~ ~ 0 / u~:0 ui+l:£ ... u,~:£ £:£ u1:£ ... u~-x:£-- 
The first two choose an assignment for a variable, 
and the second two enforce that assignment's con- 
sistency. For each conjunct, Ijl V 1/2 V ljs, where 
the l's are literals, we also add three loops, one 
for each literal. The loops enforce a value of 1 
on the symbol uj~ if lj~ is a positive literal, or 0, 
if it is negative. For example, for the conjunct 
ul V qua V u4, we add the following three rules: 
cj -+ cj / ul:l u~:£ ... um:£-- 
c~ ~ c~ / us:0 u4:£ ... u,,:£__ 
Cj --~ Cj / u4:l u5:£ ... um:£-- 
Thus, the input to DFSM GENERATION is 
the above DFSM plus an input string cre- 
ated by iterating the substring ul...umcj 
for each conjunct. The input string corre- 
sponding to the formula, ('~ul V u2 V u4) A 
(~u~ V us V'~u4) A (ul V u2 V us), would be 
~ulu2usu4clulu2uau4e2ulu2uau4cs. The DFSM 
accepts this input string if and only if the input 
formula is satisfiable; and this translation is linear 
inm+n. D 
Compilation 
Of course, we should consider whether the com- 
plexity of DFSM GENERATION can be compiled 
out, leaving a polynomial-time machine which ac- 
cepts input strings. This can be formalized as the 
separate problem: 
• FIXED-DFSM-GENERATION: For some 
DFSM, D, over alphabet, £, given an underly- 
ing form, u, does D generate a surface form, s, 
from u? 
Whether or not FIXED DFSM GENERATION 
belongs to P remains an open problem. It is, of 
course, no more difficult than the general DFSM 
GENERATION problem, and thus no more difficult 
than NP-complete. The method used in tile proof 
given above, however, does not naturally extend 
to the case of FIXED DFSM GENERATION, since 
we cannot, with a fixed DFSM, know in advance 
40 
.... .. ,.,.: 
how many variables to expect in a given input for- 
mula, without which we cannot use the same trick 
with the left context to preserve the consistency 
of variable assignment. 
Even more interestingly, the technique used in 
the proof of PSPACFE-hardnees of EDFSM GEN- 
ERATION does not naturally extend to fixed 
EDFSM's either; thus, whether or not FIXED 
DFSM GENERATION belongs to P is an open 
question as well s. Dropping finite encodability, of 
course, affects the compilation time of the problem 
immensely. 
Nulls 
The two proofs we have given remain valid 
if we switch alll of the underlying forms with 
their surface counterparts. Thus, without nulls, 
EDFSM RECOGNITION is PSPACE-hard, DFSM 
RECOGNTION is NP-complete, and, if FIXED 
DFSM GENERATION is in P, then we can presum- 
ably use the same compilation trick with the roles 
of underlying and surface strings reversed to show 
that FIXED DFSM RECOGNITION is in P as well. 
If nulls are permitted in surface realizations, 
however, DFSM RECOGNTION becomes much 
more difficult, even with finite encodability en- 
forced: 
Theorem 3. DFSM RECOGNTION with nulls is 
PSPACE-hard. 
Proof by reduction of CONTEXT-SENSITIVE 
LANGUAGE MEMBERSHIP (see Figure 2). Given 
a context-sensitive grammar and an input string 
of length m, we let the input surface form to the 
DFSM RECOGNTION problem be the same as the 
input string. We then design a DFSM with an 
alphabet equal to E U {$,!}, where ~ is the the 
set of non-terminals plus the set of terminals. The 
DFSM first copies each surface input symbol to 
the corresponding position in the underlying form, 
and then adds the pair $:0, completing the task 
in a state So. 
Having copied the string onto the underlying 
side of the pair, the remainder of the recognized 
underlying form will consist of rewritings of the 
string for each rule application, and will be paired 
with surface nulls at the end of the input string. 
Each rewriting will be separated by a $ symbol, 
and, as the string length changes, it will be padded 
by ! symbols. For each rule a ~ #, we add a cycle 
to the DFSM, emanating from state so, which first 
sit is quite unlikely, however, since the reduc- 
tion can probably be made with a different PSPACE- 
complete problem, from which the NP-completeness 
of FIXED EDFSM GENERATION would follow as a 
corollary. 
writes j copies of the ! symbol to the underlying 
form, where j = b - a, b = Ifll, and a = lal: 
! -* 0 / ~:£(£:£ ...~ £:£)-- :. 
j >_ 0 since the rules are context-sensitive. 
copy string + $:0 
to underly~gg 
$:0 / !:L ... !:L write J l:O,sji=~ 
a S:L $:L_ 
(rl)OOL a->'_:r2) 
recognize \[3, write al~,.,L,~ 
' (rl)O 
Figure 2. DFSM constructed in Theorem 3. 
The cycle then copies part of the most recent 
S-bounded string of symbols with a family of loops 
of the form: 
o" --+ 0 / o':£ (£:  ...m+j £:£ )-- (r l) 
for each o" E ~. It then recognizes ~, and : writes 
a, with: 
~1 ~ 0 / (&:£ ...b &:£) 
( £:£ ...,,+j+l-b £:£ )--, 
followed by: 
or2 -- 0/--, 
o<, --+ 0/- 
It then copies the rest of the most recent g- 
bounded string, using copy of the family of loops 
in (rl), and then adds a new $ with a rule that 
also ensures that this second loop has iterated the 
appropriate number of times by checking that the 
length has been preserved: 
$ -~ 0 / $:£ (£:L...m L:L )" (r2) 
The DFSM also has a loop emanating from so 
which adds more ! symbols: 
! -..+ 0 / h£ ( £:£ ...m £:£ )- 
All of the rule-cycles will use this to copy 
previously-added ! symbols, as the string shrinks 
in size. The proper application of this loop is also 
ensured by the length-checking of (r2). 
Finally, we add one arc to the DFSM from So 
to the only final .state which checks that the final 
copy of the string contains only the distinguished 
symbol, S: 
$ -* 0 / ( h£ ...~-I h£) S:£ $:£__ 
L,. 
41 ' 
Thus, the DFSM recognises the surface form 
if and only if there is a series of rewritings from the 
input string to S using the rules of the grammar, 
and the translation is linear in the size of the input 
string times the number of rules. O 
Since there exist fixed context-sensitive gram- 
mars for which the acceptance problem is NP- 
hard 9, the NP-hardness of FIXED DFSM RECOG- 
NITION with nulls follows as a corollary. 
CONCLUSION 
We claimed that DFSM's provide an approach to 
rules that is likely to seem more natural and in- 
tuitive to phonologists. Bridging the gap between 
linguistically adequate formalisms and computa- 
tionally useful formalisms is a long-term, commu- 
nity effort, and we feel that it would be premature 
to make claims about the linguistic adequacy of 
the approach; this depends on whether two-level 
approaches can be developed and deployed in a 
way that will satisfy the theoretical and explana- 
tory needs of linguists. A specific claim on which 
our formalism depends is that all natural two-level 
phonologies can be reproduced using DFSM's with 
finitely encodable rules. We feel that this claim is 
plausible, but it needs to be tested in practice. 
Computationally, our complexity work so far 
on DFSM's does not preclude the possibility that 
compilers for generation and recognition (with- 
out nulls) exist which will allow for polynomial- 
time behavior at run-time. Although this ques- 
tion must eventually be resolved, we feel that any 
implementation is likely to be simpler than that 
required for KIMMO, and that even a direct imple- 
mentation of DFSM's can prove adequate in many 
circumstances. We have not constructed an imple- 
mentation as yet. 
Like other two-level approaches, we have a 
problem with surface nulls. It is possible in 
most realistic recognition applications to bound 
the number of nulls by some function on the length 
of the overt input; and it remains to be seen 
whether a reasonable bound could sufficiently im- 
prove complexity in these cases. 
We have dealt with the problem of underlying 
nulls by simply ruling them out. This simplifies 
the formal situation considerably, but we do not 
believe that it is acceptable as a general solution; 
for instance, we can't expect all cases ofepentheses 
to occur at morpheme boundaries. If underlying 
nulls are allowed, though, we will somehow need to 
limit the places where underlying nulls can occur; 
this is another good reason to pay attention to a 
phonotactic level of analysis. 
9Garey and Johnson, (1979), p. 271. 
ACKNOWLEDGEMENTS 
This material is based upon work supported under 
a National Science Foundation Graduate Research 
Fellowship. This work was funded by National 
Science Foundation grant IRI-9003165. We thank 
the anonymous referees for helpful comments. 

REFERENCES 
Evan Antworth. 1990. Pc-KIMMO: a two- 
level processor for morphological analysis. Dallas, 
Texas: Summer Institute of Linguistics. 
Edward Barton, Robert Berwick, and Eric 
Ristad. 19877. Computational Complezity and 
Natural Language. Cambridge, Massachusetts: 
MIT Press. 
Gerhard Brewka. 1993. Adding priorities and 
specificity to default logic. DMG Technical Re- 
port. Sankt Augustin, Germany: Gesellschaft fdr 
Mathematik und Datenverarbeitung. 
Mary Dalrymple, Ronald Kaplan, Lanri Kart- 
tunen, Kimmo Koskenniemi, Sami Shalo, and 
Michael Wescoat. 1987. Tools for Morphological 
Analysis. Stanford, California, 1987: CSLI Tech- 
nical Report CSLI-87-108. 
Michael Garey and David Johnson. 1979. 
Computers and Intractability: A Guide to the 
Theory of NP-completeness. New York, New 
York: Freeman and Co. 
Lanri Karttunen. 1991. "Finite-state con- 
straints." International conference on current 
issues in computational linguistics. Penang, 
Malaysia. 
Lauri Karttunen and Kent Wittenburg. 1983. 
"A two-level morphological analysis of English." 
Tezas Linguistic Forum ~ pp. 217-228. 
Graeme Ritchie, Graham Russell, Alan Black 
and Stephen Pulman. 1992. Computational mor- 
phology. Cambridge, Massachusetts: MIT Press. 
Richard Sproat. 1992. Morphology and com- 
putation. Cambridge, Massachusetts: MIT Press. 
