CONSTRAINT PROPAGATION IN KIMMO SYSTEMS 
G. Edward Barton, Jr. 
M.I.T. Artificial Intelligence Laboratory 
545 Technology Square 
Cambridge, MA 02139 
ABSTRACT 
Taken abstractly, the two-level (Kimmo) morphological 
framework allows computationally difficult problems to 
arise. For example, N + 1 small automata are sufficient 
to encode the Boolean satisfiability problem (SAT) for for- 
mulas in N variables. However, the suspicion arises that 
natural-language problems may have a special structure -- 
not shared with SAT -- that is not directly captured in 
the two-level model. In particular, the natural problems 
may generally have a modular and local nature that dis- 
tinguishes them from more "global" SAT problems. By 
exploiting this structure, it may be possible to solve the 
natural problems by methods that do not involve combi- 
natorial search. 
We have explored this possibility in a preliminary way 
by applying constraint propagation methods to Kimmo gen- 
eration and recognition. Constraint propagation can suc- 
ceed when the solution falls into place step-by-step through 
a chain of limited and local inferences, but it is insuffi- 
ciently powerful to solve unnaturally hard SAT problems. 
Limited tests indicate that the constraint-propagation al- 
gorithm for Kimmo generation works for English, Turkish, 
and Warlpiri. When applied to a Kimmo system that en- 
codes SAT problems, the algorithm succeeds on "easy" 
SAT problems but fails (as desired) on "hard" problems. 
INTRODUCTION 
A formal computational model of a linguistic process 
makes explicit a set of assumptions about the nature of the 
process and the kind of information that it fundamentally 
involves. At the same time, the formal model will ignore 
some details and introduce others that are only artifacts 
of formalization. Thus, whenever the formal model and 
the actual process seem to differ markedly in properties, a 
natural assumption is that something has been missed in 
formalization -- though it may be difficult to say exactly 
what. 
When the difference is one of worst-case complexity, 
with the formal framework allowing problems to arise that 
are too difficult to be consistent with the received diffi- 
culty of actual problems, one suspects that the natural 
computational task might have significant features that 
the formalized version does not capture and exploit ef- 
fectively. This paper introduces a constraint propagation 
method for "two-lever' morphology that represents a pre- 
liminary attempt to exploit the features of local in\]orrna- 
tion flow and linear separability that we believe are found 
in natural morphological-analysis problems. Such a local 
character is not shared by more difficult computational 
problems such as Boolean satisfiability, though such prob- 
lems can be encoded in the unrestricted two-level model. 
Constraint propagation is less powerful than backtracking 
search, but does not allow possibilities to build up in com- 
binatorial fashion. 
TWO-LEVEL MORPHOLOGY 
The mod~l of morphology developed by "two-level" 
Kimmo Koskenniemi is att~'active for putting morphological 
knowledge to use in processing. Two-level rules mediate 
the relationship between a lexieal string made up of mor- 
phemes from the dictionary and a surface string corre- 
sponding to the form a wo~d would have in text. Equiva- 
lently, the rules correspond, jto finite-state transducers that 
• • • ~ ~)" ÷ s , . r 
1 
• . . tz'l es . . . 
Figure 1: The automaton component of the Kimmo sys- 
tem consists of several two-headed finite-state automata 
that inspect the lexical/surface correspondence in paral- 
lel. The automata move together from left to right. (From 
Karttunen, 1983:176.) 
45 
ALPHABET x y z T F - 
ANY = 
END 
Figure 2: This is the complete Kimmo genera- 
tor system for solving SAT problems in the vari- 
ables x, y, and z. The system includes a con- 
sistency automaton for each variable in addition 
to a satisfaction automaton that does not vary 
from problem to problem. 
"x-consistency" 3 3 
x x = 
T F = 
1: 2 3 1 
2: 2 0 2 
3: 0 3 3 
"y-consistency" 3 3 
1: 2 3 1 
2: 2 0 2 
3: 0 3 3 
"z-consistency" 3 3 
Z Z = 
T F = 
I: 2 3 1 
2: 2 0 2 
3: 0 3 3 
"satisfaction" 3 4 
= _- 
T F 
i. 2 1 3 
2: 2 2 2 1 
3. 1 2 0 0 
END 
can be used in generation and recognition algorithms as 
implemented in Karttunen's (1983) Kimmo system (and 
others). As shown in Figure 1, the transducers in the "au- 
tomaton component" (~ 20 for Finnish, for instance) all 
inspect the lexical/surface correspondence at once in order 
to implement the insertions, deletions, and other spelling 
changes that may accompany affixation or inflection. In- 
sertions and deletions are handled through null characters 
that are visible only to the automata. A complete Kimmo 
system also has a "dictionary component" that regulates 
the sequence of roots and affixes at the lexical level. 
Despite initial appearances to the contrary, the straight- 
forward interpretation of the two-level model in terms of 
finite*state transducers leads to generation and recogni- 
tion algorithms that can theoretically do quite a bit of 
backtracking and search. For illustration we will consider 
the Kimmo system in Figure 2, which encodes Boolean 
satisfiability for formulas in three variables x, y, and z. 
The Kimmo generation algorithm backtracks extensively 
while determining truth-assignments for formulas accord- 
ing to this system. (See Barton (1986) and references cited 
therein for further details of the Kimmo system and of the 
system in Figure 2.) 
Taken in the abstract, the two-level model allows com- 
putationally difficult situations to arise despite initial ap- 
pearances to the contrary, so why shouldn't they also turn 
up in the analysis of natural languages? It may be that 
they do turn up; indeed, the relevant mathematical re- 
ductions are abstractly based on the Kimmo treatment of 
vowel harmony and other linguistic phenomena. Yet one 
feels that the artificial systems used in the mathematical 
reductions are unnatural in some significant way -- that 
similar problems are not likely to turn up in the analysis 
of Finnish, Turkish, or Warlpiri. If this is so, then the re- 
ductions say more about what is thus-far unexpressed in 
the formal model than about the difficulty of morphological 
analysis; it would be impossible to crank the difficult prob- 
lems through the formal machinery, if the machinery could 
be infused with more knowledge of the special properties 
of natural language. 1 
MODULAR 
INFORMATION STRUCTURE 
The ability to use particular representations and pro- 
cessing methods is underwritten by what may be called the 
"information structure" of a task -- more abstract than a 
particular implementation, and concerned with such ques- 
tions as whether a certain body of information suffices for 
making certain decisions, given the constraints of the prob- 
lem. What is it about the information structure of morpho- 
logical systems that is not captured when they are encoded 
1The systems under consideration in this paper deal with ortho- 
graphic representations, which are somewhat remote from the "more 
natural" linguist~ level of phonology and contain both more and less 
information than phonological representations. 
46 
as Kimmo systems? Are there significant locality princi- 
ples and so forth that hold in natural languages but not in 
mathematical systems that encode CNF Boolean satisfac- 
tion problems (SAT)? Y'erhaps a better understanding of 
the information relationships of the natural problem can 
lead to more specialized processing methods that require 
less searching, allow more parallelism, run more efficiently, 
or are more satisfying in some other way. 
A lack of modular information structure may be one 
way in which SAT problems are unnatural compared to 
morphological-analysis problems. Making this idea precise 
is rather tricky, for the Kimmo systems that encode SAT 
problems are modular in the sense that they involve vari- 
ous independent Kimmo automata assembled in the usual 
way. However, the essential notion is that the Boolean sat- 
isfaction problem has a more interconnected and "global" 
character than morphological analysis. The solution to 
a satisfaction problem generally cannot be deduced piece 
by piece from local evidence. Instead, the acceptability 
of each part of the solution may depend on the whole 
problem. In the worst case, the solution is determined 
by a complex conspiracy among the problem constraints 
instead of being composed of independently derivable sub- 
parts. There is little alternative to running through the 
possible cases in a combinatorial way. 
In contrast to this picture, in a morphological analy- 
sis problem it seems more likely that some pieces of the 
solution can be read off relatively directly from the input, 
with other pieces falling into place step-by-step through 
a chain of limited and local inferences and without the 
kind of "argument by cases" that search represents. We 
believe the usual situation is for the various complicating 
processes to operate in separate domains -- defined for in- 
stance by separate feature-groups -- instead of conspiring 
closely together. 
The idea can be illustrated with a hypothetical 
language that has no processes affecting consonants but 
several right-to-left harmony processes affecting different 
features of vowels. By hypothesis, underlying consonants 
can be read off directly. The right-to-left harmony pro- 
cesses mean that underlying vowels cannot always be iden- 
tified when the vowels are first seen. However, since the 
processes affect different features, uncertainty in one area 
will not block conclusions in others. For instance, the pro- 
cessing of consonants is not derailed by uncertainty about 
vowels, so information about underlying consonants can 
potentially be used to help identify the vowels. In such a 
scenario, the solution to an analysis problem is constructed 
more by superposition than by trying out solutions to in- 
tertwined constraints. 
A SAT problem can have either a local or global infor- 
mation structure; not all SAT problems are difficult. The 
unique satisfying assignment for the formula (~ v z)&(x v 
y)&:5 is forced piece by piece; the conjunct ~ forces x to 
be false, so y must be true, so finally z must be true. In 
contrast, it is harder to see that the formula 
is unsatisfiable. The problem is not just increased length; 
a different method of argument is required. Conclusions 
about the difficult formula are not forced step by step as 
with the easy formula. Instead, the lack of "local informa- 
tion channels" seems to force an argument by cases. 
A search procedure of the sort used in the Kimmo sys- 
tem embodies few assumptions about possible modularity 
in natural-language phonology. Instead, the implicit as- 
sumption is that any part of an analysis may depend on 
anything to its left. For example, consider the treatment of 
a right-to-left long-distance harmony process, which makes 
it impossible to determine the interpretation of a vowel 
when it is first encountered in a left-to-right scan. Faced 
with such a vowel, the current Kimmo system will choose 
an arbitrary possible interpretation and arrange for even- 
tual rejection if the required right context never shows up. 
In the event of rejection, the system will carry out chrono- 
logical backtracking until it eventually backs up to the er- 
roneous choice point. Another choice will then be made, 
but the entire analysis to the right of the choice point will 
be recomputed -- thus revealing the implicit assumption 
of possible dependence. 
By making few assumptions, such a search procedure 
is able to succeed even in the difficult case of SAT prob- 
lems. On the other hand, if modularity, local constraint, 
and limited information flow are more typical than difficult 
global problems, it is appropriate to explore methods that 
might reduce search by exploiting this aspect of informa- 
tion structure. 
We have begun exploring such methods in a prelim- 
inary and approximate way by implementing a modular, 
non-searching constraint-propagation algorithm (see Win- 
ston (1984) and other sources) for Kimmo generation and 
recognition. The deductive capabilities of the algorithm 
are limited and local, reflecting the belief that morpho- 
logical analyses can generally be determined piece by piece 
through local processes. The automata are largely decou- 
pied from each other, reflecting an expectation that phono- 
logical constraints generally will not conspire together in 
complicated ways. 
The algorithm will succeed when a solution can be 
built up, piece by superimposed piece, by individual au- 
tomata -- but by design, in more difficult cases the con- 
straints of the automata will be enforced only in an approx- 
imate way, with some nonsolutions accepted (as is usual 
47 
with this kind of algorithm). In general, the guiding as- 
sumption is that morphological analysis problems actually 
have the kind of modular and superpositional information 
structure that will allow constraint propagation to suc- 
ceed, so that the complexity of a high-powered algorithm 
is not needed. (Such a modular structure seems consonant 
with the picture suggested by autosegmental phonology, 
in which various separate tiers flesh out the skeletal slots 
of a central core of CV timing slots; see Halle (1985) and 
references cited thereQ 
SUMMARIZING COMBINATIONS 
OF POSSIBILITIES 
The constraint-propagation algorithm differs from the 
Kimmo algorithms in its treatment of nondeterminism. In 
terms of Figure 1, nondeterminism cannot arise if both 
the lexical surface strings have already been determined. 
This is true because a Kimmo automaton lists only one 
next state for a given lexical/surface pair. However, in the 
more common tasks of generation and recognition, only 
one of the two strings is given. The generation task that 
will be the focus here uses the automata to find the surface 
string (e.g. triea) that corresponds to a lexical string (e.g. 
try+a) that is supplied as input. 
As the Kimmo automata progress through the input, 
they step over one lexical/surface pair at a time. Some 
lexical characters will uniquely determine a lexical/surface 
pair; in generation from try+a the first two pairs must be 
t/t and r/r. But at various points, more than one lex- 
ical/surface pair will be admissible given the evidence so 
far. If y/y and y/± are both possible, the Kimmo search 
machinery tries both pairs in subcomputations that have 
nothing to do with each other. The choice points can po- 
tentially build on each other to define a search space that 
is exponential in the number of independent choice points. 
This is true regardless of whether the search is carried out 
depth-first or breadth-first. ~ 
For example, return to the artificial Kimmo system 
that decides Boolean satisfiability for formulas in variables 
x, y, and z (Figure 2). When the initial y of the for- 
mula yz .x-y-z ,-x.-y is seen, there is nothing to decide 
between the pairs y/T and y/F. If the system chooses y/T 
first, the choice will be remembered by the y-consistency 
automaton, which will enter state 2. Alternatively, if the 
possibility y/F is explored first, the y-consistency automa- 
ton will enter state 3. After yz.x.., has been seen, the 
x-, y-, and z-consistency automata may be in any of the 
2See Karttunen {1983:184} on the difference in search order be- 
tween Karttunen's Kimmo algorithms and the equivalent procedures 
originally presented by Koskenniemi. 
following state-combinations: 
(3,3,2) (2,3,2) 
(3,2,3) (2,2,3) 
<3,2,2) (2,2,2) 
(The combinations (3, 3, 3) and (2, 3, 3) are not reachable 
because the disjunction yz that will have been processed 
rules out both y and z being false, but on a slightly dif- 
ferent problem those combinations would be reachable as 
well.) The search mechanism will consider these possible 
combinations individually. 
Thus, the Kimmo machinery applied to a k-variable 
SAT problem explores a search space whose elements are 
k-tuples of truth-values for the variables, represented in the 
form of k-tuples of automaton states. If there are k = 3 
variables, the search space distinguishes among (T, T, T), 
(T, T, F), and so forth -- among 2 k elements in general. 
Roughly speaking, the Kimmo machinery considers the el- 
ements of the search space one at a time, and in the worst 
case it will enumerate all the elements. 
Instead of considering the tuples in this space indi- 
vidually, the constraint-propagation algorithm summarizes 
whole sets of tuples in slightly imprecise form. For exam- 
ple, the above set of state-combinations would be summa- 
rized by the single vector 
<{2,3}, {2,3}, {2,3)> 
representing the truth-assignment possibilities 
(x Z {T,F},y • {T,F},z • {T,F}). 
The summary is less precise than the full set of state-tuples 
about the global constraints among the automata; here, 
the summary does not indicate that the state-combinations 
(3, 3, 3) and (2, 3, 3) are excluded. The constraint-propa- 
gation algorithm never enumerates the set of possibilities 
covered by its summary, but works with the summary it- 
self. 
The imprecision that arises from listing the possible 
states of each automaton instead of listing the possible 
combinations of states represents a decoupling of the au- 
tomata. In addition to helping avoid combinatorial blowup, 
this decoupling allows the state-possibilities for different 
automata to be adjusted individually. We do not expect 
that the corresponding imprecision will matter for natural 
language: instead, we expect that the decoupled automata 
will individually determine unique states for themselves, a 
situation in which the summary is precise. 3 For instance, 
aObviously, this can be true ill a recognition problem only if the 
input is morphologically unambiguous, in which case it can still fail to 
hold if the constraint-propagation method is insufficiently powerful to 
48 
x-consistency 1 ... 
y-consistency 1 "" 
z-consistency 1 .... 
satisfaction 1 "" 
"" 1 "'" 
• " 1 ....... 2,3-.. 
• -'1,2 ...... ~,2"" 
I 
.... 1 ""t 
.... 2,3"" 
x/T 
'/' x/F 
input y z , x 
""2,3"" 
"'2,3"" 
"'2,3"" 
""1,2".. 
Figure 3: The constraint-propagation algorithm produces this representation when processing 
the first few characters of the formula yz.x-y-z.-x,-y using the automata from Figure 2. At 
this point no truth-values have been definitely determined. 
in the case of generation involving right-to-left vowel har- 
mony, only the vowel harmony automaton should exhibit 
nondeterminism, which should be resolved upon process- 
ing of the necessary right context. The imprecision also 
will not matter if two constraints are so independent that 
their solutions can be freely combined, since the summary 
will not lose any information in that case. 
CONSTRAINT PROPAGATION 
Like the Kimmo machinery, the constraint-propagation 
machinery is concerned with the states of the automata at 
intercharacter positions. But when nondeterminism makes 
more than one state-combination possible at some position, 
the constraint-propagation method summarizes the possi- 
bilities and continues instead of trying a single guess. The 
result is a two-dimensional multi-valued tableau containing 
one row for each automaton and one column for each inter- 
character position in the input) Figure 3 shows the first 
few columns that are produced in generating from the SAT 
rule out invalid possibilities. Note that many cases of morphological 
ambiguity involve bracketing (e.g. un\[loadableJ/\[unloadJable) 
rather than the identity of lexical characters. Though the matter is not 
discussed here, we propose to handle bracketing ambiguity and lexical- 
string anabiguity by different mechanisms. In addition, for discussions 
of morphological ambiguity, it becomes very important whether the 
input representation is phonetic or non-phonetically orthographic, 
4An extra column is needed at each position where a null might be 
inserted. 
formula yz ,x-y-z, -x.-y. The initial y can be interpreted 
as either y/T or y/F, and consequently the y-consistency 
automaton can end up in either state 2 or state 3. Simi- 
larly, depending on which pair is chosen, the satisfaction 
automaton can end up in either state 1 (no true value seen) 
or state 2 (a true value seen). 
In addition to the states of the automata, the tableau 
contains a pair set for each character, initialized to con- 
tain all feasible lexical/surface pairs (el. Gajek et al., 1983) 
that match the input character. As Figure 3 suggests, the 
pair set is common to all the automata; each pair in the 
pair set must be acceptable to every automaton. If one 
automaton has concluded that there cannot be a surface 
g at the current position, it makes no sense to let another 
automaton assume there might be one. The automata are 
therefore not completely decoupled, and effects may prop- 
agate to other automata when one automaton eliminates a 
pair from consideration. Such propagation will occur only 
if more than one automaton distinguishes among the pos- 
sible pairs at a given position. For example, an automaton 
concerned solely with consonants would be unaffected by 
new information about the identity of a vowel. 
Wahz's line-labelling procedure, the best-known early 
example of a constraint-propagation procedure (el. Win- 
ston, 1984), proceeds from an underconstrained initial la- 
belling by eliminating impossible junction labels. A label is 
impossible if it is incompatible with every possible label at 
some adjacent junction. The constraint-propagation pro- 
cedure for Kimrno systems proceeds in much the same way. 
49 
A possible state of an automaton can be eliminated in four 
ways: 
• The only possible predecessor of the state (given the 
pair set) is ruled out in the previous state set. 
• The only possible successor of the state (given the pair 
set) is ruled out in the next state set. 
• Every pair that allows a transition out of the state is 
eliminated at the rightward character position. 
• Every pair that allows a transition into the state is 
eliminated at the leftward character position. 
Similarly, a pair is ruled out whenever any automaton be- 
comes unable to traverse it given the possible starting and 
ending states for the transition. (There are special rules 
for the first and last character position. Null characters 
also require special treatment, which will not be described 
here.) 
The configuration shown in Figure 3 is in need of con- 
straint propagation according to these rules. State 1 of the 
satisfaction automaton does not accept the comma/comma 
pair, so state 1 is eliminated from the possible states { 1,2} 
of the satisfaction automaton after z. State 1 has there- 
fore been shown as cancelled. However, the elimination of 
state 1 causes no further effects at this point. 
The current implementation simplifies the checking 
of the elimination conditions by associating sets 
of triples with character positions. Each triple 
(old state, pair, new state) is a complete description of one 
transition of a particular automaton. The left, right, and 
center projections of each triple set must agree with the 
state sets to the left and right and with the pair set for the 
position, respectively. Figure 4 shows two of the triple-sets 
associated with the z-position in Figure 3. 
The nondeterminism of Figure 3 is finally resolved when 
the trivial clauses at the end of the formula yz .x-y-z. -x, -y 
are processed. After x in the clause -x all of the consistency 
automata are noncommittal, i.e. can be in either state 2 or 
state 3. The satisfaction automaton was in state 3 before 
the x because of the minus sign and it can use either of 
the triples (3,x/T, 1) or (3,x/F,2). However, on the next 
step it is discovered that only state 2 will allow it to tra- 
verse the comma that follows the x. The triple (3,x/T, 1) 
is eliminated and the pair x/T goes with it. The elimina- 
tion of x/T is propagated to the x-consistency automaton, 
which loses the triple (2,x/T,2) and can no longer sup- 
port state 2 in the left and right state sets. The loss of 
state 2, in turn, propagates leftward on the x-satisfaction 
line back to the initial occurrence of x. The possibility x/T 
is eliminated everywhere it occurs along the way. Finally, 
processing resumes at the right edge. 
In similar fashion, the trivial clause -y eliminates the 
possibility y/T throughout the formula. However, this time 
the effects spread beyond the y-automaton. When the pos- 
sibility y/T is eliminated from the first pair-set in Figure 3, 
the satisfaction automaton can no longer support state 2 
between the y and z. This leaves (1,z/T,2) as the only 
active triple for the satisfaction automaton at the second 
character position. Thus z/F is eliminated and z is forced 
to truth. When everything settles down, the "easy" for- 
mula yz,x-y-z,-x,-y has received the satisfying truth- 
assignment FT, F-F-T, -F, -F. 
ALGORITHM 
CHARACTERISTICS 
The constraint-propagation algorithm shares with the 
Waltz labelling procedure a number of characteristics that 
prevent combinatorial blowup: 5 
• The initial possibilities at each point are limited and 
non-combinatorial; in this case, the triples at some po- 
sition for an automaton can do no worse than to encode 
the whole automaton, and there will usually be only a 
few triples. \]t is particularly significant that the num- 
ber of triples does not grow combinatorially as more 
automata are added. 
• Possibilities are eliminated monotonically, so the lim- 
ited number of initial possibilities guarantees a limited 
number of eliminations. 
• After initialization, propagation to the neighbors of a 
visited element takes place only if a possibility is elim- 
inated, so the limited number of eliminations guaran- 
tees a limited number of visits. 
• Limited effort is required for each propagator visit. 
However, we have not done a formal analysis of our im- 
plementation, in part because many details are subject to 
change. It would be desirable to replace the weak notion 
of monotonic possibility-elimination with some (stronger) 
notion of indelible construction of representation, based if 
possible on phonological features. Methods have also been 
envisioned for reducing the distance that information must 
be propagated in the algorithm. 
The relative decoupling of the automata and the gen- 
eral nature of constrain~-propagation methods suggests that 
a significantly parallel implementation is feasible. How- 
ever, it is uncertain whether the constraint-propagation 
method enjoys an advanlage on serial machines. It is 
clear that the Kimmo machinery does combinatorial search 
while the constraint-propagation machinery does not, but 
SThroughout this paper, we are ignoring complications related to 
the possibility of nulls. 
50 
y-consistency .... 2,3"" 
z-consistency .... 1 "" 
z/T 
z/F 
.... 2,3 ........ 2,3"" 
• "2,3 ........ 1 "" 
(2, z/T,2) 
<3, z/T,3) 
<2, z/F, 2) 
(3, z/F,3) 
(1,z/T,2) 
<1, z/F, 3> 
.... 2,3 .... 
.... 2,3 .... 
Figure 4: When the active transitions of each automaton are represented by triples, it is easy 
to enforce the constraints that relate the left and right state-sets and the pair set. The left 
configuration is excerpted from Figure 3, while the right configuration shows the underlying 
triples. The set of triples for the y-consistency automaton could easily be represented in more 
concise form. 
we have not investigated such questions as whether an ana- 
logue to BIGMACHINE precompilation (Gajek et al., 1983) 
is possible for the constraint-propagation method. BIG- 
MACHINE precompilation speeds up the Kimmo machin- 
ery at a potentially large cost in storage space, though it 
does not reduce the amount of search. 
The constraint-propagation algorithm for generation 
has been tested with previously constructed Kimmo au- 
tomata for English, Warlpiri, and Turkish. Preliminary re- 
sults suggest that the method works. However, we have not 
been able to test our recognition algorithm with previously 
constructed automata. The reason is that existing Kimmo 
automata rely heavily on the dictionary when used for 
recognition. We do not yet have our Kimmo dictionaries 
hooked up to the constraint-propagation algorithms, and 
consequently an attempt at recognition produces mean- 
ingless results. For instance, without constraints from 
the dictionary the machinery may choose to insert suffix- 
boundary markers + anywhere because the automata do 
not seriously constrain their occurrence. 
Figure 5 shows the columns visited by the algorithm 
when running the Warlpiri generator on a typical example, 
in this case a past-tense verb form ('scatter-PAST') taken 
from Nash (1980:85). The special lexical characters I and 
<u2> implement a right-to-left vowel assimilation process. 
The last two occurrences of I surface as u under the influ- 
ence of <u2>, but the boundary # blocks assimilation of the 
first two occurrences. Here the propagation of constraints 
has gone backwards twice, once to resolve each of the two 
sets of I-characters. The final result is ambiguous because 
our automata optionally allow underlying hyphens to ap- 
pear on the surface, in accordance with the way morpheme 
boundaries are indicated in many articles on Warlpiri. 
The generation and recognition algorithms have also 
been run on mathematical SAT formulas, with the de- 
sired result that they can handle "easy" but not "diffi- 
cult" formulas as described above. ~ For the easy formula 
(~ v z)&(x v y)&~ constraint propagation determines the 
solution (T V T)&(F V T)&F. But for the hard formula 
constraint propagation produces only the wholly uninfor- 
mative truth-assignment 
({T,F} v {T,F} V {T, F})&({T, F} V {T,F}) 
&({T,F} v {T,F})a({T,F} V {T,F}) 
&({T,F} v {T, FI)&({T,F} v {T,F}) 
Since we believe linguistic problems are likely to be more 
like the easy problem than the hard one, we believe the 
constraint-propagation system is an appropriate step to- 
ward the goal of developing algorithms that exploit the 
information structure of linguistic prob\]ems. 
6Note that the current classification of formulas as "easy" is dif- 
ferent from polynomial-time satisfiability. In particular, the restricted 
problem 2SAT can be solved in polynomial time by resolution, but not 
every 2SAT formula is "easy ~ in the current sense. 
51 
012345 
1234 
2345678910111213 
789101112 
891011121314 
pIrrI#kIjI-rn<u2>: result ambiguous, pirri{O,-}kuju{-.O}rnu 
Figure 5: This display shows the columns visited by the constraint-propagation algorithm when 
the Warlpiri generator is used on the form plrrI#kIjI-rn<u2> 'scatter-PAST'. Each reversal 
of direction begins a new line. Leftward movement always begins with a position adjacent to 
the current position, but it is an accidental property of this example that rightward movement 
does also. The final result is ambiguous because the automata are written to allow underlying 
hyphens to appear optionally on the surface. 
ACKNOWLEDGEMENTS 
This report describes research done at the Artificial 
Intelligence Laboratory of the Massachusetts Institute of 
Technology. Support for the Laboratory's artificial intel- 
ligence research has been provided in part by the Ad- 
vanced Research Projects Agency of the Department of 
Defense under Office of Naval Research contract N00014- 
80-C-0505. This research has benefited from guidance and 
commentary from Bob Berwick. 
REFERENCES 
Barton, E. (1986). "Computational Complexity in Two- 
Level Morphology," ACL-86 proceedings (this volume). 
Gajek, O., H. Beck, D. Elder, and G. Whittemore (1983). 
"LISP Implementation \[of the KIMMO system\]," Texas 
Linguistic Forum 22:187-202. 
Halle, M. (1985). "Speculations about the Representa- 
tion of Words in Memory," in V. Fromkin, ed., Pho- 
netic Linguistics: Essays in Honor of Peter Ladefoged, 
pp. 101-114. New York: Academic Press. 
Karttunen, L. (1983). "KIMMO: A Two-Level Morpho- 
logical Analyzer," Tezas Linguistic Forum 22:165-186. 
Nash, D. (1980). Topics in Warlpiri Grammar. Ph.D. the- 
sis, Department of Linguistics and Philosophy, M.I.T., 
Cambridge, Mass. 
Winston, P. (1984). Artificial Intelligence, second edition. 
Reading, Mass.: Addison-Wesley. 
52 
