Phonological Analysis in Typed 
Feature Systems 
Steven Bird* 
University of Edinburgh 
Ewan Klein* 
University of Edinburgh 
Research on constraint-based grammar frameworks has focused on syntax and semantics largely to 
the exclusion of phonology. Likewise, current developments in phonology have generally ignored 
the technical and linguistic innovations available in these frameworks. In this paper we suggest 
some strategies for reuniting phonology and the rest of grammar in the context of a uniform 
constraint formalism. We explain why this is a desirable goal, and we present some conservative 
extensions to current practice in computational linguistics and in nonlinear phonology that we 
believe are necessary and sufficient for achieving this goal. 
We begin by exploring the application of typed feature logic to phonology and propose a 
system of prosodic types. Next, taking HPSG as an exemplar of the grammar frameworks we have 
in mind, we show how the phonology attribute can be enriched so that it can encode multi-tiered, 
hierarchical phonological representations. Finally, we exemplify the approach in some detail for the 
nonconcatenative morphology of Sierra Miwok and for schwa alternation in French. The approach 
taken in this paper lends itself particularly well to capturing phonological generalizations in terms 
of high-level prosodic constraints. 
1. Phonology in Constraint-Based Grammar 
Classical generative phonology is couched within the same set of assumptions that 
dominated standard transformational grammar. Despite some claims that "deriva- 
tions based on ordered rules (that is, external ordering) and incorporating interme- 
diate structures are essential to phonology" (Bromberger and Halle 1989:52), much 
recent work has tended toward a new model, frequently described in terms of con- 
straints on well-formedness (Paradis 1988; Goldsmith 1993; McCarthy and Prince 1993; 
Prince and Smolensky 1993). While this work has an increasingly declarative flavor, 
most versions retain procedural devices for repairing representations that fail to meet 
certain constraints, or for constraints to override each other. This view is in marked 
contrast to the interpretation of constraints in grammar frameworks like LFG, GPSG, 
and HPSG 1 and in constraint programming systems more generally (Jaffar and Lassez 
1987; Smolka 1992). In such approaches, constraints cannot be circumvented, there 
are no 'intermediate structures,' and the well-formedness constraint (Partee 1979) is 
observed (i.e. ill-formed representations can never be created). The advantage of these 
frameworks is that they allow interesting linguistic analyses to be encoded while 
remaining computationally tractable. 
* University of Edinburgh, Centre for Cognitive Science, 2 Buccleuch Place, Edinburgh EH8 9LW, U.K. 
E-maih steven.ewan@cogni.ed.ac.uk 1 Lexical Functional Grammar (Kaplan and Bresnan 1982), Generalized Phrase Structure Grammar 
(Gazdar, Klein, Pullum, and Sag 1985), and Head-Driven Phrase Structure Grammar (Pollard and Sag 
1987). 
© 1994 Association for Computational Linguistics 
Computational Linguistics Volume 20, Number 3 
Here, we are interested in the question of what a theory of phonology ought to look 
like if it is to be compatible with a constraint-based grammar framework. This issue 
has already received attention, 2 although a thoroughgoing integration of phonology 
into constraint-based grammars has yet to be attempted. To ease exposition, we shall 
take HPSG dS a suitably representative candidate of such approaches. Although we are 
broadly committed to a sign-oriented approach to grammar, none of our proposals 
depends crucially on specific tenets of HPSG. 
Rather than attempting to theorize at an abstract level about constraint-based 
phonology, we shall engage in two case studies intended to give a concrete illustra- 
tion of important issues: these involve templatic morphology in Sierra Miwok and 
schwa alternation in French. Before launching into these studies, however, we present 
an overview of some aspects of phonology that present a challenge to standard as- 
sumptions taken in sign-oriented constraint-based grammars. Then we describe a (sim- 
plified) version of HPSG that will make it possible to illustrate the approach without 
irrelevant technical machinery. 
1.1 The Challenge of Phonology 
Given that the dominant focus of most research in constraint-based grammar has 
been syntax and semantics, it is not surprising that the phonological content of words 
and phrases has been largely limited to orthographic strings, supplemented with a 
concatenation operation. How far would such representations have to be enriched if 
we wanted to accommodate a more thoroughgoing treatment of phonology? 
As remarked earlier, recent work in theoretical phonology has apparently moved 
closer to a constraint-based perspective, and is thus a promising starting point for our 
investigation. Yet there are at least three challenges that confront anyone looking into 
theoretical phonology from the viewpoint of computational linguistics. Most striking 
perhaps is the relative informality of the language in which theoretical statements are 
couched. Bird and Ladd (1991) have catalogued several examples of this: notational 
ambiguity (incoherence), definition by example (informality), variable interpretation 
of notation depending on subjective criteria (inconsistency), and uncertainty about 
empirical content (indeterminacy). When a clear theoretical statement can be found, 
it is usually expressed in procedural terms, which clouds the empirical ramifications 
making a theory difficult to falsify. Finally, even when explicit and nonprocedural 
generalizations are found, they are commonly stated in a nonlinear model, which 
clearly goes beyond the assumptions about phonology made in HPSG as it currently 
stands. 
We approach these challenges by adopting a formal, nonprocedural, nonlinear 
model of phonology and showing how it can be integrated into HPSG, following on 
the heels of recent work by the authors (Bird and Klein 1990; Bird 1992; Klein 1992). 
One of the starting assumptions of this work is that phonological representations 
are intensional, i.e. each representation is actually a description of a class of utterances. 
Derivations progress by refining descriptions, further constraining the class of denoted 
objects. Lexical representations are likewise partial, and phonological constraints are 
cast as generalizations in a lexical inheritance hierarchy or in a prosodic inheritance 
hierarchy. When set against the background of constraint-based grammar, this inten- 
sional approach is quite natural (cf. Johnson \[1988\]). Moreover, some recent thinking on 
the phonology-phonetics interface supports this view (Pierrehumbert 1990; Coleman 
2 (Bach and Wheeler 1981; Wheeler 1981; Bird 1990; Cahill 1990; Coleman 1991; Scobbie 1991; Bird 1992; 
Walther 1992; Mastroianni 1993; Russell 1993) 
456 
Steven Bird and Ewan Klein Phonological Analysis in Typed Feature Systems 
1992). However, it represents a fundamental split with the generative tradition, where 
rules do not so much refine descriptions as alter the objects themselves (Keating 1984). 
While it is clearly possible to integrate an essentially generative model into the 
mold of constraint-based grammar (Krieger, Pirker, and Nerbonne 1993), it is less clear 
that this is the approach most phonologists would wish to take nowadays. It is be- 
coming increasingly apparent that rule-based relationships between surface forms and 
hypothetical lexical forms are unable to capture important generalizations about sur- 
face forms. This concern was voiced early in the history of generative phonology, when 
Kisseberth (1970) complained that such rules regularly conspire to achieve particular 
surface configurations, but are unable to express the most elementary observations 
about what those surface configurations are. As a criticism of rule-based systems, 
Kisseberth's complaint remains valid and has been echoed several times since then 
(Shibatani 1973; Hooper 1976; Hudson 1980; Manaster Ramer 1981). However, recent 
work in phonology has moved away from models involving rules that relate lexical 
and surface forms toward models involving general systems of interacting constraints, 
where this problem has been side-stepped. 
Accordingly, we avoid the theoretical framework of early generative phonology, 
focusing instead on encoding phonological constraints in a constraint-based grammar 
framework. We present an overview of the grammar framework in the next section. 
1.2 Motivation 
At this point, we should briefly address the question: What is gained by integrating 
phonology into a constraint-based grammar? One pragmatic answer is that approaches 
like HPSG have already taken this step, by introducing a PHONOLOGY attribute that 
parallels attributes for SYNTAX and SEMANTICS. Since, as we have already pointed out, 
the value of PHONOLOGY needs to be enriched somehow if it is to be linguistically 
adequate, it is reasonable to ask whether the formalism allows insightful statements 
of phonological generalizations. 
An objection might take the following form: phonology is formally less complex 
than syntax, as shown by the body of work on finite state analyses of phonology (cf. 
Section 1.4). Hence, it is inappropriate to encode phonology in a general purpose for- 
malism that has been designed to accommodate more complex phenomena. As a first 
response, we would maintain that formalisms should not be confused with theories. 
Certainly, we want to have a restrictive theory of phonology and its interactions with 
other levels of grammar. But we view constraint-based formalisms as languages for 
expressing such theories, not as theories themselves. Moreover, the fact that we use a 
uniform constraint formalism does not force us to use homogeneous inferential mech- 
anisms for that formalism; this issue is discussed further in Section 1.4 and Section 6. 
A further question might be: do natural language grammars require the kind 
of interaction between phonology and other levels of grammar made possible by 
constraint-based formalisms? This is not the place to explore this issue in the detail 
it deserves. However, even if we accept the contention of Pullum and Zwicky (1984) 
that the interactions between phonology and syntax (narrowly construed) are highly 
restricted, there are still good reasons for wanting to accommodate phonological rep- 
resentations as one of the constraints in a sign-based grammar framework. 
To begin with, it is relatively uncontroversial that morphology needs to be in- 
terfaced with both syntax and phonology. Approaches like that of Krieger and Ner- 
bonne (in press) have shown that both derivational and inflectional morphology can 
be usefully expressed within the constraint-based paradigm. Taking the further step 
of adding phonology seems equally desirable. 
Second, the use of typed feature structures within the lexicon has been strongly 
457 
Computational Linguistics Volume 20, Number 3 
argued for by Briscoe (1991) and Copestake et al. (in press). That is, even when we 
ignore syntactic combination, constraint-based grammar frameworks turn out to be 
well suited to expressing the category and semantic information fields of lexical entries. 
But the interaction of phonology with categorial information inside the lexicon is well 
documented. Lexical phonology (Kiparsky 1982) has shown in detail how phonological 
phenomena are conditioned by morphologically specified domains. If direct interaction 
between phonology and morpho-syntax is prohibited, one can only resort to ad hoc 
and poorly motivated diacritic features. 
Turning to a different empirical domain, it can be argued that focus constructions 
exhibit an interaction between information structure (at the semantic-pragmatic level) 
with prosodic structure (at the phonological level). This interaction can be directly 
expressed in a sign-oriented approach. In other frameworks it is common practice to 
avoid direct reference to phonology by invoking a morpho-syntactic FOCUS feature 
(e.g. Selkirk 1984; Rooth 1985); the mediation of syntax in this way appears to be more 
an artifact of the grammar architecture than an independently motivated requirement. 
Equally, it has been argued that the phenomenon of heavy NP shift is a kind of syntax- 
phonology interaction that is simply stated in a constraint-based approach, where the 
linear precedence constraints of syntax are sensitive to the phonological category of 
weight (Bird 1992). 
1.3 Theoretical Framework 
Typed feature structures (Carpenter 1992) impose a type discipline on constraint-based 
grammar formalisms. A partial ordering over the types gives rise to an inheritance 
hierarchy of constraints. As Emele and Zajac (1990) point out, this object-oriented 
approach brings a number of advantages to grammar writing, such as a high level of 
abstraction, inferential capacity and modularity. 
On the face of it, such benefits should extend beyond syntax--to phonology for 
example. Although there have been some valuable efforts to exploit inheritance and 
type hierarchies within phonology (e.g. Reinhard and Gibbon 1991), the potential of 
typed feature structures for this area has barely been scratched so far. In this section, we 
present a brief overview of HPSG (Pollard and Sag 1987), a constraint-based grammar 
formalism built around a type system that suits our purposes in phonology. 
In order to formulate the type system of our grammar, we need to make two 
kinds of TYPE DECLARATION. The first kind contains information about the subsumption 
ordering over types. For example, the basic grammar object in HPSG is the feature 
structure of type sign. The type sign has some SUBTYPES. If a is a subtype of T, then ~r 
provides at least as much information as T. A type declaration for sign defines it as 
the following disjunction of subtypes: 3 
Example 1 
sign ~ morph V stem V word V phrase 
The second kind of declaration is an APPROPRIATENESS CONDITION. That is, for each 
type, we declare (all and only) the attributes for which it is specified, and additionally 
the types of values which those attributes can take. 4 For example, objects of type sign 
3 The constraints proposed here deviate in various respects from the standard version of HPSG. We 
follow Carpenter (1992) in using the notation a =~ ¢ to specify that type ~ satisfies constraint ¢. 
4 We are using what Carpenter (1992) calls TOTAL WELL-TYPING. That is, (i) the only attributes and values 
that can be specified for a given feature structure of type T are those appropriate for ~-; and (ii) every 
feature structure of type ~- must be specified for all attributes appropriate for T. 
458 
Steven Bird and Ewan Klein Phonological Analysis in Typed Feature Systems 
could be constrained to have the following features defined: 
Example 2 
PHON : phon 
\[SYNSEM : synsem 
sign \[DTRS : list 
That is, feature structures of type sign must contain the attributes PHON (i.e. phonol- 
ogy), SYNSEM (i.e. syntax/semantics), 5 and DTRS (i.e. daughters) and these attributes 
must take values of a specific type (i.e., phon, synsem, and list, respectively). A further 
crucial point is that appropriateness conditions are inherited by subtypes. For exam- 
ple, since morph is a subtype of sign, it inherits all the constraints obeyed by sign. 
Moreover, as we shall see in Section 3.2, it is subject to some further appropriateness 
conditions that are not imposed on any of its supertypes. 
Continuing in the same vein, we can assign appropriateness conditions to the types 
synsem and phon that occurred as values in (2), (simplifying substantially from standard 
HPSG). Here we give the constraints for synsem. The type phon will be discussed in 
Section 2. 
Example 3 
CAT : cat 
AGR : agr 
SUBCAT : list 
SEM : semantics synsem 
To conclude this section, we shall look very briefly at matters of interpretation 
and inference. As shown by Carpenter (1992) and Zajac (1992, in press), we can use 
constraint resolution to carry out type inference for feature terms. Following Zajac, let 
us say that a GROUND feature term is a term all of whose type symbols are minimal 
(i.e., the most specific types in the hierarchy immediately above _L). A WELL-TYPED 
feature term is one that obeys all the type definitions. Then the meaning of a feature 
term F is given by the set of all well-typed ground feature terms that are subsumed by 
F. Evaluating F, construed as a query, involves describing F's denotation; for example, 
enumerating all the well-typed ground feature terms it subsumes. Since the type deft- 
nitions obeyed by F might be recursive, its denotation is potentially infinite. Consider 
for example the following definitions (where "nelist" and 'elist" stand for nonempty list 
and empty list respectively, and T subsumes every type): 
Example 4 
a. list =~ nelist V elist 
b. FIRST : T \] nelist \[REST : list J 
5 Earlier versions of HPSG kept syntax and semantics as separate attributes, and we shall sometimes 
revert to the latter when borrowing examples from other people's presentations. 
459 
Computational Linguistics Volume 20, Number 3 
Here, the denotation of the type symbol list is the set of all possible ground lists. 
In practice, a constraint solver could recursively enumerate all these solutions; an 
alternative proposed by Zajac would be to treat the symbol LIST as the best finite 
approximation of the infinite set of all lists. 
1.4 Finite-State Phonology 
Over the last decade much has been written on the application of finite-state trans- 
ducers (FSTs) to phonology, centering on the TWO-LEVEL MODEL of Koskenniemi (1983). 
Antworth (1990) and Sproat (1992) give comprehensive introductions to the field. The 
formalism is an attractive computational model for 1960s generative phonology. How- 
ever, as has already been noted, phonologists have since moved away from complex 
string rewriting systems to a range of so-called nonlinear models of phonology. The 
central innovation of this more recent work is the idea that phonological representa- 
tions are not strings but collections of strings, synchronized like an orchestral score. 
There have been some notable recent attempts to rescue the FST model from its 
linearity in order to encompass nonlinear phonology (Kay 1987; Kornai 1991; Wiebe 
1992). However, from our perspective, these refinements to the FST model still admit 
unwarranted operations on phonological representations, as well as rule conspiracies 
and the like. Rather, we believe a more constrained and linguistically appealing ap- 
proach is to employ finite-state automata (FSAs) in preference to FSTS, since it has 
been shown how FSAS can encode autosegmental representations and a variety of con- 
straints on those representations (Bird and Ellison 1994). The leading idea in this work 
is that each tier is a partial description of a string, and tiers are put together using the 
intersection operation defined on FSAs. 
Apart from being truer to current phonological theorizing, this one-level model 
has a second important advantage over the two-level model. Since the set of FSAs forms 
a Boolean lattice under intersection, union, and complement (a direct consequence of 
the standard closure properties for regular languages), we can safely conjoin ('unify'), 
disjoin, and negate phonological descriptions. Such a framework is obviously compat- 
ible with constraint-based grammar formalisms, and there is no reason in principle 
to prevent us from augmenting HPSG with the data type of regular expressions. In 
practice, we are not aware of any existing implementations of HPSG (or other feature- 
based grammars) that accommodate regular expressions. Ideally, we would envisage a 
computational interpretation of typed feature structures where operations on regular 
expression values are delegated to a specialized engine that manipulates the corre- 
sponding FSAs and returns regular expression results. 6 This issue is discussed further 
in Section 6. 
1.5 Overview of the Paper 
The structure of the paper is as follows. In the next section, we present our assump- 
tions about phonological representations and phenomena, couched in the framework 
of typed feature logic. In Section 3 we discuss our view of the lexicon, borrowing heav- 
ily on HPSG's lexical type hierarchy, and developing some operations and representa- 
tions needed for morphology. The next two sections investigate various applications 
of the approach to two rather differing phenomena, namely Sierra Miwok templatic 
morphology and French schwa. Section 6 discusses some implementation issues. The 
paper concludes with a summary and a discussion of future prospects. 
6 A similar approach is envisaged by Krieger, Pirker, and Nerbonne (1993). 
460 
Steven Bird and Ewan Klein Phonological Analysis in Typed Feature Systems 
2. String-Based Phonology 
In this section we present a string-based phonology based on the HPSG list notation. 
We present the approach in Section 2.1 and Section 2.2, concluding in Section 2.3 with 
a discussion of prosodic constituency. 
2.1 List Notations 
As a concession to existing practice in HPSG, we have taken the step of using lists in 
place of strings. We shall use angle bracket notation as syntactic sugar for the standard 
FIRST/REST encoding. 
We shall assume that the type system allows parameterized types of the form 
list(a), where a is an atomic type. 
Example 5 
list(a) ~ elist(a) V nelist(a) 
FIRST : a \] 
nelist(a) \[REST : list(a) 3 
We can now treat a* and a + as abbreviations for list(a) and nelist(a) respectively. 
Another useful abbreviatory notation is parenthesized elements within lists. We 
shall interpret (a(b)) --- L, a list consisting of an a followed by an optional b concate- 
nated with an arbitrary list L, as the following constraint: 
Example 6 
FIRST : a IFIRST : 1 
list \[REST : 
list 
We shall see applications of these list notations in the next section. 
2.2 A Prosodic Type Hierarchy 
A PROSODIC TYPE HIERARCHY is a subsumption network akin to the lexical hierarchy of 
HPSG (Pollard and Sag 1987). The type constraints we have met so far can be used to 
define a type hierarchy, which for present purposes will be a Boolean lattice. In this 
section we present in outline form a prosodic hierarchy that subsequent analyses will 
be based on. Example (7) defines the high-level types in the hierarchy. 
Example 7 
phon ~ utterance v phrase V foot v syl V segment 
Each of these types may have further structure. For example, following Clements 
(1985:248) we may wish to classify segments in terms of their place and manner of 
articulation, using the following appropriateness declaration. 
461 
Computational Linguistics Volume 20, Number 3 
Example 8 
segment 
LARYNGEAL : 
SUPRALARYNGEAL : 
SPREAD : boolean 
CONSTRICTED : boolean 
VOICED : boolean 
"NASAL : boolean" 
vIANNER : CONTINUANT : boolean 
STRIDENT : boolean 
CORONAL : boolean 
PLACE : ANTERIOR : boolean 
DISTRIBUTED : boolean 
Suppose now that we wished to use these structures in a constraint for English homor- 
ganic nasal assimilation. This phenomenon does not occur across phonological phrase 
boundaries and so the constraint will be part of the definition of the type (phonologi- 
cal) phrase. Let us assume that a phrase is equivalent to segment*, i.e. a list of segments. 
Informally speaking, we would like to impose a negative filter that bars any nasal 
whose value for place of articulation differs from that of the stop consonant that im- 
mediately follows. Here, we use SL as an abbreviation for SUPRALARYNGEAL, CONT for 
CONTINUANT, MN for MANNER, and PL for PLACE. 
Example 9 
~phrase I'"segment \[SL : \[ c d PL : segment \[SL : PL : ' 
While the abbreviatory conventions in this filter might appear suspicious, it is straight- 
forwardly translated into the constraint in (10). This constraint is divided into three 
parts. The first simply requires that hna be a subtype of list(segment). The second part 
is lifted from (9), ensuring that the first two positions in the list do not violate the 
assimilation constraint. The third part propagates the assimilation constraint to the 
rest of the list. 
Example 10 
-1 {FIRST\[SL : 
hna =- list(segment) A L REsTIFIRsTIsL 
, NABL 
PL : 
I MNICONT : L PL : A \[REST : hna\] 
462 
Steven Bird and Ewan Klein Phonological Analysis in Typed Feature Systems 
Standard techniques can now be used to move the negation in (10) inward. 7 Since 
constraints on adjacent list elements generally seem to be more intelligible in the 
format exhibited by (9), we shall stick to that notation in the remainder of the paper. 
2.3 Prosodic Constituency 
One standard phonological approach assumes that prosodic constituency is like phrase 
structure (Selkirk 1984). For example, one might use a rewrite rule to define a (phono- 
logical) phrase as a sequence of feet, and a foot as sequence of syllables: 
Example 11 
a. phrase ~ foot + 
b. foot --* syl + 
Within the framework of HPSG, it would be simple to mimic such constituency by 
admitting a feature structure of type phrase whose DTRS (i.e. daughters) are a list of 
feature structures of type foot, and so on down the hierarchy. However, there appears 
to be no linguistic motivation for building such structure. Rather, we would like to 
say that a phrase is just a nonempty list of feet. But a foot is just a list of syllables, and 
if we abandon hierarchical structure (e.g. by viewing lists as strings), we seem to be 
stuck with the conclusion that phrases are also just lists of syllables. In a sense this is 
indeed the conclusion that we want. However, not any list of syllables will constitute a 
phrase, and not every phrase will be a foot. That is, although the data structure may be 
the same in each case, there will be additional constraints that have to be satisfied. For 
example, we might insist that elements at the periphery of phrases are exempt from 
certain sandhi phenomena; and similarly, that feet have no more than three syllables, 
and only certain combinations of heavy and light syllables are permissible. Thus, we 
shall arrive at a scheme like the following, where the Ci indicate the extra constraints: 8 
Example 12 
a. phrase =_ foot + A C1 A ... A Ck 
b. foot = syl + ACt A ... A Cn 
This concludes our discussion of string-based phonology. We have tried to show how 
a phonological model based on FSAs is compatible with the list notation and type 
regime of HPSG. Next we move onto a consideration of morphology and the lexicon. 
7 These techniques employ the following equivalences: 
~\[A: ~\] = -~\[A: ~\]V~\[B: @\] 
-~\[A: q~\] = \[-~(A: T)\]V\[A: ~*\] 
Here ~(A:T) indicates that the attribute A is not appropriate for this feature structure. 8 Sproat and Brunson (1987) have also proposed a model in which prosodic constituents are defined as 
conjunctions of constraints. 
463 
Computational Linguistics Volume 20, Number 3 
3. Morphology and the Lexicon 
3.1 Linguistic Hierarchy 
The subsumption ordering over types can be used to induce a hierarchy of grammat- 
ically well-formed feature structures. This possibility has been exploited in the HPSG 
analysis of the lexicon: lexical entries consist of the idiosyncratic information particu- 
lar to the entry, together with an indication of the minimal lexical types from which it 
inherits. To take an example from Pollard and Sag (1987), the base form of the English 
verb like is given in Example 13. 
Example 13 
main A base A strict-trans 
PHON : (1 a t k) 
SYN\]LOqSUBCAT : (\[lIT\] \[\]\[-~) 
\[RELN : ~e\] 
SEMICONT : /LIKER : 
L LIKEE 
Since main is a subtype of verb, the entry for like will inherit the constraint that its 
major class feature is V; by virtue of the type strict-trans, it will inherit the constraint 
that the first element in the SUBCAT list is an accusative NP, while the second ele- 
ment is a nominative NP; and so on for various other constraints. Figure 1 shows a 
small and simplified portion of the lexical hierarchy in which the verb like is a leaf 
node. 
Along the phonological dimension of signs, lexical entries will have to observe 
any morpheme or word level constraints that apply to the language in question. When 
words combine as syntactic phrases, they will also have to satisfy all constraints on 
well-formed phonological phrases (which is not to say that phonological phrases are 
isomorphic with syntactic ones). In the general case, we may well want to treat words 
in the lexicon as unsyllabified sequences of segments. It would then follow that, for ex- 
ample, the requirement that syllable-initial voiceless obstruents be aspirated in English 
lexical-sign 
verb Y 
main 
x. 
base 
x. 
j 
like 
unsaturated 
trans 
strict-trans j. 
,1" J 
,1" 
Figure 1 
A portion of the lexical hierarchy. 
464 
Steven Bird and Ewan Klein Phonological Analysis in Typed Feature Systems 
would have to be observed by each syllable in a phrase (which in the limiting case, 
might be a single word), rather than lexical entries per se. 
In some languages we may require there to be a special kind of interaction between 
the lexical and the prosodic hierarchy. For example, Archangeli and Pulleyblank (1989) 
discuss the tongue root harmony of Yoruba, which is restricted to nouns. If atr (i.e. 
advanced tongue root) was the type of harmonic utterances, then we could express 
the necessary constraint thus: 
Example 14 
atr \]\] PHON : phon A 
\[SYNJLOC\]HEAD : r MAJ: fLEX:n+ °un 
YlOblll 
This kind of constraint is known as a morpheme structure constraint, and phonol- 
ogists have frequently needed to have recourse to these (Kenstowicz and Kisseberth 
1979). Another interaction between prosody and morphology is the phenomenon of 
prosodic morphology, an example of which can be found in Section 4. 
3.2 Morphological Complexity 
Given the syntactic framework of HPSG, it seems tempting to handle morphological 
complexity in an analogous manner to syntactic complexity. That is, morphological 
heads would be analyzed as functors that subcategorize for arguments of the appro- 
priate type, and morphemes would combine in a Word-Grammar scheme. Simplifying 
drastically, such an approach would analyze the English third person singular present 
suffix -s in the manner shown in (15), assuming that affixes are taken to be heads. 
Example 15 
PHON : 
SYNSEMISUBCAT : affix 
Is> 1 {verb-stem}J 
By adding appropriately modified versions of the Head Feature Principle, Subcategor- 
ization Principle, and linear order statements, such a functor would combine with a 
verb stem to yield a tree-structured sign for walks. 
Example 16 \[P ON: EP ON  
{ver _s e  \[PHON 
verb 
While one may wish to treat derivational morphology in this way (cf. Krieger 
and Nerbonne \[in press\]), a more economical treatment of inflectional morphology 
is obtained if we analyze affixes as partially instantiated word forms. 9 Example (17) 
illustrates this for the suffix -s, where 3ps is a subtype of sign. 
9 See Riehemann (1992) for a detailed working out of this idea for German derivational morphology. 
465 
Computational Linguistics Volume 20, Number 3 
Example 17 
PHON : 
MORPH : 
3ps 
E\] -D 
affix-morph 
IA STEM : 
FHX 
PHON : \[~\]\] verb-,;tem 
\[PHON: \[~(S}\]/ \] 
Note that we have added to sign a new attribute MORPH, with a value morph. The 
latter has two subtypes, affix-morph and basic-morph, depending on whether the value 
contains a stem and affix or just a stem. 
Example 18 morph ~ affix-morph V basic-morph 
While both of these types will inherit the attribute STEM, affix-morph must also be 
defined for the attribute AFFIX: 
Example 19 
a. \[STEM:stem\] morph 
b. affix_morph \[ AFFIx:affix\] 
Moreover, affix has two subtypes: 
Example 20 
affix ~ prefix V suffix 
Thus, (17) is a third person singular verb form whose stem is unspecified. 
As indicated in Section 1.3, we can take the interpretation of a complex type to 
be equivalent to the disjunction of all of its subtypes. Now, suppose that our lexicon 
contained only two instances of verb-stems, namely walk and meet. Then (17) would 
evaluate to exactly two fully specified word forms, where verb-stem was expanded 
to the signs for walk and meet respectively. Example 21 illustrates the first of these 
options. 
Example 21 
"PHON : 
MORPH : 
3ps 
N .D 
affix-morph 
STEM : 
AFFIX : 
"PHON : \[~(w OI k} \] 
SYNSEM: \[CAT: verb\]\] 
verb-stem 
suffix \[PHON : \[\]\[\] (s} 1 
466 
Steven Bird and Ewan Klein Phonological Analysis in Typed Feature Systems 
Of course, this statement of suffixation would have to be slightly enriched to 
allow for the familiar allomorphic alternation -s,,~z,,,~z. The first pair of allomorphs can 
be handled by treating the suffix as unspecified for voicing, together with a voicing 
assimilation rule similar to the homorganic nasal rule in (9). The third allomorph 
would admit an analysis similar to the one we propose for French schwa in Section 5. 
A second comment on (21) is that the information about ordering of affixes relative 
to the stem should be abstracted into a more general pair of statements (one for prefixes 
and one for suffixes) that would apply to all morphologically complex lexical signs 
(e.g. of type affixed); this is straightforward to implement: 
Example 22 
a. I PHON : 
MORPH : 
affixea 
STEMIPHON : \[\] 
AFFIX : prefix \[PHON 
b. 
affix~ 
PHON: 
MORPH : r 
STEMIPHON : \[\] 
AFFIX : SUffix \[PHON 
Given this constraint, it is now unnecessary to specify the phonology attribute for 
feature terms like (21). 
Additionally, it is straightforward to prevent multiple copies of the plural suffix 
from being attached to a word by ensuring that 3ps and verb-stem are disjoint. 
3.3 Morphophonological Operations 
In and of itself, HPSG imposes no restrictions on the kind of operations that can be 
performed in the course of composing morphemes into words, or words into phrases. 
As an illustration, consider the data from German verb inflections analyzed by Krieger, 
Pirker, and Nerbonne (1993). As they point out, the second person singular present 
inflection -st has three different allomorphs, phonologically conditioned by the stem: 
Example 23 
sag+st arba~t+0st m~ks+t 
'say' 'work' 'mix' 
Although the main thrust of their paper is to show how an FST treatment of this 
allomorphy can be incorporated into an HPSG-style morphological analysis, from a 
purely formal point of view, the FST is redundant. Since the lexical sign incorporates 
the phonologies of both stem and affix, segments can be freely inserted or deleted in 
constructing the output phonology. This is exemplified in (24) for arbeitest and mixt 
respectively. 
467 
Computational Linguistics Volume 20, Number 3 
Example 24 
a. 
2ps 
b. 
2ps 
PHON : 
MORPH : 
SYNILOCIHEADIAGR : 
PHON : 
MORPH : 
SYN\[LOCIHEADIAGR : 
STEM : IT\](... {t,d}}\] 
SUFFIX : \[-~(S t} \[N O: :.\] 
PER : 
STEM: \[\](... {S~Z, X}>" 
SUFFIX: (S} ~, \[-~(t} 
NUM : ;g\] 
PER : 
That is, we can easily stipulate that o is intercalated in the concatenation of stem and 
suffix if the stem ends with a dental stop (i.e either t or d); and that the s of the suffix is 
omitted if the stem ends with alveolar or velar fricative. Although an actual analysis 
along these lines would presumably be stated as a conditional, depending on the 
form of the stem, the point remains that all the information needed for manipulating 
the realization of the suffix (including the fact that there is a morpheme boundary) 
is already available without resorting to two level rules} ° Of course, the question 
this raises is whether such operations should be permitted, given that they appear to 
violate the spirit of a constraint-based approach. The position we shall adopt in this 
paper is that derivations like (24) should in fact be eschewed. That is, we shall adopt 
the following restriction: 
Phonological Compositionality: 
The phonology of a complex form can only be produced by either unifying or con- 
catenating the phonologies of its parts. 
We believe that some general notion of phonological compositionality is method- 
ologically desirable, and we assume that Krieger, Pirker, and Nerbonne would adopt 
a similar position to ours. The specific formulation of the principle given above is 
intended to ensure that information-combining operations at the phonological level 
are monotonic, in the sense that all the information in the operands is preserved in 
the result. As we have just seen, the constraint-based approach does not guarantee 
this without such an additional restriction. 
4. Sierra Miwok Templatic Morphology 
Noncatenative morphology has featured centrally in the empirical motivations for 
autosegmental phonology, since McCarthy's demonstration that the intercalation of 
vowels in Arabic consonantal verb roots could be elegantly handled within this frame- 
work (McCarthy 1981). This section presents an approach to intercalation that uses key 
10 This approach of using restructuring devices in the process of a derivation has been explored in the 
context of extended Montague frameworks by Wheeler (1981) and Hoeksema and Janda (1988). 
468 
Steven Bird and Ewan Klein Phonological Analysis in Typed Feature Systems 
insights from autosegmental phonology. However, they are captured within constraint- 
based grammar where the inflectional paradigm is realized as an inheritance hierarchy 
of partially instantiated stem forms (cf. Reinhard and Gibbon \[1991\]). We also show 
that autosegmental association of consonants and vowels to a skeleton can be mod- 
eled by reentrancy. Rather than classical Arabic, we use the simpler data from Sierra 
Miwok that Goldsmith (1990) chose to illustrate the phenomenon of intercalation in 
his textbook. 
This section is divided into four subsections. In Section 4.1 we present an overview 
of the data, and in Section 4.2 we briefly show what a traditional generative analysis 
might look like. Our encoding of association by reentrancy is given in Section 4.3, 
while Section 4.4 contains our constraint-based analysis of Sierra Miwok stem forms. 
4.1 Descriptive Overview 
As mentioned above, Goldsmith (1990) takes data from Sierra Miwok verb stems to 
illustrate morphologically determined alternations in skeletal structure. He discusses 
three of the four types of stem, where the division into types depends primarily 
on the syllable structure of the basic form, which is the form used for the present 
tense. The three types are given the following autosegmental representations by Gold- 
smith: 
Example 25 
a. Type I k i 
C V 
c a w /\ 
C V V C 
b. Type II c e 
r 
C V 
1 k u 
C C V 
c. Type III h a m e 
I /\1 
C V C C V 
As shown in (26), each type has forms other than the basic one, depending on the 
morphological or grammatical context; these additional forms are called second, third, 
and fourth stems. 
Although the associations of vowels and consonants exhibited above are taken as 
definitional for the three stem Types, from the data in (26) it appears that the distinction 
is only relevant to so-called Basic stem forms. 
469 
Computational Linguistics Volume 20, Number 3 
Example 26 
Gloss Basic stem Second stem Third stem Fourth stem 
Type I 
bleed kicaaw kicaww kiccaw kicwa 
jump tuyaa~ tuyar3r 3 tuyya U tuyr3a 
take patiit patitt pattit patti 
roll huteel hutell huttel hutle 
Type II 
quit celku celukk celluk celku 
go home wo?lu wo?ull wo??ul wo?lu 
catch up with nakpa nakapp nakkap nakpa 
spear wimki wimikk wimmik wimki 
Type III 
bury hamme hame?? hamme? ham?e 
dive ?uppi ?upi?? ?uppi~ ~up?i 
speak liwwa liwa?? liwwa? liw?a 
sing milli mili?? milli? mil?i 
4.2 Segmental Analysis 
Goldsmith (1990) has shown just how complex a traditional segmental account of 
Sierra Miwok would have to be, given the assumption that all of the stem forms are 
derived by rule from a single underlying string of segments (e.g. that kicaww, kiccaw 
and kicwa are all derived from kicaaw). Here, we simplify Goldsmith's analysis so that 
it just works for Type I stems. The left-hand column of (27) contains four rules, and 
these are restricted to the different forms according to the second column. 
Example 27 
Rules 
V~-,O/C--V~C\] 
Ci--~CiCi/--\] 
Ci--~CiCi/\[CV--V 
VC\]--*CV 
Form 
all 
2 
3 
4 
Second Third Fourth 
kicaaw kicaaw kicaaw 
kicaw kicaw kicaw 
kicaww -- 
-- kiccaw 
-- kicwa 
kicaww kiccaw kicwa 
Thus, the first rule requires that a vowel Vi is deleted if it occurs after a consonant 
and immediately before an identical vowel Vi that in turn is followed by a stem-final 
consonant. Goldsmith soundly rejects this style of analysis in favor of an autosegmental 
one: 
This analysis, with all its morphologically governed phonological rules, 
arbitrary rule ordering, and, frankly, its mind-boggling inelegance, 
ironically misses the most basic point of the formation of the past 
tense in Sierra Miwok. As we have informally noted, all the second 
470 
Steven Bird and Ewan Klein Phonological Analysis in Typed Feature Systems 
stern forms are of the shape CVCVCC, with the last consonant a gem- 
inate, and the rules that we have hypothetically posited so far all 
endeavor to achieve that end without ever directly acknowledging it. 
(Goldsmith 1990:87) 
4.3 Association 
We shall not attempt here to give a general encoding of association, although the 
technique used in Sections 5.4 could be applied to achieve this end. Moreover, like 
Goldsmith we shall ignore the role of syllable structure in the data, though it clearly 
does play a role. Instead, we shall confine our attention to the manner in which skeletal 
slots are linked to the consonant and vowel melodies. Consider again the skeletal 
structure of Type I verb stems shown in (25a). As Goldsmith (1990) points out, there 
is a closely related representation that differs only in that the CV information is split 
across two tiers (and which allows a much more elegant account of metathesis and 
gemination): 
Example 28 
consonantal melody k c w 
skeleton X X 
vowel melody i 
X X X X \/ 
a 
The diagram in Example 28 can be translated into the following feature term: 
Example 29 
CON : 
VOW : 
phon \[SKEL : 
@k @c Dw} \] 
(\[-~i \[\]a} \] 
(DSDDSD} 
That is, since association in (28) consists of slot-filling (rather than the more general 
temporal interpretation), it can be adequately encoded by coindexing. 
4.4 Basic Stem Forms 
The analysis starts from the assumption that the Sierra Miwok lexicon will contain 
minimally redundant entries for the three types of verb root. Let us consider the root 
corresponding to the basic stem form kicaaw. We take the unpredictable information 
to be the consonantal and vowel melodies, the valency, the semantics, and the fact it 
is a Type I verb stem. This is stated as (30), together with the declaration that lex-bleed 
is a subtype of v-root-I. 
471 
Computational Linguistics Volume 20, Number 3 
Example 30 
PHON : 
SYNSEM : 
lex-bleed 
\[CON : (k c w')\] 
phon \[VOW: (i a) \] 
SUBCAT : (NP)\] 
synsem SEM : bh,ed \] 
Notice that we have said nothing about how the melodies are anchored to a 
skeleton--this will be a task for the morphology. Additionally, this entry will inherit 
various properties by virtue of its type v-root-I. The three types of verb root share at 
least one important property, namely that they are all verbs. This is expressed in the 
next two declarations: 
Example 31 
a. v-root ~ v-root-I V v-root-II V v-root-III 
b. v_root ISYNSEM\[CAT : verb\] 
We shall also assume, for generality, that every v-root is a root, and that every root is 
a morph. Anticipating the rest of this section, we show how all the postulated types 
are related in Figure 2. The next step is to show how a v-root-I like (30) undergoes 
morphological modification to become a basic verb stem; that is, a form with skeletal 
structure. Our encoding of the morphology will follow the lines briefly sketched in 
Section 3.2. 
We begin by stating types that encode the patterns of skeletal anchoring associated 
with the three types of basic stem. 
Example 32 
phon =~ template-I V template-II V template-III 
sign v-root morph-dtrs p"p'hon phra//\ 
word stem morph 
affixe~d roo//t ~ 
basic affix 
v-r°°t-/~I 
v-root-II\ 
v-root-III 
5\ bas-morph~-dtrs~ basic-I basic-III 
aff-morph-dtrs basic-II 
Figure 2 
Sierra Miwok type hierarchy. 
472 
Steven Bird and Ewan Klein Phonological Analysis in Typed Feature Systems 
The appropriateness constraints on these types are given in (33). As an aid to 
readability, the numerical tags are supplemented with a C or a V to indicate the type 
of value involved. 
Example 33 
CON : 
a. \[vow : 
template-I LSKEL : 
<E\]c E\]c \[~c> <frlV rqv> 
<EOC Nv Sc frlV rrlv IEC> 
b. 
template-lI 
CON : 
VOW : 
SKEL : 
@c Dc Dc> @\]v ~v> 
@c Dv \[~c E\]v E\]c> 
c. 
template-III 
CON : 
VOW : 
SKEL : <EF Dc> I 
@\]v ~v> <E\]c Dv E\]c @c Fqv > 
Each of these types specializes the constraints on the type phon, and each can be unified 
with the phon value earlier assigned to the root form of kicaaw in (30). In particular, 
the conjunction of constraints given in (34) evaluates to (29), repeated here: 
Example 34 
CON : 
phon \[VOW : 
(k c w}\] (i a} j A template-I 
Example 29 
F CON : IVOW : 
phon LSKEL 
(V~k Dc @w> \] 
<\[i\]i E\]a} \] 
<SSD@S~> 
However, we also need to specify the dependency between the three types of verb 
root, and the corresponding phonological exponents that determine the appropriate 
basic stem forms (cf. Anderson \[1992\]). As a first attempt to express this, let us say 
that stem can be either basic or affixed: 
Example 35 
stem ~ affixed V basic 
Type declaration (35) ensures that basic will inherit from stem the following constraint, 
namely that its SYNSEM value is to be unified with its MORPH'S ROOT'S SYNSEM value: 
Example 36 
ISYNSEM : 
ORPHIROOTISYNSEM : ~\] stem 
473 
Computational Linguistics Volume 20, Number 3 
We could now disjunctively specify the following three sets of constraints on basic: 
Example 37 
a. 
basic 
PHON : phon A template-I 1 
MORPH: IROOT: v-root-Ill 
b. 
basic 
"PHON : 
MORPH : 
phon A template-II 1 I oow: v-roo,-l \]J 
C. 
basic 
"PHON : 
MORPH : 
phon A template-III 1 
\[ROOT: v-root-III\]\] 
Although the example in question does not dramatize the fact, this manner of en- 
coding morphological dependency is potentially very redundant, since all the common 
constraints on basic have to be repeated each time. n In this particular case, however, 
it is easy to locate the dependency in the phon value of the three subtypes of v-root, as 
follows: 
Example 38 
v_root_I IPHON : template-l\] 
v_root_ii \[PHON : template-Ill 
v_root_III\[PHON : template-Ill\] 
We then impose the following constraint on basic: 
Example 39 IPH° : \[\] 
 MO PH'ROOT"  _roo I 
By iterating through each of the subtypes of v-root, we can infer the appropriate 
value of PHON within MORPH'S ROOT, and hence infer the value of PHON at the top level 
of the feature term. Example 40 illustrates the result of specializing the type v-root to lex-bleed: 
11 In an attempt to find a general solution to this problem in the context of German verb morphology, 
Krieger and Nerbonne (in press) adopt the device of 'distributed disjunction' to iteratively associate 
morphosyntactic features in one list with their corresponding phonological exponents in another list. 
474 
Steven Bird and Ewan Klein Phonological Analysis in Typed Feature Systems 
F CON : \[\] \[s%OW: 
phon 
iuAT: verb \] 
BEAT: (NP)\] 
EM : bleed\] 
F 
template-morph \[ROOT : v-root-I 
Exam ~le 40 
PHON : 
SYNSEM : \[\] 
synsem 
MORPH : 
basic 
(NND@E\] 
PHON : 
SYNSEM : 
Exactly the same mechanisms will produce the basic stem for the other two types of 
verb root. For an account of the other alternations presented in Goldsmith's paradigm, 
and for some discussion of how lexical and surface forms determine each other, see 
Klein (1993). 
We have just seen an application of constraint-based phonology to Sierra Miwok. 
In order to illustrate some of the other expressive capabilities of the approach, we now 
turn to the phenomenon of French schwa. 
5. French Schwa 
Many phonological alternations can be shown to depend on properties of prosodic 
structure. In this section we show how the French phenomenon of schwa-zero al- 
ternation arises out of the interplay of various syllable structure requirements. This is 
done by introducing a system of prosodic types for syllables and a special type declar- 
ation showing how a string of segments can be 'parsed' into syllables. The standard 
(but nonmonotonic) ONSET MAXIMIZATION PRINCIPLE is reinterpreted in the system, as 
well as the exceptions to this principle due to a class of words known as h-aspir6 
words. We also show how a certain kind of disjunction can be used to deal with free 
variation. As we shall see, some linguistic analyses are more amenable to a declarative 
encoding than others. In order to demonstrate this, it will be necessary to go into some 
detail concerning the linguistic data. 
This section is divided into four subsections. In Section 5.1 we present a descriptive 
overview of the data, 12 and in Section 5.2 we sketch a traditional generative analysis. A 
more recent, nonlinear analysis appears in Section 5.3 while our own, constraint-based 
version is presented in Section 5.4. 
5.1 Descriptive Overview 
Unlike schwa in English, the French schwa (or mute e) is a full vowel, usually realized 
as the low-mid front rounded vowel ce (and sometimes as the high-mid front rounded 
vowel o in certain predictable environments). Its distinctive characteristic is that under 
12 The data is from standard French taken from (cited) literature, although in some instances we have 
found speakers with different acceptability judgments than reported here. See Morin (1987) for a 
discussion of some problems with the treatment of French data in the literature. 
475 
Computational Linguistics Volume 20, Number 3 
certain conditions, it fails to be realized phonetically. 13 From now on we shall use the 
term 'schwa' to refer to the vowel with this characteristic, rather than to the segment o. 
Although schwa is associated witlh orthographic e, not all es will concern us here. 
For example, the orthographic e of samedi \[sam.di\] 'Saturday' can be taken to indicate 
that the previous vowel should not be nasalized, while the final e of petite \[poe.tit\] 
indicates that the final t should be pronounced. In morphology, orthographic e marks 
feminine gender, first-conjugation verbs, and subjunctive mood. 
Instead, we shall be concerned with the pattern of realization and non-realization 
exhibited by schwa--a pattern that we interpret as grounded in the alternation of two 
allophones of schwa: oe and 0 (zero). This alternation is manifested in forms like (41), 14 
where the dots indicate syllable boundaries. 
Example 41 
a. six melons \[si.moe.15\] ,,~ \[sim.15\] 
b. sept melons \[s~t.moe.15\], *\[s~tmlS\] 
Observe that while six melons can be pronounced with or without the schwa, 
sept melons requires the schwa in order to break up the tml cluster that would oth- 
erwise be formed. Unfortunately, the conditions on the distribution of schwa are not 
as simple (and purely phonological) as this example implies. As we shall see, schwa 
alternation in French is governed by an interesting mixture of lexical and prosodic 
constraints. 
In the remainder of this section, we dispel the initial hypothesis that arises from 
(41), namely that schwa alternation is to be treated as a general epenthesis process, is 
Consider the following data (Morin 1978:111). 
Example 42 
Cluster 
rdr 
rf 
skl 
ps 
Schwa Possible/Obligatory 
bordereau \[b3r.doe.ro\] 
derechef \[doe.roe.fef\] 
squelette \[skoe.lct\] 
d6pecer \[de.poe.se\] 
Schwa Impossible 
perdrix \[pEr.dri\] 
torchon \[t~r.ff\] 
scl6rose \[skle.roz\] 
6clipser \[ek.lip.se\] 
The table in (42) gives data for the clusters \[rdr\], \[r f\], \[skl\] and \[ps\]. In the first column 
of data, the oe is possible or obligatory, while in the second column, it is absent. Thus, 
we see that the apperance of oe cannot be predicted on phonotactic grounds alone. 
Consequently, we shall assume that schwa must be encoded in lexical representations. 
Note that it is certainly not the case that a lexical schwa will be posited wherever 
there is an orthographic e. Consider the data in (43), where these orthographic es are 
underlined. 
13 The data used in this section is drawn primarily from the careful descriptive work of Morin (1978) and 
Tranel (1987b). The particular approach to French schwa described in the following paragraphs most 
closely resembles the analysis of Tranel (1987a). 
14 We shall not be concerned with another ce~0 alternation known as elision. This is a phonologically 
conditioned allomorphy involving alternations such as le~l', for example, le chat \[loe.faL l'ami \[la.mi\]. 
15 This epenthesis hypothesis was advanced by Martinet (1972). 
476 
Steven Bird and Ewan Klein Phonological Analysis in Typed Feature Systems 
Example 43 
Orthography 
bordereau 
fais-le 
six melons 
pelleterie 
With Schwa 
\[bar.doe.ro\] 
\[f~.lo\] 
\[si.mce.15\] 
Without Schwa 
\[sim.15\] 
\[p~l.tri\] 
In a purely synchronic analysis there is no basis for discussing an alternating vowel 
for bordereau, fais-l¢ and pelleterie. Many orthographic es that are not in the first syllable 
of a word come into this category. 
Accordingly, we begin our analysis with three background assumptions: the alter- 
nating schwa is (i) prosodically conditioned, (ii) lexically conditioned, and (iii) not in 
direct correspondence with orthographic e. Next we present a generative analysis of 
schwa due to Dell, followed by an autosegmental analysis due to Tranel. We conclude 
with our own, syllable-based analysis. 
5.2 A Traditional Generative Analysis 
The traditional approach to vowel-zero alternations is to employ either a rule of 
epenthesis or a deletion rule. Dell discusses the case of the word secoue, whose pro- 
nunciation is either \[sku\] or \[soeku\], in a way that parallels (41). 
In order to account for alternations such as that between \[sku\] and 
\[soeku\] there are two possibilities: the first consists of positing the 
underlying representation /sku/ where no vowel appears between 
/s/ and /k/, and to assume that there exists a phonological rule of 
epenthesis that inserts a vowel ce between two consonants at the be- 
ginning of a word when the preceding word ends in a consonant .... 
The second possibility is preferable: the vowel \[oe\] that appears in 
Jacques secoue is the realisation of an underlying vowel/o/which can 
be deleted in certain cases. We shall posit the VCE1 rule, which deletes 
any /a/ preceded by a single word-initial consonant when the pre- 
ceding word ends in a vowel. 
VCEI: 0--+ 0 / V "#1 C 
(Dell, 1980:161f) 
Suppose we were to begin our analysis by asking the question: how are we to 
express the generalization about schwa expressed in the above rule? Since our declar- 
ative, monostratal framework does not admit deletion rules, we would have to give 
up. As we shall see below, however, we begin with a different question: how are we to 
express the observation about the distribution of schwa that Dell encodes in the above 
rule? 
There is another good reason for taking this line. As it happens, there is an em- 
pirical problem with the above rule, which Dell addresses by admitting a potentially 
large number of lexical exceptions to the rule and by making ad hoc stipulations (Dell 
1980). Additionally, adding diacritics to lexical entries to indicate which rules they 
undergo and employing rules that count # boundaries would seem to complicate a 
grammar formalism unnecessarily. As we saw above for the discussion of the word 
bordereau, in the approach taken here we have the choice between positing a stable 
477 
Computational Linguistics Volume 20, Number 3 
oe or one that alternates with zero (i.e. a schwa) in the lexicon, whereas Dell must 
mark lexical items to indicate which rules they must not undergo. There is also some 
evidence for a distinction between the phonetic identity of the ce allophone of schwa 
and the phonetic identity of a nonalternating lexical oe in some varieties of French, 
requiring that the two be distinguished phonologically (Morin 1978). 
Thus, the fact that Dell's analysis involves deletion does not provide a signifi- 
cant stumbling block to our approach. However, Dell employs another procedural 
device, namely rule ordering, in the application of the rule. In discussing the phrase 
vous me le dites \[vu.m(oe).l(oe).dit\], in which either schwa (but not both) may be omitted, 
Dell writes: 
VCE1 begins on the left and first deletes the schwa of me, producing 
/vu#m#1o#dit/. But VCE1 cannot operate again and delete the schwa 
of le, for, although this schwa was subject to the rule in the original 
representation, it no longer is once the schwa of me has been dropped. 
In other words, the first application of VCE1 creates new conditions 
that prevent it from operating again in the following syllable (Dell 
1980:228). 
Again, we are not interested in encoding Dell's particular generalization, and in 
fact we are unable to. Rather, it is necessary to look at the underlying observation 
about the distribution of schwa. The observation is that schwa does not appear as its 
zero allophone in consecutive syllables. This observation is problematic for us, in that 
it refers to two levels of representation, an underlying (or lexical) level involving a 
schwa segment, and a surface level involving a zero allophone. We cannot formulate 
this observation monostratally. However, we can come up with a different observation, 
namely that the vowel is never omitted if the omission results in unacceptable syllable 
structure. In the case of Dell's example, vous me le dites, if both schwas are omitted the 
result is a \[vml\] cluster, which cannot be broken up into a valid coda-onset sequence. 
This new observation makes a different empirical prediction, namely that schwa can 
be omitted in consecutive syllables just in case the result is syllabifiable. As we shall 
see below in (51), this prediction is actually borne out. 
Before proceeding with our own analysis, we present an overview of an autoseg- 
mental analysis of French schwa due to Tranel. This analysis is interesting because 
it demonstrates the oft-repeated phenomenon of enriched representations leading to 
dramatically simplified rule systems. Given the heavy restriction on rules in a mono- 
stratal framework, it will be more natural to take Tranel's (rather than Dell's) analysis 
as our starting point. 
5.3 Tranel's Analysis 
Tranel (1987a) provides an insightful analysis of French schwa cast in the framework of 
autosegmental phonology. In this section we give an overview of this analysis. In the 
following section we shall endeavour to provide an empirically equivalent analysis. 
Tranel adopts a CV skeleton tier and a segmental tier. Schwa is represented as an 
unlinked vowel, as shown in the following representation for melons. 
Example 44 
C C V 
m oe 1 5 
478 
Steven Bird and Ewan Klein Phonological Analysis in Typed Feature Systems 
On top of this two-tiered structure, Tranel proposes a level of hierarchical organ- 
ization for representing syllable structure. Tranel adopts the two syllable formation 
rules given in (45). A third (unstated) rule is assumed to incorporate unsyllabified 
consonants into coda position. 
Example 45 
a. Basic syllable formation b. Schwa-syllable formation 
O" (7 /\ /\ 
O R O R 
r 
C V --* C V C ~ C V 
\[F\] ~ \[F\] 0 
Note that (45a) does not apply to the mo~ sequence in (44), as the schwa is not linked 
to a V node as required on the left-hand side of rule (45a). (Tranel later adopts a 
refinement to (45a), preventing it from applying if the V is the first vowel of an h-aspir6 
morpheme.) For the phrases six melons and sept melons, the basic syllable formation rule 
builds the following structures. 
Example 46 
O" /\ 
O R 
J 
C V C 
I 
s i m 
/\ /\ /\ 
OR OR OR 
I r I 
C V C V C C C V rl II 
~ 1 5 s ~ t m ~ 1 5 
The remaining consonants must either be syllabified leftward into an unsaturated 
coda or remain unsyllabified and rescued by the schwa syllable formation rule. For 
six melons, both options are possible, as illustrated below. Note that the unlinked oe is 
assumed to be phonetically uninterpreted. 
Example 47 
O" /\ 
O R 
I \ 
C V C 
~ L L 
s i m 
O" O" O" O" /\ /\ /\ /\ 
OR OR O R OR 
J f r I I I 
C V C V C V C V 
I I L I I I t I 
ce 1 5 s i m oe I 5 
This gives us the two options, \[sim.lS\] and \[si.mce.15\], according with the observation 
479 
Computational Linguistics Volume 20, Number 3 
in (41). For sept melons, however, there is just the one option. The t must be syllabified 
into the preceding coda, and the m requires the presence of schwa, and so we have 
\[set.mce.15\]. Further examples of this particular kind of schwa alternation are given 
below (Tranel 1987b:91). 
Example 48 
Schwa Required 
de qui parlez-vous? \[doekiparlevu\] 
te casse pas la t~te \[tcekaspctlat~t\] 
debout \[doebu\] 
depuis quatre ans \[dcept\[ikatr~t\] 
dedans \[dced~t\] 
je joue \[~3oe3u\] 
le lait \[lcele\] 
ce salon \[scesal6\] 
Schwa Optional 
vous parlez de qui? 
ne te casse pas la t~te 
il est debout 
c'est depuis quatre ans 
la-dedans 
mais je joue 
dans le lait 
dans ce salon 
\[vuparled(ce)ki\] 
\[ncet(ce)kctspctlat~t\] 
\[iled(ce)bul 
\[s~d(ce)ptIikatr~tl 
\[lad(ce)d0~l 
\[mc3(ce)3u\] 
\[d~(ce)l~\] 
\[d~ts(ce)sal6\] 
So far, we have seen the case where the leftward syllabification of a consonant 
licenses the omission of schwa. Now we turn to a similar case, but where the consonant 
syllabifies rightward into a following onset provided that the resulting onset cluster 
is permitted. The data in (49) are from Tranel (1987b:92). 
Example 49 
secoue pas la t6te 
je pense pas 
ce bona rien 
\[sku.pct.la.t~t\]~\[soe.ku.pct.la.tet\] 
\[fp~ts.pct\],,~ \[ 3oe.pgls.p ~t\] 
\[zbS.a.rj~\]~-,\[sce.bS.a.rj~\] 
'don't shake your head' 
'I don't think so' 
'this good-for-nothing' 
Tranel gives two additional syllable formation rules, shown in (50). 
Example 50 
a. Onset accretion b. Onset accretion across schwa 
O O 
I /\ 
C C --* C C C 
\[ L I 
\[F\] \[G\] \[Fl \[GI IF\] ce 
O O 
C --+ C C 
\[G\] IF\] ce \[G\] 
Restriction: 
must create a valid onset 
Restriction: 
\[F\] must be word-initial 
Rule (50a) incorporates as many consonants as possible into an onset so long as the 
onset conforms to the phonotactic constraints of the language. Rule (50b), of most 
interest here, allows for a consonant to be incorporated into a following onset even if 
there is an intervening schwa, provided that the consonant is word-initial (and that 
the resulting onset is allowable). The intervening schwa remains unpronounced. Rule 
(50b), which is optional, correctly captures the alternations displayed in (49). This rule 
is restricted to apply word-initially "so as to avoid the generation of wordqnternal 
triliteral consonant clusters from underlying/CCoC/sequences (compare marguerite 
480 
Steven Bird and Ewan Klein Phonological Analysis in Typed Feature Systems 
/margorit/ \[margorit\] *\[margrit\] and margrave /margrav/ \[margrav\] *\[margorav\])" 
(Tranel 1987a:852). Thus, although many CCC sequences are acceptable phonologically, 
they are not permitted if a schwa is available to break up the cluster. 
We also note that Tranel's analysis (Tranel 1987a) gives the correct result for cases 
of deletion of schwa in consecutive syllables. Consider the following data. 
Example 51 
a. on ne se moque pas \[6n.sm3k.pct\] (Valdman 1976:120) 
b. sur le chemin \[syl.fm~\] (Morin 1978:82) 
For both of these cases we observe an "underlying" ClceC2ce pattern, but where both 
ces are omitted and where C1 syllabifies into the preceding coda and C2 syllabifies 
into the following onset. 
To conclude, we can summarize the empirical content of Tranel's analysis as 
follows: 
(a) Every consonant must be syllabified. 
(b) Schwa must be realized if it provides the syllable nucleus for an 
immediately preceding consonant that: 
(i) cannot be syllabified into a coda, and 
(ii) cannot form a permissible (word) onset with an immediately 
following consonant. 
Naturally, this statement is not the last word on French schwa and there may be 
ways in which it needs to be revised, such as for the treatment of word-final schwas 
and thematic schwas (Tranel 1987a:855ff). However, since our purpose is primarily to 
illustrate the workings of the theoretical model, we shall take the above statement as 
a well-defined starting point on which to base the following analysis. 
5.4 A Constraint-Based Analysis 
Given our formal semantics for the autosegmental notation, it would be a relatively 
straightforward matter to implement Tranel's analysis directly, especially since the 
rules only involve the building of structure, and there is no use of destructive processes. 
Tranel's analysis is fully declarative. 
However, as it happens, there is no need for us to adopt the rich representation 
Tranel employs. We can simulate his analysis using a single tier (rather than two) while 
retaining a representation of syllable structure. Observe that the use of the CV tier and 
the melody tier was motivated solely by the need to have a floating autosegment, the 
~e. It is equivalent to collapse these two tiers, using the alternation ce~0 in place of the 
floating ce. This style of approach to zero alternations, which dates back to Bloomfield 
(1926), will employ the parenthesis notation for optional items that was defined in 
Section 2.1. We follow Tranel in representing syllable structure and we shall do this 
using the notation shown in (52). 16 
16 Our analysis is not crucially tied to this particular version of syllable structure, which is most closely 
related to the proposals of Kahn (1976) and Clements and Keyser (1983). 
481 
Computational Linguistics Volume 20, Number 3 
Example 52 I 
ONS " onset 
|NUC : nucleus 
syl \[CODA : coda 
An independent tier that represents syllable structure will be encoded as a se- 
quence of such syllables, where the segmental constituents of the syllable structure 
are coindexed with a separate segmental tier, as defined in (53). Note that the indices 
in (53) range over lists that may be empty in the case of onsets and codas, and that 
the type phrase denotes phonological phrases. 
Example 53 
a. 
SYLS : 
SEGS phrase 
< D\]> o \] NUC : ~, 
syl LCODA : 
E\] ~- E\]~- E\]r- E\] 
I SYLS : ~r~_\]l \[SEGS 
phrase 
b. rSYCS q 
phrase LSEGS : 
The notation of (53) states that in order for something to be a well-formed phrase, 
its sequence of segments must be parsed into a sequence of well-formed syllables. In 
more familiar terms, one could paraphrase (53) as stating that the domain of syllabi- 
fication in French is the phrase. 
As a simple illustration of the approach, consider again the word melons. The pro- 
posed lexical representation for the phonology attribute of this word is 
\[SEGS : (m (ce) 1 5}\]. When we insist that any phrase containing this word must con- 
sist of a sequence of well-formed syllables, we can observe the following pattern of 
behavior for six melons. 
Example 54 
a. 
phrase 
\[ <rONSLcoDA '>'>\] tONS ,m>\] tONS SYLS: /NUC: (i} |NUC: } /NUC: (5} 
LCODA IT <> : : LCODA : syl syl syl 
LSECS (s i m ce 1 5} 
b. 
phrase 
I { toNs: SYLS • / NUC : syl LCODA : LSEGS (S i m 1 O} ,s>l  ONS il\]>\] <,>l LcoN~CA > (m)Jsy I 
Observe in the above example that the syllabic position of m is variable. In Exam- 
ple 54a m is in an onset while in 54b it is in a coda. Therefore, it is inappropriate to 
482 
Steven Bird and Ewan Klein Phonological Analysis in Typed Feature Systems 
onset 
internal-onset 
(obs, liq) 
((cons), (glide)} 
coda 
internal-coda~ 
, n) (cons, cons} 
(obs, son} ((cons)} 
(s, stop, liq) 
Figure 3 
Parts of French type hierarchy. 
insist that the syllabic affiliation of segments is determined lexically. Rather, we have 
opted for the prosodic type phrase, insisting that anything of this type consists of one 
or more well-formed syllables (cf. Example 11). 
Now consider the case of the phrase sept melons. This is similar to the situation in 
(54), except that we must find a way of ruling out the tml cluster as a valid coda-onset 
sequence. We are not aware of any exhaustive study of possible French consonant clus- 
ters, although one can find discussions of particular clusters (e.g., Tranel \[1987b:95ff\] 
shows that CLj onset clusters are not tolerated). Consequently, the two hierarchies in 
Figure 3 are necessarily preliminary, and are made more for the sake of being explicit 
than for their precise content. Note that parentheses indicate optionality, so, for ex- 
ample, both onsets and codas are allowed to be null. Additional stipulations will be 
necessary to ensure that an intervocalic consonant is syllabified with the material to 
its right. We can do this by preventing an onsetless syllable from following a closed 
syllable, with the type onset-max-1. 
Example 55 /,... rCODA:ne, st\] 
onset-max-1 - ~ \ syl k syl 
phrase 
Now consider again the phrase six melons. The syllabification *\[si.moel.5\] would 
be represented as follows: 
Example 56 / IONS  IONS" e' s'l  
, IONS: (s)\] / NUC: (~e/ LNUC (~/j / 
syl LNUC: (i)Jsyl LCODA : nelist(1) syl 
Observe that this list of syllables contains a violation of (55), so \[si.moel.5\] is ruled 
out. Now that we have considered vowel-consonant-vowel (VCV) sequences, we shall 
move on to more complex intervocalic consonant clusters. 
Although the constraints in Figure 3 produce the desired result for VLLV clus- 
ters (L--liquid), by assigning each liquid to a separate syllable (Tranel 1987b), there 
is still ambiguity with VOLV clusters (O=obstruent), which are syllabified as V.OLV 
according to Tranel. We can deal with this and similar ambiguities by further refining 
483 
Computational Linguistics Volume 20, Number 3 
the classification of syllables and imposing suitable constraints on syllable sequences. 
Here is one way of doing this, following the same pattern that we saw in (55). 
Example 57 
onset-max-2 ~ -~ 
phrase 
< syl I CODA : < obs >\] IONS: syl t  <obs >l > 
This constraint states that it is not permissible to have an obstruent in a syllable 
coda if the following onset lacks an obstruent. Equivalently, we could say that if a 
syllable coda contains an obstruent then the following onset must also contain an 
obstruent. To see why these constraints are relevant to schwa, consider the case of 
demanderions, (also discussed by Tranel \[1987b\]). The constraints in Figure 3 rule out 
* \[doe.m~t.drjS\], since the underlined onset cluster is too complex. The constraint in (57) 
rules out *\[doe.m0~d.rj6\], where the obstruent d is assigned to the preceding syllable to 
leave an rj onset. The remaining two possible pronunciations are \[dce.m~t.dc~.rj6\] and 
\[dce.m0~.dri.j6\], as required. (Note that the ions suffix has the two forms, \[j6\] and \[ij6\].) 
Now let us consider the case of h-aspir6 words. These vowel-initial words do 
not tolerate a preceding consonant being syllabified into the word-initial onset. What 
happens to the V.CV and V.OLV constraints when the second vowel is in the first 
syllable of an h-aspir6 word, as we find in sept haches \[s~t.a ~\], *\[s~.ta~\] and quatre haches 
\[katr.ay\], *\[kat.ray\], *\[ka.tray\]? Here, it would appear that Tranel's analysis breaks 
down. Our conjecture is that the constraints in (55) and (57) should only apply when 
the second syllable is not an h-aspir6 syllable. So we need to introduce a further 
distinction in syllable types, introducing ha-syl for h-aspir6 syllables and nha-syl for 
the rest. 
Example 58 
syl ~ ha-syl V nha-syl 
Now ha-syl is defined as follows: 
Example 59 
IONSET : elist\] 
ha-syl I. 
Accordingly, the constraints (55) and (57) are refined, so that the second syllable 
is of the type nha-syl. The revised constraints are given in (60). 
Example 60 <  CODa  ne,/st  
a. onset-max-l' -- -~ " syl nha-syl phrase 
b. < ICOD :<o   >\] \[ONS: <o s >\]> onset-max-2' =-- ~ "" syl nha-syl phrase 
Now, h-aspir6 words will be lexically specified as having an initial ha-syl. How- 
ever, we must not specify any more syllable structure than is absolutely necessary. 
Example 61 displays the required constraint for the word haut. 
484 
Steven Bird and Ewan Klein Phonological Analysis in Typed Feature Systems 
Example 61 
lexical-sign 
\[SYLS (hasy,\[ C 
PHON: LSEGS : (r/-\] o) 
phon 
SYNSEMICAT : noun 
So although syllabification operates at the phrase level rather than the morpheme 
level (see Example 53), we are still able to impose lexically conditioned constraints on 
syllable structure directly. 
It remains to be shown how this treatment of h-aspir6 bears on schwa. Fortunately, 
Tranel (1987b:94) has provided the example we need. Consider the phrase dans le haut 
\[d~t.lce.o\]. This contains the word le \[l(ee)\], which is lexically specified as having an 
optional ce, indicated by parentheses. 17 There are three possible syllabifications, only 
the last of which is well formed. 
Example 62 < '°NS: ii' \] l°N : 
a. * |NUC: ) NUC: (o) 
LCODA : CODA: 0 syl ha-syl 
b. 
< tONS ,d,1 ('ONSET " \])I ONSET"\]> * /NUC: <C~> I /NUC: (oe) NUC: (o) 
LCODA: (l) J k C°DA: 0 L CODA: 0 syl syl ha-syl 
C. 
<,ONS ,d,\] ONSET,l, 1 I ONSET,,\]> 
NUC : NUC : LCODA I~ ~ NUC: LCODA 7 ~ <o> 
: : k CODa: 0 syl syl ha-syl 
The syllabification in (62a) is unavailable, since the syllable corresponding to the word 
haut is lexically specified as ha-syl, which means that its onset must be an elist from 
(59). The syllabifications in (62b) are likewise unavailable since these both consist of a 
syllable with a coda followed by a syllable without an onset, in contravention of (60a). 
This only leaves (62c), which corresponds to the attested form \[d~t.lce.o\]. 
We conclude this section with an example derivation for the phrase on ne se moque 
pas \[6n.smak.pct\], which was presented in (51). We assume that at some stage of a 
derivation, the PHON attribute of a sign is as follows: 
Example 63 
F 
/SEGS : (6) --- (n (oe)) --- (s (oe)) --- (m a k) --- (p ¢)/ 
..1 
phon L J 
17 As stated above, we do not address the phenomenon of elision here; this example shows that an 
analysis of elision would not require a separate stipulation for h-aspir6 words. 
485 
Computational Linguistics Volume 20, Number 3 
When the appropriate grammatical conditions are met, this phonology attribute will 
be given the type phrase. The definition in (53) will accordingly specialize the SYLS 
attribute. One possible specialization :is given in Example 64. 
Example 64 
f /\[ °Ns: rONBET:,BI,1 YLS: NUC: I NUC: / 
CODA: LCODA: {k) J syl syl 
phrase LSEGS : (6 n s m o k p ct) 
(P)1\1 ,NUC. j;\] 
Lcoo -I  / 
The reader can check that the onset and coda sequences comply with the constraints in 
Figure 3, that the first syllable can have an empty onset because there is no preceding 
syllable that could have a coda that matches the requirements of (60a), and that the 
obstruent k is permitted by constraint (60b) to appear in the coda of the second syllable 
because there is another obstruent p in the following onset. 
This concludes our discussion of French schwa. We believe our treatment of schwa 
is empirically equivalent to that of Tranel (1987a), except for the analysis of h-aspir6. 
Several empirical issues remain, but we are optimistic that further refinements to 
our proposals will be able to take additional observations on board. Notwithstanding 
such further developments, we hope to have demonstrated that the procedural de- 
vices of deletion and rule ordering are unnecessary in a typed feature-based grammar 
framework, and that constraints represent a perspicuous way of encoding linguistic 
observations. 
6. Prospects for Implementation 
In the preceding sections we have shown how the use of parameterized lists in HPSG 
is sufficient for encoding a variety of phonological generalizations. While we like this 
approach for the purposes of specification and exposition, as stated in Section 1.4, 
we actually envisage an implementation employing finite-state automata for string 
manipulation. This is simply because we favor the use of existing well-understood 
technology when it comes to producing an efficient implementation. 
As we have already explained in Section 1.4, we have linguistic reasons for not 
wishing to use finite-state transducers and the concomitant two-level model, and in- 
stead are interested in exploring the prospects of integrating our work with the au- 
tomaton model of Bird and Ellison (1994). In this section we give an overview of this 
automaton model and briefly outline the view of automata as types that was originally 
proposed in Bird (1992). 
6.1 One-Level Phonology 
For a variety of reasons already laid out in Section 1, we would like to achieve a closer 
integration between phonology and constraint-based grammar frameworks like HPSG. 
However, for such an integration to work, it is necessary to adopt a rather unusual 
view of phonology; one characterized by such notions as compositionality, intensional- 
ity, and lexicalism, and which has come to be called constraint-based phonology (Bird 
1990). 
Recently, Bird and Ellison (1994) have reinterpreted the constraint-based approach 
to phonology using finite-state automata. Nonlinear phonological representations and 
486 
Steven Bird and Ewan Klein Phonological Analysis in Typed Feature Systems 
x i!iiiiii!iiiiilq!iii 
!!!!! i 
i{iiiiii ~!i Y 
Figure 4 
Two views of autosegmental association. 
rules are encoded as automata. The key insight is that if autosegmental association 
is viewed as overlap between intervals with duration (Bird and Klein 1990), then the 
overlap can be simulated by using synchronization primitives on automata. Figure 4 
illustrates this idea. The diagram on the left of Figure 4 shows two temporal intervals 
x and y that overlap during the shaded period. On the right, the intervals x and 
y themselves are represented as sequences of contiguous tape cells where each cell 
contains a copy of the appropriate information (here, simply repeats of x and y). 
Again, the shaded period indicates the period of 'overlap' of the two intervals. The 
reader is referred to Bird and Ellison (1994) for further details. 
Although this kind of phonology employs formal devices very similar to the two- 
level FST model, there are some important differences in how the two models are used. 
In the two-level model the traditional distinction in phonology between RULES and 
REPRESENTATIONS is evident in the transducers and tapes respectively. As in constraint- 
based grammar more generally, one-level phonology does not have this distinction; 
rules and representations alike are interpreted as automata. Figure 5 illustrates this 
difference. 
Now that we have outlined the one-level model and briefly discussed its relation- 
ship with the two-level model, we shall sketch the link to typed feature systems. 
(a) Two Level Phonology 
Lexical Representation 
\] I I I I J 
Rules 
I r J I I \[ 
Surface Representation 
(b) One Level Phonology 
~ - Lexical Representation 
~-F-S-~ - Lexical Constraint 
/,,/ ~F-~ - Prosodic Constraint 
-~ - Surface Representation 
Figure 5 
Comparison of two-level and one-level phonology. 
487 
Computational Linguistics Volume 20, Number 3 
6.2 Types as Automata 
A type denotes a set of objects. Thus, types are descriptions, and they can be combined 
using the familiar operations of meet, join, and negation. Similarly, an automaton de- 
notes a set of objects, namely strings (o1" automaton tapes). And likewise, the operations 
of meet, join, and negation are defined for automata and correspond to intersection, 
union, and complement of the corresponding sets. Of course, a further operation of 
concatenation is defined for automata. We envisage a system for processing linguis- 
tic descriptions that implements a subset of the types (which we might simply call 
STRING types) as finite-state automata over some predefined alphabet. When the infer- 
ence engine requires that two string types be 'unified,' the meet of the corresponding 
automata will be formed. 
Although these string types may be declared as the appropriate values for certain 
attributes in a typed feature system, string types are only declared in terms of the 
basic alphabet and other string types. It is not possible to employ non-string types in 
the definition of string types. This is a severe restriction, since list types (say, in HPSG) 
allow arbitrary feature structures as elements, and we would like to be able to do the 
same for string types. Work on overcoming this limitation is currently in progress, and 
builds on the well-known similarity between feature structures and automata, when 
viewed as directed graphs (Kasper and Rounds 1986). 
7. Conclusion 
In this paper, we have tried to give the reader an impression of how two rather different 
phonological phenomena can be given a declarative encoding in a constraint-based 
grammar. Although we have focused on phonology, we have also placed our analyses 
within a morphological context as befits the multi-dimensional perspective of HPSG. 
The formal framework of HPSG is rather powerful; certainly powerful enough to 
capture many analyses in the style of classical generative phonology in which arbi- 
trary mappings are allowed between underlying and surface representations. We have 
limited ourselves further by allowing only one phonological stratum in the grammar, 
and by adopting a notion of phonological compositionality that supports monotonicity. 
These restrictions make it much harder to carry over generalizations that depended on 
a procedural rule format. This is not a handicap, we contend, since it is heuristically 
valuable to view the data in a new light rather than just coercing traditional analyses 
into a modern grammar formalism. 
So what is a constraint-based style of phonological analysis? An important key, we 
claim, is the use of generalizations expressed at the level of prosodic types. Coupled 
with a systematic underspecification of lexical entries and a regime of type inheritance, 
this allows us to have different levels of linguistic abstraction while maintaining a 
'concrete' relation between lexical and surface representations of phonology. 
We hope to have given enough illustration to show that our approach is viable. In 
future, we wish to extend these same techniques to a typologically diverse range of 
other linguistic phenomena. A second important goal is to show how the technology 
of finite-state automata can be invoked to deal with phonological information in HPSG. 
For although we have placed phonology within a general framework of linguistic 
constraints, the analyses we have presented only involve manipulation of regular 
expressions. 
Acknowledgments 
This research is funded by the U.K. Science 
and Engineering Research Council, under 
grant GR/G-22084 Computational Phonology: 
A Constraint-Based Approach, and has been 
carried out as part of the research program 
488 
Steven Bird and Ewan Klein Phonological Analysis in Typed Feature Systems 
of the Human Communication Research 
Centre, supported by the U.K. Economic 
and Social Research Council. We are 
grateful to Ted Briscoe, Tomaz Erjavec, 
Dani61e Godard, John Nerbonne, Marc 
Plenat, Bernard Tranel, and Ivan Sag for 
discussions and correspondence relating to 
this work, and to two anonymous reviewers 
for their helpful suggestions. 
References 
Anderson, S. R. (1992). A-Morphous 
Morphology, Volume 62 of Cambridge 
Studies in Linguistics. Cambridge 
University Press. 
Antworth, E. (1990). PC-KIMMO: A 
Two-Level Processor for Morphological 
Analysis. Summer Institute of Linguistics. 
Archangeli, D., and Pulleyblank, D. (1989). 
"Yoruba vowel harmony." Linguistic 
Inquiry, 20, 173-217. 
Bach, E., and Wheeler, D. W. (1981). 
"Montague phonology: A first 
approximation." University of 
Massachusetts Occasional Papers in 
Linguistics, 7, 27-45. 
Bird, S. (1990). Constraint-based phonology. 
Doctoral dissertation, University of 
Edinburgh. To be published in revised 
form by Cambridge University Press, 
Studies in Natural Language Processing. 
Bird, S. (1992). "Finite-state phonology in 
HPSG." In Proceedings of the 14th 
International Conference on Computational 
Linguistics (COLING-92). 74-80. 
Bird, S., and Ellison, T. M. (1994). 
"One-level phonology: Autosegmental 
representations and rules as finite 
automata." Computational Linguistics, 20, 
55-90. 
Bird, S., and Klein, E. (1990). "Phonological 
events." Journal of Linguistics, 26, 33-56. 
Bird, S., and Ladd, D. R. (1991). "Presenting 
autosegmental phonology." Journal of 
Linguistics, 27, 193-210. 
Bloomfield, L. (1926). "A set of postulates 
for the science of language." Language, 2, 
153-164. Reprinted in Readings in 
Linguistics h The Development of Descriptive 
Linguistics in America 1925-56, edited by 
Martin Joos. 26-31. 
Briscoe, T. (1991). "Lexical issues in natural 
language processing." In Natural Language 
and Speech, edited by E. Klein and 
E Veltman, 39-68. Springer-Verlag. 
Bromberger, S., and Halle, M. (1989). "Why 
phonology is different." Linguistic Inquiry, 
20, 51-70. 
Cahill, L. J. (1990). "Syllable-based 
morphology." In Proceedings, 13th 
International Conference on Computational 
Linguistics, Volume 3, edited by 
H. Karlgren, 48-53. 
Carpenter, B. (1992). The Logic of Typed 
Feature Structures. Volume 32 of Cambridge 
Tracts in Theoretical Computer Science. 
Cambridge University Press. 
Clements, G. (1985). "The geometry of 
phonological features." Phonology 
Yearbook, 2, 225-252. 
Clements, G. N., and Keyser, S. J. (1983). CV 
Phonology: A Generative Theory of the 
Syllable. MIT Press. 
Coleman, J. S. (1991). Phonological 
representations--their names, forms and 
powers. Doctoral dissertation, University 
of York. 
Coleman, J. S. (1992). "The phonetic 
interpretation of headed phonological 
structures containing overlapping 
constituents." Phonology, 9, 1-44. 
Copestake, A.; Sanfilippo, A.; Briscoe, T.; 
and de Paiva, V. (in press). "The 
ACQUILEX LKB: An introduction." In 
Default Inheritance in Unification Based 
Approaches to the Lexicon, edited by 
T. Briscoe, A. Copestake, and V. de Paiva. 
Cambridge University Press. 
Dell, E (1980). Generative Phonology and 
French Phonology. Cambridge University 
Press. 
Emele, M. C., and Zajac, R. (1990). "Typed 
unification grammars." In Proceedings, 
13th International Conference on 
Computational Linguistics. 293-298. 
Gazdar, G.; Klein, E.; Pullum, G.; and Sag, I. 
(1985). Generalized Phrase Structure 
Grammar. Blackwell. 
Goldsmith, J. (1990). Autosegmental and 
Metrical Phonology. Blackwell. 
Goldsmith, J. A. (1993). "Harmonic 
phonology." In The Last Phonological Rule: 
Reflections on Constraints and Derivations, 
edited by J. A. Goldsmith, 21--60. 
University of Chicago Press. 
Hoeksema, J., and Janda, R. (1988). 
"Implications of process morphology for 
categorial grammar." In Categorial 
Grammars and Natural Language Structures, 
edited by R. T. Oehrle, E. Bach, and D. W. 
Wheeler, 199-247. Reidel. 
Hooper, J. (1976). An Introduction to Natural 
Generative Phonology. Academic Press. 
Hudson, G. (1980). "Automatic alternations 
in non-transformational phonology." 
Language, 56, 94-125. 
Jaffar, J., and Lassez, J.-L. (1987). 
"Constraint logic programming." In ACM 
Symposium on Principles of Programming 
Languages, II1-119. 
Johnson, M. (1988). Attribute-value logic and 
489 
Computational Linguistics Volume 20, Number 3 
the theory of grammar. Doctoral 
dissertation, Stanford University. 
Kahn, D. (1976). Syllable-based Generalizations 
in English Phonology. Indiana University 
Linguistics Club. 
Kaplan, R. M., and Bresnan, J. (1982). 
"Lexical-Functional Grammar: A forma1 
system for grammatical representation." 
In The Mental Representation of Grammatical 
Relations, edited by J. Bresnan. MIT Press. 
Kasper, R. T., and Rounds, W. C. (1986). "A 
logical semantics for feature structures." 
In Proceedings, 24th Annual Meeting of the 
Association for Computational Linguistics. 
257-266. 
Kay, M. (1987). "Nonconcatenative 
finite-state morphology." In Proceedings, 
Third Meeting of the European Chapter of the 
Association for Computational Linguistics. 
2-10. 
Keating, P. (1984). "Phonetic and 
phonological representation of stop 
consonant voicing." Language, 60, 286-319. 
Kenstowicz, M., and Kisseberth, C. (1979). 
Generative Phonology: Description and 
Theory. Academic Press. 
Kiparsky, P. (1982). Lexical Morphology and 
Phonology. Hanshin Publishing Co. 
Kisseberth, C. W. (1970). "On the functional 
unity of phonological rules." Linguistic 
Inquiry, 1,291-306. 
Klein, E. (1992). "Data types in 
computational phonology." In Proceedings, 
14th International Conference on 
Computational Linguistics (COLING-92). 
149-155. 
Klein, E. (1993). "An HPSG approach to 
Sierra Miwok verb stems." In Phonology 
and Computation, edited by T. M. Ellison 
and J. M. Scobbie. University of 
Edinburgh, 19-35. 
Kornai, A. (1991). Formal phonology. Doctoral 
dissertation, Stanford University. 
Koskenniemi, K. (1983). Two-level morphology: 
A general computational model for word-form 
recognition and production. Doctoral 
dissertation, University of Helsinki. 
Krieger, H.-U., and Nerbonne, J. (in press). 
"Feature-based inheritance networks for 
computational lexicons." In Default 
Inheritance in Unification Based Approaches to 
the Lexicon, edited by T. Briscoe, 
A. Copestake, and V. de Paiva. 
Cambridge University Press. 
Krieger, H.-U.; Pirker, H.; and Nerbonne, J. 
(1993). "Feature-based allomorphy." In 
Proceedings, 31st Annual Meeting of the 
Association for Computational Linguistics, 
140-147. 
Manaster Ramer, A. (1981). How abstruse is 
phonology? Doctoral dissertation, 
University of Chicago. 
Martinet, A. (1972). "La nature 
phonologique d'e caduc." In Papers in 
Linguistics and Phonetics to the Memory of 
Pierre Delattre, edited by A. Valdman. 
Mouton. 
Mastroianni, M. (1993). Attribute Logic 
Phonology. CMU-LCL 93-4, Carnegie 
Mellon University. 
McCarthy, J. (1981). "A prosodic theory of 
non-concatenative morphology." 
Linguistic Inquiry, 12, 373-418. 
McCarthy, J., and Prince, A. (1993). 
"Prosodic morphology I---Constraint 
interaction and satisfaction." Unpublished 
Report. 
Morin, Y.-C. (1978). "The status of mute 'e'." 
Studies in French Linguistics, 1, 79-140. 
Morin, Y.-C. (1987). "French data and 
phonological theory." Linguistics, 25, 
815-843. 
Paradis, C. (1988). "On constraints and 
repair strategies." The Linguistic Review, 6, 
71-97. 
Partee, B. H. (1979). "Montague grammar 
and the well-formedness constraint." In 
Syntax and Semantics 10: Selections from the 
Third Groningen Round Table, edited by 
E Heny and H. SchneUe, 275-313. 
Academic Press. 
Pierrehumbert, J. (1990). "Phonological and 
phonetic representation." Journal of 
Phonetics, 18, 375-394. 
Pollard, C., and Sag, I. (1987). 
Information-Based Syntax and Semantics. 
Volume 13 of CSLI Lecture Notes. Stanford: 
Center for the Study of Language and 
Information. 
Prince, A. S., and Smolensky, P. (1993). 
"Optimality theory: Constraint interaction 
in generative grammar." Technical 
Report 2, Center for Cognitive Science, 
Rutgers University. 
Pullum, G. K., and Zwicky, A. M. (1984). 
"The syntax-phonology boundary and 
current syntactic theories." In Ohio State 
University Working Papers in Linguistics: 
Papers on Morphology, edited by A. Zwicky 
and R. Wallace. Ohio State University. 
Reinhard, S., and Gibbon, D. (1991). 
"Prosodic inheritance and morphological 
generalizations." In Proceedings, 5th 
European ACL Meeting, 131-136. 
Riehemann, S. (1992). Word formation in 
lexical type hierarchies: A case study of 
bar-adjectives in German. Master's thesis, 
Department of Linguistics, University of 
Tiibingen. 
Rooth, M. (1985). Association with focus. 
Doctoral dissertation, University of 
Massachusetts-Amherst. 
490 
Steven Bird and Ewan Klein Phonological Analysis in Typed Feature Systems 
Russell, K. (1993). A constraint-based approach 
to phonology. Doctoral dissertation, 
University of Southern California. 
Scobbie, J. (1991). Attribute-value phonology. 
Doctoral dissertation, University of 
Edinburgh. 
Selkirk, E. (1984). Phonology and Syntax. MIT 
Press. 
Shibatani, M. (1973). "The role of surface 
phonetic constraints in generative 
phonology." Language, 49, 87-106. 
Smolka, G. (1992). "Feature constraint logics 
for unification grammars." Journal of Logic 
Programming, 12, 51-87. 
Sproat, R. (1992). Morphology and 
Computation. Natural Language 
Processing. MIT Press. 
Sproat, R., and Brunson, B. (1987). 
"Constituent-based morphological 
parsing: A new approach to the problem 
of word-recognition." In Proceedings, 25th 
Annual Meeting of the Association for 
Computational Linguistics. 65-72. 
Tranel, B. (1987a). "French schwa and 
nonlinear phonology." Linguistics, 25, 
845-866. 
Tranel, B. (1987b). The Sounds of French--An 
Introduction. Cambridge University Press. 
Valdman, A. (1976). Introduction to French 
Phonology and Morphology. Newbury 
House. 
Walther, M. (1992). Deklarative Silbifizierung 
in einem constraintbasierten 
Grammatikformalismus. Master's thesis, 
University of Stuttgart. 
Wheeler, D. (1981). Aspects of a Categorial 
Theory of Phonology. Doctoral dissertation, 
University of Massachusetts-Amherst. 
Wiebe, B. (1992). Modelling autosegmental 
phonology with multi-tape finite state 
transducers. Master's thesis, Simon Fraser 
University. 
Zajac, R. (1992). "Inheritance and 
constraint-based grammar formalisms." 
Computational Linguistics, 18, 159-182. 
Zajac, R. (in press). "Issues in the design of 
a language for representing linguistic 
information based on inheritance and 
feature structures." In Default Inheritance in 
Unification Based Approaches to the Lexicon, 
edited by T. Briscoe, A. Copestake, and 
V. de Paiva. Cambridge University Press. 
491 
