Tractability and Structural Closures in Attribute Logic Type Signatures
Gerald Penn
Department of Computer Science
University of Toronto
10 King’s College Rd.
Toronto M5S 3G4, Canada
gpenn@cs.toronto.edu
Abstract
This paper considers three assumptions
conventionally made about signatures
in typed feature logic that are in po-
tential disagreement with current prac-
tice among grammar developers and
linguists working within feature-based
frameworks such as HPSG: meet-semi-
latticehood, unique feature introduc-
tion, and the absence of subtype cover-
ing. It also discusses the conditions un-
der which each of these can be tractably
restored in realistic grammar signatures
where they do not already exist.
1 Introduction
The logic of typed feature structures (LTFS, Car-
penter 1992) and, in particular, its implementa-
tion in the Attribute Logic Engine (ALE, Car-
penter and Penn 1996), have been widely used
as a means of formalising and developing gram-
mars of natural languages that support computa-
tionally efficient parsing and SLD resolution, no-
tably grammars within the framework of Head-
driven Phrase Structure Grammar (HPSG, Pollard
and Sag 1994). These grammars are formulated
using a vocabulary provided by a finite partially
ordered set of types and a set of features that must
be specified for each grammar, and feature struc-
tures in these grammars must respect certain con-
straints that are also specified. These include ap-
propriateness conditions, which specify, for each
type, all and only the features that take values
in feature structures of that type, and with which
types of values (value restrictions). There are also
more general implicational constraints of the form
a0a2a1 a3 , where a0 is a type, and a3 is an expres-
sion from LTFS’s description language. In LTFS
and ALE, these four components, a partial order
of types, a set of features, appropriateness declara-
tions and type-antecedent constraints can be taken
as the signature of a grammar, relative to which
descriptions can be interpreted.
LTFS and ALE also make several assump-
tions about the structure and interpretation of this
partial order of types and about appropriateness,
some for the sake of generality, others for the
sake of efficiency or simplicity. Appropriate-
ness is generally accepted as a good thing, from
the standpoints of both efficiency and representa-
tional accuracy, and while many have advocated
the need for implicational constraints that are even
more general, type-antecedent constraints at the
very least are also accepted as being necessary and
convenient. Not all of the other assumptions are
universally observed by formal linguists or gram-
mar developers, however.
This paper addresses the three most contentious
assumptions that LTFS and ALE make, and how
to deal with their absence in a tractable manner.
They are:
1. Meet-semi-latticehood: every partial order
of types must be a meet semi-lattice. This
implies that every consistent pair of types has
a least upper bound.
2. Unique feature introduction: for every fea-
ture, F, there is a unique most general type to
which F is appropriate.
3. No subtype covering: there can be feature
structures of a non-maximally-specific type
that are not typable as any of its maximally
specific subtypes. When subtype covering
is not assumed, feature structures themselves
can be partially ordered and taken to repre-
sent partial information states about some set
of objects. When subtype covering is as-
sumed, feature structures are discretely or-
dered and totally informative, and can be
taken to represent objects in the (linguistic)
world themselves. The latter interpretation is
subscribed to by Pollard and Sag (1994), for
example.
All three of these conditions have been claimed
elsewhere to be either intractable or impossible
to restore in grammar signatures where they do
not already exist. It will be argued here that: (1)
restoring meet-semi-latticehood is theoretically
intractable, for which the worst case bears a dis-
quieting resemblance to actual practice in current
large-scale grammar signatures, but nevertheless
can be efficiently compilable in practice due to the
sparseness of consistent types; (2) unique feature
introduction can always be restored to a signature
in low-degree polynomial time, and (3) while type
inferencing when subtype covering is assumed is
intractable in the worst case, a very elegant con-
straint logic programming solution combined with
a special compilation method exists that can re-
store tractability in many practical contexts. Some
simple completion algorithms and a corrected NP-
completeness proof for non-disjunctive type infer-
encing with subtype covering are also provided.
2 Meet-semi-latticehood
In LTFS and ALE, partial orders of types are as-
sumed to be meet semi-lattices:
Definition 1 A partial order, a4a6a5a8a7a10a9a12a11 , is a meet
semi-lattice iff for any a13a14a7a16a15a18a17a19a5 , a13a21a20a22a15a24a23 .
a20 is the binary greatest lower bound, or meet op-
eration, and is the dual of the join operation, a25 ,
which corresponds to unification, or least upper
bounds (in the orientation where a26 corresponds
to the most general type). Figure 1 is not a meet
semi-lattice because a27 and a28 do not have a meet,
nor do a29 and a30 , for example.
In the finite case, the assumption that every pair
of types has a meet is equivalent to the assump-
tion that every consistent set of types, i.e., types
with an upper bound, has a join. It is theoretically
convenient when discussing the unification of fea-
ture structures to assume that the unification of
a
b
c
g
f
e
d
Figure 1: An example of a partial order that is not
a meet semi-lattice.
two consistent types always exists. It can also be
more efficient to make this assumption as, in some
representations of types and feature structures,
it avoids a source of non-determinism (selection
among minimal but not least upper bounds) dur-
ing search.
Just because it would be convenient for unifica-
tion to be well-defined, however, does not mean
it would be convenient to think of any empiri-
cal domain’s concepts as a meet semi-lattice, nor
that it would be convenient to add all of the types
necessary to a would-be type hierarchy to ensure
meet-semi-latticehood. The question then natu-
rally arises as to whether it would be possible,
given any finite partial order, to add some extra
elements (types, in this case) to make it a meet
semi-lattice, and if so, how many extra elements
it would take, which also provides a lower bound
on the time complexity of the completion.
It is, in fact, possible to embed any finite partial
order into a smallest lattice that preserves exist-
ing meets and joins by adding extra elements. The
resulting construction is the finite restriction of
the Dedekind-MacNeille completion (Davey and
Priestley, 1990, p. 41).
Definition 2 Given a partially ordered set,
a5 , the Dedekind-MacNeille completion of a5 ,
a4a6a31a33a32a2a34a35a5a37a36a38a7a40a39a12a11 , is given by:
a31a22a32a2a34a35a5a37a36a42a41a44a43a46a45a47a39a48a5a50a49a51a45a12a52a40a53a54a41a55a45a57a56
This route has been considered before in the
context of taxonomical knowledge representation
(A¨ıt-Ka´ci et al., 1989; Fall, 1996). While meet
semi-lattice completions are a practical step
towards providing a semantics for arbitrary
partial orders, they are generally viewed as
an impractical preliminary step to performing
computations over a partial order. Work on
more efficient encoding schemes began with
A¨ıt-Ka´ci et al. (1989), and this seminal paper has
123 124 134 234
1 2 3 4
Figure 2: A worst case for the Dedekind-
MacNeille completion at a58a59a41a61a60 .
in turn given rise to several interesting studies
of incremental computations of the Dedekind-
MacNeille completion in which LUBs are added
as they are needed (Habib and Nourine, 1994;
Bertet et al., 1997). This was also the choice
made in the LKB parsing system for HPSG
(Malouf et al., 2000).
There are partial ordersa5 of unbounded size for
which a49a62a31a22a32a2a34a35a5a37a36a40a49a37a41a64a63a50a34a35a65a67a66a68a42a66a51a36 . As one family of
worst-case examples, parametrised by a58 , consider
a set a69a70a41a71a43a73a72a24a7a40a74a10a74a40a74a75a7a16a58a76a56 , and a partial order a5 de-
fined as all of the size a58a78a77a59a72 subsets of a69 and all of
the size a72 subsets of a69 , ordered by inclusion. Fig-
ure 2 shows the case where a58a79a41a80a60 . Although the
maximum subtype and supertype branching fac-
tors in this family increase linearly with size, the
partial orders can grow in depth instead in order to
contain this.
That yields something roughly of the form
shown in Figure 3, which is an example of a recent
trend in using type-intensive encodings of linguis-
tic information into typed feature logic in HPSG,
beginning with Sag (1997). These explicitly iso-
late several dimensions1 of analysis as a means
of classifying complex linguistic objects. In Fig-
ure 3, specific clausal types are selected from
among the possible combinations of CLAUSAL-
ITY and HEADEDNESS subtypes. In this set-
ting, the parameter a58 corresponds roughly to the
number of dimensions used, although an exponen-
tial explosion is obviously not dependent on read-
ing the type hierarchy according to this conven-
tion.
There is a simple algorithm for performing this
completion, which assumes the prior existence of
a most general element (a26 ), given in Figure 4.
1It should be noted that while the common parlance for
these sections of the type hierarchy is dimension, borrowed
from earlier work by Erbach (1994) on multi-dimensional
inheritance, these are not dimensions in the sense of
Erbach (1994) because not every a81 -tuple of subtypes from
an a81 -dimensional classification is join-compatible.
Most instantiations of the heuristic, “where there
is no meet, add one” (Fall, 1996), do not yield
the Dedekind-MacNeille completion (Bertet et al.,
1997), and other authors have proposed incremen-
tal methods that trade greater efficiency in com-
puting the entire completion at once for their in-
crementality.
Proposition 1 The MSL completion algorithm is
correct on finite partially ordered sets, a5 , i.e.,
upon termination, it has produced a31a33a32a82a34a83a5a37a36 .
Proof: Let a84a59a34a35a5a37a36 be the partially ordered set pro-
duced by the algorithm. Clearly, a5a85a39a86a84a59a34a35a5a37a36 . It
suffices to show that (1) a84a59a34a35a5a37a36 is a complete lattice
(with a87 added), and (2) for all a88a89a17a44a84a22a34a35a5a37a36 , there
exist subsets a45a90a7a16a91a92a39a89a5 such that a88a93a41a55a94a50a95a97a96
a68a14a98
a45a89a41
a99
a95a97a96
a68a14a98
a91 .
2
Suppose there are a88a100a7a102a101a103a17a61a84a59a34a35a5a37a36 such that a88a18a20
a101a105a104 . There is a least element, so a88 and a101 have
more than one maximal lower bound, a106a83a107a38a7a16a106a109a108 and
others. But then a43a110a106 a107 a7a16a106 a108 a56 is upper-bounded and
a106a83a107a8a25a111a106a112a108a113a104 , so the algorithm should not have termi-
nated. Suppose instead that a88a33a25a114a101a105a104 . Again, the
algorithm should not have terminated. So a84a59a34a35a5a37a36
with a87 added is a complete lattice.
Given a88a19a17a79a84a22a34a83a5a37a36 , if a88a19a17a79a5 , then choose a45a105a115a93a41
a91a105a115a116a41a117a43a110a88a118a56 . Otherwise, the algorithm added a88 be-
cause of a bounded set a43a40a119 a107 a7a120a119 a108 a56 , with minimal up-
per bounds, a121a122a107a123a7a10a74a40a74a40a74a124a121a100a125 , which did not have a least
upper bound, i.e., a126a128a127 a72 . In this case, choose
a45 a115 a41a129a45a105a130a35a131a133a132a79a45a105a130a135a134 and a91 a115 a41a129a136
a107a120a137a67a138a112a137a100a125
a91
a52a40a139
. In ei-
ther case, clearly a88a90a41 a94 a95a140a96
a68a14a98
a45a105a115a141a41
a99
a95a140a96
a68a14a98
a91a105a115 for
all a88a50a17a19a84a142a34a83a5a37a36 . a143
Termination is guaranteed by considering, af-
ter every iteration, the number of sets of meet-
irreducible elements with no meet, since all com-
pletion types added are meet-reducible by defini-
tion.
In LinGO (Flickinger et al., 1999), the largest
publicly-available LTFS-based grammar, and one
which uses such type-intensive encodings, there
are 3414 types, the largest supertype branching
factor is 19, and although dimensionality is not
distinguished in the source code from other types,
the largest subtype branching factor is 103. Using
supertype branching factor for the most conserva-
tive estimate, this still implies a theoretical maxi-
2These are sometimes called the join density and meet
density, respectively, of a144 in a145a8a146a112a144a42a147 (Davey and Priestley,
1990, p. 42).
fin-wh-fill-rel-cl inf-wh-fill-recl-cl red-rel-cl simp-inf-rel-cl wh-subj-rel-cl bare-rel-cl
fin-hd-fill-ph inf-hd-fill-ph fin-hd-subj-ph
wh-rel-cl non-wh-rel-cl hd-fill-ph hd-comp-ph hd-subj-ph hd-spr-ph
imp-cl decl-cl inter-cl rel-cl hd-adj-ph hd-nexus-ph
clause non-clause hd-ph non-hd-ph
CLAUSALITY HEADEDNESS
phrase
Figure 3: A fragment of an English grammar in which supertype branching distinguishes
“dimensions” of classification.
mum of approximately 500,000 completion types,
whereas only 893 are necessary, 648 of which are
inferred without reference to previously added
completion types.
Whereas incremental compilation methods rely
on the assumption that the joins of most pairs of
types will never be computed in a corpus before
the signature changes, this method’s efficiency re-
lies on the assumption that most pairs of types
are join-incompatible no matter how the signa-
ture changes. In LinGO, this is indeed the case:
of the 11,655,396 possible pairs, 11,624,866 are
join-incompatible, and there are only 3,306 that
are consistent (with or without joins) and do not
stand in a subtyping or identity relationship. In
fact, the cost of completion is often dominated
by the cost of transitive closure, which, using a
sparse matrix representation, can be completed for
LinGO in about 9 seconds on a 450 MHz Pentium
II with 1GB memory (Penn, 2000a).
While the continued efficiency of compile-time
completion of signatures as they further increase
in size can only be verified empirically, what can
be said at this stage is that the only reason that sig-
natures like LinGO can be tractably compiled at
all is sparseness of consistent types. In other ge-
ometric respects, it bears a close enough resem-
blance to the theoretical worst case to cause con-
cern about scalability. Compilation, if efficient,
is to be preferred from the standpoint of static
error detection, which incremental methods may
elect to skip. In addition, running a new signa-
ture plus grammar over a test corpus is a frequent
task in large-scale grammar development, and in-
cremental methods, even ones that memoise pre-
vious computations, may pay back the savings in
compile-time on a large test corpus. It should also
be noted that another plausible method is compi-
lation into logical terms or bit vectors, in which
some amount of compilation (ranging from linear-
time to exponential) is performed with the remain-
ing cost amortised evenly across all run-time uni-
fications, which often results in a savings during
grammar development.
3 Unique Feature Introduction
LTFS and ALE also assume that appropriateness
guarantees the existence of a unique introducer for
every feature:
Definition 3 Given a type hierarchy, a4a35a148a78a7a40a9a149a11 , and
a finite set of features, Feat, an appropriateness
specification is a partial function, a150a8a151a24a151a153a152a155a154a123a151 a156
a157a14a158a102a159a24a160a42a161
a148a55a77
a1
a148 such that, for every F a17
a157a14a158a16a159a24a160 :
a162 (Feature Introduction) there is a type
a163a16a164a100a160
a152a75a154a100a34 Fa36a8a17a59a148 such that:
- a150a165a151a24a151a166a152a75a154a102a151a167a34 Fa7 a163a16a164a100a160 a152a75a154a100a34 Fa36a120a36a168a23 , and
- for every a119a90a17a79a148 , if a150a165a151a24a151a166a152a75a154a102a151a167a34 Fa7a113a119a155a36a75a23 , then
a163a16a164a54a160
a152a155a154a100a34 Fa36a8a9a169a119 , and
a162 (Upward Closure / Right Monotonic-
ity) if a150a165a151a24a151a166a152a75a154a102a151a149a34 Fa7a16a170a10a36a75a23 and a170 a9 a119 , then
a150a8a151a24a151a153a152a155a154a123a151a167a34 Fa7a120a119a75a36a75a23 and a150a165a151a24a151a166a152a75a154a102a151a149a34 Fa7a16a170a40a36 a9
a150a8a151a24a151a153a152a155a154a123a151a167a34 Fa7a120a119a75a36 .
Feature introduction has been argued not to be
appropriate for certain empirical domains either,
although Pollard and Sag (1994) do otherwise ob-
serve it. The debate, however, has focussed on
whether to modify some other aspect of type infer-
encing in order to compensate for the lack of fea-
ture introduction, presumably under the assump-
tion that feature introduction was difficult or im-
possible to restore automatically to grammar sig-
natures that did not have it.
1. Find two elements, a171 a131a75a172 a171a134 with minimal upper bounds,
a173
a131a175a174a75a174a75a174
a173a46a176 , such that their join
a171
a131a178a177
a171
a134 is undefined, i.e.,
a179a78a180a182a181 . If no such pair exists, then stop.
2. Add an element, a183 , such that:
a184 for all
a181a42a185a22a186a118a185a22a179 ,
a183a8a187
a173
a139
, and
a184 for all elements
a171 , a171a167a187a114a183 iff for all
a181a141a185a114a186a167a185a79a179 ,
a171a54a187
a173
a139
.
3. Go to (1).
Figure 4: The MSL completion algorithm.
Just as with the condition of meet-semi-
latticehood, however, it is possible to take a
would-be signature without feature introduction
and restore this condition through the addition
of extra unique introducing types for certain
appropriate features. The algorithm in Figure 5
achieves this. In practice, the same signature
completion type, a88 , can be used for different
features, provided that their minimal introducers
are the same set, a188 . This clearly produces a
partially ordered set with a unique introducing
type for every feature. It may disturb meet-
semi-latticehood, however, which means that this
completion must precede the meet semi-lattice
completion of Section 2. If generalisation has
already been computed, the signature completion
algorithm runs in a189a142a34a83a190a166a58a153a36 , where a190 is the number
of features, and a58 is the number of types.
4 Subtype Covering
In HPSG, it is generally assumed that non-
maximally-specific types are simply a convenient
shorthand for talking about sets of maximally
specific types, sometimes called species, over
which the principles of a grammar are stated. In a
view where feature structures represent discretely
ordered objects in an empirical model, every
feature structure must bear one of these species.
In particular, each non-maximally-specific type
in a description is equivalent to the disjunction of
the maximally specific subtypes that it subsumes.
There are some good reasons not to build this
assumption, called “subtype covering,” into LTFS
or its implementations. Firstly, it is not an ap-
propriate assumption to make for some empiri-
cal domains. Even in HPSG, the denotations of
1. Given candidate signature, a191 , find a feature, F, for
which there is no unique introducing type. Let a192 be
the set of minimal types to which F is appropriate,
where a193a192a50a193 a180a59a181 . If there is no such feature, then stop.
2. Add a new type, a183 , toa191 , to which F is appropriate, such
that:
a184 for all
a179a141a194
a192 , a183a165a187
a179 ,
a184 for all types,
a171 in a191 , a171a37a187a55a183 iff for all
a179a79a194
a192 ,
a171a118a187
a179 , and
a184a59a195a122a196a38a196a175a197a112a198a124a196
a146 F
a172
a183a38a147 a199
a195a122a196a38a196a175a197a112a198a124a196
a146 F
a172
a179
a131
a147a128a200
a195a122a196a38a196a175a197a112a198a124a196
a146 F
a172
a179
a134
a147a61a200
a174a75a174a75a174
a200
a195a153a196a38a196a175a197a109a198a124a196
a146 F
a172
a179a24a201a202a42a201
a147 ,
the generalization of the value restrictions on F
of the elements of a192 .
3. Go to (1).
Figure 5: The introduction completion algorithm.
parametrically-typed lists are more naturally in-
terpreted without it. Secondly, not to make the as-
sumption is more general: where it is appropriate,
extra type-antecedent constraints can be added to
the grammar signature of the form:
a58
a1a85a203
a107a205a204a207a206a40a206a10a206a16a204
a203
a138
for each non-maximally-specific type, a58 , and its
a208 maximal subtypes,
a203
a107 a7a10a74a40a74a10a74a120a7
a203
a138 . These con-
straints become crucial in certain cases where the
possible permutations of appropriate feature val-
ues at a type are not covered by the permutations
of those features on its maximally specific sub-
types. This is the case for the type, verb, in the
signature in Figure 6 (given in ALE syntax, where
sub/2 defines the partial order of types, and
intro/2 defines appropriateness on unique in-
troducers of features). The combination, AUXa156a209a77a141a210
INVa156a212a211 , is not attested by any of verb’s subtypes.
While there are arguably better ways to represent
this information, the extra type-antecedent con-
straint:
verba1 aux verba204 main verb
is necessary in order to decide satisfiability cor-
rectly under the assumption of subtype covering.
We will call types such as verb deranged types.
Types that are not deranged are called normal
types.
bot sub [verb,bool].
bool sub [+,-].
verb sub [aux_verb,main_verb]
intro [aux:bool,inv:bool].
aux_verb sub [aux:+,inv:bool].
main_verb sub [aux:-,inv:-].
Figure 6: A signature with a deranged type.
4.1 Non-Disjunctive Type Inference under
Subtype Covering is NP-Complete
Third, although subtype covering is, in the au-
thor’s experience, not a source of inefficiency in
practical LTFS grammars, when subtype cover-
ing is implicitly assumed, determining whether a
non-disjunctive description is satisfiable under ap-
propriateness conditions is an NP-complete prob-
lem, whereas this is known to be polynomial
time without it (and without type-antecedent con-
straints, of course). This was originally proven by
Carpenter and King (1995). The proof, with cor-
rections, is summarised here because it was never
published. Consider the translation of a 3SAT for-
mula into a description relative to the signature
given in Figure 7. The resulting description is al-
ways non-disjunctive, since logical disjunction is
encoded in subtyping. Asking whether a formula
is satisfiable then reduces to asking whether this
description conjoined with trueform is satisfi-
able. Every type is normal except fortruedisj,
for which the combination, DISJ1a156falseforma210
DISJ2a156falseform, is not attested in either of its
subtypes. Enforcing subtype covering on this one
deranged type is the sole source of intractability
for this problem.
4.2 Practical Enforcement of Subtype
Covering
Instead of enforcing subtype covering along with
type inferencing, an alternative is to suspend con-
straints on feature structures that encode subtype
covering restrictions, and conduct type inferenc-
ing in their absence. This restores tractability
at the cost of rendering type inferencing sound
but not complete. This can be implemented very
transparently in systems like ALE that are built on
top of another logic programming language with
support for constraint logic programming such as
SICStus Prolog. In the worst case, an answer to a
query to the grammar signature may contain vari-
bot sub [bool,formula].
bool sub [true,false].
formula sub [propsymbol,conj,disj,neg,
trueform,falseform].
propsymbol sub [truepropsym,
falsepropsym].
conj sub [trueconj,falseconj1,
falseconj2].
intro [conj1:formula,
conj2:formula].
trueconj intro [conj1:trueform,
conj2:trueform].
falseconj1 intro [conj1:falseform].
falseconj2 intro [conj2:falseform].
disj sub [truedisj,falsedisj]
intro [disj1:formula,
disj2:formula].
truedisj sub [truedisj1,truedisj2].
truedisj1 intro [disj1:trueform].
truedisj2 intro [disj2:trueform].
falsedisj intro [disj1:falseform,
disj2:falseform].
neg sub [trueneg,falseneg]
intro [neg:propsymbol].
trueneg intro [neg:falsepropsym].
falseneg intro [neg:truepropsym].
trueform sub [truepropsym,trueconj,
truedisj,trueneg].
falseform sub [falsepropsym,falseconj1,
falseconj2,falsedisj,falseneg].
Figure 7: The signature reducing 3SAT to non-
disjunctive type inferencing.
ables with constraints attached to them that must
be exhaustively searched over in order to deter-
mine their satisfiability, and this is still intractable
in the worst case. The advantage of suspending
subtype covering constraints is that other princi-
ples of grammar and proof procedures such as
SLD resolution, parsing or generation can add de-
terministic information that may result in an early
failure or a deterministic set of constraints that can
then be applied immediately and efficiently. The
variables that correspond to feature structures of
a deranged type are precisely those that require
these suspended constraints.
Given a diagnosis of which types in a signature
are deranged (discussed in the next section),
suspended subtype covering constraints can be
implemented for the SICStus Prolog implemen-
tation of ALE by adding relational attachments
to ALE’s type-antecedent universal constraints
that will suspend a goal on candidate feature
structures with deranged types such as verb
or truedisj. The suspended goal unblocks
whenever the deranged type or the type of one
of its appropriate features’ values is updated to
a more specific subtype, and checks the types of
the appropriate features’ values. Of particular use
is the SICStus Constraint Handling Rules (CHR,
Fr¨uhwirth and Abdennadher (1997)) library,
which has the ability not only to suspend, but to
suspend until a particular variable is instantiated
or even bound to another variable. This is the
powerful kind of mechanism required to check
these constraints efficiently, i.e., only when nec-
essary. Re-entrancies in a Prolog term encoding
of feature structures, such as the one ALE uses
(Penn, 1999), may only show up as the binding
of two uninstantiated variables, and re-entrancies
are often an important case where these con-
straints need to be checked. The details of this
reduction to constraint handling rules are given in
Penn (2000b). The relevant complexity-theoretic
issue is the detection of deranged types.
4.3 Detecting Deranged Types
The detection of deranged types themselves is
also a potential problem. This is something that
needs to be detected at compile-time when sub-
type covering constraints are generated, and as
small changes in a partial order of types can have
drastic effects on other parts of the signature be-
cause of appropriateness, incremental compila-
tion of the grammar signature itself can be ex-
tremely difficult. This means that the detection of
deranged types must be something that can be per-
formed very quickly, as it will normally be per-
formed repeatedly during development.
A naive algorithm would be, for every type,
to expand the product of its features’ appropriate
value types into the set, a45 , of all possible maxi-
mally specific products, then to do the same for the
products on each of the type’s a208 maximally spe-
cific subtypes, forming sets a91a105a138 , and then to re-
move the products in the a91a105a138 from a45 . The type is
deranged iff any maximally specific products re-
main in a45a78a213a100a34a35a132a140a138a35a91a12a138a6a36 . If the maximum number of
features appropriate to any type is a29 , and there are
a119 types in the signature, then the cost of this is
dominated by the cost of expanding the products,
a119a155a214 , since in the worst case all features could have
a26 as their appropriate value.
A less naive algorithm would treat normal (non-
deranged) subtypes as if they were maximally spe-
cific when doing the expansion. This works be-
cause the products of appropriate feature values of
normal types are, by definition, covered by those
of their own maximally specific subtypes. Maxi-
mally specific types, furthermore, are always nor-
mal and do not need to be checked. Atomic types
(types with no appropriate features) are also triv-
ially normal.
It is also possible to avoid doing a great deal of
the remaining expansion, simply by counting the
number of maximally specific products of types
rather than by enumerating them. For exam-
ple, in Figure 6, main verb has one such prod-
uct, AUXa156a215a77a92a210 INVa156a215a77 , and aux verb has two,
AUXa156a212a211a92a210 INVa156a212a211 , and AUXa156a212a211a216a210 INVa156a135a77 . verb,
on the other hand, has all four possible combina-
tions, so it is deranged. The resulting algorithm is
thus given in Figure 8. Using the smallest normal
For each type, a171 , in topological order (from maximally spe-
cific down to a217 ):
a184 if t is maximal or atomic then
a171 is normal. Tabulate
normalsa146a112a171a112a147a54a199a59a218a16a171a109a219 , a minimal normal subtype cover of
the maximal subtypes of a171 .
a184 Otherwise:
1. Let a220a221a199a19a222a105a223a35a224a123a225a75a226a212a227a229a228 normalsa146a112a230a120a147 , where a231a10a146a112a171a112a147 is the
set of immediate subtypes of a171 .
2. Let a232 be the number of features appropriate to
a171 , and let a233 a199 a218a235a234a112a230
a131a120a172 a174a75a174a75a174 a172
a230a120a236a102a237a85a193a80a230
a139
a199
Appropa146 F
a139
a172
a230a120a147
a172 Approp
a146 F
a139
a172
a171a109a147a209a238
a172
a230
a194
a220a37a219 .
3. Givena239 a131a120a172 a239 a134 a194 a233 such thata239 a131 a177 a239 a134 a238 (coordinate-
wise):
- if a239 a131 a187a169a239 a134 (coordinate-wise), then discard
a239
a134 ,
- if a239
a134
a187a142a239
a131 , then discard
a239
a131 ,
- otherwise replace a218a102a239 a131a113a172 a239 a134 a219 in a233 with:
a218a235a234
a173
a131 a172 a174a75a174a155a174 a172
a173
a236 a237a76a193
a173
a139
immed. subtype of a230
a139
in a239 a131 a219
a240
a218a38a234
a173
a131a75a172 a174a75a174a120a174 a172
a173
a236a102a237a153a193
a173
a139
immed. subtype of a230
a139
in
a239
a134
a219
a174
Repeat this step until no such a239 a131 a172 a239 a134 exist.
4. Leta241a149a199a59a242 F
a243Appropa226 Fa244
a227a229a228a215a245 maximal
a146 Appropa146 F
a172
a171a109a147a35a147a120a246
a247a140a248a215a249a123a250
a244a251a251a251a212a244
a249a120a252a235a253
a224a123a254
a242
a131a35a255
a139
a255
a236 maximala146
a173
a139
a147 , where
maximala146a112a230a113a147 is the number of maximal subtypes
of a230 .
5. if a241 a0a199 a1 , then a171 is deranged; tabulate
normalsa146a112a171a109a147a59a199a128a220 and continue. Otherwise, a171
is normal; tabulate normalsa146a112a171a112a147 a199a221a218a102a171a109a219 and con-
tinue.
Figure 8: The deranged type detection algorithm.
subtype cover that we have for the product of a119 ’s
feature values, we iteratively expand the feature
value products for this cover until they partition
their maximal feature products, and then count the
maximal products using multiplication. A similar
trick can be used to calculate maximal efficiently.
The complexity of this approach, in practice,
is much better: a2a22a34a109a119a4a3a6a5 a214 a36 , where a3 is the weighted
mean subtype branching factor of a subtype of
a value restriction of a non-maximal non-atomic
type’s feature, and a28 is the weighted mean length
of the longest path from a maximal type to a sub-
type of a value restriction of a non-maximal non-
atomic type’s feature. In the Dedekind-MacNeille
completion of LinGO’s signature, a3 is 1.9, a28 is 2.2,
and the sum of a3 a5 a214 over all non-maximal types
with arity a29 is approximately a72a8a7a10a9 . The sum of
maximala214 a34a109a119a155a36 over every non-maximal type, a119 , on
the other hand, is approximately a72a8a7 a107a12a11 . Practical
performance is again much better because this al-
gorithm can exploit the empirical observation that
most types in a realistic signature are normal and
that most feature value restrictions on subtypes do
not vary widely. Using branching factor to move
the total number of types to a lower degree term is
crucial for large signatures.
5 Conclusion
Efficient compilation of both meet-semi-
latticehood and subtype covering depends
crucially in practice on sparseness, either of
consistency among types, or of deranged types,
to the extent it is possible at all. Closure for
unique feature introduction runs in linear time in
both the number of features and types. Subtype
covering results in NP-complete non-disjunctive
type inferencing, but the postponement of these
constraints using constraint handling rules can
often hide that complexity in the presence of
other principles of grammar.

References

H. A¨ıt-Ka´ci, R. Boyer, P. Lincoln, and R. Nasr. 1989.
Efficient implementation of lattice operations. ACM
Transactions on Programming Languages and Sys-
tems, 11(1):115-146.

K. Bertet, M. Morvan, and L. Nourine. 1997. Lazy
completion of a partial order to the smallest lattice.
In Proceedings of the International KRUSE Sympo-
sium: Knowledge Retrieval, Use and Storage for Ef-
ficiency, pages 72-81.

B. Carpenter and P.J. King. 1995. The complexity
of closed world reasoning in constraint-based grammar theories. In Fourth Meeting on the Mathematics of Language, University of Pennsylvania.

B. Carpenter and G. Penn. 1996. Compiling typed
attribute-value logic grammars. In H. Bunt and
M. Tomita, editors, Recent Advances in Parsing
Technologies, pages 145-168. Kluwer.

B. Carpenter. 1992. The Logic of Typed Feature Struc-
tures. Cambridge.

B. A. Davey and H. A. Priestley. 1990. Introduction
to Lattices and Order. Cambridge University Press.

G. Erbach. 1994. Multi-dimensional inheritance. In
Proceedings of KONVENS 94. Springer.

D. Flickinger et al. 1999. The LinGO English
resource grammar. Available on-line from

A. Fall. 1996. Reasoning with Taxonomies. Ph.D. the-
sis, Simon Fraser University.

T. Fr¨uhwirth and S. Abdennadher. 1997. Constraint-
Programmierung. Springer Verlag.

M. Habib and L. Nourine. 1994. Bit-vector encoding for partially ordered sets. In Orders, Algorithms,
Applications: International Workshop ORDAL '94
Proceedings, pages 1-12. Springer-Verlag.

R. Malouf, J. Carroll, and A. Copestake. 2000. Efficient feature structure operations without compilation. Journal of Natural Language Engineering,
6(1):29-46.

G. Penn. 1999. An optimized prolog encoding of
typed feature structures. In Proceedings of the
16th International Conference on Logic Programming (ICLP-99), pages 124-138.

G. Penn. 2000a. The Algebraic Structure of Attributed
Type Signatures. Ph.D. thesis, Carnegie Mellon
University.

G. Penn. 2000b. Applying Constraint Handling Rules to HPSG. In Proceedings of the
First International Conference on Computational
Logic (CL2000), Workshop on Rule-Based Constraint Reasoning and Programming, London, UK.

C. Pollard and I. Sag. 1994. Head-driven Phrase
Structure Grammar. Chicago.

I. A. Sag. 1997. English relative clause constructions.
Journal of Linguistics, 33(2):431-484.
