UNIFYING DISJUNCTIVE FEATURE STRUCTURES 
LENA S'FROMBACK 
Deparlment of Coinputer ,and htforntatiol~ Science 
Link0ping University 
S-58183 LinkOping, Sweden 
Telephone +46 13282676 
elnail: lcsti(Wida, liu.se 
Abstract 
This paper describes an algorithm for unifying dis- 
junctive feature structnres. Unlike previous algo- 
rithms, except Eisele & l)6n'e (1990), this algorithm 
is as fast as an algorithm withont disjunction when 
disjunctions do not participate in the unification, it is 
also as fast as an algorithm handling only local dis- 
junctions when there are only local disjunctions, and 
expensive only in tile case of unifying fnll disjunc- 
tion. The description is given in the fiamework of 
graph unification algoritbnls which ulakes it easy to 
implement as an extension of such an algorithm. 
1 Introduction 
Disjunction is all important extension to feature 
structure languages since it increases the compact- 
uess of the descriptions. The mmn problem with in- 
cluding disjunction in tile structures is that the 
unification operation becomes NP-complete. There- 
lore there have been many proposals on how to uni- 
fy disjunctive feature structures, the most important 
being Karttunen's (1984) unification with con- 
straints, Kasper's (1987) unification by successive 
approximation, Eisele & D0rru's (1988) value unifi- 
cation and lately Eisele & D0rre's (1990a, b) unifi- 
cation with named disjunctions. Since Kasper's and 
Eisele & D0rre's algorithms seem to be more gener- 
al and efficient than Karttunen's algorithm I will re- 
strict my discussion to them. 
hi Kasper's algorithin the structures to be unified 
are divided into two parts, one that does not contain 
any disjunctions and one that is a conjunction of all 
disjunctions in the structure. Tile idea is to unify the 
non-disjunctive parts first and then unify the result 
with the disjunctions, thus trying to exclude as many 
alternatives as possible. The last step is to compare 
all disjunctions with each other, making it possible 
to discard further alternatives. At is this comparison 
that is expensive. The algorithm is always expensive 
for disjunctions, regardless of whether they coutain 
path equivalences or not and independent of wheth- 
er they are affected by the unification or not. This is 
due to the representation, where all disjunctions are 
moved to the top level of the strncture, which means 
that larger parts of the structures are moved into the 
disjunctions and must be compared by the algo- 
rithm. Carter (1990) has made a development of this 
algorithm which improves the efficiency when nsed 
together with bottom-up parsing. 
Eisele & D01Te'S (1988) approach is based on the 
fact that unification of path equivalences should re- 
turn uot only a local value, but also a global value 
that affects some other part of the struetm'e. Their 
solution is to compute tbe local value and save tile 
global value a~s a global Jesuit. The global results 
will be unified with the result of the first unification. 
This new unification can also generate a new global 
disjunction so that the unification with global results 
will be repeated until no new global result is gener- 
ated. This solution generates at least one, but otten 
more than one, exUa nnification for each path equiv- 
alence. Thus, tile algorithm is always expensive for 
path equivalences, regardless of whether they are 
contained inside disjuncttous or not. 
Tbe approach taken by Eisele & D0rre (1990) is 
similar to file approach taken in tills paper. They use 
'nmned disjunction' (Kaplan & Maxwell 1989) and 
one of their central ideas i.e. to use a disjunction as 
the value of a variable to decide when the value is 
dependent on the choice in some disjunction is simi- 
itu" to the way of unifying variables in the present 
paper, ltowevcr, they use feature terms for repre- 
setmug the structures and their algorithm is de- 
scribed by a set of rewrite rules lot feature terms. 
This makes the algorithm different from algorittuns 
described for graph unification. 
What is special with the algorithm in the present 
paper is filat it is 
1. As efficient as au algorithm not handling disjunc- 
tion wlleu the participating structures do not con- 
tain any disjuuclions. 
2. As efficient as an algorithm allowing only local 
disjunctions when the participating structures 
only contain such disjunction. 
3. Expensive only when non-local disjunction is in- 
volved. 
The description is given in a way that makes the 
algorithin easy to implement as an extensron of a 
graph unification algorithm. 
2 The Formulas 
Feature structures are represented by fornlulas. The 
syntax of the formulas, especially the way of con- 
structing complex graphs, is chosen so as to get a 
close relation to feature st~ uctmes. This also makes 
it easy to construct a unification procedure s~milar to 
ACTES DE COLING-92, NANTES. 23 28 AOt';r 1992 116 7 PROC. O1: COLING-92. NANTES, AUG. 23-28, 1992 
graph unification and give the formulas a semantics 
based on graph models. For disjunction a generali- 
zation of Kaplan & Maxwell's (1989) 'named dis- 
junction' is used. Their idea is to give the 
disjunctions names so that it is possible to restrict 
the choices in them. Kaplan and Maxwell use only 
binary disjunctions, and if the left alternative in one 
disjunction is chosen the left alternative in all dis- 
junctions with the same name has to be chosen. In 
this paper I do not restrict the algorithm to binary 
disjunctions. Instead of giving the disjunction a 
name I give each alternative a name. Alternatives 
with the same name are then connected so that if 
one of them is chosen we also have to choose all the 
others. 
We assume four basic sets A, F, X and E of atoms, 
feature attributes, variables and disjunction switches 
respectively. These sets contain symbols denoted by 
strings. They are all assumed to be enumerable and 
pmrwise disjoint. From these basic sets we define 
the set S of feature structures. S contains the follow- 
ing structures: 
• T : no information 
• .L : failure 
• aforallaE A:atoms 
• x for all x E X : variables 
• \[ft:sl ..... fn:sn\] for anyf i E F, s i E S, n > 0 such 
that fr- ~ for i~j: complex feature slructure 
• {ot:Sl,...,On:Sn}fOranyoiE )2, siE S, n20 
such that of,-~crj for i~j : disjunction 
A formula is defined to be a pair (s, v) where s is a 
feature structure and v:X-)S a valuation function 
that assigns structures to variables. We demand that 
the formulas are acyclic. 
An example of a formula is given in figure 1. Var- 
iables are denoted by using the symbol # and a 
number. The same formula is also given in matrix 
format which will be used to make the examples 
easier to read. 
(\[a: \[e:#1\],b:3,c:#l\], {(#1, \[d:4\])} 
Figure 1 
We can observe that according to this definflion 
formulas are not unambiguously determined. The 
same formula can for example be expressed with 
different variables. There is also nothing said about 
the value of the valuation function v for variables 
not occurring in the formulas. 
3 Semantics 
The semantics given for these formulas is similar to 
the one given by Kasper & Rounds (1986) for their 
logic of feature structures. This logic is modified in 
the same way as in Reape (1991) to allow for the 
use of variables instead of equational coustraints as 
used by Kasper and Rounds. As Kasper and Rounds 
I will use a graph model for the formulas where 
each formula is satisfied by a set of graphs. I will 
use b to denote the transition function between 
nodes in the graph. We also need to define a valua- 
tion to describe the semantics of variables. Given a 
graph a valuation is a function V:X-->N. By this 
fnnction every variable is assigned a node in the 
graph as its value. 
Satisfaction is defined by the following rules. The 
model M = (G, V, L) where G is a graph, V a valua- 
tion and L a subset of the switches occurring in the 
formula.satisfies a formula at node i iff it fulfils any 
of these cases. 1 will use the notion sat(i) if node i in 
the graph satisfies a formula. 
• M sat(i) {T, v) for all v 
• M sat(i) (t, v) for no v 
• M sat(i) (a, v) iff node i in G is the leaf a E A 
• M sat(i) (x, v) iff V(x)=i and M sat(i) (v(x), v) 
• M sat(i) (\[fl:Sl ..... fn:Sn\], v) iff for all k = 1 ... n 
~(if~z)=jk and M sat(jr :) (s k, v) 
• M sat(i) ({ol:s I ..... On:Sn}, v) iffprecisely one of 
o t ... o n is in L and M sat(i) (s k, v) for k such that 
OkE L 
These rules correspond to the usual sansfaction 
definitions for feature structures. The snbset of 
switches L forces us to choose exactly one alterna- 
tive in each disjunction and the model should satisfy 
this alternative. 
4 Unification 
in this section I will define a set of rewrite rules for 
computing the unification of two formulas. 1 will 
start by inu'oducing the operator ^ into our formu- 
las. The syntax and semantics is given by the fol- 
lowing rules: 
• M sat(i)fst/,fs;~ ifffs I andfs 2 are formulas and M 
sat(i) fs I and M sat(i) fs 2 
• M sat(i) (SlAS2, v) iffM sat(i) (s 1, v} and M sat(i) <s2, 
v) 
The operator ^ can be viewed as the unification 
operator. By the definition we can see that it is inter+ 
preted as a conjunction or intersection of the two 
participating formulas, which is the normal interpre- 
tation of unification. The task of unifying two for- 
mulas is then the task of rewriting two tormulas 
containing ^ into a formula not containing A. Here 
we can note that since a formula is not unambigu- 
ously determined the unified formula is not unique. 
Actually there is a set of formulas that all have the 
AerE.s DE COLING-92, NANTES. 23-28 AOt~'r 1992 l 1 6 8 PROC. OF COLING-92, NANTES, AUG. 23-28, 1992 
same model as the unification of the l"ormulas. The 
aim here is to compute one of these formulas as a 
representative for this set, and thus a representative 
for the unification offs t and fs 2. The rewrite rules 
given below correspond to the unification algorithm 
for formulas not containing disjunction. 
1, (s t, vl)A(s 2, v,~) ~ (slAs 2, v) if v I and v 2 are dis- 
joint and v(x)=vl(x) for all x in v 1, v(x)=vHx) for 
all x in v),. 
2. (~/~2, v)~(s2~ 1, v) 
3. (T,~, v) ~ (s, v) 
4. (aAa, v) r. (a, v) where aEA 
5. (a/49, v) r. (±, v) where ae:b and a,bEA 
6. (a^lfl:Sl..~,:sn\], v) ~ (.1_, v) where a6 A 
7. <_t~,, v> ~ <±, v> 
8. <xm, v> - <x, v~> 
where (v(x)^s, v) ~ (s t , Vl) and xeX, v;~(x)=s I 
and v2=v I for all other variables 
9. (\[fll:Sll"fln:Sln\]m\[f21:s21"f2m:SemI, v)~ (s, v e ) 
where s is tim complex feature structure contam- 
ing: 
fl,:suj for anyj such that fLr~fek for all k 
-~-~f)):s)i for anyj such that/2f,-eflt_ _ for all k 
f lj.'S3i for any j,k such that f u=f,~ t where (s ljAs2t, 
V(F1) ) ~ (S3i, Vi) 
and i describes some enumerauon of the result- 
ing formulas vo=v and <x3p, vp) is the last of tim 
formulas. 
The first rule is a kind of entry rule and can be in- 
terpreted as saying that it is possible to unify two 
formulas if the variables occumng within them are 
disjoint. "l~le second rule says that unification is 
commutative, and are used to avoid duplicating the 
other rules. The next rule says that T unifies with 
everything. Rules four to six says that an atom only 
unifies with itself and becomes failure when unified 
with some other atom or a complex structnre. The 
seventh rule says that unifying failure always yields 
failure. The eighth rule deals with unification of var- 
iables. Here we have to start with unifying the value 
of the variable with the other saucmre. This unifica- 
tion gives a new pair of feature structure and valua- 
tion function as result where the new valuation 
function contains the changes of variables that have 
been made during this unification. The result of the 
unification of a variable is the pair of the variable 
and the new valuation function where the value of 
the variable is replaced with the unified one. Rule 
nine deals with the umfication of two complex fea- 
ture structures and says that the result is the struc- 
ture obtained by unifying the values of the common 
attributes of the two structures and then adding all 
atmbutes that occurs in either of the structures to the 
result. 
Figure 2 gives an example that illustrates what 
modifications that must be made to the rewrite rules 
to be able to handle unification of disjunction. Uni- 
fying a disjunction is basically unifying each of its 
alternatives. But the exmnple also shows what mnst 
o{ol 
.2:#libel ^I °:\[)' 
lb.. #1 
I (/: I 
ib: 
o1: 
02:#1 
#1 
C.' I \[e t 
\[a. 31 
Figme 2 
happen if a variable occurs within the disjuncUon. 
The value of the variable is global sitice it can affect 
parts of the structure outside the disjunction. There- 
fore this value must be dependent on what alterna- 
tive that is chosen m the disjunction. This is done by 
representing the value of the variable as a new dis- 
junction where we only choose the unified value if 
the alternative o 7 is chosen, qb express this in the 
rewrite rule we index all rules by the list of switches 
that are Uaversed in the formula. This is expressed 
by replacing the m with __x in all rules where X is a 
list of the switches passed to reach this point of the 
unification. We also need to split rule 8 into two 
rules depending on if any disjunctions have been 
passed to reach the variable. The new rules are giv- 
en below and we assume that the switches occurring 
in each formula are unique. 
8.a(xm, v) ~0 (x, v~_:{'.st)) 
where (v(x)^s, v) ~ (s t, v 1) x~X, vHx)=s I and 
v2=v I for all other variables 
8.b(x,~', v) ~ lot .... 'O(x, v~) , 
where (v(x)^s, v) ~ol ... om (si, Vl), xCX, 
vJx)={ol :l o2: I...\[ o~:sl o ..... :v(x)... I 
Onow2: v(x) } cr new I : v(x) }, v~ = v I for all other vari - 
ables and Onewi is a switch name not used before. 
10.({Ol:Sll...On:Sln}AS, v).,----X,~ ({Ol:S21...Crn:S2n }, Vn) 
where (Sli^S, v(i 1)) i~°lu'~ (s2i, vi) and v o =v 
In StrOmblick (1991, 1992) these rewrite rules are 
proved to compute the unification of two foimulas. 
5 Discussion 
The syntax and semantics of the formulas are very 
similar to what is given in Reape (1991 pp 35) 
which is a development of the semantics given in 
Kasper & Rounds (1986) that allows the use of vari- 
ACrEs DE COLING-92, NANa~2S, 23-28 XOt;r 1992 l 1 6 9 PROC. OF COLING-92, NANTES, AUO. 23-28, 1992 
ables to express equational constraints. The differ- 
ence is that I use formulas of the form \[/l:sl...f,:sn\] 
instead of an ordinary conjunction and that we use 
named disjunction. This restricts the syntax of the 
formulas somewhat and makes them closer to ordi- 
nary feature structures. The restricted syntax is also 
the reason why we need to include a valuation func- 
tion in the formulas. 
It is easy to represent the formulas as ordinary di- 
rected acyclic graphs where variables are represent- 
ed as references to the same substructure in the 
graphs. If we think of the formulas as graphs it is 
also easy to compare the rewrite rules 1-9 above 
with an ordinary graph unification algorithm. Doing 
this we can conclude that each of the rewrite roles 
three to nine corresponds to a case in the unification 
algorithm. The only difference is that when varia- 
bles are represented as reentrant subgraphs we never 
have to look-up the variable to find its value. The 
main advantage with defining unification by a set of 
rewrite rules is that the procedure can be proved to 
be correct. 
6 Detection of failure and 
improvements 
The problem with the rewrite rules is that they 
sometimes produces formulas which have no model. 
Such formulas must be detected in order to know 
when the unification fails. As long as the formulas 
only contain local disjunction this is not a problem 
and it is easy to change the rewrite rules in order to 
propagate a failure to the top level in the formula. 
The ninth rule is, for example, changed to return (.±, 
vp) whenever any of the values of the attributes in 
the resulting formula is fail. 
When nonlocal disjunction is included we must 
find some of keeping track of which choices of 
switches in the disjunctions that represent a failure. 
This can be done by building a tree where the paths 
represents possible choices of switches and the leaf 
nodes in the tree contains a value that is false if this 
choice represents a subset of switches for which the 
formula has no model and true otherwise. Figure 3 
shows an example of a formula and its correspond- 
ing choice tree. To reach the leaf b in the tree the 
switches 0.1, 03, and crn have been chosen and or2, 
0.4, and 03 have not. So 0.3 is both chosen and not 
chosen and the value of this leaf must be false. Con- 
tinuing this reasoning for the other paths in the tree 
we could see that the leafs b, e, and f must have the 
value false and the other leafs must have the value 
true. If some value of an alternative is .1_ the corre- 
sponding leafs in the choice tree must be false. If 
we, for example assume that the value of or4 is fail 
we must assign false to the leafs c,f, and g. 
Choice trees can be built ones for each formula 
and merged during the unification of formulas. A 
better solution is to only build the choice trees when 
they are needed, i.e. when a disjunction alternatave 
O2:#1 { O3: ... } 
On.' 
{ O3: ~l } 
04: 
03 f:¢," true a 
03. "~n "",~- false O 
ol ~-. 
/" o4 "~- true e / 
~\ 03 ~" trlte d 
,,2 "C~"'~-*-I alse, 
.. 03 ._~- false f 
on" J'~ true g 
Figure 3 
where the disjunction shares some switch name with 
another disjunction fails. If this is done we only 
have to do the expensive work when really needed 
which is when we have failure in a non-local dis- 
junction and achieves a better performance of the al- 
gorithm for all other cases. 
Str6mhiick (1991, 1992) discusses how the choice 
tree is best used. The papers also discuss how the 
choice tree can be used to remove failed alternatives 
from a formula without destroying the interpretation 
of the formula. The main idea here is to see what 
switches that must be chosen to reach each disjunc- 
tion alternative in the formula. For this set of 
switches we find all leafs in the choice nee that can 
be reached if these switches are chosen. If all these 
leafs are false the alternative should be removed. 
For example, if we assume that the value of 0.4 in 
figure 3 is fail and that we have assigned false to the 
corresponding leafs in the choice tree, we can also 
see that there is no way of reaching a leaf with the 
value true if we have to choose tin. In this case we 
can as well remove both 04 and on from the feature 
StrUCture. 
The two papers mentioned above also discuss 
vmious improvements that can be made in order to 
get a more efficient algorithm. Most important here 
is that we can build only parts of the choice tree and 
that the notion of switches for a disjunction can be 
extended to allow sets of switches in order to avoid 
creating too many new disjunctions. 
7 Implementation 
The algorithm has been implemented in Xerox 
Common Lisp and is running on the Sun Sparcsta- 
tions. 
ACTES DE COLING-92. NANTES, 23-28 AOt~'r 1992 1 l 7 0 PROC. OF COLING-92, NANTES, AUG. 23-28, 1992 
8 Complexity 
To analyze the complexity of this algorith m 1 will 
look at threc cases. If we assume that there are no 
disjunctions in the formulas the procedure can be 
implemented almost linearly. If we have local dis- 
junction in the formulas, i.e. disjunctions which do 
not contain variables and which not are connected 
by switch nantes, the total complexity becomes ex- 
ponential on the maximum depth of disjunctions oc- 
curring within each other. For the third case we have 
to add the complexity for the removal strategies 
when alternatives have failed. The complexity for 
this procedure is also exponential in the size of a", 
where a is the total nnmbcr of alternatives OccutTing 
in the formulas. For a more complete discussion of 
the complexity see StrOmhack (1991, 1992) 
When considering complexity one must remem- 
ber that the second case will only be pcrforn~ed 
when there are disjunctions in the formula and when 
these disjunctions are actually affecWA by the unifi- 
cation. Disjunctions in some subpart of the formula 
not affected by the unification never affect the com- 
plexity. It is also reasonable to assume that m most 
case.q when a disjunction really participates in the 
unification, some of its alternatives will be removed 
due to failure. The same thing holds for the last 
case; it will only be performed when some global al- 
ternative has failed. This means that this procedure 
can at most be performed once for each ordinary al- 
ternative in the initial formulas. 
Comparing this to the other proposed alternatives 
we can see that Kasper's (1987) algorithm has a bet- 
ter worst case complexity (2a/2). On the other hand 
this complexity holds for all disjunctions in the 
structure regardless of whetlmr they arc ',fffected by 
the unification or not. The algorithm by Eisele & 
D0rre (1988) has a similar worst case complexity. 
The disadvantage here is that this 'algorithm is ex- 
pensive even if the structures do not contain any dis- 
junctions at all. The third algorithm (Eisele & D~3rre 
1990a, b) will also be NP-complete in rite worst case 
and will probably have a stinilar performartce com- 
pared to the algorithm descritxxl in this paper. 
9 Conclusion 
This paper describes an algot~ithm for unifying 
disjunctions which calls for as little computation as 
possible for each ease. Disjunctions only affect the 
complexity when they directly parucipate ~ are 
affected by the unification, which is the only case 
when we expand to disjunctive normal form. The 
most expensive work is done only when there is a 
failure in a disjunction which affects some other part 
of the structure. The only algorithm that shows sim- 
ilar complexity is the algorithm proposed by Eisele 
& D0rre (1990). However the description given by 
Eisele and DOne is harder to relate and implement 
as a graph unification algorithm. This paper shows 
that it is possible to use Snililar ideas together with 
graph unification. The de,,;cription given here is fair° 
ly easy to implement as mi extension of a graph uni~ 
ficatiou algorithm. 
Acknowledgements 
This work is part of the project I)ynamic ! ,anguage 
tlndcrstanding suppmted by the Swedish Council 
for Research in the Itumauities and the Swedish 
Bored for Industrial and "l~echnical Development. 1 
would also like to thank Lars Ahrcnberg and "lhre 
Laugholln for valuable comments on this work. 
References 
Crater, David (1990). Efficient Disjunctive Unification 
for Bottom-Up l'msmg. Proe. 13th International Confer- 
ence on Computational Linguistics, vol. 3, pp 70-75. 
Eisele, Andreas and Jochen D0rre (1988). Unification of 
Disjunctive Feattne Descriptions. Proc. 26th Annual 
Meeting of the Association fi~r Computational Linguis- 
tics, pp 286-294. 
Eisele, Andreas and Jochen DOtre (1990a). Disjt/r~ctive 
Unification. IWBS Rep()rt 124, IWtIS, IBM Deutsehlat~d, 
W. Gemmny, May 1990. 
Eisele, Andreas eald Jochen DOne (19tX)b). Feature Logic 
with Disjtmctive Unification. Proe. 13th International 
Conference on Compntatiomll Linguistics, vol. 2, pp 100o 
105. 
Kalttnnen, Lauri (1984). Featttres and Values. lOth Inter- 
national Conference on Computational l dnguistics122nd 
Annual Meeting of the Association for Computational 
Linguistics, Stanford, California, pp 28-33. 
Karttenen, Lauri (1986). D-PATR: A Developnlent EaWl- 
ronment for Unification Based Grammars. Proc. llth In- 
ternational Conference on Computational Linguistics, 
Bonn, Federal Republic of Gemmny, pp 74~80. 
Kaplall, Ronald M. mid John T. Maxwell lit (1989). An 
Overview of Disjunctive Constraint Satisfaction. Proc. 
International Workshop on I'arsing Technologies, Pitts- 
bulgh, Pennsylvania, pp 18-27. 
Kasper, Robert T. (1987). A ihtihcatien Method for Dis- 
junctive Feature Descriptions. 25th Annual Meeting & 
the Association for Computational Linguistics. pp 235- 
242. 
Reape, Mike (199 l). An Introduction to file Semantics of 
Unification-Based Grmnmar Formalisms. Deliverable 
R3.2.A DYANA - ESPRIT B~ic Research Action BR 
3175. 
Rotmds, Willianl C. and Robert Kasper (1986). A Com- 
plete Logical Calculus for Record StmcttHes Represent- 
ing Linguistic Information. Proe. Symposium on Logic in 
Computer Science, Cambridge Massachusetts, pp 39 - 43 
Str0mback, Lena (1991). Unifying Disjuucti ve Feature 
Structures. Teclmical Report LiTH-1DA-R-91-34, l~e- 
partmeslt of Computer and lnfommtion Science, 
LinkOping Univelsity, Link0pixlg, Sweden. 
Str~,mbitck, Lena (1992). Studies in Extended Uni\]ication 
Formalisms for Linguistic Description. Licentiate thesis. 
Depmtment of Compute; and hfformation Scie~:e, 
Link6ping University, LinktJping, Sweden. 
ACRES Ul! COLING-92, NANTES, 23-28 AO(7l' 1992 l 1 7 1 PROC. OF COLING-92, NANTES, AUG. 23-28, 1992 
