"YD/2---A Type Description Language 
for Constraint-Based Grammars 
Hans-Ulrich Krieger, Ulrich Scldifer 
{krieger,schaefer}@dfki.uni-sb.de 
(\]CITIIIaAt Research (Je, d;cr tbr Artificial lnte, lligcnc(; (I)FI(I) 
Stuhlsm;zenhmtsweg 3, 1)-(;(;\] 23 S~m'l)rii(:kcn, Gc~ llla,lly 
Abstract 2 Motivation 
This paper presents "FD~g, a typed feal, ure-bascd rein'c- 
sent, el;ion language att(l ild'erence system. Type defini- 
tions in TDL consis~ of type and feature constraints over 
the Imolean cmme(Mves. TD£ supports open- and closed- 
worM reasoning over types and allows for par|;itions and 
iueompal;i\[)le I.ypes. Working with l)artially as well as 
with lhlly expanded types is possible. E\[lieienl. reasoning 
in "\]'1)12 is accomplished through Sl)Ccialized mtMules. 
Tot)teal Pat)er. 7t~pie Area: sofl,w;u'e fin" NI,I', gram-- 
mar \['t)i'midism for Lyl)ed \['e~H.Hi't: st,rut:lures. 
1 Introduction 
Over l;he last lb.w years, eonsl;raint-based grammar 
tbrmalisms have become the predominmtt t)ar;tdigm 
in natural language processing ~uld (:oulptltal;iollal 
linguistics. Their success stems from tim feet that 
I;hey e~LIl be seen as ;t ttioltoLoIti(:, highqe.vel rel)resew- 
I;ation language for linguistic knowledge which can be 
given tt l)reeise m~d;hetn~d:ical semantics. The mMn 
idea of representii~g as much linguistic knowledge as 
possit)h~ through a mfique dater type <died fi'atu*'e 
struetur,, Cdlows the inl,egl:,ttion of differenl; des(:l;i 1) 
lion levels withoul, taking care of interface probh!lnS. 
While the tirst N)l)roaehes relied (m almotate(t I)hrase 
sLIJll(:l;lll?e FIlles (e.g., PAr\['\[{, • \[l), 1110(1(21711 formalisms 
try l,o specify grmnmal, ical knowledge as well as lexi- 
con entries entirely through feature, sLruetures, h, or- 
der to au:hieve t, his goal, one must enrich the exl)res- 
sive power of the lirst imilication-based formMisms 
with different forms of disjunctive deseril)tions, l,at- 
er, oLher operations came int,o play, e.g., (classical) 
negation. Ol,her proposals consider the integration of 
funcl;ionM/relatiomd del)endencies into t,he \[brmMism 
which make them in gelleral 'l'uriug-c.omph2te ((e.g., 
AI, I'; \[4\]). However l,he mosl; important ext(msion 1;o 
\['ormalisms eonsisl;s of tire incorporation of types, lbl: 
instance in modern systems like TI"S \[15\], CUI e \[tl\], 
or "FD£ \[7\]. Types are ordered hierm:ehically its it is 
known front ol)jeet-oriented t)rogranmdng languages. 
This leads l,o multiple inheritlmee in the description 
of linguistic entities. Finally, reeursive types are nee-. 
essary 1;o describe at lelust phrase-structure rccursion 
which is inherent in all gramnta.r ibrmalisms which 
are nol; l)rovidcd with a context-free loaekbone. 
ht the next section, we argue for the need and rel-. 
evmtce of using types in CL and Nl,l ). AO;er that, we 
give an overview of 7"1)£ and il,s specialized inl~:rence 
modules. EspeeiMly, we have it closer look ()it the 
novel features of J'D.~ and ltresenl, the techniques we 
h~tve employed in iulltlein(mLing "~1)£, 
Mode.rn tylmd mdtieaJ,ion-.ba,sed granunax foruudimns 
differ from etMy unt, yped systcnis in that they high-- 
light tile notion of a fi!ature type. Tyl)es C~L\[1 t)e 3,1'- 
ranged hierarchically, where a subtype inherits mono- 
tonicMly all the inlbrmation frolu its supertypes and 
unification plays l;he role of the primary in%rm~ttion- 
coml)imng operation. A tN)e definition elm be seen as 
;m M)breviation for ~ (:Oml~lex exl)ressi(m, (:(msisl.ing 
of I.ype eonstraiuts (eoncerning the sub-/sup(~rl.yp(: 
rehLtionship) ;rod feature constraints (stat,ing the :~1' 
propriate M,i;ribut.es and t;he, ir values) over the c(m 
ueet.ives A, V, and -,. Types serve its abbrcvi~tfions 
Ibr lexicon e, ntries, 11) rule s(:helu;d.;'L, and mfiv(~rsa.I 
its well as kmguage-specilic principles as is l'amilinr 
Doln IlPSG. Ih~sides using Lyt)cs as an abl)revia|.ion~ 
\[tl lt|(!~ltS ~tS temlthd,es rare, I,}lere are o|,hel; ~t(lv~ttlliages 
as well which cmmot be a(:(;Oml)lished by te.nl)la.i.es: 
• STRU(JTUIt.ING KNOWI,b;D(',I!; 
Types together with the l)ossibility to order 
then, hier;u'ehieally allow for a luodul;u" aHd 
ele~m way to r,~l)rcs('.nl, lingulsLic kuowle(lge nd 
equ~d,ely. Moreow:r, generalizntions can be put, 
a.t the apl)l'Ol~ri;d,e h:vels of re.13resenl;atioti. 
• I,~FFI('IENT I'I{,OCI,'SSIN(I 
Certain I,yl)e eonsLrainl,s (;all I)e (:ompih~d iltl,o el: 
ficient represenl,al;ious like. bit veeLors \[I\], where 
a (l\[,l/ (grcgd;esL h)wer bOUlld), L \[Jl~, (leaM; Ul)p(~r 
I),mnd), or a, ~ (Lyl,e s,d)SUml)liot 0 eOmlml;m, iot~ 
reduces to low-h'vel bit Inanilmlatio,i ; see Seel,iou 
3.2. Moreover, types release mltyt)ed uniliei~tio. 
fi'om eXlmnSive COmlmI.M;iou through lhe i)ossi 
bility to declare them incoml)al;ilde, lu iuhligi(m, 
working with t.yf)e ua.mes only or with partiMly 
expanded l;ypes minimizes the costs of copying 
sl, ruet;ures during processing. 'Phis can only be 
a.ccomplished i\[' the sysLent m~ukes at Uleeh;LILiSln 
for type exlmnsion available; see Se(:l,ion ;L4. 
• TYPE (JIIECKIN(I 
'Fype deliniti(ms allow n gramm~riml to (leelar(~ 
which attributes are al)l)rOl)riate lkq' a given l.yl)e 
and which types m:e a.l)prol)riate for a given at.. 
tribute, therelb.'e disallowiug one to write il~(:(m 
sistent, feat, m'e structures. Again, type expansioll 
is necess;try to determine the glol)M etmsist,eney 
of it given description. 
• RECIJltSIVI,\] TYI'ES 
l{ecursive l,ypes give it glmlmnar writ.or the op- 
porl.unity to formulnl.e cerl.Mn fimel.ion.s or re-- 
lations as recm'sivc type specific.;ttions. \York 
ing in the type deduel,io|l i)~-tra(ligm el\]i'orecs a, 
grammar writer 1,o rel)la(:e the eonl;exl;..fl'ee back. 
89,3 
bone through recursive types. Here, parameter- 
ized delayed type expansion is the ticket to the 
world of controlled linguistic deduction \[13\]; see 
Section 3.4. 
3 TD£ 
TDZ: is a unificatiol,-based grammar development en- 
vironment and run time system snpporting HPSG- 
like grammars. Work on TD£ has started within the 
DISCO project of the DFKI \[14\] (this volume). The 
DISCO grammar currently consists of approx. 900 
type specifications written in TD£ and is the largest 
HPSG grammar for German \[9\]. The core engine of 
DISCO consists of T/I£ and the feature constraint 
solver //D/A~ \[3\]. ND/~ itself is a powerful untyped 
unification machinery which allows the use of dis- 
tributed disjunctions, general negation, and fllnction- 
al dependencies. The modules communicate through 
an interface, and this connection mirrors exactly the 
way an abstract typed unification algorithm works: 
two typed feature structures can only be unified if 
the attached types are definitely compatible. This 
is accomplished by the unifier in that ~ handles 
over two typed feature structures to TD£ which gives 
back a simplified form (plus additional information; 
see Fig. 1). The motivation for separating type and 
featnre constraints and processing them in special- 
ized modules (which again might consist of special- 
ized components as is the case in 73)£) is twofold: (i) 
this strategy reduces the complexity of tile whole sys- 
tem, thus making the architecture clear, and (ii) leads 
to a higher performance of the whole system because 
every module is designed to cover only a specialized 
task. 
3.1 TD£ Language 
7"D£ supports type definitions consisting of type con- 
straints and feature constraints over the operators 
A, V, -1, and ® (xor). The operators are general- 
ized in that they can connect feature descriptions, 
coreference tags (logical variables) as well as types. 
77)£ distinguishes between arm types (open-world se- 
mantics), sort types (closed-world semantics), built-in 
types (being made available by the underlying COM- 
MON LISP system), and atoms. Recursive types are 
explicitly allowed and handled by a sophisticated lazy 
type expansion mechanism. 
In asking for the greatest lower bound of two awn 
types a and b which share no common subtype, TD£ 
always returns a A b (open-world reasoning), and not 
2_. The reason for assuming this is manifold: (i) par- 
tiality of our linguistic knowledge, (ii) approach is 
in harmony with terminological (KL-ONE-like) lan- 
guages which share a similar semantics, (iii) impor- 
tant during incremental grammar/lexicon construe- 
tion (which has been shown usefid in our project), 
and (iv) one must not write superfluous type defini- 
tions to guarantee successful type unifications during 
processing. 
The opposite case holds for the C, LB of sort types 
(closed-world approach). Furthermore, sort types dif- 
fer in another point from avm types in that they arc 
not fllrther structured, as is the case for atoIns. More- 
over, 779£ oilers the possibility to declare partitions, 
a feature heavily used in IfPSG. In addition, one can 
declare sets of types as incompatible, meaning that 
the conjunction of them yields ±, so that specific avm 
types can be closed. 
7"D£ allows a grammarian to define and use param- 
eterized templates (macros). There exists a special 
instance definition facility to ease the writing of lex- 
icon entries which differ from nor,hal types in that 
they are not entered into the type hierarchy. Input 
given to TD£ is parsed by a Zebu-generated LALR(1) 
parser \[8\] to allow for an intuitive, hi9h-level input 
syntax and to abstract fi'om uninteresting details im- 
posed by the unifier and the underlying Lisp systenr. 
The kernel of TD£-. (and of most other monoton- 
ic systems) can be given a set-theoretical semantics 
Mong the lines of \[12\]. It is easy to translate TD£. 
statements into denotation-preserving expressions of 
Smolka's feature logic, thus viewing 7"D£ only as syn- 
tactic sugar for a restricted (decidable) subset of first- 
order logic. Take for instance the following feature 
description O written as an attribute-vMue matrix: 
np 
\[ agr'eement \] 
¢= A~\[\] NUM sO 
PERS 3rd SUBJ \[\] 
It is not hard to rewrite this two-dimensionM de- 
scription to a flat first-order formula, where at- 
tributes/features (e.g., .~GR) are interpreted as binary 
relations and types (e.g., up) as unary predicates: 
3~. ,~p(¢) A ,Ga(e,, ~) A ,,a,°~em~,,t(~) A RUM(x, sg) 
A PERS(x, o°7"(1) A SUBJ(¢, x) 
The corresponding VD£ type definition of ¢ looks as 
follows (actually &; is used on the keyboard instead 
of A, \[instead of V,~instead of ~): 
¢ := np A \[AGR #x A agreement A \[NUll st, PERS at'd\], 
SUBJ #~\]. 
3.2 Type Hierarchy 
The type hierarchy is either called directly by the 
control machinery of TD£ during the detinition of a 
type (type classification) or indirectly via the simpli- 
tier both at definition and at run time (type unifica- 
tion). 
3.2.1 Encoding Method 
The implementation of the type hierarchy is based 
on A'/t-Kaci's encoding technique for partial orders 
\[1\]. Every type t is assigned a code 7(t) (represented 
via a bit vector) such that 7(0 reflects tile reflexive 
transitive closure of the subsumption relation with 
respect to t. Decoding a code c is realized either 
by a look-up OFF 3t . 7-1(c) = t) or by computing 
the "maximal restriction" of the set of types whose 
codes are less than c. l)eper, ding on the encoding 
method, the hierarchy occupies O(n logn) (compact 
encoding) resp. O(n 2) (transitive closure encoding) 
bits. ltere, GLB/LUB operations directly correspond 
to bit-or/and instructions. GI,B, I, UB and ~ com- 
putations 1-1ave the nice property that they can be 
carried out in this tYamework in O(n), where n is the 
894 
~\] , \[...\] ........... _> 
Query 
{~ 1, \[...1 
up/ . Result 
~\]\]A\[~ 
<bl,N> ~ 
pe hiera,rchy 
-<~-- (,tAb) 
TZ)£ 
<{~, ,, A ~, _q, {yo~, ~o, ~ai~)> 
Figure 1: htterfa('e between "FDE and ll/)'&& DepetMiug on the type hierarchy and the type of ~ and \[~, 
TD£. either returns c (c is delinitely the (;LB of a and b) or a A b (open-world reasoning) resl). ~L (clo.se<l-world 
reasoning 9 if there doesn't exist a single type which is ecplal to the GLB of a and b. In addition, 7"DL: determitws 
whether tlDi32: must carry out lbature term unification (yes) or not (no), i.e., the return type contains all the 
information one needs to work on prol>erly (fail signals a global unification lhilure). 
number of (,ypes. 1 
Aitq(aci's nmthod has been extended in 7'D£ to 
cover the ol)en-world nature of avm types in thai; po- 
tential (\]I,I~/LUB cmMidates (calculated front their 
codes) must be verified. Why so'. e Take the. lbllowing 
ex~mrple to see why this is ne.cessary: 
a: := J/ Az 
x' := y' A z' A\[. 1\] 
l)uring processing, one can definitely substitute y A z 
through % I)ut rewriting !I' A z' to a:' is not correct, 
because x' difl'ers fi'om f A z'- a/ is more speciiic as 
a coltseqtlellCX: of the l~e;~ture consl, r~t\[llt \[tt 1\]. So We 
make ;~ distinction between the "internal" gre;~test 
lower bound GI,B4, concerning only the type sub 
sumptiot~ relation i)y using Ait-Kaci's method alone 
(which is however used for sort types) and the :'ex- 
t(,rmq" one, GIA}c , which takes the subsumption re- 
lation over fi;ature structures into &(:COtlllt. 
With Gl,l)-< and GLIJc in mind, we (:m~_ define, a 
generalized (~,B operation infbrmally by the follow- 
ing table. This GLI} operation, is actually used during 
type mfitication (jr(.' :: feature constraint): 
-di~g- ,Tis,,~-.~oT! ,~fft,~i,u-F_ f,;7- 
\[_f,'~ _lLs,'-~. \[ ± l - Is-.-- ;.j 
?lJh('A'c 
a,Jmj < > Gl.llE(.vmi, a,,m~) -..vm,:~ 
(IUIIL 1 #,:--~? (ll~?lt I ~ (llHII. 2 
I. .L ~-->. C\[,I\]~ (amnl, arm2) -- J_, via an 
explicit incomp~ttibility declara~tion 
aural A aline!, otherwise (open world) 
~.. ,,,,,,,,~ <=. exp~md(,,,,,,,,,.~) ni~:~,, ¢ z _L, 
otherwise 
sor't.~ ¢-=> (\]l,I}5(sortt, .sorte) = ~orta 
3. sort t 4;:._~ sort1 == sort2 
.L, otherwise (closed worM) 
at, oral ,~ ~--'.~ type-of(a/oral ,~) ~ sort.& i, 
,1. where sort~,l is ~ built-in 
J., otherwise 
5.. atoHtt #,--~ o{oHt I =: (tlont 2 
±, otherwise 
T 4~> f,:l VI fc~ ¢ _l_ 6.. _L, 
otherwise 
The encoding algorithm is also exl,m,ded towards 
the rcdcJiuition of types and the use of undcJlmd 
types, an essentiM lmrt of at, im:remental gram- 
mar/lexicon dew.qopmetd, systenl, ll.edetining a I,ype 
means not oldy to m~ke changes local to this type. 
h,stead, (.,lie }I:4S to redefil,e all depcndcul.s of this 
type -all subtypes in case of a conjunctive l;ype def 
itdtion and all disjunction alternatives for at disjuuc- 
tive type speeilication plus, in both cases, all types 
which use these types in their de\[inition. The depen- 
dent types o\[ a l.ype t can be characterized gr~q)h- 
theoretically via l,he strongly c(mnected component 
of t with respect to the depe,Mency relation. 
3.2.2 D(molnI)osing Type Definitions 
Conjm~ctivc, e.g., x := J/A z ~tnd disju,u;tivc t!lp('. 
specificalio)~s, e.g., a/ ::-= f V z / are entered difl'er- 
ently into the hier~u'chy: :c inherits from its s,,per- 
l;ypes 9 and z, whereas x' delines itse|f through its 
IActuMly, one can choose, in 7"DE I)ctwccn the two 
encoding I:cchniques and between bit vectors and bignums 
ill COMMON \[ASP for the representatiou of the codes, h, 
our I,l.ql' implelnentaLion: operatimm on bignulns are. a 
magtfil;ude faster than on bi~ vectors. 
895 
\J 
z luAvAwl 
y 
Figure 2: The intermediate types luAH and NAvAwl 
are introduced by TD£ during the type delinitions 
,= := uA',, A \[a 0\] and Y := wA v A,*A \[a 1\]. 
alternatives !/ and z'. This distinction is represent- 
ed through tile use of different kinds of edges in the 
type graph (bold edges denote disjunction elements; 
see Fig. 3). But; it is worth noting that both of tllem 
express subsumption (x ~ y and x' >-_ y') and that 
the GLB/LUB operations must work properly over 
"conjunctive" as well as "disjunctive" subsumption 
links. 
TD£ decomposes complex definitions consisting of 
A, V, and ~ by introducing intermediate types, so 
that the resulting expression is either a pure covjunc- 
lion or" a disjunction of type symbols. Intermediate 
type names are enclosed in vertical bars (ef. the in- 
termediate types \[u A v I and lu A v A w{ in Fig. 2). 
Tile same technique is applied when using • (see 
Fig. 3). (b will be decomposed into A, V and ~, plus 
additional intermediates. For each negated tyt)e ~t, 
7"1)£ introduces a new intermediate type symbol I-'tl 
having the definition ~t and dechu'es it incompatible 
with t (see Section 3.2.a). I,~ addition, if t is not 
already present, T/)£ will add t as a new type to the 
hierarchy (see types \[~b\[ allcl \]-el in Fig. 3). 
Let's consider the example a := b ® c. The de- 
composition can be stated informally by the follow- 
ing rewrite steps (assuming that the user tu~s chosen 
CNF): 
a := bOc 
. := (~ A -~(-) v (-~ A c) 
. := (b v -~b) A (b v c) A (-~ V ~) A (-,e V e) 
,, := (~ v e) A (~ v ~) 
. := I~vel A I~bWel 
3.2.3 Incompatible Types and Bottom 
Propagation 
Incompatible lypes lead to the introduction of spe- 
cialized bottom symbols (see Fig. 3 and 4) which how- 
ever are identified in the underlying logic in that they 
denote the empty set. These bottom symbols must be 
propagated downwards by a mechanism called bottom 
propagation which takes place at definition time (see 
Fig. 4). Note that it is important to take not only 
subtypes of incompatible types into account but also 
disjunction elements as the following example shows: 
T 
-k(b, ~b\] J-{e,~c} 
Figure 3: Decomposing a := b®c, such that a inherits 
from tile intermediates IbVc\[ and b/,v~cl. 
.k -- a A b. } _~C+ a A bi := J- and a A b~ = J_ b := bl V b.). 
One might expect; that incompatibility statements 
together with feature term unification no longer lead 
to a monotonic, set-theoretical semantics. But this 
is not the case. To preserve monotonicity, one must 
assume a 2-level interpretation of tgpcd feature struc- 
tures, where feature constraints and t, ype constraints 
might denote diflb.rent sets of objects and the glob~ 
al interpretation is determined by the intersection of 
the two sets. Take for instance the type definitions 
A := \[a 1\] and 13 := \[b 1\], plus the user declaration 
J- = A A B, meaning that A and B are incompatible. 
Tl,en A A B will simplify to J_ although the corre- 
sponding feature structures of A and \[t successfully 
unify to \[a 1, b 1\], thus the global interpretation is ±. 
3.3 Symbolic Simplifier 
\[File simplifier operates on arbitrary TD~ expressions. 
Simplitication is done at definition time and at run 
time when typed unification takes place (cf. \]rig. 1). 
The main issue of symbolic simplitication is to avoid 
(i) unnecessary feature constraint unification and (it) 
queries to the type hierarchy by simply applying 
"syntactic" reduction rules. Consider all expression 
like x~ A... A xi . . . A "~a:i , . . A xn. The shnplilier will 
detect .k by simply applying reduction rules. 
The simplification schemata are well known from 
the propositional calculus. They are hard-wired in 
the implementation to speed up computation. For- 
really, type simplitication in "FD£ can be character- 
ized as a term rewriting system. A set of reduction 
rnles is applied until a normal form is reached. Con- 
fluence and termination is guaranteed by imposing 
a total generalized lexicwraphic order on terms (see 
below). In addition, this order has the nice effects 
of neglecting eommutativity (which is expensiw.' and 
might lead to termination problems): there is only 
one representative lbr a given formula. Therefore, 
memoizatiou is cheap and is employed in TD£ to 
reuse precomputed results of simplilied expressions 
(one must not cover all permutations of a formula). 
Additional reduction rules are applied at run time 
using "semantic" inlbrmation of the type hierarchy 
(GLB, LUB, and ~). 
896 
\[ l- 
d ',:: b A \[p t'\]. 
< ::: b A \[p --\]. 
. t, ---+ <, //~. c 
.l-{a,b>,:} I-{a,b,c} 
Iqgure d: IJottom propagat, icm trigg'ered throltg'h the :mbEglWS d aud c of b, ,so f, ha.L a A d A c as well w; a A ,.: A c 
will simI>lil ~ to _L during processing. 
a.a.1 Nornial Forlli 
hi order to reduce ;ui m;1)il;rary l,yl m expression to 
it simpler expression, Siml)lifi(:al;ion rules inusl; I)e a\])- 
plied. So we have to deline wh;Ll, it, lfie0Al.q for &ll 
expressioll t() t)(; "SJlllllle'. Ollo, CilJl eil;he, r (:boo,q(; the 
coujimcl,ivc or disjuimt, ive nol:maJ tbrm. The ~tdwtlr- 
I, a gcs of CN I"/I)NF are: 
i UNIQIIF, NES,q 
<l'yl)e ('.XlJl:ossiolls ill llOl'lll~t\[ \[O1"111 ttl;C IllliqllO, 
niodulo (;onunutal;ivil,y. Sorl,ing l,yllc extJressions 
according t,o ~ t;oi,;d lexi(Jographic order will lead 
i;o a i:otM uiliqueness of l,yl)e exlli:essions (,<-;ee, 
Section 3.3.3). 
• LINEAI{I'FY 
'\['ype expressions in liOi'lllal \['orlll ;~i:e linear. Ar 
bil;r;tly llesl.ed expl:essi()iis c:itii lie 1,ra.nsfortxied 
inl,o llal (JXl)i'OS,'-;iOllS. This l\[l;,ty l'(',dlil;(? i,he COHI 
plexil,y of later siniplilicatious, e,g;., ;d; rllil time. 
• (Jt)Ml'a ItABII,1TY 
This l)roperi,y is a colls(xlll(;lt(:(! of the two oi;hel; 
proli(;l:tie,<;. (~'ni(lue aal(I line,u: exl)ressions lnake 
it; easy i,o liild O1" 1,() cOUllm,'e (sul))expl:essions. 
This is itlll)Ort, allt, \[or the liierlloiz~d;ioli lx;t:hliique 
described in Scctioii 3.3.4. 
3.3.2 ll,eduction Rnl(~s 
lu order to reach a normal forui, it; would suffice 
to at)l)ly only the s(:ll.etlt;t|;;~ \['or (lf)ll})\[(~ neg~l,ion, dis- 
I,ribul, ivity, and De Morgan's h~ws. Ilowever, in the 
worst case, t, hcsc I, hr('(; rtlles wouht blow iI t) i,he leugl;h 
of th(~ normal lbrm to eXl)OnCnl,ial size ((:omp~u'ed 
with \],he mtull)er o\[ lit, erals iu the originaJ expres- 
sion). To ~o,'oi(l I, his, ()(;her titles ;tr(' use(I intermedi- 
ately: idempotcnce, idenl, ity, al)sorpl,ioih etc. If they 
can l)e applied, t, he.y alw~tys re(tilt:t; l,he lengl,h of I,hc 
expressions, l'\]specia.lly w\[; run l, ime, llu(; also al; del L 
\]nil;ion tilne, i\[, is use\['ul to eXldOi\[, infbrmM, i(m \['rt)ln 
the, t,ype hi(warchy. I"url,h(:r siml)lilit:al, ious are l)ossi- 
hie by ~csking lbr l,h(; (:ll,lt, \],till, altd ~. 
3.3.3 Lc'xlcogral)hic Order 
To avoid the al)pli('ation of l;ltc cotmnutativil, y rule, 
wc introduc(~ ;~ to(,al lcxicographic order on tyllc cx- 
lU'essious. Together with I)NF/(TNI,', we ol)taiil a 
unique sorl;ed normal fornt tbr an arbitrary l;y\[)e ex- 
pression. This guarant,ees fast (:oinparabilil,y. 
We define I;he order <NF on 7>ary normal forms: 
t~,lpe <N~; neqaled type <NI; conjunction <NI,' dis,- 
,\]'?trtCti01~ <NI" symbol <NI" striu9 <NF ~lltll21J(~F. l"ot' 
the coinl)arisoil of atoms, st;rings, and type names, 
we use the lexic, ographical order on strings ;rod lbr 
llitlllt)(!l:S \[,h(~ ordering < ou n;ttural IIIIlH\[)OI'S. 
l",x;unple: a <NI; b <NI; bb <NI; -m(t <NI; c.z A b <NI,' 
a A -,a <NI; a V b <NJ" (t V b V c <NI; a V i 
:1.3.4 Memoizalfioit 
The memoization t, cchnique describe, d in \[10\] hw-; 
1)een ad;q)ted in order to reuse precomlml,ed resull;s o\]' 
l.ype sinq)li\[i<:at,ion. The lexicogral>hically sorted nor- 
lnM f¢)rni guar;uitees fast; ~u:cess 1;o lU:CCOlnlml;e(l l,ype 
sinll)lilications. Memoiza|;ion resull, s are also used by 
the recursive simplific;d;ion algorit;hm (;o exploit, pre- 
conllmted results for subexln:cssions. 
Some enqfirical results show I;he usefulness of nteui- 
oization, The current DISCO grallltlUtr \]'t)r Q',0r- 
lI|~l,ll co118i81,8 o\]' 88 F) types ;uld 27 tentl~latx:s. AI: 
ter a lull (,ylm expausion of a toy lexicon of 244 in 
s(,;tltces/elll, ries, the lnemoiz;tl, ion table txmtaium ap- 
prox. 3000 cnl;ries (literals m'c noL lneuloized). 18000 
results have been reused ~tt; lc'asl; once (some up t;(~ 
600 tiines) of whMl 90 % ~re proper sinlplilica(,ions 
(i.e., the siinplilicd formulae m:e really shorter th~m 
t, he unsimplilied ones). 
3.4 Type Exlmnsion and Control 
Wc noted earlier I, hat types allow us to refer to c(m,-- 
pIex constraints folirougli tim use o\[ symbol nantes. 
l/,ecolml, rucl, ing |,he consl, r;tinl,s which determilm a 
I,ype (rept:eseltted as a \['eature sl,rucl;ure) requires a 
complex ol)er;-ttion called Qjpc c,7,Tmusz'om This is 
COml);tr;tble to (Jat'lmnl;er's lolalhj wcll-l~jpcdncss \[5\]. 
3.4.1 Motivation 
In ~J'l)l~, I,he mot, iwttioll for type expansion is m;m- 
iibl(l: 
• CONSISTI,;NCY 
AI, definition time, type expansion del,ermiues 
whc|;her tim st:l, of |,ype delinil;ion,s (grammar and 
lexicon) is consistent. At; run time, t, ype exi);m- 
sion is involved in checking the satis\[i;d)ility of 
l;he unilical;ion of two part,\]ally explm(h.'d typed 
fe;d,ure s(;rucl, lures, e.g., during parsing. 
897 
• ECONOMY 
From the standpoint of efficiency, it; does make 
sense to work only with small, partially expand- 
ed structures (if possible) to speed up feature 
term unification and to reduce the antount of 
copying. At the end of processing however, one 
has to snake the result/constraints explicit. 
• ItECURSION 
l{ecursive types are inherently present in modern 
constraint-based granmlar theories like IIPSG 
which are not provided with a context-free back- 
bone. Moreover, if the formalism does not al- 
low fnnctionM or relational constraints, one tnust 
specify certain flmctions/relations like append 
through recurslve types. Take for instance Ait- 
Kaci's version of the append type which (:ass be 
stated in "\]-DE as follows: 
append := appendo V appendl. 
aN)endo := \[FRONT < >, 
BACK #1A list, 
WHOLE #1\]. 
append, := \[FRONT < #first. #~v.stl >, 
BACK #back A list, 
WHOLE < #first. #rest2 >, 
PATCH append A \[FRONT #restl, 
BACK #back, 
WHOLE #rest2\]\]. 
o TYPE DEDUCTION 
Parsing and generation can be seen in the light of 
type deduction as a uniforin process, where ideal- 
ly only the phonology (for parsing) or the seman- 
tics (for generation) must be giw'.n. Type expan- 
sion together with a sufficiently specified gram- 
mar then is responsible in both cases for cov- 
strncting a fully specified feature structure which 
is maximal informative and compatible with the 
input. Itowever, \[la\] has shown that type ex- 
pansion without sophistieated control strategies 
is in Illany cases inelficient and moreover does 
not guarantee termination. 
3.4.2 Controlled Type. Exliansion 
Uszkoreit \[la\] introduced a new strategy tbr lin- 
guistic processing called controlled linguistic deduc- 
lion. Ills approaeh permits the.specitication of lit> 
gnistic performance models without giving up the 
declarative basis of linguistie competence, especial- 
ly monotonicity and eompleteness. The ewduation of 
both cm0nnctive and disjunctive constraints can be 
controlled in this framework. For conjunctive con- 
straints, the one with the highest faihtre probability 
should be evahtated first. For disjunctive ones, a suc- 
cess measure is used instead: the alternative with the 
highest success probability is used until a unification 
fails, in which case one has to backtrack to the next 
best alternative. 
7'D£ and /./D~de snpport this strategy in that ev- 
ery feature structnre can be associated with its sue- 
cess/faihtre potentiM such that type expansion can be 
sensitive to these settings. Moreover, one can make 
other decisions as well during type expansion: 
• only regard structures which art subsumed by a 
given type resp. the opposite case (e.g., expand 
the type subcat-list always or never expand the 
type daughters) 
• take into &ccouttt only structures under cer- 
tain paths or again assume the oliposite case 
(e.g., always expand the wtlue nailer path 
SYNSEMILOCICAT; in addition, it is possible to 
employ path pattenls in the sense of pattern 
matching) 
• set the depth of type expansion for a given type 
Note that we are not restricted to apply only one 
of these settings-- they can be used in combination 
and can be changed dynamically during processing. 
It does make sense, tbr instance, to expand at cer- 
tain well-defined points during parsing the (partial) 
information obtained so far. If this will not resnlt in a 
failure, one can throw away (resp. store) this flflly ex- 
panded feature structure, working on with the older 
(and smaller) one. tlowever, if the information is in- 
consistent, we luust backtrack to older stages in com- 
putation. Going this way which of course assumes 
/seuristic knowledge (language as well as grammar- 
specific knowledge) results in faster processing and 
copying. Moreover, the inference engine lllnst be able 
to handle possibly illconsistenl, knowledge, e.g., in 
cast of a chart parser to allow for a third kind of 
edge (besides active and passive ones). 
3.4.3 Reem'siw; Types, hnplenmntational 
Issues, and Undeeidabillty 
The set of all recursive types of a given gram- 
mar/lexicon can be precompiled by employing the 
dependency graph of this type system. This graph 
is updated every time a new type delhfition is added 
to the system. Thus detecting whether a given type 
is recnrsive or not reduces to a simple table look--up. 
ltowever l, he expansion of a recnrsive type itself is a 
little bit harder. In T'D£, we are using a lazy expan- 
sion technique whMt only makes those constraints 
explicit which are really new. To pslt it in anoth-. 
er way: if no (global or local) control information 
is specified to guide a specific expansion, a recnrsive 
type will be be expanded under all its paths (local 
plus inherited paths) until one reaches a point where 
the information is already given in a prcJi:r path. We 
call such an expanded structure a resolved typeil t?~.a - 
ture structure. Of course, there are inlinitely many 
resolved feature structures, but this structure is the 
most general resolved one. 
Take lbr instance the append example l¥om the 
1)revions section, append is of course a recursive 
type because one of its alternatives, viz., append 1 
uses append under the PATCH attrilmte. Exl)and- 
ing append with no additional information sup- 
plied (especiMly no path leading inside appcndl, 
e.g., PATCH I PATCH I PATCH) yields a disjunctive feature 
structure where both append o and append I are sub- 
stituted by their definitiorl. The expansion then stops 
if no other informatioll enforce a fisrther expansion. 
In practice, one has to keep track of the visited 
paths and visited typeil feature structures to avoid 
unnecessary expansion. 3'0 make expansion more el L 
ficient, we mark structures whether they are fully ex- 
panded or not. A feature strnetnre is then fully ex- 
panded iff all of its substructures are fully expanded. 
This simple idea leads to a massive reduction of the 
search space when dealing wills incremental expan- 
sion (e.g., during parsing). 
898 
It is worth noting that the sat|st|ability of fea- 
ture descriptions admitting recursive type equa- 
tions/detinitions is in general undecidable. Rounds 
and Manaster-ll, aumr \[11\] were the tirst having shown 
that a t(asper-ll.ounds logic enriched with recnrsive 
types allows one to encode it Turing machine, lie- 
cause our logic is much more richer, we imlne(liately 
get; the sanle result tbr TD£. 
itowever, one can choose in 7"l)£ between a com- 
plete expansion algorithm which may not terniinate 
and a non-comf)lete on(.' to guarantee tcrmin~-ttion (see 
\[2\] and \[5, Ch. 1,5\] for similar prol,osals ). The latter 
ease heavily depends on the notion of resolvedness 
(see above). In both cases, the depth of the search 
space can be restricted by specifying a maximal path 
length. 
4 Comparison with other Systems 
7'D/~ is tmique in that it iml)lemevts many novel fea- 
tures not found in other systems like ALE \[4\], I,IFI'; 
\[2\], (7,: TIeS \[15\]. Of course, these systems l,rovide 
other l~atures whiclt are not present in our formal- 
|sin. What makes 7,D£ unique in COmlTarison to them 
is the distinction open vs. closed world, the awdlabil- 
ity of the full boolean connectives and distributed 
disjunctions (via UD/~), as well as an imphmte,lted 
lazy type expansion mechaifism for reeursive types 
(as compared with LIFE). AI,E, \[br instance, neither 
allows dis|mint|re nor recurs|re tyl)es and enforces 
the l;ype hierarchy to be a I?,CPO. IIowever, il; makes 
recursion available througl, detinite relations and in- 
corporates special mechanisms \[br eml)ty categories 
and lexical rules. TFS comes up with a closed worhl, 
the unawdlability of negative information (only im- 
plicitly present) and only a poor tbrm of disjunctive 
information but performs parsing and generation en- 
tirely through type deduction (in fact, it was the tirst 
system). LIF'I'3 comes closest to us but l)rovides a se- 
mantics for types that is similar to TFS. Moreover 
the lack of negative information and distributed dis- 
junctions makes it again comparal)le with TFS. LIFF 
as a whole can be seen as an extension of PROI,O(~ (as 
was the case for its predecessor LO('HN), where tirst- 
order terms are rel)laced by .~-terms. In this sense, 
I,IFF, is rMmr than onr fomalism in that it offers a 
fifll relational calculus. 
5 Summary and Outlook 
In this pal)er , we have presented 7,D£, a typed tha- 
ture forlnMism thg~t integrates a |)owerflfl feature con- 
strMnt solver and type system. 13oth of t\]tem provide 
the boolean connectives A, V, and ~, where a con> 
l)lex exl)ression is decomposed by emphTying interme- 
diate types. Moreover, recursive types are supported 
as well. lit 7,D/2, a grammar writer decides whether 
types liw. ~ in an open or a closed world. This ef- 
fe.cts (\]Ltl and LIJI\] computations. '|'he type system 
i~,self consists of several inference components, each 
designed to cover etficiently a specific task: (i) a tilt 
vector encoding o\[ the hierarchy, (ii) a fast symbolic 
simplilier for complex type expressions, (iii) memo 
ization t;(7 cache preeomI)uted results, and (iv) a so- 
phisticated type expansion nmchanism. The system 
as described in this paper has been implemented in 
COMMON IASP and integrated in tile I)ISCO environ- 
men| \[14\]. 
The next lll;kjor version of 7,D£ will be integrat- 
ed into a declarative sl)ecilication langttage which al- 
lows linguists to define eoutrol kuowledge that can be 
nsed during proe~.'ssing. In addition, certain forms of 
know|edge, compilation will be made availa/fle in fu- 
ture versions o\[' TD/~, e.g., the autolnatic detection o\[' 
syntactic ineonq)atibilities between tyl)es , so that a 
type eOmlmtation can subsl,itute an extensive feature 
term unification. 
Re \['erences 
\[1\] llassan Ai't-Kaci, Robert lloye.r, Pa(a'ick Lincoln, ~-tlld 
t{oger Nasr. I*;flieient; implementation of lattice op 
erations. ACM Transactions ou l'rogrammin 9 Lan- 
(lUages aud Sgstcms, 11 (1.):115- 146, January 1989. 
\[2\] IIassan M'l,-Kaci, Andreas Podelski, and Seth Copen 
Goldstein. Ovder-sort, ed R:aLure timory uni\[icaLion. 
Teclm. Hcport 32, \])EC Paris l/.esem'ch |,lb., tg.')a. 
\[3\] Rolf \]~a(:kofcll and Cln'ist;oph Weyers. UDi/g'c a fca- 
|.life t:onst, raint solver with distributed disiunction 
and classical negation. Teclmical report., D \["K \[, Sct\[tt'- 
brficken, (\]ermmty, 1!/9-1. Forthconting. 
\[4\] Bob Carpenter. ALE |;he al;tribu(:e logic engine us- 
er's guide. Version ft. Technical report., Labm'atoty 
for Computal;iona\[ l,inguisl;ics. Carnegie Mellou Uni- 
versity, Pit(.sburgh, PA, 1992. 
\[5\] Bob Carpenter. The Logic of ~lilped Feature Struc- 
tm'cs. Cambridge University Press, Cmnbridge, 19!)2. 
\[6\] .lochen l)Srre and Michael Dovna. CUF .a formal- 
ism for lhtguisi, ic knowledge representation. In 
£1 ' l) ~ ~lt'{~ , editor, Comp.utational Aspects cff Cou- 
straint- B,sed Linguistic Description. DYANA, 1!)9:L 
\[7\] llans-UMch Krieger and Ulrich Sch'ffer. "\['1)£. -a 
type description language for HPSG. Pm't 2: user 
guide. Technical report, DFKI, Saarbrlleken, (k't'- 
many, 19!)-1. Forthcoming. 
\[8\] Joachiln l~aul)sch. Zebu: A tool for speeifyil~g rc 
versible \],ALR(I) parsers. '\]'ethnical re.porl;, IIewleM;- 
Packard, 1993. 
\[9\] Klaus Netter. Ar(:|dteci;m'e and coverage of |;he 1)\[,%. 
CO grammar. In S. Busemamt aim K. IIarbusch, 
eds., t'roc, of the DFKI Workshop on N, tural Lau- 
9tu,/e Systems: Modularity and ltc- Usability~ 1993. 
\[10\] Peter Norvig. Techniques fin' mttomal:ic memoizai;ion 
with applications t;o (:oul;ex|;-ft'ee pro'sing. Computa- 
tional Linguistics, 17(1):91--98, 1991. 
\[11\] William C. \]{O/llttls and Alexis Mg.tli.~tsl;tw-l{~-ttllel'. A 
logical version of fimctional gr;unmsu'. In Procccdi,,qs 
of the A UL, pages 89 4)6, 1987. 
\[t2\] GerL Smolka. A feature logic with subsorl;s. I,II,O(', 
l/.eporL 33, IBM (;ermany, Sl;ut.tgart, 1988. 
\[13\] llans Uszkoreit. Strat;cgies for adding control infof 
lint|ion (.o declarative gr;munars, in l'roccediuya of 
the ACL, l)agcs 237 245, 1991. 
\[14\] H. Uszkoreit., R. Backofen, S. lhlsemann, A.K. l)i 
agne, E.A. Ilinkehnan, W. Kasper, B. Kiefcr, t\[.- 
/j. KT"ieger, K. Netter, G. Neummm, S. Oel)en , and 
S.\[ ). Sl)ackman. DISCO- an HPSC;-bas(:d NI,P sys.- 
tern aim il;s app|h:al;ion for aplmintlnent uchedulhlg. 
In Proceediuqs of COLING, 1994. 
\[15\] Hdmi Zajac. Inheritance and constraint-bas('.d 
grammar formal|sins. Computational Linguistics, 
J8(2): tat) ~82, 1,,),,)2. 
899 
