Reverse Queries in DATR* 
Hagen Langer 
(~niversity of ()stall)flick, Germany 
hlai~ger((~)jupite, r.rz.mfi-osnal)rue(:k.de 
Abstract 
I)ATI{ is a declarative re.presentation language ti)r lex-. 
ical iifformation and as such, fit prin(:iple, neul;ral with 
resl)(;ct; 1;o i)arl;icul&r l)rocessing st,rat,egies. Previous 
DATR (:l)mt)iler/inl;erI)ret(!r sy,qt(!ms support only one 
al:l:e.4s ,%rat,egy ~hnt, closely resembles the set, of inti~r-. 
otlce rllleS of the procedm:sd s(mumti(:s of \])A.Tli (Evmls 
& C,~tz(lar 1989a). In this i/al)er w(! present, an alt,ern;> 
1;ivc access st,r~tl;egy ('ri:'uc'l'.s('. q'ucr!/ .stral, cgy ) ~br a him- 
trivial subsel; of I)A'F\]/. 
1 The Reverse Query Problem 
DATR (Evans & Gazdm" 1989@ has l)ecome. Olte of the 
iiiosl; widely used fornlatl languages tin' the I'(~l)t'ese.tll;;t- 
t,ion of lexicad infornlat,ion. !)N\['ll ~q)plil:ations ha.re 
been (h~velol)ed for a wide variety of lmlguages (includ- 
ing English, .lat/mmse , Kikuyu, Arabi(:, l,at,in, and oth- 
ers) ;rod mmly different; subdonudns of le, xical rel)resen- 
tat,ion, including inilect,ional morphology, undt~rspeci- 
fication l)honltlogy, nlm-(:onca.t,enative morphophonol- 
ogy, lexicaI senlanti(:s, and tone systems I. 
We presutlI)OSe that the reader of the llresenl; paper is 
\[)mlilia.r with the basic Datur(!s of \])AT\[/. as spe(:ilie(1 
in Ewms & Gazda.r \[1989@ 
7'he ;all;(tu;tcy of st lexi(:on repr(~se.nt,;~t;ion fornmlisln de- 
pends basically on two major fact,ors: 
• it,s declarative c:cpres.sivenes.s: is the ff)rmalism, in 
prin(:iple, i:al)al)le of rel)resent,ing l;he phenomena in 
• This research was partly SUpl)orted by the (~ermau l,'e(h!ral 
Ministry of Heseareh and Technology (BMfI', project VEI~P,- 
MOBIl,) at the University of l~ielelk~ld. I would like tel thank 
1)afy(Id Gibbon for vely ttseful COllltHeltL8 o11 ali earlier draft of 
I;his paper. 
1 See Cahill \[\[9.93\], Gibbon \[ \[9!)2\], (lazdm" \[I 9921, ;rod Kilbm'y 
\[1992\] for reeelll; I)ATR applicatious in Lhese areas. An informal 
introducl, ion I,o I)ATR is given in (lazdar \[19!10\]. 'l'he sl.andatd 
syntax and semantics of I)ATI{ is defined in I,iwms gz (~az(lar 
\[198!)a, 19891)\]. hul)lementation issues are discussed iu (:libbon 
& Almua \[1991\], Jenkins \[1990\], aud in Gibbon \[19931 . M,)ser 
\[I 992a, 1992b, 1992(:, 1992d\] provides interesting insights into the 
fl~rmal properties of I)N\['I/(see also the I)A'\['I/ represen/ations of 
finil,e state allLomal.a, dilI'e~ent kiiMs of logics, regisl, er operations 
ere. in Evans & (l~z(la,' \[1990\], and l,;ml;er \[1993\]). Andry et 
al. \[19931 describe how I)ATR can lm used in speech-oriented 
~tl)l)lieal.ion.~;. 
qll(*,st,ion~ &lid does it allow lbr a.n explicit t,re;~t,- 
lllonl; of generalisat,ions, subgene, ralisations, ;rod ex-- 
Cel)tions'. ~ 
• its l'ailg~e of acct',ssing .strategies: are th(w0, &cc0s.qillt, ~ 
strategies for all apl)lical;ions which 1)rt'.suppose a lex- 
icon (e.g. parsing, general;ion, ...), a.nd tlo t,hey sup 
porl; t;he development, Ill;:tillt,t!ll~tllt:(}, ;-I, II(\] evahmLion 
of lexi(:a in an a(h~(ltlat,l~ manner? 
Most; of t,h(! previous work oil i)A~I'l/ has focussed ou 
t,hc forlnc'r set, of (:rigeria, i.e. t,he det;larative features 
of l;he language, its exl)ressive i:~Lpalfili|;ies, mid its a(t- 
equ;ti:y l()r Lhe r(>forinul~l;ion of l)r(>l;h(~oret,i(; int()rln~tl 
linguistic concepts. This paper is mainly con(:erImd 
with f;he latter set of criteria of adequacy. However, in 
the (:ase of I)ATI{, the limited access iu only one di- 
re(:tion lms led to a somewhat l)ro(:edural view of \[;he 
language whi(:h, in 1)artil:ular cases, has also had a.n 
impact on the declarative rel)resenl;al;ion,q I;hem,qelves. 
I)AT\]/. has ofl;en been (:h~r;u:I;erised as a fiim.ctional ttild 
d(ttg'.l'glti'lti.s't{(: 1}LllglHtglL These fe}Ll;llt'(hq 31'o, ()f COllt'SO, 
not prolmrl;ies of the bmgm~ge it,self, but rather of I;he 
la.uguage l;ogether with a particulm: procedural ild;er 
pre.t,ation. Actually, l;he t,erm deterministic is ill)i; }l,l)- 
t)lic~fl)le to a declarative l~mguage, but only makes s(!ltse 
if applied to a procedural laalgua.ge or a particuta.r pro- 
cedural intert)retal;ion of a langnage. The I)ATR in 
terpreter/couq/iler systems develol)Cd st) t~l '2 have in 
COmlnon that (,hey supt)orL italy one way of accessing 
the inli)rmat, ion relIres(mt(~'(1 in & I)ATR theory. 'Fhis 
access st;ral;egy, whi(:h we will refer to as the sl, anda'rd 
pT"ocedur'al intcrprctatio'n of \])ATR, closely resembh~s 
the inference rules defined in Evans & Gaz(lar \[11989a\]. 
Even if one considers DATR neither a.s a tool for i)a.rs - 
ing nor for generatioll tasks, \[)lit, rather as a purely ret/ 
resent,ational device, the one-way-only access to DATR 
t,heories turns ollt, to 1)e OllO ()f the major drawbacks of 
t;he model. 
One (If (;tie i:bdins stated for DATR in F, wms &. Gaz- 
(l&r \[\] 989\] is t,haA; it is i:onqnttationally l;ra(:l;able, lhlt~ 
for many practical purpl/Ses, including lexicon iIevelo 1) 
tnt!llL sl, lld ew~,lual;ion, it, is llOt, sufficient, t,hal; t,her( ~. is ,:~lly 
21)ATI;i/ impl-et;mn~ati ..... i,ave I ........ leveloped by iC Evans 
(I)A'\['I(90), I). (lit)bon (I)I)ATI{, ODE), A. Sikorski (TPI)A- 
'I'll,q), .l. Kilbury (QI)ATII), (I. I)rexel (YAI)\]"), M. I)uda (I IU I~ 
I)ATII), mid other.s. 
1089 
arbitrary accessing strategy at all, bnt there should be 
an appropriate way for accessing whatever information 
that is necessary for the purpose in question. This is a 
strong motivation for investigating alternative strate- 
gies for processing DATR representations. This paper 
is concerned with the reverse query problem, i.e. the 
problem how a given DATR value can be mapped onto 
the queries that evaluate to it. A standard query con- 
sists of a node and a path, e.g. Sheep:<orth plur>, an<l 
evaluates to a sequence, of atoms (value), e.g. sheep. A 
reverse query, on the other hand, starts with the value, 
e.g. sheep, and queries the set of node-path pairs which 
evaluate to it, for instance, Sheep:<orth sing> and 
Sheep:<orth plur>. Our solution can be be regarded 
as an inversion of the parsing-as-deduction al)proach of 
the logic programming tradition, since we treat reverse- 
query theorem proving as a parsing problem. We adopt 
a wellknown strategy frora parsing technology: we iso- 
late the context-fi'ee "backbone" of DATR and use a 
modified chart-parsing algorithm for CF-PSG as a the- 
orem prover for reverse queries. 
I, br the purposes of the present paper we will intro- 
duce a DATR notation that slightly differs fi'om the 
standard notation given in Evans & Gazdar \[1989\] in 
the following respects: 
• the usual DATR abbreviation conventions are spelled 
out 
* the global environment of a DATR descriptor is ex- 
plicitly represented (even if it is uninstantiated) 
• each node-path pair N:P is associated with the set 
of extensional suffixes of N:P that are defined within 
the DATR theory 
In standard DATR notation, what one might call a 
non-terminal symbol, is a node-path pair (or an abbre- 
viation for a node-path pair). In our notation a DATR 
nonterminal symbol is an ordered set \[N, P, (7, N', P'\]. 
N and N ~ are nodes or variables ranging over nodes. 
P and P' are paths or variables ranging over paths. C 
is the set of path suffixes of N:P. 
A DATR terminal symbol of a theory 0 is an atom 
that has at least one occurence in a sentence in 0 where 
it is not an attribute, i.e. where it does not occur in a 
path. 
The suffix-set w.r.t, a t)refix p and a set of sequences S 
(written as alp, S)) is the set of the remaining suifixes 
of strings in S which contain thc prefix p: alp, S) - 
{slp^s ~ S}. 
Let N:P be the left hand side of a DATR sentence of 
some DATR theory 0. Let be II the set of pat, hs occur- 
ring under node N in 0. The path extension constraint 
of P w.r.t. N and 0 (written as C(P,N,O), or simply 
c) is defined as: C(P, N, O) = G(I", n). 
Thus, the constraint of a path P is the set of path suf- 
fixes extending P of those paths that have P as a prefix. 
Example: Consider the DATR theory 0: 
N: 
<> == 0 
<a> == 1 
<a b> == 2. 
The constraint of <> (w.r.t. N and 0) is {<a>,<a 
b>}, the constraint of <a> is {< b >}, and the con- 
straint of <a b> is ~. 
We s W that a sequence S - st . .. s,~ (1 _< n) satisfies a 
constraint C ill {a: 6 cl.ax = s} - ~ (i.e. a sequence 
S satisfies a constraint C iff there is no pretix of S in 
C). 
Now having defined some basic notions, we can give 
the rules that map standard DATR notation ont;o our 
representation: 
Mapping rules 
N:P =-- 0 
N:P == atonl ::~ 
N:P == N2:1'2 ::> 
N:P == N2 => 
N:P == P2 :::> 
N:P =-- "N2:P2" 
N:P =-- "N2" 
N:P == "P2" :=> 
\[N,P,C,N',P'\] -~ e 
\[N,P,C,N',P'\] -+ atom 
\[N,P,C,N',P'\] --+ \[N2,P2,C,N',P'\] 
\[N,P,C,N',P'\] -+ \[N2,P,C,N',P'\] 
\[N,P,C,N',P'\] --;. \[N,P2,C,N',P'\] 
\[N,P,C,N',P'\] -~ \[N2,P2,C,N%P2\] 
\[N,P,C,N',P'\] -~ \[N2,P',C,N2,P'\] 
\[N,P,C,N',P'\] --+ \[N',I'2,C,N',I'2\] 
llow these inat)ping principles work can 1)erhaps best he 
claritied by a larger example. Consider the small DAq'R the- 
ory, below, wifich we will use ms an example case throughout 
this paper: 
House: 
<> == Noutl 
<root> == house. 
Sheep: 
<> :: Noun 
<root.> == sheep 
<affix plur> :--- . 
Foot: 
<> == Sheep 
<root> == foot 
<root plur> == feet. 
Noun: 
<orth> --= "<root>" "<affix>" 
<affix sing> == 
<affix sing gen> == s 
<affix plur> == s. 
The application of the mapping rules to the DATR the- 
ory above yields tile following result; (unstantiated vari- 
ables are indicated by bold letters): 
\[Ifouse,<>,{<root>},N',P'\] ~ \[Noun,<>,{<root>},N',P'\] 
\[House,<root>,{},N',P'\] ~ houso 
\[Sheep,<>,{<root>,<atflx plur>},N',P'\] -} 
\[Noun,< >,{ <root >,<affix plur> },N',P'\] 
\[Sheep,<root>,~,N',P '\] --+ sheep 
\[Sheep,<affix plur>,ql,N',P'\] --~ e 
\[Foot,<>,{<root>,<root plur>},N',P'\]--} 
\[Sheep,<>,{<root>,<root plur>},N',P'\] 
\[Foot,<root>,{<plur>},N',P'\] -, foot 
7090 
\[l.bot,<root plur>,(b,N',P'\] ~ feet 
\[Noun,<orth>,~,N',P'\] --* \[N',<root>,~,N',<root>\] 
\[N',<aflix>,(t,N',<aflix>\] 
\[Noun,<allix sing>,{<gen>},N',P'\] -+ e 
\[Nmm,<a\[\[ix sing gen>,~,N',P'\] ~ s 
\[Noun,<aI\[ix plur>,0,N',l"\] --~ s 
The general aim of this (somewhat redundant;) notation 
is 1;o lint everyl;hing that is needed for drawing infm- 
trices from a sentence (especially its global enviromnent 
mM possibly compel;ing clauses al; the same node) into 
t, he rcpresenl;ation of the. sentxmc(; itself. Similar inter- 
hal representations are used in several I)ATII. imple- 
lnentations. 
2 Inference in DATR 
Bol;h sl;mMmd inference a.nd reverse query inference 
can be regarded as COmlflex sul)stil;ul;ion Ol)eral, ions de- 
fined for sequences of DATR terminal and iiolt-l;Crtllinal 
symbols which apply if particular real;thing crit(wia ~rr(: 
sal;istk!d. In case of DATI{. standa.rd procedural Selnan- 
tics, a step of inference is tim substitution of a I)ATt{ 
IlonternfinM by a sequcnt:e of \])A'FR torminal and non- 
ternfinal symbols. The matching criterion applies to a 
givon DAT\]{ query and the left hmld sides of the sen- 
tenets of the 1)A'HI, theory, if the LfIS of a I)ATII 
sentences satisfies the matching criterion, a modified 
vcrsioIl of the right ha.IM side is sttl)sl.il.lll;ed lbr the 
LItS. Since the maL(:hing criterion is such l;hat there is 
at most one sent0.nce in a t)A:.HI theory with a match- 
ing I,HS, DATR standard inDrence is determilfistic mM 
functional. The starting point of DA'FR staiMm:d in- 
ference is single nonterminal a.nd tim derivation process 
terminates if a Se(lUenc('. of I.ernfinals is obl;ailmd (or if 
there is no IAIS in the theory that sa.l;isfics the matching 
criterion, in which case the process of inference termi- 
tortes with a failure). 
In terms of DAq'\]I. roverse query t)rocedural semmt- 
tics, a step of inti;ren(;e is the. substitution of a sub- 
sc;qll{m(:(~ of a given sequence of I)ATR. terminal and 
non-terminal symt)ols by a. I)ATlt non-ternfinal. Tim 
matching criterion applies l,o the subsequence and the. 
right hand sides of the sentences o\[ the DATR theory. 
If the matching criterion is satisfied, a modifie.d ver- 
sion of the LHS of the I)ATlt sentence is substituted 
for the m~tching subsequencc. In contrast to I)A'FI/, 
standard inli!rmm(!, the matching c:riterion is sut:h that 
there might be several I)AT\]/. senl;encos in a given t;hc- 
ory which satisfy il;. DA\[I'II reverse query iM'erence is 
hence neither flmctional, nor deterministic. Starting 
poinI; of a reverse query is a sequence of l;(n:lninals (a 
valll(!). A th',rivati(m (,erminaI;cs, if the substitutions 
finally yield a singh; nonter\]uinal with identical \]oc, al 
and global cnvirolmmnt (or if there are no matching 
sentences in the theory, in which case the dcrivatioil 
fails). 
We now define the inaA;(:hing criteria for I)ATR terminal 
symbols, I)ATI{ nonterminM symbols and sequences of 
DATft symbols. These matching criteria relate extra> 
sional lemlnal;a (i.e. already derived tmrtial analyses) 
to I)ATR definil;ional sentences (i.e. "rules" that may 
yield a fm'tho, r roduction) w.r.t, a given DATR theory 
0. 
A term.thai symbol t, 'matches another tc.r'minal sy'm- 
bolt 2 ifl' t, - t2. We also say that t, rrtatt'Jte.s t.2 
with art arbit~nry suJfi:c and art empty constTnint h, of 
der to provide compatibility with the definitions tbr 
nontermimfls, below. 
1. A nontcrmi'nal IN, 1'1, C1, N', P'\] matches another 
nonto.rminal \[N, 12.2, C2, N', Pq with a s~tf.Jirr E a'nd a 
constraint C2 if (@ H'2 = P~E, &n(l (l)) E s;~|;isfies C1. 
2. A nonterminal IN, P~, C1, N', i"\] match.ca anotlmr 
nont, o.rminal \[N, P.e, C2, N', I"\] with an e.rnpt~/ s'uf/i:c 
a'ttd a constraint a(.\[~,Cu) if (a) P, = I~AI,:, and (b) 
E satisfies C~. 
Example: The non-terminal symbol \[Node, <ab>, 
{<c d e>},Nf,P\[I matches \[Node,<~ b c d>, 
~, N~, l~\] with suffix ,5' = <c d> and constraint ~. 
l?rom the definitions, giwm abovo., we can derive the 
matching criterion for sequences: 
1. The ernpt!/ sequence matches the empty sequence 
with a.n empty suffix and constraint V). 
2. A non-empty sequence of (terminal and non- 
tcrmilml) symbols s'~ ... s',~ (1 < n) matches another se- 
quen(:e of (terminal and non-terminal) symbols s j ... s,, 
with suttix E mM constraint C if 
(a) for ca.all symbol sl (1 < i < n): s{ m~l;cho, s s,. with 
suffix /3 and constradnt Ci, and 
(b) C = C~ u (& ... o C.. 
To put it roughly, this definition requires thai: the sym- 
bols of the sequences match one another with the sarrte 
(possibly eml>ty) suffix. Tho. re'suiting constraint of the 
s('.quence is t, he ration of the constraints of the sylnbols. 
Example: The string of nontcrminal symbols 
\[N1,<a>,C~,N'I,P'I\] \[N2,<x>,C>N'2,P'2\] 
matches \[Nl,<a b>,{<c>,<d>},N'l,P'l\] \[N2,<x b>, 
{ <e> } ,N'2,P'2\] with suffix <b> ~md constraint { <c>, 
<d>, <e>}. :~ 
aThe matching criteria, defined above, do not; cover non- 
t, erminals with evaluable paths, i.e. paths that include (an 
arbitrary nu tuber of possibly recursively e.mbcdded) nonter- 
mimds. The matching cril, erion for nonterminals has to be 
extended in order to account fl)r sLatemcnts with evaluabh~ 
paths: l,et, lit! eval(tt, e, 0) a funcLion I;hat maps a sl;ring 
of I)ATR t, erminal attd nonl, erHlinal symbols (~ = At ... A,, 
on|;o a string of I)NH/. terminals ~' such that (a) each ter- 
minal synfl)ol Ai(I < i < rt) in (~ is mapped onl, o il, self in 
:~, and (b) each nonU'*minal Aj \[Nj, l}, (5'~, Nj, \[j\](l < 
j < rl,) in ~ is mapped onto ell(; se, quence, a~... aj' in c~' 
such t;hat, N'j : l'^j e = aj'.., aj' in 0. ,A, refers to (recur- 
1091 
3 The Algorithm 
Metaphorically, DATR can be regarded as a formal- 
ism that exhibits a context-free backbone 4. In anal-- 
ogy to a eontext-flee phrase structure rule, a DATR 
sentence has a left hand side that consists of exactly 
one non-terminal symbol (i.e. a node-path pair) and 
a right hand side that consists of an arbitrary num- 
ber of non-terminal and terminal symbols (i.e. DATR 
atoms). IIl contrast to context-free phrase structure 
grmmnar, DATR nonterminals are not atomic sym- 
hols, but highly structured complex objects. Addition- 
ally, DATR difli?rs from CF-PSG in that there is not a 
unique start symbol but a possibly infinite set of them 
(i.e. the set of node-path pairs that, taken as the. start- 
ing point of a query, yMd a value). 
Despite these differences, the basic similarity of 
DATR sentences and CF-PSG rules suggests that, in 
principle, any parsing algorithm for CF-PSGs couhl 
be a suitable starting point for constructing a reverse 
query algorithm for DATR. The algorithm adopted 
here is a bottom-up chart parser. 
A chart parser is an abstract machine that performs 
exactly one action. This action is monotonically adding 
items to an abstract data-structure called ehart, which 
might be thought of as a graph with annotated arcs 
(which are also often referred to as edges) or a matrix. 
There are basically two diff'erent kinds of items: 
• inactive items (which represent completed amdyses 
of substrings of the input string) 
• active items (which represent incomplete analyses of 
substrings of the input string) 
if one thinks of a chm't in terms of a graph structure 
consisting of vertices connected by arcs, then an item 
can be defined as a triple (START, END, LABEL), 
where START and END are vertices connected by an 
arc labeled with LABEL. Active and inactive items 
ditfer with respect to the structure of the label, in- 
active items are labeled with a category representing 
the analysis of the substring given by the START and 
END position. An active item is labeled with a cate- 
gory representing the analysis for a substring starting 
at; START and ending at sorne yet unknown position 
X (END < X) and a list of categories that still have to 
sire) DATR path extension (of. Evans & ('azdar 1989a). 
Notice that e has no index and thus has to be the same 
tbr all nonterminals Aj. Let X1 IN, 15, Ct, N', P'\] be 
a nonterminal symbol including an evaluable path PI. Xt 
matches \[N, P'2, C2, N', P'\] with a suffix /3. and a constraint 
(L, if (at eval(Pt, 1,/, 0) = 7r, and (b) \[N, real'. ', C~, N', P'\] 
matches \[N, P'2, C~, N ~,/)q with suffix 15' and constraint C., 
(according to the matching criteria, defined above). 
4The similarity of certain I)ATR sentences and context- 
free phrase structure rules has first been mmltioned in Gilt- 
bon \[1992\]. 
l)e i)roven to he proper analyses of a sequence of con- 
nected substrings starting at END and ending at X. 
For the purpose of processing DATR rather than CI,'- 
PSGs, each active item is additionally associated with 
a path sutfix. Thus an active item has the structure: 
(START,END,CAT0, CATj ... CAT,, SUFFIX) 
Consider the following examples: the inactive item 
(0, 1, \[House,<orth sing>,{<gen>},House,P'\]) 
represents the intbrmation that the substring of the 
input string consisting of the first symbol is the 
vahm of the query House:<orth sing> (with arty ex- 
tensional path suffix, but not gcn) in the global 
environment that consists of the node House and 
some still uninstantiated path P'. The active item 
((),l,\[Noun, <orth>,0,House,P'\], 
\[Itouse,<affix>,O,House,P'\],e) 
represents the information that there is a t)artial anal- 
ysis for a substring of the input string that starts with 
the first symbol and ends somewhere to the right. This 
substring is the value of the query Noun:<orth> within 
the global environment consisting of the node House 
and some uninstantiated glohal path P', if there is a 
substring starting from vertex 1 that turns out to he 
the value of the query Itouse:< a~ix> in the same global 
environment .IIousc:P '. 
The general aim is to get all inactive items la-. 
heled with a start symbol (i.e. a DATR nonterminal 
with identical local and global environment) for the 
whole string which a derivable from the given gram- 
mar. There are different strategies to achieve this. The 
one we have adopted here is hased on a chart-parsing 
algorithm proposed in Kay \[1980\]. 
Here is a brief description of the. procedures: 
• parse is the main proeedm:e that scans the inl)ut , 
increments the pointer to the current chart position, 
and invokes the other procedures 
• reduce searches t;he DATR theory for appropriate 
rules in order to achieve fllrther reductions of inac- 
tiw'~ items 
• add-epsilon applies epsik)n productions 
• complete combines inactive and active items 
• add-item adds items to the chart 
We will now giw'~ a more detailed description of the 
procedures in a pseudo-code notation (the input argu- 
ments of a procedure are given in parentheses after the 
procedure nainc). Since the only chart-modif)ing ot> 
('.ration is carried out as a side effcc.t of the procedure 
add-item, the,'e are no output wdues, at all. 
The procedure parse takes as input arguments a ver- 
tex that indicates the current chart position (in the 
initial state this 1)osition is 0) and the suffix of the 
1092 
input string sUu'ting at this position. As long its the 
re.intoning suItix of tlm inlmt string is n(m-(;mpty, parse 
calls the procedures add-cpsilon, red'ace, and complete, 
ilICI'{~IIR!II|;S I,h{! pointer to l;he currellt, ch;41:l; position, 
and si;m'i,s again with t.he new currelg; vcrLex. 
procedm'e parsc(VEl{~Fl'3X, S I .. • ,% ) 
variables: 
VER3PEX, NEXT-.VER3~EX (integer) 
SI ... Sn (string of t)A'I'\]:L symbols) 
data: A DATR theory 0 
I) egin 
if n > 0 
then 
NEX'\[':VIBIIlI'I+,X :-- VI,;I/I\['EX + 1 
call-proe ad<l:epsilon (VEI{;I'EX) 
call-proc reduce(VEllSl'EX, $1, NI,'XT-VFI/TI,;X) 
call-proe complete(VEI(Pl,;X, $1, NEX'I'-VEIIlFI,;N) 
eall-proe parse.(NEXT-VEIIS\['F,X,S2 ... S,,) 
else add-e.psilon (VEI/flT;X) 
en(i 
The 17r<>ce<hn'e add-cpsilo'n ill,'-;(!l't;s ar<:s For the epsihm 
pr{}(lu{:ti{}ns inLo l;he charL: 
procedure add- c.p,+il<m(V E\]{SI TBX) 
variables: VI,;R/IT;X (integer) 
data: A I)ATR, the,{Try 0 
i}egin 
for-each rule CAT ~ e: in 0 
ca.ll-proc redu{:e(VEl{l'EX, CAT, V\],;II3T;X) 
call-proc (:omplete(Vl,;R'l?EX, CAT, VER\[PIBX) 
end 
The, l}lO(:<~durc 'reduce Lakes all inactive item as tim in- 
l)tll; a,rgumcnL and s{~;~l{;h{!s l,}lO I)ATll, Llmory for tulcs 
thai; have a mat(:hinp; le, fl;-c{>ruer <:at<~g(}ry. t,'or ea,(:h 
such rule f{mn{1, 'rc.d'acc inv{}kes tim lTr{Tc{~<htr{~ add..itcm. 
procedm'e "red'ucc(Vi ,CATI,V2) 
data: A I)ATR theory 0 
begin 
if is-tx~rminal(CA'l't ) 
then 
fi}r-e.aeh rllle 
\[No,Po,Co,N'o,P'o\] > CNI'j ...CAT,, in 0 
call-t}roc a<ld-itcn~(Vl ,V2, \[No ,1 }o ,C{}, N 'o, t' 'o\], 
CATI ... CAT,,,X) 
else 
for-each rule 
\[No,l o, ,0,N 0,I 0\] > CATt ...({A'I',, in 0 
such t;ha.t (JAfl' matches (JATI with snlIix S 
;rod constraint C 
ealt-proc add il;{!in(Vt ,V2, 
\[N{,,I'{,,C u o(S,C0),N'{,,P'(,\], (::A%...CA%,S) 
elld 
The procedure complete takes ;m inat:tiv(~ il;{;nl as a.n 
inpuL ~o',glllllelll; ittK\[ s(};/,7'c}los |;h(! {;h&l'|; for active il;elnS 
whi(:h ('an tTe, c(TmI)leted wiLh it,. 
procedure complete(V1 ,CAT,V2) 
data: A <:hart CII 
begin 
if is-terminal (CAT) 
|;hen for-each a.cl;ive item 
(Vo,V, ,CATo,CAT, CAT2 ... CAT,~,S) in CII 
eall-proc add-item(Vo ,V2,M,CAT2 ... CA'\]?,, ,S) 
else for-each act;iv(', il;em 
(V0 ,V~,\[No,I'o,(-Jo,N'o,I"o\],CAT, ... CAT,, ,S) in CII 
such l;hat (JA'FI lna.t(:hes CAT with consl;rainL (', 
and su\[lix ,S 
eall-proc 
add-iten,(V,},V2,\[No,Po,o(S, G,)U C, 
N',P'\],CAT2 ... Cat,~,S) 
end 
The procedure add.item is t;\[1(.' chart-modifying ope.r- 
al;i{Tn. \[L t,akes an a{%ive item its an inlmt argttnw.tit. 
\[f Lhis acLive i{;em has no 1)ending categories, it L'; re- 
garded as a.n inactiw' item. In this case add-item ins(!rl, s 
a new (:harl enLry for t;he ilxm~, provided il; is not al- 
r('.ady includ{;d in l;he chart, and calls the procedures 
reduce ;rod cornplcl.< If tit(: item is an active item, then 
it; is inserted hfl;o the (:hart;, provided it, is not ah'eady 
inside. 
) i ~ }I procedm'e add-il.cm(V~ ,V:,\[No,l o,(,o,N o,t o\], 
(~A'l'~ ... CAT,,S) 
data: A charl, CII 
begin 
if CAT, ... CA'\[',~ -: e 
the.n 
if ' ~A -, , , ), (V~,V~,\[No,I oS,Co,N o,1 o\]) E CIt 
then end 
,',Is,, (3\[ ::: CII tO (Vt ,V2,\[No,I'~S,Co,N',,,P'o\]) 
else 
if 
(V, ,V~,\[No,Po,Co,N'o,I"o\],CAT~ ... (;AT,,S) (- CII 
then e.nd 
else CH :- CII tO 
(VI,V2,\[N{I ' C ,v, ,},lCv~, ...CAT,,,S) 
end 
4 Cycles 
A hard problem ior I)ATR interpr(~ters are c:vclc,% i.e. 
I)ATI(, statements and sets of I)N.\['I{ statements wlfic, h 
involve r(;(:ursive detiifitions such thai; standard infer- 
ence (71 reverse-query illf(!r(',Iic(~ (\[o(;s i1(7|; necess:u'ily Ix'r- 
ininate afLer a linite mlmber of steps of iMi,rence. Here 
&l'(: SOIlIO (!X~l,IilpleS O\[' cycles: 
• simple cgclc.s: N:<a> -:- <a>. 
• path lc'n.gthcning cycle.s: N:<a> =-- <a a>. 
• paUt .sh.orte'ain.9 C?lclc.s: N:<a a> =-= <a>. 
1093 
While simple cycles have to be considered as semanti- 
cally ill-formed and thus typically occur as typing er- 
rors only, both path lengthening and path shortening 
cycles occur quite frequently in many DATR represen- 
tations. Note that path lengthening cycles turn out to 
be path shortening cycles in the reverse query direc- 
tion, and vice versa. The DATR inference engine can 
be prevented from going lost in path-lengthening and 
path-shortening cycles by a limit on path length. This 
finite bound on path length can be integrated into our 
algorithm by modifying the add-item procedure such 
that only items with a path shorter than the permitted 
maximum path length are added to the chart. 
5 Complexity 
CF-PSG parsing is known to have a cubic complexity 
w.r.t, the length of the input string. Though it is cru- 
cial for our approach that we exploit the CF-backbone 
of DATR for computing reverse queries, this result is 
of no significance, here. I)ATR is %1ring-equivalent 
(Moser 1992d), and ~ISMng-equivalence has also been 
shown for a proper subset of DATR (Langer 1993). 
These theoretical results may a priori outrule DATR 
as an implementation language for large scale real time 
applications, but not as a develot)ment environment for 
prototype lexica which can be transformed into efficient 
task-specific on-line lexica (Andry et al. 1992). With 
a finite bound on path length our algorithm works, in 
practice 5, fast enough to be regarded as a usefifl tool 
for the development of small and medium scale lexica 
in DATR. 
6 Conclusions 
We have proposed an algorithm for the evaluation of 
reverse queries in DATR. This algorithm makes DATR- 
based representations applicable for various parsing 
tasks (e.g. morphological parsing, lexicalist syntactic 
parsing), and provides an important tool for lexicon 
development and evaluation in DATR. 
References 
\[Andry et al. 1992\] F~'an(;ois Andry, Nor- 
man M. lh'aser, Scott McGlashan, Siraon Thorn- 
ton & Nick J. Youd \[1992\]: Making DATR Work 
for Speech: Lexicon Compilation in SUNDIAL. In: 
Cornp. Ling. Vol. 18, No. 3, pages 245-267. 
5A prolog implementation of the algorithm described in this 
paper is freely available as a DOS executable program. Please, 
contact the author for fllrther information. 
\[Cahill 1993\] Lynne J. Cahill: Morphonology in the 
Lexicon. in Sixth Co@fence of the E.uropean Chap- 
ter of the Association for Computational Linguis- 
tics, pages 87-96, 1993. 
\[Evans & Gazdar 1989a\] Roger Evans & Gerald Gaz- 
dar: Inference in DATR. In li'o~tr'th Conference of 
the European Chapter of th, e Association for" Com- 
putational Linguistics, pages 66-71, 1989. 
\[Evans & Gazdar 1989b\] Roger Ewms & Gerald Gaz- 
dar: The Semantics of DATR. In: Anthony G. 
Cohn led.\]: P~vcecdings of th, e Seventh Conference 
of the Society for" the Study of Artificial Intelligence 
and Simulation of Behaviour, pages 79-87,London 
1989, Pitman/Morgan Kaufinann. 
\[Evans k Gazdar (eds.) 1990\] Evans, Roger & Gerald 
Gazdar \[eds.\]: The DATR Papers. Brighton: Uni- 
versity of Susse× Cognitive Science Research Paper 
CSRP 139, 1990. 
\[Gazdar 1992\] Gerald Gazdar: Paradigm I~mction 
Morphology in DATR. In: L. J. Cahill & ll,iehard 
Coates \[eds.\]: Sussex Papers in General and Com- 
putational Linguistics: Presented to the Linguis- 
tic Association of Great Britain Conference at 
Brighton Polytechnic, 6th-8th April 1992. Cogni- 
tive Science Research Paper (CSRP) No. 239. Uni- 
versity of Sussex, 1992, pages 43-54. 
\[Gibbon 1992\] Dafydd Gibbon: ILEX: A linguistic ap- 
proach to computational lexiea. In: Ursula Klenk 
led.\]: Computatio Linguae. AuNiitze zur algorith- 
mischen nnd quantitativen Analyse der Sprache, 
pages 32-53. 
\[Gibbon 1993\] Dafydd Gibbon: Generalised DATR 
for flexible access: Prolog specification. En- 
glish/Linguistics Occasional Papers 8. University of 
Bielefeld. 
\[Gibbon & Ahona 1991\] Dafydd Gibbon & Firmin 
Ahoua: DDATR: un logieiel de traitement 
d'hdritage par ddfaut pour la moddlisation lexi- 
cal. Chiers Ivoriens de Recherche Linguistique (eivl) 
27. Universitd Nationale de C6te d'Ivoire. Abidjan, 
1991, pages 5-59. 
\[Jenkins 1990\] Elizabeth A. Jenkins: Enhancements 
to tbe Sussex Prolog DATR hnplemeutation. In: 
Evans & Gazdar \[eds.\] \[1990\], pp. 41-61. 
\[Kay 1980\] Martin Kay: Algorithm Schemata and 
Data Structures in Syntactic Processing. XEROX, 
Palo Alto. 
\[Kiltmry 1993\] James Kilbury: Para<ligm-Based 
Derivational Morphology. In: Giinther GSrz led.I: 
KONVENS 92. Springer, Berlin etc. 1992, pages 
159-168. 
1094 
\[Langer 1993\] lt;~gen Langer: DATR without nodes 
mM global inheritance. In: Proc. of 4. F;~chtagung 
DcklaTntivc und prozedurab~ Aspcktc d~r @Tnchvcr- 
arbcitung der \])GfS/CI~, University of Iiamburg, 
pages 7\]-76. 
\[Moser 1992a\] I,ionel Moser: DATR Paths as Argu- 
ments. Cognitive Science Research l'al)e.r CSR,P 
216, University of Sussex, Brighton. 
\[Moser 1992b\] Lionel Moser: Lexical Consl;r~thlts in 
/)AT1L Cognitive Science Resea.rch Paper CSRP 
215, University of Susse, x, Brighton. 
\[Moser 1992(:\] Lionel Moser: Evaluation in DATR 
is co-NP-Itard. Cognitive St:ience \[1.ese;~rch P;tper 
CSRP 240, University of Sussex, Brighton. 
\[Moser 1992d\] Lionel Moser: Simulating Turing M~L- 
chines in DATR. Cognitive Scien(:(~ Research Paper 
CSRP 2411, Univ(,rsity of Sussex, Brighton. 
1095 
