Datenbank-DIALOG and the Relevance of Habitability 
Harald Trost* 
DFKI GmbH 
Stuhlsatzenhausweg 3 
D-6600 Saarbrficken, Germany 
Wolfgang Heinz, Johannes Matiasek 
Ernst Buchberger 
~SFAI 
Schottengasse 3, A-1010 Wien, Austria 
j ohn@ai, univi e. ac. at 
1 Introduction 
The paper focusses on the issue of habitability and how it 
is accounted for in Datenbank-DIALOG 1. Examples from 
the area of comparisons and measures--both ilnportant 
for many application domains and non-trivial from a lin- 
guistic point of view--demonstrate how design strategies 
can SUl)port the development of a habitahle system. 
Datenbank-DIALOG is a German language interface to 
relational databases. Since the development of a first 
prototype (1985-88) it has been tested in different en- 
viromnents and continually been improved. Currently, 
in a large field test, Datenbank-DIALOG interfaces to a 
database about AI research in Austria. Questions sent 
by einail 2 are answered automatically. 
The system consists of four main components. The 
scanner breaks up the natural language query into to- 
kens for morphological analysis. The parser l)erforms 
syntactic and semantic analysis creating one or--in case 
of ambiguities--more caseframes containing the query 
representation at the domain level. The interpretation 
of the query is performed in three stages. The mapping 
from domain-level to database-level predicates results in 
a DB-Caseframe, then a linearization step produces the 
Logical Form and finally a syntactic transformation leads 
to an SQL query. The answer is then given directly by 
the DBMS as the result of executing the SQL query. 
2 Habitability 
Experiments with NLIs indicate that the linguistic cov- 
erage of state-of-the-art systems is adequate. Savings in 
training time outweigh problems with queries not han- 
dled. Very important though is that the system behaves 
in a predictable way, i.e. is habitable, so that users can 
learn the types of acceptable queries very fast. Other- 
*at OFAI during development of Datenbank-DIALOG 
IThe (levdopment was carried out at the Austrian Re- 
search Institute for Artificial Intelligence (OFAI)jointly with 
"Software Management GmbH', Vienna and has been spon- 
sored by the Austrian Government within the "Mikroelek- 
tronik F6rderungsprogramm, Schwerpunkt $7". See also: 
H. Trost, E. Buchberger, W. Heinz, Ch. H6rtnagl and J. Ma- 
tiazek. Datenbank-DIALOG -A German Language Interface 
for Relational Databases. In Applied Artificial Intelligence, 
1:181-203, 1987. 
2 Email address: aif orsch~ai, univie, ac. at 
wise, they will either face a continuously high rejection 
rate or--more likely, since humans adapt much better 
than computers--formulate their queries ill an unneces- 
sarily simple and inefficient way. 
Hahitability cannot be judged solely on syntactic cov- 
erage. Queries must be correctly interpreted syntac- 
tically, semantically and pragmatically. While syntac- 
tic coverage depends solely on tile parser, semantic and 
pragmatic coverage must be considered with respect to 
the contents of the database to which the NLI connects. 
The grammar of Datenbank-DIALOG is completely 
domain-independent, designed to make the accepted 
suhlanguage as consistent as l)ossihle. Recent advances 
of linguistic theory were incorporated in its development, 
thus also facilitating implementation and maintenance. 
Two examples for this strategy are the treatment of de- 
terminers and verb-second (V2). 
Using results of Generalized Quantifiers Theory for 
natural language quantifiers (e.g. conservativity) a for- 
mal correspondence between GQ-formulas (representing 
the logical form of a query) and SQL-statements (for- 
mulas over the relational calculus) was estal)lished and 
implemented. This gives a sound theoretical basis for 
semantic interpretation and SQL generation. All exten- 
sional natural language determiners can be handled-- 
matching the extensional nature of databases. 
In German finite verbs occur ill second position in 
main clauses (V2) and in final position in subordinate 
clauses. Ideas from (_;B-Theory are used for a uniform 
treatment. V2 is considered to be the result of a move- 
rnent from an underlying final position in the verb cluster 
to clause initial cornplementizer position. This move- 
ment is iml)lemented as a relation between the "moved" 
finite verb and its trace. In the case of main clauses, 
Vlin is "moved back" to the end of the verb cluster, 
and now the same mechanism applies uniformly. Thus 
both clause types are subject to the same syntactic and 
semantic constraints (which thus need only be stated 
once) and give rise to the same interpretation. 
3 Comparisons 
A central concern ill querying databases are cornparisons 
between various kinds of objects. Comt)arisons involve 
a relation between values associated with a dimension 
and units of measure. Values may be given explicitly or 
implicitly by derivation (thus including superlatives). 
  
 241 
Linguistic means for expressing colnparison vary 
widely. Usually, comparison is associated with grad- 
able adjectives and adverbs in various syntactic con- 
structions: hal ein hgheres Gehalt (Aux+A/NP); verdient 
mehr (V+Adv); rail einem hgheren Gehall (A/PP). The 
interpretation of those expressions should be the same. 
Datenbank-DIALOG uses a compositional semantics and 
separates the lexical item from the underlying semantic 
relation, which may be shared by different words. 
More prol)lerns arise when specifying the value for the 
cornparison: ein hb'heres Gehalt als 20.000,-; ein Gehalt 
yon mehr als 20.000,-; mehr als 20.000,- Gehalt; verdi- 
ent mehr als 20.000,-. The comparative and the value 
may be adjacent or not, and show up a.s PPs, complex 
determiners or adverbial phrases. All these constructions 
map onto the same semantic representation, a relation, a 
value along with a dimension and a unit--thus allowing 
to compare values with different units--and a compared 
object. This assures a uniform semantic treatment. 
In verdicnl mehr als Dr. Haid the value is specified 
only implicitly by referring to the salary of Dr. Haid. 
Despite the different structure of the corresponding SQL 
queries the user will hardly notice this fundarnental dif- 
ference. For a habitable system it is necessary to pro- 
vide solutions to both types of comparisons. Datenbank- 
DIALOG recognizes the different interpretations by the 
semantic type associated with the value of the phrase to 
be cornpared. If the value has the correct dimension, it 
may safely be inserted a.s an argulnent into the cornpar- 
ison relation. Otherwise, Datenbank-DIALOG constructs 
a subquery giving the value by using the dominating 
relation and fitting the comparison object into the "sub- 
ject" slot of the attribute. The resulting structure is 
processed analogously to a top-level query. As a conse- 
quence, anal)hora resolution may be apl)lied enabling Da- 
tenbank-DIALOG to give a correct interpretation of Weri 
verdient mehr als seini Vorgeselzler'? 
Domain predicates need not uniquely deterrnine the 
relation and attributes of a corresponding predicate in 
the database. Datenbank-DIALOG splits the interpreta- 
tion of an utterance into two stages: An interpretation 
in the domain model, i.e. a caseframe, which is then 
mapped (using a translation table) to an interpretation 
in the database lnodel, i.e. a DB-caseframe. 
This approach allows to interpret superficially simi- 
lar queries as quite different SQL queries. Attributes 
with the same meaning stored in different tables (nurse- 
salary vs. doctor-salary) can be treated as well as de- 
rived attributes (salary computed from basic + variable 
salary)--in short, the user should not need to know 
about the actual encoding of information. An interesting 
instance of this prillciple is the interpretation of Wieviele 
Patienten behandelt Dr. Haid. Whereas in one database 
rnodel the imrnber of patients is stored explicitly and 
can be treated analogously to the salary above, other 
database models contain this "attribute" only implicitly: 
the number of patients haz to be computed (counted) by 
the SQL query. To obtain these quite different interpre- 
tations, only a different mapping of the (contents of the) 
argurnent-slot of the predicate Treatment in the transla- 
tion step between domain and databa.se level is required. 
A special case (where irnplicit attributes must be made 
explicit) are queries involving the comparison of two sub- 
queries. This cannot be expressed in a single SQL query. 
A ternporary table has to be created containing the rel- 
evant count-attribute together with information on the 
object hearing that attribute. The actual comparison 
can then be made with the now explicit attribute. 
Most problems with comparatives also occur with su- 
perlatives and are dealt with in an analogous way. 
One interesting phenomenon which has no direct par- 
allel in comparative structures shows in Welcher Arzl, 
der in der Ambulanz arbeilet, verdient am meislen? In 
most cases Who has the highest salary among the doctors 
working in the casualty department? is the most plausi- 
ble interpretation and should be preferred. To produce 
this reading, a kind of copying has to he performed: not 
only must the dominating relation be copied but also the 
restrictions on the subject slot (i.e., on the bearer of the 
attribute) have to be inherited. In Datenbank-DIALOG 
this copying works on the ca.seframe representation, and 
thus is able to handle restrictions resulting from differ- 
ent syntactic constructions, such as the lexicon (Ambu- 
lanzarzl), APs (in der Ambulanz arbeitende Arzl), PPs 
(Arzt aus der Ambulanz), NPs (Arzt der Ambulanz) and 
relative clauses ( Arzt, der in der Ambulanz arbeitet). All 
these constructions end up as modifications in the case- 
frame due to the compositional nature of our approach. 
Thus a unified solution for inheritance of modifiers in 
their various forms is achieved. 
A correct comparison is only possible if compared val- 
ues are of the same dimension and share a unit of 
measure. Differences and incompatibilities may arise 
in different places: frorn special formatting conventions 
(e.g. 20000, 20.000,-, $20), when the user specifies a di- 
mension and unit of measure verbatim (e.g. "10 Meter"), 
from the database, where coinparable columns may be 
a.ssociated with different units of Inca.sure. Datenbank- 
DIALOG solves this problem--by defining a normalized 
form azsociating values with units and transformation 
rules between measures of different units--at the scanner 
level (l)atterns, e.g. (late formats), at the parser level 
(linguistic information to fill the slots in the normalized 
value frame), at the interpretation level (procedures to 
transform constant values from one unit to another), at 
the database level (transformation fimctions of the query 
language). 
4 Summary 
Habitability is a rnost iinl)ortant feature of NLIs. Us- 
ing comparison as examl)le we have shown how the de- 
sign of Datenbank-DIALOG enhances habitability, in par- 
ticular by: giving a uniform interpretation to semanti- 
cally equivalent user queries of different syntactic and 
morphological appearance; enabling users to enter data 
in the form most convenient to them (formatting, unit 
conversion); removing the need for users to know about 
the database representations of the concepts they use 
(domain concepts vs. databa.se relations and attributes, 
implicit flmctions, unit conversion); making ambiguities 
explicit; and incorporating presupl)ositions (relation and 
restriction copying). 
