GOAL ORIENTED PARSING: IMPROVING THE EFFICIENCY OF 
NATURAL LANGUAGE ACCESS TO RELATIONAL DATA BASES 
Giovanni Guida 
Hilan Polytechnic Artificial Intelligence Project 
Politecnico di Milano 
Milan, Italy 
This paper is devoted to present a new appro- 
ach to natural language understanding which is 
called here goal-oriented parsing. The interac- 
tion in natural language with artificial sy- 
stems (robots, data base systems, program gene- 
rators, question-answering systems, etc.) does 
not require in most cases of actual interest 
a full (human like) comprehension of natural 
language in all its details and nuances. A par- 
tial understanding is often enough ,wich ex- 
tracts from the natural language expressions 
the only significant information which is ne- 
cessary to construct a correct formal input for 
the target system. In such a model of comprehen- 
sion the same meaning is assigned to several 
different natural language expressions, thus 
defining a many-to-one mapping between natural 
language sentences and corresponding formal re- 
presentations. We argue that a bounded scope, 
restricted, goal-oriented understanding of natu- 
ral language may greatly increase the efficency 
of representation models and parsing algorithms, 
thus allowing the construction of effective 
systems. This claim is supported by the design 
and implementation of a natural language inter- 
face to a relational data base called NLI and 
developed at the Milan Polytechnic Artificial 
Intelligence Project. In the paper the archi- 
tecture of the system, the linguistic models, 
and the parsing algorithms are presented and 
illustrated through selected examples. Promising 
directions for future research are outlined as 
well. 
l.lntroduction 
The development of natural language under- 
standing systems has received in the last years 
a growing interest, both in the area of compu- 
tational linguistics and artificial intelli- 
gence. In this field a lot of running systems 
have been implemented which are based on diffe- 
rent linguistic models and parsing algorithms. 
This paper is devoted to illustrate NLI, a 
natural language understanding system which has 
been developed at the Milan Polytechnic Arti- 
ficial Intelligence Project for the inquiry 
in Italian of a relational data base. NLI is ba- 
sed on a new approach to natural language under- 
standing called here goal-oriented comprehension. 
It is claimed that the understanding activity 
may be correctly defined only if the purpose 
or goal of the comprehension is clearly specified, 
and that the same sentence may have several dif- 
ferent meanings depending on the purpose for 
which it is considered. It follows that the com- 
plexity of the natural language understanding 
task depends, in addition to the inner linguistic 
or structural complexity of the sentence which is 
considered, on the complexity and extent of the 
scope of the understanding activity, Therefore, 
we can argue that, if one confines the understan- 
ding activity to well defined and bounded goals 
(as it is generally possible in most of the natu- 
ral language processing applications), it should 
be possible to develop ad hoc linguistic models 
and parsing algorithms which are particulary 
fitting and efficent. Such models and algorithms 
will obviously strictly depend on the particular 
goal which is considered and will generally not 
hold for very different scopes. Let us outline 
that goal-oriented understanding implies a boun- 
ded scope, incomplete,and in some way diagonal 
comprehension, in which any detail and nuance 
which is not relevant to the goal is ignored. 
In the design of NLI the classical topic of 
data base inquiry has been chosen as application 
domain and the goal of the understanding process 
has been defined as that one of translating the 
user's natural language queries into the formal 
query language QBE (Query-By-Examplel5). The ado- 
pted liguistic model is strongly semantics based 
and doesn't take into account,as much as possible, 
the syntactic aspects of the sentence to be ana- 
lysed I. The linguistic information needed for 
the parsing is stored in a lexicon and a base of 
the heuristics,and the parsing algorithms are 
strictly dependent on the particular goal which 
is considered.NLl has been so far developed in 
two versions,NLl-I and NLI-2; the latter one, 
which is presently in an advanced development 
--550 
stage, is the subject of this work. 
The paper is organized in the following way: 
in section two the version NLI-I is shortly illu- 
strated and the design specifications and crite- 
ria for NLI-2 are discussed; section three is 
devoted to present the overall architecture of 
the system and the parsing algorithms which have 
been implemented; section four shows some selec- 
ted examples of comprehension; in section five 
some promising directions for future research are 
illustrated together with some conclusive remarks. 
2.Previous work and experimental activity 
In this section the first prototype version 
of the system NLI(NLI-I) is shortly illustrated. 
The functional requirements and the basic techni- 
cal decisions are then presented, on which a se- 
cond version of NLI (NLI-2) is presently being 
developed. 
NLI-I has been designed and implemented at the 
Milan Polytechnic Artificial Intelligence Project 
in the years 1977-784,5 with the following aims: 
- to evaluate through a sample application the 
new model of natural language understanding 
proposed and the designed algorithms for seman- 
tic-based parsingl; 
- to study in a particular case the most critical 
aspects involved by the developent of actual 
applications,such as efficiency, memory occu- 
pations, creation and management of the dictio- 
naries, evaluation~of the understanding capabi- 
lities, etc. 
The inquiry in natural language(Italian) of 
a toy relational data base representing the cata- 
logue of a library has been chosen as an appli- 
cation domain for NLI-I. The architecture of the 
system has been structured in three sequential 
modules. The first module scanns the input sen- 
tence, searches in the vocabulary each word 
which has been recognized through a simplified 
lexical analysis, and generates an internal re- 
presentation of the sentence in which the mea- 
ning of each word and elementary construct is 
embedded. The second module is devoted to reduce 
the ambiguities and to check the correctness of 
the structure proposed (through an algorithm ba- 
sed on set intersectionl); it produces a semantic 
tree which represents the formal internal repre- 
sentation of the input sentence. The third modu- 
le translates the semantic three into an equiva- 
lent sentence expressed in a formal query langua- 
ge for relational data bases (SEQUEL). 
NLI-I has been implemented in FORTRAN first 
on an UNIVAC II00 and, later, on a Digital 
PDP-II/34 computer. It requires about 15 Kbytes 
memory with a lexicon of 200 words. The parsing 
of a simple sentence(a single query of about 
10-15 words) needs a few seconds and more complex 
sentences need up to 20-30 seconds. 
The research activity done with NLI-I has 
highlighted some relevant aspects concerning the 
development and actual implementation of NLI, 
which will be further considered in the design 
of NLI-2. We list some of these below: 
- the necessity of developing an high-level dia- 
gnostic subsystem to be used by the applica- 
tion designer(often,the end-user himself) in 
the tuning activity; 
- the need for an interactive system for the in- 
cremental definition and for the management of 
the vocabulary(to be utilized also in the de- 
sign of a new application versions of NLI); 
- the necessity of refining the parsing algori- 
thms and of defining a base of the heuristics 
to be utilized in the most complex cases of 
comprehension; 
- the problem of the optimal content of the voca- 
bulary (increasing the content of the vocabula- 
ry doesn't always improve the understanding ca- 
pabilities of the system!); 
- the problem concerning the need of inserting 
in the vocabulary all the words which consti- 
tute the content of the data base. 
For greater details about NLI-I the reader 
is referred to the literature 5. 
The new version NLI-2 has been designed in 
1979 on the base of the experience acquired 
through NLI-I. The main methodological assump- 
tions, funcitonal specifications,and technical 
decisions on which NLI-2 is based are illustrated 
below: 
- goal-oriented understanding; 
- semantics directed parsing; 
- hierarchical organization of the linguistic 
information supplied to the system in two mo- 
dules:the lexicon and the base of the heuri- 
stics; 
- development of a generalized parsing algorithm 
independent of the content of the lexicon and 
of the base of the heuristics, of the structure 
of the data base, and of the particular natu- 
ral language which is used; 
- developent of an high-level diagnostic module; 
- design of a fully interactive system for the 
incremental definition and dynamic management 
of the content of the lexicon and of the base 
of the heuristics and for the implementation 
of new versions of the system; 
implementation of the appropriate human engi- 
neering features needed to improving the usa- 
bility of the system. 
--551 - 
The natural language used for the inquiry is 
Italian and the output of the system is the tran- 
slation of the query in QBE (Query-By-Examplel4). 
The data base adopted for the development of the 
sample application of NLI-2 is concerned with 
a department store description and is just the 
same used for the illustration of QBE (a complete 
definition of the data base is reported in the 
Appendix). In designing NLI-2 it has been decided 
of not inserting anything about the content of 
the data base in the lexicon of the system,in or- 
der to keep its dimensions as small as possible. 
This choice implies that the user indicates expli- 
citely(through a pair of special symbols") in the 
input sentence the words which denote values of 
domains of the data base. The adequacy and effec- 
tiveness of this decision will be evaluated du- 
ring the experimental activity to be done with 
NLI-2 and could be removed if considered unappro- 
priate. 
The next section will be entirely devoted to 
the illustration of the global structure of NLI-2 
and of the parsing algorithms which have been 
developed. 
3.System architecture and parsing algorithms 
NLI-2 is based on a modular structure in which 
there is a distinct division between the data(vo- 
cabulary,data base model, formal query language 
structure) and the programs(analyser,generator, 
vocabulary management). The system architecture 
is illustrated in Figure I. 
The monitor is devoted to open the job(it in- 
forms the user about the system capabilities and 
present to him an appropriate menu of different 
options), to interprete the user's commands, and 
to manage the activities of the lower modules. 
The user's control messages accepted by the moni- 
tor are: 
- SHELP : detailed informations and instructions 
on the operation of the system are supplied; 
- SU : the understanding cycle of the natural 
language queries is activated; the system is 
ready to run in U-mode; 
- SV : the module for the management of the voca- 
bulary is activated; the system can be used for 
the definition,updating,and tuning of the voca- 
bulary in the V-mode; 
- $STOP : returns the control to the monitor for 
U-mode and V-mode operations; 
- SEND : closes the job. 
The analyser, which is activated by the moni- 
tor when running in the U-mode,accepts the natu- 
ral language query in input and generates an ap- 
propriate internal representation or, if some 
step of the parsing fails, a diagnostlc message 
to be displayed to the user. The analyser opera- 
tes in a cyclic way: when the parsing of a sen- 
MONITOR 
wl I I 
ANALYSER 
I 
VOCABULARY 
MANAGEMENT 
4 I 
GENERATOR 
I 
VOCABULARY 
J DATA BASE 
MODEL 
FORMAL QUERY LANGUAGE 
STRUCTURE 
Figure 1 - NLI-2 architecture 
552- 
tence is concluded it is automatically reset and 
is ready to accept a new query. Queries are not 
stored by the parser, so that the user is not 
allowed to carry on a dialog with the system in 
which new queries can refer to old ones(or to 
their answers). The module for the vocabulary 
management can be activated by the monitor or 
directly by the analyser when a fail occurs. It 
supplies the user with some basic functions(some 
of them mainly relating to text editing) for a 
flexible management of the vocabulary(search, 
display,delate,add,update,etc.).The generator 
receives in input the internal representation 
produced by the analyser and translates it into 
the formal query language QBE. 
Let us describe now the internal structure 
of the vocabulary. It is composed of two parts: 
the lexicon~ which contains the words and the 
elementary construct referring to the applica.- 
tion domain in which the system operates,and the 
base of the heuristics , which embeds further 
linguistic information(sometimes also of syntac- 
tic nature) necessary to understand complex and 
often ambiguous constructs. 
The lexicon is organized in 26 alphabetic 
groups,each one of them contains all the words 
beginning by the same letter. Each word (or e- 
lementary construct) is bound to all its possi- 
ble meanings,thus yelding a record of the lexi- 
con. A record is in fact composed of two parts: 
- the left part contains a word or a (short) 
sequence of words representing a simple con- 
struct; morphology is (generally)not taken into 
account in the parsing,but the wordsare repre- 
sented in such a way to be recognized in all 
possible forms(conjugated verbs,inflected nouns, 
ect.); 
- the right part embeds the representation of the 
semantics of the word within the application 
domain which is considered. 
The right part of each record has a different 
structure depending on the semantictype to which 
the word stored in the left part belongs. In our 
application a word can denote three different 
types of information concerning: 
the identification of the relations and of 
the domains involved by the query; 
the logical connection between relations and 
domains; 
the specification of the required output. 
Therefore, in relation to the three above pre- 
sented activities, we define the following se- 
mantic types: object, connective, and function, 
respectively. 
The general structure of an object record is: 
li I 111 12222 0 N1.LI=N2.M ~ ..... N1.MI=N2.M ~ 
where: 
- P denotes a word or a simple costruct; 
- 0 denotes that the record is of type object; 
- each pair N~.M~ denotes that the wo~d P refers 
to the domain IN~ in the relation M~ (relations 
and domains are ~epresented by positive inte - 
gers); the = symbol separates equal domains 
belonging to different relations; if P refers 
to a relation without specifying the domain, 
N~ is replaced by the special symbol $;each 
f~eld of the record contains a different mea- 
ning of the object P. 
The structure of a connective record is: 
where: 
P denotes a word or a simple costruct; 
C denotes that the record is of type connecti- 
ve; 
- each pair X. : Y. denotes a possible meaning 
1 . 1 of the connectlve P; namely, X: represents the 
I pattern in which P may appear(position of P and 
type of the adjacent words),Y: denotes(through 
I a pointer) the function which must be applied 
or the action which must be performed during 
the parsing to take correctly into account the 
meaning of the connective P. 
The structure of a function record is: 
where : 
- P denotes a word or a simple construct; 
- F denotes that the record is of type function; 
- each Z. denotes a possible meaning of the fun- 
ction ~ and represents (through a pointer) an 
action which must be performed in the parsing. 
The base of the heuristics is constituted by 
a bag of heuristic rules(of the type precondi- 
tion-action), which allow to represent linguistic 
informations which are not comtained in the lexi- 
con but are still needed for understanding com- 
plex sentences and for the resolution of ambigui- 
ties. The heuristic rules are selected and acti- 
vated during the parsing,whenever a critical si- 
tuation occures,on the base of a pattern direc- 
ted invocation algorithm. 
Let us outline that the vocabulary(both its 
structure and content) is strctly dependent on 
--553-- 
the particular application to which the system 
is devoted. This feature,which represents a quite 
rigid bound to the flexibility of the system, is, 
on the other hand, a straightforward conse- 
quence of the concept of goal-oriented understan- 
ding. The issue of designing system architectures 
which allow a flexible handling of the purpose 
and domain of the comprehension is considered 
as a promising and ambitious topic for future 
research,as it will be illustrated in the con- 
clusions. 
Let us illustrate now the activity of the ana- 
lyser. It accept as input a natural language que- 
ry and supplies an internal formal representation 
of it which is not far from a QBE expression; in 
such away the role of the generator is confined, 
within NLI-2, to a simple editing activity. The 
activity of the analyser can be splitted in four 
steps: 
I. scanning of the natural language input senten- 
ce, search in the lexicon,and generation of 
a first-level internal representation; 
2. parsing,(partial or full) resolution of the 
ambiguities, and generation of the second-le- 
vel internal representation; 
3. intersection,i.e, verification of the consi- 
stency of the proposed structure and resolu- 
tion of the possibly remaining ambiguities; 
4. generation of the correct formal representa- 
tion or, if any fail has occured, of the ap- 
propriate diagnostics. 
In step I. the input sentence is first scanned 
and an internal representation of it is construc- 
ted( first-level internal representation).The 
searching of the words in the lexicon is indexed 
sequencial(with a one-level index); if a word 
or elementary construct is found in the lexicon 
it is replaced by all the record to which it 
refers; on the other hand, if a word is not re- 
cognized, it is enclosed between brackets in the 
first-level internal representation and it is 
successively ignored in the parsing. The words 
which denote values of domains,i.e, which appear 
in the content of the data base, are preceded 
and followed by the special symbol" in the input 
sentence and remain unaltered in the internal 
representation. The first-level internal repre- 
sentation reflects the ordering of the words in 
the input sentence and embeds all the linguistic 
information which can be obtained from the lexi- 
con. 
In step 2. the relations involved by the 
user's query are first recognized through a deta- 
iled analysis of the words of type object. This 
activity cannot yeld, in general, a definite and 
unambiguous result since some objects may refer 
to different relations. A pattern directed invo- 
cation of appropriate heuristics may contribute 
in eliminating some (or also all)of the ambigui- 
ties. After the correct relations are individua- 
ted, their logical schemata are extracted from 
the data base model and a new phase starts aiming 
to individuate the domains referred to in the in- 
put sentence. The possible ambiguities may be re- 
solved through an appropriate use of the heuri- 
stics. The domains which have a role in the 
user's query are then labelled in order to be 
further processed in the following steps of the 
parsing. The words enclosed between pairs of " 
symbols are then considered in order to find the 
appropriateassignement of these values to the 
relating domains. Different criteria may be used 
to perform such an assignement(e.g., the conti- 
guity of a value and an object),but, in any case, 
the assignement is not a definitive one until it 
is confirmed by an heuristic or by the following 
step 3.(intersection), A tentative interpretation 
of the input sentence(second-level internal re- 
presentation) is then produced,which will be 
further refined and completed in the following 
steps of the parsing. 
The analysis of the connectives constitutes 
the kernel of step 3.;it is organized in two 
phases: 
singling out the correct semantics of a connec- 
tive on the base of its position in the senten- 
ce and of the type of the words to which it 
refers(appropriate heuristics have to be utili- 
zed to resolve possible ambiguities); 
verification of the proposed interpretation 
of the sentence segment to which the connecti- 
ve belongs(this activity is performed through 
set intersection algorithms I, what gives the 
reason for the name assigned to step 3.). 
In step 4. the functions~which possibly ap- 
pear in the input sentence, are considered and 
the domains to which they apply are individuated. 
If the parsing terminates correctly the formal 
representation of the sentence is generated which 
will be later translated in QBE by the generator. 
In the following section some complete exam- 
ples of comprehension are illustrated. 
4.Parsing sample sentences 
In this section we are going to present some 
examples of parsing extracted from the sample 
application in which NLI-2 is presently working 
(the department store data base which is descri- 
bed in the Appendix). 
--554-- 
Example I. 
Input sentence: 
<TROVA I REPARTI CHE VENDONO PRODOTTI FORNITI 
DA "PARKER"> (I) 
The first-level internal representation,after 
step I. of the parsing is concluded,results: 
<TROVA/F/PV/ (I) REPARTI/O/I.4=I.2 (CHE) VENDONO 
/C/XISX2,XI=O,X2=V:PA2.2/YISY2,YI=O,Y2=O:PR2 
PRODOTTI/O/2.2=I.3=I.4 FORNITI DA /C/XISX2,XI=O, 
x2=V:PA2.3/YlgY2,YI=O,Y2=C:PR3 "PARKER"> 
In step 2. the relations involved by the user's 
request are first recognized, 
<VENDITE (REPARTO,ARTICOLO) > 
< FORNITORI (ARTICOLO,FORNITORE)> 
and, then, the domains referred to in the input 
sentence are labelled through the letters A and 
B: 
< VENDITE (REPARTO:A,ARTICOLO:B)> 
< FORNITORI (ARTICOLO:B,FORNITORE)> 
The second-level internal representation (propo- 
sing a tentative assignement of values to the 
labelled domains) results: 
< VENDITE (REPARTO:A,ARTICOLO:"PARKER") > 
< FORNITORI (ARTICOLO:"PARKER",FORNITORE)> 
After step 3. we get: 
< VENDITE (REPARTO:A,ARTICOLO:B) > 
< FORNITORI (ARTICOLO:B FORNITORE:"PARKER") > 
and, after step 4., the correct formal represen- 
tation in QBE results: 
VENDITE (REPARTO: P, ARTICOLO:B) 
FORNITORI (ARTICOLO:B, FORNITORE: "PARKER") 
PV is a pointer to a routine which puts the spe- 
cial symbol P in the domain which follows the 
function TR~A. PA is a pointer to a routine 
which assigns X; to domain 2 of relation 2 (in 
the case VENDONO),or to domain 2 of the relation 
3 (in the case FORNITI DA); PA is activated if 
the connective to which it is bound connects an 
object (0) to a particular value of a domain(V), 
as it arrives in both cases VENDONO and FORNITI 
DA. The second meaning of these two connectives, 
(I) find the departments which sell products sup- 
plied by Parker 
represented by the pointer PR, is not considered 
in this parsing. 
Example 2. 
Input sentence: 
<VOGLIO I NOMI DEGLI IMPIEGATI CHE GUADAGNANO 
PIU' DEL LORO DIRIGENT\[> (2) 
First-level internal representation: 
<VOGLIO/F/PV (I) NOMI DEGLI IMPIEGATI/O/I.I 
(CHE) GUADAGNANO/O/2.1 PIU' DEL/F/PM (LORO) 
DIRIGENTE/O/3.1> 
Second-level internal representation: 
< IMPIEGATI (NOME:A, SALARIO:B, DIRIGENTE:C, 
REPARTO) > 
After step 3. we get: 
< IMPIEGATI (NOME:A, SALARIO:B, DIRIGENTE:C, 
REPARTO) > 
In step 4., after the analysis of the functions, 
we have: 
< IMPIEGATI (NOME:P, SALARIO:B, DIRIGENTE:C, 
REPARTO)> 
In this representation there are two domains(SA- 
LARIO and DIRIGENTE) whose role is not yet under- 
stood; this pattern activatesaparticular heuri- 
stics which allows two obtain the following cor- 
rect formal representation: 
IMPIEGATI (NOME:P, SALARIO> B, DIRIGENTE:C, 
REPARTO) 
IMPIEGATI (NOME:C, SALARIO:B DIRIGENTE, 
REPARTO) 
Example 3. 
Input sentence: 
< VORREI I NOMI DELGI IMPIEGATI CHE LAVORANO 
NELLA DIVISIONE "CANCELLERIA" > (3) 
First-level internal representation: 
< VORREI/F/PV (I) NOMI DEGLI IMPIEGATI/O/I.I 
(2) I want to know the names of the employees, 
who earn more than their managers 
(3) I would like to know the names of the emplo- 
yees,who work in the writing-materials department 
--555 
(CHE) (LAVORANO) (NELLA) (DIVISIONE) 
"CANCELLERIA"> 
Second-level internal representation: 
< IMPIEGATI (NOME:"CANCELLERIA", SALARIO, 
DIRIGENTE, REPARTO) > 
After step 3. we get: 
<IMPIEGATI (NOME: "CANCELLERIA", SALARIO, 
DIRIGENTE, REPARTO) > 
Step 4. fails since the assignement of the spe- 
cial symbol P to the domain NOME, which is requi- 
red by the function VORREI (pointer PV), can not 
be executed, being the value "CANCELLERIA" alrea- 
dy present in the domain NOME. The following 
diagnostic message is therefore generated: 
>analisi fallita nella fase 4 
>impossiblit~ di effettuare un corretto assegna- 
>mento del valore "CANCELLERIA" ad un opportuno 
dominio 
>termini ignorati nell'analisi: 
I, CHE, LAVORANO, NELLA, DIVISIONE 
>rappresentazioni interne generate: (4) 
(vedi sopra) 
Let us note that,if we insert in the lexicon the 
new record DIVISIONE/O/I.4=I.2, it is possible 
to obtain the following correct parsing: 
IMPIEGATI (NOME:P, SALARIO,DIRIGENTE, REPARTO: 
"CANCELLERIA") 
Example 4. 
Input sentence: 
<DAMMI I NOMI DEI FORNITORI DI "OROLOGI">(5) 
(4) parsing failed in step 4 
the value"CANCELLERIA" can not be correctly 
assigned to an appropriate domain 
words ignored in the parsing: 
I, CHE, LAVORANO, NELLA, DIVlSIONE 
internal representations generated: 
(see above) 
(5) find the names of the supplyers of watches 
First-level internal representation: 
<DAMMI/F/PV (I) (NOMI) (DEI) FORNITORI/O/2.3 
(DI) "OROLOGI"> 
Second-level internal representation: 
<FORNITORI (ARTICOLO, FORNITORE:"OROLOGI")> 
After step 3.we get: 
<FORNITORI (ARTICOLO, FORNITORE:"OROLOGI")> 
Step 4. fails because of the same reasons as 
in Example 3. In this case, however,the system 
can correctly parse the input sentence only by 
inserting in the lexicon the word OROLOGI which, 
on the other hand,is part of the content of the 
data base. 
5.Conclusions 
In the paper the NLI system has been illustrated, 
with particular attention to the NLI-2 version. 
It is devoted to the inquiry in natural language 
(Italian) of a relational data base. NLI is based 
on a modular architecture which allows the desi- 
gner , or even the user, to define his own ap- 
plication in a fully inte.ractive and incremental 
way. The linguistic model supported by the system 
is that one of goal-oriented understanding,and 
the parsing algorithms adopted are mainly seman- 
tics directed. 
The implementation of NLI-2 is now being com- 
pleted and the system will be used in the future 
for developing the following research acti- 
vities: 
definition of performance evaluation criteria 
for measuring the understanding capabilities 
of natural language systems; 
evaluation of the complexity of the parsing 
algorithms in relation to the complexity of the 
input sentence and of the scope of the compre- 
hension; 
investigation on the flexibility of the system 
in relation to small variations in the structu- 
re and content of the data base and in the ado- 
pted natural language; 
definition of new system architectures allowing 
a full flexibility in relation to the purpose 
of the understanding and to the application 
doamin. 
Appendix 
This Appendix is devoted to present the sample 
relational data base which has been adopted in 
the development of NLI-2. This is concerned with 
the activity of a department store and it is just 
the same utilized in the definition of QBE 15. The 
--556-- 
data base is sketched below through a set of ta- 
bles in which both the names of relations and 
domains and their internal representation(pairs 
of integers) is given. 
I.S IMPIEGATO (I.I NOME, 1.2 SALARIO, 
1.3 DIRIGENTE, 1.4 REPARTO) 
2.$ VENDITE (2.1 REPARTO, 2.2 ARTICOLO) 
3.3 FORNITORI (3.1ARTICOLO, 3.2 FORNITORE) 
4.3 TIPO (4.1ARTICOLO, 4.2 COLORE, 4.3 TAGLIA) 
Acknowledgments 
I am indebt to E. Cuccurullo for the implemen- 
tation of NLI-2 and for stimulating discussions. 

References 

I. Burger J.F, Leal A., Shoshani A. Semantic-ba- 
sed parsing and a natural language interface 
for interactive data managelent. Proc. 13th' 
Conf. of the ACL, Boston, 1975 

2. Codd E.F. A relational model of data for lar- 
ge shared data banks. Comm.ACM 13,6 (1970), 
377-387. 

3. Codd E.F. Seven steps to rendezvous with the 
casual user. IBM Research Report RJ1333, San 
Jos6,Cal. ,1974. 

4. Guida G. Ideas about design of natural langua- 
ge interfaces to query systems. Proc. of a 
Workshop on Natural Lang.uage for Interaction 
with Data Bases, IIASA CP-78-9, Laxemburg, 
Austria, 1978, 265-279. 

5. Guida G. Natural language interfaces to com- 
puter system: An experimental project. Alta 
Frequenza XLVII, 9 (1978), 668-674. 

6. Guida G., Somalvico M. A two level modular 
system for natural language data base query 
applications. Proc. 6th Int. Joint Conf on 
Artificial Intelligence0 Tokio,Japan, 1979, 
345-347. 

7. Harris L.R. Experience with ROBOT in 12 com- 
mercial natural language data base query ap- 
plications. Proc.6th Int. Joint Conf. on Ar- 
tificial Intelligence, Tokio,Japan, 1979, 
365-368. 

8. Hendrix G.G., Sacerdoti E.D., Sagalowicz D., 
Slocum J. Developing a natural language inter- 
face to complex data. ACM Trans. on Data Base 
Systems 3,2 (1978), I05-147. 

9. Kaplan S.J., Mays E., Joshi A.K. A technique 
for managing the lexicon in a natural langua- 
ge interface to a changing data base. Proc. 
6th Int.Joint Conf. on Artificial Intelligen- 
c_e_e,Tokio,Japan, 1979, 463-465. 

I0. Lehmann H.The USL project. Its objectives 
and status. Proc.lnt, Technical Conf. on Re- 
lational Data Base Systems, IBM Scientific 
Center,Bari, Italy, 1976, 7-38. 

II. Mylopoulos J., Borgida A., Cohen ~Roussopo- 
ulos H., Tsotsos J., Wong H. TORUS : A step 
towards bridging the gap between data bases 
and the casual user. Information Systems 2 
(1976), 49-64. 

12. Schank R.C., DeOong G. Purposive understan- 
ding. Machine Intelligence 9, J.E. Hayes, 
D.Michie, L.l.Mikulich (Eds.), Ellis Horwood, 
Chichester, 1979, 459-478. 

13. Waltz D.L. Natural language access to a large 
data base: an engineering approach. Proc. 
4th Int. Joint Conf. on Artificial Intelli- 
gence:Tbilisi, USSR, 1975, 868-872. 

14. Waltz. D.L., Goodman B.A. Writing a natural 
language data base system. Proc. 5th Int, 
Joint on ArtifiCial Intelligence, Cambridge, 
Mass., 1977, 144-150. 

15. Zloof M.M. Query by example. Proc. Nat. Com- 
puter Conf., AFIPS Press, Vol. 44, 1975, 
431-438. 
