INTERACTION WITH A LIMITED OBJECT DOMAIN - 
ZAPSIB PROJECT 
A.S.Narin'yani 
Computing Center, Siberian Branch USSR Ac. Sci. 
630090 Novosibirsk, USSR 
Abstract. The report presents the 
basis principles of the ZAPSIB project 
aimed at the development of a modular 
series of linguistic processors designed 
for natural language (NL) interaction 
with applied data bases. The general 
structure of the ZAPSIB processors and 
functions of the main modules are dis- 
cussed, as well as technology of the pro- 
ject including problem of processors 
adaptation to an object domain of the in- 
teraction. 
I. Basic principles 
Launching the project the authors x) were 
aware of specialities of commercial sys- 
tems which are principally different in 
many respects from the experimental prog- 
rams developed as their prototypes 
at the beginning of the applicational 
direction of our NL works.7, 2 
This position was implemented in the ba- 
sic principles of the project which could 
be formulated as follows: 
(a) Civing up the realization of any 
"generalized" scheme of interaction (an 
"average" user ~ an "average" object 
domain). No scheme of that kind is pos- 
sible in principle: a customer's demands 
could differ decisively on the main pa- 
rameters of the interaction, such as 
- limitedness of the NL syntax ; 
- contents and complexity of the object 
domain; 
- the lexicon size; 
- the computer's resource; 
- the efficiency of L-processor, etc. 
For some of the parameters the limits of 
those demands can vary up to 100, I 0OO 
or even 10 O00 times. In this spectrum 
of diversity it is not possible to ex- 
tract one or two dominant stereotypes - 
practically every customer needs his own 
L-processor, adequate to his special con- 
ditions and interaction domain. 
This situation determines the strategy 
of the project: it programs the develop- 
ment of not one but aseries of L-processors 
with the same general structure whose 
basic modules are realized as sequences 
of successively extending and compatible 
R) The project being carried out by A.I. 
Laboratory of the Computing Center of 
the Siberian Div. of the USSR Acad. Sci. 
versions. Implementation of this prin- 
ciple is supposed to provide more ade- 
quate choice of L-processor configura- 
tion with regard to a particular user. 
(b) Each L-processor is to be partition- 
ed into the universal and adaptable 
parts. The latter covers all the infor- 
mation depending on the domain of ap- 
plication and includes 
- the data base structure: object,their 
attributes and relations; 
- the lexicon of interaction domain, in- 
cluding the vocabulary, standard word- 
complexes and denotations. 
- the syntax of the formal language of 
the system the L-processor works with. 
To specificate the adaptable part of 
L-processor during its "tuning in" the 
object domain, the processor's modules 
are completed with special means. For a 
better effectiveness of the adaptation a 
professionalcarrying out this process is 
provided with a high-level declarative 
language and a set of specialized meta- 
processors which compile the "outer" 
specification into the inner representa- 
tion. 
The complex of these metaprocessors com- 
poses the STEND system which is construc- 
ted specially to ensure maximal comfort 
and effectiveness of adaptation proce- 
dure (fig.l) • 
(c) Shortcomings of the traditional "syn- 
tactical analysis ~ semantical analy- 
sis" sequence are well known: 
- This scheme enables to process only 
"syntactically normal" texts. Any viola- 
tion of the norm (which is rather rule 
than exception for a mass user) leads to 
faults. 
-In principle this scheme is based on 
assumption of existence of a "complete" 
formal NL model. But no such a model has 
been elaborated up to the moment and 
most probably it will not be available 
during nearest ten years. 
- Even rather rough approximations of 
the model being developped recently are 
cumbersome, expensive and too efficiency- 
killing for a commercial type system. 
Semantically-oriented analysis of text 
based on maximal utilization of semantic 
"foundation" of a message and using syn- 
tax information as locally as possible 
for elimination of superfluous meanings, 
seems free of the mentioned shortcomings 
and much more adequate as a model of un- 
derstanding process. 2,3,4 
--567-- 
Module' s input i 1 
Universal part \]~ pdaptableer Adaptation 
J<_____~ representation)~ s metaprOces- 
a module of L-processor J I i 
II Module's output 
ZAPSIB L-processor 
I 
i 
~ S~pecificatio~n 
l of adaptable I 
Ipart (outer I 
I representa- i 
~on) J 
i i STEND System 
Fig.1. A module of a ZAPSIB L-processor 
and the scheme of its adaptation 
through the STEND system. 
The sphere of applications of the ap- 
proach is limited now to restricted ob- 
ject domains, and 'user - applied data 
base' interface is one of the most ac- 
tual examples of such a problem. 
For realization of the semantically-ori- 
ented analysis the ZAPSIB L-processors 
are completed with special means enabl- 
ing to specify and use detailed data 
about the interaction domain. 
(d) The main procedure of the analysis 
is organized as a non-deterministic bot- 
tom-up parse process, one- or multi-va- 
riant, depending on the processor version. 
This organization corresponds optimally 
to chosen formal apparatus based on the 
notion of c o m p o n e n t which 
generalizes the means of dependency and 
constituents grammars. 
2. General scheme of 
ZAPSIB L-processors 
The minor versions of ZAPSIB L-processors 
being under development now have the ge- 
neral scheme(fig.2). 
Preprocessin~ module includes 
- lexical analysis which decomposes the 
string of input text and divides it into 
words, numbers of various notations and 
letter-digital denotations; 
- assemblage of word-complexes, i.e. 
standard combinations of lexemes which 
are used as an integrate semantic unit 
under further stages of analysis (War and 
Peace International Federation of In- 
formation Processing, etc). 
Main process operates with a system 
of rules, each of them being production 
realized in a high-level context-depen- 
dent grammar. The system includes spe- 
cial means to control partial ordering 
of the rules application. The level of 
the grammar and control means depends on 
the L-processor version. At the module's 
output one or more (in a case of ambigu- 
ous result of the analysis) acyclic parse 
graphs are formed. 
Postprocessing comprises three 
stages: 
- elimination of the local ambiguities 
with the help of global information about 
the text meaning formed up to the end of 
the parse; 
- synthesis of the text semantic repre- 
sentation according to the parse graph; 
- generation of the output representation 
of the text meaning in the User's system 
formal language. 
Model of interaction domain incorpo- 
rates all the semantic and pragmatic in- 
formation concerning the interaction do- 
main and necessary for the operating of 
all other modules. 
Feed-back with the user serves,if 
necessary to specify the user's inten- 
tions and verify the results of the ana- 
lysis. The ZAPSIB strategy regards ap- 
plying to the user as an extreme measure 
in the most urgent cases. 
Each of the main modules is in its turn 
a complex of modules and this provides 
sufficient flexibility and compatibility 
of different versions of the modules. 
3. Technology of the project 
For the development of individual modules 
as well as "assembled" configurations we 
use a two-stage technological cycle: 
(I) Creation of the working pilot program 
in the very high-level SETL language; 
(2) Transferring the SETL-program into 
the instrumental language (PL/I). 
Such a technology helps to cut down ef" 
forts on the development of the univer- 
sal part of the software up to three 
times. 
Special attention in the project is paid to 
automation of the adaptation procedure 
of the L-processor to the user's object 
--568-- 
Model of 
interaction 
domain \ 
\x Feed-back 
with the user 
Answer for- 
~ _~ mulation 
I 
j ' 
Naturai language~ 
text / 
L 
Preprocessing 
Main process 
Postprocessing 
Representation of ~ 
the text meaning 
in the user's 
system language \]L 
The user's system 
~ I VOCABULARY 
I Lexical rules 
~ Main rules 
Fig. 2. General scheme of ZAPSIB L-processors. 
domain. The adaptation is expected to be 
realized on the pilot "L-processor - da- 
ta base" tandem by means of the STEND 
system. 5, 6 Provided with a set of spe- 
cialized dialogue means the system enab- 
les to carry out procedure by direct in- 
teraction with any of the L-processor 
modules. 

References 

i. HapMH~HM A.C., HXHO T.M. KOHTeKCTHO-- 
3aBHcHMaH FpaN~4aTHKa C pa3p~BH~MM COCTaS- 
~HD~MMH B CHCTeMe BOCXO~eFo aHa~H3a. - 
B KH.: BSaHMo~e~CTBHe C 9BM Ha eCTeCTBeH-- 
HOM HS~Ke. - HOBOCH6HpcK: 1978,C.157-165. 

2. ~eBMH ~.~., HapHHBHHH A.C. 9KcnepM- 
MeHTanBH~ MHHMnpoueccop: ceMaHTHqeCKH 
opMeHTMpOBaHH~ aHanHs. - B KH.: BSa~MO- 
~e~CTBMe C 9BM Ha eCTeCTBeHHOM ~s~Ke. - 
HOBOCH6HpCK: 1978, C. 223--233. 

3. Narin'yani A.S. AI Work in the Com- 
puter Center of the Siberian Branch of 
the USSR Acad. of Sciences. - In: Machine 
Intelligence, Ellis Horwood Ldt. 1979, 
V.9. 

4. HapMH~HH A.C. ~HHFBMCTM~ecKHe npo- 
~eccop~ BAHCHB (qacTB I - 3a~aqH npoeK- 
Ta) . - HOBOCM6HpcK, 1979. - 22 c.(Hpen- 
pHHT/B~ CO AH CCCP, 199). 

5. HapMHBHHM A.C. ~HHFBHCTMKecKHe npo- 
~eccopN 3AHCHB (qaCTb 2 - 0~a~ cxeMa 
M OCHOBH~e Mo~ynH). - HOBOCH6MpCK, 1979.- 
48 C. (HpenpHHT/B~ CO AH CCCP, 202). 

6. ~eBHH ~.~. CTEH~ - CHCTeMa a~anTauzH 
nHHFBMCTMqecKHx npoueccopoB. - HOBOCH- 
6HpcK, 1980. - 29 C. (HpenpMHT/B~ CO AH 
CCCP, 238). 
