File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/80/c80-1086_metho.xml

Size: 9,002 bytes

Last Modified: 2025-10-06 14:11:19

<?xml version="1.0" standalone="yes"?>
<Paper uid="C80-1086">
  <Title>INTERACTION WITH A LIMITED OBJECT DOMAIN - ZAPSIB PROJECT</Title>
  <Section position="1" start_page="0" end_page="0" type="metho">
    <SectionTitle>
INTERACTION WITH A LIMITED OBJECT DOMAIN -
ZAPSIB PROJECT
</SectionTitle>
    <Paragraph position="0"> Abstract. The report presents the basis principles of the ZAPSIB project aimed at the development of a modular series of linguistic processors designed for natural language (NL) interaction with applied data bases. The general structure of the ZAPSIB processors and functions of the main modules are discussed, as well as technology of the project including problem of processors adaptation to an object domain of the interaction. null I. Basic principles Launching the project the authors x) were aware of specialities of commercial systems which are principally different in many respects from the experimental programs developed as their prototypes at the beginning of the applicational direction of our NL works.7, 2 This position was implemented in the basic principles of the project which could be formulated as follows: (a) Civing up the realization of any &amp;quot;generalized&amp;quot; scheme of interaction (an &amp;quot;average&amp;quot; user ~ an &amp;quot;average&amp;quot; object domain). No scheme of that kind is possible in principle: a customer's demands could differ decisively on the main parameters of the interaction, such as  - limitedness of the NL syntax ; - contents and complexity of the object domain; - the lexicon size; - the computer's resource; - the efficiency of L-processor, etc.</Paragraph>
    <Paragraph position="1">  For some of the parameters the limits of those demands can vary up to 100, I 0OO or even 10 O00 times. In this spectrum of diversity it is not possible to extract one or two dominant stereotypes practically every customer needs his own L-processor, adequate to his special conditions and interaction domain.</Paragraph>
    <Paragraph position="2"> This situation determines the strategy of the project: it programs the development of not one but aseries of L-processors with the same general structure whose basic modules are realized as sequences of successively extending and compatible R) The project being carried out by A.I. Laboratory of the Computing Center of the Siberian Div. of the USSR Acad. Sci. versions. Implementation of this principle is supposed to provide more adequate choice of L-processor configuration with regard to a particular user. (b) Each L-processor is to be partitioned into the universal and adaptable parts. The latter covers all the information depending on the domain of application and includes - the data base structure: object,their attributes and relations; - the lexicon of interaction domain, including the vocabulary, standard word-complexes and denotations.</Paragraph>
    <Paragraph position="3"> - the syntax of the formal language of the system the L-processor works with.</Paragraph>
    <Paragraph position="4"> To specificate the adaptable part of L-processor during its &amp;quot;tuning in&amp;quot; the object domain, the processor's modules are completed with special means. For a better effectiveness of the adaptation a professionalcarrying out this process is provided with a high-level declarative language and a set of specialized metaprocessors which compile the &amp;quot;outer&amp;quot; specification into the inner representation. null The complex of these metaprocessors com- null poses the STEND system which is constructed specially to ensure maximal comfort and effectiveness of adaptation procedure (fig.l) * (c) Shortcomings of the traditional &amp;quot;syntactical analysis ~ semantical analysis&amp;quot; sequence are well known: - This scheme enables to process only &amp;quot;syntactically normal&amp;quot; texts. Any viola null tion of the norm (which is rather rule than exception for a mass user) leads to faults.</Paragraph>
    <Paragraph position="5"> -In principle this scheme is based on assumption of existence of a &amp;quot;complete&amp;quot; formal NL model. But no such a model has been elaborated up to the moment and most probably it will not be available during nearest ten years.</Paragraph>
    <Paragraph position="6"> - Even rather rough approximations of the model being developped recently are cumbersome, expensive and too efficiencykilling for a commercial type system. Semantically-oriented analysis of text based on maximal utilization of semantic &amp;quot;foundation&amp;quot; of a message and using syntax information as locally as possible for elimination of superfluous meanings, seems free of the mentioned shortcomings and much more adequate as a model of understanding process. 2,3,4</Paragraph>
    <Paragraph position="8"> Fig.1. A module of a ZAPSIB L-processor and the scheme of its adaptation through the STEND system.</Paragraph>
    <Paragraph position="9"> The sphere of applications of the approach is limited now to restricted object domains, and 'user - applied data base' interface is one of the most actual examples of such a problem.</Paragraph>
    <Paragraph position="10"> For realization of the semantically-oriented analysis the ZAPSIB L-processors are completed with special means enabling to specify and use detailed data about the interaction domain.</Paragraph>
    <Paragraph position="11"> (d) The main procedure of the analysis is organized as a non-deterministic bottom-up parse process, one- or multi-variant, depending on the processor version. This organization corresponds optimally to chosen formal apparatus based on the notion of c o m p o n e n t which  generalizes the means of dependency and constituents grammars.</Paragraph>
    <Paragraph position="12"> 2. General scheme of</Paragraph>
  </Section>
  <Section position="2" start_page="0" end_page="0" type="metho">
    <SectionTitle>
ZAPSIB L-processors
</SectionTitle>
    <Paragraph position="0"> The minor versions of ZAPSIB L-processors being under development now have the general scheme(fig.2).</Paragraph>
    <Paragraph position="1"> Preprocessin~ module includes - lexical analysis which decomposes the string of input text and divides it into words, numbers of various notations and letter-digital denotations; - assemblage of word-complexes, i.e.</Paragraph>
    <Paragraph position="2"> standard combinations of lexemes which are used as an integrate semantic unit under further stages of analysis (War and Peace International Federation of Information Processing, etc).</Paragraph>
    <Paragraph position="3"> Main process operates with a system of rules, each of them being production realized in a high-level context-dependent grammar. The system includes special means to control partial ordering of the rules application. The level of the grammar and control means depends on the L-processor version. At the module's output one or more (in a case of ambiguous result of the analysis) acyclic parse graphs are formed.</Paragraph>
    <Paragraph position="4"> Postprocessing comprises three stages: - elimination of the local ambiguities with the help of global information about the text meaning formed up to the end of the parse; - synthesis of the text semantic representation according to the parse graph; - generation of the output representation of the text meaning in the User's system formal language.</Paragraph>
    <Paragraph position="5"> Model of interaction domain incorporates all the semantic and pragmatic information concerning the interaction domain and necessary for the operating of all other modules.</Paragraph>
    <Paragraph position="6"> Feed-back with the user serves,if necessary to specify the user's intentions and verify the results of the analysis. The ZAPSIB strategy regards applying to the user as an extreme measure in the most urgent cases.</Paragraph>
    <Paragraph position="7"> Each of the main modules is in its turn a complex of modules and this provides sufficient flexibility and compatibility of different versions of the modules.</Paragraph>
    <Paragraph position="8">  3. Technology of the project  For the development of individual modules as well as &amp;quot;assembled&amp;quot; configurations we use a two-stage technological cycle: (I) Creation of the working pilot program in the very high-level SETL language; (2) Transferring the SETL-program into the instrumental language (PL/I).</Paragraph>
    <Paragraph position="9"> Such a technology helps to cut down ef&amp;quot; forts on the development of the universal part of the software up to three times.</Paragraph>
    <Paragraph position="10"> Special attention in the project is paid to automation of the adaptation procedure of the L-processor to the user's object  domain. The adaptation is expected to be realized on the pilot &amp;quot;L-processor - data base&amp;quot; tandem by means of the STEND system. 5, 6 Provided with a set of specialized dialogue means the system enables to carry out procedure by direct interaction with any of the L-processor modules.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML