XML Viewer - a83-1013

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/83/a83-1013_metho.xml
Size: 36,517 bytes
Last Modified: 2025-10-06 14:11:27
<?xml version="1.0" standalone="yes"?>
<Paper uid="A83-1013">
  <Title>III ANALYSIS</Title>
  <Section position="4" start_page="0" end_page="82" type="metho">
    <SectionTitle>
II SPECIFIC APPROACH
</SectionTitle>
    <Paragraph position="0"> Our objective is to do better than this by making more use of powerful, but still non-domain-dependent semantics in the front-end linguistic analysis.</Paragraph>
    <Paragraph position="1"> Doing this should have two advantages: restraining syntax, and providing a good platform for domain-dependent semantic processing. However, the overall architecture of the front end still follows the Konolige model in maintaining a clearcut separation between the different kinds of knowledge to be utilised, keeping the bulk of the domain-dependent knowledge in declarative form, and attempting to minimlse the consequences of changes in the front end environmant, whether of domain or database model, to promote s~ooth transfers cf the front end from one back end database management system to another.</Paragraph>
    <Paragraph position="2"> We believe that there is a lot of mileage to be got from non-task-specific semantic analysis of user requests, because their resulting rich, explicit, and ncrmalised meaning representations are a ~ccd starting point for subsequent task-specific operations, and specificall~ are better than either syntax trees, or the actual input text of e.g. the PLANES approach. Furthermore, since the domain world is (in some sense) a subset of the real world, it is possible to interpret descriptions of it using the same semantic apparatus and representation language as is used by the natural language analyser, which should allow easy and reliable linking of the natural language input words, domain world objects and relationships and data language terms and expressions. Since the connections between these do not appear hard-wired in the lexicon, but are established on the basis of matching rich semantic patterns, no changes at all should be required in the lexicon as the application moves from one domain or database to another, only expansions to allow for the semantic definitions of new words relevant to the new application.</Paragraph>
    <Paragraph position="3"> The approach leads to an overall front end structure as follows:  Each process in the diagram above operates cn the output of the previous one. Processes I and 2 constitute the analysis phase, and processes 3 and - the translation phase. Such a system has essentially been constructed, and is under active test; a detailed acccunt cf its components and operations follows.</Paragraph>
    <Paragraph position="4"> For the purposes of illustration we shall use questions addressed to the Suppliers and Parts relational database of (Date 1977). This has three relaticns with the following structure: Supplier(Snc, Shame, Status, Scity), Part(Pno, Pname, Colour, Weight, Pcity), and Shipments(Sno, Pnc, Quantity).</Paragraph>
  </Section>
  <Section position="5" start_page="82" end_page="84" type="metho">
    <SectionTitle>
III ANALYSIS
A. The Anal)met
</SectionTitle>
    <Paragraph position="0"> The natural language anal l met has been described in detail elsewhere (Boguraev 1979), (Boguraev and Sparck Jones 1982), and only a brief summary will be presented here. It has been designed as a general purpose, domain- and task-independent language processor, driven by a fairly extensive llnguistlcally-motivated grammar and controlled in its operation by variegated application cf a rich and powerful semantic apparatus. Syntacticallycontrolled constituent identification is coupled with the Judgemental application cf semantic specialists: following the evaluation of the semantic plausibility of the constituent at hand, the currently active processor either aborts the analysis path or constructs a meaning representation for the textual unit (noun phrase, ccmplementiSero embedded clause, etc.) for incorporation into any larger semantic construct. The philosophy behind the anal yser is that syntactlcally-drlven analysis (which is a major prerequisite for domain- and/or task-independence) is made efficient by frequent and timely calls to semantic specialists, which both control blind syntactic backtracking and construct meaning representations for input text without going through the potentiall y costly enumeration of intermediate syntactic trees. The analyser can therefore operate smoothly in environments which are syntactically or lexically hlghiy ambiguous.</Paragraph>
    <Paragraph position="1"> To achieve its objectives the program pursues a passive parsing strategy based on semantic pattern matching of the kind proposed by (Wilks 1975). Thus the semantic specialists work with a range of patterns referring to narrower or broader word classes, all defined using general semantic primitives and ultimately depending on formulae which use the primitives to characterise individual word senses. However the application of patterns in the search for input text meaning is mcre effectively controlled by syntax in this system than in Wilks'.</Paragraph>
    <Paragraph position="2"> The particular advantages of the approach in the database application context are the powerful and flexible means of representing linguistic and world knowledge provided by the semantic primitives, and the ease with which 'traps for the unexpected' can be procedurally encoded. The latter means that the system can readily deal with the kinds cf problems generated by unconstrained natural language text which provoke untoward 'ripple' effects when large semantic grammars are mcdified. The semantic primitive foundatlcn for the analyser provides a good base fcr the whole front end, since the ccmprehensive inventory cf primitives can be exploited to characterise both natural language and data language terms and expressions, and to reconcile the user's view of the database domain with the actual administrative organisaticn of the database.</Paragraph>
    <Paragraph position="3"> For present purposes, the form and ccntent cf the outputs of the natural language analyser are more important than the means by which they are derived (for these see Boguraev and Sparck Jones 1982). The meaning representations output by the analyser are dependency structures with clusters of case-labelled components centred around main verb or noun elements. Apart from the structure of the dependency tree itself, and group identifying markers like 'ins' and 'modallty', the substantive information in the meaning representation is provided by the case labels, which are drawn from a large set of semantic relation primitives forming part of the overall inventory of primitives, and by the semantic category primitive characterisations of lexically-derived items.</Paragraph>
    <Paragraph position="4"> The formulae charaoterislng word senses may be quite rich. The fairly straightforward characterisation of 'supplier1', representing one sense of &amp;quot;supplier&amp;quot; is</Paragraph>
    <Paragraph position="6"> meaning approximately that some sort of organisatton (which may reduce to an individual) gives entities.</Paragraph>
    <Paragraph position="7"> The meaning representation for the whole sentence &amp;quot;Suppliers live in cities&amp;quot; (with the formulae for individual units abbreviated, for space reasons, to their head primitives) is</Paragraph>
    <Paragraph position="9"> where ~ and @location are case labels. &amp;quot;The parts are coloured red&amp;quot; will be analysed as  ( el ause ...... (v (be2 ... be thin in tpartl ... mennK)))yl (@@number (@~state ~:~ &lt;colourl ... sign) (val (red1 ... sense))))))), and &amp;quot;Who supplies green parts?&amp;quot; will give rise to the structure: (clause ... (type question) (v (supplyl ... 81ve (@@agent (n (query (d~y)))) ~race (clause V agent)) (clause</Paragraph>
    <Paragraph position="11"> (gr, eenl ... , tsee ~.se))))))))))))).</Paragraph>
    <Paragraph position="12"> As these examples sho~ the anal yser's representations combine expressive power with structural simplicity. Further, the power of the semantic category primitives used to identify text message patterns means that it is possible to achieve far mcre semantic analysis cf a question, far earlier in the frcnt end processing, than can be achieved with frcnt ends conforming tc the Koncllge model. The effectiveness cf the anal yser as a general natural-language prccesslng device has been demcnstrated by its successful application to a range of natural language processing tasks. There is, however, a price to pay, in the database context, for its generality. Natural language makes ocn=acn use of vague concepts (&amp;quot;have&amp;quot;, &amp;quot;do&amp;quot;), almost content-empty markers (&amp;quot;be e, &amp;quot;of&amp;quot;), and opaque constructions such as compound nouns. Clearl~ front ends where domain-specific information can provide leverage in interpreting these input text items have advantages.</Paragraph>
    <Paragraph position="13"> and it is not clear how a principled solution to the problems they present can be achieved within the framework of a general-purpose anal yser of the kind described. To provide a domain-specific interpretation of, for example, compounds like &amp;quot;supplier city&amp;quot;, an interface would have to be provided oharaeterising domain k~owledge in the semantic terms familiar to the parser, and guaranteeing the provision of explicit structural charaoterlsations of the text constituent which would be available for further exploitation by the parser.</Paragraph>
    <Paragraph position="14"> To avoid invoking domain knowledge in this way in analysis we have been obliged to accept questicn interpretations which are incomplete in limited respects. That is, we push the ordinary semantic analysis procedures as far as they will go, accepting that they may leave 'dummy' markers in the dependency structure and compound nominals with ambiguous member words and no explicit extracted structure.</Paragraph>
    <Paragraph position="15"> B. The Extractor nile the meaning representations constructed by the natural language analyser are general and informative enough to be able to support dlfferent tasks in different applications for different domains, they are not necessarily the best fcrm cf representation for question answering, and specifically for addressing a coded database. After the initial determination of question meaning.</Paragraph>
    <Paragraph position="16"> therefore, the question is subjected to taskoriented, though not yet domain- and databaseoriented, processing. Imposing domain world and database organisatlon restrictions on the question at this stage would be premature, since it cculd ecmplloate or even inhibit possible later inference operations. The idea cf providing a system ccmponent addressing a general linguistic task, withcut throwing away any detailed information not in fact needed for scme specific instance cf that task, like natural language distinctions between quantifiers ignored by the database system, is also an attractive one.</Paragraph>
    <Paragraph position="17"> The extractor thus emphasises the fact that the input text is a questicn, but carries the detailed semantic information provided by the analyser forward fcr exploitation in the translation phase cf the processing.</Paragraph>
    <Paragraph position="18"> A gccd way to achieve a question formulation abstracted from the low-level crganisaticn cf the database is to interpret the user's input as a formal quer~ However our extractor, unlike the equivalent processors described by (Wocds 1972). (Warren and Pereira 1981) and (Grcsz et al 1982), does not make any use cf domain-dependent in fcrmaticn, but constructs a icgic expression whose variable ranges and predicate relaticnships are defined in terms cf  the general semantic primitives used for ccnstructlng the input question meaning representation. The logic representation of the question which is output by the extractor highlights the search aspects cf the input, formalising them so that the subsequent processes which will eventually generate the search specification for the database management system can locate and focus on them easily; at the same time, the semantic richness of the original meaning representation is maintained to facilitate the later domain-crlented translation operations.</Paragraph>
    <Paragraph position="19"> The syntax of the logic representation closely follc~ that defined by (Wocds 1978): (For &lt;quantifier&gt; &lt;variable&gt; / &lt;range&gt; : &lt;restrictions on variable&gt; - &lt;prcpcslticn&gt; ), where each cf the restrictions, or the proposition, can themselves be quantified expressions. The rationale for such quantified expressions as media for questions addressed towards an abstract database has been discussed by Woods. As we accept this, we have developed a transformation procedure which takes the meaning representation of an input question and ccnstructs a corresponding logic representation in the form just described. Thus for the question &amp;quot;Who supplies green parts?&amp;quot; analysed in Section A, we obtain (For Every SVarl / query : (For Every $Var2 / part1 : (cclourl $Var2 8reenl) - (supply1SVarl SVar2)) (Display SVarl)).</Paragraph>
    <Paragraph position="20"> where the lexically-derived items indicating the ranges of the quantified variables ('query', 'part1'), the relationships between the variables ('supply1') and the predicates and predicate values ('cclcur1', 'green I') in fact carry along wltb them their semantic formulae: these are omitted here, and in the rest cf the paper, to save space.</Paragraph>
    <Paragraph position="21"> The extractor is geared to seek, in the analyser's dependent y structures, the simple prc positicns (atomic predications) which make up the logic representaticn. Follcwing the philcscphy cf the semantic thecry underlying the analyser design, these simple prcpositicns are identified wlth the basic messages, i.e. semantic patterns, which drive the parser and are expressed in the meaning representations it produces as verb and noun group clusters of case-related elements. In order to 'unpack' these, the extractor iccks for the sources cf atomic predicates as 'SVO' triples, identifiable by a verb (cr ncun) and its case rcle fillers, which can be extracted quite naturally in a straightforward way from the dependency structure.</Paragraph>
    <Paragraph position="22"> Depending bcth cn the semantic characterisaticn cf the verb and its case arguments, and cn the semantic context as defined by the dependency tree, the triples are categcrised as belcnging to cne cf two types: \[$ObJ SLink $ObJ\]. or \[$Obj SPoss SPrcp\]. where the $Obj, SLink. or $Prcp items are further characterised in semantic terms. It is clear that the 'basic messages' that the extractor seeks to identify as a preliminary step tc ccnstructing the logic representation define either primitive relationships between objects, cr properties of those same cbjects. Thus the meaning representation for &amp;quot;part suppliers&amp;quot; will be unpicked as a 'dummy' relationship between &amp;quot;suppliers&amp;quot; and &amp;quot;parts&amp;quot;, i.e. as \[$ObJ1(supplierl) $Link1(dummy) $Obj2(partl)\].</Paragraph>
    <Paragraph position="23"> while &amp;quot;green parts&amp;quot; will be interpreted as</Paragraph>
    <Paragraph position="25"> Larger constructs can be similarly deocmpcsed: thus &amp;quot;Where do the status 32 red parts suppliers live?&amp;quot; will be broken down into the following set of triples:</Paragraph>
    <Paragraph position="27"> It must be empbasised that while there are parallels between these structures and those of the entityattribute approach to data modelling, the forms cf triple were chosen without any reference to databases. As noted earlier, they naturally reflect the form of the 'atomic propositions', i.e. basic messages, used as semantic patterns by the natural language anal yser.</Paragraph>
    <Paragraph position="28"> For completeness, the triples underlying the earlier question &amp;quot;Who supplies green parts?&amp;quot; are</Paragraph>
    <Paragraph position="30"> The sets cf interconnected triples are derived from the meaning representations by a fairly simple recursive prccedure. The next stage o~ the extraction process restructures the triples tree into a skeleton quantified structure, the icgic representation, to be passed fcrward tc the translator generating the formal query representaticn. Whenever mcre explicit information regarding the interpretation of the input as a question can be extracted frcm the meaning representaticn, this is inccrpcrated into the logic representation. Thus the processing includes identification and sccping of quantifiers following the approach adopted by Wccds, and establishing the aspect, mcdaiity and focus cf the questicn. Like anyone else, we do not claim tc provide a ccmprehensive treatment cf natural language quantifiers, and indeed in practice have not implemented prccesses for all the quantifiers handled by LUNAR.</Paragraph>
    <Paragraph position="31"> The icgic representaticn defines the logical content and structure cf the information the user is seeking. It may, as ncted, be inccmplete at pcints where domain reference is required, e.g. in the interpretation cf compound ~cuns; but it carries along, tc the translator, the very large amcunt cf semantic information provided by the case labels and formulae of the meaning representation, which should be adequate to pinpoint the items sought by the user and tc describe them in terms suited to the database management system, so they may be accessed and retrieved.</Paragraph>
  </Section>
  <Section position="6" start_page="84" end_page="87" type="metho">
    <SectionTitle>
IV TRAMSLATIOM
</SectionTitle>
    <Paragraph position="0"> A. The translator In the process of transforming the semantic content of the user's question into a low-level search representation geared to the administrative structure of the target database, it is necessary to reconcile the user's view of the world with the domain model. Before even attempting to construct, Say, a relational algebra expression to be interpreted by the back-end database management system, we must try to interpret the semantic content of the loKlc representation with reference to the se~emt cr variant of the real world modelled by the database.</Paragraph>
    <Paragraph position="1"> An obvious possibility here is to proceed directly from the variables and predications of the Icglc representation to their database counterparts. For example,</Paragraph>
    <Paragraph position="3"> can be mapped directly onto a relation Shipments in the Suppliers and Parts database. The mapping could be established by reference to the lexicon and to a schedule of equivalences between logical and database structures.</Paragraph>
    <Paragraph position="4"> This approach suffers, however, from severe problems: the most important is that end users do not necessarily constrain their natural language to a highly limited vocabulary. Even in the simple context of the ~,ppliers and Parts database, it is possible to refer to &amp;quot;firms&amp;quot;, &amp;quot;goods&amp;quot;, &amp;quot;buyers&amp;quot;, &amp;quot;sellers&amp;quot;, &amp;quot;provisions&amp;quot;, &amp;quot;customers&amp;quot;, etc. In fact, it was precisely in order to bring variants under a common denominator that semantic grammars were employed. We, in contrast, have a more powerful, because more flexible, semantic apparatus at our disposal, capable of drawing out the similarities between &amp;quot;firms&amp;quot;, &amp;quot;sellers&amp;quot;, and &amp;quot;suppllers&amp;quot;, as opposed to taking them as read. Thus a general semantic pattern which will match the dictionary definitions cf all of these words is (((neat obJm) give) (~bJ |org) ). Furthermore, if instead of attempting to define any sort of direct mapping between the natural language terms and expressions of the user and corresponding domain terms and expressions, we concentrate on finding the common links between them, we can see that even though the domain and, in turn, database terms and expression= may not mean exactly the same as their natural language relatives or sources, we should be able to detect overlaps in their semantic characterlsatlons.</Paragraph>
    <Paragraph position="5"> It is unlikely that the same cr similar words will be used in both natural and data languages if their meanings have ncthing in ccmmcn, even if they are not identical, so characterising each using the same repertoire of semantic primitives shculd serve to establish the link~ between the two. Thus, for example, one sense of the natural language word &amp;quot;iccaticn&amp;quot; will have the formula (this (where spread) ) and the data language word &amp;quot;&amp;city&amp;quot; referring to the domain object &amp;city will have the formula (((man folk) wrap) (wl~re spread)), which can be connected by the common constituent (~re spread).</Paragraph>
    <Paragraph position="6">  One distinctive feature of our front end design, the use of general semantics for initial question interpretation, iS thus connected with ancther: the more stringent requirements imposed on natural lanKusge to data language translation by the initial unconstrained question interpretation can be met by exploiting the resources for language meaning representation initially utilised for the natural language question interpretation. We define the domain world modelled by the database using the same semantic apparatus as the one used by the natural language front end processor, and invoke a flexible and sophisticated semantic pattern marcher tc establish the connection between the semantic content of the user question (which is carried over in the logic representation) and related ccncepts in the domain world. Taking the next step from a domain world concept or relationship between domain world obJants to their direct model in the administrative structure of the database is then relatively easy.</Paragraph>
    <Paragraph position="7"> Since the domain world is essentially a closed world restricted in sets if not in their members, it is possible to describe it in terms of a limited set of concepts and relationships: we have possible properties of objects and potential relationships between them. We can talk about &amp;suppliers and &amp;parts and the important relationship between them, namely that &amp;suppliers &amp;supply &amp;parts. We can also specify that &amp;suppliers &amp;llve in &amp;cities, &amp;parts can be &amp;n,-bered, and so on.</Paragraph>
    <Paragraph position="8"> We can thus utillse, either explicitly or implicitly, a description of the domain world which could be represented by dependency structures llke those used for natural language. The important point about these is the way they express the semantic content of whole statememts about the domain, rather than the way they label individual domaln-referrlng terms as, e.g. &amp;quot;&amp;supplier&amp;quot; or &amp;quot;&amp;part&amp;quot;. It is then easy to see how the logic representation for the question &amp;quot;What are the numbers of the status 30 suppliers?&amp;quot;,</Paragraph>
    <Paragraph position="10"> can be unpacked by semantic pattern matching routines to establish the ccnnecticn between &amp;quot;supplier 1&amp;quot; and &amp;quot;&amp;supplier&amp;quot;, &amp;quot;number 1&amp;quot; and &amp;quot;&amp;number&amp;quot;, and so on. In the same way the lcgic representations for &amp;quot;From where does Blake operate?&amp;quot; and &amp;quot;Where are screws found?&amp;quot; can be analysed for semantic content which will establish that &amp;quot;Blake&amp;quot; is a &amp;supplier, &amp;quot;operate&amp;quot; in the context cf the database domain means &amp;supply, and &amp;quot;where&amp;quot; is a query marker acting fcr &amp;city from which the &amp;supplier Blake &amp;supplies (as opposed to street corner, bucket shop, or crafts market); similarly, &amp;quot;screW' is an instance of &amp;part and the cnly iccational information associated with &amp;parts in the database in question is the &amp;city where they are stored. All this becomes clear simply by matching the underlying semantic primitive definitions of the natural language and domain world words, in their propositional contexts.</Paragraph>
    <Paragraph position="11"> The translator is alac the module where domain reference is brought in tc complete the interpretation cf the input question where this cannot be fully interpreted by the analyser alcne.</Paragraph>
    <Paragraph position="12"> The semantic pattern-matchlnK potential cf the translation module can be exploited to determine the nature of the unresolved domain-specific predications (both 'dummy' relationships and those implicit in compound nominals), and vacuously defined objects ('query' variables). Thus the fragment of logical form for &amp;quot;... London suppliers of parts ..&amp;quot;, namely  Apart from the fact that semantic pattern matching seems to cope quite successfully with unexpected inputs ('unexpected' in the sense that in the alternative approach nc mapping function would have been defined for them, thus implying a failure to parse and/or interpret the input question), having a general natural language analyser at our disposal offers an additional bonus: the description of the domain world in terms of semantic primitives and primitive patterns can be generated largely automatically, since the domain world can be described in natural language (assuming, of course, an apprcpriate lexicon of domain world Words and definitions) and the descriptions simply analysed as utterances, producing a set of semantic structures which can subsequently be prccessed to cbtaln a repertoire of domain-relevant forms to be exploited fcr the matching procedures.</Paragraph>
    <Paragraph position="13"> B. The Convertor Having identified the domain . terms and expressions, we have a high-level database equivalent cf the original English question. A substantial amcunt cf processing has pinpointed the question focus, has eliminated potential ambiguities, has resolved domain-dependent language ccnstructicns, and has provided fillers for 'dummy' or 'query' items. Further, the system has established that &amp;quot;London&amp;quot; is a &amp;city, for example, cr that &amp;quot;Clark&amp;quot; is a specific instance of &amp;supplier. The processing now has to make the final transition to the specific fcrm in which questions are addressed to the actual database management system. The semantic patterns cn which the translator relies, for example defining a domain word &amp;quot;&amp;supplier&amp;quot; as (((cent obje) give) (subJ IorK)), while adequate encugh tc deduce that Clark is a &amp;supplier, are not informative enough to suggest how &amp;suppliers are modelled in the actual database.</Paragraph>
    <Paragraph position="14"> Again, the cbvious approach to adopt here is the mapping one, so that, for instance, we have: &amp;supplier :=&gt; relation Supplier Clark ==&gt; tuple of relation Supplier such that Shames&amp;quot;Clark&amp;quot; But this approach suffers from the same limitations as direct mapping from logic representation tc search representation; and a mcre flexible apprcach using the way the database mcdels the domain world has been adopted.</Paragraph>
    <Paragraph position="15"> In the previous section we discussed how the translator uses an inventory of semantic patterns to establish the connection between natural language and domain world words. This inventory is not, however, a flat structure with no internal organisatlon. On the ccntrar~ the semantic information about the domain world is crganised in such a way that it can naturally be associated with the administrative structure cf the target database, For example in a relational database, a relation with tuples over domains represents properties of. cr relationships between, the objects in the domain world. The objects, properties and relationships are described by the semantic apparatus used for the translator, and as they also underlie, at not toc great remove, the database structure, the domain world concepts or predications of the query representation act as pointers into the data structures cf the database administrative crganlsatlon.</Paragraph>
    <Paragraph position="16"> For example, given the relation supplier over the domains S~ame, Snc. Status and Scity. the semantic patterns which describe the facts that in the domain world &amp;suppliers &amp;have &amp;status, &amp;numbers, &amp;names and &amp;live in &amp;cities are crcsslinked, in the sense that they have the superstructure cf the database relation .Supplier imposed over them. We can thus use them to avoid explicit mapping between query data references and template relaticnal structures for the database. From the initial meaning representation for the question fragment &amp;quot;... Clark, who has status 30 ...&amp;quot; through to the query representation, the semantic pattern matching has established that Clark is an instance cf &amp;supplier, that the relationship between the generic &amp;supplier and the specific instance of &amp;supplier (i.e. Clark) is that cf &amp;name, and that the query is focussed cn his &amp;status (whose value is supplied explicitly).</Paragraph>
    <Paragraph position="17"> Now from the position of the query predication (&amp;status &amp;supplier 30) in the characterisaticn cf the relaticn Supplier, the system will be able tc deduce that the way the target database administrative structure models the question's semantic ccntent is as a relation derived from Supplier with &amp;quot;Clark&amp;quot; and &amp;quot;30&amp;quot; as values in the columns Shame and Status respectlvely.</Paragraph>
    <Paragraph position="18"> The convertor thus employs declarative knowledge about the database organisaticn and the correspondence between this and the domain world structure to derive a generalised relational algebra expression which is an interpretation cf the formal  query in the context of the relational database model of the domain. We have chosen to gear the convertor towards a generallsed relational algebra expression, because both its simple underlying definition and the generality of its data structures within the relational model allow easy generation of final low-level search representations for different specific database access systems.</Paragraph>
    <Paragraph position="19"> To derive the generallsed relational algebra form of the question from the query representation, the convertor uses its k~owledge of the way domain objects and predications are modelled in the database to establish a primary or derivable relation for each of the'quantifled variables of the query representation. These constituents of the algebra expression are then combined, with an appropriate sequence of relational operators, to obtain the complete expression.</Paragraph>
    <Paragraph position="20"> The basic premise of the convertor is that every quantified variable in the formal representation can be associated with some primary or computable relation in the target database; restrictions on the quantified variables specify how, with that relation as a starting point, further relational algebra computations can be performed to mcdel the restricted variable; the process is recurslve, and as the query representation is scanned by the convertor, variables and their associated relational algebra expressions are bound by an 'environmemttype' mechanism which provides all the necessary information to 'evaluate' the propositions of the quer~ Thus ccnverslon is evaluating a predicate expression in the context of its semantic interpretation in the domain ~rld and the envlronmemt of the database * models for its variables.</Paragraph>
    <Paragraph position="21"> For example, given the query representation fragment for the phrase &amp;quot;... all London suppliers who supply red parts ..&amp;quot;, namely  SVarl will initially be bound to the primary relation .Supplier, which will be subsequently restricted to those tuples Where Sctty is equal to &amp;quot;London&amp;quot;. Slmllarl~ $Var2 will be associated with a partial relation derived from Part, for which the value of Colcur is &amp;quot;red&amp;quot;. Evaluating the prcposltion (&amp;supply SVarl $Var2). whose dcmain relationship Is mcdelled in the database by Shipments, will in the envlrcnment of $Varl and SVar2 yield the relational expression (jcin I select .Suppller where Seity equals &amp;quot;London&amp;quot;) j91n Shlpmen~s ~select Part where Colcur equals &amp;quot;red&amp;quot;))). At this point, the information that the user wants has been described in terms of the target relational database: names cf files, fields and columns. The search description has, however, still to be given the specific form required by the back-end database management system. This is achieved by a fairly straightforward application of standard ccmplling techniques, and does not deserve detailed discussicn here. At present we can generate search specifications in three different relational search languages. Thus the final form in the local search language Salt of the example question &amp;quot;Who supplies green parts?&amp;quot; is</Paragraph>
  </Section>
  <Section position="7" start_page="87" end_page="87" type="metho">
    <SectionTitle>
V IMPLEMENTATION
</SectionTitle>
    <Paragraph position="0"> All of the modules have been implemented (in LISP). The convertor is at present restricted to relational databases, and we would like to extend it to other models. The system has so far been tested cn Suppliers and Parts, which is a toy database from the point of view of scale and complexity, but which is rich enough to allow questions presenting challenges tO the general semantics approach to question interpretation. To illustrate the performance of the front end. we show below the query representations and final search representations for some questions addressed to this database. Work is currently in progress to apply the front end to a different (relational) database containing planning information: this simulates IBM's TQA database (Damerau 1980). Most of the work in this is likely to come in writing the lexical entries needed for the new vocabulary. Longer term developments include validating each step of the translation by generating back into English, and extending the front end, and specifically the translator, with an inference engine.</Paragraph>
    <Paragraph position="1"> Clearly. in the longer term, database front ends will have to be provided with an inference capability. As Konolige points out, in attempting tc insulate users, with their particular and varied views of the domain cf discourse, from the actual administrative organisatlon cf the database, it may be necessary to do an arbitrary amcunt cf inferenclng exploiting domain informaticn to connect the user's question with the database. An obvious problem ~r~th front ends not clearly separating different processing stages is that it may be difficult to handle inference in a coherent and ccntrclled way. Insofar as inference is primarily domain-based, it seems natural in a modular front end to provide an inference capability as an extension of the translator. This should serve bcth tc Iccaliae inference operations and to facilitate them because they can work on the partially-processed input question, However the inference engine requires an ex pllclt and well-crganised domain model, and specifically one which is rather more comprehensive than current data models, or than the rather infcrmal nonce ptual schema we have used tc dr i ve the translator.</Paragraph>
    <Paragraph position="2"> We hope to begin work on providing an inference capability in the near future, but it has to be reccgnised that even for the restricted task cf database access, it may prove impossible to confine inference operations to a single mcdule: dcing so would imply, for example, that compound nouns will generally only be partly interpreted in the analysis and extraction phases. Starting with inference limited to the translation mcdule is therefore primarily a research strategy for tackling the inference prcblem.</Paragraph>
    <Paragraph position="3"> * Green parts are supplied by which suppliers?  and (Ql-var2.Scity : &amp;quot;Paris&amp;quot;) and (Ol-var1.Cclcur = &amp;quot;blue&amp;quot;)</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML