File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/90/c90-2055_metho.xml
Size: 10,626 bytes
Last Modified: 2025-10-06 14:12:26
<?xml version="1.0" standalone="yes"?> <Paper uid="C90-2055"> <Title>(*) Many people are taking part in the development of SESAME. For the Linguistic aspects: Blandine Gelain, St~phane Guez, Jean-Michel Liaunet, Fariba Ommani, and Zhengce Peng, and for the Information System and User Interface aspects: Pascal Fischer, Eli,: Kerbaje, Laurent Lacote, and Amaud Villemin.</Title> <Section position="3" start_page="0" end_page="318" type="metho"> <SectionTitle> THE INFORMATION SYSTEM ENVIRONMENT </SectionTitle> <Paragraph position="0"> The information system environment is used to design the information system parts of a particular application, which consist essentially of the conceptual schema of the application and mapping rules between the conceptual level and the data base relational level.</Paragraph> <Paragraph position="1"> The conceptual schema is a set of specifications which describe the semantic structure of the data base. It is specified in an entity relationship (ER) model. This model contains the traditional concepts (entity, relationship, property) used in standard design methods (Merise, Yourdon, IDA...). We have chosen to use this type of model instead of a knowledge representation language or a semantic network, in order to ease the implementation of applications by people used to standard data base tools. To respect the purpose of SESAME, it was essential not to fall in the trap where only the designers of SESAME would be able to develop applications. See (Grosz et al. 1987) for a review of existing systems, regarding the problem of portability.</Paragraph> <Paragraph position="2"> We have extended the ER model to include multivalued properties, structured value domains (e.g. the domain date will be built from the domains day, month and year), and the generalization/specialization of entity types through the definition of inheritance relations between entity types. It is possible to specify on a schema the dependencies between entitles, which makes the generation of a normalized schema easier for the mapper. It is also on this schema, at the conceptual level, that access rights for confidential data can be specified. They have their counterpart at the linguistic level: only words expressing authorized concepts will be accessible for a particular user.</Paragraph> <Paragraph position="3"> The mapper produces a set of mapping rules which are rewriting rules which link the conceptual schema with the relational schema of the data base. An additional module contains a description of the specific features of the D.B.M.S. used, in order to fill the gap between standard SQL and the actual SQt.</Paragraph> </Section> <Section position="4" start_page="318" end_page="318" type="metho"> <SectionTitle> THE LINGUISTIC ENVIRONMENT </SectionTitle> <Paragraph position="0"> The linguistic environment is used to build the linguistic knowledge bases of a particular application, lhe unification grammar formalism is used to describe the lexicon and the grammar (Shiebert 1986).</Paragraph> <Paragraph position="1"> A lexicon editor is used to generate the lexicon in the unification grammar formalism. The first source of information used is the conceptual model of the data base: all the concepts have to be associated with words and the semantics of the natural language interface is the semantics of the conceptual modelling of the application information system. But a natural language query may also contain semantic relations which are not directly expressed in the conceptual schema. These &quot;virtual&quot; semantic relations are defined in the linguistic conceptual schema as rewriting rules on &quot;real&quot; relations from the conceptual schema. These extensions of the lexicon are made possible by the analysis of a domain corpus.</Paragraph> <Paragraph position="2"> The grammar is described in the unification grammar formalism: each grammatical category carl b4~, associated with a features structure represented a'.~; a tree. Syntactic as well as semantic constraints are expressed as contraints on the trees (features equations) and operated through unification.</Paragraph> <Paragraph position="3"> A grammar rule is made of a rewriting rule and a set of equations which specify the syntactic constraints. S,~;mantic constraints (selection restrictions) are also expressed as features equations. -\[he values of these features specify the semantic types of the ai:)plication. &quot;The lexicon is described in tile same formalism. Each lexical entry is associated with a set of equations which specify the category of the word as well as tile value of certain features. \]he s,;mantics of a natural language query is represented with a logical formalism. The construction principle of semantic representation is compositionality. Each syntactic rule is associated with equations which express the rules of semantic composition (Moore 1989).</Paragraph> <Paragraph position="4"> The grammar and the lexicon are compiled into a Prolog program. Unification which is a basic Prolog operation is thus directly and efficiently used.</Paragraph> <Paragraph position="5"> np :- det, noun, n pp & \[np, agr,number\] = \[noun, agr,number\],</Paragraph> <Paragraph position="7"> Simplified descriptions of a grammar rule and a lexicon entry The linguistic covering of the grammar and the lexicon is the sub-language of data base query, which include the processing of expressions concerning tile sorting of information, comparisons, etc. The grammar also processes coordination, pronominal reference and it detects ambiguities. The covering is large enough so that the present grammar should fit any standard application without any major addition. Only very specialized applications will require important changes, mainly at the level of noun phrases. This is a difference with NaturalLink which only provides a formalism: the semantic grammar and the semantic representation building rules have to be written by the application developer (Texas Instruments 1985).</Paragraph> </Section> <Section position="5" start_page="318" end_page="319" type="metho"> <SectionTitle> THE QUERY ENVIRONMENT </SectionTitle> <Paragraph position="0"> In the SESAME project, we have taken great care of the user interface which is the only way to have a friendly access to the data base. The query environment provide the user with several powerful functionalities. If the analysis of a freely typed query succeeds, then the SQL translation is completed after a dialogue with tile user in order to specify the form of the answer. If the analysis fails, the list of possible continuations after the failure point is proposed. The user can select a word from this list or type it directly. The remaining part of the sentence which has not been parsed is also displayed so that the user can use it directly or edit it to complete the query. The user can also choose to complete the query in guided mode with the help of dynamically synthesized menus. For more information on these techniques, see (Rincel & Sabatier 1989). A graphic query interface has been specified, but not yet implemented.</Paragraph> <Paragraph position="1"> ....</Paragraph> <Paragraph position="2"> I l l!l il conceptual schema. It is a SQL like language where joins are replaced by semantic paths. Tile translation into ER-SQL is completed in two steps. In a first step, the logical form which contains virtual relations is translated into an equivalent logical form which only contains real predicates, using the linguistic conceptual schema. The second step transforms a query expressed in a logical language with quantified variables into a query expressed in an algebraic language (ER-SQL) operating on the conceptual model.</Paragraph> <Paragraph position="3"> The ER-SQL query is translated into a standard SQL query, using the mapping rules generated by the information system environment. The translation process keeps trace of the direct link which is set between the conceptual data of the ER-SQL query and the data which will make up the answer when the query is sent to the D.B.MS. This information is kept in the form of conceptual data / data base data mapping rules. These rules define the semantics of the content of the relational table the result of the query is made of. The results management environment will use these mapping rules to present and display the results of a query. A writer module refines the query expressed in the standard SQL language to fit the specific features of the particular D.BM.S. used for the application.</Paragraph> <Paragraph position="4"> Query en vironment An history module provides the user with tools to memorize, organize and manage the queries of a working session and their results. Usually 70% of the queries belong to a small fixed set, SESAME include tools to manage a library of queries with parameters to be specified by the user, with the advantage that queries in this library are expressed in natural language. A help system, implemented with hypertext tools, can be called at any point in the user interface. &quot;This help facility is implemented through a hypertext system integrated to the project graphic toolbox. The tutorial provides the user with a demo of the product and a learning session, including sketches with choice points so that the user can control the demo.</Paragraph> <Paragraph position="5"> The logical form produced by the natural language interface is translated into a query expressed in ER-SQL. ER-SQL is the query language of the</Paragraph> </Section> <Section position="6" start_page="319" end_page="320" type="metho"> <SectionTitle> THE RESULTS MANAGEMENT ENVIRONMENT </SectionTitle> <Paragraph position="0"> The results management environment must be able to take the answers of the D.B.M.S. and to present them in such a form that they can be exploited by the user. A broad choice of possibilities is given to the user for the presentation of the results: environment in which the results are to be exploited and type of tool used for this exploitation. This module uses the conceptual data / data base data mapping rules to retrieve the information specified at the conceptual level, in the relational table which is returned as the result of the query. The results can then be presented in the terms the user choose to express the query, and not with the logical names provided by the data base. This module will make possible a presentation of the structured domains and the multivalued properties which do not exist at the relational level.</Paragraph> </Section> class="xml-element"></Paper>