File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/86/c86-1020_intro.xml
Size: 3,360 bytes
Last Modified: 2025-10-06 14:04:33
<?xml version="1.0" standalone="yes"?> <Paper uid="C86-1020"> <Title>Towards a Dedicated Database Management System for I)ictionaries</Title> <Section position="2" start_page="0" end_page="0" type="intro"> <SectionTitle> 1. Introduction </SectionTitle> <Paragraph position="0"> As the means of natural language processing axe gradually reaching a stage where the realisation of large-scale projects like EUROTRA becomes more and more feasible, the demand for lexical databases increases. Unfortunately, this is not a demand which is easy to meet, because lexical databases are exceedingly expensive. ~\[he two main reasons for this are the following: * The mmmal labour involved with the coding of entries is time-consuming.</Paragraph> <Paragraph position="1"> * The possibilities to take over or to cumulate existing machine-readable dictionaries are rather limited because existing dictionaries usually contain only a part of the information needed fox&quot; a certain project. Severe consistency problems and the need for manual post-editing are the result of this (->\[Hess, et. al. 1983\]).</Paragraph> <Paragraph position="2"> As long as there is no general agreement on the kind of information which should be stored in a dictionary and therefore no universally applicable lexical database, we will have to live with these problems. The important question for the time being is, whether we can alleviate them. This paper ,argues that the best way to do that is to construct a dedicated database management system (DBMS). It presents a prototype proposal which has been conceived in a doctoral thesis \[Domenig 1986\] and which is the basis for a project that ISSCO 1 has recently started in conjunction with the Swiss National Fnnd.</Paragraph> <Paragraph position="3"> Because of the limited space at disposal we will mainly explain the most uncommon feature of the system, its morphological capabilities. We will not go into all of the monitor- and manipulation flmctions which alleviate the task of lexicography. The reader may infer the potential for them, however, if he remembers the following fact: as both the 'static' and 'dynamic' informations about entries (features and morphological processes, respectively) are coded within the system, they can both be accessed and controlled quite easily.</Paragraph> <Paragraph position="4"> 2. Tile requirements for a lexical database According to our opinion, a lexical database should not be a mere collection of 'static' data, i.e. a set of morphemes with associated features. It should comprise morphological processes which enable it to serve as a real-time word-analyser used in a message-switching environment (e..g. a local area network). Moreover, the DBMS should control the consistency of the data as far as possible so that only plausible combinations of features and morphological processes can be associated with entries. This differs very much from tile 'traditional' concept of lexical databases, where the entries consist of strings with associated features and the morphological interpretation is done outside of the database in a program. Naturally, the control over consistency is much more efficient and also easier to maintain if both 'static' and 'dynamic' information are coded within the database.</Paragraph> <Paragraph position="6"/> </Section> class="xml-element"></Paper>