File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/87/j87-3008_abstr.xml
Size: 5,941 bytes
Last Modified: 2025-10-06 13:46:24
<?xml version="1.0" standalone="yes"?> <Paper uid="J87-3008"> <Title>A COMPUTATIONAL FRAMEWORK FOR LEXICAL DESCRIPTION</Title> <Section position="2" start_page="0" end_page="0" type="abstr"> <SectionTitle> 1. METHODOLOGY </SectionTitle> <Paragraph position="0"> As can be judged from the review in Ingria(1986), there are a wide variety of techniques and sub-systems used for handling lexical information within natural language processing systems. In many systems, particularly experimental ones, the lexicon module is fairly small and rudimentary, as the vocabulary is limited and the research is not primarily concerned with lexical issues.</Paragraph> <Paragraph position="1"> On the other hand, theoretical linguists have often discussed regularities that occur within the lexicon, primarily in the areas of morphology (word structure) and lexical redundancy (generalisations across lexical entries). We have designed a related set of rule-formalisms and structures which embody a linguistically-motivated theory of lexical structure, and have implemented these techniques in software which can serve as a general lexical module within a natural language parsing system. This is of theoretical interest as it presents a computer-tested set of mechanisms which fulfil, in an integrated way, some of the roles that Copyright 1987 by the Association for Computational Linguistics. Permission to copy without fee all or part of this material is granted provided that the copies are not made for direct commercial advantage and the CL reference and this copyright notice are included on the first page. To copy otherwise, or to republish, requires a fee and/or specific permission. linguists have posited for morphological and lexical rules. From a practical point of view, it defines a software module which is largely rule-driven and so can be tailored to different vocabularies, and perhaps even to various languages. Although it has been designed with syntactic parsing as the main intended application, most of the linguistic mechanisms and descriptions are independent of their use within a parser.</Paragraph> <Paragraph position="2"> It is important to bear in mind the distinction between a linguistic mechanism and a linguistic description which uses that mechanism. We have developed not only a related set of formalisms, all with a clear computational interpretation, we have devised a description of a large subset of English morpho-syntactic phenomena using these formalisms. Although the adequacy of the mechanism and of the description are mutually interdependent, it is important to maintain this distinction when appraising the work reported here, particularly when considering its possible extension to other vocabulary or other languages. Another important issue when considering a computationally feasible system is the question of how to interpret a rule-notation procedurally. Linguistic formalisms tend to be discussed as declarative statements of regularities within the language, and it is not always clear what is the appropriate interpretation when the rules have to be used for processing data. For example, the Feature Co-occurrence Restrictions of Gazdar et al. (1985) define arbitrary logical constraints to which feature-sets (categories) must conform. A computational implementation has (at least) two ways in which these statements could be interpreted -- as recipes for filling in extra features, or as filters for rejecting ill-formed categories (cf. Stanley(1967)). It is not at once apparent whether a linguist writing FCR statements would accept both of these as equally &quot;natural&quot; interpretations. Whatever algorithmic interpretation is chosen for a rule notation, it should be compu/ationally tractable and fairly obvious to the reader. This has led us, particularly in the area of lexical redundancy rules, to opt for notations which have a very obvious and explicitly defined procedural interpretation.</Paragraph> <Paragraph position="3"> A further methodological question which arises when giving a computational interpretation to declarative statements of lexical regularities is whether a rule notation is best regarded as a notational device which allows the linguist to write more succinct entries, but which is not used directly in the computation of the association between a character string and a lexical entry. In terms of the implementation, this is the question of whether a rule-system is an aid to the entry of data by the linguist (and can be used for some form of pre-processing) or is a mechanism which is used in a more general or efficient look-up procedure.</Paragraph> <Paragraph position="4"> In designing linguistic rule-formalisms, there is traditionally a trade-off between the power of the mechanism and the substance of the linguistic claims or theories embodied in the notation. We have generally opted for fairly powerful techniques, in the interests of achieving generality and flexibility, for two reasons. Firstly, we were not sure initially what facilities would be needed for an adequate description of lexical phenomena in English, and so had to allow scope for experimentation.</Paragraph> <Paragraph position="5"> It would be possible, in the light of regularities within our description, to devise a more restricted set of rule formalisms if this was desired. Secondly, we wished to design and implement a set of tools which could be used by computational linguists of a variety of theoretical persuasions and with varying needs, and hence we felt it would be too restrictive to tailor the rule systems to the minimum that our description of English demanded.</Paragraph> <Paragraph position="6"> We shall start by giving an informal description of the overall system, then we shall outline some of the rule systems in more detail, and finally our description of English word-structure will be summarised.</Paragraph> </Section> class="xml-element"></Paper>