File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/94/c94-1040_abstr.xml

Size: 3,084 bytes

Last Modified: 2025-10-06 13:47:58

<?xml version="1.0" standalone="yes"?>
<Paper uid="C94-1040">
  <Title>Noun Phrasal Entries ill the 1,3)17, English Word Dictionary</Title>
  <Section position="2" start_page="0" end_page="0" type="abstr">
    <SectionTitle>
1. Introduction
</SectionTitle>
    <Paragraph position="0"> The dictionary construction project at tile Japan Flectronic Dictionary Research Institute, Ltd. (El)R) in Tokyo began in 1986 and is ahnost certainly the largest lexicon construction project for computational purposes it\] tile world. This paper describes some aspects of the construction of the English language dictionary, in particular a project to verify and enh,'mce information on noun phrases in the English Word Dictioniuy undertaken by the Computing Research Laboratory at New Mexico State University and the University of Sheffield. We believe the work so far raises issues of wider linguistic interest which require practical solutions so that the large scale lexicon project can proceed. We hope that this palter will show the complexity, diversity, attd richness of the content of the EI)R English Word Dictionary.</Paragraph>
    <Paragraph position="1"> Tile key idea has beet\] to construct a system of features, categories and structures for encoding English words and phrases that is, at the same time, universal, or at least sufficiently universal to code both English and Japanese, two very different languages indeed. This is particuhMy evident in the use of left and right &amp;quot;adjacency attributes&amp;quot; in both the English and Japanese dictionaries. This general idea is a very natural outcome of the general state of linguistic theory, at least in tile generative tradition, in its broadest sense: one which emphasises universality in its feature sets and structural conslraints, but which has also evolved by a long and tortuous route to the current position where tile lexicon is primary in a linguistic system, and all other levels of linguistic analysis can be seen as a projection from that level. The alphabet-soup grammar theories that are now current all share that assumption to some degree.</Paragraph>
    <Paragraph position="2"> Thus, a practical attempt to construct a lexicon on i)rinciples as universal as possible for computational use. is indeed a project broadly consistent with the state of generative theory. Almost all other lexicon construction projects under way with computation as a main goal (e.g. COMLEX, CUP, Procter 1992 att(I see Wilks, Slator and Guthrie, it\] press) are designed principally for IC/nglish, although CUP intends to augment its structures l&amp;quot;ron\] non-English corpora \[IS soon as is feasible, and a COMI,I';X tbr Spanish is already under discussion. Nonctlmless, the sheer scale of the EDR enterprise (see below) and its explicitly universalist assumptions (1o make it unique. We will now outline briefly tile general structure of the dictionaries it\] the project and then proceed directly to some of the theoretical :rod computational choices that have been nlade ill tile Fnglish lexicon.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML