File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/91/p91-1047_abstr.xml

Size: 3,806 bytes

Last Modified: 2025-10-06 13:47:16

<?xml version="1.0" standalone="yes"?>
<Paper uid="P91-1047">
  <Title>Discovering the Lexical Features of a Language</Title>
  <Section position="1" start_page="0" end_page="0" type="abstr">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> This paper examines the possibility of automatically discovering the lexieal features of a language. There is strong evidence that the set of possible lexical features which can be used in a language is unbounded, and thus not innate. Lakoff \[Lakoff 87\] describes a language in which the feature -I-woman-or-fire-ordangerons-thing exists. This feature is based upon ancient folklore of the society in which it is used. If the set of possible lexieal features is indeed unbounded, then it cannot be part of the innate Universal Grammar and must be learned. Even if the set is not unbounded, the child is still left with the challenging task of determining which features are used in her language.</Paragraph>
    <Paragraph position="1"> If a child does not know a priori what lexical features are used in her language, there are two sources for acquiring this information: semantic and syntactic cues. A learner using semantic cues could recognize that words often refer to objects, actions, and properties, and from this deduce the lexical features: noun, verb and adjective. Pinker \[Pinker 89\] proposes that a combination of semantic cues and innate semantic primitives could account for the acquisition of verb features. He believes that the child can discover semantic properties of a verb by noticing the types of actions typically taking place when the verb is uttered. Once these properties are known, says Pinker, they can be used to reliably predict the distributional behavior of the verb. However, Gleitman \[Gleitman 90\] presents evidence that semantic cues axe not sufficient for a child to acquire verb features and believes that the use of this semantic information in conjunction with information about the subcategorization properties of the verb may be sufficient for learning verb features.</Paragraph>
    <Paragraph position="2"> This paper takes Gleitman's suggestion to the extreme, in hope of determining whether syntactic cues may not just aid in feature discovery, but may be all that is necessary. We present evidence for the sufficiency of a strictly syntax-based model for discovering *The author would like to thank Mitch Marcus for valuable help. This work was supported by AFOSR jointly under grant No. AFOSR-90-0066, and by ARO grant No. DAAL 03-89-C0031 PRI.</Paragraph>
    <Paragraph position="3"> the lexical features of a language. The work is based upon the hypothesis that whenever two words are semantically dissimilar, this difference will manifest itself in the syntax via playing out the notion 51\]). Most, if not all, For instance, there is lexical distribution (in a sense, of distributional analysis \[Harris features have a semantic basis.</Paragraph>
    <Paragraph position="4"> a clear semantic difference between most count and mass nouns. But while meaning specifies the core of a word class, it does not specify precisely what can and cannot be a member of a class.</Paragraph>
    <Paragraph position="5"> For instance, furniture is a mass noun in English, but is a count noun in French. While the meaning of furniture cannot be sufficient for determining whether it is a count or mass noun, the distribution of the word Call.</Paragraph>
    <Paragraph position="6"> Described below is a fully implemented program which takes a corpus of text as input and outputs a fairly accurate word class list for the language in question. Each word class corresponds to a lexical feature. The program runs in O(n 3) time and O(n 2) space, where n is the number of words in the lexicon.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML