File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/80/c80-1017_abstr.xml

Size: 16,111 bytes

Last Modified: 2025-10-06 13:45:50

<?xml version="1.0" standalone="yes"?>
<Paper uid="C80-1017">
  <Title>RELATIVE SEMANTIC COMPLEXITY IN LEXICAL UNITS</Title>
  <Section position="1" start_page="0" end_page="116" type="abstr">
    <SectionTitle>
RELATIVE SEMANTIC COMPLEXITY IN LEXICAL UNITS
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
Summary
</SectionTitle>
      <Paragraph position="0"> The lexical component of a human language is typically heterogeneous and extremely complex. Before we can come to grips with the underlying lexical organization, we must reduce the bewildering complexity. Methods must be elaborated by which the interrelations between the units of the lexicon can be elucidated. This paper describes how a Swedish lexical material stored in a computer has been semantically stratified as a stage in the semantic analysis of the items included in the data base. In particular, a minor subset of the lexical items, consisting of current words in the language, has been selected as metalanguage in the definitions. It is argued that, in this way, a means of describing the relative semantic complexity in lexical units is provided.</Paragraph>
      <Paragraph position="1"> Introduction The semantic and syntactic inter-relations between the lexical units of a human language are notoriously complex and intricate, whether considered from the individual language-user's point of view or from the perspective of the collective competence of a language community. Indeed, they are so complex that, when it comes to thorough semantic analysis, scholars have only been able to handle small portions of the lexicon at a time. The typical lexico-semantic study has therefore concerned single lexical items or small groups of semantically interrelated items, in particular so-called word-fields or semantic domains.</Paragraph>
      <Paragraph position="2"> On the other hand, there seems to be a growing sentiment among linguists that the lexical component is very basic to the functioning of language.</Paragraph>
      <Paragraph position="3"> The crucial role of the lexicon cannot, however, be adequately understood unless the scope is widened. Detailed knowledge is, admittedly, quite indispensable in constructing an overall model of the lexicon; but large-scale lexical investigations are just as necessary in order to reveal the underlying principles of lexical organization. Consequently, computer-based lexicology should rank high as a branch of computational and theoretical linguistics. null The Heterogeneity of Lexicons Lexical inventories that have developed spontaneously do not usually constitute neat and clear-cut systems.</Paragraph>
      <Paragraph position="4"> They are typically skewed in the sense that many phenomena which may seem quite marginal have nonetheless given rise to a rich vocabulary, in contrast to the lexical sparsity characterizing several domains that are logically more fundamental to man. To take just one example, there are, in Swedish, rather few expressions for eating while there is a great variety of verbs for making all sorts of noises displaying only minor acoustical (and perceptual) differentiation. Our creative capacity simply seems to be more nourished by our imagination with regard to sounds than by our imagination with regard to food consumption. That the asymmetry is quite arbitrary is emphasized by the fact that other essential human activities may produce a rich vocabulary. For instance, very fine distinctions can, in Swedish, be expressed monomorphemically in the field of walking. Such disproportions as those just mentioned are basically due to historical accidents, i.e. pure chance, more or less. Consequently, they are language-specific rather than universal and cannot be ascribed to any general tendencies in the human mind. The same holds for all culture-dependent expressions. Thus, if the lexicons of many languages tend to contain words for buildings and vehicles, it is primarily because human beings tend to develop such things and, secondarily, need to name them. It can be concluded that the reason for the recurrence of such terms in various languages all over the world is not essentially (psycho) linguistic but, rather, a corollary of comparable extra-linguistic circumstances.</Paragraph>
      <Paragraph position="5"> --115 Cultural conditions may also give rise to other types of lexical heterogeneity. The lexicon of a language may be viewed as comprising different strata, some of which contain common words used by everyone, others containing words used exclusively by specialists. Technical language - where &amp;quot;technical&amp;quot; should be taken in a broad sense - in various fields, such as medicine, law, economy, technology, etc.; some forms of language used in certain professions or by certain socially defined groups, like traders, priests, or outlaws - these are examples of vocabulary strata that are likely to be fully mastered only by relatively few individuals. It is to be deplored when the language of professional debaters, for instance in politics and esthetics, also develops in this direction, as is often the case.</Paragraph>
      <Paragraph position="6"> Other strata of language may be quite familiar to a majority of the language-users although they are less frequently employed, being tied up with different styles, registers, or contextual settings. This may apply to the vocabulary of honorific language, religious language, etc. Such differentiation in vocabularies as has been exemplified here is manifested in a language-specific way, but the very existence of differentiation is a universal trait. It has been suggested that lexical inventories can be sub-divided into various domains obeying different sets of rules that govern the relations between language and reality.</Paragraph>
      <Paragraph position="7"> In other words, there may well be various kinds of word meanings (cf.</Paragraph>
      <Paragraph position="8"> Fillmore 1978). 5 Information about a many-splendoured world is to be conveyed by means of language. The phenomena referred to are quite different in nature, and so the semantic content of lexical items may vary accordingly.</Paragraph>
      <Paragraph position="9"> In most authentic vocabularies there is a gradient ranging from more or less purely grammatical operators and structure-dependent items (such as the copula, connectives, quantifiers, etc.), over items that are partly system-oriented, partly more semantically weighted (e.g. pronouns, deictic expressions, prepositions), all the way to items simply indexing &amp;quot;encyclopedic&amp;quot; phenomena. There is much fluctuation from language to language in this regard, since the division of labour between vocabulary and grammar proper may vary. Thus the proportion of words with primarily grammatical functions may differ to a high degree between languages. However, the grammar-oriented part of the vocabulary tends to be shared by most speakers, more differentiation being found at the other extreme.</Paragraph>
      <Paragraph position="10"> Fillmore has mentioned a number of ways in which languages may differ with respect to word semantics. There are such features as relative analyticity, i.e. the degree of semantic transparency characterizing the total lexical system, taxonomic depth, by which is meant the dosage of particular as compared to generic terms, patterns of meaning extension, areas of synonymy elaboration, collocational patterns, etc. (Fillmore 1978, p. 155-157). 5 In fact, different domains within the vocabulary of a single language may vary a great deal in these respects. For instance, terminology is often, although not always, harder to analyse than are common words. In particular, terminology tends to invite heavy borrowing of foreign lexical material; in this way the portion of arbitrary lexical units increases.</Paragraph>
      <Paragraph position="11"> It cannot be doubted that somewhere behind the confusing complexity of the lexicon there is a clue as to what human beings find imperative to recognize as delimited concepts. The categorization reflected by lexical inventories is considerably disguised through the heterogeneity which is a basic characteristic of the lexical component, as has been emphasized repeatedly. As a first step, then, methods must be elaborated by which the complexity can be duly handled. In particular, the semantic redundancy of the authentic lexicon must be reduced.</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="116" type="sub_section">
      <SectionTitle>
Reducing Redundancy
</SectionTitle>
      <Paragraph position="0"> It is very natural in lexico-semantic analysis to take word definitions as a point of departure. It can be argued that a defined word is semantically more complex than each word used in the definition of that word. Also, it is a well-known fact that circularity very easily creeps into definitions.</Paragraph>
      <Paragraph position="1"> Although circularity in definitions has occasionally been the target of investigation and has served successfully as a basis for determining semantic relatedness (e.g. Calzonari 1977), 2 it should, ideally, be controlled.</Paragraph>
      <Paragraph position="2"> One way of achieving maximal reduction of semantic redundancy in the lex~ con is, of course, to define all lexical entries by means of an effective metalanguage, e.g. a minimal defining vocabulary. Our interest can then be focused  on this minimal word-list on the assumption that it covers the same semantic range as the complete vocabulary defined by it. In practical lexicography, defining vocabularies have been utilized in, for instance, The General Basic English Dictionary (1942) ; 8 Michael West, An International Reader's Dictionary (1965); Iu and, in a project having much wider scope and, therefore, holding greater theoretical interest, in</Paragraph>
    </Section>
    <Section position="3" start_page="116" end_page="116" type="sub_section">
      <SectionTitle>
Longman's Dictionary of Contemporary
</SectionTitle>
      <Paragraph position="0"> English (1978).J Defining vocabularies are intuitively attractive. They seem to capture the notion of basic vocabulary, the general lexical subset included in everybody's vocabulary. In some exceptional cases it is very easy to isolate this subset. In Dyirbal, for instance, a Queensland Australian language, there is a special vocabulary used in certain social contexts; hence it is referred to as &amp;quot;mother-in-law language&amp;quot; (Dixon 1971). 4 In this subsystem, Dyangul, the same grammatical rules apply, but the vocabulary is very restricted so that, for instance, each Dyangul verb corresponds to several in the common language. Therefore, the Dyangul vocabulary can be taken as a model for a semantic classification of words in Dyirbal.</Paragraph>
      <Paragraph position="1"> A slight disadvantage in using defining vocabularies is the levelling of depth in the linguistic analysis. The lexicon is considered on two fixed levels alone: that of the lexical entries and that of the basic defining words. As is well known, however, lexical units play very different rQleS in the language they are part of. Not infrequently, the semantic interrelations within given sets can only be represented in a multi-layered fashion. I do not wish to claim that the human lexicon is, in any strict sense, hierarchically organized, but various subdivisions of it may well be.</Paragraph>
      <Paragraph position="2"> For instance, to catch something means roughly 'to get hold of something', to fish means 'to try to catch fish', ~d W angle means 'to fish with a hook and line'. Consistent use of a minimal defining vocabulary would yield definitions like 'to try to get hold of fish with a hook and line' for to angle. This is by no means a totally inadequate definition. To angle is clearly related to verbal expressions like to get hold of; the semantic relatedness becomes apparent in a comparison with other verbs, such as to interrupt, to sneeze, or to twinkle. The verbal acts designated by to catch, to fish, and to angle are, however, not absolutely on a par with each other. Both to fish and to angle &amp;quot;contain&amp;quot; an element of catching. It can be argued that they differ from each other, and from to catch, in the way the catching is specified. To fish explicates the object caught, viz. 'fish'. That fish is caught is presupposed by to angle as well, but with the additional specification of the fishing method employed. However, the two types of specification are not equal with respect to the verbal act 'to catch'. While 'to catch' is presupposed as an element in to fish, the whole meaning 'to try to catch fish' is incorporated in to angle.</Paragraph>
      <Paragraph position="3"> The relations can be expressed by bracketing in the following manner: to catch - '(to try to get hold of)' to fish - '(to catch \[= to try to get hold of\] (fish))' to angle - '(to fish \[= to catch (= to try to get hold of) (fish)\] (with a hook and line))' The closer relationship between to fish and to angle may be indicated by making use of to fish in the definition of to angle. Parallel treatment of pairs or groups of verbs to the effect that one verb may contain not only the general semantic properties of another verb but actually the other verb itself has been suggested by, among others, Binnick (1971) I and Fillmore (1978). 5 In fact, this relative semantic stratification of the lexicon is rather similar to Weinreich's strategy for investigating the semantic content of the lexical inventory. Weinreich gives the following presentation:  and stratum-1 terms, without circularity Stratum n: terms whose definitions contain only terms of strata 0, I, 2, ... n I. null  He concludes that the metalanguage will be made up of the complete ordinary language except for stratum n (Weinreich 1962). ~ A similar line of reasoning is at the bottom of the organization of the Swedish lexical material analysed in the  project Lexical Data Base, carried out at the Department of Computational Linguistics, University of GSteborg. A minimal defining vocabulary is, in principle, utilized in definitions. In addition, however, words not included in the defining vocabulary proper are occasionally allowed in definitions, with the requirement that they should be ultimately reducible to strict defining vocabulary units. The minimal defining vocabulary comprises words denoting very fundamental concepts pertaining to physical elements and forces, geometrical notions, topographical properties, state and movement, location, time, causation, basic organisms, physical and mental functions of organisms, etc., as well as more culture-sensitive and conventionalized concepts, such as colours, artefacts, social conditions, and the like.</Paragraph>
      <Paragraph position="4"> A larger subset than the defining vocabulary is the so-called fully defined vocabulary. This part of the vocabulary is provided with elaborated definitions. Together with the defining vocabulary it makes up the semantic hard core of the lexicon taken as a whole. We are not likely to find more candidates for this part of the vocabulary no matter how much material is included in the data base. Instead, new material tends to be of a more specific kind, e.g. terminology known by only a few people, almost obsolete words, nonpermanent compounds that have barely passed the threshold of lexicalization, but which are easily analysable in terms of the well-defined part of the vocabulary; in short, words which do not add anything further to the basic semantic system of the lexicon. These latter items are not assigned any proper definitions but are semantically specified more summarily.</Paragraph>
      <Paragraph position="5"> Thus the data base is, in principle, divided into three strata: (I) the ~PSi~_~!~X, whose units are axiomatic in a logical sense and highly restricted in number; (2) the ~!!x defined vocabulary, whose units have carefully formulated definitions based on the defining vocabulary;</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML