File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/06/w06-1007_intro.xml

Size: 8,231 bytes

Last Modified: 2025-10-06 14:03:52

<?xml version="1.0" standalone="yes"?>
<Paper uid="W06-1007">
  <Title>Structural properties of Lexical Systems: Monolingual and Multilingual Perspectives</Title>
  <Section position="4" start_page="0" end_page="51" type="intro">
    <SectionTitle>
2 Structure of lexical systems
</SectionTitle>
    <Paragraph position="0"> Lexical systems as formal models of natural language lexica are very much related to the &amp;quot; -Net &amp;quot; generation of lexical databases, whose most well-known representatives are undoubtedly WordNet (Fellbaum, 1998) and FrameNet (Baker et al.</Paragraph>
    <Paragraph position="1"> , 2003). However, lexical systems possess some very specific characteristics that clearly distinguish them from other lexicographic structures. We will first characterize the two main current approaches to the structuring of lexical models and then present lexical systems relative to them.</Paragraph>
    <Section position="1" start_page="0" end_page="51" type="sub_section">
      <SectionTitle>
2.1 Dictionary- vs. net-like lexical databases
</SectionTitle>
      <Paragraph position="0"> Dictionary-like databases as texts The most straightforward way of building lexical databases is to use standard dictionaries (i.e.</Paragraph>
      <Paragraph position="1"> books) and turn them into electronic entities. It is the approach taken by most publishing companies (e.g. American Heritage (2000)), with various degrees of sophistication. Resulting products  can be termed dictionary-like databases . They are mainly characterized by two features. * They are made up of word (word sense) descriptions, called dictionary entries .</Paragraph>
      <Paragraph position="2"> *Dictionary entries can be seen as &amp;quot;texts,&amp;quot; in the most general sense.</Paragraph>
      <Paragraph position="3"> Consequently, dictionary-like databases are before all huge texts, consisting of a collection of much smaller texts (i.e. entries).</Paragraph>
      <Paragraph position="4"> It seems natural to consider electronic versions of standard dictionaries as texts. However, formal lexical databases such as the multilingual XML-based JMDict (Breen, 2004) are also textual in nature. There are collections of entries, each entry consisting of a structured text that &amp;quot;tells us something&amp;quot; about a word. Even databases encoding relational models of the lexicon can be 100% textual, and therefore dictionary-like. Such is the case of the French DiCo database (Polguere, 2000), that we have used for compiling our lexical system. As we will see later, the original DiCo database is nothing but a collection of lexicographic records, each record being subdivided into fields that are basically small texts. Although the DiCo is built within the framework of Explan- null database, in the sense of WordNet or FrameNet. Net-like databases as graphs Most lexical models, even standard dictionaries, are relational in nature. For instance, all dictionaries define words in terms of other words, use pointers such as 'Synonym' and 'Antonym.' However, their structure does not reflect their relational nature. The situation is totally different with true net-like databases. They can be characterized as follows.</Paragraph>
      <Paragraph position="5"> * They are graphs--huge sets of connected entities--rather than collections of small texts (entries).</Paragraph>
      <Paragraph position="6"> * They are not necessarily centered around words, or word senses. They use as nodes a potentially heterogeneous set of lexical or, more generally, linguistic entities.</Paragraph>
      <Paragraph position="7"> Net-like databases are, for many, the most suitable knowledge structures for modeling lexica. Nevertheless, databases such as WordNet pose one major problem: they are inherently structured according to a couple of hierarchizing and/or classifying principles. WordNet, for instance, is semantically-oriented and imposes a hierarchical organization of lexical entities based, first of all, on two specific semantic relations: synonymy-through the grouping of lexical meanings within synsets --and hypernymy. Additionally, the part of speech classification of lexical units creates a strict partition of the database: WordNet is made up of four separate synset hierarchies (for nouns, verbs, adjectives and adverbs). We do not believe lexical models should be designed following a few rigid principles that impose a hierarchization or classification of data. Such structuring is of course extremely useful, even necessary, but should be projected &amp;quot;on demand&amp;quot; onto lexical models. Furthermore, there should not be a predefined, finite set of potential structuring principles; data structures should welcome any of them, and this is precisely one of the main characteristics of lexical systems, that will be presented shortly (section 2.2).</Paragraph>
      <Paragraph position="8"> Texts vs.</Paragraph>
      <Paragraph position="9"> graphs: pros and cons It is essential to stress the fact that any dictionary-like database can be turned into a net-like data-base and vice versa. Of course, dictionary-like databases that rely on relational models are more compatible with graph encoding. However, there are always relational data in dictionaries, and such data can be extracted and &amp;quot;reformatted&amp;quot; in the form of nodes and connecting links.</Paragraph>
      <Paragraph position="10"> The important issue is therefore not one of exclusive choice between the two types of structures; it concerns what each structure is better at. In our opinion, the specialization of each type of structure is as follows.</Paragraph>
      <Paragraph position="11"> Dictionary-like structures are tools for editing (writing) and consulting lexical information. Linguistic intuition of lexicographers or users of lexical models performs best on texts. Both lexicographers and users need to be able to see the whole picture about words, and need the entry format at a certain stage--although other ways of displaying lexical information, such as tables, are extremely useful too!  Net-like structures are tools for implementing dynamic aspects of lexica: wading through lexical knowledge, adding to it, revising it or infer- null It is no coincidence if WordNet so-called lexicographer files give a textual perspective on lexical items that is quite dictionary-like. The unit of description is the synset, however, and not the lexical unit. (See WordNet on-line documentation on lexicographer files.)  ring information from it. Consequently, net-like databases are believed by some (and we share this opinion) to have some form of cognitive validity. They are compatible with observations made, for instance, in Aitchison (2003) on the network nature of the mental lexicon. Last but not least, net-like databases can more easily integrate other lexical structures or be integrated by them.</Paragraph>
      <Paragraph position="12"> In conclusion, although both forms of structures are compatible at a certain level and have their own advantages in specific contexts of use, we are particularly interested by the fact that net-like databases are more prone to live an &amp;quot;organic life&amp;quot; in terms of evolution (addition, subtraction, replacement) and interaction with other data structures (connection with models of other languages, with grammars, etc.).</Paragraph>
    </Section>
    <Section position="2" start_page="51" end_page="51" type="sub_section">
      <SectionTitle>
2.2 Lexical systems: a new type of net-like
</SectionTitle>
      <Paragraph position="0"> lexical databases As mentioned above, most net-like lexical data-bases seem to focus on the description of just a few properties of natural language lexica (quasisynonymy, hypernymic organization of word senses, predicative structures and their syntactic expression, etc.). Consequently, developers of these databases often have to gradually &amp;quot;stretch&amp;quot; their models in order to add the description of new types of phenomena, that were not of primary concern at the onset. It is legitimate to expect that such graft of new components will leave scars on the initial design of lexical models. The lexical structures we propose, lexical systems (hereafter</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML