File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/03/j03-2001_abstr.xml

Size: 8,837 bytes

Last Modified: 2025-10-06 13:42:47

<?xml version="1.0" standalone="yes"?>
<Paper uid="J03-2001">
  <Title>c(c) 2003 Association for Computational Linguistics A Model for Matching Semantic Maps between Languages (French/English, English/French)</Title>
  <Section position="2" start_page="0" end_page="157" type="abstr">
    <SectionTitle>
1. Goals
</SectionTitle>
    <Paragraph position="0"> This article presents a spatial model that projects the semantic space of a source language word onto a semantic space in the chosen target language. Although the study presented in this article can be described from various angles, we place it within the framework of artifactual simulations of the translation process, and more specifically, access to the target language's lexicon. The model is described as a construction process designed to reproduce cognitive functions and their extensions. Future research will include the study of the psycholinguistic validity of such a spatial representation.</Paragraph>
    <Paragraph position="1"> Now let us briefly describe the scientific basis of the study.</Paragraph>
    <Paragraph position="2"> * Three major areas are generally distinguished in the study of the translation process (see Vinay and Darbelnet [1996]), the lexicon (or the study of notions), sentence generation (putting words together), and the message (which brings communicative factors into play). The first area involves choosing the right word, which is usually left up to the intuition and expertise of the translator. Our model deals with accessing the lexicon of the target language starting from a notion in the source language. The utility of this research lies in the fact that different languages break down reality in different ways.</Paragraph>
    <Paragraph position="3"> * Although the translation process has been mastered by a number of experts, it is usually still dependent upon the utilization of tools like dictionaries. The model proposed here relies on semantic maps and  Computational Linguistics Volume 29, Number 2 spreading of activation to the prime's neighboring concepts. As an alternative to these local semantic networks, Masson (1995) proposed a connectionist model that takes into account the subjects' reaction time during priming experiments (the correspondence is based on the assumption that semantic or phonologic proximity and ease of access are correlated). Rouibah, Ploux, and Ji (2001) showed that experimental data on interactions between phonology and semantics could be simulated by distances on lexical maps. One advantage of this proposal is that experimental and artifactual findings converge; another is its ability to describe a real lexicon. Although the relevance of our model to the representation of the mental lexicon will not be discussed in this article (attempts to gain insight into this correlation are currently underway in other studies), this point is not unrelated to the suitability of our approach to modeling translation as a cognitive function.</Paragraph>
    <Paragraph position="4"> 2. Description of the Model No two lexicons are related by a one-to-one correspondence (Abplanalp 1998). In other words, the way words are used to refer to extralinguistic reality varies across languages. Some examples of this are cross-language differences in color naming and, borrowing Chuquet and Paillard's (1989) English-French examples, differences like: * room: pi`ece, chambre, bureau (or in an abstract domain) * esprit: mind, spirit, wit Certain authors (Abplanalp 1998) insist how impossible it is to translate at the word level and propose recourse to the conceptual level as a theoretical alternative. Concepts are thought to depend on human cognitive abilities that are general and shared by all. Although the correspondence between words and concepts remains a controversial topic of study (Reboul 2000), the concept/word opposition is nevertheless relevant to any model of translation, even an artifactual one like ours. As we shall see, even when heeding the specific organization and breakdown of each individual language, the matching operation does not take place at the word level but at the substrate level (defined below), where the set of meanings of each word &amp;quot;cuts out&amp;quot; a form.</Paragraph>
    <Paragraph position="5"> First, we will present the model we devised to describe the organization of languages. Then we will explain the source-to-target spreading method used.</Paragraph>
    <Section position="1" start_page="156" end_page="157" type="sub_section">
      <SectionTitle>
2.1 A Model Based on Semantic Similarity
</SectionTitle>
      <Paragraph position="0"> The model was initially developed on the basis of a semantic similarity: synonymy.</Paragraph>
      <Paragraph position="1"> Note, however, that the data and the model are independent, so this same framework can be used to organize other types of similarity (contextual, phonological [Rouibah, Ploux, and Ji 2001], etc.). Other authors also organize the lexicon or other kinds of knowledge on the basis of similarity. For example, in Edelman's (1998) spatial model of internal representations of the world's objects, spatial proximity reflects object similarity. WordNet (Fellbaum 1998) and EuroWordNet (Vossen 1998) organize the lexicon conceptually as a network of terms, each of which is associated with a partition into  Ploux and Ji A Model for Matching Semantic Maps Synsets (a Synset being a small group of synonyms that label a concept). Our model differs from Edelman's in that it deals with lexical semantics, not perceived objects. It also differs from Miller's (1990) approach, in three respects:  use separate units to represent words or concepts (symbols, points in a space, nodes on a graph, etc.). Relationships between units are expressed as proximity links (in spatial models) or as arcs between nodes (in networks). Our model is spatial, but it differs from local models in that each term is represented by a region in the space, part of which it shares with other terms. This region is constructed automatically according to lexical similarity links (such as those given by a synonym dictionary). It is not the result of supervised learning, nor is it a manual, ontological description of how the lexicon is organized. The next section will break the semantic-space construction process into steps in presenting the initial data, the granular approach, and the resulting organization.</Paragraph>
    </Section>
    <Section position="2" start_page="157" end_page="157" type="sub_section">
      <SectionTitle>
2.2 Method
</SectionTitle>
      <Paragraph position="0"> French terms and one containing English terms) and a translation database (French-English, English-French) that maps each term to similar words in the other language.</Paragraph>
      <Paragraph position="1"> The links between an entry and the terms that follow it were not chosen &amp;quot;by hand.&amp;quot; The data were taken mainly from published dictionaries and thesauruses.</Paragraph>
      <Paragraph position="2">  It is updated and supplemented regularly by the addition of new links between words (synonymy or translation links). The method used to generate the French synonym database (described in detail in Ploux (1997) was applied again to generate the English and translation databases. The first step required creating an intermediate database containing the set of all links attested in available work in lexicography. In this preliminary database, a term was deemed similar to another term if at least one lexicographer had established the link. The final database was obtained through symmetrization of the links produced in the first step. While maintaining the shifts in meaning that occur when there is nontransitivity and that, as we shall see, are essential for developing the model, we created new links to symmetrize any initially one-directional ones.</Paragraph>
      <Paragraph position="3">  Table 1 gives a typical example of the structure of the initial data. Table 2 gives a global evaluation of the number of entries and links in the lexical databases. Note that we are not attempting here to define the term synonymy. We rely on lexicographic publications, which as Edmonds and Hirst (2002) remarked, &amp;quot;have always treated synonymy 1 Masson's (1995) model assigns each concept a basin of attraction in a multidimensional space of activation. This framework authorizes a certain form of internal variability for the set of patterns corresponding to a concept. Nevertheless the basins are disjoint and do not overlap as do the nodes in local semantic networks. Furthermore, this model, built essentially for the purposes of validating hypotheses and comparing psycholinguistic results, is applicable only to a highly limited vocabulary and is therefore a poor representative of the natural lexicon.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML