File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/90/c90-1017_metho.xml

Size: 8,716 bytes

Last Modified: 2025-10-06 14:12:26

<?xml version="1.0" standalone="yes"?>
<Paper uid="C90-1017">
  <Title>STS: An Experimental Sentence Translation System</Title>
  <Section position="3" start_page="0" end_page="0" type="metho">
    <SectionTitle>
2 Architecture of the system
</SectionTitle>
    <Paragraph position="0"> The basic architecture of the STS system is the familiar transfer model with its three main components: analysis, transfer and generation. To see how these components interact, consider the following (drastically oversimplified) sketch of the translation process: an input sentence in the source language (SL) is first mapped onto some formal representation for this language (Sstructure). This is done by the parser, on the basis of lexical information and detailed knowledge about the grammar of SL. The transfer component maps then the S-structure :returned by the parser onto an appropriate S-structure in the target language. The transfer is done constituent by constituent, in a top-down fashion, starting with the top S (or S) constituent. For each constituent, the lexical head is first considered: Its lexeme is associated with a set of possible translations, ie. one or more lexemes in the target language. Once the relevant lexeme has been selected an appropriate structure is projected on the basis of its lexical properties 2For a description of the parser used in the STS project, see Wehrli (1988).</Paragraph>
    <Paragraph position="2"> and of the general rules and principles of target language grammar. Notice that the projection is done solely on the basis of information internal to the target language.</Paragraph>
    <Paragraph position="3"> In other words, the interface structure is minimal, as it should be, and is almost entirely a matter of lexical :napping. Both the analysis and generation modules are completely independent of each other. This again, is a desirable feature, in the sense that, for instance, the same parser can be used no matter what the target language :night be. In fact, given that the S-structures in this system are solely justified in terms of the gra~unar, the parser (and generator) do not have to be application dependent.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
2.1 Lexical database
</SectionTitle>
      <Paragraph position="0"> The lexical database is the central piece of the STS system. It contains crucial information used by the three active components of the system. This information is distributed in two monolingual lexicons (SL lexicon and TL lexicon) along with one bilingual lexicon. We shall consider them in turn:  We assume a static - or relational - conception of morphology, along the lines of Jackendoff (1975), Wehrli (1985). According to this view, morphological relations between two or more lexical entries are expressed by a complex network of relations.</Paragraph>
      <Paragraph position="1"> A monolingual lexicon distinguishes three basic entities: lexeme, word and idiom. A lexeme is an abstract lexical unit, which can be compared, roughly speaking, to a standard dictionary entry. It stands for a whole class of morphological variants. By contrast, a word corresponds to a particular morphological instantiation of a lexeme. In other words, we make a clear distinction among :features which may vary with inflexion and those which are invariant. To give an example, am, are, were, being, be, etc. are words, :norphological variants of the lexeme &amp;quot;be&amp;quot;. The lexemes are associated all the features which are independent of the morphological realization, such as semantic features, subcategorization features, and the like. Features which depend on inflexional markers - e.g. tense, number, person, etc. - are naturaly attached to the words. In addition to words and lexemes, a monolingual lexicon also contains a list of idioms, ie. pharses which have a fixed, non-compositional meaning, such as to kick the bucket or to be caught redhanded. null The notion of lexeme turns out to be one of great significance: Not only does it make possible to factor out basic syntactic and/or senmntic properties shared by morphologically related words. At the same time, it also provides the abstract lexical level which is relevant for lexical transfer.</Paragraph>
      <Paragraph position="2">  The bilingual dictionary specifies the set of possible relations between lexemes of the source language and lexemes of the target language. Each entry in this dictionary specifies one SL lexeme and one TL lexeme. In case one particular SL lexeme has more than one corresponding TL lexeme (e.g. aimer -7 to like, to love, etc.), the bilingual dictionary contains as many entries as there are correspondences. The bilingual dictionary contains other kind of information as well. For instance, in the case of argument-taking elements, such as verbs or predicative adjectives, an entry of the bilingual dictionary must also specify how the arguments of the SL predicate match the arguments of the TL predicate.</Paragraph>
    </Section>
  </Section>
  <Section position="4" start_page="0" end_page="77" type="metho">
    <SectionTitle>
3 The transfer component
</SectionTitle>
    <Paragraph position="0"> The role of the transfer component is to :nap SL S-structures onto TL S-structures. In STS, this mapping is done indirectly, by means of two mechanisms: lexical transfer and lexical projection. null Transfer applies to the syntactic structures returned by the parser. In a top-down fashion, starting with the main S-structure, the transfer procedure considers the lexical head of a phrase, look it up in the bilingual dictionary and selects the most appropriate TL lexeme, based on contextual information, features in the bilingual dictionary. Once a lexeme has been selected, a process of lexical projection creates a TL syntactic structure on the basis of the lexical properties of the TL lexeme, and of the general syntactic  properties of TL. In the next step, the transfer procedure considers the complements of the head, using the same strategy. In addition to lexicaJ\[ transfer and lexical projection, the transfer procedure is guided by the argument mapping information found in the bilingual dictionary~ as mentioned above.</Paragraph>
    <Paragraph position="1"> In the STS system, discrepancies between SL and TL, such as differences in word order or argument matching, can be handled quite naturally without the need of complex and ad hoc structural transfer rules. To illustrate this point, the fact that a sentence containing a modal verb must be assigned a bi-sentential structure in French, but not in English follows from the lexical properties of modal verbs in French and in English, i.e. French modals are main verbs selecting an infinitival sentential complement, while English modals are not marked as main verbs. Within a linguistic theory which assumes that phrase structures are projected from their lexical elements, the structural differences between French and English sentences follows from the lexical differences between, say, pouvoir and car~.</Paragraph>
    <Paragraph position="2"> (2)a. l'homme dont vous semblez avoir oublid le nom ne pourra-t-il pas vous fournir les renseignements dont vous avez besoin? null b. won't the man whose name you seem to have forgotten be able to provide you with the information that you need? Such examples show that the STS system can succesfully handle structures of a non-trivial level of complexity. The second example, in particular, shows the ability of this system to handle problems such as difference in word-order, argument matching and idiomatic expressions.</Paragraph>
    <Paragraph position="3"> In addition, this model proved extremely efficient - total time for parsing and translation of the above sentences averages 150 ms/word 4 which is a crucial prerequisite for on-line interaction. null</Paragraph>
  </Section>
  <Section position="5" start_page="77" end_page="77" type="metho">
    <SectionTitle>
4 Some examples
</SectionTitle>
    <Paragraph position="0"> Two examples of translations produced by the STS :system are given below. In the first example, a is the input sentence, b the structure returned by the parser, and c the sentence returned by the translator s . The structure has been omitted in the second example.</Paragraph>
    <Paragraph position="1">  c. the book at which you are looking seems to be easy to read.</Paragraph>
    <Paragraph position="2"> 3The structures returned by the parser correspond to slightly simplified S-structure representations of a GBgranmlar. The indexes in (lb) express A'-binding or control relations with empty categories (e.g. \[ NP el). 4CPU time on a DEC Vaxstation II computer.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML