File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/99/w99-0210_abstr.xml

Size: 6,716 bytes

Last Modified: 2025-10-06 13:49:50

<?xml version="1.0" standalone="yes"?>
<Paper uid="W99-0210">
  <Title>Coreference-oriented Interlingual Slot Structure &amp; Machine Translation,</Title>
  <Section position="1" start_page="0" end_page="69" type="abstr">
    <SectionTitle>
Abstract
</SectionTitle>
    <Paragraph position="0"> One of the main problems of many commercial Machine Translation (MT) and experimental systems is that they do not carry out a correct pronominal anaphora generation. As mentioned in Mitkov (1996), solving the anaphora and extracting the antecedent are key issues in a correct translation.</Paragraph>
    <Paragraph position="1"> In this paper, we propose an Interlingual mechanism that we have called lnterlingual Slot Structure (ISS) based on Slot Structure (SS) presented in Ferrfindez et al. (1997). The SS stores the lexical, syntactic, morphologic and semantic information of every constituent of the grammar. The mechanism 1SS allows us to translate pronouns between different languages. In this paper, we have proposed and evaluated ISS for the translation between Spanish and English languages. We have compared pronominal anaphora resolution both in English and Spanish to accomplish a study of the existing discrepancies between two languages.</Paragraph>
    <Paragraph position="2"> This mechanism could be added to a MT system such as an additional module to solve anaphora generation problem.</Paragraph>
    <Paragraph position="3"> Introduction According to Mitkov (1996), the establishment of the antecedents of anaphora is of crucial importance for a correct translation. It is essential to solve the anaphoric relation when a language is translated into one that marks the pronoun gender. On the other hand, anaphora resolution is vital when translating discourse rather than isolated sentences since the anaphoric references to preceding discourse entities have to be identified. Unfortunately, the majority of Machine Translation (MT) systems do not deal with anaphora resolution and their successful operation usually does not go beyond the sentence level. Another important aspect in automatic translation of pronouns, as mentioned in Mitkov (1996), consists on the application of two possible techniques: translation or reconstruction of referential expressions.</Paragraph>
    <Paragraph position="4"> In the first technique, source language pronouns are directly translated into target language pronouns without studying their relation with other words in the text.</Paragraph>
    <Paragraph position="5"> The second technique considers that the pronouns are not autonomous in their meaning/function but dependent on other units in the text. Then, a more natural way to treat pronouns in MT would be the following: (a) analysis has to determine the reference structure of the source text, i.e.</Paragraph>
    <Paragraph position="6"> coreference/cospecification relationships between anaphora and antecedents have to be determined, (b) this is the only information that is conveyed to the target language generator, (c) the target language generator generates the appropriate target language surface expression as a function of the target equivalent of the source antecedent and/or according to the rules of this language. Mitkov et al. (1995) adopt a similar approach.</Paragraph>
    <Paragraph position="7"> In this work, we present an Interlingual (formal language without' ambiguity) mechanism proposal based on the second technique. Basically, a structure that stores the anaphora and its antecedent in the source language is used. From this structure, a similar one in the target language is generated. Using this new structure we will be able to generate the final surface structure of the original sentence.</Paragraph>
    <Paragraph position="8"> I This paper has been supported by the CICYT number TIC97-0671-C02-02.  In the following section we will describe the general purpose anaphora resolution system. The following section will show the anaphora resolution module, where we will focus on the differences between English and Spanish system and we will report some evaluation results. After that, we will present our Interlingual mechanism based on the English-Spanish discrepancy analysis. Finally, we will discuss the evaluation of some commercial MT systems with their problems in pronouns translation and we will study the solution with our proposal.</Paragraph>
    <Paragraph position="9"> 1 General purpose anaphora resolution system The general purpose anaphora resolution system with our Interlingual module is shown in Figure 1. It can be observed that there are two processes in parallel, corresponding to anaphora resolution in English and Spanish. These two processes are independent of each other and they are connected by means of the Interlingual mechanism. The input of each process is a grammar defined by means of the grammatical formalism SUG (Slot Unification Grammar) Ferrfindez et al. (1997), Ferrhndez (1998a). A translator which transforms rules SUG into Prolog clauses has been developed. This translator will provide a Prolog program that will parse each sentence. This system can carry out a partial or full parsing of the text with the same parser and grammar. In this paper we will use a partial parsing (Slot Unification Partial Parser, SUPP). This partial parser SUPP described in Martinez-Barco et al. (1998), works on unrestricted corpus that contains the words tagged with their obtained grammatical categories from the output of a &amp;quot;part-of-speech (POS) tagger&amp;quot;. In this paper, we have&amp;quot; used bilingual corpus (Blue Book, English and Spanish) CRATER (1994) for the evaluation of anaphora resolution module.</Paragraph>
    <Paragraph position="10"> The output of the parsing module will be what we have called Slot Structure (henceforth SS) that stores the necessary information for linguistic phenomena resolution. This SS will be the input for the following module in which we deal with anaphora resolution as well as other linguistic phenomena (extraposition, ellipsis, ...).</Paragraph>
    <Paragraph position="11"> After applying the linguistic phenomena resolution algorithm we obtain a new slot structure (SS) that will store both the anaphora and their antecedents. This new structure in the source language will be the input for the</Paragraph>
    <Section position="1" start_page="69" end_page="69" type="sub_section">
      <SectionTitle>
Interlingual mechanism (Interlingual Slot
</SectionTitle>
      <Paragraph position="0"> Structure, ISS), which will obtain the corresponding slot structure in the target language. Using this new structure we will be able to generate the final surface structure of the original sentence.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML