File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/98/p98-2226_metho.xml

Size: 6,165 bytes

Last Modified: 2025-10-06 14:15:04

<?xml version="1.0" standalone="yes"?>
<Paper uid="P98-2226">
  <Title>Translating Idioms</Title>
  <Section position="4" start_page="1389" end_page="1389" type="metho">
    <SectionTitle>
3 A sketch of the translation process
</SectionTitle>
    <Paragraph position="0"> In this section, we will show how idioms are handled in the French-to-English ITS-2 translation system, a transfer-based translation system which uses GB-style D-structure representations as interface structures. The general architecture of the system is given in figure 1 be-</Paragraph>
    <Section position="1" start_page="1389" end_page="1389" type="sub_section">
      <SectionTitle>
3.1 Idiom identification
</SectionTitle>
      <Paragraph position="0"> As we argued in the previous section, the task of identifying an idiom is best accomplished at the abstract level of representation (D-structure).</Paragraph>
      <Paragraph position="1"> ITS-2 uses the IPS parser (cf. Wehrli, 1992, 1997), which produces the structure (6) for the input (5a) 5: ~In example 6, we use the following syntactic labels :</Paragraph>
      <Paragraph position="3"> At this point, the structure is completely general, and does not contain any specification of idioms. The idiom recognition procedure is triggered by the &amp;quot;head of idiom&amp;quot; lexical feature associated with the head casser. This feature is associated with all lexical items which are heads of idioms in the lexical database.</Paragraph>
      <Paragraph position="4"> The task of the recognition procedure is (i) to retrieve the proper idiom, if any (casser might be the head of several idioms), and (ii) to verify that all the constraints associated with that idiom are satisfied. Idioms are listed in the lexical database as roughly illustrated in (6)6: (7)a. casser sa pipe 'to kick the bucket'  b. 1: \[ \] 2: \[ casser\] 3: \[ DP V pipe\] c. 1. \[+human\] 2. \[-passive\] 3. \[+literal,-extraposition\]</Paragraph>
    </Section>
  </Section>
  <Section position="5" start_page="1389" end_page="1390" type="metho">
    <SectionTitle>
POSS DP
</SectionTitle>
    <Paragraph position="0"> Idiom entries specify (a) the canonical form of the idiom (mostly for reference purposes), (b) the syntactic frame with an ordered list of constituents, and (c) the list of constraints associated with each of the constituents.</Paragraph>
    <Paragraph position="1"> In our (rather simple) example, the lexical constraints associated with the idiom (7) state that the head is a transitive lexeme whose direct object has the fixed form &amp;quot;POSS pipe&amp;quot;, where POSS stands for a possessive determiner coreferential with the external argument of the head (i.e. the subject). Furthermore, the subject constituant bears the feature \[+human\], the head is marked as \[-passive\], meaning that this particular idiom cannot be passivized. Finally, the object is also marked \[/literal, -extraposition\], which means that the direct object constituent cannot be modified in any way (not even pluralized), and cannot be extraposed.</Paragraph>
    <Paragraph position="2"> The structure in (7) satisfies all those constraints, provided that the possessive sa refers  uniquely to Paul T. It should be noticed that even though an idiom has been recognized in sentence (6), it also has a semantically well-formed literal meaning. Running ITS-2 in interactive mode, the user would be asked whether the sentence should be taken literaly or as an expression. In automatic mode, the idiom reading takes precedence over the literal interpretation s .</Paragraph>
    <Section position="1" start_page="1390" end_page="1390" type="sub_section">
      <SectionTitle>
3.2 Transfer and generation of idioms
</SectionTitle>
      <Paragraph position="0"> Once properly identified, an idiom will be transfered as any other abstract lexical unit. In other words, an entry in our bilingual lexicon has exactly the same form no matter whether the correspondance concerns simple lexemes or idioms. The corresponding target language lexeme might be a simple or a complex abstract lexical unit. For instance, our bilingual lexical database contains, among many others, the following correspondances: French English avoir besoin de X need X casser sa pipe kick the bucket faire la connaissance de X meet X avoir envie feel like quelle mouche a piqu~ what has gotten The generation of target language idioms follows essentially the same pattern as the generation of simple lexemes. The general pattern of generation in ITS-2 is the following: first, a maximal projection structure (XP) is projected on the basis of a lexical head and of the lexical specification associated with it. Second, syntactic operations apply on the resulting structure (extraposition, passive, etc.) triggered either by lexical properties or general features transfered from the source sentence. For instance, the lexical feature \[+raising\] associated with a predicate would trigger a raising transformation (NP movement from the embedded subject position to the relevant subject position). Subject-Auxiliary inversion, topicalization, auxiliary verb insertion are all examples of syntactic transformations triggered by general features, derived from the source sentence.  age, which would avoid formulation (Sa) to state that 'Paul has broken someone's pipe'.</Paragraph>
      <Paragraph position="1"> The first step of the generation process produces a target language D-structure, while the second step derives S-structure representations. Finally, a morphological component will determine the precise orthographical/phonological form of each lexical head.</Paragraph>
      <Paragraph position="2"> In the case of target language idioms, the general pattern applies with few modifications. Step 1 (projection of D-structure) is based on the lexical representation of the idiom (which specifies the complete syntactic pattern of the idiom, as we have pointed out earlier), and produces structure (8a). Step 2, which only concerns the insertion of perfective auxiliary in position T deg, derives the S-structure (8b). Finally, the morphological component derives sentence (Sc). (8)a. \[Tp \[DPPaul\] \[vpkick \[vl~the \[</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML