File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/98/p98-1017_evalu.xml

Size: 4,927 bytes

Last Modified: 2025-10-06 14:00:28

<?xml version="1.0" standalone="yes"?>
<Paper uid="P98-1017">
  <Title>An Efficient Kernel for Multilingual Generation in Speech-to-Speech Dialogue Translation</Title>
  <Section position="5" start_page="114" end_page="115" type="evalu">
    <SectionTitle>
4 Results
</SectionTitle>
    <Paragraph position="0"> Our approach to separate a generation module into a language-independent kernel and language-specific knowledge sources has been successfully implemented in a dialogue translation system. Furthermore, the mentioned adaptability to other generation tasks has also been proved by an adaptation of the generation module to a new application domain and also to a completely different semantic representation 5Note that the node labels shown in Figures 7 and 8 are only a concession to readability. The TAG requirement that in an auxiliary tree the foot node must have the same category label as the root node is fulfilled.  They are compiled from the corresponding lexical type MV_NP_TRANS_LE as defined in the HPSG grammar. Trees 3 and 4 differ only with respect to their feature structures which are not shown in this figure.</Paragraph>
    <Paragraph position="1"> language by adapting the microplanning knowledge sources to the new formalism.</Paragraph>
    <Paragraph position="2"> VM-GECO is fully implemented (in Common Lisp) and integrated into the speech-to-speech translation system Verbmobil for two output languages, English and German. The adaptation to Japanese generation will be performed in the current project phase. Our experience from adding German makes us confident that this can be done straightforwardly by creating the appropriate knowledge sources without modifications of the kernel generator. To give the reader a more detailed impression of the implementation of the generation component we present some characteristic data of the English generator. The numbers for the German system, especially for lexicon and processing time, are similar.</Paragraph>
    <Paragraph position="3"> The underlying English grammar is a lexicalized TAG which consists of 2844 trees. These trees were transformed during an of\[line pre-processing step from 2961 HPSG lexical entries of the linguistically well motivated English HPSG grammar written at CSLI. On the other hand the microplanner's knowledge sources consist of 2730 partially pre-processed microplanning rules which are utilized in an in- null tegrated handling of structural and lexical decisions based on constraint propagation. The microplanning rules are of course especially adapted to the underlying semantic representation formalism. Furthermore, the underlying lexicon covers the word list that has been constructed from a large corpus of the application domain of the Verbmobil system, i.e., negotiation dialogues in spontaneous speech.</Paragraph>
    <Paragraph position="4"> The TAG grammar resulting from the compilation step allows for highly efficient lexically driven robust syntactic generation mainly consisting of tree adjoinings, substitutions, and feature unifications. The average overall generation time per sentence (up to length 24) is 0.7 seconds on a SUN ULTRA-1 machine, 68 % of the runtime are needed for the microplanning while the remaining 32 % of the runtime are needed for syntactic generation.</Paragraph>
    <Section position="1" start_page="115" end_page="115" type="sub_section">
      <SectionTitle>
4.1 Reusing the Kernel
</SectionTitle>
      <Paragraph position="0"> Beside the usability for multiple languages in Verbmobil our kernel generation component has also proven its adaptability to a very different semantic representation language (systematically and terminologically) in another still ongoing multilingual (currently 12 languages) translation project. The project utilizes an interlingua-based approach to semantic representations of utterances. The goal of this project is to overcome the international language barrier which is exemplarily realized by a large corpus improvement of the transparency of consisting of international law texts. Our part in this project is the realization and implementation of the German generation component.</Paragraph>
      <Paragraph position="1"> Because of our language-independent core generator the adaptation of the generation component to this semantic representation decreased to the adaptation of the structural and lexical knowledge bases of the microplanning component and appropriate domain-specific extensions on the lexicon of the syntactic generator. With an average sentence length of 15 words the average runtime per sentence on a SUN ULTRA-2 is less than 0.5 seconds. Currently, even the longest sentence (40 words) needs under 2 seconds runtime.</Paragraph>
      <Paragraph position="2"> Within Verbmobil, the generation component will also be used for text generation when producing protocols as described in (Alexandersson and Poller, 1998).</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML