File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/04/w04-2306_metho.xml

Size: 13,327 bytes

Last Modified: 2025-10-06 14:09:24

<?xml version="1.0" standalone="yes"?>
<Paper uid="W04-2306">
  <Title>Semi-Automatic Generation of Dialogue Applications in the GEMINI Project [?]</Title>
  <Section position="3" start_page="0" end_page="0" type="metho">
    <SectionTitle>
2 Application Generation Platform
</SectionTitle>
    <Paragraph position="0"> The main target of the GEMINI project is the development of a platform for generating interactive, multilingual and multi-modal dialogue interfaces to databases with a minimum of cost and human effort. The AGP is an integrated set of assistants to generate multi-modal dialogue applications in a semi-automatic way. Its open and modular architecture simplifies the adaptability of applications designed with the AGP to different use cases.</Paragraph>
    <Paragraph position="1"> Connecting to a different database, adding a new modality or changing a scripting language can be achieved by adding or replacing the appropriate component without touching the other aspects of dialogue design again.</Paragraph>
    <Paragraph position="2"> The AGP consists of assistants, which are tools (partly with a GUI) producing models. All these models generated within the AGP are described in GDialogXML (GEMINI Dialog XML), which is an object-oriented abstract dialogue modelling language. It was created during GEMINI for use with the AGP. See Figure 1 for an example of the GDialogXML syntax. For a detailed description of GDialogXML refer to (Hamerich et al., 2003).</Paragraph>
    <Paragraph position="3">  All models in the AGP may be saved as libraries for future applications.</Paragraph>
    <Paragraph position="4"> As shown in Figure 2 the AGP is not supposed to complete its task without any human interaction. This is because there will always be different ways for retrieving specific information. Consequently, the designer of dialogue applications has to select the preferred flow of dialogue manually by confirming the proposals of the AGP components. Most of these operations are simply drag &amp; drop actions between various windows that contain all relevant fields, which are automatically created from the previous tools of the platform.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
2.1 AGP Architecture
</SectionTitle>
      <Paragraph position="0"> All components of the AGP are integrated into one framework. This eases the use of the platform and enables the designer to switch back and forward to different tools in case she or he wants to add or modify certain dialogues.</Paragraph>
      <Paragraph position="1"> In Figure 2 the architecture of the AGP is illustrated.</Paragraph>
      <Paragraph position="2"> The whole AGP consists of three layers. These layers are described in more detail in the following sections.</Paragraph>
      <Paragraph position="3">  The framework layer is the first layer of the AGP (refer to Figure 2). It includes the application description assistant (ADA), the data modelling assistant (DMA), and the data connector modelling assistant (DCMA). As indi- null cated by the black arrow in the upper left corner of Figure 2, all assistants are controlled manually.</Paragraph>
      <Paragraph position="4"> The designer has to provide the application description, which mainly consists of the modalities for which the AGP should generate dialogue scripts, the languages for which the dialogues should be available, the dialogue strategy for the resulting system, some settings for error handling and a rough application description containing the major dialogue steps and their respective slots.</Paragraph>
      <Paragraph position="5"> The DMA helps creating the data model, which consists of class descriptions. Also, the attributes and elementary types of the data are specified here. In this process, the GUI guides the designer, and there is the possibility to load libraries of previously created classes. Furthermore the DCMA helps creating APIs and implementation references for application specific data access functions.2 These functions could then be used in the runtime system without any knowledge of the existing database.</Paragraph>
      <Paragraph position="6">  The retrievals layer (shown as the second layer in Figure 2) mainly consists of the retrieval modelling assistant (RMA). This layer is modality and language independent, therefore no language or modality specific data is included here.</Paragraph>
      <Paragraph position="7"> The designer uses the RMA to create the abstract dialogue flow. It provides a user-friendly interface where the design process is accelerated. Two main sources of information are used to automate the process: the data model and the data connector. Using the information in the data model, several dialogues are automatically generated: (1) candidate dialogues for attributes that the user should be asked for (we call them 'get information dialogues') and (2) another dialogue where that specific attribute is presented by the system ('say information dialogues'). At the same time, all procedures from the data connector are available to the designer, who can drag &amp; drop any of the dialogues mentioned so far.</Paragraph>
      <Paragraph position="8"> In the ideal situation, where a dialogue only depends on items from the data model, it can be modelled with just three drag actions: (1) drag &amp; drop a get information dialogue, (2) drag &amp; drop a call to the database (from the data connector), and (3) drag &amp; drop a say information dialogue. All the values exchanged by these three functions are assigned automatically by the assistant, so the designer just has to press 'Accept' for all assignments.</Paragraph>
      <Paragraph position="9"> When the dialogue depends on data not contained in the data model (as questions to the user that do not correspond to an object from the data model), the designer can use a set of four different types of dialogues: dialogue based on user input / on a variable / on a sequence / on a loop. In all of them, conditional, switch-case and loop constructs can be inserted. So, the designer has both automation and a great flexibility in dialogue design.</Paragraph>
      <Paragraph position="10"> The resulting output is called generic retrieval model (GRM), which consists of the modality and language independent parts of a dialogue, which is mainly the application flow. The GRM is modelled in an object-oriented way using GDialogXML and mainly consists of dialogue modules. A dialogue module can call other modules as subdialogues or can jump to another top level module. This way, the application flow of dialogues in GDialogXML is modelled.</Paragraph>
      <Paragraph position="11"> As indicated by the dashed arrow, it may be necessary to do some manual fine tuning on the GRM, as the complexity of the RMA depends on the application and may be rather high and often there exist several ways to implement the application.</Paragraph>
      <Paragraph position="12">  The dialogue layer is modality and language dependent as now the modality extensions from the modality extension assistant (MEA) are added to the retrieval model. In the extension files the input and output behaviour of an application is described for a specific modality. The current implementation of the AGP supports the generation of voice (speech modality) and web-based applications (web modality). For the speech modality the extensions consist of links to grammar and prompt concepts, which are language and modality independent. For each language, there is a separate concept file, containing the wording for the prompts and the names of the grammars used. Additionally the modality extension consists of special subdialogues which are specific for one modality only.</Paragraph>
      <Paragraph position="13"> All grammars and prompts of the AGP are handled in a global library, which eases the quick and easy reuse of several components.</Paragraph>
      <Paragraph position="14"> The GRM is enriched by the modality extensions in the Linker. The resulting model is called dialogue model, which is processed by the speech script generator and/or the web-page script generator depending on the selected modalities in the application description. For the speech modality VoiceXML scripts with some additional CGI scripts are generated. The grammars are taken from the AGP grammar library or have to be generated with the MEA. For the web modality a web-page script is generated out of the dialogue model which enables dynamic web pages.</Paragraph>
      <Paragraph position="15"> For the speech modality, some more tools are relevant, namely the language modelling tool and the vocabulary builder.</Paragraph>
      <Paragraph position="16"> To have the runtime system ready for use, little effort has to be spent on manual fine tuning again. For example the recogniser dependent settings have to be adjusted for the VoiceXML platform.</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
2.2 Implementation of the AGP
</SectionTitle>
      <Paragraph position="0"> The initial prototype of the AGP of the GEMINI project was finished in summer 2003. This version's architecture is shown on Figure 2. In spring 2004 an extended and improved version of the AGP will be implemented. This version covers additional features like mixed initiative dialogues with over-answering, advanced user-modelling, natural language generation, and language-identification.</Paragraph>
      <Paragraph position="1"> As well, multilingual dialogues are possible with this final version.</Paragraph>
      <Paragraph position="2"> All platform components have been implemented using Qt. Due to this fact, the AGP is applicable on different operating systems.</Paragraph>
    </Section>
  </Section>
  <Section position="4" start_page="0" end_page="0" type="metho">
    <SectionTitle>
3 Applications
</SectionTitle>
    <Paragraph position="0"> Two pilot applications have been generated using the AGP for evaluation and validation. All these applications are generated in a very user friendly way, taking into account the automatic multi-modal error handling capabilities of the AGP, refer to (Wang et al., 2003) for more details about the error handling in GEMINI.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.1 EG-Banking
</SectionTitle>
      <Paragraph position="0"> The voice banking application called EG-Banking application constitutes a voice portal for user-friendly, high-quality interactions for bank-customers. The main functionality of EG-Banking includes a general information part (covering credit cards, accounts, loans infos) available to the public and a transaction part (covering account flow, account balance, statements, etc.) available to customers of the bank only. The multi-lingual application is accessible via a cellular or fixed network telephone.</Paragraph>
      <Paragraph position="1"> A manually refined version of the generated application is installed at Egnatia Bank in Greece and is used as a commercial product for phone banking.</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.2 CitizenCare
</SectionTitle>
      <Paragraph position="0"> CitizenCare is an e-government dialogue system for citizen-to-administration interaction (via multiple channels like internet and public terminals), filled with content for an exemplary community. The main functionality is an interactive authority and information guide, providing different views like an administrative view, based on the hierarchical structure of the authorities, and a concern-oriented view, giving the citizen all the information needed to make use of services offered by public administration authorities.</Paragraph>
    </Section>
  </Section>
  <Section position="5" start_page="0" end_page="0" type="metho">
    <SectionTitle>
4 Comparison to Other Approaches
</SectionTitle>
    <Paragraph position="0"> The GEMINI approach for setting up new dialogue applications differs in a lot of points from other proposals.</Paragraph>
    <Paragraph position="1"> In this section we compare our AGP with other existent approaches.</Paragraph>
    <Paragraph position="2"> Compared with the REWARD system from (Brondsted et al., 1998) the GEMINI AGP allows the generation of dialogues for several modalities. Additionally in GEMINI we generate dialogues in standardised description languages (VoiceXML and XHTML), so we have no need to develop a special runtime system. As done for the REWARD system, we focused a lot on reusability.</Paragraph>
    <Paragraph position="3"> In (Polifroni et al., 2003) a rapid development environment for speech dialogues from online resources was described. The development process there first takes knowledge from various web applications and composes a database from it. This is one of the differences to our approach. Our AGP requires a filled database and allows the development of speech and web applications from it.</Paragraph>
    <Paragraph position="4"> Because of this, we do not need to extract any knowledge, which makes the GEMINI approach more domain independent. Another important difference is, that the speech dialogue applications generated by the AGP will be implemented in VoiceXML, which allows the generated dialogues to be executed with every VoiceXML interpreter.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML