File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/98/p98-1024_intro.xml
Size: 10,373 bytes
Last Modified: 2025-10-06 14:06:32
<?xml version="1.0" standalone="yes"?> <Paper uid="P98-1024"> <Title>Managing information at linguistic interfaces*</Title> <Section position="2" start_page="0" end_page="161" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> This paper describes the development and usage of interface representations for linguistic information within the Vorbmobil project: a large distributed project for speech-to-speech translation of negotiation dialogues, between English, German and Japanese. We take as our reference point the Verbmobil Research Prototype (Bub et al., 1997), but this is only one of a sequence of fully integrated running systems. The functional, formal and communicative role of interface terms within these systems, once instigated, has already been maintained through changes in the overall architecture and constitution of the Ve rbmobil consortium.</Paragraph> <Paragraph position="1"> There are two aspects to our story: * the practical and software engineering constraints imposed by the distributed development of a large natural language system.</Paragraph> <Paragraph position="2"> * the linguistic requirements of the translation task at the heart of the system.</Paragraph> <Paragraph position="3"> deg This work was funded by the German Federal Ministry of Education, Science, Research and Technology (BMBF) in the framework of the Vorbmobil project under grant 01 IV 701 R4 and 01 IV 701 N3. The responsibility for this article lies with the authors. Thanks to our many Vorbmobil colleagues who helped to design and develop the results presented here, and also to the anonymous reviewers for giving useful hints to improve this paper.</Paragraph> <Paragraph position="4"> The prominence of the engineering requirements is further heightened by the fact that we are dealing with a spoken dialogue system which must strive towards real time interactions.</Paragraph> <Paragraph position="5"> We proceed by describing the requirements of the Vorbmobi! Research Prototype, the actual contents of the Verbmobil Interface Term (henceforth VIT), the semantic formalism encoded within VITs and processing aspects. We conclude that VITs fulfill the joint goals of a functional interface term and an adequate linguistic representation within a single data structure.</Paragraph> <Section position="1" start_page="0" end_page="160" type="sub_section"> <SectionTitle> 1.1 Modularity </SectionTitle> <Paragraph position="0"> We are concerned here with the interface representations exchanged between modules that make use of traditional linguistic concepts. Figure 1 shows a simplified form of the Verbmobi! Research Prototype architecture I , the modules in the grey shaded area make use of VITs as linguistic interface terms and are loosely termed the &quot;linguistic&quot; modules of the system, in contrast to the other modules shown which are chiefly concerned with the processing of acoustic signals. The linguistic design criteria for VITs derive mainly from the syntactic and semantic analysis module, labelled SynSem, the generation and the transfer modules.</Paragraph> <Paragraph position="1"> One point that should be made at the outset is that these linguistic modules really are modules, rather than, say, organisational constructs in some larger constraints system. In practice, these modules are developed at different sites, may be implemented in different programming languages or use different internal linguistic formalisms, and, indeed, there may be interchangeable modules for the same function. This level of modularity, in itself, provides sufficient motivation for a common interface represen~ln that we have excluded the modules that employ alternative techniques and express no interest in linguistic information. null tation among the linguistic modules, allowing the definition of a module's functionality in terms of its I/O behaviour, but also providing a theory independent linguafranca for discussions which may involve both linguists and computer scientists. The Verbraobil community is actually large enough to require such off-line constructs, too.</Paragraph> </Section> <Section position="2" start_page="160" end_page="161" type="sub_section"> <SectionTitle> 1.2 Encoding of Linguistic Information </SectionTitle> <Paragraph position="0"> A key question in the design of an interface language is what information must be carried and to what purpose. The primary definition criterion within the linguistic modules has been the translation task. The actual translation operation is performed in the transfer module as a mapping between semantic representations of the source and target languages, see (Dorna and Emele, 1996). However, the information requirements of this process are flexible, since information from various levels of analysis are used in disambiguation within the transfer module, including prosody and dialogue structure. null To a large extent the information requirements divide into two parts: * the expressive adequacy of the semantic representation; null * representing other types of linguistic information so as to meet the disambiguation requirements with the minimum of redundancy.</Paragraph> <Paragraph position="1"> The design of the semantic representations encoded within VITs has been guided by an ongoing movement in representational semantic formalisms which takes as a starting point the fact that certain key features of a purely logical semantics are not fully defined in natural language utterances and typically play no part in translation operations. This has been most clearly demonstrated for cases where quantifier scope ambiguities are exactly preserved under translation. The response to these observations is termed underspecification and various such underspecified formalisms have been defined. In one sense underspecification is, in itself, a form of information management, in that only the information that is actually present in an utterance is represented, further disambiguation being derived from the context. In the absence of such contextual information further specificity can only be based on guesswork.</Paragraph> <Paragraph position="2"> While the management of information in the VIT semantics consists of leaving unsaid what cannot be adequately specified, the amount of information and also the type of information in the other partitions of the VIT (see Section 2.1) has been determined by the simple principle of providing information on justified demand. The information provided is also quite varied but the unifying property is that the requirements are defined by the specific needs of transfer, in distinguishing cases that are truly ambiguous under translation or need to be resolved.</Paragraph> <Paragraph position="3"> For example: (1) Geht das bei ihnen? a. Is it possible for you? b. Is it possible at your place? In (1), the German preposition bei displays an ambiguity between the experiencer reading (1 a) and the spatial interpretation (lb). The resolution of this ambiguity requires in the first instance three pieces of information: the type of the verb predicate, the sort of the internal argument of bei and the sort of the subject. This, in turn, requires the resolution of the reference of the anaphor das, where morphosyntactic constraints come into play. If the referent has the sort time then the experiencer reading (la) can be selected. This is the more usual result in the Verbmobil context. Should the referent combines a unique tag for the turn segment described by the current VIT and the word lattice path used in its linguistic analysis; a triple consisting of the entry points for traversing the VIT representation; labelled conditions describing the possibly underspecified semantic content of an utterance; scope and grouping constraints, e.g. used for underspecified quantifier and operator scope representation; sortal specifications for instance variables introduced in labelled conditions; additional semantic and pragmatic information, e.g. discourse roles for individual instance; morpho-syntactic features, e.g. number and gender of individual instances; morpho-syntactic tense combined with aspect and sentence mood information, e.g. used for computing surface tense; prosodic information such as accenting and sentence mood. be sortally specified as a situation, further information will be required to determine the dialogue stage, i.e. whether the time of appointment is being negotiated or its place. Only in the latter case is the spatial reading (lb) relevant.</Paragraph> <Paragraph position="4"> (2) Dann wiirde das doch gehen a. Then, it WOULD be possible, after all.</Paragraph> <Paragraph position="5"> b. It would be possible, wouldn't it? Consider the discourse particle doch in (2) which can be disambiguated with prosodic information.</Paragraph> <Paragraph position="6"> When doch is stressed and the utterance has falling intonation, it functions as a pointer to a previous dialogue stage. Something that was impossible before turned out to be feasible at the utterance time. Then, doch is translated into after all and the auxiliary takes over the accent (2a) 2. If (2) has a rising intonation and the particle is not stressed, it signals the speaker's expectation of the hearer's approving response. In English, this meaning is conveyed by a question tag (2b). Lieske et al. (1997) provide a more detailed account of the use of prosodic information in Verbmobil.</Paragraph> <Paragraph position="7"> In addition to the information that is explicitly represented in the specified fields of a VIT, including the surface word order that can be inferred from the segment identification, and the resolution of underspecified ambiguities in context, transfer might require further information, such as domain-specific world knowledge, speech act or discourse 2We indicate prosodic accent with SMALL CAPITALS.</Paragraph> <Paragraph position="8"> stage information. This information can be obtained on demand from the resolution component (see Figure 1). This flexible approach to the information required for transfer is termed cascaded disambiguation (Buschbeck-Wolf, 1997) and is balanced against the fact that each level of escalation implies a correspondingly greater share of the permissible runtime.</Paragraph> </Section> </Section> class="xml-element"></Paper>