File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/92/a92-1010_metho.xml
Size: 18,779 bytes
Last Modified: 2025-10-06 14:12:55
<?xml version="1.0" standalone="yes"?> <Paper uid="A92-1010"> <Title>Integrating Natural Language Components into Graphical Discourse</Title> <Section position="4" start_page="72" end_page="73" type="metho"> <SectionTitle> 3 The SIC!-System </SectionTitle> <Paragraph position="0"> We demonstrate the combination of graphics ,and natural language output in the context of the SIC! system, SIC\[ is implemented using HyperNews (HyperNews, 1989), a hypermedialike user-interface management system which can be controlled from aLISP-client. SIC! offers information on conferences; its domain consists of abstract information on conferences including workshops, tutorials, persons, institutes, and conference topics. A user who wants to obtain information from SIC! poses a query whereafter he can inspect the data retrieved. This is done by selecting a presentation form. In SIC! we use several cognitively motivated presentation forms (Kemer and Thiel; 1991).</Paragraph> <Paragraph position="1"> One of these forms is the ring presentation form (cf. Figure 1 ) which surveys a structure of given data. All items are positioned on a virtual ring structure, the relations between the concepts being presented as single lines. In our example the ring presentation form contains the categories workshop and topic. The concepts of the workshops are on the left side of the figure, those of the topics on the right. However, this presentation form is clearly limited with respect to the quantity of data that can be presented simultaneously.</Paragraph> <Paragraph position="2"> Consider, for example, the following situation: A user asks for a subset of the IJCAI89 workshops and their related topics. SIC! retrieves three workshops and several topics. If the user wants to get an overview of the data's structure, then for this goal the ring is the most adequate presentation form available in SIC! But this causes a problem when the ring form cannot display as many data as SIC! retrieved. There are several ways of solving this problem, depending on the user's main point of interest. If he wants to see all the data he might select a presentation form that can show unlimited amounts of information, e.g. the table presentation form (cf. Figure 2 ). If he his initial query so that less data are retrieved. But, importantly, the user cannot be aware of the details of the current situation of retrieval and its implications simply from the graphical information displayed. Hence the system has to inform him of the situation and offer possible alternatives. This needs to be done in a way that enables users to grasp the situation and to choose the appropriate alternatives for their purpose. To achieve this, the system has to create a coherent informative act that is concise and yet unambiguous (in context), giving all the information necessary for the user to determine his future actions.</Paragraph> <Paragraph position="3"> Based on work by Feiner and McKeown (1990) on the coordination of text and graphics in explanation generation and by Lombardi (1989), who examined the assignment of information to media, we assume that text is the appropriate medium for informative acts in meta-dialogs, since a wellconstructed text is not only concise and easy to understand, but also guarantees the necessary flexibility to meet any situation that ,arises. Graphical 'acts' cannot be constructed compositionally to express possibly unforeseen complex circumstances: novel graphics must first be learnt by users--a situation avoided by the generation of situationally appropriate natural language. Thus, generating natural language text, particularly text involving controlled and appropriate deployment of text-forming resources such as rhetorical relations, enhances the total coherence of the user's dialog with the system. Our hypothesis is that the user's understanding of the situation and its implications is increased by the natural language output, which becomes an intermediary between the various possibilities for information presentation.</Paragraph> </Section> <Section position="5" start_page="73" end_page="75" type="metho"> <SectionTitle> 4 KOMET/Penman Text Generation System </SectionTitle> <Paragraph position="0"> We are using the KOMET/Penman I system (Mann and Matthiessen, 1983) for generating the natural language output our system requires. KOMET/Penman is a domain-independent text generation system based on systemic-functional grammar (Halliday, 1978). It consists of extensive grammars of English and German (Matthiessen, 1990; Teich, 1991), a linguistically motivated ontology, called the Upper Model (Bateman, Kasper, Moore and Whitney, 1989), a semantic interface that relates the categories of the conceptual ontology with their possible grammatical expressions in English and German (Matthiessen, 1990), and a basic lexicon containing English and German closed-class items and default lexical realizations for the concepts in the Upper Model ontology. The definition of the lexical items includes morphological information and sets of lexical features that determine the grammatical contexts in which items are to be selected.</Paragraph> <Paragraph position="1"> The Upper Model is the component of the system that is primarily responsible for mediating between the knowledge specific to any given domain and the general lexical and grammatical expressions that are provided by a language. Because it is possible to state how any particular Upper Model concept is to be realized, subordinating domain concepts to particular Upper Model concepts causes those domain concepts to inherit appropriate forms of expression. For example, concepts from the object-class are usually realized as nominal phrases, while concepts from theprocess-class (e.g., mental-process, verbalprocess, action-process, relation-process) are often realized by clauses 2. The relationship between Upper Model and domain model is diagrammed in the context of its application for SIC! in Figure 3.</Paragraph> <Paragraph position="2"> Input to the KOMET/Penman text generation system is given in terms of the Sentence Plan Language (Kasper, 1989), of which we will see examples below. An SPL expression defines the semantic content of a sentence to be generated; it consists of a set of typed variables and relations defined between those variables. Both the types and the possible relations are defined either by the Upper Model directly or by concepts or relations in the domain model that have been subordinated to the Upper Model. In addition to this information, SPL expressions may also contain direct statements in terms of the gramroar's semantic interface -- in practical applications these latter are often abbreviated by use of macros (e.g. :tense present) or are defaulted.</Paragraph> <Paragraph position="3"> 1 The original Penman system was developed at the Information Science Institute of the University of Southern California; the KOMET system of GMD/IPSI builds on this, working towards multilinguality and enhanced text planning capabilities.</Paragraph> <Paragraph position="4"> z But not always: the existence of, for example, nominalizations motivates the maintenance of two distinct levels of representation, null To interface SIC! with KOMET/Penman we have to provide several types of knowledge (cf. Figure 3 ) * A domain model, which is a taxonomy of knowledge specific to our application-domain. We split the domain into two parts: an Information-Domain (I-Domain), which contains concepts related to the information that is shown by SIC!, e.g. workshops and topics (cf. Figure 1 ), and a Presentation-Domain (P-Domain), which contains concepts related to the way this information is presented by SIC!, e.g. ring, table. By splitting the domain model we increase the adaptability in case of changes in the underlying application domain, e.g. replacing the conference knowledge base with a knowledge base on research projects. Every concept in the domain model has to be linked to some Upper Model concept from which it inherits attributes which enable KOMET/Penman to express the concept in a way that is grammatically correct. The I-Domain concepts can be generated automatically flom the un- null derlying SIC! knowledge bases (cf. Figure 3 ). Concepts can also be associated to lexical items.</Paragraph> <Paragraph position="5"> * A domain lexicon, containing the definitions of lexicat items of all the words that may appear in the application domain.</Paragraph> <Paragraph position="6"> 6 Creating the Natural Language Output</Paragraph> <Section position="1" start_page="74" end_page="74" type="sub_section"> <SectionTitle> 6.1 Planning Sentences </SectionTitle> <Paragraph position="0"> As stated in our example above, we want to produce text that, in this case, informs the user that not all the information that was requested can be shown because the current presentation-form's capacity is limited. Furthermore, we need to offer possible actions which solve this problem. In Figure 4, we show the semantic input to KOMET/Penman, expressed in SPL, that would cause KOMET/Penman to generate the first sentence I (a / ascription :domain (c / capacity :owned-by (p / presentation-form)) :range (e / exceeded)) Fig. 4 : SPL-Plan for &quot;The presentation-form's capacity is exceeded.&quot; that we require: i.e., &quot;The presentation-form's capacity is exceeded.&quot; One type of abstract concept that the system requires is the status of a particular entity that may be displayed or used.</Paragraph> <Paragraph position="1"> Possible statuses are, for presentation-forms, exceeded, incomplete. These status concepts can then be attributed to objects by means of the Upper Model relation ascription, which has roles ':domain' and ':range'. They represent the concepts which are related, in our example the presentation-form's capacity and exceeded. In general, ':domain' contains the essential concept of the relation while ':range' contains additional information. The P-Domain concept capacity has been modeled as an object.</Paragraph> </Section> <Section position="2" start_page="74" end_page="75" type="sub_section"> <SectionTitle> 6.2 Using Rhetorical Relations </SectionTitle> <Paragraph position="0"> Figure 5 shows a more complex SPL-plan which demonstrates some of the more advanced possibilities given by KO-MET/Penman. The most interesting aspect in this plan is the use of rhetorical relations based on Rhetorical-Structure-Theory (RST).</Paragraph> <Paragraph position="1"> RST is a theory of the organization of natural language texts (Mann and Thompson, 1987). Mann and Thompson studied a wide variety of English texts and observed that there are approximately 25 relations that usually occur between coherent portions of English text. An RST relation consists of two parts, a nucleus and a satellite. The nucleus is that part that is most essential to the speaker's purpose, while the satellite contains additional information. The satellite is more easily replaced than the nucleus because of the nucleus' central role in the thematical progression of the discourse. Even though there are some critics questioning the use of rhetorical relations in discourse structure theory (Grosz and Sidner, 1986) we use shown, because the presentation-form's capacity is exceeded&quot;. RST relations because they proved to be quite useful when we link portions of information. In KOMET/Penman, RST-relations are treated the same way as other relations, e.g. ascription which we used in the plan shown in Figure 4.</Paragraph> <Paragraph position="2"> The SPL-plan shown in Figure 5 combines two relations: the ascription-relation, which we used in the SPL-plan in Figure 4, and the existence-relation. Existence is a so called oneplace-relation, because it contains only a :domain-role but no :range. It is usually realized as &quot;There is ...&quot;, where :domain defines what exists. We link these two relations via an RST-relation called rst-nonvolitional-result. This RST-relation implies that the nucleus, which is defined in our :domain-role is a result of the satellite, defined in :range. One possible output is <domain>, because <range>, in our case &quot;<There are concepts * ..>, because <... capacity is exceeded>&quot;. Because what is defined in the :domain (&quot;There are concepts that are not shown&quot;) is not volitional, we use rst-nonvolitional-result instead of rstvolitional-result. The fact that there is data that is not shown by the current presentation-form is essential to our informational purpose. Therefore this fact becomes the nucleus (represented by :domain) of our plan.</Paragraph> <Paragraph position="3"> RST-relations which ensure the connectivity between our text segments. Pragmatic coherence is supported by the mere fact that we are using text as a medium for meta-dialogs, as these are difficult to understand on a graphical level.</Paragraph> </Section> </Section> <Section position="6" start_page="75" end_page="75" type="metho"> <SectionTitle> 7 Controlling Multimodal Discourse </SectionTitle> <Paragraph position="0"> The dialog manager is one of the main components of our interface system (cf. Figure 6 ). It chooses interaction modes (graphic or text) and controls the navigation or exploration in the information space.</Paragraph> <Paragraph position="1"> In order to prevent the user from 'being lost in hyperspace', we guide the user by case-based dialog plans (Tissen, 1991). In a case-based planning system a new plan will be generated by retrieving the plan which is most appropriate to the user's goals and adapting it dynamically during the ongoing dialog. Two types of adaptations can be distinguished: In'st, system-driven modifications using domain dependent background knowledge, and second, corrections of misconceptions, handled interactively in meta-dialogs with the user. The dialog manager detects misconceptions, i.e. situations in which an intended goal cannot be realized, e.g. more items were retrieved than can be displayed in the current presentationform. The corrector operates on knowledge bases of misconceptions and correction rules, e.g. &quot;if there is a misconception like 'ring presentation: not all requested data can be presented in the ring' and there is no automatic plan modification possible then start a meta-dialog, which informs the user about the situation and offers alternatives.&quot; Because meta-dialogs will be handled in text mode, the dialog manager requests the SPL creator to produce SPL plans. Therefore, the dialog manager informs the SPL creator on the current misconception and possible alternatives the user has to choose from to resolve the situation. Then, the SPL creator produces the appropriate SPL plans by combining information on the misconception and possible alternatives with elements from the SPL library. The SPL plans are transformed into natural language text by the KOMET/Penman system. The resulting text is returned to the dialog manager which presents it to the user.</Paragraph> <Section position="1" start_page="75" end_page="75" type="sub_section"> <SectionTitle> 6.3 Supporting Coherence </SectionTitle> <Paragraph position="0"> In his work on coherence in multi-modal discourse, Bandyopadhyay (Bandyopadhyay, 1990) states that there are three levels of coherence: syntactic coherence, semantic coherence and pragmatic coherence. Syntactic coherence deals with the immediate connectivity among adjacent segments (in texts this is often called text cohesion). Semantic coherence ensures the wellformed thematic organization of a discourse. Discourse segments are connected by semantic ties (Hobbs, 1983). Bandyopadhyay defines a discourse to bepragmatically coherent if it is compatible with the addressees' interpretative ability. In our system syntactic coherence is enhanced by the way we present the natural language output in our graphical environment. Semantic coherence is supported by the use of</Paragraph> </Section> </Section> <Section position="7" start_page="75" end_page="76" type="metho"> <SectionTitle> 8 Controlling utterance selection </SectionTitle> <Paragraph position="0"> The IGiNG system intends to produce user adapted naturaJ language output. It is an object oriented system consisting ol several object classes (cf. Figure 7 ) When IGiNG is requested to produce an utterance it calls th~ utterance's express method, which In'st builds a list ofplan-ob. jects starting from the initial plan-object given by the utterance object. Then, it is determined whether complex or short state.</Paragraph> <Paragraph position="1"> ments are desired. This information is kept in a user-stereotype and determines in which direction the list of plan-objects is tc be traversed. Now IGiNG tests each plan-object's select condi tions. If all conditions are satisfied, the plan-object is selected otherwise IGiNG tries the succeeding plan-object. Finally th4 plan def'med by the plan-object is passed to the KOMET\]Pen man system which generates the utterance.</Paragraph> <Paragraph position="2"> Figure 6 : Integrating SIC! and KOMET/Penman Example: Let us consider the IGiNG-objects given in Figure 7. IGiNG is requested to express rood-1. It builds a list of possible plan-objects, which is (plan-1 plan-2). As concise statements are desired, plan-1 is tested first. Because con-1 is not satisfied (it demands that user-level is low while the current user level is advanced) plan-1 is rejected. Next, plan-2 is tested. As all conditions are satisfied, the plan given by plan-2 is generated.</Paragraph> <Paragraph position="3"> Adapting to new situations When a new situations is to be included, the following steps have to be performed: A new proposition-instance has to be defined as the successor of an existing proposition-class. If the new proposition is not a part of any of the existing proposition-classes, a new proposition-class should be defined first.</Paragraph> <Paragraph position="4"> For any of the possible utterances a parameterizable partial sentence plan has to be written, which is stored in a planobject, together with a reference to the plan-object's selectcondition-object. Of course, it is possible to use existing plan-objects, if they are suitable for the intended purpose. Finally the plan-objects have to be linked to select-conditions. As these are domain independent, preexisting selectconditions can be reused.</Paragraph> <Paragraph position="5"> These are all the steps necessary for defining new propositions. The new objects inherit from their ancestors all the functionality which is necessary for selection and expression.</Paragraph> </Section> class="xml-element"></Paper>