File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/00/w00-1412_metho.xml
Size: 19,963 bytes
Last Modified: 2025-10-06 14:07:26
<?xml version="1.0" standalone="yes"?> <Paper uid="W00-1412"> <Title>Incr en ntal, Eventoneeptua fization.:and Natural Language Generation in Monitoring Environments</Title> <Section position="3" start_page="85" end_page="88" type="metho"> <SectionTitle> 2 Conceptualizing Events </SectionTitle> <Paragraph position="0"> If the system has to produce descriptions about what it 'sees', the main conceptual task is building way that a sequence of clear-cut modules arises, in aralle\[ .P~allel rocessin has up conceptual entities representing spatio- * .which can work ..p ..... :~ &quot; &quot; &quot; g -~ ......... ~eml6oral- constellations -of'theexternai world, i:e. the advantage that several tasks can be done simultaneously; thus, while utterances are generated for some input, subsequent input can already be taken in and processed.</Paragraph> <Paragraph position="1"> Simultaneous conceptualization can be used as the basis of systems producing verbal messages when they detect a (possibly) safety-critical development while monitoring a safety-critical system, like intensive care units, nuclear power plants, or airports. A module for the generation of natural language can be an effective enhancement for monitoring for mainly two reasons: first, in most cases operators are busy observing many displays. Here the auditory presentation of information can make use of idle cognitive resources of the operators and, thus, reduce their workload in directing their attention to a development that may lead to hazardous situations. 2 Second, the essential piece of information can be extracted from a highly complex set of multimodal information and presented by the system in a crisp way. Language is the best conceivable means to transfer information as pointedly as possible. Moreover, taking the dynamics of the permanently changing world into account has the advantage that safety-critical situations can be anticipated earlier and much more reliably. Conventional systems, in contrast, just compare actual measurements with allowed values and give a warning or an alarm when a violation occurs. But it is more useful, e.g. for a nurse if the system tells her that a patient's blood pressure is rapidly dropping than that his blood pressure is already dangerously low.</Paragraph> <Paragraph position="2"> We see the proposed implementation of an in= cremental conceptualizer also as a means to gain 2 The multimodal monitoring environment proposed here reflects the division of labor between the components in working memory (Baddeley, 1986), especially between the visuospatial sketchpad (VSSP) and the phonological loop. Since the observation of multiple display units puts a heavy strain on the VSSP, spoken natural language as input of critical information would use that subcomponent of working memory, namely the phonological loop, which is less strained.</Paragraph> <Paragraph position="3"> event conceptualization. Events emerge from dynamic input data, which are segmented by the conceptual system into meaningful units (Avrahami & Kareev, 1994). They are therefore internal representations rather than external entities: &quot;\[...\] events arise in the perception of observers&quot; (Zacks, 1997). Consequently, a language production system designed for verbalizing what the system perceives has to deal with information stemming from multiple modalities, e.g. auditory and spatial. In particular, a continuous multimodal 'perceptual stream' has to be translated into discrete propositional output (preverbal messages) that can be encoded linguistically, cf. Levelt (1989). To meet such demands, three subtasks have to be solved: (1) The input stream has to be subdivided into 'perceptual units'; (2)conceptual representations have to be built up from these 'percepts', which (3) have to be combined to pre-verbal messages. For the time being, we take the input stream to be strictly sequential, but later versions of our model will compute simultaneous events, as well.</Paragraph> <Paragraph position="4"> According to Habel & Tappe (1999) the function of the conceptualizer can be subdivided into the following processes: segmentation & grouping, structuring, selection, and linearization. The first process operates on the (perceptual) input that is segmented into meaningful basic units (segmentation), and--if possible--two or more of these units are grouped together to form more complex entities (grouping). The structuring process builds .up multilevel hierarchical structures from these meaningful basic units.</Paragraph> <Paragraph position="5"> To exemplify these steps we use the scenario o,f: monitoring the :taxiing of an aircraft, ~iz. the movements of an aircraft from the terminal to its assigned runway and vice versa. Air traffic controllers who guide the movements of aircraft on the ground (surface movement controllers, SMC) have to rely mainly on visual information---either looking out of the window of the control tower or getting information from a airfield control moni- null * :tor--and on communicatiom.with ,.the.,aircraft .......... e.vent,,~,and,:on,-,the.,other~,hand of some~groupings..: crews. Yet, in some conditions, e.g. in low- The movement from position B to position C, for visibility, this method is not failsafe (although re- example, contains--at least--three sub-phases liable). It forces the crews to decrease speed--and corresponding to a straight, a curved and a second such increases number and duration of de- straight section of the trajectory. These three lays--but it also results in greater safety risks) A phases can be distinguished by segmentation, but supporting system that monitors the occurrences are combined by a grouping. Furthermore, the on the taxiway can mitigate these effects., structuring process has to build up the different * phases to form the.TAXI event, cf. Figure 3.</Paragraph> <Paragraph position="6"> J .2r._. Lc.L:_,~J-.28. L_: ........ ~&quot;.-..The:,~:third,of:~he ~,above.~mentioned: subc@ @D Echo processes, selection, has two functions: first, it -'--\],~,! detects that there is a conflict or a (possibly) safety-critical development or situation and decides that a warning has to be generated. Second, it selects the required information for a suitable ---~A warning or alert. Since the verbal warning can be given on different levels of detail, it is necessary to select appropriate events from the event hierarchy for further verbalization. On the one hand, it Figure I. Monitoring theTaxiingofanAircrafl: Phases of a would not be adequate to produce a general Complex Event warning like &quot;Taxiing problem&quot;--except perhaps We will demonstrate the workings of our when there is not enough time or no more informodel of an incremental conceptualizer, which mation available at that moment---on the other produces natural language messages for the SMC, hand, it would not be suitable to give an in-depth with the example depicted in Figure 1. The flight description of each part of the taxiing. Finally, the with the number CK-314 shall taxi from the ter- selected items are brought into an appropriate orminal via taxiway Echo to runway 27. The initial de r by the forth process, linearization. position of the airplane is A. It then starts to move Before we describe the internal structure of the until it reaches position 13 right before a junction conceptualizer in section 3, we want to discuss the where it has to stop and wait until the way is clear core idea of 'event conceptualization' in more before moving on. It then starts again and contin- detail. The first question to be answered is how a ues (C) but its velocity is too high at point D. continuous (perceptual) input stream can be seg-Consequently the plane might not be able to mented into separate events. According to the cut branch off at the junction, where it is supposed to hypothesis of Avrahami and Kareev (1994, p. turn left into runway 27. At that point the moni- 239), &quot;A sub-sequence of stimuli is cut out of a toring system generates a warning that the plane is sequence to become a cognitive entity if it has in danger of missing the junction. (If the plane in- been experienced many times in different condeed misses the junction, an alert is generated, but texts.&quot; This segmentation takes place in the 'eye our example does not include this.) of the observer'. Hence, event conceptualization For this task two kinds of information have to partly depends on individual as well as on conbe available: the planned movements of the plane textual factors. and its actual movements. While the former in- The idea of the cut hypothesis implies the exformation could be handleddirectly by the con- istence of basic events, which ,are the building ceptualizer because they are inherently discrete, blocks in our experience used to trigger segmenthe latter are information about a continuously tation. They are minimal conceptual entities huchanging world. Here the perpetual continuous in- * : ~ .man observers ~ascribe a-beginning and,an:end to. put stream has to be transformed into discrete Thus, they are perceptual wholes--although they items. This process consists, on the one hand, of may have an internal structure--, and are theresegmentations into discrete units, e.g. a STOP fore the basis for the interface between perception and cognition. Basic events can be grouped together to form complex events, e.g. assuming that 3 The reports of incidents and accidents of the Austra- the four basic events GRIP THE HANDLE OF A lian Bureau of Air Safety. Investigation is a rich source of occurrences that should not happen in civil aircraft WINDOW, TURNING THE HANDLE, PULL, and LET operations. GO THE HANDLE are perceived, the complex event OPENING A WINDOW. can-bC/.buiitaap. Furthermore, subsequent events of opening all windows of a room can be grouped to AIRING. But events can not only be grouped but also segmented: if the event OPENING A WINDOW iS perceived, it can be segmented into the respective sub-events. We assume that hierarchical event structures, which are based on knowledge about the internal structure of prototypical events, e.g. in the format of scripts Incremental processing is the 'piecemeal' and parallel processing of a sequential information stream. It is a specific kind of parallel processing in that the processes have a fixed order, which De Smedt & Kempen (1987) describe as a 'cascade of processes', in analogy to a water cascade. This metaphor means that, for example, the grammatical encoding--including lexical access--of an utterance segment cannot take place until the information 'splashes down' from the conceptual encoding process. Figure 2 sketches such a cascade of dependent parallel processes in our model of the conceptualizer: The cascade consists of the processes construction, selection, linearization, and pvm-generation (preverbal-message-generation). These processes also constitute a pipeline in Reiter's (1994) sense, but they do work in parallel.</Paragraph> <Paragraph position="7"> One central parameter of incremental processing, which is highly relevant for the format of pre-verbal messages, is the size of the increments.</Paragraph> <Paragraph position="8"> Assume that a description (no warning this time) of the turning of the flight number CK-314 into taxiway Echo shall be given. This could be done by a proposition like turn(ok314, goal(tw-echo)), which is a potential increment for a preverbal message. Yet, such a proposition would have to be built up completely, before the subsequent components can begin forming it into a sentence like 'Flight CK 314 turns into taxiway Echo.' Hence the formulator coulcL not start processing the first element, say turn, as soon as it is received from the conceptualizer. In:contrast tothis, we opt for an architecture, in which the selection of appropriate lemmas from the lexicon can start for parts of a preverbal message, before other entities are built up on the preverbal message level.</Paragraph> <Paragraph position="9"> As a consequence, the dynamics in incremental processing demands a modified notion of preverbal messages. We conceive of them no longer as -.,eomplete,.propositions~as~:i,s~mosfly ~t.he~.oase. in approaches combining Levelt's ideas with conceptual semantics--but as sequences of well-formed propositional structures~on a.sub-propositional level; in logical terminology: predicate symbols, functional expressions, terms, etc. The incremental formulator SYNPHONICS, which takes specific .well-formed parts of propositions as input, follows these principles (Abb et al. 1996).</Paragraph> <Section position="1" start_page="87" end_page="88" type="sub_section"> <SectionTitle> 3.1 Coarse Architecture </SectionTitle> <Paragraph position="0"> In short, our conceptualizer performs the task 'Give warnings about (possibly) safety-critical developments and situations!' It operates on two different input streams: a discrete one, which contains the plans for the movements of the aircraft on the ground, and a continuous one, which originates in the sensors distributed over the taxiway. Since the conceptualizer cannot directly operate on the continuous input stream, these input information must be converted into a stream of discrete basic entities, which are basic events in this case. In our example a basic event is induced by sensoric data sent to the monitoring system.</Paragraph> <Paragraph position="1"> e.g. that a particular aircraft passes its position.</Paragraph> <Paragraph position="2"> Since the other input stream is already discrete, it simply has to be adapted to the required input format of the conceptualizer, i.e. it has to be convetted into basic events, as well. We will neglect this process and concentrate on the continuous input stream.</Paragraph> <Paragraph position="3"> ..... Based-on Habel :&-T~appe ~(1999) we propose a model of the conceptualizer as depicted in Figure 2. It consists mainly of four incremental (cascaded) processes that work on the blackboard-like current conceptual structure (CCR). At first sight, the use of a data structure, to which more than one process has access, seems to collide with the notion of a cascaded information stream. These -processes are interdependent,in ~sucha ~way,. how, ever, that they indeed behave incrementally; e.g.</Paragraph> <Paragraph position="4"> the selection process cannot select anything that has not been inserted into the CCR (constructed).</Paragraph> <Paragraph position="5"> The CCR can be seen as a shared memory unit with a common data structure. A third kind of information is needed for a representation of the state of affairs: the constellation.of the terminals, taxiways, runways, and the participating object(s), :are: .</Paragraph> <Paragraph position="6"> ( 1 ) construction (2) selection (3) linearization (4) pyre-generation The first process comprises the processes segmentation & grouping as well as structuring of Habel & Tappe (1999), apart from the segmentations that are already done in. the pre-processing or, more generally:~the.,spatial~arrmngement,of the world and information about objects in it. For example, there is one node that stands for flight CK314, and all the nodes shown in Figure 3 are linked to it via an actor relation. Since this type of information is not in the focus of the present paper, we will not discuss it the following.</Paragraph> <Paragraph position="7"> In addition to the cascaded processes there is a concept lexicon, accessible via a concept matcher: these modules, which are called by the construction process, find best matches for structures that can either be subsumed by a more complex concept or may represent still incomplete concepts. The first is necessary to build up hierarchical structures at all. The second is needed for the generation of expectations about developments in the near future. When, for example, flight CK-314 is at position D, the expectation is generated that it will go on straight at the next junction or that it will be unable to turn left at the next junction when keeping the current velocity. 4 On the other hand, after the two nodes STARTi and CHPOS~ (Figure 3) are constructed, these are given to the concept matcher for a subsumption test, which consists of trying to match the nodes onto more complex concepts. This yields that they can be joined together to a MOVE node (MOVEr). Thus, it informs the construction process that a STOP event (STOPs) will probably occur in the near future, which illustrates the second function of the matcher: the generation of expectations. (Even the last MOVE of a sequence of MOVE events contains a STOP event, because aircraft stop at the beginning of the runway, which is the last event of the taxiing, before they commence the takeoff.) The construction process inserts these two new nodes together with the information .that the. STOPi node is just a hypothesis up to now, nothing actually perceived.</Paragraph> <Paragraph position="8"> The four cascaded processes that constitute the 'heart' of the conceptualizer and that will be described in more detail in the following sections</Paragraph> </Section> </Section> <Section position="4" start_page="88" end_page="89" type="metho"> <SectionTitle> 4 The computation of the velocity is easily, done from </SectionTitle> <Paragraph position="0"> the sensoric data.</Paragraph> <Paragraph position="1"> ....... step: ~hes~lectiorr.and thet i nearization processes correlate to the ones in Habel & Tappe (1999), thus, the first selects nodes for verbalizations, while the second brings them into an appropriate order. The pyre-generation is an additional process and guarantees that the selection as well as the linearization have some time to change (the order of) the selected nodes, before they are passed on to the formulator. We call this time span the latency time.</Paragraph> <Paragraph position="2"> For the implementation of this architecture and (a first version of) the algorithms we use a formalism called referential nets (Habel, 1986), which was developed to represent linguistic as well as common sense knowledge. Entities are represented by referential objects (refOs), which can be connected via relations, so that a network structure arises. The basic entities the pre-processing component produces already contain some information about what attributes (e.g. which sort) have to be ascribed to a refO. In the following we use symbolic constants to refer to refOs. These are just arbitrary labels; the important point is that the refOs can be related to suitable refOs of subsequent processes, which, for example, stand for lexical items.</Paragraph> <Section position="1" start_page="88" end_page="89" type="sub_section"> <SectionTitle> 3.2 Construction </SectionTitle> <Paragraph position="0"> The construction process takes basic entities as input and builds up a hierarchical knowledge representation of the perceived states of affairs in the CCR. In the domain we discuss here, three relations are especially relevant for the representation of events: (temporal) inclusion. (_), temporal precedence (-<), and the match of planned events onto actual events (g). For the example described .... above::the sub-net- of the:a'eferential: :net ~that contains the actual events (the ones that have sort A-Event) is depicted in Figure 3 (the velocity problem is just detected). MOVE2, for example, is temporally included in the event TAXI (MOVE2 E TAXI), the event MOVEt is the temporal predecessor of MOVE2 (MOVEi -< MOVE?), a matching between a planned and an actual event is p.(MOVE~, * MOVErs), where MOVt!~ stands for the planned movement from position A to B.</Paragraph> <Paragraph position="1"> MOVE is a label for complex events that consists of maximally three sub-events, namely</Paragraph> </Section> </Section> class="xml-element"></Paper>