XML Viewer - w94-0312

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/94/w94-0312_metho.xml
Size: 27,258 bytes
Last Modified: 2025-10-06 14:13:54
<?xml version="1.0" standalone="yes"?>
<Paper uid="W94-0312">
  <Title>Generating Event Descriptions with SAGE: a Simulation and Generation Environment</Title>
  <Section position="3" start_page="99" end_page="101" type="metho">
    <SectionTitle>
2. EVENTS
</SectionTitle>
    <Paragraph position="0"> In this section, we address the problem of representing and describing events. The goal is to identify the information that needs to be represented in order to take advantage of all the resources a language provides for describing events (which involves determining which distinctions language supports) and determining at what level the information should be represented and the decisions made to make those distinctions.</Paragraph>
    <Paragraph position="1"> We first outline six different kinds of information needed for the expression of events: linear time, event type, temporal modifiers, event structure, argument structure, and agency. In section three, we describe the architecture of SAGE and show where the decisions supporting the distinctions in the expression of events are made.</Paragraph>
    <Section position="1" start_page="99" end_page="101" type="sub_section">
      <SectionTitle>
2.1 Information for Events
</SectionTitle>
      <Paragraph position="0"> First, in order to generate events, there needs to be a model of linear time. Most of the current work on tenses is based on a Reichenbachian-style analysis, which involves three temporal notions: point of speech, point of the event, and point of reference, as we showed above in examples (I) and (2).</Paragraph>
      <Paragraph position="1"> Another well recognized distinction is that of event types, such as state, process, transition, exemplified by the following examples:  3. The mouse is under the table. (state) 4. Fluffy ran. (process) 5. Peter found his keys. (transition----achievement) 6. Helga wrote a letter. (transition--accomplishment)  While verbs have an intrinsic type (e.g. wait is a process and catch is a transition), these types also apply to whole phrases, since tense, aspect, adjuncts and arguments can compose with the type of the lexical head to form a new type: 7. Fluffy ran into the kitchen. (process --&gt; transition) 8. Helga is writing a letter. (transition --&gt; process) 9. The mouse is caught. (transition --&gt; state) 10. Roscoe builds houses. (transition --&gt; iteration) Four kinds of temporal adverbials can be distinguished and are linked to the event types. Duration modifies processes, as in example (lla), but not transitions (llb); frame adverbials modify accomplishments, as in (12a), but not processes (12b); point adverbials modify achievements, as in (13); and frequency adverbials modify iterative events, as in (14).</Paragraph>
      <Paragraph position="2">  b) * Peter waited in the lobby in an hour 13. Hank found the pen at four o'clocl~ 14. Martha writes letters frequently.</Paragraph>
      <Paragraph position="3"> It is also clear that events are not undifferentiated masses, but rather have subparts that can be picked out by the choice of phrase type or the addition of adverbial phrases. Moens &amp; Steedman (1988) identify three constituents to an event nucleus, a preparatory process, culmination, and consequent state, whereas Nakhimovsky (1988) identifies five: preparatory, initial, body, final, result, exemplified by the following: 1 15. When the children crossed the road, a) they waited for the teacher to give a signal b) they stepped onto its concrete surface as if it were about to swallow them up.</Paragraph>
      <Paragraph position="4"> c) they were nearly hit by a car d) they reached the other side stricken with fear. e) they found themselves surrounded by strangers. Pustejovsky (1991) offers a much more compositional notion of event structure, where a transition is the composition of a process and a state. This analysis is more closely tied to the lexicon than Moens and Steedman's or Nakhimovsky's (and is offered in the context of a generative theory of lexical semantics). It not only accounts for the semantics of verbs, but also their compositions with adjuncts to form new types, as in (7) above.</Paragraph>
      <Paragraph position="5"> The participants of an event are those entities that act in or are acted upon in the event. The argument structure is the set of participants in the event that are grammaticized with respect to a particular lexicalization of the event, such as the agent, theme, source, and goal. For some event types (especially those that appear as examples in linguistics papers), the distinction between what is an argument and what is an adjunct is clear. For example, in &amp;quot;Fluffy ate a bone in the dining room yesterday&amp;quot;, &amp;quot;Fluffy&amp;quot; (the agent) and &amp;quot;a bone&amp;quot; (the theme) are arguments, whereas the location and time are adjuncts. For other verbs, however, the distinction is not so clear, as in &amp;quot;Mickey slid into home plate&amp;quot;, where the location is a necessary participant to the meaning, yet as a location it would be treated as an adjunct in most analyses.</Paragraph>
      <Paragraph position="6"> Agency in an event is an aslSect of the argument structure, but since there are some important generalizations over this participant that is not true of others, we treat it separately. One of the most widely discussed syntactic variations is the active/passive, which vary on the placement/inclusion of the agent. As discussed in Meteer (1991) there are really many different motivations for what is often characterized as a single &amp;quot;switch&amp;quot; in generators. The degree of explicitness of the agent in different syntactic constructions can be seen in the following set of examples, from the explicit inclusion of 1 Nakhimovsky, 1988, p.31.</Paragraph>
      <Paragraph position="7"> the agent in the subject position in (a), to the movement of the agent to the by-phrase in (b), to the deletion of the agent in (c), to an adjectival construction in (d) using the past participle form of the verb, to a result construction in (e) that includes no indication of agency. Notice that the explicitness of the event's tense diminishes along with the agency.</Paragraph>
      <Paragraph position="9"> Peter tore the shirt.</Paragraph>
      <Paragraph position="10"> The shirt was torn by Peter. The shirt was torn yesterday. Peter wore the torn shirt yesterday. No one noticed the tear in the shirt. (cf No one noticed the missing button.) Another argument that agency should be treated specially is made by Pustejovsky (1991) in his work in generative lexical semantics and event structure. Pustejovsky argues that some distinctions usually characterized by event type or argument structure are actually rooted in agency, such as the difference between verbs that are lexically transitions but have unaccusative and causative variants (&amp;quot;The door closed&amp;quot; vs. &amp;quot;Thelma closed the door&amp;quot;). Furthermore, the difference between the two types of transitions, accomplishments vs.</Paragraph>
      <Paragraph position="11"> achievements, is based on an agentive/non-agentive distinction. According to Pustejovsky, accomplishments (such as build, draw, and leave) include both the act and the causation in their semantics, whereas in accomplishments (such as win, find, and arrive) agency is not an intrinsic part of the semantics of the verb, but is rather based on something else, such as the configuration of elements (someone wins when they are at the front in some competition at a particular moment, given some particular evaluation function). This is substantiated by the interaction with &amp;quot;deliberately&amp;quot; and these verbs, shown in the examples below: 19. a. Helga deliberately drew a picture b. *Helga deliberately found the pen.</Paragraph>
      <Paragraph position="12"> 20. a. Peter deliberately left the party.</Paragraph>
      <Paragraph position="13"> b. *Peter deliberately arrived at the party.</Paragraph>
      <Paragraph position="14"> Having identified the information necessary for the description of events, the next step in the research is to determine which levels should be responsible for the representation of the information. In particular, what aspects of the event description are * dependent on the event itself (a fact of the world/model); * dependent on the discourse context; * dependent on what linguistic resources are available (e.g. lexicon and syntax) and constraints on their composition.</Paragraph>
      <Paragraph position="15"> SAGE allows us to approach these questions experimentally, using SAGE to provide a context in which to make the decision about where the information is best represented and the decisions best made. In the next section, I describe SAGE, its components, and how they interact. I also include where in that architecture the information for event descriptions is represented. In</Paragraph>
    </Section>
  </Section>
  <Section position="4" start_page="101" end_page="101" type="metho">
    <SectionTitle>
3. THE COMPONENTS OF SAGE
</SectionTitle>
    <Paragraph position="0"> SAGE is a package of integrated tools to aid in exploring the relationship between simulated events in a multi-agent environment, the narration of those events by a text generator, and the animation of the events with simple graphics. There are three main components to SAGE:</Paragraph>
    <Section position="1" start_page="101" end_page="101" type="sub_section">
      <SectionTitle>
3.1 The Modelling Component of SAGE
</SectionTitle>
      <Paragraph position="0"> The underlying program of SAGE, that is, the part in which objects and events are modelled, is a knowledge based simulation system with two parts: the knowledge representation language and the simulator. The objects and events are modelled primarily in VSFL (the Very Simple Frame Language), which is an amalgamation of a knowledge representation language and an object oriented programming language. As a descendent of KL-ONE (Brachman &amp; Schmolze 1985), it provides concept and role hierarchies and multiple inheritance of roles (including role restrictions and defaults) 3.</Paragraph>
      <Paragraph position="1"> The knowledge base in SAGE is what ties together the main components. It acts as a central resource, providing definitional information for types and relations. The type of an object controls its actions in the simulation, the way it is expressed by the generator, and how it is displayed by the graphics component. For example, if the generator is referring to the object #&lt;fluffy&gt;, which is of type dog, it uses the mapping of concept dog to the class of alternative expressions for named individual (such as using the name,</Paragraph>
    </Section>
  </Section>
  <Section position="5" start_page="101" end_page="102" type="metho">
    <SectionTitle>
2 VSFL (&amp;quot;Very Simple Frame Language&amp;quot;) and SCORE CSproket
</SectionTitle>
    <Paragraph position="0"> Core&amp;quot;) were developed at BBN Systems and Technologies by Glenn Abrett and Jeff Palmucci, with assistance from Mark Burstien, and Stephen Deutsch. VSFL is a reimplementation of SFL, which is a descendent of KL-One. SCORE is a reimplementation of the SPROKET simulator. See Abrett, et al.</Paragraph>
    <Paragraph position="1"> 1989 for a more detailed description of these systems.</Paragraph>
    <Paragraph position="2"> 3 VSFL is &amp;quot;very simple&amp;quot; in that it does not support automatic classification and does not have a graphical editor (though it does have a graphical viewer). Its integration with CLOS (Common Lisp Object System) supports the creation of instances and the ability to associate methods with concepts.</Paragraph>
    <Paragraph position="3"> The integration with CLOS also provides more efficient slot accessors and other optimizations.</Paragraph>
    <Paragraph position="4"> a pronoun, &amp;quot;I&amp;quot; if fluffy is the speaker, a generic reference &amp;quot;a dog&amp;quot; if he is being introduced and not known, etc.). The graphics component uses the fact that the type &amp;quot;dog&amp;quot; inherits from &amp;quot;agent&amp;quot; and agents are drawn using triangles pointing in the direction the agent is facing. There is a core knowledge base which contains the set of concepts that are used by all domains, such as ACTION, OBJECT, LOCATION. This is similar to the upper model used in Penman (Bateman 1989). 4 Events are represented as goals and procedures in the simulator and are also linked to the knowledge base through their types, which are concepts in the knowledge base. This provides a classification of events into the three main event types: state, process, and transition. The parameters to those goals/procedures are the roles on the concept, defining the participants in the event, as well as associated information, such as location.</Paragraph>
    <Paragraph position="5"> The simulator SCORE is an event-based simulator that supports multiple agents executing parallel goals and actions. SCORE provides a language for declaratively representing the plans of agents, where a plan is a partial ordering of procedures and subgoals for accomplishing goals and handling contingencies. Goals define the intentions of agents (goals succeed or fail) and procedures define a sequence of actions and decision points (procedures complete or are interrupted). The primitives in this system are actions, which are simply lisp functions.</Paragraph>
    <Paragraph position="6"> The hierarchical structure of the plans, with procedures defined in terms of subprocedures and actions, defines the structure of events, in the sense of Nakhimovsky, described above. The procedure for cross-the-road, for example, would be defined in terms of prepare-to-cross (look both ways, wait for traffic, wait for teacher's signal, etc,) step onto the road, walk across, step on to the other side, with a consequent change in that agent's state (more specifically, his location) from one side of the street to another. Note that in these terms, the constituents of an event is a fact of the model and the level of granularity that is represented, and not a linguistic issue. We can describe the event as a single action &amp;quot;cross the road&amp;quot;, but with an animation component, each of the steps must be modelled as well (depending, of course, on the granularity of the animation, since if the &amp;quot;road&amp;quot; is a single line, then a single action might be adequate to move the agent across it).</Paragraph>
    <Paragraph position="7"> When a goal/procedure is run, an instance of the event concept is created and the parameters are filled with instances of objects and other events. The start and end time and instances of subprocedures are filled in as the procedure runs, providing the event time necessary for the generation of tense. The simulator passes instances of actions to both the generator and graphics component, 4 As yet we make not theoretical claim to the significance of our choice of which concepts live in the core. This is part of our ongoing research.</Paragraph>
    <Section position="1" start_page="102" end_page="102" type="sub_section">
      <SectionTitle>
7th International Generation Workshop * Kennebunkport, Maine * June 21-24, 1994
</SectionTitle>
      <Paragraph position="0"> which use the type hierarchy to know how to describe the action or how to update the display.</Paragraph>
    </Section>
    <Section position="2" start_page="102" end_page="102" type="sub_section">
      <SectionTitle>
3.2 The Generation Component of SAGE
</SectionTitle>
      <Paragraph position="0"> SPOKESMAN is composed of two major components: the text planner and the linguistic realization component. The text planner selects the information to be communicated explicitly, determines what perspectives the information should be given (e.g. whether something is viewed as an event, &amp;quot;Peter waited for a long time&amp;quot;, or as an object, &amp;quot;the long wait&amp;quot;), determines the organization of the information, and chooses a mapping for the information onto the linguistic resources that the language provides.</Paragraph>
      <Paragraph position="1"> The linguistic realization component is MUMBLE-86 (McDonald 1984; Meteer, et al. 1987). It carries out the planner's specifications to produce an actual text. It ensures that the text is grammatical and handles all of the syntactic and morphological decision making.</Paragraph>
      <Paragraph position="2"> Both components use multiple levels of representation, beginning with objects from the application program through progressively more linguistic representations to the final text, as shown in Figure 4.</Paragraph>
      <Paragraph position="3">  Each representational level is a complete description of the utterance and provides constraints and context for the further decisions that go into its construction. This is an essential part of planning by progressive refinement, because the representation must constrain the planner so that it is not allowed to make decisions it will later have to retract. The representational levels also control the order of the decisions.</Paragraph>
      <Paragraph position="4"> The Text Structure, which is the central representation level of the text planner, provides a vocabulary for mapping from the terms of the model level to the linguistic terms of the generator. It is at this level that the content lexical items are selected and the semantic category of the constituents is determined. Events and their composition are handled in the style of Pustejovsky (described above). For example, a RUN-TO-LOCATION procedure in the simulator (which has a type of transition) is mapped to the composition of the lexical head &amp;quot;run&amp;quot; (with the agent from the WHO role of the procedure), which is lexically a process, with a goal locative adjunct (e.g. &amp;quot;to the kitchen&amp;quot;), which produces a transition as shown in the Text Structure tree in Figure 5. Constraints on the transition type indicate that only a frame adverbial (e.g. &amp;quot;in two minutes&amp;quot;), can be added, and not a duration (e.g. &amp;quot;for two minutes&amp;quot;).</Paragraph>
      <Paragraph position="6"> The speaker could also choose not to express the entire transition as a kernel unit, but rather pick out only the process portion, as in &amp;quot;Jake ran&amp;quot;, in which case the composition would also be of type process, which constrains the temporal adjuncts to be of type duration, rather than frame. (See Meteer, 1992, for a more complete description of the vocabulary of terms in the text structure and its role in the incremental composition of the text plan .) Another role of the Text Structure is to keep track of discourse level information, such as focus and what entities have been referenced and in what context. As Webber (1988) points out, tense can be used anaphorically, just as definite nps and pronouns can, and the speaker must keep track of the changing temporal focus. It is the combination of the discourse specific information and the event time and speech time as defined by the simulator 5 that are needed to correctly generate English tense, as described above.</Paragraph>
    </Section>
  </Section>
  <Section position="6" start_page="102" end_page="103" type="metho">
    <SectionTitle>
4. EXAMPLE
</SectionTitle>
    <Paragraph position="0"> In this section, we look at the underlying structures for a narration of a simulation in the SAGE system. We focus on those elements at the interface between the underlying program and the generator. The simulation begins with each of the agents located at a position on the map (Figure 6). Fluffy the dog is assigned a goal of catching a mouse and Jake the mouse is assigned the goal of getting some cheese, which is located in the kitchen. The following simple paragraph, generated by Spokesman, describes each of their goals and actions and is produced incrementally as they are executed by the simulator: 5 The simulator is the &amp;quot;speaker&amp;quot; in SAGE, since it is the component that has goals to express information and the model defined by the knowledge base is the intensional model of that speaker. The generator defines the possibilities for expression and executes the speaker's goals.</Paragraph>
    <Section position="1" start_page="103" end_page="103" type="sub_section">
      <SectionTitle>
7th International Generation Workshop * Kennebunkport, Maine * June 21-24, 1994
</SectionTitle>
      <Paragraph position="0"> Fluffy wants to catch a mouse. He is looking for her.</Paragraph>
      <Paragraph position="1"> The mouse wants to get cheese. She is leaving a mouse-house.</Paragraph>
      <Paragraph position="2"> She is going toward it.</Paragraph>
      <Paragraph position="3"> Fluffy is chasing the mouse. He is going toward her. He caught her.</Paragraph>
      <Paragraph position="4"> The mouse didn't get the cheese.</Paragraph>
      <Paragraph position="5">  As described in Section 2 above, there are several different kinds of information needed to generate event descriptions. Since the underlying program in this case is an ongoing simulation, the linear time is easily available in the system. Figure 7 shows a graph of the events as they are created in the system, marked by their time. Since the generation is a &amp;quot;play-by-play&amp;quot; narration, the event time, reference time, and speaker time are usually the same, as is reflected in the use of the present tense in the text. An exception to this can be seen at the end of the above paragraph. Since the actions underlying these sentences are marked as completed by the simulator, the event time is before the speaker time, and thus the past tense is used. Another kind of information needed for generation is the event type. Note that in SAGE there is not a single notion of &amp;quot;event type&amp;quot;, but rather two: one for the underlying knowledge base and the other in Ravel, the text planner. This reflects the difference between: * a concept's intrinsic type in the domain, which includes what objects it is related to (e.g. its parents, what slots it has), and how it functions in the underlying program (e.g. what methods it has or inherits), and * a concept's &amp;quot;expression type&amp;quot; in the text planner, which reflects the fact that the speaker can alter an object's expression type through lexical choice (e.g.</Paragraph>
      <Paragraph position="6"> nominalization) and the choice of tense, aspect and adjuncts.</Paragraph>
      <Paragraph position="7"> Portions of these two types of classification hierarchies are shown in Figure 8. They are mediated by the mappings in Ravel, which we describe below.Another kind of information that is represented in the underlying program and used by the generator is the difference between a goal, which represents an agent's intentions, and a procedure, which represents an agent's actions. In the example paragraph, this is reflected by the use of the matrix verb &amp;quot;want to&amp;quot; in the first and third sentences in the case where the &amp;quot;action&amp;quot; field of the goal event is &amp;quot;start&amp;quot;, and by the use of the past tense in the sentence &amp;quot;He caught her&amp;quot;, when the action field is &amp;quot;succeed&amp;quot; and by the past and negation in the sentence &amp;quot;The mouse didn't get the cheese&amp;quot; when the action field is &amp;quot;fail&amp;quot;. Instances of goals and procedures are shown in Figure 9. Each simulation event object has two parts: (1) the goal or event wrapper, which indicates the goal/procedure status, the relationship of this event to other events (is a super or sub event), and the time stamp; and (2) the action instance, which is an instance of an action type from the domain model with the fields filled in, indicating the various actors and objects acted on and other related information (note that this information is often but not always expressed as verb arguments).</Paragraph>
    </Section>
  </Section>
  <Section position="7" start_page="103" end_page="105" type="metho">
    <SectionTitle>
\[PE FLUFFY-NEY NIL-NIL I
IGE CATCH-GOAL 7-2
\[GE GEI&amp;quot;-FOOD 6-2 I
</SectionTitle>
    <Paragraph position="0"> the action field is &amp;quot;start&amp;quot; and just uses the mapping for the action instance in other cases. The procedure event also adds nothing to the mapping, but just uses the mapping for the instance class (&amp;quot;watch-for&amp;quot; in the example above).</Paragraph>
    <Paragraph position="1"> (mapping-tables (find-class 'spr: :goal-event) class-to-text-structure :condition (eq (spr: :action self) 'spr: :start) :realization-class state-to-activity-class :arguments ( :agent (spr: :who (core-event-object self) )  procedures in the text planner Mapping tables for the action types catch and look-for 6 are shown below in Figure 11. Each has two mappings, one which offers alternatives at the Text Structure level and The connection from the underlying program to the text generator is made through the mapping tables. Mapping tables provide an association between a concept in the domain hierarchy and the set of linguistic resources that can be used to express instances of that concept. For example, the mapping tables shown in Figure 10 connect the goal and procedure events shown above to choices in the generator. Note that the mapping is conditional, so the goal event is mapped to a set of alternatives for expressing a state with an activity argument at the level of the Text Structure and to a tree family with the verb &amp;quot;want&amp;quot; when 6 I realize there is a confusion here between &amp;quot;look-for&amp;quot; and &amp;quot;watch-for&amp;quot;. &amp;quot;Watch-for&amp;quot; is a child of &amp;quot;look-for&amp;quot; in the hierarchy (see Figure 8), and was probably introduced automatically by the system as the name of a procedure of type &amp;quot;look-for&amp;quot;. While confusing, this exemplifies the kind of naming problems that come up in real systems, and since all of these examples are directly from running code, I hesitate to white them out. In fact, it is the relations among the concepts and their fields that distinguish them, not their symbol names, and it is the mappings that determine what lexical items are used to express them (though some mappings use the concept name as a default lexical item when none is specified.)  a second which offers choices at the linguistic specification level. Specifically, each realization class that is mapped to at the CLASS-TO-TEXT-STRUCTURE-MAPPING offers alternatives of different semantic expression categories (for example expressing the transition &amp;quot;catch&amp;quot; as a process by using the progressive aspect) and the opportunity to leave out optional arguments (even though they are available in the underlying structure, the speaker can choose to leave them out). The argument structure class inspects the choices that have been made in semantic category and arguments and-selects the appropriate tree family. The specific elementary tree will not be selected until the level of the surface structure in Mumble-86, when syntactic context is available.</Paragraph>
    <Paragraph position="2">  The choices described above result in the Text Structure representation, as shown in Figure 12:</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML