File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/89/j89-4002_metho.xml
Size: 79,968 bytes
Last Modified: 2025-10-06 14:12:17
<?xml version="1.0" standalone="yes"?> <Paper uid="J89-4002"> <Title>NATURAL LANGUAGE GENERATION FROM PLANS</Title> <Section position="3" start_page="0" end_page="0" type="metho"> <SectionTitle> 2 SYSTEM OVERVIEW 2.1 PLANS AND PLANNERS </SectionTitle> <Paragraph position="0"> For this project, we have adopted a traditional AI view of planning and plans. According to this view, the task of planning to achieve a goal is that of finding a set of (instantaneous) actions which, when performed, will transform the world as it is (the &quot;initial state&quot;) to a new world (the &quot;final state&quot;), which is similar to the present world, but where in addition the goal is true.</Paragraph> <Paragraph position="1"> We assume that the plans produced by our planner are nonlinear (almost standard with AI planners since NOAH; Sacerdoti 1975); that is, they only partially specify the order in which the actions are to be performed. Furthermore we assume that the time constraints involved in a plan can be displayed in an action graph, where an action is represented by a point and a line going rightward from one action to another indicates that the first action must take place before the second (this is true of most, but not all, AI plans--see for instance Tsang 1986). Figure 1 shows an action graph for a nonlinear plan for building a house.</Paragraph> <Paragraph position="2"> We further assume that plans are in general hierarchical. By this we mean that the planner operates in a hierarchical manner (almost standard in AI planners since ABSTRIPS; Sacerdoti 1973), first producing a plan :specified at a very abstract level, and then successively refining it until the required level of detail is obtained. At each stage a process of criticism may impose new orderings between actions whose relative ordering seemed to be unconstrained at the previous levels of abstraction. For us, the history of this hierarchical expansion must be present in the final plan, since we assume no explicit interaction with the planner itself while it is operating. (We shall return in Section 5 to the question of whether the hierarchical plan structure is in fact a sufficient description of the planner's processing.) For concreteness, we have based our system on the output of a single AI planning program, even though there are a number of planning systems that could produce a similar style of output. The input to our natural language generator, then, is the translation into Prolog of the set of datastructures created by Tate's (1976) NONLIN planner.</Paragraph> </Section> <Section position="4" start_page="0" end_page="0" type="metho"> <SectionTitle> :2.2 SYSTEM STRUCTURE AND PARAMETRIZATION </SectionTitle> <Paragraph position="0"> We have set ourselves the goal of generating from a NONLIN plan a single natural language text that explains the actions to be performed and why things have to be done this way. To a large extent the explanatory power of such an account depends on what information is represented in the plan in the first place. Although a system that produces a single monolog from a plan is more restricted than, say, an interactive system that can be a,;ked to explain parts of the plan selectively, a number of possible applications do suggest themselves (for instance, the automatic generation of instruction manuals), and the monolog task does provide us with an excellent way of studying the problems of automatically generating large texts.</Paragraph> <Paragraph position="1"> We have attempted to factor out domain-dependence as much as possible in the generation system by having it rely heavily on knowledge expressed in a declarative fashion. Given a particular target natural language, a specific lexicon then needs to be provided for the domain in which the plans are to be generated (we have considered cookery, house building, car maintenance, 234 Computational Linguistics Volume 15, Number 4, December 1989 Chris Mellish and Roger Evans Natural Language Generation from Plans central heating installation, and the &quot;blocks world&quot;). This provides linguistic representations corresponding to the objects, states, and actions that will arise in generated plans. These lexical representations are supplemented by domain-dependent rewrite rules that can be used to reveal hidden additional structure in the planner's representation of the domain. Even with the target natural language fixed and a particular domain given, there are still in general many possible plans from which natural language could potentially be generated (indeed, many man-years of AI research were devoted to developing plans simply in the &quot;blocks world&quot;). Our natural language generation system can be thought of as consisting of four processing stages, centering on the construction and manipulation of an expression of our special message language, as follows:</Paragraph> <Section position="1" start_page="0" end_page="0" type="sub_section"> <SectionTitle> Message Planning Message Simplification Compositional Structure Building Linearization and Output </SectionTitle> <Paragraph position="0"> Message Planning is the interface between the generator and the outside world. At this stage, the generator must decide &quot;what to say,&quot; i.e., which objects and relationships in the world are to be expressed in language and in roughly what order. The output of the message planner is an expression in the message language which, following McDonald, we will call the message. The idea is that message planning may be a relatively simple process and that the resulting message is then &quot;cleaned up&quot; and simplified by localized rewrite operations on the expression (&quot;Message Simplification&quot;). null The message is a nonlinguistic object, and the task of structure building is to build a first description (a functional description much as in Functional Grammar; Kay 1979) of a linguistic object that will realize the intended message. We assume here that a &quot;linguistically motivated&quot; intermediate representation of the text is of value (this is argued for, for instance, by McDonald 1983). Our structure builder is purely compositional, and so the amount of information that it can take into account is limited. We treat structure-building as a recursive descent traversal of the message, using rules about how to build linguistic structures that correspond to local patterns in the message. During this, a simple grammatical constraint satisfaction system is used, to enforce grammaticality and propagate the consequences of syntactic decisions. The recursive descent terminates when it reaches elements of the message for which there are entries in the system's lexicon.</Paragraph> <Paragraph position="1"> Once a structural description of a text has been produced, it is necessary to produce a linear sequence of words. Our structural descriptions contain only dominance information and no ordering information, and so a separate set of rules is used to produce a linearization.</Paragraph> <Paragraph position="2"> This is akin to the ID/LP distinction used in GPSG (Gazdar et al. 1985).</Paragraph> <Paragraph position="3"> The resulting system is similar to McDonald's (1983) model, in that it is basically a direct production system that utilizes an intermediate syntactic representation.</Paragraph> <Paragraph position="4"> The system is also similar to McDonald's in its emphasis on local processing, although there is no attempt to produce a psychological model in our work. Our constraint satisfaction system is implemented efficiently by unification, however, so that the effects of local decisions can propagate globally without the need for explicit global variables. This is used, for instance, to enforce a simple model of pronominalization (based on that of Dale 1986).</Paragraph> </Section> </Section> <Section position="5" start_page="0" end_page="2" type="metho"> <SectionTitle> 3 A WORKED EXAMPLE </SectionTitle> <Paragraph position="0"> AS an illustration of the various mechanisms contained within the system, we present in this section an example of the system in operation. The example is taken from a demonstration application, showing how the language generator might be attached to an expert system. The scenario is as follows: we have an expert system whose function is to diagnose car-starting faults. The expert system asks questions to which the user can either give an answer or type &quot;how,&quot; meaning &quot;how can I find out the answer?&quot; In this latter case, the expert system invokes a planner to work out what the user has to do, and passes the resultant plan to the language generator, which produces text giving instructions to the user. The expert system then asks its original question again.</Paragraph> <Paragraph position="1"> In our demonstration system, the expert system is in fact just a binary decision tree. At each internal node there is a yes-no question and a planner goal, to be used if the user responds with &quot;how.&quot; At each leaf node there is a recommendation and a planner goal--here &quot;how&quot; is interpreted as &quot;how do I carry out your recommendation?&quot; To make the demonstration more varied, the system keeps track of what it has already told the user to do, so that, for example, accessing the carburetor jet will be described differently depending on whether the air filter (which is on top of the carburetor) has already been checked.</Paragraph> <Paragraph position="2"> We pick up the example at a point where it has been ascertained that the battery is OK, but that there is no spark on the spark plugs. The next step is to test for a spark at the distributor. The system asks: Is there a spark at the distributor? and we respond with &quot;how.&quot; The NONLIN plan goal associated with the above question is {tested dist_spark} that is, &quot;make a plan to achieve the state in which we have tested the distributor spark.&quot; The planner assumes that we have done nothing already and are standing at the front of the car, looking at the engine.</Paragraph> <Section position="1" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 3.1 THE PLAN </SectionTitle> <Paragraph position="0"> The plan produced by NONLIN for this example case is a totally ordered sequence of six actions as follows: Computational Linguistics Volume 15, Number 4, December 1989 235 Chris Mellish and Roger Evans Natural Language Generation from Plans {act {detached dirt_cover engine}} {act {detached coil_lead dist_cap}} {act {located mech cab}} {act {started engine}} {act {located mech frontofcar}} {act {observed spark colLlead}} detach the dirt cover from the engine detach the coil lead from the distributor cap go to the cab start the engine go to the front of the car observe whether there is a spark on the coil lead However, although this is the plan at its lowest level, the plan structure returned by NONLIN also includes the hierarchical expansion history of the plan. The plan started out as just the original goal itself, and was successively expanded to greater levels of detail until the primitive actions given above were obtained. The expansion hierarchy for this plan is shown in Figure 2. 2 As well as this hierarchical structure (and the ordering information not shown in this diagram), NONLIN returns information about preconditions in the plan-where they are needed and where they are established. So, for example, the condition {goal {located mech cab}} is required by node 14 and made true by node 13 (and made false again by node 15).</Paragraph> </Section> <Section position="2" start_page="0" end_page="2" type="sub_section"> <SectionTitle> 3.2 THE TEXT </SectionTitle> <Paragraph position="0"> All this information is extracted from NONLIN's data structures, converted into Prolog clauses, and passed to the language generator. The generator looks for ways to break up and organize the information in the plan to produce comprehensible text. This process is described in more detail in Section 4, but to see what it does in this case, we shall concentrate on just one fragment of the above plan, namely nodes 3, 5, 6, 7, and 8. These nodes represent the expansion of the following NONLIN operator: actsclhema tested_4 pattern {act {tested dist_spark}} conditions unsupervised {goal {accessible dist}} at self supervised {goal {detached coil_lead dist_cap}} at 4 from 1 supervised {goal {located mech frontofcar}} at 4 from 3 supervised {goal {started engine}} at 4 from er.Ld; This operator expands the high level action &quot;do something that causes the distributor spark to have been tested&quot; (that is, &quot;test the distributor spark&quot;) into four subgoals (not actions, since if they are already in effect, nothing further needs to be done), the first three of which must preceed the last. Thus the plan here is to ensure that the coil lead is detached from the distributor cap, the engine is started, and the mechanic is at the front of the car, and then to observe whether there is a spark on the coil lead.</Paragraph> <Paragraph position="1"> Tile subplan gives rise to the following piece of text:</Paragraph> </Section> </Section> <Section position="6" start_page="2" end_page="2" type="metho"> <SectionTitle> TESTING THE DISTRIBUTOR SPARK </SectionTitle> <Paragraph position="0"> Testing the distributor spark involves detaching the coil lead from the distributor cap, starting the engine, going 1: (goal ( tested dist._spark) } 2: {goal {accessible dist} } 3: {act {tested distspark) } 4: {act {accessible dist} } 5: {goal {detached coil_lead dist_cap} } 6: { goal { started engine } } 7: \[goal {located mech frontofcar} } 8: {goal {observed spark coil_lead} } 9: {goal {detached dirt_cover engine) } 1 O: { goal {located mech cab } } lh {act {detached dirt_cover engine} } 12: {act \[detached coil_lead dist_eap} } 13: { act {located mech cab } } 14: {act {started engine} } 15: (act {located mech frontofcar} } 16: {act {observed spark coil_lead} } 236 Computational Linguistics Volume 15, Number 4, December 1989 Chris Mellish and Roger Evans Natural Language Generation from Plans to the front of the car and then observing whether the coil lead sparks.</Paragraph> <Paragraph position="1"> If you go to the front of the car now you will not be at the wheel afterwards. However in order to start the engine you must be at it. Therefore before going to the front of the car you must start the engine. If you start the engine now you will not be at the front of the car afterwards. However in order to detach the coil lead from the distributor cap you must be at the front of the car. Therefore before starting it you must detach the coil lead from the distributor cap. Detach the coil lead from the distributor cap. After this start the engine. After this go to the front of the car. After this observe whether the coil lead sparks. You have now finished testing the distributor spark. Notice that this text is not just a description of the actions as specified by the plan operator. Nor is it the fully detailed plan of everything to be done. It is a description of the plan operator in the context of the current plan, embellished with additional useful information from that context. It includes references to actions at several different levels of abstraction, as well as information about ordering constraints present in the plan but not present in the basic plan operator.</Paragraph> <Section position="1" start_page="2" end_page="2" type="sub_section"> <SectionTitle> 3.3 THE MESSAGE </SectionTitle> <Paragraph position="0"> The first step in the generation of this text is to convert the plan data into an expression in the generator's intermediate message language. The message language and the strategies for carrying out this conversion are discussed more fully in Section 4; here we concentrate on those aspects particularly relevant to this text.</Paragraph> <Paragraph position="1"> The overall strategy applied to our subplan is to construct an embedding: an introduction-body-conclusion structure in which the introduction explains how the action is expanded, the body explains how to execute the expansion, and the conclusion makes the point that by executing the expansion, the higher level action is acheived. This strategy is appropriate because the body of the expansion is relatively simple. For more complex examples, where it is not practical to attempt to describe the expansion merely as an introductory sentence of a paragraph, an alternative strategy would be employed.</Paragraph> <Paragraph position="2"> This strategy decision gives us the general shape of the text, and the components of the embedding are straightforwardly constructed, by reference to the local &quot;shape&quot; of the plan fragment. Here the actions are linearly ordered, which suggests presenting them in the order of execution. At the same time the message is embellished with the justifications for the action ordering. In the above plan operator, the first three actions were unordered, but lower level considerations (concerning where the mechanic is at a given time) impose an ordering on them in the actual plan returned. Message elements are added to explain these ordering requirements, and in this case these necessarily appeal to the lower level actions of moving about.</Paragraph> <Paragraph position="3"> The resulting message expression is too large for easy display, so we shall concentrate on a small part of it, the part corresponding to the three sentences: If you go to the front of the car now you will not be at the cab afterwards. However in order to start the engine you must be at it. Therefore before going to the front of the ear you must start the engine.</Paragraph> <Paragraph position="4"> The initial message expression for this text is: achieve (goal (located (mech, frontofcar))))) where expressions like goal(located(mech,frontofcar)) and goal(started(engine)) are straight NONLIN expressions, translated literally into Prolog. This expression can be read approximately as &quot;the hypothetical result of going to the front of the car is that you will not be in the cab, and this contrasts with the prerequisite of being in the cab to start the engine. This combination implies you should start the engine before you go to the front of the car.&quot; And of course, that is more or less what the produced text says.</Paragraph> </Section> </Section> <Section position="7" start_page="2" end_page="2" type="metho"> <SectionTitle> 3.4 SIMPLIFYING THE MESSAGE </SectionTitle> <Paragraph position="0"> The above message contains a number of redundancies, which will lead to inelegant text if it is used for generation. First of all, it contains various occurrences of wait(\[ \]) (the action of waiting for nothing). These are inserted because in general at certain points of the subplan being explained, one is forced to wait for the conclusion of actions being performed in other subplans; this time, however, there are no such critical actions in other parts of the plan. Second, the NONLIN expressions have been inserted verbatim, without any consideration of whether they could be expressed more elegantly in the message language. Message simplification concerns rewriting the message expression generated by message planning into one that is &quot;simpler&quot; in some sense. This is achieved by applying rewrite rules to components of the whole message. Two kinds of rewrite rules are used: the first kind perform domain-independent structural simplifications to message expressions, and the second domain- or language-dependent alterations. For example, the following two rules together dispose of the redundant wait(\[ \]) terms of the above message (\[ \] denotes the empty action here):</Paragraph> <Paragraph position="2"> Computational Linguistics Volume 15, Number 4, December 1989 237 Chris Mellish and Roger Evans Natural Language Generation from Plans These are to be read as rewrite rules, with the expressions on the left of the &quot;---~&quot; being rewritten to the expressions on the right. Variables are denoted by names beginning with capital letters. The operation of these rules is entirely domain-independent. One of the domain-dependent rules rewrites mech (the mechanic) as user, to indicate that the mechanic is the same as the person to whom the instructions are being given. Another set of rules rewrites states into a form where the affected person or object is explicit (this representation allows the system to collapse together multiple states involving the same person): goal (located (X,Y)) --~ state (X, located (Y)).</Paragraph> <Paragraph position="3"> More substantial examples include the following rules for talking about moving around: achieve (state (user, located (Y))) --* go_to (user,Y).</Paragraph> <Paragraph position="4"> result (go_to (X,Y),state (user, located (Y))) --~ do (go_to (X,Y)).</Paragraph> <Paragraph position="5"> The first causes a phrase such as &quot;get to be at the front of the car&quot; to be rewritten as &quot;go to the front of the car.&quot; The second removes redundancy in a sentence like &quot;Go to the front of the car and you will be at the front of the car,&quot; rewriting it as simply &quot;Go to the front of the car.&quot; Once all the rewrite rules have been applied, our simplified example message looks like this: implies ( contra_seq ( hypo_result ( user, go_to (user, function (front, car)), state (user, not (located (cab)))), prereqs ( user, start (engine), state (user, located (cab)))), neccbefore (user, start (engine), go_to (user, function (front, car))))</Paragraph> </Section> <Section position="8" start_page="2" end_page="2" type="metho"> <SectionTitle> 3.5 COMPOSITIONAL STRUCTURE BUILDING </SectionTitle> <Paragraph position="0"> The next stage is to build a linguistic structure from this message expression. The structure-building component uses an ordered set of rules for mapping from local message patterns to linguistic structures. It is similar to the system described in Chester (1976) in the local nature of its operation, but Chester builds sentences directly, rather than via structural representations.</Paragraph> <Paragraph position="1"> Mann et al. (1982) would call our system a &quot;direct translation&quot; system. A system built in this fashion has the advantage of a very simple control structure and has the potential of having its principles expressed in modular, independent rules.</Paragraph> <Paragraph position="2"> Our linguistic structural descriptions are similar to the functional descriptions used in Functional Grammar (Kay 1979; Kay 1984). For example, the following is a slightly simplified version of the rule used to realize the hypo_result construct above:</Paragraph> <Paragraph position="4"> In this rule, the left hand side of the ---> is a Prolog pattern to be matched with part of the message (symbols beginning with uppercase letters represent variables, which are to match subexpressions of the message). The rightlaand side is a functional description, describing the English phrase that is to render that part. In these functional descriptions, expressions preceded by dollar signs represent the places where further information will be contributed by the expansion of subparts of the message. Thus the &quot;agent&quot; value is obtained by recursively matching the value of the variable Agent (that is, the first argument in the hypo__vesu.lt term) against the structure rules.</Paragraph> <Paragraph position="5"> This rule is responsible for sentences like: If you start the engine now you will not be at the front of the car afterwards.</Paragraph> <Paragraph position="6"> The rule provides the basic template for the sentence: it is a combination of two sentences using the conjunction &quot;if.&quot; The first sentence is present tense active, with agent specified by the Agent argument and predicate by the Act argument, and has an adverbial modifer &quot;now.&quot; The second sentence is a future tense expression of the State given as argument, with adverbial modifier &quot;afterwards.&quot; The presence of samesentencer ensures that the whole is a single sentence and that the two subclauses have the same focus.</Paragraph> <Paragraph position="7"> Tile recursive structure building process &quot;bottoms out&quot; when a message element is reached for which a linguistic realization appears in the domain-dependent lexicon. Domain states and properties (Section 4.1) are provided with lexical entries that describe how to realize them as VPs. Such entries could be written as structure building rules in the same format as the above 1-~,po_result rule, but in practice it is convenient to use a more compact notation: lx (accessible, be, \[attr: @ accessible\]).</Paragraph> <Paragraph position="8"> lx (answer, answer, \[obj: @ 'the question'\]).</Paragraph> <Paragraph position="9"> lx (start (Z), start, \[obJ: Z\]).</Paragraph> <Section position="1" start_page="2" end_page="2" type="sub_section"> <SectionTitle> 238 Comput~ltional Linguistics Volume 15, Number 4, December 1989 </SectionTitle> <Paragraph position="0"> Chris Mellish and Roger Evans Natural Language Generation from Plans These entries indicate how each of accessible (a property), answer, and start(Z) (actions) can be realized in English, by providing a verb and a specification for the complements to follow it. In the first two, the complement phrases are specified directly as constant strings (indicated by the '@' sign). In the last one, the filler of the obj role (the direct object) will be whatever phrase is used to realize the object Z being started. Additional rules provide possible fixed phrases to realize such domain objects: referent (engine, @engine).</Paragraph> <Paragraph position="1"> When a domain object like engine comes to be realized, either the fixed phrase provided (prefixed by 'the') will be used, or a pronoun with the appropriate gender and number will be chosen. It is clearly a limitation of our system that no other possibilities are currently allowed, but in some sense this reflects the fact that plans come with a great deal of information about the actions to be performed but very little information about the objects involved in them.</Paragraph> <Paragraph position="2"> Structure building rules are ordered, so rules for more specific patterns can be placed before rules for less specific ones, without the latter having to explicitly provide negative conditions. In addition, rule application is deterministic, in that, once the left hand side of a rule has matched a piece of message and the right hand side structure (minus the parts that require recursive rule matching) has been successfully built, no other rule will ever be tried for that portion of the message.</Paragraph> <Paragraph position="3"> As well as the usual specifications of features and values (for example, conjn and first used above), functional descriptions can also contain specifications of properties, such as s and samesentenee, that the relevant construction must have. Some of these properties (such as s) are intrinsic--essentially just features without values. Others are &quot;macros&quot; for bundles of simpler feature-value and property specifications. For example, saraesentence is defined as being shorthand for a bundle of feature-value pairs that limit the possibilities for focus movement in and around the structure described. A collection of such macros enables us to implement what is essentially Dale's (1986) model of how discourse structure constrains pronominalization, which was inspired by the work of Grosz (1977) and Sidner (1979). The use of a macro like ssanesentence (keyed off particular structures in the message) sets up an environment that will allow certain pronominalizations but exclude others. The choice of whether to pronominalize or not is then made on a local basis. It is interesting to compare this scheme to that of McKeown (1982), which also makes focus decisions on a local basis. McKeown's approach, almost the opposite to ours, is to take certain focus priorities as primary and then to attempt to select material in accordance with these. Our approach, which involves considering focus only after the material has already been organized, regards pronominalization more as a last-minute lexical Computational Linguistics Volume 15, Number 4, December 1989 optimization than as something that is planned in advance. We have considered incorporating some means of focus annotation in the message, but it is not always clear at this level what the focus should be. We have thus preferred to allow message planning simply to place general constraints on focus movement.</Paragraph> <Paragraph position="4"> As the message is traversed by the structure building rules, more and more information accumulates about the output functional description and its components.</Paragraph> <Paragraph position="5"> As is usual in unification grammar, in the written form structural descriptions are sideways open, that is, an object satisfying the description is required to have the features listed, but may have any other features in addition. Thus our structure building rules only provide the framework of the final functional description. The rest is filled in by a simple grammatical constraint satisfaction system. This enforces grammaticality and handles the propagation of feature values that are shared between different phrases (for instance, number agreement). The constraint satisfaction system is based on the use of a declarative &quot;grammar specification&quot; of the types of legal descriptions and the constraints they must satisfy. This specification is compiled into a representation that essentially treats every property and feature as a macro for a bundle of conclusions that follow from its being involved in a description.</Paragraph> </Section> </Section> <Section position="9" start_page="2" end_page="239" type="metho"> <SectionTitle> 3.6 CONSTITUENT ORDERING </SectionTitle> <Paragraph position="0"> The final task once the linguistic structure has been built is to determine the order in which the constituents are to be produced, and to locate the actual words to be used.</Paragraph> <Paragraph position="1"> Substructure ordering is determined by ordering rules.</Paragraph> <Paragraph position="2"> The ordering rules are applied to a structural description in much the same way that structure-building rules are applied to the message; that is, recursively and compositionally. The left hand side of an ordering rule is a pattern that is matched against the structural description. The right hand side of the first rule whose pattern matches is then taken as a template determining which parts of the description are to be realized as phrases and in what order. For example, here is a rule for VP ordering: \[vp, mainverb = V, adv = A, eompls = C\] --* \[V,C~\].</Paragraph> <Paragraph position="3"> This rule ensures that a verb is realized before its complements, which are realized before any adverbial modifiers, producing VPs like: go to the front of the car now Each application of an ordering rule returns an ordered list of functional descriptions. These are then recursively subjected to ordering rules, to determine their relevant subphrases and the order these should be realized in. The recursion &quot;bottoms out&quot; when a functional description of type word is reached. The end result is a list of word descriptions each containing features detailing aspects of the morphology. These are passed to the morphology component (currently ex- null Chris Mellish and Roger Evans Natural Language Generation from Plans pressed as raw Prolog code), which will then output the appropriately inflected word.</Paragraph> </Section> <Section position="10" start_page="239" end_page="240" type="metho"> <SectionTitle> 4 PLANS AND MESSAGES 4.1 THE MESSAGE LANGUAGE </SectionTitle> <Paragraph position="0"> In some ways, a natural language generation system is like an optimizing compiler. Producing some sort of natural language from a symbolic input is not a task of great difficulty, but producing text that is smooth and readable is a challenge (and in general quite beyond the state of the art). With both tasks one has the option of planning the text and simplifying its form either in a single pass or in multiple passes. In language generation, McDonald's MUMBLE (McDonald 1983) produces and simplifies linguistic structures within a single pass of the input. Although the modeling of human language production may require a theory of this kind in the end, the result is a system where it can be hard to separate out the different principles of structure building and simplification, because these have all been conflated for reasons of efficiency. We have thus opted for a multi-pass system. Multi-pass optimizing compilers need to have specialized internal languages (virtual machine codes) more abstract than the output machine codes and in terms of which optimizations can be stated. The analog in a natural language generation system would be a message language that could express at a level more abstract than linguistic structure the goals and intended content of a piece of language to be generated. We can see elements of such a language in the &quot;realization specifications&quot; of McDonald and Conklin (1982) and in the &quot;protosentences&quot; of Mann and Moore (1981). A crucial part of our own system is the use of a message language specialized for the explanation of plans.</Paragraph> <Paragraph position="1"> Our message language is a language specifically devised for expressing the objects that arise in plans and the kinds of things one might wish to say about them.</Paragraph> <Paragraph position="2"> The main types of statements (&quot;utterances&quot;) that can be made at present as part of our generated text are shown in Figure 3. These &quot;utterances&quot; mention actions and states, which could be domain actions and states (as appearing in the plan) or complex actions and states, formed according to the rules in Figure 4. The message language provides for the description of actions being carried out involving different agents and objects (both represented as &quot;objects&quot;--Figure 5), although NONLIN provides no indication about who is responsible for any given part of a plan. Thus the agent of an action defaults to usor for an action that is properly part of the current subplan and someone for an action that has been included in the description but is properly part of another subplan. In this way, each part of the plan is explained from the point of view of the person executing it, with no assumption that the same person will be executing other parts of the plan. A message consists of a number of &quot;utterances&quot; linked together by various</Paragraph> <Paragraph position="4"> --- one action must take place before another do(ACTION) ..... instruction to perform an action :result(ACTION,STATE) .... as 'do', but also mentioning an effect of the action :hypo_result(OB JECT,ACTION,STATE) --- if the agent carried out the action, the state would hold expansion(ACTION,ACTION) --- describing the expansion of an action into subactions prer eqs(OB JECT,ACTION,STATE) --- describing the prerequisites of an action, with the --- assumption that a given agent will perform it needed(OBJECT,ACTION,STATE) --- describing the reason why a STATE is needed, so that --- OBJECT can perform ACTION causes(STATE,STATE) --- once the first state holds, so does the second now(STATE) --- indicating that some state now holds Figure 3 Types of Basic Utterance.</Paragraph> <Paragraph position="5"> organizational devices. These indicate various kinds of sequencing and embedding (Figure 6). Most are simply ways to string together two &quot;utterances,&quot; with an appropriate conjunction being suggested, according to what kind of link there is between the two. The embed construction is used to indicate a discourse segment which has an introductory section, a body and a concluding section. Hence it has three parts. The idea is that the explicit marking of such structures in the message language will enable linguistic decisions (for instance concerning pronominalization) to be made more intelligently. In general, the domain-dependent lexicon need only supply a single linguistic representation for the simplest form of a domain action or property. The linguistic forms of the more complex forms allowed by the message language are then dealt with automatically by the system (Figure 7).</Paragraph> </Section> <Section position="11" start_page="240" end_page="240" type="metho"> <SectionTitle> 4.2 FROM PLAN TO MESSAGE </SectionTitle> <Paragraph position="0"> A plan with 30 or so actions contains a great deal of material, spelling out the necessary partial ordering between the actions and their preconditions and effects.</Paragraph> <Paragraph position="1"> A crucial task in message planning is cutting this material down into small enough pieces that can be rendered as independent pieces of text. In a domain-independent system for plan explanation, the only structure that such a &quot;chunking&quot; can make use of is the abstraction hierarchy and the local &quot;shape&quot; of the action graph. Even this is unfortunately limited by the fact that the abstraction hierarchy may represent a view of the domain that is convenient to the plan generator, but not the plan executer.</Paragraph> <Paragraph position="2"> The abstraction hierarchy tells us how certain actions at a particular level of the plan arise from the expansion of a single action at a more abstract level. Such a group of actions is an obvious candidate for explaining as a single chunk. Thus our basic strategy is to first of all talk about the plan at the most abstract level, then discuss the expansion of the actions at that level, then discuss the expansion of the actions involved in that, and so on. In general, then, at any time we are concerned with 1) selecting out the portion of the plan that corresponds to the expansion of a single abstract action and 2) describing this, given that whole subsets of the actions in it are to be treated as single actions.</Paragraph> <Paragraph position="3"> The first of these is not trivial because, as the result of successive criticisms, the set of actions in the expansion of a more abstract action may no longer be a simple connected piece of the plan. As an example of this, Figure 8 is the action graph for a house-building plan, with the actions that are in the expansion of &quot;installing the services&quot; blocked out.</Paragraph> <Paragraph position="4"> To describe this set of actions and their timing in the plan, it is necessary to describe other actions whose timing is closely coupled to them. The actions to be included in the explanation of the expansion are obtained by a &quot;closure&quot; operation--a process of tracing through all possible paths going forward in time between actions in the expansion. Any other actions encountered on these paths are deemed necessary to be included in the description. We call these actions &quot;intruders.&quot; Thus the actions described form the minimal convex graph that includes the desired actions (Figure 9).</Paragraph> <Paragraph position="5"> Once the lowest-level plan actions corresponding to a single abstract action have been isolated, the &quot;shape&quot; of this part of the plan at the current level of abstraction needs to be determined. The current point in the abstraction hierarchy specifies the set of actions that can be mentioned in this part of the text. If one of these is an abstract action, in general there will be a whole class of lowest-level actions that need to be described simply as parts of this action, whose internal structure will be described later. The lowest-level actions are thus grouped into subsets, and what is to be explained are relationships between these subsets, rather than relationships between primitive actions. Technically, the &quot;chunking&quot; imposed by the current layer of the abstraction hierarchy defines an equivalence relation, and Computational Linguistics Volume 15, Number 4, December 1989 241 Chris Meilish and Roger Evans Natural Language Generation from Plans do( examineOftter_bolts ) ).</Paragraph> <Paragraph position="6"> do(complete(examine(filter_bolts))).</Paragraph> <Paragraph position="7"> do( de legate (someone ,examineOflter botts) ) ).</Paragraph> <Paragraph position="9"> Examine the filter bolts.</Paragraph> <Paragraph position="10"> Finish examining the Nter bolts. Have someone examine the Nter bolts. You can now examine the filter bolts. The filter bolts have now been examined. You are now examining the filter bolts. The plug leads are now in position.</Paragraph> <Paragraph position="11"> Get the plug leads to be in position.</Paragraph> <Paragraph position="12"> Wait until the plug leads are in position. Once the plug leads are in position you can examine the filter bolts.</Paragraph> <Paragraph position="13"> we are interested in the quotient plan with respect to this relation. We can define the usual plan relationships between the relevant subsets of actions in a natural way. For instance, we say that one subset comes before another if and only if each element of the first comes before each element of the second. Because such a demanding criterion will apply quite rarely, in general there will be a great deal of parallelism in subplans whose actions are not at the most detailed level. Once a piece of the whole plan has been extracted and its &quot;shape&quot; (relative to some given equivalence relation) established, rhetorical strategies are applied to decide how particular parts are to be presented. The message created depends directly on the structure of the justified plan. Thus, for instance, the expansion of a complex action gives rise to a section of text represented by a message of the form: where Action is the action described, Intro is an introductory message, which describes the prerequisites of the main action and the set of actions in its expansion (unless there are too many of them) and Body describes the action graph expanding the main action. The main strategies for describing action graphs are the lump strategy, forwards description, and backwards description (Figure 10). The lump strategy applies if a piece of the action graph is a self-contained &quot;lump&quot; Figure 8 Distribution of Expression of an Abstract Action. between two actions A and B, with no links between any actions inside the &quot;lump&quot; and any actions outside. If the subgraph between the actions is sufficiently complex (has more structure than two simple actions in parallel), the strategy suggests that its explanation should be postponed to a separate &quot;section.&quot; Meanwhile the whole subgraph is incorporated into the current explanation as if it were a simple action (this is, of course, the same strategy that is applied for an action that is above the primitive level in the abstraction hierarchy). Forward description is deemed appropriate when the action graph is a right-branching structure; in this case the actions are generally dealt with in time order, giving a message of one of the forms:</Paragraph> <Paragraph position="15"> where Act is the first action, State a state that it makes true, and &quot;...&quot; is the message derived from the subsequent actions. When the action graph is a left branching structure, however, the strategy of backwards description is suggested. This gives rise to messages of the form:</Paragraph> <Paragraph position="17"/> </Section> <Section position="12" start_page="240" end_page="244" type="metho"> <SectionTitle> 8 Q </SectionTitle> <Paragraph position="0"> *. Before you can do A you must do B and C Figure 10 Rhetorical Strategies. parallel (Act, achieve (State))))) where Act is the first action, with preconditions Pres and effects State, and &quot;...&quot; is the message derived from the subsequent actions.</Paragraph> <Paragraph position="1"> All of these kinds of messages require the insertion of preconditions and effects of actions. It is necessary for the system to compute those preconditions and effects that are actually relevent for the current plan, rather than simply the total sets of preconditions and effects. This amounts to determining the justifications for the action ordering chosen. The justification for action A coming before action B can be of one of two types. Either A is needed to create a state where a precondition of B is true, or A comes before B because otherwise B would create a state where a precondition of A was not true. The two different possibilities give rise to different modes of presentation, but if the justification is redundant or not available from the plan, it is simply missed out.</Paragraph> <Section position="1" start_page="240" end_page="244" type="sub_section"> <SectionTitle> 4.3 IMPROVING THE MESSAGE </SectionTitle> <Paragraph position="0"> As the last section suggests, the initial version of the message is put together in a very direct way from the structure of the plan. As a result, it is often unnecessarily cumbersome. Message simplification concerns rewriting the message expression generated by message planning into one that is &quot;simpler&quot; in some sense. Since the amount of material we wish to deal with could be large, we have avoided considering expensive global simplification techniques in favor of emphasizing local simplification techniques analogous to &quot;peephole&quot; optimizing techniques in compiling. Of course, a crucial difference between language generation and compilation is that in the former there is no clear notion of what &quot;optimality&quot; is. In the absence of a formal and detailed psychological theory of discourse comprehension, researchers in natural language generation are reduced more or less to using their intuitions about whether one way of phrasing something is &quot;easier to understand&quot; than another. We have regretfully had to follow the same course in designing and evaluating our own system. null The domain-independent simplification rules used by our message simplification system are treated equally, but conceptually they seem to be of four main types.</Paragraph> <Paragraph position="1"> Members of the first type tidy up special cases that could as easily be detected when the expression is constructed. Here is an example of such a rule (\[ \] denotes the empty utterance):</Paragraph> <Paragraph position="3"> Thus any utterance expression of type neutral_seq will be rewritten by this rule if its second component is empty. Such an expression is rewritten simply to its first component. Incorporating such rules into the simplification stage means that the message-planning component can be simpler and more perspicuous.</Paragraph> <Paragraph position="4"> The second kind of rule expresses knowledge about planning and plan execution. Here are two such rules: (2) achieve (state (user, done (Act))) --* Act.</Paragraph> <Paragraph position="5"> (3) parallel (X, wait (Y)) --* then (X, walt (Y)).</Paragraph> <Paragraph position="6"> Rule (2) expresses the fact that the only way to create a state where you have done an action is to do the action.</Paragraph> <Paragraph position="7"> Rule (3) expresses the fact that waiting is an action that is always postponed until there is nothing else to do.</Paragraph> <Paragraph position="8"> Both of these principles are useful in finding the best way to express a given action.</Paragraph> <Paragraph position="9"> A third kind of rule really reflects the linguistic coverage of the system in an indirect manner. If there is a special way available for saying a particular kind of thing, then that should be preferred to using a more general technique. Here is such a rule: (4) prereqs (user, X, state (user, done (Y))) --* neccbefore (user, Y, X).</Paragraph> <Paragraph position="10"> This rule is about a special case of the prereqs structure arising in the message. When one is calling attention to the prerequisite(s) of an action X, a special case arises when the only prerequisite is the achievement of another action Y. In this case, the prerequisites statement amounts to saying simply that Y must happen before X.</Paragraph> <Paragraph position="11"> In general, one would expect that expressing the statement in this second way would result in a simpler piece of text than using a general-purpose strategy for expressing prereqs statements. It is arguable that such rules should really exist as special-case structure building rules. Such an approach would, however, preclude the use of simplification rules that made further use of the output of such rules.</Paragraph> <Paragraph position="12"> Finally, there are rules that are motivated by notions of simplicity of structure. For instance, the rule: (5) time_parallel (do (X), do (Y)) --* do (parallel (X, Y)).</Paragraph> <Paragraph position="13"> Computational Linguistics Volume 15, Number 4, December 1989 243 Chris Meilish and Roger Evans Natural Language Generation from Plans results in an expression with one fewer &quot;connectives.&quot; Such rules should really be backed up by a (perhaps psychological) theory of the complexity of messages.</Paragraph> <Paragraph position="14"> Here is an example of how a message language expression can be simplified using these rules.</Paragraph> <Paragraph position="15"> neutr~.\] ~eq ( prereqs (user, achieve (state (user, done (al))), state (user, done (parallel (a2, wait (s))))), \[\]) is simplified by rule (1) to: prereqs (user, achieve (state (user, done (al))), state (user, done (parallel (a2, wait(s))))) which is simplified by rule (2) to: prereqs (user, el, state (user, done (parallel (a2, wait(s))))) which is simplified by rule (3) to: prereqs (user, el, state (user, done (then (a2, walt(s))))) which is simplified by rule (4) to: neccbefore (user, then (a2, wait(s)), al) Here the simplification would result in the difference between a text like: In order to get you to have washed the baby you must have undressed the baby and waited until the bath is full.</Paragraph> <Paragraph position="16"> and one like the following: You must undress the baby and then wait until the bath is full before you can wash the baby The rewrite rules we have discussed so far in this section are independent of the domain in which the plan is made. Our system also allows for domain-dependent rules to be provided for a given planning domain. This provides a way of automatically rewriting every occurrence of a given expression coming from the planner into another given expression. One purpose of this kind of rule is to provide a translation for states, which may be primitive objects to the planner but are required to be somewhat more complex by the generator. For example, in the car domain, there is a rule that rewrites the planner primitive positioned (X) to be the complex term state (X, positioned). Domain-dependent rewrite rules can also be used to show correspondances between action and state names that seem independent but are in fact strongly connected. For instance, in the house-building plan, there is an action lay_basement_ floor and a domain state basement_floor_laid (not a legal message state). Not surprisingly, the second is an effect of the first and can only come about by the first having been done. Given that we can deal with complex states and actions, we would do well to replace the second by a formula involving the first, in fact state (user', done (lay_basement_floor)). In this way we can simplify certain expressions in the message. For instance, the expression: do (achieve (basement_floor_laid)) is equivalent to: do (achieve (state (user, done (lay_basement_floor))) which simplifies to: do ( ls~r_basement-floor ) by simplification rule (2) above. Given this domain-dependent rule and the simplifications thus enabled, the expression do (achieve (basement__floorAaid)) would be realized as something like &quot;lay the basement floor,&quot; rather than &quot;get the basement floor to be laid,&quot; which would arise from a more straightforward encoding of the state basement_floor_laid in terms of verbs and cases. Domain-dependent rewrite rules allow us, in principle, infinitely to enrich the semantics of actions and state,~ represented in the plan. They thus provide one way of compensating for the shallowness of the planner's representation. The basic framework on which the plan actions and states hang is, however, fixed by the planner and cannot be changed by the generator. Thus not all deficiencies of the planner can be rectified by this method. The extensive use of domain-dependent re-write rules is in any case unattractive, as it takes away from the domain-independence of the system. We will return to this topic later.</Paragraph> </Section> <Section position="2" start_page="244" end_page="244" type="sub_section"> <SectionTitle> 4.4 :KNOWLEDGE SOURCES IN MESSAGE CONSTRUCTION </SectionTitle> <Paragraph position="0"> Before we leave our discussion of how messages are constructed, it is useful to summarize the different knowledge sources that have an effect on the text generated from a plan. The gross organization of the message is determined by rhetorical strategies that look for patterns in the plan structure. Such strategies are specific only to the kind of plan that we are taking as input (i.e., hierarchical, nonlinear plans). Message simplification is usually responsible for the finer-grain structure of the message, as its rewrite rules operate strictly locally in the message. The domain-independent rewrite rules exploit the redundancy in the message language and express heuristics about how a given proposition might be expressed most simply. Such rules embody simple knowledge about planning and the facilities of the message language. Finally, domain-dependent rewrite rules enable some of the hidden structure in the planner's representation to be revealed.</Paragraph> <Paragraph position="1"> Once a final message has been decided on, its realization as text makes use of structure building rules that depend on the natural language being used. At this point most of the significant decisions have already been made. The structure-building rules are able to make a Computational Linguistics Volume 15, Number 4, December 1989 Chris Mellish and Roger Evans Natural Language Generation from Plans Figure 11 Installing the Services.</Paragraph> <Paragraph position="2"> limited choice among possible syntactic structures and are able to introduce pronominalization where it seems appropriate, but their scope is heavily constrained by the message. During structure building, a domain-dependent lexicon makes available a verb entry for each domain state and action, as well as a fixed NP that can be used to denote each domain object. Although it is useful to assess the effectiveness of the system by considering the text output, many of the more interesting problems with the system are really already visible at the message stage.</Paragraph> </Section> </Section> <Section position="13" start_page="244" end_page="244" type="metho"> <SectionTitle> 5. DISCUSSION 5.1 FURTHER EXAMPLES </SectionTitle> <Paragraph position="0"> The system has been tested using a number of different domains with rather different characteristics, and the results have been correspondingly varied. One domain that seems to work fairly well is that of cookery recipes such as the following:</Paragraph> </Section> <Section position="14" start_page="244" end_page="244" type="metho"> <SectionTitle> MAKING PAPRIKA POTATOES AND SMOKED SAUSAGES </SectionTitle> <Paragraph position="0"> Melt the fat, fry the onion in it, add the flour to it and add the paprika to it. After this, stir the sauce until it boils. Meanwhile peel the potatoes and cut them into pieces. After this, add them to the sauce, cover the pan and make the sauce boil, stirring the sauce occasionally. Meanwhile cook the sausages. After this, add them to the sauce.</Paragraph> <Paragraph position="1"> This text was actually produced from a &quot;mockup&quot; of plausible planner output, rather than a real plan, and did not include enough information (about preconditions, effects, etc.) to warrant the system adding justifications about ordering. This does not seem to matter too much, probably because cookery recipes are traditionally presented as instructions to be followed more or less blindly.</Paragraph> <Paragraph position="2"> For an example where our techniques produce a less pleasing result, consider the &quot;installing the services&quot; extract from the house-building plan (discussed above) shown in Figure 11. In this action graph (which shows no preconditions or effects), we have indicated the actions with abbreviated names. Those actions in lowercase are not actually part of installing the services (but are &quot;intruder&quot; actions that are nevertheless crucial to this part of the plan); they will be described elsewhere in the text. Here is the English produced for this plan fragment:</Paragraph> </Section> <Section position="15" start_page="244" end_page="244" type="metho"> <SectionTitle> INSTALLING THE SERVICES </SectionTitle> <Paragraph position="0"> Installing the services involves finishing the electrical work and laying the storm drains.</Paragraph> <Paragraph position="1"> You must paint the house before finishing the electrical work.</Paragraph> <Paragraph position="2"> In order to paint the house you must have installed the finished plumbing and installed the kitchen equipment. You must lay the finished flooring before installing the finished plumbing and installing the kitchen equipment. You must fasten the plaster and plaster board before laying the finished flooring. In order to fasten the plaster and plaster board you must have installed the air conditioning and installed the rough plumbing and installed the rough wiring.</Paragraph> <Paragraph position="3"> Install the drains and then install the air conditioning, installing the rough plumbing.</Paragraph> <Paragraph position="4"> Meanwhile install the rough wiring, You can now fasten the plaster and plaster board.</Paragraph> <Paragraph position="5"> You can now lay the finishedflooring.</Paragraph> <Paragraph position="6"> You can now install the finished plumbing and install the kitchen equipment.</Paragraph> <Paragraph position="7"> You can now paint the house.</Paragraph> <Paragraph position="8"> You can now finish the electrical work.</Paragraph> <Paragraph position="9"> Meanwhile lay the storm drains.</Paragraph> <Paragraph position="10"> You have now finished installing the services.</Paragraph> <Paragraph position="11"> This account is basically comprehensible, but is repetitive and quite hard to follow. One reason for the repetition is that the subject matter is really very boring and uninformative, and it would be quite a challenging task for a human being to produce interesting and readable text from the same information. We discuss below some other reasons why this text is less than optimal.</Paragraph> <Section position="1" start_page="244" end_page="244" type="sub_section"> <SectionTitle> 5.2 DEFICIENCIES IN PLANS </SectionTitle> <Paragraph position="0"> Although generating explanations from the output of an AI planner appears to be a promising application of natural language generation research, there are a number of special problems that we have encountered with this task. Indeed, we can explain some of the deftciences in the text we have been able to generate purely in terms of deficiencies of the planner and/or its plans.</Paragraph> <Paragraph position="1"> Some problems stem from the use of plan operators not designed with text generation in mind, and can be solved within the scope of the planning system. More serious are problems that arise because of deficiencies in the planner methodology itself. In the development of our system we have encountered a number of these, ranging from trivial to quite fundamental. Some of these are properties of NONLIN in particular; others apply more generally to most AI planning systems. It is not appropriate to discuss the full details of these problems here, but we shall mention some of the main points.</Paragraph> </Section> </Section> <Section position="16" start_page="244" end_page="246" type="metho"> <SectionTitle> GRANULARITY </SectionTitle> <Paragraph position="0"> One might ask why, unlike in the cookery recipe, there is no pronominalization in the text for installing the services. The coherence of the text would be improved Computational Linguistics Volume 15, Number 4, December 1989 245 Chris Mellish and Roger Evans Natural Language Generation from Plans considerably by the judicious use of pronouns. Unfortunately, whereas in the cookery domain (which we encoded by hand) a particular action of 'frying' is treated as an instance of a general action that can potentially be applied to different objects; in the house-building domain the objects acted on by an action are fundamentally built into that action. The difference can be seen from example lexical entries from the two domains: ix (fry (Food, In), fry, \[obj: Food, in: In\]).</Paragraph> <Paragraph position="1"> lx (install_rough_wiring, install, \[obj : @ 'the rough wiring'\]).</Paragraph> <Paragraph position="2"> To use pronominalization, one needs to be able to determine that the same domain object is being mentioned several times, but only the first type of representation here actually supports the representation of domain objects. What has happened here is that, from the point of view of making a house-building plan, the planner cannot make use of properties of a general action like 'install,' and so its representation of actions is at a coarser level of granularity than that required to produce good text.</Paragraph> <Paragraph position="3"> In common with most plans in traditional AI work, NONLIN plans only encode very weak information about causality and temporal relationships. For instance, when there is an action that achieves an effect, there is no way to tell from the plan whether we are dealing with an instantaneous action, an extended action that terminates as soon as the effect is achieved, or an extended action where the effect is achieved sometime during the execution. A natural language like English provides ways of distinguishing between these cases: Turn on the switch and the light will be on.</Paragraph> <Paragraph position="4"> Pour in the water until the bucket is full.</Paragraph> <Paragraph position="5"> Prepare a chicken curry so that the chicken scraps are used up.</Paragraph> <Paragraph position="6"> Because there is no way to distinguish between these in the NONLIN representation of effects, our generator is forced to try to find a neutral way to express all of them. As a result, there is a homogeneity in the text that is not necessarily reflected in the actual plan execution. Again the problem can be thought of as a mismatch between the granularity of the representation used for planning and that needed to exploit the facilities of the natural language.</Paragraph> <Paragraph position="7"> The effect of the granularity problem can be lessened by allowing the plan generator to provide deeper information about the internal structure of actions and states through domain-dependent rewrite rules. Our message language allows us to talk about repeated actions, for instance, and so we can specify that certain domain actions are really shorthand for more complex expressions: null filLbucket -~ repeat (pour (water, bucket), state (bucket, full)).</Paragraph> <Paragraph position="8"> Messages containing these complex actions can then be simplified by domain-independent rules like: result (repeat (Act, State), State) -* do (repeat (Act, State)).</Paragraph> <Paragraph position="9"> Similarly we can use domain-dependent rewrite rules to introduce tokens standing for domain objects and hence give us a basis for pronominalization. The more one reliies on domain-dependent rewrite rules for good text, however, the less one can claim to have a domain-independent basis for generating text from plans.</Paragraph> </Section> <Section position="17" start_page="246" end_page="246" type="metho"> <SectionTitle> CONCEPTUAL FRAMEWORK </SectionTitle> <Paragraph position="0"> Domain-dependent rewrite rules can be regarded as a way of embellishing the planner's output to match up better with the requirements for generation. The basic framework of the plan is, however, something that cannot be changed unless the generator itself is to start doing some of the planning. Assuming that there is some point in distinguishing the planner from the generator, the generator is therefore sometimes faced with a mis-match between self-evident concepts in the planner's conceptual framework and those concepts that can be expressed simply in natural language. Consider, for instance, the notion of (primitive) actions that are unordered in the plan. If two actions have no ordering relation between them, then this indicates that the actions can be performed in any order relative to one another. To express correctly the plan's semantics, one should therefore make use of expressions like: Install the drains and install the rough wiring, in any order.</Paragraph> <Paragraph position="1"> In practice, however, we have chosen to map such a piece of plan into a message like: do (parallel (instalLdrains, inst~.11 rough_wiring)).</Paragraph> <Paragraph position="2"> which then gives rise to a text such as: Install the drains. Meanwhile install the rough wiring, Treating unordered actions as parallel actions may indeed both produce good text and even capture the reality of plan execution, as in: Make the sauce boil, stirring the sauce occasionally.</Paragraph> <Paragraph position="3"> but this will only be so if at least one of the actions takes place over a period of time and the actions can be and are recommended to be executed concurrently. There is, of course, no way to determine from the planner's representation whether this is so. Indeed, since the planner regards all primitive actions as essentially instantaneous, in all cases it is in some sense incorrect to express the planner's recommendations in this way. If the correct execution of the plan were critical, for instance, then it could be very dangerous to hide the limited way in which the planner views the world as we have done. It might thus be suggested that a generator working from plans could and should always strive to convey the plan semantics accurately, even if this Comput~tional Linguistics Volume 15, Number 4, December 1989 Chris Mellish and Roger Evans Natural Language Generation from Plans involves long-winded and unnatural prose. But in the end one is faced with an incompatibility between the planner's conceptual framework and the limits of what our language can express. For instance, it unclear how the action of &quot;installing the rough wiring&quot; can be expressed in English in such a way that the action can only be interpreted as an instantaneous action, which is the way the planner sees it.</Paragraph> </Section> <Section position="18" start_page="246" end_page="246" type="metho"> <SectionTitle> EXPLANATORY POWER </SectionTitle> <Paragraph position="0"> The texts that we have generated from plans are intended to do more than simply tell the reader how to execute a series of actions. We always hoped that the justification structure built by the planner would also help us to explain why the given actions, with the ordering described, are the right ones to achieve the plan's goal. In practice, however, our texts have failed to be explanatory for a number of reasons. One problem is that, unlike instructions generated by human beings, our texts only tell you what to do, and not what not to do. It is often just as important for a person to be warned about the unpleasant consequences of doing the wrong thing as it is to be told what the right thing is.</Paragraph> <Paragraph position="1"> Unfortunately, the notion of &quot;plan&quot; we have adopted only makes reference to the successful actions, even though the plan generator may have spent a lot of time exploring other possibilities that did not work out. It might therefore be appropriate, in future work, to consider natural language generation based on the trace of a planning system, rather than on the final result.</Paragraph> <Paragraph position="2"> Similarly, in many of the texts produced by our system the reader is told what to do but is given no illumination as to why things have to be done in this way. Unfortunately, although in principle every plan is justified by earlier actions achieving the preconditions of later actions, many plans do not contain this information in a useful form--in the housebuilding plan, for instance, the only preconditions that are required for an action in this plan to be performed are the successful completion of previous actions. That is, the person who has encoded the operators in terms of which the plan is constructed has &quot;compiled in&quot; certain ordering constraints without using the language of preconditions and effects effectively to explain them. One is reminded here of the problems that Swartout (1983) encountered in producing explanations from expert systems. The problem was that just because a set of rules was sufficient to produce expert behavior did not mean that those rules contained anything illuminating to put into explanations. Similarly in the planning area, there is no reason why a set of operators that are effective for producing useful plans need contain anything very interesting that can be put into a natural language account. Unfortunately, one cannot necessarily expect machine-generated plans to come at the right level of detail to be really useful to a human being. For instance, a house-building plan that enabled one to see why the rough plumbing must be installed after the drains (pre-Computational Linguistics Volume 15, Number 4, December 1989 sumably because otherwise it is hard to make the pipes line up) would be very large, and it would be well beyond the state of the art for such a plan to be produced automatically. Moreover, such a plan would undoubtably contain a lot of information that was blindingly obvious to a human reader and hence of no interest whatsoever.</Paragraph> </Section> <Section position="19" start_page="246" end_page="246" type="metho"> <SectionTitle> ARBITRARY PLANNER RESTRICTIONS </SectionTitle> <Paragraph position="0"> As is typical with application programs, most planners have particular features that represent non-standard or novel approaches to certain situations. This fact means that any natural language generator using plans as input must customize itself somewhat to the peculiarities of the particular planner it is working with. One problem peculiar to NONLIN's representation language concerns the manner in which preconditions are specified.</Paragraph> <Paragraph position="1"> In NONLIN an operator specifies how a goal is expanded to a network of subgoals. As was observed in the earliest AI planners, the most common case is that a goal has preconditions, goals that must be true before the given goal can be achieved. In NONLIN, one has to use an expansion to represent this, with the consequence that one wants the original goal itself to occur in the expansion (that is Goal expands to Pre ~ ..... Pre y, Goal). NONLIN will not allow this, so there have to be two distinct representations of the goal. So in the car example, we have {goal ...} for the high level goal and {act ...} for the low level version (although this scheme might not work in every domain). By this means we can make NONLIN behave, but give ourselves a linguistic problem--every action occurs twice. For example, we might get:</Paragraph> </Section> <Section position="20" start_page="246" end_page="246" type="metho"> <SectionTitle> STARTING THE ENGINE </SectionTitle> <Paragraph position="0"> Go to the cab and then start the engine and you will have finished starting it.</Paragraph> <Paragraph position="1"> Roughly speaking, the distinction is between 'starting the engine' (the whole task) and 'actually starting the engine' (the specific operation). To some extent we can avoid the problem by using different phrases (e.g., 'turn on the engine'), but it does not make the generation task easier.</Paragraph> </Section> <Section position="21" start_page="246" end_page="246" type="metho"> <SectionTitle> 5.3 DEFICIENCIES IN OUR APPROACH </SectionTitle> <Paragraph position="0"> The problems with our natural language accounts are, of course, not entirely due to deficiencies in the plans we are working on. We have deliberately held closely to some basic guiding principles to evaluate their applicability. So it is important to pinpoint their failings in our current system and mention possible alternative approaches. null</Paragraph> </Section> <Section position="22" start_page="246" end_page="247" type="metho"> <SectionTitle> RELYING ON PLAN STRUCTURE </SectionTitle> <Paragraph position="0"> To build a domain-independent system to generate text from plans, we have deliberately tried to use only information that the planner itself understands; i.e., information about the structure of the plan. One of the Chris Meilish and Roger Evans Natural Language Generation from Plans fundamental tenets of our approach was thus that the plan abstraction hierarchy would be a useful source of information about how the text should be organized.</Paragraph> <Paragraph position="1"> But our experience suggests that it may not be as useful as one might think. As well as the kind of problems described above (which might be corrected in a different planning system), there seem to be more general discrepancies between the kind of abstraction useful to a planner and the kind useful to a text generator.</Paragraph> <Paragraph position="2"> For example, our car domain and many of the blocksworld plans that have been studied in AI tend to have a deeper abstraction hierarchy than one might expect from the apparent simplicity of the tasks. A generator that tries to exploit them all ends up producing too much structure in its text. Thus the example used in Section 3 also has a 'section':</Paragraph> </Section> <Section position="23" start_page="247" end_page="247" type="metho"> <SectionTitle> GAINING ACCESS TO THE DISTRIBUTOR </SectionTitle> <Paragraph position="0"> Detach the dirt cover from the engine and you will have finished gaining access to the distributor.</Paragraph> <Paragraph position="1"> Here there is a level of abstraction that is useful to the planner, but not to the human reader: it would probably have been better to insert this section &quot;in-line&quot; in the higher level description. On the other hand, the single &quot;section&quot; devoted to installing the services in the house-building plan would have gained from being broken up in some way. There may be a linguistic solution to the problem of whether a piece of information deserves a full&quot; section,&quot; perhaps in terms of a domain-dependent model of what is and is not worth saying, or the problem may point to a fundamental difference between the ways the planner and a human perceives the planning task. Either way, what is clear is that the planner's abstraction hierachy alone is not fully adequate for text generation. Whether we can devise general principles for producing alternative decompositions of plans more suitable for text generation remains an open research area.</Paragraph> </Section> <Section position="24" start_page="247" end_page="248" type="metho"> <SectionTitle> REPETITION </SectionTitle> <Paragraph position="0"> We have commented above on reasons why the raw material we can gain from plans is liable to lead to repetitiveness in the text. Even if we managed to enrich the plan representations suitably, however, the generator would still be deficient when the input really is uniform. In particular, the uniformity of the text output often leads to unwanted ambiguities, simply because of the lack of variation in the stylistic devices used. For instance, in the following excerpt it is unclear whether the potato peeling is supposed to be &quot;in parallel with&quot; melting the fat, or just with stirring the sauce: Melt the fat ....</Paragraph> <Paragraph position="1"> After this, stir the sauce until it boils.</Paragraph> <Paragraph position="2"> Meanwhile peel the potatoes and cut them into pieces.</Paragraph> <Paragraph position="3"> We originally hoped to overcome the problem of repetition by providing several structure-building rules for each type of message language construction, which would be sensitive to the form of the objects involved in the construction. To some extent we succeeded in producing such rules, but the effect on the text was not great. The problem here is that, even with these extra rules, our structure-building is based solely on local patterns in the message, whereas the problem of repetition can only be solved by a global planning of the text. It :might be possible to gain some improvement in our system by having the choice of structure-building rules be determined partially by some random factor, but a proper solution requires a more radical redesign.</Paragraph> </Section> <Section position="25" start_page="248" end_page="248" type="metho"> <SectionTitle> LINGUISTIC SIMPLIFICATIONS </SectionTitle> <Paragraph position="0"> There are a number of stylistic issues that the system cannot easily accommodate. For instance, operations such as &quot;heavy NP shift,&quot; segmentation into sentences, coordination, and ellipsis all require detailed stylistic control and evaluation. The message language is deliberately nonlinguistic and so can only approximately represent the kinds of language-dependent stylistic information such processing needs. For instance, rewrite rules can decide how to group information on the basis of the complexity of the message, but this only indirectly reflects the complexity of the text that will be generated. The effective use of different stylistic devices depends in the end on simplifications that are justified on linguistic, rather than conceptual, grounds, and this suggests that our architecture should really in~zorporate a style module capable of reasoning at this level. Such a style module would necessarily have to take a more global view, looking at the overall linguistic effect of the localized basic text generation processes. It might be possible to introduce linguistic simplifications at structure-building time, relaxing the requirement of compositionality (indeed, this is how McDonald 1983 operates). We believe, however, that it would be preferable to attempt to treat it at least conceptually as a subsequent processing stage.</Paragraph> </Section> class="xml-element"></Paper>