XML Viewer - w04-0601

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/04/w04-0601_metho.xml
Size: 14,508 bytes
Last Modified: 2025-10-06 14:09:06
<?xml version="1.0" standalone="yes"?>
<Paper uid="W04-0601">
  <Title>Techniques for Text Planning with XSLT</Title>
  <Section position="4" start_page="0" end_page="0" type="metho">
    <SectionTitle>
3 Text Planning in COMIC
</SectionTitle>
    <Paragraph position="0"> Broadly speaking, text planning in COMIC follows the standard pipeline model of natural language generation (Reiter and Dale, 2000). The input to the COMIC text planner, from the dialogue manager, specifies the content of the description at a high level; the output consists of logical forms for the OpenCCG realizer.</Paragraph>
    <Paragraph position="1"> The module is implemented in Java and uses Apache Xalan2 to process the XSLT templates. The initial implementation of the presentation-planning module--of which the XSLT-based sentence planner described here is just a part--took approximately one month. After that, the module was debugged and updated incrementally over a period of several months, during which time additional templates were created to support updates in the OpenCCG grammar. The development process was made easier by the ability to use OpenCCG to parse a target sentence, and then base a template on the resulting logical form.</Paragraph>
    <Paragraph position="2"> The current presentation planner uses 14 templates for content structuring and aggregation (Section 3.2), and just over 100 to build the logical forms (Section 3.3). The tasks described here take little time to perform (i.e., hundreds of milliseconds);  most of the module's time is spent communicating with other modules in the system.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.1 Content Selection
</SectionTitle>
      <Paragraph position="0"> The features of the available designs are stored in the system ontology. This is represented in DAML+OIL (soon to be OWL) and includes tile properties such as style, colour, and decoration.</Paragraph>
      <Paragraph position="1"> There is also canned-text commentary associated with some features (e.g., the Tuscan country home text in (1)). The ontology instance corresponding to design (&amp;quot;tileset&amp;quot;) 9 is shown in Figure 2.</Paragraph>
      <Paragraph position="2"> For a description like (1), the dialogue-manager specifies only the tileset to be described, and optionally a set of features to include in the description. Figure 3 shows a dialogue-manager message3 indicating that tileset 9 should be described, and that the description must include the colour.</Paragraph>
      <Paragraph position="3"> To select the content of the description, we first retrieve all of the features of the indicated design  containing ontology instances to be validated easily against an XML schema.</Paragraph>
      <Paragraph position="4"> from the ontology, using the Jena semantic web framework.4 We then use the system dialogue history to filter the retrieved features by removing any that have already been described to the user. Finally, we add back to the set any features specifically requested by the dialogue manager, even if they have been included in a previous description.</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.2 Content Structuring
</SectionTitle>
      <Paragraph position="0"> The result of content selection is an unordered set of tileset features; this set is converted into a text plan as follows. First, for each selected feature, a message is created in XML that combines the information gathered from the ontology with information from the system dialogue history. Figure 4 shows the messages corresponding to the colour feature and to the associated canned-text commentary. The dialogue-history information is included in the same-as-last (i.e., whether this value is the same as the corresponding value of the previous tileset) and already-said attributes.</Paragraph>
      <Paragraph position="1"> The unordered set of messages is converted to an ordered list using a small number of heuristics: for example, features requested by the dialogue manager are always put at the start of the list, while canned-text commentary always goes immediately after the feature to which it refers. These heuristics provide a partial ordering, which is then converted to a total ordering by breaking ties at random.</Paragraph>
      <Paragraph position="2"> The next step is to aggregate the flat list of messages. In many NLG systems, aggregation is a task that is done at the syntactic level; in COMIC, we instead work at the conceptual level. Thanks to the fact that we produce multiple alternative syntactic structures (see Section 4), we can be confident that, whatever the final set of messages, there will be some syntactic structure available to realize them.</Paragraph>
      <Paragraph position="3"> The aggregation is done using a set of XSLT templates that combine adjacent messages based on various criteria. For example, the template shown in Figure 5 combines a feature-value message with the associated canned-text commentary.5 Figure 6 shows the combined message that results when the messages in Figure 4 are processed by this template.</Paragraph>
      <Paragraph position="4"> The sentence boundaries in the final text are determined by the content structure: each aggregated message after aggregation corresponds to exactly one sentence in the output.</Paragraph>
      <Paragraph position="5">  tests, and aggregation is performed in several passes to allow multi-level aggregation. The set namespace refers to a Java Set instance that stores message IDs to avoid processing a message twice.</Paragraph>
    </Section>
    <Section position="3" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.3 Sentence Planning
</SectionTitle>
      <Paragraph position="0"> After the content of a description has been selected and structured, the logical forms to send to the realizer are created by applying further XSLT templates. Every such template matches a message with particular properties, and produces a logical form for the realizer, possibly combining the results of other templates to produce its own final result.</Paragraph>
      <Paragraph position="1"> XSLT modes are used to select different templates in different target syntactic contexts.</Paragraph>
      <Paragraph position="2"> Two sample templates are shown in Figure 7.</Paragraph>
      <Paragraph position="3"> The first template produces the logical form for a sentence (mode=&amp;quot;s&amp;quot;) describing the colours of a tileset (e.g., The tiles are terracotta and beige).</Paragraph>
      <Paragraph position="4"> The second template creates a logical form representing a commentary message as a verb phrase6 (mode=&amp;quot;vp&amp;quot;), and then appends it as an elaboration 6Canned-text commentary is represented in the realizer lexicon as a multi-word verb.</Paragraph>
      <Paragraph position="5"> to a sentence about the same property. When the messages in Figure 6 are transformed by these templates, the result is the logical form shown in Figure 8, which corresponds to the sentence The tiles are terracotta and beige, giving the room the feeling of a Tuscan country home.</Paragraph>
      <Paragraph position="6"> Referring expressions are generated based on the number of mentions of the referent: the first reference gets a full NP (e.g., this design), while subsequent mentions are pronominalized.</Paragraph>
      <Paragraph position="7"> The logical form created for each top-level message is sent to the OpenCCG realizer, which then generates and returns the corresponding surface form. As described below, the logical forms may incorporate alternatives, in which case the realizer chooses the logical form to use.</Paragraph>
    </Section>
  </Section>
  <Section position="5" start_page="0" end_page="0" type="metho">
    <SectionTitle>
4 Sending Alternatives to the Realizer
</SectionTitle>
    <Paragraph position="0"> Many messages can be realized by several different logical forms. For example, to inform the user  and This design is country. Often, the text planner has no reason to prefer one alternative over another. Rather than picking an arbitrary option within the text planner (as did, e.g., van Deemter et al. (1999)), we instead defer the choice and send all of the valid alternatives to the realizer, in a packed representation. This makes the implementation of the text planner more straightforward. Figure 9 shows an example of such a logical form, incorporating both of the above options under a &lt;one-of&gt; element.</Paragraph>
    <Paragraph position="1"> To process a logical form with embedded alternatives, the COMIC realizer makes use of the same n-gram language models that it uses to guide its search for the realization of a single logical form.</Paragraph>
    <Paragraph position="2"> Since OpenCCG cannot yet handle the realization of logical forms with embedded alternatives directly (though this capability is planned), in the current system the packed alternatives are first multiplied out into a list of top-level alternatives, whose order is randomly shuffled. The realizer then computes the best realization for each top-level alternative in turn, keeping track of the overall best scoring complete realization, until either the anytime time limit is reached or the list is exhausted. To allow for some free variation, a new realization's score must exceed the current best one by a certain threshold before it is considered significantly better.</Paragraph>
    <Paragraph position="3"> As a concrete example, consider the case where the system must confirm that the user intends to refer to a tileset with a specific feature. The feature could be included in the logical form in two ways: it could be attached directly to the design node (2-3), or it could instead be included as a non-restrictive  modifier (4).</Paragraph>
    <Paragraph position="4"> (2) Do you mean this country design? (3) Do you mean this design by Coem? (4) Do you mean this design, with tiles by Coem?  When the modifier can be placed before design, as in (2), the directly-attached structure is acceptable. However, for some features, the modifier can only be placed after the modified noun, as in (3). In these cases, the preferred structure is instead the non-restrictive one in (4); this breaks the sentence into two intonational phrases, which makes it easier to understand when it is output by the speech synthesizer. This preference is implemented by including only sentences of the preferred type when &lt;!-- The tiles are terracotta and beige, giving the room the feeling of a Tuscan country home. --&gt; &lt;lf id=&amp;quot;t2-1-5+t2-1-6&amp;quot;&gt; &lt;node mood=&amp;quot;dcl&amp;quot; info=&amp;quot;rh&amp;quot; pred=&amp;quot;elab-rel&amp;quot; id=&amp;quot;n7&amp;quot;&gt; &lt;rel name=&amp;quot;Core&amp;quot;&gt; &lt;node tense=&amp;quot;pres&amp;quot; id=&amp;quot;n2&amp;quot; pred=&amp;quot;be&amp;quot;&gt; &lt;rel name=&amp;quot;Arg&amp;quot;&gt; &lt;node det=&amp;quot;the&amp;quot; pred=&amp;quot;tile&amp;quot; id=&amp;quot;n1&amp;quot; num=&amp;quot;pl&amp;quot;/&gt; &lt;/rel&gt;  &lt;lf id=&amp;quot;t2-1-2&amp;quot;&gt; &lt;!-- This design is ... --&gt; &lt;node tense=&amp;quot;pres&amp;quot; mood=&amp;quot;dcl&amp;quot; info=&amp;quot;rh&amp;quot; pred=&amp;quot;be&amp;quot; id=&amp;quot;n13&amp;quot;&gt; &lt;rel name=&amp;quot;Arg&amp;quot;&gt; &lt;node id=&amp;quot;n1&amp;quot; num=&amp;quot;sg&amp;quot; pred=&amp;quot;design&amp;quot; kon=&amp;quot;+&amp;quot;&gt; &lt;rel name=&amp;quot;Det&amp;quot;&gt; &lt;node kon=&amp;quot;+&amp;quot; pred=&amp;quot;this&amp;quot; id=&amp;quot;n18&amp;quot;/&gt; &lt;/rel&gt;  building the language model for OpenCCG. The realizer will then give (4) a higher n-gram score than (3), and will therefore choose the desired structure.</Paragraph>
    <Paragraph position="5"> In addition to simplifying the implementation, retaining multiple alternatives through the planning process also increases the robustness of the system, and provides a substitute for backtracking. Particularly during development, there may be times when a required template simply does not exist; for example, the second template in Figure 7 will fail if the canned-text commentary cannot be realized as a verb phrase. In such cases, the text planner prunes out the failing possibilities before sending the set of options to the realizer, using the template shown in Figure 10.</Paragraph>
  </Section>
  <Section position="6" start_page="0" end_page="0" type="metho">
    <SectionTitle>
5 Related Work
</SectionTitle>
    <Paragraph position="0"> The work presented here continues in the tradition of several recent NLG systems that use what could be called generalized template-based processing. By generalized, we mean that, rather than manipulating flat strings with no underlying linguistic representation, these systems instead work with structured fragments, which are often processed recursively. Other systems that fall into this category include EXEMPLARS (White and Caldwell, 1998), D2S (van Deemter et al., 1999), Interact  (Becker, 2002).</Paragraph>
    <Paragraph position="1"> The main novel contribution of the text-planning approach described here is in its use of an external realizer that processes logical forms with embedded alternatives. This eliminates the need to use a backtracking AI planner (Becker, 2002) or to make arbitrary choices when multiple alternatives are available (van Deemter et al., 1999). The realizer also uses a completely different algorithm than the XSLT template processing--bottom-up, chart-based search rather than top-down rule expansion-which allows it to deal with those aspects of NLG that are more easily addressed using this kind of processing strategy.</Paragraph>
    <Paragraph position="2"> Our approach to text planning draws from both the AI-planning and the template-based traditions in natural language generation. Most previous NLG systems that use AI planners use them primarily to do hierarchical decomposition of communicative goals; the work described here uses XSLT to achieve the same end, with a substitute for backtracking provided by the realizer's support for multiple alternatives. The system is nonetheless equally based on (generalized) template processing.</Paragraph>
    <Paragraph position="3"> This demonstrates that, rather than being in conflict, the two traditions actually have complementary strengths, which can usefully be combined in a single system (contra Reiter, 1995; cf. van Deemter et al., 1999).</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML