File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/05/w05-1603_intro.xml

Size: 8,588 bytes

Last Modified: 2025-10-06 14:03:17

<?xml version="1.0" standalone="yes"?>
<Paper uid="W05-1603">
  <Title>Ten Years After: An Update on TG/2 (and Friends)</Title>
  <Section position="2" start_page="0" end_page="0" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> Shallow NLG is known as &amp;quot;quick and dirty&amp;quot; on the one hand, and as a practical approach to implementing real-world applications on the other. Its legitimization stems from practical success rather than from theoretical advantages. As with shallow analysis, methods have become acceptable that had been rejected twentyfive years ago as linguistically unjustified. For instance, template-based NLG systems were known to be unscalable and unflexible. Besides they were quite trivial and did not contribute to solving any research questions in the field. However, it became evident that many practical applications involving NLG required limited linguistic coverage, used canned text and/or templates, and badly needed improvements to make the NLG systems more flexible. A revival of template-based systems followed, and subsequent scientific discussions clarified the relation to more advanced NLG research themes. The papers in [Becker and Busemann, 1999] nicely show a continuum between template- and &amp;quot;plan&amp;quot;-based systems.</Paragraph>
    <Paragraph position="1"> Since its first implementation in 1995, the shallow NLG system TG/2 [Busemann, 1996] has been used as a component in several diverse applications involving NLG. Implemented in Common Lisp, TG/2 has continuously been refined over the years; a Java brother implementation, called XtraGen, has eventually become available, and the grammar development environment eGram eventually allows the grammar writer to design large-scale grammars.</Paragraph>
    <Paragraph position="2"> Among the attractive properties of TG/2 is the quick development of new NLG applications with limited requirements on linguistic expressiveness. Numerous implementations show that TG/2 is well suited for simple dialogues, report generation (from database content), and even as a realizer for complex surface-semantic sentence representations.</Paragraph>
    <Paragraph position="3"> Besides a better understanding of the pros and cons of TG/2 has emerged.</Paragraph>
    <Paragraph position="4"> Time has come to summarize these developments and, more generally, reassess the value of TG/2 as a framework to specify generation systems.</Paragraph>
    <Paragraph position="5"> In the following section, TG/2 is localized on the NLG map, clarifying a few common misconceptions on what it can be used for. In Section 3 we sketch major use cases involving TG/2 that exhibit different degrees of &amp;quot;shallowness&amp;quot;. Section 4 summarizes the major extensions and refinements that have been implemented over the last decade, taking into account some critical comments from the literature. We then describe in Section 5 the need for, and the benefits of, the dedicated grammar development environment eGram that supports the fast developments of large rule sets. The paper concludes with an outlook to upcoming work.</Paragraph>
    <Paragraph position="6"> 2 What TG/2 is and What it isn't TG/2 has been described originally in [Busemann, 1996; Busemann and Horacek, 1998] as a template-based generator. To remind the reader of the main points, TG/2 is a flexible production system [Davis and King, 1977] that provides a generic interpreter to a separate set of user-defined condition-action rules representing the generation grammar.</Paragraph>
    <Paragraph position="7"> The generic task is to map a content representation, which must be encoded as a feature structure1, onto a chain of terminal elements as defined by the rule set. The rules have a context-free categorial backbone used for standard top-down derivation guided by the input representation. The rules specify conditions on the input - the so-called test predicates that determine their applicability. Due to the context-free backbone each subtree of depth 1 in a derivation tree corresponds to the application of one rule. TG/2 is equipped with a constraint propagation mechanism that supports the establishment of agreement relations across the derivation tree. Figure 1 shows a sample rule.</Paragraph>
    <Paragraph position="8">  TG/2 production rules has a simple interpretation procedure that corresponds to the classical three-step evaluation  cycle in production systems (matching, conflict resolution, firing) [Davis and King, 1977]. The algorithm starts from a (piece of the) input structure and a category.</Paragraph>
    <Paragraph position="9"> 1. Matching: Select all rules carrying the current category.</Paragraph>
    <Paragraph position="10"> Execute the tests for each of these rules on the input structure and add those passing their test to the conflict set.</Paragraph>
    <Paragraph position="11"> 2. Conflict resolution: Select an element from the conflict set, e.g. on the basis of some conflict resolution mechanism. null 3. Firing: Evaluate its constraints (if any). For each right null hand side element, read the category, determine the sub-structure of the input, and goto step 1.</Paragraph>
    <Paragraph position="12"> The processing strategy is top-down and depth-first. The set of actions is fired from left to right. Failure of executing some action causes the rule to be backtracked.</Paragraph>
    <Paragraph position="13"> The right-hand side of a rule can consist of any mixture of terminal elements (canned text) and non-terminal categories, as in Figure 1. The presence of canned text is useful if the input does not express explicitly everything that should be generated. The grammar thus adds text to the output that does not have an explicit semantic basis in the input. With very detailed input and hence less &amp;quot;implicit&amp;quot; semantics, only little canned text will be needed in the grammar, and the terminal elements of the grammar usually are word stems.</Paragraph>
    <Paragraph position="14"> Canned parts of the grammar are &amp;quot;invented&amp;quot;. This gives rise to the notion of &amp;quot;shallow generation&amp;quot;, as opposed to shallow analysis, where parts of the input text are ignored. TG/2 leaves complete freedom to using canned text, mixing it with context free rules, or sticking to the more traditional distinction between (context-free) rules and the lexicon. [Busemann and Horacek, 1998] refer to the former kind as shallow and to the latter as in-depth generation. One may thus identify TG/2 applications on a scale ranging from more shallow to more in-depth systems. Figure 2 attempts to compare some TG/2-based NLG applications along this dimension. They will be discussed in Section 3.</Paragraph>
    <Paragraph position="15"> As mentioned above, there is no strict borderline between template-based and plan-based generation systems. While this insight resulted from comparing different systems, TG/2 implements this claim by forming a single framework that may host any approach ranging from pure canned text to completely lexicon-based. As Section 3 demonstrates, TG/2 can implement template-based systems and full-fledged realizers.</Paragraph>
    <Paragraph position="16"> In an attempt to relate existing NLG systems to the RAGS framework [Mellish et al., 2000], TG/2 was among the systems to look at. It turned out that TG/2 differs from the principles underlying RAGS in that it does not support any of the levels of conceptual, semantic, rhetoric, document or syntactic representation, which were abstractly defined to capture many (most) NLG approaches. Rather TG/2 entails a single mapping from input to output, and any tasks generally ascribed to components delivering the above intermediate representations must be encoded by one or several production rules. There is no pipeline of modules with intermediate representations, as ideally assumed in RAGS. Rather all tasks need to be encoded within the production rules. During this experiment it actually became evident that TG/2 isn't a classical generation system at all.</Paragraph>
    <Paragraph position="17"> In non-trivial NLG applications, TG/2 is complemented by other components. On the output side it can be hooked up to morphological inflection components using a shared representation of word stems and morpho-syntactic features. On the input side TG/2 has been combined with a text structuring component in the TEMSIS application, with a context management system in COMET, and with a lexical choice component in the MUSI system.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML