File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/92/p92-1007_intro.xml

Size: 9,820 bytes

Last Modified: 2025-10-06 14:05:22

<?xml version="1.0" standalone="yes"?>
<Paper uid="P92-1007">
  <Title>A Functional Approach to Generation with TAG 1</Title>
  <Section position="2" start_page="0" end_page="0" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> Joshi (1987) described the relevance of Tree Adjoining Grammar (TAG) (Joshi, 1985; Schabes, Abeille &amp;5 Joshi, 1988) to Natural Language Generation. In particular, he pointed out how the unique factoring of recursion and dependencies provided by TAG made it particularly appropriate to derive sentence structures from an input provided by a text planning component. Of particular importance is the fact that (all) syntactic dependencies and function argument structure are localizest in TAG trees.</Paragraph>
    <Paragraph position="1"> Shieber and Schabes (1991) discuss using Synchronous TAG for generation. Synchronous TAG provides a formal foundation to make explicit the relationship between elementary syntactic structures and their corresponding semantic counterparts, both expressed as elementary TAG trees. This relationship is made explicit by pairing the elementary trees in the syntactic and logical form languages, and associating the corresponding nodes. Shieber and Schabes (1990) describe a generation algorithm which &amp;quot;parses&amp;quot; an input logical form string recording the adjoining and substitution operations necessary to build the string from its elementary components. The corresponding syntactic structure is then generated by doing 1 This work is supported ill part by Grant #H133E80015 from the National hlstitute on Disability and Rehabilitation Research. Support was also provided by the Nemours Fotmdation. We would like to thank John Hughes for Iris many conunents and discussions concerning this work.</Paragraph>
    <Paragraph position="2"> the same. set of operations (in reverse. ) on the corresponding elementary structures m the grammar describing the natural language.</Paragraph>
    <Paragraph position="3"> Note that the generation methodology proposed for synchronous TAG (and the hypothetical generator alluded to in (Joshi, 1987)) takes as input the logical form semantic representation and produces a syntactic representation of a natural language sentence which captures that logical form. While the correspondence between logical form and the natural language syntactic form is certainly an important and necessary component of any sentence generation system, it is unclear how finer distinctions can be made in this framework. That is, synchronous TAG does not address the question of which syntactic rendition of a particular logical form is most appropriate in a given circumstance. This aspect is particularly crucial from the point of view of generation. A full-blown generation system based on TAG must choose between various renditions of a given logical form on well-motivated grounds.</Paragraph>
    <Paragraph position="4"> Mumble-86 (McDonald &amp; Pustejovsky, 1985; Meteer et al., 1987) is a sentence generator based on TAG that is able to take more than just the logical form representation into account.</Paragraph>
    <Paragraph position="5"> Mumble-86 is one of the foremost sentence generation systems and it (or its predecessors) has been used as the sentence generation components of a number of natural language generation projects (e.g., (McDonald, 1983; McCoy, 1989; Conklin &amp; McDonald, 1982; Woolf&amp; McDonald, 1984; Rubinoff, 1986)). After briefly describing the methodology in Mumble-86, we will point out some problematic aspects of its design. We will then describe our architecture which is based on interfacing TAG with a rich functional theory provided by functional systemic grammar (Halliday, 1970; Halliday, 1985; Fawcett, 1980; Hudson, 1981). 2 We pay particular attention to those aspects which distinguish our generator from Mumble-86.</Paragraph>
    <Paragraph position="6">  content of what is to be generated along with the goals and rhetorical force to be achieved. While the form of the L-Spec is dependent on the particular application, for the purposes of this discussion we can think of it as a set of logical form expressions that describe the content to be expressed. Mumble-86 uses a dictionary-like mechanism to transform a piece of the L-Spec into an elementary TAG tree which realizes that piece. The translation process itself (performed in the dictionary) may be influenced by contextual factors (including pragmatic factors which are recorded as a side-effect of grammar routines), and by the goals recorded in the L-Spec itself. It is in this way that the system can make fine-grained decisions concerning one realization over another.</Paragraph>
    <Paragraph position="7"> Once a TAG tree is chosen to realize the initial subpiece, that structure is traversed in a left to right fashion. Grammar routines are run during this traversal to ensure grammaticality (e.g., subject-verb agreement) and to record contextual information to be used in the translation of the remaining pieces of the L-Spec. In addition to the grammar routines, as the initial tree is traversed at each place where new information could be added into the evolving surface structure (called attachment points), the remaining L-Spec is consulted to see if it contains an item whose realization could be adjoined or substituted at that position.</Paragraph>
    <Paragraph position="8"> In order for this methodology to work, (McDonald &amp; Pustejovsky, 1985) point out that they have to make some strong assumptions about the logical form input to their generator. Notice that the methodology described always starts generating from an initial tree and other auxiliary or initial trees are adjoined or substituted into that initial structure. 3 As a result, in generating an embedded sentence, the generator must start with the innermost clause in order to ensure that the first tree chosen is an initial (and not an auxiliary) tree. Consider, for example, the generation of the sentence &amp;quot;Who did you think hit John&amp;quot;. Mumble-86 must start generating from the clause &amp;quot;Who hit John&amp;quot; which is (roughly) captured in the tree shown in Figure 4. This surface structure would then be traversed. At the point labeled fr-node (an attachment point) the auxiliary tree representing &amp;quot;you think&amp;quot; in Figure 2 would be adjoined in. Notice, however, that if Mumble-86 must work from the inner-most clause out, then the initial L-Spec must be in a particular form which is not consistent with the &amp;quot;logician's usual represen3An initial tree is a minimal non-recursive structure in TAG, wlfile an auxiliary tree is a minimal recursive structure. Thus, an auxiliary tree is characterized as having a leaf node (wlfich is termed the foot node) which has the same label as the root node. The tree in Figure 2 is an auxiliary tree. The adjoining operation essentially inserts an auxiliary tree into another tree. For instance, the tree in Figure 5 is the result of adjoining the auxiliary tree shown in Figure 2 into the ilfitial tree shown in Figure 4 at the node labeled It-node.</Paragraph>
    <Paragraph position="9"> tation of sentential complement verbs as higher operators&amp;quot; (McDonald &amp; Pustejovsky, 1985)\[p.</Paragraph>
    <Paragraph position="10"> 101\] (also noted by (Shieber &amp; Schabes, 1991)).</Paragraph>
    <Paragraph position="11"> Instead Mumble-86 requires an alternative logical form representation which amounts to breaking the more traditional logical form into smaller pieces which reference each other. Mumble-86 must be told which of these pieces is the embedded piece that the processing should start with. 4 Notice that this architecture is particularly problematic for certain kinds of verbs that take indirect questions. For instance, it would preclude the proper generation of sentences involving &amp;quot;wonder&amp;quot; (as in &amp;quot;I wonder who hit John&amp;quot;). Verbs which require the question to remain embedded are problematic for Mumble-86 since the main verb (wonder) would not be available when its inclusion in the surface structure needs to be determined. ~ An additional requirement on the logical form input to the generator is that the lambda expression (representing a wh-question) and the expression containing the matrix trace be present in a single layer of specification. This, they claim, is necessary to generate an appropriate sentence form without the necessity of looking arbitrarily deep into the representation. This would mean that for sentences such as &amp;quot;Who do you think hit John&amp;quot;, the lambda expression would have to come with the &amp;quot;hit John&amp;quot; part of the input. We will show that our system does not place either of these restrictions on the logical form input and yet is able to generate the appropriate sentence without looking arbitrarily deep into the input specification. null One can notice a few features of the system just described. First, because the dictionary translation process is context sensitive, the generation methodology is able to take more than just logical form into account. Note, however, that it is unclear what the theory is behind the realizations made. In addition, these decisions are encoded procedurally thus the theory is rather difficult to abstract.</Paragraph>
    <Paragraph position="12"> It is also the case that Mumble-86 makes no distinction between decisions that are made for functional reasons and those that are made for syntactic reasons. Both kinds of information must be recorded (procedurally) in grammar routines so that they can be taken into account during subsequent translations. While the fact that the grammar is procedurally encoded and that functional</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML