File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/03/p03-1034_intro.xml

Size: 4,029 bytes

Last Modified: 2025-10-06 14:01:48

<?xml version="1.0" standalone="yes"?>
<Paper uid="P03-1034">
  <Title>Integrating Discourse Markers into a Pipelined Natural Language Generation Architecture</Title>
  <Section position="2" start_page="0" end_page="0" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> Historically, work on NLG architecture has focused on integrating major disparate architectural modules such as discourse and sentence planners and surface realizers. More recently, as it was discovered that these components by themselves did not create highly readable prose, new types of architectural modules were introduced to deal with newly desired linguistic phenomena such as referring expressions, lexical choice, revision, and pronominalization.</Paragraph>
    <Paragraph position="1"> Adding each new module typically entailed that an NLG system designer would justify not only the reason for including the new module (i.e., what linguistic phenomena it produced that had been previously unattainable) but how it was integrated into their architecture and why its placement was reasonably optimal (cf., (Elhadad et al., 1997), pp. 4-7). At the same time, (Reiter, 1994) argued that implemented NLG systems were converging toward a de facto pipelined architecture (Figure 1) with minimal-to-nonexistent feedback between modules.</Paragraph>
    <Paragraph position="2"> Although several NLG architectures were proposed in opposition to such a linear arrangement (Kantrowitz and Bates, 1992; Cline, 1994), these research projects have not continued while pipelined architectures are still actively being pursued.</Paragraph>
    <Paragraph position="3"> In addition, Reiter concludes that although complete integration of architectural components is theoretically a good idea, in practical engineering terms such a system would be too inefficient to operate and too complex to actually implement. Significantly, Reiter states that fully interconnecting every module would entail constructing C6B4C6 A0 BDB5 interfaces between them. As the number of modules rises (i.e.,as the number of large-scale features an NLG engineer wants to implement rises) the implementation cost rises exponentially. Moreover, this cost does not include modifications that are not component specific, such as multilingualism.</Paragraph>
    <Paragraph position="4"> As text planners scale up to produce ever larger texts, the switch to multi-page prose will introduce new features, and consequentially the number of architectural modules will increase. For example, Mooney's EEG system (Mooney, 1994), which created a full-page description of the Three-Mile Island nuclear plant disaster, contains components for discourse knowledge, discourse organization, rhetori- null cal relation structuring, sentence planning, and surface realization. Similarly, the STORYBOOK system (Callaway and Lester, 2002), which generated 2 to 3 pages of narrative prose in the Little Red Riding Hood fairy tale domain, contained seven separate components.</Paragraph>
    <Paragraph position="5"> This paper examines the interactions of two linguistic phenomena at the paragraph level: revision (specifically, clause aggregation, migration and demotion) and discourse markers. Clause aggregation involves the syntactic joining of two simple sentences into a more complex sentence. Discourse markers link two sentences semantically without necessarily joining them syntactically. Because both of these phenomena produce changes in the text at the clause-level, a lack of coordination between them can produce interference effects.</Paragraph>
    <Paragraph position="6"> We thus hypothesize that the architectural modules corresponding to revision and discourse marker selection should be tightly coupled. We then first summarize current work in discourse markers and revision, provide examples where these phenomena interfere with each other, describe an implemented technique for integrating the two, and report on a preliminary system evaluation.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML