File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/05/w05-1604_intro.xml

Size: 2,577 bytes

Last Modified: 2025-10-06 14:03:20

<?xml version="1.0" standalone="yes"?>
<Paper uid="W05-1604">
  <Title>Real-Time Stochastic Language Generation for Dialogue Systems</Title>
  <Section position="3" start_page="0" end_page="0" type="intro">
    <SectionTitle>
2 Background
</SectionTitle>
    <Paragraph position="0"> The task of Content Determination is typically relegated to a module outside of the Generation component, such as with a Task Manager or other reasoning components. This leaves the tasks of Sentence Planning and Surface Realization as the main steps in dialogue generation, and this paper is describing a module that performs both. The task of referential generation is not addressed, and it is assumed that each logical input is a single utterance, thus removing the need for multiple sentence generation.</Paragraph>
    <Paragraph position="1"> Traditionally, surface realization has been performed through templates or more complex syntactic grammars, such as the FUF/SURGE system [Elhadad and Robin, 1996].</Paragraph>
    <Paragraph position="2"> Template-based approaches produce inflexible output that must be changed in every new domain to which the system is ported. Symbolic approaches produce linguistically correct utterances, but require a syntactic input and typically have runtimes that are impractical for dialogue. Requiring word choice to be finished beforehand, including most syntactic decisions, puts a heavy burden on dialogue system designers.</Paragraph>
    <Paragraph position="3"> Stochastic approaches have recently provided a new method of reducing the need for syntactic input and produce flexible generation in dialogue. HALogen [Langkilde-Geary, 2002] was one of the first stochastic generation systems, providing a two-phased approach that allowed the system designer to use an under-specified input. The first phase uses a hand written grammar that over-generates possible word orderings into a word forest. The second phase uses an n-gram language model to choose the highest probability path through the forest, returning this path as the generated sentence. This approach was first used in a dialogue system in [Chambers and Allen, 2004] as an attempt to create a domain independent surface realizer. A human evaluation showed a slight decline in naturalness when moved to a new domain.</Paragraph>
    <Paragraph position="4"> The stochastic approach was shown in [Langkilde, 2000] to produce good coverage of the Penn Treebank, but its runtime was significantly slow and others have suggested the stochastic approach is not feasible for dialogue.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML