File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/95/p95-1034_concl.xml

Size: 2,281 bytes

Last Modified: 2025-10-06 13:57:27

<?xml version="1.0" standalone="yes"?>
<Paper uid="P95-1034">
  <Title>Two-Level, Many-Paths Generation</Title>
  <Section position="23" start_page="258" end_page="258" type="concl">
    <SectionTitle>
10 Conclusions
</SectionTitle>
    <Paragraph position="0"> Statistical methods give us a way to address a wide variety of knowledge gaps in generation. They even make it possible to load non-traditional duties onto a generator, such as word sense disambiguation for machine translation. For example, bei in Japanese may mean either American or rice, and sha may mean shrine or company. If for some reason, the analysis of beisha fails to resolve these ambiguities, the generator can pass them along in the lattice it builds, e.g.: the American shrine In this case, the statistical model has a strong preference for the American company, which is nearly always the correct translation. 7 Furthermore, our two-level generation model can implicitly handle both paradigmatic and syntagmatic lexical constraints, leading to the simplification of the generator's grammar and lexicon, and enhancing portability. By retraining the statistical component on a different domain, we can automatically pick up the peculiarities of the sublanguage such as preferences for particular words and collocations. At the same time, we take advantage of the strength of the knowledge-based approach which guarantees grammatical inputs to the statistical component, and reduces the amount of language structure that is to be retrieved from statistics. This approach addresses the problematic aspects of both pure knowledge-based generation (where incomplete knowledge is inevitable) and pure statistical bag generation (Brown et al., 1993) (where the statistical system has no linguistic guidance).</Paragraph>
    <Paragraph position="1"> Of course, the results are not perfect. We can improve on them by enhancing the statistical model, or by incorporating more knowledge and constraints in the lattices, possibly using automatic knowledge acquisition methods. One direction we intend to pursue is the rescoring of the top N generated sentences by more expensive (and extensive) methods, incorporating for example stylistic features or explicit knowledge of flexible collocations.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML