File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/relat/03/w03-1016_relat.xml
Size: 2,871 bytes
Last Modified: 2025-10-06 14:15:39
<?xml version="1.0" standalone="yes"?> <Paper uid="W03-1016"> <Title>Statistical Acquisition of Content Selection Rules for Natural Language Generation</Title> <Section position="6" start_page="0" end_page="0" type="relat"> <SectionTitle> 5 Related Work </SectionTitle> <Paragraph position="0"> Very few researchers have addressed the problem of knowledge acquisition for content selection in generation. A notable exception is Reiter et al. (2000)'s work, which discusses a rainbow of knowledge engineering techniques (including direct acquisition from experts, discussion groups, etc.). They also mention analysis of target text, but they abandon it because it was impossible to know the actual criteria used to chose a piece of data. In contrast, in this paper, we show how the pairing of semantic input with target text in large quantities allows us to elicit statistical rules with such criteria.</Paragraph> <Paragraph position="1"> Aside from that particular work, there seems to exist some momentum in the literature for a two-level Content Selection process (e.g., Sripada et al. (2001), Bontcheva and Wilks (2001), and Lester and Porter (1997)). For instance, distinguish two levels of content determination, &quot;local&quot; content determination is the &quot;selection of relatively small knowledge structures, each of which will be used to generate one or two sentences&quot; while &quot;global&quot; content determination is &quot;the process of deciding which of these structures to include in an explanation&quot;.</Paragraph> <Paragraph position="2"> Our technique, then, can be thought of as picking the global Content Selection items.</Paragraph> <Paragraph position="3"> One of the most felicitous Content Selection algorithms proposed in the literature is the one used in the ILEX project (Cox et al., 1999), where the most prominent pieces of data are first chosen (by means of hardwired &quot;importance&quot; values on the input) and intermediate, coherence-related new ones are later added during planning. For example, a painting and the city where the painter was born may be worth mentioning. However, the painter should also be brought into the discussion for the sake of coherence. null Finally, while most classical approaches, exemplified by (McKeown, 1985; Moore and Paris, 1992) tend to perform the Content Selection task integrated with the document planning, recently, the interest in automatic, bottom-up content planners has put forth a simplified view where the information is entirely selected before the document structuring process begins (Marcu, 1997; Karamanis and Manurung, 2002). While this approach is less flexible, it has important ramifications for machine learning, as the resulting algorithm can be made simpler and more amenable to learning.</Paragraph> </Section> class="xml-element"></Paper>