XML Viewer - j98-3004

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/relat/98/j98-3004_relat.xml
Size: 8,940 bytes
Last Modified: 2025-10-06 14:16:02
<?xml version="1.0" standalone="yes"?>
<Paper uid="J98-3004">
  <Title>Describing Complex Charts in Natural Language: A Caption Generation System</Title>
  <Section position="6" start_page="400001" end_page="400001" type="relat">
    <SectionTitle>
7. Related Work
</SectionTitle>
    <Paragraph position="0"> Most previous efforts in generating intelligent multimedia presentations have focused on coordinating natural language and graphical depictions of real world devices (e.g., military radios \[Feiner and McKeown 1991\] and coffee makers \[Wahlster et al. 1993\]) for generating instructions about their repair or proper use. These projects tackled important problems such as apportioning content to media and generating cross-references between them. Research has also focused on issues regarding the generation of coordinated presentations in applications where the graphics are familiar, or possess an obvious mapping between the data set and a graphical image (e.g., weather maps \[Kerpedjiev 1992\] and network diagrams \[Marks and Reiter 1990\]).</Paragraph>
    <Paragraph position="1"> Our work differs from these projects in two ways. The first difference concerns the type of data that our system deals with. Unlike the presentations generated by the systems mentioned above, presentations generated by SAGE are usually based on abstract or relational information (e.g., census reports, logistics data, hospital administration data, real estate sales data), lacking any obvious graphical depiction. Second, although our long term goal is to generate coordinated multimedia explanations using informational graphics and natural language, our focus in this paper was on generating effective natural language explanations about the graphical presentations. In order to do this, the system had to explicitly reason about the perceptual complexity of the presentation. Generating such captions is an important component of constructing multimedia explanations involving integrative graphical displays.</Paragraph>
    <Paragraph position="2"> The POSTGRAPttE system (Fasciano 1996; Fasciano and Lapalme 1996) is the closest related research effort. As in our work, PoSTGRAPI-IE generates statistical graphics and accompanying captions. However, the issues considered in our work differ from those in POSTGRAPI-IE in several ways and both the text and the graphics generated by PosTGRAPHE emphasize aspects orthogonal to the ones considered in our project.</Paragraph>
    <Paragraph position="3"> For instance, PoSTGRAPI-IE can take as input a list of aspects that should be conveyed by the presentation. (These goals are represented in the system as a predefined set of templates, such as, &amp;quot;show the evolution of &lt;attribute-name-I} with respect to  Computational Linguistics Volume 24, Number 3 (attribute-name-2).&amp;quot;) This information is then used by POSTGRAPHE not only to generate an appropriate type of diagram (e.g., a line chart), but also to generate a caption that explicitly captures the specific aspects of interest, such as: &amp;quot;The profits were at their highest in 1975 and lowest in 1974, with about half their 1975 value.&amp;quot; This is in contrast to our system, which does not reason about trends or relationships between different data points shown in the graphic. Instead, our work has focused on describing complex data-to-grapheme mappings and deriving metrics for perceptual complexity.</Paragraph>
    <Paragraph position="4"> This is due, in part, to the nature of the graphical presentations that the two systems can design. SAGE, for instance, is capable of designing novel graphical presentations for very complex data sets, using techniques such as multiple grapheme composition and space alignment to facilitate cross-attribute comparisons. The range of graphical capabilities in PoSTGRAPHE is more limited. Combined with the fact that the graphics are generated in response to an explicit user goal, user comprehension problems in POSTGRAPHE are less likely than in our system. Perhaps in light of this, PoSTGRAPHE does not need to explicitly analyze its graphic presentations for potential ambiguities or perceptual complexities, and the captions accompanying the graphic do not take these factors into account.</Paragraph>
    <Paragraph position="5"> However, our current implementation, described in the paper, should not be confused with our long term research agenda; it was designed as a framework to evaluate more sophisticated capabilities. These include some of the capabilities that POSTGRAPHE has, particularly those dealing with the generation of information about trends and pattems. We plan to extend the approach used by POSTGRAPHE to take into account both the writer's goals and domain- and data-specific aspects. To this end, we are developing a language to express presentation intentions, taking into account our experiences as well as the language used in PosTGRAPHE. Furthermore, whereas the sequence of presentation goals to be achieved are part of the input to POSTGRAPHE, our new framework generates these dynamically by integrating a data analysis module with a discourse planner. The data analysis module is being designed to identify all possible relevant aspects of the data based on the domain specification and an analysis task. The planner can use a variety of strategies to select and organize these aspects into complex arguments that can be realized as presentations combining both text and graphics (see Kerpedjiev et al. \[1997\] for further details on our new framework).</Paragraph>
    <Paragraph position="6"> 8. Conclusions and Future Work Captions that explain novel or creative graphics can be crucial in understanding how data and various relations are expressed in them. This paper presents a framework for generating explanatory captions for information graphics. The system generates captions based on: (1) a representation of the structure of the graphical presentation and its mapping to the data it depicts, (2) a framework for identifying the perceptual complexity of graphical elements, and (3) the structure of the data expressed in the graphic.</Paragraph>
    <Paragraph position="7"> One of the strengths of our approach is that the system is able to generate surprisingly effective and comprehensible descriptions in the absence of a detailed semantic model for the domain. The captions shown in this document were generated using only the data characterization used by SAGE for designing the visual presentation and an extremely basic lexical representation. Thus, the caption generation mechanism can be quickly and easily transferred to another domain (the only thing required is a lexicon for the new terms). However, this is also a limitation, because under certain circumstances, the system generates seemingly odd descriptions. This occurs in cases  Mittal, Moore, Carenini, and Roth Generating Chart Captions where the underlying database representation happens to contain attribute specifications that differ from the way they would normally be described in discourse. For instance, if the database schema happened to relate house attributes such as house address, number of rooms, and sale price to the owner of the house, rather than the house itself, the system would generate statements such as &amp;quot;John's sale price is ... &amp;quot;. A secondary limitation of our implementation is that it does not generate general graphical annotations. While the system can (and does) highlight specific graphemes in the presentation if so required by the planner (currently done to single out the tuple being used in an example), the system does not coordinate the generation of graphical keys and the captions. This is because our speech act language does not permit bidirectional communication between the text planner and SAGE. The ability to specify arbitrary graphical annotations in the speech act language would make the current simple specification quite complex. As we extend the planning framework to generate both the text and the graphics, this will be remedied as well.</Paragraph>
    <Paragraph position="8"> There are two ways to facilitate an effective use of a graphic: (1) explaining how the graphic expresses its data, and (2) conveying what aspects of the data are relevant to the current user's analysis task. In the work described in this paper, we have addressed the first issue. We are currently working on the second one.</Paragraph>
    <Paragraph position="9">  This chart presents information about house sales from data-set TS-2480. The y-axis shows the houses. The left edge of the bar shows the house's selling price whereas the right edge shows the * asking price. The mark shows the agency estimate.</Paragraph>
    <Paragraph position="10"> Figure 22 Plan steps and the corresponding caption generated. (Terms such as DOM-2516 are pointers to domain concepts and attributes in the KB.)</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML