File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/03/w03-0602_concl.xml

Size: 1,236 bytes

Last Modified: 2025-10-06 13:53:42

<?xml version="1.0" standalone="yes"?>
<Paper uid="W03-0602">
  <Title>Words and Pictures in the News</Title>
  <Section position="7" start_page="0" end_page="0" type="concl">
    <SectionTitle>
5 Conclusion
</SectionTitle>
    <Paragraph position="0"> News photo captions are an interesting dataset both for their unique textual properties, and for the opportunities they provide to exploit relationships between the text and image contents. We have used these captions to illuminate underlying topical structure in the collection. This research indicates that captions act much like very short summaries, emphasizing words that are strongly associated with underlying themes in the news. We are investigating how well this topical structure translates to more general collections of news articles. We have also shown that images can provide links between articles that are missed by textual analysis alone. Separately, we are investigating the possibility of building non-parametric models of celebrity faces. This line of research indicates that by combining a face detector with an analysis of the linguistic conventions of the text, captions can be used as an almost supervised dataset of people in the news.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML