File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/02/c02-1098_intro.xml
Size: 3,328 bytes
Last Modified: 2025-10-06 14:01:24
<?xml version="1.0" standalone="yes"?> <Paper uid="C02-1098"> <Title>Annotation-Based Multimedia Summarization and Translation</Title> <Section position="2" start_page="0" end_page="0" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> Multimedia content such as digital video is becoming a prevalent information source. Since the volume of such content is growing to huge numbers of hours, summarization is required to effectivelybrowsevideosegmentsinashorttime withoutmissingsignificantcontent. Annotating multimedia content with semantic information such as scene/segment structuresandmetadata about visual/auditory objects is necessary for advanced multimedia content services. Since natural language text such as a voice transcript is highly manageable, speech and natural language processing techniques have an essential role in our multimedia annotation.</Paragraph> <Paragraph position="1"> We have developed techniques for semi-automatic video annotation integrating a multilingual voice transcription method, some video analysis methods, and an interactive visual/auditory annotation method. The video analysis methods include automatic color change detection, characterization of frames, and scene recognition using similarity between frame attributes.</Paragraph> <Paragraph position="2"> Therearerelatedapproachestovideoannotation. For example, MPEG-7 is an effort within the Moving Picture Experts Group (MPEG) of ISO/IEC that is dealing with multimedia content description (MPEG, 2002). MPEG-7 can describe indeces, notes, and so on, to retrieve necessary parts of content speedily. However, it takes a high cost to add these descriptions by hands. The method of extracting them automatically through the video/audio analysis is vitally important. Our method can be integrated into tools for authoring MPEG-7 data.</Paragraph> <Paragraph position="3"> Thelinguistic descriptionscheme, whichwill be a part of the amendment to MPEG-7, should play a major role in this integration.</Paragraph> <Paragraph position="4"> Using such annotation data, we have also developed a system for advanced multimedia processing such as video summarization and translation. Ourvideo summaryis not justa shorter version of the original video clip, but an interactive multimedia presentation that shows keyframes of important scenes and their transcripts in Web pages and allow users to interactively modify summary. The video summarization is customizable according to users' favorite size and keywords. When a user's client device isnotcapableofvideoplaying, oursystemtranforms video to a document that is the same as a Web document in HTML format.</Paragraph> <Paragraph position="5"> The multimedia annotation can make delivery of multimedia content to different devices very effective. Dissemination of multimedia content will be facilitated by annotation on the usageofthecontent indifferentpurposes,client devices, and so forth. Also, it provides object-level description of multimedia content which allows a higher granularity of retrieval and presentation inwhichindividualregions, segments, objects and events in image, audio and video datacanbedifferentiallyaccesseddependingon publisher and user preferences, network bandwidth and client capabilities.</Paragraph> </Section> class="xml-element"></Paper>