File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/01/w01-1017_intro.xml
Size: 2,949 bytes
Last Modified: 2025-10-06 14:01:18
<?xml version="1.0" standalone="yes"?> <Paper uid="W01-1017"> <Title>The Automatic Generation of Formal Annotations in a Multimedia Indexing and Searching Environment</Title> <Section position="2" start_page="0" end_page="0" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> MUMIS develops and integrates basic technologies, which will be demonstrated within a laboratory prototype, for the automatic indexing of multimedia programme material. Various technology components operating offline will generate formal annotations of events in the data material processed. These formal annotations will form the basis for the integral online part of the MUMIS project, consisting of a user interface allowing the querying of videos. The indexing of the video ma- null section Human Language Technology (HLT). See for more information http://parlevink.cs.utwente.nl/projects/mumis/. line of time codes extracted from the various documents. null For this purpose the project makes use of data from different media sources (textual documents, radio and television broadcasts) in different languages (Dutch, English and German) to build a specialized set of lexicons and an ontology for the selected domain (soccer). It also digitizes non-text data and applies speech recognition techniques to extract text for the purpose of annotation. null The core linguistic processing for the annotation of the multimedia material consists of advanced information extraction techniques for identifying, collecting and normalizing significant text elements (such as the names of players in a team, goals scored, time points or sequences etc.) which are critical for the appropriate annotation of the multimedia material in the case of soccer.</Paragraph> <Paragraph position="1"> Due to the fact that the project is accessing and processing distinct media in distinct languages, there is a need for a novel type of merging tool in order to combine the semantically related annotations generated from those different data sources, and to detect inconsistencies and/or redundancies within the combined annotations. The merged annotations will be stored in a database, where they will be combined with relevant metadata.2 Finally the project will develop a user interface to enable professional users to query the database, by selecting from menus based on structured an2We see in this process of merging extracted informations and their combination with metadata a fruitful base for the identification and classification of content or knowledge from distinct types of documents.</Paragraph> <Paragraph position="2"> notations and metadata, and to view video fragments retrieved to satisfy the query, offering thus a tool to formulate queries about multimedia programmes and directly get interactive access to the multimedia contents. This tool constitutes the on-line component of the MUMIS environment.</Paragraph> </Section> class="xml-element"></Paper>