File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/04/w04-0608_intro.xml

Size: 4,413 bytes

Last Modified: 2025-10-06 14:02:27

<?xml version="1.0" standalone="yes"?>
<Paper uid="W04-0608">
  <Title>An Extensible Framework for Efficient Document Management Using RDF and OWL</Title>
  <Section position="3" start_page="1" end_page="2" type="intro">
    <SectionTitle>
1. Introduction
</SectionTitle>
    <Paragraph position="0"> Digital documents are ubiquitously used to encode, preserve as well as exchange useful information in order to accomplish information sharing across the community.</Paragraph>
    <Paragraph position="1"> As the growth in volumes of digital data is exponential, it is necessary to adopt a principled way of managing these documents. Besides, due to the distributed nature of information, it is also imperative to take into account the geographical and enterprise-level barriers for uniform data access and retrieval.</Paragraph>
    <Paragraph position="2"> The ITEA (Information Technology for European Advancement) project, Proteus  has similar objectives.</Paragraph>
    <Paragraph position="3"> Proteus is a collaborative initiative of French, German and Belgium companies, universities and research institutes aimed at developing a European generic software platform usable for implementation of web-based e-maintenance centers. It implements a generic architecture for integrated document management using ena- null This material is based upon work supported by the ITEA (Information Technology for European Advancement) programme under Grant 01011 (2002).  http://www.proteus-iteaproject.com/ bling technologies such as XML, RDF and OWL. Most of the existing document management systems ([1], [2]) limit themselves in the scope of application or document formats or simply neglect any structure-based analysis. However, considering our requirements, it is obvious that only a multi-layered functional architecture can cover various issues related to distributed document management such as localized vs global structural constraints, conceptual definition of documents, reasoning-based discovery etc.</Paragraph>
    <Paragraph position="4"> Indeed, evolving technologies such as XML (eXtensible Markup Language), RDF (Resource Description Framework) and OWL (Web Ontology Language) provide us with rich set of application frameworks that if applied intelligently, can help a great deal in solving these problems. XML ([3]) is primarily designed for low-level structural descriptions. It provides a tree of structured nodes, which can be efficiently used to describe documents and check their models using DTDs (Document Type Definitions) or XML Schemas. Besides, XML enables easy human readability as well as efficient machine interpretability. However, there are issues if we only deal with the structural aspect. If one wants to pick some semantic information from a document, there is no straightforward way other than to constrain it by an schema or make an application handprogrammed to recognize certain document-specific semantics. Furthermore, if the schema changes over time, it could typically introduce new intermediate elements. This might have the consequences of invalidating certain queries and creating incoherencies in the semantic data-model of the document.</Paragraph>
    <Paragraph position="5"> RDF (Resource Description Framework) and OWL (Web Ontology Language) build upon the XML syntax to describe the actual semantics of a document and provide useful reasoning and inference mechanisms. RDF ([4]) specifies graphs of nodes, which are connected by directed arcs representing relational predicates such as URIs (Uniform Resource Identifiers) and encode the conceptual model of the real world. Unlike XML, an RDF schema is a simple vocabulary language. The parse of the semantic graph results in a set of triples, which mimic predicate-argument conceptual structures.</Paragraph>
    <Paragraph position="6"> OWL can be used on top of these semantic structures to do logical reasoning and discover relations that are not explicit and obvious.</Paragraph>
    <Paragraph position="7"> In the following sections we discuss how we use these technologies to enable a generic document management system. Firstly, in Section 2 we describe the document management and the Proteus architecture followed by discussion on Annotations in Section 3. Section 4 provides brief account of the model theoretic access mechanisms enabled by OWL followed by description of data categories in Section 5.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML