File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/03/w03-0809_metho.xml

Size: 23,938 bytes

Last Modified: 2025-10-06 14:08:24

<?xml version="1.0" standalone="yes"?>
<Paper uid="W03-0809">
  <Title>Automatic Creation of Interface Specifications from Ontologies</Title>
  <Section position="3" start_page="0" end_page="0" type="metho">
    <SectionTitle>
5. Concluding remarks are found in Section 6.
2 Approaches to Knowledge Storage
</SectionTitle>
    <Paragraph position="0"> Efforts originating in various W3C and Semantic Web projects brought about several knowledge modeling standards: Resource Description Framework (RDF), DARPA Agent Mark-up Language (DAML), Ontology Interchange Language (OIL), Web Ontology Language (OWL).2 As for intra-agent or intra-module communication languages, either XMLS or DTDs have become standards for interface specifications, due to the fact that instance documents can be automatically validated during run-time and software has been developed for parsing and marshaling information represented in these formats.3  tor.exolabs.org) for marshaling XML documents.</Paragraph>
    <Paragraph position="1"> Current systems often feature both XMLS- or DTD-based communication languages as well as DAML- or RDF-based knowledge stores. However, those are often structurally and terminologically heterogeneous. Mappings from the message content to the ontology are often difficult and costly. Attempts to hand-craft XMLS or DTDs for defining communication between various processing modules show that several problems grow roughly linear to the complexity of the domains to be defined by means of the individual representations. These problems are: a6 inconsistencies in modeling choices, e.g. elements versus attributes, a6 inconsistencies in the hierarchy, e.g. flat-ness versus depth of individual branches, a6 readability and understandability of the schemata. In a system involving multiple domains it becomes pretty much impossible to manually define suitable XML schemata for the modules that exchange information about the multitude of possible utterances. The ensuing inadequacies of the representations constitute a substantial obstacle for system development and its functionality. Processing modules operating on such schemata can also not apply inferencing algorithms directly on these structures, as they do not represent enough knowledge. This has the effect that individual knowledge stores have to be hand-crafted for specific components, causing the heterogeneity between the communicated and the modeled objects to increase further. Additionally, readability decreases as more complex XMLS structures, such as extension hierarchies or substitution groups, are used and potential links to semantic web ontologies are lost or become costly.</Paragraph>
  </Section>
  <Section position="4" start_page="0" end_page="0" type="metho">
    <SectionTitle>
3 The Task: From Knowledge to
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
Interfaces
</SectionTitle>
      <Paragraph position="0"> Ontologies are a suitable means for knowledge representation, i.e. for the definition of an explicit and detailed model of the system's domains. That way, they provide a shared domain theory, which can be used for communication. Additionally, they can be employed for deductive reasoning and manipulations of models. The meaning of ontology constructs relies on a translation to some logic.</Paragraph>
      <Paragraph position="1"> This way, the inference implications of statements, e.g.</Paragraph>
      <Paragraph position="2"> whether a class can be related to another class via a sub-class or some other relation, can be determined from the formal specification of the semantics of the ontology language. However, this does not make any claims about the syntactic appearance of the representations exchanged, e.g. an ordering of the properties of a class.</Paragraph>
      <Paragraph position="3"> An interface specification framework, such as XMLS or DTD, constitutes a suitable means for defining constraints on the syntax and structure of XML documents.</Paragraph>
      <Paragraph position="4"> Ideally, the definition of the content communicated between the components of a language technology system should relate both the syntax and the semantics of the XML documents exchanged. Those can then be seen as instances of the ontology represented as XMLS-based XML documents. However, this requires that the knowledge, originally encoded in the ontology, is represented in the XMLS syntax.</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.1 Ontology to XMLS transformation
</SectionTitle>
      <Paragraph position="0"> The solution presented here states that the knowledge representations to be expressed in XMLS are first modeled in OIL-RDFS or DAML+OIL as an ontology proper, using the advantages of ontology engineering systems available, and then transformed into a communication interface automatically with the help of the software developed for that purpose. Before showing how the problems mentioned in Section 2 can be minimized, we will introduce the basic formal properties of the given source and target representations.</Paragraph>
      <Paragraph position="1"> Ontology representation languages: Domain knowledge stored in the ontology may be encoded using XML-based semantic mark-up languages, such as OIL, or DAML+OIL. In the work reported here, we used an ontology defined in the OIL-RDFS syntax, but the basic transformation algorithms are as well applicable to OIL or DAML+OIL .</Paragraph>
      <Paragraph position="2"> OIL-RDFS is a representation format which allows to express any OIL ontology in RDF syntax. This has the advantage that the ontology is partially understandable for non-OIL aware RDFS applications. Additionally it allows for all the formal semantics and reasoning support available for OIL. A detailed characterization of the formal properties of the OIL language can be found in Fensel et al. (2001).</Paragraph>
      <Paragraph position="3"> The semantics of OIL is based on a combination of frame and description logic extended with concrete datatypes. The FACT system4 can be used as a reasoning engine for OIL ontologies, providing some automated reasoning capabilities, such as class consistency or subsumption checking. The OIL language employs frame semantics and provides most of the modeling primitives commonly used in frame-based knowledge representation systems. Graphical ontology engineering front-ends and visualization tools are available for editing, maintaining, and visualizing the ontology.5 XML Schema: XML schemata provide a grammar for prescribing the structure of XML documents, data typing  visualization.</Paragraph>
      <Paragraph position="4"> as well as inclusion and derivation mechanisms. XMLS definitions are themselves XML documents, which have the immediate advantage that all tools developed for XML, e.g. validation tools, can be immediately used for</Paragraph>
    </Section>
    <Section position="3" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.2 Differences between ontology languages and
</SectionTitle>
      <Paragraph position="0"/>
    </Section>
  </Section>
  <Section position="5" start_page="0" end_page="0" type="metho">
    <SectionTitle>
XMLS
</SectionTitle>
    <Paragraph position="0"> In comparing ontology languages and XML schema it is important to realize, that while both of them provide vocabulary and structure to represent knowledge, the underlying formalisms show different formal properties since their purpose is different.</Paragraph>
    <Paragraph position="1"> OIL-RDFS definition is a directed acyclic graph, while XMLS establish a tree structure. Ontology languages, such as OIL provide much richer modeling primitives, i.e.</Paragraph>
    <Paragraph position="2"> classes, slots. They also incorporate the notion of (multiple) inheritance, which may be either explicitly stated or implied.</Paragraph>
    <Paragraph position="3"> XMLS have different modeling primitives, i.e. elements of certain types, which can be either simple or complex types. However, no precise semantic interpretation is assigned to them. There is no inheritance as such, but types can be derived by extension or restriction, i.e. types can share some structures between them. Generally the XMLS language is much richer in terms of its datatyping capabilities and grammar for prescribing the structure and content of the elements.</Paragraph>
    <Paragraph position="4"> In contrast to that, ontologies constitute high level domain models. Because of different formal properties of the underlying representation formalisms, a straight-forward mapping between them is not always possible. In some cases it may be rather intricate, so that special transformation algorithms are required. These algorithms are responsible for explicating and mapping knowledge structures from the ontology to XMLS.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.3 Transforming OIL definitions into XMLS
</SectionTitle>
      <Paragraph position="0"> definitions In this section, we provide a description of the algorithms employed by the transformation software. We will show how OIL definitions can be written in XMLS. Let's assume the existence of the ontology shown in Figure 1. Step 1 Mapping of class definitions: According to this ontology, the class WatchPerceptualProcess is a subclass of PerceptualProcess, and its instances have an object to be watched, i.e. AvEntertainment. A  The translation of the class definition header is done in a straight-forward manner using the XML schema complexType construct to assign a name: a0 complexType name=&amp;quot;WatchPerceptualProcess  Step 2 Resolving inheritance: The notion of inheritance is important for many language processing algorithms operating either on the basis of an ontology or equivalent XMLS. It allows, for example, underspecification in semantic representations, when a more general class is used in place where specific derived classes can also occur. The generalization hierarchy found in the ontology should be made explicit to an equivalent type extension structure in the XMLS.</Paragraph>
      <Paragraph position="1"> The subclass-of statement would thus be translated to XML schema. This corresponds to the type extension7 and results in a construct:  However, in more complicated cases, in particular when the class in question has subclasses and consequently shares some structures (slots) with its superclass, as in the example below, a direct mapping to the type extension in XML schema may result in a problem.</Paragraph>
      <Paragraph position="2"> 7As XML schema does not allow to express multiple inheritance, only unary inheritance can be handled. This is a constraint on the modeling side which should be taken into account. Also ns: in our examples stands for any namespace, which can be assigned freely.</Paragraph>
      <Paragraph position="3">  some non-abstract types (i.e. types that can be instantiated, e.g. Agent) have derived types (e.g. Person). As some substructures are shared by the types, in the instance documents it is no longer possible to determine whether a specific element matches the content model of Agent, Person or possibly another type derived from Agent by extension, e.g. Animal.</Paragraph>
      <Paragraph position="4"> To avoid this sort of ambiguity, a specific algorithm is proposed. The basic idea is to move the information needed to identify the type of the element to additional artificially created complex types.8 This way, each class in the ontology is translated to a set of three complex types. Type abstract: Type abstract serves to preserve the original class definition in the ontology, but is given an attribute abstract in the XML schema, i.e. it may never be instantiated, resulting in:  Type abstract adding no new content, but it is final, i.e. no more type may be derived from it. Type final is used when elements are given a type. There is no more ambiguity in the instance documents, as Type final does not have any derived types, e.g.:  Type: Type can be used by the elements because it is not derived from anything. It consists of a choice XMLS construct containing an element for each of the possible derivations of the Type abstract including Type abstract itself. As a result, any element tag of the choice, which is a derivation of the given base type can be actually used in the instance document, as follows: 8An alternative solution is to use xsi:type attributes in the instance documents. This, however, ceases to make inheritance structures visible through the element tags.</Paragraph>
      <Paragraph position="5">  Step 3 Mapping of slot constraints: Class descriptions in the ontology may typically contain one or more slot constraints of certain kinds. For each slot constraint associated with a class a sequence element of the corresponding complex type in XMLS is created, like in the example below:  The slot constraint has-watchable object of WatchPerceptualProcess is mapped to an element watchableObject. The cardinality of the slot is transformed to the cardinality of the element accordingly. The type of the element in a sequence is determined from the appropriate slot filler in the ontology. There exist a variety of possibilities to specify the domain of slots in the ontology (see Step 4 for a detailed discussion). As each of them requires special treatment while translating to XMLS, we discuss this in a separate step.</Paragraph>
      <Paragraph position="6"> Step 4 Resolving fillers of slot constraints: The domain of the slot in the ontology can be specified by class expressions of different complexity, individual or a set of those and some elementary data types. However, XMLS require that an element representing a slot in the corresponding complex type definition is always given a single non-ambiguous type. This may be the case when a slot is filled with a single class having no subclasses; in other cases it is not directly expressible in XMLS and a special mechanism is required. We support the transformation of the following slot fillers: A non-final class definition: A non-final class in the ontology is a class having further subclasses in the generalization hierarchy. The semantics of such a class in the slot definition is that any instance of this class or any of its subclasses can fill the slot. As XMLS lack the notion of implicit semantics, the subclass hierarchy of the respective non-final class must be made explicit. The complex type Type containing a choice running across all possible derivations is employed as an unambiguous type of the sequence element within the corresponding complex type definition (cf. Step 3).</Paragraph>
      <Paragraph position="7">  has-agent has-value Animal or Person The semantics of such an expression as slot-filler is that any instance of the evaluated expression can fill the slot. As XMLS does not support the use of logical operators, placeholder complex types, e.g.</Paragraph>
      <Paragraph position="8"> Or PerceptualProcess agent in the example below, should be introduced at this point to XMLS. These artificially created complex types consist of a choice of elements corresponding to each of the operands of the boolean class expression. The elements are given a type of the respective class.</Paragraph>
      <Paragraph position="9">  Similar algorithms exist for resolving AND and NOT operators. The only difference for AND is that elements of a placeholder complex type would not be combined in a choice, but in a sequence. For NOT, the choice of the complex type would contain elements for all classes, except the class which is the operand in the respective expression.</Paragraph>
      <Paragraph position="10"> Individuals: In the following ontology definition, the slot-constraint has-genre is filled with a set of individual values love, humor, science.</Paragraph>
      <Paragraph position="11"> class-def AvMedium slot-constraint has-genre has-value (one-of love humor science) The slot has-genre is translated to a sequence element genre of the complex type corresponding to the AvMedium class definition. In order to give this element an unambiguous type, a placeholder simple type OneOf AvMedium genre is introduced. The content model of this type is then restricted to the set of enumeration values corresponding to the set of individuals specified in the original slot definition.</Paragraph>
      <Paragraph position="12">  Step 5 Mapping of individuals: The last step to be taken in transforming the ontology to XMLS is mapping the individuals which represent instances of specific classes in the ontology, e.g. The General and City Lights are modeled as instances of the AvMedium class.</Paragraph>
      <Paragraph position="13"> instance-of The General AvMedium instance-of City Lights AvMedium For instances of individual classes, a simple type with the name of the respective class is created. The content model of this simple type is an enumeration of specific values which correspond to the names of individuals in the ontology.</Paragraph>
      <Paragraph position="14">  Our approach to the automatic creation of interface specifications from an ontology has been successfully tested in SMARTKOM, a complex multi-modal dialogue system (Wahlster et al., 2001). The system comprises a large set of input and output modalities which the most advanced current systems feature, together with an efficient fusion and fission pipeline. SMARTKOM supports speech input with prosodic analysis, gesture input via infrared camera, recognition of facial expressions and their emotional states. On the output side, the system features a gesturing and speaking life-like character together with displayed generated text and multimedia graphical output.</Paragraph>
      <Paragraph position="15"> The system currently comprises nearly 50 modules running on a parallel virtual machine-based integration software called Multiplatform (Herzog et al., 2003). The modules exchange messages whose content is encoded in XML. The interfaces are defined by a set of XML schemata. The part of them containing the system's knowledge about application domains was obtained via the automatic transformation of an OIL-RDFS ontology (Gurevych et al., 2003b). Thusly, all components of the system operate on a common knowledge store XML schemata resulting from the ontology transformation, e. g., the parser (Engel, 2002), the dialogue manager (L&amp;quot;ockelt et al., 2002).</Paragraph>
      <Paragraph position="16"> In this trial, our initial hypothesis that employing ontological knowledge for interface specifications will make them more consistent, better-structured and more readable as compared to manually defined interfaces was fully satisfied. Some additional advantages that were not anticipated originally also resulted from this approach. Enhancing the OIL-RDFS datatyping capabilities: As previously stated, ontologies are a suitable means for specifying high-level domain knowledge. However, if knowledge represented in the ontology is to become part of the common XML schema based representation exchanged between the modules, it is important to have a mechanism for referencing structures, i. e., datatypes defined elsewhere in a larger XML context.</Paragraph>
      <Paragraph position="17"> It should be noted that the datatyping capabilities of the OIL-RDFS ontology per se are very limited. Therefore, enabling references to XMLS datatypes within the ontology, or more generally, referencing any datatype defined elsewhere in a larger XML context, is in practice beneficial to the ontology.</Paragraph>
      <Paragraph position="18"> We provide a special mechanism which allows to employ external datatypes in the ontology, e. g., NmToken (a built-in XML schema datatype) or derived datatypes, such as ns:TimeExpression (as parts of other ontologies, like a time ontology). To make use of this feature, external datatypes have to be modeled as instances in the ontology. Thus, they can be employed, e. g., as slot fillers. Consequently, the ontology converted to XML schema can be embedded in a larger XML context.</Paragraph>
      <Paragraph position="19"> Supporting multiple applications with a single ontology: Ontology construction is known to be labor and cost intensive. To reduce the cost of ontology design and maintenance, it is necessary to construct ontologies which are re-usable, i. e., support multiple applications and domains. This may, however, often result in the side effect that an ontology covers more domains than are addressed by the specific system in question. Transforming the ontology to XMLS as is, would then lead to overloading the domain model and slowing down the system and development performance.</Paragraph>
      <Paragraph position="20"> As a solution to this problem we enabled different (system-dependent) views on a single ontology covering multiple domains. This solution requires that certain parts of the ontology are marked up as being relevant for a particular system in question. The mark-up is examined automatically and a decision is made which parts of the ontology are irrelevant for the specific system at hand. These parts are, then, skipped in the process of transformation to XMLS. As a result, the XML schema-based domain model contains exactly the knowledge relevant for a particular system and presents one of the possible views on the underlying ontology.</Paragraph>
      <Paragraph position="21"> Language processing tasks on XMLS: Resulting from the transformation a maximal amount of the knowledge from the ontology has been preserved and made explicit in the produced XMLS. Various linguistic operations, e. g., anaphora, bridging and metonymy resolution or discourse processing techniques such as overlay (Alexandersson and Becker, 2003) can, therefore, work directly on the schemata.</Paragraph>
    </Section>
  </Section>
  <Section position="6" start_page="0" end_page="0" type="metho">
    <SectionTitle>
5 Related Work
</SectionTitle>
    <Paragraph position="0"> The relation between ontologies and schema-languages has been addressed previously in the AI and Semantic Web communities. Gil and Ratnakar (2002) carried out a detailed comparison of semantic mark-up languages in the course of looking for a language suitable for developing user-oriented tools for the Semantic Web.</Paragraph>
    <Paragraph position="1"> Klein et al. (2000) relate ontologies to the XMLS language definition, applicable in the context of defining the content of on-line information sources. Their conclusion is that both refer to different levels of abstraction and should therefore be used at the different stages of the development of information sources. They also provide a translation procedure from OIL to XMLS which is similar to ours, yet differs in the technical details. In contrast to our work, Klein et al. (2000) do not verify their approach via practical implementation, i. e., while it is stated that most of the steps could be automated, the focus of their work remains on a fairly theoretical level.</Paragraph>
    <Paragraph position="2"> The approach proposed herein has been implemented and successfully deployed in a language technology system.</Paragraph>
    <Paragraph position="3"> It is available as a free software project, thus enabling its practical re-use in other systems.</Paragraph>
    <Paragraph position="4"> Some research is also underway to explore the reverse direction, i.e. from XML schema to ontology content9.</Paragraph>
    <Paragraph position="5"> The motivation for that is twofold: firstly to enable reasoning about XML content for DAML-enabled software and secondly to create DAML content from XML in a quick and automated fashion. The main objective of our approach is, however, to bring semantics to XML documents, i.e., derive appropriate interface specifications from the given domain model, thereby enabling high-quality reasoning immediately on the XMLS level.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML