File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/05/w05-1617_metho.xml
Size: 18,773 bytes
Last Modified: 2025-10-06 14:09:59
<?xml version="1.0" standalone="yes"?> <Paper uid="W05-1617"> <Title>Exploiting OWL Ontologies in the Multilingual Generation of Object Descriptions</Title> <Section position="3" start_page="0" end_page="0" type="metho"> <SectionTitle> 3 Exporting M-PIRO ontologies to OWL </SectionTitle> <Paragraph position="0"> M-PIRO's ontological assumptions are very similar to those of OWL. As with M-PIRO, OWL assumes there are entity types, called classes, and entities, called individuals. M-PIRO's fields correspond to OWL's properties. Relationships between entities are expressed by defining object properties, that map entities to other entities, while attributes of entities are expressed via datatype properties, that map entities to literals of specific datatypes. It is, thus, relatively straightforward to export an M-PIRO ontology to OWL, as sketched below. There are actually three different versions of OWL, called OWL LITE, OWL DL, and OWL FULL, with increasing sophistication. The mapping from M-PIRO's ontologies to OWL produces ontologies in OWL LITE, which can be thought of as a subset of OWL DL and OWL FULL.</Paragraph> <Paragraph position="1"> When exporting M-PIRO ontologies to OWL, entity types give rise to class definitions; e.g., the 'vessel' entity type of Figure 1 leads to the following OWL class: One problem we have encountered is that OWL provides no mechanism to specify default values of properties. In M-PIRO, it is possible to introduce a generic entity per entity type, and the values of its fields are used as default values of all the entities in that type. For example, one could specify that kouroi, a kind of statue, were made in the archaic period, by introducing a 'generic-kouros' entity, similar to the 'generic-kylix' of Figure 1, and filling its 'creation-period' with 'archaic-period'. This would save us from having to specify the creation period of each individual kouros; their 'creation-period' fields would be left empty. It is also possible to override default information: to specify that a particular kouros was created during the classical period, perhaps the art of an eccentric classical sculptor, one would fill its 'creation-period' with 'classical-period', and this would licence texts like &quot;Kouroi were created during the archaic period. However, this kouros was created during the classical period&quot;. We export generic entities as ordinary OWL individuals, but use a special prefix in their identifiers, which allows M-PIRO's system to assign them special status when reloading the ontology. Another system, however, that relies only on OWL's official semantics would have no way to realize that such individuals should be assigned special status.</Paragraph> <Paragraph position="2"> A second problem is that some of M-PIRO's datatypes (e.g., dates) do not correspond exactly to OWL's recommended datatypes. We have defined new datatypes in OWL, using XML SCHEMA, that correspond exactly to M-PIRO's datatypes, and we currently use those in the exported ontologies instead of the recommended OWL datatypes. We hope to modify M-PIRO's datatypes to correspond exactly to the recommended ones in future versions of M-PIRO's system.</Paragraph> <Paragraph position="3"> The mapping from M-PIRO ontologies to OWL that we sketched above has been fully implemented, and it now allows the authoring tool to export its ontologies in OWL. Apart from allowing other systems to reuse M-PIRO's ontologies, the mapping also opens up the possibility of generating object descriptions in both human-readable and machine readable forms. Every natural language description that M-PIRO produces can in principle also be rendered in a machine-readable form consisting of OWL individuals, this time using the mapping to translate into OWL the parts of the ontology that the system has decided to convey. For example, the English description of Figure 2 can be rendered in OWL as: M-PIRO's generator might have also included in the resulting text information deriving from the fields of the painter, e.g., the city the painter was born in, or other entities mentioned in the text. In that case, the OWL rendering of the description's content would include additional individuals, such as: In the machine-readable forms of the descriptions, the OWL individuals would include only properties corresponding to fields the generator has decided to convey, unlike when exporting the full ontology. That is, the OWL individuals may not include properties corresponding to fields deemed uninteresting for the particular end-user, or fields that have already been conveyed; e.g., the painter's city may have already been conveyed when describing another work of the same artist.</Paragraph> <Paragraph position="4"> It is thus possible to annotate the generated texts with OWL individuals representing their semantics. This would allow computer applications (e.g., Web agents visiting the site of a retailer that generates product descriptions using M-PIRO's technology) to reason about the semantics of the texts (e.g, locate items of interest). Alternatively, it is possible to define user types for both human users (e.g., 'expert', 'averageadult') and artificial agents acting for users of different interests and expertise (e.g., 'agent-expert', 'agent-averageadult'), and produce human-readable or machine-readable descriptions depending on the user type (in M-PIRO's demonstrators, there is a login stage where visitors select their types). The OWL ontology without its individuals (classes and properties only) can also be published on the Web to help the agents' developers figure out the structure and semantics of the OWL individuals their agents may encounter.</Paragraph> </Section> <Section position="4" start_page="0" end_page="0" type="metho"> <SectionTitle> 4 Importing OWL ontologies </SectionTitle> <Paragraph position="0"> When porting M-PIRO's system to a new domain, much of the authoring effort is devoted to defining entity types, and the fields that express attributes and relationships. This is a time-consuming process, partly because the ontology often has to be reshaped as more experience about the domain is gained. If a well-thought OWL ontology about the domain already exists, as will be the case with the gradual expansion of the Semantic Web, the authoring can be accelerated by importing the existing ontology into the authoring tool. Thereafter, the authors can focus on adding the necessary domain-dependent linguistic resources (micro-plans, lexicon entries, etc.), setting up the user stereotypes, and populating the ontology with entities that were not already present in the imported one. For the latter, we have developed software that allows the authoring tool to construct entities automatically from data in relational databases via ODBC; the authors only need to establish a mapping between the fields of the entity types and the attributes of the database's relations.</Paragraph> <Paragraph position="1"> As already mentioned, there are three versions of OWL (OWL LITE, OWL DL, OWL FULL) with increasing sophistication. The mapping from M-PIRO's ontologies to OWL of the previous section uses only a subset of OWL LITE. Hence, importing an arbitrary OWL ontology, as opposed to an OWL ontology exported by the authoring tool, is not simply a matter of following the inverse mapping of the previous section.</Paragraph> <Paragraph position="2"> Below we highlight the problems that arise when importing arbitrary OWL LITE ontologies, to offer a taste of the work that remains to be carried out to make M-PIRO's system fully compatible with OWL LITE. We also point to some additional problems that arise when one moves on to OWL DL and OWL FULL. The discussion is based on experiments we conducted with more than a dozen of existing OWL ontologies.5 One of the main difficulties is that OWL (all versions) allows multiple inheritance, while M-PIRO does not (section 2). Importing an ontology with multiple inheritance currently causes the process to fail. The need for multiple inheritance has also been noted by authors, who often encounter cases where, for example, a person has to be categorized as both painter and potter. We hope to support multiple inheritance in future versions; this requires, among others, modifications in how the ontology is presented in the authoring tool.</Paragraph> <Paragraph position="3"> Another problem is that OWL (all versions) supports prop-erty inheritance. For example, there may be a property 'isplayer-of', used to represent the relationship between soccer players and their teams, and another property 'is-goalkeeperof', that associates goalkeepers with their teams. The latter is a subproperty of the former, in the sense that if X is the goalkeeper of Y , then X is also a player of Y . The import facilities of the authoring tool currently ignore subproperty inheritance, because there is no corresponding notion in M-PIRO's ontologies; i.e., the two properties would be treated as unrelated. Subproperty inheritance, however, could help the generator avoid expressing information that follows from other information it has already conveyed; e.g., if a user has been told that X is the goalkeeper of Y , avoid saying that X with subproperty inheritance in future work.</Paragraph> <Paragraph position="4"> A further complication is that OWL LITE allows the range of possible values of a property to be the intersection of several classes, while in M-PIRO's model the values of each field must come from a single, named entity type. A possible solution is to create automatically a new entity type in M-PIRO's ontology for each intersection in the OWL ontology, but this leads back to the single inheritance problem, because the intersection has to inherit from all the intersected types. This problem is more acute in OWL DL and OWL FULL, where several set operations (e.g., union, complement) between classes are allowed when specifying the ranges of properties.</Paragraph> <Paragraph position="5"> In OWL it is also possible to refine a property's range.</Paragraph> <Paragraph position="6"> For example, an ontology may specify that individuals of the class 'product' have a property 'made-by', which associates them with individuals of the class 'manufacturer'; there would be an rdfs:range in the definition of 'product' setting the range of 'made-by' to 'manufacturer'. We may then wish to specify that individuals of 'automobile', a subclass of 'product', accept as values of 'made-by' only individuals of 'automobile-manufacturer', a subclass of 'manufacturer'. There are mechanisms in OWL (all versions) to state this (allValuesFrom tag), but there is no equivalent mechanism in M-PIRO's ontological model. We currently ignore range refinements when importing OWL ontologies, but this has the risk that authors may violate refinements (e.g., when adding individuals), creating ontologies that are no longer compatible with the imported ones.6 Additional work is needed to support OWL's (all versions) someValuesFrom, which allows stating that in set-valued properties (cf. M-PIRO's 'Many' column) at least one of the elements of each set-value should belong to a particular class. A further mechanism in OWL DL and OWL FULL (hasValue tag) allows specifying that all the individuals of a class have a particular value at some of their properties; e.g., that all wines of class 'burgundy' have 'dry' taste. Such information can be imported into M-PIRO's generic entities (Section 3), though the correspondence is not exact, as generic entities carry default information that may be overridden.</Paragraph> <Paragraph position="7"> As already pointed out (Section 1), M-PIRO does not allow relationships or attributes to be declared as one-to-one.</Paragraph> <Paragraph position="8"> In contrast, OWL (all versions) provides appropriate facilities, as well as facilities to declare properties (relationships or attributes) as transitive, symmetric, or the inverse of another one. All such declarations are currently ignored when importing OWL ontologies; again, this has the risk that the authors may modify the ontologies in ways that are incompatible with the ignored declarations. An additional problem in OWL FULL is that classes can be used as individuals, allowing the use of relationships to associate classes, as opposed to individuals; this violates M-PIRO's current ontological model.</Paragraph> <Paragraph position="9"> It should be clear, then, that there are still issues to be resolved in M-PIRO's ontological assumptions to make M-PIRO fully compatible with OWL LITE, and there are additional difficulties with OWL DL and OWL FULL. As discussed above, however, most of the necessary improvements appear to be 6ILEX and M-PIRO's core generation engine provide some support for such refinements, but M-PIRO's authoring tool does not. within reach, at least for OWL LITE. Overall, it appears reasonable to conclude that future versions of NLG systems like M-PIRO's will be able to exploit fully OWL ontologies.</Paragraph> </Section> <Section position="5" start_page="0" end_page="0" type="metho"> <SectionTitle> 5 Towards semantic browsers </SectionTitle> <Paragraph position="0"> We have so far proposed two ways in which OWL ontologies can be exploited in systems like M-PIRO's: first, the generated texts can be accompanied by OWL specifications of their semantics, with an OWL ontology establishing the semantic vocabulary; and, second, existing OWL ontologies can be imported, to accelerate the authoring. In both cases, the ontologies are linked to domain-dependent language resources (micro-plans, lexicon entries, etc.) and user stereotypes (the interest of each field per user type, etc.), but these additional resources are not parts of the OWL ontologies: when exporting M-PIRO ontologies to OWL, the authoring tool produces additional proprietary XML files that contain the domain-dependent language resources and stereotypes; and when importing OWL ontologies developed by others, the additional resources have to be filled in by the authors. We argue below that agreeing upon standards on how the additional resources could be embedded in OWL ontologies would allow NLG systems like M-PIRO to play a central role in the Semantic Web.</Paragraph> <Paragraph position="1"> Note, first, that it is possible to represent in OWL M-PIRO's domain-dependent linguistic resources and user stereotypes.</Paragraph> <Paragraph position="2"> For example, micro-plans could be treated as individuals of a class 'Microplan' with subclasses 'ClausePlan' and 'Template'. In a similar manner, there would be a class 'Voice' with individuals 'active' and 'passive', and similarly for tenses, genders, supported languages, etc. There would also be a class 'LexiconEntry' with subclasses 'VerbEntry' and 'NounEntry', and individuals corresponding to the entries of the domain-dependent lexicon. (Classes corresponding to language resources could be grouped under a 'LinguisticResource' super-class.) Then, for example, the English micro-plan of Figure 1 would roughly be represented in OWL as: One complication is that we need to establish mappings from micro-plans to the properties (fields) they can express, and this requires using property names as values of other properties. This can be seen in the micro-plan above, where we used the property (field) name 'painted-by' as the value of property 'for-property' to signal that the micro-plan can express 'painted-by'. Using property names as values of properties, however, requires OWL FULL. There is a similar problem with noun entries, which have to be associated with classes (entity types) they can refer to: in the noun entry above, we used the class name 'vessel' as the value of property 'refersto-class'. Using class names as values of properties again requires OWL FULL. Similar problems arise with stereotypes.</Paragraph> <Paragraph position="3"> We are currently exploring how M-PIRO's domain-dependent language resources and stereotypes can be best embedded in OWL ontologies. This embedding will lead to 'language-enabled' ontologies, that will include all the resources a system like M-PIRO needs to render the ontologies in several natural languages. This opens up another possibility for publishing content on the Semantic Web: a site could publish only its language-enabled ontology (including the individuals that correspond, for example, to the items it sells), and the NLG technology to render the ontology in natural language could take the form of a browser plug-in. When visiting a site of this kind, a human user would be initially presented with an inventory of objects that can be described (e.g., product thumbnails). Selecting an object would transmit to the browser the ontology or its relevant parts, and it would be the responsibility of the NLG plug-in to produce an appropriate description in the user's language and tailor it to the user's type and interaction history. If the NLG community could establish standards for language-enabled ontologies, there could be different NLG plug-ins by different makers, perhaps each specialising in particular languages and user types, in the same way that there are different browsers for HTML. There could also be a market for developers of language-enabled ontologies for particular sectors (e.g., museums, retailers of computer equipment), who would sell their ontologies to organisations wishing to publish content in those sectors. The client organisations would only need to populate the ontologies with their own individuals (e.g., exhibits, products), possibly by reusing databases, and publish them at their sites. Artificial agents would interact directly with the ontologies of the various sites, invoking their own NLG plug-ins to report their findings in natural language.</Paragraph> <Paragraph position="4"> Establishing standards is, of course, far from trivial. For example, different NLG systems may require very different domain-dependent language resources, or make different assumptions on which resources are domain-dependent or independent. Nevertheless, we believe it is worth trying to move towards this direction, as there are large potential gains for both the NLG community and the users of the emerging Semantic Web. Furthermore, the effort to establish standards should proceed in cooperation with other fields that could exploit language-enabled ontologies. For example, the association between entity types and noun entries can be used for query expansion in information retrieval; and the association between micro-plans and ontology fields can be useful in information extraction systems that populate ontologies.</Paragraph> </Section> class="xml-element"></Paper>