File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/98/w98-0714_metho.xml

Size: 19,504 bytes

Last Modified: 2025-10-06 14:15:07

<?xml version="1.0" standalone="yes"?>
<Paper uid="W98-0714">
  <Title>I I I I I I I I I I I Algorithms for Ontological Mediation</Title>
  <Section position="4" start_page="102" end_page="104" type="metho">
    <SectionTitle>
3 Fundamentals
</SectionTitle>
    <Paragraph position="0"> Before proceeding with a discussion of algorithms for ontological mediation, we first set forth some assumptions and definitions, and make some clarifying remarks.</Paragraph>
    <Section position="1" start_page="102" end_page="103" type="sub_section">
      <SectionTitle>
3.1 Common words mean the same thing.
</SectionTitle>
      <Paragraph position="0"> We make the following simplifying assumption: Rule 1 If two agents are communicating about the same domain, then if both of them know some word, then it means the same thing to both of them.</Paragraph>
      <Paragraph position="1"> The rationale for this assumption is that when agents are communicating, each implicitly assumes that a word used by the other means the same thing as it does to it.  People don't go around wondering whether each word they hear really means what they think it does, and their communication with other people is usually free of error. Of course, this assumption can lead to problems when common words really don't mean the same thing. Then it becomes the agents' duty to detect miscommunicadon. Work is being done in this area (see, for example (McRoy, 1996)) but this is not the focus of our current research. We are more concerned with using mediation techniques to find correspondences between concepts in ontologies. This presupposes detection, since the agents have called a mediator to help them.</Paragraph>
      <Paragraph position="2"> 3.20ntologies The word &amp;quot;ontology&amp;quot; is used by many researchers to mean a variety of similar but distinct things. Without making a strong or precise statement as to what ontologies should be necessarily, we present some issues with respect to ontologies that our research addresses.</Paragraph>
      <Paragraph position="3">  Contrary to many ontology designers, who do not seem to distinguish between word (or symbol) and concept, we take an ontology to be an organization of an agent's concepts by some set of ontological relations. A concept is a particular agent's conceptualization of an element of the domain of discourse, and each concept can be denoted by one or more words. This way, words can be shared between agents, but concepts cannot. Naturally, we require a mapping between words and concepts to support reasoning about agents&amp;quot; concepts. For a given agent, we currently assume a 1-1, onto mapping between concepts and words. Presently. we do not have algorithms that give a proper treatment of polysemy or synonomy of words for ontological mediation.</Paragraph>
      <Paragraph position="4">  If an ontological mediator is to find words in one ontology that have the same meaning as words in another ontology, the mediator must be thinking about the concepts in those ontologies. The notion of a &amp;quot;concept&amp;quot; is very slippery, and frequently means different things to different people. Therefore, for the purpose of describing these algorithms and their underlying theory, we make the following definitions. null !. For any agent A and domain element O, ifA knows about or can think about O, then there exists a mental representation C in A's mind, which represents O. We write \[\[C\]\] A = O,  2. Concept: The mental entity C which exists in the mind of an agent and serves to represent some domain element for that agent.</Paragraph>
      <Paragraph position="5"> 3. OM-Concept: The mental entity C' which exists in the mind of the ontological mediator that is thinking about C. that is. thinking about some concept in the  mind of another agent, and how that concept might fit into the agent's ontology.</Paragraph>
      <Paragraph position="7"> Note one important implication of the distinction: The &amp;quot;'domain&amp;quot; of thought for an ontological mediator is not the same as the communicants' domain. Rather, the OM's domain is that of concepts in the communicants' ontologies. While the communicants are &amp;quot;thinking about&amp;quot; elements of their own domain, the OM is thinking about those concepts invoked by the communicant's thinking. Thus, whenever agent A uses a word W, it expresses some concept C, which in turn represents some domain entity O forA. Therefore, the first time OM hears A use W, OM builds in its own mind an ore-concept C to represent that concept. Hence \[\[C'~oM = C, and of course ClIA = O.</Paragraph>
    </Section>
    <Section position="2" start_page="103" end_page="104" type="sub_section">
      <SectionTitle>
3.3 Ontological Relations
</SectionTitle>
      <Paragraph position="0"> An ontological relation is simply any relation commonly used in the organization of ontoiogies. Whether a relation is truly ontological is a matter of opinion, but for example, some kind ofsubclass/superclass relation pair is almost always used to form a taxonomic hierarchy.</Paragraph>
      <Paragraph position="1"> 3.3.1 Hierarchical generalizer,s, and specializers A hierarchical ontological relation is any ontological relation that organizes concepts into a hierarchy, taxonomy, or similar structure. Hierarchical relations are related to but distinct from transitive relations. For example, the transitive relation ancestor is related to the hierarchical relation parent.</Paragraph>
      <Paragraph position="2"> The hierarchical ontological relations are important for ontological mediation because they form the hierarchies organizing the concepts in the ontology. When a relation is hierarchical, we can think of it as having an direction or orientation, either as a generalizer, relating a concept to concepts above it (e.g., its &amp;quot;superconcepts&amp;quot;), and moving &amp;quot;up&amp;quot; the hierarchy, or as a specializer, relating a concept to concepts below it (its &amp;quot;subconcepts&amp;quot;), and moving &amp;quot;'down&amp;quot;. For example, directSuperClass is a hierarchical generalizer, while directSubClass is a hierarchical specializer.</Paragraph>
      <Paragraph position="3"> The &amp;quot;up&amp;quot; and &amp;quot;down&amp;quot; directions are merely conventions, of course, in that they relate to the way we tend to draw pictures of hierarchies as trees. We start at some root concept or concepts and fan out via some hierarchical specializer. How do we know that directSubClass is the specializer (down direction) and that dircctSuper.</Paragraph>
      <Paragraph position="4"> Class is the generalizer (up direction)? We expect fan-out with specializers, that is, specializers tend to relate several subconcepts to a single superconcepts. For a pair of hierarchical relations R and R ~ (the converse of R), we examine the sets of concepts X = {xl3yR(x,y)} and Y = {YI3xR(x,Y)} * lflY\] &gt; IXI then R is a specializer, otherwise R is a generalizer.</Paragraph>
      <Paragraph position="5"> If R is a hierarchical relation, then R ~ is its converse.</Paragraph>
      <Paragraph position="6"> i.e.. R(CI ,Cz) - R*(C2,Ct ). It follows naturally that ifR is a generalizer, then R ~ is a specializer, and vice versa.</Paragraph>
      <Paragraph position="7"> We say that a concept P is a &amp;quot;parent&amp;quot; (with respect to R) ofanotherconcept CifR(C, P) for some hierarchical generalizer R. Likewise, we say that a concept C is a &amp;quot;'child&amp;quot;  of P if R(P, C) for some hierarchical specializer R.</Paragraph>
    </Section>
    <Section position="3" start_page="104" end_page="104" type="sub_section">
      <SectionTitle>
3.4 Relation notation
</SectionTitle>
      <Paragraph position="0"> By convention, R(X,Y) means that Y bears the R relation to X, for example, we say subclass(animal, dog) to mean that dog is a subclass of animal. We choose this convention to reflect the question-asking approach where questions are asked of the domain and answers are given in the range. For example, in &amp;quot;What are the subclasses of animal?&amp;quot; we have the question in terms of a relation: subclass(animal, ?x), or functionally, as in subclass(animal) =?x.</Paragraph>
    </Section>
    <Section position="4" start_page="104" end_page="104" type="sub_section">
      <SectionTitle>
3.5 Tangled Hierarchies
</SectionTitle>
      <Paragraph position="0"> For many ontologies, the taxonomic hierarchy is structured as a tree (or as a forest), where any given concept can have at most one superconcept. Other ontologies can be tangled hierarchies with multiple inheritance. The techniques of ontological mediation presented here do allow for mediation with tangled hierarchies.</Paragraph>
    </Section>
  </Section>
  <Section position="5" start_page="104" end_page="104" type="metho">
    <SectionTitle>
4 Algorithms
</SectionTitle>
    <Paragraph position="0"> In this section, we discuss various algorithms for ontological mediation. We define word(C,A) to be the word that agent A uses to express concept C, and concept(W,A) to be the ore-concept representing the concept that W expresses forA, if one exists, undefined otherwise. Also, let knows(A, W) be true if and only if concept(W,A) is defined, false otherwise.</Paragraph>
    <Paragraph position="1"> We define the following operations:  * Ontology(A) : return the set of ore-concepts that OM currently uses to represent concepts in A's ontology. * Agent(C) : returns a representation of the agent that C is an ore-concept for. This representation is used  to direct questions to the agent.</Paragraph>
    <Paragraph position="2"> The following algorithm exists in support of ontological mediation algorithms by asking questions of the communicants as needed to establish OM's knowledge of ontological relationships. Evaluate takes a relation R. and an ore-concept C. and returns a set of om-concepts such that Agent(C) believes R(~C~Age,~(c), ~C~A~,~(c~ ) for each om-concept C' in the set. Results are cached so that multiple calls to evaluate the same question do not result in multiple queries issued.</Paragraph>
    <Paragraph position="3">  Algorithm EvaluaCe(R,C): sec of om-concept i. let A~AEen~(C ) 2. Build a query Q in A's interlingua to ask ''What bears relacion R to word(C,Agent(C))?'' 3. Issue Q to Agent(C}. The response co the query will be a set of words S.</Paragraph>
    <Paragraph position="4"> 4. let Answer~{} 5. for V6 S do 6. assert R(C,concep~(V,A)) 7. lee Answer~Anwswer+concep~(V,A)</Paragraph>
    <Paragraph position="6"> 8. end for 9. return Answer  The first two algorithms below each take as arguments a word W used by agent S and not known by agent L, arid return a set of ore-concepts representing possible ontological translations. More formally, when X is the ore-concept for which word(~X\]oM,S) = W, given any ore-concept Y in the set returned by the algorithm, there is reason to believe that \[~X\]oM~s = ~Y\]oM\]I,.</Paragraph>
    <Section position="1" start_page="104" end_page="104" type="sub_section">
      <SectionTitle>
4.1 Recursive over one relation (MedTax)
</SectionTitle>
      <Paragraph position="0"> The first algorithm explores an ontology along one hierarchical relation, given by parameterR. It is called MedTax because an obvious choice for R is either SubClass or SuperClass, which will result in exploration of the taxonomic hierarchies of the ontologies.</Paragraph>
      <Paragraph position="1">  2. for P E EvaluaTe(R, concept(W,S) do 3. if knows(i, word(P,S)) then 4. let Q (-- (\] + concep~(word(P,S), L) 5. else 6. let O +- QU MedTax(word(P,S),S,L,R) 7. end if 8. end for  I0. for P E Q do Zl. for C E EvaluaTe(Rl, P) do  12. if not knows(S,word(C,L) then 13. F+-F+C 14. end if 15. end for 16. end for 17. return F</Paragraph>
    </Section>
    <Section position="2" start_page="104" end_page="104" type="sub_section">
      <SectionTitle>
4.2 Multiple relations(MedOnt)
</SectionTitle>
      <Paragraph position="0"> We can extend this algorithm to handle multiple hierarchical ontological relations, such as ParCWhole. Now, each hierarchical ontological relation forms its own hierarch), in which the unknown word is situated in the listener's ontology.</Paragraph>
      <Paragraph position="1"> Again, we find the translation of a word used by S but unknown to L by starting at the unknown word in the speaker's ontology, then crawling up (or down) the hierarchies of the speaker to points where ontological translations of the word at those points has been made already, (or is easy to make immediately because the listener knows the word) then crawl back down (or up) the listener's hierarchies.</Paragraph>
      <Paragraph position="2">  Note that MedOnt is a union-tbrming algorithm, rather than an intersection-forming one. That is, it returns ore-concepts that are found by exploring via one or more hierarchical relations, rather than restricted to having been found through every relation. It returns a set of candidates for ontological translation, and does not calculate which is the best one.</Paragraph>
    </Section>
    <Section position="3" start_page="104" end_page="104" type="sub_section">
      <SectionTitle>
4.3 Choosing the best candidate (MedCount)
</SectionTitle>
      <Paragraph position="0"> This algorithm, unlike the previous algorithms, returns a pair: (1) the single ore-concept representing the listener's concept which the mediator believes to be equivalent to the speaker's concept expressed by an unknown word W, and (2) a measure of the mediator's confidence in this ontological translation.</Paragraph>
      <Paragraph position="1"> We introduce the notation A =r B to mean that concept A is known by OM to be equivalent to concept B with confidence measure Y.</Paragraph>
      <Paragraph position="2">  number of sets in CandidatesByRelations that contain C is maximized.</Paragraph>
      <Paragraph position="3"> let Y+- the number of sets in</Paragraph>
    </Section>
  </Section>
  <Section position="6" start_page="104" end_page="106" type="metho">
    <SectionTitle>
5 Experiments with WordNet
</SectionTitle>
    <Paragraph position="0"> The WordNet (Miller et al., 1993; Miller. 1995) lexical ontology organizes concepts called &amp;quot;synsets,&amp;quot; which are sets of words considered synonymous in a certain context. Primarily we are interested in some of WordNet's hierarchies, including the taxonomic hierarchy:</Paragraph>
    <Paragraph position="2"/>
    <Section position="1" start_page="104" end_page="104" type="sub_section">
      <SectionTitle>
5.1 Variables
</SectionTitle>
      <Paragraph position="0"> Since WordNet is such a large ontology, we controlled two independent binary variables in the experiment, Synonyms, and AllowAllSenses. These are explained below.</Paragraph>
      <Paragraph position="1">  One approach to WordNet is to consider each synset as a separate mental concept in the mind of the agent who uses WordNet as its ontology. When the agent expresses that concept, he uses one or more of the words in the synseL If so the agent supports synonomy. However, deciding which synonym to use is difficult to say the least, and may be a reason why many if not most ontologies don't support synonomy.</Paragraph>
      <Paragraph position="2"> $.1.2 AHowAESenses The agent playing the role of WordNet receives queries from the ontological mediator, then in turn makes an appropriate access to its WordNet component. Each query returns a sequence of zero or more groups of output, one for each relevant synset the word was in. If AI-IowAllSenses was not set, the agent only reported the information from the first block, ignoring the others. Conversely, if AllowAilSenses was set, then the agent reported information from all synsets.</Paragraph>
    </Section>
    <Section position="2" start_page="104" end_page="106" type="sub_section">
      <SectionTitle>
5.2 Experiment
</SectionTitle>
      <Paragraph position="0"> We devised two agents, appropriately named &amp;quot;AMERI-CAW&amp;quot; and &amp;quot;BRITISH&amp;quot; because they were constructed to use the corresponding dialect of the English language.</Paragraph>
      <Paragraph position="1"> Both agents use the WordNet ontology, but are restricted from using words strictly from the other's dialect (they pretend not to know them}. The dialect restrictions come from the Cambridge Encyclopedia of the English Language, (Crystal, 1995, p. 309). Naturally we only used word pairs where both words exist in WordNet in the same synset. We chose 57 word pairs where both words were present in WordNet and members of the same snyset, for example, ( lift, elevator), (patience, solitaire), (holiday, vacation), (draughts, checkers).</Paragraph>
      <Paragraph position="2"> We then tested the MedCount algorithm mediating from an American speaker to a British listener, and then vice versa from a British speaker to an American listener. There were four Hierarchical relations used: SubClass, Superclass, PartOf, and HasPart.</Paragraph>
      <Paragraph position="3"> When the mediator returns the correct word from the word pair, that is called a success. When the mediator returns some other word, that is called an error, and when the mediator can not find any word for an ontological translation that is called a miss.</Paragraph>
      <Paragraph position="4"> Table 1 summarizes the performance of the HedCoua1: algorithm under combinations of AllowAIISenses (Sen) and Synonyms (Syn), showing the numbers of successes, errors, misses, success rate (Success~57 x 100%), the average certainty over all successes (Cer). and average CPU time. when the speaker is &amp;quot;'BRITISH&amp;quot; and the listener is &amp;quot;AMERICAN.&amp;quot; Table 2 gives the same data for when the speaker is &amp;quot;'AMERICAN&amp;quot; and the listener is &amp;quot;BRITISH&amp;quot;.</Paragraph>
    </Section>
  </Section>
  <Section position="7" start_page="106" end_page="106" type="metho">
    <SectionTitle>
6 Analysis
</SectionTitle>
    <Paragraph position="0"> The first remarkable difference between an American speaker vs. a British speaker is that the success rate plummets when Synonyms is turned off. This reflects a bias in WordNet tbr putting the American words first in the synsets. If the British word is at the end. it will not be reported when Synonyms is on, thus it will not be found.</Paragraph>
    <Paragraph position="1"> and the miss rate increases.</Paragraph>
    <Paragraph position="2"> Another reason for seemingly low success rates even with both Synonyms and AllowAllSenses on is due to a sort of polysemy inherent in dealing with WordNet.</Paragraph>
    <Paragraph position="3"> While WordNet isn't really polysemous in its underlying data structure since synsets provide a crisp distinction internally, any agent--human or machine--that uses the ordinary external interface to WordNet makes queries using single words that may have multiple senses (meanings) in WordNet, and thereby may uncover data on more than just one concept.</Paragraph>
    <Paragraph position="4"> \[t stands to reason that an agent would perform ontological mediation more correctly if that agent were sophisticated enough to understand that WordNet's resposes (or the responses of any source that recognizes terms as synonymous) may include multiple distinct synsets, that each synset contains multiple synonymous terms, and that these should be organized as one concept, not many. While this sophistication is the subject of on-going research, presently the Ontolgical Mediator deals with single terms only. and cannot distinguish among ontology data for multiple word senses. Thus errors occur when there are too many translation candidates and the wrong one is picked.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML