File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/92/h92-1042_concl.xml

Size: 10,770 bytes

Last Modified: 2025-10-06 13:56:50

<?xml version="1.0" standalone="yes"?>
<Paper uid="H92-1042">
  <Title>INFERENCING IN INFORMATION RETRIEVAL</Title>
  <Section position="6" start_page="220" end_page="222" type="concl">
    <SectionTitle>
4. MAPPING QUERIES TO DOCUMENTS
</SectionTitle>
    <Paragraph position="0"> Locating relevant documents in a bibliographic database is a complex process that involves users - their knowledge of the subject matter, their understanding of the conventions of the database, their familiarity with the interface to that database - and it involves the relationship between the meaning of a query and the meaning of a relevant document.</Paragraph>
    <Paragraph position="1"> A query generally is directed to just one, or perhaps a few aspects, of a full document. The relationship between the query and document may be direct, or it may be quite indirect.</Paragraph>
    <Paragraph position="2"> The following examples from the test collection illustrate 3.</Paragraph>
    <Paragraph position="3"> A query in the clinical medicine research portion of the collection is, &amp;quot;Causes, treatment, signs and symptoms of depression specifically in the post partum period (i.e., first year after childbirth or traceable to the event of childbirth).</Paragraph>
    <Paragraph position="4"> To include mild depression (also known as 'baby blues') to post partum psychosis.&amp;quot; The title of a relevant citation is, &amp;quot;A prospective study of postpartum psychoses in a high-risk group. Clinical characteristics of the current postpartum episodes.&amp;quot; Here the title clearly answers at least part of the 3As noted above, the queries were collected from two medical fibraries. They consist primarily of search request forms filled in by users of these fibraries. The language is, therefore, the natural language of the user, and it is directed to a human search specialist rather than to a computer interface.  query directly and is, thus, deemed relevant.</Paragraph>
    <Paragraph position="5"> A somewhat less direct correspondence between the query and document is shown by an example from the health services research portion of the collection. The query is, &amp;quot;Attitudes of health personnel as it relates to neoplasms, AIDS, and ALS.&amp;quot; The title of one of the documents retrieved for this query is, &amp;quot;The impact of a program to enhance the competencies of primary care physicians in caring for patients with AIDS.&amp;quot; The abstract, while not directly discussing attitudes of physicians treating AIDS patients does indicate that of 635 physicians interviewed, only 30 percent &amp;quot;demonstrated adequate knowledge of practices necessary to deal with patients' AIDS-related symptoms and concerns.&amp;quot; Our recent investigations have looked at the degree of similarity between the language of a query and the language of a relevant document. Our experiments involved parsing the query and document texts, extracting the constituent noun phrases, augmenting these with synonyms and other variants, and then attempting to map queries to relevant documents.</Paragraph>
    <Paragraph position="6"> We found that the mappings are almost never straightforward and almost always involve multiple inferences.</Paragraph>
    <Paragraph position="7"> Our current parsing system was able to handle about 45 percent of the 155 queries and about 55 percent of the 3,078 titles in the collection. As we analyzed particular phenomena, we parsed selected portions of some of the abstracts. Both the queries and the titles are generally complex noun phrases, but queries tend to be more elliptical and much less well-formed than titles. Abstracts consist of well-formed English sentences, but some of the structures found there are highly specialized. The following sentence from one of the abstracts illustrates: &amp;quot;At 55-57 days of age, the animals were divided into the following dietary treatment groups: A) 4.5 96 fat \[control fat (CF)\]; B) CF + 1.0 MMOL ROA/kg diet (CF + ROA); C) 20.0 ~ fat \[high fat (HF)\]; D) I-IF + ROA.&amp;quot; Our investigations have indicated that the mapping between queries and documents involves a range of phenomena. When concepts do not map directly to each other, it is often the case that various types of relations between them are the key to a successful mapping. The synonymy relation is clearly of great importance to robust retrieval systems. The more synonyms or closely related terms there are available at search time, the more likely it is that a user will find the desired documents.</Paragraph>
    <Paragraph position="8"> (For example, see\[15\] for the view that traditional retrieval systems would be greatly improved by the addition of huge numbers of synonyms, or &amp;quot;aliases&amp;quot;). The synonymy must, however, go beyond the word-level to the phrase-level. An example from our experiment illustrates. The fairly simple query is, &amp;quot;Vitamin C and immunity&amp;quot;. The title of a relevant citation is &amp;quot;Effect of ascorbic acid on humoral and other factors of immunity in coal-tar exposed workers.&amp;quot; Both the 4For our purposes a document consists of a title and an abstract.</Paragraph>
    <Paragraph position="9"> Metatbesaurus and the Dorland dictionary list &amp;quot;vitamin C&amp;quot; and &amp;quot;ascorbic acid&amp;quot; as synonyms, so, in this case, parsing the query and title, together with a look-up in our online resources has the desired effect.</Paragraph>
    <Paragraph position="10"> Another example illustrates some of the more complex relations that may exist between concepts in queries and documents. The query is, &amp;quot;Hematoporphyrin derivative treatment of tumors using a laser.&amp;quot; The first sentence of a relevant citation is, &amp;quot;Photoradiation with photosensitizing porphyrins offers a potentially useful approach to the diagnosis and treatment of certain human cancers.&amp;quot; The system must recognize that hematoporphyrin is a kind of porphyrin, that tumors are related to cancer, and that the use of a laser is implied by photoradiation. Access to the knowledge contained in the Metathesaurus does, in fact, allow these inferences to be made. A sub-tree in the MeSH hierarchy, one of the constituent vocabularies in the Metathesaurus, is shown below. Hematoporphyrin is shown to be a narrower term than porphyrin and the isa link is implied:  By navigating through the interrelationships expressed in the Metathesaurus structure, the system is able to draw the appropriate inferences.</Paragraph>
    <Paragraph position="11"> Another example illustrates a somewhat more complex case.</Paragraph>
    <Paragraph position="12"> The query is, &amp;quot;Ocular complications of Myasthenia Gravis&amp;quot;. A relevant title is, &amp;quot;Myasthenia gravis and recurrent retrobulbar optic neuritis: an unusual combination of diseases&amp;quot;. Myasthenia gravis is a neuromuscular disorder and is generally associated with ocular complications of a muscular nature, such as ptosis, diplopia, and ophthalmoplegia. The optic neuritis mentioned in the title is, however, an inflammatory disorder. The correct inference can be made by referring to the Semantic Network which has established the potential relation &amp;quot;complicates&amp;quot; between any two co-occurring diseases. In this case, then, the literature has actually instanti- null ated the &amp;quot;complicates&amp;quot; relationship between the two normally unrelated disorders mentioned in the rifle.</Paragraph>
    <Paragraph position="13"> It is clear that while identifying noun phrases in queries and documents will improve the mapping capabilities of a retrieval system, it will not be capable of drawing many of the deeper inferences that are required. A fairly simple example makes the point. The query is, &amp;quot;Thermography for indications other than breast.&amp;quot; An obviously relevant title is, &amp;quot;Use of thermogram in detection of meningitis.&amp;quot; Here a system needs to know that &amp;quot;breast&amp;quot; actually refers to &amp;quot;breast disorders&amp;quot; and that &amp;quot;other than&amp;quot; is a negative operator. As we incorporate more semantics into our parser, some of these inferences should fall out.</Paragraph>
    <Paragraph position="14"> Most often the process of locating a relevant document involves mapping sets of concepts and their interrelationships in queries onto similar sets of concepts and interrelationships in documents. These interrelationships between major concepts may be explicit or they may be implicit. An example of an explicit relation is shown in the following query, &amp;quot;Transillumination light scanning for use in the detection of diseases of the breast.&amp;quot;. A relevant title for this query is &amp;quot;The value of dlaphanograpy as an adjunct to mammography in breast diagnostics.&amp;quot; Here the notion of using a particular technique to detect, or diagnose, the disorder is of paramount importance.</Paragraph>
    <Paragraph position="15"> An example of an implicit relationship is shown in the query, &amp;quot;Neoplasia in kidney, heart, and liver transplant recipients.&amp;quot; The user is probably interested in articles that discuss neoplasia arising as a result of of the transplant (or more likely the immunosuppressive therapy associated with the transplant), but this is not directly stated. A relevant title for this query is, in fact, &amp;quot;Development of incidence of cancer following cyclosporine therapy.&amp;quot; In many cases, it will not be possible for a system to draw the appropriate inferences without the interactive aid of the user. This is most likely if only noun phrases are presented as a search statement. For example, if a query consists simply of the two terms &amp;quot;rifampin&amp;quot; and &amp;quot;tuberculosis&amp;quot;, multiple interpretations of the relationship between these terms are possible. The Semantic Network, for example, provides the following potential relationships between drugs and diseases: affects, prevents, complicates, treats, diagnoses, and causes.</Paragraph>
    <Paragraph position="16"> If the user is presented with the set of possible relations between drugs and diseases, a choice can be made and the query can be further refined.</Paragraph>
    <Paragraph position="17"> Our work to date has revealed a variety of inferences that must be made if the attempt to map a query to a relevant document is to be successful. We intend to continue our explorations of these phenomena, and we have begun to develop an approach to handling some of them. Our online sources of biomedical information have already proven to be of direct use in making some of the appropriate inferences.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML