File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/02/w02-0305_abstr.xml
Size: 12,868 bytes
Last Modified: 2025-10-06 13:42:29
<?xml version="1.0" standalone="yes"?> <Paper uid="W02-0305"> <Title>MPLUS: A Probabilistic Medical Language Understanding System</Title> <Section position="1" start_page="0" end_page="0" type="abstr"> <SectionTitle> Abstract </SectionTitle> <Paragraph position="0"> This paper describes the basic philosophy and implementation of MPLUS (M+), a robust medical text analysis tool that uses a semantic model based on Bayesian Networks (BNs). BNs provide a concise and useful formalism for representing semantic patterns in medical text, and for recognizing and reasoning over those patterns. BNs are noise-tolerant, and facilitate the training of M+.</Paragraph> <Paragraph position="1"> Introduction In the field of medical informatics, computerized tools are being developed that depend on databases of clinical information. These include alerting systems for improved patient care, data mining systems for quality assurance and research, and diagnostic systems for more complex medical decision support. These systems require data that is appropriately structured and coded. Since a large portion of the information stored in patient databases is in the form of free text, manually coding this information in a format accessible to these tools can be time consuming and expensive. In recent years, natural language processing (NLP) methodologies have been studied as a means of automating this task. There have been many projects involving automated medical language analysis, including deciphering pathology reports (Smart and Roux, 1995), physical exam findings (Lin et al., 1991), and radiology reports (Friedman et al., 1994; Ranum, 1989; Koehler, 1998).</Paragraph> <Paragraph position="2"> M+ is the latest in a line of NLP tools developed at LDS Hospital in Salt Lake City, Utah. Its predecessors include SPRUS (Ranum, 1989) and SymText (Koehler, 1998). These tools have been used in the realm of radiology reports, admitting diagnoses (Haug et al., 1997), radiology utilization review (Fiszman, 2002) and syndromic detection (Chapman et al., 2002). Some of the character of these tools derives from common characteristics of radiology reports, their initial target domain. Because of the off-the-cuff nature of radiology dictation, a report will frequently contain text that is telegraphic or otherwise not well formed grammatically. Our desire was not only to take advantage of phrasal structure to discover semantic patterns in text, but also to be able to infer those patterns from lexical and contextual cues when necessary.</Paragraph> <Paragraph position="3"> Most NLP systems capable of semantic analysis employ representational formalisms with ties to classical logic, including semantic grammars (Friedman et al., 1994), unification-based semantics (Moore, 1989), and description logics (Romacker and Hahn, 2000). M+ and its predecessors employ Bayesian Networks (Pearl, 1988), a methodology outside this tradition.</Paragraph> <Paragraph position="4"> This study discusses the philosophy and implementation of M+, and attempts to show how Bayesian Networks can be useful in medical text analysis.</Paragraph> <Paragraph position="5"> The M+ Semantic Model</Paragraph> <Section position="1" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 2.1 Semantic Bayesian Networks </SectionTitle> <Paragraph position="0"> M+ uses Bayesian Networks (BNs) to represent the basic semantic types and relations within a medical domain such as chest radiology reports.</Paragraph> <Paragraph position="1"> M+ BNs are structurally similar to semantic networks, in that they are implemented as directed acyclic graphs, with nodes representing word and concept types, and links representing relations between those types. BNs also have a character as frames or slot-filler representations (Minsky, 1975). Each node is treated as a variable, with an associated list of possible values. For instance a node representing &quot;disease severity&quot; might include the possible values {&quot;severe&quot;, &quot;moderate&quot;, &quot;mild&quot;}. Each value has a probability, either assigned or inferred, of being the true value of that node.</Paragraph> <Paragraph position="2"> In addition to providing a framework for representation, a BN is also a probabilistic inference engine. The probability of each possible value of a node is conditioned on the probabilities of the values of neighboring nodes, Association for Computational Linguistics.</Paragraph> <Paragraph position="3"> the Biomedical Domain, Philadelphia, July 2002, pp. 29-36. Proceedings of the Workshop on Natural Language Processing in through a training process that learns a Bayesian joint probability function from a set of training cases. After a BN is trained, a node can be assigned a value by setting the probability of that value to 1, and the probabilities of the alternate values to 0. This results in a cascading update of the value probabilities in all unassigned nodes, in effect predicting what the values of the unassigned nodes should be, given the initial assignments. The sum of the probabilities for the values of a given node is constrained to equal 1, making the values mutually exclusive, and reflecting uncertainty if more than one value has a nonzero probability.</Paragraph> <Paragraph position="4"> Please note that in this paper, &quot;BN instance&quot; refers to the state of a BN after assignments have been made.</Paragraph> <Paragraph position="5"> A training case for a BN is a list of node / value assignments. For instance, consider a simple BN for chest anatomy phrases, as shown in Figure 1.</Paragraph> <Paragraph position="6"> Figure 1. BN for simple chest anatomy phrases.</Paragraph> <Paragraph position="7"> A training case for this BN applied to the phrase &quot;right upper lobe&quot; could be:</Paragraph> <Paragraph position="9"> In the context of the Bayesian learning, this case has an effect similar to a production rule which states &quot;If you find the words 'right', 'upper' and 'lobe' together in a phrase, infer the meaning *right-upper-lobe&quot;. After training on this case, assigning one or more values from this case would increase the probabilities of the other values; for instance assigning side= &quot;right&quot; would increase the probability of the value interpretation= *right-upper-lobe.</Paragraph> <Paragraph position="10"> Interpretive concepts such as *right-upper-lobe are atomic symbols which are either invented by the human trainer, or else obtained from a medical knowledge database such as the UMLS metathesaurus. By convention, concept names in M+ are preceded with an asterisk.</Paragraph> <Paragraph position="11"> A medical domain is represented in M+ as a network of BNs, with word-level and lower concept-level BNs providing input to higher concept-level BNs. Figure 2 shows a partial view of the network of BNs used to model the M+ Head CT (Computerized Tomography) domain, instantiated with the phrase &quot;temporal subdural hemorrhage&quot;. Each BN instance is shown with a list of nodes and most probable values. Note that input nodes of higher BNs in this model have the same name as, and take input from, the summary nodes of lower BNs.</Paragraph> <Paragraph position="12"> Word level BNs have input nodes named &quot;head&quot;, &quot;mod1&quot; and &quot;mod2&quot;, corresponding to the syntactic head and modifiers of a phrase.</Paragraph> <Paragraph position="13"> Each node in a BN has a distinguished &quot;null&quot; value, whose meaning is that no information relevant to that node, explicit or inferable, is present in the represented phrase.</Paragraph> <Paragraph position="14"> &quot;temporal subdural hemorrhage&quot;.</Paragraph> <Paragraph position="15"> One way in which M+ differs from its predecessor SymText (Koehler, 1998) is in the size and modularity of its semantic BNs. The SymText BNs group observation and disease concepts together with state (&quot;present&quot;, &quot;absent&quot;), change-of-state (&quot;old&quot;, &quot;chronic&quot;), anatomic location and other concept types. M+ trades the inferential advantages of such monolithic BNs for the modularity and composability of smaller BNs such as those shown in figure 2. Figure 3 shows a single instance of the SymText Chest Radiology Findings BN, instantiated with the sentence &quot;There is dense infiltrative opacity in the right upper lobe&quot;.</Paragraph> <Paragraph position="16"> *observations : *localized upper lobe infiltrate (0.888) *state : *present (0.989) state term : null (0.966) *topic concept : *poorly-marginated opacity (0.877) topic term : opacity (1.0) topic modifier : infiltrative (1.0) *measurement concept : *null (0.999) measurement term : null (0.990) first value : null (0.998) second value : null (0.999) values link : null (0.999) size descriptor : null (0.999) *tissue concept : *lung parenchyma (0.906) tissue term : alveolar (1.0) *severity concept : *high severity (0.893) severity term : dense (1.0) *anatomic concept : *right upper lobe (0.999) *anatomic link concept : *involving (1.0) anatomic link term : in (1.0) anatomic location term : lobe (1.0) anatomic location modifier : null (0.999) anatomic modifier side : right (1.0) anatomic modifier superior/inferior : upper (1.0) anatomic modifier lateral/medial : null (0.999) anatomic modifier anterior/posterior : null (0.999) anatomic modifier central/peripheral : null (0.955) *change concept : *null (0.569) change with time : null (0.567) change degree : null (0.904) change quality : null (0.923)</Paragraph> </Section> <Section position="2" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 2.2 Parse-Driven BN Instantiation </SectionTitle> <Paragraph position="0"> M+ BNs are instantiated as part of the syntactic parse process. M+ syntactic and semantic analyses are interleaved, in contrast with NLP systems that perform semantic analysis after the parse has finished.</Paragraph> <Paragraph position="1"> M+ uses a bottom-up chart parser, with a context free grammar (CFG). As a word such as &quot;right&quot; is recognized by the parser, a word-level phrase object is created and a BN instance containing the assignment side= &quot;right&quot; is attached to that phrase. As larger grammatical patterns are recognized, the BN instances attached to subphrases within those patterns are unified and attached to the new phrases, as described in section 3. The result of this process is a set of completed BN instances, as illustrated in figure 2. Each BN instance is a template containing word and concept-level value assignments, and the interpretive concepts inferred from those assignments. The templates themselves are nested in a symbolic expression, as described in section 2.3, to facilitate composing multiple BN instances in representations of arbitrary complexity.</Paragraph> <Paragraph position="2"> Each phrase recognized by the parser is assigned a probability, based on a weighted sum of the joint probabilities of its associated BN instances, and adjusted for various syntactic and semantic constraint violations. Phrases are processed in order of probability; thus the parse involves a semantically-guided best-first search. Syntactic and semantic analysis in M+ are mutually constraining. If a grammatically possible phrase is uninterpretable, i.e. if its subphrase interpretations cannot be unified, it is rejected. If the interpretation has a low probability, the phrase is less likely to appear in the final parse tree. On the other hand, interpretations are constructed as phrases are recognized. The exception to this rule is when an ungrammatical fragment of text is encountered. M+ then uses a semantically-guided phrase repair procedure not described in this paper.</Paragraph> </Section> <Section position="3" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 2.3 The M+ Abstract Semantic Language </SectionTitle> <Paragraph position="0"> The probabilistic reasoning afforded by BNs is superior to classical logic in important ways (Pearl, 1988). However, BNs are limited in expressive power relative to first-order logics (Koller and Pfeffer, 1997), and commercially available implementations lack the flexibility of symbolic languages. Friedman et al have made considerable headway in giving BNs many useful characteristics of first order languages, in what they call probabilistic relational models, or PRMs (e.g. Friedman at al. 1999).</Paragraph> <Paragraph position="1"> While we are waiting for industrystandard PRMs, we have tried to make our semantic BNs more useful by combining them with a first-order language, called the M+</Paragraph> </Section> </Section> class="xml-element"></Paper>