File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/02/w02-0305_metho.xml
Size: 16,541 bytes
Last Modified: 2025-10-06 14:07:57
<?xml version="1.0" standalone="yes"?> <Paper uid="W02-0305"> <Title>MPLUS: A Probabilistic Medical Language Understanding System</Title> <Section position="2" start_page="0" end_page="0" type="metho"> <SectionTitle> Abstract Semantic Language (ASL), </SectionTitle> <Paragraph position="0"> implemented within M+. Specifically, BNs are treated as object types within the ASL. There is a &quot;chest anatomy&quot; type, for instance, and a &quot;chest radiology findings&quot; type, corresponding to BNs of those same names. The interpretation of a phrase is an expression in the ASL, containing predicates that state the relation of BN instances to one another, and to the phrase they describe. For instance, the interpretation of &quot;hazy right lower lobe opacity&quot; could be the</Paragraph> <Paragraph position="2"> where #phrase1 identifies a syntactic phrase object, and #find1 and #loc1 are tokens representing instances of the findings BN (instanced with the words &quot;hazy&quot; and &quot;opacity&quot;) and the anatomic BN (instanced with &quot;right&quot;, &quot;lower&quot; and &quot;lobe&quot;), respectively. The relation 'head-of' denotes that the findings BN is the main or &quot;head&quot; BN for that phrase. Conversely, &quot;hazy right lower lobe opacity&quot; can be thought of as a findings-type phrase, with an anatomictype modifier.</Paragraph> <Paragraph position="3"> This expression captures the abstract or &quot;skeletal&quot; structure of the interpretation, while the BN instances contain the details and specific inferences. One can think of the meaning of an expression like (located-at #find1 #loc1) in abstract terms, e.g. &quot;some-finding located-at some-location&quot;. Alternatively, the meaning of a BN token might be thought of as the most probable interpretive concept within that BN instance. In this case, (located-at #find1 #loc1) could mean &quot;*localized-infiltrate located-at *left-lower-lobe&quot;.</Paragraph> <Paragraph position="4"> Because the object types in the ASL are the abstract concept types represented by the BNs, semantic rules formulated in this language constitute an &quot;abstract semantic grammar&quot; (ASG). The ASG recognizes patterns of semantic relations among the BNs, and supports analysis and inference based on those patterns.</Paragraph> <Paragraph position="5"> It also permits rule-based control over the creation, instantiation, and use of the BNs, including defining pathways for information sharing among BNs using virtual evidence (Pearl, 1988).</Paragraph> <Paragraph position="6"> One use of the ASG is in post-parse processing of interpretations. After the M+ parser has constructed an interpretation, post-parse ASG productions may augment or alter this interpretation. One rule instructs &quot;If two pathological conditions exist in a 'consistent-with' relation, and the first condition has a state modifier (i.e. *present or *absent), and the second condition does not, apply the first condition's state to the second condition&quot;.</Paragraph> <Paragraph position="7"> For instance, in the ambiguous sentence &quot;There is no opacity consistent with pneumonia&quot;, if the parser doesn't correctly determine the scope of &quot;no&quot;, it may produce the an interpretation in which *pneumonia lacks a state modifier, and is therefore inferred (by default) to be present. This rule correctly attaches (state-of *pneumonia *absent) to this interpretation.</Paragraph> <Paragraph position="8"> One important consequence of the modularity of the M+ BNs, and of the ability to nest them within the ASL, is that M+ can compose BN instances in expressions of arbitrary complexity. For instance, it is straightforward to represent the multiple anatomic concepts in the phrase &quot;opacity in the inferior segment of the left upper lobe, adjacent to the heart&quot;:</Paragraph> <Paragraph position="10"> where the interpretive concepts of #anat1, #anat2 and #anat3 are *left-upper-lobe, *inferior-segment, and *heart, respectively.</Paragraph> <Paragraph position="11"> The set of binary predicates that constitutes a phrase interpretation in M+ forms a directed acyclic graph; thus we can refer to the interpretation as an interpretation graph. The interpretation graph of a new phrase is formed by unifying the graphs of its subphrases, as described in section 3.</Paragraph> <Section position="1" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 2.4 Advantages of Bayesian Networks </SectionTitle> <Paragraph position="0"> As mentioned, a BN training case bears a similarity to a production rule. It would be straightforward to implement the training cases as a set of rules, and apply them to text analysis using a deductive reasoning engine. However, Bayesian reasoning has important advantages over first order logic, including: 1- BNs are able to respond gracefully to input &quot;noise&quot;. A semantic BN may produce reasonable inferences from phrasal patterns that only partially match any given training case, or that overlap different cases, or that contain words in an unexpected order. For instance, having trained on multi-word phrases containing &quot;opacity&quot;, the single word &quot;opacity&quot; could raise the probabilities of several interpretations such as *localized-infiltrate and *parenchymalabnormality, both of which are reasonable hypotheses for the underlying cause of opacity on a chest x-ray film.</Paragraph> <Paragraph position="1"> 2- Bayesian inference works bidirectionally; i.e. it is abductive as well as deductive. If instead of assigning word-level nodes, one assigns the value of the summary node, the probability of word values having a high correlation with that summary will increase. For instance, assigning the value *localized-infiltrate will raise the probability that the topic word is &quot;opacity&quot;.</Paragraph> <Paragraph position="2"> Bi-directional inference provides a means for modeling the effects of lexical context. A value assignment made to one word node can alter value probabilities at unassigned word nodes, in a path of inference that passes through the connecting concept nodes. For instance, if a BN were trained on &quot;right upper lobe&quot; and &quot;left upper lobe&quot;, but had never seen the term &quot;bilateral&quot;, applying the BN to the phrase &quot;bilateral upper lobes&quot; would increase the probabilities of both &quot;left&quot; and &quot;right&quot;, suggesting that &quot;bilateral&quot; is semantically similar to &quot;left&quot; and &quot;right&quot;. This is one approach to guessing the node assignments of unknown words, a step in the direction of automated learning of new training cases.</Paragraph> <Paragraph position="3"> Similarly, if the system encounters a phrase with a misspelling such as &quot;rght upper lobe&quot;, by noting the orthographic similarity of &quot;rght&quot; to &quot;right&quot; and the fact that &quot;right&quot; is highly predicted from surrounding words, it can determine that &quot;rght&quot; is a misspelling of &quot;right&quot;. The spell checker currently used by M+ employs this technique.</Paragraph> </Section> </Section> <Section position="3" start_page="0" end_page="5" type="metho"> <SectionTitle> 3 Generating Interpretation </SectionTitle> <Paragraph position="0"/> <Section position="1" start_page="0" end_page="0" type="sub_section"> <SectionTitle> Graphs </SectionTitle> <Paragraph position="0"> As mentioned, in M+ the interpretation graph of a phrase is created by unifying the graphs of its child phrases. High joint probabilities in the resulting BN instances are one source of evidence that the words thus brought together exist in the expected semantic pattern.</Paragraph> <Paragraph position="1"> However, corroborating evidence must be sought in the syntax of the text. Words which appear together in a training phrase may not be in that same relation in a given text. For instance, &quot;no&quot; and &quot;pneumonia&quot; support different conclusions in &quot;no evidence of pneumonia&quot; and &quot;patient has pneumonia with no apparent complicating factors&quot;. M+ therefore only attempts to unify sub-interpretations that appear, on syntactic grounds, to be talking about the same things. This is less constraining than production rules that look for words in a specific order, but more constraining than simply pulling key words out of a string of text.</Paragraph> <Paragraph position="2"> The following are examples of rules used to guide the unification of ASL interpretation graphs. For convenience, several shorthand functional notations are used: If P represents a phrase on the parse chart, rootbn(P) represents the root or head BN instance in P's interpretation graph, and type-of(root-bn(P)) is the BN type of root-bn(P). If A and B are sibling child phrases of parent phrase C, then C = parent-phrase(A,B). Note that for convenience, BN instances in the interpretation graphs in Figures 4 - 6 are represented alternately as the words slotted in those instances, and as the most probable interpretive concepts inferred by those instances.</Paragraph> </Section> <Section position="2" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 3.1 Same-type Unification </SectionTitle> <Paragraph position="0"> If phrase A syntactically modifies phrase B, then M+ assumes that some semantic relation exists between A and B. The nature of that relation is partly determinable from type-</Paragraph> <Paragraph position="2"> relation is simply one where root-bn(A) and root-bn(B) are partial descriptions of a single concept. If root-bn(A) and root-bn(B) are unifiable, M+ composes their input to form root-bn(parent-phrase(A,B)).</Paragraph> <Paragraph position="3"> If in addition there are two unifiable same-type BN instances X and Y linked to rootbn(A) and root-bn(B) respectively, via arcs of the same name, then X and Y also describe a single concept, and the arcs describe a single relationship. For instance, if X and Y describe the anatomic locations of root-bn(A) and rootbn(B), and if root-bn(A) and root-bn(B) are partial descriptions of a single &quot;finding&quot;, then X and Y are partial descriptions of a single In figure 4, in the Chest X-ray domain, the phrase &quot;bilateral hazy lower lobe opacity&quot; is interpreted by unifying the interpretations of its subphrases &quot;bilateral hazy&quot; and &quot;lower lobe opacity&quot;. Note that without any corresponding syntactic transformation, this rule brings about a &quot;virtual transformation&quot;, whereby words are grouped together within BN instances in a manner that reflects the conceptual structure of the text. In this example &quot;bilateral hazy lower lobe opacity&quot; is treated as (&quot;bilateral lower lobe&quot;) (&quot;hazy opacity&quot;).</Paragraph> </Section> <Section position="3" start_page="0" end_page="5" type="sub_section"> <SectionTitle> 3.2 Different-type Unification </SectionTitle> <Paragraph position="0"> If phrase A syntactically modifies phrase B, and type-of(root-bn(A)) <> type-of(root-bn(B)), then root-bn(A) and root-bn(B) represent different concepts within some semantic relation. M+ uses the ASG to identify that relation and to add it to the interpretation graph in the form of a path of named arcs connecting root-bn(A) and root-bn(B). This path may include implicit connecting BN instances.</Paragraph> <Paragraph position="1"> M+ is written in Common Lisp, with some C routines for BN access. The M+ architecture consists of six basic components: The parser, concept space, rule base, lexicon, ASL inference engine, and Bayesian network component.</Paragraph> <Paragraph position="2"> For instance, to interpret &quot;subdural hemorrhage&quot; in the Head CT domain, M+ attempts to unify the graphs for the subphrases &quot;subdural&quot; and &quot;hemorrhage&quot;, where type-</Paragraph> <Paragraph position="4"> identifies the connecting path for these two types as shown in figure 2, and adds that path to the interpretation as shown in figure 5. Note that this path contains instances of the &quot;observation&quot; and &quot;anatomy&quot; BN types. As mentioned, the parser is an implementation of a bottom up chart parser with context free grammar.</Paragraph> <Paragraph position="5"> The concept space is a table of symbols representing types, objects and relations within the ASL. These include BN names, BN node value names, inter-BN relation names, and a small ontology of useful concepts such as those The rule base contains rules, which comprise the syntactic grammar and ASG.</Paragraph> <Paragraph position="6"> The lexicon is a table of Lisp-readable word information entries, obtained in part from the UMLS Specialist Lexicon.</Paragraph> <Paragraph position="7"> The ASL inference engine combines symbolic unification with backward-chaining inference. It can be used to match an ASG pattern against an interpretation graph, and to perform tests associated with grammar rules.</Paragraph> </Section> <Section position="4" start_page="5" end_page="5" type="sub_section"> <SectionTitle> 3.3 Grammar Rule Based Unification </SectionTitle> <Paragraph position="0"> The Bayesian network component utilizes the Norsys Netica(TM) API, and includes a set of Lisp and C language routines for instantiating and retrieving probabilities from BNs.</Paragraph> <Paragraph position="1"> Individual grammar rules in M+ can recognize semantic relations, and add connecting arcs to the interpretation graph. For instance, M+ has a rule which recognizes findings-type phrases connected with strings of the &quot;suggesting&quot; variety, and connects their graphs with a 'consistent-with' arc. This is used to interpret &quot;opacity suggesting possible infarct&quot; in the Head CT domain, as shown in figure 6.</Paragraph> <Paragraph position="2"> Training M+ Porting M+ to a new medical domain involves gathering a corpus of training sentences for the domain, using the Netica(TM) graphical interface to create domain-specific BNs, and generating training cases for the new BNs.</Paragraph> <Paragraph position="3"> The most time-consuming task is the creation of training cases. We have developed a prototype version of a Web-based tool which largely automates this task. The basic idea is to enable M+ to guess the BN value assignments of unknown words, then use it to parse phrases similar to phrases already seen. For instance, having been trained on the phrase &quot;right upper lobe&quot;, the parser is able to produce reasonable parses, with some &quot;guessed&quot; value assignments, for &quot;left upper lobe&quot;, &quot;right middle lobe&quot;, &quot;bilateral lungs&quot;, etc. The BN assignments produced by the parse are output as tentative new cases to be reviewed and corrected by the human trainer.</Paragraph> <Paragraph position="4"> The training process begins with an initial set of interpreted &quot;seed&quot; phrases. From this set, the tool can apply the parser to phrases similar to this set, and so semi-automatically traverse ever widening semantically contiguous areas within the space of corpus phrases. As the training proceeds, the role of the human trainer increasingly becomes one of providing correction and interpretations for semantic patterns the system is increasingly able to discover on its own.</Paragraph> <Paragraph position="5"> To parse phrases containing unknown words, M+ uses a technique based on a variation of the vector space model of lexical semantic similarity (Manning and Schutze, 1999). As M+ encounters an unknown word, it gathers a list of training corpus words judged similar to that word, as predicted by the vector space measure. It then identifies BN nodes whose known values significantly overlap with this list, and provisionally assigns the unknown word as a new value for those nodes. The assignment resulting in the best parsetree is selected for the new provisional training case.</Paragraph> </Section> </Section> class="xml-element"></Paper>