File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/96/c96-1025_metho.xml
Size: 13,969 bytes
Last Modified: 2025-10-06 14:14:07
<?xml version="1.0" standalone="yes"?> <Paper uid="C96-1025"> <Title>Processing Metonymy: a Domain-Model Heuristic Graph Traversal Approach*</Title> <Section position="4" start_page="137" end_page="139" type="metho"> <SectionTitle> 3 Method </SectionTitle> <Paragraph position="0"/> <Section position="1" start_page="137" end_page="137" type="sub_section"> <SectionTitle> 3.1 Rationale </SectionTitle> <Paragraph position="0"> The input to the semantic analyser is the syntactic representation of a sentence produced by a previous large coverage syntactic analyser (B~rard-Dugourd et al., 1989). This representation connects words, or predicates, with grammatical relations such as subject, object, oblique object, modifier, etc. The output of the semantic analyser is a conceptual graph on which pragmatic inferences are performed to enrich the representation.</Paragraph> <Paragraph position="1"> In the semantic lexicon, each word points to one or more conceptual representations. The grammatical link between two words in a sentence expresses a conceptual link between their two associated conceptual counterparts. The task of the semantic analyser is to identify this conceptual link.</Paragraph> <Paragraph position="2"> Rather than including the knowledge needed for this task in the semantic lexicon, or in a specific rule base, the program will examine the domain knowledge to resolve the link. The method relies on a heuristic path search algorithm that exploits the graphic aspects of the conceptual graphs formalism. null</Paragraph> </Section> <Section position="2" start_page="137" end_page="138" type="sub_section"> <SectionTitle> 3.2 Domain knowledge </SectionTitle> <Paragraph position="0"> The main domain knowledge elements consist of the domain ontology (Fig. 1) which is a subsumption hierarchy of concept types (henceforth simply 'types') and of relation types, and of a set of reference models attached to the main types.</Paragraph> <Paragraph position="1"> The reference model of a type represents knowledge about this type as a conceptual graph (Fig. 2). Basically, a conceptual graph is a bipartite graph with concept nodes (or concepts) labeled with a type plus an optional referent, and relation nodes labeled with relation types (Chein and Mugnier, 1992). A model of a given type has an identified head concept with the same type, and the network of its related concepts represents its associated knowledge. Since types are organised in an IS-A hierarchy, this knowledge is also inherited.</Paragraph> </Section> <Section position="3" start_page="138" end_page="138" type="sub_section"> <SectionTitle> 3,3 Semantic lexicon </SectionTitle> <Paragraph position="0"> The semantic analyser relies on a two-tier semantic lexicon: one for predicates, the other for grammatical relations. Predicates map to conceptual graphs; most of them are reduced to one concept, since most of the words in the lexicon are technical terms for which a type exists. Figure 3 reports some lexical entries.</Paragraph> <Paragraph position="1"> It is difficult to map grammatical relations to static, predefined conceptual representations, since their meaning in the domain depends on their context of use, and mostly on the predicates they link. Besides, one cannot think of envisioning all the possible uses of such a relation, partly because of the use of metonymy. The conceptual representation of an actual grammatical link will therefore be computed dynamically by the semantic analyser using its context: the linked predicates and domain knowledge. However, each grammatical relation may have conceptual preferences for types or for conceptual relations. These preferences are associated with the grammatical relation. Our grammatical relations include oblique complements, so that prepositions in our semantic lexicon are expressed under this second paradigm (Fig. 3).</Paragraph> </Section> <Section position="4" start_page="138" end_page="139" type="sub_section"> <SectionTitle> 3.4 Algorithm </SectionTitle> <Paragraph position="0"> Given an input triple predicate, grammatical relation, predicate (P1; Gr; P'2), the semantic analyser first replaces the two predicates with their semantic entries -- two conceptual graphs. It then endeavours to link them, that is, to find a concept-level relation between their two head concepts C1 and C2 that, first, is compatible with the semantic preferences of grammaticM relation Gr, and, second, conforms to the representational canon made of the reference models.</Paragraph> <Paragraph position="1"> The basic idea is to project the two head concepts onto the domain knowledge and find a plausible concept-level relation between the two.</Paragraph> <Paragraph position="2"> We implement this by heuristic graph traversal through the reference models and the type hierarchy, looking for a chain made of concepts and conceptual relations (i.e. a linear conceptual graph), which could link concepts of the same types as C1 and C2 and at the same time would satisfy the conceptual preferences of Gr. Semantic analysis then consists in solving recursively every grammatical link starting from the sentence head predicate and then joining the obtained conceptual chains to build the conceptual representation of the whole sentence. We focus here only on the link resolution algorithm.</Paragraph> <Paragraph position="3"> We consider that each predicate Pi is associated with the head concept Ci of a model Mi. Let Ti be the type of Ci. We also assume a partial order on types. We focus here only on the strategy for i)roducing the set of all possible chains between Cl and C2. Wc can use three methods of increasing complexity to find chains to link C1 and C2: 1. Concept fllsion: the two concepts may be redundant. null If T1 < T2 or Tl > T2, then C, and 6'2 could be merged, and an empty chain is returned.</Paragraph> <Paragraph position="4"> 2. Concept inclusion: a concept may be &quot;included&quot; in the other's model.</Paragraph> <Paragraph position="5"> (a) For every concept C' of type T' ill M1 such that T' > T2, every path between Cl and C' in Mt is a returned chain.</Paragraph> <Paragraph position="6"> (b) For every concept C' of type T' in 3/& such that T' >_ Tt, (;very path in Mu between C' and C9 is a returned chain.</Paragraph> <Paragraph position="7"> 3. Model join: two arbitrary concepts in the two could be joined.</Paragraph> <Paragraph position="8"> For every pair of concepts (C\[, C~) where C~ of type T&quot; is in Mi, and such that T\[ < T.~ or T\[ > T.~, all the paths Pathsl between C1 and C~ in M, and Paths.2 between C~ and 6'2 in \]1/\[2 are produced. Then, for every pair</Paragraph> <Paragraph position="10"> of the two paths where last(p,) is joined to first(p.e) is returned.</Paragraph> <Paragraph position="11"> At this point, we are provided with all chains extracted from the pair of models (MI, Me).</Paragraph> <Paragraph position="12"> 3.4.3 Model identification.</Paragraph> <Paragraph position="13"> The models that associate knowledge to a given predicate P can be ranked according to their level of generality. The most specific model is the predicate definition in the semantic lexicon. The next one is the reference model associated with the type T of the head concept of the definition. Then, the following models are the reference models inherited along the ontology through supertypes of T. As the type hierarchy is, in our system, a tree (Bouaud et al., 1995), the models for a predicate are strictly ordered. Considering two grammatically linked predicates, the product of their models constitutes as many model pairs that can be potentially used to look for possible chains. Such pairs are structured by a partial order based on the generality rank of their members, a 3.4.4 Heuristic chain selection.</Paragraph> <Paragraph position="14"> At this stage, we are provided with all the possibles chains between P1 and P2 extracted from their models. The remaining problem ix to choose tile most appropriate chain to substitute for Gr.</Paragraph> <Paragraph position="15"> After some experimentation, we chose the following scheme. The best chain ix selected according to five heuristic criteria: (1) satisfiability of aA model pair (To. 1, rn2) is more specific than (rn\[, rn~) if max_rank(ml, m.~) is less than max_rank(m~, rn~), or if equal, rain_rank(m1, re.e) is less than min_rank(m~, m'2).</Paragraph> <Paragraph position="16"> Gr preferences; (2) most specific ,nodel pair, i.e., the use of most specific knowledge associated with words is prefered; (3) simplest chain production method (see 3.4.2); (4) most specific or highest priority of Gr preferences; (5) shorter chain length. When inultiple chains remain in competition, one is selected randomly.</Paragraph> <Paragraph position="17"> To reduce search, tile link resolntion strategy does not consider all possible chains, and implements the first; two criteria directly in the chain production step. Chains that violate Gr preferences are discarded, and model pairs are explored starting fi'om the most specific pair.</Paragraph> </Section> <Section position="5" start_page="139" end_page="139" type="sub_section"> <SectionTitle> 3.5 An example </SectionTitle> <Paragraph position="0"> Let us illustrate the, resolution on example (2) (an angioplasty of segment II). Tile inimt triple is (angioplastie_f;de_f;segment_iI_f). The corresponding types, Angioplasty and Segment_II, are not compatible and tile &quot;fusion&quot; inethod fails. The &quot;inclusion&quot; method also fails since no model for angioplastie_f includes a concept compatible with Segment_II, and no model for segment_ii_f includes a concept compatible with Angioplasty.</Paragraph> <Paragraph position="1"> However, with the &quot;join&quot; method, the algorithm identifies 6063 possible chains that satisfy the preferences attached to preposition des (Fig. 3).</Paragraph> <Paragraph position="2"> The selected chain uses the reference model of Angioplasty (Fig. 2) and tile definition graph for segment/I_f (Fig. 3) which are connected on concept trtery~qegment. The resulting conceptual representation .joins the two corresponding paths:</Paragraph> <Paragraph position="4"> This representation reflects the fact that in the context of an 'angioplasty', 'segment II' is considered from the point of view of the physical artery segment the angioplasty is to act upon (instead of the spatial notion Segment_II expresses).</Paragraph> </Section> </Section> <Section position="5" start_page="139" end_page="140" type="metho"> <SectionTitle> 4 Implementation and results </SectionTitle> <Paragraph position="0"> This analyser has been implemented on top of a conceptual graph processing package embedded in Common Lisp. In the current state, the ontology contains about 1,800 types and 300 relation types; over 500 types have their own reference model; the lexicon defines over 1,000 predicates and about 150 grammatical relations and prepositions. The analyser correctly handles typical expressions found in our texts, including examples (2)-(5) (see table 1). The complete processing chain has been tested on a set of 37 discharge summaries (393 sentences, 5,715 words) (Zweigenbaum et ~1., 1995). This corpus included development texts, so the results are somewhat opti- null (#) phrase total chains method models partial chains selected (2) 'angiot)lasty of segmenl, H' 6063 join Angiol)lasty \[Ailgioplasty\] ~ (imrl)or ted_oh j)-, \[At tery~%(~gnmnt \] \[Artery_Segment\] ~(zone_of),-\[Spatial_()l)ject\]--+ (slmt ial J'ole)-~ \[Segment J 1\] 'segme.nt II' definition - (3) 'angiophtsty of a coronary artery' 2387 inclusion Angiol)lasty \[Angiol)lasty\]-+ (purlmr tedx)l)j)- + \[Ar t(n'y~eg ....... t\] ~-(pal't)~ .\[Coronary_Artery\] ---(4) 'angioplasty of Mr X' 3633 inchlsion Angioplasty \[Angiophtstyl-,(p,,rported ml,j)-~\[Ar tery2qegme,~t\]~ - (part) +--\[llumanAteing l (5) 'angioi)lasty of a stenosis' 2217 \[A ngiot)lasty\]-~ (purported. oh j) * \[hrtery~Seg ...... t\]~ -(i .... Ires) ,-\[Stenosis\] inclusion Angioplasty mistie; on the other hand, the systern is in an ilu:Oml/lete state of develolltnent. The test consisted in code a.ssignlne, t~t and answering a fix('.(\[ questionnaire, the gold standard being given by health (:are professionals. Overall recall and precision were measured at /1:8 % and 63 % on the (:o(ling task, and 66 % and 77 % on the questionnaire task.</Paragraph> <Paragraph position="1"> No ewfluation has been performed on 1here basic components of the system; we can however provide statistics drawn from the global test for the semantic analyser. For 274 sentences received, the link resolution procedure was called on 8,749 grammatical links and exI/lored 247,877 chains, with an average of 28 chains per call and 904 per sentence. The numbea&quot; of paths found depends heavily on the richness of the lnodels used, which varies with the types involved, l%r instance, the model for type angioplasty (involved in table 1) is central in the domain. It is the most eoinplex in the knowledge base and (:ontain8 54 (:oneet)ts and 78 relations, which at:counts fl)r the, greater number of paths found in these examples.</Paragraph> <Paragraph position="2"> Ilowever, inadequate expai~sion8 are, SOlnetilnes made due to lack of lnodels, or to their complex ity, which makes the heuristic principles not selective enough. Such limitations also stem froin a lack of &quot;actual&quot; selnantic knowledge. The semantic analyser goes directly fi'om gralnmatical relations to concet)tua\] relations without any interme(liate selnantic ret)resentatioll. Usefll\] ilfforlnatioll~ Sll(',h as the arglllnellt~tl or thelnati(: structure of predicates (e.g. , Mel'(:uk et al. (1995), Pugeault et al. (1994)), could prol)a})ly overcome seine of its shortcomings.</Paragraph> </Section> class="xml-element"></Paper>