File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/93/h93-1029_metho.xml
Size: 24,482 bytes
Last Modified: 2025-10-06 14:13:18
<?xml version="1.0" standalone="yes"?> <Paper uid="H93-1029"> <Title>VALIDATION OF TERMINOLOGICAL INFERENCE IN AN INFORMATION EXTRACTION TASK</Title> <Section position="3" start_page="0" end_page="0" type="metho"> <SectionTitle> 1. SOME BACKGROUND </SectionTitle> <Paragraph position="0"> Alembic is a natural language-based information extraction system that has been under development for about one year. As with many such systems, the information extraction process in Alembic occurs through pattern matching against the semantic representation of sentences.</Paragraph> <Paragraph position="1"> These representations are themselves derived from parsing the input text, in our case with a highly lexicalized neocategorial grammar \[1\].</Paragraph> <Paragraph position="2"> Experience has shown that this kind of approach can yield impressive performance levels in the data extraction task (see \[18\]). We have found--as have others--that meaningful results can be obtained despite only having sketchy sentence semantics (as can happen when there are widespread gaps in the lexicon's semantic assignments).</Paragraph> <Paragraph position="3"> In addition, because the parsing process normalizes the sentence semantics to a significant degree, the number of extraction patterns can be relatively small, especially compared to approaches that use only rudimentary parsing.</Paragraph> <Paragraph position="4"> Strict semantic pattern-matching is unattractive, however, in cases that presume some degree of inference. Consider the following example of an East-West joint venture: \[...\] Samsung signed an agreement with Soyuz, the external-trade organization of the Soviet Union, to swap Korean TV's and VCR's for pig iron from the</Paragraph> <Section position="1" start_page="0" end_page="0" type="sub_section"> <SectionTitle> Soviet Union </SectionTitle> <Paragraph position="0"> What makes this sentence an example of the given joint venture concept is an accumulation of small inferences: that Soyuz is a Soviet entity, that signing an agreement designates agreement between the signing parties, and that the resulting agreement holds between a Soviet and non-Soviet entity. Such examples suggest that it is far preferable to approach the extraction problem through a set of small inferences, rather than through some monolithic extraction pattern. This notion has been embodied in a number of earfier approaches, e.g. \[11\] or \[17\].</Paragraph> <Paragraph position="1"> The inferential approach we were interested in bringing to bear on this problem is the RHO framework. RHO is a terminological classification framework that ultimately descends from KL-ONE. Unlike most recent such systems, however, RHO focuses on terminological inference (rather than subsumption). And whereas most KL-ONE descendants sacrifice completeness for computational tractability, inference in RHO is complete in polynomial time if terminological axioms meet a normal form criterion.</Paragraph> <Paragraph position="2"> Nevertheless, before embarking on a significant development effort to implement the RHO framework under Alembic, we wanted to verify that the framework was up to the data extraction task. In particular, we were keen to ensure that the theoretical criterion that guarantees polynomial time completeness for RHO was actually met in practice. Towards this end, my colleagues and I undertook an extensive empirical study whose goal was, among others, to validate this criterion.</Paragraph> <Paragraph position="3"> The present paper is a summary of our findings, with a special focus on RHO itself and on the validation task. We provide some suggestive interpretations of these findings, and touch on current and ongoing work towards bringing RHO to bear on the extraction task in Alembic.</Paragraph> </Section> </Section> <Section position="4" start_page="0" end_page="150" type="metho"> <SectionTitle> 2. THE RHO FRAMEWORK </SectionTitle> <Paragraph position="0"> The RHO framework, as noted above, arose in reaction to standard approaches to terminological reasoning, as embodied in most descendants of KL-ONE, e.g., CLASSIC \[4\], BACK \[13\], LOOM \[12\], and many others. This line of work has come to place a major emphasis on computing concept subsumption, i.e., the determination of whether a representational description (a concept) necessarily entails another description. In our view, this emphasis is mistaken.</Paragraph> <Paragraph position="1"> Indeed, this emphasis ignores the way in which practical applications have successfully exploited the terminological framework. These systems primarily rely on the operation of classification, especially instance classification.</Paragraph> <Paragraph position="2"> Although subsumption helps to provide a semantic model of classification, it does not necessarily follow that it should provide its computational underpinnings.</Paragraph> <Paragraph position="3"> In addition, the emphasis on complete subsumption algorithms has led to restricted languages that are representationally weak. As is well-known, these languages have been the subject of increasingly pessimistic theoretical results, from intractability of subsumption \[5\], to undecidability of subsumption \[15, 16\], to intractability of the fundamental normalization of a terminological KB \[14\]. Against this background, RHO was targeted to support instance classification, and thus departs in significant ways from traditional terminological reasoners. The most draconian departure is in separating the normal terminological notion of necessary and sufficient definitions into separate sufficiency axioms and necessity axioms. The thrust of the former is to provide the kind of antecedent inference that is the hallmark of classification, e.g., western-corp (x) ~ corporation (x) & hq-ln (x, y) (1) & western-nation (y) The role of necessity conditions is to provide consequent inference such as that typically associated with inheritance and sort restrictions on predicates, e.g.,</Paragraph> <Paragraph position="5"> Although both classes of axioms are expressed in the same syntactic garb, namely function-free Horn clauses, they differ with respect to their inferential import. If one thinks of predicates as being organized according to some taxonomy (see Fig. 1), then necessity axioms encode inference that proceeds up the hierarchy (i.e., inheritance), while sufficiency axioms encode inference that proceeds down the hierarchy (i.e., classification).</Paragraph> <Paragraph position="6"> The most interesting consequence of RHO's uniform language for necessity and sufficiency is that it facilitates the formulation of a cfitedon under which classification is guaranteed to be tractable. For a knowledge base to be guaranteed tractable, the criterion requires that there be a tree shape to the implicit dependencies between the variables in any given axiom in the knowledge base.</Paragraph> <Paragraph position="7"> For the sample axioms above, Fig. 2 informally illustrates this notion of variable dependencies. Axiom (1), for</Paragraph> <Paragraph position="9"> on the left, and 0), on the fight.</Paragraph> <Paragraph position="10"> example, mentions two variables, x and y. A dependency between these variables is introduced by the predicative term hq-in(x,y): the term makes the two variables dependent by virtue of mentioning them as arguments of the same predicate. As the axiom mentions no other variables, its dependency graph is the simple tree on the left of Fig. 1. Similarly, in axiom (4) the agreement predicate makes both y and z dependent on x, also yielding a tree. Finally, axioms (2) and (3) lead to degenerate trees containing only x. Since all the dependency relations between these variables are tree-shaped, the knowledge base formed out of their respective axioms is tractable under the cfitedon.</Paragraph> <Paragraph position="11"> A formal proof that tractability follows from the cfitedon appears in an appendix below, as well as in \[19\].</Paragraph> </Section> <Section position="5" start_page="150" end_page="151" type="metho"> <SectionTitle> 3. VALIDATING RHO </SectionTitle> <Paragraph position="0"> This formal tractability result is appealing, especially in light of the overwhelming number of intractability claims that are usually associated with terminological reasoning.</Paragraph> <Paragraph position="1"> Its correctness, however, is crucially dependent on a normal form assumption, and as with all such normal form cntefia, it remains of little more than theoretical interest unless it is validated in practice. As we mentioned above, we strove to achieve such a validation by determining through a paper study whether the RHO framework could be put to use in the data extraction phase of Alembic.</Paragraph> <Paragraph position="2"> Towards this end, my colleagues and I assembled a set of unbiased texts on Soviet economics. The validation task then consisted of deriving a set of terminological rules that would allow RHO to perform the inferential pattern matching necessary to extract from these texts all instances of a pre-determined class of target concepts. The hypothesis that RHO's tractability criterion can be met in practice would thus be considered validated just in case this set of inference rules was tractable under the criterion.</Paragraph> <Paragraph position="3"> /-3.1. Some assumptions At the time that we undertook the study, however, the Alembic implementation was still in its infancy. We thus had to make a number of assumptions about what could be expected out of Alembic's parsing and semantic composition components. In so doing, we took great pain not to require superhuman performance on the part of the parser, and restricted our expected syntactic coverage to phenomena that we felt were well within the state of the art. In particular, we rejected the need to derive S. As with many similar systems, Alembic uses a fragment parser that produces partial syntactic analyses when its grammar is insufficient to derive S. In addition, we exploited Alembic's hierarchy of syntactic categories, and postulated a number of relatively fine-grained categories that were not currently in the system. This allowed us for example to assume we could obtain the intended parse of &quot;Irish-Soviet airline&quot; on the basis of the pre-modifiers being both adjectives of geographic origin (and hence co-ordinable). We also exploited the fact that the Alembic grammar is highly lexicalized (being based on the combinatorial categofial framework). This allowed us to postulate some fairly detailed subcategorization frames for verbs and their nominalizations. As is cu~ently the case with our system, we assumed that verbs and their nominalizations are canonicalized to identical semantic representations.</Paragraph> <Paragraph position="4"> Elsewhere at the semantic level, we assumed basic competence at argument-passing, a characteristic already in place in the system. This allowed us, for example, to assume congruent semantics for the phrases &quot;Samsung was announced to have X'd&quot; and &quot;Samsung has X'd.&quot; 3.2. The validation corpus With these assumptions in mind, we assembled a corpus of data extraction inference problems in the area of Soviet economics. The corpus consisted of text passages that had been previously identified for an evaluation of information retrieval techniques in this subject area. The texts were drawn from over 6200 Wall Street Journal articles from 1989 that were released through the ACL-DCI. These articles were filtered (by extensive use of GREP) tO a subset of 300-odd articles mentioning the then-extant Soviet Union. These articles were read in detail to locate all passages on a set of three pre-determined economic topics: 1. East-West joint ventures, these being any business arrangements between Soviet and non-Soviet agents.</Paragraph> <Paragraph position="5"> 2. Hard currency, being any discussion of attempts to introduce a convertible unit of monetary value in the former USSR.</Paragraph> <Paragraph position="6"> 3. Private cooperatives, i.e., employee-owned enterprises within the USSR.</Paragraph> <Paragraph position="7"> We found 85 such passages in 74 separate articles (1.2% of the initial set of articles under consideration).</Paragraph> <Paragraph position="8"> Among these, 47 passages were eliminated from consideration because they were just textual mentions of the target concepts (e.g. the string &quot;joint venture&quot;) or of some simple variant. These passages could easily be identified by Boolean keyword techniques, and as such were not taken to provide a particularly insightful validation of a complex NL-based information-extraction process! Unfortunately, this eliminated all instances of private cooperatives from the corpus, because in these texts, the word &quot;cooperative&quot; is a perfect predictor of the concept. An additional four passages were also removed during a cross-rater reliability verification. These were all amplifications of an earlier instance of one of the target concepts, e.g., &quot;U.S. and Soviet officials hailed the joint project.&quot; These passages were eliminated because the corpus collectors had differing intuitions as to whether they were sufficient indications in and of themselves of the target concepts, or were somehow pragmatically &quot;parasitic&quot; upon earlier instances of the target concept. The remaining 34 passages required some degree of terminological inference, and formed the corpus for this study.</Paragraph> </Section> <Section position="6" start_page="151" end_page="152" type="metho"> <SectionTitle> 4. INFERENTIAL DATA EXTRACTION </SectionTitle> <Paragraph position="0"> We then set about writing a collection of terminological axioms to handle this corpus. As these axioms are propositional in nature, and the semantic representations produced by Alembic are not strictly propositional, this required specifying a mapping from the language of interpretations to that of the inference axioms.</Paragraph> <Section position="1" start_page="151" end_page="152" type="sub_section"> <SectionTitle> 4.1. Semantic representation in Alembic </SectionTitle> <Paragraph position="0"> Alembic produces semantic representations at the increasingly popular interpretation level \[2, 10\]. That is, instead of generating fully scoped and disambiguated logical forms, Alembic produces representations that are ambiguous with respect to quantifier scoping. For example, the noun phrase &quot;a gold-based ruble&quot; maps into something akin to the following interpretation: modifiers to the mods slot, and generalized quantifiers to the quant slot. The proxy slot contains a unique variable designating the individuals that satisfy the interpretation. If this interpretation were to be fully mapped to a sorted first-order logical form, it would result in the following sentence, where gold is treated as a kind individual:</Paragraph> </Section> </Section> <Section position="7" start_page="152" end_page="152" type="metho"> <SectionTitle> 3 Pl17 : ruble basis-of(P117, gold) </SectionTitle> <Paragraph position="0"> Details of this semantic framework can be found in \[3\].</Paragraph> <Section position="1" start_page="152" end_page="152" type="sub_section"> <SectionTitle> 4.2 Conversion to propositional form </SectionTitle> <Paragraph position="0"> Axioms in RHO are strictly function-free Horn clauses, and as such are intended to match neither interpretations nor first-order logical forms. As a result, we needed to specify a mapping from interpretations to some propositional encoding that can be exploited by RHO'S terminological axioms. In brief, this mapping hyper-Skolemizes the proxy variables in the interpretation and then recursively flattens the interpretation's modifiers. 1 For example, the interpretation for &quot;a gold-based ruble&quot; is mapped to the following propositions: ruble(Pll7) basis-of(P117, gold) The interpretation has been flattened by pulling its modifier to the same level as the head proposition (yielding an implicit overall conjunction). In addition, the proxy variable has been interpreted as a Skolem constant, in this case the &quot;gensymed&quot; individual ~17.</Paragraph> <Paragraph position="1"> This interpretation of proxies as Skolem constants is actually hyper-Skolemization, because we perform it on universally quantified proxies as well as on existentially quantified ones. Ignoring issues of negation and disjunction, this unorthodox Skolemization process has a curious model-theoretic justification (which is beyond our present scope). Intuitively, however, one can think of these hyper-Skolemized variables as designating the individuals that would satisfy the interpretation, once it has been assigned some unambiguously scoped logical form.</Paragraph> <Paragraph position="2"> To see this, say we had the following inference rule: m-loves-w(x,y) ~-- loves (x, y) & man (x) & woman(y) Now say this rule were to be applied against the semantics of the infamously ambiguous case of &quot;every man loves a woman.&quot; In propositionalized form, this would be:</Paragraph> <Paragraph position="4"> target occurrences, sufficiency rule density, n rules, r r/n joint venture 12 17 1.4 hard curr. 22 13 .59 From this, the rule will infer m-loves-w(Pl18,P119). If we think of Pl18 and Pl19 as designating the individuals that satisfy the logical form of &quot;every man loves a woman&quot; in some model, then we can see that indeed the m-loves-w relation necessarily must hold between them. This is true regardless of whether the model itself satisfies the standard V-3 scoping of the sentence or the notorious 3-V scoping. This demonstrates a crucial property of this approach, namely that it enables inferential extraction over ambiguously scoped text, without requiring resolution of the scope ambiguity (and without expensive theorem proving).</Paragraph> </Section> </Section> <Section position="8" start_page="152" end_page="153" type="metho"> <SectionTitle> 5. FINDINGS </SectionTitle> <Paragraph position="0"> Returning to our validation study, we took this proposifionalized representation as the basis for writing the set of axioms necessary to cover our corpus of data extraction problems. In complete honesty, we expected that the resulting axioms would not all end up meeting the tractability criterion. Natural language is notoriously complex, and even such classic simple KL-ONE concepts as Brachman's arch \[6\] do not meet the criterion.</Paragraph> <Paragraph position="1"> What we found took us by surprise. We came across many examples that were challenging at various levels: complex syntactic phenomena, nightmares of reference resolution, and the ilk. However, once the corpus passages were mapped to their corresponding interpretations, the terminological axioms necessary to perform data extraction from these interpretations all met the criterion.</Paragraph> <Paragraph position="2"> Table 1, above, summarizes these findings. To cover our corpus of 34 passages, we required between two and three dozen sufficiency rules, depending upon how one encoded certain economic concepts, and depending on what assumptions one made about argument-passing in syntax.</Paragraph> <Paragraph position="3"> We settled on a working set of thirty such rules.</Paragraph> <Paragraph position="4"> Note that this inventory does not include any necessity rules. We ignored necessity rules for the present purposes in part because they only encode inheritance relationships.</Paragraph> <Paragraph position="5"> The size of their inventory thus only reflects the degree to which one chooses to model intermediate levels of the domain hierarchy. For this study, we could arguably have used none. In addition, necessity rules are guaranteed to meet the tractability criterion, and were consequently of only secondary interest to our present objectives.</Paragraph> <Section position="1" start_page="153" end_page="153" type="sub_section"> <SectionTitle> 5.1. Considerations for data extraction </SectionTitle> <Paragraph position="0"> From a data extraction perspective, these results are clearly preliminary. Looking at the positive side, we are encouraged that the rules for our hard currency examples were shared over multiple passages, as follows from their fractional rule density of .59 (see Table 1). The joint venture rules fared less well, mainly because the concept they encode is fairly complex, and can be described in many ways.</Paragraph> <Paragraph position="1"> Given our restricted data set, however, it is not possible to conclude how well either set of rules will generalize if presented with a larger corpus. What is clearly needed is a larger corpus of examples. This would allow us to estimate generalizability of the rules by considenng the asymptotic growth of the rule set as it is extended to cover more examples. Unfortunately, constructing such a corpus is a laborious task, since the examples we are interested in are precisely those that escape simple automated search techniques such as Boolean keyword patterns. The time and expense that were incurred in constructing the MUC3/4 and TIPSTER corpora attest to this difficulty.</Paragraph> <Paragraph position="2"> We soon hope to know more about this question of rule generalizability. We are currently in the process of implementing a version of RHO in the context of the Alembic system, which is now considerably more mature than when we undertook the present study. We intend to exploit this framework for our participation in MUCS, as well as retool our system for the MUC4 task. As the TIPSTER and MUC4 data sets contain a considerably greater number of training examples than our Soviet economics corpus, we expect to gain much better insights into the ways in which our rule sets grow and generalize.</Paragraph> </Section> <Section position="2" start_page="153" end_page="153" type="sub_section"> <SectionTitle> 5.2. Considerations for R.O </SectionTitle> <Paragraph position="0"> From the perspective of our terminological inference framework, however, these preliminary results are quite encouraging indeed. We started with a very simple tractable inference framework, and studied how it could be applied to a very difficult problem in natural language processing. And it appears to work.</Paragraph> <Paragraph position="1"> Once again, one should refrain from reaching overly general conclusions based on a small test sample. And admittedly RHO gets a lot of help from other parts of Alembic, especially the parser and a rudimentary inheritance taxonomy. Further analyses, however, reveal some additional findings that suggest that RHO's tractability cfitedon may be of general validity to this kind of natural language inference.</Paragraph> <Paragraph position="2"> Most interestingly, the tractability result can be understood in the context of some basic characteristics of natural language sentence structure. In particular, axioms that violate the tractability criterion can only be satisfied by sentences that display anaphora or definite reference. For example, an axiom with the following fight hand side: own(x, z) & scorn(x, y) & dislike(y, z) matches the sentences &quot;the man who owns a Ferrari scorns anyone who dislikes it/his car/that car/the car.&quot; It is impossible, however, to satisfy this kind of circular axiom without invoking one of these referential mechanisms (at least in English). This observation, which was made in another context in \[8\], suggests a curious alignment between tractable cases of terminological natural language inference and non-anaphofic cases of language use.</Paragraph> <Paragraph position="3"> It is particularly tantalizing that the cases where these terminological inferences are predicted to become computationally expensive are just those for which heuristic interpretation methods seem to play a large role (e.g., discourse structure and other reference resolution strategies).</Paragraph> <Paragraph position="4"> Though one must avoid the temptation to draw too strong a conclusion form such coincidences, one is still left thinking of Alice's ineffable words, &quot;Curiouser and curiouser.&quot;</Paragraph> </Section> </Section> class="xml-element"></Paper>