XML Viewer - w03-0906

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/03/w03-0906_metho.xml
Size: 26,531 bytes
Last Modified: 2025-10-06 14:08:24
<?xml version="1.0" standalone="yes"?>
<Paper uid="W03-0906">
  <Title>Entailment, Intensionality and Text Understanding</Title>
  <Section position="4" start_page="0" end_page="0" type="metho">
    <SectionTitle>
3 Intensionality
</SectionTitle>
    <Paragraph position="0"> The detection of entailments and contradictions between pieces of text raises a number of technical challenges, including but not limited to the following. (a) Ambiguity is ubiquitous in natural language, and poses an especial problem for text processing, where longer sentences tend to increase grammatical ambiguity, and where it is not generally possible to enter into clarificatory dialogues with the text author. Ambiguity impacts ECD because semantic relations may hold under some interpretations but not under others. (b) Reference resolution in the broad sense of determining that two texts talk about the same things, rather than the narrower sense of intra-text pronoun resolution, is also crucial to ECD. Entailment and contradiction relations presuppose shared subject matter,  and reference resolution plays a role in establishing this. (c) World/domain knowledge, as we noted before, can be involved in establishing entailment and contradiction relations. (d) Representations that enable ECD must be derived from texts. What should these representations be  like, and how should they be derived? At a bare minimum some level of parsing to obtain predicate-argument structures seems necessary, but how much more than this is required? null We cannot address all of these issues in this paper, and so will focus on the last one. In particular, we want to point out that intensional constructions are commonplace in text, and that simple first-order predicate-argument structures are inadequate for detecting intensional entailments and contradictions. Within the formal semantics literature since at least Montague, the phenomena raised by intensionality are well known and extensively studied, though not always satisfactorily dealt with. Yet this has been poorly reflected in computational work relating language understanding and knowledge representation. Formal semanticists have the luxury of not having to perform automated inference on their semantic representations, and can trade tractability for expressiveness. Computational applications on the other hand have traded expressiveness for tractability, either by trying to shoe-horn everything into an ill-fitting first-order representation, or by coding up special purpose and not easily generalizable methods for dealing with particular intensional phenomena in special tasks and domains. None of these approaches are particularly satisfactory for the task of detecting substantial numbers of entailment and contradiction relations between texts. A more balanced trade-off is required, and we suggest at least one way in which machinery from formal semantics can be adapted to support this.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.1 Intensionality is pervasive
</SectionTitle>
      <Paragraph position="0"> Intensionality extends beyond the conventional examples of propositional attitudes (beliefs, desires etc) and formal semanticists seeking unicorns. Any predication that has a a proposition, fact or property denoting argument introduces intensionality. Almost every lexical item that takes a clausal or predicative argument should be seen as intensional. As an anecdotal test of how common this is, inspection of 100 Eureka tips about the workaday world of printer and copier repair showed that 453 out of 1586 sentences contained at least one verb sub-categorizing for a clausal argument. Some randomly selected examples of intensional constructions are given in (5).</Paragraph>
      <Paragraph position="1"> (5) a. When the rods are removed and replaced it is very easy to hit the glass tab and break it off.</Paragraph>
      <Paragraph position="2"> b. The weight of the ejected sets is not sufficient to keep the exit switch depressed.</Paragraph>
      <Paragraph position="3"> c. This is a workaround but also disables the ability to use the duplex tray after pressing the &amp;quot;Interrupt&amp;quot; button, which should be explained to the customer.</Paragraph>
      <Paragraph position="4"> d. Machines using the defective toner may require repair or replacement of the Cleaner Assembly.</Paragraph>
      <Paragraph position="5"> Nor is intensionality confined to lexical items taking clausal or predicative arguments, as sentences (3) and (4) demonstrate. Prevention and causation (of central importance within the Eureka domain) are inherently intensional notions To say that &amp;quot;A prevented B&amp;quot; is to say that there was an occurrence of A and no occurrence of B, but  that had A not occurred B would have occurred. Similarly, to say that &amp;quot;A caused B&amp;quot; is to say that there was an occurrence of both A and B, but that had there been no occurrence of A there would have been no occurence of B.</Paragraph>
      <Paragraph position="6"> Both refer to things or events materialized in one context but not in another. It is plain that we cannot give a semantic analysis for (6a) along the lines of (6b) (6) a. Corrosion prevented continuous contact.</Paragraph>
      <Paragraph position="7"> b. a0a2a1a4a3a6a5a8a7a10a9a12a11a14a13a14a13a15a11a17a16a19a18a20a11a14a21a23a22a24a1a8a25a4a26a27a9a12a11a14a21a29a28a31a30a2a9a32a28a12a22a24a5a33a25</Paragraph>
      <Paragraph position="9"> since this asserts the existence of the continuous contact that was prevented. In (Condoravdi et al., 2001) we argued at some length that preserving a first-order analysis along the lines suggested by (Hirst, 1991) -- through the introduction of explicit existence predicates (6c) -is at best a partial solution. Not only are identity criteria for non-existent entities problematic, but (6c) also fails to capture significant monotonicity entailments: Corrosion preventing continuous contact does not imply that corrosion prevents contact of any form; but first order inference allows one to drop the a9a12a11a14a21a29a28a20a18a34a21a29a35a8a11a14a35a29a16a36a22a24a5a33a25 conjunct from (6c), yielding the representation one would expect for corrosion prevented contact.</Paragraph>
      <Paragraph position="10"> We do not completely rule out the possibility that some more sophisticated, ontologically promiscuous, first-order analysis (perhaps along the lines of (Hobbs, 1985)) might account for these kinds of monotonicity inferences. But a more overtly intensional analysis like (7) does not face this problem in the first place.</Paragraph>
      <Paragraph position="12"> In (7) we assume thata38a40a13a15a41a43a42a2a41a43a21a29a28 carries a lexical entailment that its second, propositional, argument is false. Thus (7) rules out the existence of continuous contact, but does not rule out the existence of any form of contact. Hirst, however, points out that allowing quantification over individuals into intensional contexts brings in its wake other well known difficulties: what does it mean for the same individual to exist in different possible worlds? In some sense, this is the trans-world analogue of the problematic identity criteria for non-existent individuals.</Paragraph>
      <Paragraph position="13"> In (Condoravdi et al., 2001) we proposed an alternative analysis, (8), based on viewing noun phrases as being concept denoting rather than individual denoting (Zimmermann, 1993).</Paragraph>
      <Paragraph position="15"> This says that there is some sub-type of corrosion, a4 , and some sub-type of continuous contact, a6 , such that concept a4 prevents concept a6 . This means, amongst other things, that there is some instance of a4 but no instance of a6 . Of course, just because there are no instances of continuous contact, it does not follow that there are no instances of contact, and (8) predicts the correct monotonicity entailments. Moreover, since concepts are functions from possible worlds to their extensions (sets of individuals), the issue of the trans-world identity of concepts does not arise: any particular concept expresses a single function, regardless of possible world.2  not mean that substitution of concepts that are co-extensive in one world is always truth preserving. Thus our use of concepts is intensional in the philosophically traditional sense, which is a point of clarification requested by one of our anonymous reviewers. null</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.2 Detecting an Intensional Entailment
</SectionTitle>
      <Paragraph position="0"> In (Condoravdi et al., 2001) we went into greater depth about how an analysis like (8) formally predicts the right kinds of entailment. Our purpose here is not to repeat these arguments, still less to argue that ours is the only possible way of accounting for these facts. Rather, we want to show how this highly intensional, analysis can be deployed for practical ECD.</Paragraph>
      <Paragraph position="1"> As an example consider determining the possible mutual entailment between (3) and (4), repeated below.</Paragraph>
      <Paragraph position="2">  (9) Corrosion caused intermittent electrical contact. (10) Corrosion prevented continuous electrical contact.  The lexical semantics for a9a12a30a36a35a29a16a43a41 and a38a40a13a15a41a43a42a2a41a43a21a29a28 can be stated as follows (where we use the term &amp;quot;context&amp;quot; instead  of &amp;quot;possible world&amp;quot;): (11) If a38a40a13a15a41a43a42a2a41a43a21a29a28a12a22a18a11a20a19a15a3a21a11a23a22a43a25 is true in context a24 then (a) In context a24 the concept a11a25a19 is instantiated and concept a11a23a22 is uninstantiated, and (b) There is a context a26 that is maximally similar to a24 with the exception that a11a25a19 is uninstantiated in a26 , and in a26 concept a11a27a22 is instantiated.</Paragraph>
      <Paragraph position="3"> (12) If a9a12a30a36a35a29a16a43a41a2a22a18a11a20a19a14a3a21a11a23a22a43a25 is true in context a24 then (a) In context a24 the concept a11a25a19 is instantiated and concept a11a23a22 is also instantiated, and (b) There is a context a26 that is maximally similar to a24 with the exception that a11a25a19 is uninstantiated in a26 , and in a26 concept a11a27a22 is also uninstantiated.</Paragraph>
      <Paragraph position="4"> Applying these definitions to (3) and (4), on the assumption that both statements are true in some context a24 : (13) If cause(corrosion, intermittent-contact) is true in a24 then (a) In a24 there is an instance of corrosion and an instance of intermittent contact, and (b) There is a context a26 that is maximally similar to a24 except that there is no instance of corrosion, where there is no instance of intermittent contact; hence either there is no contact at all, or contact in a26 is non-intermittent (i.e. continuous).</Paragraph>
      <Paragraph position="5"> (14) If prevent(corrosion, continuous-contact) is true in a24 then (a) In a24 there is an instance of corrosion but no instance of continuuous contact; hence either there is no contact in a24 , or contact is non-continuous (i.e. intermittent). null (b) There is a context a26 that is maximally similar to a24 except that there is no instance of corrosion, where  there is an instance of continuous contact.</Paragraph>
      <Paragraph position="6"> Both (13) and (14) refer to a relation of maximal similarity between contexts, with respect to the instantiation of a particular concept. The nature of this relation has deliberately not been spelled out, as it is unnecessary to do so in order to detect the possible entailment relation between (13) and (14). Assuming that both are evaluated against the same initial context a24 , they both invoke counterfactual contexts a26 that are maximally similar to a24 with respect to the concept of corrosion. Moreover, provided we pick the right disjunctive alternatives for non-intermittent and non-continous contact, we can see that a24 and a26 have the same contents in both cases. Thus, whatever maximal similarity might turn out to be, (3) and (4) can be analysed as introducing the same contexts related in the same ways: that is, mutual entailment.3 Before describing how this example can be generalized to a scheme for detecting certain classes of intensional entailments and contradictions, we want to emphasize one point. The example makes free use of the notion of one context/possible world being maximally similar to another, with respect to the instantiation of a particular concept. Relations of maximum similarity between worlds are standard fare within formal, model-theoretic semantics, and alternative definitions abound. It is probably fair to say that the notion is not yet well understood. Fortunately for our example, full understanding of maximal similarity is not required. We only need to know that the same relation applies to the same initial context (a24 ) to pick out the same counterfactual contexts (a26 ). Of course, other examples may necessitate spelling out the relation in more detail. For instance, suppose we had the statement that rust caused intermittent contact, where rust is a sub-type of corrosion. This raises the question of how maximal similarity varies across the type hierarchy; i.e. how does a maximally similar context with no instance of rust compare to one with no instance of corrosion? To answer this, we still do not need to specify fully the maximal similarity relation; merely state some of its necessary properties. Ultimately, though, if we want to use such formal means to relate language to the world, then relations like maximal similarity will have to be fully spelled out. But this is not the task that ECD sets out to deal with.</Paragraph>
    </Section>
    <Section position="3" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.3 A General Approach to Intensional Entailments
</SectionTitle>
      <Paragraph position="0"> The example above points to a general, two stage strategy for ECD. First map texts to contexted clauses, showing what contexts there are, and what (atomic) facts hold 3Note that if we pick the other disjunctive alternative for non-intermittent contact, i.e. no instance of contact at all, then (3) can be shown to contradict (4): (3) says that corrosion causes an intermittent short circuit, while in (4) it intermittently breaks a contact should that be present. We do not yet have anything very useful to say about preferences between such interpretations, though we have been exploring the use of evidential reasoning. null in them. Then attempt to pair contexts between the two text representations, and use relatively limited inference to determine whether the facts in paired contexts entail or contradict. We will look at these two stages in turn.</Paragraph>
      <Paragraph position="1"> Contexted Clauses A contexted atomic clause comprises an atomic fact, plus the context in which the fact is supposed to hold. Borrowing McCarthy's notation4 we write (ist a0 contexta1a2a0 facta1 ) to state that some fact holds in some context. A list of flat contexted clauses is interpreted conjunctively. Consider the contexted clauses de- null Here we have a number of facts about what holds in the intitial context, t: that there is an instance of a sub-concept of corrosion but no instance of some sub-concept of (continuous) contact, and that the prevent relation holds between the corrosion and contact concepts. This relation also introduces a new context, prevent-context3, which is maximally similar to t with respect to corrosion. Within prevent-context3, alternative, counterfactual assertions about concept instantiations are made. Finally, and independently of any particular context, subconcept assertions are made. The first says that corrosion1 is some (unspecified) subconcept of the concept corrosion. This statement is not relativized to a context, since the concept hierarchy is assumed to be constant across all contexts (even though extensions of concepts can vary).</Paragraph>
      <Paragraph position="2"> The 'flattening' of (8) to derive (15b) proceeds via skolemization, conversion to clausal form, the relativization of each conjunct to a context and canonicalization to introduce extra contextual structure that is only implicit in linguistic forms (the context prevent-context3, corresponding to the counterfactual state of affairs the lexical entailments of prevent make reference to), or domain knowledge.</Paragraph>
      <Paragraph position="3"> The canonicalization process is both language and knowledge/ontology-driven,introducing a deeper level of semantic representation. Structures assembled by compositional semantics must thus be transformed to structures that are well-suited for making successive small, 4Though not borrowing McCarthy's view of contexts as subsumption-ordered logical micro-theories.</Paragraph>
      <Paragraph position="4"> automated inference steps. Performing comparison on canonicalized contexted representations reflects a computationally advantageous division of labor: highly directed use of world knowledge and inference in the service of creating meaning representations, followed by relatively lightweight inference procedures in the stage of determining inferential relations between texts. Further aspects of canonicalization to conceptual structure based on a linguistically independent knowledge representation are discussed in (Crouch et al., 2002), e.g. mapping word senses onto term in a domain appropriate concept hierarchy.</Paragraph>
      <Paragraph position="5"> A more complex example of flattening and canonicalization is (16), which is ambiguous between it being the removal of the sleeve that prevents breakage, or making the cable flexible that prevents breakage. The initial logical form for the second interpretation is shown in (16), and the packed contexted representation for both parses is (partially) shown in (18).</Paragraph>
      <Paragraph position="6">  (16) Removing a sleeve made the cable flexible, preventing breakage.</Paragraph>
      <Paragraph position="7"> (17) a0 C. subconcept(C, cable) &amp; def(C, ?A) &amp;  ment a1 a30 a2 a41a2a22a31a7a19a7a19a7 a25 is replaced by the new context name make-ctx3, and component clauses asserted within this new context. Note also how skolem functions like sleeve6(make-ctx3) take context terms as arguments, and how the hook for definite reference by &amp;quot;the cable&amp;quot;, def(C, ?A), is canonicalized to a concept equality, where part12KE45 is some recently mentioned machine part. Also, maxsim can be relativized either to an event type, removeev4, or a context, make-ctx3.</Paragraph>
      <Paragraph position="8"> Context Matching Having obtained contexted representations for two texts, ECD proceeds in two stages.</Paragraph>
      <Paragraph position="9"> First, by assuming that both texts describe the same initial context, locate sub-contexts introduced by the two texts that have parallel relations to the initial context. Second, for the contexts thus paired identify local entailments and contradictions using first-order reasoning. Given our use of concepts, much of this can be done using T-box reasoning from description logics. At present, we only view identical context relations as parallel, and do not give much consideration to the inheritance of propositional content between related contexts. A deeper level of matching would be based on an algebra of contexts detailing different types of context relations and their inheritance properties.</Paragraph>
    </Section>
  </Section>
  <Section position="5" start_page="0" end_page="0" type="metho">
    <SectionTitle>
4 Feasibility of ECD
</SectionTitle>
    <Paragraph position="0"> The last section described one way of approaching intensionality in the setting of entailment and contradiction detection. Our intention has not been to claim that this is the &amp;quot;one true way&amp;quot; of dealing with intensional ECD. It is rather to demonstrate the claim that practical progress can be made in the area, and that formal model-theoretic semantics can make a contribution to this. However, the preceding discussion has arguably been at too abstract and theoretical a level to really demonstrate a claim of practical progress or feasibility. This section briefly discusses a prototype entailment and contradiction detection system (described at greater length in (Crouch et al., 2002)) in order to point out that current technology already makes it feasible to begin addressing ECD.</Paragraph>
    <Paragraph position="1"> The system has been developed around the Eureka collection of printer and copier repair tips. The full collection contains 30-40,000 free text documents. We have been focusing on a development subset of some 1,300 of these documents, including 15 pairs that have been pulled out for closer scrutiny because of known entailments and contradictions between them. We do not as yet have any testing data separate from our development data.</Paragraph>
    <Paragraph position="2"> The system maps each document into a set of contexted clauses, by means of full syntactic and semantic analysis followed by knowledge-based canonicalization. Document representations undergo (statistically filtered) pair-wise comparison to identify sentences within document pairs related by contradiction or entailment. We will describe the mapping and the comparison in turn.</Paragraph>
    <Paragraph position="3"> The first stage of mapping uses a broad coverage, hand coded Lexical Functional Grammar of English (Butt et al., 1998) and the parser from the Xerox Linguistic Environment (XLE) (Maxwell and Kaplan, 1993) to parse the documents. Parsing is robust in the sense that every sentence receives a functional-structure analysis, encoding grammaticalized predicate-argument structure. In about 25% of cases the functional-structures are fragmentary, either because of coverage gaps in the grammar or because of poor spelling and punctuation (to which the technicians writing the tips are prone). Fragments comprise longest span structures for constituents such as S, NP or PP that have been successfully analysed by the grammar. Ambiguity management via packing (Maxwell and Kaplan, 1989) allows the parser to efficiently5 find all possible analyses of each sentence according to the grammar, and represent the alternatives in a compact, structure-shared form. Evaluation of essentially the same grammar on a dependency annotated subset of section 23 of the UPenn Wall Street Journal gives the accuracy of best parses as 85%, increasing by another 4% for nonfragmentary analyses (Riezler et al., 2002). Stochastic selection of the most probable parse (not necessarily the best parse) gives an accuracy of 80%.</Paragraph>
    <Paragraph position="4"> Initial semantic interpretation is via an implementation of &amp;quot;glue semantics&amp;quot;, which uses linear logic deduction to assemble the meanings of words and phrases in a syntactically analysed sentence (Dalrymple, 1999). Semantic interpretation preserves the ambiguity packing in syntactic analysis (though currently not in an algorithmically optimal way), deals with such things as quantifier scoping, and incorporates lexico-semantic information not relevant to parsing.Despite theoretical proposals for dealing with anaphora and ellipsis in glue interpretion, e.g.</Paragraph>
    <Paragraph position="5"> (Crouch, 1999), this has not currently been implemented; hooks are placed in the representation to mark where subsequent canonicalization needs to resolve textual and domain dependencies like pronouns and compound noun interpretations. Semantic analysis is also robust, with about 65% of all sentences receiving full, non-fragmentaryanalyses (around 60% on WSJ-23).</Paragraph>
    <Paragraph position="6"> Canonicalization starts with a systematic flattening of logical forms: skolemizing quantifiers, replacing intensional arguments by new context names, and expanding out the intensional arguments within their new contexts.</Paragraph>
    <Paragraph position="7"> Rewrite rules are then applied, with the assistance of a TMS-based evidential reasoner, to further refine the resulting contexted clauses. Some rules are domain independent simplifications of alternate linguistic constructions onto the same underlying form. Others exploit ontological information to map words onto appropriate word senses or to identify domain appropriate pronouns antecedents. Others introduce additional contextual struc5It takes a morning to distribute the 1300 development documents across half a dozen workstations and perform syntactic and semantic analysis.</Paragraph>
    <Paragraph position="8"> ture, or eliminate irrelevant linguistically induced contexts. To promote domain-portability, care is being taken to write canonicalization rules in such a way as to distinguish between (a) domain independent rules, (b) general rules with an interface to domain dependent ontologies, and (c) domain specific hacks.</Paragraph>
    <Paragraph position="9"> Comparison of representations starts with statistical pre-filtering. This uses probabilistic latent semantic analysis to identify, on the basis of word occurrences, which documents are likely to have some content overlap (Brants and Stolle, 2002). For candidate pairs of documents thus identified, we employ a charitable form of reference resolution: if it is possible to identify clauses or contexts occurring in different documents, then identity is assumed. The Structure Mapping Engine (SME) (Forbus et al., 1989) is used to match contexts. The SME is a graph matching algorithm developed for the recognition of analogy. In our case it is used to match up structurally similar context structures containing structurally similar clauses. Having paired the contextual structures, limited ontological inference is then used to detect contradictions or entailments between the contents of matched contexts.</Paragraph>
    <Paragraph position="10"> In summary, the robust application of detailed, hand-coded rules to the syntactic and semantic analysis of open texts appears feasible, with syntax somewhat more advanced. Similar observations have been made by other researchers, e.g. (Siegel and Bender, 2002). Knowledge-based canonicalization is less well advanced. In part, progress depends on the construction of rules in many ways similar to the grammar rules and lexical entries of syntactic analysis. Progress also depends on the construction of appropriate ontologies.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML