File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/96/c96-2165_metho.xml

Size: 24,198 bytes

Last Modified: 2025-10-06 14:14:18

<?xml version="1.0" standalone="yes"?>
<Paper uid="C96-2165">
  <Title>On Inference-Based Procedures for Lexical Disambiguation</Title>
  <Section position="3" start_page="980" end_page="980" type="metho">
    <SectionTitle>
2 The Idea of Inference-Based
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="980" end_page="980" type="sub_section">
      <SectionTitle>
Lexical Disambiguation
</SectionTitle>
      <Paragraph position="0"> Lexical disambiguation is a procedm'e determinin~ r for a (le.xically) amhiguous sentence within a discourse which reading of the selttellce is contextually api)ropriate. Iq'om a logicM point of view, the resolution of a lexical amtfiguity is usually reconstructed by an inference process which rules out a reading if our concet)tual knowledge contradicts this readiug in the given ('ontext. '~ In order to illustrate this type of inference-based resolution t)rocedure let us consider the German sentence (1)  (1) Einige Arzte haben eine Schwester.</Paragraph>
      <Paragraph position="1"> which contains the ambiguous lexical item 'Sehwester'. Let us consider the two readings of (1) which have to be expressed in English by (2a,b) .4 (2) (a) Some physicians haw; a sister.</Paragraph>
      <Paragraph position="2"> (b) Some physicians have a nurse.</Paragraph>
      <Paragraph position="3"> These two readings are represented by the two (oversimplified) predicate-calculus forinulas given in (3). '5 (a) (a) 3~:( S'h&gt;~i,.i~4~;) A ~:j( si.,.t~r.(:/, :,:))) (I,) ~:( l'h&gt;~i,:i,,~(~.) ^ ~:,j(m,,,.~e(,)))  we abstract away fl'oln the others R)r the sake of simplicity.</Paragraph>
      <Paragraph position="4"> SSince we are primarily interested in the process, we abstract fi'om furtl,er details, like temporal aspects. ll,esolution of an ambiguity as in (1) is possible if it; is embedded in a discourse which provides disambiguating information. If the discourse were continued as in (4)  (4) Einige Arzte haben eine Schwester, mit der sic verheiratet sind.</Paragraph>
      <Paragraph position="5"> we could rule out the mMesired reading given in (5). (5) 3a:(Pi~ys.(x) A 3y(Sister(y, x) A Married(x, y))) This reading which is expre, ssed in English by (6) (6) Some physicians have a sister to whom they are  married.</Paragraph>
      <Paragraph position="6"> can he ruled out, since according to our conceptual system nobody can be married to his sister. Since this part of our conceptual knowledge can be formalized, as in (7)</Paragraph>
      <Paragraph position="8"> the inapl)roprial:eness of reading (5) can be explicated front a logical point of view by the fact that we can deriw; a contradiction fl'Oln that reading of (4) and our conceptual knowledge (meaning postulates).a</Paragraph>
    </Section>
  </Section>
  <Section position="4" start_page="980" end_page="981" type="metho">
    <SectionTitle>
3 The Intractability Problem
</SectionTitle>
    <Paragraph position="0"> Our inference-based reconstruction of the disambiguation process given in the previous section requires oil the one hand that the meaning of the text is adequately represented in an apt)ropriate (formal) representation language, which allows the encoding of conceptuM knowledge as well. By requiring on l;he other hand the underlying logic to be sound and as far as possible complete, we run, of course, into well-known decidability prot)lems.</Paragraph>
    <Paragraph position="1"> Without any flu'ther restrictions on tile expressive power of 1;he representation language and/or the underlying logic the inconsistency of the representation of an arbitrary text and our conceptual knowledge is not decidable. Thus a natural language system whose re.solver is based on such an inference system is not very useflfl, since an attempt to resolve an ambiguity is not guaranteed to terminate.</Paragraph>
    <Paragraph position="2"> Since the field of AI which deals with knowledge represenl;ation and rel, rievM has heen worrying about the same problem for quite a hmg time, it is not surprising that approaches to eope with ~&gt;Phere is, of course, another procedure which is dual to the given one. The. dtml variant allows us to rule out a reading if this reading of the discourse conrains redundant intbrmation, i.e., inlbrmation which already follows fl'om the meaning postulates. This procedure would exchlde e.g. for 'Einige Arzte habe.n eine Schwester, rail; der sie nicht w'xheiratet sind' the Sister&amp;quot; reading which is expressed in English by 'Some physicians have a sister to whom they are not married', since (7) implies for physicians who 1,aw~ a sister that they are not married to her.</Paragraph>
    <Paragraph position="3">  this problem within lexical disambiguation were directly adopted from knowledge representation.</Paragraph>
    <Paragraph position="4"> According to the subject of the restriction used to ensure the traetabilty of the problem, we have to distinguish three main approaches.</Paragraph>
    <Paragraph position="5"> The simplest way to guarantee tractability of the disambiguation problem is by restricted computations. If the underlying logic of a resolver is known to be undecidable (e.g. the inference machine used in LILOG (Bollinger, Lorenz, and Pletat 1991)) the only chance to ensure termination is by stopping the computation after a limited amount of resources (inference length, computation time, etc.) is consumed. Since the termination behavior of such a system is without any further empirical evidence not in any way correlated with our cognitive capabilities and without any further formal evidence not in any way correlated with the behavior which we would expect, if the disambiguation problem were nevertheless decidable, we have to rule out these approaches from a scientific point of view.</Paragraph>
    <Paragraph position="6"> The second class of approaches achieves tractability by restricted representation languages.</Paragraph>
    <Paragraph position="7"> These restrictions allow one to base retrieval on a tractable logic which is sound and complete. In order to support the distinction between terminological and assertional knowledge, most formalisms of this class provide two different (restricted) representation languages: the terminological language and the assertional language.</Paragraph>
    <Paragraph position="8"> To use one of these knowledge representation formalisms (especially the tractable descendants of KL-ONE) for lexical disambiguation leads to problems which disqualify language restrictions as the only means to ensure tractability of the disambiguation problem. On the one hand it is, of course, possible to find examples of meaning postulates which are inexpressible in the restricted terminological languages (see e.g. the list given in Doyle and Patil 1991). But these counterexamples do not provide conclusive arguments, since the expressive power needed in order to formulate these eounterexamples is still rather weak, and one could counter by moving a little bit of expressive power around. Much more crucial for disambiguation are the restrictions imposed on the assertional language.</Paragraph>
    <Paragraph position="9"> In BACK (Hoppe et al. 1993), for example, which is used by Quantz and Schmitz 1993 for disambiguation by storing the text representation in the ABox (assertional knowledge base) and the meaning postulates in the TBox (terminological knowledge base) it is e.g. not possible to represent (4) in an adequate way. We can only find representations whose models include the models of (5), but not a representation with exactly the same models. In order to see this, consider the set-theoretic versions of the satisfiability conditions of (5) and (7) (for a model with interpretation flmction Z) given in (8) and (9). 7  (8) (~Phys.\]ZN{x I ~y((y, x)e ~Sist.\]Zn~Marr.\]Z)}) # 0 (9) \[Sister\] z n \[Married\] z = According to these conditions the BACK expressions (10) and (11) were adequate representations of&amp;quot; (5) and (7).</Paragraph>
    <Paragraph position="10"> (10) X :: Phys. and some(Sister and Married) (11) Sister and Married :&lt; nothing  Although (10) contradicts the TBox representation (11) of (7), it is not possible to use BACK to establish this inconsistency (incoherence), since BACK does not allow the conjunction of roles in the ABox (cf. Hoppe et al. 1993, p. 5{)) which is of course needed in (10) (the conjunction of the roles Sister and Married).</Paragraph>
    <Paragraph position="11"> Example (10) is, of course, just beyond the border of the permitted expressions, since it is in principle expressible but not allowed, and much more problematic (e.g. for 'donkey' sentences) is certainly the fact that variables are not explicitly available in these representation languages. But it should indicate the lack in expressive power at least inasmuch as it is possible without a more general formal proof (which we cannot give here for lack of space). Since the correct disambiguating inferences cannot be performed anymore if the truth conditions of a discourse are boiled down in a way that allows to represent it (somehow) in such a restricted assertional language, approaches which model lexical disambiguation on the basis of these knowledge representation formalisms must fail.</Paragraph>
    <Paragraph position="12"> Since an extension of the expressive power of the assertional languages would lead immediately to our original tractability problem, we have to give up the implicit assumption that lexical disambiguation presupposes the consistency of the discourse, if we don't want to give up lexical disambiguation at all. Thus, we end up in the third class of approaches which provide us with fully expressive languages to represent discourse and ensure tractability by limited inferences. In order to see whether the requirements of soundness and completeness can be adequately weakend wehave to study the inferences involved in lexical disambiguation more carefully.</Paragraph>
  </Section>
  <Section position="5" start_page="981" end_page="984" type="metho">
    <SectionTitle>
4 Towards Tractable Lexical
</SectionTitle>
    <Paragraph position="0"> Disambiguation To limit inference is a well-known strategy employed for knowledge retrieval (e.g. Frisch and Allen 1982). By using incomplete theorem provers it is certainly possible to ensure tractability, but incompleteness is always a compromise which can  be a(:cet)ted as long as the prover computes the desired inferences completely (which is in fact hard to show).</Paragraph>
    <Paragraph position="1"> In contrast to knowledge retrieval where incompleteness is assumed for utility reasons, inference systems used for lexieal disambiguation have to be essentially incomplete. Otherwise we wouht get wrong results. In order to motivate our restrictions we proceed in three steps. In the first step we show that we need an incomplete (but souIld) inli?rence mechanism for lcxical disambiguation, since a complete mechanisin leads to wrong results. We st)ecify a class of inconsistency l)rooN which contains the disamt)iguating inferences as a subclass. In the second step, we separate out those prooN which are in fact disamt)iguating and illustrate in the last step tidal the discourse structure imposes further restrictions on the accessibility of premises.</Paragraph>
    <Section position="1" start_page="982" end_page="983" type="sub_section">
      <SectionTitle>
4.1 The Incompleteness and Decidability
of Lexieally Disambiguating
Inference Mechanisms
</SectionTitle>
      <Paragraph position="0"> in order ~o develop our approach to lexical disambiguation, we work successively through some.</Paragraph>
      <Paragraph position="1"> adequacy conditions which have to be. satisfied by an adequate procedure. According to the discussion in section 3 we have to assume a fully exl)ressive language for the representation of discourse. Assuml)tion (I) is therefore as follows: (I) We have to assume a fully expressive language for the representation of texts. Semantic representations of natural language texts in this language do in general not satisfy conditions which make them de&lt;-idable (see e.g. Rabin 1977 for standard conditions). null To illustrate which kind of iimoinpleteness we need, we assume that the meaning postulates and the discourse can be e, xt)ressed in a first-order language without flmction symbols and identity. Although we think that one needs a more expressive language for an adequate representation of discourse, and that very ofl;en nonmonotonic reasoning is involved, the. first,-order case seems nevertheless representative, since we have to (teal with the decidability problem. Moreover, we expect that dm methodology we used can be applied to more expressive discourse rel)resentadon hmguages in a similar way.</Paragraph>
      <Paragraph position="2"> For our conceptual knowledge on the other hand we make the much stronger assumption (1I).</Paragraph>
      <Paragraph position="3">  (H) Conceptual knowledge is represented by a finite consistent and decidable set of meaning postulates MP that does not contain logi&lt;:ally valid subsets of formulas, s 8Since this condition is certainly not satisfed by  our world knowledge, its integration in the disambiguation process would he a much harder prohlem. Decidability of MP, i.e. the decidability of MP ~- (/~ for a given formula (/~, results fi'oIn the fact that MP does not make any absolute existential claim on the entities in the world, especially on the, Jr cardinality. 9 In order to be able to specify the incompleteness of our inference machinery in terms of a resolution logic, let us in the following assume that MP and the discourse is given in Skolem conjunctive form (SCF). I.e., as two uniw~'rsally quantified formulas whose matrices are in conjunctive normal form.</Paragraph>
      <Paragraph position="4"> Let us fllrthermore assume that we wouhl know that the given discourse is consistent (we abstract here first fi'orn the i)rohlem that this t)roperty is undecidable). We were then able to determine the m, satisfiahility of the discourse and MP by resolution.</Paragraph>
      <Paragraph position="5"> Let us take, for example, the set of clauses obtained fi'om the SCFs of the memfiilg postulate (7) aim the discourse (5) by the standard preparation pro&lt;:edures. If we abbreviate Physician by P, Sister by S and Married by M and use clause set notation (each conjunct of the matrix is represented as dm set of its disjunctively connected litenJs) tt~e unsatisfiability of (5) and (7) can be shown, since there is a resolution refiltation depicted as a</Paragraph>
      <Paragraph position="7"> The whole problem is now that despite of the de(-idability of MP the lexical disamtfiguation problem would still be undecidable if it wouhl pre-SUl)pose a consistent discourse. Decidability of the lexical disamhiguation problem results nevertheless from the fact that lexical disambiguadon does not involve a complete understanding of the discom'se. In order to illustrate that, let us 'aBy checking several examples we found out that this t)rot)erty can I)e characterized model-theoretically as follows. There is a finite set of (up to isomorphism unique) tinite models {M1, ..,Mu} of MP such that each other finite model M~ of MP can successively be reduced to a model M G \[Mk\] by a chain of models M = M E -&lt; M~ -&lt; .. ~ M~. ~ of MP such that for each pair of models /VI~ = (lal',%:i), M~ +1 = (L/i+1,53 :I+1} dmre is a (partial) isomorphism f from l, ti+l\bli in la # such that ,9~(R) is the set of tuples (at,.., a,,~) with (bl, .., b,,~) 6 c3:{+1 (R), and al = bl if bt C bC/ i, and at = f(bt) if bt G U~+~\U ~, for every relation symbol R. Since the infinite models of MP correspond to unions of infinite chains of such models (i.e. MP is a rather restricted W-theory), we can reduce the test of M~. p q~ tbr each model M~. of Me to a test of Mk ~ (b.</Paragraph>
      <Paragraph position="8"> Thus, we can decide MP L- ~/) by checking M \[=- q5 for all models M E {M1, .., Mn}. But note that this does not allow us to test whether q5 is valid or not.</Paragraph>
      <Paragraph position="9">  consider the inconsistent lexically ambiguous sentences (13a,c) whose Sister&amp;quot; readings are expressed in English by (13b,d).</Paragraph>
      <Paragraph position="10">  (13) (a) Es gibt keine Sehwestern, aber einige Arzte haben eine, mit der sic nieht verheiratet sind.</Paragraph>
      <Paragraph position="11"> (b) There are no sisters at all, but some physicians have one to whom they are not married. null (c) Es gibt keine Schwestern, aber einige Arzte haben eine, mitder sie verheiratet sind.</Paragraph>
      <Paragraph position="12"> (d) There are no sisters at all, but some physi null cians have one to whom they are married.</Paragraph>
      <Paragraph position="13"> Although it is possible to derive from the semantic representations of (13a,c) a contradiction, these proofs are by no means disambiguating inferences, since the meaning postulates are not involved. In order to be able to explain by inconsistency proofs why the Sister reading is excluded for (13c) but not for (13a) one has to assume an incomplete inference system} deg Otherwise the system would not work correctly and would, of course, not necessarily terminate. Thus, our third assumption is: (III) Lexical disambiguation is very often possible although the discourse is inconsistent or its consistency is not known.</Paragraph>
      <Paragraph position="14"> What we are in fact looking for is a procedure which tests whether there is a consistent set of information pieces of the discourse which contradicts MP. In order to isolate tile consistent information pieces provided by a (possibly inconsistent) discourse we use a discourse representation (and meaning postulates) in clause form. Since each single clause of such a representation must be satisfiable, we can identify the set of consistent information pieces provided by a discourse with the set of clauses of the discourse in SCF. On the basis of this set we can then test whether there is a consistent subset of these pieces which contradicts MP. Take as an example the clause representation of (13a) and our meaning postulate (7) depicted in (14a,b).</Paragraph>
      <Paragraph position="15">  (14) (a) {-,S(u,v)} {P(a)} {S(b,a)} {~M(a,b)} (b) {~S(y,x),-~M(x,y)} ..</Paragraph>
      <Paragraph position="16">  That the Sister reading is not excluded for (laa) is then explicable by the fact that there is no consistent subset of clauses of (14a) which is inconsistent with MP. What is consistently said in the (inconsistent) discourse does not violate tile mFor the sake of simplicity we were confined to short and simple examples and could therefore not avoid stone artificiality. Moreover, an additional test based on the procedure sketched in footnote 6 would certainly exclude the Sister reading for (laa). But it is, of course, easy to construct more realistic examples where the inconsistency is much more hidden and does not affect the disambiguation.</Paragraph>
      <Paragraph position="17"> meaning postulates in this case. In order to test this kind of incompatibility we have to demand that each resolution deduction starts with a clause from MP. This restriction prevents the attempt to prove the inconsistency of the discourse alone (at least if MP does not contain logically valid sub-sets of formulas that we assume and are able to decide). It prevents us Dora proving the unsatisfiability of (14a,b), but we can still show the inconsistency of the clause representation of (13c) and (14tl) as in (12).</Paragraph>
    </Section>
    <Section position="2" start_page="983" end_page="983" type="sub_section">
      <SectionTitle>
4.2 Disambiguating Inferences
</SectionTitle>
      <Paragraph position="0"> The restriction introduced above is by no means sufficient, since the proof procedure is not yet sensitive to the predicates representing the readings of an ambiguous lexical item. In order to illustrate this insufficiency let us consider the English translation of tile Sister&amp;quot; reading of (4), repeated in (15).</Paragraph>
      <Paragraph position="1"> (15) Some physicians have a sister to whom they are married.</Paragraph>
      <Paragraph position="2"> If we also assume (7) for English then a contradiction would result although we did not regard 'sister' as ambiguous (at least in our oversimplified language domain), ttence, if (15) were embedded in a larger discourse we would have no chance to disambiguate other ambiguous lexical items, since we would get a contradiction for every reading of these items. That disambiguation is nevertheless possible in many of those cases can be made obvious e.g. by continuing (15) as in (16).</Paragraph>
      <Paragraph position="3"> (16) Some physicians have a sister to whom they are mm'ried. Some of these sisters admire stars who got an Oscar.</Paragraph>
      <Paragraph position="4"> The disambiguation of the ambiguous item 'star' should make no problems, given we had the right; meaning postulates. Thus, we have to assume: (IV) Lexical disambiguation is very often possible although the discourse contradicts our conceptual knowledge.</Paragraph>
      <Paragraph position="5"> In order to disambiguate properly we have t6 consider only those consistent sets of information pieces which contain at least one occurrence of the predicate that represents one reading of t, he ambiguous lexical item. Therefore we have to demand in addition that each resolution deduction starts with a pair of clauses A E MP and B from the discourse representation where B contains an occurrence of the predicate representing one reading of the ambiguous lexical item. This prevents disambiguating inferences for cases where there is no choice with respect to the interpretation of tile discourse ('sister' has to be interpreted as Sister although there is a contradiction).</Paragraph>
    </Section>
    <Section position="3" start_page="983" end_page="984" type="sub_section">
      <SectionTitle>
4.3 Reflecting Discourse Structure
</SectionTitle>
      <Paragraph position="0"> For lexical disambiguation we assumed so far that the underlying inference machinery operates on  the set of consistent information pieces t)rovided by the discourse. This set was crucially dependent on what is said and not on what follows, since we were (especially in case of inconsistencies) not interested in the set of all logical consequences of a discourse. Hence, our procedure already reflects in a very weak sense the discourse structure, since we did not allow all conversions preserving logical equivalence, but only those needed to construct an SCF froln the discourse.</Paragraph>
      <Paragraph position="1"> By converting the whole discourse into SCF we made all consistent information pieces provided by the discourse accessible for lexical disambiguation. Whether we need this entire set or just a rather limited subset of pieces which can be made accessible by locally restricted conversions into SCF, is for a first-order discourse an empirical trot no formal problem. But if we consider discourse representations in more expressive languages (e.g. the language of an intensional logic) it becomes cleat&amp;quot; that we have to make only those consistent pieces accessible which result froln tlrstorder consequences of the discourse representation. Information in the scope of the intensional verb in (17a) whose Sister reading is expressed in English by (17b) is, for example, not accessible for lexical disambiguation.J 1  (17) (a) Einige .Arzte versuchten ihre Schwestern zu heiraten.</Paragraph>
      <Paragraph position="2"> (b) Some physicians tried to marry their sis null ters.</Paragraph>
      <Paragraph position="3"> Since we cannot get an SCF of the first-order consequences of a (possibly inconsistent) discourse represented in a more expressive representation language, it is necessary to find exactly those logical equivalence preserving conversions which allow us to convert the discourse representation in such a way that the adequate set of consistent information pieces can be made accessible for the disambiguation by locally restricted conversions into SCF. But we must, of course, admit l;hat further study is needed in order to be able to determine these conversions.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML