File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/93/e93-1008_metho.xml

Size: 21,306 bytes

Last Modified: 2025-10-06 14:13:18

<?xml version="1.0" standalone="yes"?>
<Paper uid="E93-1008">
  <Title>Disjunctions and Inheritance in the Context Feature Structure System</Title>
  <Section position="4" start_page="0" end_page="55" type="metho">
    <SectionTitle>
2 The Use of Disjunctions and
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="0" end_page="54" type="sub_section">
      <SectionTitle>
Inheritance
Disjunctions
</SectionTitle>
      <Paragraph position="0"> Disjunctions are used to express ambiguity and capability. A first example is provided by the lexicon entry for German die (the, that, ...) in Figure 1. It may be nominative or accusative, and if it is singular the gender has to be feminine.</Paragraph>
      <Paragraph position="1"> Those parts of the term which are not inside a disjunction are required in any case. Such parts shall be shared by all &amp;quot;readings&amp;quot; of the term. The internal</Paragraph>
      <Paragraph position="3"> representation shall provide for mechanisms which prevent from multiplication of independent disjunctions (into dnf).</Paragraph>
      <Paragraph position="4"> tr&amp;ns :.~-~ trails : * dom : syn : categ : gvb : aktiv</Paragraph>
      <Paragraph position="6"> As a second example Figure 2 shows a type describing possible realizations of a transitive object. The outermost disjunction distinguishes whether the dominating predicate is in active or in passive voice.</Paragraph>
      <Paragraph position="7"> For active predicates either a noun (syn : categ : class : nomn)or a subsentence (syn : categ : class : ssent) is allowed* This way disjunctions describe and restrict the possibility of combinations of constituents* null</Paragraph>
    </Section>
    <Section position="2" start_page="54" end_page="55" type="sub_section">
      <SectionTitle>
External Treatment of Disjunctions
</SectionTitle>
      <Paragraph position="0"> The KONTEXT grammar is a lexicalized grammar. This means that the possibility of combinations of constituents is described with the entries in the lexicon rather than in a separated, general grammar.</Paragraph>
      <Paragraph position="1"> A chart parser is used in order to decide which constituents to combine and maintain the combinations* This means that some of the disjunctions concerning concrete combinations are handled not by the unification formalism, but by the chart* Therefore structure sharing for inheritance which is extensively used by the parser is even more important.</Paragraph>
      <Paragraph position="2"> Inheritance Inheritance is used for two purposes: abstraction in the lexicon and non-destructive combination of chart entries* Figure 3 together with the type trans of Figure 2 shows an example of abstraction: The feature structure of trans is inherited (marked by $&lt;&gt;) to the structure for the lexeme spielen (to play) at the destination of the path syn : slots :. A virtual copy of the type structure is inserted* The type trans will be inherited to all the verbs which allow (or require) a transitive object. It is obvious that it makes sense not only to inherit the structure to all the verbs on the level of grammar description but also to share the structure in the internal representation, without copying it.</Paragraph>
      <Paragraph position="4"> Inheritance is also extensively used by the parser.</Paragraph>
      <Paragraph position="5"> It works bottom-up and has to try different combinations of constituents. For single words it just looks up the structures in the lexicon. Then it combines a slot of a functor with a filler. An example is given in Figure 4 which shows a trace of the chart for the sentence Kinder spielen eine Rolle im Theater. (Children play a part in the theatre.) In the 6'th block, in the line starting with ... 4 the parser combines type _16 (for the lexicon entry of im) with the type _17 (for Theater) and defines this combination dynamically as type _18. _16 is the functor, _17 the filler, and caspn the name of the slot. The combination is done by unification of feature structures by the CFS system.</Paragraph>
      <Paragraph position="6"> The point here is that the parser tries to combine the result _18 of this step more than once with different other structures, but unification is a destructive operation! So, instead of directly unifying the structures of say _7 and _18 (_11 and _18, .*.), _7 and _18 are inherited into the new structure of _20. This way virtual copies of the structures are produced, and these are unified* It is essential for efficiency that a virtual copy does not mean that the structure of the type has to be copied. The lazy copying approach (\[Kogure, 1990\], and \[Emele, 1991\] for lazy copying in TFS with historical backtracking) copies only overlapping parts of the structure. CFS avoids even this by structure- and constraint-sharing.</Paragraph>
      <Paragraph position="7"> For common sentences in German, which tend to be rather long, a lot of types will be generated* They supply only a small part of structure themselves (just the path from the functor to the filler and a simple slot-filler combination structure). The bulk of the</Paragraph>
      <Paragraph position="9"> structure is shared among the lexicon and all the different combinations produced by the parser.</Paragraph>
      <Paragraph position="10">  Recursive inheritance would be a means to combine phrases in order to analyze (and generate) without a parser (as in TFS). On the other hand a parser is a controlled device which e.g. knows about important paths in feature structures describing constituents, and which can do steps in a certain sequence, while unification in principle is sequenceinvariant. We think that recursion is not in principle impossible in spite of CFS' concurrent treatment of disjunctions, but we draw the borderline between the parser and the unification formalism such that the cases for recursion and iteration are handled by the parser. This seems to be more efficient.</Paragraph>
      <Paragraph position="11"> The Connection between Disjunctions and</Paragraph>
    </Section>
    <Section position="3" start_page="55" end_page="55" type="sub_section">
      <SectionTitle>
Types
</SectionTitle>
      <Paragraph position="0"> The similarity of the relation between disjunctive structure and disjunct and the relation between type and instance is, that in a set theoretic semantics (see below) the denotation of the former is a superset of the denotation of the latter. The difference is that a disjunctive structure is invalid, i.e. has the empty set as denotation, if each disjunct is invalid.</Paragraph>
      <Paragraph position="1"> A type, however, stays valid even when all its currently known instances are invalid. This distinction mirrors the uses of the two: inheritance for abstraction, disjunctions for complete enumeration of alternatives. When an external system, like the chart of the parser, keeps track of the relation between types and instances disjunctions might be replaced by inheritance. null</Paragraph>
    </Section>
  </Section>
  <Section position="5" start_page="55" end_page="57" type="metho">
    <SectionTitle>
3 Contexts and Inheritance
</SectionTitle>
    <Paragraph position="0"> This chapter introduces the syntax and semantics of CFS feature terms, defines contexts, and investigates the relation between type and instance concerning the validity of contexts. We want to define contexts such that they describe a certain reading of a (disjunctive) term, i.e. chooses a disjunct for some or all of the disjunctions. We will define validity of a context such that the intended reading has a non-empty denotation.</Paragraph>
    <Paragraph position="1"> The CFS unification algorithm as described in \[BSttcher, KSnyves-TSth 92\] computes a set of invMid contexts for all unification conflicts, which are Mways conflicts between constraints expressed in the feature term (or in types). The purpose of the definition of contexts is to cover all possible conflicts, and to define an appropriate search space for the search procedure described in the last part of this paper.</Paragraph>
    <Paragraph position="2"> Therefore our definition of contexts differ from those in \[DSrre and Eisele, 1990\] or \[Backofen et al., 1991\]. Syntax and Semantics of Feature Terms Let A = {a,...} be a set of atoms, F = {f, fi, gi,...} a set of feature names, D -- {d,...} a set of disjunction names, X = {x, y, z,...} a set of type names, I = {i,...} a set of instantiation names. The set of terms T - {t, tl,...} is defined by the recursive scheme in Figure 5. A sequence of type definitions is</Paragraph>
    <Paragraph position="4"> The concrete syntax of CFS is richer than this definition. Variables are allowed to express path equations, and types can be unified destructively. Cyclic path equations (e.g. &lt;&gt; = &lt;gl. * *gm &gt;) are supported, but recursive type definition and negation are not supported, yet.</Paragraph>
    <Paragraph position="5">  In order to define contexts we define the set of disjunctions of a term, the disjuncts of a disjunction, and deciders as (complete) functions from disjunctions to disjuncts. Mi is a mapping substituting all disjunction names d by i(d), where i is unique for each instantiation.</Paragraph>
    <Paragraph position="7"> Figure 6 defines the interpretation \[tiC of deciders i c w.r.t, terms t as subsets of some universe U (similar to \[Smolka, 1988\], without sorts, but with named disjunctions and instantiations).</Paragraph>
    <Paragraph position="8">  Similar to deciders we define specializers as partial functions from disjunctions to disjuncts. We also define a partial order _t on specializers of a term:</Paragraph>
    <Paragraph position="10"> The interpretation function can be extended to specializers now: If c is a specializer of t, then C/~6deeiders(t)Ae'-g~C/ A specializer is valid iff it's denotation is not empty. For the most general specializer, the function cawhich is undefined on each disjunction, we get the interpretation of the term:</Paragraph>
    <Paragraph position="12"> Contexts will be objects of computation and representation. They are used in order to record validity for distributed disjunctions. We give our definition first, and a short discussion afterwards.</Paragraph>
    <Paragraph position="13"> For the purpose of explanation we restrict the syntax concerning the composition of disjunctions. We say that a disjunctive subterm {..-}d oft is outwards in t if there is no subterm {.., tj, ..}a, of t with {...}n subterm of tj. We require for each disjunctive subterm {...}a oft and each subterm {..,tj, ..}d' oft: if {...}d is outwards in t i then each subterm {...}a of t is outwards in tj. This relation between d ~ and d we define as subdis(d~,j, d). Figure 7 shows the definition of contexts.</Paragraph>
    <Paragraph position="14"> A specializer c of t is a context of t, iff Vd, d / E dis(t) :  The set of contexts and a bottom element +- form a lattice (__t, Ct+-). The infimum operator of this lattice we write as At. We drop the index ~ from operators whenever it is clear which term is meant.</Paragraph>
    <Paragraph position="15"> Discussion: E.g. for the term</Paragraph>
    <Paragraph position="17"> text. We exclude such specializers which have more general specializers (dl --~ 2) with the same denotation. For the same term (d2 ~ 1) is not a context. This makes sense due to the fact that there is no constraint expressed in the term required in (d2 ~ 1), but e.g. a at the destination of f is required in (dl --* 1, d2 ~ 1). We will utilize this information about the dependency of disjunctions as it is expressed in our definition of contexts.</Paragraph>
    <Paragraph position="18"> In order to show what contexts are used for we define the relation is required in (requi) of subterms and contexts of t by the recursive scheme:</Paragraph>
    <Paragraph position="20"> The contexts in which some subterms of t are required, we call input contexts of t. Each value constraint at the destination of a certain path and each path equation is required in a certain input context.</Paragraph>
    <Paragraph position="21">  a is required in (dl --+ 1) at the destination of f, and e is required in (d2 --+ 2) at the destination of f, and the conflict is in the infimum context (dl --* 1) n (d~ --, 2) = (dl ---, 1, d2 ---, 2). This way each conflict is always in one context, and any context might be a context of a conflict. So the contexts are defined with the necessary differentiation and without superfluous elements.</Paragraph>
    <Paragraph position="22"> We call the contexts of conflicts nogoods. It is not a trivial problem to compute the validity of a term or a context from the set of nogoods in the general case. This will be the topic of the last part (4).</Paragraph>
    <Paragraph position="23"> Instantiation If z := t is a type, and x is inherited to some term x(c)&lt;&gt;i then for each context c of z there is a corresponding context d of z(c)&lt;&gt;i with the same denotation. null</Paragraph>
    <Paragraph position="25"> Therefore each nogood of t also implies that the corresponding context of the instance term z(c)&lt;&gt;i has the empty denotation. It is not necessary to detect the conflicts again. The nogoods can be inherited.</Paragraph>
    <Paragraph position="26"> (In fact they have to because CFS will never compute a conflict twice.) If the instance is a larger term, the instance usually will be more specific than the type, and there might be conflicts between constraints in the type and constraints in the instance. In this case there are valid contexts of the type with invalid corresponding contexts of the instance. Furthermore the inheritance can occur in the scope of disjunctions of the instance.</Paragraph>
    <Paragraph position="27"> We summarize this by the definition of contezt mapping mi in Figure 8.</Paragraph>
    <Paragraph position="29"/>
  </Section>
  <Section position="6" start_page="57" end_page="58" type="metho">
    <SectionTitle>
4 Computing Validity
</SectionTitle>
    <Paragraph position="0"> Given a set of nogood contexts, the disjunctions and the subdis-relation of a term, the question is whether the term is valid, i.e. whether it has a non-empty denotation. A nogood context n means that \[t\]n = {}. The answer to this question in this section will be an algorithm, which in CFS is run after all conflicts are computed, because an incremental version of the algorithm seems to be more expensive. We start with an example in order to show that simple approaches are not effective.</Paragraph>
    <Paragraph position="1"> {fi it }, { \[i it }. { \[i</Paragraph>
    <Paragraph position="3"> For the term in Figure 9 the unification algorithm of CFS computes the shown nogoods. The term is invalid because each decider's denotation is empty.</Paragraph>
    <Paragraph position="4"> A strategy which looks for similar nogoods and tries to replace them by a more general one will fail. This example shows that it is necessary at least in some cases to look at (a covering of) more specific contexts.</Paragraph>
    <Paragraph position="5"> But before we start to describe an algorithm for this purpose we want to explain why the algorithm we describe does a little bit more. It computes all most general invalid contexts from the set of given nogoods. This border of invalid contexts, the computed nogoods, allows us afterwards to test at a low rate whether a context is invalid or not. It is just the test Bn G Computed-Nogoods : c ~_t n. This test is frequently required during inspection of a result and during output. Moreover nogoods are inherited, and if these nogoods are the most general invalid contexts, computations for instances will be reduced.</Paragraph>
    <Paragraph position="6"> The search procedure for the most general invalid contexts starts from the most general context cv.</Paragraph>
    <Paragraph position="7"> It descends through the context lattice and modifies the set of nogoods. We give a rough description first and a refinement afterwards: Recursive procedure n-1  1. if 3n E Nogoods : c -4 n then return 'bad'.</Paragraph>
    <Paragraph position="8"> 2. select a disjunction d with c undefined on d and  such that the specializer (d -* j, d ~ --~ c(d~)) is a context, if no such disjunction exists, return 'good'.</Paragraph>
    <Paragraph position="9"> 3. for each j E sub(d) recursively call n-1 with (d --+ j, d ~ -.+ c( d~) ).</Paragraph>
    <Paragraph position="10"> 4. if each call returns 'bad', then replace all n E Nogoods : n ~_ c by c and return 'bad'.</Paragraph>
    <Paragraph position="11"> 5. continue with step 2 selecting a different disjunction. null If we replace the fifth step by 5. return 'good' n-1 will be a test procedure for validity.</Paragraph>
    <Paragraph position="12"> n-1 is not be very efficient since it visits contexts more than once and since it descends down to most specific contexts even in cases without nogoods. In order to describe the enhancements we write: Cl is relevant for c2, iff cl I-1 c2 ~ .1..</Paragraph>
    <Paragraph position="13">  The algorithm implemented for CFS is based on the following ideas: (a) select nogoods relevant for c, return 'good' if there are none  (b) specialize c only by disjunctions for which at least some of the relevant nogoods is defined. (c) order the disjunctions, select in this order in the  step 2.-4. cycle.</Paragraph>
    <Paragraph position="14"> (d) prevent multiple visits of contexts by different specialization sequences: if the selected disjunction is lower than some disjunction c is defined on, do not select any disjunction in the recursive calls (do step 1 only).</Paragraph>
    <Paragraph position="15"> The procedure will be favorably parametrized not only by the context c, but also by the selection of relevant nogoods, which is reduced in each recursive call (because only 'relevant' disjunctions are selected due to enhencement (b)). This makes the procedure stop at depth linear to the number of disjunctions a nogood is defined on. Together with the ordering (c,d) every context which is more general than any nogood is visited once (step 1 visits due to enhencement (d) not counted), because they are candidates for most general nogood contexts. For very few nogoods it might be better to use a different procedure searching 'bottom-up' from the nogoods (as \[de Kleer, 1986, second part\] proposed for ATMS).</Paragraph>
    <Paragraph position="16"> (a) reduces spreading by recognizing contexts without more specific invalid contexts. (b) might be further restricted in some cases: select only such d with Vj G sub(d) : 3n E relevant-nogoods : n(d) = j. (b) in fact clusters disjunctions into mutually independent sets of disjunctions. This also ignores disjunctions for which there are currently no nogoods thereby reducing the search space exponentially.</Paragraph>
    <Section position="1" start_page="58" end_page="58" type="sub_section">
      <SectionTitle>
Eliminating Irrelevant Disjunctions
</SectionTitle>
      <Paragraph position="0"> The algorithm implemented in CFS is also capable of a second task: It computes whether disjunctions are no longer relevant. This is the case if either the context in which the disjunctive term is required is invalid, or the contexts of all but one disjunct is invalid. null Why is this an interesting property? There are two reasons: This knowledge reduces the search space of the algorithm computing the border of most general nogoods. And during inheritance neither the disjunction nor the nogoods for such disjunctions need to be inherited. It is most often during inheritance that a disjunction of a type becomes irrelevant in the instance. (Nobody would write down a disjunction which becomes irrelevant in the instance itself.) Structure- and constraint sharing in CFS makes it necessary to keep this information because contexts of shared constraints in the type are still defined on this disjunction, i.e. the disjunction stays relevant in the type. Let the only valid disjunct of d be k.</Paragraph>
      <Paragraph position="1"> The information that either the constraint can be ignored (c(d) ~ k) or the disjunction can be ignored (c(d) = k) is stored with the instantiation. The context mapping for the instantiation filters out either the whole context or the disjunction.</Paragraph>
      <Paragraph position="2"> The algorithm is extended in the following way: 4a. if e is an input context of t and d is a disjunction specializing e and the subcontexts are also input contexts, and if all but one specialization delivers 'bad' the disjunction is irrelevant for t.</Paragraph>
      <Paragraph position="3"> All subdisjunctions of subterms other than the one which is not 'bad' are irrelevant, too.</Paragraph>
      <Paragraph position="4"> Consequences One consequence of the elimination of irrelevant disjunctions during inheritance is, that an efficient implementation of contexts by bitvectors (as proposed in e.g. \[de Kleer, 1986\]) with a simple shift operation for context mappings will waste a lot of space. Either sparse coding of these bit vectors or a difficult compactifying context mapping is required. The sparse coding are just vectors of pairs of disjunction names and choices. Maybe someone finds a good solution to this problem. Nevertheless the context mapping is not consuming much of the resources, and the elimination of irrelevant disjunctions is worth it.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML