File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/91/e91-1007_metho.xml

Size: 31,773 bytes

Last Modified: 2025-10-06 14:12:35

<?xml version="1.0" standalone="yes"?>
<Paper uid="E91-1007">
  <Title>HORN EXTENDED FEATURE STRUCTURES: FAST UNIFICATION WITH NEGATION AND LIMITED DISJUNCTION t</Title>
  <Section position="3" start_page="0" end_page="0" type="metho">
    <SectionTitle>
1. PRELIMINARY CONCEPTS
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
1.1 Unification-based grammar formalisms
</SectionTitle>
      <Paragraph position="0"> Unification-based grammar formalisms constitute a cornerstone of many of the most important approaches to natural-language understanding (Shieber, 1986), (Colban, 1988), (Fenstad etal., 1989). The basic idea is that the parser generates a number of partial representations of the total parse, which are subsequently checked for consistency and combined by a second process known as a unifier. A common form of representation for the partial representations is that of/cature structures, which are record-like data structures which are allowed to grow in three distinct ways: by adding missing values, by adding attributes, and by coalescing existing attributes (forcing them to be the same). The last operation may lead to cyclic structures, which we do not exclude. If the feature structure Sz is an extension of $1 (i.e., $1 grows into $2 by application of some sequence of the above rules), we write $1 E $2 and say that St subsumes $2. Intuitively, if Sl E: $2, S~ contains more information than does Sl. It is easy to show that E: is a partial order on the class of all feature structures.</Paragraph>
      <Paragraph position="1"> Each feature structure represents partial information generated during the parse. To obtain the total picture, these partial components must be combined tThe research reported herein was performed while the author was visiting the COSMOS Computational Linguistics Group of the Mathematics Department at the University of Oslo. lie wishes to thank Jens Erik Fenstad and the members of that group for providing a stimulating research environment. Particular thanks are due Tore Langholm for many invaluable discussions regarding the i,,terplay of logic, feature structures, and unification.</Paragraph>
      <Paragraph position="2"> into one consistent piece of knowledge. The formal process of unification is precisely this operation of combination. The most general unifier (mgu) $1 LI $2 of feature structures Sj and Sa is the least feature structure (under E) which is larger than both Sl and $2.</Paragraph>
      <Paragraph position="3"> Such an mgu exists if and only if $1 and $2 are consistent; that is, if and only if they subsume a common feature structure.</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
1.2 Unification algorithms and this paper
</SectionTitle>
      <Paragraph position="0"> While the idea of a most general unifier is a pleasing theoretical notion, its real utility rest with the fact that there are efficient algorithms for its computation.</Paragraph>
      <Paragraph position="1"> The fastest known algorithm, identified by Ait-Kaci (1984), runs in time which is, for all practical purposes, linear in the size of the input (i.e., the combined sizes of the structures to be unified). In proposing any extension to the basic framework, a primary consideration must be the complexity of the ensuing unification algorithm. The principal contribution of the research summarized here is to provide an extension of ordinary feature structures, admitting negation and limited disjunction, while at the same time continuing to admit a provably efficient unification algorithm.</Paragraph>
      <Paragraph position="2"> Due to space limitations, we must omit substantial background material from this paper. Specifically, we assume that the reader is familiar with the notation and definitions surrounding feature structures (Shieber, 1986; Fenstad et al., 1989), as well as the traditional unification algorithm (Colban, 1990). We also have been forced to omit much detail from the description and verification of our algorithm. A full report on this work will be available in the near future. null</Paragraph>
    </Section>
  </Section>
  <Section position="4" start_page="0" end_page="0" type="metho">
    <SectionTitle>
2. UNIFICATION IN THE PRESENCE
OF CONSTRAINTS
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
2.1 Constraints on feature structures Not ev-
</SectionTitle>
      <Paragraph position="0"> ery feature structure is a possibility as the ultimate output of the parsing mechanism. Typically, there are constraints which must be observed. One way of ensuring this sort of consistency is to build the checks right into the grammar, so that the feature structures generated are always legitimate substructures of tile final output. The CLG formalism (Dumas and Vatlie, 1989) is an example of such a philosophy. |n many ways, this is an attractive option~ because it provides a - 33 unified context for expressing all aspects of the grammar. liowever, this approach has the disadvantage that it limits the use of independent parsing subalgorithms whose results are subsequently unified, since the consistency checks nmst be performed before the feature structures are presented to the unifier. Therefore, to maintain such independence, it would be a distinct advantage if some of the constraint checking could be relegated to the unification process.</Paragraph>
      <Paragraph position="1"> To establish a formal framework in which this is possible, we must start by extending our notion of a feature structure. Following the ideas of Moshier and Rounds (1987) and Langholm (1989), we define an extended fcature structure to be a pair (N, K:) in which /C is a set of feature structures and N is the least element of/C under the ordering _. (Titus, by definition, K: has a least element, and K: determines N.) Think of N a.s the &amp;quot;current&amp;quot; feature structure, and/C as the set of all structures into which N is allowed to grow. We define (N~,K:t) C:~ (N~,/C~) to mean precisely that K~ C_ /C~. In other words, the set of all structures which N~ can grow into is a subset of those which N~ can grow into. (It follows necessarily that N~ ~_ N2 in this case.) Note that if we identify the ordinary feature structure N with the pair (N, IM I N ~ M}), we precisely recapture ordinary subsumption. Finally, the notion of unification associated with _~ is given</Paragraph>
      <Paragraph position="3"> has a least element M; undefined oOmrwise.</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
2.2 Logical feature structures with con-
</SectionTitle>
      <Paragraph position="0"> straints To operate on pairs of the form (N~/C) algorithmically, we must have in place an appropriate representation for the set g:. There are many possible choices; ours is to let it be the set of all structures satisfying a set of sentences iu a particular logic. The logic which we use is a simple modification of the language of Rounds and Ka.sper (1986) (see also (Kasper and Rounds, 1990)) admitting negation but only binary path equivalences. Specifically, an atomic feature term is one of the following.</Paragraph>
      <Paragraph position="2"> Semantics The identically true term. The identically false term. The path (nesting of attributes) cz exists and terminates with label a.</Paragraph>
      <Paragraph position="3"> The paths cr and /? have a common end point (coalesced end points).</Paragraph>
      <Paragraph position="4"> In (a : a), the label a may be T, denoting a missing value. The notation (a ~ /~) is borrowed from (Langholm, 1989), and has the same semantics as {,,B} of(Rounds and Kasper, 1986). A (general)featur~ term is b.ilt up from atomic feature terms using the connectives ^, v, and -., with the usual semantics. In particular, the negation we use is the classical notion; a structure satisfes (-,~0) if and only if it does not satisfy ~. For any set * of feature terms, Mod(&amp;) denotes the set of all feature structures for which each E r~ is true. For a formal definition of satisfaction, we refer the reader to the above-cited references. Intuitively, any set of terms which defines a consistent rooted, directed graph is satisfiable. Ilowever, let us specifically remark that only nodes with no outgoing edges may have labels other than T, that labels other than T may occur at at most one end point, that no two outgoing edges from the same node may have the same label, and that any term of the form (a : .L) is equivalent to _L, and so inconsistent.</Paragraph>
      <Paragraph position="5"> Now we define a logical extended feature structure (LoXF) to be an extended feature structure iN, K:) in which K: = Mod(C/) for some consistent finite set ~ of feature terms. In particular, Mod(~) must have a least model. We also denote this pair by</Paragraph>
      <Paragraph position="7"/>
    </Section>
    <Section position="3" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
2.3 Remark on negation A full discussign of the
</SectionTitle>
      <Paragraph position="0"> nature of negation in LoXF's is complex, and will be the focus of a separate paper. IIowever, because this topic has received a great deal of attention (Moshier and Rounds, 1987), (Langholm, 1989), (Dawar and Vijay-Shanker, 1990), we feel it essential to remark here that ~'(C/~) does not have the &amp;quot;classical&amp;quot; negation semantics which can be determined by looking solely at the least element. Indeed, the appropriate definition is that .~'(~) satisfies -'7' precisely when no member of Mod(&amp;) satisfies C/; in other words, the structure N. is not allowed to be extended to satisfy ~o.</Paragraph>
    </Section>
    <Section position="4" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
2.4 Unification algorithms for logical ex-
</SectionTitle>
      <Paragraph position="0"> tended feature structures In view of the definition immediately above, it is easy to see that that any unification algorithm for LoXF's must solve the following two problems in the course of attempting to unify ~'(~i) and ~'(~2).</Paragraph>
      <Paragraph position="1"> (ul) It must decide whether or not ~i U q~2 is consistent; i.e., whether or not there is a feature structure satisfying all sentences of both ~i and cb2.</Paragraph>
      <Paragraph position="2"> (u2) In case that 4~I U~2 is satisfiable, it must also determine if there is a least model, and if so, identify it.</Paragraph>
      <Paragraph position="3"> Now it is well known that (ul) is an NP-complete problem, even if we disallow negation and path equivalence (Rounds and Kasper, 1986, Thin. 4). Therefore, barring the eventuality that P = NP, we cannot expect to allow ~I and ~2 to be arbitrary finite sets of feature terms and still have a tractable algorithm for unification. One solution, which has been taken by a number of authors, such as Kasper (1989) and Eisele and D6rre (1988), is to devise clever algorithms which apply to the general case and appear empirically to work well on &amp;quot;typical&amp;quot; inputs, but still are provably - 34 exponential in the worst case. While such work is undeniably of great value, we here propose a companion strategy; namely, we restrict attention to pairs {N, ~) such that the very nature of * guarantees a tractable algorithm.</Paragraph>
    </Section>
  </Section>
  <Section position="5" start_page="0" end_page="0" type="metho">
    <SectionTitle>
3. HORN FEATURE LOGIC
</SectionTitle>
    <Paragraph position="0"> In the field of mathematicM logic in general, and in the computational logic relevant to computer science in particular, Horn clauses play a very special r61e (Makowsky, 1987). Indeed, they form the basis for the programming language Prolog (Sterling and Shapiro, 1986) and the database language Datalog (Ceri et ai., 1989). This is due to the fact that while they possess substantial representational power, tractable inference algorithms are well known. It is perhaps the main thesis of this work that the utility of llorn clauses carries over to computational linguistics as well.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.1 Horn feature clauses A feature literal is ei-
</SectionTitle>
      <Paragraph position="0"> ther an atomic feature term (e.g., (~: a), (~ ~-. /~), or _L) or its negation. A feature clause is a finite disjunction PSlvt~v...vl,n of feature literals. A feature clause is florn if at most one of the ti's is not negated. A Horn extended feature structure ( lloXF) is a LoXF ~'(4,) such that * is a finite set of llorn feature clauses.</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.2 A taxonomy of Horn feature clauses Be-
</SectionTitle>
      <Paragraph position="0"> fore moving on to a presentation of algorithms on tIoXF's, it is appropriate to provide a brief sketch of thc utility and limits of restricting our attention: to collections of lIorn clauses, hnplication here is classical; in the case of ordinary propositional logic, we use the notation et^~r~^... ^am =~ p to denote the clause ~O'l v-~0r2v... V'~O'rnVp. Horn feature clauses may then  be thought of as falling into one of the following four categories.</Paragraph>
      <Paragraph position="1"> (lIl) A clause of the form a, consisting of a single positive literal, is just a fact.</Paragraph>
      <Paragraph position="2"> (lI2) A clause of the form -~e, consisting of a single negative literal, is a negated fact. In terms of lloXF's, if -~a E C/, this means that within ~'(&amp;), no extension of NC/ in which a is true is permitted. As a concrete example, a constraint stating that a subject may not have an attribute named &amp;quot;tense&amp;quot; would be of this form.</Paragraph>
      <Paragraph position="3"> (H3) A clause of the form ai ^*2 ... am =~ p is called a  rule or an implication. Numerous examples of the utility of implication in linguistics are identified in (Wedekind, 1990, Sec. 1.3). Kasper's conditional descriptions (Kasper, 1988) are also a form of implication. More concretely, the requirement that a transitive verb requires a direct object is easily expressed in this form.</Paragraph>
      <Paragraph position="4"> (114) A clause of the form al^a2^...^am =~ 1 is called a compound negation. The formalization of the constraint that a verb cannot be both intransitive and take a direct object is an example of the use of such a clause, The type of knowledge which is not recapturable using llorn feature logic is positive disjunction; i.e., formulas of tim form ~rlva2, with both a.l and aa feature terms. Of course, this has nothing in particular to do with feature-term logic, but is well-known limitation of Itorn clauses in general. However, in accepting this limitation, we also obtain many key properties, including tractable inference and the following important property of genericity.</Paragraph>
    </Section>
    <Section position="3" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.3 Totally generic LoXF'a Let now * be any
</SectionTitle>
      <Paragraph position="0"> finite set of feature terms. We say that * is totally generic if, for any set q of facts (see (H1) above), if Mod(O 0 #) is nonempty then it contains a least element under E. Intuitively, if we use * to define the LoXF ~(~), total genericity says that however we extend the base feature structure NC/ (consistently with O), we will continue to have a LoXF. Remarkably, we have the following.</Paragraph>
      <Paragraph position="1">  3.4 Theorem A set of feature terms ~p is totally generic if and only if it is equivalent to a set of Horn  feature clauses.</Paragraph>
      <Paragraph position="2"> Proof outline: This result is essentially a translation of (Makowsky, 1987, Thm. 1.9) to the logic Of feature structures. In words, it says that if (and only if) we work with lloXF's, condition (u2) on page 4 becomes superfluous (except for explicitly identifying the least model.) t3</Paragraph>
    </Section>
  </Section>
  <Section position="6" start_page="0" end_page="0" type="metho">
    <SectionTitle>
4. THE EXTENDED UNIFICATION
ALGORITHM
</SectionTitle>
    <Paragraph position="0"> It has been shown by Dowling and Gallier (1984) that satisfiability for finite sets of propositional IIorn formulas can be tested in time linear in the length of the formulas. Their algorithms can easily be modified to deliver the least model as well. Since unification of HoXF's is essentially testing for satisfiability plus identifying the least model (see (ul)-u(2) on the previous page), a natural approach would be to adapt one of their algorithms. Essentially, this is what we do.</Paragraph>
    <Paragraph position="1"> Like theirs, our algorithm is \]orward chaininff, we start with the facts and &amp;quot;fire&amp;quot; rules until no more can be fired, or until a contradiction appears. However, the adaptation is not trivial, because feature-term logic is more expressive than propositional logic. In particular, feature-term logic contains countably many tautologies which have no correlates in ordinary propositional logic. The main contribution of our algorithm is to implicitly recapture the full semantics of these tautologies while keeping the time complexity within reasonable bounds. Due to space limitations, we cannot present tile full formality of the rather complex data structures. Rather, to highlight tile key features, we step through an annotated example. We focus only upon the special problems inherent in the extension to feature-term logic, and assume familiarity with the forward-chaining algorithm in (Dowling and Gallier, 1984) and the graph unification algorithm in (Colban, 1990).</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
4.1 An example theory and extended feature
</SectionTitle>
      <Paragraph position="0"> graplm The set E contains the following eight llorn feature clauses.</Paragraph>
      <Paragraph position="1">  (~,) (AA : a).</Paragraph>
      <Paragraph position="2"> (~,) (n: a).</Paragraph>
      <Paragraph position="3"> - 35 (~) (AA : a)^(B : a)=v (CCDDG : t). (~) (A : T)^(C: T) =:, (ABDDG: T). (~s) (AA.X B)^(ABDDG : T)=} (ABDDEF : T). (~) (A13DD : T)^(B : T) =:, (CCD x ABD). (~,) (CCDD x ABDD) =} (AC : T). (~s) (ACD : T) =v (ACC : t).  Just as we may represent a set of atomic feature terms with a feature graph, so too may we represent, in part, a set of llorn feature clauses with an extended feature graph. Shown in Figure I below is the initial extended feature graph for the set ~, representing the state of inference before any deductions are made.</Paragraph>
      <Paragraph position="5"> Every path and every node label which occurs in some literal of E is represented. The labels of all edges, as well as all non-T node labels, are underscored, denoting that they are virtual, which means that they are only possibilities for the minimal model, and not yet actually part of it. The root node is denoted by (r), and nodes with value T are denoted with a .. Note that paths with common virtual end labels (e.g., AA and B) are not coalesced; virtual nodes and edges are never unified. As a result, the predecessors (along any directed path) of any actual node or edge is itself actual. As inferences are made, edges and nodes become actual (depicted by deleting underscores), and actual nodes with common labels are ultimately coalesced.</Paragraph>
      <Paragraph position="6"> The final extended feature graph is shown in Figure 2 below. For easier visibility, actual edges are also highlighted with heavier lines.</Paragraph>
      <Paragraph position="7">  If we delete the remaining virtual nodes and edges, we obtain the graphical representation of the least model of ::.</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
4.2 Computing the minimal model of the ex-
</SectionTitle>
      <Paragraph position="0"> ample Now let us consider the process of actually obtaining the structure of Figure 2 from E. In the propositional forward chaining approach, we start by pooling the facts that we know -- in this ease {~1, ~2}.</Paragraph>
      <Paragraph position="1"> We then look for rules whose left-hand sides have been satisfied. In the example, the left-hand side of~3 is satisfied, so we may fire that rule and add (CCDDG : t) to our list of known facts, exactly as in the propositional case. We may also conclude that (AA x B), because both are actual paths which terminate with the same label a, and non-T labels are unique. The representative extended feature graph at this point is shown in Figure 3 below.</Paragraph>
      <Paragraph position="2"> A B D D O_</Paragraph>
      <Paragraph position="4"> There are other things which we may implicitly conclude, and which we must conclude to fire the other rules. For example, we may fire rule ~4 at this point, because (AA : a) =~ (A : T) and (IJ : a) =C/~ (/3 : T) are both tautologies in tile logic of feature terms, and so its left-hand side is satisfied. Thus, we may add (A:BDDG : T) to our list of known facts. Similarly, since, as noted above, (AA ~ 13) holds, we &lt;may fire rule ~5 to conclude (ABDDEF : T). Likewise, we may now fire rule ~s and conclude (CCD x ABD).</Paragraph>
      <Paragraph position="5"> The representative extended graph structure at this pc4nt is shown in Figure 4 below.</Paragraph>
      <Paragraph position="6"> A B D D G</Paragraph>
      <Paragraph position="8"> We mr, st eventually invoke a unification at the common end point of CCD and ABD. Such unification implicitly entails the tautology (CCD x ABD) :~ (CCDD x A13DD) and permits us to conclude that rule ~7 should fire and add (AC : T) to the set of facts of the least model. The result represented by the final extended feature graph of Figure 2. Note that rule ~s never fires, and that there are virtual edges and nodes left at the conclusion of the process.</Paragraph>
    </Section>
    <Section position="3" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
4.3 A taxonomy of implicit rules for sets of
</SectionTitle>
      <Paragraph position="0"> Horn feature clauses As we remarked in the introduction to this section, to correctly adapt forward chaining to the context of IIoXF's, we must implicitly iticlude the semantics of countably many tautologies.</Paragraph>
      <Paragraph position="1"> These fall into three classes.</Paragraph>
      <Paragraph position="2"> (il) Whenever an atomic term of the form (or// : a) is determined to be true (ap denotes the concate,nation of a and fl), and another term of the form - 36 (c, : T) occurs as au antecedent of a ilorn feature clause, (with either fl not the empty string or else a :fl T), we must be able to automatically make the deduction of the tautology (oq~ : a) =~ (~ : T) to conclude that (c~ : T) is now true. We call this node and path subsumption. In computing the least model of =, the deductions (AA: a) =~ CA: T) and (B : a) =~ (B : T) are examples of such rules.</Paragraph>
      <Paragraph position="3"> (i2) Whenever we deduce two terms of the form (a : a) and (fl : a) to be true, with a ~ T, we must implicitly realize the semantics of the rule (a : a)^(fl : a) ~ (a x fl), due to tile constraint that non-T labels are unique. We call this label matching.</Paragraph>
      <Paragraph position="4"> In computing the least model of E, tile deduction (AA : a)A(B : a) ::*. (AA X B) is a specific example.</Paragraph>
      <Paragraph position="5"> (i3) Whenever we coalesce two paths, we must perform local unification on the subgraph rooted at the point of coalescence. More precisely, if we coalesce the paths cY and fl, and the atom (~7 : a) is true, we re,st deduce that both (cr7 x \[/7) and (f17 : a) are true; i.e., we must implicitly realize the compound rule (cC/ y. fl)^(c*7 : a) =~ (a'r x f17)^(f17 : a). This is just a logical representation of local unification. In computing the least model of E, a specific example is the deduction (CCD ~ ABD)^(CCDDG : t)</Paragraph>
      <Paragraph position="7"/>
    </Section>
    <Section position="4" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
4.4 Data structures To support these inferences,
</SectionTitle>
      <Paragraph position="0"> several specific data structures are supported. They are sketched below.</Paragraph>
      <Paragraph position="1"> (dl) There is tile list of clauses. Each clause has a counter associated with it, indicating the number of literals which remain to be fired before its left-hand side is satisfied. When this count drops to zero, the clause fires and its consequent becomes true.</Paragraph>
      <Paragraph position="2"> (d2) There is a list of atoms which occur in the antecedents of clauses. With each literal is associated a set of pointers, one to each clause of which it is an antecedeut literal. When an atom becomes true, the appropriate clauses are notified, so they may decrement their counters.</Paragraph>
      <Paragraph position="3"> (d3) Tile working extended fealure structure, as illustrated in Figures 1-4, is maintained throughout. (d4) For each node in the working extended feature structure, a list of atoms is maintained. If the node label is a, then each such atom in the list is of the form (c~ : a), with c, a path from the root node to the node under consideration. When that node becomes actual, that atom is notified that is is now satisfied. (d5) For each non-T node label a which occurs in some atom, a list of all virtual nodes with that label is maintained. When one such node becomes actual, the other are checked to see if an inference of the form (i2) should be made.</Paragraph>
      <Paragraph position="4"> (dr) For each atom of the form (or x fl) occurring as an antecedent in some clause, the nodes at the ends of tl,ese paths in the working extended feature structure are endowed with a common tag. Whenever nodes are coalesced, a check for such common tags is made, so the appropriate atom may be notified that it is now true.</Paragraph>
    </Section>
    <Section position="5" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
4.5 Independent processes and unification
</SectionTitle>
      <Paragraph position="0"> The algorithm also maintains a ready queue of available processes. These processes are of three types.</Paragraph>
      <Paragraph position="1"> A process of the form Actual(or : a), when executed, makes the identified path and label actual in the extended feature graph. A process of the form Coalesce(hi,ha) coalesces the end points of the two nodes nl and n2 in the extended feature graph. A process of the form Unify(n) performs a local unification at the subgraph rooted at node n, using an algorithm such as identified in (Colban, 1990). All processes in the ready queue commute; they may be executed in any order.</Paragraph>
      <Paragraph position="2"> To unify two distinct sets of terms (perhaps generated by independent parts of a parser), we join their two extended feature graphs at the root, merge the corresponding data structures, and add the command Unify(root) to the merged process queue. In other words, we perform a unification to match common information, and then continue with the inference process. null</Paragraph>
    </Section>
    <Section position="6" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
4.6 The complexity of the unification algo-
</SectionTitle>
      <Paragraph position="0"> rithm Define the length of a literal to be the number of attribute name and attribute value occurrences in it. Thus, for example, length((AB ~ CD)) = 4 and length((ABCD : a)) = 5. For a set cb of tlorn feature clauses, we further define the following quantities.</Paragraph>
      <Paragraph position="1"> L = The length of ~I,; i.e., the sum of the lengths of all literals occurring in 4~.</Paragraph>
      <Paragraph position="2"> P = The number of distinct terms of the form (or fl) which occur as the right-hand side of a rule in &amp;.</Paragraph>
      <Paragraph position="3"> (Facts are not considered to be rules here.) m = The number of distinct attributes in the input. (If we collect all of the literMs occurring in tile clauses of * and discard any negation to yield a large pool of facts, then m is tile number of edges in the graph representing the associated feature structure. If ~ is a set of positive iiterals to begin with, and hence represents an ordinary feature structure, then m represents the size of this feature structure.) We then have the following theorem.</Paragraph>
    </Section>
    <Section position="7" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
4.7 Theorem The worst.case time complexity
</SectionTitle>
      <Paragraph position="0"> of our IloXF unification algorithm is O(L + (P + 1). m. w(m)), where a~(m) is an inverse Ackermann/unction (which grows more slowly than than any primitive recursive function - for all practical purposes w(n) &lt;_ 5). 121 This may be compared to tile worst-case complexity of the usual algorithm for unifying ordinary feature structures, which is O(m.w(m)). The increase in complexity over this simpler case is due to two factors. (cl) We must read the entire input; since iiterals may be repeated, it is possible that L &gt; m; hence tile L term.</Paragraph>
      <Paragraph position="1"> (c2) Each time that we deduce that two nodes must be coalesced, we must perform a unification. This can occur at most P times - the number of times that a rule can assert a distinct coalescing of nodes.  -374.8 Further remarks on the algorithm Note in particular that there are no restrictions on where path equivalences (e.g., (or ~. ~)) may occur in Horn feature clauses. In particular, unlike (Kasper, 1988), we do allow negated path equivalences, llowever, if we disallow path equivalences as consequents of rules, then the complexity of our algorithm becomes essentially that of the traditional unification algorithm (see (c2) above). It is primarily deducing path equivalences on the fly which results in the additional computational burden.</Paragraph>
    </Section>
  </Section>
  <Section position="7" start_page="0" end_page="0" type="metho">
    <SectionTitle>
5. CONCLUSIONS, FURTHER DIItEC-
TIONS~ AND PROJECT STATUS
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
5.1 Conclusions and further directions We
</SectionTitle>
      <Paragraph position="0"> have identified lloXF's as an attractive compromise between ordinary feature structures (in which there is no way to express constraints on growth) and full logical feature theories (for which the unification problem is NP-complete). We view lloXF's not as the &amp;quot;best&amp;quot; apl~roach, but rather as a tool to be used to buihl better overall unification-based grammar formalisms.</Paragraph>
      <Paragraph position="1"> The obvious next step is to develop an integrated framework in which IloXF's are employed to handle negation and the disjunction arising from implication, while other techniques handle more general disjunction and term subsumption (Smolka, 1988). Such an optimized approach could lead to much faster overall handling of negation and disjunction, but further work is clearly needed to bear this out.</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
5.2 Status of the project While the algorithm
</SectionTitle>
      <Paragraph position="0"> has been spelled out in considerable detail, we have just begun to build an actual implementation of the IIoXF unifier in the programming language Scheme.</Paragraph>
      <Paragraph position="1"> We expect to complete the implementation by the summer of 1991.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML