File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/92/c92-2076_metho.xml

Size: 11,545 bytes

Last Modified: 2025-10-06 14:12:56

<?xml version="1.0" standalone="yes"?>
<Paper uid="C92-2076">
  <Title>DISJUNCTIVE FEATURE STRUCTURES AS HYPERGRAPHS</Title>
  <Section position="3" start_page="0" end_page="0" type="metho">
    <SectionTitle>
2. BASIC FRAMEWORK
2.1 Disjunctive featme structures as hypeqgraplts
</SectionTitle>
    <Paragraph position="0"> (Disjunctive) feature structures will be defined as directed acyclic hypergraphs. In a hypergraph (see Bergc, 1970), ,arcs (hyperarcs) connect sets of nodes instead of pairs of nodes, as in usual graphs. We will consider hyperarcs as directed from their first node to all other nodes. More precisely, each hyperarc will be an ordered pair consisting of an input node nio, and a (nonempty) set of output nodes n# ..... nit. We will say that (hid, {nit ..... nit}) is a k-arc from hid to ni t ..... nit, that hid is an immediate predecessor of i'ti I ..... nit, and that nit ..... nik are immediate successors of hid..</Paragraph>
    <Paragraph position="1"> A path t ill a hypergraph is a sequence of nodes ni .... ni such that for j = 1 ... p - 1 ni is an * l' '. p ' ' '. j 2mmedtate predecessor of nij+l. If there ex2sLs a path from a node ni to a node nj, we will write ni ~ nj. A hypergraph is acyclic if there is no node such that n i :::&gt; hi. A hypergraph has a root no if tbr each node ni Y=: no, no ~ hi. The leaves of a hypergraph are those nodes with no successor. A path terminating with a leaf is a nutximal path. Nodes with more than one immediate predecessor are called merging nodes.</Paragraph>
    <Paragraph position="2"> Definition 2.1 Let L be a set of labels and A be a set of atomic values. A (disjunctive)feature structure on (L, A) is a quadruple F = (D, no, ,~, C0, respecting the consistency conditions 2.1 below, where D is a finite directed acyclic hypergraph with a root no, 2 is a partial function from the l-arcs of D into L, and o' is a partial function front the leaves of D into A.</Paragraph>
    <Paragraph position="3"> Feature structures which have isomorphic hypergraphs, whose corresponding leaves have the same value, and whose corresponding feature-arcs have the same labels, are isomorphic. We will consider such feature structures to be equal up to isomorphisnt.</Paragraph>
    <Paragraph position="4"> Definition 2.2 Labeled l-arcs are called feature-arcs.</Paragraph>
    <Paragraph position="5"> Non-labeled hyperarcs are called OR-arcs.</Paragraph>
    <Paragraph position="6"> Note that OR-arcs are usually k-arcs with k &gt;1, but (non-labeled) l-arcs can be OR-arcs as a sttecial case* We will use a graphic representation for disjunctive feature structures in which OR-arcs are represented as k liues connected together2 (see Fig. 2).</Paragraph>
    <Paragraph position="7"> Definition 2~3 The extended label of a given path is the concatenation of all labels along that path. We will use the notation 11:12: ... In to represent extended labels. A maximal extended label from a node is an extended label for a maximal path from that node.</Paragraph>
    <Paragraph position="8"> lWe use this t*rm in the sense usual in graph theory. It should not be confused with the term path use.d in ninny feaUure structure studies, which is a string of labels, and for which we will intlodtw.e the team extended label lat~ in the paper, 21n some work involving AND/OR graphs, this convention is used for AND-arcs. This should not c~atc further confusion.</Paragraph>
    <Paragraph position="10"> verify the folh)wing consistency conditions: (C1) No output node of all ORdure is a leaf; (C2) Output uodes of OR-arcs are not mergitig nodes; (C3) All fealure-arcs from the sante my.It have differeut labels; (C4) No maximal extended label front a given node is a  prefix of a non-maximal extended label obtained by following a different hyperarc from rite same node. C 1 and C2 constrain OR-arcs to represent only disjunctions* C3 and C4 are extensions of the determinism tbat is usually imposed on dags (no outgoing arcs with the ,same label from any given node).  Definition 2.4 A dag feature structure is a feature structure with 220 OR-arc.</Paragraph>
    <Paragraph position="11"> Definition 2.5 A projection of a feature structure x is a hypergraph obtained by removing all but one output node of all OR-arcs of x.</Paragraph>
    <Paragraph position="12"> Therelore, a projection has only l-arcs.</Paragraph>
    <Paragraph position="13"> Definition 2.6 A dag leature structure y is a dug..  projection of a feature structure x if there exist some projectinu y' ofx and a function h mapping nudes of y' into nodes dry such that: (1) the root dry' is mapped to file root of y; (2) if (hid, {nit}) is a feature-arc of y', then (h(nio), {h(nil)}) is a feature-arc of y with the same label; (3) if (hid, {nil}) is a 1-oR-arc of y', then h(nlo) : h(nil); (4) the value associated with a node ni in y' is the same as the value associated with h(ni) in y, or both have no value; (5) each feature arc in y is the image of at least one  feature arc in y'.</Paragraph>
    <Paragraph position="14"> In other terms, a dag-projection is obtained from a projection by merging the input and output nodes of each l-oR-arc, and merging paths with common prefixes to ensure detemainism.</Paragraph>
    <Paragraph position="15"> Definition 2.7 A sub-feature structure rooted at a node ni is a quadruple composed of a sub-hypergmph rooted at that node, the root ni, together with the ACRES DE COLING-92, NANTES. 23-28 Ao~r 1992 4 9 9 Pride, OF COLING-92. N^N'rEs, AUG. 23-28, 1992 restrictions of the label and value functions to this subhypergraph. The AND-part of a node is the sub-feature structure rooted at that node, starting with only the feature-arcs from that node. The OR-parts of a node are the different sub-feature structures rooted at that node, starting with each of the OR-arcs. The disjuncts of an OR-arc are the sub-feature structures rooted at each of the output nodes of that oR-arc. If a node has only one OR-arc, we will call its disjuncts the disjuncts of the node.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
2,2 Representation language
</SectionTitle>
      <Paragraph position="0"> Definition 2.8 The representation language for (disjunctive) feature structures described above is defined by the following grammar:</Paragraph>
      <Paragraph position="2"> where F is the axiom, e is the empty string, I belongs to the set of labels L, a belongs to the set of atomic values A, and i belongs to a set I of identifiers (we use the symbols 1&amp;quot;71, I'~&amp;quot;1, etc.), disjoint from L. A formula * of that language is called a (disjunctive)feature description.</Paragraph>
      <Paragraph position="3"> The mapping between feature structures and feature descriptions is straightforward (Fig. 3). Translating between feature descriptions and feature structures and checking that a description is valid (that is, corresponds to a valid feature structure) is computationally trivial, and does not rely on the (potentially expensive) application of equivalence rules as in Kasper and Rounds (1986).</Paragraph>
    </Section>
  </Section>
  <Section position="4" start_page="0" end_page="0" type="metho">
    <SectionTitle>
3. A TYPOLOGY OF NORMAL FORMS
</SectionTitle>
    <Paragraph position="0"> In this section, we will first define the disjunctive normal form (DNF) in terms of hypergraphs. We will then define a family of increasingly restricted normal forms, the most restricted of which is the DNF. One of them, the factored normal form (FNF) enables a clear definition of the &amp;quot;format&amp;quot; of a feature structure. It also imposes a strict hierarchical view of the data, and is exactly the class of feature structures that are reachable from the DNF through sequences of factoring operations. We believe that the FNF class is of great linguistic interest, since it is clear that disjunction is often used to reflect hierarchical organization, factoring, etc., and thus is more than just a space-saving device. In the sections that follow, factoring operations in the FNF class will be defined formally, along with appropriate extentions to the notions of subsumption and unification.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.1 Disjunctive Normal Form
</SectionTitle>
      <Paragraph position="0"> Definition 3.1 A (disjunctive) feature structure is said to be in disjunctive normal form (DNF) if: (1) the root has only one OR-part, and no AND-part; (2) each disjunct is a dag feature structure; (3) all the disjuncts are disjoint and different (non null isomorphic).</Paragraph>
      <Paragraph position="1"> Note that the disjunctive normal form is defined for feature structures themselves, not for their descriptions. Definition 3.2 The disjunctive normal form of a given feature structure x, noted DNF(x), is a DNF feature structure, in which the set of disjuncts Di is equal to the set of dag-projections ofx.</Paragraph>
      <Paragraph position="2"> Definition 3,3 Two feature structures x and y are</Paragraph>
      <Paragraph position="4"/>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.2 Typology of normal forms
</SectionTitle>
      <Paragraph position="0"> We can define several interesting restrictions on feature structures, which in turn define a typology of increasingly restricted normal forms.</Paragraph>
      <Paragraph position="1">  above define several normal forms: (1) 3.1: non-redundant normal form (NRNF); (2) 3.1 and 3.2: hierarchical normal form (HNF); (3) 3.1 and 3.3: AND-normal form (ANF); (4) 3.1, 3,2 and 3.3: layered normal form (LNF). Definition 3.5 In an ANF feature structure x, the AND-part of a node ni is a maximal AND-part of x if ni is the output node of no feature arc.</Paragraph>
      <Paragraph position="2"> Definition 3.6 The layers of a LNF feature structure are defined recursively as follows: (1) Layer 0 is the AND-part of the root; Ac'rEs DE COLING-92, NANTES, 23-28 AOt3T 1992 5 0 0 PROC. OF COLING-92, NANTES, AUG. 23-28, 1992 (2) Layer n+l is set of (maximal) AND-parts of all the  output nodes of OR-arcs originating in layer n. Let us now turn back to formats.</Paragraph>
      <Paragraph position="3"> Definition 3.7 The format of a dag feature structure is the set of maximal extended labels starting at its root. The format of a layer is the union of formats of all the  nmximal AND-parts in that layer.</Paragraph>
      <Paragraph position="4"> Definition 3.8 A LNF feature structure is said to be  in factored normal form (FNF) if the following properties hold: (1) the formats of all layers are disjoint; (2) paths originating in two distinct maximal AND-parts of a layer n can merge only in a node belonging to an AND-part in a layer n' such that n&amp;quot; &lt; n.  Fig. 3 shows the typology of normal forms. Note that the DNF is obviously in FNF.</Paragraph>
      <Paragraph position="5"> In the rest of the paper, we will study only the properties of FNF, in which formats are homogeneous. Definition 3.9 The format of a FNF feature structure, noted f(x), is the sequence &lt;.30 ..... sn&gt; of the formats of each of its layers, in increasing order starting with the root.</Paragraph>
      <Paragraph position="6"> Definition 3.10 We will call sets of extended labels dag-formats, and sequences &lt;so, ..., s,,&gt; of dag-formats with all si disjoint,d-formats.</Paragraph>
      <Paragraph position="7"> Proposition 3.2 If two FNF feature structures have the same DNF and the same forumt, they are equal.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML