File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/88/j88-1004_metho.xml
Size: 16,420 bytes
Last Modified: 2025-10-06 14:12:14
<?xml version="1.0" standalone="yes"?> <Paper uid="J88-1004"> <Title>IMPLEMENTING SYSTEMIC CLASSIFICATION BY UNIFICATION</Title> <Section position="1" start_page="0" end_page="0" type="metho"> <SectionTitle> IMPLEMENTING SYSTEMIC CLASSIFICATION BY UNIFICATION C. S. Mellish </SectionTitle> <Paragraph position="0"> School of Cognitive Sciences, University of Sussex 1 The &quot;system networks&quot; of Systemic Grammar provide a notation for declaring how combinations of properties may imply or be inconsistent with other combinations.</Paragraph> <Paragraph position="1"> Partial information about a linguistic entity can be recorded as a set of known properties, and a system network then enables one to infer which other properties follow from this and which other properties are incompatible with this. The possible descriptions allowed by a system network are partially ordered by the relationship of subsumption, where a description subsumes any description that is more specific than it, given the background constraints declared by the network. Given this partial ordering, the set of descriptions can be seen as forming a lattice with least upper bound and greatest lower bound operations. In a class of applications (such as parsing and generation) that require incremental description refinement, we are only really interested in forming new conjunctions (greatest lower bounds) and testing subsumption relationships.</Paragraph> <Paragraph position="2"> If one factors out the complexity of variable renaming and introduces special top and bottom elements, the set of logical terms also forms a lattice (the lattice of Generalised Atomic Formulae -&quot;GAF lattice&quot;) under the partial ordering relation &quot;is equally or more instantiated than&quot; (Reynolds (1970)). In this lattice, the greatest lower bound operation is unification (Robinson (1965)). Unification is a primitive operation in most logic programming systems and is also the basis of various grammatical formalisms. It is therefore a relatively well understood operation and can be efficiently implemented.</Paragraph> <Paragraph position="3"> In this paper, we investigate to what extent it is possible to find structure-preserving mappings from the description spaces defined by system networks to sublattices of the GAF lattice. Where this is possible, we can use a fixed mapping from property names to logical terms to create terms that represent conjunctive descriptions (by unification) and to test subsumption (by testing &quot;less instantiated than&quot;). Incompatibility of descriptions is also indicated by unification failure.</Paragraph> <Paragraph position="4"> There are a number of reasons why it is interesting to investigate these possibilities: (I) They may result in more efficient or flexible ways of using system networks for inferencing.</Paragraph> <Paragraph position="5"> (2) They may enable language analysers or generators which involve unification for other reasons (e.g., analysers for GPSG (Gazdar et al. (1985)) or Functional Unification Grammar (Kay (1984)) to build in feature co-occurrence tests using the same mechanism. null (3) They may enable us to make formal sense of various ad-hoc mechanisms used by logic programmers in natural language processing.</Paragraph> <Paragraph position="6"> (4) By exposing the nature of the relevant description spaces, they may open various possibilities for the implementation of other classification tasks, e.g., concept learning (Mellish forthcoming).</Paragraph> <Paragraph position="7"> (5) They may give us more insight into the semantics of system networks and the potential of unification.</Paragraph> </Section> <Section position="2" start_page="0" end_page="41" type="metho"> <SectionTitle> 1 HALLIDAY'S SYSTEM NETWORKS </SectionTitle> <Paragraph position="0"> System networks, as used in Halliday's Systemic Grammar (Hudson (1971), Kress (1976), Winograd (1983)) are a way of encoding the choices that must be made in the generation of a complex linguistic object and the inter-dependencies between them. There is actually nothing that makes such networks specific to linguistic applications, and so there is no reason why they cannot be applied to describing the choices involved in other complex situations.</Paragraph> <Paragraph position="1"> A system network can be viewed as a graph, some of whose nodes are annotated with symbols representing properties. The nodes are tied together by the use of four different &quot;connectives&quot;, which we shall designate by &quot;1&quot;, &quot;{&quot;, &quot;}&quot; and &quot;\]&quot;. In order to be precise about Copyright 1988 by the Association for Computational Linguistics. Permission to copy without fee all or part of this material is granted provided that the copies are not made for direct commercial advantage and the CL reference and this copyright notice are included on the first page. To copy otherwise, or to republish, requires a fee and/or specific permission. exactly what system networks mean, we will present a logical interpretation, where each appearance of a &quot;connective&quot; in a network gives rise to a set of logical axioms relating the property symbols (interpreted as unary predicates) appearing with it.</Paragraph> <Paragraph position="2"> A fundamental concept in system networks is that of the choice system. A choice system indicates that, if a certain &quot;entry condition&quot; holds, then the object described must have exactly one of the properties mentioned in the system. Choice systems are denoted by use of the &quot;1 .... connective&quot;. Thus, Figure 1 indicates that masculine, feminine and neuter are mutually exclusive and whenever an object has a gender it has one of these. In logic,</Paragraph> <Paragraph position="4"> where AMO (&quot;at most one of&quot;) is defined by:</Paragraph> <Paragraph position="6"> Incidentally, an alternative reading that might suggest itself, namely:</Paragraph> <Paragraph position="8"> is not adequate, because it allows spurious models, for instance where there is an object &quot;a&quot; which satisfies &quot;feminine(a)&quot; and &quot;masculine(a)&quot; but not &quot;gender (a)&quot;.</Paragraph> <Paragraph position="9"> Sometimes more than one choice will be relevant, given the same entry conditions. This is indicated by the &quot;{ .... connective&quot;. For instance, as indicated in Figure 2, in some languages a noun may be either singular or plural, and also either masculine or feminine. Instances of the &quot;{&quot; connective can be translated into logic by simply treating the entry condition of the &quot;{&quot; as that of all the networks introduced on the right hand side.</Paragraph> <Paragraph position="10"> Thus:</Paragraph> <Paragraph position="12"> The final two connectives concern complex entry conditions into networks. A conjunctive entry condition is denoted by &quot;}&quot;, as shown in Figure 3. This means sing geadex simply:</Paragraph> <Paragraph position="14"> Finally, &quot;\]&quot; introduces a disjunctive entry condition, so that the example provided in Figure 4 persomd demonstr number means: V x: personal(x) v demonstr(x) ~ number(x) By convention, uses of the four &quot;connectives&quot; can be connected together in any way, as long as &quot;loops&quot; are not created. That is, if one regards each connective as a set of arcs going from properties on its &quot;left&quot; to properties on its &quot;right&quot;, the resulting directed graph must be acyclic.</Paragraph> <Paragraph position="15"> As a larger example, Figure 5 depicts the system network for English pronouns presented in Winograd (1983)). Here is a logical translation of selected parts.</Paragraph> <Paragraph position="17"> It is important to note that in this paper we consider system networks as a self-contained notation for describing certain types of choices (&quot;systemic choices&quot;) that are available in the construction of a complex (linguistic) object. We will be completely ignoring the philosophical differences between Systemic Grammar and other forms of generative grammar, and we will also completely ignore the other components that are required in a full Systemic Grammar, such as realisation rules.</Paragraph> <Paragraph position="18"> Computational Linguistics Volume 14, Number 1, Winter 1988</Paragraph> <Paragraph position="20"/> </Section> <Section position="3" start_page="41" end_page="42" type="metho"> <SectionTitle> 2 SUBSUMPTION AND THE LATTICE OF DESCRIPTIONS </SectionTitle> <Paragraph position="0"> The property symbols in a system network provide a basic vocabulary out of which descriptions can be built.</Paragraph> <Paragraph position="1"> The most obvious way to produce more complex descriptions is by conjunction and disjunction. The logical interpretation of such complex descriptions is straightforward, and we will often blur the distinction between a description and its interpretation. Thus: masculine & singular corresponds to Ax. masculine(x) & singular(x) masculine v feminine corresponds to Ax. masculine(x) v feminine(x) A fundamental relationship between descriptions is subsumption: d I subsumes d 2 iff ~ ~ V x: d2(x) D dl(x) where E is the set of axioms derived from the network. Note that our notion of subsumption depends vitally on ~. This is a special case of what Buntine (1986) calls &quot;generalised subsumption&quot;. Intuitively d~ subsumes d2 if, given the axioms E, d~ is a more general description than d2. That is, if dl describes all the objects accurately described by d2 and maybe more. Subsumption is a partial ordering on descriptions, and the set of possible descriptions (properties and all possible finite conjunc- null tions and disjunctions of descriptions made from them j, ordered by subsumption, forms a lattice. In this lattice, the least upper bound of two descriptions is their disjunction and the greatest lower bound is their conjunction. Figure 6 is a picture of a portion of the lattice consisting of the descriptions derived from the pronoun network. In the picture, if there is a line going upwards from description d2 to description dl then dl subsumes d 2. Two descriptions that are logically equivalent (e.g., &quot;personal&singular&quot; is the same as &quot;singular&case&quot;) give rise to a single node in the diagram (technically, we are interested in the quotient lattice of the free lattice generated by the property symbols, with respect to the congruence relation defined by E). To find the node for the conjunction of two descriptions, one finds the highest node that is &quot;below&quot; both, i.e., the greatest lower bound. Similarly, to find the node for the disjunction of two descriptions, one finds the lowest node which is &quot;above&quot; both.</Paragraph> </Section> <Section position="4" start_page="42" end_page="42" type="metho"> <SectionTitle> 3 INCREMENTAL DESCRIPTION REFINEMENT </SectionTitle> <Paragraph position="0"> The previous two sections introduced a simple language of descriptions and the use of system networks to express extra background information about (constraints on) the terms appearing in those descriptions. A notion of subsumption was defined which allowed this Computational Linguistics Volume 14, Number 1, Winter 1988</Paragraph> <Paragraph position="2"> background information to be taken into account. But what are the operations that we need to carry out on descriptions in practical natural language processing systems, and does the structure we have described support these? In this paper, we will concentrate especially on a process that seems to arise in a number of contexts in natural language processing--incremental description refinement. Incremental description refinement (IDR) takes place when. a target description is gradually being built of some individual, and information about this individual appears as a sequence of self-contained, independent data descriptions. For instance, the target could be the description of an English sentence and the data descriptions partial descriptions of this sentence like: the sentence is passive the sentence is declarative the agent of the sentence is the speaker At any point in an IDR process, the information that has accumulated so far may allow certain properties of the individual to be inferred, and so one would like to be able to interrogate the partial description that has been built. In particular, one would like to be able to answer questions about which descriptions are compatible and incompatible with the target description. To build an effective IDR system, one must have a way to represent the conjunction of an arbitrary set of pieces of information so that inconsistency and subsumption relationships with other descriptions can be easily detected. The term &quot;incremental description refinement&quot; was, we believe, originally coined by Bobrow and Webber (1980), but the notion of incrementally building descriptions has been influential in a number of AI projects. In natural language processing, IDR is relevant to both natural language parsing and generation. In parsing, it is natural to accumulate information about the structure of a phrase gradually as words are read. For instance, in the sentence The hairy sheep was...</Paragraph> <Paragraph position="3"> Computational'Linguistics Volume 14, Number 1, Winter 1988 we know after reading the first three words that the gender of the subject noun phrase is &quot;neuter&quot; and after the next word we know that the number of that phrase is &quot;singular&quot;. It is important in parsing that we be able to accumulate pieces of information of this kind and detect inconsistencies if they arise. In generation, it is natural to want to allow different semantic and pragmatic factors to provide separate constraints on a sentence to be generated. For instance, one pragmatic goal may force a sentence to be passive; another forces it to have a given surface subject. This conjunction of constraints may be inconsistent with certain choices of the main verb (e.g., &quot;buy&quot; vs. &quot;sell&quot;). Again there is a need to reason about partial descriptions that are built incrementally.</Paragraph> <Paragraph position="4"> In formal terms, the operations involved in IDR are simple. At any point, the information known about the target description can be represented without loss by a single &quot;partial description--the least upper bound of all the descriptions the target could be. Initially this is simply the most general description of all (&quot;true&quot;). When a new data description appears, the partial description is replaced by the greatest lower bound of it and the data description. This &quot;algorithm&quot; for IDR is in fact a special case of a more general classification algorithm given in Mellish (forthcoming). At any point, a contradiction is signaled by the partial description becoming the most specific description of all (&quot;false&quot;). Moreover, one can validly infer that the target is subsumed by a given description if the partial description is. The only operations that we need for IDR are subsumption checking and the computation of greatest lower bounds. This means that, in fact, we do not need the full lattice structure developed abovenall we need is the meet semi-lattice (Birkhoff (1963)) that contains the possible data descriptions and all possible conjunctions of them.</Paragraph> <Paragraph position="5"> The above description of IDR is not dependent on descriptions being related to system networks, and indeed IDR has been used in quite different contexts. In this paper, however, we will confine ourselves to this case, and consider IDR where the data descriptions are precisely the properties mentioned in a system network.</Paragraph> <Paragraph position="6"> We are thus concerned with ways of computing and testing subsumption between conjunctions of properties, given the background information provided by the network axioms.</Paragraph> </Section> class="xml-element"></Paper>