File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/92/j92-2003_metho.xml
Size: 25,480 bytes
Last Modified: 2025-10-06 14:13:11
<?xml version="1.0" standalone="yes"?> <Paper uid="J92-2003"> <Title>Feature Structures and Nonmonotonicity</Title> <Section position="4" start_page="187" end_page="189" type="metho"> <SectionTitle> 3. Feature Structures and Unification </SectionTitle> <Paragraph position="0"> Feature structures are often depicted as matrices of attribute-value pairs where values are either atoms or feature structures themselves and, furthermore, values may be shared by different attributes in the feature structure. Feature structures can be defined using a description language, such as the one found in PATR-II (Shieber 1986a) or in Kasper and Rounds (1986; 1990). For instance, 4a is a description of 4b.</Paragraph> <Paragraph position="2"> Following the approach of Kasper and Rounds (1986; 1990), and others, we represent feature structures formally as finite (acyclic) automata (the definition below is taken from Dawar and Vijay-Shanker 1990): Definition A finite acyclic automaton A is a 7-tuple (Q, ~, P, 6, q0, F, &quot;~/where: 1. Q is a nonempty finite set of states, 2. G is a countable set (the alphabet), 3. 1 ~ is a countable set (the output alphabet), 4. ~ : Q x G --* Q is a finite partial function (the transition function), 5. q0EQ, 6. FC_Q, 7. )~ : F --* P is a total function (the output function), 8. the directed graph (Q, E) is acyclic, where pEq iff for some l E Y~,6(p,I) = q, 9. for every q E Q, there exists a directed path from q0 to q in (Q, E), and 10. for every q E F, 6(q, I) is not defined for any l. We will frequently write QA, GA, etc. for the set of states of automaton A, the alphabet of A, etc.</Paragraph> <Paragraph position="3"> The relationship between the matrix notation and the automaton concept should be obvious. The following automaton M is, for instance, equivalent to the matrix in 4b.</Paragraph> <Paragraph position="5"> (gf) and (gg) are reentrant. Unification is defined in terms of subsumption, a relation that imposes a partial ordering on automata: Definition An automaton A subsumes an automaton B (A _ B) iff there is a homomorphism h from A to B such that: 1. h(6A(q,l) = ~s(h(q),l), 2. &B(h(q)) = &A(q) for all q E FA, and 3. h(qoA) = qoB-Intuitively, A u B if B extends the information in A. A = B if A _ B and B U A. Unification of two automata A and B (A U B) is the least upper bound of these automata under subsumption. If no upper bound exists, unification fails. The semantics of descriptions (sets of formulae of the description language) is given in terms of satisfaction: iff for all q~ E D :A ~ q~, iff Q = F = {q0} and &(q0) = a, iff 6(qo, p) is defined and qo/P ~ D, iff 6(qo, pl) = ~(q0, p2).</Paragraph> <Paragraph position="6"> qo/P is the automaton obtained from A by making ~(q0, P) the initial state and removing all inaccessible states. There is always a unique minimal element in the subsumption hierarchy that satisfies a description D. This element is the denotation of D. 3 2 rS(q, pl) is defined for pl ff ~* as 6(6(q, p), 1). 3 Much of the formal work on feature structures is concerned with the semantics of feature structure descriptions involving disjunction and negation. Such descriptions do not denote a unique feature structure, but denote sets of feature structures. Such extensions are not taken into consideration here. Computational Linguistics Volume 18, Number 2</Paragraph> </Section> <Section position="5" start_page="189" end_page="189" type="metho"> <SectionTitle> 4. Default Unification </SectionTitle> <Paragraph position="0"> Default reasoning with feature structures requires the ability to modify feature structures nonmonotonically. Unification does not have this ability, as it can only replace a feature structure by more specific instances of that structure. Below, we define default unification as an operation that merges parts of one feature structure (the default argument) with another feature structure (the nondefault argument). We write AU!B for the default unification of the default feature structure A and the nondefault feature structure B. The operation has the following characteristics: 1. It has a declarative semantics and is procedurally neutral. That is, if A -- A' and B = B', then (AU!B) = (A'U!B').</Paragraph> <Paragraph position="1"> 2. It is monotonic only with respect to the nondefault argument. That is, B U (AU!B) is always true, but in general A U (AU!B) will not hold.</Paragraph> <Paragraph position="2"> 3. It never fails. If A is fully incompatible with B, (AU!B) = B.</Paragraph> <Paragraph position="3"> 4. It gives a unique result.</Paragraph> <Paragraph position="4"> 5. Reentrancies in the nondefault argument may be replaced by a weaker set of reentrancies if necessary (this is the add conservatively operation of Shieber (1986b)).</Paragraph> <Paragraph position="5"> Intuitions about default unification appear to be more clear in those cases where feature structures do not contain any reentrancies. Therefore, we will first define default unification for this case, moving to the general case in Section 4.2. Section 4.3. deals with the incorporation of add conservatively.</Paragraph> <Section position="1" start_page="189" end_page="189" type="sub_section"> <SectionTitle> 4.1 Default Unification without Reentrancies </SectionTitle> <Paragraph position="0"> Subsumption suggests a straightforward definition of an operation that has properties 1-4 above.</Paragraph> <Paragraph position="1"> Definition Default Unification (first version) AU!B = A ~ U B, where A ~ is the maximal (i.e. most specific) element in the subsumption ordering such that A' r- A and A ~ U B is defined. From this definition of U!, it follows immediately that properties 1-3 hold. The fact that default unification has a unique result follows from the fact that A' is unique (up to isomorphism). 4 Note furthermore that from tile requirement that A ~ must be the maximal it follows that no information contained in A is left out in AU!B unnecessarily.</Paragraph> </Section> </Section> <Section position="6" start_page="189" end_page="196" type="metho"> <SectionTitle> 4 Unicity of A p is proved as follows: Assume that there is an A&quot; such that (1) A&quot; ~ X, (2) A t E A and </SectionTitle> <Paragraph position="0"> A&quot; U A, (3) A t U B and A&quot; U B are defined, and (4) both A t and A&quot; are maximal. We show that these assumptions are inconsistent. From (2) it follows that A t U A&quot; is defined and (X U A') D A. From (3) it follows that (X U A') U B is defined (since, if there are no reentrancies, it holds in general that if X U Y, Y U Z, and X u Z are defined, X u Y U Z is defined). But then, if (A t U A') = A t or (A t U A') = A', either condition (1) or (4) is not met, or, if A t U A&quot; ~ A t ~ A', condition (4) is not met. D The definition of default unification above relies crucially on the fact that there is a unique maximal element A' unifiable with B. In Section 2, we argued that such an approach is only feasible for a limited domain. In particular, once reentrancies are introduced, A' is no longer guaranteed to be unique, and the definition above is therefore not easily generalized. Fortunately, it is also possible to define AU!B without requiring unifiability of some element A t with B explicitly. This definition, which will be extended below, defines AU!B in terms of the difference of the two arguments A and B.</Paragraph> <Paragraph position="2"> It should be obvious that characteristics 1-3 continue to hold. Uniqueness follows in this case from the fact that the difference operation will give a unique result. (A - B can be constructed from A by checking for each state in A whether it must be removed Computational Linguistics Volume 18, Number 2 or not and ensuring that the resulting automaton is connected.) For instance, assuming A and B to be defined as in Example 6, we find that A - B is:</Paragraph> <Section position="1" start_page="191" end_page="192" type="sub_section"> <SectionTitle> Example 7 </SectionTitle> <Paragraph position="0"> Note that in A - B all parts that are identical in A and 13 are removed, whereas this was not the case for A', as defined in Definition 3.1. The outcome of default unification, however, is identical in both cases. The reason for this restriction on A - B will become apparent below.</Paragraph> <Paragraph position="1"> While default unification monotonically extends the nondefault argument (i.e. B E AU!B) and nonmonotonically extends the default argument, the operation itself is monotonic in its default argument and nonmonotonic in its nondefault argument. The theorem below proves monotonicity for the default argument; that is, a more specific default argument will lead to a more specific outcome of default unification: Theorem 1 For all feature structures A, B, and C, not containing reentrancies, if A F- B then (ALl!C) _ (BU!C).</Paragraph> <Paragraph position="2"> It suffices to show that (A - C) G (B - C), or in other words:</Paragraph> <Paragraph position="4"> If these two conditions are met, there is a homomorphism from A - C to B - C as required by the definition of subsumption. (Remember that there are no reentrancies.) Case (1): If .X(6A-C(qO, p)) = a, then (i))~(6B(qo, p)) = a (since A - C _G_ A U B) and from the definition of A - C it follows that (ii) there is no prefix p' of p such that 6c(qo, p') c Fc nor is ~c(q0,p) defined. From (i) and (ii) it follows that ~B-c(q0,p) is defined and that ~(6~-c(q0, p)) = a.</Paragraph> <Paragraph position="5"> Case (2): If ~A-C(q0, p) is defined and 6A-C(qO, p) (~ FA-C (otherwise this case reduces to case (1)), it follows that (i) 6B(q0, p) is defined, and (ii) there is no prefix p~ of p such that ~c(q0, p') E Fc. From (i) and (ii) it follows that ~B-C(q0, p) is defined. * Note, however, that addition of nondefault information does not necessarily lead to a more specific result. That is, the dual of Theorem 1. does not hold: Example 8 if B E_ C then (ALl!B) G (ALl!C) The reason is that addition of nondefault information may lead to a larger amount of default information being removed, and thus, the resulting feature-structures AU!B and AU!C can be incompatible. An example that falsifies 8 is presented below.</Paragraph> <Paragraph position="6"> or A(~B(q0,p)) = a (since there are no reentrancies) and (ii) there is no prefix p' of p such that 6c(qo,p') c Fc, nor is 8c(qo,p) defined. From (i) and (ii) it follows that A(6A-c(qo,p)) = a or A(SB-C(qo,P)) = a, and thus that A(6E(qo, p)) = a. Similarly, if 8D(qo, p) is defined (but not an end state), it follows that 6,~(qo, p) or 8B(qO, p) is defined and that there is no prefix p' of p such that 6c(qo, p) EFc. Therefore, either 8A-C(qO, p) or 6B-C(qo,p) is defined, and thus 8E(qO,P) is defined. It follows that D _u E.</Paragraph> <Paragraph position="7"> Case (2): If A(SE(qo,p)) = a, then )~(SA-c(qo,p)) = a or A(SB-c(qo,p)) = a (since there are no reentrancies). Therefore, A(SAuB(qo,P)) = a and also, there is no prefix p' of p such that 6c(qo,p') E Fc, nor is ~c(qo,P) defined. It follows that A(6D(qo,p)) = a. Similarly if ~E(qO, p) is defined but not an end state, 6D(qO, p) is defined. It follows that EUD. * As long as Theorem 2. holds, it is possible to define default unification by decomposing the default argument into simpler feature structures and adding these (nonmonotonically) to the nondefault argument. This approach appears to underlie some of the previous proposals, but is inadequate once reentrancies enter the picture.</Paragraph> </Section> <Section position="2" start_page="192" end_page="195" type="sub_section"> <SectionTitle> 4.2 Default Unification with Reentrancies </SectionTitle> <Paragraph position="0"> Taking reentrancies into account requires an extension of the difference operation.</Paragraph> <Paragraph position="1"> If we allow either default or nondefault information to refer to an extension of a nondefault or default reentrancy, respectively, there is in general no unique maximal element subsuming A and unifiable with B. A slight modification of Examples 2 and 3 will illustrate this.</Paragraph> <Paragraph position="2"> Example 10 default reentrancy. A - B could be constructed from .4 by removing either the fact that (if) : a or (gf) : b. In 11, nondefault information refers to an extension of a default reentrancy. In this case, we could either remove the reentrancy (and the fact that (gf) : a) or remove the fact that (if) : a and (gf) : a and preserve the reentrancy. Neither solution subsumes the other. To avoid such problems, it is best to avoid interaction between reentrancies and other information altogether and to treat reentrant nodes in a similar fashion as atomic nodes. That is, we remove default reentrancies if they refer to defined parts of the nondefault automaton, and default information in general is removed if it refers to extensions of nondefault reentrancies. Thus, the difference operation can be extended as follows: Again, characteristics 1-4 of default unification mentioned in the introduction of this section hold. Uniqueness of the result follows from the fact that A - B is unique. (A - B can be constructed in this case as follows: for all paths p, if 6A(q0, p) = 6A(q0, P'), and p is defined in B, introduce a new value for ~A(q0, p) such that the automata that have ~A(q0, p) and 6A(qo, p') as initial state are isomorphic. Next, check for all states in the modified automaton whether they must be removed and ensure that the resulting automaton is connected.) The monotonicity properties of default unification also remain as before. The theorem below is the relevant generalization of Theorem 1.</Paragraph> <Paragraph position="3"> Theorem 3 For all feature structures A, B, and C, if A G B then (AU!C) U (BU!C) Proof It suffices to show that A - C ___ B - C, or in other words: 1. if ~(6A-C(qo, p)) = a, then ,~(6B-C(qO, p)) = a,</Paragraph> <Paragraph position="5"> If these three conditions are met, there is a homomorphism from A - C to B - C as required by the definition of subsumption.</Paragraph> <Paragraph position="6"> Case (1): If /~(6A-c(qo,p)) = a, then (i) ~(~B(qO,P)) = a ( since A - C E A _G B) and (ii) from the definition of A - C, it follows that there is no prefix p' of p such that 6c(q0, p') EFc or 6c(qo, p') = 6c(q0~ p&quot;), nor is ~c(q0, p) defined. From (i) and (ii) it follows that ~B-C(q0~ p) is defined and that A(~B-c(q0, p)) = a.</Paragraph> <Paragraph position="7"> Case (2): Similarly, if 6A-C(qO,p) = ~A-C(q0,p'), then (i) 6B(qo,p) = ~B(qo~P'), and (ii) there is no prefix p' of p such that ~c(q0,p') E Fc or 6c(qo,p') = 6c(qo,p') nor is 6c(q0, p) defined. From (i) and (ii) it follows that 6B-c(q0, p) = 6B-c(q0, p').</Paragraph> <Paragraph position="8"> Case (3): If 6A-c(qo,p) is defined and ~A-C(q0,p) ~ FA-C (otherwise this case reduces to case (1)) and 6A-C(qO,P) not reentrant (otherwise this case reduces to case (2)), it follows that (i) ~B(qo,P) is defined, and (ii) there is no prefix p' of p such that As in the previous section, it suffices to prove that (A - C) U (B - C) __U (A U B) - C. From the fact that X E X' and Y G Y' implies (X u Y) G (X' u Y'), it follows that ((A - C) U (B - C)) PS- (A U B). Now, as in the previous proof, if some path p is atomic, reentrant, or merely defined in (A- C) U (B - C), it follows that (i) p is atomic, reentrant, or defined in A U B and (ii) there is no atomic or reentrant path p' in C that is a prefix of p, nor is p defined in C if p is atomic or reentrant in (A - C) u (B - C). It follows that p is atomic, reentrant, or defined in (A U B) - C. * An illustration of this result is given below. Note that 12 also illustrates that the converse of Theorem 4. no longer holds.</Paragraph> <Paragraph position="10"/> </Section> <Section position="3" start_page="195" end_page="195" type="sub_section"> <SectionTitle> Computational Linguistics Volume 18, Number 2 4.3 Add Conservatively </SectionTitle> <Paragraph position="0"> Defining default unification as (A - B) U B will fail to capture the idea of Shieber's (1986b) add conservatively, as the difference operation completely removes a default reentrancy if one of the paths leading to it is also defined in the nondefault argument.</Paragraph> <Paragraph position="1"> However, linguistic applications, such as an encoding of the Head Feature Convention, indicate that a more subtle approach should be taken. In particular, if a default structure contains the information that (P/= (P'), whereas in the nondefault structure (pl) is defined for some feature l, we want to treat only I as an exception to the general rule that (P/= (P'/, and preserve the information that Ipl') = (p'l' I (for l' # I).</Paragraph> <Paragraph position="2"> We implement this idea using the following operation: Definition Let A and B be automata. The extension of A relative to B (Ext(A, B)) is the minimal (i.e. most general) element Ext(A, B) such that</Paragraph> <Paragraph position="4"> The automaton A is extended, sometimes somewhat redundantly, with reentrant paths that are extensions of paths already reentrant in A. Ext(A, B) is nevertheless usually more informative than A itself, as the addition of a path pl blocks unification with feature structures in which p receives an atomic value. Note furthermore that path extensions are not always possible; that is, if 6A(qO,p) E FA and 6B(qo, pl) is defined, there is no extension of A in which pl is defined. (This explains the wherever possible).</Paragraph> <Paragraph position="5"> In order to get all relevant path-extensions, G will in general be the set of all features defined in the grammar, although in particular cases G can be restricted to a smaller set (the set of head-features, for instance).</Paragraph> <Paragraph position="6"> We are now ready to give a definition of default unification that incorporates the effects of add conservatively. To avoid confusion, we use the operator t3ac! for this extended version of default unification.</Paragraph> </Section> <Section position="4" start_page="195" end_page="196" type="sub_section"> <SectionTitle> Definition Default Unification (final version) </SectionTitle> <Paragraph position="0"/> <Paragraph position="2"> An example of default unification involving reentrancies is presented below. We assume that the set of features G = {f,g}.</Paragraph> <Paragraph position="3"> The example shows that default unification is slightly more restrictive than add conservatively, since the original reentrancy is removed even though A and B would have been unifiable. The reason is of course that this will guarantee uniqueness of the result of default unification, whereas this is not the case for add conservatively.</Paragraph> </Section> </Section> <Section position="7" start_page="196" end_page="198" type="metho"> <SectionTitle> 5. Linguistic Applications of Default Unification </SectionTitle> <Paragraph position="0"> In this section, we sketch how default unification can be incorporated in a grammar formalism and argue briefly that this can be an alternative for some of the extensions mentioned in Section 2.</Paragraph> <Section position="1" start_page="196" end_page="198" type="sub_section"> <SectionTitle> 5.1 Nonmonotonic Template Inheritance </SectionTitle> <Paragraph position="0"> In grammar formalisms such as PATR-II, feature structures are defined as sets of equations and templates. Each equation or template denotes a feature structure (i.e.</Paragraph> <Paragraph position="1"> the minimal feature structure that satisfies the equation or the equations that make up the template definition), and the denotation of a set of such elements is simply the unification of all their denotations. Incorporation of default unification requires that a distinction is made between default and nondefault information. In the notation used here, nondefault information is prefixed by a &quot;!'. The feature structure denoted by a definition that contains both default and nondefault information is arrived at by first unifying all default information and unifying all nondefault information. Next, the two feature structures are combined by means of default unification (tAac!).</Paragraph> <Paragraph position="2"> If templates are incorporated as default information, the feature structure denoted by the template is inherited nonmonontonically. (Monotonic inheritance is possible as well of course: this is achieved by prefixing a template with &quot;!'.) As an illustration, consider the following fragment, in which an attempt is made to encode some of the peculiarities of the English auxiliary system in a lexicalist grammar:</Paragraph> <Paragraph position="4"> (subcat rest rest I -- empty !(subcat first subcat first nform I = (subcat rest first nform I ).</Paragraph> <Paragraph position="5"> Computational Linguistics Volume 18, Number 2 Adding the equations !(aux} : + and !(inv} : +5 to the definition of AUX has an effect comparable to that of the overwrite-operation of (C;hieber 1986a, p. 60). The AUX template inherits from VERB by default, but the equations just mentioned block inheritance of the values for (inv} and (aux}. However, default unification allows us to do more. An auxiliary does not subcategorize for an ordinary NP subject, nor does it subcategorize for a complement VP that subcategorizes for an ordinary NP subject. Rather, the restrictions to be placed on the nform of the subject are inherited from the embedded VP: Example 15 a. it will annoy Kim that she lost b. *Sue will annoy Kim that she lost This dependency between elements of the subcat list is encoded in the final equation, which also suppresses (or overwrites) the default value for (nform}. The denotation of AUX is thus:</Paragraph> <Paragraph position="7"> The nonmonotonic inheritance regime is flexible enough to allow for exceptions to exceptions. Gazdar et al. (1985, p. 65) observe that at least in some dialects of English, the auxiliary might cannot occur in inverted structures. This is expressed in the following lexical entry, in which might inherits nonmonotonically from AUX, which itself inherits nonmonotonically from VERB: Example 17 might :( AUX !(inv}=- ).</Paragraph> <Paragraph position="8"> There is an important difference between the approach to nonmonotonic inheritance sketched here and the majority of inheritance-based formalisms used for Knowledge Representation, which has to do with the way in which templates are evaluated. If a template is used as part of the definition of another feature structure, all we need to know to determine the denotation of this feature structure is the denotation of this 5 Note that the feature INV as used here indicates only whether a (lexical) item may occur in an inverted structure. It does not distinguish between inverted and noninverted clauses. Gosse Bouma Feature Structures and Nonmonotonicity template (which is a feature structure). How this template was defined (as a set of equations or as a combination of (more general) templates, as a combination of default and nondefault information or not) is completely irrelevant to its meaning. Thus, the denotation of AUX would remain as before, if we defined it as:</Paragraph> <Paragraph position="10"> Consequently, the denotation of might is not affected by this change in definition either.</Paragraph> <Paragraph position="11"> The role of classes (or frames) in inheritance-based systems, however, as described in, for instance, Touretzky (1986), is rather different. To determine the denotation of a class might that inherits from a class AUX, we not only need to know the contents of AUX, but also the classes from which AUX inherits. The latter is important for resolving multiple-inheritance conflicts. If the class might inherits from both AUX and VERB, for instance, and AUX in its turn inherits from VERB as well, information inherited from AUX must take precedence over information from VERB, as the former is more specific than the latter. In our nonmonotonic inheritance mechanism for templates, such reasoning is impossible. Adding the template VERB as default information to the definition of the template (or lexical entry) might would lead to a unification failure of the default information, and thus the definition as a whole would be considered as illegal. 6 This is as it should be, we believe, given the fact that the inheritance hierarchy as such should not play a role in determining the meaning of templates. The denotation of the template AUX is the feature structure in 16 (i.e., whether it is defined as in 14 or as in 18 is irrelevant), and from that it is impossible to conclude that AUX inherits from VERB, and thus the kind of reasoning used to justify the resolution of feature conflicts used in Touretzky (1986) is not applicable in our case.</Paragraph> </Section> <Section position="2" start_page="198" end_page="198" type="sub_section"> <SectionTitle> 5.2 Lexical Defaults </SectionTitle> <Paragraph position="0"> The definition of auxiliaries above is still unsatisfactory in that it predicts that auxiliaries subcategorize for verbal complements that are specified as (aux) = -. Clearly, this requirement is too strong (although it is correct for the auxiliary do). One way to solve this problem is to redefine the AUX-template as:</Paragraph> </Section> </Section> class="xml-element"></Paper>