File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/91/w91-0103_metho.xml
Size: 19,230 bytes
Last Modified: 2025-10-06 14:12:49
<?xml version="1.0" standalone="yes"?> <Paper uid="W91-0103"> <Title>Towards Uniform Processing of Constraint-based Categorial Grammars</Title> <Section position="4" start_page="12" end_page="13" type="metho"> <SectionTitle> 2 Constralnt-based versions </SectionTitle> <Paragraph position="0"> of categorial grammar The formalism I assume consists of definite clauses over constraint languages in the manner of HShfeld and Smolka (1988). The constraint language at least consists of the path equations known from PATR II (Shieber, 1989), augmented with variables. I write such a definite clause as: P :-ql ...qn,C/.</Paragraph> <Paragraph position="1"> where p, qi are atoms and C/ is a (conjunction of) constraint(s). The path equations are written as in PATR II, but each I path starts with a variable:</Paragraph> <Paragraph position="3"> where Xt are variables, c is a constant, I, l' are attributes. I also use some more powerful constraints that are written as atoms.</Paragraph> <Paragraph position="4"> This formalism is used to define what possible 'signs' are, by the definition of the unary predicate s:i.gn/1. There is only one nonunit clause for this predicate. The idea is that unit clauses for sign/1 are lexical entries, and the one nonunit clause defines the (binary) application rule. I assume that lexical entries are specified for their arguments in their 'subcat list' (sc). In the application rule a head selects the first (f) element from its subcat list, and the tail (r) of the subcat list is the subcat list of the mother; the semantics (sere) and strings (phon) are shared between the head and the mother.</Paragraph> <Paragraph position="6"> I write such rules using matrix notation as follows; string(X) represents the value Y, where string(X,Y=).</Paragraph> <Paragraph position="8"> The grammar also consists of a number of lexical entries. Each of these lexical entries is specified for its subcat list, and for each subcat element the semantics and word-order domain is specified, such that they satisfy a termination condition to be defined in the following section. For example, this condition is satisfied if the semantics of each element in the subcat list is a proper sub-part of the semantics of the entry, and each element of the subcat list is a proper subpart of the word-order domain of the entry. The phonology of a sign is defined with respect to the word-order domain with the predicate 'string'. This predicate simply defines a left-to-right depth-first traversel of a word-order domain and picks up all the strings at the terminals. It should be noted that the way strings are computed from the word-order domains implies that the string of a node</Paragraph> <Paragraph position="10"> not necessarily is the concatenation of the strings of its daughter nodes. In fact, the relation between the strings of nodes is defined indirectly via the word-order domains.</Paragraph> <Paragraph position="11"> The word-order domains are sequences of signs.</Paragraph> <Paragraph position="12"> One of these signs is the sign corresponding to the lexical entry itself. However, the domain of this sign is empty, but other values can be shared.</Paragraph> <Paragraph position="13"> Hence the entry for an intransitive German verb such as 'schl~ft' (sleeps) is defined as in figure 1. I introduce some syntactic sugaring to make such entries readable. Firstly, XPi will stand for s nsem: sem:\[ Furthermore, in lexical entries the s~sem part is shared with the synsem part of an element of the word order domain, that is furthermore specified for the empty domain and some string. I will write: << string >> in a lexical entry to stand for the sign whose synsem value is shared with the synsem of the lexical entry itself; its dora value is 0 and its phon value is string. The foregoing entry is abreviated as: synsern : sere : sehla/en(\[T\]) : (\[\]N P, ) dora: \[\](\[\], << s hla/t >>) phon : string(D Note that in this entry we merely stipulate that the verb preceded by the subject constitutes the word-order domain of the entire phrase. However, we may also use more complex constraints to define word-order constraints. In particular, as already stated above, LP constraints are defined which holds for word-order domains. I use the sequence-union predicate (abbreviated su) defined by Reape as a possible constraint as well. This predicate is motivated by clause union and scrambling phenomena in German. A linguistically motivated example of the use of this constraint can be found in section 4. The predicate su(A, B, C) is true in case the elements of the list C is the multi set union of the elements of the lists A and B; moreover, a < b in either A or B iff a < b in C. I also use the notation X U 0 Y to denote the value Seq, where su(X,Y,$eq). For example, su(\[a, d, e\], \[b, c, f\], \[a, b, c, d, e, f\]); \[a, e\] o 0 \[b\] stands for \[a, c, b\],\[a, b, c\] or \[b, a, c\]. In fact, I assume that this predicate is also used in the simple cases, in order to be able to spel out generalizations in the linear precedence constraints. Hence the entry for 'schlafen' is defined as follows, where I write lp(X) to indicate that the lp constraints should be satisfied for X. I have nothing to say about the definition of these constraints.</Paragraph> <Paragraph position="15"> phon: string(tp(\[ \[)) In the following I (implicitly) assume that for each lexical entry the following holds: dora: \[\] phon : string(lp(D) \]</Paragraph> </Section> <Section position="5" start_page="13" end_page="15" type="metho"> <SectionTitle> 3 Uniform Processing </SectionTitle> <Paragraph position="0"> In van Noord (1991) I define a parsing strategy, called 'head-corner parsing' for a class of</Paragraph> <Paragraph position="2"> grammars allowing more complex constraints on strings than context-free concatenation. Reape defines generalizations of the shift-reduce parser and the CYK parser (Reape, 1990b), for the same class of grammars. For generation head-driven generators can be used (van Noord, 1989; Calder et al., 1989; Shieber et al., 1990). Alternatively I propose a generalization of these head-driven parsing- and generation algorithms. The generalized algorithm can be used both for parsing and generation. Hence we obtain a uniform algorithm for both processes. Shieber (1988) argues for a uniform architecture for parsing in generation. In his proposal, both processes are (different) instantiations of a parameterized algorithm. The algoritthm I define is not parameterized in this sense, but really uses the same code in both directions. Some of the specific properties of the head-driven generator on the one hand, and the head-driven parser on the other hand, follow from general constraint-solving techniques. We thus obtain a uniform algorithm that is suitable for linguistic processing. This result should be compared with other uniform scheme's such as SLD-resolution or some implementations of type inference (Zajae, 1991, this volume) which clearly are also uniform but facessevere problems in the case of lexicalist grammars, as such scheme's do not take into account the specific nature of lexicalist grammars (Shieber et al., 1990).</Paragraph> <Paragraph position="3"> Algorithm. The algorithm is written in the same formalism as the grammar and thus constitutes a meta-interpreter. The definite clauses of the object-grammar are represented as</Paragraph> <Paragraph position="5"> for the rule sign(M) :- sign(H), sign(A), oh. The associated interpreter is a Prolog like top-down backtrack interpreter where term unification is replaced by more general constraint-solving techniques~, (HShfeld and Smolka, 1988; Tuda et aL, 1989; Damas et al., 1991). The meta-interpreter defines a head-driven bottom-up strategy with top-down prediction (figure 2), and is a generalization of the head-driven generator (van Noord, 1989; Calder et al., 1989; van Noord, 1990a) and the head-corner parser (Kay, 1989; van Noord, 1991).</Paragraph> <Paragraph position="7"> connect(T, T).</Paragraph> <Paragraph position="8"> connect(S, T) :rule(S, M, A), prove(A), connect ( M, T).</Paragraph> <Paragraph position="9"> In the formalism defined in the preceding section there are two possible ways where non-termination may come in, in the constraints or in the definite relations over these constraints. In this paper I am only concerned with the second type of non-termination, that is, I simply assume that the constraint language is decidable (HShfeld and Smolka, 1988). 1 For the grammar sketched in the foregoing section we can define a very natural condition on lexical entries that guarantees us termination of both parsing and generation, provided the constraint language we use is decidable. null The basic idea is that for a given semantic representation or (string constraining a) word-order domain, the derivation tree that derives these representations has a finite depth. Lexical entries are specified for (at least) ae, phon and nero. The constraint merely states that the values of these attributes are dependent. It is not possible for one value to 'grow' unless the values of the other attributes grow as well. Therefore the constraint we propose can be compared with GB's projection principle if we regard each of the attributes to define a 'level of description'. Termination can then be guaranteed because derivation trees are restricted in depth by the value of the se attribute. null In order to define a condition to guarantee termination we need to be specific about the inter- null pretation of a lexical entry. Following Shieber (1989) I assume that the interpretation of a set of path equations is defined in terms of directed graphs; the interpretation of a lexical entry is a set of such graphs. The 'size' of a graph simply is defined as the number of nodes the graph consists of. We require that for each graph in the interpretation of a lexical entry, the size of the subgraph at sere is strictly larger than each of the sizes of the sere part of the (subgraphs corresponding to the) elements of the subcat list. I require that for each graph in the interpretation of a lexicM entry, the size of phon is strictly larger than each of the sizes of (subgraphs corresponding to) the phon parts of the elements of the subcat lists.</Paragraph> <Paragraph position="10"> Summarizing, all lexical entries should satisfy the following condition: Termination condition. For each interpretation L of a lexical entry, if E is an element of L's subcat list (i.e. (L synsem sc r* f) ~ E), then: size\[(E phon)\] < size\[(L phon)\] size\[(E synsem sere)\] < size\[(L synsem sere)\] The most straightforward way to satisfy this condition is for an element of a subcat list to share its semantics with a proper part of the semantics of the lexical entry, and to include the elements of the subcat list in its word-order domain.</Paragraph> <Paragraph position="11"> Possible inputs. In order to prove termination of the algorithm we need to make some assumptions about possible inputs. For a discussion cf. van Noord (1990b) and also Thompson (1991, this volume). The input to parsing and generation is specified as the goal ?-- sign(Xo), C/.</Paragraph> <Paragraph position="12"> where C/ restricts the variable X0. We require that for each interpretation of X0 there is a maximum for parsing of size\[{Xo phonl\] , and that there is a maximum for generation of size\[(Xo synsem sem)\].</Paragraph> <Paragraph position="13"> If the input has a maximum size for either semantics or phonology, then the uniform algorithm terminates (assuming the constraint language is decidable), because each recursive call to 'prove' will necessarily be a 'smaller' problem, and as the order on semantics and word-order domains is well-founded, there is a 'smallest' problem. As a lexical entry specifies the length of its subcat list, there is only a finite number of embeddings of the 'connect' clause possible.</Paragraph> </Section> <Section position="6" start_page="15" end_page="17" type="metho"> <SectionTitle> 4 Some examples </SectionTitle> <Paragraph position="0"> Verb raising. First I show how Reape's analysis of Dutch and German verb raising constructions can be incorporated in the current grammar (Reape, 1989; Reape, 1990a). For a linguistic discussion of verb-raising constructions the reader is referred to Reape's papers. A verb raiser such as the German verb 'versprechen' (promise) selects three arguments, a vp, an object np and a subject np. The word-order domain of the vp is unioned into the word order domain of versprechen. This is necessary because in German the arguments of the embedded vp can in fact occur left from the other arguments of versprechen, as in: esi ihmj jemandk zu leseni versprochenj hatk (it him someone to read promised had i.e. someome had promised him to read it.</Paragraph> <Paragraph position="1"> Hence, the lexical entry for the raising verb 'versprechen' is defined as in figure 3. The word-order domain of 'versprechen' simply is the sequence union of the word-order domain of its vp object, with the np object, the subject, and ver~prechen itself. This allows any of the permuations (allowed by the LP constraints) of the np object, versprechen, the subject, and the elements of the domain of the vp object (which may contain signs that have been unioned in recursively).</Paragraph> <Paragraph position="2"> Seperable prefixes. The current framework offers an interesting account of seperable prefix verbs in German and Dutch. For an overview of alternative accounts of such verbs, see Uszkoreit (1987)\[chapter 4\]. At first sight, such verbs may seem problematic for the current approach because their prefixes seem not to have any semantic content. However, in my analysis a seperable prefix is lexically specified as part of the word-order domain of the verb. Hence a particle is not identified as an element of the subcat list. Figure 4 might be the encoding of the German verb 'anrufen' (call up). Note that this analysis conforms to the condition of the foregoing section, because the particle is not on the subcat list. The advantages of this analysis can be summarized as follows.</Paragraph> <Paragraph position="3"> Firstly, there is no need for a feature system to link verbs with the correct prefixes, as eg.</Paragraph> <Paragraph position="4"> in Uszkoreit's proposal. Instead, the correspondence is directly stated in the lexical entry of the particle verb which seems to me a very desirable result. 5 HPSG Markers Secondly, the analysis predicts that particles can 'move away' from the verb in case the verb is sequence-unioned into a larger word-order domain. This prediction is correct. The clearest examples are possibly from Dutch. In Dutch, the particle of a verb can be placed (nearly) anywhere in the verb cluster, as long as it precedes its matrix verb: *dat jan marie piet heefft willen zien bellen op dat jan marie piet heeft willen zien op bellen dat jan marie pier heeft willen op zien bellen dat jan marie piet heeft op willen zien bellen dat jan marie piet op heeft willen zien bellen that john mary pete up has want see call (i.e. john wanted to see mary call up pete) The fact that the particle is not allowed to follow its head word is easily explained by the (independently motivated) LP constraint that arguments of a verb precede the verb. Hence these curious facts follow immediately in our analysis (the analysis makes the same prediction for German, but because of the different order of German verbs, this prediction can not be tested).</Paragraph> <Paragraph position="5"> Thirdly, Uszkoreit argues that a theory of seperable prefixes should also account for the 'systematic orthog!aphic insecurity felt by native speakers' i.e. whether or not they should write the prefix and the verb as one word. The current approach can be seen as one such explanation: in the lexical entry for a seperable prefix verb the verb and prefix are already there, on the other hand each of the words is in a different part of the word-order domain.</Paragraph> <Paragraph position="6"> In newer versions of HPSG (Pollard and Sag, 1991) a special 'marker' category is assumed for which our projection principle does not seem to work. For example, complementizers are analyzed as markers. They are not taken to be the head of a phrase, but merely 'mark' a sentence for some features. On the other hand, a special principle is assumed such that markers do in fact select for certain type of constituents. In the present framework a simple approach would be to analyze such markers as functors, i.e. heads, that have one element in their subcat list: synseTTt : $eTn ..</Paragraph> <Paragraph position="7"> se : (L~\] VP_FIN1) dora : (<< dass >>, \[~\]) However, the termination condition defined in the third section can not always be satisfied because these markers usually do not have much semantic content (as in the preceding example).</Paragraph> <Paragraph position="8"> Furthermore these markers may also be phonetically empty, for example in the HPSG-2 analysis of infinite vp's that occur independently such an empty marker is assumed. Such an entry would look presumably as follows, where it is assumed that the empty marker constitutes no element of its own domain:</Paragraph> <Paragraph position="10"> dom: Q It seems, then, that analyses that rely on such marker categories can not be defined in the current framework. On the other hand, however, such markers have a very restricted distribution, and are never recursive. Therefore, a slight mod- null ification of the termination condition can be defined that take into account such marker categories. To make this feasible we need a constraint that markers can not apply arbitrarily. In HPSG-2 the distribution of the English complementizer 'that' is limited by the introduction of a special binary feature whose single purpose is to disallow sentences such as 'john said that that that mary loves pete'. It is possible to generalize this to disallow any marker to be repeatedly applied in some domain. The 'seed' of a lexical entry is this entry itself; the seed of a rule is the seed of the head of this rule unless this head is a marker in which case the seed is defined as the seed of the argument.</Paragraph> <Paragraph position="11"> In a derivation tree, no marker may be applied more than once to the same seed. This 'don't stutter' principle then subsumes the feature machinery introduced in HPSG-2, and parsing and generation terminates for the resulting system.</Paragraph> <Paragraph position="12"> Given such a system for marker categories, we need to adapt our algorithm. I assume lexical entries are divided (eg. using some user-defined predicate) into markers and not markers; markers are defined with the predicate marker(Sign,Name) where Name is a unique identifier. Other lexical entries are encoded as before, marktypes(L) is the list of all marker identifiers. The idea simply is that markers are applied top-down, keeping track of the markers that have already been used. The revised algorithm is given in figure 5.</Paragraph> </Section> class="xml-element"></Paper>