File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/86/c86-1045_intro.xml
Size: 14,655 bytes
Last Modified: 2025-10-06 14:04:31
<?xml version="1.0" standalone="yes"?> <Paper uid="C86-1045"> <Title>Categorial Unification Grammars*</Title> <Section position="3" start_page="0" end_page="191" type="intro"> <SectionTitle> 0. Introduction </SectionTitle> <Paragraph position="0"> The work on merging strategies from unification grammars and categorial grammars has its origins in several research efforst that have been pursued in parallel. One of them is the grammar development on the PATR system (Shieber et al., 1983; Shieber, 1984) at SRI. For quite a while now I have been using the excellent facilities of PATR for the design and testing of experimental\[ CUGs. Such grammars currently run on two PATR implementations: Stuart Shieber's Zetalisp version on the Symbolics 3600 and Lauri Karttunen's Interlisp-D w:rsion on the XEROX 1109. The work on CUGs has influenced our efforts to develop a larger PATR grammar, and will do so even more in the future.</Paragraph> <Paragraph position="1"> On the theoretical side, this work is part of ongoing research on such topics as word order variation, modification, and German syntax within projects at SRI and CSLI (Stanford University).</Paragraph> <Paragraph position="2"> The structure of the paper reflects the diverse nature of the enterprise. In the first section, I will introduce the basic notions of CUGs and demonstrate them through examples in PATR notation. The second section discusses the motivation for this work and some of its theoretical implications. The third section sketches a linguistically motivated CUG framework with a strong lexical syntax that accomodates word order variation.</Paragraph> <Paragraph position="3"> The paper concludes with a brief discussion of possible CUG approaches to long-distance dependencies.</Paragraph> <Paragraph position="4"> 1. Basic Notions of Categorial Unification</Paragraph> <Section position="1" start_page="0" end_page="191" type="sub_section"> <SectionTitle> Grammars 1.2. Unification Grammars and Categorial Grammars </SectionTitle> <Paragraph position="0"> Both terms, unification grammar (UG) and categorial grammar (CG), stand for whole families of related grammar formalisms whose basic notions are widely known.l Yet, for the characterization of the class of formalisms I want to discuss, it will be useful to review the most central concepts of both UG and CG.</Paragraph> <Paragraph position="1"> Unification grammar formalisms employ complex feature structures as their syntactic representations.</Paragraph> <Paragraph position="2"> These structures encode partial information about constituents. Either term or graph unification is utilized as the main operation for checking, propagating, and merging of the information in these complex representations. Most unification grammars also use the complex feature structures for the linking of syntactic and semantic information.</Paragraph> <Paragraph position="3"> In traditional categorial grammars, all information about possible syntactic combinations of constituents is encoded in their categories. Those grammars allow only binary combinations. One of the two combined constituents, the functor, encodes the combination funtion, the other constituent serves as the argument to this function. Instead ot7 phrase structure rules, the grammar contains one or, in some formalisms, two combination rules that combine a functor and an argument by applying the function encoded in the functor to the argument constituent. Most categorial grammars only combine constituents whose terminal strings concatenate in the input string, but this need not be so. In most categorial grammar formalisms, it is assumed that the syntactic functor-argument structure in the corresponding compositional semantics.</Paragraph> <Paragraph position="4"> There are usually two types of grammatical categories in a categorial grammar, basic and derived ones. Basic categories are just category symbols, derived categories are functions from one (derived or basic) category to another. A derived category that encodes a function from category A to category B might be written B/A if the functor combines with an argument to its right or B~, if it expects the argument to its left. Thus, if we assume just two basic categories, N and S, then N/S, S/N, N\S, S\N, (S\N)/N, (N/S\(S\(N/N)), etc. are also categories. Not all of these categories will ever occur in the derivation of sentences. The set of actually occurring categories depends on the lexical categories of the language.</Paragraph> <Paragraph position="5"> Assume the following simple sample grammar: (2) Basic categories: N, S lexical categories: N (Paul, Peter) It should be clear from my brief description that the defining characteristics of unification grammar have nothing to do with the ones of categorial grammar. We will see that the properties of both grammar types actually complement each other quite wetl.</Paragraph> <Paragraph position="6"> 1.2. A Sample CUG in PATR Notation Since the first categorial unification grammars were written in the PATR formalism and tested on the PATR systems implemented at SRI, and since PATR is especially well suited for the emulation of other grammar formalisms, I will use its notation.</Paragraph> <Paragraph position="7"> The representations in PATR are directed acyclic graphs (DAGs) 2 . Rules have two parts, a head and a body. The head is a context-free rewrite rule and the body is a DAG. Here is an example, a simple rule that forms a sentence by combining a noun phrase with a verb phrase.</Paragraph> <Paragraph position="8"> (4) head XO -~ X1, X2 body in unification notation</Paragraph> <Paragraph position="10"> The rule states that two constituents X1 and X2 can combine to form a constituent X0 if the terminal string covered by X1 immediately precedes the terminal string of X2 and if the DAGs of X0, X1, and X2 unify with the X0, X1, and X2 subgraphs of the rule body, respectively.</Paragraph> <Paragraph position="11"> I will now show the most straight-forward encoding of a categorial grammar in this notation. There are two types of constituent graphs. Constituent graphs for basic categories are of the following form: combine the constituents Peter and likes Paul, the result is a finite sentence.</Paragraph> <Paragraph position="12"> However, if the same rule is applied to the identical constituents likes Paul and likes Paul, again a finite sentence is obtained. '\]\['his is so because the graph for likes Paul actually unifies with the value of arg in the same graph. This can be easily remedied by modifying the graph for the VP slightly. By stipulating that the argument must not have an unfilled argument position, one can rule out derivcd categories as subject arguments tbr the VP: 1.3. Extensions to the Basic Formalism In this subsection \[ want to discuss very briefly a few extensions of' the basic model that make it more suitable for the encoding of natural-language grammars. The first one is the sorting of fimctors according to their own syntactic category. This move might be described alternatively as defining the type of a constituent as being defined by both a set of syntactic (and semantic) attributes and a function from categories to categories. This function is also expressed as the value of an attribute. For a basic category the value of the function attribute is NIL. The following graph is a simplified example of a functor category (prenominal adjective in a language with case and number agreement within the NP).</Paragraph> <Paragraph position="14"> The combination rules need accordingly. This is the modified functional application.</Paragraph> <Paragraph position="15"> to be changed rule of forward value -~ functor argument</Paragraph> <Paragraph position="17"> <functor function dir> = Right.</Paragraph> <Paragraph position="18"> In a traditional categorial grammar, a derived category is exhaustively described by the argument and value categories. But often, syntacticians want to make more fine grained distinctions. An example is VP modification. In a traditional categorial grammar, two different VP modifiers, lets say an adverb and an adverbial clause, would receive the same translation. (12) Peter called him angrily N (S\N)fN N (S\N)/(S~q) (13) Peter called him at work</Paragraph> <Paragraph position="20"> But what should be the category for very? If it receives the category ((S\N)\(S\N))/((S\N)\(S~N)) to allow the derivation of (14), the ungrammatical sentence (15) is also permitted.</Paragraph> <Paragraph position="21"> If functor categories are permitted to carry features of their own that are not necessarily bound to to any features of their argument and value categories, this problem disappears. Adverbs and adverbial clauses could receive different features even if their categories encode the same combination function.</Paragraph> <Paragraph position="22"> Another solution to the problem involves the encoding of the difference in the value part of the functor. Yet this solution is not only unintuitive but also contradicts a linguistic generalization. It is unintuitive because there is no difference in the distribution of the resulting VPs. The only difference holds between the modifiers themselves. The gene~:alization that is violated by the encoding of the difference in the value subgraphs is the endocentricity of the VP. The modified VP shares all syntactic features with its head, the lower VP. Yet the feature that indicates the difference between adverbs and adverbial phrases could not be in both the argument and the value parts of the functor, otherwise iterations of the two types of modifiers as they occur in the following pair of sentences would be ruled out.</Paragraph> <Paragraph position="23"> (16a) Peter called him very angrily at work.</Paragraph> <Paragraph position="24"> (16b) Peter called him at work very angrily.</Paragraph> <Paragraph position="25"> Another augmentation is based on the PATR strategy for linking syntax and semantics. Most grammars written in PATR use the constituent graphs also for encoding semantic information. Every constituent has an attribute called trans or semantics. The value of this attribute contains minimally the internal semantic fnnction-argument structure of the constituent, but may also encode additional semantic information. The separate encoding of the semantics allows for a compositional semantics even in construction in which syntactic and semantic structure divert as in certain raising constructions. The following graph for a ficticious prenominal adjective that was introduced earlier contains translation attributes for the functor, the argument and the value. The meaning of the adjective is indicated by the atom Red.</Paragraph> <Paragraph position="26"> that are used in the highly simplified examples--seem to exhibit an excessive degree of complexity and redundancy. However, the lexical approach to syntax is built on the assumption that the lexicon is structured. To create a lexicon that is structured according to linguistic generalizations, we introduced lexical templates early on in the development of PATR.</Paragraph> <Paragraph position="27"> Templates are graphs that contain structure shared by a class of lexical entries. Lexical graphs can be partially or fully defined in terms of templates, which themselves can be defined in terms of templates. If a template name appeam in the definition of some graph, the graph is simply unified with the graph denoted by the template.</Paragraph> <Paragraph position="28"> The next augmentation is already built into the formalism. Categorial grammarians have recognized the limitations of fimctional application as the sole mode of combining constituents for a long time. One of the obvious extensions to classical categorial grammar was the utilization of functional composition as a further combination mode. A good example of a categorial grammar that employs both functional application and functional composition is Steedman (1985). Forward functional composition permits the following combination of categories:</Paragraph> <Paragraph position="30"> The resulting category inherits the argument place for C from the argument B/C.</Paragraph> <Paragraph position="31"> Neither Steedman's nor any other CG I am aware of permits functional composition in its full generality. In order to prevent overgeneration, functional composition as well as other combination modes that are discussed by Steedman are restricted to apply to certain categories only. This somehow violates the spirit of a categorial grammar. Steedman's combination rules, for instance, are net universal.</Paragraph> <Paragraph position="32"> In CUG, functional composition is subsumed under functional application. It is the functor category that determines whether simple functional application, or functional composition, or either one may take place.</Paragraph> <Paragraph position="33"> Conjunction is a good case for demonstrating the versatility.</Paragraph> <Paragraph position="34"> Consider the following sentences: 3 (22a) Peter andPaul like bananas.</Paragraph> <Paragraph position="35"> (22b) Peter likes bananas and Paul likes oranges. (22c) Peter likes and buys bananas.</Paragraph> <Paragraph position="36"> The conjunction and may combine two simple argument categories (22a), two functors with one unfilled argument position (22b), or two functors with more than one unfilled argument position (22c). If the conjuncts have unfilled argument positions, the conjoined phrase needs to inherit them through functional composition. The simplified lexical graph for and is given under (23). In order to avoid a thicket of crossing edges, I have expressed some of the relevant bindings by indices.</Paragraph> <Paragraph position="38"> The most appealing feature of this way of utilizing functional composition is that no additional combinators are required. No restriction on such a rule need to be formulated. It is only the lexical entries for functors that either demand, permit, or forbid functional composition.</Paragraph> <Paragraph position="39"> Extensions to the formalism that I have experimented with that cannot be discussed in the frame of this paper are the use of multiple stacks for leftward and rightward arguments and the DCG-like encoding of the ordering positions in the graphs. In Sections 3. and 4., I will discuss further extensions of the formalism and specific linguistic analyses. The following section contains a summary of the motivations for working on and with CUG and the main objectives of this work.</Paragraph> </Section> </Section> class="xml-element"></Paper>