File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/90/c90-3002_metho.xml
Size: 17,075 bytes
Last Modified: 2025-10-06 14:12:30
<?xml version="1.0" standalone="yes"?> <Paper uid="C90-3002"> <Title>CLG (n): Constraint Logic Grammars</Title> <Section position="2" start_page="0" end_page="8" type="metho"> <SectionTitle> 1. System Overview </SectionTitle> <Paragraph position="0"> In CLG(2) the data types defined are variables, constants, typed feature structures, list and sets of typed feature structures. Typed feature structure can be seen as directed graphs with labelled arcs, every node being indexcd with its type name.</Paragraph> <Paragraph position="1"> The main novel feature of CLG(2), and of the other members of the CLG(n) family, is its constraint language I., a slightly constrained form of first order predicate logic, including explicit quantification. Unification remains tlle sole building operation, under the control of complex constraints.</Paragraph> <Paragraph position="2"> The logical symbols of the complex constraint language consist of variables, constants, the logical connectives & (conjunction), I (disjunction), &quot; (negation),-> (material implication), <-> (logical equivalence), the binary predicatc symbol &quot;=&quot; and non-logical function and predicate symbols. The terms of the constraint language are variables, constants and path expressions. The atomic formulae are either equational constraints, i.e. formulae of the form tl=t2 for terms tl,t2, or r(tl,t2,...) for terms ti and relation symbols r. The complex constraints of CLG(2) are the non atomic well formed formulae of L, defined in the usual way: for well formed formulae (constraints) C1, C2 and variable</Paragraph> <Paragraph position="4"> are also well formed formulae (constraints). The S in the quantified constraints are used to restrict the domain of the quantification and can be omitted. The interpretation of the constraint language L is the standard interpretation of first order predicate logic. In other words, we do not resort to intuitionistie or other non-standard interpretations, like for instance Moshier & Rounds (1987). Examples of constraints are:</Paragraph> <Paragraph position="6"> In order to facilitate the statement of constraints, a macro facility is available in all members of the CLG(n) family, which is a generalization of PATR-II templates in that it can take a list of formal parameters. In CLG(2) this facility has been extended in a fashion akin to UD (Johnson & Rosner, 1989) to include reeursive user defined relations.</Paragraph> <Paragraph position="7"> An example of such a relation is:</Paragraph> <Paragraph position="9"> In section 3 it will be shown how such definitions contribute to the statement of linguistic principles. We turn now to describe the components of a CLG(2) grammar.</Paragraph> <Paragraph position="10"> Global type declarations: CLG(2) relies on a strong typing scheme similar to the concept of abstract data type.</Paragraph> <Paragraph position="11"> The following is a detail of the syntactic feature hierarchy used for one type of linguistic sign in one of the grammars implemented in CLG(2):</Paragraph> <Paragraph position="13"> Other systems require typing information, including HPSG (Pollard & Sag 1987) and UCG (Moens et al. 1989).</Paragraph> <Paragraph position="14"> Type information is used in CLG(2) both to structure the grammatical information and to achieve a more efficient implementation.</Paragraph> <Paragraph position="15"> Global constraints: these encode HPSG-type of linguistic principles. A principle is of the form: partial-object- null specification -> constraints. For instance, HPSG's Head Feature Principle could be expressed as:</Paragraph> <Paragraph position="17"> Partial descriptions of lexical signs. Lexical and phrasal descriptions have both the same format consisting of a pair <DAG,CS> whose first element is a DAG specified by a set of equations and whose second element is a set of complex constraints. Both lexical and phrasal constraints have a number of alternative shorthand formats to suit user requirements.</Paragraph> <Paragraph position="18"> Partial descriptions of phrasal signs: these ;ire the CLG(2) rules. A number of different equivalent rule formats are supported. For instance:</Paragraph> <Paragraph position="20"> are equivalent formulations.</Paragraph> </Section> <Section position="3" start_page="8" end_page="8" type="metho"> <SectionTitle> 2. Formal Semantics </SectionTitle> <Paragraph position="0"> We define in this section a denotational semantics 'for CLG(2) grammars in a similar way to what was done for CLG(0) grammars (Damas & Varile, 1989). For reasons of space, we present a slightly simplified version.</Paragraph> <Paragraph position="1"> Starting from primitive sets Labels and Atoms of attribute names and atomic values we would like to define the domain of objects and the domain of values as follows</Paragraph> <Paragraph position="3"> Note that to simplify the semantics we are assuming that every label can have as value a list of sub-objects.</Paragraph> <Paragraph position="4"> Given a set Vats of variable symbols and a set Preds of predicate symbols we define the following syntactic domains:</Paragraph> <Paragraph position="6"> where we assume that every path which occurs m a definition is associated with a formal argument.</Paragraph> <Paragraph position="8"> The Constraints comlxment in a Grammar denotes the conjunction of all principles with the disjunction of the descriptions of all lexical and phrasal signs. The Path* component specifies which paths are involved in the dominance relation for the grammar.</Paragraph> <Paragraph position="9"> Given an object o and a path p we will extend o to paths by</Paragraph> <Paragraph position="11"> if o(p) has only one element and that element is not an atom, error otherwise In what follows we will omit the handling of error values, which should produce error if any partial result leads tO error.</Paragraph> <Paragraph position="12"> To define our semantic functions we still need the following domains:</Paragraph> <Paragraph position="14"> Now we define the following semantic functions V\[ e+e'\] r o = concatenate(V\[ e \] r o, V\[e' \]r o) V\[ e:e'\] r o = cons(V\[ e \]r o, V\[ e'\]r o) C, which assigns a truth value to every constraint, is defined by</Paragraph> <Paragraph position="16"> D is defined by taking, for each sequence of definitions pi(xl,..,xn)<-> Di, the least fixed point of the function H: PEnv --> PEnv defined by: H\[ pi \] d (vl, ..., vn) = C\[ Di \] d \[vi/xi\] o_nil where o_nil is the empty object.We can now define G as follows: G\[ < c, <pl, ...,pk>, Ds > \] o = T iff there is an environment r such that C\[ c \] d r o= T and for every path pi such that o(pi) = <ol, ..., o1>: G\[ < c, <pl ..... pk>, <C1,...,Cn> > \] oj = T for j= 1,...,I, where d = D\[ Ds \].</Paragraph> </Section> <Section position="4" start_page="8" end_page="8" type="metho"> <SectionTitle> 3, Complementizer-Trace Effects in CLG(2) </SectionTitle> <Paragraph position="0"> We will illustrate the expressive power of CLG(2) with an analysis of those phenomena traditionally known ,~s complementizer-trace effects (Perlmutter, 1971; Chomsky & Lasnik, 1977). It is inspired by the HPSG framework (Pollard & Sag, 1987), but it departs from it in some respects.</Paragraph> <Paragraph position="1"> The most recent account of these phenomena within HPSG is that of Pollard (1985). There, he aims at showing that most of the GPSG insights (Gazdar, Klein, Pullum & Sag, 1985) can be preserved within a framework which does not express subcategorization directly in PS rules, and which does not make use of meta-rules.</Paragraph> <Paragraph position="2"> In our revision of the analysis we will follow Pollard (1989) in separating subjects selection from complement selection. Our grammar incorporates, however, some radical differences, most of them concerned with the typing of features structures, and the typology of lexical and phrasal categories it induces.</Paragraph> <Paragraph position="3"> In essence, our approach incorporates a much more articulated theory of minor categories which attributes them a more privileged role than it is generally assumed in PSG frameworks. We assume, then, that minor categories have n certain head-like status and, consequently, seleetion,~l properties (Chomsky, 1986; Warner, 1989).</Paragraph> <Paragraph position="4"> Thus, the top of our hierarchy of signs is as follows: The main difference between major and minor signs is that the latter contain information of type syntactic category and semantics only, while major signs may contain also binding information.</Paragraph> <Paragraph position="5"> Now consider, the following, schematic lexical entries for the English complementizers that and for, which are minor signs of type clitic:</Paragraph> <Paragraph position="7"> Where subj and compl abbreviate subject complements.</Paragraph> <Paragraph position="8"> And the schematic entries for the following verbs:</Paragraph> <Paragraph position="10"> Tlaus, subject extraction from a clausal complement of think or want is impossible if the complement has a complementizer, because it violates its seleetional restrictions.</Paragraph> <Paragraph position="11"> We predict then that, &quot;in English, subject extraction is only possible with bridge verbs (e.g. think), and that it is always impossible with non-bridge verbs (e.g. wantl, complain), while complement extraction is always possible (e.g. object extraction in object control verbs like want2). Note that the different syntactic properties of verbal complements (clauses, VPs) seems to have a direct semantic correlation in the property/proposition distinction which has been advocated in some recent analyses of control, e.g. Sag & Pollard (1988).</Paragraph> <Paragraph position="12"> The CLG(2) grammar which accounts for the above facts contains four rules and four principles. Two rules are the well known Complementation and TopicaUzation rules of standard HPSG.</Paragraph> <Paragraph position="13"> The other two are original: one, lhe Clitie Placement rule, licenees those stuetures in which a minor head is attached to a major head; it requires that the selectional restrictions of the minor head be satisfied and marks the mother node with whatever features come from the minor head (e.g., comp=that, when the complementizer is attached to a clause). The other rule is like Topicalization, but for subject binding. As for the principles, we have a Head Feature Principle, a Complementation Principle, a Binding Principle, and a Control Principle.</Paragraph> <Paragraph position="14"> As an example, we provide the CLG version of the Complementation Principle, which given its formulation has the direct consequence of performing gap introduction when some complement is not found:</Paragraph> <Section position="1" start_page="8" end_page="8" type="sub_section"> <SectionTitle> Complementation Principle </SectionTitle> <Paragraph position="0"/> <Paragraph position="2"> The slash is computed by merge by concatenating thc slashes of each of the complement daughters with thosc elements of the compls list for which there is no matching daughter.</Paragraph> </Section> </Section> <Section position="5" start_page="8" end_page="8" type="metho"> <SectionTitle> 4. Implementation </SectionTitle> <Paragraph position="0"> The CLG(2) parser has been implemented in Prolog.</Paragraph> <Paragraph position="1"> A CLG(2) grammar is compiled by successively compiling type declarations, partial descriptions of phrasal signs, principles, user defined relations and lexical information.</Paragraph> <Paragraph position="2"> This implementation, uses a simple bottom-up parser with backtracking and handles constraints using ~ln extension of the ideas described in Damas &.Varile (1989).</Paragraph> <Paragraph position="3"> The parser is implemented as a predicate of the form derive(Tree,\[Head I Input\],Output ) :complete(Head,Input,Output,Tree). null complete(Tree,Input,Input,Tree).</Paragraph> <Paragraph position="4"> complete(FirstDaughter,Input,Output,Tree) :.</Paragraph> <Paragraph position="5"> apply_rules(FirstDaughter,Input,Output 1,Tree 1), complete(Treel,Output 1,Output,Tree).</Paragraph> <Paragraph position="6"> where the apply_rules predicate is produced by compiling each grammar rule into a clause for this predicate, which attempts to apply the rule. These clauses also apply all the principles, which are partially evaluated at compile time. This technique usually results in verifying only those principles which are relevant for the particular rule. In the actual implementation the amount of backtracking involved is reduced by introducing other clauses for the complete predicate which handle rules known to have ~ fLxed number of daughters.</Paragraph> <Paragraph position="7"> Constraints are handled in a way similar to the one described in Damas & Varile (1989) by adding two extr~ arguments to each of the predicates mentioned above.</Paragraph> <Paragraph position="8"> These arguments contain a list of constraints at clause entry and exit, respectively. From time to time a rewriting process is applied to the list of constraints which may result into failure or new set of simpler constraints. Note that this rewriting process may also cause variable instantiation as ~ side effect.</Paragraph> <Paragraph position="9"> Constraints imposed by principles are implemented by a call to a predicate addconstraint which first attempts to decide if the constraint holds or not. If not enough information is available at that time for that purpose the constraint is added to the list of unresolved constraints for latter re-evaluation.</Paragraph> <Paragraph position="10"> However, the recursively defined constraints (e.g. the user defined relations) have a special treatment.</Paragraph> <Paragraph position="11"> Backtracking is allowed in its application, but some restrictions are imposed, namely they are applied only when sufficiently instantiated to insure that they finitely fail. In particular, for each recursively defined constraint, we must 4 10 specify which are the minimum conditions of application (for instance which arguments may not be undefined).</Paragraph> <Paragraph position="12"> Constraints on complex objects require some care on their interpretation and implementation. Consider, for instance, an object description such as \[syn.loeal.subj <NP1 > syn.loeal.compls <v\[compls < >,comp for\] I v\[subj < NP 1 >,compls < >,inf\] > \]\].</Paragraph> <Paragraph position="13"> which is represented internally as a complex term containing only variables plus a constraint on those variables. Note that, if a variable that refers to a atomic value is envolved in a simple equality constraint (or conjunction of) that can be evaluated in compile time.</Paragraph> <Paragraph position="14"> For the above example we could have (here in the user language, for simplicity) objeet(Spec,Const), and if in Spec we identify</Paragraph> <Paragraph position="16"/> <Section position="1" start_page="8" end_page="8" type="sub_section"> <SectionTitle> Final Remarks </SectionTitle> <Paragraph position="0"> It is clear that the highly structured nature of CLG(2) grammatical descriptions has a number of advantages with respect to more classical approaches, amongst which not least the possibility to express powerful generalization about languages in a highly structured way while maintaining the necessaiy capability for expressing exceptions.</Paragraph> <Paragraph position="1"> A drawback of this approach is however that while it is possible to give a clean and simple formal semantics to each individual component, the formalization of the complete grammatical system is certainly more complex than desirable and, as a consequence, the possibility to achieve an efficient implementation is unnecessarily complicated.</Paragraph> <Paragraph position="2"> We are currently investigating the possibility of making the type theory underlying the Global Declarations a first class citizen, namely being the unifying formal framework for all the grammar components (at least for all non iexical information).</Paragraph> <Paragraph position="3"> By this we mean that a type declaration system in the form of an algebra of sorts can cover essentially the expressive requirements of our current formalism while providing a simple and uniform formal framework for the whole.</Paragraph> </Section> </Section> <Section position="6" start_page="8" end_page="8" type="metho"> <SectionTitle> Acknowledgement </SectionTitle> <Paragraph position="0"> This work has been carried out within the framework of the Eurotra R&D programme for machine translation financed by the European Communities. We are especially grateful to a number of colleagues for their useful comments on earlier versions of CLG(n).</Paragraph> </Section> class="xml-element"></Paper>