File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/01/p01-1033_metho.xml

Size: 11,422 bytes

Last Modified: 2025-10-06 14:07:41

<?xml version="1.0" standalone="yes"?>
<Paper uid="P01-1033">
  <Title>Towards Abstract Categorial Grammars</Title>
  <Section position="3" start_page="0" end_page="0" type="metho">
    <SectionTitle>
4 Relating ACGs to other grammatical
</SectionTitle>
    <Paragraph position="0"> formalisms In this section, we illustrate the expressive power of ACGs by showing how some other families of formal grammars may be subsumed. It must be stressed that we are not only interested in a weak form of correspondence, where only the generated languages are equivalent, but in a strong form of correspondence, where the grammatical structures are preserved.</Paragraph>
    <Paragraph position="1"> First of all, we must explain how ACGs may manipulate strings of symbols. In other words, we must show how to encode strings as linear lterms. The solution is well known: it suffices to represent strings of symbols as compositions of functions. Consider an arbitrary atomic type [?], and define the type 'string' to be ([?] [?]* [?]). Then, a string such as 'abbac' may be represented by the linear l-term lx.a(b(b(a(cx)))), where the atomic strings 'a', 'b', and 'c' are declared to be constants of type ([?] [?]* [?]). In this setting, the empty word (epsilon1) is represented by the identity function (lx.x) and concatenation (+) is defined to be functional composition (lf.lg.lx.f (gx)), which is indeed an associative operator that admits the identity function as a unit.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
4.1 Context-free grammars
</SectionTitle>
      <Paragraph position="0"> Let G = &lt;T,N,P,S&gt; be a context-free grammar, where T is the set of terminal symbols, N is the set of non-terminal symbol, P is the set of rules, and S is the start symbol. We write L(G) for the language generated by G. We show how to construct an ACG GG = &lt;S1,S2,L,S&gt; corresponding to G.</Paragraph>
      <Paragraph position="1"> The abstract vocabulary S1 = &lt;A1,C1,t1&gt; is defined as follows:  1. The set of atomic types A1 is defined to be the set of non-terminal symbols N.</Paragraph>
      <Paragraph position="2"> 2. The set of constants C1 is a set of symbols in 1-1-correspondence with the set of rules P. 3. Let c [?] C1 and let 'X - o' be the rule corresponding to c. t1 is defined to be the function that assigns the type [[o]]X to c, where [[*]]X obeys the following inductive definition: null (a) [[epsilon1]]X = X; (b) [[Yo]]X = (Y [?]* [[o]]X), for Y [?] N; (c) [[ao]]X = [[o]]X, for a [?] T.</Paragraph>
      <Paragraph position="3"> The definition of the object vocabulary S2 = &lt;A2,C2,t2&gt; is as follows: 1. A2 is defined to be {[?]}.</Paragraph>
      <Paragraph position="4"> 2. The set of constants C2 is defined to be the set of terminal symbols T.</Paragraph>
      <Paragraph position="5"> 3. t2 is defined to be the function that assigns the type 'string' to each c [?] C2.</Paragraph>
      <Paragraph position="6"> It remains to define the lexicon L = &lt;F,G&gt; : 1. F is defined to be the function that interprets each atomic type a [?] A1 as the type 'string'. 2. Let c [?] C1 and let 'X - o' be the rule corresponding to c. G is defined to be the function that interprets c as lx1 ....lxn.|o|, where x1 ...xn is the sequence of l-variables occurring in |o|, and |* |is inductively defined as follows: (a) |epsilon1 |= lx.x; (b) |Yo |= y +|o|, for Y [?] N, and where y is a fresh l-variable; (c) |ao |= a+|o|, for a [?] T.</Paragraph>
      <Paragraph position="7">  It is then easy to prove that GG is such that: 1. the abstract language A(GG) is isomorphic to the set of parse-trees of G.</Paragraph>
      <Paragraph position="8"> 2. the language generated by G coincides with the object language of GG, i.e., O(GG) = L(G).</Paragraph>
      <Paragraph position="9"> For instance consider the CFG whose production rules are the following:</Paragraph>
      <Paragraph position="11"> which generates the language anbn. The corresponding ACG has the following abstract language, object language, and lexicon:</Paragraph>
      <Paragraph position="13"/>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
4.2 Regular grammars and rational
transducers
</SectionTitle>
      <Paragraph position="0"> Regular grammars being particular cases of context-free grammars, they may be handled by the same construction. The resulting ACGs (which we will call &amp;quot;regular ACGs&amp;quot; for the purpose of the discussion) may be seen as finite state automata. The abstract language of a regular ACG correspond then to the set of accepting sequences of transitions of the corresponding automaton, and its object language to the accepted language.</Paragraph>
      <Paragraph position="1"> More interestingly, rational transducers may also be accomodated. Indeed, two regular ACGs that shares the same abstract language correspond to a regular language homomorphism composed with a regular language inverse homomorphism.</Paragraph>
      <Paragraph position="2"> Now, after Nivat's theorem (Nivat, 1968), any rational transducer may be represented as such a bimorphism. null</Paragraph>
    </Section>
    <Section position="3" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
4.3 Tree adjoining grammars
</SectionTitle>
      <Paragraph position="0"> The construction that allows to handle the tree adjoining grammars of Joshi (Joshi and Schabes, 1997) may be seen as a generalization of the construction that we have described for the context-free grammars. Nevertheless, it is a little bit more involved. For instance, it is necessary to triplicate the non-terminal symbols in order to distinguish the initial trees from the auxiliary trees.</Paragraph>
      <Paragraph position="1"> We do not have enough room in this paper for giving the details of the construction. We will rather give an example. Consider the TAG with the following initial tree and auxiliary tree:  One of the keystones in the above translation is to represent an adjunction node A as a functional parameter of type Aprimeprime [?]*Aprime. Abrusci et al. (1999) use a similar idea in their translation of the TAGs into non-commutative linear logic.</Paragraph>
    </Section>
  </Section>
  <Section position="4" start_page="0" end_page="0" type="metho">
    <SectionTitle>
5 Beyond the multiplicative fragment
</SectionTitle>
    <Paragraph position="0"> The linear l-calculus on which we have based our definition of an ACG may be seen as a rudimentary functional programming language. The results in Section 4 indicate that, in theory, this rudimentary language is powerful enough. Nevertheless, in practice, it would be useful to increase the expressive power of the multiplicative kernel defined in Section 2 by providing features such as records, enumerated types, conditional expressions, etc.</Paragraph>
    <Paragraph position="1"> From a methodological point of view, there is a systematic way of considering such extensions.</Paragraph>
    <Paragraph position="2"> It consists of enriching the type system of the formalism with new logical connectives. Indeed, each new logical connective may be interpreted, through the Curry-Howard isomorphism, as a new type constructor. Nonetheless, the possible additional connectives must satisfy the following requirements: null 1. they must be provided with introduction and elimination rules that satisfy Prawitz's inversion principle (Prawitz, 1965) and the resulting system must be strongly normalizable; 2. the resulting term language (or at least an interesting fragment of it) must have a decidable matching problem.</Paragraph>
    <Paragraph position="3"> The first requirement ensures that the new types come with appropriate data constructors and discriminators, and that the associated evaluation rule terminates. This is mandatory for the applicative paradigm of Section 3. The second requirement ensures that the deductive paradigm (and consequently the transductive paradigm) may be fully automated.</Paragraph>
    <Paragraph position="4"> The other connectives of linear logic are natural candidates for extending the formalism. In particular, they all satisfy the first requirement. On the other hand, the satisfaction of the second requirement is, in most of the cases, an open problem.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
5.1 Additives
</SectionTitle>
      <Paragraph position="0"> The additive connectives of linear logic '&amp;' and '[?]' corresponds respectively to the cartesian product and the disjoint union. The cartesian product allows records to be defined. The disjoint union, together with the unit type '1', allows enumerated types and case analysis to be defined. Consequently, the additive connectives offer a good theoretical ground to provide ACG with feature structures.</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
5.2 Exponentials
</SectionTitle>
      <Paragraph position="0"> The exponentials of linear logic are modal operators that may be used to go beyond linearity. In particular, the exponential '!' allows the intuitionistic implication '-' to be defined, which corresponds to the possibility of dealing with non-linear l-terms. A need for such non-linear l-terms is already present in the example of Section 2.3. Indeed, the way of getting rid of the second assumption we made at the beginning of section 2.3 is to declare the logical symbols (i.e., the existential quantifier and the conjunction that occurs in the interpretation of A in Lexicon L13) as constants of the object vocabulary S3. Then, the interpretation of A would be something like: lP.lQ. EXISTS (lx. AND (P x)(Qx)).</Paragraph>
      <Paragraph position="1"> Now, this expression must be typable, which is not possible in a purely linear framework. Indeed, the l-term to which EXISTS is applied is not linear (there are two occurrences of the bound variable x). Consequently, EXISTS must be given ((e t)[?]*t) as a type.</Paragraph>
    </Section>
    <Section position="3" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
5.3 Quantifiers
</SectionTitle>
      <Paragraph position="0"> Quantifiers may also play a part. Uses of first-order quantification, in a type logical setting, are exemplified by Morrill (1994), Moortgat (1997), and Ranta (1994). As for second-order quantification, it allows for polymorphism.</Paragraph>
      <Paragraph position="1"> 6 Grammars as first-class citizen The difference we make between an abstract vocabulary and an object vocabulary is purely conceptual. In fact, it only makes sense relatively to a given lexicon. Indeed, from a technical point of view, any vocabulary is simply a higher-order linear signature. Consequently, one may think of a lexicon L12 : S1 - S2 whose object language serves as abstract language of another lexicon L23 : S2 - S3. This allows lexicons to be sequentially composed. Moreover, one may easily construct a third lexicon L13 : S1 - S3 that corresponds to the sequential composition of L23 with L12. From a practical point of view, this means that the sequential composition of two lexicons may be compiled. From a theoretical point of view, it means that the ACGs form a category whose objects are vocabularies and whose arrows are lexicons. This opens the door to a theory where operations for constructing new grammars from other grammars could be defined.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML