Tree Adjoining Grammars in a Fragment 
of the Lambek Calculus 
V. Michele Abrusci" 
Universitd di Bari 
Jacqueline Vauzeilles ~ 
Universit4 Paris-Nord 
Christophe Fouquer6 t 
Universit4 Paris-Nord 
This paper presents a logical formalization of Tree Adjoining Grammar (TAG). TAG deals with 
lexicalized trees and two operations are available: substitution and adjunction. Adjunction is 
generally presented as an insertion of one tree inside another, surrounding the subtree at the 
adjunction node. This seems to contradict standard logical ability. We prove that some logical 
formalisms, namely a fragment of the Lambek calculus, can handle adjunction. 
We represent objects and operations of the TAG formalism in .four steps: first trees (initial 
or derived) and the way they are constituted, then the operations (substitution and adjunction), 
and finally the elementary trees, i.e., the grammar. Trees (initial or derived) are obtained as the 
closure of the calculus under two rules that mimic the grammatical ones. We then prove the equiv- 
alence between the language generated by a TAG grammar and the closure under substitution 
and adjunction of its logical representation. Besides this nice property, we relate parse trees to 
logical proofs, and to their geometric representation: proofnets. We briefly present them and give 
examples of parse trees as proofnets. This process can be interpreted as an assembling of blocks 
(proofnets corresponding to elementary trees of the grammar). 
1. Introduction 
This paper presents a logical formalization of Tree Adjoining Grammar (TAG) (Joshi, 
Levy and Takahashi 1975). TAG deals with lexicalized trees and two operations are 
available: substitution and adjunction. A set of (elementary) trees is associated to 
each lexical item. TAG is a tree-rewriting system: the derivation process consists in 
applying operations to trees in order to obtain a (derived) tree whose sequence of 
leaves is a sentence. Adjunction increases the expressive power of the formalism in 
such a way that noncontext-free languages can be represented although the parse 
process is done in polynomial time. Adjunction is generally presented as an insertion 
of one tree inside another, surrounding the subtree at the adjunction node. This seems 
to contradict standard logic, but we show (in Section 4) that some logical formalisms, 
namely a fragment of the Lambek calculus (LC, first introduced by Lambek \[1958\]), 
can handle adjunction. 
We represent objects and operations of the TAG formalism in four steps: first trees 
(initial or derived) and the way they are constituted, then the operations (substitution 
and adjunction), and finally the elementary trees, i.e., the grammar. Labels occurring 
CILA, 70121 Bari, Italy. E-mail: abrusci@caspur.it 
t LIPN-CNRS URA 1507, 93430 Villetaneuse, France. E-maih cf@lipn.univ-paris13.fr LIPN-CNRS URA 1507, 93430 Villetaneuse, France. E-mail: jv@lipn.univ-paris13.fr 
~) 1999 Association for Computational Linguistics 
Computational Linguistics Volume 25, Number 2 
in the grammar constitute the set of propositional variables we need. The sequent 
calculus is a restriction of the standard sequent calculus for LC: there are identity 
axioms (A t- A) and rules for introducing connectives (® at left-hand side, o-- at 
right-hand side). In LC, / is usually used for o-- and • for ®. We use this notation 
throughout the paper to relate our formalization to noncommutative linear logic, o-- 
is the left implication, @ is a noncommutative "and" connective. We prove that this 
restricted calculus is closed under two rules that mimic the grammatical operations. 
Trees (initial or derived) are then obtained as the closure of the calculus under these 
two rules. In fact, trees are represented as (provable) sequents in an almost classical 
way. The right-hand side is the variable labeling the mother node of the tree. The 
left-hand side is a sequence of formulas of the following kinds: A for some leaf A of 
the tree, A o-- B1 ® ... ® Bn where A is the label of some internal node and Bi are the 
labels of its daughters, A o- A whenever A is a node where an adjunction can take 
place. This latter kind of formula can be grammatically interpreted as if such an A 
was split up into two nodes with the same label linked by some "soft" relation. The 
set of elementary trees of a TAG grammar ~' is then represented as a subset M of the 
sequents in the closure of the calculus under the two previous rules. We then prove 
the equivalence between the language generated in TAG by such a grammar G' and 
the closure under substitution and adjunction of the logical representation M. Note 
that our interpretation of adjunction is very close to the use of quasi trees described 
in Vijay-Shanker, (1992). 
Besides this equivalence property, we relate parse trees to logical proofs, and to 
their geometric representation, proofnets. We briefly present proofnets, and the corre- 
spondence between proofs and proofnets, and give examples of parse trees viewed as 
proofnets. This enables a new point of view on the parse process. This process can be 
interpreted as an assembling of blocks (proofnets corresponding to elementary trees 
of the grammar), and also as a circulation of information through links relating nodes 
of the proofnets. 
The remainder of the paper is organized in four parts. Section 2 describes the 
TAG formalism. We recall the terminology and show how substitution and adjunction 
operate on trees. Section 3 gives a survey of Lambek calculus viewed as a fragment 
of a noncommutative linear logic. We propose in Section 4 a logical formulation of 
TAG in a fragment of LC, and prove the correspondence between the two. Section 5 is 
devoted to the representation of proofs as proofnets; in this final section, we also study 
implications of this point of view. The proofs of propositions and theorems given in 
Section 4 are delayed to the appendix for the sake of clarity. 
2. Tree Adjoining Grammars 
The Tree Adjoining Grammar formalism is a tree-generating formalism introduced in 
Joshi, Levy, and Takahashi (1975), linguistically motivated (see, for example, AbeillO 
et al. \[1990\] and Kroch and Joshi \[1985\]), and with formal properties studied in Vijay- 
Shanker and Joshi (1985) and Vijay-Shanker and Weir (1994a, 1994b). A TAG is defined 
by two finite sets of trees composed by means of the substitution and adjunction 
operations. I 
1 Originally, there was no need for a substitution operation, as initial trees were always rooted at S, thus • 
labeling a sentence. In the Lexicalized-TAG formalism, this constraint disappears in favor of the 
substitution operation. Throughout the paper, we will use TAG to refer to the Lexicalized-TAG formalism. 
210 
Abrusci, Fouquer6, and Vauzeilles Tree Adjoining Grammars 
Definition 
A TAG G is a 5-tuple (VN, Vv, S,I,A) where 
• VN is a finite set of nonterminal symbols, 
• VT is a finite set of terminal symbols, 
• S is a distinguished nonterminal symbol the start symbol, 
• I is a set of initial trees, 
• A is a set of auxiliary trees. 
An elementary tree is either an initial tree or an auxiliary tree. Initial as well as 
auxiliary trees are trees with at least one leaf labeled by a terminal node (the grammar 
is a so-called lexicalized one). An auxiliary tree must furthermore have a leaf (the foot 
node, marked with a star ,) with the same label as the root node. Each nonterminal 
node is marked as adjoinable or nonadjoinable (in this case, the node is marked NA). 
Each internal node must obviously be labeled by a nonterminal node. 2 A derived tree 
is either an initial tree or a tree obtained from derived trees by means of the two 
available operations. 
We add two constraints on TAG grammars: a node X cannot have a unique daugh- 
X 
ter labeled X, i.e., -I cannot be part of a tree. This condition is in no way an important 
X constraint, as a grammar may always be transformed to conform to the constraint by 
substituting a unique node X for the partial tree. However, our logical representation 
makes use of a trick based on such trees: we replace nodes marked adjoinable by such 
partial trees (there is no mark at all in our logical representation). We also suppose 
that the type of each tree is unambiguous: an initial tree has no leaf with the same 
label as the root node, an auxiliary tree has only one leaf with the same label as the 
root node. 
To conform with the literature, we will use a' to refer to an initial tree, fl to refer to 
an auxiliary tree, and "7 to refer to some derived tree. Examples of initial and auxiliary 
trees are given in Figure 1. Two TAGs are defined: G1 = ({S}, {a, b, c, d, e}, S, {al}, {ill}) 
({ is the empty word) and G2 = ({S, VP, NP, N}, {the, man, walks}, S, {a2, c~3, a4}, 0). 
The substitution operation is defined as usual. A nonterminal leaf of a tree may be 
expanded with a tree whose root node has the same label. We follow a conventional 
notation: leaves that accept substitution are marked with a down arrow 4. This is not 
to be interpreted as a restriction on substitution, but only as a visual indication of 
what remains to be substituted to get a complete sentence. The adjunction operation 
is a little bit more complicated. It supposes a derived tree with a nonterminal node, 
say X, possibly internal and not marked NA, and an auxiliary tree with root node X. 
The operation consists in: 
• excising the subtree with root labeled X in the derived tree, 
• inserting the auxiliary tree at node labeled X in the derived tree, 
2 In some versions, nonterminal nodes of elementary trees are labeled by a set of (auxiliary) trees that 
can be adjoined at this node. In case of the empty set, the node is obviously nonadjoinable. For the sake of darity, we simplify the definition to only take into account the Boolean adjoinable property. 
211 
Computational Linguistics Volume 25, Number 2 
¢ a b~c d 
the N$ man NP$ VIP 
walks 
Figure 1 
Elementary trees. 
• finally, inserting the excised subtree at the foot node (hence labeled X 
and marked with a star ,) in the auxiliary tree. 
Examples of these operations are given in Figure 2. To clearly show the adjunction 
operation, the links of the adjoined tree fll are represented by dashed lines in the 
derived trees 73 and 74. Obviously, there is only one kind of link. We write 7a ~ 72 
when 72 is the result of an adjunction or a substitution of an elementary tree of a 
TAG G on the derived tree 71; ~h is the reflexive, transitive closure of ~c. The set 
{7/3a E G and c~ ~ 7} is represented by T(G). The language L(G) generated by a 
TAG G is the set of strings, i.e., sequences of leaves of trees in T(G) when the leaves of 
these trees are only labeled with terminal nodes, and whose root is the start symbol. 
Hence, L(G1) = {anbncnd~/n > 0} and L(G2) = {the man walks}. 
3. Lambek Calculus and Noncommutative Linear Logic 
Lambek calculus is well known; we give only the language and sequent calculus in 
Figure 3. 
Lambek calculus will be sufficient to formalize TAG (see the next section). In 
Figures 4, 5, and 6, we give three examples of proofs to show how the sequent calculus 
can be used. The first one (Figure 4) is a straightforward use of a Lambek-style parsing, 
given the two implications and a set of proper axioms corresponding to the words. 
The two other proofs do not use proper axioms at all: rules labeled lex are provable 
sequents; as these sequents are obviously provable we omit their proof tree. The second 
proof (Figure 5) is in the same spirit as the first. However, for this second proof, 
descriptions of lexical items are included in the sequents. At the same time, it can 
easily be compared to the third proof: in the second proof, the structural information 
is located at the head of each structure as one formula; in the third proof, one formula 
represents a syntactic tree of level 1. The third proof (Figure 6) interprets the Lambek 
grammar in a derivation style, we only need one implication o- and the connective 
times ®. The proofs use cuts: they can be withdrawn using the cut elimination theorem, 
but we think the cuts help in understanding the process. The following sections include 
other examples and emphasize the usefulness of noncommutative linear logic in the 
linguistic domain. 
A natural way to extend Lambek calculus consists in embedding it in a classical 
system, in the sense that the connectives "and" and "or" are dual. Indeed, LC is an 
"intuitionistic" system as there can be only one conclusion in the sequents, this is not 
212 
Abrusci, Fouquer6, and Vauzeilles Tree Adjoining Grammars 
Substitution of c~3 on a2 gives "rl = the l~ 
man 
Substitution of 71 on a4 gives 7~ = riP 
the ~ walks 
man 
Adjuncfion of fll on a l gives 73 ............. 
a .,,~. d 
b S/~A c 
Adjunction of fll on V3 gives V4 = 
Figure 2 
Substitution and adjunction results. 
a S~,A d 
a .~.. d 
bS~A c 
the case with noncommutative linear logic. Allowing multiple conclusions may give 
valuable benefits from a linguistic point of view, but we will only consider in this 
paper the geometrical representation available for such a system, i.e., proofnets. In 
the appendix, we give a brief description of linear logic, and the relations between 
classical linear, and noncommutative linear logics. We hope this will help readers to 
understand the overall framework. 
4. The Calculus ,,4 (A Fragment of LC) 
The formalization of TAG in LC relies mainly on a logical presentation of the two op- 
erations substitution and adjunction, together with a correspondence between proofs 
and trees. As already shown in the previous section, the substitution operation is noth- 
ing but the application of the cut rule restricted to atomic formulas, which we call the 
atomic cut rule. Interpreting the adjunction operation is really the main difficulty. The 
adjunction results from two atomic cut rules between the sequent corresponding to 
the adjunction tree and two suitable sequents corresponding to two subparts of the 
213 
Computational Linguistics Volume 25, Number 2 
A e A (axiom) PeA F1,A, F2 e B (cut) F1,F, F2 e B 
PeA AeB (r-®) 
F, AeA®B 
F1,A,B, F2 e C 
rl,A, r2ec AeB (l-o-) 
~:Ao-B,A, F2eC P, Be A (r-o--) FeAo-B 
F1,A, F2eC Ae B (l--o) 
P1,A,B --o A, F2 e c 
B,FeA (r- -o) 
Fe B-oA 
Figure 3 
Language and sequent calculus for the Lambek calculus. 
John e NP 
gives e ((NP--o S)o-NP)o-NP 
Mary e NP 
a e NPo--N 
book t- N 
NPt- NP St- S 
NP, NP -o S t- S NP t- NP 
NP,(NP .-o S) o-- NP, NP ~- S NP t- NP 
NP,((NP -oS) o- NP)¢>- NP, NP, NP~- S N~- N 
NP, ((NP -o S) o-- NP) o- NP, NP, (NP ¢>- N), N t- S 
(cuts wrt the ................. 
proper axioms) John, gives, Mary, a, book t- S 
Figure 4 
Lexicon and proof of John gives Mary a book: (Lambek-style) with proper axioms. 
(lex) (lex) (lex) 
Npo..Mary, MarykN15~x)Oo (Npo--N) o--a,a, NkNP No-book, bookFN NP, ((( NP --o S) o- NP) o-- NP) o- gives, gives, NP, NP ~- S (cut) 
NP, (((NP --o S) o- NP) o- NP) o-- gives, gives, NP o-- Mary, Mary, NP F S ( NP o-- N) o-- a, a, N o-- book, book ~" NP 
(cut) 
NPo- John,Johnt- NP "lex'( ~ NP,(((NP -oS) o- Np) o-- NP) o- gives,gives,NPo- Mary, Mary,(NPo-- N) o-a,a, No- book, bookF S 
(cut) 
NP o-- John, John, { ( ( NP --a S) e- NP ) o- NP ) a-- gives, gives, NP o-- Mary, Mary, ( NP o-- N) o- a, a, N o-- book, book ~- S 
Figure 5 
Proof of John gives Mary a book: (Lambek-style) two implications. 
tree where adjunction is done. Consider, for example, the following TAG grammar: 
Grammar G~ = {l, //~} 
¢ ailed 
b SNA C 
214 
Abrusci, Fouquer6, and Vauzeilles Tree Adjoining Grammars 
(lex) (lex) (lex) (lex) 
So-NP®VP, NP, Vpo-V®NPGNP, Vo-gives,givea,,alp, NPFS NPo-Mary, MarltFNP NPo-DetON, DeIo-a,a,NFNP No-book, booh~-N (cut) (CuO 
Sc.-- NP® VP, NP, VPc- V ® NP®NP, V e--gives,gives, NPo'- Mary, Mary, NPF S NPc- Det®N. Det o--a,a, N c- b~mk, bookI- Np 
(cuO (lex) 
NPo-dohn,dohnFNP S0-NP®VP~NP~VP~-V®NP®NP~V~-give8~ives~NP0-Mar~,Marv~NP0-D~®N~D~t0-a~a~N0-b~k~k~S (cut) 
So- NP O VP, NP o- John, John, VP o-- V ~ NP ® N P, V o- gives,gives, NP c- Martj, Mary, Np c- Det ® N, Det o-- a,a, N o- book, book t- S 
Figure 6 
Proof of John gives Mary a book: one implication and times. 
This set of trees may be viewed as a subset of the closure T(G2) under substitution 
(possibly with the declaration of adjunction nodes) of the following set of trees of 
level 1: 
GrammarG2={I, ~/~, //~} 
a S d b SNA C 
Note that the result of the adjunction of the second tree of ~ on itself is exactly 
the result of substitutions on trees of G2. However it is obvious that trees resulting 
from substitution operations on G2 do not always correspond to results of adjunction 
operations on ~. 
We logically represent the set of trees T(~2) as (the set of provable theorems of) 
a calculus A(~2): the formulas are built with the alphabet {c, a, b, c, d, S} and the set of 
connectives {®, o-}, the sequent calculus consists of the axioms s F- s and the rules (in 
both axioms and rules, s is a propositional letter): 
P}-¢ F1, S, F2~-B Pt-a®S®d I~I,S, P2}-B YI-b®S®c I~I,S,I~2}-B 
I~1, S o-- ¢, I~, F2 F- B P1, So-a®S®d,P, P2}-B P1, So-b®S®c,P, P2t-B 
Ft-A AI-B (®) sl-s I~I,S, P2~-B 
F,A F- A ® B \[~1, S (3-- S,S,P 2 ~- B 
The introduction of a left implication (o--) corresponds to the building of a partial 
tree. Such introductions are then restricted either to the formalization of the trees 
of the grammar (the first three rules correspond exactly to the trees of ~2), or to the 
formalization of adjunction nodes (the formula s o-- s "marks" s as being an adjunction 
node, i.e., the adjunction rule may be applied only on this kind of node as it will be 
clear below). 
The grammar ~ can then be logically represented as a subset M(G~) of the set of 
provable sequents of the calculus A(G2): 
M(~) = {S o- a ® S ®d,a,S o-- S,S o-- bQ S Qc, b,S,c, dt- S,S o-- S,S o- ~,~ F- S} 
In AB-grammars (Bar-Hillel 1953), only one implication is used without any "and" 
connective. The grammar would be represented in AB-grammars as two provable 
sequents (note that "daughters of a node" are explicitly ordered): 
((So-a) o--S)o--a,a, So-S,((So--c) o-S)o-b,b,S,c,a  S, S o--S, S S 
We will prove later that, besides the cut rule, there exists another derived rule for 
the calculus A(~2) (and in fact for each calculus of this kind) mimicking the adjunction 
operation. Reducing the calculus, then, to a closure of the substitution and adjunction 
215 
Computational Linguistics Volume 25, Number 2 
cut aaj 
Tree Seq 
cut 
ad i 
D M(G') ~" CL(M(~')) 
D ~' :~ CL 
subst adj 
G') 
Figure 7 
Summary of the logical interpretation of the TAG formalism. 
rules on M(~), we get exactly the logical representations of the set of trees under the 
TAG grammar ~. 
The adjunction rule must be logically justified: there must be only one way to 
combine the pieces (i.e., provable sequents corresponding to trees of level 1), given 
the substitution node, such that the order of the elements is as requested. 
To prove this, we show that for a suitable fragment of LC there is a unique way 
to decompose a sequent P, a o-- A, A t- B in two sequents P, a, A 2 }- B and A1 }- A, 
where A = A1, A 2. In this section, we clarify the calculus A used to interpret TAG: it 
includes a cut rule and an adjunction rule that mimic the grammatical operations. As 
pointed out previously, these two rules are correct with respect to logic. We give the 
basic properties satisfied by this calculus A. In order to represent TAG in LC, we first 
construct the set ~ of subtrees of depth I of trees appearing in a TAG grammar G'. The 
TAG grammar G r is then a subset of the closure T(~) of the set ~ under substitution 
(indicated by subst) and the declaration of nodes where adjunction is not available 
(indicated by NA). The interpretation of elements of G as provable sequents of A is 
straightforward. This leads to a calculus A(~) where the operations are restricted with 
respect to ~. The TAG grammar ~r is then in correspondence with a subset M(~ ~) 
of A(G) and we prove the equivalence between the language CL(G') generated by 
G' and the set of sequents CL(M(~')) obtained by closure on M(G') by the cut and 
adjunction rules (we use M instead of M(G') whenever there is no ambiguity). Proofs 
of propositions are delayed to the appendix. The various components of our approach 
are summarized in Figure 7. 
Consider the following fragment A of LC: 
Definition The Calculus A 
Alphabet of ~4: propositional letters a, b ..... connectives ®, o--. 
Formulas: usual definition. A is a simple Q-formula iff A is a 
propositional letter or A is a formula bl @ .. • @ bn where bl ..... bn are 
propositional letters. B is a o--formula iff B = a o-- A where a is a 
propositional letter and A is a simple Q-formula. 
216 
Abrusci, Fouquer6, and Vauzeilles Tree Adjoining Grammars 
Sequents: F ~- A, where F is a finite sequence of formulas and A is a 
formula. 
• Sequent calculus: 
m 
m 
Axiom: a ~- a 
Rules: F~-A Af-B (®) 
F, Af-A®B 
F t- A F1, C, F2 b B 
F1, C o- A,F, F2 ~- B (o-) 
In the following, we only consider sequents such that formulas in the left side are 
either propositional letters, or o---formulas. So, in the rule introducing o-, C stands 
for a propositional letter. As we have only one propositional letter before o-, we model 
trees: C is the (unique) mother and the Q-formula A is the sequence of its daughters. 
Proposition Main properties of calculus .4 
(proofs in the appendix) 
1. If I ~ f- A @ B is provable in .4, then 
A and B are simple ®-formulas; 
there is a unique pair (F1, F2) s.t. F = F1, F2 and both the 
sequents F1 t- A and I~2 \[- B are provable in .4. 
2. If F, a o-- A, A ~- B is provable in .4, then 
. 
. 
• A and B are simple Q-formulas; 
• there is a unique pair (A1, A2) s.t. A = A1, A2 and both the 
sequents A1 }- A and F, a, A2 F- B are provable in `4. 
Such a pair (A1, A2) will be called "the splitting pair in F, a o-- A, A f- B 
for A." Note that this pair can be computed easily: the first element A1 
of the splitting pair must satisfy a counting condition on each 
propositional letter occurring in it (see the appendix). 
The calculus .4 is closed under the atomic cut rule (which we also call 
the substitution rule) FF-a Al, a, A2 ~- A 
A1, P, A2 }- A (cut) 
i.e., if the sequents F }- a and A1, a, A2 ~- A are provable in .4, then the 
sequent A1, F, A2 f- A is also provable in .4. 
The calculus .4 is closed under the adjoining rule 
Pl, a, P2 F- a A, a o-- a,A ~- b 
A, F1, A1, F2, A2 F- b (adj) 
where (A1, A2) is the splitting pair of A in A, a o-- a, A t- b. 
Note that A1 and A2 are uniquely defined from the premises, so the previous deduc- 
tion is really a logical rule. 
Definition The Calculus `4(G) 
V 
Let G be a family of labeled trees, of depth 1, not of the form ~. 
X 
Let T(G) be 
217 
Computational Linguistics Volume 25, Number 2 
the closure of ~ under the rules: 
substitution with or without the declaration of a new possibly internal 
point on which the adjoining operation may be performed, 
adjoining operation. 
A(G) is the calculus obtained from .4 as follows: 
propositional letters are exactly all the labels of the trees in ~, 
the rule (o-) is restricted as follows: 
Pt-A FI, a, F2F-B 
PI, aO-A,P, F2}-B (o.-,~) 
where A, B are simple Q-formulas of the language of A(~), a is a 
propositional letter of the language of A(G) and one of the following 
cases occurs: 
-- Aisa 
-- A is a propositional letter b different from a, and the tree IE G 
b 
A is bl ® • .. ® bn, and the tree //~ E 
bl ... bn 
The following propositions state the correspondence between sequents and trees. 
The first two provide a precise translation between the two notions. Basically, a sequent 
I ~ F- a (in the previous language) is the logical equivalent of a tree with root a, and 
there is exactly one formula in I ~ for each leaf, for each subtree (of depth 1), for 
each adjunction node, and nothing else. SeqO (respectively, Tree()) associates a sequent 
(respectively, a tree) to each tree (respectivel3¢ each sequent), and we prove the two are 
converse. The last three propositions are properties concerning the logical counterpart 
of a TAG grammar. The last one is in fact the most important: the closure under 
(logical) adjunction and substitution of the set of sequents corresponding to a set of 
elementary trees is exactly the set of sequents corresponding to the closure under 
(grammatical) adjunction and substitution of this set of elementary trees. In other 
words, the logical calculus (the restricted logical calculus we defined above) and the 
grammatical calculus (the TAG calculus) coincide. 
Proposition Main properties of calculus `4(~) 
(proofs in the appendix) 
Properties 1-4 of `4 are also properties of A(G). Moreover the following properties 
hold for A(G): 
To T E T(~), we associate a sequent Seq(T) of `4(G) s.t. 
-- if a is the root of T, and the terminal points of T (ordered from 
left to right) are al ..... am, then Seq(T) is 
Ft-a 
218 
Abrusci, Fouquer6, and Vauzeilles Tree Adjoining Grammars 
where the sequence of all the propositional variables occurring 
in E is al .... , am and there is a formula c o-- c in E iff c is a 
possibly internal point of T on which the adjoining operation 
may be performed; 
-- Seq(T) is provable in A(~). 
• To every provable sequent F }- A in .A(~), we associate Tree(E t- A) s.t. 
w if A is a propositional letter, then Tree(iv t- A) C T(~) where the 
root is A, the terminal points (from left to right) are exactly all 
the propositional letters occurring in E and in the same order in 
which they occur in IV, and the possibly internal points on which 
the adjoining operation may be performed are exactly all the 
propositional letters c s.t. c o-- c occurs in F; 
if A is bl ® ... ® bn, and so F = F1 ..... Fn with the sequents 
Ivi F- bi provable in A(G) for every 1 < i < n, then Tree(E t- A) is a 
sequence T1 ..... Tn of trees E X-(G), s.t. Ti = Tree(Fi t- bi). 
• If IV F a is provable in A(G), then Seq(Tree(iv ~- a)) = P 1- a. If T is a tree of 
G, then Tree(Seq(T)) = T. 
• Let M be a set of provable sequents in A(~). Define CL(M) as follows: 
MC_CL(M) 
-- (closure under atomic cut rule) if F F a ¢ CL(M) and 
A1, a, A2 F B ECL(M), then A1, iV, A 2 F B cCL(M) 
-- (closure under adjoining operation) if El, a, IV2 }- a ¢CL(M) and 
A, a o-- a, A F b ECL(M), then A, IV1, A1, IV2, A2 }- b cCL(M), where 
(A1, A2) is the splitting pair of A in A, a, A ~- b 
-- nothing else belongs to CL(M). 
• If IV F A cCL(M), then iV F A is provable in A(~). 
• If ~' c T(~, let CL(~') be the closure of G' under: 
-- substitution, 
-- adjoining operation. 
Clearly, CL(G') C T(~). Let M = {Seq(T)/T E G'}, then 
CL(M) = {Seq(T)/T ¢ CL(0')}. 
Starting from this last proposition, it is possible to prove that the language accepted 
by a TAG grammar ~t is exactly the language accepted by M(~'). We can define the 
language accepted by such a calculus as follows: Let us take only those sequents 
in CL(M(Gt)) whose right part is the propositional variable S (the start symbol of 
the grammar), and such that propositional variables of the left part of the sequent 
correspond to terminal symbols of the grammar, i.e., words of the language. The 
language accepted by M(~ ~) is then the set of sequences of words in the same order 
as they appear in the previous sequents. 
5. TAG Analysis Using Noncommutative Proofnets 
A proof in sequent calculus contains many useless properties in its contexts. Girard 
(1987) has defined, in a purely geometric way, a class of graphs of formulas, called 
219 
Computational Linguistics Volume 25, Number 2 
proofnets: for each proof of a sequent t- F in the one-sided sequent calculus for multi- 
plicative linear logic, there is a corresponding proofnet whose conclusions are exactly 
the formulas in F, and for each proofnet, there is at least one corresponding proof of 
the sequent t- F in the one-sided sequent calculus for multiplicative linear logic (where 
P is a sequence of all the conclusions of the proofnet). Similarly, Abrusci (1991) de- 
fined in a purely geometric way a class of graphs, called noncommutative proofnets, 
relative to multiplicative noncommutative linear logic. Roorda (1992) also described 
proofnets for Lambek calculus. Other criteria exist by now for characterizing proofnets 
for commutative or noncommutative, intuitionistic or nonintuitionistic linear logic. We 
present here Abrusci's criteria. 
5.1 Noncommutative Proofnets 
Proofnets are defined on one-sided sequent calculi. Presentations of the one-sided 
sequent calculus, and of proofnets are given in the appendix. Let us recall that ~ is 
the "or" connective associated to ® (the "and" connective), such that A --o B = A±~gB. 
To every proof 7r of a sequent F- F in the one-sided sequent calculus for multiplicative 
noncommutative linear logic, we can associate (by induction on the construction of 
the proof zr) a noncommutative proofnet with conclusions P, i.e., an oriented planar 
graph 7r' of occurrencies of formulas such that: 
• The conclusions of 7r' are exactly the formulas in F. 
• ~r' is a noncommutative proof structure, i.e., it is constructed by means of 
the following links3: 
-- Axiom-link (two conclusions, no premise) 
-- Cut-link (two premises, no conclusion) 
@-link (two premises, one conclusion) 
Av-7  
A A ± 
I i 
A B 
V A®B 
A B 
V -- ~?-link (two premises, one conclusion) A ~ B 
and every occurrence of formula is a premise of at most one link and is a 
conclusion of exactly one link. 
The translation ~r I of 7r is a proofnet, i.e., it admits no shorttrip. A 
shorttrip is a trip that does not contain each node twice. A trip is a 
sequence of nodes, going from one node to another according to the 
graph and to a switch for each Q-link and each =?-link, in a 
bideterministic way: the traversal of nodes is done according to Figure 8. 
Every assignment for 7r' is total: two integer variables are associated to 
each label (one for each "side" of the variable). Constraints are imposed 
on variables with respect to how trips are done throughout the net. The 
assignment is total if the set of constraints has a solution. 
3 The W-link is graphically distinguished from the @-link. However this is a moot point because the graph has only one kind of edge. 
220 
Abrusci, Fouquer6, and Vauzeilles Tree Adjoining Grammars 
Conclusion 
Axiom-link f~±~ /~ 
Cut-link 
B C B >C 
@-link L-switch ~\V/ I R-switch Ik,,XV / 
B®C B®C 
B C B C 
L- wit  
Figure 8 
Travels through proof structures. 
• 7r ~ induces the linear order F of the conclusions, i.e., iff the precedence 
relation is a chain and each conclusion occurs exactly once in the chain. 
Precise definitions, examples, explanations and the proof of the following theorem 
may be found in Abrusci (1995). 
Theorem 
7r / is a noncommutative proofnet with conclusions F iff there exists a proof 7r of the 
sequent k P in the sequent calculus for multiplicative noncommutative linear logic 
such that 7r I is associated to 7r. 
Note that every noncommutative proofnet is a planar graph. 
5.2 Parse Examples 
In this section, we give two simple examples of parses. The aim of this section is to 
show the strong connection between the structure of proofs of sequents and a standard 
TAG derived structure. Moreover, it emphasizes the interest of a proofnet approach as 
the syntax (and parsing process) is concretely designed as a logical manipulation of 
logical structures. In the next section, we develop this approach and show how lexical 
rules can be integrated into it. Finally, we briefly mention that this can also give a 
logical formalization of D-trees (Vijay-Shanker 1992). 
The first example requires only substitution, i.e., the cut rule in the logical point of 
view. We first give the sequents (provable in .4) associated to the lexical items. Their 
meanings are straightforward, e.g., "John and Mary are noun phrases (NP)" or "saw 
requires a complement NP to obtain a verb phrase (VP) and a subject NP to obtain 
a sentence (S)." Note that VP is an adjunction node so the sequent associated to the 
item saw includes the formula VP o-- VP. The next example uses this specification. 
John 
Mary 
saw 
NP o- John, John t- NP 
NP o-- Mary, Mary l- NP 
S o-- NP ® VP, NP, VP o-- VP, VP o- V ® NP, V o-- saw, saw, NP t- S 
221 
Computational Linguistics Volume 25, Number 2 
So--NP®VP, NP, VPo- VP, VPo-V®NP, Vo--saw, saw, NPI-S NPo--Mary, MaryI-NP (cut) 
NP ¢,-- John, John ~" NP S o-- NP ® VP, NP, VP o.- VP, VP e-- V ® NP, V o-- saw, saw, NP o- Mary, Mary I- S 
S o-- NP ® VP, NP o-- John, John, VP o-- VP, VP o-- V ® NP, V o-- saw, saw, NP o-- Mary, Mary I- S (cut) 
So--NP®VP, NP, VPo-- VP, VPo--V®NP, Vo--saw,saw, NPI-S NPo--John,JohnI-NP (cut) 
NP o- Mary, Mary I- NP S o- NP ® VP, NP o-- John, John, VP o-- VP, VP o-- V ® NP, V o-- saw, saw, NP t- S 
S o-- NP ® VP, NP o-- John, John, VP o-- VP, VP o-- V ® NP, V c-- saw, saw, NP e-- Mary, Mary I- S (cut) 
Figure 9 
A(G) proofs of John saw Mary. 
The proof associated to the analysis of John saw Mary requires two cuts. The two 
sequent proofs given in Figure 9 are the only two possibilities for this sentence in the 
fragment A(G). This pinpoints the fact that the order in which the cuts are done is 
not significant with respect to the derived structure. Proofnets allow the expression of 
this equivalence. Hence the two proofs have the same associated proofnet, given in 
Figure 10. For the sake of clarity, the cut rules are bold lines, and subnets associated to 
lexical items are circled. Obviously, if we delete the two cut lines, we are left with three 
proofnets referring to (provable) sequents. The proofnet in Figure 10 still contains some 
superfluous information, namely, nodes that cannot be targeted by the only available 
operations in A(G)--the cut rule and the adjunction rule on a propositional variable. 
In fact, we only need to keep nodes (i) that refer to conclusions of the proofnet that are 
propositional variables or negation of propositional variables (a cut can be done on 
such a literal), and (ii) that belong to subgraphs of the following form (corresponding 
to the existence of a formula A o- A in the left part of a sequent, i.e., its negation 
A @ A ± in the one-sided associated sequent): 
"7 F-- A A ± 
V ANA ± 
We can then simplify the graph and replace the internal logical machinery by black 
boxes (shown in the figures as solid black circles). The conclusions of each basic 
proofnet are labeled: outputs (i.e., conclusions that are propositional variables) are 
drawn as closed half circles, inputs (i.e., conclusions that are the negation of proposi- 
tional variables) are drawn as open half circles. Plain lines link black boxes to black 
boxes or conclusions, and subgraphs corresponding to adjunction points are drawn 
as dashed lines. The previous proofnet is then redrawn as in Figure 11. We obviously 
find the derived tree (neglecting some minor differences). The logical proofnet can 
then be seen as an "explanation" of the structure of the tree, that is to say the oper- 
ations available on the tree are the result of some focus of what can be done on the 
proofnet. On the one hand, the use of black boxes is necessary to clarify the structure 
of the analysis; on the other hang this hides proof details that can be useful for some 
linguistic operations (as is the case for adjunction with respect to the classical struc- 
ture of a derived tree). We show in the next subsection another application of such a 
(logical) refinement. 
The last example discussed in this section is the analysis of the sentence John saw 
Mary today. The sequent associated to the adverb today is the following one: 
today VP o-- VP ® today, VP, today F- VP 
222 
Abrusci, Fouquer6, and Vauzeilles Tree Adjoining Grammars 
" ° V I/t ,:,. ~q~ V ~,oA( I V II 
Figure 10 
John saw Mary. 
S 
fiN, John s~i t 
Mary 
Figure 11 
A simplified proof for John saw Mary. 
:!d + I 
Figure 12 
John saw Mary today. 
The logical analysis includes the two operations substitution and adjunction, i.e., 
two cut rules and an adjunction rule. In Figure 12 the adjunction rule is shown as 
a double-thick dashed line: this (logically) mimics the adjunction as it is shown in 
the derived tree given in Figure 13. Note that the adverb has to be placed after the 
complement (rightmost in the proofnet) in order to keep the graph planar. The proofnet 
in Figure 14 is the proofnet corresponding to a cut-free proof. 
223 
Computational Linguistics Volume 25, Number 2 
S 
John 
saw 
M~ 
today 
Figure 13 
A simplified proof for John saw Mary today. 
i $(\]w j_ --, in aow ® V ~- V®NP®VPJ3~VP~. NP® VP®S ± S 
Figure 14 
Cut-free proofnet for John saw Mary today. 
5.3 On Some Extensions 
As usual in lexicalized formalisms, TAG states rules to generate the lexicon from a 
basic set of descriptions. Among these, we find rules for passivization, interrogative 
forms or wh-sentences. We focus here on one example (namely who) to show to what 
extent the previous paradigm can be used also to logically interpret these lexical rules. 
We expect this will help in understanding the underlying mechanisms. The formula- 
tion we propose is the simplest one. This is also closely related to the approach used 
in categorial grammars (the raising rule is simply the introduction of an implication; 
see also Joshi and Kulick \[1995\] for such a relation and the way who can be defined). 
Figures 15, 16, and 17 present proofnets and simplified proofnets for the two noun 
adjuncts who John meets and who meets John. The analysis of complete sentences includ- 
ing these adjuncts is then similar to the process developed in the previous section. 
The corresponding (provable) sequents are given below. The basic lexical descriptions 
are the following (we have deleted the adjunction declarations for sake of clarity; the 
(logical) adjunction rule has to be slightly extended in order to take care of these new 
structures): 
John NP o- John, John t- NP 
meets S o-- NP ® VP, NP, VP o-- V ® NP, V o-- meets, meets, NP k S 
who N o-- N ® who @ (S o-- NP), N, who, S o-- NP t- N 
Let M(~) denote the set of the three previous sequents. From these basic descriptions, 
the following entries are computed, i.e., the part of the lexicon relevant to these words 
224 
Abrusci, Fouquer6, and Vauzeilles Tree Adjoining Grammars 
meets A 
,L 
mects V J" V 
racers ® V "t" 
,! 
V V@NP VP a" 
V NP®VP@S j" 4 
Figure 15 
Cut-free proofnet for who John meets. 
,~ N~r-5-"---- 
~who ± N x N®who®(S:qx NP)®N j" 
o P± meets V J" V ® NP VP x V V 
' "1/' VP NP V 
VP®NPS V 
V®NP®VP ± VP®NP®S x ~,~o ± 
? 
N .1. N®who®(S~ ± NP)®NaN 
Figure 16 
Cut-free proof-net for who meets John. 
(M(G~) denotes this new set): 
John 
meets 
who meets _ 
who _ meets 
NP o-- John, John k NP 
S o-- NP ® VP, NP, VP o-- V @ NP, V o- meets, meets, NP k S 
No--N ®who® ( So- NP ),N, who,( So- NP)o-- VP, VPo- V ® NP, Vo-meets,meets,NPk N 
No--N@who®( So-- NP),N, who,So- NP® VP, NP, VPo- V@NP, Vo--rneets, meetsI-N 
It should be noted that the two sequents given below are provable in the calculus 
M(~) (cut and adjunction rules only). 
who meets John 
who John meets 
N o-- N ® who ® (S ?-- NP), N, who, (S o-- NP) o-- VP, 
VP o-- V @ NP, V o-- meets, meets, NP o-- John, John k N 
N o-- N ® who @ (S o-- NP), N, who, S o-- NP ® VP, NP o-- John, John, 
VP o-- VP, VP o-- V @ NP, V o- meets, meets t- N 
But they are not provable with the cut and adjunction rules from M(~). In other 
words, we should consider the construction of the language in two steps. The first step 
is the construction of the lexicon (a TAG grammar) from a basic set of descriptions 
using complex rules. The second step is the closure of the TAG grammar with the cut 
and adjunction rules. This point of view needs to be further developed but could be 
a first approach to a complete integration of lexicon and grammar. 
225 
Computational Linguistics Volume 25, Number 2 
NP 
John 
meets 
VP 
NP~ VP_ 
meets i NP 
John 
Figure 17 
Simplified proofs for who John meets and who meets John. 
6. Conclusion 
The use of logic as a framework to describe natural language is not a new idea. Works 
on Lambek calculus and logic programming are famous examples. However, linguistic 
formalisms have fundamentally evolved in the past two decades. Though theoretical 
research has been done on unification and attribute-value structures, operations on 
syntactic trees have been investigated mainly by comparing different solutions (Vijay- 
Shanker and Weir 1994a, 1994b). In this paper, we consider another way to look at 
these operations. We focus on the adjunction operation available in Tree Adjoining 
Grammars, as it seems to be the simplest way to augment the expressive power of a 
formalism. We prove that noncommutative intuitionistic linear logic is a good frame- 
work and we define a fragment equivalent to TAG. We show, furthermore, to what 
extent geometric representations of proofs (proofnets) may be useful in understanding 
how black boxes (i.e., relations between nodes in a syntactic tree) help simplify a parse 
but also hide interesting mechanisms. There is still a lot to do in this direction. For 
one thing, generalized categorial grammars also have to be logically investigated, the 
objective being to relate GCG operations to logical operations (completed if necessary). 
The preceding discussions also show the relationship between our point of view and 
the idea of quasi trees developed by Vijay-Shanker (1992). He proposes to consider 
partial descriptions of trees, i.e., adjunction nodes represented by means of loose rela- 
tions whose meaning is a domination relation. In this case, the adjunction operation is 
identified by a pair of substitution operations. The strong relation with what precedes 
is clean However, in order to take into account exactly this presentation, the axiom 
of identity A t- A, where A is a propositional variable, must be added to the calculus 
,4(G) given in Section 4. In this way, adjunction nodes can be deleted from sequents. 
In this new calculus, the following rule is satisfied: 
AF-A F, Ao--A, At-B 
F, A t- B (adjunction) 
226 
Abrusci, Fouquer6, and Vauzeilles Tree Adjoining Grammars 
Hence, we obtain the following equivalence: 
Proposition 
A parse tree is correct 
iff the two nodes in a domination relation have the same label 
iff there is a proof whose conclusions that are propositional variables are 
the words of the sentence in the same order, and without any formula of 
the form A o- A. 
Appendix 
A.1 A Brief Description of Noncommutative Linear Logic 
Linear logic was introduced by Girard (1987) as a "resource conscious logic." In other 
words, though classical logic deals with static descriptions, linear logic considers 
propositions as finite resources. Hence, while "A" and "A and A" are equivalent in 
classical logic, this is (generally) not the case in linear logic. The easiest technical way 
to investigate this difference is to consider the Gentzen sequent calculus for these log- 
ics. A sequent is of the form F F- A where F and A stand for sequences of formulas 
well-formed with respect to the language of the logic. It expresses the fact that the 
(multiplicafive) disjunction of formulas in A is a consequence of the (multiplicative) 
conjunction of formulas in F. Remember that a sequent calculus is a set of rules spec- 
ifying the provable sequents, given a set of axioms. A proof of a sequent is then the 
successive application of sequent rules beginning with axioms, i.e., a tree with the 
proved sequent as the root of the tree (at the bottom) and whose leaves are axioms 
(on top). Besides axioms and rules introducing connectives at the right or left part of 
a sequent, we find structural rules that govern the structure of a sequent. In classical 
logic, the set of structural rules consists in weakening, contraction, and exchange (see 
Figure 18 where A, B are formulas, F, F', A, A ~ are sequences of formulas). Weaken- 
ing and contraction allow the arbitrary copying of formulas: having a formula A as 
a hypothesis or conclusion is equivalent to having it twice (or more). This point of 
view contradicts the notion of resource, hence these two structural rules are omitted in 
linear logic. However special connectives, namely the exponentials of-course "!" and 
why-not "?" have these properties. The exchange rule is responsible for commutativity 
of the comma (in the right side and in the left side): the order of hypotheses or con- 
clusions does not matter. This rule is no longer valid in the noncommutative version 
of linear logic. 
However, and this is already true in linear logic, the logical interpretation of "and" 
and "or" is not as simple as it is in classical logic. We need to distinguish two "and" (® 
meaning 'times' and & meaning 'with') and two "or" (~ meaning 'par' and ® mean- 
ing 'plus'), hence inducing four constants: 1, T, 3_, 0 (respective neutral elements for 
the previous connectives). In fact, connectives are related in such a way that they form 
two groups: the mulfiplicafive group (®, ~, 1, ±) and the additive group (&, ®, T, 0). 
Hereafter, we use only the multiplicafive group. There are obviously fundamental rea- 
sons for this proliferation but these explanations are outside the scope of this paper. 
Negation and implication are however of special interest. In (commutative) linear logic, 
there is only one negation • ± and one (linear) implication -o. In the noncommutafive 
case, negation and implication have to be split: there is pre- ±. and post- negation .± 
and pre- o- and post- implication --o. These two implications have to be related with 
two operations in Lambek calculus: -o with \ and o-- with/. The implications may 
227 
Computational Linguistics Volume 25, Number 2 
F k A (1- weakening) F, AkA F k A (r- weakening) PkA, A 
F,A, Ak A Pk A,A,A F,A k A (l - contraction) r k A,A (r- contraction) 
F, B,A, F' k A 
F, A, B, F' b A (1 - exchange) 
Figure 18 
Structural rules. 
r k A, B, A, A' (r - exchange) 
FPA,A,B,A' 
be defined in the following way: B o-- A = B ~ ±A and A --o B =_ A±~B. In Figure 19, 
we give the one-sided sequent calculus for the multiplicative fragment of noncommu- 
tative linear logic (N-LL), and in Figure 3 in Section 3, the two-sided sequent calculus 
for the multiplicative fragment of intuitionistic noncommutative linear logic (N-ILL): 
sequent calculus for N-LL and sequent calculus for N-ILL satisfy the cut elimination 
theorem, i.e., for each proof there exists a cut-free proof with the same conclusion; 
however, we make use of cut rules in Section 4. Note that if F t- A is provable in the 
multiplicative intuitionistic noncommutative linear logic, then t- (F*)±, A* is provable 
in the multiplicative noncommutafive linear logic, where: 
• for each formula A of intuitionistic noncommutative linear logic, A* is a 
formula of noncommutative linear logic defined as follows 
-- p" = p, for every propositional letter p 
-- (B ® C)* = B* ® C*, (B -o C)* = (B*)±~C *, 
(B o- C)* : B*~ ±(C*) 
• for each finite sequence A1 ..... An of formulas of intuitionistic 
noncommutative linear logic, (A1 ..... An)* = (A1)*, .... (An)* 
• for each finite sequence A1 ..... An of formulas of noncommutative linear 
logic, (A1 ..... An) ± = (An) ± ..... (A1) ± 
A.2 The Calculus .A (a Fragment of N-ILL) (proofs) 
In this section, we give the proofs for the various propositions presented in the paper. 
We repeat the definitions and propositions for clarity. 
Definition The Calculus A 
Alphabet of A: propositional letters a, b ..... connectives ®, o-. 
• Formulas: usual definition. A is a simple ®-formula iff A is a 
propositional letter or A is a formula bl @ • .. @ bn where bl ..... bn are 
propositional letters. B is a o--formula iff C = a o- A where a is a 
propositional letter and A is a simple ®-formula. 
• Sequents: F t- A, where F is a finite sequence of formulas and A is a 
formula. 
228 
Abrusci, Fouquer6, and Vauzeilles Tree Adjoining Grammars 
Alphabet: 
• propositional letters: a, b, c,... 
• for each propositional letter p and each integer n > 0 
n times n times 
p.L ... 3_ and.L ... ip 
• connectives: ®, 
Formulas: usual definition 
Sequents: t- r where F is a finite sequence of formulas 
Metalinguistic definition of A ± and IA s.t. "(A± )= ('A) ± = A, for every formula A: 
times .q'l times . times ~*--1 times 
(p'..." )" = p ±...± ( ±...,p), = ±..., p 
n times .--i times n times R'{'I tlmcs 
±(p,..." )= p ±-.-± ±( "...±p)= ,..., p 
(B®C) ±=C ±~B ± (B:~C) x=C ±®B ± 
"(B ® C)=~C:~'B Z(B ~ C)=Q:7 ® ±B 
Rules of sequent calculus: 
t- A ±, A (axiom) t- F1, A, F2 t- A ±, A (cut - 1) t- F,A ~- A1, A ±, A2 (cut - 2) 
1- PI,A, F2 I- Al,F, A2 
I-F1,A, F2 PB, A PF,A PAl B, A2 I- A1,A,B, A2 (r- 
Figure 19 
Language and sequent calculus for multiplicative noncommutative linear logic. 
(r2 - ®) 
• Sequent calculus: 
m 
m 
Axiom: a }- a 
Rules: F}-A A}-B (®) 
F,A F-A®B 
FPA F1, C, F2 }- B 
F1, C o-- A, F, F 2 }-- B (o-) 
In the following, we only consider sequents such that formulas in the left side are 
either propositional letters, or o---formulas. So, in the rule introducing o--, C stands 
for a propositional letter. This consists in considering trees: C is the (unique) mother 
and the Q-formula A is the sequence of its daughters. 
229 
Computational Linguistics Volume 25, Number 2 
Proposition Calculus `4 
1. If P F A ® B is provable in .4, then 
A and B are simple ®-formulas; 
there is a unique pair (P1, P2) s.t. P = P1, P2 and both the 
sequents P1 F- A and P2 t- B are provable in `4. 
Proof 
By induction on the proof 7r of P F A ®/3 in `4. Note that 7r cannot be an axiom. 
• 6c 496D 
If the last rule in 7r is (®), then lr is P ~- C-~ E) (®) with P = ~,49 and 
A ® B = C ® D (disregarding the brackets). 
If A = C and B = D, the result is obvious• If A ® C' = C and C' ® D = B, 
then by induction hypothesis there exist unique ~1 and ~2 such that 
@1 t- A and @2 F- C' are provable for • = ~1, ~2; and then, by (®) 
@2, 49 H B is provable, so that P1 = @1 and P2 = ~2, 49 are unique and 
satisfy the property. If A = C ® D' and D' ® B = D, then by induction 
hypothesis, there exist unique 491 and 492 such that 491 F D' and 492 t- B 
are provable for 49 = 491,492; and then, by (®) ~, 491 F- A is provable, so 
that P1 = @, 491 and P2 = 492 are unique and satisfy the property. 
49 ~- C ~I,a, @2"F A ® B 
If the last rule in 7r is (o--), then 7r is P F A ® B (o--) with 
P = ~1, a 0- C, 49, ~2. We apply the induction hypothesis on 
~1, a, ~2" t- A ® B (a proof shorter than 7 0. If ~1, a, A F A and A' F B are 
49(-C ¢I,a,'A~-A 
provable with ~It2 ~ A, A', then ~,a~-- C-~,& }- A (0-) so that 
P1 = ~1, a o- C, 49, A and P2 = A/ are unique and satisfy the property. If 
A F A and A', a, ~2 t- B are provable with ~1 = A, A', then 
49FC A',a,~2HB 
&F,a~>- C--~, ~22 F B (0-) so that P1 = A, and P2 = A',a 0- C, 49,~2 are 
unique and satisfy the property. 
. If P, a 0- A, A t- B is provable in `4, then 
• A and B are simple Q-formulas; 
• there is a unique pair (A1, A2) s.t. A = A1, A 2 and both the 
sequents A1 F A and P, a, A2 t- B are provable in `4. 
Such a pair (A1, A2) will be called "the splitting pair for A in 
P, ao--A, AFB." 
Proof 
By induction on the proof 7r of P, a 0- A, A H B in `4. 
230 
Abrusci, Fouquer6, and Vauzeilles Tree Adjoining Grammars 
This pair can be computed easily: the first element A1 of the splitting pair must 
satisfy a counting condition on each propositional variable occurring in it as defined 
below. This property will enable us to consider an adjunction rule based on such 
splitting pairs. 
Definition 
Let A be a simple ®-formula or a o---formula (calculus A) and a a propositional vari- 
able, the number of positive occurrences p(a, A) (and negative occurrences n(a, A)) of 
a in A is defined by: 
• if A = a then p(a,A) = 1, n(a,A) = 0 
• if A = b and b is a propositional variable distinct from a, then p(a, A) = 0, 
n(a, A) = 0 
• if A ~ B ® C, then p(a, A) = p(a, B) + p(a, C), n(a, A) = n(a, B) + n(a, C) 
• if A =_ B o-- AI®... ®A,, then p(a,A) = p(a,B) and 
n(a, A) = p(a, A1 ®... ® An) as A1,..., A, are ®-simple formulas, cf. the 
calculus A. 
Let S be the sequent C1 ..... Cn t- A defined as for calculus A (® and o-): 
• p(a,$) =p(a,A)+n(a, C1)+...+n(a, Cn) 
• n(a, S) = p(a, C1) +-.. + p(a, Cn) 
It is easy to prove (for S provable in the calculus A) by induction on a proof of S that (i) 
for each propositional variable a occurring in S, p(a, $) = n(a, S), and also that (ii) if S 
is the sequent C1 ..... Cn t- A then C, is a propositional variable (we denote this variable 
by e($)). Moreover, for k G n, if we denote the sequent C1 ..... Ck f- A by Sk, then (iii) 
p(a, Sk) > n(a, Sk). We can then deduce that (iv), for k < n, there exists at least one 
propositional variable s.t. p(a, Sk) > n(a, Sk). Note that p(e(S), ~n--1) > n(e(S),  -1). 
Proposition 
Let S: F, B o- C, D1 ..... Dn F- A be a provable sequent in ,4, then the splitting pair 
for D1 ..... Dn in $ is uniquely determined by the sequent SP: D1 ..... Dj t- C, j ~ n, 
such that for each propositional variable a occurring in S', the following condition is 
satisfied: 
p(a, $') = n(a, S') 
Proof 
Note that S' is provable. Hence the property (i) is true for such a sequent. The unique- 
ness results from property (iv) stated previously. 
3. The calculus A is closed under the atomic cut rule 
Pf-a AI, a, A2t-A 
A1, F, A 2 \[- A (cut) 
i.e., if the sequents F F- a and A1, a, i 2 }- A are provable in A, then the 
sequent A1, F, A 2 ~- A is also provable in A. 
231 
Computational Linguistics Volume 25, Number 2 
Proof 
By induction on the proof ~r of F }- a, by using the properties I and 2. If ~- is an axiom, 
the result is trivial. If 7r is not an axiom, the last rule in ~r is (o-), and so 7r has the 
form 
qd ~- B 421,b,'~21- a 
F t- a (¢>-) 
By induction hypothesis, since (I)1, b, (I) 2 }- a is provable (with a shorter proof than 7r) 
and A1, a, A2 }- A is provable, then A1, 4Pl, b, q~2, A2 t- A is provable, and then we get 
qd ~- B A1, 4Pl, b,'~2, A21- a 
F~-a (o--) 
4. The calculus ,4 is closed under the adjoining rule 
Fl, a, F2t-a A, a o- a, A t- b 
A, F1, A1,1~2, A2 }- b (adj) 
where (A1, A2) is the splitting pair of A in A, a o-- a, A f- b. 
Proof 
Indeed, suppose the sequents F1, a, F2 f- a and A, a o-- a, A t- b are provable in ,4. 
Since A, a o-- a, A F- b is provable, by the property 2, there is a unique pair (A1, ha) 
s.t. A = A1, A2 and both the sequents A1 }- a and A, a, A2 ~- b are provable in ,4. Now 
since F1, a, F2 ~- a and A, a, A2 }-- b are provable in ,4, by the property 3 the sequent 
A, F1, a, F2, A2 ~- b is also provable in ,4; and now, since A1 }- a and A, F1, a,  2, A2 }- b 
are provable in ,4, the sequent A, F1, A1, F2, A2 }- b is also provable in ,4. 
Definition The calculus ,4(G) 
Let ~ be a family of labeled trees, of depth 1, not of the form . Let T(~) be the 
X closure of G under the rules: 
,4(~) is 
substitution with or without the declaration of a new possibly internal 
point on which the adjoining operation may be performed, 
adjoining operation. 
the calculus obtained from ,4 as follows: 
• propositional letters are exactly all the labels of the trees in ~, 
• the rule (o-) is restricted as follows: 
Ft-A FI, a, F2 t- B 
FI, aO-A,F, F2F-B (o--, G) 
232 
Abrusci, Fouquer6, and Vauzeilles Tree Adjoining Grammars 
where A, B are simple ®-formulas of A(~), a is a propositional letter of 
A(O), and one of the following cases occurs: 
-- Aisa 
-- A is a propositional letter b different from a, and the tree TE G 
b 
-- Aisbl®...®bn, andthetree //~ c 
bl ... bn 
Proposition Calculus A(~) 
Properties 1-4 of A are also properties of A(~). Moreover the following properties 
hold for A(G): 
To T E T(G), we associate a sequent Seq(T) of A(~) s.t.: 
if a is the root of T, and the terminal points of T (ordered from 
left to right) are al .... , am, then Seq(T) is 
Pt-a 
where in F the sequence of all the occurring propositional 
variables is al ..... am and in I? there is a formula c o-- c iff c is a 
possibly internal point of T on which the adjoining operation 
may be performed; 
Seq(T) is provable in .A(G). 
Proof 
By induction on the class of all the trees of T(O). 
LetTEG, i.e. Tis //~ E~ 
bl ... bn 
Define Seq(T) ==_ a o-- bl ® ... ® bn, bl ..... bn ~- a. Trivially, Seq(T) satisfies (i) and 
(ii). 
Let T be a tree obtained from a tree T1 c T(~) with root a and a tree T2 E T(G) 
with a terminal point a, by substitution with the declaration that a is a point in T on 
which the adjoining operation may be performed. Suppose b is the root of T2, and so b 
is the root of T. By induction hypothesis, to T1 is associated a sequent Seq(T1) - P ~- a 
satisfying (i) and (ii), and to T2 is associated a sequent Seq(T2) satisfying (i) and (ii) 
so that Seq(T2) - A1, a, A2 t- b where all the terminal points of T2 before a occur in 
A1 in the same order as in T2 and all the terminal points of T2 after a occur in A2 in 
the same order as in T2. Define Seq(T) -- A1, a o-- a, I?, A 2 t- b. It is easy to prove that 
Seq(T) satisfies (i). Seq(T) is obtained from Seq(T1) and Seq(T2) by using (o--, ~), so 
that it is provable in A(~) since Seq(T1) and Seq(T2) are provable in .A(G) by induction 
hypothesis. 
Let T be a tree obtained from a tree T1 E T(G) with root a and a tree T2 E T(O) 
with a terminal point a, by substitution without the declaration that a is in T a point on 
which the adjoining operation may be performed. Suppose b is the root of T2, and so b 
is the root of T. By induction hypothesis, to T1 is associated a sequent Seq(T1) - F ~- a 
satisfying (i) and (ii), and to T2 is associated a sequent Seq(T2) satisfying (i) and (ii) 
233 
Computational Linguistics Volume 25, Number 2 
so that Seq(T2) = A1, a, A2 }- b where all the terminal points of T2 before a occur in 
A1 in the same order as in T2 and all the terminal points of T2 after a occur in A2 in 
the same order as in T2. Define Seq(T) = A1, F, A2 F- b. It is easy to prove that Seq(T) 
satisfies (i). Seq(T) is obtained from Seq(T1) and Seq(T2) by using the atomic cut rule, 
so that by property 3 it is provable in ¢4(G) since Seq(T1) and Seq(T2) are provable in 
A(~) by induction hypothesis. 
Let T be a tree obtained by adjoining operation from a tree T1 E T(G) with root 
a and a terminal point a, and a tree T2 E T(~) with a possibly internal point a on 
which the adjoining operation may be performed. Suppose b is the root of Z2,. and 
so b is the root of T. By induction hypothesis, to T1 is associated a sequent Seq(T1) 
satisfying (i) and (ii), so that Seq(T1) = FI, a, F2 ~- a where all the terminal points of 
T1 before a occur in F1 in the same order as in T1 and all the terminal points of T1 
after a occur in F2 in the same order as in T1; and to T2 is associated a sequent Seq(T2) 
satisfying (i) and (ii) so that Seq(T2) = A,a 0- a, A ~- b. Since Seq(T2) is provable 
in .A(~) by induction hypothesis, then by property 2, there is a unique pair (A1, A2) 
s.t. A = A1, A2 and the sequents A1 t- a and A, a, A2 ~- b are both provable in A(U). 
Define Seq(T) = A, F1, A1, I~2, A2 \[- b. It is easy to prove that Seq(T) satisfies (i). Seq(T) 
is obtained from Seq(T1) and Seq(T2) by using adjoining rule, so that by property 3 
it is provable in A(G) since Seq(T1) and Seq(T2) are provable in ~4(~) by induction 
hypothesis. 
• To every provable sequent F t- A in ~4(G), we associate Tree(F I- A) s.t. 
if A is a propositional letter, then Tree(F I-- A) c T(G) where the 
root is A, the terminal points (from left to right) are exactly all 
the propositional letters occurring in F and in the same order in 
which they occur in F, and the possibly internal points on which 
the adjoining operation may be performed are exactly all the 
propositional letters c s.t. c 0- c occur in F; 
if A is bl ®... ® bn, and so F = F1 ... Fn with the sequents Fi t- bi 
provable in A(G) for every 1 < i < n, then Tree(F I- A) is a 
sequence T1 ..... T, of trees E T(G), s.t. Ti = Tree(Fi }- bi). 
Proof 
By induction on the proof 7r of F }- A. 
Tree(a F- a) = a 
If A = B®C and the last rule of 7r is (@) with principal formula B®C and premises 
F1 ~- B and C 2 ~- Cr then Tree(F t- B @ C) = Tree(F1 1- B), Tree(F2 t- C). 
If the last rule of 7r is (o--, ~) with principal formula a o- a and premises F t- a 
and A1, a, A2 t- b, then Tree(A1, a o-- a, A2 }- b) is the tree obtained by substitution from 
Tree(F ~- a) and Tree(A1, a, A2 I- b) with the declaration that the possibly internal point 
a is a point on which the adjoining operation may be performed. 
If the last rule of 7r is (0--, ~) with principal formula a o-- a and premises F ~- a and 
A1, a, A 2 I'- bl ® .. • ® bn, and a occurs in F i s.t. Fi t- bi, then Tree(A1, a o-- a, A2 1-- bl ® 
• .. @ bn) is obtained from Tree(A1, a, A2 I-- bl @... ® bn) = Tree(F1 t- bl) ..... Tree(Fn ~- bn) 
by replacing Tree(Fi F- bi) by the tree obtained by substitution from Tree(F }- a) and 
Tree(Fi ~- bi) with the declaration that the possibly internal point a is a point on which 
the adjoining operation may be performed. 
If the last rule of 7r is (o--, G) with principal formula a o-- A and premises I? t- A 
and A1, a, A2 F- b, then Tree(A1, a 0- A, A2 ~ b) is the tree obtained as follows: first add 
234 
Abrusci, Fouquer6, and VauzeiUes Tree Adjoining Grammars 
a root a common to all the trees Tree(r F- A) by using (if A = bl ®... ® bn) the link 
bl ... bn 
or (if A = b) the link IE ~, and then compose this tree with the tree Tree(A1, a, A2) t- b). 
b If the last rule of 7r is (o--, ~) with principal formula a o-- A and premises I ~ t- A and 
A1, a, A2 t- bl ®... ® bn, and a occurs in F i s.t. ~'i ~ bi, then Tree(A1, a o-- A,/k 2 I-- bl ®... ® 
b,) is obtained from Tree(A1, a, A2 F- bl ®... ® bn) = Tree(P1 F- bl) ..... Tree(Pn t- b,) by 
replacing Tree(Pi F- bi) by the tree obtained as above from Tree(P }- A) and Tree(Fi t- bi). 
Let 
m 
If F 
If F t- a is provable in A(G), then Seq(Tree(F ~- a)) = F t- a. If T is a tree of 
~, then Tree(Seq(T)) = T. 
M be a set of provable sequents in ~4(G). Define CL(M) as follows: 
MC_CL(M) 
(closure under atomic cut rule) if F F- a E CL(M) and 
A1, a, A2 t- B ECL(M), then A1, F, A2 F- B cCL(M) 
(closure under adjoining operation) if I~1, a, F2 }- a ECL(M) and 
A, a o-- a, A t- b cCL(M), then A, F1, A1, I~2, A2 F- b ECL(M), where 
(A1, A2) is the splitting pair of A in A, a, A F- b; 
nothing else belongs to CL(M). 
t- A ECL(M), then F t- A is provable in A(G). 
Proof 
By induction on CL(M). 
If I ~ t- A cM, then by hypothesis I ~ t- A is provable. 
If P F- A is obtained from two other sequents, by atomic cut rule, then I ~ t- A is 
provable by property 3 since (by induction hypothesis) the two sequents are provable. 
If I? t- A is obtained from two other sequents, by adjoining operation, then I ~ ~- A is 
provable by property 4 since (by induction hypothesis) the two sequents are provable. 
If ~' c ~, let T(~') be the closure of ~' under: 
-- substitution; 
-- adjoining operation. 
Clearly, CL(~') C T(~). Let M = {Seq(T)/T E G'}, then 
CL(M) = {Seq(T)/T E T(G')}. 
Proof 
The proof follows previous results. 
References 
AbeillG Anne, K. Bishop, S. Cote, and Yves 
Schabes. 1990. A lexicalized tree-adjoining 
grammar for English. Technical Report 
MS-CIS-90-24, LINC LAB 170, Computer 
Science Department, University of 
Pennsylvania, Philadelphia, PA. 
Abrusci, Michele. 1991. Phase semantics and 
235 
Computational Linguistics Volume 25, Number 2 
sequent calculus for pure 
noncommutative classical linear 
propositional logic. The Journal of Symbolic 
Logic, 56(4):1,403-1,451. 
Abrusci, Michele. 1995. Noncommutative 
proof nets. In Jean-Yves Girard, Yves 
Lafont, and Laurent Regnier, editors, 
Advances in Linear Logic, volume 222. 
Cambridge University Press, 
pages 271-296. Proceedings of the 
Workshop on Linear Logic, Ithaca, NY, 
June 1993. 
Bar-Hillel, Yoshua. 1953. A 
quasi-arithmetical notation for syntactic 
description. Language, 29:47-58. 
Girard, Jean-Yves. 1987. Linear logic. 
Theoretical Computer Science, 50:1-102. 
Joshi, Aravind K. and Seth Kulick. 1995. 
Partial proof trees as building blocks for a 
categorial grammar. In Glyn Morrill and 
Richard T. Oehrle, editors, Formal 
Grammar, Proceedings of the Conference of the 
European Summer School of Logic, Language 
and Information, Barcelona, August. Also 
as Technical Report, Institute for Research 
in Cognitive Science, University of 
Pennsylvania, Philadelphia, PA, March 
1996. 
Joshi, Aravind K., Leon S. Lev}~ and M. 
Takahashi. 1975. Tree adjunct grammars. 
Journal of Computer and System Sciences, 
10(1):136-163. 
Kroch, Anthony S. and Aravind K. Joshi. 
1985. Linguistic relevance of tree 
adjoining grammars. Technical Report 
MS-CIS-85-18, LINC LAB 170, Computer 
Science Department, University of 
Pennsylvania, Philadelphia, PA. 
Lambek, Joachim. 1958. The mathematics of 
sentence structure. American Math. 
Monthly, 65:154-169. 
Roorda, Dirk. 1992. Proof nets for Lambek 
calculus. Journal of Logic and Computation, 
2(2):211-231. 
Vijay-Shanker, K. 1992. Using descriptions 
of trees in a tree adjoining grammar. 
Computational Linguistics, 18(4):481-517. 
Vijay-Shanker, K. and Aravind K. Joshi. 
1985. Some computational properties of 
tree adjoining grammars. In Proceedings of 
the 23rd Annual Meeting, pages 82-93. 
Association for Computational 
Linguistics. 
Vijay-Shanker, K. and David J. Weir. 1994a. 
The equivalence of four extensions of 
context-free grammars. Mathematical 
Systems Theory, 27:511-545. 
Vijay-Shanker, K. and David J. Weir. 1994b. 
Parsing some constrained grammar 
formalisms. Computational Linguistics, 
19(4):591-636. 
236 
