Parsing French with Tree Adjoining Grammar: 
some linguistic accounts 
Anne ABEILLE* 
Department of Computer and Information Science 
University of Pennsylvania 
Philadelphia PA 19104-6389 USA 
Abstract 
We present the first sizable grammar written for TAG. 
We present the linguistic coverage of our grammar, and 
explain the linguistic reasons which lead us to choose the 
particular representations. We show that TAG formal- 
ism provides sufficient constraints for handling most of 
the linguistic phenomena, with minimal linguistic stipula- 
tions. We first state the basic structures needed for pars- 
ing French, with a particular emphasis on TAG's extended 
domain of locality that enables us to state complex sub- 
categorizaeion phenomena in a natural way. We then give 
a detailed analysis of sentential complements, because it 
has lead us to introduce substitution in the formalism, 
and because TAG makes interesting predictions. We dis- 
cuss the different linguistic phenomena corresponding to 
adjunction and to substitution respectively. We then movc 
on to support verb constructions, which are represented in 
a TAG in a simpler way than the usual double analysis. 
It is the first time support verb constructions are handled 
in a parser. We lastly give an overview of the treatment 
of adverbs, and suggest a treatment of idioms which make 
them fall into the same representations as 'free' structures. 
Introduction 
Tree Adjoining Grammar (TAG) was introduced by/Joshi 
et al. 1975/ as a formalism for linguistic description. A 
TAG's basic component is a finite set of elementary trees, 
each of which is a domain of locality, and can be viewed 
as a minimal linguistic structure. 
A TAG comprises of two kinds of elementary trees: 
initial trees (a), which are complete structures, usually 
rooted in S, with preterminals on all their leaves, and aux- 
iliary trees (fl), which are constrained to have exactly one 
leaf node labeled with a non-terminal of the same cate- 
gory as their root node. We have added lexical trees (6), 
which are initial trees corresponding to arguments. Their 
insertion in preterminal nodes of elementary trees, which 
serves as psedicates, is obligatory. 
Sentences of the language of a TAG are derived from the 
composition of an initial tree and any number of auxiliary 
trees by an operation called 'adjunction'. Adjunction in- 
*Visiting from University of Paris VII. This work was partially 
supported by a J. W. Zellidja grant, and also by ARO grant 
DAA29-84-9-007, DARPA grant N0014-85-K0018, NSF grants MGS- 
82-191169 slid DCR-84-10413 to the University of Pennsylvania. 
Thanks are due to Aravind Joshl, Anthony Kroch, and Yves Schahes. 
serts an auxiliary tree at one of the corresponding nodes of 
an elementary or a derived tree. Recursion is provided by 
the structure of the auxiliary trees which can adjoin into 
themselves. Adjunction allows the insertion of a complete 
structure at an interior node of another complete struc- 
ture. It appears to be a natural way of handling adverbs 
and modifiers in natural language. Three constraints can 
be associated to any node of an elementary tree : null 
adjunction (NA), obligatory adjunction (OA), and selec- 
tive adjunction (SA). Because of the formal properties of 
adjunction, the formalism is more powerful them Context- 
I~ee Gran~nar, .but only mildly so/Joshi 85/. Most of its 
linguistic properties come from the fact that it factors re- 
cursion from local dependencies. We are thus able to local- 
ize all dependencies such as subcategorization, agreement, 
and filler-gap relations. Because trees, and not categories, 
are considered ms the units of the grammar, TAGs have 
a broader domain of locality than usual phrase structure 
rules. 
We have added substitution to the formalism, essen- 
tially for descriptive purposes. Although adjunction is 
more powerful than substitution, and could be used to 
simulate it, it seems more natural to have substitution it- 
self for lexical insertion and for constructions in which the 
extra power of adjunction is not needed (section 2). We 
define a restrictive use of substitution: it inserts an ini- 
tial tree (or a tree derived from an initial tree), or a lex- 
ieal tree, into an elementary tree. Substitution is always 
obligatory and only one constraint, selectional substitu- 
tion, is defined. This improves the descriptive power of 
the formalism without changing its generative capacity. 
Features structures can be associated with each node 
of an elementary tree /Vijay-Shanker 87/. They permit 
the dynamic assignment of constraints. Features are also 
used for constraining the lexieal insertion of items such 
as prepositions in verbal complements, coinplementizer of 
sentential complement or determiner in NP. 
Our grammar currently covers the major basic and de- 
rived constructions, such as wh-question, relativization or 
cleft-extraction. We are also able to handle neutral and 
reciprocal verbs, middle and locative alternations , as well 
as argument reordering such as scrambling or heavy-NP 
shift. We refer the reader to /Abeilld 88b/ for a more 
complete presentation of the grammar. In this paper, we 
focus on some constructions which are of particular lin- 
guistic significance. 
1 Elementary trees and minimal 
linguistic structures 
Our framework is that of a lexicon-grammar /Gross 75/ 
and /Gross 81/. We view all basic structures as being 
produced by a lexical item in the lexicon. We adopt the 
notation of/Gross 75/and/Boons , Guillet, Lecl~re 76/. 
In this framework, as in a TAG, the linguistic unit is the 
sentence. We define 40 basic structures for French : 12 for 
verbs taking NP essential complements, and 28 for verbs 
taking sentential complements. 
1.1 Elementary trees for basic construc- 
tions 
Each of the first i2 structures are represented in 
the grammar by three initial trees corresponding to 
declarative sentences, complement clauses, and infini- 
tive clauses. Corresponding to No V N1 , we have 1. 
0( 4 S 
NPo,L VP 
N V SPl$ 
I I I Jean sims S 
Merle 
*eL J¢$ 
SOA SOA 
G 8 
q.el flPo~VP NP°I AVP 
\[ //~ PRO V SPlJ. 
. v .p,* I I I I ! 
elmer N Jean aires 
i I MJrle Merle 
NP's are substituted at the proper nodes in the trees. The 
structures a2 and a3, which would otherwise yield incom- 
plete sentences, bear an obligatory adjunction constraint 
on their root-nodes. We have to differentiate trees with 
infinitive from trees with tensed verb, because lexical in- 
sertion is defined on already inflected items, and because 
French does not allow lexical subject in infinitive clauses. 
We thus state this constraint as a basic structure of the 
grammar : in a3, the subject has to be non lexical (PRO). 
A verb is thus defined by its syntactic argument struc- 
ture, and the corresponding set of trees are associated with 
it. We refer to a given argument structure as a tree-family. 
The optionality of a given argument and the lexical value 
of the preposition (for verbs taking prepositional com- 
plements) are noted as part of the argument structure. 
A verb with more "than one possible argument structure 
will be duplicated. /Gross 81/shows that French verbs 
have no more than three essential arguments, including 
the subject. 2 
IIaving such trees associated with the lexical items, in- 
stead of a standard argument structure in the form of a list 
(or of a feature) and rules for sentence formation, provides 
us with an extended domain of locality that has interest- 
ing linguistic consequences. We do not manipulate basic 
categories, but tree-structures corresponding to minimal 
sentences (for a verb, or a predicative noun) or complete 
constituents (NP, for a non-predicative noun, AP, for a 
modifying adjective). We are thus able to state cross-level 
1For simplification, we do not put all the adjunction constraints 
these trees bear at their different nodss.J, marks substituti~a. 
2Leaving apart such examples as Jean parle 1 O0 F d Marie que 
Pierre viendra, which can undergo some kind of reanalysis. It is 
not\]always easy to distinguish essential complements from adjuncts 
altl~ough our formalism requires a clear-cut distinction. 
8 
dependencies often overlooked in grammars, because they 
can only be defined on the sentence as a basic unit. For 
example, the value of the determiner of the subject may 
depend on the verb, as shown in 1-2, but it also depends on 
the presence of a verbal complement in 3 ; the adjunction 
of the right adjective on the nominal complement depends 
on the lexical value of the verb in 4-5 : 
1) * Ce mot rime. 
~) Ces roots riment. 
3) Ce mot rime avec "banane'. 
4)*Jean mange un fntur g~teau. /Gross 81/ 
5) Jean pr@are un futur gdteau, s 
Lexical insertion, or adjunction of adverbs or modifiers 
such as relative clauses, depend on each element of the 
elementary tree, and not on just the immediately domi- 
nating node. They are difficult to capture by CFG rules 
such as S-+NP VP, or VP-*V NP. 
1.2 Elementary Trees for Derived Con- 
structions 
In a TAG, the standard derived constructions are rep- 
resented as elementary trees of the grammar. They are 
part of the tree family associated with a verb. Their 
properties are those of elementary trees, which must be 
complete structures and have;, their gaps bounded in the 
same tree they appear in, plus the properties given by 
adjunction and substitution, respectively. If one consid- 
ers the principles that are used for designing such fam- 
ilies, these principles will correspond to syntactic rules, 
or 'transformations', in derivation-based theories of gram- 
mar. Wh-question gives rise to the corresponding wh- 
elementary trees for each of the arguments of an el- 
ementary tree. For the initial tre e el, correspond- 
ing to the structure No V N1, we have for example : 
Ktt SSA ~:r SQ~ 
*., s I /~ 
I po~v q.~ "po vp qui N P 
. v I I 
\] I J """ 
Jean alma el MI!Ie 
The different local constraints account for the asymetry 
between subject and object movement, o~5 can be an au- 
tonomous sentence, whereas 0~4 is only an indirect ques- 
tion, and must have an auxiliary tree such as Ye sais S 
adjoined to it. Relative clauses are represented as auxil- 
iary trees rooted in NP which can then adjoin to the NP 
node they modify. Each elementary tree, corresponding 
to a declarative sentence, has thus corresponding auxiliary 
trees rooted in NP. Cleft-extraction is also represented by 
elementary trees. To say that a tree with a wh-element, 
or a relative pronoun, must he an elementary tree, derived 
from another elementary tree, provides us with strong pre- 
dictions: wh-movement is forced to apply only to elements 
present in an elementary tree, that is to arguments of our 
basic linguistic structures, and not to adjuncts. 
3*Thls word rhymes. These words rhyme. This word rhymes 
with 'banana'. *Jean is eating a future cake. Jean is making a 
future cake. 
Lexically dependent derivations comprise of middle, 
ergative, passive, or locative alternation. They are rep- 
resented as features associated with the proper verb, and 
correspond to sets of trees to be added to the tree family 
of the verb. One should notice that the verbal item to 
be marked is in fact a pair (lexical entry, argument struc- 
ture). Foc example, regarder has, at least, four argument 
structure,% that is to say four entries : 
a) NPo r~:garde NPa 
b) NPo rcgarde NP1 ( V-inf W) 
c) (NPo 4- So) regarde NP, 
d) NPo r~:garde que P (subj) 
Only regarder(a) has a passive. 
For mere surface reordering, we have the possibility of 
defining linear-precedence rules associated either with a 
tree-family, or with a specific tree, as described in/Joshi 
sV. 
2 Tbe treatment of complement 
clauses 
The representation of a verb taking a sententiM argu- 
ment can be viewed as the composition of two senten- 
tial structures. The standard way of composing two 
structures in a TAG is to have one adjoined to the 
other. Cc,mplement clauses can thus be represented as 
elementary trees, with 'matrix' sentences being auxil- 
iary trees adjoined to them, or vice versa. Following 
/Kroch and Joshi 85/ we prefer the former in order to 
account for wh-movement out of a complement clause) 
and to have unbounded dependencies falling out of the 
formalism. No V $1, for example, is represented by : 
qua !t 
N P ~ NP VP 
I A I A N V 81 A PRO V S 
i i" I Bob po rise penle perlser 
fll is adjoined to a3 to produce: 
6") Bob pease qn¢ Jean aime Marie 4. 
/?2 and/73 are cases of recursive adjunction; 7 is derived 
from/71 .... /?5 -~/72 --' ~3: 
7) Bob peuse que Paul pease que Max pease que Jean aime 
Marie ~ 
The wh-element and the corresponding gap are always in 
the same basic structure. Unbounded dependencies, which 
have always been a problem for generative grarrmaar, are 
thus represented in a straightforward way /Kroch, Joshi 
85/and /Kroch 86/: adjunction is not limited and does 
not destroy the gap-filler relations stated in the initial 
trees. For example: 
8) Quil penses-tu que Marie aime el ~ 
is derived t'rom Qni~ que Marie aime e~ ?, which is one of 
the Wh-trees corresponding to the initial tree : qne Marie 
aime Jean) and penses-tu is adjoined to it. The Wh-island 
4Bob tlfi~tks that Jean loves Mary 
5Bob thhtks that Paul thinks that Max thinks that John loves 
Mary 
6Who do you think that Marie loves ? 
constraint is no longer a constraint on movement, but be- 
comes a constraint on the structure of the elementary trees 
of the grammar. No elementary tree with two wh-elements 
is defined, and there is no means to derive 9 because there 
is no elementary tree corresponding to 10: 
9) *Qui~ te demandes-tu comment Jean a rencontrg e~ ? 
10) *Quii comment Jean a rencontrd ei f T 
This simple account fails short in the case of verb tak- 
ing two sentential arguments, such as Jean pr~#re perdre 
Marie h perdre don ~me, because an auxiliary tree is con- 
strained to have exactly one foot-node, and cannot adjoin 
to two initial trees at the same time. We use for this pur- 
pose substitution as an alternative operation. It replaces 
the leaf node of an elementary tree with an initial, or a 
lexical, tree (or a tree derived from an initial tree), pro- 
vided it has a root-node of the same category as that of 
the leaf-node of the elementary tree. 
Let us compare the linguistic properties derived from 
substitution and adjunction respectively. Substitution 
represents embeddment as the insertion of a complement 
clause at a leaf node of the matrix clause. Adjunction 
views it as the insertion of a matrix clause at any node of 
a complement clause. Constraints on the derivation are 
put in the matrix clause, when using substitution, and in 
the complement clause when using adjunction. Comple- 
ment clause which undergo wh-movement must be com- 
posed with their matrix clause by adjunction, because the 
matrix clause has to be inserted at an interior node (be- 
tween the Wh-element and the complementizer). If one 
uses substitution, on the other hand, insertion at an in- 
terior node will be blocked, and wh-movement out of the 
complement clause will be ruled out. Both operations are 
therefore complementary; in order to know whether to use 
one or the other, one has to ask whether wh-movement out 
of the embedded clause is possible or not. 
In the case of verbs taking both a sentential subject 
and a sentential object, we use substitution to represent 
the subject clause. This makes the well-known sentential- 
subject island constraint fall out from the formalism. We 
generate for example 11 and rule out 12: 
11) Qne Marie aiile en Gr&e ennnie Jean 
12) *Ofii qne Marie aiile el ennuie-t-ii Jean ? s 
The verb ennnyer is associated with the argument struc- 
ture So V NP1, which is represented as an initial tree 9. For 
verbs taking two sentential complements , wh-movement 
is normally allowed only out of one of the S-complements, 
usually the direct one. 
13) Jean dgduit que Marie a fail venir Bob de ce qu'on 
entend dn bruit, to 
14) Quil Jean d~duit-il que Marie a fail venir ei de ce 
qn'on entend du bruit ? 
15) * Quei Jean ddduit-ii que Marie ~ fail venir Bob de 
ce qn'on entend ~ .q 
Using adjunction for the clause subject to extraction and 
substitution for the other one rightly predicts the ungram- 
maticality of 15). 
z*Whoi do you wonder how Jean met ei ? 
8That Mary is going to Grece bothers Jean. 
9To account for the constraint in its full generality we substitute 
sententlal complements even in structures with no other sentential 
argument. 
l°John deduces that Mary invited Bob from hearing noise. 
on V NP1,L ~ ~ quo NP0$ VP 
I I / \ .J .\[ /'% 
...... "iT i,&i' 
3 The structure of NP : support 
verb constructions 
Modifiers of NP are treated like adjuncts in respect to 
sentential structures. Adjectives, for example, are repre- 
sented as auxiliary trees rooted in N, and they adjoin to 
the node they modify, either before or after the noun : 
16) Jean volt un camion bleu. 
17) Jean voil une jolie femme 11 
N N 
A A 
N A A N 
I I bleu \]oll~ 
16 and 17 are derived respectivdy from dean volt un 
camion, and Jean voil une femme. Adjectives produce 
then two types of structnres, one for their modifying 
nouns, and one for their being arguments of a sentence 
structure, such as NP0 V NPt A : 
18) ,lean lrouve Marie jo\[ie 1~ 
They are listed twice in the lexicon, except for so-called 
relational adjectives, which can Olfly be modifiers : 
19) C'esl une ddcision minisl~rielle 
20) *Celle d&ision esl minist&ielle. 13 
Prepositional phrases modifying NP receive the same 
treatment, and Jean volt une femme sans lard TM is de- 
rived from the adjunction of sans fard to Jean volt une 
fe~7$me. 
Complements of nouns can be either prepositional 
phrases or sentential complements. They can be viewed 
as a node in the lexieal tree yielded by the head-noun (to 
be substituted at any NP-node in any elementary tree). 
This is what we do for sentences such as : 
21) Jean ddsapprouve une enqu~le snr celle affaire 15 
The PP can only be moved together with the head noun it 
modifies, and extraction is ruled out for it. Because celte 
affaire is an NP substituted in it, extraction is blocked in 
the correct way. 
I I I I I I I 
Jean desapprouve une eequete sur cette affMre 
The derived constructions, such as wh-movement or cleft-. 
a 1 Jean sees a blue truck. Jean sees a pretty wom~m. 
12Jean finds Mary pretty. 
13Thls is a departmental derision. *This decision is departmentM 
lgJean sees a woman without make-up 
1s Jean disapproves of an inqlfiry into this affair. 
extraction, are defined on the nodes present in the elemen- 
tary tree. They are thus defined only for NP1 enquire, 
with or without its complement since the complement is 
optional, hut not on the PP sur cette affaire. We thus rule 
out: 
22) *Ear quoi Jean d&approuve-t-il nne enquire ? 
23) *C'esl sur cetle affaire que Jean ddsapprouve une 
cnqu~le. 
But sentences can be found which are of the same sur- 
face structure as 21) but in which the PP exhibit different 
syntactic properties: it seems to have properties of a nom- 
inal and of a verbal complement as well: 
25) Jean fail une enqu~te sur celie affaire. 16 
25) C'est une enqu~te sur cetle affaire que Jean fail. 
26) ,gur quoi Jean fail-il une enquEle .~ 
27) C'est sur celle affaire que Jean fail nne enquire. 
These constructions have been called 'support verb' sen- 
tences by/Gross 81/, because the verb gives only person 
and tense marking to the sentence ( with optionally some 
aspectual variat;on). The noun is the predicative head 
of the sentence and subcategorizes the subject. /Gross 
76/proposed to have two basic structures associated with 
these constructions, although they are not ambiguous, 
and they are problems for most formalisms/Abeill6 88a/. 
Itowever, they can be represented in a TAG in a natural 
way with only one basic structure. We consider the PP= 
node corresponding to sur celle affaire as belonging to the 
initial tree, which makes it arJ argument of the sentence 
as any verbal complement. But it is as the same time 
dominated by the noun enquire, and this accounts for its 
properties as nominal complement. 
$ 
NPO~, VP \[ 
N V$ NP 
D N P NP2~ I I I I I 
un. enquete sur cette sffalre 
The difference between 22-23 and 26-27 comes from the 
fact that wh-movement and cleft-extraction are defined 
only on the arguments (nodes) of elementary structures 
rooted in S. In a2 both NP-1 and the PP are available 
for movement. We are thus able to handle, in the gram.- 
mar, differences in syntactic properties concerning sen.- 
tenees which are exactly of the same string : (NP VP 
(NP (PP))). The resulting trees are the same, but one is 
an initial tree, while the other one is derived . 
We also find support verb constructions with nouns tak- 
ing sentential complements of NP, and we find pairs similar 
to 21-24 : 
28)Jean ale projel d'aller h New-York 
29)Jean critique le projet d'aller h New-York. r\[ 
In 28, the S-node corresponding to the sentential comple- 
ment of NP is part of the elementary tree, ,and the string 
Jean ale projet de S' is represented as an auxiliary tree. : 
In 29, there is only one NP-node as direct complement of 
16Jean makes aal inquiry into this affair. 
17Jean has a plan to go to New York/John opposes a plan to g~ 
to New Ybrk 
10 
critique, aud the complex NP is treated as a lexical tree, 
the sententlcd complenrent being substituted in it, before 
insertion in the complete sentence. Thus, extraction is 
made possible for 28 and not in 29: 
30) 04 dean a-t-il ie projet d'aller ? 
31) *Off a~an crilique-t-il le projet d'atler ? 
To represent support verb constructions with sentential 
complements as auxiliary trees accounts for unbounded 
dependencies: 
3f?) Oeq as&a l'impression que Jean nous a donnd l'idde 
de \]hire la proposition ... d'aller ei ? ~s 
We cone, taler all nouns taking complements aa hav- 
ing corresponding Support verbs that they subcategorize. 
They thus yield a tree family, just as verbs, which com- 
prises of the trees {br the support verb construction, and of 
the complex NP h'xical tree as well, which correspond to 
non-light verb constructions. So, a predicative noun will 
not be liste:l twice. Verbs, on the other hand, will be listed 
twice, as p:=:edicate for their 'plain' use, and as arguments 
for their support w;rb use, except for a few verbs which 
appear to be always support verbs : pratiquer, pew6trer 
or commet!re. 
Such a representation can be extended to Verb~Adj-PP 
constructions, such as: 
33) Jean e<;t contenl de son nouveau chapeau. 
34) Jean e:;t content que tout le monde le rega'lde. 19 
They are considered as S-initial trees yielded by the pred- 
icative adjective, and the node for complement, out of 
which extraction is possible, is present in it. Adjectives 
taldng complements produce then three kinds of tree struc- 
ture (sentential, attributive, modifying). 
We thus extend the set of elementary trees of our gram-- 
mar to the support verb constructions. They are projee- 
iiona of tile noun, or tile adjective, in the lexicon, and add 
40 basic ~3trnctures in our grammar. 
4 The adjunct;ion of adverbs 
Adverbs can be : 
-- 'lexical' ~Ldverbs : souvent, rarement 
• PP : a h',~it heures 
-. NP : ee j o,~r.-la 
- subordinate clauses : pendant que Jean lit le journal. 2° 
Lexic~d adw:rbs, PP introduced by prepositions, and 
subordinate clauseq iutrodnced by conjunctions, are repre- 
sented by/;he proper auxiliary tree(s) in the lexicon. The 
prepositional adverbs are listed under the value of their 
preposi~;ion; the bare-NP adverbs under that of their noun, 
and are considered cases of compound adverbs (see sec- 
tion 5). The subordinating conjunctions are represented 
>," auxilimy trees rooted in S, in which sentential trees 
(derived ol init;a/) m'e substituted : 
16Wherel to you. have ~he impression that Jean gave us the idea 
~,o ma.ke the snggestion....to go to ei ? 
\]g Jean it; t.appy about his new hat. Jean is happy that everybody 
ad~,ir~:s hin 
~°ot'iacn, stddom/ut right o'clod(/ that day/while Jean is reading 
the paper 
f/ ,o 
Co. l S$ / T V NPl$ 
/N I I ponaant quu le Journal 
The use of substitution, which forces the insertion of a 
sentential structure to take place at its root-node, pre- 
dicts that extraction is ruled out out of an adjunct: 
35) Marie regarde la tdld pendant que Jean lit le journal. 
36) :~Qui ' est ce que Marie regarde la tdld pendant que Jean 
lit ei 721 
Adverbials are represented as auxiliary trees usually 
rooted in S or in VP. Leaving aside the case of negation, 
which is a discontinuous constituent, corresponding to a 
tree rooted in V (because of the word-order), we consider 
most the adverbs to be rooted in S, in order to have a 
correspondence with such Wh-trees as 56 and/?7 , which 
have to be rooted in S : 
.&7 
Wh I S Adv Wh~ S Adv 
J I I I 
OtJ8 nd E I Ob el 
Although the formalism rules out extraction out of ad- 
juncts /Kroch 86/, it does not rule out wh-movement of 
the adverbial as a whole. It further predicts that only S- 
rooted adverbials give rise to wh-question: ~z 
37) Jean a ddplorg la desh'uction de Beirouth Est le ~ Juin. 
37 is analyzed as being ambiguous, between an S- and an 
NP- attachment of the adverbial. But the fronted Quand 
Jean a4-il d@lor( la destruction de Beirouth t¢:;1 ? is cor.. 
rectly disambiguated, because quand can only be adjoined 
to S. 
'l'he various positions of an adverb in a string, with the 
same attachment, is handled by linear precedence rules 
associated with the tree-structure it adjoins into /aoshi 8r/. 
For adverbs which are obligatory in a sentence, such 
as Jean va bien. 23, there are two possibilities: either 
to put an obligatory adjunetion constraint in a structure 
such as Jean va, or to treat the adverb as an argument 
of the elementary tree. We choose the latter, in order to 
maintain our claim that elementary trees correspond to 
semantic, as well as syntactic units. 
5 The representation of idioms 
Because in a TAG the linguistic unit is the sentence, not 
the word, entries comprising of several words can easily 
be defined. Compound phrases, which can be discontinu- 
ous constituents, are assigned a head that is usually either 
the item of the same category as the whole, or the most 
significant item. The head produces the subtree corre- 
21Marie is watching TV while Jean is reading the paper. *What 
is Mary watching TV while Jean is reading ei ? 
22Jea~l deplored the destruction of East Beirut on June, 4th 23Jean is doing fine. 
11 
sp0nding to the compound phrase, which will itself yield 
a tree-family in the case of a compound predicate (e.g. a 
compound verb). 
The internal structure of Sentential idioms is expanded 
more than that of 'free' sentences. For example, the NP 
subject is usually noted as an NP-node, open for substi- 
tution; if part of it is frozen, the corresponding node (D 
or N) is directly in the basic tree, and its lexical value 
is subcategorized by the verb. The heads for sentential 
idioms are the same as for 'free' sentences. For example, 
Jean voit un canard, which is a free sentence, is a tree of 
depth 1 : (NP (V NP)), whereas Jean chasse le canard, 
with the meaning of to hunt, has a frozen verb-determiner 
combination, and is represented by a tree of depth 2 : (NP 
(V (D N))). The verb chasserproduces also a tree of depth 
1, for its occurrence in free sentences, with the meaning 
of l0 chase. The parser will give two analyses, one corre- 
sponding to the idiomatic sense, the other to the literate 
interpretation. 
As for compound categories, we view basic categories 
as nodes which can be expanded if needed. If it is a sim- 
ple category, it will be treated as a preterminal, if it is a 
compound one, its internal structure will be specified. To 
have the precise internal structure is important in the case 
of idioms allowing some variations, or insertion. We thus 
have a unified representation for the complex determin- 
ers la majorit~ de and la grande majo~td de : the adjec- 
tive grande is adjoined to the noun majorild as to any N. 
.4 /-,, 
|11 N PP 
A raliorlte P NP,L 
I j grande de 
Conclusion 
Choosing the TAG formalism for parsing French has both 
computational and linguistic advantages. The linguistic: 
stipulations are minimized and the general organization 
of the grammar is simplified: all structures are stated in 
terms of surface structures, and there is a direct matching 
between the lexical information and the tree structures. 
The implementation of such a grammar leads to a new 
parsing strategy developed in/Schabes, Abefll6, Joshi 88/. i 
We have shown that TAG formalism is suited for build- 
ing a sizable grammar for a natural language, and further- 
more it allows one to state more local dependencies than 
other formalisms. We show that constraints on extraction 
out of complement clauses and syntactic properties of sup- 
port verb' constructions are handled in a natural way. We 
are using our current approach to build a TAG grammar 
for English along the same lines. 
The overall size of the French grammar amounts to 80 
basic structures (tree-families), which correspond to sim- 
ple verbs (12), verbs with sentential complements (28) 
support verb-noun combinations (20), and support verb- 
adjective combinations (20). An average tree-family com- 
prises of 15 trees, and the whole size of the grammar is: 
roughly 1200 trees. One should notice that what crucially 
matters is the number of tree-families, which is closed,, 
if we have been exhaustive. We have not incorporated 
yet pronominalization and coordination, the two major 
remaining phenomena. We have added selectional restrict. 
tions features to each predicate. We know how to limit the 
future growth of the grammar: if the deriwtion we want 
to add amounts to word-reordering, it is stated by adding 
a rule to the set of linear precedence rules associated ei- 
ther to the tree-family, or to one of the trees/Joshi 87/. 
If it is a lexical rule, a feature will be added to the pred- 
icative entries. In both cases, the size of the tree-database 
remains unchanged. If it is a syntactic rule, it adds the 
proper number of trees to at most each tree-family, so the 
multiplying factor is 80 in the worst case. 
Our grammar has been implemented in an Earley-type 
parser as defined in /Schabes and Joshi 1988/, and uses 
a dictionary which comprises of more than 4000 lexical 
items, that are the most common for French. 

References 
A. Abeilld (a), "Light Verb Constructions and Extraction out 
of NP in Tree-Adjoining Grammar", in Papers o\] the ~4th Re- 
gional Meeting of the Chicago Linguistic Society, 1988. 

A. Abeill6 (b), "A French Tree Adjoining Grammar", Tech- 
nical Report, Univ. of Pennsylvania, Philadelphia, 1988. 

J.P. Boons, A. Gnillet, C. Leel~re,La Structure des Phrases 
Simples en Fransais : Constructions Intransitives, Droz, 
Gen~ve, 1976. 

J.P. Boons, A. Guillet, C. Lecl~re,La Structure des Phrases 
Simples en Fran~ais : Classes de Constructions Transitives, 
Rapport de Recherche du LADL, Univ. Paris VII, 1976. 

M. Gross, "Sur quelques groupes nominaux complexes',, in 
Mdthodes en Grammaire Fran#aise, J.C. Chevalier et M. Gross 
(eds), Klincksieck, 1976. 

M. Gross, Mgthodes en Syntaxe, Paxis, Hermann, 1975. 

M. Gross, "Les bases empirlqnes de la notion de pr6dicat 
s6ma3atique', Langages, n°63, Laxonsse, Paris, 1981. 

M. Gross, "Les limites entre phrases libres, phrases fig~es et 
phrases ~ verbe support, Langages, Laxousse, Paris, 1988. 

A. Joshi, "How much Context-Sensitivity is necessary for 
chaxacterizing Structural Descriptions: Tree Adjoining Gram- 
nlaxs", in D. Dowty et al. eds, Natural Language Processing:i 
Psyeholinguistie, Computational and Theoretical Perspectives, 
New-York, Cambridge University Press, 1985. 

A. Joshi, L. Levy, M. Takahashi, "Tree Adjunct Grammars", 
Journal of the Computer and System Sciences, 10:1, pp.136- 
163, 1975. 

A. Joshi, "Word-order variation in Natural Language Gen- 
eration", in AAAI 87, Sixth National Con\]erenee on Artificial 
Intelligence, pp 550-555, Seattle, July 1987. 

A. Kroch, "Unbounded Dependencies and Subja~ency in a 
Tree Adjoining Grammar", in The Mathematics o~ Language, 
New-York, Benjaxnins.1986 

A. Kroch, A. Joshi, "Some Aspects of the Linguistic Rele~ 
vance of Tree Adjoining Grammar", in Technical Report CI5 
85.18, University of Pennsylvania, 1985. 

Y. Schabes, A. Abefll6, A.Joshi, "Parsing Strategies with 
'LexicMized' Grammars: Application to Tree Adjoining Gram- 
mars", in Proceedings of the l~th International Conference on 
Computational Linguistics, Budapest, 1988. 

Y. Sch~bes, A. Joshl, "An Earley-type Parsing Algorithm 
for Tree Adjoining Grammars", in Proceedings ACL'88, 1988. 

K.Vij~y-Shanker, A Study of Tree-Adjoining-Grammars, 
PhD Thesis, University of Pennsylvaaaia, Philadelphia, 1987. 
