Morphology and cross dependencies In the synthesis of personal pronouns 
in Romance languages 
Laurence DANLOS and Fiagametta NAMER 
LADL, Universit6 de Paris 7 
2, Place Jussieu 
75251 Paris Cedex 05, France 
Abstract 
This paper describes some of the problems that arise from 
the synthesis of personal pronouns in a system tlrat 
generates texts in Romance languages. It puts the 
emphasis first on the fact that fire morphological level has 
to betaken hire account early in the generation process, 
second on the numerous "cross dependency" phenomena 
which are to b,; found when the synthesis of an element X 
depends upon that of another element Y and when the 
synthesis of "Y depends upon that of X. The linguistic 
examples are taken from French and Italian languages, for 
wlfich a robust generatiort system has been implemented. 
1 Introduction 
It is generally believed that a generation system can be 
modularized into a sequence of components, the first one 
making the "high level" decisions (i.e. the conceptual 
decisions), the following ones making the linguistic 
decisions (e.g. lexical and syntactic construction choices), 
the penultimate, one performing the "low level" operations 
(i.e. the syntactic operations), and the last one handling 
the morphological operations. We have shown in (L. 
DanIos1985, 1987a) that the conceptual and linguistic 
decisions are operations that depend on each other. 
Therefore, we designed a generation system modularized in 
the following way: a "strategic component" makes the 
conceptual and linguistic decisions simultaneously and 
gives back "clause templates" which are synthesized into 
clauses by a "syntactic component". A simplified version 
of the clause template syntax is the following one (a more 
complete version is presented in (L. Danlos 1987b)) : 
\[Cll = (:el \[subject\] \[verb\] cplt n (0<_n~2)) 
\[subject\] :: (:subject token) 
\[verb\] = (:verb verb ) 
cplt = \[dir-object\] / I/t-object\] / \[de-object\] /\[prgp- 
object\] 
\[dir-object\] = (:dir-object token) 
\[h-object\] = (:h-object token) 
\[de-object\] = (:de-object token) 
\[pr6p-object\] = (:pr6p-object \[pr6p\] (:object token)) 
\[prgp\] = (:pr6p preposition ) 
The prepositional complements JR-object\] and \[de-object\] 
are complements respectively introduced by ~, and de in 
French, a and di in Italian. They are separated from the 
prepositional complements \[prgp-object\] introduced by 
other prepositions because they have a specific syntactic 
behaviour, especially in regard to pronominalization (cf. 
3). An example of a clause template is : 
CC1 (:subject HUM1) (:verb amare ) (:dir-object 
HUM2)) 
with 
HUM1 =: PERSON HUM2 =: PERSON 
NAME : Idgo NAME : Maria 
SEX : mast SEX : fern 
According to the context (i.e. fire clause templates that 
have been previously synthesized), the syntactic 
component, which handles pronominalization, produces 
one of the following Italian clauses (given that the verb 
is in the present tense) : 
Ugo ama Maria (Ugo loves Mary) 
Ugo l'ama (Ugo loves her) 
Quest'uomo l'ama (This man loves her) 
Ama questa donna (He loves this woman) 
It will be shown in 3 that pronominalization involves the 
morphological level. Tire decisions concerning 
pronominalization, which is a stumbling block for natural 
language processing, must certainly not be made last. 
Thus, the morphological level (level supposedly very 
"low") must not be taken into account only at the very 
last stage of the generation process. 
The second aim of this paper is to put forward "non local 
dependencies" which are to be found when the synthesis of 
an element X depends upon that of another element Y. 
Such a dependency requires the synthesis of X to be 
carried out after that of Y, whatever the order of X and Y in 
the clause template. Moreover, cases of "cross 
dependencies" are to be found when the synthesis of X 
depends upon that of Y and when the synthesis of Y 
depends upon that of X. A cross dependency leads to 
conflicting orderings, namely synthesis of X after that of 
Y and synthesis of Y after that of X. The solution to such 
conflicting orderings is to perform a sequence of 
incomplete syntheses of X and Y. 
2 Non local and cross dependencies 
The syalthesis of the verb and direct object in French will 
be taken as an illustration of non local and cross 
dependencies. On file one lland, the synthesis of the verb 
depends upon that of the \[dir-object\] for two reasons. 
First, there is a switch from the auxiliary avoir to the 
auxiliary dtre (when the verb is conjugated in a compound 
tense) if the \[dir-object\] is synthesized as a reflexive 
pronoun (which must appear before the verb) : 
Ugo a ddtestd Marie (Ugo hated Mary) 
Ugo s'est d~testd (Ugo hated himself) 
Second, there is agreement in gender and number between 
the past participle of a verb conjugated in a compound 
tense and a \[dir..object\] synthesized as a personal pronoun 
(which must appear before the verb) : 
Ugo, je l'ai ddtest~ (IJgo, I hated him) 
Marie, je l'ai ddtestJe (Mary, I hated her) 
On tire other hand, the synthesis of the \[dir-object\] 
depends upon that of the verb in the following way which 
will be explained in detail in 3 : determining whether the 
\[dir-object\] has to be synthesized as a personal pronoun 
may depend upon the first letter and the form of the 
conjugated verb. All in all, the synthesis of the verb 
depends upon that of the \[dir-objeet\] and the synthesis of 
the \[dir-objeet\] depends upon that of the verb. This cross 
dependency can be handled with the following sequence of 
incomplete syntheses : 
1) Determine if the \[dir-object\] must be synthesized as a 
reflexive pronoun (by checking if its value is equal to the 
value of the subject). If it is, mark the verb as having to 
be conjugamd with the auxiliary ~tre. 
2) Synthesize the verb (i.e. conjugate it) without taking 
139 
into account a possible agreement between a past 
participle and a pronominalized \[dir-object\]. In Step 2, the 
verb is conjugated in a compound tense with the right 
auxiliary thanks to Step 1. Let us mention that the 
conjugation of a verb is a morphological operation. 
3) Synthesize the \[dir-object\] if it has not been 
synthesized as a reflexive pronoun in Step 1. In Step 3, 
the form of the conjugated verb provided by Step 2 is used 
to determine if the \[dir-object\] has to be synthesized as a 
personal pronoun. 
4) Complete the synthesis of the verb if necessary, i.e. 
carry out the agreement in gender and number between a 
past participle if any (information given by Step 2) and a 
pronominalized \[dir-object\] if any (infomation given by 
Step 3). 
These four steps imply that both the direct object and the 
verb are checked over twice. Note that this is only for the 
synthesis of these two elements. The cross dependencies 
that arise from other elements imply that the direct object 
and the verb are checked over more thant twice. Generally 
speaking, a clause template (i.e. a tree) is gone through 
several times in the syntactic component. 
3 Synthesis of personal pronouns 
If a token refers to the speaker(s) or the hearer(s), it must 
be synthesized as a first or second person pronoun; the 
only operation to be performed is the computation of this 
"dialogue" pronoun. Otherwise, we consider synthesizing a 
token as a flfird person pronoun only if it has already been 
synthesized (because occuring in a previous clause 
template, for example). In other words, w e do not consider 
the left pronominalization phenomena (T. Reinhart 1983). 
Determining whether a token which does not refer to the 
speaker(s) or hearer(s) and which has already been 
synthesized has to be synthesized as a personal pronoun 
requires the following steps to be gone through: 
1) Compute the form of the foreseen pronoun (eL 3.1); 
2) Compute the list L1 of tokens that have been 
synthesized in nominal phrases whose "morphological" 
features (i.e. gender and number) are compatible with the 
form of the foreseen pronoun (eL 3.2). 
3) Compute the sublist L2 of L1 that contains only the 
elements of L2 that are syntactically compatible with the 
foreseen pronoun. For example, in Mary hated her, Mary 
and the personal pronounher cannot be coreferential. The 
token representing Mary is said to be syntactically 
incompatible with the pronoun her. 
4) Compute the sublist L3 of L2 that contains only the 
elements of L2 that are semantically compatible with the 
foreseen pronoun. For example, in The book is on the 
table, it was published recently , the pronoun it and the 
table cannot be coreferential because the direct object of 
the verb publish in the active (its subject in the passive) 
cannot be a piece of furniture. The token representing the 
table is said to be semantically incompatible with the 
pronoun it. 
5) According to the number of elements in L3, and maybe 
according to other considerations, decide actually if the 
foreseen pronoun has to be synthesized. At a rough 
estimate, if the number of elements of L3 is one, then the 
foreseen pronoun can be synthesized since it does not lead 
to ambiguity, while it should not be synthesized if the 
number of elements in L3 is greater than one since it 
would lead to ambiguity. Yet, it is well known that 
pragmatic and structure parallelism considerations may 
allow a pronoun to be non ambiguous even if L3 has more 
than one element (G. Hirst 1981, C. Sidner 1981, K. 
McKeown 1985, L. Danlos 1987a). Step 5 takes those 
considerations into account to determine whether the 
foreseen pronoun has to be actually synthesized. 
3.1 Computation of the form of the foreseen 
pronoun 
This computation involves the following factors : 
1) The syntactic position in which the token that could be 
synthesized as a pronoun appears. In English, it is enough 
to distinguish between the subject and complement 
positions. In French and Italian, it is necessary to 
distinguish between the \[subject\], \[dir-objeet\], \[h-object\], 
\[de-object\], \[lee-object\] and \[prdp-object\] positions: the 
\[subject\] and \[prdp-object\] positions generally give rise to 
pronouns that are similar to the English onesl; the other 
positions may give rise to pronouns that must appear 
before the verb, such pronouns being noted Ppv . 
2) The person and number of the token. Person and number 
are semantic information which are given in the definition 
of the token. 
3) The gender of the nominal phrase that synthesizes the 
previous occurrence of the token. In French and Italian 
languages, which have only the masculine and feminine 
gender, gender is not semantic but lexieal information. 
Consider the token TOK1 with the following definition: 
TOK1 =: BICYCLE 
NUMBER : 1 
DEFINITE : yes 
In French, it can be synthesized as a feminine noun group 
la bicyclette (the bicycle) or as a masculine noun group le 
vdlo (the bike). The gender of a pronoun which 
synthesizes a token is generally equal to the gender of the 
previous occurrence of the token : 
La bicyclette est cassde. (Elle + * ll) est au garage. 
(The bicycle is broken. It is at the garage.) 
Le v~lo est cass~. (11 + * Elle) eat au garage. 
(The bike is broken. It is at the garage.) 
4) The human nature of the token along with the verb (in 
the infinitive form) of the clause template. As an example, 
consider the synthesis of an \[h-object\] in Italian. The 
verbs dare, pensare and credere can all take a human or 
non human \[h-object\]. The form of a pronoun 
corresponding to the \[h-object\] of one of these three verbs 
is given in the table below 2 : 
TABLE,, 1 ~m:sHUMAN f-s ,-ii 
dare gli le . plur .lore I 
credere I gli I le lore 
pensare lalui I alei aloro 
NON HUMAN 
gli 'le ioro 
ci ci ci 
ci ci ci 
In Italian as well as in French, the form of an \[h-object\] 
pronoun can only be obtained by consulting a "lexicon- 
grammar" (M. Gross 1975, 1986 ; A. Ella et alii 1981). 
1 In fact, an Italian \[subject\] pronoun is erased when this 
erasing does not create any ambiguity. There is no room 
in this paper to discuss this complex phenomenon which 
is also to be found in Spanish and Portuguese. 
2 The abbreviation "m-s" stands for masculine-singular, "f- 
s" for feminine-singular, "plur" for plural. The pronouns 
preceeded by the preposition a are not placed before the 
verb. 
140 
For each w,.:cb, a lexicon-granamar records all its syntactic 
properties, among them those concerning 
pronominalization. 
5) The synthesis of the verb. /n French, a \[dir-objeet\] of 
the third person singular is pronominalized as le if the 
previous oc~;urrence of the token is masculine, as la if 
feminine : 
Ugo, je \[e vois souvent (Ugo, I see him often) 
Marie, je la vois souvent (Mary, I see her often) 
However, if the first letter of tile conjugated verb is a 
vowel, there is elision of le ou la into l' : 
Ugo, je l'ai vu et je l'entends (Ugo, I saw him and I hear 
him) 
Marie, je Pai vue et je l'entends (Mary, I saw her and I 
hear her) 
This elisi(m changes the computation of the 
morphological antecedents of the pronoun as shown in 
3.2. Therefore, it has to be taken into account when 
determining if the \[dir-objeet\] has to be pronominalized. 
6) The synthesis of other complements. This factor 
iiwolves several non local dependencies. For example, in 
French, an \[b-object\] cannot be pronominalized as the Ppv 
=: lui if there is a \[dir-objeet\] synthesized as te (M. 
Gross 1968): 
Marie, je la pr~senterai d Ugo --> Je la lui 
pr~venterai 
(Mary, I will introduce her to Ugo -> I will introduce 
her to him) 
Toi, je te prdsenterai ?t Ugo --> * Je te lui pr~senterai 
(You, I will introduce you to Ugo --> I will introduce 
you to him) 
Another exmnple in Italian: an \[h-object\] of the third 
person singular can be pronominalized as gli if the 
previous occurrence of the token is masculine, as le if 
feminine (see Table 1). However, if there is a \[dir-object\] 
synthesized as the Ppv =: 1o , the pronouns gli or le 
amalgamate with this Ppv and both become glie : 
Diedi il libro a Maria --> Le diedi il iibro 
(I gave the book to Mary --> I gave the book to 
her) 
Diedi il libro a Ugo --> Gll diedi il libro 
(i gave the book to Ugo --> I gave the book to him) 
I1 libro, la diedi a (Maria + Ugo) --> Glielo diedi 
(The book, I gave it to (Mary + Ugo) --> I gave it 
to her/him) 
3.2 Computation of the morphological 
antecedents of the foreseen pronoun 
A token, which does not refer to the speaker(s) or 
hearer(s), eollrespends to a morphological antecedent of 
the foreseen pronoun if it has been previously synthesized 
as a nominal phrase whose morphological features (i.e. 
gender and number) are compatible with the form of the 
foreseen prunoun. For example, if the foreseen pronoun is 
the French \[dir-objet\] pronoun la, its morphological 
antecedents are the feminine singular noun phrases; 
the Italian \[/i-object\] pronoun gli , it s morphological 
antecedents are the masculine singular noun phrases; 
tile Italian \[h-object\] pronoun le , its morphological 
antecedents are the feminine singular noun phrases; 
the Italian \[h-object\] pronoun glie (result of an 
amalgamatio~l of gli or le with another Ppv ), its 
morphologie~ anteee.xlents are the singular noun phrases. 
In the cases mentionned above, the computation of the 
morphological antccedent~ of the foreseen pronoun (i.e. 
the eomput~ition of the list L1) only depends upon ithe 
form of the pronoun. The computation of L1 can also 
depend upon the synthesis of other elements, thereby 
involving non local dependencies. For example, when the 
foreseen pronoun is l', its morphological antecedents are 
all tlle singular noun phrases if the conjugated verb does 
not include a past participle as in Je l'entends (I hear 
him/her/it); otherwise, its morphological antecedents are 
the singular noun phrases with the gender indicated by the 
past participle: in Je l'ai vu (I saw him/it), the 
morphological antecedents of 1" are the masculine singular 
noun phrases, while in Je l'ai rue (I saw her/it), the 
morphological antecedents of l' are the feminine singular 
noun phrases. This is why the synthesis of the \[dir-objeet\] 
depends upon that of the verb. It is an illustration of the 
claim that pronominalization involves the morphological 
level. 
Conclusion 
The cross dependencies and morphological interactions 
wlfieh were presented here concern only the synthesis of 
personal pronouns, putting aside the synthesis of 
sentential, subordinate and coordinated clauses. The reader 
can guess the complexity of a syntactic component for 
Romance languages. A robust French and Italian syntactic 
component has been implemented in a procedural 
Common-Lisp program. An English syntactic component 
has been implemented in a declarative formalism using 
functional descriptions (J.M. Laneelet alii 1988). 

References

Danlos, L., 1985, Gdndration automatique de textes en 
langues naturelles , Masson, Paris. 

Danlos, L., 1987a, The linguistic basis of text 
generation, Cambridge University Press, Cambridge. 

Danlos, L., 1987b, A French and English Syntactic 
Component for Generation, Natural Language Generation: 
New results in Artificial Intelligence, Psychology and 
Linguistics , Kempen G. ed, DortreehtlBoston, Martinns 
Nijhoff Publishers. 

Ella, A., Martinelli, M., D'Agostino, E., 1981, Lessico e 
strutture sintattiche. Introduzione alia sintassi del verbo 
italiano, Napoli Liguori, Napoli. 

Gross, M., 1968, Grammaire transformationnelle du 
franfais: syntaxe du verbe , Larousse, Pads. 

Gross, M., 1975, Mdthodes en syntaxe , Hermann, Paris. 

Gross, M., 1986, Lexicon-Grammar, The Representation of 
Compound Words, in Proceedings of Coling'86, Bonn. 

Hirst, G., 1981, Discourse oriented Anaphora resolution, 
Amer.ican Journal of Computational Linguistics , vol. 7, 
no 2. 

Lancel, LM., Otani, M., Simonin, N., Danlos, L., 1988, 
Sentenee Parsing and Generation with a Semantic 
Dictionary and a Lexicon-Grammar, in Proceedings of 
Coling'88 , Budapest. 

McKeown, K., 1985, Text generation, Cambridge 
University Press, Cambridge. 

Reinhart, T., 1983, Anaphora and semantic 
interpretation, Croom Helm, London. 

Sidner, C., 1981, Focusing for Interpretation of Pronouns, 
American Journal of Computational Linguistics, vol. 7, 
no 4. 
