Centering theory and the Italian pronominal system 
Barbara Di Eugenio* 
Department of Computer and Information Science 
University of Pennsylvania 
Philadelphia, PA 
dieugeni@linc.cis.upenn.edu 
Abstract 
In this paper, I give an account, in terms of cen- 
tering theory \[GJW86\], of some phenomena of 
pronominalization in Italian, in particular the 
use of the null or the overt pronoun in subject 
position. After a general introduction to the 
Italian pronominal system, I will review center- 
ing, and then show how the original rules given 
in \[GJW86\] have to be extended or modified. 
Finally, I will show that centering does not ac- 
count for two phenomena: first, the functional 
role of an utterance may override the predictions 
of centering; second, a null subject can be used 
to refer to a whole discourse segment. This lat- 
ter phenomenon should ideally be explained in 
the same terms that the other phenomena in- 
volving null subject are. 
1 The Italian pronominal sys- 
tem 
In Italian, there are two pronominM systems, 
characterized by a different syntactic distribu- 
tion: weak pronouns, that must always be cliff- 
cized to the verb (e.g. \]a, \]o, li, le - respectively 
her, accusative; him, accusative; them, mascu- 
line, accusative; them, feminine, accusative or 
her, dative), and strong pronouns (lui, lei, Ioro - 
respectively he or him; she or her; they or them). 
The null subject can be considered as belonging 
to the system of weak pronouns. Notice that in 
Italian there is no neuter gender: nouns referring 
to inanimate objects are masculine or feminine. 
The weak pronouns used in this case are those of 
the corresponding gender, while, when a strong 
pronoun has to be used, paraphrase or deictics 
*This research was supported by DARPA grant no. 
N0014-85--K0018. 
are preferred. A strong pronoun for inanimate 
objects does exist - esso for masculine, essa for 
feminine, but it is not much used in current Ital- 
ian. 
Weak and strong pronouns are often in com- 
plementary distribution, as the following exam- 
ple shows - the contrast is between the use of 
the null or overt pronoun in subject position 1: 
Ex. 1 a) 
b) 
Quando Carlo/ ha incontrato Marioj, 
When Carlo/ has met Marioj, 
Oi/.j non gli.i/j ha nemmeno detto "ciao". 
he//,/ not to-him//,) has even said "hi". 
Quando Carloi ha incontrato Mario/, 
When Carlo/ has met Marioj, 
lui.i/j non glii/,j ha nemmeno detto "ci~to'. 
he.i/i not to-himq. 3 has even said "hi". 
Notice the difference between sentences a and 
b: in a the null pronoun in subject position refers 
to Carlo and therefore gli has to refer to Mario; 
in b reference is switched. The overt pronoun 
lui in subject position requires its referent to be 
Mario, and therefore gli has to refer to Carlo. 
There are some syntactic accounts of coref- 
erence phenomena in Italian, for example Cal- 
abrese's \[Cal86\]. He starts from the observation 
that weak pronouns are used in all those con- 
texts in which there is an expected referent for 
the pronoun itself; he claims that we cannot use 
strong pronouns in place of weak ones, and vice 
2 versa 
1¢ indicates a null subject and can be translated as 
an unstressed pronoun in English. In all the examples I 
will be using, if a proper name ends in -o or -i, it has 
a male referent; if it ends in -a, a female referent. The 
translations I provide are literal and generally word by 
word. 
2Actually Calabrese classifies pronouns as unstressed 
/ stressed, and not as weak / strong, but I think his ter- 
minology may lead the reader to a wrong conclusion. 
In fact, while the "unstressed" pronouns can never be 
stressed, the "stressed" pronouns can, but not necessar- 
ily ~re. 
270 1 
To formalize the concept of expected referent, 
:he resorts to the notion of Thema, defined as 
the subject of a primary predication, where x is 
a primary predicate of y iff x and y form a con- 
stituent which is either O-marked or \[+ INFL\]. 
tie then says that a pronoun in position of 
'rhema is expected to have another Thema as 
antecedent, and that if this coindexing occurs, 
the pronoun must be a weak one. 
Through these definitions and rules he man- 
ages to account for a wide range of data, as far 
as single sentences are concerned, but when he 
tries to extend them to discourse, their useful- 
hess and predictive power is not sufficient, and 
,~ometimes they give the wrong prediction. This 
is partly due to his very simplistic view of dis- 
course, which he considers as a conjunction of 
sentences. Even for those sentences in which 
this view is sufficient, the argument that coref- 
erence depends only on the syntactic structure 
of the discourse and that we cannot use a weak 
pronoun when the theory predicts that a strong 
one is expected does not hold. Consider the fol- 
lowing example: 
Ex. 2 D1) a) 
b) 
De) a) 
b) 
Ieri Carloi ha incontrato Marioj. 
Yesterday Carloi has met Marioj. ¢,/,j 
Non gli,i/j ha nemmeno detto "ciao". 
Hei not to-himj has even said "hi". 
Ieri Carloi ha incontrato Mariaj. 
Yesterday Carloi has met Mariaj. 
¢,ilj Non gl|i/,j ha nemmeno detto "ciao". 
Shej not to-himi has even said "hi". 
Calabrese's analysis correctly explains the al- 
lowed and disallowed coreferences in DI: Mario 
is not the Thema of Dl.a. So, if we want to 
have the subject of Dl.b refer to Mario, we can- 
.aot use a weak pronoun, but we have to use a 
~trong one: in fact, if we do use a null subject, 
it is interpreted as referring to Carlo. 
Let's now consider D2. The structure of the 
I;wo discourses is exactly the same. Therefore 
I;he theory predicts that, if we want to refer to 
Maria, which is not the Thema of D2.a, we have 
t;o use a strong pronoun, and not a null one: in- 
~:vtead, D2,b is almost perfect. 
The reason is that in D2.b the null subject has 
two potential referents, one male and the other 
:lbmale. While processing the sentence, the pos- 
,ibility that the null subject refers to Carlo is 
culed out when the clitic gli, marked for mascu- 
line, is found. In fact, gli has to refer to Carlo; 
given that gli is not reflexive, it cannot corefer 
with the subject, therefore the latter is forced to 
refer to Maria. 
This kind of disambiguation cannot be per- 
formed in Dl.b, in which the null subject has 
two potential referents of the same gender. 
I should mention that Calabrese, at the begin- 
ning of his paper, says that such features \[gen- 
der, number and person\] allow a first selection 
among the possible refe~ents which are assigned 
to the pronominal. Presumably he would use 
these features as a superimposed filter to be ap- 
plied to the whole sentence after it ha~s been 
completely read or heard. 
Itowever, this could hardly fit in a model of 
how people process discourse: it is very likely 
that the normal human mode of operation is in- 
cremental \[Ste89\]. My claim is that disambigua- 
tion clues have to be taken into account as soon 
as they are available while processing a sentence. 
We will see in fact that they can help to make 
a discourse coherent or not according to their 
position in the sentence. 
Notice that the issue here is to account not 
so much for the grammatieality or ungrammat- 
icality of a sentence, as pure syntactic accounts 
do, but for more or less coherence in a discourse: 
this is exactly the purpose of centering theory. 
in particular, centering relates discourse coher- 
ence with the inference load that a certain se- 
quence of utterances, and especially a certain 
choice of referring expressions, requires on the 
part of the hearer. 
In the next section, I'll show how centering 
theory can be useful to explain certain uses of 
Italian pronouns in discourse, and in turn, how 
a richer pronominal system caz~ help to refine 
the rules that centering uses. 
2 Centering theory 
It is now widely accepted that discourse is di- 
vided into segments (see for example \[Web88\]); 
a discourse is coherent when its constituent seg- 
ments exhibit both local coherence - namely, co- 
herence among the utterances of each individual 
segment, and global coherence - namely, coher- 
ence among the different segments. 
Centering is an account of local coherence: it 
tries to determine the entity which an utterance 
most centrally concerns. Besides, it assesses the 
271 
coherence Of a discourse in terms of the different 
moves that a speaker can do (basically, going on 
to talk about the same entity or switching to 
another one), and in terms of how these moves 
are encoded, in particular as far as the choice 
of referring expressions is concerned. According 
to \[GJW86\], discourse coherence is a measure of 
the infhrence load a certain discourse imposes on 
a hearer. Notice that the view I am taking on 
centering is as a theory of discourse production. 
From \[GJW86\], it is not very clear whether cen- 
tering concerns the production or the compre- 
hension of discourse. 
More technically, there are three moves that 
a speaker can perform, for every triple of ut- 
terances U~, Un+I, Un+2, belonging to the same 
segment: 
DEF. ! 
Continuation: Un and gn+l concern the same 
entity; it is likely that Un+2 will concern 
it too. 
Retention: Un and Un+x concern the same em 
tity, but it is not likely that U,~+2 will con- 
cern it. 
Shifting: U,~ and U~+I concern different enti- 
ties. 
To formalize these concepts, the theory de- 
fines ~ centers those entities that serve to link 
one utterance to another in the same segment; 
an utterance U~ typically has a single backward 
looking center X (Cb), and a set of forward look- 
ing centers (Of's) {}~, ...,\]~m}. 
X, YI~ ..., Ym are all candidates for being 
Cb(U,~+I) (in fact X = ~, for some i), and 
Cb(U,,+I) will be constrained to belong to the 
set of Cf's of U~. Both Cb(U~) and the set 
of Cf's(Un) correspond to linguisticaJ\]y realized 
NPs in Un. 
The set of Cf's for a given utterance Un is par- 
tially ordered; the ordering relation is affected 
by syntactic factors. In \[GJW86\], the only syn- 
tactic element that is identified in this respect 
is the subject of U~: it is the most likely en- 
tity to be Cb(Un+l), therefore it is the highest 
ranked Cf in U~. This assumption is definitely 
plausible, but it does not say anything about 
ordering among the other Cf's. For a more de- 
tailed analysis of the factors affecting Cf's order- 
ing, see Kameyama' s application of centering to 
Japanese \[Kam85\], and for more recent work on 
this topic, \[WIC90\]. I will not address this prob- 
lem in the current paper. 
Given that the Cb corresponds to the en- 
tity that an utterance concerns, the speaker has 
some choices as far as encoding the Cb goes. In 
\[GJW86\] the following rule R1 is proposed: 
in Un+l the speaker can use 
• a single pronoun, and that is the Cb(Un+l); 
• zero or more than one pronoun: then 
Cb(U~+,) is 
- Cb(U~) if Cb(Un) is realized in U~+~, 
-otherwise the highest ranked Cf(Un) 
which is realized in U,~+I. 
In order to ensure a coherent discourse, the 
speaker has to apply the following rule R2 as 
well: 
Given Cb(U~) = X, Cf(U~) = {Y~ > ... > Y,~}, 
X = Yk, for some k, 1 < k < m: 
if there are pairs {Yi,Yj}, with i < j, s.t. both Yi 
and Yj are realized in Un+l, and if \]Q is realized 
with a pronoun, then Y, has to be realized with a 
pronoun. 
The previous rule requires that a speaker, if 
s/he chooses to use a pronoun to refer to a cer- 
tain Cf Yj, has to use a pronoun to refer to all 
the other Cfs realized in the current utterance 
and higher in the ordering than ~. 
This rule accounts for the unacceptability of 
discourses like (from \[GJW86\]) 3: 
Ex. 3 U1) Johni wanted to go for a ride ye,~terday. 
c~(yl) : {John} 
Us) Hei called up Mike 3. 
Cb(U~) = Joh~ 
C:f(U~) = {John > Mike} 
U3) Hej was annoyed by Johni's call . 
In U3, Mike is referred to with a pronoun; Mike 
was less highly ranked than John as a Cf, there- 
fore, if we want to refer to John in//3, we should 
also use a pronoun. The fact that in Ua the 
proper name John is used makes the sequence 
unacceptable: in fact substituting his to John's 
results in an acceptable discourse. 
3Notice that the first utterance of a discourse does not 
have a Cb. 
272 3 
After recognizing what Cb(U~+I) is, the 
hearer can derive the kind of move that the 
speaker has performed in the following way: 
DEF. E 
Continuation: Cb(Un+l) = Cb(Un) and 
Cb(U~+I) is the most highly ranked ele- 
ment in Cf(U,~+I). 
Itetention: Cb(U,~+I) = Cb(Un) but Cb(U~+,) 
is not the most highly ranked element in 
Cf(U~+~). 
Shifting: Cb(Un+l) ¢ Cb(U,~) 4 
Notice the correspondence between Def. 1 
and Def. 1': the notion of Un and Un+l con- 
cerning the same entity corresponds to Un and 
Un+l having the same Cb. The notion of Un+2 
going on to concern still the same entity cor- 
responds to Cb(U=+I) being the most highly 
ranked Cf(Un+l). 
3 Centering and Italian pro- 
nouns 
I now want to recast the choices that the two 
Italian pronominal systems offer to a speaker in 
terms of centering, and, at the same time, refine 
centering itself. I will get evidence from exam- 
ples like the following 5. 
Ex. 4 U~) Maria/ voleva andare al mare. 
Mariai wanted to go to the seaside. 
U~) ~bl Telefono' a Giovannij. 
Shei called Giovannij up. 
U3) a) ¢i Si arrabbio' perche' ¢i non Ioj 
trovo ~ a CKSa. 
Shei got angry because shei not h|mj 
find at home. 
b) 4'i/:j Si arrabbio' 
perche' Cj stava dormendo. 
Shei/?Hej got angry 
because he/ was sleeping. 
c) Luij si arrabbio' perche' ~bj stava dormendo. 
Hej got angry because hej was sleeping. 
d) Cj Sie' arrabbiatO 
perche' Cj stava dormendo. 
4Other versions of centering provide for two different 
types o:f shifting \[BFP87\]. 
5I am using referents of different gender, because 
I want to show how gender and morphological mark- 
ings come into play when resolving reference. Notice 
that these examples would not be ambiguous in English, 
given that null subject is not an option available to a 
speaker: the subject he/she would unambiguously pick 
up its referent. 
Hej has gotten angry(-masc.) 
because h% was sleeping. 
Various interesting facts come out from the 
four U3 variations 6. 
\[a\] The null subject refers to Maria, who, ac- 
cording to the rules in the previous section, 
is Cb(U3.a), and the highest ranked ele- 
ment in Cf(U3.a). U3.a thus demonstrates 
center continuation. The discourse is per- 
fectly coherent. 
\[b\] The most natural interpretation is that the 
null subject in the main clause refers to 
Maria - the null subject in the subordi- 
nate clause is forced to refer to Giovanni 
on pragmatic grounds. 
However, for this same pragmatic reason, 
on second thought the null subject in the 
mMn clause may be interpreted as refer- 
ring to Giovanni, but the discourse sounds 
less coherent. 
\[el The speaker perfonas a felicitous center 
shifting by referring to Giovanni with an 
overt pronoun, given that Giovanni was not 
Cb(U2), and not even the highest Cf(U2). 
\[d\] Contrast this utterance with \[b\]. They 
should have the same effect on the hearer, 
namely, the null subject should be inter- 
preted as referring to Maria: instead in 
\[d\] it is felicitously interpreted as referring 
to Giovanni. This happens because in \[d\] 
the verb is in the present perfect tense 7; 
the past participle agrees with the subject, 
and its masculine morphology forces the 
referent of the null subject to be Giovanni, 
and not Maria. 
It seems to me that Ex.4 and other similar 
examples point to the following generalizations: 
• typically, the speaker encodes center con- 
tinuation with a null subject. This agrees 
with Kameyama's analysis of Japanese 
\[Kam85\]); 
~As a warning to the reader, notice that I am not 
worrying about the interpretation of the null snbject in 
the subordinate causal clause, as it does not affect the 
interpretation of the null subject in the main clause, and 
it is affected by pragmatic reasons. 
TThe temporal relation between the preceding dis- 
course and \[d\] is not right; U2 should also be in the past 
perfect. However, this temporal incoherence does not af- 
fect resolution of pronoun reference. 
4 273 
• he typically encodes center retention or 
shift with a stressed pronoun; 
• he can felicitously use a null subject in 
cases of center retention or shift if he pro- 
vides Un+l with syntactic features that 
force the null subject to refer to a partic- 
ular referent and not to Cb(U,~). 
My claim is that it is the syntactic context 
up to and including the verbal form(s) carrying 
tense and / or agreement that makes the ref- 
erence felicitous or not. Consider U3.d again: 
it is the fact that the main verb is marked for 
masculine that allows the null subject to refer 
to something different from Cb(U2). 
Analogous considerations hold for D2.b in Ex. 
2. There the clitic g\]i precedes the verb and 
forces the null subject to refer to Maria. The 
fact that the clitic precedes the verb is crucial: 
evidence for this derives from examples involv- 
ing modal verbs and clitics. 
Ex. 5 u,) 
u~) 
Mariai e' arrabbiata con Giorgio,: 
Mariai is angry with Giorgloj: 
a) ¢i non vuole piu' parlarglij. 
shei not wants any more talk-to-himj. 
b) * Cj non vuole piu' parlarlei. 
• hej not wants any more tMk-to-heri. 
c) Cj non lei vuole piu' parlare. 
hej not to-heri wants any more talk. 
Here U2.a is perfect, with the null subject re- 
ferring to the higher Cf(U1), namely Maria. 
U2.b is incoherent: the null subject is inter- 
preted as :referring to Maria, but when the clitic 
le is found, at the end of the sentence, the hearer 
is forced to change interpretation. The effect is 
similar to a syntactic "garden path". 
U2.c is acceptable, for the very reason that the 
clitic le, that in U2.b is cliticized onto parlare, 
climbs in front of the modal verb vuo\[e: so the 
hearer is forced to exclude Maria as referent of 
the null subject. This happens early enough so 
that no "garden path" effect is registered. 
4 Other phenomena 
The predictions presented in the previous sec- 
tion are quite reliable, but there are some cases 
that are not taken into account. 
4.1 Purpose of an utterance 
Consider the following example: 
Ex. 6 U~) Luisai ha lasciato suo maritoj: 
Luisai has left her husbandj. 
U~) ¢.i/~ picchiava i bambini e si ubriacava. 
hej used to beat the children 
and get drunk. 
In this case, Cb(U2) is Luisa's husband. U2 is fe- 
licitous, although the speaker uses a null subject 
to achieve a shift and no syntactic clue forces the 
null subject not to refer to Luisa. It looks like 
it is the function of U2, namely, explaining why 
Luisa left her husband, that licenses the use of 
a null subject in this case. 
It may even be argued that this case is out- 
side the purview of centering, which explicitly 
states that the referential phenomena accounted 
for are within a single segment: U2 may belong 
to a new segment, possibly much longer than 
what is shown here, that explMns why Luisa left 
her husband. 
On the other hand, it seems to me that the 
concept of local coherence is not totally depen- 
dent on having two utterances belonging to the 
same segment. The transition to another seg- 
ment may override centering predictions; never- 
theless, the referential expressions found in the 
first utterance of the new segment may need to 
be accounted for in terms of the Cf's of the last 
utterance of the previous segment. This may be 
what happens in Ex.6, if indeed U2 belongs to a 
new segment. 
4.2 Null subject referring to a whole 
discourse segment 
Reference to a whole discourse segment is gen- 
erally achieved in Italian by means of questo / 
cio', both equivalent to this, but sometimes a 
null subject is used (on this topic, see \[DiE89\]): 
Ex. 7 Questi grandi atleti sono illuminati dai 
These great athletes come under 
mass media ogni due, ogni quattro anal, 
the media light every two, every four years, 
e devono conquistare una medaglia 
and they have to win a medal 
lottando contro il mondo intero 
fighting against the whole world 
per guadagnarsi l'affetto della gente. 
to gain people's affection. 
Mentre in altri sport (nel calcio soprattutto) 
In other sports (in soccer above all), 
l'amore, la celebrita', i denari 
love, fame, money 
sono quasi automatici, quasi obbligatori. 
are almost automatic, almost compulsory. 
274 5 
~b E' giusto? 
Is this lalr? 
In the preceding example, the null subject 
in the last utterance refers to the whole pre- 
vious discourse: the fact that a null subject, 
namely, the pronoun with the least informative 
content, that should supposedly refer to an ex- 
pected referent, can be used in such a way, is a 
phenomenon that deserves explanation. 
in general, centering does not say anything 
about reference to discourse segments, and in 
fact it may again be argued that clausal refer- 
ence has nothing to do with local coherence. 
This actually depends on the perspective from 
which we look at clausal reference: it is possi- 
ble that entities corresponding to discourse seg- 
ment.s are implicitly included in the Cf's set; or 
that they are avMlable for reference, but they 
have a status different from the normal Cf's; or 
that they have a different status altogether, for 
example that they do not exist as centered en- 
tities until they are referred to for the first time 
\[WebS8\]. 
In any of these three cases, a theory of dis- 
course coherence should at least partly address 
the problem. 
5 Conclusions 
In this paper, 1 have shown how the context up 
to argot including the verb helps in disambiguat- 
ing the reference for a null subject. 
Some topics for future research have been dis- 
cussed in the previous section. Integrating the 
analysis of these phenomena with centering will 
shed some light on the whole phenomenon of 
reference. 
Centering gives us a vintage point of view in 
looking at local coherence in discourse as em- 
bodied by the choice of referring expressions 
that a speaker uses. Languages with richer mor- 
phological marking and agreement system than 
English can be very useful both to assess center- 
ing and to refine its rules. 
Acknowledgements. 
I would like to thank Prof. Bonnie Webber for 
her support and her comments on earlier ver- 
sions of this paper, and Prof. Aravind aoshi for 
useful discussions. 

References 

Susan Brennan, Marilyn Walker 
Friedman, and Carl Pollard. A cen- 
tering approach to pronouns. In Proc. 
25th Meeting, Association for Com- 
putational Linguistics, pages \]55-162, 
1987. 

Andrea Calabrese. PRONOMINA 
Some properties of the Italian 
pronominal system. In N. Fukui, T. 
Rapoport, and E. Sagey, editors, MIT 
Working papers in Linguistics. Pa- 
pets in Theoretical Linguistics. Vol. 8, 
1986. 

Barbara Di Eugenio. Clausal refer- 
ence in Italian. In Proceedings Penn 
Linguistics Colloquium, 1989. 

B. Grosz, A. Joshi, and S. Weinstein. 
Towards a computational theory of 
discourse interpretation. 1986. Un- 
published manuscript. 

Megumi Kameyama. Zero anaphora: 
the case of Japanese. PhD thesis, 
Stanford University, 1985. 

M. Steedman. Grammar, interpreta- 
tion and processing from the lexicon. 
In W. Marslen-Wilson, editor, Lexi- 
cal Representation and Process, MIT 
Press, 1989. 

Bonnie Webber. Discourse deixis and 
discourse processing. Technical Re- 
port MS-CIS-88-75, Department of 
Computer and Information Science, 
University of Pennsylvania, 1988. 

Marilyn Walker, Masayo Iida, and 
Sharon Cote. Centering in Japanese 
discourse. In Proc. COLING 90, 1990. 
