Zero Pronoun Resolution 
in Japanese Discourse 
Based on Centering Theory 
OKUMURA Manabu 
S(:hool of Information Scicn(:e, 
,Japan Adva, nced \]ns(;ituto of 
St:tent(: an(l Technology 
Tal;sunokuchi, Ishikawa 923-12, ,1 at)all 
oku@jaist.ac, jp 
TAMURA Kouji 
Department of Softwm'c l)evch)t)m<:nt , 
Fujitsu 
10\] 5 Kamikodanaka, Nakahara-ku, 
Kawasaki 211, Japan 
k-tamura(0candy, paso. fujitsu, co. jp 
Abstract 
Recently therc have been a nmnber of 
works thai, IllOd(;1 (;ho. ZCl'O pronolllt res- 
olution wii:h t,h(: con(',el)t (:all(~d ~ccill,(}i.' 
Howcv(.w, (;he' llSCflllltess of (;ho. lncvious 
t;(;ntc.ring Kameworks has not fully c.wflu- 
al;ed with nat;urally oc('.urring (lis(:oms(~,q. 
Fm'therm(n'(~, tlm previous CCII|;CI'ilI~ |;}l(~- 
ory has bm(lh:d only th(' liht:nom(ma in 
SIl(;CC,qsiv{: silt(I)\](2 S(;Ill;CltCC,q ;l,lld h;4s I\]()|; 
adequately ad(h'cssed the way to han- 
dle comph;x s(,ntences that are t)rew> 
lent in naturally occurring discourses. In 
this paper, wc pr(:s(:nt a nt(:i;ho(l to ham- 
dh: (;ontph:x s(:nl;en(;t:s with the ccn(;(:ring 
th(:ory and d(:scril/(: our fl'mncw(nk thai; 
idendfi(:s (;h(: m~tccedcnts of z(',ro lno 
nomts in nal;urally occurring ./alm.n(:s(: 
discourses. Wc also t)r(:scnt (;lit', (:wfl- 
uation of our fl'amc.work with r('al dis- 
(:ours(:s. 
1 Introduction 
lilt many n~tural languages, (~lemenls that (:a.n be 
easily deduc(;d by tim reader ~r(~ frequently omit- 
(;ed front the. CXln'essions in discourso.s. Pmdc.u- 
la.rly in Jal)mms('. discourses, this omission o(:(;ms 
IltOl'C fi'eqllelt|;ly &ltd a ZClO pr()IlOllll is of|;O.ll llsed 
to avoid rel)(;ating ~ iiOllll t)hrasc thai; al)tmar(~d 
in 1;he previous S(Htt, CltC(kS. A ZCl'O I)l'()II()llll C}tll l)(t 
(xmsidei'cd as a noun \[)hra.se which is of an oblig- 
atory case and which is not (:xprossed but (:~m 
1)o. und(:rstood through tim c(ntt(~xt(Yoshimoto, 
1!)86). Thox(,.fore, 1;o un(h~rsl;m,t a ,\]al)mms('~ dis- 
(:oursc, it is imt)ol'l;mfl; to identify (;h('. mtL(me(lelfl;s 
of zoa'o pronouns. 
l{,(;ccntly thor( ~. have t)(~cn ~t lllllnl)(;r of works 
(;ha.t mo(lcl tim (zca'o) pronoun rt!solution with Chc 
(;on('x~.t)(, calh:d '(;()itl;('.r'(Grosz (.q; a,l., 11995; Ih'cn- 
nan el; al., 1!)87; -Walker et al., 1994; Kamo.ymna., 
I !)86). ~I'Ilc CO, Ill;(Willp~ I;h(',ory Lr'ios l;() iden(;i(y die 
;tAI|;(X',(',(1Oll|, of ;t (zero) i)rollOtllt l)y tim i(te~ t;llm; 
|;lw. ondl;y |;hal; a, S(~IltO, ll(;e ItlOS(; centrally con- 
(:(!rns(ctmi;(:r) (;ends (;o 1)(: (;Xl)rCss(:d by a (zt:r()) 
pronoun. Tim (:entering t;hoory has tit(: follow- 
ing adwmtagcs. Because it uses only the mlr- 
ftu:(', information in sent(races and does not n(:cd 
a huge amount of common sense knowledge to rc- 
solvo (zero) pronouns, it is easy to implem(:nt i|; 
on computer syst(:ms. Secondly, it; cmt bc appli(:d 
to many languages(Grosz (:1; al., 1995; Walk(!r (!t 
al., \] 994). 
hi spite of l;h(~se ndv;mt~gcs, unt'ortunat('.ly, the 
usefithmss of tim previous centering flmneworks 
has not flllly tc'stcd lmCmlSC only a small num- 
ber of construcl,ed discourses have b(;cn usc.d for 
cv~fluation. ~)~ think they should bc I;osi;o.d with 
~t corpus of naturally occurring discourses. How- 
(WOI', Stlch & |;(~S|; is II()W difficult becaus(~ the t)I'C- 
vious (:(mt, caing theory has only handled the t)tw- 
ItOtltOdtgL ilt Sll(;CC,qSivo. simph', stHltCllt;O,q and has ltOL 
ad(;quatdy addr(~,ss(~d the way to handh; c.omt)h~x 
,SCttI, CIIC(tS |~\[liLt &I'C l)t(.w~JenL it( na|;Ul'a\]\]y (/(;(:tlrring 
(lis(:ours(:s. 
lit this p~rt)('.r, w(; t)rcsc.n() a met;hod |;o hm> 
die comt)lex so.n|;(,.nc(;s with tim (;C.lfl;(~ring |;h(!ory 
and do.scribo, our Kmnework that identities tit(', a.n- 
t;(?ccd()it|;s of Z(.Wtl IIi'(lllOllltS in naturally occurring 
.\]aImnes(~ discourses. We also prcs(ml; |;ht! ewJua- 
(ion of our fi'mncwork with re~fl discourso.s. 
\]tl so(;t;i()n (;wo wc Oll(;\]itto, |;WO versiolls of 
(;he c(;n(;o.ring theory |;trot; have bccn al)plit~d to 
,Ja.tla.llO.,<-;c zero 1)l'OllOlllt r(!sohti;iolt, lit sectiou |;}lro(~ 
wo. exp1;fin how zero \[)l'()lt()lllIS in (;omp\[ex sen- 
\[,O.1IC(~S (;tilt bC h~mdled bas('d on (;\]m CtHtl;Cl'ilI~ th(> 
ory. In sc('.l;ion f()m wc (hm(:ribe a st% of the exper- 
iments that our zero pronoun rcs(flution method 
in s('~t:tion (;hre(~ is a.ppli('d i;o 1.'clJ ,\]ii.1)mt(;s(,. dis- 
COllI'SCS. 
2 Two Versions of the Centering 
Theory 
In tim (:(',ntx~ring th(:ory, (,a(:h SCII.I;CIIC( ~. \]l~toS l;W() 
StI'IICI;IlI'(Lq associated with it: a set of discom's(: 
(:nti|;i(:s called fi)rward-looking centers, Cfs, ilia\]; 
at)pear in dm sentenc(:, and a Sl)(:(:ial membt:r of 
Cfs (:alh~d tlm l/a(;kw~r(t-h)(/king ccitt(:(, Ub. Th(' 
(/I, is Lit(: dis(:ours(', cntii,y Lh;tt l;lm st:Ill;el((;(: Iltost 
c(:nt;rally concerns. A (7f may \[mconm a. Ct, later 
871 
in the discourse. The set of Cfs is ordered by 
their grammatical properties which are considered 
to reflect their degrees of salience. The centering 
theory specifies the following (heuristic) rule: 
If the Cb of the current sentence is the 
same as the Cb of the previous sentence, 
a (zero) pronoun should be used. 
There are two versions of the centering theory 
that have been applied to Japanese zero pronoun 
resolution: Kameyama's(Kameyama, 1986) and 
Walker's(Walker et al., 1994). Roughly both ver- 
sions use the following same forward center rank- 
ing for Japanese: 
Topic > Empathy > Subject > Object2 
> Object > Others, 
where Empathy is a grammatical property that in- 
dicates the speaker's position in describing a situ- 
ation. In addition to the above rule, Kameyama's 
version uses the property sharing constraint that 
two zero pronouns in adjacent sentences, which 
co-specify the same Cb, should share one of the 
grammatical properties. This constraint is used 
for ranking discourse entities in the order of pref- 
erence as the antecedent of a zero pronoun. 
Walker's version, on the other hand, uses the 
following additional rules and constraint: 
• Constraint 
For each sentence Ui: 
The center, Cb(Ui), is the highest- 
ranked element of C I (\[7/-1) that ap- 
pears in Ui. 
• Rules 
For each sentence Ui: 
1. If a certain element of C I (Ui-1) appears 
as a (zero) pronoun in Ui, then so is 
Cb(Ud. 
2. Transition states are ordered, where the 
transition state is determined based on 
two factors: whether Cb of the current 
sentence is the same as of the previous 
sentence, and whether Cb is the same 
as the highest-ranked member of C I of 
the current sentence. This transition or- 
dering is used for ranking discourse en- 
tities in the order of preference as the 
antecedent of a zero pronoun. 
Basically, when the centering algorithm is used 
for the (zero) pronoun resolution, the algorithm 
first generates all possible antecedents for (zero) 
pronouns in a sentence by enumerating all possible 
Ct, and C\] pairs for the sentence, and then filters 
and ranks these possible antecedents with the con- 
straint and rules that are mentioned above. The 
Cb of the sentence is computed as the side effect 
of performing the (zero) pronoun resolution. 
3 Processing Complex Sentences 
with the Centering Theory 
In the centering theory that we outlined in the last 
section, 'sentence', that is its basic unit of pro- 
cessing, means the simple sentence that contains 
only one predicate(verb). The centering theory, 
therefore, has not adequately addressed the way 
to handle complex sentences that contain nmltiplc 
verbs. However, it is necessary to handle complex 
sentences that are prevalent in naturally occurring 
discourses with the centering algorithms. 
We can think of (at least) two ways to handle 
complex sentences. For instance, consider process- 
ing a complex sentence of the form 'SX Conj SY,' 
where SX and SY each consists of a simple sen- 
tence and Conj is a conjunctive element(Suri and 
McCoy, 1994) 1. One can imagine processing SX 
first and then SY as if they are a linear sequence of 
simple sentences and applying the centering the- 
ory to each sentence successively and updating the 
data structures for centering. 
On the other hand, the whole sentence can be 
treated as a single unit. This approach, how- 
ever, has two problems. First, the intrasenten- 
tial ellipsis that the antecedent exists in the same 
sentence 9 cannot be handled with the centering 
theory, because the centering theory only han- 
dles the intersentential ellipsis. Therefore, the in- 
trasentential ellipsis must be dealt with separately 
from the intersentential ellipsis. Secondly, in the 
centering theory, it is unclear whether two zero 
pronouns with the same grammatical property in 
the different simple sentences (of a complex sen- 
tence) can be simultaneously handled without any 
extension to the original theory. 
Comparing these two approaches, we adopt the 
former. And if a sentence contains multiple verbs, 
we partition it into multiple simple sentences and 
apply the centering theory to a sequence of p~r- 
titioned simple sentences individually for the zero 
pronoun resolution. Using this approach, we need 
not modify the original centering algorithm dras- 
tically to handle complex senteimes. Even the in- 
trasentential ellipsis can be handled with the cen- 
tering theory, because different simple sentences 
contain the antecedent and the zero pronoun re- 
spectively, after partitioning. 
a.1 The range of search for the 
antecedent 
Since the centering theory uses only the infor- 
nmtion in the previous and current sentences, 
this might be problematic when we adopt the 
'partition' approach. \]ebr example, if the previ- 
ous sentence consists of three simple sentences, 
1In case of Japanese, it is a conjunctive 
postposition. 
2Of course, the antecedent does not exist in the 
same simple sentence. 
872 
tile firs(; simple sentence in tile previous seil- 
tence becomes tile third from the, current sen- 
tence, after partitioning. Partitioning might cause 
that, the information in the previous and current 
(post-partitioned simple) sentences does not in- 
chide even tile information in the current (pre- 
partitioned) sentence. We tilink it is inadequate, 
since the antecedents of zero pronouns often ap- 
pear in the previous (pre-partitioned) sentence. 
Therefore, it; is necessary to extend the range of 
search for (;he antecedent to more previous (post- 
partitione.d simple) sentences. 
To determine to what extent we should extend 
the range of search for the antecedent, we make 
the following investigations and e.xperiment: 
• How many simple sentences does a naturally 
occurring sentence consist of? 
• How many sentences from the current seil- 
tence do we find tim antecedent of a zero I)ro- 
noun in real discourses? 
• How does tile accuracy of tile zero pronoun 
resolution change if we vary the range of sim- 
ple sentences where the antecedent of a zero 
pronoun is searched? 
The first investigation is l)ertormed manually, 
and the result shows that 10,000 sentences of 
the review articles from the newspaper consist of 
24,332 simple sentences. Therefore, a naturally 
occurring Japanese sentence can be considered to 
consist of 2.0 2.5 simple sentences on average. 
The second investigation is performed manually 
on one of tile test discourses that are inentioned in 
the next section, and the result shows that 95% of 
the antecedents appear in the previous or current 
(pre-partitioned) sentence, This result is consis- 
tent with the larger-scale investigation that l%l- 
jisawa et al.(l%jisawa et al., :1991) made. for tile 
same purpose. Fnjisawa's investigation, on 1,087 
sentences of the scientific journal and 1,426 sen- 
tences of the review articles from the newspaper, 
showed that 87.6% of the antecedents appeared 
in the previous or current sentence and 95.1% ap- 
peared ill tile previous two sentences or current 
sentence. The third experiment is performed on 
two of the test discourses in the next section, by 
implementing two versions of the centering algo- 
rithms that are mentioned in the hast section anti 
wn'ying tim range of simple sentences where the. 
antecedent of a zero pronoun is searched from tile 
previous sentence to the previous ten sentences. 
Tile experiment shows that the accnraey improves 
until the previous 2 4 sentences are searched, but 
degrades after that. 
Totally taking into account these results, we 
determine that tile antecedents are searched in 
the previous four simple sentences. Since the an- 
tecedent tends to appear in the closer sentence to 
~he zero t)ronoun, as 141jisawa's investigation indi- 
cates, we deternfine the following forward center 
ranking among the Cfs of tile previous four simple 
Selltences: 
c} > > > c}, 
where C} ~ represents the C I of the n-th silnple 
sentence froln the current sentence. 
3.2 Taking into account the information 
of emdunetive postpositions 
Even if the antecedents are searched in the pre- 
vious tour simple sentences, simple ~partition' ap- 
proach might not yield good performance, because 
tile information of conjunctive postpositions that 
are between two adjacent simple sentences is not 
taken into account. For exmnple, consider the fol- 
lowing sentences: 
(a) Taro wa issyoukenmei benkyou 
siteita. 
(b) Jiro ga koe wo kake temo, 
kizukanakatta. 
These sentences are partitioned into the folh)wing 
simple, sentences: 
(a) Taro wa issyoukenmei benkyou siteita. 
Topic 
Taro was studying tmr(i. 
(bl)Jiro ga ( (/5 ni ) koe wo kake telno, 
Subj Conj 
Although Jiro called out to him, 
(b2X ¢:t ga ) ( (/)~ ui ) kiznkanakatta. 
he did not notice ilim. 
Here (/5 represents a zero pronoun. Applying the. 
centering algorithm to timse sentences, tile process 
becomes as follows: 
(a) : C~ = \[T,.'o\], C~ - \['?\]a 
(bl) : </5 = Taro, Cf = \[Jiro, Taro\], 6% = 
Taro 
(b2) : 4)1 = ,liT"o, (/52 = Taro, Cf = 
\[,lifo, Taro\], G, = Taro 
Therefore, the counter-intuitive interpretation 
that 'Jiro did not notice Taro' is obtained. 
Since two adjacent simple sentences in a com- 
plex sentence are combined together by the con- 
junctive postposition that indicates the relation- 
ship between theln, using the intorination of the 
conjunctive postposition might ilnprove tim per- 
tormanee of the zero pronoun resolution. 
To clarify how tile zero pronoun resolution relies 
on the intormation of conjunctive i)ostpositions, 
we pertorin the investigation whether tile noun 
phrases with the same grammatical property agree 
in two adjacent simple sentences that have a con- 
junctive postposition tmtween timm, by extracting 
sentences with conjunctive postpositions Dora the 
revie, w articles in the newspaper and enmnerating 
the agreement and disagreement. The enulnera- 
lion is performed in cases where both sentences 
haw; zero pronouns and only e.ither sentence has 
aThe first sentence in a discourse has no Ca. 
873 
a zero pronoun. Twelve main conjunctive postpo- 
sitions are investigated. The result of the investi- 
gation is quite similar to the Yoshimoto's and Mi- 
natal's investigations(Yoshimoto, 1986; Minaret, 
1974) that classify the conjunctive postpositions 
into three classes: 
,, Class A: 'nagara' ('while'), 'tart' ('and'), 
'tutu' ('while'), 'te' ('and') 4 
If two sentences have a conjunctive post;po- 
sition of class A between then:, the subject 
noun phrases tend to coincide, in both cases 
where both sentences have zero pronouns and 
only either sentence has a zero pronoun. 
• Class B: 'temo' ('although'), 'node' ('be- 
cause'), 'non)' ('although'), 'keredo' ('al- 
though'), 'ba' ('if'), 'kara' ('because'), 'to' 
('when') 
If two sentences have a conjunctive postposi- 
tion of class B between them, the antecedent 
tends to be not the subject of the other sen- 
tence, in case where only either sentence has 
the zero pronoun of the subject position. In 
case where both sentences have zero pro- 
nouns, tile agreement/disagreelnent depends 
on the context and doe.s not have any ten- 
dency. 
• Class C: 'ga' ('but') 
Tile agreement/disagreement depends on tile 
context and does not have any tendency. 
LFrom this result of the investigation, we deter- 
mine to apply to the zero pronoun resolution the 
following heuristics that are concerned with con- 
junctive postpositions. Since conjunctive postpo- 
sitions of class A have a strong preference that 
two subjects in adjacent sentences tend to coin- 
cide, instead of the centering alger)thin, we use 
this preference tbr tile zero pronoun resolution in 
the simple sentence after the conjuimtive postpo- 
sitions of class A, and try to find the antecedents 
of zero pronouns in the same position of the adja- 
cent sentence, if any. In this case also, the center 
of the current sentence is computed similarly to 
the ordinary algorithm, and the antecedent of the 
zero pronoun becomes the Cb of tile current sen- 
tence. 
In case of conjunctive postpositions of class B, 
the antecedent tends to lie not the subject of the 
other sentence if one of tile sentences has the zero 
pronoun of the subject position. We think this 
tendency in)plies that noun phrases in the sen- 
tence before the conjunctive t)ostt)ositions of class 
B tend to be not the antecedents of zero pro- 
nouns in the next sentences. Therefore, we give 
these noun phrases the least t)reference as the an- 
tecedents, although the. zero pronoun resolution is 
perforlned by the original centering algorithm. 
4In parentheses, we show the direct translation of 
conjunct;ive postpositions into English. 
Since conjunctive postpositions of class C have 
no preference for the antecedents of zero pronouns, 
the zero pronoun resolution is performed as usual. 
Consider again the following sentences: 
(a) Taro wa issyoukenmei benkyou siteita. 
Topic 
Taro was studying hard. 
(bl)liro ga ( ¢ ni) koe we lmke temo, 
Subj Conj 
Although Jiro called out to hiln, 
(b2)( 051 ga ) ( ¢2 eli ) kizukanakatta. 
he did not notice him. 
If the original centering algorithm is applied to 
each sentence uniformly, the counter-intuitive in- 
terpretation is obtained, as mentioned above. 
Taking into account the ilflbrmation of conjunc- 
tive postpositions and applying the above heuris- 
tics to the points, since (bl) and (l/2) have the 
conjunctive postposition of class B, 'temo' (%l- 
though'), between them, the noun phrases it: sen- 
tence (bl) have the least preference and the order 
of C~, ix) the sentence (bl) becomes tile opposite 
to the case of the original centering algorithm. 
Therefore, the antecedents of tile zero pronouns 
in sentence (b2) are identified as follows: 
ci = C,, -F\] 
(bi) : ¢ = T,,,.o, C, = Ji,.o\], C,, 
Taro 
(t/2) : (fit -- Taro, e/5 2 =,\]iro, C l = 
\[Taro, Jiro\], Ct, = Taro 
Here this interpretation that 'Taro (lid not notice 
Jiro' fits our intuition. 
4 Experiment and Discussion 
In the last; section, we des(:ribed our zero pronoun 
resolution method that can handle colnplex sen. 
tences based on tile centering theory. It di\[lk;rs 
Koln tile original centering algorithm in the follow- 
ing two t)oints. After partitioning COlnplex sen- 
tences into inultiple sinli)le sentences, it searches 
tile ~mtecedents in the previous four simple sen- 
tences, instead of only a previous sentence. Sec- 
ondly, it; takes into account the information of 
conjunctive postpositions that are between two 
sinlple sentences, by classifying them into three 
classes. 
In this section, we describe the experiments that 
our zero pronoun resolution method is applied to 
real Japanese discourses, to evaluate the effective- 
ness. We inlplement two versions of our zero pro- 
nouI) resolution systems which are based on two 
versions of the centering algorithms teat are inen- 
tioned in sect)oil two respectively, and ewfluate 
the Imrl'orina.nce by comparing ours with the per- 
fl)rmanee of the original ee.ntering algorithms. 
As our test, discourses, we use 275 (pre- 
partitioned) sentences from tive discourses in to- 
tal, whietl are a review article in the newspaper, 
a tblk-tale, and a novel. Before the experiments, 
874 
~lifl)le 1: The performance of l;tm systems based 
on Kameyama's algorithm \[ 
method ~cor{'e(:E~ i};corre(:t-~ a(:cura.cy \[__ 
_ k-in_I 7_ ci -\] .... I 
Table 2: The performmme, of the systems base(1 
on Walker's algorithm 
\[metho(l\[ corre(:t \[ .h~rre(:i7 \] accura(:y 1 
L ~ .... ~, in /!t A TM _1 .... d 
--~2- ~_- 21-2-- Z 72~ -30 __1--(~7.5(/0-J 
these (liscours(,s are automati(:ally l);~rtitione(l im;o 
simple sent(mc(;s and re(:(;ive(l stru(:tural rarely- 
,sis, and the positions of zero pron(mns ~tr(': a,ul;o- 
inati(:ally i(tentili(',d as missing o|)liga.tory cases of 
w'a't)s. Then, the results of this prel)rocessing m'e 
inanually (:orre(:t(',d. The zero prOllOUll whos(; &11- 
tece(h'a~t ai)pears after it, i,e., t;he catat)hori(: one, 
~11d I;he zero t)l'OllOllIl whoso, anl;ec(;(l(ml; (lots nol; 
appear in the dis(:ourse, ;~xe outsi(l('~ th(! s(:op(! of 
this t)a.t)er. Thosc zero pronom~s are 30% of till 
l;he zero pl'Oll()ltlis itl Ollr l;(!s~ (|iscolll'8(,'s. 
'i'h('. correct ~tllt(~('.(~(|(}lll;,q a.l'O manually i(h,m, ified 
b(ff'orehand agains|; each zero \[)ronollli, a.lld l;h('. 
t)(M'ormmw.(~ is (:omtmlx~(l bas(xl (m I;h(;se answers. 
The experim(ml;s are ma(te on the following l;hree 
CD, SO, S: 
l. The original c('.nl;('a'ing alg(}rit;hm l;hat uses l;hc 
i/lforlnal;ion of only a previous simple sen- 
tence 
2. 'rhe Mgoril;hm that; s(',ar(:hes l;lm anl;c(:(~d(ml;s 
in the t)revious Ibm" simple s(~lll;(~llC(~,'--; 
3. Th(', a,lgoril;hm not only s(~ar(:h(~s Llle ;i.iI~ 
tec('d(mt, s in l;lm previous l'om' simph; s('a,- 
ten(:(~s, but also takes into a(:coun(; t, he infof 
marion of (:onjun(:t;ive 1)osfiposil;ions I;h;~t; are 
1)etwee, n two siml)lc 80111;(~11(:C8 
The results of the experim(;nl;s on two versions 
of our systems arc shown it, 'l'abh', 1 and 2, where 
l;h(', columns of k:orr(',(:t' and 'in(:orred;' show t, he 
nllnlb(;rs of the (:orre(:l, &lid in(:orr(?(:l; answers th;tl; 
l;he sysl;(',m oul;puL~; r(!sp(',(:tively, a.m\[ l,hc (;ohmms 
of 'in' and 'not in' show |;he I/llIII\])(WS ()\[ C~I,S('~S 
whet(' can(li(lat(;.~ of a.nt(',ce(h;nts in the system in- 
chMc the correct a.nswer and l;\]m mind)or of cases 
where tim system (toes 11o\[, h&v(! ~h(,, corr(?cl; at~sw(;r 
as l,h(; can(li(latcs, rest)e('tiv(',ly. 
Air, hough the original ('(;ntx;ri~lg alg()rithms yield 
the performa~im(, of 60 70%, th(;y have many 
cgtsos Wh(':l'O, th(! sys|;em canno\[; gel; the corrccl; an- 
swer ~ts l;hc candidat;es(qlot in') and cannot out- 
put correct; auswers. This indicates that there &l'O. 
mm~y cases where |;ho, coil(,~(;L &IiL(?ccd(~tl|;,q do i1o|; 
nt)t)ear in the previous S(~,III;(~,IIC(~,, ~ldld implies l;he 
phmsibility of our first modific~tion l;o the origina.l 
algorithm. The improvement of the I)e.rforma.nce 
in method 2 also ilntflies tit(! plausibilil;y of our 
method. \],'lUl;h(~rlnore, taking into a(:(:OUil{; 1;he ill- 
formation of conjunctive t)ostt)ositions improve.s 
l;he \[)crformnnt:e 1)y 3 6%. Totally, (:ompnr('d 
wil;h 1;he original c('.nl;ering algorit;hms, t;}w. perfor- 
mance of our reel;hod improves by 7 \]0%. 
Since the zero 1)ronoml r(~so\[ution method l;hal; 
is t)a.sexl on the (:(~.lll;Ol'illg theory uses the results 
of the zero pronolm resolution in previous sen- 
tences, 'error~cha.ining' might; occur many times. 
\]';rror-chahfing occurs when the idenl;iticatit)n of 
& wrong anl;(~(:e(l(mt; ('.raises anol;her wrong zero 
\])FOllOtlll rt}sohll;ion sut:(;essiv(',ly, ltl (;~/,q(} ()f OllF 
system(method 3) based on Kalne, yanm.'s algo- 
rilJun, 30.2% of the ilICOIT(~Ct; ~I,1ISW0,1'.% kl2"O (hi0 I,o 
this error--chainint;. There is also the possibilil,y 
where the (;orr(;(:l; an,qwt~rs m'c Olll;t)lll; |)(x:ausc of 
Lh(! wrtillg zero llrOllOlln reso\]ui;ion in 1;11(.' previous 
senlx;nces, in case of our sysl;enl(lnci;ho(1 3) 1)ased 
on Kameyama's algorithnl, only 1.2% of th(', (:of 
red; ;mswers are due to this 'false negative.' 
As you notice fi'om the above two t~fl)les, l;hc, re 
cxisI; about 30 cases where the mfl;e,t:ed(mt;s ~tl)t)t',ar 
in more l,han live s(mtt',nces fi'oin th(~ CtllTt!tll, sitll- 
pie sentem:c and camlot; be. tound t)y our lll(~I;hod. 
YVc think these (;ase.s should not t)(.' handled simply 
t)y extending l;he search l'~tllge t'(/1' l;h(; ~mt;ec.t~dent;s, 
Iml, by utilizing tim informa, tiotl of global struc.- 
l;/tr{ ~, of discourses(Grosz and Sidncr, 1986), be- 
cause the digressive sub-discourse is inserted be- 
tw(.'en the. mfl;e(:t~dents mid the zero l)I'OtlOlltlS it, 
mosl; of these, ca.sos. 
Of ('x)urse, l,h('a'e have 1)ecn tim zero l)rOllOllll 
r(~solul;iot~ ~t)l)r(la.(:lms l;ha.t t;tke into a.(:(:ount the 
information of (:(mjunctiv('. (~l('meuts(Nakaiwa a.n(l 
\[kehara, 1992; Nakaga.wa. and Nishizaw~, 1994; 
¥oNhil\[lOl;O, \],(JS(i; SIll'i all(l NIcC()y, 1994:). t/e- 
c.mme N;,kaiw~ds(Nakaiwa. mtd lkehara., 1992)mM 
Na.kagawa's(Nakagawa and Nishizawa, 1994) a.p- 
t)ro~w.ht'.s use /,he in_forma.l;ion in a r(~stricl;(~(l (it)- 
main or ~oo fine-grainc(l grmmna.ti(:al information, 
we think they are dif\[icult {;o be tune(1 to th('. 1)road 
cov(~rag(' z(;ro l)ronoun r(;solution system. Furth(',r-- 
more, Nakagaw~L's and Y()shimol,o's(Yoslfimol;o, 
1986) a.t)t)r()a(:h(',,q are not; fully ('va.lual;(~d with rea.1 
(lis(:om's(~s. Although Nakniwa's ~q)proa(:h yM(ts 
high ,~;u(:(:(',ss; rat(; (if 93%, he uses rath(,r small t(!s{, 
sets(102 ,q(mt;enc(~s from 29 a.rt;icl(~s), and the input, 
is r(~,qtrM,(~(t Ix) th(~ first t)nragrat)hs of neWSl)a.l)(~r 
~u'ti(:h;s. 
,qm'i's work(Suri a.u(l M(:Coy, \]994) mighl; t)e 
one of tim few works that extend I;h(; (:(ml,(~rh~/, , 
I\]a.m(;w()~k 1,() hind(lie (:Oml)h'.x ,q(\]lll;(~.ll(:(.'s, although 
8'/5 
she handles only sentences of the form 'SX because 
SY,' and uses Sidner's focusing framework(Sidner, 
1983), that is different from the centering theory 
that our method is based on. Purthermore, the 
effectiveness of her work is not evaluated with real 
discourses. 
Takada's work(T&kada and Doi, 1994) might be 
the only exception that proposes the zero pronoun 
resolution method based on the centering theory 
and evaluates its effectiveness with real discourses. 
Since he handles not only zero pronouns but also 
overt pronouns, the exact comparison is difficult, 
but his approach, that is based on Kameyama's 
approach, yields the performance of 74.8% if the 
results for overt pronouns are excluded. In addi- 
tion, to handle complex sentences, he adopts the 
other approach where they are treated as a single 
unit, and admits that some problems arise because 
of this approach. 
Taking into account the information of conjunc- 
tive elements in the pronoun resolution reminds us 
of the works that use the establishment of coher- 
ence relations between clauses for pronoun reso- 
lution(Hobbs, 1979; Kehler, 1993). They try to 
establish coherence relations by the costly infer- 
ence, while we use only the surface information. 
5 Conclusion 
In this paper, we presented a simple method to 
handle complex sentences with the centering the- 
ory and described our framework that can identify 
the antecedents of zero pronouns in naturally oc- 
curring Japanese discourses. We also presented 
tile evaluation of our framework with real dis- 
courses, although the evaluation is not so large- 
scale to assert the effectiveness of our framework. 
Our simple method yielded the accuracy of 78% 
for the zero pronoun resolution. 
Since our method that is presented in this paper 
is based on the centering theory and basically uses 
only syntactic information, we plan to incorporate 
the semantic constraints that filter anomalous an- 
tecedents for zero pronouns, and take into account 
the global structure of discourses. We think the 
preliminary results of our system in this paper are 
promising since incorporating the information of 
semantic constraints and the global structure of 
discourses will improve the performance. 
References 
S.E. Brennan, M.W. Friedman, and C.J. Pollard, 
A Centering Approach to Pronouns, Proc. of 
the 25th Annual Meeting of the Association for 
Computational Linguistics, pp.155-162, 1987. 
S. l~Sjisawa, S. Masuymna, and S. Naito, A Ba- 
sic Study on Ellipsis and Anaphora in Japanese 
Sentences, IPSJ SIG Notes, Vol. 91, No. 96, in 
Japanese, 1991. 
B.J. Grosz and C.L. Sidner, Attention, Intentions, 
and the Structure of Discourse, Computational 
Linguistics, Volume 12, Number 3, 175-204, 
1986. 
B.J. Grosz, A.K. Joshi, and S. Weinstein, Center- 
ing: A Framework for Modeling tile Local Co- 
herence of Discourse, Computational Linguis- 
tics, Volume 21, Nmuber 2, pp.203-225, 1995. 
J.R. Hobbs, Coherence and Coreference, Cogni- 
tire Science, Volume 3, Number 1, pp.67-90, 
1979. 
M. Kameyama, A Property-Sharing Constraint in 
Centering, Proc. of the 2~th Annual Meeting of 
the Association for Computational Linguistics, 
pp.200-206, 1986. 
A. Kehler, The Effect of Establishing Coherence 
in Ellipsis and Anaphora Resolution, Proc. o\] 
the 31st Annual Meeting of the Association for 
Computational Linguistics, pp.62-69, 1993. 
F. Minami, Gendai Nihongo no Kouzou, 
Taishuukan, Tokyo, in Japanese, 1974. 
H. Nakagawa and S. Nishizawa, Semantics of 
Complex Sentences in Japanese, Proe. of 
the 15th International Con\]erence on Compu- 
tational Linguistics, pp.679-685, 1994. 
H. Nakaiwa and S. Ikehara, Zero Pronoun Reso- 
lution in a Japanese to English Machine Trans- 
lation System by using Verbal Semantic At- 
tributes, Proe. of the 3rd Conference on Ap- 
plied NatuTYtl Language PTveessing, pp.201-208, 
1992. 
C.L. Sidner, Focusing in tile Comprehension of 
Definite Anaphora, In M. Brady and R.C. 
Berwick, editors, Computational Models of Dis- 
course, MIT Press, pp.267-330, 1983. 
L.Z. Suri and K.F. McCoy, RAFT/RAPR and 
Centering: A Comparison and Discussion of 
Problems Related to Processing Complex Sen- 
tences, Computational Linguistics, Volume20, 
Number 2, pp.301-317, 1994. 
S. Takada and N. Doi, Centering in Japanese: 
A Step \]bwards Better Interpretation of Pro- 
nouns and Zero-Pronouns, Proe. of the 15th In- 
ternational Conference on Computational Lin- 
guistics, pp.1151-1156, 1994. 
M.A. Walker, M. Iida, and S. Cote, Japanese 
Discourse and the Process of Centering, Com- 
putational Linguistics, Volume 20, Number 2, 
pp.193-232, 1994. 
K. Yoshimoto, Study of Japanese Zero Pronouns 
in Discourse Processing, IPSJ SIG Notes, Vol. 
86, No. 52, in Japanese, 1986. 
876 
