Pronoun Resolution in Japanese Sentences 
Using Surface Expressions and Examples 
Masaki Murata Hitoshi Isahara 
Communications Research Laboratory 
588-2, Iwaoka, Nishi-ku, Kobe, 651-2401, Japan 
{murat a, i sahara}@crl, go. j p 
TEL: +81-78-969-2181, FAX: +81-78-969-2189 
Makoto Nagao 
Kyoto University 
Sakyo, Kyoto 606-8501, Japan 
nagaoOpine, kuee. kyot o-u. ac. jp 
Abstract 
In this paper, we present a method of estimating ref- 
erents of demonstrative pronouns, personal pronouns, 
and zero pronouns in Japanese sentences using exam- 
pies, surface expressions, topics and loci. Unlike conven- 
tional work which was semantic markers for semantic 
constraints, we used examples for semantic constraints 
and showed in our experiments that examples are as 
useful as semantic markers. We also propose many new 
methods for estimating referents of'pronouns. For exam- 
ple, we use the form "X of Y" for estimating referents of 
demonstrative adjectives. In addition to our new meth- 
ods, we used many conventional methods. As a result, 
experiments using these methods obtained a precision 
rate of 87% in estimating referents of demonstrative pro- 
nouns, personal pronouns, and zero pronouns for training 
sentences, and obtained a precision rate of 78% for test 
sentences. 
1 Overview 
This paper describes how to resolve the referents of pro- 
nouns: demonstrative pronouns, personal pronouns, and 
zero pronouns. Pronoun resolution is especially impor- 
tant for machine translation. For example, if the sys- 
tem cannot resolve zero pronouns 1, it cannot translate 
sentences containing them from Japanese into English. 
When the word order of sentences is changed and the 
pronominalized words are changed in translation into 
English, the system must detect the referents of the pro- 
nouns. 
A lot of work has been done in Japanese pronoun reso- 
lution (Kameyama 86) (Yamamuraet al. 92) (Walker 
et al. 94) (Takada & Doi 94) (Nakaiwa & Ikehara 95). 
The main distinguishing features of our work are as fol- 
lows: 
• In conventional pronoun resolution methods, se- 
mantic markers have been used for semantic con- 
straints. On the other hand, we use examples for 
semantic constraints and show in our experiments 
that examples are as useful as semantic markers. 
This is an important result because the cost of con- 
structing the case frame using semantic markers is 
generally higher than the cost of constructing the 
case frame using examples. 
• We use examples in the form "X no Y" (Y of X) for 
estimating referents of demonstrative adjectives. 
1 Omitted noun phrases are called zero pronouns. 
Condition :*, {Proposal Proposal ..} 
Proposal := (Possible-Antecedent Points) 
Figure 1: Form of Candidate enumerating rule 
Condition =~ (Points) 
Figure 2: Form of Candidate judging rule 
• We deal with the case when a demonstrative refers 
to elements that appear later. 
• We resolve a personal pronoun in a quotation by 
determining who is the speaker and who is the lis- 
tener. 
In this work, we used almost all the potentials of con- 
ventional methods and also propose a new method. 
2 The Framework for Estimating the 
Referent 
Prior to the pronoun resolution process, sentences are 
transformed into a case structure by a case structure 
analyzer (Kurohashi & Nagao 94). The antecedents of 
pronouns are determined by heuristic rules from left to 
right. Using these rules, our system assigns points to 
possible antecedents, and judges that the one having the 
maximum total score is the desired antecedent. 
Heuristic rules are classified into two kinds: Candi- 
date enumerating rules and Candidate judging rules. Can- 
didate enumerating rules are used in enumerating can- 
didate antecedents and giving them points (which rep- 
resent the plausibility of being the correct antecedent). 
Candidate judging rules are used in giving points to the 
candidate antecedents selected by Candidate enumerating 
rules. These rules are shown in Figures 1 and 2. Surface 
expressions, semantic constraints, referential properties, 
etc. are written as conditions in the Condition part. Pos- 
sible antecedents are written in the Possible-Antecedent 
part. Points means the plausibility of the possible an- 
tecedent. 
An estimation of the referent is performed using the 
total scores of possible antecedents given by Candidate 
enumerating rules and Candidate judging rules. First, the 
system applies all Candidate enumerating rules to the 
anaphor and enumerates candidate antecedents having 
points. Next, the system applies aLl Candidate judg- 
ing rules to all the candidate antecedents and sums the 
scores of all the candidate antecedents. Consequently, 
39 
Table 1: The weight in the case of topic 
Surface expression Example Weight 
Pronoun/zero-pronoun ga/wa (John ga (subject)) shita (done) 21 
Noun wa/niwa Johnwa (subject) ,hita (do) 20 
Table 2: The weight in the case of focus 
Surface expression Example 
Pronoun/zero-pronoun wo (object)/ ni (to)/kara (from) (John ni (to)) shita (done) 
Noun ga (subject)/mo/da/nara John gg (subject) shita (do) 
Noun wo (object)/ni/,/. John ni (object) shita (do) 
Noun he (to)/de (in)/kara (from) g'~ou (school) he (to) iku (go) 
Weight 
16 
15 
14 
13 
the system judges the candidate antecedent having the 
best score to be the proper antecedent. If several can- 
didate referents have the best score, the candidate ref- 
erent selected first in order 2 is judged to be the correct 
antecedent. 
We made 50 Candidate enumerating rules and 10 Can- 
didate judging rules for analyzing demonstratives, 4 Can- 
didate enumerating rules and 6 Candidate judging rules for 
analyzing personal pronouns, and 19 Candidate enumer- 
ating rules and 4 Candidate judging rules for analyzing 
zero pronouns. Some of the rules are described in the 
following sections. 
3 Heuristic Rules for Demonstratives 
We made heuristic rules for demonstratives by consulting 
the papers (NLRI 81) (Hayashi 83) (Takahashi et al. 90) 
(Kinsui & Takubo 92) and by examining Japanese sen- 
tences by hand. Demonstratives have three categories: 
demonstrative pronouns, demonstrative adjectives, and 
demonstrative adverbs. In the following sections, we ex- 
plain the rules for analyzing demonstratives. 
3.1 Rule for Demonstrative Pronouns 
Rule in the case when the referent is a noun 
phrase 
Candidate enumerating rule 1 
When a pronoun is a demonstrative pronoun or "8ono 
(of it) / k0no (of this) \] an0 (of that)", 
{(A topic which has weight W and distance D, 
W-D-2) 
(A focus which has weight W and distance D, W - 
D + 4)} 
This bracketed expression represents the lists of pro- 
posals in Figure 1. The definition and weight W of 
the topic and focus are shown in Tables 1 and 2. The 
distance (D) is the number of topics and loci between 
the demonstrative and the possible referent. Since a 
demonstrative more often refers to loci than a zero pro- 
noun does, we add the coefficient -2 or +4 as compared 
with the heuristic rules in zero pronoun resolution. 
The score (in other words, the certification value) of a 
candidate referent depends on the weight of topics/foci 
and the physical distance between the demonstrative and 
the candidate referent. 
Rule when the referent is a verb phrase 
Candidate enumerating rule 2 
2 The order is based on order applying rules. 
Table 3: Points given in the case of demonstrative 
pronouns 
Sim. 0111 21 3 5 6 , Jo,o,- O,-lOl- X - o1- ol 
Sire. = Slmlarity level 
When a pronoun is "kore/sore/are" or a demonstrative 
adjective, 
{( The previous sentence (or the verb phrase which is a 
conditional form containing a conjunctive particle such 
as "ga (but)", " daga (but)", and "keredo (but)" if the 
verb phrase is in the same sentence), 15)) 
The following is an example of a pronoun referring to 
the verb phrase in the previous sentence. 
tengu-wa maenoban-noyouni utattari odottari shihajimeta. 
(tengu) (the previous night) (sing) (dance) (begin to do) 
(Tengus began singing and dancingjnst as they had done 
the previous night.) 
ojiisan-wa sore-wo mite, kon'nahuuni utai-hajimeta. 
(the old man) (it) (see) (as follows) (begin to sing) 
(When the old man saw this, he began to sing as follows.) (1) 
In these sentences, a demonstrative pronoun "sore (it)" 
refers to the event "tengutachi-ga utattari odottari shi- 
hajirnemashita (tengus began singing and dancing just 
as they had done the previous night.) "3. 
Rule using the feature that demonstrative 
pronouns usually do not refer to people 
Candidate judging rule 1 
When a pronoun is a demonstrative pronoun and a can- 
didate referent has a semantic marker HUM (human), 
it is given -10. We used the Noun Semantic Marker 
Dictionary (Watanabe et al. 92) as a semantic marker 
dictionary 4 . 
Candidate judging rule 2 
When a pronoun is a demonstrative pronoun, a candi- 
date referent is given the points in Table 3 by using the 
highest semantic similarity between the candidate refer- 
ent and the codes {5200003010 5201002060 5202001020 
5202006115 5241002150 5244002100) in "Bunrui Goi 
Hyou (BGH)" (NLRI 64) 5 which signify human beings. 
3A tengu is a kind of monster. 
4 This dictionary includes semantic categories shown in Ta. 
ble 4. 
Sin BGH, each word has a number called a category num- 
ber. In an electrical version of BGH, each word has a 10-digit 
40 
Table 4: Modification of category number of "bunrui 
goi hyou" 
Semantic marker Original Modified 
code code 
AN l(auimal) 
H U M (human) 
ORG(organization) 
PLA(plant) PAR(part 
of living thing) NAT(natural) 
PRO(products) 
LOC(location) 
P H E (phenomenon) 
ACT(action) 
MEN (mental) 
ellA(character) 
REL(relation) 
LIN (linguistic products) 
Others 
TIM(time) 
QUA(quantity) 
156 
1210-4\] 
1215-8\] 
155 
157 
152 
1410-9\] 
117,125,126 
150,151 
1313-81 
130 
II\[2-58\],158 
III 
131,132 
110 
116 
119 
511 
52\[0-41 
5315-s\] 
611 
621 
631 64\[0-91 
651,652,653 
711,712 
8113-8\] 
S21 
83\[2-58\],839 
841 
851,852 
861 
all 
bll 
"125" and "126" are given two category numbers. 
When we calculate the semantic similarity, we use the 
modified code table in Table 4. The reason for this 
modification is that some codes in BGH (NLRI 64) are 
not suitable for semantic constraints. 
These rules use the feature that a demonstrative pro- 
noun rarely refers to people. This reduces the num- 
ber of candidates of the referent. For example, we find 
"sore (it)" in the following sentences refers to "konpyuuta 
(computer)", because "sore (it)" can only refer to only 
a thing which is not human and the only noun which is 
near "sore (it)" and which is not human is "konpyuuta 
(computer)". 
taroo-wa saishin-no konpyuuta-wo kaimashita. 
(Taroo) (new) (computer) (buy) 
(Taroo bought a new computer.) 
ion-hi sassoku sore-wo misemashita. 
(John) (at once) (it) (show) 
(\[He\] showed it at once to John.) 
Rule with feature that "koko" and "soko" 
(2) 
often refer to locations 
Candidate judging rule 3 
When a pronoun is "koko (here) / soko (there) \] asoko 
(over there)" and a candidate referent has a semantic 
marker LOC (location), the candidate referent is given 
10 points. 
Candidate judging rule 4 
When a pronoun is "koko/soko/asoko ", a candidate ref- 
erent is given the points in Table 5 based on the seman- 
tic similarity between the candidate referent and the 
category number. This 10-digit category number indicates 
seven levels of an is-a hierarchy. The top five levels are ex- 
pressed by the first five digits "of a category number. The 
sixth level is expressed by the following two digits of a cat- 
egory number. The last level is expressed by the last three 
digits of a category number. 
Table 5: Points given demonstrative pronouns which 
refer to places 
~Sim. 0 1 2 4 5 6 lO I 
codes {6563006010 6559005020 9113301090 9113302010 
6471001030 6314020130) which signify locations in 
BGH (NLRI 64). 
"soko (there)" commonly refers to location. For ex- 
ample, "soko" in the following sentences refers to "baiten 
(shop)" which signifies location. 
koora-wo kaini baiten-ni hairimashita. 
(cola) (buy) (shop) (enter) 
(Taroo entered a shop to buy a cola.) 
jiroo-wa so\[co-de guuzen dekuwashimashita. 
(Jiroo) (there) (by chance) (meet) 
(Jiroo met Taroo there by chance.) 
Rule when "kokode" or "sokode" is used as a 
(3) 
cortjunction 
Candidate enumerating rule 3 
When a pronoun is "kokode" or "sokode", 
{(the pronoun is used as a conjunction, 11)} 
This rule is for when "kokode (here or then)" or 
"sokode (there or then)" is used as a conjunction. If 
a word that signifies location is not found near "kokode" 
or "sokode", the candidate listed by this rule has the 
highest score, and "kokode" or "sokode" is judged to be 
a conjunction. By using this rule, "sokode" in the fol- 
lowing sentences is judged to be a conjunction. 
ojiisan-wa fenen-ea kowakunakunatte-imashita. 
(old man) (tengu) (lose all fear of) 
(The old man lost all fear of the tengns.) 
sokode ojiisan-wa kakure~eita ana-kara detekimashita. 
(so) (old man) (be hiding) (hole) (leave) 
(So, he left the hole where he had been hiding.) 
(4) 
This rule is necessary when the system translates 
"sokode" into English, judges whether it is used as a 
demonstrative or as a conjunction, and translates it into 
"there" or "then." 
Rule when an anaphor does not have its 
antecedent 
Candidate enumerating rule 5 
When a pronoun is a demonstrative pronoun, a demon- 
strative adverb, or a demonstrative adjective, 
{(Introduce an individual, 10)} 
This rule is used when there is no referent of a pro- 
noun in the sentences. This rule makes the system in- 
troduce a certain individual. 
3.2 Rule for Demonstrative Adjectives 
Demonstrative pronouns such as "kono (this)", "sono 
(the)", "ano (that)", "kon'na (like this)", and "son'na 
(like it)" are classified into two reference categories: 
gentei-reference and daikou-reference. 
41 
In a Gentei-reference although a demonstrative adjec- 
tive does not refer to an entity by itself, the phrase of 
"demonstrative adjective + noun phrase" refers to the 
antecedent. For example "kono ojiisan (this old man)" 
in the following sentences: 
Table 6: Points given to so-series demonstrative ad- 
jective 
I Sim. 0 1 2  oiot l 01' 11 1 ° 3 I Exact 
ojiisan-wa ten#utachi-no-mdeni deteitte odori-hajimemashita 
(old man) (before the tengus) (appear) (begin to dance) 
(He appeared before the tengus, and began to dance.) 
keredomo kono ojiisan-wa ufa-mo odori-mo hetakuso-deshita 
(but) (this oldman) (sing) (dance) (poor) 
(But the old man was a poor singer, and his dancing was 
no better.) (s) 
In this example, although the demonstrative "kono 
(this)" does not refer to "ojiisan (old man)" in the first 
sentence, the noun phrase "kono ojiisan (this old man)" 
refers to "ojiisan (old man)" in the first sentence. 
Daikou-reference is a demonstrative adjective that 
refers to an entity. In this case, we can analyze "sono 
(the)" as well as "sore-no (of it)". In the following sen- 
tences, "sono" refers to "tengu" (tengus). It is an exam- 
pie of daikou-reference. 
mata karasu-no-youna kao-wo-shita tengu-mo imashita 
(also) (like crows) (with face) (tengu) (exist) 
(There were also some tengus with faces like those of crows.) 
sono kuchi-wa torino-kuchibashi-noyouni togatte-imashita 
(their mouths) (like the beaks of birds) (be pointed) 
(Their mouths were pointed like the beaks of birds.) 
(6) 
Rules for gentei-reference and daikou-reference are as 
follows: 
Candidate enumerating rule I 
When a pronoun is "demonstrative adjective + noun 
{ (the noun phrase containing a noun ~, 45) 
(the topic which is a subordinate of noun c~ and which 
has weight W" and distance D, W - D + 30) 
(the focus which is a subordinate of noun ~ and which 
has weight W and distance D, W - D + 30)) 
The relationships between a super-ordinate word and 
a subordinate word are detected by judging the last word 
in the definition of the word c~ in EDR Japanese word 
dictionary (EDR 95a) to be the super-ordinate of the 
word a. 
Because of this rule, when a pronoun is "demon- 
strative adjective + noun phrase a" and there is the 
same noun phrase a near it, it is judged to be "gentei- 
reference" and is selected as a candidate of the refer- 
ent. When there is a subordinate of a noun phrase 
near it, it is also selected as a candidate of the referent. 
These rules give higher points to a candidate referent 
than other rules do. The following is an example of the 
"demonstrative adjective + noun phrase a" referring to 
Table 7: Examples of the form "the mouth of Noun 
X" Examples of Noun X I 
hukuro (sack), ruporaitg (documentary writer) iin (mem- 
ber), akachan (baby), kate (he) 
the subordinate of noun phrase a. 
ojiisan-wa toonoiteiku tsuru-ao sugata-wo miokurimashita. 
(old man) (recede) (crane) (figure) (watch) 
(The old man watched the receding figure of the crane.) 
"ano tori-wo tasukete yokatta" to iimashita. 
(that bird) (save) (glad) (say) 
("I'm glad I saved that bird," said the old man to himself.) (7) 
In this example, the underlined "ano tori (that bird)" 
refers to a subordinate "tsuru (crane)" in the previous 
sentence. 
Rules for daikou-referenee of so-series 
demonstrative adjective 
Candidate judging rule 5 
When a pronoun is a so-series demonstrative adjective, 
the system consults examples of the form 'h~oun X n0 
noun Y" whose noun Y is modified by the pronoun, 
and gives a candidate referent the points in Table 6 
according to the similarity between the candidate ref- 
erent and noun X in "Bunnfi Goi Hyou" (NLRI 64). 
The Japanese Co-occurrence Dictionary (EDR 95c) is 
used as a source of examples of "X no Y". 
This rule is for checking the semantic constraint (For a 
daikou-reference, candidates of the referent are selected . 
by Candidate enumerating rule 1 in Section 3.1.). 
We explain how to use the rule in the underlined "sono 
(the)" in the sentences (6). First, the system gathers ex- 
amples of the form "Noun X no kuchi (mouth of Noun 
X)'. Table 7 shows some examples of "Noun X no kuchi 
(mouth of Noun X)" in the Japanese Co-occurrence Dic- 
tionary (EDR 95c). Next, the system checks the seman- 
tic similarity between candidate referents and Noun X, 
and judges the candidate referent having a higher sim- 
ilarity to be a better candidate referent. In this exam- 
ple, "tcngu" is semantically similar to Noun X in that 
they are both living things. Finally, the system selects 
"teng~' as the proper referent. 
Rules when non-so-series demonstrative has 
daikou-reference 
Candidate judging rule 6 
When a pronoun is a non-so-series demonstrative adjec- 
tive, the system consults examples of the form "Noun 
X no (of) Noun Y (Y of X)" whose Noun Y is modi- 
fied by the pronoun, and gives candidate referents the 
points in Table 8 according to the similarity between 
the candidate referent and noun X in "Bunrul Goi 
42 
Table 8: Points given in the case of non-so-series 
demonstrative adjective 
\]Sim. I 0\[-31\[ 2\[_31-1014 5\]_~lExact \[ Points t-30 -30 -5 0 
Table 9: Results of investigating whether "kon'na 
noun" (noun like this) refers to the previous or next 
sent( nces Postpositional particle 
wa (topic) 
wa-nai 
ni (indirect object) 
ni-mo 
ni-wa 
de (place) 
de-wa 
no (possessive) 
sura 
ga (subject) 
wo (obj, ct) 
,no (,,.Is,) 
de-wa-nai 
previous sentence 
17 
1 
15 
27 
43 
Total 137 
next sentence 
0 
0 
0 
0 
0 
0 
0 
° I o 22 
4 
1 
53 
Hyou" (NLRI 64). Since a non-so-series demonstrative 
adjective rarely is a daikou reference (NLRI 81) (Yama- 
mura et al. 92), the number of points is footnotesizeer 
than in the case of the s0-series. 
Rule when a pronoun refers to a verb phrase 
Like a demonstrative pronoun, a demonstrative adjec- 
tive can refer to the meaning of the verb phrase in the 
previous sentence. This case is resolved by Candidate 
enumerating rule 2 in Section 3.1. 
Rule for "kon'na noun" (noun like this) 
"kon'na noun" can also refer to the next sentences in 
addition to a noun phrase and the previous sentences. 
ojiisan-wa odorinagara kon'na uta-wo utaimashita. 
(old man) (dance) (song like this) (sing) 
(As he danced, he sang the following song: ) 
"tengu tengu hachl tengu. 
(tengu) (tengu) (eight tengu) 
("'Tengu,' 'tengu,' eight 'tengus." ') 
(s) 
In the above example, "kon'na uta (song like this)" refers 
to the next sentence "tengu, tengu, hachi tengu." 
But we cannot decide whether "kon'na noun" (noun 
like this) refers to the previous or next sentences only 
by the expression of "kon'na noun" (noun like this) it- 
self. To make the decision, we gathered 317 sentences 
containing "kon'na" (like this) from about 60,000 sen- 
tences in Japanese essays and editorials, and counted 
the total frequency of cases in which "kon'na" refers to 
the previous and next sentences. The results are shown 
in Table 9. This table indicates that "kon'na noun" 
followed by other particles, specifically "ga" and "wo," 
which are used when representing new information, very 
often refers to the previous sentence. Therefore, the sys- 
tem judges that the desired antecedent is the previous 
sentence. When "kon'na noun" is followed by the parti- 
cles "ga" or "wo," the proper referent is determined by 
the expression in quotation marks (","). 
3.3 Rule for Demonstrative Adverbs 
Rule when so-series demonstrative adverb 
refers to the previous sentences 
Candidate enumerating rule 9 
When an anaphor is a so-series demonstrative adverb 
such as "sou (so)," 
{(the previous sentences, 30)} 
The following is an example. 
"tengu tengu hachi tengu." 
(tengu) (tengu) (eight tengu) 
("'Tengu,' 'tengu,' eight 'tengus." ') 
so.._.~u utatta-nowa sokoni hachihiki-no tengu-ga itakara-desu. 
(sing so) (there) (eight) (tengu) (exist) 
(He sang s._oo because he counted eight of them there. ) 
(9) 
"sou (so)" refers to the previous sentence "tengu tengu 
hachi tengu". 
Rule when so-series demonstrative adverb 
cataphorically Refers to the Verb Phrase 
in the Same Sentence 
Candidate enumerating rule 10 
When an anaphor is "sou/soushite/sonoyouni" and is 
in the subordinate clause which has a conjunctive par- 
ticle such as "9a", "daga ", and "keredo "or an adjective 
conjunction such as "youni", 
{(the main clause, 45)} 
4 Heuristic Rule for Personal Pronouns 
Candidate enumerating rule 1 
When an anaphor is a first personal pronoun, 
{(the first person (the speaker) in the context, 25)} 
Candidate enumerating rule 2 
When an anaphor is a second personal pronoun, 
{(the second person (the hearer) in the context, 25)} 
A first or second personal pronoun is often presented 
in quotations, and can be resolved by estimating the 
first person (speaker) or the second person (hearer) in 
advance. The estimation of the first person and the sec- 
ond person is performed by regarding the ga-case (sub- 
jective) and n/-case (objective) components of the verb 
phase representing the speaking action of the quotation 
as the first and second persons, respectively. The detec- 
tion of the verb phase representing the speaking action 
is performed as follows. If the quotation is followed by a 
speaking action verb phrase such as "to itta (was said)," 
the verb phrase is regarded as the verb phase represent- 
ing the speaking action. Otherwise, the last verb phrase 
in the previous sentence is regarded as the verb phase 
representing the speaking action. For example, the sec- 
ond personal pronoun "omaesan (you)" in the following 
sentences refers to the second person "ojiisan (the old 
43 
ojiisan-wa jimen-ni koshi-wo-oroshimashita. 
(old man) (ground) (sit down) 
(The old man sat down on the ground.) 
yagate (ojiisan-wa) nemutte-shimaimashita. 
(soon) (old man) (fall asleep) 
(He soon fell asleep.) 
Semantic Marker 
HUM/ANI #a (agent) nemuru (sleep) 
Example 
kare (he)/ inu (dog) ga (agent) nemuru (sleep) 
Figure 3: How to check semantic constraint 
man)" in this quotation. 
"aS~t, mafa mairtwtasztyo." tO, 
(tomorrow) (again) (come) 
("I'll come again tomorrow,") 
ojiisan-wa yakusoku-shimashita. 
(old man) (promise) 
(promised the old man.) 
"mochiron omaesan-wo utagauwakedewanainodaga," 
(of course) (you) (don't mean to doubt) 
("Of course, we don't mean to doubt you,") 
tengu-ga ojiisan-ni iimashita. 
(tengu) (old man) (said) 
(said one of the "tengu" to the old man.) 
(10) 
The second person in the quotation is estimated to be 
"ojiisan" because the n/-case component of the verb 
phrase "iimashita (said)" representing the speaking ac- 
tion of the quotation is "ojiisan'. 
Candidate enumerating rule 3 
When an anaphor is a third personal pronoun, 
{(a first person, --10) (a second person, -10)) 
5 Heuristic Rule for Zero Pronoun 
Rule proposing candidate referents of general 
zero pronoun 
Candidate enumerating rule 1 
When a zero pronoun is a ga-case component, 
{(A topic which has weight W and distance D, W - 
D-2+1) 
(A focus which has weight W and distance D, W-D+ 
1) 
(A subject of a clause coordinately connected tothe 
clause containing the anaphor, 25) 
(A subject of a clause subordinately connected to the 
clause containing the anaphor, 23) 
(A subject of a main clause whose embedded clause 
contains the anaphor, 22)) 
Candidate enumerating rule 2 
When a zero pronoun is not. a #a-case component, 
{(A topic which has weight W and distance D, W - 
D-2-3) 
(A focus which has weight W and distance D, W - D * 
2+1)) 
Table 10: Points given by a verb-noun relationship 
Sire. 0 1 2 4 5 6 
Rule using semantic relation to verb phrase 
Candidate judging rule 1 
When a candidate referent of a case component (a zero 
pronoun) does not satisfy the semantic marker of the 
case component in the case frame, it is given -5. 
Candidate judging rule 2 
A candidate referent of a case component (a zero pro- 
noun) is given the points in Table 10 by using the high- 
est semantic similarity between the candidate referent 
and examples of the case component in the case frame. 
These two rules are for checking the semantic con- 
straint between the candidate referent and the verb 
phrase which has the candidate referent in its case com- 
ponent. Candidate judging rule 1 checks semantic con- 
straints by using semantic markers. Candidate judging 
rule 2 checks semantic constraints by using examples. 
Figure 3 explains how to check semantic constraints in 
the example sentences. 
In the method using semantic markers, a candidate 
referent is the proper referent if one of the semantic 
markers belonging to the candidate referent is equal or 
subordinate to the semantic marker of the case compo- 
nent. For example, with respect to the zero pronoun in 
Figure 3, since the ga-case component in the verb "ne- 
mum (sleep)" has the semantic markers HUM (human 
being) and ANI (animal) and since "ojiisan (old man)" 
has the semantic marker HUM, the proper referent is 
judged to be "ojiisan." 
In the example-based method, the validity of a can- 
didate referent is decided by the semantic similarity be- 
tween the candidate referent and the examples of the 
case component in the verb case frame. The higher the 
semantic similarity is, the greater the validity is. For 
example, with respect to a zero pronoun in Figure 3, 
since the examples of the ga-case are "kate (he)" and 
"inu (dog)," and since "ojiisan (old man)" is semanti- 
cally similar to "kate (he)", the proper referent is "off- 
isan (old man)." 
These rules, which use semantic relationships to verbs, 
are also used in the estimation of the referent of demon- 
stratives and personal pronouns. 
Rule using the feature that it is difficult for 
a noun phrase to be filled in multiple case 
components of the same verb 
Candidate enumerating rule 4 
When there is "Noun X" in another case component of 
the verb which has the analyzed case component (the 
analyzed zero pronoun), {(Noun X, -20)} 
Rule using empathy 
This rule is based on empathy theory (Kameyama 86). 
When an anaphor is a ga-case zero pronoun whose verb 
is followed by an auxiliary verb such as "kureru" or "ku- 
dasaru," the n/-case zero pronoun is analyzed first, and 
44 
doru souba-wa kitai-kara 130-yen-dai-ni joushoushita. 
(dollar) (the expectations) (130 yen) (surge) 
(The dollar has since rebounded to about 130 yen because of the expectations.) 
kono doru-daka-wa oushuu-tono kankei-wo gikushaku-saseteiru. 
the dollar's surge) (Europe) (relation) (strain) 
The dollar's surge is straining relations with Europe.) 
Rule 
Candidate enumerating rule 2 
Candidate enumerating rule 5 
Candidate enumerating rule 1 
Candidate judging rule 6 
Total score 
Score of each candidate (points) 
the previous 
sentence 
15 
15 
new 130 yen 
individual (130 yen) 
10 
17 
-30 
I0 --13 
kitai 
(expectations) 
15 
--30 
-15 
Figure 4: Example of resolving demonstrative "kono (this)" 
dorusouba 
(dollar) 
15 
-30 
-15 
Table 11: Results Text demonstrative personal pronoun zero pronoun total score 
Training 87% (41/47) 100% (9/9) 86% (177/205) 87% (227/261) 
Test 86% (42/49) 82% (9/11) 76% (159/208) 78% (210/26S) 
The points given in each nile are manually adjusted by using the training sentences. 
Training sentences (example sentences (43 sentences), a folk tale "kobutori jiisan" (Nakao 85) (93 sentences), an essay in 
"tenseijingo" (26 sentences), an editorial (26 sentences), an article in "Scientific American (in Japanese)"(16 sentences)} 
Test sentences {a folk tale "tsuru no ongaeshi" (Nakao 85) (91 sentences), two essays in "tenseijingo" (50 sentences), an 
editorial (30 sentences), articles in "Scientific American (in Japanese)" (13 sentences)} 
it is filled with the noun phrase that has high empathy 
such as the topic, and a ga-case zero pronoun is filled 
with another noun phrase. 
6 Experiment and Discussion 
6.1 Experiment 
Before pronoun resolution, sentences were transformed 
into a case structure by a case structure analyzer (Kuro- 
hashi & Nagao 94). The errors made by the structure 
analyzer were corrected by hand. We used IPAL dictio- 
nary (IPAL 87) as a verb case frame dictionary. We put 
together the case frames of the verb phrases which were 
not contained in this dictionary by consulting a large 
amount of linguistic data. 
An example of resolving the demonstrative "kono 
(this)" is shown in Figure 4, which shows that the ref- 
erent of the noun phrase "kono dorudaka (this dollar's 
surge)" was properly judged to be the previous sentence. 
By Candidate enumerating rule 2 in Section 3, the sys- 
tem took a candidate "the previous sentence" and gave 
it 15 points. By Candidate enumerating rule 5 in Sec- 
tion 3, the system took a candidate "new individual" 
and gave it 10 points. By Candidate enumerating rulel 
in Section 3, the system took three candidates, "130 
yen (130 yen)", "kitai (expectations)", and "dorusouba 
(dollar)", and gave them 17, 15, and 15 points, respec- 
tively. The system applied Candidate judging rule 6 to 
them. This uses examples of "X no Y". In this case, 
it used examples of "X no dorudaka (the dollar's surge 
of X)". The only example noun phrase X of this form 
"X no dorudaka" in the EDR occurrence dictionary was 
"saikin (recently)". All three candidates, "130 yea (130 
yen)", "kitai (expectations)", and "dorusouba (dollar)", 
were low in similarity to "saikin (recently)" in "Bun Rui 
Goihyou", and were given -30 points by Table 8. Two 
candidates, "the previous sentence" and "new individ- 
ual', so they are not noun phrases, and were not given 
points by Candidate judging rule 6. As a result, "the pre- 
vious sentence" had the highest score and was judged to 
be the proper referent. 
We show the results of our resolution of demonstra- 
tives, personal pronouns, and zero pronouns in Table 11. 
The detailed results for demonstratives are shown in Ta- 
ble 12. The precision rate of zero pronouns is in the case 
when the system knows whether the zero pronoun has a 
referent or not in advance. 
6.2 Discussion 
With respect to demonstratives, the precision rate was 
over 80% even in the test sentences. This indicates that 
the rules used in this system are effective. But since 
Japanese demonstratives are classified into many kinds, 
the precision may be increased by making more detailed 
rules. In this work we used the feature that "kono (this)" 
rarely functions as a daikou-reference. There were four 
cases analyzed correctly because of this rule. 
With respect to personal pronouns, since only first 
and second personal pronouns appeared in the texts used 
in the experiment, almost all of the personal pronouns 
were resolved correctly by estimating the first and second 
persons in the quotation. The main reason for the errors 
in the personal pronoun resolution is that the n/-case 
zero pronoun was resolved incorrectly and the second 
person was estimated incorrectly. 
45 
Text 
Tr~ning 
Test 
Table 12: Detailed results for demonstrative 
demonstrative demonstrative demonstrative pronoun adjective adverb 
83% (15/18) 86% (19/22) 100% (7/7) 
82% (14/17) 88% (23/26) 83% (5/6) 
total score 
87% (41/47) 
86% (42/49) 
Table 13: Results of com 
Method 1 
Demonstrative 87% (41/47) 
86% (42/49) 
Personal pronoun 100% (9/ 9) 
82% (9/11) 
Zero pronoun 86%(177\]205) 
76%(159/208) 
)arison between semantic marker and example-base 
Method 2 Method 3 
83% (39/47) 87% (41/47) 
88°£ (43/49) 88% (43/49) 
100% (9/ 9) 100% (9/ 9) 
64% (7/11) 82% (9/11) 
83%(171/205) 86%(176/205) 
76%(158/208) 79%(164/208) 
Method 4 
83% (39/47) 
84% (41/49) 
100% (9/ 9) 
55% (6/11) 
82%(169/205) 
75%(155/208) 
Method 5 
79% (3r/4r) 
86% (42/49) 
89% (8/9) 
64% (r/11) 
66%(135/205) 
63°£(131/208) 
Method 1 : Using both semantic marker and example 
Method 2 : Using semantic marker 
Method 3 : Using example (using modified codes of bnnrui goi hyou) 
Method 4 : Using example (using original codes of bunrui goi hyou) 
Method 5 : Using neither semantic marker nor example 
There are several reasons for the errors of the zero pro- 
noun resolution: there are errors in Japanese thesaurus 
"Bunrui goi hyou", Noun Semantic Marker Dictionary, 
and Case Frame Dictionary. 
6.3 Comparison Experiment 
As mentioned before, we use both the example rule and 
the semantic marker rule as judging rules. To check 
which rule is more effective, we made a comparison be- 
tween the example method and the semantic marker 
method. The results are shown in Table 13. The up- 
per and lower rows of this table show the accuracy rates 
for training and test sentences, respectively. The pre- 
cision of the method using examples was equivalent or 
superior to that of the method using semantic markers, 
as shown in Table 13. This indicates that we can use 
examples as well as semantic markers. Since some codes 
in BGH are incorrect, we modified them. Since the pre- 
cision using modified codes was higher than that using 
original codes, this indicates that the code modification 
is valid. 
7 Summary 
In this paper, we presented a method of estimating refer- 
ents of demonstrative pronouns, personal pronouns, and 
Zero pronouns in Japanese sentences using examples, sur- 
face expressions, topics and foci. Unlike conventional 
works, which use semantic markers for semantic con- 
straints, we use examples for semantic constraints and 
showed in our experiments that examples are as useful 
as semantic markers. We also proposed many new meth- 
ods for estimating referents of pronouns. For example, 
we used the form "X of Y" for estimating referents of 
demonstrative adjectives. In addition to our new meth- 
ods, we used many conventional methods. As a result, 
experiments using these methods obtained a precision 
rate of 87~0 in estimating referents of demonstrative pro- 
nouns, personal pronouns, and zero pronouns for training 
sentences, and obtained a precision rate of 78% for test 
sentences. 

References 
Electronic Dictionary Research Institute, ltd.: Electronic Dictio- 
nary, Japanese Word Dictionary, Version 1.5, (in Japanese), 
1995. 
Electronic Dictionary Research Institute, ltd.: Electronic Dic- 
tionary, Japanese Cooccurrence Dictionary, Version 1.5, (in 
Japanese), 1995. 
Hayashi, S.: Daimeishi-ga Sasumono Sono Sashi-kata, (in 
Japanese), Unyou I, Asakura Japanese New Lecture 5, 
Asakura Publisher, pp: 1-45, 1983. 
Information-technology Promotion Agency, Japan: IPA Lexicon 
of the Japanese Language for computers IPAL (Basic Verbs), 
(in Japanese), 1987. 
Kameyama, M. A Property-sharing Constraint in Centering 
Proe. of 2~4th Annual Meeting of ACL, pp.200-206, 1986. 
Kinsui, B. and Takubo, Y.: Demonstrative, (in Japanese), Hit- 
suji Shobou, 1992. 
Kurohashi, S. and Nagao, M.: A Method of Case Structure 
Analysis for Japanese Sentences based on Examples in Case 
Frame Dictionary IEICE Transactions on Information and 
Systems, ETT-D(2), pp.227-239, 1994. 
Matsumoto. Y, Kurohashi, S., Myoki. Y, and Nagao, M.: 
Japanese Morphological Analysis System JUMAN Manual ver- 
sion 1.0, (in Japanese), Nagao Lab., Kyoto University, 1992. 
Nakaiwa, H. and Ikehara, S.: Intrasentential Resolution of 
Japanese Zero Pronouns using Pragmatic and Semantic Con- 
straints Viewed from Ellipsis and Inter-Event Relations (in 
Japanese), IEICE-WGNLC 95-5, pp.33-40, 1995. 
Nakao, K.: The Old Man with a Wen, (in Japan-ese), Eiyaku 
Nihon Mukashibanashi Series, Vol. 7, Nihon Eigo Kyouiku Ky- 
oukai,' 1985. 
The National Language Research Institute: Bunrui Goi Hyou, 
(in Japanese), Shuuei Publishing, 1964. 
System of "KO/SO/A": (in Japanese), The National Language 
Research Institute, 1981. 
Takada, S. and Doi, N.: Centering in Japanese: A Step Towards 
Better Interpretation of Pronouns and Zero-Pronouns, Proc. 
of 15th COLING, Vol.2, pp.1151-1156, 1994. 
Takahashi, T. et al.: Demonstrative, (in Japanese), Nihon- 
gogaku, vol. 9, Meiji Shoin, 1990. 
Walker, M., Iida, M., and Cote, S.: Japanese Discourse and the 
Process of Centering, Journal of Computational Linguistics, 
Vol.20, No.2, pp.193-232, 1994. 
Watanabe, Y., Kurohashi, S. and Nagan, M.: Construction of 
semantic dictionary by IPAL dictionary and a thesaurus, (in 
Japanese), Proc. of ,~5th Convention of IPSJ, pp.213-214, 
1992. 
Yamamura, T., Ohnishi, N., and Sugie, N.: A Classification 
Scheme of Anaphora in Japanese Demonstrative Pronoun, (in 
Japanese), IEICE Transactions on Information and Sys- 
tems, J75-D-II(2), pp.371-378, 1992. 
