Automatic Extraction of Rules for Anaphora Resolution of 
Japanese Zero Pronouns from Aligned Sentence Pairs 
Hiromi Nakaiwa 
NTT Communication Science Laboratories 
1-1 Hikarinooka, Yokosuka-shi, 
Kanagawa-ken, 239 JAPAN 
nakaiwaecslab, kecl. ntt. co. j p 
Abstract 
This paper proposes a method to extract 
rules for anaphora resolution of Japanese 
zero pronouns from aligned sentence pairs. 
The method focuses on the characteristics 
of Japanese and English in which both the 
language families and the distribution of 
zero pronouns are very different. In this 
method, zero pronouns in the Japanese 
sentence and the English translation equiv- 
alents of their antecedents are extracted 
from Japanese and English aligned sen- 
tence pairs. Then resolution rules for 
Japanese zero pronouns are automatically 
extracted using the pairs of Japanese zero 
pronouns and translation equivalents of 
their antecedents in English and equivalent 
word/phrase pairs which were extracted 
from the aligned sentence pairs, based on 
the syntactic and semantic structure of the 
Japanese sentence. This method was im- 
plemented in the Japanese-to-English ma- 
chine translation system, ALT-J/E. The 
evaluation showed that, for 371 zero pro- 
nouns with deictic reference in a sen- 
tence set for the evaluation of Japanese- 
to-English machine translation systems, 
the rules which were created automatically 
from Japanese and English aligned sen- 
tence pairs correctly resolved 99.2% of zero 
pronouns in a window test and 87.6% of 
zero pronouns in a blind test. 
1 Motivation 
In all natural language, elements that can be eas- 
ily deduced by the reader are frequently omitted 
from expressions in texts (Kuno, 1978). This phe- 
nomenon causes considerable problems in natural 
language processing systems. For example, in a ma- 
chine translation system, the system needs to rec- 
ognize that elements which are not present in the 
source language, may become mandatory elements 
in the target language. In particular, the subject and 
object are often omitted in Japanese whereas they 
are often mandatory in English. Thus, in Japanese- 
to-English machine translation systems, it is neces- 
sary to identify case elements omitted from the orig- 
inal Japanese ("zero pronouns") for their translation 
into English expressions. 
Several methods have been proposed with regard 
to this problem (Kameyama, 1986; Walker et al., 
1990; Yoshimoto, 1988; Dousaka, 1994). When con- 
sidering the application of these methods to a prac- 
tical machine translation system for which the trans- 
lation target area cannot be limited, it is not possible 
to apply them directly, both because their precision 
of resolution is low as they only use limited infor- 
mation, and because the volume of knowledge that 
must be prepared beforehand is so large. 
To overcome these kinds of problems, several 
methods to resolve zero pronouns which consider 
applications for a practical machine translation 
system with an unlimited translation target area, 
have been proposed (Nakaiwa and Ikehara, 1992; 
Nakaiwa and Ikehara, 1995; Nakaiwa and Ikehara, 
1996). These methods use categorized semantic 
and pragmatic constraints such as verbal seman- 
tic attributes (Nakaiwa et al., 1994) and types of 
modal expressions and conjunctions as a condition 
for anaphora resolution of zero pronouns and deter- 
mine antecedents of zero pronouns depending on the 
typical category of three types of semantic and prag- 
matic constraints. 
However, with these methods it is necessary to 
make resolution rules for zero pronouns by hand. 
So, to make robust rules with wide coverage takes a 
lot of time and labor and analysts who make these 
resolution rules must be familiar with the NLP sys- 
tem itself. Furthermore, the types of zero pronouns 
change depending on the types of documents which 
must be analyzed. So, resolution rules must be made 
depending on the target domain of the documents. 
But, it is very difficult to make rules for every do- 
main because of the time consuming labor and the 
need for expertise. Because of these problems, a 
method to make resolution rules for zero pronouns 
effectively and efficiently is greatly needed. 
22 
In order to acquire resolution rules for a NLP sys- 
tem effectively and efficiently, various methods have 
been proposed. One typical method for this purpose 
is to use a corpus for extracting resolution rules by 
analyzing each sentence in the corpus. With regard 
to the automatic extraction of resolution rules for 
zero pronouns, several methods have been proposed 
(Murata and Nagao, 1997; Nasukawa, 1996). But 
these methods only use monolingual corpora and 
they find it difficult to extract resolution rules for 
zero pronouns whose referents are normally unex- 
pressed in Japanese. Furthermore, rules can only be 
made when similar expressions to those containing 
the zero pronouns are found in the corpus. 
In order to take into account the kinds of prob- 
lems which are caused by monolingual corpora, it 
seems that a bilingual corpus consisting of pairs of 
a sentence in one language and a translation of the 
sentence is better than a monolingual corpus for the 
purpose of acquiring resolution rules for zero pro- 
nouns. This is particularly so with a bilingual cor- 
pus of Japanese and English whose language families 
are so different and in which the distribution of zero 
pronouns is also very different. This combination is 
more useful than the bilingual corpora of language 
pairs whose language families are similar. 
Several methods have been proposed with regard 
to acquiring various kinds of rules such as trans- 
lation rules, grammar rules, dictionary entries and 
so on from bilingual corpora (Dagan et al., 1991; 
Dagan and Church, 1994; Fung and Church, 1994; 
Tanaka, 1994; Yamada et al., 1995). From the 
point of view of the extraction of resolution rules 
for zero pronouns, a technique to extract zero pro- 
nouns in a sentence in one language and translation 
equivalents of their antecedents in a translation from 
aligned sentence pairs is needed. Such a technique 
has recently been proposed, and a method to ex- 
tract Japanese zero pronouns in Japanese sentences 
and translation equivalents of their antecedents in 
English sentences from aligned sentence pairs has 
been developed. (Nakaiwa and Yamada, 1997). An- 
other technique which is needed, is to make rules 
automatically to resolve zero pronouns using pairs 
of equivalent words/phrases with zero pronouns and 
their antecedents from aligned sentence pairs. Sev- 
eral methods to extract rules and dictionary entries 
for machine translation have been proposed such as 
(Yamada et al., 1995). But there is currently no pro- 
posed method for extracting resolution rules for zero 
pronouns automatically using bilingual corpora. 
In this paper, I propose a widely applicable 
method to extract resolution rules for Japanese zero 
pronouns from Japanese and English aligned sen- 
tence pairs automatically using pairs of equivalent 
words/phrases and pairs of zero pronouns and their 
antecedents. 
2 Appearance of Zero Pronouns and 
Their Antecedents within 
Japanese and English Aligned 
Sentence Pairs 
In order to understand the distribution of zero pro- 
nouns with antecedents within Japanese and English 
aligned sentence pairs, in this section, I examine 
which zero pronouns in Japanese must be explicitly 
translated into English and where their translation 
equivalents in English appear, using a test set de- 
signed to evaluate the performance of Japanese-to- 
English machine translation systems (Ikehara et al., 
1994). The test set (3718 sentences) has many ex- 
amples of zero pronouns making intrasentential and 
deictic references. The sentence set was created to 
test the coverage of expressions that can be trans- 
lated by Japanese to English MT systems based on 
the varieties of Japanese expressions and the differ- 
ences between Japanese and English. The sentence 
set has approximately 500 kinds of test items. Each 
sentence has a manual translation, and almost all of 
the sentences can be translated without contextual 
information (3704 sentences out of 3718 sentences). 
A MT system can be evaluated by comparing its 
output to the equivalent manual translation. Each 
sentence is expressed in natural Japanese and the 
sentence set covers many different expressions. 
This is an example of a zero pronoun in Japanese 
whose referent is expressed in the English transla- 
tion. (1) (¢-ga) hon-wo yomi-tai 
book-oBJ read-WANT-TO 
I want to read a book. 
In this expression, the Japanese sentence contains 
the modal expression tai which indicates HOPE. This 
modal expression causes the default referent of the 
subject zero pronoun to be "writer" or "speaker" 
which is translated as 'T' in English. 
The results of the examination of zero pronouns 
and their referential elements in the functional test 
sentence set are shown in Table 1. There were a 
total of 525 zero pronouns in 463 sentences. The 
location of the referential elements can be divided 
into 2 kinds: those expressed in the same sentence, 
and those not expressed in the same sentence. The 
latter were further classified into 6 kinds. 
• The zero pronoun is not translated because the 
passive voice is used. 
• The referent is the writer or speaker, 'T' or a 
group, ':we". 
• The referent is the reader or hearer, "you". 
• The referent is human but it is not known who 
the human is. 
• The zero pronoun should be translated as "it". 
• The referent is another specific element. 
23 
According to this study of the functional test set, 
in 371 out of 525 instances (71%) the referent was 
not expressed in the sentence. Of these, the zero 
pronouns could be left unexpressed by converting 
the translation to the passive voice in 156 instances 
(30%). The other zero pronouns, 215 instances 
(41%), referred to referents that did not appear in 
the Japanese sentence but appear in the English 
translation. This result shows that aligned sentence 
pairs will be effective for extracting zero pronouns 
and their antecedents automatically by determining 
zero pronouns in Japanese and translation equiva- 
lents of their antecedents in English. 
According to a further examination of the English 
equivalents of Japanese zero pronouns in the sen- 
tence set in Table 1, depending on the types of ref- 
erential elements, the style of the English equivalents 
are different. These characteristics can be summa- 
rized in the following: 
1. Deictic referents in English (215 instances) 
These elements are often translated as personal 
pronouns such as 'T' or "you" or indefinite 
"one". 
2. Anaphoric antecedents in English (154 in- 
stances; intrasentential) 
These elements are often translated as per- 
sonal pronouns, demonstratives such as "that", 
definite noun phrases such as a noun phrase 
with definite article (e.g. "the company") or 
anaphoric "one". 
English expressions of these two types can be 
preferred candidates for translation equivalents of 
Japanese zero pronouns. 
Furthermore, according to an analysis of these 
aligned sentence pairs (Nakaiwa and Ikehara, 1995; 
Nakaiwa and Ikehara, 1996), zero pronouns in the 
corpus can be successfully resolved using three kinds 
of semantic and syntactic constraints: verbal seman- 
tic attributes, the types of modal expressions and 
conjunctions. So, for making suitable rules for re- 
solving Japanese zero pronouns in aligned sentence 
pairs, the use of these semantic and pragmatic cate- 
gories which are extracted from the syntactic and se- 
mantic structure of the Japanese sentence in aligned 
sentence pairs, will be effective. 
3 A Method for Extraction of 
Resolution Rules for Japanese 
Zero Pronouns 
This section describes a method for automatically 
extracting resolution rules for Japanese zero pro- 
nouns from Japanese and English aligned sentence 
pairs. Figure 1 shows an overview of the system. As 
shown in this figure, the Japanese and English sen- 
tences within the aligned sentence pairs are analyzed 
separately by Japanese and English syntactic and se- 
mantic parsers. Next, the system extracts the pairs 
of Japanese word/phrase and their English equiv- 
alent word/phrase, by comparing these two struc- 
tures, based on the Japanese syntactic and seman- 
tic structures and the English syntactic and seman- 
tic structures which are created by the Japanese 
and English parsers. Then, based on the character- 
istics of the translation equivalents of antecedents 
of Japanese zero pronouns in English, which was 
discussed in Section 2, Japanese zero pronouns in 
the Japanese sentence and the translation equiva- 
lents of their antecedents in the English sentence 
are extracted. By using these results, based on 
the Japanese syntactic and semantic structure, the 
resolution rules for Japanese zero pronouns within 
Japanese sentences are created. In the next step, 
the resolution rules are used for the semantic and 
pragmatic analysis of the Japanese sentence by the 
Japanese syntactic and semantic parser within the 
whole rule extraction system. The same Japanese 
and English aligned sentence pairs are inputted in 
the system and resolution rules of Japanese zero 
pronouns are again extracted. These processes are 
repeated until the system cannot extract any more 
rules for resolution of Japanese zero pronouns from 
the aligned sentence pairs. 
I Aligned Sentence Pairs 1 
I 
I i 
E~m~on of~Z~ 
P IDr~uDS and *he.~A ll~ ~'-~IS 
EXHOrt of pa~ofwo~ 
Jn ,~pane~ ard Eno1~ I 
~mc~onof~ Z~ ~ I 
Ex1~c~on of the3rA ~,~ ~ \] 
1 
Eng~ A na.~,~er 
l 
J, 
Figure 1: Process for Automatic Extraction of Res- 
olution Rules for Japanese Zero Pronouns 
This method has been implemented in the 
Japanese to English machine translation system. 
ALT-J/E (Ikehara et al., 1991). The system, which 
is described in Figure 1, can extract English transla- 
tion equivalents of Japanese zero pronouns from the 
aligned sentence pairs. So, these results can be used 
for the extraction of translation rules for Japanese 
24 
Loc. Loc. of 'referential elements' 
of \[ \[ \[ \[lntrasentential \]l \] I Deiitic I I zero ha \[ ga \] o \[ n, \[ misc Psve I you hu- it mi 
pron. we man sc 
ha 1 0 0 0 0 4 0 2 0 0 0 
9a 119 12 0 2 8 150 85 24 30 51 0 
o 4 0 5 1 0 0 0 0 1 7 0 
ni 1 0 0 0 0 2 3 8 0 0 1 
no 0 0 1 0 0 0 \] 1 0 0 2 0 
Total \]\[ 154 II 371 
To- 
tal 
: 7 
481 
18 
15 
I 4 
II ~25 
Table h Distribution of zero pronouns and their referential elements 
zero pronouns to English in a Japanese to English 
machine translation system. 
In the next two subsections, I describe the de- 
tails of the extraction of Japanese zero pronouns 
in Japanese sentences and translation equivalents of 
their antecedents in English sentences and the con- 
struction of resolution rules for Japanese zero pro- 
nouns. 
3.1 Extraction of Japanese Zero Pronouns 
and Their Antecedents 
The method to extract Japanese zero pronouns and 
their English equivalents consists of the following 
steps 1 . 
1) Analysis of Japanese and English sentences 
In this step, I use morphological, syntactic and 
semantic analyzers of Japanese in ALT-J/E for 
the analysis of Japanese sentences and Brill's 
English tagger (Brill, 1992) for the analysis of 
English sentences ~. 
2) Extraction of the pairs of Japanese word/phrase 
and their English equivalent word/phrase. 
1For the details of the extraction of Japanese zero 
pronouns in Japanese sentences and translation equiva- 
lents of their antecedents in English equivalent sentences, 
refer to Nakaiwa and Yamada (1997). 
2As shown in Figure 1, if the English analyzer can 
supplement ellipses in English ("I sing a song and ¢ play 
a piano."), then even if translation equivalents of the 
antecedents of zero pronouns in Japanese are omitted in 
English sentences, the overall system can determine their 
antecedents from the completed elements. Furthermore, 
when the English analyzer contains an anaphora res- 
olution process such as Lappin and Leass (1994), even 
if the antecedents of zero pronouns in Japanese are 
anaphoric expressions such as pronouns and definite 
noun phrases, the anaphora resolution process deter- 
mines the antecedents of anaphoric expressions in En- 
glish and the overall system can determine intersenten- 
tial and intrasentential resolution rules of Japanese zero 
pronouns by using extracted pairs of the antecedents 
of anaphoric expressions in English and their Japanese 
equivalents. But, for now, I am only using an English 
tagger as the English analyzer for the primary examina- 
tion because the extraction of pairs, Step 2, only needs 
the parts of speech of the English words. 
In this step, I use the following information: 3. 
• bilingual dictionary for Japanese to English 
MT system, ALT-J/E 
This dictionary is used for the determina- 
tion of pairs of equivalent word phrases of 
Japanese and English. 
• English dictionary for English generation 
in ALT-J/E 
This dictionary is used when the suffix dif- 
fers: for example the derivative, ring' be- 
tween an English word in the bilingual dic- 
tionary entry and the English word in the 
English sentence within the aligned sen- 
tence pair. 
• ignore function words such as preposi- 
tions, determiners and others from the En- 
glish sentence to find Japanese equivalent 
words/phrases in Japanese 
This is because function words must often 
be changed depending on the types of head 
such as verb for preposition and noun for 
determiner in English. 
3) Extraction of the candidates for Japanese zero 
pronouns within the Japanese sentence 
In this step, the system extracts Japanese 
zero pronouns which are determined by syntac- 
tic and semantic analysis of Japanese within 
the syntactic and semantic structure of the 
Japanese sentence. 
4) Extraction of the candidates for translation 
equivalents of antecedents of Japanese zero pro- 
nouns within the English sentence 
In this step, the following English 
words/phrases are extracted from the English 
sentence as possible translation equivalents4: 
For details of this step, refer to 
Yamada et al. (1996). 
4In this paper, I only extract these 4 types of En- 
glish words/phrases for the candidates which appeared 
in the test set and the examination of other types of 
English words/phrases which can be the candidates of 
translation equivalents of antecedents of Japanese zero 
pronouns, such as "they", "he" and "she", remains as 
future work. 
25 
• personal pronouns such as 'T' or "you" 
• "one". 
• demonstratives such as "that" 
• definite noun phrases such as a noun phrase 
with definite article (e.g. "the company") 
5) Determination of zero pronouns in Japanese 
sentences and their referents in English sen- 
tences 
The pairs of Japanese words/phrases and En- 
glish equivalent words/phrases and the pairs of 
zero pronouns in a Japanese sentence and trans- 
lation equivalents of their antecedents in the 
English sentence are determined from the can- 
didates for the pairs of Japanese word/phrases 
and their English equivalent word/phrases 
which were extracted at step 2, the candidates 
for Japanese zero pronouns within the Japanese 
sentence which were extracted at step 3 and 
the candidates for translation equivalents of an- 
tecedents of Japanese zero pronouns within the 
English sentence which were extracted at step 
4. This determination is conducted based on 
how strongly related the candidates are and how 
many pairs can be extracted from these candi- 
dates. 
3.2 Construction of Resolution Rules for 
Japanese Zero Pronouns 
Using the extracted Japanese zero pronouns and 
their antecedents from Japanese and English aligned 
sentence pairs and the syntactic and semantic struc- 
tures of Japanese sentences, the system constructs 
resolution rules for Japanese zero pronouns. For 
this construction, verbal semantic attributes, the 
types of modal expressions and conjunctions within 
Japanese syntactic and semantic structure are used 
for the resolution conditions of Japanese zero pro- 
nouns. In the implementation in ALT-J/E, the rules 
are extracted using the case type of the zero pro- 
noun, the verbal semantic attributes (VSA; 107 cat- 
egories) of the verb which governs the zero pro- 
noun (Nakaiwa et al., 1994) and categorized types 
of modal expression (134 categories) in the unit sen- 
tence which contains the zero pronoun (Kawai, 1987) 
and categorized types of conjunction (56 categories) 
which are directly connected to the unit sentence. 
For example from aligned sentence pair (1), the rule 
(3) is extracted from the syntactic and semantic 
structure (2) in Figure 2. 
When zero pronouns, whose resolution conditions 
are the same but whose antecedents are different, 
occur within aligned sentence pairs, the resolution 
rule selects the referent which appears most often as 
the antecedent of the zero pronouns. For example, 
when unit sentences with zero pronouns in subject 
position and with category "A" modal expression 
occur eight times within aligned sentence pairs and 
when the extracted antecedents of the zero pronouns 
are 'T' for 5 zero pronouns and "you" for 3 zero 
pronouns, the resolution rule with the condition of 
modal expression "A" determines the antecedents of 
zero pronouns in subject position as 'T'. 
4 Evaluation 
4.1 Evaluation Method 
The method to extract resolution rules for zero pro- 
nouns from aligned sentence pairs which was dis- 
cussed in Section 3 was evaluated by automatically 
extracting resolution rules for Japanese zero pro- 
nouns from the functional test sentence set which 
is already aligned, one sentence with one sentence. 
This evaluation was conducted using the Japanese 
to English MT system, ALT-J/E for the Japanese 
analysis and Brill's tagger for the English analysis. 
The conditions for the evaluation were as follows. 
4.1.1 Evaluation of Target Sentence Pairs 
The evaluation used Japanese and English aligned 
sentence pairs which contain zero pronouns with de- 
ictic references (371 instances) in a test set designed 
to evaluate the performance of Japanese-to-English 
machine translation systems (Ikehara et al., 1994) 
(3718 sentence pairs). 
4.1.2 Resolution Rules 
For the sentence pairs which contain zero pro- 
nouns with deictic reference, resolution rules for 
these zero pronouns were extracted by examining 
verbal semantic attributes, the types of modal ex- 
pressions and conjunctions within the syntactic and 
semantic structure of Japanese sentences (Section 
3.2). In this evaluation, the semantic constraints 
for cases by verb were not taken into consideration 5. 
4.1.3 Evaluation Parameters 
To examine the effectiveness of automatically ex- 
tracting resolution rules for Japanese zero pronouns, 
I examined the accuracy of resolution rules which 
are automatically extracted using three kinds of se- 
mantic and syntactic constraints: verbal semantic 
attributes, the types of modal expressions and con- 
junctions. As a baseline for comparison, I also exam- 
ined the accuracy of resolution by using rules which 
only consider the occurrence of antecedents in the 
same case elements such as subject and object as 
follows: 
• The most frequently occurring antecedent in the 
same case element is used for the antecedents of 
zero pronouns in the case element. 
• Antecedents in the same case element are pro- 
portionally determined with the weight of the 
5If the semantic constraints for cases by verb are 
taken into consideration, the accuracy of resolution will 
be better. But, the suitability of the semantic con- 
straints for cases needs to be taken into consideration 
and this remains as future work. 
26 
(2) 
... :: . 
(3) 
Syntactic and Semantic Structure of Japanese Sentence (1) and zero pronoun equivalent 
S : u-sent-1 
tense present , perfective aspect 
modal tai (hope) 
VSA Subject's human action, Subject's thinking action 
I- PRED pred-i 
\[ main verb : yornu (read) 
\[- CASE case- 1 
\[ case relation : objective case 
\[ particle : wo 
\[ 1- NP : np-I 
\[ \[- N : hon (book) 
I- CASE : case-2 
case relation : subject 
\[- NP : ~b-1 ¢=¢- "I" 
Extracted Resolution Rule for Zero Pronouns in Japanese Sentence (1) 
If 
S : u-sent-1 
modal : (tai hope) 
VSA : Subject's human action, Subject's thinking action 
\[- CASE : case-a 
case relation : subjective case 
\[- NP : 4:,-1 
Then 
¢-1 = "watasi" (I) 
Figure 2: An Example of Syntactic and Semantic Structure and Extracted Resolution Rule 
occurrence of the antecedents in the case ele- 
ment. 
For example, when the antecedents of zero pronouns 
in one case element are 'T' 3 times and "you" 2 
times, the former rule achieves 60% (=3/5) resolu- 
tion accuracy whilst the latter rule achieves 52% (= 
3/5 * 3/5 + 2/5 * 2/5) resolution accuracy. 
The accuracy of resolution is evaluated by the fol- 
lowing 2 tests: 
(a) window test All zero pronouns with deictic 
referents (371 instances), which were used for 
extracting resolution rules, were examined for 
their accuracy of resolution. As detailed in this 
paper, I have tried to evaluate the limitations 
to the accuracy of the rules which are extracted 
by the method. 
(b) blind test 370 zero pronouns out of 371 zero 
pronouns were used for extracting resolution 
rules and these rules were applied to the remain- 
ing zero pronoun, then the process was repeated 
371 times. By calculating the mean of the suc- 
cessfully resolved zero pronouns, the accuracy 
of resolution was examined. As shown in the 
next section, I have tried to evaluate the gen- 
erality of the rules which are extracted by the 
method. 
4.1.4 Successfully Resolved Zero Pronouns 
When a rule which can determine the antecedent 
of a zero pronoun is extracted by the method, I judge 
that the zero pronoun is resolved successfully 6 
6Because the rules for the evaluation are automati- 
cally created from aligned sentence pairs by the method 
proposed in this paper, it is very difficult to extract only 
4.2 Resolution Accuracy 
The resolution accuracy of extracted rules is shown 
in Table 2. As shown in this table, the accuracy of 
rules using three kinds of conditions (modal expres- 
sions, verbal semantic attributes and conjunctions) 
is as high as 99.2% in the window test and 87.6% 
even in the blind test. In contrast, resolution rules 
which only consider the occurrence of antecedents 
achieve a low resolution accuracy; 46.4% in the win- 
dow test and 46.1% in the blind test for resolution 
rules using the referent which appears most often 
and 31.8% in the window test and 30.6% in the blind 
test for resolution rules using the referent which is 
proportionally determined with the weight of the 
ocurrence. For the 3 rule sets which are extracted 
using only one condition each, the order of the reso- 
lution accuracy is as follows: modal expression, ver- 
bal semantic attributes and conjunctions. Further- 
more, if the modal expression and verbal semantic 
attributes are used together for the creation of res- 
olution rules, the resolution accuracy is as high as 
95.2% in the window test and 81.7% in the blind test. 
This shows that the use of these two conditions for 
the automatic extraction of resolution rules is most 
effective for the resolution of Japanese zero pronouns 
with deictic referents. These results demonstrate 
that this method of automatically creating rules us- 
ing three kinds of conditions from aligned sentence 
the rules which are most suitable for the aligned sentence 
pairs and the most general rules without using machine 
learning techniques. In this paper, I only discuss how 
resolution rules are automatically created and how many 
zero pronouns can be resolved by using these rules. The 
method to extract the best rule set remains as future 
work. 
27 
pairs can correctly resolve almost all Japanese zero 
pronouns with deictie references 7. 
According to these results, the proposed method 
is effective for the automatic extraction of resolution 
rules for Japanese zero pronouns from Japanese and 
English aligned sentence pairs and by using a large 
amount of aligned sentence pairs it is possible to 
extract resolution rules for almost all Japanese zero 
pronouns. 
Table 2: Resolution accuracy for conditions of reso- 
lution in automatically created rules 
condition resolution accuracy 
modal exp. 
VSA 
conjunction 
modal exp. + 
VSA 
modal exp. "b 
conjunction 
VSA + 
conjunction 
modal exp. + 
VSA -b 
conjunction 
occurrence 
(most often) 
occurrence 
(proportionally 
selected) 
window test blind test 
74.9% 64.2% (278) (~38) 
70.9% 52.6% (263) (195) 
55.0% 4s.s% (~04) (181) 
95.2% 81.7% (353) (303) 
90.3% 79.5% (335) (295) 
87.9% 68.2% (326) (253) 
99.2% 87.6% 
(368) (325) 
46.4%, 46.1% (172) (171) 
31.8% 3O.6% 
(117.9) (113.5) 
5 Conclusion 
This paper proposes a powerful method for the ex- 
traction of resolution rules for Japanese zero pro- 
nouns from Japanese and English aligned sentence 
pairs. In this paper, I have only discussed Japanese 
and English language pairs. But this method can 
be applied to various kinds of language pairs such 
as Italian and English, and the effectiveness of the 
extracted rules depends on how different the two lan- 
guages are. In the future, I will examine methods for 
extracting the most effective and most general rules 
for zero pronouns resolution using machine learning 
techniques. Furthermore, I would like to realize an 
overall system with an English syntactic and seman- 
tic parser and evaluate the effect of the anaphora 
resolution and omission of ellipsis in English for the 
rThis method does not uses any heuristics which 
were used in other methods of anaphora resolution 
such as Murata and Nagao (1997) for Japanese nouns 
and Lappin and Leass (1994) for English pronouns and 
achieves relatively high accuracy (87.6%) even in the 
blind test. So, if I also use heuristics for the resolution 
of Japanese zero pronouns, the accuracy will be higher. 
The examination of the combination of extracted reso- 
lution rules and heuristics for the resolution of Japanese 
zero pronouns remains as future work. 
extraction of resolution rules for zero pronouns with 
intersentential and intrasentential antecedents. 
6 Acknowledgments 
I would like to thank Professor Jun'ichi Tsujii for 
helpful discussion of many of the ideas and propos- 
als presented here during my star at UMIST from 
September 1995 to September 1996. I am also grate- 
ful to several anonymous reviewers of ACL/EACL- 
97 workshop on anaphora for helpful comments on 
earlier drafts of the paper. 

References 
Eric Brill. 1992. A simple rule-based part of speech 
tagger. In Proc. of ANLP92, pages 152-155, ACL. 
Ido Dagan, Alon Itai and Ulrike Schwall. 1991. Two 
languages are more informative than one. In Proc. 
of 29th Annual Meeting of ACL, pages 130-137, 
ACL. 
Ido Dagan and Kenneth W Church. 1994. Termight: 
Identifying and translating technical terminology. 
In Proc. of ANLPg4, pages 34-40, ACL. 
Kouji Dousaka. 1994. Identifying the Referents if 
Japanese Zero-Pronouns based on Pragmatic Con- 
dition Interpretation. In Trans. of IPS Japan, 
35(10):768-778. In Japanese. 
Pascale Fung and Kenneth W. Church. 1994. K- 
vec: A new approach for aligning parallel texts. 
In Proc. of COLINGg4, pages 1096-1102. 
Satoru Ikehara, Shirai Satoshi and Kentaro Ogura. 
1994. Criteria for Evaluating the Linguistic Qual- 
ity of Japanese-to-English Machine Translation. 
In Journal of JSAI, 9(5):569-579. 
Satoru Ikehara, Shirai Satoshi, Akio Yokoo and Hi- 
romi Nakalwa. 1991. Toward MT system without 
Pre-Editing-Effects of New Methods in ALT-J/FE- 
• In Proc. of MT Summit III, pages 101-106. 
Megumi Kameyama. 1986. A property-sharing con- 
straint in centering. In 2.{th Annual Meeting of 
A CL, pages 200-206. 
Atsuo Kawai. 1987. Modality, Tense and Aspect 
in Japanese-to-English Translation System ALT- 
J/E. In Proc. of the 34th Annual Convention IPS 
Japan, pages 1245-1246• In Japanese. 
Susumu Kuno. 1978. Danwa no Bunpoo. Taishukan 
Puhl. Co., Tokyo. In Japanese. 
Shalom Lappin and Herbert J. Leass. 1994. An Al- 
gorithm for Pronominal Anaphora Resolution. In 
Computational Linguistics, 20(4):535-561, ACL. 
Masaaki Murata and Makoto Nagao. 1997. An Esti- 
mation of Referents of Pronouns in Japanese Sen- 
tence using Examples and Surface Expressions. In 
Journal of Natural Language Processing, 4(1):87- 
109, Association of Natural Language Processing. 
In Japanese. 
Hiromi Nakaiwa and Satoru Ikehara. 1992. Zero 
Pronoun Resolution in a Japanese-to-English Ma- 
chine Translation System by using Verbal Seman- 
tic Attributes. In Proc. of ANLP92, pages 201- 
208, ACL. 
Hiromi Nakaiwa, Akio Yokoo and Satoru Ikehara. 
1994. A System of Verbal Semantic Attributes 
Focused on the Syntactic Correspondence between 
Japanese and English. In Proc. of COLING94, 
pages 672-678. 
Hiromi Nakaiwa and Satoru Ikehara. 1995. In- 
trasentential Resolution of Japanese Zero Pro- 
nouns in a Machine Translation system using Se- 
mantic and Pragmatic Constraints. In Proc. of 
TMI95, pages 96-105. 
Hiromi Nakaiwa and Satoru Ikehara. "1996. 
Anaphora Resolution of Japanese Zero Pronouns 
with Deictic Reference. In Proc. of COLING96, 
pages 812-817. 
Hiromi Nakaiwa and Setsuo Yamada. 1997. Auto- 
matic Identification of Zero Pronouns and their 
Antecedents within Aligned Sentence Pairs. In 
Proc. of the 3rd Annual Meeting of the Association 
for Natural Language Processing. In Japanese. 
Tetsuya Nasukawa. 1996. Full-text processing: im- 
proving a practical NLP system based on surface 
information within the context. In Proc. of COL- 
ING96, pages 824-829. 
Hideki Tanaka. 1994. Verbal case frame acquisition 
from a bilingual corpus: Gradual knowledge ac- 
quisition. In Proc. of COLINGg4, pages 727-731. 
Marilyn Walker, Masayo Iida and Sharon Cote. 
1990. Centering in Japanese Discourse. In Proc. 
of COLINGgO. 
Setsuo Yamada, Hiromi Nakaiwa, Kentaro Ogura 
and Satoru Ikehara. 1995. A Method of Au- 
tomatically Adapting a MT System to Different 
Domains. In Proc. of TMI95, pages 303-310. 
Setsuo Yamada, Hiromi Nakaiwa and Satoru Ike- 
hara. 1996. A New Method of Automati- 
cally Aligning Expressions within Aligned Sen- 
tenee Pairs. In Proc. of NeMLaP2, pages 56-65. 
Kei Yoshimoto. 1988. Identifying Zero Pronouns in 
Japanese Dialogue. In Proc. of COLING88, pages 
779-784. 
