Japanese-to-English Project 
PROTRAN & TWINTRAN 
Jelinek, J.~, Wilcock, G?, Nishida, 0.2, Yoshimi, T." 
Bos, M. J. W. 3, Tamura, N. a, Murakan~fi, H. 3 
1. Center for Japanese Studies, The University of Sheffield, S102tn, U.K. 
2. SHARP Corporation, 492 Minosho-cho, Yamatokouriyama, Nara, Japan 
3. Faculty of Engineering, Kobe University, Rokkodai, Nada, Kobe 657, Japan 
1 Background 
I.D.S. stands for Integrated Dictionary Systems. 
Its distinguishing feature is the integration of 
the bulk of grammar (= morphology, instruc- 
tions for syntactic analysis, transfer and gener- 
ation) !into the dictionary. 
Research on these lines, Mined at :Japanese- 
to-English Machine ~h:a.nslation, started in the 
early 60's and found practical application as a 
tool for teaching monolinguM English speakers 
to decode Japanese. Applications of this method 
to other language pairs have also taken place. 
The fDS approach to Japanese-to-English MT 
found sponsorship from the British Government 
and ICL from 1984 as part of ALVEY (IKBS 
project no.25, carried out at the University of 
Sheffield, England in cooperation with ICL and 
Kobe University, Japan). When the Japanese to 
English part of the ALVEY project was success- 
fully concluded in 1987, resulting in the creation 
of AIDTRANS, SHARP Corporation (Japan) 
concluded an agreement with the hitherto part- 
ners and took over further sponsorship of this 
research. This note is about the work carried 
out after that. 
We have since achieved a working prototype 
of the sentence-for-sentence component known 
as PROTRAN and work now continues at Kobe 
University, under SHARP sponsorship, on the 
development of a textwide component (TWIN- 
TRAN) which could run on top of the existing 
model. 
2 Sentence-for-Sentence Com- 
ponent: PROTRAN 
2.1 Linguistic Rules and the Process- 
ing System 
Our mMn task in the last year of research has 
been to reformulate the sentence-for-sentence 
Japanese-to-English system in such a way as 
to make the complete linguistic information ex- 
plicit, which are executed by a processing system 
separate from these rules. The processing sys- 
tem is all programmed in Prolog and executes 
the linguistic rules by applying a function to 
each type of ~ule. This task has largely been 
achieved by now. 
The linguistic information resides in tlhe fol- 
lowing sets of rules: 
1) Japanese-to-English Automatic Dictionary 
(at present 32000 entries), held in a rela- 
tional database with seven fields for each 
entry (combined key comprises the fields 
Entry_~ord, Translation, ~lord._class, 
Entry_code and Continuation; outside 
key are the fields Priority and Semantic_ 
category). 
2) Prioritised list of permitted juxtapositional 
links 
Morpholexical analysis is executed by a 
linear chart parser utilising the fields 
Entry_~ord, Entry_code, Continuation 
and Priority from the dictionaxy database 
and the prioritised list of permitted links. 
This yields a set of morphological word class 
50 1 
strings, each of which maps the input sen- 
tence, evaluated for their juxtapositional 
suitability and their morpholexical suitabil- 
ity as the best of all obtainable dictionary 
mappings of the input sentence. This eval- 
uation takes place in two tiers, first utilising 
the prioritised list of permitted links (to ob- 
tain morphological optimum) and then on 
the basis of the field "priority" of each en- 
try (to obtain lexicM optinrum). Only the 
overall optimum mappings are passed on for 
further processing. 
3) Morpholexical Grammar Rules (which are 
linear rewrite rules) 
A set of linear grammar rules is applied 
to produce strings of syntactic word classes 
out of the original strings of morphological 
word classes. 
4) Syntactic Analysis Rules (allowing parsing 
into trees) 
A version of Bottom-Up Parser \[4\] is then 
used to execute the linguistic rules for Syn- 
tactic Analysis, resulting in a set of all a.1- 
lowable Japanese trees for the Japanese in- 
put sentence. We now have a prioritised 
version of these rules, designed to avoid 
carrying out the complete search in favour 
of proceeding with tire best option only 
and coming back only if this option fails 
in further processing. Work is in progress 
at present, to be incorporated in TWIN- 
TRAN, to implement this stage in the fornr 
of demand-driven prioritised chart parser 
(whereby the chart parser is cmttrolled by 
an A* al.gorith.m). 
5) Sentence Pattern Transfer Rules (specifying 
case-type word order transfer) 
A set of functions applies the rules of 
Sentence Pattern Transfer (which are de- 
fined as Production Rules), moving verb- 
dependent case groups to their appropri- 
ate English word order positions, supply- 
ing default prepositions for each case group 
and creating default Subjects and/or Ob- 
jects where necessary. Unlike all the previ- 
ous stages, which, are all largely divergent, 
6) 
this process contains little divergence and 
prioritisation has not yet been introduced. 
Substitution Rules (which deal with all re- 
maiming word order transfer, which at this 
stage is Entry-Specific) 
Substitution is a set of functions executing 
rules which finalise the English word order 
down to the lowest level of trees, but the 
output remMns in the form of trees. This 
process tends to output fewer alternatives 
than have been input, as some trees are li- 
able to be eliminated. 
Generation Rules (which produce English 
word forms) 
Generation produces actual English sen- 
tences by scanning all tree node labels in 
post-order, activating Generation rules by 
node labels. It still preserves multiple user 
choices not only from amongst different 
sentence-level renderings of the input sen- 
tence but also from within sets of local al- 
ternatives within each version of the sen- 
tence. 
2.2 Points of Convergence 
It is common knowledge that. the kind of exercise 
described above is bound to entail combinatorial 
explosion at several points from (1) to (4) if no 
measures are taken to prevent it. An inseparable 
part of our method is the reliance on the so- 
called Points of Convergence to overcome this 
problem. 
A point of convergence is a point at which all 
alternatives so far listed have the same chance of 
success vis-~-vis what ma.y folk)w. A selection of 
the best alternative(s), or a ranking of these al- 
ternatives as to their relative '(goodness", may 
therefore be carried out i~t each poi~t of con- 
vergence, i.e. severM times before the end of 
sentence is reached. 
Points of convergence may be total (when 
all available a.lternatives a.t that point stand an 
equal chance of success) or partial (as between 
only some available alternatives). These points 
are found not only at the stage of linear grain- 
mar but also on the trees produced in syntactic 
2 51 
analysis. 
At each point of convergence the quantity of 
information passed on to the next process can be 
significantly reduced. The information left be- 
hind can either be dropped altogether (as is the 
case with less than optimum morphological rep- 
resentations) or be graded into ranks and wait 
in a queue on a demand-driven basis. 
3 Textwide Grammar 
A random sequence of sentences does not make 
a valid text, anymore than a random sequence of 
words makes a valid sentence (even though both 
phenomena may occur by accident). This is not 
primarily because such random sequences would 
not make a coherent sense; that would only put 
them in the same category as numerous prop- 
erly formed and perfectly official texts, which 
just happen to talk nonsense. Natural language 
is able to express nonsense, on purpose or other- 
wise. Nonsense can be grammatical and can be 
translated. Random sequences fall down mainly 
because they tend to be formally incoherent, and 
formal coherence is another word for grammar. 
There are formal rules determining text coher- 
ence and most of these rules have to do with the 
formal aspect of correferentiality. Certain struc- 
tures are formally able to refer again to some 
items (assertions, events, facts, objects or per- 
sons) that have previously been mentioned in 
the same text. Unless this "referring again", or 
correferentiality, happens quite often and suIfi- 
ciently thoroughly, the text cannot be under- 
stood as one coherent linguistic entity and may 
end up looking like a random sequence of sen- 
tences. 
The rules of grammar governing correferen- 
tiality are based on the theory of depredication 
\[2\] \[3\]. We have formulated these rules for the 
specific process of translating from Japanese to 
English. Their implementation also requires a 
fairly simple but robust semantic network, based 
entirely on only one type of semantic relation 
known as subsumption. 
We believe that TWINTI?AN will be able to 
demonstrate the functioning of textwide gram- 
mar in time for this conference. 
A processing stage which would come up with 
only one definite %ptimum" alternative at the 
very end is not yet implemented. Since the 
main prospective user is meant to be a monolin- 
gual Japanese, we envisage the need \[or interac- 
tive disambiguation based on reformulating the 
Japanese sentence in alternative Japanese ren- 
derings. 
Acknowledgement 
We wish to express our gratitude to Prof. Steven 
L. Tanimoto of University of Washington and 
Mr. Mikio Osaki, Mr. Shinobu Shiotani and Mr. 
ttitoshi Suzuki of SHARP Corporation. 

References 

Knowles, F. E.; Jelinek, J. and Wood, M. 
McG. : The ALVEY Japanese and English 
Machine Translation Project, Proceedings 
of Machine Translation Sunrmit Confer- 
ence, Tokyo 1987. 

Jelinek, Jiri : A Linguistic Aspect of Trans- 
formation Rules, in Acta Universitatis Car- 
olinae - Philologica, I (Slavica Pragensia, 
VII), Prague 1965. 

Jelinek, Jiri : Construct Classes, Prague 
Studies in Mathematical Linguistics 2, 
1966. 

Matsumoto, Y. : BUP : A Bottom-Up 
Parser Embedded in Prolog, New Genera- 
tion Computing, Vol. 1, No. 2, pp.145-158, 
1983. 
