THE JAPANESE GOVERNMENT PROJECT 
FOR MACHINE TRANSLATION 
Makoto Nagao, Jun-ichi Tsujii, and Jun-ichi Nakamura 
Department of Electrical Engineering 
Kyoto University 
Sakyou-ku, Kyoto, Japan 606 
1 OUTLINE OF THE PROJECT 
The project is funded by a grant from the Agency of 
Science and Technology through the Special Coordi- 
nation Funds for the Promotion of Science and Technol- 
ogy, and was started in fiscal 1982. The formal title of 
the project is "Research on Fast Information Services 
between Japanese and English for Scientific and Engi- 
neering Literature". The purpose is to demonstrate the 
feasibility of machine translation of abstracts of scientific 
and engineering papers between the two languages, and 
as a result, to establish a fast information exchange 
system for these papers. The project term was initially 
scheduled as three years from the fiscal year of 1982 
with a budget of about seven hundred million yen, but, 
due to the present financial pressures on the government, 
the term has been extended to four years, up to 1986. 
The project is conducted by the close cooperation 
between four organizations. At Kyoto University, we 
have the responsibility of developing the software system 
for the core part of the machine translation process 
(grammar writing system and execution system); gram- 
mar systems for analysis, transfer and synthesis; detailed 
specification of what information is written in the word 
dictionaries (all the parts of speech in the analysis, trans- 
fer, and generation dictionaries), and the working manu- 
als for constructing these dictionaries. The Electro- 
technical Laboratories (ETL) are responsible for the 
machine translation text input and output, morphological 
analysis and synthesis, and the construction of the verb 
and adjective dictionaries based on the working manuals 
prepared at Kyoto. The Japan Information Center for 
Science and Technology (JICST) is in charge of the noun 
dictionary and the compiling of special technical terms in 
scientific and technical fields. The Research Information 
Processing System (RIPS) under the Agency of Engineer- . 
# . mg Technology is responsible for completing the machine 
translation system, including the man-machine interfaces 
to the system developed at Kyoto, which allow pre- and 
post-editing, access to grammar rules, and dictionary 
maintenance. 
The project is not primarily concerned with the devel- 
opment of a final practical system; that will be developed 
by private industry using the results of this project. 
Technical know-how is already being transferred gradu- 
ally to private enterprise through the participation in the 
project of people from industry. Software and linguistic 
data are also being transferred in part. Finally, complete 
technical transfer will be done under the proper condi- 
tions. 
The Japanese source texts being used are abstracts of 
scientific and technical papers published in the monthly 
JICST journal d Current Bibliography of Science and 
Technology. At present, the project is only processing 
texts in the electronics, electrical engineering, and 
computer science fields. English source texts will be 
abstracts from INSPEC in these fields.. The sentence 
structures used in abstracts tend .to be complex compared 
to ordinary sentences, with long nominal compounds, 
noun-phrase conjunctions, mathematical and physical 
formulas, long embedded sentences, and so on. The 
analysis and translation of this type of sentence structure 
is far more difficult than ordinary sentence patterns. 
However, we have not included a pre-editing stage 
because we wanted to find the ultimate limitations on 
handling this type of complex sentence structure. 
Our system is based on the following concepts: 
1. The use of all available linguistic information, both 
surface and syntactic. The writing of as detailed as 
possible syntactic rules. The development of a gram- 
mar writing system that can accept any future level of 
sophisticated linguistic theory. 
2. The introduction of semantic information wherever 
necessary to enable the syntactic analysis to be as 
accurate as possible. The importance of semantic 
information not over-estimated; a well-balanced 
usage of both syntax and semantics. Heavily seman- 
Copyright1985 by the Association for Computational Linguistics. Permission to copy without fee all or part of this material is granted provided that 
the copies are not made for direct commercial advantage and the CL reference and this copyright notice are included on the first page. To copy 
otherwise, or to republish, requires a fee and/or specific permission. 
0362-613X/85/02091-111503.00 
Computational Linguistics, Volume I I, Numbers 2-3, April-September 1985 91 
Makoto Nagao, Jun-ichi Tsujii, Jun-ichi Nakamura Japanese Government Project for MT 
tics-oriented analysis is very attractive and effective 
for sentences within narrow limits, but a system of 
that type cannot cope with the complicated structures 
found in descriptions of the wider world where 
semantic description becomes almost impossible. 
3. There are many exceptional linguistic phenomena that 
are more word-specific than explainable in general 
linguistic theory. The system should be able to accept 
word-specific rules. In our system, these rules are 
written into the lexical entries, with the priority given 
to these grammar rules in the analysis, transfer, and 
synthesis phases. This mechanism allows the system 
to be upgraded step by step by the accumulation of 
linguistic facts and word-specific rules in the diction- 
ary and effectively bypasses any deadlock in system 
improvement. 
4. The system must be able to produce an output with 
an imperfect sentence structure and containing 
untranslated original words rather than fail in cases 
where the analysis was imperfect. From the post- 
editor's point of view, an imperfect output is far pref- 
erable to no output at all. 
Many other concepts and methods have been devel- 
oped in our machine translation system, and these are 
explained in the sections following. This paper concen- 
trates on the main features of the Japanese to English 
translation system. Details of the English to Japanese 
system, which is also included in our national machine 
translation project, is being developed, and the result will 
be published shortly. 
2 THE GRAMMAR WRITING SYSTEM, GRADE 
2.1 OBJECTIVES OF THE SOFTWARE SYSTEM 
In developing a machine translation system, the grammar 
rules should accurately reflect the intention of the gram- 
mar writer. This is fundamental to the achievement of a 
good grammar system. One of the basic necessities of 
any machine translation system is a programming 
language to write the grammar composed of the language 
for specifying the grammar rules and the accompanying 
execution system. 
A grammar-writing language for machine translation 
that is powerful must fulfill the following requirements: 
1. The language must allow manipulation of linguistic 
characteristics in both source and target languages. 
The linguistic structure of Japanese differs greatly 
from that of English. For instance, in Japanese, the 
restrictions on word order are not so strong, and 
some syntactic components can be omitted. A gram- 
mar writer must be able to reflect these sorts of char- 
acteristics. 
2. It is desirable that the grammar-writing language use 
the same framework for writing the grammars in the 
analysis, transfer, and synthesis phases. The grammar 
writer should not be forced to learn several different 
systems for the different translation stages. 
With these points in mind, we developed a new soft- 
ware system for machine translation comprising the 
language used to specify the grammar rules and the 
execution system. We call it GRADE (GRAmmar DEscri- 
ber). 
2.2 THE STRUCTURE OF GRADE 
The data format used to express the structure of a 
sentence during the analysis, transfer, and generation 
phases has a large influence on the design of the gram- 
mar writing language. GRADE uses an annotated tree 
structure to represent the sentence structure during the 
translation process. Grammatical rules in GRADE are 
described in the form of tree-to-tree transformations with 
each node annotated. The annotated tree in GRADE is a 
tree structure whose nodes are annotated by sets of 
property-value pairs. This tree-to-tree transformation 
gives a great power of expression to rewriting rules that 
can be used in the grammars for the analysis, transfer, 
and synthesis phases of the machine translation system. 
Annotation parts can be used to express information such 
as syntactic category, number, semantic markers, and 
other properties. They can also be used as flags to 
control rule application. 
A rewriting rule in GRADE consists of a declaration 
part and a main part. The declaration part has the follow- 
ing four components: 
• Directory entry part, containing the grammar writer's 
name, the version number of the rewriting rule, and the 
last revision date. This part is not used at execution 
time. The grammar writer can access the information 
using the HELP facility in GRADE. 
• Property definition part, where the grammar writer 
declares the property names and their possible values. 
• Variable definition part, where the grammar writer 
declares the names of the variables. 
• Matching instruction part, where the grammar writer 
specifies the mode of application of the rewriting rule to 
an annotated tree. 
The main part specifies the transformation in the 
rewriting rule, and has the following three parts: 
• Matching condition part, which describes the conditions 
for the structure of trees and the property values of 
nodes. 
• Substructure operation part, which specifies the oper- 
ations for the parts of the annotated tree that match the 
conditions written in the matching condition part. 
• Creation part, which specifies the structure and the 
property values of the transformed annotated trees. 
The matching condition part allows the grammar writ- 
er to specify not only a specific structure for an anno- 
tated tree but also structures that may repeat several 
times, structures that are optional, and structures where 
the order of the substructures is unrestricted. 
The substructure operation part specifies operations 
on the parts of the annotated tree that match in the 
matching condition part. It allows the grammar writer to 
assign a property value to a node, or to assign a variable 
92 Computational Linguistics, Volume 11, Numbers 2-3, April-September 1985 
Makoto Nagao, Jun-ichi Tsujii, Jun-iehi Nakamura Japanese Government Project for MT 
to a tree or property value. The variable is declared in 
the variable declaration part. It also allows him to call a 
subgrammar, a subgrammar network (which is explained 
below), a dictionary rule, a built-in function, or a LISP 
function. In addition, the grammar writer can specify a 
conditional operation using the IF-THEN-ELSE state- 
ment. 
The structure and the property values of the trans- 
formed annotated tree are written in the creation part. 
The transformed tree is described by node labels that are 
used in the matching condition part or the substructure 
operation part. 
The matching instruction parr of a rewriting rule speci- 
fies the application path through the annotated tree. 
Paths through the trees are specified by combinations of 
the four basic modes: left-to-right, right-to-left, 
bottom-to-top, and top-to-bottom. 
GRADE allows the grammar writer to divide the whole 
grammar into several subgrammars and to describe the 
phases of the translation process separately. A subgram- 
mar may correspond to a grammatical unit such as the 
parsing of a simple noun phrase or the parsing of a 
simple sentence. The network of subgrammars forming 
the whole grammar allows the grammar writer to control 
the translation process in detail. If the subgrammar 
network in the analysis phase consists of the subgrammar 
for a noun phrase (SG1) and the subgrammar for a verb 
phrase (SG2) in this sequence, the GRADE executor first 
applies SG1 to the input sentence, and then applies SG2 
to the result. 
2.3 SOME SPECIFIC FEATURES OF GRADE 
GRADE allows a grammar writer to write word-specific 
grammar rules as a subgrammar at the word dictionary 
entry level. A subgrammar written in a dictionary entry 
is called a dictionary or lexical rule. A dictionary rule is 
specific to a particular word in the dictionary. 
A dictionary rule is called by the CALL-DIC function 
in the substructure operation part. When CALL-DIC is 
executed by an entry word and rule identifier as keys, the 
dictionary rule is retrieved and is applied to the part of 
the annotated tree specified by the grammar writer. 
Any grammar writing language must be able to resolve 
the syntactic and semantic ambiguities found in natural 
languages. GRADE allows the grammar writer to merge 
the results of all possible tree-to-tree transformations for 
a particular subgrammar. However, it must avoid any 
combinatorial explosion when it encounters ambiguities. 
For instance, let us take the case where a grammar 
writer writes a subgrammar to analyze the case frame of 
a verb, containing two rewriting rules; one rule is to 
construct a VP (verb phrase) from a V and NP (verb and 
noun phrase), and the other is to construct a VP from a 
V, NP, and PP (verb, noun phrase, and prepositional 
phrase). When he specifies the NONDETERMINISTIC- 
PARALLELED mode for the subgrammar, the GRADE 
executor applies both rewriting rules to the input tree, 
constructs two transformed trees, and merges them into a 
new tree whose root node has the special PARA property. 
The root node is called a para node and the subtrees 
under this node are the trees that have been transformed 
by the rewriting rules. Figure 1 shows this mode applied 
to create a para node. 
The grammar writer can select the most suitable 
subtree under the para node by applying an subgrammar 
that assigns a priority value to each subtree and using a 
built-in function that orders the subtrees according to 
their values. 
A para node is treated the same as other nodes in the 
current implementation of GRADE. The grammar writer 
can use the para node as he wants, and can select a 
subtree under a para node at a later application of the 
grammar rule. 
A ~oo~ 
V NP PP SG 
PARA 
VP PP VP A A 
V NP V NP PP 
Figure 1. Example of para node formation. 
Computational Linguistics, Volume I 1, Numbers 2-3, April-September 1985 93 
Makoto Nagao, .lun-ichi Tsujii, Jun-ichi Nakamura Japanese Government Project for MT 
2.4 SYSTEM CONFIGURATION AND ENVIRONMENT 
The system configuration of GRADE is shown in Figure 
2. Grammar rules written in GRADE are first translated 
by the GRADE translator into internal forms, expressed 
as S-expressions in LISP. The internal forms of the gram- 
mar rules are applied to the input tree that is output by 
the morphological analysis program. The rules are 
applied by the GRADE executor, and the results are sent 
to the morphological generation program. 
The GRADE system program is written in UTILISP 
(University of Tokyo Interactive LISP) and implemented 
on a FACOM M382 computer, which can handle Chinese 
characters. The system will also run on the Symbolics 
3600. The system program contains about 10,000 lines. 
3 THE DICTIONARIES 
Because our system is based on the "transfer" approach, 
there are three separate dictionaries (for analysis, trans- 
fer, and synthesis). In this project, Japanese words are 
classified into 12 major categories (parts of speech) and 
46 subcategories according to their morpho-syntactic 
behaviour. English words are classified into 14 major 
categories and 28 subcategories. The outline of the 
dictionaries of different kinds is explained in this section. 
The details are available in the Japanese literature. 
3.1 THE JAPANESE ANALYSIS DICTIONARY 
3.1.1 ANALYSIS DICTIONARY FOR VERBS 
Some Japanese verbs are used in a wide range of circum- 
stances, each usage expressing a subtly different 
"meaning". These must all be translated into English 
differently. Distinguishing these different usages requires 
careful investigation of the context around the verb. As 
described in section 2, GRADE allows the definition of 
grammar rules that are applied only to specific lexical 
items. We use this capability to discriminate between 
verb usages. However, many verbs have only two or 
three different usages at most. We prepared a fixed 
format for the lexical coding of these verbs. Descriptions 
in this format are converted to internal representation in 
GRADE automatically by a program. For verbs that have 
a wide range of usages, where the rules need to be writ- 
ten based on a variety of heuristic information, the gram- 
mar rules can be written directly in GRADE, bypassing 
the fixed format. In the fixed format, a verb can have 
several case frames corresponding to different usages. A 
case frame in Japanese is represented as a set of triplets 
like: 
(Surface-Case-Mark Deep-Case Constraints-on NP) 
<SCM> <DC> <CON> 
SCM is a set of postpositional case particles, one of which 
follows the noun phrase to fill the case. DC expresses the 
deep case interpretation of the relationship between the 
verb and the noun phrase. CON specifies a set of seman- 
tic markers that the noun phrase to fill the case should 
have. Note that the deep case interpretation of the same 
surface case particle changes depending on the verb. 
We listed 103 postpositional case particles and 33 
deep case relations in Japanese, and 32 deep case 
relations in English (Table 1), which we believe to be 
sufficient for Japanese to English translation. Figure 3 
gives a list of semantic markers used for semantic specifi- 
cation of the nouns in CON. 
Dlctlonary Grammar -,,. / 
/ \ 
Dictionary Grammar 
rule (Interns\] form) 
sententtal tree \[executor I sententtal tree 
94 
Figure 2. System configuration of GRADE. 
Computational Linguistics, Volume 11, Numbers 2-3, April-September 1985 
Makoto Nagao, Jun-ichi Tsujii, Jun-ichi Nakamura Japanese Government Project for MT 
,g 
~'~ ,~ 
m o,,~ 
cl 
o 
I ~.-~ U ~ e-i m 0 0 
U~ 1 4.1 
I~.~ fa.1 r.3 C..I 
O 
0 i...I.,-I 
IJ ~J 
~ g 
4-1 
N 
g 
0 
~-40 Z ® 
4..I 
0 
0 
CJ 
0 ~ 
~ ~J 0 
l ~ l 
I 0 E~ tJ 
0 ~ ~ 
\[ 
O 
Cl "I 
O 
N 
~J 
~ ."4 O 
I tJ gh ~ 
4J m 0 ~J O~ 
oo 
o 
i 
0 0 
aJ 
aJ ~ 
C~ 0 ~ O0 
0 ~J 0 W 0 U 
~ NN® 
tJ 
N N 
~ O 
O ~ O ~ mCJ 
~ m 
~ 0 m 
0 
cJ 
1 4J ,.~ m ,JO bOO 0 ~J I ~ .,,4 
O~ ~ 4J 4.I 
N N N N N 
0 • i-I ,.o 
Ill I..i 
O ~ .l,J 
U 0J 
m ~ 
I1 U 
O2 I 
N 
W 
0 4J 
,IJ U 
QJ 4J 
Figure 3. System of semantic primitives for nouns. 
Computational Linguistics, Volume 11, Numbers 2-3, April-September 1985 95 
Makoto Nagao, Jun-ichi Tsujii, Juff-ichi Nakamura Japanese Government Project for MT 
Table 1. English case labels, a. Subcategorization of nouns: proper noun, common 
1. AGenT 17. RANge noun, action noun, adverbial noun, postpositional 
2. Gausal-POtency 18. COmpaRison noun, conjunctive noun, complementizer. 
3. EXPeriencer 19. TOOl b. Semantic codes: The semantic codes shown in Figure 
3 are used. 4. OBJect 20. PURpose 
5. RECipient 21. Space-FRom c. Information on collocation: Adjectives, nouns, etc., 
which often occur together with a noun are specified. 
6. ORigin 22. Space-AT This information plays a role similar to the case frame 7. SOUrce 23. Space-TO 
8. GOAl 24. Space-THrough of a verb, and is effective in discriminating the differ- 
ent usages and meanings of nouns. 9. COntent 25. Time-FRom 
We also have fixed formats for words in the other 10. PARtner 26. Time-AT 
11. OPPonent 27. Time-TO morpho-syntactie classes. All words in the dictionary 
12. BENificiary 28. DURation have, besides the above information, the properties listed 
in Table 2. 13. ACCompaniment 29. CAUse 
14. ROLe 30. CONdition Table 2. Properties of Japanese words in the dictionary. 
15. DEGree 31. RESult 
16. MANner 32. COnCession 
Since one case frame corresponds to one usage of a 
verb, and each usage corresponds to a different 
"meaning" of the verb, the lexical properties of verbs are 
represented by the properties of each case frame. The 
following properties are coded for each case frame. 
a. Aspectual features: stative, semi-stative, durative, 
resultant, transitional. 
b. Volition: volitional verb, non-volitional verb, 
c. Possible transformations of surface case markers: 
Some auxiliary verbs that follow the verb and express 
passive, or causative voice, etc., change the surface 
case marking; that is, the postpositional case particles 
described in SCM are changed. Which auxiliary verbs 
can follow, and what transformation of surface case 
markers is caused by an auxiliary verb, depend on the 
verb itself, and so are marked as a lexical property of 
the verb. 
d. Idiomatic expressions: Information on collocation; for 
example, which nouns and adverbs are often collocat- 
ed with the verb are described in this column. 
e. Lexical entry in transfer dictionary: As described 
before, a verb may have more than one case frame\] 
each of which corresponds to one "meaning" of the 
verb. The transfer dictionary contains an entry for 
each meaning. Thus a single surface verb in Japanese 
may correspond to several different entries in the 
transfer dictionary. On the other hand, certain usages 
,of different surface verbs may be reduced to a single 
entry in the transfer dictionary, if they are synony- 
mous. 
f. Semantic class: This property is used for semantic 
classification of verbs such as "mental-action", 
"physical-transfer", etc. 
g. Miscellaneous properties: Several other minOr proper- 
ties are coded in the current dictionary. 
3.1.2 ANALYSIS DICTIONARY FOR NOUNS 
The following properties are described using the fixed 
noun format: 
lexical item 
word length 
word stem 
pronunciation 
part of speech 
sub-categorization of 
part of speech 
conjugation 
synonym 
derivations (noun, verb, 
adjective, adverb) 
related words 
subject code 
lexical entry in the 
transfer dictionary 
semantic codes 
(for nouns, verbs) 
thesaurus codes 
(for nouns) 
idiomatic expressions 
case frames 
(for verbs, adjectives) 
3.2 THE JAPANESE TO ENGLISH 
TRANSFER DICTIONARY 
3.2.1 TRANSFER DICTIONARY FOR VERBS 
Different verb usages are discriminated during the analy- 
sis phase. This means that usage ambiguities are partially 
resolved before the transfer phase. However, the 
concept of "meaning" (usage) applied to a single word is 
very vague and in fact depends greatly on the language 
pairs we have for translation. For example, the verb 
NOMU in Japanese can be used in the following ways: 
Tabako-wo NOMU ~ smoke a cigarette 
Kusuri-wo NOMU ~ take medicine 
Mizu-wo NOMU ~ drink water 
These three cases should be translated differently. In a 
similar way, the English verb to wear is used as: 
Wear a suit ~ Suutsu-wo KIRU 
Wear black shoes ~ Kuroi Kutsu-wo HAKU 
Wear spectacles ~ Megane-wo KAKERU 
Wear a wristwatch ~ Udedokei-wo SURU 
96 Computational Linguistics, Volume I !, Numbers 2-3, April-September 1985 
Makoto Nagao, Jun-ichi Tsujii, Jun-ichi Nakamura Japanese Government Project for MT 
These four cases should be translated differently into 
Japanese. Some might claim that these verbs are very 
ambiguous and have different meanings; but this contra- 
dicts the intuitive conclusion that suggests it is reasonable 
to consider that the target language simply has more 
specific verbs in these cases. In other words, discrimi- 
nation in meaning at the analysis stage is not sufficient to 
select the appropriate target verb in these cases. 
The verb transfer dictionary is divided into two parts: 
a word selection part and a mapping part. The word 
selection part is used to choose appropriate target verbs 
by referring to semantic markers of the case elements. 
The semantic markers currently being used appear to be 
insufficient to decide appropriate target verbs in certain 
cases. We cannot, for instance, distinguish medicine and 
cigarette with the current set of semantic markers, which 
is relevant to choosing appropriate English verbs for 
NOMU. However, we can treat such problems by specify- 
ing word selection rules in the noun transfer dictionary. 
The mapping part gives the correspondence of the 
deep cases in Japanese and English. In most cases, the 
Japanese deep case maps to the same deep case in 
English. There are, however, certain deep cases that are 
interpreted differently in the two languages. 
Sometimes a single Japanese verb can not be trans- 
lated into a single English verb and has to be paraphrased 
using a combination of a verb and another element such 
as a noun or a prepositional phrase. For example, 
SHISAKUSURU ~ develop (something) on a trial basis 
Such linguistic expressions are also treated in the 
mapping part. 
Although many verbs in the transfer dictionary are 
coded in this fixed format and converted to lexical rules 
in GRADE by a program, we also write lexical rules 
directly for verbs that have a wide range of usages. 
3.2.2 TRANSFER DICTIONARY FOR NOUNS 
Some Japanese words that behave morpho-syntactically 
as nouns have to be translated into English words in 
other morpho-syntactic classes. Such class conversions 
should be treated in the transfer dictionary, because they 
are highly dependent on the lexical item. For example: 
(i) TAIWA-KEISHIKI-de JIKKOUSURU 
(interaction) (to execute) 
-- to execute interactively 
(ii) PUROGURAMU-MOODO-de JIKKOUSURU 
(program mode) (to execute) 
-~ to execute in program mode 
The above two examples have exactly the same struc- 
tures in Japanese (where the noun phrases 
TAIWA-KEISHIKI-de and PUROGURAMU-MOODO-de fill 
the same deep case, "manner") but translate to different 
English structures simply because an appropriate lexical 
item exists for (i) but not for (ii). 
The fixed format for nouns includes the following 
items: 
a. Conditions on the sequence of words in the preceding 
part: 
A set of default rules that translate Japanese postpo- 
sitional case particles to English prepositions is 
provided in the transfer grammar. However, these 
default rules are often violated, because certain 
English nouns require specific prepositions. This kind 
of information is coded in this column. 
b. Conditions on the sequence of words in the succeed- 
ing part: 
Postpositions that follow the noun often give a clue to 
the morpho-syntactic class conversion. 
c. Collocation with verbs: 
Certain combinations of nouns and verbs in Japanese 
are translated into English as single verbs, and certain 
combinations of nouns and verbs such as kusuri 
(medicine) and NOMU (to smoke, to drink, to take) 
require specific translation of the verb (to take). This 
is the kind of information coded here. 
3.3 ENGLISH GENERATION DICTIONARY 
The format for verbs includes the following items: 
a. Components: In the transfer phase, certain Japanese 
verbs are translated into English expressions contain- 
ing not only verbs but also prepositional phrases 
and/or adverbial particles (off, up, etc.). These 
complex expressions have separate entries in the 
generation dictionary, and the structural descriptions 
for the complex expressions are given here. 
b. Verb patterns: The verb codes from the Longman 
Dictionary of Contemporary English are used to speci- 
fy the syntactic patterns a verb can take. 
c. Aspectual features: stative, transitive, process, 
completive, momentary. 
d. Voice: usually passive, can be used in passive voice, 
cannot be used in passive voice. 
e. Volition: volitional verb, non-volitional verb. 
f. Agent of to-infinitive: SUBject (I promise him to go), 
OBJject (I want him to go). 
g. Case frames: A case frame of a verb is expressed by a 
set of quadruplets like: 
Surface-Case Deep-Case Syntactic-Form Semantic-Code 
SC DC SF SEC 
SF is a list of numbers, each of which expresses one syntactic 
form the case element can take: 
1. noun phrase 
2. infinitive without to 
3. to-infinitive 
4. -ing 
5. that-clause 
6. wh-clause 
7. adjective 
8. -ed 
The formats for other parts of speech are described in 
the Japanese literature. 
Computational Linguistics, Volume 11, Numbers 2-3, April-September 1985 97 
Makoto Nagao, Jun-ichi Tsujii, Jun-ichi Nakamura Japanese Government Project for MT 
4 JAPANESE SENTENCE ANALYSIS 
4.1 ANALYSIS STRATEGIES 
As pointed out by Wilks, semantic information cannot be 
used as constraints on single linguistic structures; it can 
be used only as preference cues to help choose the most 
feasible interpretation from among all the syntactically 
possible interpretations. We believe that many types of 
preference cues, besides semantic ones, exist in real texts, 
and these cannot be captured by CFG rules. By making 
use of various types of preference cues, our analysis 
grammar for Japanese can work almost deterministically 
to give the most preferable interpretation at the first 
output, without extensive semantic processing. 
In order to integrate heuristic rules based on various 
levels of cues into a unified analysis grammar, we have 
introduced the following principles in the analysis of 
Japanese sentences: 
1. Explicit control of rule application: Heuristic rules 
can be ordered according to their strength. 
2. Multiple relation representation: Various levels of 
information including morphological, syntactic, 
semantic, and logical are expressed in a single anno- 
tated tree and can be manipulated at any time during 
the analysis. This is required not only because many 
heuristic rules are based on heterogeneous levels of 
cues but also because the analysis grammar should be 
able to perform semantic/logical interpretation of 
sentences at the same time, and the rules for these 
phases should be written using the same framework 
as the syntae, tic analysis rules. 
3. Lexicon-driven processing: We can write heuristic 
rules specific to a single or a limited number of words, 
such as rules concerned with collocation among 
words. These rules are strong in the sense that they 
almost always succeed. They are stored in the lexicon 
and invoked at the appropriate time during the analy- 
sis without decreasing efficiency. 
4. Explicit definition of analysis strategies: The whole 
analysis phase can be divided into steps. This makes 
the whole grammar efficient, natural, and easy to 
read. Furthermore, strategic consideration plays an 
essential role in preventing undesirable interpretations 
from being generated. 
Figure 4 shows the overall organization of our current 
analysis grammar. The main components are: 
1. Morphological Analysis 
2. Analysis of Simple Noun Phrases 
3. Analysis of Simple Sentences 
4. Analysis of Embedded Sentences (relative clauses) 
5. Analysis of Sentence Relationships 
6. Analysis of Outer Cases 
7. Contextual Processing (processing of omitted case 
elements, interpretations of ha, etc.) 
The analysis produces dependency tree structures 
showing the semantic relationships between the words in 
the input sentence. 
4.2 TYPICAL STEPS IN THE ANALYSIS GRAMMAR 
4.2.1 SIMPLE SENTENCES 
As described in 3, the analysis dictionary for verbs 
contains verb case frames that are expanded to GRADE 
rules with unrestricted word order to obtain a match with 
the input sentence structure. Certain verbs such as ARU, 
NARU, SURU, MOTSU, etc., which have a ,-,~le range of 
usages, are discriminated by directly coding SG~,, ":" the 
dictionary. 
4.2.2 RELATIVE CLAUSES 
Relative clause constructions in Japanese express several 
different relationships between modifying clauses (rela- 
tive clauses) and their antecedents. Some relative clause 
constructions cannot be translated into English as rela- 
tive clauses. We classified Japanese relative clauses into 
four types, according to the relationship between the 
clause and its antecedent. Because these four forms of 
relative clauses have the same surface forms, like 
.......... (verb) (noun) 
Relative Clause Antecedent 
careful processing is required to distinguish between 
them. We have developed a sophisticated analysis proce- 
dure that uses the various levels of heuristic information. 
4.2.3 NOUN PHRASE CONJUNCTIONS 
Noun phrase conjunctions often appear in abstracts of 
scientific and technical papers. It is important to analyze 
them correctly, especially in correctly determining the 
scope of the conjunction, because they often lead to a 
proliferation of the analysis results. We have many 
heuristic rules based on various types of information. 
Some are based on surface lexical items, some on word 
morphemes, and some on semantic information. They 
are used differently in different conjunctive structures. 
We can distinguish strong heuristic rules (that is, rules 
that almost always give correct scopes when applied) 
from others. In fact, there is some ordering of heuristic 
rules according to their strength. In GRADE we can 
define arbitrary ordering of rule applications by using 
subgrammar networks and also by ordering rewriting 
rules inside a subgrammar. This capability of being able 
to control the rule application sequence is absolutely 
necessary in integrating heuristic rules based on heter- 
ogeneous types of information into a unified set of rules. 
4.2.4 SENTENCE RELATIONSHIPS AND 
OUTER CASE ANALYSIS 
In Japanese there are several different syntactic 
constructions corresponding to English subordinators and 
coordinators like although, in order to, and, and so on. 
The correspondence between forms of Japanese and 
English sentence constructions is not straightforward. 
Some postpositional particles in Japanese express several 
different semantic relationships between sentences, and 
therefore should be translated into different subordina- 
98 Computational Linguistics, Volume 11, Numbers 2-3, April-September 1985 
Makoto Nagao, Jun-ichi Tsujil, Jun-ichi Nakamura Japanese Government Project for MT 
START) 
I morphologica~ 
analysis J 
1 ?e:ision on scopes of~ 
imple sentences 
1 f Decision on scopes of h A~lysis~ 
~alysis of -~ ~noun phrase conjunction~ \[ relative I ~mantic _ ~f I ~/~clause~ 
~lationship in~~Analysis of ~oun phrased.j/? T 
~un phrases~ ~nalysis °f simple sentencey / 
@nversion of "ha' to "n~ / 
~ecision :n aspec~ / 
~nalysi's o~ semantiC/ Velationships between F 
\ simple sentences 
~ ecision on ten~ 
1 ~Contextual analys:~ 
deleted cases etc 
1 Phrase structure t h 
pendency structur~ 
(END) 
Figure 4. Basic flow of processing. 
Computational Linguistics, Volume 11, Numbers 2-3, April-September 1985 99 
Makoto Nagao, Jun-ichi Tsujii, Jun-ichi Nakamura Japanese Government Project for MT 
tors in English according to the semantic relationships. 
The postpositional particle TAME can express either a 
"purpose-action" relationship or a "cause-effect" 
relationship. In order to resolve the ambiguity in the 
semantic relationships expressed by TAME, a set of lexi- 
cal rules is defined in the dictionary for the entry TAME. 
The rules are roughly as follows, where the sequential 
form ($1, S2) is assumed: 
(i) If $1 expresses a completed action or a stative 
assertion, the relationship is "cause-effect". 
(ii)If $1 expresses neither a completed action nor a 
stative assertion, and $2 expresses a volitional action, 
the relationship is "purpose-action". 
Note that whether S1 expresses a completed action or not 
is determined in a preceding phase by using rules that 
utilize the aspectual features of the verbs described in the 
dictionary, and auxiliary verbs following the verb. We 
have heuristic rules for 57 postpositional particles for 
sentences conjunctions like TAME. 
Postpositional particles that follow noun phrases and 
express case relationships are also very ambiguous in the 
sense that they express several different deep cases. 
Although the interpretations of inner case elements are 
directly given in the verb dictionary as the form of 
mapping between surface case particles and their deep 
case interpretations, the outer case elements should be 
semantically interpreted by referring to the semantic 
categories of noun phrases and the verb properties. 
Lexical rules for 62 case particles have also been imple- 
mented and tested. 
5 TRANSFER AND GENERATION OF ENGLISH 
In principle we use the deep case dependency structure to 
represent a sentence semantically. Theoretically it is 
possible to assign a unique case dependency structure to 
each input sentence. In practice, however, the analysis 
phase may fail, or it may assign the wrong structure. 
Therefore, as an intermediate representation, we use a 
structure that makes it possible to annotate multiple 
possibilities as well as multiple level representation. 
Properties at a node are represented as vectors, so that 
this complex dependency structure is flexible in the sense 
that different interpretation rules can be applied to the 
structure. 
Transfer and generation rules are organized along the 
principle that "if a better rule exists, then the system uses 
it; otherwise, the system attempts to use a standard rule: 
if that fails, the system uses a default rule". The gram- 
mar involves a number of stages of application of heuris- 
tic rules. Figure 5 shows the process flow for the transfer 
and generation phases. 
To obtain a more neutral (or target-language oriented) 
structure, some heuristic rules are activated immediately 
after the standard analysis of the Japanese sentence is 
finished. We call such activation the pre-transfer loop. 
Semantic and pragmatic interpretations are done in the 
pre-transfer loop. The larger the number of heuristic 
rules applied in this loop, the better the results. 
100 
pre-transfer 
loop 
internal 
repr esen tat i on 
for Japanese 
ANALYS I S // 
TRANSFER 
post-transfer ~ 
loop 
internal 
representation 
for English 
t GENERATION 
tree I structure 
transformation 
MORPHOLOGICAL 
SYNTHESIS Figure 
5. Process flow for the transfer and generation phases. 
Computational Linguistics, Volume 11, Numbers 2-3, April-September 1985 
Makoto Nagao, Jun-ichi Tsujii, Jun-ichi Nakamura Japanese Government Project for MT 
Table 3. Word selection in target language by using semantic markers. 
_h~Y~ 
X ~ 
X 
non-living substance 
,,,stru,cture 
social phenomena 
action,deed,movement 
reaction 
standard,property 
state,condition 
relation ¢ 
non-living 
X~Y~ Y structure 
substance 
phenomena, action ¢ 
property 
Xfi~Y~&~ Y measure 
¢ 
form-i 
take place 
occur-i 
arise-I 
produce-2 
form-i 
cause-i 
produce-2 
improve-i 
increase-2 
raise-i 
form X(obj) 
,take"'place 
X occur 
X arise 
produce X 
X form Y 
X cause Y 
X produce Y 
'X improve Y 
X increase Y 
X raise Y 
Semanticmarker for X/Y 
Table 4. Default rule for assigning a case label of English to the Japanese postposition ni. 
J-SURFACE-CASE 
t- (ni) 
J-DEEP-CASE E-DEEP-CASE 
REC. BENeficiary RECipient 
ORigin ORI from 
PARticipant PAR wilh 
TIMe Time-AT in 
ROLe ROt as 
GOAl GOA Io 
eoe ooo 
Default Preposition 
1to (REC-- to. BEN -- for) 
Q @ • 
5.1 WORD SELECTION IN ENGLISH 
USING SEMANTIC MARKERS 
Word selection in the target language is a big problem in 
machine translation. There are varieties of choices for 
translation of a word in the source language. The main 
principles adopted by our system are: 
1. Area restriction using field codes, such as electrical 
engineering, nuclear science, medicine, and so on. 
2. Semantic codes are attached to a word in the analysis 
phase and used for the selection of the proper target 
language word or phrase. 
3. The sentence structure involving the word to be 
translated is sometimes effective in determining the 
proper word or phrase in the target language. 
Table 3 shows an example of part of the verb transfer 
dictionary. Selection of the English verb is done from 
the semantic categories of the nouns related to the verb. 
A number (i) attached to the verb, like form-1 or 
produce-2, labels the i-th usage of the verb. When 
semantic information on the nouns is not available, the 
Computational Linguistics, Volume 11, Numbers 2-3, April-September 1985 
column labelled 0 is applied to produce a default trans- 
lation. 
The expressive power of format-oriented descriptions 
is, however, insufficient for a number of common verbs 
such as SURU 'to make, to do, to perform, ...' and NARU 
'to become, to consist of, to provide, ...'. In such cases, 
we can write the transfer rules directly in GRADE. There 
must be a constant effort to list varieties of usages with 
their corresponding English sentence structures and 
semantic conditions. 
A postposition in Japanese represents a case slot for a 
verb, but it has a variety of usages; thus determination of 
the English preposition corresponding to each Japanese 
postposition is quite difficult. It also depends on the verb 
that governs the noun phrase having that postposition. 
Table 4 illustrates part of a default table for determin- 
ing deep and surface case labels when no higher level 
rule applies. This sort of table is defined for all case 
combinations. In this way, we confirm at least one trans- 
lation to be assigned to an input. The particular usage of 
a preposition for a particular English verb is written in 
101 
Makoto Nagao, Jun-ichi Tsujii, Jun-ichi Nakamura Japanese Government Project for MT 
the lexical entry for the verb, and the information is used 
for English sentence generation. 
Many odd structures are still left after the pre-transfer 
loop and the lexical selection, and the internal English 
representation must be adjusted further into more natural 
forms. We call this part the post-transfer loop. 
Global sentence structures are completely different in 
Japanese and English, and correspondingly the internal 
structures are also completely different. The fundamen- 
tal differences between the internal representation of 
Japanese and of English are absorbed in the pre-transfer 
loop. But before the English generation phase, some 
structural transformations are still required for cases such 
as (a) embedded sentence structures, and (b) complex 
sentence structures. These structural adjustments are 
performed in the post-transfer loop. 
The steps comprising the transfer phase are shown in 
Figure 6. 
\[Apply heuristic rules (I t Simplify structure 
~ ransfer nouns, adjectives and determiners 
l raos e  ooopouo  oooos I  eolde 
output 
I Transfer 
I Transfer 
Transfer 
,L 
adverbs 
1 
verbs I 
optional case, 
prepositions 
with adjective or 
subordination and coordinatio~ 
Decide tense, 
Apply heuristic 
aspect and moda4 
rules (II 
not 
102 
Figure 6. Outline of the transfer phase. 
Computational Linguistics, Volume 11, Numbers 2-3, April-September 1985 
Makoto Nagao, Jun-ichi Tsujii, Jun-ichi Nakamura Japanese Government Project for MT 
5.2 ENGLISH SURFACE STRUCTURE GENERATION 
After transferring from the Japanese deep dependency 
structure to the English one, the structure is converted to 
a phrase structure tree with all the surface words 
attached to the tree. 
The conversion is performed top-down from the root 
node of the dependency tree to the leaf. Therefore, 
when a governing verb demands a noun phrase 
expression or a to-infinitive expression for its dependent 
phrase, a structural change must be made to the phrase. 
Noun-to-verb transformations and noun-to-adjective 
transformations are often required due to the difference 
in expressions between Japanese and English. This proc- 
ess moves down from the root node to all the leaf nodes. 
After this phrase structure generation process, some 
sentential transformations are performed. For example: 
• When the agent is missing, a passive transformation is 
applied. 
• When the agent and the object are both missing, the 
predicative verb is nominalized and made the subject by 
supplementing verb phrases such as is made or is 
performed. 
• When the subject phrase has a big tree, the anticipatory 
subject it is introduced. 
• In compound and complex sentences, same subject 
nouns are pronominalized. 
• Duplication of head nouns in conjunctive noun phrases 
is eliminated. For example, "uniform component and 
non-uniform component" is reduced to "uniform and 
non-uniform components". 
Any big structural transformations required in the 
translation come from the essential differences between 
English, which is a DO-language, and Japanese, which is 
a BE-language. In English, case slots such as tool, 
cause/reason, and some others often appear in the 
subject position, while in Japanese such expressions are 
never used. Transformations of this kind are incorpo- 
rated in the generation grammar as shown in Figure 7. 
They produce more natural English expressions. The 
stylistic transformation part of the process is still very 
primitive. We need to accumulate much more linguistic 
knowledge and lexical data before we can produce really 
natural English expressions. 
6 EVALUATION OF TRANSLATION QUALITY 
The following two aspects of the machine translation 
output have been adopted to evaluate translation quality. 
They are to some extent independent indicators. 
1. Intellibility: An evaluation of the extent to which the 
translated text can be understood by a native speaker 
of the target language. In Japanese to English trans- 
lation, we evaluate the extent to which an average 
British or American reader can understand the output 
without any reference made to the Japanese original. 
2. Accuracy: The degree to which the translated text 
conveys the meaning of the original text is evaluated, 
and a measure of the amount of difference between 
the input and output sentences. The evaluation is 
done by Japanese translators specializing in Japa- 
nese-to-English translation. 
earthquake building collapse 
collapse 
SUB~ ~AUSE 
building earthquake 
= The building collapsed 
due to the earthquake. 
CPO 
destroy 
earthquake 
\[CPO:causal 
BJ 
building 
potency\] 
The earthquake 
destroyed the 
buildings. 
Figure 7. An example of structural transformations in the generation phase. 
Computational Linguistics, Volume 11, Numbers 2-3, April-September 1985 103 
Makoto Nagao, Jun-ichi Tsujii, Jun-ichi Nakamura Japanese Government Project for MT 
6.1 INTELLIGIBILITY 
Evaluation of intelligibility is based on a scale of 1 to 5; 
the categories are described below. \[See Appendix A for 
translation examples.\] 
1. The meaning of the sentence is clear, and there are no 
questions. Grammar, word usage, and style are all 
appropriate, and no rewriting is needed. 
2. The meaning of the sentence is clear, but there are 
some problems in grammar, word usage, and/or style, 
making the overall quality less than 1. 
3. The basic thrust of the sentence is clear, but the eval- 
uator is not sure of some detailed parts because of 
grammar and word usage problems. The problems 
cannot be resolved by any set procedure; the evalu- 
ator needs the assistance of a Japanese evaluator to 
clarify the meaning of those parts in the Japanese 
original. 
4) The sentence contains many grammatical and word 
usage problems, and the evaluator can only guess at 
the meaning after careful study, if at all. The quickest 
solution will be a retranslation of the Japanese 
sentence because too many revisions would be need- 
ed. 
5) The sentence cannot be understood at all. No 
amount of effort will produce any meaning. 
As the evaluation number increases on the above scale 
from 1 to 5, intelligibility decreases. The evaluator uses 
the above scale to evaluate the output sentence without 
any reference to the Japanese original in the first place. 
When the output sentence contains untranslated words in 
Japanese, the English translation of those words is 
provided by a Japanese rewriter before the evaluation. 
This evaluation work has been carried out to date by one 
British and one American evaluator, neither of whom has 
the ability to read or evaluate Japanese. Both evaluators 
have one year's experience in proofreading and checking 
translations of general scientific and technical literature, 
but neither has specialized knowledge in the field of elec- 
trical engineering, which has been used for the input 
material up to now. 
6.2 ACCURACY 
Accuracy is evaluated on a scale of 0 to 6; that is, seven 
categories. \[See Appendix B for translation examples.\] 
0) The content of the input sentence is faithfully 
conveyed to the output sentence. The translated 
sentence is clear to a native speaker and no rewriting 
is needed. 
1) The content of the input sentence is faithfully 
conveyed to the output sentence, and can be clearly 
understood by a native speaker, but some rewriting is 
needed. The sentence can be corrected by a native 
speaking rewriter without referring to the original 
text. No Japanese language assistance is required. 
2) The content of the input sentence is faithfully 
conveyed to the output sentence, but some changes 
are needed in word order. 
3) While the content of the input sentence is generally 
conveyed faithfully to the output sentence, there are 
some problems with things like relationships between 
phrases and expressions, and with tense, voice, 
plurals, and the positions of adverbs. There is some 
duplication of nouns in the sentence. 
4) The content of the input sentence is not adequately 
conveyed to the output sentence. Some expressions 
are missing, and there are problems with the relation- 
ships between clauses, between phrases and clauses, 
or between sentence elements. 
5) The content of the input sentence is not conveyed to 
the output sentence. Clauses and phrases are missing. 
6) The content of the input sentence is not conveyed at 
all. The output is not a proper sentence; subjects and 
predicates are missing. In noun phrases, the main 
noun (the noun positioned last in the Japanese) is 
missing, or a clause or phrase acting as a verb and 
modifying a noun is missing. 
As the evaluation number increases on the above scale 
from 0 to 6, the accuracy decreases. This part of the 
evaluation was done by four Japanese translators, each 
of whom has one or two years experience in Japanese to 
English translation. The whole evaluation process is 
monitored by a Japanese translation specialist with 
extensive experience in translation work. 
6.3 RESULTS OF EVALUATION 
We describe here the results of the evaluation of the 
translation of 1,682 sentences taken from the monthly 
JICST journal A Current Bibliography of Science and 
Technology. Of these, 791 were the ones often referred to 
for the development of the analysis grammar, and the 
remaining 891 were added as the test material this time. 
All the sentences were given to the machine translation 
system with no pre-editing. The 791 sentences forming 
the first group were originally selected out of 1000 after 
eliminating 120 that contained ungrammatical Japanese 
expressions, and a further 90 that contained long math- 
ematical or chemical formulae. The deletion of the latter 
was because, in the early stages, the analysis grammar 
that would deal with formulae had not been completed. 
The second group of 891 were all those that were in the 
abstract, without any such selection. 
Tables 5 and 6 present the evaluation results for intel- 
ligibility and accuracy for the two groups of abstracts. 
Table 7 gives a comparison of the two groups. As the 
system was not tuned to the sentences in the second 
group, there were many unknown grammatical structures 
and missing words in the dictionary, which made the 
evaluation result worse than the first group. 
As these tables show, when the accuracy of translation 
goes down, so too does the intelligibility. We did not 
find any examples of intelligibility being low when accu- 
racy was high, but we did find a reasonable number of 
cases where the translation accuracy was evaluated as 
low, but intelligibility was rated high. Table 8 lists typical 
sample sentences for each evaluation type. 
104 Computational Linguistics, Volume I 1, Numbers 2-3, April-September 1985 
Makoto Nagao, Jun-ichi Tsujii, Jun-ichi Nakamura Japanese Government Project for MT 
Table 5. Evaluation results for the first group of 791 abstracts. 
llncelllgibility 
3 
4 
5 
total 
percentage of ¢o ~.',1 
---7- 
T 
0 
Table 6. 
accuracy 
1 2 I 3 4 5 
23.6 22.4 12.4 12.4 ! 5.8 
2 
4 
16 
24 
36 
82 
10.4 
m 
do receive! total 
0 116 
0 2-;;-9 
3 259 
1 58 
Evaluation results for the second group of 891 abstracts. 
32.7 
12.5 
intelligibility 
0 
1 61 
2 0 
3 0 
4 0 
5 0 
to=oz 61 
perconCege of =o.= ~ 6.8 
1 
0 
142 
0 
0 
0 
141 
15.8 
2 
7 
22 
138 
I0 
0 
177 
19.9 
3 
5 
27 
68 
24 
1 
125 
14.0 
uccurscy 
4 
0 
8 
44 
35 
6 
93 
I0.4I l 
I 5 l 6 defective 
f 
3 7 0 
13 9 0 
26 17 4 
16 37 i 4 
t 
7 149 1 
65 219 9 
7.3 24.6j 1.0 
pe:¢en~9= 
¢O~Sl of to~l 
83 9.3 
221 24.8 
297 33.3 
126 14.1 
164 18.4 
891 
Table 7. Comparison between first and 
second groups for intelligibility. 
tntolligi first group soco~ group 
-~izity 7 9 1 8 9 1 
1 14.7% 9.3% 
2 32.7~ 24.8~ 
3 32.7% 33.3~ 
4 12.6% 14.1~ 
5 7.3% 18.4% 
Table 8. Typical sample sentences in 
the different evaluation categories. 
accuracy flrst7 g lgr°uP I second group 891 
4 
0 12.4~ 6.8~ 
I l 23.6% 15.8% 
2 22.4% 19.9~ 
3 12.4% 14.0% 
12.4~ 10.4% 
5 8% 7 3% 
lO 4% 24 6% 
0 6% I 0% 
Computational Linguistics, Volume 11, Numbers 2-3, April-September 1985 | 05 
Makoto Nagao, Jun-ichi Tsujii, Jun-ichi Nakamura Japanese Government Project for MT 
Just as there are no clear and objective criteria for 
evaluating the quality of Japanese to English translations 
done by humans, standard criteria for judging the results 
of machine translation have yet to be established. The 
evaluation methods proposed in this paper are still in the 
trial stage, and much more refining and improving is still 
needed. 
The translation quality and the amount of post-editing 
needed is closely related to the quality and nature of the 
original text. It is quite natural to expect that simple 
sentences can be translated accurately and intelligibly. 
We need to develop some way to evaluate the degree of 
difficulty of the original text along with the translation 
evaluation. Only within this wider context can accuracy 
and intelligibility be meaningfully discussed. 
The JICST abstracts used in this project were written 
primarily with the aim of condensing as much informa- 
tion as possible into a few sentences. This means that 
there are many long sentences, many of which are not 
very correct from a linguistic point of view. This is one 
obvious factor contributing to the poor evaluation results 
shown in Tables 5 to 7. 
Evaluation of the quality of machine-translated 
sentences is closely linked to the way in which the 
machine translation output is to be used, hence to the 
ease with which post-editing can be done. Only a mini- 
mum of post-editing will be necessary to convey the 
technical meaning of the original to the specialist in a 
particular field for the purpose of information service. 
However, when the translated text is for wide circulation 
or publication (for example, technical manuals), style and 
naturalness of sentential expressions, as well as exact 
meaning, become more important. Depending on these 
situations, the yardstick for intelligibility will change as 
well. 
ACKNOWLEDGMENT 
We are deeply indepted to professor Toyoski Nishida 
(Kyoto University), Mr. Yoshiyuki Sakamoto (ETL), 
Tsuyoshi Toriumi (JICST), and Masayuki Sato (JICST) for 
taking part in this project, and to many other people from 
private companies for their help in the development of 
the system. We are grateful to the Science and Technol- 
ogy Agency and to the Agency of Engineering Technolo- 
gy for their constant funding of and guidance for the 
project. Professors Yutaka Kusanagi, Shinobu Takamat- 
su, Makoto Hirai, and others gave us comments during 
the development of the system. 

REFERENCES 
Nagao, Makoto; Nishida, Toyoaki; and Tsujii, Jun-ichi 1984 Dealing 
with the Incompleteness of Linguistic Knowledge in Language 
Translation. In Proceedings of COL1NG 84, Stanford University, 
California: 420-427. 
Nakamura, Jun-ichi; Tsujii, Jun-ichi; and Nagao, Makoto 1984 Gram- 
mar Writing System (GRADE) of Mu-Machine Translation Project 
and its Characteristics. In Proceedings of COLING 84, Stanford 
University, California: 338-343. 
Sakamo, Yoshiyuki 1984 Lexicon Features for Japanese Syntactic 
Analysis in Mu-Project-JE. In Proceedings of COLING 84, Stanford 
University, California: 338-343. 
Tsujii, Jun-ichi; Nakamura, Jun-ichi; and Nagao, Makoto 1984 Analy- 
sis Grammar of Japanese in the Mu-Project. In Proceedings of 
COLING 84, Stanford University, California: 338-343. 
