Learning Mechanism in Machine Translation System "PIVOT" 
~itsugu ~iura Mikito Hirata Nami Hoshino 
C&C S~stems Engineering givision 
NEC Corporation 
4-12-35, Shibaura, Minato-ku, 
Tokyo, JAPAN 
Abstract 
8EC's machine treuslation system "PIVOT" provides 
analysis editing functions. The user can iuterectivel~ 
correct errors in analysis results, such as dependency 
and case. However, without a \]earning mechanism, the 
user must correct similar dependency errors several 
times. We discuss the learning mechanism to utilize 
dependency and case information specified by the user, 
We compare four types of matching methods by simulation 
and show non-restricted best matching is the most 
effective. 
1. Introduction 
In the current machine translation system, users 
cannot always get correct translated sentences at the 
first translation. This is due to the low ability of 
the grammar rules end low quality of the dictionarY. 
Woreover, the grealar rules and the dictionary need 
customization for each document of varying fields and 
contents. It is very difficult to prepare beforehand 
the information corresponding to various fields. 
NEC has developed a machine translation systea 
"PlV0T"(Jepanese to English/English to Japanese) as the 
translation support systea for business use. The trans- 
lation part of PIVOT is the rule-based system and 
adopts the interlingue method. PIVOT provides a special 
editor so that the user can correct the analysis 
results. The user can interactively select suitable 
translation equivalents, can correct dependency, case 
(semantic relation), and so on. In technical manual 
documents which ere the main objects of machine trans- 
lation, there ore many expressions that appear more 
than once. The analysis results of such expressions are 
often the sane. At present, PIVOT has learning functien 
for selection of translation equivalents, but it does 
not have such mechanism for dependency and case. The 
user has to correct many similar errors in dependency 
and case, so a heavy burden is laid on the user. Infor- 
mation give~ by the user can be regarded as customizing 
information for the document to be translated. There- 
fore, for o practical use system, it is an important 
issue to provide a framework to improve translation 
by using correction information froa the user. 
There are various approaches for analyzing seutences 
by using accumulated dependencies. One system auto- 
matically extracts all dependencies which have ne 
ambiguJty\[5\]. Another system accumulates only the 
dependencies which are directly corrected by the user 
\[2\]. In 8lure et al.\[4J, the s~stem accumulates all 
dependencies in the sentence that are corrected 
or confirmed by the user. 
There ere two ways for remembering the keys in the 
dependency structures to he accumulated: one by the 
spelling and the other by the semantic code. However, 
the rougb selantic code used in the current system does 
not have high distinguishing ability, end often causes 
bad influence. For example, consider the following 
sentences. 
lie looked at the singing man with opera glasses. 
lie looked at the man who is singing with the aicro- 
phone. 
The seaantie code "Instrument" is usually assigned to 
"~-~ ¢~(opera glasses)" and "~4 ~ (microphone)". 
Therefore, it ien't possible to fix dependence relation 
such as "~5(singing)" with "~4¢(microphone)", and 
".E~(look)" with "#"~P~X(opera glasses)". 
In the process of using learning results there is an 
approach that adopts best matching by computing siai- 
larity with accumulated inforaatioo\[i3. The example- 
based approach that translates by retrieving examples 
and calculating siailarity has been investigated. These 
systems also adept best aatching\[l\]\[6\]\[7\]. 
This paper proposes an approach that can ilprove the 
translation quality b~ interactively accumulating de- 
pendency and case structures corrected by the user. In 
the learning process, the syntactle head, tile syntactic 
dependent, and the ease between them are stored in the 
association database. $o avoid side effects, head and 
dependent words are stored in the form of spellings. 
This makes it easier for the user to understand the 
behavior of the system. Four types of matching methods 
are examined that ere used in matching betgeen the 
possible analysis structures and the association date- 
base. 
Section 2 describes analysis editing function in 
PIY0T/JE(dapanese to English). Section 3 explains the 
learning mechanism, and the results of simulation on 
actual manuals are presented in Section 4. 
2. Analysis Editing Function 
The user can interaetively ~peeify the following 
information related to dependency relation by using 
analysis editing function of P1VOT/JE. 
(I) Dependencg (syntactic dependent end syntactic head) 
(2) Case 
(3) Parallel 
(4) Scope 
ACRES DE COLING-92, NANTES, 23-28 AOtJT 1992 6 9 3 PROC. OF COTING-92, NAt.S, AUO. 23-28, 1992 
(5) Sharing 
The dependency relation which the system analyzes is 
displayed on the screen as shown in Figure 1. An under- 
line is drawn under each Japanese phrase (a word ~itb 
s particle). The dependency is shown by the line 
which connects tmo phrases. The thick line indicates 
the dependency corrected by the user. Case is displayed 
on the line of the dependency in the form of the parti- 
cles which have one-to-one correspondence with one of 
the cases. The bo~ indicates the correct case specified 
by the user. The user directly corrects above-mentioned 
information by using a louse and carries out trans- 
lation operation once again. The translation rule 
controls the analysis to reflect the correction by the 
user. 
Figure I: Display of Analysis Result 
2.1 Dependency 
The user can correct dependency. In Figure 2, syn- 
tactic head of "~--~(user)" is changed from "~1~ 
)dT~-~ (analyze)" to "~r~31"~ (specify)". 
uAer analyze necessary Infor=|tJo~ speclf~ 
l__J \[__J L~ \[__l L__I 
Figure 2: Example of 0ependency Correction 
2.2 Case 
Case shows the semantic relation between two phrases 
which are in dependency relation. PIVOT has more than 
fort~ kinds of eases such as Agent and Reason. On the 
screen, particles are used to express cases. 
In Figure 3, the case between "EWS4800" and "11~31" 
~(run)" is changed froa "Contents" to "Place" . 
trlnslttion syste= run EIIS4800 
~1~, ~y,~,~t IJl~'¢'~ EWS4800. 
\[__1 
T~ (Contents) 
t 
~B~(PI~ee) 
Figure 3: ExamPle of Case Correction 
2.3 Parallel 
The user can specify the information that two 
phrases are in parallel relation, Because parallel 
relation is one of the PIVOT eases, this function 
enables the user to correct dependency and case at the 
s~e time. 
2.4 Scope 
The user can specify scope. Scope means the phrase 
sequence in which only the syntactic head has depen- 
dency relation with other phrases outside of it. 
2.5 Sharing 
In Figure 1, "~(user)" is the subject of "~ 
(specify)" and at the same time it is the subJect of 
"l~(translate)". In such a case, we say"user" is 
shared by "~ (specify)" and "~'¢-$ (translate)". 
Specification of sharing is done by specifying more 
than one syntactic heads for the dependent. So the 
sharing is decomposed into dependency relations. 
Useful information on dependency relation is gotten 
from the user's specification of scope and so on, but 
this paper discusses learning from correction operation 
for dependency and case onlY. 
3. Learning Mechanism 
Proposed learning mechanism is as follows. 
3.1 Learning Process 
(1) PIVOT analyzes a source sentence. 
(2) PIVOT displays the analysis result. 
(3) A user corrects mistakes in the analysis result. 
(4) After the user finishes asking corrections, PIVOT 
translates the sentence again. 
(5) PIVOT asks the user whether translation has been a 
success or not, 
(O) If the translation is s success, PIVOT stores the 
analysis result together with the instruction item 
into an association database. If the translation is 
a failure, PIVOT does nothing further. 
3.2 Applying Process 
(I) PIVOT analyzes a source sentence, 
(2) If there is ambiguity at s certain stage of 
analysis, PIVOT retrieves data in the association 
database. 
(3) PIVOT compares the possible analysis structures of 
the given sentence with the analysis results 
accumulated in the association database. 
(4) PIVOT selects the analysis structure that matches 
with the analysis results accumulated in the asso- 
ciation database. If no matching occurs, PIVOT 
selects one structure by further application of the 
analysis rules. 
PIVOT learns correct analysis structures related to 
user's instruction. The smallest unit of PIVDT's 
analysis structure, that is, the triplet of syntactic 
dependent (with particles and voice information), syn- 
tactic head (with voice information), and the ease 
ACRES bE COUNG-92, NAN'IT.S, 23-28 AOt3"l" 1992 6 9 4 PROC. OF COLING-92, NAI~'rEs. AUG. 2.3-28, 1992 
betmeen them. combined with the instruction item forms 
the learning unit. The instruction item shoms what the 
correction has been made on, namely, case or dependen- 
cy correction. Each learning unit is accumulated in the 
association database. The database nan be retrieved 
mith the spelling of the syntactic dependent or head as 
the key. The learning unit corresponds to the follol'ing 
structure. 
mord2 (Syntactic head) 
I 
CASEI (Case) 
I 
wordl (Syntactic dependent) 
Example of the learning process and the applying 
process is shomn below. This is the exaaple of correct- 
ing dependency. 
\[Translation process at the first stage\] 
Source sentence: 
(Translation) 
Possible analysis structures: 
(Analysis structure 1) (Analysis structure 2) 
~ look .~ look 
/ / I \ 
AGT OBJ AGT INS OBJ 
/ / l \ 
~ ~,~ man ~ :t'J ~'~'~2~ ~ man 
he he opera glasses I 
OBJ OBd 
i 
~-~ "~ ~,~ ~ sing ~,~.~ ~ ~,~ 7~ 
sing 
INS 
kGT:kgent 
;iV ~;f ~3l'C" OBj:Object 
opera l~lasses INS: Instrument 
If there is no information in the association 
database, analysis structure 1 is selected by further 
application of the rules. 
Translated sentence: 
He looked at the man who is singing with opera 
glasses. 
\[Instruction by User and the Learning Process\] 
The user corrects the analysis results. 
Correction of dependency: 
The user changes the syntactic head of ":t~,'~q'92~ 
(opera glasses)" from "{1~-9~;5 (sing)" to "~.~ 
(look)." 
Translated sentence: 
lie looked at a singing man with opera glasses, 
Learning: 
PIVOT stores the correct analysis structure with 
dependency as the instruction itea in the association 
database. 
J~& look 
\ 
INS 
\ 
~' 9¥ ~R~ opera glasses 
\[Applying process\] 
PIVOT translates another similar sentence. 
Source sentence: 
(Translation) 
Possible analysis structures: 
(Analysis structure 1) (Analysis structure 2) 
~,~ look J~ look 
/ / I \ 
AGT OBJ AGT INS Dad 
/ / I \ 
~t,l~ 7J~:~ woman ~d~i~- ~l~" ~')' 92"Z' ~ woman 
I l opera glasses I 
OBd OBJ 
I 
~'C~,5 laugh -~->'C~,~;5 laugh 
INS 
opera glasses 
Database retrieval: 
PIVOT retrieves information ill the association 
database, because there exist two possible analysis 
structures. 
~& look 
\ 
INS 
\ 
~" ~O'~R'C opera glasses 
Watching: 
PIVOT succeeds 
structure 2. 
in latching, and selects analysis 
Translated sentence; 
I looked at a laughing woman with opera glasses. 
3.3 Watching Methods 
The learning mechanism decreases the number of 
user's instructions. The problem is to find the 
effective matching method in the learning mechanism. 
Ie made experiments on four types of matching 
methods and compared the efficiency of each method. 
The matching methods are: 
(1) Restricted exact matching 
(2) Non-restricted exact matching 
(3) Restricted best latching 
Ac'rEs DE COLING-92. NANTES, 23-28 AO~r 1992 6 9 5 PROC. OF COLING-92. NANTEs. AUG. 23-28. 1992 
(4) Non-restricted best matching 
Restricted exact matching is a well-known method. 
This method is used in many fields now. There is no 
study about non-restricted exact watching. Restricted 
best watching is a comparatively new aethod. Experiment 
by Wiura\[4\] is the first. There is no study about non- 
restricted best satchin¢. 
3.3.1 Restricted Ratchin¢ and Non-restricted Natching 
In restricted matching, the item in applying process 
has to be the same with the instruction item in 
learning. When the items are different, PIVOT will not 
use learned data. For example, if the instruction item 
in learning is case, PIVOT will use the learned 
correct analysis structure only for case selection. It 
will not use the data for selection of dependency or 
translation equivalent of each word. 
In non-restricted matching, the item in applying 
process need not be the same with the instruction item 
in learning. For example, if the instruction itew in 
learning is case, PIVOT will use this learned data for 
selection of dependency and translation equivalent of 
each word as well. 
The difference between the actions of restricted 
matching and non-restricted matching is described belo*. 
Consider a sentence mith two possible analysis struc- 
tures. 
(Analysis structure 1) (Analysis structure 2) 
word5 word5 
/1\ /\ 
CASEI CASE3 CASE4 CASEI CASEd 
/ I \ / \ 
wordl word3 word4 wordl wordd 
I /\ 
CASE2 CA.~E5 OASE6 
I / \ 
word2 word2 word3 
Assume the following analysis structure is already 
learned by correcting case. 
word4 
/ 
CkSE5 
/ 
word2 
Using restricted matching, the system selects struc- 
ture 1 with its usual analysis procedure. In this case, 
data learned by case correction cannot be used in 
selection of dependenc~. Using non-restricted matching, 
the system selects structure R, because the learned 
pattern matches with the part of structure 2. 
3.3.2 Exact Watching and Best Matching 
Exact matching makes matching only once. while best 
matching makes matching several times. Best matching is 
also called associative reasonin¢. 
The difference of actions between the two methods is 
illustrated below. 
wsrd2(head) 
/ 
CASEI bet (CI,KR,Wl) stand for the learned 
/ structure as shown on the left. 
wordl(dependent) 
Suppose that the following data is accumulated in 
the association database through dependency instruc- 
tions. 
(C4,W3,~7) 
(C3,W3,~Z) 
(C3,W5,~7) 
(Cl,WZ,~O) 
(Cl,W3,Wl) 
(C1,~5,\[1) 
(~,w3,w~) 
Exact matching: 
\[Assumption\] 
There are two possible syntactic heads, W7 and W3, 
for W2. 
\[Action\] 
The association database is searched for patterns 
(x.#T,WB) and (~,W3.#2). (±:don't care) 
Database Search pattern Watching 
(C4,W3,W?) 
(C3,W3.12) (~.W3,W2) (C3==*,W3::W3,Wg==W2) 
Success 
(C~,15A7) 
(Cl,WR,w6) 
(CI,W3,ll) 
(CI.WS,W\]) 
(CZ,~3,W6) 
(C3,W3,WR) is selected as the correct answer. 
Best matching: 
\[Assumption\] 
There are two possible syntactic heads, W7 and 15, 
for W2. 
\[Action\] 
First, the association database is searched for 
patterns (x,W7,W2) and (~.|5,W2). (x:don't care) 
Database Search pattern Batching 
(c~,w3,wT) 
(~,W3,W2) (x,W7,WB) (C3::*,W3!=W7, W2==W2) Fail 
(~,W5,W2) (C3:=x,W3!=W5,W2==W2) Fail 
(c3,wS,WT) 
(CI,W2,W6) 
(CI,W3,Wl) 
(CI,W5,\[I) 
(Ce,W3,~8) 
In this case, there is no data that exactly matches 
A(.TES DE COLING-92, NANrl~, 23-28 AOt~'r 1992 6 9 6 Pgoc. o1: COLING-92, NANTES, AUG. 23-28, 1992 
with search patterns. However, there is data (C3,W3,\[2) 
that matches mith syntactic dependent. The system 
retrieves more information in the database so as to 
decide mhich of W5 and W7 is more similar to W3. 
Searching database for patterns (=,x,W3) and (x,W3,*), 
the following data is obtained. 
(C4,W3,WT) 
(C3,W3,W2) Let this set of data be called 
(C1,W3,WI) "database(W3)." 
(cz,w3,w6) 
Searching database for patterns (*,*,WT) and (*,WT,*), 
the following data is obtained. 
(C4,W3,W7) Let this set of data be called 
(C3,W5,W7) "database(W7)." 
Searching database for patterns (~,~,W5) and (=.W5,x), 
the following data is obtained. 
(C3,W5.WT) Let this set of data be called 
(C1,WB,W1) "database(15)." 
On the assumption that W3 is tbe same as W7, the 
system performs exact matching between database(W3) and 
database(W7). In the following, \[W3\] is regarded as WT. 
Database(W3) Database(W7) 
(~,\[W3\],I7) (C4,W3,W7) 
(C3,\[W3U,IZ) 
(Cl,\[W3\],\[l) 
(C2,\[W3\],V6) 
(C3,WS,W7) 
Watching 
Fail 
because \[W3\]:=WTl=%3. 
Fail 
On the assumption that W3 is the same as W5, the 
system performs exact matching between database(W3) and 
database(WB). In the following, \[W3\] is regarded as WS. 
Database(W3) Oatabase(WS) Watching 
(CA, \[W3\],WT) (03,W5,WT) (C4!=C3,\[I3\]==WS,WT=:W?) 
Fail 
(CI==CI,\[~3\]==WS,Wl==W1) 
Success 
(C:3,\[W3\],W2) 
(CI,\[W3\],Wl) (C1,WB,WI) 
(C2.\[W3\].W6) 
Because the number of matches between database(W3) 
and databaso(WB) is larger than that between date- 
base(W3) and database(W7), W5 is considered to be more 
similar to W3 than W7. IS is selected as the head. 
3.3.3 Natching Algorithm 
Let PDBi(PCi,PHi,PHi.PTi) (l<=i<=n) be a possible 
analysis structure, where 
PCi: Case. PHi: Head, PDi:Bependent, PTi:Item. 
PDB is called "possible analysis structures database". 
Let ADBk(ACk,AHk,ADk.ATk) (l<=k<=m) be an associ- 
ation database entry, xhere 
ACk: Case, AHk: Head, hOk:Dependent, ITk:Item. 
ADB is called "association database". 
Matching algorithm for dependency selection is shown 
belom. All PDi's in PDH are supposed to be the same 
and lost of PCi's in PDB are supposed to be "don't 
care" for ease of understanding. 
First Step: 
Extract all hDBk's such that PDi==AHk(l<=i<=n, l<=k< 
=m) from ADB and create SADBj(SCj,SHj,SDj,STj) (l<=j<= 
p), mhere 
SCJ: Case, SHj: Head, SDJ:Dependent, STj:Item. 
SADB is a subset of ADH. 
If nothing is in SADB, stop search and return fail. 
Second Step: 
(l)Rostricted exact matching 
Let WORK be an empty database. 
for i=l to n 
for j=l to p 
if (SCj::PCi & SHj==Ptli & STj=:PTi) 
then add PDBi to WORK; 
endif 
end 
end 
return WORK; 
(2)Hoe-restricted exert matching 
Let WORK be an empty database. 
for i=l to n 
for j=l to p 
if (SCj==PCi & SHj==PHi) 
then add PDBi to WORK; 
endif 
end 
end 
return WORK; 
(3)Restricted best matching 
Let WBRKI, WORK2 be empty databases. 
cnt=O; 
• for i=1 to n 
for j=l to p 
if (SCj==PCi & SHim=PHi & STj==PTi) 
thee add POBi to WflRKI; 
endif 
else if (SCJ--PCi & SIU!=PHi & STJ==PTi & 
WOBKI==NULL) 
then 
/~ Calculate the similarity between 
SIIj and PHi. =/ 
extract all AgHk's such that 
ARk==SHj or AHR==Stlj (l<=k<=m) 
and create database X; 
extract all kDBk's such that 
kHk==Ptli or kOk=Plli (l<=k<==) 
and create database Y; 
assume SHj==PHi and perform restricted 
exact matching between X and Y; 
Let cntl be the number of matched 
entries between X and Y; 
if (cntl>O & cntl==cnt) 
then add PDBi to WORK2; 
endif 
/x Cat is tbe largest number of matches 
ACRES DE COLING-92, NANTES, 23-28 Ao{rr 1992 6 9 7 PROC. OF COLING-92, NANTES, AUG. 23-28, 1992 
made betmeen X end Y, shoming the 
de~ree of similarity betmeen them. */ 
else if (cntl>cnt) 
then 
cnt=cntl; 
clear fORK2; 
add PURl to WORK2; 
endif 
endif 
end 
end 
If (WORKI l= NULL) 
then return IORXI; 
endif 
else return IORX2; 
(4)Non-restricted best matching 
The algorithm is the same as (3) except that non- 
restricted exact matching is performed between X and Y 
instead of restricted exact matching. 
In the above, if more than one entries are in WORK 
or WORX1, the system mill select one that is most 
recently stored by the user's instruction. If WORX2 has 
more than one entries, one entry will be selected by 
further application of the rules. 
Watching algorithm for case selection is similar to 
that for dependency selection. 
4. Experiments 
Experiments have been made to evaluate the effect 
of learning mechanism described in Section 3 by simula- 
tion. In the experiments, the instruction iteas were 
limited to case and dependency. 
k total of 1565 sentences were collected from six 
kinds of technical manuals, These sentences mere trans- 
lated with PIVOT/J6. Using the analysis editing func- 
tion stated previously, correction of mistakes in 
dependencies and cases were made. 
After all errors in the analysis results of the 
whole text were corrected, correction information for 
case end dependency was extracted and put into s file. 
k tool which simulates learning mechanism mus prepared. 
After reading the file which stores the correction 
inforlation, it counts the number of corrections to be 
=~e in each of the folloaing eases: no application of 
the learned data, application with restricted exact 
matching, application with restricted best matching, 
application with non-restricted exact matching and with 
non-restricted best utehing. 
The results are shown in the table and the graph 
beloa. The value is the sum of the estimated number of 
the corrections and the estimated number of the recor- 
factions needed to cancel the secondary effect. 
Table 1 
D 
X 
0 
A 
0 
n 
X 
0 
A 
e 
Number of Sentences 
Without Learning 
Restricted Exact Watching 
Restricted Best Watching 
Non-restricted Exact Natehing 
Non-restricted Best Watching 
Text i Text 2 Text 3 Text 4 
220 456 713 920 
112 220 345 372 
81 137 236 262 
76 127 217 243 
7B 131 232 251 
77 123 218: 238 
Text 5 Text 6 
1138 1565 
447 760 
301 576! 
271 414 
289 524 
266 380 
Gr&ph 1 
888 
o 788 
o 688 
t~ 
588 
o 
o ¢88 
o 3BB 
288 
188 
Z B 
B 
x 
| x 
t 
o 
x n 
x • x 
488 88fl 1280 I6flB 
Number of Sentences 
The results are shown in order of effectiveness. 
1 non-restricted best matching 
2 restricted best matching 
3 non-restricted exact matching 
d restricted exact matching 
5 without learning 
Non-restricted best matching is the most effective 
among the five methods. 
5. Conclusion 
This paper discussed the learning mechanism for 
dependency and case corrected by the user. The learned 
data is accumulated in the association database. Four 
types of matchins methods that are used in the applying 
process mere examined. The simulation sboms that non- 
restricted best latching is the lost effective along 
the four types. 
The \]earning mechanism discussed above is also 
effective for selection of a translation equivalent. 
This mechanism will be incorporated in PIVOT, taking 
over the current learning mechanism for selection of 
translation equivalents. 
ACI'ES DE COTING-92, NANTES, 23-28 AOIYI" 1992 6 9 8 PREC. OF COLING-92. NAh-rES, AUO. :23-28, 1992 

References 

I. Nslao.M.: "k Framework of a Mechanical Translation 
between Japanese and Entlish by knalogy Principle', 
in Artificial and Human Intelligence (Elithorn & 
BsnerJi, Eds.), Elsevier Science Publishers, pp173- 
180, 1984. 

Z. Shirai,K., Hayashi,Y., Hirata,Y., and Kubota,J.: 
"Database Formulation and Lesrnint Procedure for 
Xaknri-Uko Dependency knalrsis", Transactions of 
IPSJ, Vol.20 No.4, 19BS(in Japanese). 

3. StanfilI,C. and hltz,B.: "Toward Weaory-Based 
Reasoning", CACW, 29-12. pplglJ-lg2B, 1986. 

4. Wiurs.K., \[tshashi,S., and Nishino,H.: "Japanese 
Text knalysis System with Valency Frame", WDNL 63-4, 
\[PSJ, 1987(in Japanese). 

5. lna#aki,H,, Kaboys,K., and Obsshi,F,: "Modification 
knslysis using Semantic Psttern ~, WGNL 67-5, IPSJ, 
lgBS(in Japanese). 

6. Sato, S.: "Welory-based Translation H", IGkl 70-3, 
\[PSJ, 1990(in Japanese). 

7. Sumtta,E., and lida,H.: "Experiments and Prospects 
of Example-Based Machine Translation", #GNL BZ-5, 
IPSJ, 1991. 
