DECOMPOSITION OF JAPANESE SENTENCES INTO NORMAL FORMS 
BASED ON HUMAN LINGUISTIC PROCESS 
Tsutomu Endo Tuneo Tamati 
Department of Information Science 
and Systems Engineering 
Oita University 
Oita 870-11, Japan 
Department of Information Systems 
Graduate School of Engineering Sciences 
Kyushu University 
Fukuoka 812, Japan 
A diversity and a flexibility of language 
expression forms are awkward problems for the 
machine processing of language, such as trans- 
lation, indexing and question-answering. This 
paper presents a method of decomposing Japanese 
sentences appearing in the Patent Documents on 
"Pulse network", into normal forms. First, the 
linguistic information is analysed and classi- 
fied based on the human linguistic process. 
Then, predicate functions, phrase functions and 
operators are introduced as the normal forms. 
Finally, the decomposing procedure and some 
experimental results are shown. 
Introduction 
One of the most remarkable features of 
natural language is a diversity and a flexibil- 
ity of its expression form. Especially, 
Japanese appears to have a peculiar syntactic 
structure because it is an agglutinative 
language. This is an awkward problem for the 
machine processing of language, such as trans- 
lation, subject indexing and question-answering. 
An approach to dealing with this problem is to 
transform the sentences into some normal forms 
if any. Proposals for such normalization have 
been made for some time, but there have been 
few attempts.l, 2 
The normal form needs to have every infor- 
mation which is contained in original sentences. 
Let us now consider what information the 
sentences contain. In human linguistic process, 
the objects to be exprssed are provided first, 
then the cognitive structure corresponding to 
them is formed, and lastly the language expres- 
sion based on the cognitive structure is 
produced. In other words, the immediate basis 
of language expression is considered to be 
human cognitive structure. Therefore, the 
arrangement of words in sentences represents 
not only the relation among objects in the 
external world, but also the cognitions and the 
relations among them, which are relatively 
independent of the present objects. 
This paper presents a method of decomposing 
Japanese sentences into normal forms based on 
such human linguistic process. First of all, 
the linguistic information necessary for decom- 
posing process is analysed and classified from 
the above mentioned point of view. Then, 
predicate functions, phrase functions and 
operators are introduced as the normal forms. 
Two kinds of function describe the syntactic 
structure of the sentences and phrases. The 
operator describes the relationship among func- 
tions. Finally, the decomposing procedure and 
some experimental results are shown. Sample 
sentences are selected from the claim points of 
the Japanese Patent Documents on "pulse network". 
Analysis of linguistic information 
In this section, we analyse and classify 
the linguistic information necessary for decom- 
posing Japanese sentences into their normal 
forms. 
Classification of words 
From the standpoint of the linguistic process, 
that is, objects, cognitions and expressions, 
all words are divided into objective expressions 
W 1 and subjective expressions W 2. W 1 is the set 
of expressions which reflect external objects, 
namely, conceptual expressions. On the other 
hand, W 2 is the set of cognitive expressions 
without conceptual process, and immediately 
represents the affection, judgement, desire, 
will and so on. The detail of the classifica- 
tion of words is summarized in Table i. We give 
supplementary explanations about Table i. 
Adjective ~ is the words which are called stem 
of adjectival verb in the traditional Japanese 
grammar. For inflectional words such as AAn, 
Vn, TB n and JJn, we specify n as i, 2, 3, 4, 
and 5(6) according to inflectional forms, that 
is, negative, declinable word modifying, final, 
noun modifying, and conditional(imperative) 
form respectively. 
Analysis of cognitive structure 
In order to describe the content of words and 
the relation among words, we introduce the 
descriptive scheme M which consists of such 
five descriptors as follows; 
M = < O, E, U, ~, A >. 
(i) 0 = {s~l(substance) , a~2(attribute), r~3 
(relation)}. 0 is the cognitive unit formed 
by separating and abstracting the external 
objects ideally, and is classified into three 
large categories, namely, substances, attributes 
and relations. The symbol ~i specifies the 
variety and the abstracting level of each unit. 
Thus, 0 is regarded as the classification of 
concepts in the objective world (e.g., pulse 
network). 
(2) ~ = {oi, o2, 03}. E describes the rela- 
tionship between objects from the various view 
points, o I is the relationship between sub- 
stance and attribute, o 2 is the relationship 
between substance and relation, and o 3 is the 
492- 
various connection of the same kind of objects. 
(3) U represents the active cognitions which 
are relatively independent of concepts. 
(4) ~ specifies the cognitive behaviors how the 
speaker cognize the objects. 
(5) A = {if(tense), 12(anaphora)}. A represents 
the relation between a speaker and objects. 
A part of O, Z, U and ~ is tabulated in 
Table 2-5 respectively. 
Definition of predicate function 
In this section and following two sections, 
we define the normal forms of Japanese sentences~ 
Generally, a sentence expresses the property of 
an object, or the relationship among objects. 
The component which indicates such property or 
relationship, is the predicate of a sentence. 
So we introduce the function, the constants of 
which are the predicate and the case postposi- 
tions, and the variables of which are noun 
phrases just in front of case postpositions. 
This function is called predicate function and 
is expressed by 
XlalX2a2...Xiai...XnanP 
where Xi, a i and P indicate the noun phrase, the 
case postposition and the predicate respectively. 
\[Example\] 
I. (SOOTI) GA (ZIZOKUHA) WO (PULSE) NI KAERU. 
X1 al X2 a2 X3 a3 P 
(A device converts continuous wave into 
pulse train.) 
2. (DENATU) we (TEIKOOKI) NI KUWAETA. 
(Someone applied voltage across a resistor.) 
3. (DENRYOKU HENKA) GA TIISAI. 
(The variation in power is small.) 
Table i Classification of words 
Objective 
expression 
(Wl) 
Subjective 
expression 
(W z ) 
Category 
Common noun 
Attribute Dynamle 
Noun noun 
(N) Abstract noun 
Pronoun 
Numeral 
Adjective Adjective I 
(A) Adjective 
Common verb 
Verb Special verb 
(Vn) Abstract verb 
Uninflected noun modifier 
Attribute adverb I Atribute 
adverb Attribute adverb 
Prefix 
Nominal suffix Suffix 
Verbal suffix 
Special symbols 
Compound word 
Auxiliary verb 
Case postposition Post- 
position Dependent postposition 
(X) Adverbial postposition 
Conjunctive postposition 
Conjunction 
Assertive adverb 
Punctuation Comma 
points Period 
Symbol Example 
NA transistor 
NBA HASSlN(oseillation) 
NBB gEl(plus) 
NC MONO(thSng) 
ND KORE(this) 
NE ICHl(one) 
AA n OOKll(large) 
AB KYUUGEKI(rapid) 
VAn KUWAERU(add) 
VE n HASSIN~SURU(oscilIate) 
VCn gURU(do) 
RR ARU(eertain) 
DA SIDAINI(gradually) 
DB KIWAMETE(very) 
HE HU(non-) 
TA KA(-ize) 
TEn SASERU(make) 
EX ~, (,) 
RE \[.AKAWARAZU 
(in spi~e of) 
JJn YOO(will) 
XA GA, NO, NI 
XB WA, MO 
XC DAKE(only), NADO 
XD TO. TEMO 
CC oYogl(and) 
YY MOSl(if) 
ZA 
ZB 
\[ 
i s sl 
' Sll 
Sill 
Sllll 
Sill2 
Sill3 
Sl12 
Sl13 
Sll~ 
sl2 
Sl21 
s122 
s123 
Sl2h 
s125 
s2 
s21 
s211 
s212 
s213 
s2131 
s2132 
s214 
S215 
S216 
s217 
s218 
s219 
s22 
s23 
s3 
s31 
S32 
s321 
s33 
s3~ 
s35 
s36 
SU 
S41 
s42 
a 
al 
all 
alll 
el2 
al21 
a1211 
al212 
a122 
a1221\[ 
a123 
a1231 
a1232 
a12~ 
a125 
a!251 
a126 
a127 
a13 
a131 
a132 
a133 
al3~ 
a135 
al42 
al~3 
a2 
a21 
a22 
a23 
a2h 
a241 
a243 
a2q2 
a2~4 
a245 a2',6 I 
~2q7 a248 
a25 a2~9 a26 I 
a27 
a2a 
a29 
i'l I " 
r 1 
r12 
r 2 
r21 
r 3 
r4 
r5 
r51 
r 6 
Table 2 List of descriptor O 
Deseripter gxample 
Substance 
Material subst~nlce 
Body(or Object) 
Functional body 
Circuit element 
Circuit device 
Device or System 
Material 
Particle 
Abstract group 
Phenomenon 
Wavy phenomenon 
Electric phenomenon 
Functional phenomenon 
Other phenomenon 
Attribute phenomenon 
Ideal substance 
View points 
Characteristic 
Type 
Quantity 
Impedance 
Wavy quantity 
State 
Property 
Shape 
Degree 
Value 
Content 
Number 
Method 
Space 
Direction 
Place 
Terminal 
Pass 
Boundary 
Part of substance 
Scope 
Time 
Temporal point 
Time interval 
Attribute 
Dynamic attribute 
Operation 
Concrete operator 
gvent 
Change 
Process of change 
Effect of change 
Remove 
Place of removal 
Input/Output 
Input 
Output 
Continuation 
Movement 
Concrete movement 
Display 
Start and stop 
Generation of relation 
Connection 
Switching 
Composition 
Addition 
Separation 
Action 
Usage 
Judgement 
Determination 
Passive 
Static attribute 
Possibility 
Difference of quality 
Property 
Relational attribute 
Positional reletion 
Difference 
Conformity 
Dependency 
Possession 
Opposition 
Necessity 
Temporal relation 
Other relation 
gxlstence 
Comparison of degree 
Circumstances 
Rank 
Abstract situation 
Relation 
Positioi~ 
Front and rear 
Nidway 
Reference 
Complement 
Causal relation 
Correspondence 
~lole and part 
Part 
Arithmetic relatlon 
MONO(thlng) 
TAl(body) 
BUKA(load) 
trannsistor, diode, condenscr 
NASS\[NKl(oseillator) 
KEISANKl(computer) 
RANDOOTAl(semlconduetor0 
DENSl(electron) 
TUl(pair), GUN(group) 
pulse, SEIGENHA(sine wave) 
DENATU(voltage), DENRYOKU(power) 
SINGOO(signal) 
ZATUON(nolse) 
drift 
KOTO(event) 
Zener TOKUSEl(charaeterlstic) 
KATA(type), pattern 
RYOO(quantity) 
TEIKOO(resistance) 
SYUUHASUU(frequency) 
ZYOOTAl(state) 
SINRAISEl(reliability) 
KUKEl(rectangle) 
BODe(degree), level 
ATAI, Tl(value) 
NAIYOO(eontent) 
0, i, 2, 3, 4 
UOOHOO(way) , SYUDAN (method) 
ZYUN-NOOKOO(foward direction) 
TgN(point) 
collector, emitter 
pass, loop 
limit 
HEN(side), peak 
HANl(seope) 
TOKl(time) 
KIKAN(time period) 
SURU(do), SASERU(make) 
ASgYUKU(compress),control 
KAWARU(ehange) 
ZOOKA(increasa), KOOKA(drop) 
UOOWA(saturate), KOSYOO(trouble 
NAGARERU(float) 
TOORUgpass) 
KUWAERU(add), ATABRD(give) 
UASSEl(generate) 
ZIZOKD(eontlnue) 
DOOSA(work) 
KYOOSIN(resonanee) 
UYOOZl(dlsplay, show) 
KAIgI(start) 
SETUZOKU(conneet) 
HIRAKU(open), switch 
KUMIAWASERU(comblne) 
TUKBRU(attach) 
BUNRl(separate) 
SIYO0, TUKAU(use) 
HANgETU(dlscriminant) 
SADAMERU (settle)~ KIMERU(declde 
RARgRD, Hl 
DEKIRU(be able to) 
OOKll(large), YOl(good) 
SENKEl(linear), digital 
TYOKURETU(serles) 
I\[ITOgll(equal), TlgAU(differ) 
MUKD(fit) 
DOKURITU(independent) 
MOTH(have, own) 
HANTAI (opposite) 
BITDYOO(necessary) 
TUNg(always, usually) 
soogOTEKl(complementary) 
ARU(exist), NAl(empty) 
KIWA~IETE(very, much) 
ANZgN(safety), UEIKOO(balance) 
SYU(main), KIZYUN(standard) 
SIDAINl(gradually) 
~E(in front of, before) 
AIDA. KAN(between) 
HOKA(other) 
BUBUN(part) 
El(ratio), }~NSUN(haif) 
--493-- 
However, a predicate P has a variety of ex- 
pression form in Japanese. For example, a 
verb is frequently connected with some aux- 
iliary verbs(e.g., NAl(negative), TA(past)) 
or verbal suffixes(e.g., RARERU(passive), 
SASERU(causative)). Therefore, we decom- 
pose the predicate P into objective 
expression Po and subjective expression Ps. 
Then, we define the basic predicate function 
as the function which donsists of the 
following four kinds Of predicate PoPs . 
(i) Po(Final form of verb) Ps(Zero element 
of speaker's judgement), 
(2) Po(Final form of adjective I) Ps(Zero 
element of speaker's judgement), 
(3) Po(Adjective ~) Ps(Judgement 
expression "DA(be)"), 
(4) Po(Noun) Ps(Judgement expression 
"DA(be) "). 
The application of operators presented in 
next section, inflects the form of Po or Ps. 
Other predicate functions are defined by the 
application of operators to basic predicate 
functions. Thus, the predicate functions 
are classified as follows. f Constant function 
Predicate (ideomatic expression) 
function Basic predicate function 
Derivative function 
The predicate generally represents some 
attribute concept. Unlike substances an 
attribute does not occur alone. It arises 
accompanying substances. When we cognize an 
attribute as the concept, there exist some sub- 
stances which accompany this attribute. The 
variables corresponding to these substances are 
called obligatory variables of the predicate, 
and the case postpositions, obligatory ones aei. 
On the other hand, one substance usually accom- 
panies various kinds of attribute, and is 
related to other substances as a mediation of 
this attribute. In the predicate function, the 
variables corresponding to such attributes and 
substances are called facultative variables, and 
the case postpositions, facultative ones aoi. 
The variables of a predicate function have 
some domains of their own, that is to say, sub- 
stitutable word classes. So we specify the 
domain of variables in terms of the descriptor O. 
Also, the relationship between the predicate and 
each variable is given by the descriptor E. 
These are summarized in Table 6. 
Definition of operators 
The operator produces a new function from 
one or two functions. They are classified into 
six groups, that is, modal(Fl) , nominalization 
(FII), embedding(fill), connecting(FIV), 
elliptical(F V) and anaphoric operator(Fvl). 
Modal operator 
The modal operators consist of the objective 
expressions Fil(e.g. , abstract verb, verbal 
suffix, a part of prefix) and the subjective 
expressions Fi2(e.g. , auxiliary verb, adverbial 
postposition). FII applies to Po of the predi- 
Table 3 List of descriptor 
Symbol Descriptor Example 
If. I Simple connection between (IMPEDANCE) GA TAF~I 
I|.2 
rl.3 
oI 
~1,7 
J\] .8 
JI,9 
:l.lo 
o2 
o3~' -- 
13.1 
~3.2 
J3.3 ; 
13 ,~ 
J3.5 
~ubstance and attribute 
Object of action 
or operation 
Starting point in action 
Finishing point in action 
Opponent in mutual action 
Standard or reference 
Way or means 
Spatial positioning 
Temporal positioning 
Others 
(The impedance is high,) 
(SINGO0) WO ZOOHUKU-SURU 
(A thing amplifies signal.) 
(TRANglSTON) KARA NARU 
(A thing consists of transistor.) 
(SAIDAITI) NI TASSURU 
(A thing amounts to maximum value.) 
(DIODE) TO KETUGOO-SURU 
(A th%ng connects with diode.) 
(ZATUON) NI YORU 
(A thlng depends on noise.) 
(PULSE) DE ILANTEN-SURU 
(A th~ng turns with pulse.) 
(COLLECTOR KAN) NI ARU 
(A th~ng is between co\]lectors,) 
(TOKI) NI NAGARERU 
(A thing floats when .... .) 
(YO0) NI KETUGOO-SURU 
(A person connects a thins so as to....) 
Substance with the AIDA\[r12*o2*s32 , r12*o2*sl~2\] 
relation (space or time b~etween ... ) 
Order of substance 
Number of substance 
Property of substance 
Material of substance 
Unit of substance 
Various connection 
among attributes 
DAI (2) (TRANSISTOR) 
(second transistor) 
(2) KO NO (TRANSISTOR) 
(two transistors) 
~PULSE) NO SYUUHASUU 
(frequency of pulse train) 
(~L~NDOOTAI) (DIODE) 
(semiconductor diode) 
(400 NM) NO (HATYOO) 
(wavelength of 400 nm.) 
(HEIRETU) (SETUZOKU) 
(parallel connection) 
u3 \] 
u j, 
u 5 
u6 
U7 
u~ 
u9 
ulo 
ull 
u12 
ul3 
ulE~ 
Table 4 
DescripLer 
List of descriptor U 
Example of ~tlbjective e×pre~sicn!; 
Affirmative judgement 
Negative judgement 
Special judgement 
Universal Judgement 
Purpose or aim 
Will 
Assumption 
Certification 
Inference 
Desire 
Natural judgement 
Instance 1 
Limitation toward ideal premises 
Exess 
DA, ARU (be, do) 
NAI, NU (not) 
WA, Me 
WA 
U, YOO (will, shall) 
BA, HOSI (if) 
TA 
U, TABUN (probably) 
BEKI (should, have Eo) 
NADO (..-so on) 
D~E (only) 
~IADE (even) 
Table 5 List of descriptor 
Cognitive behavior ~ymbol ri, 
~a 
49 
Cognizin S object O faithfully 
Cognizlng attribute a as suhstance 
Cognizing static attribute a dynamically 
Cognizing dynamic attribute a statically 
CognizIng causal relation of events backward 
Cognizing U as objective substance ideally 
Cognizln S U as objective attribute ideally 
Cognizing one object concretely and abstractly 
Cognizing a degree of attribute as quantity 
Cognizing one object from the various view points 
Conjunctive enumeration 
Disjunctive enumeration 
-494 
cate, and varies the mode of the attribute 
which is expressed by the function. On the 
other hand, FI2 applies to Ps, and varies 
the mode of the judgement. An example of 
FII and FI2 are shown in Table 7-8 respec- 
tively. 
Nominalization operator 
The nominalization operators apply to one 
predicate function and nominalize it in the 
following way. 
(i) f~l : Cognizing one of the objects 
expressed by the predicate function, as the 
substance with attribute. 
(DIODE) GA (HOODEN ZIKAN) WO HAYAMERU. 
(A diode advances the time of discharge.) 
+ (HOODEN ZIKAN) WO HAYAMERU DIODE 
(A diode which advances the time of 
discharge.) 
(2) ill2 : Recognizing the concrete event 
expressed by the predicate function, as 
substance ideally. 
(HAKEI) GA NAMARU 
(The wave form is blunted.) 
÷ (HAKEI) GA NAMARU KOTO (or NO) 
(that the wave form is blunted.) 
(3) fII3 : Transforming the predicate 
function into clauses which express the 
time, reason, state, effect and so on. 
(DENKA) WO KENSYUTU-SURU 
(A thing detects the electric charge.) 
÷ (DENKA) WO KENSYUTU-SURU TOKI 
(when a thing detects the electric 
charge.) 
÷ (DENKA) WO KENSYUTU-SURU TAME 
(in order to detect the electric 
charge.) 
÷ (DENKA) WO KENSYUTU-SURU YOU 
(so as to detect the electric charge.) 
Table 6 Example of basic predicate 
Predicate P Variable Case postposition 
ATAERU X l a e I WO o I. 2 
(give) X 2 as| NI Ol. ~ 
KENSYUTU-SURU X 1 as I GA 01. l 
(detect) X 2 eel WO O\] • 2 
KOSU X I ael CA °l. 1 
(exceed) X 2 ae2 WO O1.8 
E 1 ael WO °l. 2 
S ETUZOKU-SURU X 2 as2 NI O1.4 
(connect) X 3 aol NI g3.6 
X~ as 2 DE °1.7 
X 5 as3 DE °l .i0 
TAMOTU X1 ao I GA 01. l 
(keep) X 2 eel WO °l . 2 
X 3 as2 NI °l . 6 
DOOTUU- SURU Xl ae I GA s I . l 
(conduct) X 2 aol DE sl. 7 
ITTEI (constant) X l ael CA Ol . 1 
OOKll (large) X 1 ael CA ~1 . 1 
TUYOl X\] eel GA oi. 1 
(strong) X2 aol NI O1,6 
N,B. 
function 
Domain 
sl2 
s32 
Sill2 
Sl2, s213 
S122 
s218 
Sill, S32 
SIll, S32 
a241, a27 
Slll 
s31 
SIll2 
Sl2 
S3 
Sllll 
Sl2 
s213~ s216 
s213 
SllI2 
Sl2 
In sample sentences, a substance "human being" is not considered 
explicitly, so the variable corresponding to it is omitted in 
this Table. 
Table 7 Example of modal operators FII 
Symbol Operator 3ontent Usage 
f~ll SURU all 
(make) 
Fill 
f{ll NARU al21 
(become) 
f~12 RERU RARERU 
(be able to) 
f~12 RERU RARERU 
(passive) 
FII2 
SERE 
f~i2 SASERU 
SIMERU 
(make) 
f~14 KA (-able) 
FII4 
(IMPEDANCE) GA TAKAI 
(The impedance is high.) 
(HOES) CA (I~mEDANCE) WO TA~EKU gURU 
(A thing makes the impedance high.) 
(IMPEDANCE) GA TARAI 
(IHPEDANCE) GA TARAKU NAR_~U 
(The impedance becomes high. ) 
a21 
(I~EDANCE) WO TAKAMERU 
(A thing increases the impedance 
(IMPEDANCE) WO TA~IE RARERU 
(A thing is able to increase the impedance.) 
al5 
(DENATU) WO (TEIKOOKI) NI KUWAERU 
(A thing applies voltage across the resistor.) 
(DENATU) GA (TEIKOOKI) NI KUWAE RARERU 
(Voltage is applied across the resistor.) 
all 
(DENATU) GA HENRA-SURU 
(The voltage varies.) 
(SOOTI) CA (DENATU) WO HENKA $ASERU 
(A device makes the voltage vary.) 
(SOOTI) GA (PULSE) WO RASSEI-SURU 
(A device generates the pulse train.) 
(SOOTI) NI (PULSE) WO ILASSEI SASERU 
(A thin S makes a device generate the pulse 
train.) 
I KA SEIGYO KAHOOWA 
a21 ! (c--ontrollable) (saturatable) 
i Ul SEIGYO HI SOKUTEI 
I ~ontrolled) (measured) 
I KOGATAKA IC KA 
I (making small) (integration) 
ZIRAN TEKI DENKI TEK__~I 
(temporal) (electrical) 
f~14 HI (-ed) alS 
f{15 RA (-ize) al21 
FII5 f~15 TEKI a23 
Table 8 
Symbol Operator 
f~21 DA, ARU (be) 
NAI, NU 
f~21 (not) 
f~21 U, YOU (wlil) 
FI21 
f~21 BESI 
f~21 TA 
f~21 TA 
(past) 
F122 f~22 HU 
f~23 WA 
F123 
f~23 MO 
f~24 NADO 
FIg 4 
f~g4 DAKE 
MOSI 
f{25 (if) 
FI25 
f~25 TATOE 
Example of modal operators FI2 
content Usage 
u I 
(SWITCIIING DOOSA) GA SEIRAKU DE ARU 
(The switcbing operation is correct.) 
(KAIRO) GA (COIL) WO HUKUMA ~ NAI #. 
(A network does not contain a coil.) 
(OOKISA) WO HANTEI-SI ~ YOU ~. 
(We will decide the size.) 
u2 
u6 
I~NTEN-SURU ¢ BEKI TRANSISTOR 
Ull (a transistor to turn) 
u8 ($UOTI) GA DOOSA-SI TE IRU (A device is working.) 
(TRANSISTOR) GA HANTEN-SI ¢ TA ¢. 
(A transistor turned.) Ii 
U UU ITTI HU KANZEN 
u2 (disagreement) (imperfect) 
(TRANSISTOR) CA (SINGO0) WO ZOOHUKU-SURU 
(The transistor amplifies the signal,) u3 
(TRANSISTOR) WA (SINGO0) WO ZOOHUKU-SURU 
(SINGOO) ~ (TRANSISTOR) GA ZOOHUKU-SURU 
(TRANSISTOR) WA NOODO0 SOSI DA 
u4 (A transistor is an active element,) 
u3 
(PULSE) WO HASSEI-SURU 
(A thing generates the pulse train.) 
(PULSE)}IO ILESSEI-SURU 
(A thing generates the pulse train too.) 
(PULSE NO }LASSEI) MS gURU 
(PULSE) WO }LASSEI-$1 MS gURU 
(SOSI) GA (HASSINKI) NADO NI TEKISURU 
(The element is suitable for the oscillator 
and other things.) 
(PULSE) DAKEWO HASSEI-SURU 
(A thin S generates the pulse train only.) 
(PULSE) WO HASSEI-SURU DAK___EEDA 
(A thing generates only the pulse train.) 
Ul2 
Ul3 
MOSI (ZATUON) GA HASSEI-SURE BA, 
u7 (If the noise generates,) 
TATO E (SYUUKI) WO KAE TEMO. (SINPUKU) WA ITTEI DA 
(Even if we vary the period, the amplitude is 
constant.) 
--495 
(NYUURYOKU SINGO0) WO HENTYOO-SURU 
(A thing modulates the input signal.) 
÷ (NYUURYOKU SINGOO) WO HENTYO0-SI TA SING00 
(the signal which is modulated by the 
input signal.) 
(4) fII4 : Cognizing the only attribute as sub- 
stance. 
(PULSE) WO HASSIN-SI WA SURU 
(A thing generates the pulse train.) 
(5) fII5 : Cognizing the event expressed by the 
predicate function, as substance immediately. 
(ONDO) GA HENKA-SURU 
(A temperature changes.) 
÷ ONDO NO HENKA, or ONDO HENKA 
(A change in temperature.) 
The clause or noun phrase which is produced 
by the application of the nominalization operator, 
is substituted in the variable of other predicate 
function by embedding operator fIII. 
Connecting operator 
A connecting operator joins one predicate func- 
tion to another coordinately or subordinately. 
Generally, it corresponds to conjunctions and 
conjunctive postpositions. Some operators are 
related to modal operators, attribute adverbs, 
or variety of predicate. It is classified into 
following six groups. 
(i) Conjunctive connecting operator(fIV I) 
S 1 : (SYOOHI DENRYOKU) GA TIISAI 
(The consumption power is small.) 
S 2 : (SWITCHING ZIKAN) GA MIZIKAI 
(The switching time is short.) 
SI*fIVI*S 2 
(SYOOHI DENRYOKU) GA TIISAKU, (SWITCHING ZIKAN) 
GA MIZIKAI (The consumption power is small, 
and the switching time is short.) 
(2) Simultaneous conjunctive connecting 
operator(fIV2) 
S 1 : (TRANSISTOR) WO KUDOO-SURU 
(A thing drives the transistor.) 
S 2 : (HOOWADO) WO SEIGYO-SURU 
(A thing controls the saturation rate.) 
+ Sl*fIV2*S 2 
(TRANSISTOR) WO KUDOO-SURU TO DOOZI NI 
(HOOWADO) WO SEIGYO-SURU (The moment a thing 
drives the transistor, it controls the 
saturation rate.) 
(3) Disjunctive connecting operator(fIV 3) 
S 1 : (CONDENSER) WO SETUZOKU-SURU 
(A person connects a capacitor.) 
S 2 : (COIL) WO IRERU 
(A person inserts a coil.) 
+ Sl*flv3*s 2 
(CONDENSER) WO SETUZOKU-SURU K A (COIL) WO 
IRERU (A person connects a capacitor, or 
inserts a coil.) 
(4) Causal connecting operator(flV4) 
S I : (DENRYUU) GA (SYOTEITI) WO KOSU 
(The current exceeds the fixed value.) 
S 2 : (DENATU HENKA) GA SYOOZIRU 
(The voltage changes.) 
+ SI*fIV4*S2 
(DENRYUU) GA (SYOTEITI) WO KOSU TO (DENATU 
HENKA) GA SYOOZIRU (The voltage changes when 
the current exceeds the fixed value.) 
(5) Concessive connecting operator(flV5) 
S I : (SYUUKI) WO KAERU 
(A person changes the period.) 
S 2 : (SINPUKU) GA ITTEI-DA 
(The amplitude is constant.) 
+ SI*flv5*S 2 
(SYUUKI) WO KAE TEMO (SINPUKU) GA ITTEI-DA 
(Even if a person changes the periode, the 
amplitude is constant.) 
(6) Modificatory operator(flV6) 
S I : (TEIKO0) WO KAISURU 
(Through the resistor) 
S 2 : (BASE) WO (DENGEN) NI SETUZOKU-SURU 
(A person connects the base to the power 
source.) 
+ SI*fIv6*S 2 
(TEIKO0) WO KAISI TE (BASE) WO (DENGEN) NI 
SETUZOKU-SURU (A person connects the base to 
the power source through the resistor.) 
Generally, more than one connecting opera- 
tor is applied in the actual sentences. So we 
define the universal connecting formula as 
follows. Let fII and fIII be the nominalization 
and the embedding operator respectively. An 
arbitrary predicate function A i is expressed by 
A i = Ail*fivl*Ai2*flVl*... 
*fIVl*Aik*fIVl* .*fIViAi m 
where Ai k is 
(i) Su, 
(~) \[Ai*flVd*Aj\] (d = 2,3,4,5,6). 
Su is the basic predicate function, or the 
derivative function which is produced by the 
application of more than one modal operator, and 
is called unit predicate function. Moreover, 
the embedding operator is sometimes applied to 
Su in the following way. 
Su(flll-A~, A~,...., A~,..., A~) 
where A~ = fiiAi . 
Other operators 
When one predicate function is produced by the 
application of the connecting operator to two 
functions, the elliptical operator omits the one 
of the same expression forms in the two functions 
and anaphoric operator replaces the one of the 
same expression forms with the pronoun. 
Definition of phrase function 
We introduce the phrase function in order 
to describe the structure of noun phrases or 
compound words. However, it is not easy to 
define the phrase function based on the word 
class, unlike the predicate function. So we 
classify the phrases according to their content, 
and define the phrase function based on this 
classification. An example of phrase function 
is listed in Table 9. 
G 1 is the phrase connected in terms of such 
relational concepts as position(rl) , reference 
(r2) , and part(rs). G 2 is the phrase formed by 
cognitive behaviors(Y), such as enumeration(@10, 
@II), cognition of one object from the various 
view point(@9) , concrete and abstract cognition 
of one object(~7), and so on. G 3 is the phrase 
constructed in terms of the relationship(o I) 
between substance and attribute, and the various 
--496- 
connection(o 3) of the same kind of objects. 
G 4 is other phrases. 
Decomposition process 
The new derivative functions can 
be produced by the application of the 
various operators to the basic predicate 
functions. This means that the sentences 
with complex syntactic structure correspond 
to one predicate function. Therefore, the 
normalization of sentences is the decom- 
position of the predicate function corre- 
sponding to these sentences, into a set of 
basic predicate functions, phrase func- 
tions and operators. In this section, we 
describe the decomposing procedure 4. 
Machine dictionary 
A machine dictionary consists of three 
elementary dictionaries, that is, word 
dictionary(WD), predicate function dic- 
tionary(PFD) and related concept diction- 
ary(RCD). WD is utilized to acquire the 
basic linguistic information of each words 
in input sentences. PFD is given to the 
candidate word for predicate, such as verb, 
adjective, and so on, and is used to extract 
the predicate function from sentences and 
phrases. RCD is stored with the relation 
between concepts, and is used for not only 
the decision of embedded phrase but also 
the analysis of phrases. Table i0 shows an 
example of each dictionary. 
Procedural description 
General flow of decomposition process. 
The general procedural flow and the data 
flow of decomposition process are shown in 
Fig.l and Fig.2 respectively. Input Japa- 
nese sentences spelled in Roman letters are 
segmented word by word with spaces. 
Each word is matched with entry words 
of WD. The word Iist(WLIST) is constructed 
based on the information from WD. The 
candidate for predicate (e.g., verb, adjec- 
tive) is found by searching WLIST from the 
head of the list. Then, the modal operator 
(Fill, FII 2 and FI21) , embedding operator 
fill and connecting operator FIV are extract- 
ed by investigating the variety and the 
inflectional form of the predicate or the 
words which follow the predicate. The 
extracting method of these operators is shown 
in Fig.3. The extracted information is 
stored in FLIST 1 and CLIST. The variables 
of the predicate function are extracted by 
reference to PFD. At the same time, the 
modal operators FI2 3 and FI2 4 are extracted, 
if any. If the obligatory variable of the 
function is omitted, the word whose concept 
is coincident with the domain of the 
variable, is found from the extracted word 
string in WLIST. This is regarded as the 
application of the elliptical operator. 
When the embedding operator applies to the 
predicate, the variety of the nominaliza- 
tion operator and the embedded phrase are 
Table 9 Example of phrase functions 
G\] 
GB; 
Symbol 
gl01 
g103 
gi05 
Phrase function 
(RYOOIglc}w FAN 
w I w 2 KAN 
(TAHOO NOITA)w m 
w/w*rl2*P 
wl/w2'~rl2*P 
TAliOOar22~Wm 
w{NO\[~}Wm w*rsl*Wm 
w{NOIc}Wm wm*rsl*w 
gllO KAKU w m KAKU*rzh*W m 
82ol Wml(TOIOYOBI\],IE}WmB(TO\[E) ~10-wml/Wm2 
8202 Wml MATAWA Wm2 ~ll-Wml/Wm2 
8203 w w m ~9-w/wm 
g20h w w m ~7-w/wm 
DAI w w m w-o3,l-W m 
w{KO NOITU NO}w m w-o3.2-w m 
Example 
8301 
I 8302 
G3 8306 
I I g30O 
G~ 8401 
RYO0 BASE KA 
(between bases) 
BASE COLLECTOR KAN 
(between base and collector 
TANOO NO KAIRO 
(another circuit) 
TIIYRISTOR NO GATE 
(gate of thyrlstor) 
TRANSISTOR KAIRO 
(transistor circuit) 
KAEU DIODE 
(each diode) 
TEIKO0 TO DIODE 
(resistor and diode) 
TEIKO0 MATAWA CONDENSER 
(resistor or capacitor) 
PULSE DENATU 
(pulse voltage) 
ZOONUEU 8AYOO 
(amplifying operation) 
DAI 2 TRANSISTOR 
(second transistor) 
2 KO NO TRANSISTOR 
(two transistors) 
KIBYUN DENGEN KIZYUN w m KIZYUN'cI.I-Wm (standard power source) 
PULSE NO SINPUKU 
w(NO\[E}wm w-°3.3"wm (amplltude of pulse) 
KO0 IMPEDANCE SOSI 
w{NOle)Wm wm-°3.3-w (high impedance element) 
EDNO TRANSISTOR /II~KONOISONOjw m D_wm (this translator) 
N.8, w m indicates the main component of the phrase. 
Table I0 Structure of machine 
(a) Word dictionary (l~'O) 
Entry word Category Code 
TRANSISTOR(transistor) NA 300 
CA(~) XA i 
SETUZOKU (connect) VB 1010 
COLLECTOR(collector) NA 410 
DENRYUU (current) NA 376 
SASRRU (make) TB3h 24 
HEN~% (change) VB 1025 
HANDOOTAI ( semieonduc t o r ) NA 343 
DENTYOO (module ce) VB 1018 
OOKIKU (large) AAI 2 1206 
gEKI (should) JJ~ 32 
KONO (thls) RR 112 
DAI (large) HH 1206 
dictionary 
Concept Pointer 
SlIII(~O) i 
aisl(* o) i - 
Snl(*0) 2 s122(~o) n 
all(~O) 
al21(~0) 2 
SllZ(~0) 3 alll(~ o) 
3 
a22(00) 4 
Ull 
a22($0 ) 4 
(b) Predicate function dictionary (PFD) 
Number of Case Number of Character string 
NO. variable Designat°r~ postposition domain Domain of predicate 
0 WO 2 sill. sB2 
0 NI 2 Sill, S32 
i 5 i NI 2 aBhl, a27 SETUZOKU-SURU 
i DE i Sll\] 
i DE i s31 
2 2 0 GA 2 Sl2' SBl IIENKA-SURU 2 CA 1 
SIll2 
i 3 ? O WO i Sl2 UENTYOO-SURO 
- 3 -- i s_12 _ __ 
4 i 0 I GA 1 s213 OOKII 
i 
0(obligatory variable), l(facultative variable), 
2(special variable due to f~ll and f~12), 3(special variable due to f~ 3 ) 
(c) Related concept dictionary (RCD) 
NO, Number Variety Direction Level* Related concept 
1 3 r51 ~ 0 410. EMITTER s*, BASE ~ 
i r51 1 Slll2 
1 r51 ~ 0 300 2 
2 ~3.3 ~ 0 376, DENATU(voltage) ~* 
3 2 o3. h i Sl|ll , sill2 
* 0(code), l(concept) 
** The code is stored in actual dictionary. 
--497-- 
decided. The extracted information is 
stored FLIST I, and the word strings of the 
variables are stored in VLIST. These word 
strings are decomposed into basic predicate 
functions, nominalization operators and 
phrase functions, and then stored in FLIST 2 
and GLIST. The above procedure are repeat- 
ed for other predicate candidates. Finally, 
the connecting formula which indicates the 
relation among predicate functions are form- 
ed by reference to CLIST. 
Processing of phrases. At first, the 
procedure finds the candidate for predicate, 
such as dynamic attribute noun, declinable 
word modifying form of common verb, prefix 
(e.g., "KOO(high)", "TEl(low)", "DAl(large)", 
,etc.) and adjective II, from the word strings 
stored in VLIST. If the candidate is found, 
the basic predicate function, nominalization 
operator and embedded word are extracted. If 
not, the phrase function are extracted. They 
are classified into three types according to 
decision method. 
\[Type I\] Phrase functions extracted by the 
features of their constant. The example are 
gl01, g201, g301, and so on, in Table 9. 
Their constants, such as "RYeS(both)", "KAN 
(between)", "TAHOe(another)", "DAI", "KS", 
etc., are given the priority based on the 
strength of the connectability to variable, 
and are stored in constant list. The phrase 
function of this type is extracted according to 
priority. 
\[Type II\] Phrase functions extracted by using RCD. 
The examples are g105, g308, and so on. 
\[Type III\] Phrase functions extracted by using 
the variety or level of word concept. For 
example, g20'3 is extracted by investigating 
whether the upper concepts of both words agree 
with each other or not, and g204 is done by 
investigating whether the concept of second word 
Input sentences I 
1 
~&IST Word category, Inflectional form 
Code, Concept 
Pointer to other dictionary 
FLIST l VLIST 
Number of variable /~ (Variable i) 
Index of variable i Word string 
Case pos tposition Index 
Index of variahle 2 .......... 
Case pos~position ~ (Variable 2) 
Word string 
Index 
Medal operator ......... 
Nominalizntion operator 
Elliptical operator 
CLIST 
Connecting operator 
Embedding operator 
FLIST 2 
Embedded word 
Nominalization operator 
Number of variables 
Variable(word) 
Case postposltion 
GLIST 
Number of phrase function 
Number of variables 
Variables(word) 
Constant 
Fig.2 Data flow of decomposition process 
S T A R T ) 
Identify all words lu sentences J 
f 
--_ Find a candidate for predicate 
\[ 
j E ....... ~e oodal ~od ....... ing ! 
• operator applied to the predicate J 
of the predicate function I Extract the variables J 
Is the embedding operator applied te the predicate 
YES 
E ....... 1 ..... inalizatlsn operator \] 
and the embedded phrase f 
----_ (Extract tile phrase functions) 
Are Lhere candidates for predicate remained? ~-- 
connecting formula 
\[ --- 
Print the normal form of input sentences 
Fig.l Decomposing procedure of Japanese sentences 
~\[ Predi ..... (Vn) 1 
~'''" "''-. \[KAlSURU(thrsugb), YORU(d ..... ) \] 
3~--~'~HERU(hy way of), TAlSUgU(toward)| 
,~i~ /oozlgu( ..... ding to), etc. 
J_ f vg 
2 3 fll2(fll2 ) 
(fC1 
flv5 
(FII(FI21) 
~fIII 
C 
RERU ( .... u) 
$ERU(SASERU) 
fll2 
1 
f~21' f~6 
~=~ f~l f{ll ~= N R~ ~ fhl 
' Prodl .... CA,,o) 
Fig.3 Extraction of modal, connecting 
and embedding operators 
-498- 
(1) ConnecEing point of resistor and inductance coil 
TEIKO0 TO INDUCTANCE NO SETUZOKU TEN 
NAIN ELEMENT = TEN 
PREDICATE FUNCTION 
I (INDUCTANCE} 140 (TEIKO0) TO SETUZOKU-SURU 
N.OP. = F2.30 NOUN = TEN 
(2) 0utpu~ pulse with consLant amplitude 
ITTEI SINPUKU SYUTURYOKU PULSE 
MAIN ELEMENT = PULSE 
PHRASE FUNCTION 
1 G3.B8 : PULSE--CC3--SINPUKU 
PREOICATE FUNCTION 
I (SINPUKU) GA ITTEI-DA 
N.OP. = F2.11 NOUN = SINPUKU 
2 (PULSE) GA SYUTURYOKU-SURU 
N.OP. : F2.12 NOUN = PULSE 
(3) Voltage detecting device with high input impedance 
KO0 NYUURYOKU-IMPEDANCE DENATU KENSYUTU KAIRO 
MAIN ELEMENT = KAIRO 
PHRASE FUNCTION 
\] G3.08 : KAIRO--CC3--NYUURYOKU-IMPEDANCE 
PREDICATE FUNCTION 
1 (NYUURYOKU-IMPEDANCE) GA TAKAI 
N.OP. = F2.11 NOUN = NYUURYOKU-IMPEDANCE 
2 (KAIRO) GA (DENATU) WO KENSYUTU-SURU 
N.OP. = F2.12 NOUN : KAIRO 
Fig.4 Examples of phrase processing 
is the upper concept of first word or not. 
Experiments 
The merit of above procedure is the combination 
of top-down processing and bottom-up processing. 
The formar finds a key word in sentences without 
reference to the word order. The latter analyses 
word string based on the key word. This is ad- 
vantageous for the processing of Japanese sen- 
tences in which the word order variation and the 
embedding appear frequently. 
The procedure was programmed by the assembly 
language of TOSBAC-40C mini computer. The 
experimental results for sentences in 30 docu- 
ments confirmed the adequacy of our procedure. 
The examples of phrases and sentences processing 
are shown in Fig.4-5. 
Conclusion 
This paper have presented the method of 
decomposing Japanese sentences into normal forms. 
This method has following desirable advantages: 
(i) The descriptive scheme M which describes the 
word content and the relation among words, is 
introduced based on the human linguistic process. 
This will be useful for language processing in- 
cluding the pragmaties in the future. 
(2) The normal forms which consist of the basic 
predicate function, phrase function and operator, 
are interpreted according to the descriptive 
scheme M. This is useful for the semantic 
processing of input sentences. 
(3) The structure of considerably long sentences 
can be described by the embedding and connecting 
operators. 
(4) The structural description of phrases or 
compound words is useful to reduce the amount of 
storage for word dictionary. 
(5) The normal forms of sentences can serve as 
input data for an automatic subject indexing or 
abstracting of documents in the information 
retrieval system 5'6 
The problems left unsolved are word segmen- 
tation of input Japanese sentences, detection of 
syntactic and semantic ambiguity, and semantic 
***INPUT SENTENCE*** 
1 TUI NO PHOTO:TRANSISTOR NO COLLECTOR KAN 41 TAGAI NI GYAKU-HEIRETU NO IfAKKDO-D 
lODE WO SEFUZOKU ST TE , DOOTUU SURU 3EKI P~IOTO-TRANSISTOR WO HOST SURU YO0 NI H 
IKARI-KETUGO0 SASE , PHOTO-TRANSISTOR NO BAKE NI KOOGO NI KUWAE RARERU HIKARI Sl 
NGO0 NI YORI HANTEN SURU FE . (A fl~p-flop in which light emitting diodes 
connected in antiparal~el are tied across co~leotors of a pair of photo 
transistor,s; and in which they are photo coupZed so as to keep the state of the 
photo transistor to conduct; and which turns by al~ernateZy appZying photo 
sijna~ to the base of photo transistor, ) 
***STRUCTURAL DESCRIPTION*** 
Sl (XI.I) WO (XI.2) NI SETUZOKU-SDRU 
OP. = FI.215(TA) 
XI.I = TAGAI NI GYAKU-HEIRETU NO HAKKDU-DIUDE 
MAIN ELEMENT = IIAKKOO-DIODE 
PREDICATE FUNCTION 
I (HAKKOO-DIODE) GA (TAGAI) NI GYAKU-HEIRETU-DA 
N.OP. = F2.1~O(NOUN = HAKKOO-DIODE) 
X\].2 : I TUI NO PIIOTO-TRANSISTOR NO COLLECTOR KAH 
IIAIN ELEHENT = KAN 
PHRASE FUNCTION 
I G3.03 : 2--CC2--PHOTO-TRANSISTOR 
2 GI.OI : COLLECTOR/COLLECTOR*RI2*POSITIOH 
3 GI.05 : PHOTO-TRANSISTOR*R51*COLLECTOR 
$2 (X2.1) GA DOOTUU-SURU 
OP. = FI.214(UEKI), F2.110(NOUN = X2.1) 
X2.1 = PIIOTO-TRANSISTOR 
$3 (X3.1) WO HOZI-SURU 
OP. = F2.3OO(NOUN : ¥00) 
X3.1 : X2.1 
S4 (X4.1) GA (X4.2) TO (X4.3) NI HIKARI-KETUGOO-SURU 
OP. = Fl . 124(SASERU), FS. I O0 (PHOTO-TRANSISTOR), F5. I O0(HAKKOO-DIODE ) 
X4.I = PHOTO-TRANSISTOR 
X4.2 - HAKKOO-DIODE 
X4.3 = YO0 
$5 (X5.1) WO (X5.2) NI (×5.3) NI KUWAERU 
OP. =Fl.I23(RARERU), F2.110(NOUN = X5.1) 
X5.1 = HIKARI SINGOO 
MAIN ELEMENT = SINGO0 
PHRASE FUNCTION 
I G2.03 : AO9--HIKARI/SINGO0 
X5.2 = PHOTO-TRANSISTOR NO BASE 
MAIN ELEMENT = BASE 
PHRASE FUNCTION 
I GI.05 : PIIOTO-TRANSISTOR*R51*BASE 
×5.3 = KOOGO 
S5 (X6.I) NI YORU 
X6.1 = X5.1 
S7 (X7.1) GA HANTEN-SURU 
OP. = F2.IlO(NOUR ~ X7.1) 
XT.I : FF 
***CONNECT ING FORMULA*** 
SI*F4. l'S4 (F3-S3(E3-S2))*F4. I*\[S6(F3-S5)*F4.6"S7 \] 
N.B. I S\], $2, ... are basic predicate funcclons, 
2 XI.I, Xl.2. X2.1. ... are variables of each functions. 
3 The symbol "OP," indicates the operator applied to the predicate. 
$ The symbol "NOUN" indicates ~he embedded phrase or word. 
5 The predicate ftli%ctlou "$7" is as embedded one, but it is considered to 
be the independent functiotl in connecting formula. 
Fig.5 Example of sentences processing 
description of sentences. 

References 

i. T.Fujita, H.Tsurumaru and S.Yoshida,"Machine 
Processing of Japanese--Decomposition of 
Japanese Sentences into Their Normal Forms--~ 
Trans. IECE Japan, Vol.58-D, No.7, pp.405- 
412, July 1975. 

2. F.Nishida and S.Takamatsu, "A Reduction of 
Restricted Japanese Sentences to Predicate 
Formulas and the Information-Extraction", 
Trans. IECE Japan, VoI.J59-D, No.8, pp.515- 
522, Aug. 1976. 

3. T.Endo and T.Tamati, "Syntax Analysis of 
Japanese Text for Subject Indexing", Tech. 
Report of IECE Japan, AL77-46, Oct. 1977. 

4. T.Endo and T,Tamati, "On a Structural 
Description of Japanese Text", Tech. Report 
of IECE Japan, AL79-37, July 1979. 

5. G.Salton, "The SMART RETRIEVAL SYSTEM--Ex- 
periments in Automatic Document Processing--~ 
Prentice-Hall Inc. 1971. 

6. P.W.Lancaster, "Vocabulary Control for 
Information Retrieval", Information Resource 
Press, 1972. 
