Transforming a Sentence End into News Headline Style
Satoshi Ikeda and Kazuhide Yamamoto
Dept. of Electrical Engineering, Nagaoka University of Technology
ikeda@nlp.nagaokaut.ac.jp, yamamoto@fw.ipsj.or.jp
Abstract
News on electrical bulletin boards con-
sist of high density expressions. Many
sentences end with unique expressions
that consist of nouns and case parti-
cles. This paper focuses on expressions
used at the end of sentences and at-
tempts to summarize them by forming
noun or case particle endings. We sum-
marize the news sentence through pat-
tern matching approach. Our evalua-
tion illustrates that the summarizer re-
duces 2.50 characters per sentence on
average; the reduction ratio is 6%. We
also show that people perceive the cor-
rect meanings of the summarized sen-
tences with 95% accuracy.
1 Introduction
Electrical bulletin board displays the latest
news headlines which each newspaper office an-
nounced. News headlines are shorter than news-
papers’ news with laconicism because they are
summarized to transfer in limited space.
One of a characteristics of Japanese news head-
lines can be seen at sentence ends (Exp.1).
Exp.1)n�Y�x)Xw��_��
Q�{
(Countermeasure for alleged abduction is judged
after the movement of administration party.)
Although the end of sentence in Exp.1 is omit-
ted, we have no difficulty to understand the mean-
ing. We unconsciously complete the sentence by
guessing what is omitted without a mistake. With-
out unnecessary ends, these type of sentences are
short and nonredundant.
The final purpose of this work is to transform
news sentences into a news headline style. In
Japanese, sentences end with nouns or case par-
ticles are grammatically incorrect, however, this
kind of expressions are shorter than grammati-
cally perfect sentences, and hence often used to
meet the limited length. We believe many sen-
tences have semantically redundant expressions
in the end, which needs to be focused in summa-
rization.
In this paper we present a list of deletable ex-
pressions at Japanese sentence ends. We have
to carefully investigate which sentence ends are
deletable, and how to change into the headline
style. We present the concrete expressions of
deletion with examples and illustrate effects of
deletion.
2 Related Works
As the most similar work to ours, Sato et al.[6]
tries to extract paraphrasing patterns of sentence
end by preparing a lot of alignment pairs between
news sentences and their headline versions. They
compare the sentences from the ends and obtain
many correspondences between the two. How-
ever, they have no proposals on how to use these
one-to-N correspondences, i.e., the way to select
one from many candidates. Our approach is to ob-
tain many transformation patterns as well, but we
do not use aligned corpus; we use a large collec-
tion of news headlines instead and find patterns
by our thorough observation.
Wakao et al.[7] compares newscast and corre-
sponding subtitle expressions to investigate the
differences of them. One of the observation tar-
gets is sentence end, and they have shown us
some typical patterns of conversion into a short
news. This enumerates phraseologies which are
able to be cut down and investigates the frequency
of use. In news subtitles nouns or case particles
are used at the sentence end. This work is drew
upon literature [7] while we investigated in our
own right. We shaded light on the phraseologies
which do not exist literature [7] such as��	
`hq_�����	T(There seems to sur-
render.)�. We examined the phraseologies which
41
are disposed by machine. Fukushima et al. [1] cut
off the unnecessary part from literature [7].
There are investigations to summarize text
by confining the number of characters [2,3,5].
Ishizako et al.[2] cut off areas of overlap. Ohmori
et al.[5] and Mikami et al.[3] summarized text al-
together, but these investigations do not focus on
sentence ends.
3 News Headlines and Their Sentence
Ends
There is an email service that delivers Japanese
news headlines three times a day on week-
days. That is Nikkei news mail(1) provided by
NIKKEI-goo. We have been collecting them
since December 1999. Table1 shows the statistics
we have obtained.
Table 1: Statistical datum which are collected
number of mails 3365
number of stories 21127
number of sentences 40374
News headlines are more distinctive than news
stories in sentence end. Therefore, we investi-
gated part of speech on both news headlines and
newspaper(Nihon Keizai Shimbun(2)). Table2
shows the comparison.
Table 2: Occurrence ratio of POS in sentence end
occurrence ratio[%]
POS newspaper headlines
noun 23.70 55.92
(verbal noun) (5.00) (39.90)
verb 28.66 15.91
adjective 1.80 0.19
adverb 0.20 0.22
particle 1.56 8.83
(case particles) (0.34) (6.41)
auxiliary verb 38.59 18.52
symbol 5.42 0.40
In the newspaper, declinable words are respon-
sible for the majority of sentence ends. In news
headlines, there are in fact many verbal nouns in
sentence ends.
Japanese words are classified broadly into two
types; one derived from China and another origi-
nated in Japan. News headlines contain the for-
mer more than the latter because words from
China carry more information in fewer characters.
We investigated news headlines and news on a pa-
per which contained words of both Chinese and
Japanese origins. The result is shown in Table3.
In fact, news headlines preferably use the words
of Chinese origin about three times as much as
that of Japanese origin.
Table 3: Ratios of Chinese and Japanese origin
words in (a) newspaper and (b) headlines.
ratios [%]
Japanese Chinese (a) (b) a/b
_mT�
C_(to be found) 1.059 2.658 0.398
>��>(to decide) 0.622 2.184 0.285

�
�	Z(to elect) 0.210 2.643 0.079

�T�
Q�(to find out) 0.181 2.875 0.063
�a���(to order) 1.132 3.841 0.295
	\��
Ct(to say) 0.456 0.181 2.493
����*(to investigate) 6.284 53.333 0.118
total 2.712 7.271 0.373
We can imagine that a short phraseology
is preferably used when the phraseology has
the same information. We estimate that the
news headlines are high density phraseology than
newspaper.
4 Method of Summarization
In order to transform a sentence end into a
shorter one, we have conducted three kinds of
procedures:
(1) Deletion of target words at sentence end
(2) Deletion with minor transformation after the
target words
(3) Transformation of sentence end
More precisely, we have proposed conduct-
ing the following 10 procedures for transforming
Japanese sentence ends into a news headline style:
1. Cut off dictum and honorific phraseology (1)
2. Cut off���b(wo shimesu:show)�(1)
3. Change verbal noun(2)
4. Cut off�s�(naru)�(2)
5. Cut off the part which follows���Tt(akirakani)�
(2)
6. Change words of Japanese origin(2)
7. Cut off�o`�O(teshimau)�(1)
8. Cut off�qm(tatu)�(2)
9. Transform phraseology indicated the action in the fu-
ture (2)
42
10. Change to compound noun (3)
We summarized in this order, and process 3.�
9. can be switched.
4.1 Cut Off Dictum and Honorific Phraseol-
ogy
Phraseologies shown below are dictum or hon-
orific phraseology. These phraseology in sentence
end is cut off because these are not necessary to
understand the meaning.
~dictum phraseology:�ilh(datta)��pK�
(dearu)��i(da)�
~honorific phraseology:��b(masu)��pb(desu)�
4.2 Cut Off���b(wo simesu:show)�
When a sentence end is���b(wo
shimesu)�or���`h(wo shimeshita)�, this
phraseology is cut off because��b(shimesu)�
has little meaning in that sentence. The main verb
of the sentence is the verbal noun before���
b(wo shimesu)�.
4.3 Change Verbal Nouns
The expression after the verbal noun closest
to the main verb of the sentence is deleted. In
Japanese, we put a word�b�(suru)�after a
verbal noun to make a verb, but in the summary
it can be deleted since we can still understand the
usage.
When a self-sufficient word exists following a
verbal noun, we do not dispose this.
Step 1 The part following�b�(suru)�is cut.
Nominalized verbal noun to cut�b�(suru)�is
the verbal noun in this arrangement.
Step 2 When the cut part contains an estimation
phraseology�����(mirareru)�or�i�
O(daou)�, tack on�T(ka))�and finish.
Exp.2)�
���tlo�	`hq����{
��
���tlo�	T{
(He seemed to surrender in trouble with escape fund.)
Step 3 When the cut part contains a contradic-
tion phraseology�sM(nai)�or�u(nu)�,
tack on�dc(sezu)�at the sentence end and fin-
ish. When this part concurrently contains a pas-
sive phraseology���(reru)�, tack on�^�
c(sarezu)�and finish.
Step 4 When a sentence end is�noun��(wo)
�verbal noun�,��(wo)�is cut to become a
compound noun�noun�verbal noun�.
Exp.3)DT���q	�wm��t
g�	�9�Lb�{
�DT�z��q	�wm��t
g�	�9L{
(Starting this month, Japanese chess problems are seen in
ads of each station and in trains.)
Step 5 When a sentence end is�particle1
�noun�b�\q(surukoto)�particle2�
noun�,�b�\q(surukoto)�is cut. If the
particle1 is��(wo)�or�T(ka)�, this parti-
cle changes to�w(no)�.
If the cut part contains�	s�o(ha-
jimete:first)�, procedures from Step 2 is different
as follows.
Step 2 When the cut part contains�b�wx
(surunoha)�or�`hwx(shitanoha)�,�	s
�o(hajimete:first)�is tacked on before verbal
noun. When the part of cut contains�����
(mirareru)�, tack on�T(ka)�in sentence end
and finish.
Step 3 When the cut part contains�`o
(shite)�,��	s(go hatsu)�is tacked on in the
sentence end. When the term just before noun
is particle�T(ka)�, this particle�T(ka)�
changes into particle�w(no)�.
Exp.4)q_t ahwx����

HU�T�
	Z`o	s�o{
�q_t ahwx����

HU�T�
	Z�	s{
(He first acceded the interview since Karmapa Seventeenth
left China.)
Step 4 .1 When a verbal noun is�
Ct(hat-
sugen:delivery)�or�tt(genkyuu:citation)�,
�	s�o(hajimete:first)�is tacked on before the
verbal noun.
Exp.5)���	�
�U+@ttt`hwx	s�o{
����	�
�U+@t	s�ott{
(Russian troop’s cadre first adverted to retreat.)
Step 4.2 When a verbal noun is not�
Ct(hat-
sugen:delivery)�or�tt(genkyuu:citation)�,
the term before the verbal noun is checked. The
sentence end is processed the following.
~particle�w(no)�,�U(ga)��verbal noun
�particle�w(no)��verbal noun��x	s(ha
hatsu)�
~particle��(wo)�,��(mo)��verbal noun
�particle��(wo)�,��(mo)��verbal noun
~otherwise
�verbal noun��w�verbal noun��x	s(ha
hatsu)�
43
Step 5 When the cut part contains�����
(mirareru)�,�T(ka)�is tacked on in the sen-
tence end.
4.4 Cut Off�s�(naru)�
When�particle�s�(naru)�exists in a
sentence, this part and the following are cut off.
When a self-sufficient word exists in the cut part,
the meaning changes or we do not understand the
meaning.
Therefore, when a self-sufficient word exists
�particle�s�(naru)�following, the sentence
is not disposed this arrangement.
The�particle�s�(naru)�and the follow-
ing are cut off. When the particle is�t(ni)�or
�q(to)�,�t(ni)�is tacked on in the sentence
end.
Exp.6)
�
��d
��D
R�w��pzW�W�w

�Rqslh{
�
�
��d
��D
R�w��pzW�W�w

�Rt{
(The accord became the bare adoption because this arranged
after three and half months of general election ballot )
When the cut part contains a contradiction
phraseology�sM(nai)�or�u(nu)�,�s
�c(narazu)�is tacked on in the sentence end.
Exp.7)P U�loMhwTz�q�rUI
0N
ts�sTlh{
�P U�loMhwTz�q�rUI
0N
ts�c{
(Almost all detonating agents did not work because they
seemed to wet)
4.5 Cut Off the Part After���Tt�
When���Tt(akirakani:out of doubt)�ex-
ists in a sentence, the part which follows���T
t(akirakani)�is cut off. When the cut part con-
tains a self-sufficient word, the meaning changes
or we do not understand the meaning.
Then, when a self-sufficient word exists in the
sentence, the sentence is not disposed this ar-
rangement.
Step 1 The part which follows���Tt(aki-
rakani)�is cut off.
Step 2 Research the part of cut and dispose the
cut part.
~Contradiction phraseology�sM(nai)�or�u(nu)�
and passive phraseology���(reru)�exist.
��^�c(sarezu)�is tacked on in the sentence end.
~The contradiction phraseology�sM(nai)�or�u
(nu)�exists.
��dc(sezu)�is tacked on in the sentence end.
Exp.8)���x��Tt`oMsM{
����x��Ttdc{(The amount of
loss is not announced.)
Step 3 When�b�\q�(surukoto wo)�ex-
ists before���Tt(akirakani)�,�b�\q
(surukoto wo)�is cut off. When the part before
the cut is�particle�t(ni)��verbal noun�,
�t(ni)�is changed to��(e)�. When the part
before the cut part is�particle��(wo)��ver-
bal noun�,��(wo)�is changed to�w(no)�.
4.6 Change Words of Japanese Origin
When a Japanese origin word by Table3 exists
in a sentence, the part before it is cut off. Then
the Japanese origin word is replaced by Chinese
one.
When a self-sufficient word exists following
Japanese origin word, the sentence is not disposed
of this arrangement. We changed the word which
shows Table3.
Step 1 Japanese origin word and following are
cut off.
Step 2 When sentence end is�b�\q
�(surukoto wo)�, cut off�b�\q(su-
rukoto:doing)�, tack on the correspondent Chi-
nese origin word, and finish the arrangement.
Exp.9)�B����	������w^
Rt�	b�
\q�>�h{
��B����	������w^
Rt�	
�>{
(They have decided to start making an ‘instruction book on
extensive assistance for disaster’.)
Step 3 When a sentence condition is followed,
the sentence is disposed.
~A sentence end is a particle�U(ga)�and Japanese
origin word is�
�T�(wakaru:understand)�
�tack on�
Q�(hanmei:understand)�and finish.
~A sentence end is a particle�U(ga)�and Japanese
origin word is not����(siraberu:census)�
�The particle�U(ga)�is changed to a particle��
(wo)�.
~A sentence end is�U(ga)�noun�p(de)���w
(no)�noun��(wo)�
~A sentence end is a particle�x(ha)�and Japanese
origin word is�
�T�(wakaru:understand)�
�Get the former sentence back again and finish.
~Japanese origin word is����(siraberu:census)�
and the cut part contains�`oM�(siteiru)�
44
�tack on��*�(tyousa tyu:under survey)�at the
sentence end and finish.
Step 4 Chinese origin word which corresponds
Japanese origin word tacked on the sentence end.
Exp.10)!�_-�U_mTlh{
�!�_-��
C_{
(The total of 359 counterfeit coins were found.)
4.7 Cut Off�o`�O(teshimau)�
When a sentence contains�o`�O(teshi-
mau)�, we feel that the sentence is negative and
�o`�O(teshimau)�is not necessary to un-
derstand the meaning of the sentence. Thus we
cut off�o`�O(teshimau)�in the headline.
This arrangement is used not only the sentence
ends but middle of the sentence. When the term
after the cut part is�y(ba)�, we do not dis-
pose it. When the sentence end is�o`�O
(teshimau)�, change the term before�o`�O
(teshimau)�to primitive form and finish.
When�o`�O(teshimau)�exists without
the sentence end,�o`�O(teshimau)�and
the character before this phraseology is cut off.
4.8 Cut off�qm(tatsu)�
When a sentence contains�qm(tatsu)�,�q
m(tatsu)�, the part following it is cut off. When
the following part contains the self-sufficient
word, the meaning changes or we do not under-
stand the meaning.
Therefore, when a self-sufficient word exists in
the following part, the sentence is not disposed
this arrangement. When�qm(tatsu)�is a part
of idiom, the sentence is not disposed of this ar-
rangement.
Step 1�qm(tatsu)�and the following part
are cut off.
Exp.11)��������x)+���w
��	�
3w
�:tqm{
���������x)+���w
��	�
3w
�:t{
(’Top boy’ is acme in TV game retail business)
Step 2 When a contradiction phraseology�sM
(nai)�or�u(nu)�exists in the cut part,�q
hc(tatazu)�is tacked on at the sentence end.
4.9 Phraseology of Words Implying Future
When a phraseology which indicate the action
in the future such as�-h(keikaku:attempt)�
or�'(yotei:plan)�exists in the sentence,
the phraseology can changed to��(he)�in
Japanese. Therefore, the terms listed below are
the phraseology of indicated the action in the fu-
ture. When�b�(suru)�this phraseology�
exists in the sentence, this part and following are
changed to��(he)�.
�'(yotei:plan)��-h(keikaku:attempt)��M

(houshin:policy)��M�(houkou:future direction)�
When�b�(suru)�this phraseology�exists
in a sentence and the following contains a contra-
diction phraseology�sM(nai)�or�u(nu)�,
the sentence is not disposed of this arrangement.
When the following contains the�qMO(toiu)�
or�z�, the sentence is not disposed of this ar-
rangement.
�b�(suru)�this phraseology�and follow-
ing are cut off. when the sentence end is particle,
the particle is cut off.��(he)�is tacked on the
sentence end.
4.10 Change to a Compound Noun
When a sentence end is�noun�particle�
verbal noun�after the above arrangements, the
particle cut off to become a compound noun.
When the noun is neither pronoun, person name,
unique noun nor postfix for Chasen(3), this ar-
rangement is not disposed. When the particle is
�T�(kara)�,�p(de)�or��(mo)�, this
arrangement is not disposed.
We make a compound noun dictionary for The
Mainichi Newspapers(4) to check the adequacy of
compound nouns. When�noun particlet(ni)
verbal noun�and the dictionary contains�noun
�verbal noun�which is cut of�t�,�noun
�particlet(ni)�verbal noun�is changed to
�noun�verbal noun�. When the particle is not
�t(ni)�,�noun�particle�verbal noun�is
changed to�noun�verbal noun�.
Exp.12)�w	�Z
{T��
Qw�.U_mTlh{
��w	�Z
{T��
Qw�.�
C_{
��w	�Z
{T��
Qw�.
C_{
(A man’s body was found on the third floor of burned-out
site.)
5 Experiments
We implemented the proposed technique with
Perl programming language to measure the ade-
45
quacy of proposed technique. We summary with
this program. Then input sentence are all sen-
tences seen in the newspaper corpus. The number
of input sentences is 232,038, and 73,512 outputs
are somehow summarized in our method.
5.1 Summarization Ratio
We calculated a sentence ratio and number of
reduced characters in a sentence. This result of
experiment is shown in Table4. The method of
Table4 shows the section number. This Table4
shows the result which used the only one method.
The summarization ratio is 94%. In fact, this
method is reduced the 6% about one sentence.
Table 4: Summarization ratio
process 4.1 4.2 4.3 4.4 4.5
# sentence 16825 1313 37995 7510 199
summ. ratio 0.94 0.94 0.94 0.93 0.90
# reduced char. 1.60 4.00 2.56 3.12 5.41
process 4.6 4.7 4.8 4.9 total
# sentence 7194 600 197 848 72681
summ. ratio 0.96 0.89 0.92 0.87 0.94
# reduced char. 2.20 3.93 3.28 6.57 2.45
5.2 Subjective Evaluation
We also evaluated the proposed technique by
human judgment. We picked up 1,000 sentences
at random from summary sentences, and three ex-
aminees individually accounted them. The sen-
tences are measured by majority decision. As-
sessment criterion is: (1) same meaning without
context, and (2) low unnaturalness. The result is
shown in Table5. The numbers in the table de-
note the section numbers explaining the process
of transformation.
Table 5: Correctness of each process
method 4.1 4.2 4.3 4.4 4.5
# sentence 231 19 492 107 9
# correct 205 18 481 106 8
ratio 0.89 0.95 0.98 0.99 0.89
method 4.6 4.7 4.8 4.9 total
# sentence 116 21 3 13 1000
# correct 113 17 3 12 952
ratio 0.97 0.81 1 0.92 0.95
We have also computed the influence of per-
sonal difference. In this kind of subjective evalua-
tion different person may answer difference judg-
ment. We have evaluated our results in three cri-
teria: (1) at least one said correct, (2) at least two
said correct, and (3) all three said correct. This re-
sult is shown in Table6. The Table illustrates that
correctness is more than 90% in all cases.
Table 6: Correctness changes by personal differ-
ences.
≥ 1 ≥ 2 = 3
correctness 0.98 0.95 0.91
5.3 Comparison to the Human Summaries
We compare summaries of the proposed
method and by the human. We picked up 100 sen-
tences in summary sentences at random. One ex-
aminee summarized the original sentences which
corresponded the pick up the summary sentences.
We computed the summarization ratio about these
sentences. The result is shown in Table7.
Table 7: Comparison of summaries by proposed
technique and manual summary
machine human
# sentence 72727 100
summ. ratio 0.94 0.92
# reduced characters 2.45 3.87
Although the sentence ratio of machine sum-
mary is close to the manual summary’s one, num-
ber of reduced characters are approximately one
character different. This indicates that human try
to change many parts of sentence according to
the change of the sentence end, while the ma-
chine does not consider such influence. Change
of sentence end often requires transforming the
whole syntax structure, such as change of aspect
or form. We need more investigations on this is-
sue.
6 Discussions
6.1 Discussion of Erroneous Summaries
In this section we describe some erroneous
summaries by our method and discuss the rea-
sons.
Exp.13)%xfw
w���hMs�wpz*Vi�
w
�	��15.5mm
u�	Ovz
��b{
46
�*1%xfw
w���hMs�wpz*Vi�
w
�	��
u�	Ovz
{
(The face show the character like an annual ring.)
Exp.13 is error example in arrangement ‘cut
off the���b(wo shimesu:show)�‘ When
the term before���b(wo shimesu)�is the
noun, the sentence does not have main verb. The
main verb which does not exist in the sentence
is not right in Japanese. when the term before
���b(wo shimesu)�is noun, this arrange-
ment does not disposed. This kind of error is
covered. But when the noun is��Q(kan-
gae:concept)�,���(ikou:disposition)�or�_
�`(mitooshi:forecast)�, this arrangement is
correct.
Exp.14)b;	 taSw�;�/b�\q�>�h{
�*b;	 taSw�;�/�>{
(It is decided to caution the overuse to user.)
Exp.14 is the error example ‘change the word
of Japanese origin. When the cut off�b�\
q(surukoto:doing)�, the modification relation
is changed. Therefore, the modification relation
is a wrong one. When the particle��(wo)�is
changed to particle�w(no)�, this kind of error
is covered(Exp.15).
Exp.15)b;	 taSw�;�/b�\q�>�h{
�b;	 taSw�;w/�>{
Exp.16)<
�t`o`�SOq�loMh{
�*<
�t`Oq�loMh{
(He thinks that his mother killed.)
Exp.16 is an error example in ‘cut off�o`�
O(teshimau)�’. When�o`�O(tesimau)�is
cut off, it is not congruent inflected forms of�o
`�O(teshimau)�and the verb. When�o`
�O(teshimau)�is cut off, the inflected forms
must be congruent.
6.2 Verbalness/Nominalness of Verbal Noun
The sentence end is��x	s(ha hatsu:first)�
in Section4.3. There are a big differences by hu-
mans in degree of accepting this expression. We
thus change expression��x	s(ha hatsu)�into
��	s�o(hajimete:first)��. The example be-
fore changed is shown in Exp.17.
Exp.17)����Gw�U���Hw��	 q
q�b�wx	s�o{
�����Gw�U���Hw��	 qw
q�x	s{
1symbol ‘*’ indicates that the sentence is wrong.
(It is the first time that President Putin has a talk to the cap-
tain of Arab Crown)
Some people feel unnatural or wrong in this
example. But when the original sentences do
not have�	s�o(hajimete:first)�, the sum-
mary sentences are correct. The example is shown
Exp.18 without�	s�o(hajimete:first)�.
Exp.18)����Gw�U���Hw��	 q
q�b�{
�����Gw�U���Hw��	 qq�{
This example gives us no unnaturalness. We
think the verbal noun affect this. The verbal noun
represents that indicates the kind. The verbal op-
eration of verbal noun is varied by humans.
We think concretely about�q�
(kaidan:meeting)�of Exp.17and Exp.18.
First, we think that�q�(kaidan:meeting)�
is complemented the verbal noun�q�b�
(kaidan suru:have a talk)�. The predicate is
generally at sentence end in Japanese. When
the predicate does not exist in a sentence, it is
inclinable in human thought that sentence end
term is predicate. The other hand, we think
that�q�(kaidan:colloquy)�is nominal
or verbal operation in Exp.18 because�q�
(kaidan:meeing)�is not sentence end. then
when�q�(kaidan:meeting)�is nominal,
human have unnaturalness. And when�q�
(kaidan:meeting)�is verbal, human do not feel
unnatural.
We cite the error summary which sentence end
is noun other than verbal noun in this paper but
the verbal operation of noun is pertained in these
sentences. And the noun of operation verbal is
��Q(kangae:concept)�other than verbal noun.
6.3 Comparison of Machine and Manual
Summaries
We examine the machine and manual sum-
maries. Although many sentences are not much
different, some sentences have big differences for
summarization. One example is shown as fol-
lows, original sentence, its machine summary and
its manual summary respectively.
Exp.19)����
��lh����sr�K��b{
�����
��lh����sr�K�{
�����
��lh����sr�{
(There is graph used the color picture.)
Exp.19 is cut off the honorific phraseology but
the manual summary is cut off�K�(aru)�too.
47
This is shown that�K�(aru)�is dictum phrase-
ology. And the sentence end is��(mo)�. This
is often seen in the news headline. But the pro-
posed technique do not deal with them.
6.4 Summarization Failure
We examine the sentences which are not sum-
marized by the method. We picked up the 200
sentences at random and examine whether or not
it should be summarized. This results is that 9
sentences are missing. The example is shown be-
low with the supposed summary.
Exp.20)	�Z
{T���^�U�.p
C_
^�h{
�	�Z
{T���^�U�.p
C_{
(Mr. Ikemoto’s blob was found from the burned-out site.)
Exp.20 is not summarized. The reason of this
error is caused by an error of the morphological
analysis.
7 Conclusion
In order to generate short and smart style seen
in news headlines, this paper presents a method of
transforming Japanese sentence end expressions
into short style. Our observation reveals that the
end of sentence in the headlines are either nouns
or case particles in many sentences, we thus at-
tempt to summarize them as short as possible. We
have implemented the approach and evaluated in
summarization ratio and their correctness. The
results illustrates that the reduction ratio is 6%
against overall sentence length, and the sentence
is expected to be cut off 2.50 characters per sen-
tence. The length of automatic shortening is ap-
proximately the same as manual summarization.
We also confirmed that 95% of the summaries
were judged to be correct.
Acknowledgment
This work was supported in part by MEXT Grants-in-Aid
for Young Scientists (B) 16700134, and for Scientific Re-
search (A) 16200009, Japan.
Tools and language resources
(1) Nikkei news mail, NIKKEI-goo,
http://nikkeimail.goo.ne.jp/
(2) Nihon Keizai Shimbun Newspaper Corpus,
year 2000, Nihon Keizai Shimbun, Inc.
(3) Chasen, Ver.2.3.3, Matsumoto Lab, Nara In-
stitute of Science and Technology.
http://chasen.naist.jp/hiki/ChaSen/
(4) The Mainichi Newspapers Corpus, year 2000,
Mainichi Newspaper Co., Ltd.
References
[1] Takahiro Fukushima, Terumasa Ehara and Kat-
suhiko Shirai. 1999. Regulation for Reducing Num-
ber of Characters for Sentence Simplification, Pro-
ceedings of The Fifth Annual Meeting of The As-
sociation for Natural Language Processing, pp.221–
224. (in Japanese)
[2] Yuko Ishizako, Akira Kataoka, Shigeru Masuyama
and Seiichi Nakagawa. 1999. Summarization by
Reducing Overlaps and Its Application to TV News
Texts. IPSJ SIG Technical Reports 99-NL-133(7),
pages 45–52. Information Processing Society of
Japan. (in Japanese)
[3] Makoto Mikami, Shigeru Masuyama and Seiichi
Nakagawa. 1999. A Summarization Method by Re-
ducing Redundancy of Each Sentence for Mak-
ing Captions of Newscasting. Journal of Natural
Language Processing Vol.6, No.6, pp.65–81. (in
Japanese)
[4] Kiyonori Ohtake and Kazuhide Yamamoto. 2001.
Paraphrasing Honorifics. Proc. of NLPRS2001
Post-Conference Workshop on Automatic Para-
phrasing: Theories and Applications, pp. 13–20.
[5] Takefumi Oomori, Hidetaka Masuda and Hiroshi
Nakagawa. 2003. Web News Articles Summariza-
tion and its Evaluation using Articles for Mobile
Terminals, IPSJ SIG Technical Reports 2003-NL-
153(1). pages 1–8. Information Processing Society
of Japan. (in Japanese)
[6] Dai Sato, Moritaka Iwakoshi, Hidetaka Masuda
and Hiroshi Nakagawa. 2004. Extraction of Para-
phrasing Patterns from Aligned Corpora of Web
and Mobile Terminal News Articles. IPSJ SIG
Technical Reports 2004-NL-159(27). pages 193–
200. Information Processing Society of Japan. (in
Japanese)
[7] Takahiro Wakao, Terumasa Ehara and Katsuhiko
Shirai. 1997. Summarization Methods Used for
Caption in TV News Programs, IPSJ SIG Technical
Reports 97-NL-122(13). pages 83–89. Information
Processing Society of Japan. (in Japanese)
48
