Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing (EMNLP 2006), pages 355–363,
Sydney, July 2006. c©2006 Association for Computational Linguistics
Fully Automatic Lexicon Expansion
for Domain-oriented Sentiment Analysis
Hiroshi Kanayama Tetsuya Nasukawa
Tokyo Research Laboratory, IBM Japan, Ltd.
1623-14 Shimotsuruma, Yamato-shi, Kanagawa-ken, 242-8502 Japan
{hkana,nasukawa}@jp.ibm.com
Abstract
This paper proposes an unsupervised
lexicon building method for the detec-
tion of polar clauses, which convey pos-
itive or negative aspects in a specific
domain. The lexical entries to be ac-
quired are called polar atoms, the min-
imum human-understandable syntactic
structures that specify the polarity of
clauses. As a clue to obtain candidate
polar atoms, we use context coherency,
the tendency for same polarities to ap-
pear successively in contexts. Using
the overall density and precision of co-
herency in the corpus, the statistical
estimation picks up appropriate polar
atoms among candidates, without any
manual tuning of the threshold values.
The experimental results show that the
precision of polarity assignment with
the automatically acquired lexicon was
94% on average, and our method is ro-
bust for corpora in diverse domains and
for the size of the initial lexicon.
1 Introduction
Sentiment Analysis (SA) (Nasukawa and Yi,
2003; Yi et al., 2003) is a task to recognize
writers’ feelings as expressed in positive or
negative comments, by analyzing unreadably
large numbers of documents. Extensive syn-
tactic patterns enable us to detect sentiment
expressions and to convert them into seman-
tic structures with high precision, as reported
by Kanayama et al. (2004). From the exam-
ple Japanese sentence (1) in the digital cam-
era domain, the SA system extracts a senti-
ment representation as (2), which consists of
a predicate and an argument with positive (+)
polarity.
(1) Kono kamera-ha subarashii-to omou.
‘I think this camera is splendid.’
(2) [+] splendid(camera)
SA in general tends to focus on subjec-
tivesentimentexpressions, whichexplicitlyde-
scribe an author’s preference as in the above
example (1). Objective (or factual) expres-
sions such as in the following examples (3) and
(4) may be out of scope even though they de-
scribe desirable aspects in a specific domain.
However, when customers or corporate users
use SA system for their commercial activities,
such domain-specific expressions have a more
important role, since they convey strong or
weak points of the product more directly, and
may influence their choice to purchase a spe-
cific product, as an example.
(3) Kontorasuto-ga kukkiri-suru.
‘The contrast is sharp.’
(4) Atarashii kishu-ha zuumu-mo tsuite-iru.
‘The new model has a zoom lens, too.’
This paper addresses the Japanese ver-
sion of Domain-oriented Sentiment Analysis,
which identifies polar clauses conveying good-
ness and badness in a specific domain, in-
cluding rather objective expressions. Building
domain-dependent lexicons for many domains
is much harder work than preparing domain-
independent lexicons and syntactic patterns,
because the possible lexical entries are too
numerous, and they may differ in each do-
main. To solve this problem, we have devised
an unsupervised method to acquire domain-
dependent lexical knowledge where a user has
only to collect unannotated domain corpora.
The knowledge to be acquired is a domain-
dependent set of polar atoms. A polar atom is
a minimum syntactic structure specifying po-
larity in a predicative expression. For exam-
ple, to detect polar clauses in the sentences (3)
355
and (4)1, the following polar atoms (5) and (6)
should appear in the lexicon:
(5) [+] kukkiri-suru
‘to be sharp’
(6) [+] tsuku ← zuumu-ga
‘to have ← zoom lens-NOM’
The polar atom (5) specified the positive po-
larity of the verb kukkiri-suru. This atom can
be generally used for this verb regardless of
its arguments. In the polar atom (6), on the
other hand, the nominative case of the verb
tsuku (‘have’) is limited to a specific noun zu-
umu (‘zoom lens’), since the verb tsuku does
not hold the polarity in itself. The automatic
decision for the scopes of the atoms is one of
the major issues.
For lexical learning from unannotated cor-
pora, our method uses context coherency in
terms of polarity, an assumption that polar
clauses with the same polarity appear suc-
cessively unless the context is changed with
adversative expressions. Exploiting this ten-
dency, we can collect candidate polar atoms
with their tentative polarities as those adja-
cent to the polar clauses which have been
identified by their domain-independent polar
atoms in the initial lexicon. We use both intra-
sentential and inter-sentential contexts to ob-
tain more candidate polar atoms.
Our assumption is intuitively reasonable,
but there are many non-polar (neutral) clauses
adjacent to polar clauses. Errors in sentence
delimitation or syntactic parsing also result in
false candidate atoms. Thus, to adopt a can-
didate polar atom for the new lexicon, some
threshold values for the frequencies or ratios
are required, but they depend on the type of
the corpus, the size of the initial lexicon, etc.
Our algorithm is fully automatic in the
sense that the criteria for the adoption of po-
lar atoms are set automatically by statistical
estimation based on the distributions of co-
herency: coherent precision and coherent den-
sity. No manual tuning process is required,
so the algorithm only needs unannotated do-
main corpora and the initial lexicon. Thus
our learning method can be used not only by
the developers of the system, but also by end-
users. This feature is very helpful for users to
1The English translations are included only for con-
venience.
analyze documents in new domains.
In the next section, we review related work,
and Section 3 describes our runtime SA sys-
tem. In Section 4, our assumption for unsu-
pervised learning, context coherency and its
key metrics, coherent precision and coherent
density are discussed. Section 5 describes our
unsupervised learning method. Experimental
resultsareshowninSection6, andweconclude
in Section 7.
2 Related Work
Sentiment analysis has been extensively stud-
ied in recent years. The target of SA in this
paper is wider than in previous work. For ex-
ample, Yu and Hatzivassiloglou (2003) sepa-
rated facts from opinions and assigned polari-
ties only to opinions. In contrast, our system
detects factual polar clauses as well as senti-
ments.
Unsupervised learning for sentiment analy-
sis is also being studied. For example, Hatzi-
vassiloglou and McKeown (1997) labeled ad-
jectives as positive or negative, relying on se-
mantic orientation. Turney (2002) used col-
location with “excellent” or “poor” to obtain
positive and negative clues for document clas-
sification. In this paper, we use contextual
information which is wider than for the con-
texts they used, and address the problem of
acquiring lexical entries from the noisy clues.
Inter-sentential contexts as in our approach
were used as a clue also for subjectivity anal-
ysis (Riloff and Wiebe, 2003; Pang and Lee,
2004), which is two-fold classification into sub-
jective and objective sentences. Compared to
it, this paper solves a more difficult problem:
three-fold classification into positive, negative
and non-polar expressions using imperfect co-
herency in terms of sentiment polarity.
Learningmethodsforphrase-levelsentiment
analysis closely share an objective of our ap-
proach. Popescu and Etzioni (2005) achieved
high-precision opinion phrases extraction by
using relaxation labeling. Their method itera-
tively assigns a polarity to a phrase, relying on
semantic orientation of co-occurring words in
specific relations in a sentence, but the scope
of semantic orientation is limited to within a
sentence. Wilson et al. (2005) proposed su-
pervised learning, dividing the resources into
356
Document
to analyze a45
Sentence
Delimitation ...


Sentences
a63Proposition Detection
Propositions
Clauses
a63Polarity Assignment
+
−
Polarities
Polar Clauses
Modality
Patterns
Conjunctive
Patterns
a42
Polar
Atoms
a45
Figure 1: The flow of the clause-level SA.
prior polarity and context polarity, which are
similar to polar atoms and syntactic patterns
in this paper, respectively. Wilson et al. pre-
pared prior polarities from existing resources,
and learned the context polarities by using
prior polarities and annotated corpora. There-
fore the prerequisite data and learned data
are opposite from those in our approach. We
took the approach used in this paper because
we want to acquire more domain-dependent
knowledge, and context polarity is easier to
access in Japanese2. Our approach and their
work can complement each other.
3 Methodology of Clause-level SA
As Figure 1 illustrates, the flow of our sen-
timent analysis system involves three steps.
The first step is sentence delimitation: the in-
put document is divided into sentences. The
second step is proposition detection: proposi-
tions which can form polar clauses are identi-
fiedineachsentence. Thethirdstepis polarity
assignment: the polarity of each proposition
is examined by considering the polar atoms.
This section describes the last two processes,
which are based on a deep sentiment analy-
sis method analogous to machine translation
(Kanayama et al., 2004) (hereafter “the MT
method”).
3.1 Proposition Detection
Our basic tactic for clause-level SA is the high-
precision detection of polar clauses based on
deep syntactic analysis. ‘Clause-level’ means
that only predicative verbs and adjectives such
2For example, indirect negation such as caused by
a subject “nobody” or a modifier “seldom” is rare in
Japanese.
as in (7) are detected, and adnominal (attribu-
tive) usages of verbs and adjectives as in (8)
are ignored, because utsukushii (‘beautiful’) in
(8) does not convey a positive polarity.
(7) E-ga utsukushii.
‘The picture is beautiful.’
(8) Utsukushii hito-ni aitai.
‘I want to meet a beautiful person.’
Here we use the notion of a proposition as a
clause without modality, led by a predicative
verb or a predicative adjective. The proposi-
tions detected from a sentence are subject to
the assignment of polarities.
Basically, we detect a proposition only at
the head of a syntactic tree3. However, this
limitation reduces the recall of sentiment anal-
ysis to a very low level. In the example (7)
above, utsukushii is the head of the tree, while
those initial clauses in (9) to (11) below are
not. In order to achieve higher recall while
maintaininghighprecision, weapplytwotypes
of syntactic patterns, modality patterns and
conjunctive patterns4, to the tree structures
from the full-parsing.
(9) Sore-ha utsukushii-to omou.
‘I think it is beautiful.’
(10) Sore-ha utsukushiku-nai.
‘It is not beautiful.’
(11) Sore-ga utsukushii-to yoi.
‘I hope it is beautiful.’
Modality patterns match some auxiliary
verbs or corresponding sentence-final expres-
sions, to allow for specific kinds of modality
and negation. One of the typical patterns is
[ v to omou] (‘I think v ’)5, which allows ut-
sukushii in (9) to be a proposition. Also nega-
tion is handled with a modality pattern, such
as [ v nai] (‘not v ’). In this case a neg fea-
ture is attached to the proposition to identify
utsukushii in (10) as a negated proposition.
On the other hand, no proposition is identi-
fied in (11) due to the deliberate absence of
a pattern [ v to yoi] (‘I hope v ’). We used
a total of 103 domain-independent modality
patterns, most of which are derived from the
3This is same as the rightmost part of the sentence
since all Japanese modification is directed left to right.
4These two types of patterns correspond to auxil-
iary patterns in the MT method, and can be applied
independent of domains.
5 v denotes a verb or an adjective.
357
coordinative (roughly ‘and’)
-te, -shi, -ueni, -dakedenaku, -nominarazu
causal (roughly ‘because’)
-tame, -kara, -node
adversative (roughly ‘but’)
-ga, -kedo, -keredo, - monono, -nodaga
Table 1: Japanese conjunctions used for con-
junctive patterns.
MT method, and some patterns are manually
added for this work to achieve higher recall.
Another type of pattern is conjunctive pat-
terns, which allow multiple propositions in a
sentence. We used a total of 22 conjunctive
patterns also derived from the MT method, as
exemplified in Table 1. In such cases of coordi-
native clauses and causal clauses, both clauses
can be polar clauses. On the other hand, no
proposition is identified in a conditional clause
due to the absence of corresponding conjunc-
tive patterns.
3.2 Polarity Assignment Using Polar
Atoms
To assign a polarity to each proposition, po-
lar atoms in the lexicon are compared to the
proposition. A polar atom consists of po-
larity, verb or adjective, and optionally, its
arguments. Example (12) is a simple polar
atom, where no argument is specified. This
atom matches any proposition whose head is
utsukushii. Example (13) is a complex polar
atom, which assigns a negative polarity to any
proposition whose head is the verb kaku and
where the accusative case is miryoku.
(12) [+] utsukushii
‘to be beautiful’
(13) [−] kaku ← miryoku-wo
‘to lack ← attraction-ACC’
A polarity is assigned if there exists a polar
atom for which verb/adjective and the argu-
ments coincide with the proposition, and oth-
erwise no polarity is assigned. The opposite
polarity of the polar atom is assigned to a
proposition which has the neg feature.
We used a total of 3,275 polar atoms, most
of which are derived from an English sentiment
lexicon (Yi et al., 2003).
According to the evaluation of the MT
method (Kanayama et al., 2004), high-
precision sentiment analysis had been achieved
using the polar atoms and patterns, where the
splendid
light have-zoom
small-LCD ⊗ satisfied
⊗
high-price
a73a27
a9
Inter-sentential
Context
a54
a54
Intra-sentential
Context
Figure 2: The concept of the intra- and inter-
sentential contexts, where the polarities are
perfectly coherent. The symbol ‘⊗’ denotes
the existence of an adversative conjunction.
system never took positive sentiment for neg-
ative and vice versa, and judged positive or
negative to neutral expressions in only about
10% cases. However, the recall is too low, and
most of the lexicon is for domain-independent
expressions, and thus we need more lexical en-
tries to grasp the positive and negative aspects
in a specific domain.
4 Context Coherency
This section introduces the intra- and inter-
sententialcontextsinwhichweassume context
coherency for polarity, and describes some pre-
liminary analysis of the assumption.
4.1 Intra-sentential and
Inter-sentential Context
The identification of propositions described
in Section 3.1 clarifies our viewpoint of the
contexts. Here we consider two types of
contexts: intra-sentential context and inter-
sentential context. Figure 2 illustrates the
context coherency in a sample discourse (14),
where the polarities are perfectly coherent.
(14) Kono kamera-ha subarashii-to omou.
‘I think this camera is splendid.’
Karui-shi, zuumu-mo tsuite-iru.
‘It’s light and has a zoom lens.’
Ekishou-ga chiisai-kedo, manzoku-da.
‘Though the LCD is small, I’m satisfied.’
Tada, nedan-ga chotto takai.
‘But, the price is a little high.’
The intra-sentential context is the link be-
tween propositions in a sentence, which are
detected as coordinative or causal clauses. If
there is an adversative conjunction such as
-kedo (‘but’) in the third sentence in (14), a
flag is attached to the relation, as denoted
with ‘⊗’ in Figure 2. Though there are dif-
ferences in syntactic phenomena, this is sim-
358
shikashi (‘however’), demo (‘but’), sorenanoni
(‘even though’), tadashi (‘on condition that’),
dakedo (‘but’), gyakuni (‘on the contrary’),
tohaie (‘although’), keredomo (’however’),
ippou (‘on the other hand’)
Table 2: Inter-sentential adversative expres-
sions.
Domain Post. Sent. Len.
digital cameras 263,934 1,757,917 28.3
movies 163,993 637,054 31.5
mobile phones 155,130 609,072 25.3
cars 159,135 959,831 30.9
Table 3: The corpora from four domains
used in this paper. The “Post.” and “Sent.”
columns denote the numbers of postings and
sentences, respectively. “Len.” is the average
length of sentences (in Japanese characters).
ilar to the semantic orientation proposed by
Hatzivassiloglou and McKeown (1997).
The inter-sentential context is the link be-
tween propositions in the main clauses of pairs
of adjacent sentences in a discourse. The po-
larities are assumed to be the same in the
inter-sentential context, unless there is an ad-
versative expression as those listed in Table 2.
If no proposition is detected as in a nominal
sentence, the context is split. That is, there is
no link between the proposition of the previous
sentence and that of the next sentence.
4.2 Preliminary Study on Context
Coherency
We claim these two types of context can be
used for unsupervised learning as clues to as-
sign a tentative polarity to unknown expres-
sions. To validate our assumption, we con-
ducted preliminary observations using various
corpora.
4.2.1 Corpora
Throughout this paper we used Japanese
corpora from discussion boards in four differ-
ent domains, whose features are shown in Ta-
ble 3. All of the corpora have clues to the
boundaries of postings, so they were suitable
to identify the discourses.
4.2.2 Coherent Precision
How strong is the coherency in the con-
text proposed in Section 4.1? Using the polar
clauses detected by the SA system with the
initial lexicon, we observed the coherent pre-
cision of domain d with lexicon L, defined as:
cp(d,L) = #(Coherent)#(Coherent)+#(Conflict) (15)
where #(Coherent) and #(Conflict) are oc-
currence counts of the same and opposite po-
larities observed between two polar clauses as
observed in the discourse. As the two polar
clauses, we consider the following types:
Window. A polar clause and the nearest po-
lar clause which is found in the preceding
n sentences in the discourse.
Context. Two polar clauses in the intra-
sentential and/or inter-sentential context
described in Section 4.1. This is the view-
point of context in our method.
Table 4 shows the frequencies of coherent
pairs, conflicting pairs, and the coherent pre-
cision for half of the digital camera domain
corpus. “Baseline” is the percentage of posi-
tive clauses among the polar clauses6.
For the “Window” method, we tested for
n=0, 1, 2, and ∞. “0” means two propositions
within a sentence. Apparently, the larger the
window size, the smaller the cp value. When
the window size is “∞”, implying anywhere
within a discourse, the ratio is larger than the
baseline by only 2.7%, and thus these types
of coherency are not reliable even though the
number of clues is relatively large.
“Context” shows the coherency of the two
types of context that we considered. The
cp values are much higher than those in the
“Window” methods, because the relationships
between adjacent pairs of clauses are handled
more appropriately by considering syntactic
trees, adversative conjunctions, etc. The cp
values for inter-sentential and intra-sentential
contexts are almost the same, and thus both
contexts can be used to obtain 2.5 times more
clues for the intra-sentential context. In the
rest of this paper we will use both contexts.
We also observed the coherent precision for
each domain corpus. The results in the cen-
ter column of Table 5 indicate the number
is slightly different among corpora, but all of
them are far from perfect coherency.
6Ifthereisapolarclausewhosepolarityisunknown,
the polarity is correctly predicted with at least 57.0%
precision by assuming “positive”.
359
Model Coherent Conflict cp(d,L)
Baseline 57.0%
Window
n = 0 3,428 1,916 64.1%
n = 1 11,448 6,865 62.5%
n = 2 16,231 10,126 61.6%
n = ∞ 26,365 17,831 59.7%
Context
intra. 2,583 996 72.2%
inter. 3,987 1,533 72.2%
both 6,570 2,529 72.2%
Table 4: Coherent precision with various view-
points of contexts.
Domain cp(d,L) cd(d,L)
digital cameras 72.2% 7.23%
movies 76.7% 18.71%
mobile phones 72.9% 7.31%
cars 73.4% 7.36%
Table 5: Coherent precision and coherent den-
sity for each domain.
4.2.3 Coherent Density
Besides the conflicting cases, there are many
more cases where a polar clause does not ap-
pear in the polar context. We also observed
the coherent density of the domain d with the
lexicon L defined as:
cd(d,L) = #(Coherent)#(Polar) (16)
This indicates the ratio of polar clauses that
appear in the coherent context, among all of
the polar clauses detected by the system.
The right column of Table 5 shows the co-
herent density in each domain. The movie
domain has notably higher coherent density
than the others. This indicates the sentiment
expressions are more frequently used in the
movie domain.
The next section describes the method of
our unsupervised learning using this imperfect
context coherency.
5 Unsupervised Learning for
Acquisition of Polar Atoms
Figure 3 shows the flow of our unsupervised
learning method. First, the runtime SA sys-
tem identifies the polar clauses, and the can-
didate polar atoms are collected. Then, each
candidate atom is validated using the two met-
rics in the previous section, cp and cd, which
are calculated from all of the polar clauses
found in the domain corpus.
Domain
Corpus d
a45
Initial
Lexicon L
a42
SA
a54
Polar
Clauses
contexta45
a18a85
Candidate
Polar Atoms
f(a),p(a),n(a)
cd(d,L)
cp(d,L)
?test
a54
a82a9
a78 a63
? testa45
a14
a45∧ a45 New
Lexicon
Figure 3: The flow of the learning process.
ID Candidate Polar Atom f(a) p(a) n(a)
1* chiisai ‘to be small’ 3,014 226 227
2 shikkari-suru ‘to be firm’ 246 54 10
3 chiisai ← bodii-ga 11 4 0‘to be small ← body-NOM’
4* todoku ← mokuyou-ni 2 0 2‘to be delivered←on Thursday’
Table 6: Examples of candidate polar atoms
and their frequencies. ‘*’ denotes that it
should not be added to the lexicon. f(a), p(a),
and n(a) denote the frequency of the atom and
in positive and negative contexts, respectively.
5.1 Counts of Candidate Polar Atoms
From each proposition which does not have a
polarity, candidate polar atoms in the form of
simple atoms (just a verb or adjective) or com-
plex atoms (a verb or adjective and its right-
most argument consisting of a pair of a noun
and a postpositional) are extracted. For each
candidate polar atom a, the total appearances
f(a), and the occurrences in positive contexts
p(a) and negative contexts n(a) are counted,
based on the context of the adjacent clauses
(using the method described in Section 4.1).
If the proposition has the neg feature, the po-
larity is inverted. Table 6 shows examples of
candidate polar atoms with their frequencies.
5.2 Determination for Adding to
Lexicon
Among the located candidate polar atoms,
how can we distinguish true polar atoms,
which should be added to the lexicon, from
fake polar atoms, which should be discarded?
As shown in Section 4, both the coherent
precision (72-77%) and the coherent density
(7-19%) are so small that we cannot rely on
each single appearance of the atom in the po-
lar context. One possible approach is to set
the threshold values for frequency in a polar
context, max(p(a),n(a)) and for the ratio of
appearances in polar contexts among the to-
360
tal appearances, max(p(a),n(a))f(a) . However, the
optimum threshold values should depend on
the corpus and the initial lexicon.
In order to set general criteria, here we as-
sume that a true positive polar atom a should
have higher p(a)f(a) than its average i.e. coher-
ent density, cd(d,L+a), and also have higher
p(a)
p(a)+n(a) than its average i.e. coherent preci-
sion, cp(d,L+a) and these criteria should be
met with 90% confidence, where L+a is the
initial lexicon with a added. Assuming the bi-
nomial distribution, a candidate polar atom is
adopted as a positive polar atom7 if both (17)
and (18) are satisfied8.
q > cd(d,L),
where
p(a)summationdisplay
k=0
f(a)Ckqk(1−q)f(a)−k = 0.9
(17)
r > cp(d,L) or n(a) = 0,
where
p(a)summationdisplay
k=0
p(a)+n(a)Ckrk(1−r)p(a)+n(a)−k= 0.9
(18)
We can assume cd(d,L+a) similarequal cd(d,L), and
cp(d,L+a) similarequal cp(d,L) when L is large. We
compute the confidence interval using approx-
imation with the F-distribution (Blyth, 1986).
These criteria solve the problems in mini-
mum frequency and scope of the polar atoms
simultaneously. In the example of Table 6, the
simple atom chiisai (ID=1) is discarded be-
cause it does not meet (18), while the complex
atom chiisai ← bodii-ga (ID=3) is adopted
as a positive atom. shikkari-suru (ID=2)
is adopted as a positive simple atom, even
though 10 cases out of 64 were observed in the
negative context. On the other hand, todoku
← mokuyou-ni (ID=4) is discarded because it
does not meet (17), even though n(a)f(a) = 1.0,
i.e. always observed in negative contexts.
6 Evaluation
6.1 Evaluation by Polar Atoms
First we propose a method of evaluation of the
lexical learning.
7The criteria for the negative atoms are analogous.
8nCr notation is used here for combination (n
choose k).
Annotator B
Positive Neutral Negative
Anno- Positive 65 11 3
tator Neutral 3 72 0
A Negative 1 4 41
Table 7: Agreement of two annotators’ judg-
ments of 200 polar atoms. κ=0.83.
It is costly to make consistent and large
‘gold standards’ in multiple domains, espe-
cially in identification tasks such as clause-
level SA (cf. classification tasks). Therefore
we evaluated the learning results by asking hu-
man annotators to classify the acquired polar
atoms as positive, negative, and neutral, in-
stead of the instances of polar clauses detected
with the new lexicon. This can be done be-
cause the polar atoms themselves are informa-
tive enough to imply to humans whether the
expressions hold positive or negative meanings
in the domain.
To justify the reliability of this evaluation
method, two annotators9 evaluated 200 ran-
domly selected candidate polar atoms in the
digital camera domain. The agreement results
are shown in Table 7. The manual classifi-
cation was agreed upon in 89% of the cases
and the Kappa value was 0.83, which is high
enough to be considered consistent.
Using manual judgment of the polar atoms,
we evaluated the performance with the follow-
ing three metrics.
Type Precision. The coincidence rate of the
polarity between the acquired polar atom
and the human evaluators’ judgments. It
is always false if the evaluators judged it
as ‘neutral.’
Token Precision. The coincidence rate of
the polarity, weighted by its frequency in
the corpus. This metric emulates the pre-
cision of the detection of polar clauses
with newly acquired poler atoms, in the
runtime SA system.
Relative Recall. The estimated ratio of the
number of detected polar clauses with the
expanded lexicon to the number of de-
tected polar clauses with the initial lex-
9For each domain, we asked different annotators
who are familiar with the domain. They are not the
authors of this paper.
361
Domain # Type Token RelativePrec. Prec. Recall
digital cameras 708 65% 96.5% 1.28
movies 462 75% 94.4% 1.19
mobile phones 228 54% 92.1% 1.13
cars 487 68% 91.5% 1.18
Table 8: Evaluation results with our method.
The column ‘#’ denotes the number of polar
atoms acquired in each domain.
icon. Relative recall will be 1 when no
newpolaratomisacquired. Sincethepre-
cision was high enough, this metric can
be used for approximation of the recall,
which is hard to evaluate in extraction
tasks such as clause-/phrase-level SA.
6.2 Robustness for Different
Conditions
6.2.1 Diversity of Corpora
For each of the four domain corpora, the an-
notators evaluated 100 randomly selected po-
lar atoms which were newly acquired by our
method, to measure the precisions. Relative
recall is estimated by comparing the numbers
of detected polar clauses from randomly se-
lected 2,000 sentences, with and without the
acquired polar atoms. Table 8 shows the re-
sults. The token precision is higher than 90%
in all of the corpora, including the movie do-
main, which is considered to be difficult for SA
(Turney, 2002). This is extremely high preci-
sion for this task, because the correctness of
both the extraction and polarity assignment
was evaluated simultaneously. The relative re-
call 1.28 in the digital camera domain means
the recall is increased from 43%10 to 55%. The
difference was smaller in other domains, but
the domain-dependent polar clauses are much
informative than general ones, thus the high-
precision detection significantly enhances the
system.
To see the effects of our method, we con-
ducted a control experiment which used pre-
set criteria. To adopt the candidate atom a,
the frequency of polarity, max(p(a),n(a)) was
required to be 3 or more, and the ratio of po-
larity, max(p(a),n(a))f(a) was required to be higher
than the threshold θ. Varying θ from 0.05 to
10The human evaluation result for digital camera do-
main (Kanayama et al., 2004).
a54
∼ a45
Relative recall
Token
precision
0.5
1
1.0 1.1 1.2
star
star θ = 0.05
star θ = 0.1starstarθ = 0.3starstarstar
star
starθ = 0.8
star stardigital cameras
•
• θ = 0.05•θ = 0.1
•
• θ = 0.3•
•
••
•
• •movies
(our method)
a14a89
Figure 4: Relative recall vs. token precision
with various preset threshold values θ for the
digital camera and movie domains. The right-
most star and circle denote the performance of
our method.
0.8, we evaluated the token precision and the
relative recall in the domains of digital cam-
eras and movies. Figure 4 shows the results.
The results showed both relative recall and
token precision were lower than in our method
for every θ, in both corpora. The optimum θ
was 0.3 in the movie domain and 0.1 in the
digital camera domain. Therefore, in this pre-
set approach, a tuning process is necessary for
each domain. Our method does not require
this tuning, and thus fully automatic learning
was possible.
Unlike the normal precision-recall tradeoff,
the token precision in the movie domain got
lower when the θ is strict. This is due to the
frequent polar atoms which can be acquired
at the low ratios of the polarity. Our method
does not discard these important polar atoms.
6.2.2 Size of the Initial Lexicon
We also tested the performance while vary-
ing the size of the initial lexicon L. We pre-
pared three subsets of the initial lexicon, L0.8,
L0.5, and L0.2, removing polar atoms ran-
domly. These lexicons had 0.8, 0.5, 0.2 times
the polar atoms, respectively, compared to
L. Table 9 shows the precisions and recalls
using these lexicons for the learning process.
Though the cd values vary, the precision was
stable, which means that our method was ro-
bust even for different sizes of the lexicon. The
smaller the initial lexicon, the higher the rela-
tive recall, because the polar atoms which were
removed from L were recovered in the learning
process. This result suggests the possibility of
362
lexicon cd Token Prec. Relative Rec.
L 7.2% 96.5% 1.28
L0.8 6.1% 97.5% 1.41
L0.5 3.9% 94.2% 2.10
L0.2 3.6% 84.8% 3.55
Table 9: Evaluation results for various sizes of
the initial lexicon (the digital camera domain).
the bootstrapping method from a small initial
lexicon.
6.3 Qualitative Evaluation
As seen in the agreement study, the polar
atoms used in our study were intrinsically
meaningful to humans. This is because the
atoms are predicate-argument structures de-
rived from predicative clauses, and thus hu-
mans could imagine the meaning of a polar
atom by generating the corresponding sen-
tence in its predicative form.
In the evaluation process, some interesting
results were observed. For example, a nega-
tive atom nai ← kerare-ga (‘to be free from
vignetting’) was acquired in the digital cam-
era domain. Even the evaluator who was fa-
miliar with digital cameras did not know the
term kerare (‘vignetting’), but after looking up
the dictionary she labeled it as negative. Our
learning method could pick up such technical
terms and labeled them appropriately.
Also, there were discoveries in the error
analysis. An evaluator assigned positive to aru
← kamera-ga (‘to have camera’) in the mobile
phone domain, but the acquired polar atom
had the negative polarity. This was actually
an insight from the recent opinions that many
userswantphoneswithoutcamerafunctions11.
7 Conclusion
We proposed an unsupervised method to ac-
quire polar atoms for domain-oriented SA, and
demonstrated its high performance. The lex-
icon can be expanded automatically by us-
ing unannotated corpora, and tuning of the
threshold values is not required. Therefore
even end-users can use this approach to im-
prove the sentiment analysis. These features
allow them to do on-demand analysis of more
narrow domains, such as the domain of digital
11Perhaps because cameras tend to consume battery
power and some users don’t need them.
cameras of a specific manufacturer, or the do-
main of mobile phones from the female users’
point of view.

References
C. R. Blyth. 1986. Approximate binomial confi-
dence limits. Journal of the American Statistical
Asscoiation, 81(395):843–855.
Vasileios Hatzivassiloglou and Kathleen R. McKe-
own. 1997. Predicting the semantic orientation
of adjectives. In Proceedings of the 35th ACL
and the 8th EACL, pages 174–181.
Hiroshi Kanayama, Tetsuya Nasukawa, and Hideo
Watanabe. 2004. Deeper sentiment analysis us-
ing machine translation technology. In Proceed-
ings of the 20th COLING, pages 494–500.
Tetsuya Nasukawa and Jeonghee Yi. 2003. Senti-
ment analysis: Capturing favorability using nat-
ural language processing. In Proceedings of the
Second K-CAP, pages 70–77.
Bo Pang and Lillian Lee. 2004. A sentimental
education: Sentiment analysis using subjectiv-
ity summarization based on minimum cuts. In
Proceedings of the 42nd ACL, pages 271–278.
Ana-Maria Popescu and Oren Etzioni. 2005. Ex-
tracting product features and opinions from
reviews. In Proceedings of HLT/EMNLP-05,
pages 339–346.
Ellen Riloff and Janyee Wiebe. 2003. Learning ex-
traction patterns for subjective expressions. In
Proceedings of EMNLP-03, pages 105–112.
Peter D. Turney. 2002. Thumbs up or thumbs
down? Semantic orientation applied to unsuper-
vised classification of reviews. In Proceedings of
the 40th ACL, pages 417–424.
Theresa Wilson, Janyce Wiebe, and Paul Hoff-
mann. 2005. Recognizing contextual polarity in
phrase-level sentiment analysis. In Proceedings
of HLT/EMNLP-05, pages 347–354.
Jeonghee Yi, Tetsuya Nasukawa, Razvan Bunescu,
and Wayne Niblack. 2003. Sentiment analyzer:
Extracting sentiments about a given topic using
natural language processing techniques. In Pro-
ceedings of the Third IEEE International Con-
ference on Data Mining, pages 427–434.
Hong Yu and Vasileios Hatzivassiloglou. 2003. To-
wards answering opinion questions: Separating
facts from opinions and identifying the polarity
of opinion sentences. In Proceedings of EMNLP-
2003, pages 129–136.
