File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/04/c04-1071_intro.xml

Size: 5,834 bytes

Last Modified: 2025-10-06 14:02:04

<?xml version="1.0" standalone="yes"?>
<Paper uid="C04-1071">
  <Title>Deeper Sentiment Analysis Using Machine Translation Technology</Title>
  <Section position="2" start_page="0" end_page="0" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> Sentiment analysis (SA) (Nasukawa and Yi, 2003; Yi et al., 2003) is a task to obtain writers' feelings as expressed in positive or negative comments, questions, and requests, by analyzing large numbers of documents. SA is becoming a useful tool for the commercial activities of both companies and individual consumers, because they want to sort out opinions about products, services, or brands that are scattered in online texts such as product review articles, replies given to questionnaires, and messages in bulletin boards on the WWW.</Paragraph>
    <Paragraph position="1"> This paper describes a method to extract a set of sentiment units from sentences, which is the key component of SA. A sentiment unit is a tuple of a sentiment1, a predicate, and its arguments. For example, these sentences in a customer's review of a digital camera (1) contained three sentiment units (1a), (1b), and (1c). Apparently these units indicate that the camera has good features in its lens and recharger, and a bad feature in its price.</Paragraph>
    <Paragraph position="2"> It has excellent lens, but the price is too high.</Paragraph>
    <Paragraph position="3"> I don't think the quality of the recharger has any problem.</Paragraph>
    <Paragraph position="5"> The extraction of these sentiment units is not a trivial task because many syntactic and semantic operations are required. First, the structure of a predicate and its arguments may be changed from the 1Possible values of a sentiment are 'favorable', 'unfavorable', 'question', and 'request'. In this paper the discussion is mostly focused on the first two values.</Paragraph>
    <Paragraph position="6"> syntactic form as in (1a) and (1c). Also modal, aspectual, and negation information must be handled, as in (1c). Second, a sentiment unit should be constructed as the smallest possible informative unit so that it is easy to handle for the organizing processes after extraction. In (1b) the degree adverb 'too' is omitted to normalize the expression. For (1c), the predicate 'problematic' has the argument 'recharger' instead of the head word of the noun phrase 'the quality of the recharger', because just using 'quality' is not informative to describe the sentiment of the attribute of a real-world object. Moreover, disambiguation of sentiments is necessary: in (1b) the adjective 'high' has the 'unfavorable' feature, but 'high' can be treated as 'favorable' in the expression &amp;quot;resolution is high&amp;quot;.</Paragraph>
    <Paragraph position="7"> We regard this task as translation from text to sentiment units, because we noticed that the deep language analysis techniques which are required for the extraction of sentiment units are analogous to those which have been studied for the purpose of language translation. We implemented an accurate sentiment analyzer by making use of an existing transfer-based machine translation engine (Watanabe, 1992), replacing the translation patterns and bilingual lexicons with sentiment patterns and a sentiment polarity lexicon. Although we used many techniques for deep language analysis, the system was implemented at a surprisingly low development cost because the techniques for machine translation could be reused in the architecture described in this paper.</Paragraph>
    <Paragraph position="8"> We aimed at the high precision extraction of sentiment units. In other words, our SA system attaches importance to each individual sentiment expression, rather than to the quantitative tendencies of reputation. This is in order to meet the requirement of the SA users who want to know not only the over-all goodness of an object, but also the breakdown of opinions. For example, when there are many positive opinions and only one negative opinion, the negative one should not be ignored because of its low percentage, but should be investigated thoroughly since valuable knowledge is often found in such a minority opinion. Figure 1 illustrates an image of the SA output. The outliner organizes positive and negative opinions by topic words, and provides references to the original text.</Paragraph>
    <Paragraph position="9">  gine and the sentiment analyzer. Some components are shared between them. Also other components are similar between MT and SA.</Paragraph>
    <Paragraph position="10"> This means that the approach for SA should be switched from the rather shallow analysis techniques used for text mining (Hearst, 1999; Nasukawa and Nagano, 2001), where some errors can be treated as noise, into deep analysis techniques such as those used for machine translation (MT) where all of the syntactic and semantic phenomena must be handled.</Paragraph>
    <Paragraph position="11"> We implemented a Japanese SA system using a Japanese to English translation engine. Figure 2 illustrates our SA system, which utilizes a MT engine, where techniques for parsing and pattern matching on the tree structures are shared between MT and SA.</Paragraph>
    <Paragraph position="12"> Section 2 reviews previous studies of sentiment analysis. In Section 3 we define the sentiment unit to be extracted for sentiment analysis. Section 4 presents the implementation of our system, comparing the operations and resources with those used for machine translation. Our system is evaluated in Section 5. In the rest of paper we mainly use Japanese examples because some of the operations depend on the Japanese language, but we also use English examples to express the sentiment units and some language-independent issues, for understandability. null</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML