File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/06/w06-0304_metho.xml
Size: 16,751 bytes
Last Modified: 2025-10-06 14:10:35
<?xml version="1.0" standalone="yes"?> <Paper uid="W06-0304"> <Title>User-directed Sentiment Analysis: Visualizing the Affective Content of Documents</Title> <Section position="5" start_page="23" end_page="26" type="metho"> <SectionTitle> 3 The Approach </SectionTitle> <Paragraph position="0"> Our methodology combines a traditional lexical approach to scoring documents for affect with a mature visualization tool. We first automatically identify affect by comparing each document against a lexicon of affect-bearing words and obtain an affect score for each document. We provide a number of visual metaphors to represent the affect in the collection and a number of tools that can be used to interactively explore the affective content of the data.</Paragraph> <Section position="1" start_page="23" end_page="24" type="sub_section"> <SectionTitle> 3.1 Lexicon and Measurement </SectionTitle> <Paragraph position="0"> We use a lexicon of affect-bearing words to identify the distribution of affect in the documents.</Paragraph> <Paragraph position="1"> Our lexicon authoring system allows affect-bearing terms, and their associated strengths, to be bulk loaded, declared manually, or algorithmically suggested. In this paper, we use a lexicon derived from the General Inquirer (GI) and supplemented with lexical items derived from a semi-supervised bootstrapping task. The GI tool is a computer-assisted approach for content analyses of textual data (Stone, 1977). It includes an extensive lexicon of over 11,000 hand-coded word stems and 182 categories.</Paragraph> <Paragraph position="2"> We used this lexicon, specifically the positive and negative axes, to create a larger lexicon by bootstrapping. Lexical bootstrapping is a method used to help expand dictionaries of semantic categories (Riloff & Jones, 1999) in the context of a document set of interest. The approach we have adopted begins with a lexicon of affect bearing words (POS and NEG) and a corpus.</Paragraph> <Paragraph position="3"> Each document in the corpus receives an affect score by counting the number of words from the seed lexicon that occur in the document; a separate score is given for each affect axis. Words in the corpus are scored for affect potential by comparing their distribution (using an L1 Distri- null bution metric) of occurrence over the set if documents to the distribution of affect bearing words. Words that compare favorably with affect are hypothesized as affect bearing words. Results are then manually culled to determine if in fact they should be included in the lexicon.</Paragraph> <Paragraph position="4"> Here we report on results using a lexicon built from 8 affect categories, comprising 4 concept pairs: Each document in the collection is compared against all 8 affect categories and receives a score for each. Scores are based on the summation of each affect axis in the document, normalized by the number of words in the documents. This provides an overall proportion of positive words, for example, per document. Scores can also be calculated as the summation of each axis, normalized by the total number of affect words for all axes. This allows one to quickly estimate the balance of affect in the documents. For example, using this measurement, one could see that a particular document contains as many positive as negative terms, or if it is heavily skewed towards one or the other.</Paragraph> <Paragraph position="5"> While the results reported here are based on a predefined lexicon, our system does include a Lexicon Editor in which a user can manually enter their own lexicon or add strengths to lexical items. Included in the editor is a Lexicon Bootstrapping Utility which the user can use to help create a specialized lexicon of their own. This utility runs as described above. Note that while we enable the capability of strength, we have not experimented with that variable here. All words for all axes have a default strength of .5.</Paragraph> </Section> <Section position="2" start_page="24" end_page="26" type="sub_section"> <SectionTitle> 3.2 Visualization </SectionTitle> <Paragraph position="0"> To visualize the affective content of a collection of documents, we combined a variety of visual metaphors with a tool designed for visual analytics of documents, IN-SPIRE.</Paragraph> <Paragraph position="1"> IN-SPIRE (Hetzler and Turner, 2004) is a visual analytics tool designed to facilitate rapid understanding of large textual corpora. IN-SPIRE generates a compiled document set from mathematical signatures for each document in a set. Document signatures are clustered according to common themes to enable information exploration and visualizations. Information is presented to the user using several visual metaphors to expose different facets of the textual data. The central visual metaphor is a Galaxy view of the corpus that allows users to intuitively interact with thousands of documents, examining them by theme (see Figure 4, below). IN-SPIRE leverages the use of context vectors such as LSA (Deerwester et al., 1990) for document clustering and projection. Additional analytic tools allow exploration of temporal trends, thematic distribution by source or other metadata, and query relationships and overlaps. IN-SPIRE was recently enhanced to support visual analysis of sentiment.</Paragraph> <Paragraph position="2"> In selecting metaphors to represent the affect scores of documents, we started by identifying the kinds of questions that users would want to explore. Consider, as a guiding example, a set of customer reviews for several commercial products (Hu & Liu, 2004). A user reviewing this data might be interested in a number of questions, such as: * What is the range of affect overall? * Which products are viewed most positively? Most negatively? * What is the range of affect for a particular product? * How does the affect in the reviews deviate from the norm? Which are more negative or positive than would be expected from the averages? * How does the feedback of one product compare to that of another? * Can we isolate the affect as it pertains to different features of the products? In selecting a base metaphor for affect, we wanted to be able to address these kinds of questions. We wanted a metaphor that would support viewing affect axes individually as well as in pairs. In addition to representing the most common axes, negative and positive, we wanted to provide more flexibility by incorporating the ability to portray multiple pairs because we suspect that additional axes will help the user explore nuances of emotion in the data. For our current metaphor, we drew inspiration from the Rose plot used by Florence Nightingale (Wainer, 1997). This metaphor is appealing in that it is easily interpreted, that larger scores draw more attention, and that measures are shown in consistent relative location, making it easier to compare measures across document groups. We use a modified version of this metaphor in which each axis is represented individually but is also paired with its opposite to aid in direct comparisons. To this end, we vary the spacing between the rose petals to reinforce the pairing. We also use color; each pair has a common hue, with the more positive of the pair shown in a lighter shade and the more negative one in a darker shade (see Figure 1).</Paragraph> <Paragraph position="3"> To address how much the range of affect varies across a set of documents, we adapted the concept of a box plot to the rose petal. For each axis, we show the median and quartile values as shown in the figure below. The dark line indicates the median value and the color band portrays the quartiles. In the plot in Figure 1, for example, the scores vary quite a bit.</Paragraph> <Paragraph position="4"> quartile variation.</Paragraph> <Paragraph position="5"> Another variation we made on the base metaphor was to address a more subtle set of questions. It may happen that the affect scores within a dataset are largely driven by document membership in particular groups. For example, in our customer data, it may be that all documents about Product A are relatively positive while those about Product B are relatively negative. A user wanting to understand customer complaints may have a subtle need. It is not sufficient to just look at the most negative documents in the dataset, because none of the Product A documents may pass this threshold. What may also help is to look at all documents that are more negative than one would expect, given the product they discuss. To carry out this calculation, we use a statistical technique to calculate the Main (or expected) affect value for each group and the Residual (or deviation) affect value for each document with respect to its group (Scheffe, 1999). To convey the Residual concept, we needed a representation of deviation from expected value. We also wanted this portrayal to be similar to the base metaphor. We use a unit circle to portray the expected value and show deviation by drawing the appropriate rose petals either outside (larger than expected) or inside (smaller than expected) the unit circle, with the color amount showing the amount of deviation from expected.</Paragraph> <Paragraph position="6"> In the figures below, the dotted circle represents expected value. The glyph on the left shows a cluster with scores slightly higher than expected for Positive and for Cooperation affect. The glyph on the right shows a cluster with scores slightly higher than expected for the Negative and Vice affect axes (Figure 2).</Paragraph> <Paragraph position="7"> from expected values.</Paragraph> <Paragraph position="8"> IN-SPIRE includes a variety of analytic tools that allow exploration of temporal trends, thematic distribution by source or other metadata, and query relationships and overlaps. We have incorporated several interaction capabilities for further exploration of the affect. Our analysis system allows users to group documents in numerous ways, such as by query results, by meta-data (such as the product), by time frame, and by similarity in themes. A user can select one or more of these groups and see a summary of affect and its variation in those groups. In addition, the group members are clustered by their affect scores and glyphs of the residual, or variation from expected value, are shown for each of these sub-group clusters.</Paragraph> <Paragraph position="9"> Below each rose we display a small histogram showing the number of documents represented by that glyph (see Figure 3). These allow comparison of affect to cluster or group size. For example, we find that extreme affect scores are typically found in the smaller clusters, while larger ones often show more mid-range scores. As the user selects document groups or clusters, we show the proportion of documents selected.</Paragraph> <Paragraph position="10"> plot per cluster.</Paragraph> <Paragraph position="11"> The interaction may also be driven from the affect size. If a given clustering of affect characteristics is selected, the user can see the themes they represent, how they correlate to metadata, or the time distribution. We illustrate how the affect visualization and interaction fit into a larger analysis with a brief case study.</Paragraph> </Section> </Section> <Section position="6" start_page="26" end_page="28" type="metho"> <SectionTitle> 4 Case study </SectionTitle> <Paragraph position="0"> The IN-SPIRE visualization tool is a non-data specific tool, designed to explore large amounts of textual data for a variety of genres and document types (doc, xml, etc). Many users of the system have their own data sets they wish to explore (company internal documents), or data can be harvested directly from the web, either in a single web harvest, or dynamically. The case study and dataset presented here is intended as an example only, it does not represent the full range of exploration capabilities of the affective content of datasets.</Paragraph> <Paragraph position="1"> We explore a set of customer reviews, comprising a collection of Amazon reviews for five products (Hu & Liu, 2004). While a customer may not want to explore reviews for 5 different product types at once, the dataset is realistic in that a web harvest of one review site will contain reviews of multiple products. This allows us to demonstrate how the tool enables users to focus on the data and comparisons that they are interested in exploring. The 5 products in this dataset are: player We begin by clustering the reviews, based on overall thematic content. The labels are automatically generated and indicate some of the stronger theme combinations in this dataset.</Paragraph> <Paragraph position="2"> These clusters are driven largely by product vocabulary. The two cameras cluster in the lower portion; the Zen shows up in the upper right clusters, with the phone in the middle and the Apex DVD player in the upper left and upper middle.</Paragraph> <Paragraph position="3"> In this image, the pink dots are the Apex DVD The affect measurements on these documents generate five clusters in our system, each of which is summarized with a rose plot showing affect variation. This gives us information on the range and distribution of affect overall in this data. We can select one of these plots, either to review the documents or to interact further. Selection is indicated with a green border, as shown in the upper middle plot of Figure 5.</Paragraph> <Paragraph position="4"> Figure 5. Clusters by affect, with one cluster glyph selected.</Paragraph> <Paragraph position="5"> The selected documents are relatively positive; they have higher scores in the Positive and Virtue axes and lower scores in the Negative axis. We may want to see how the documents in this affect cluster distribute over the five products. This question is answered by the correlation tool, shown in Figure 6; the positive affect cluster contains more reviews on the Zen MP3 player than any of the other products.</Paragraph> <Paragraph position="6"> Figure 6. Products represented in one of the positive affect clusters.</Paragraph> <Paragraph position="7"> Alternatively we could get a summary of affect per product. Figure 7 shows the affect for the Apex DVD player and the Nokia cell phone.</Paragraph> <Paragraph position="8"> While both are positive, the Apex has stronger negative ratings than the Nokia.</Paragraph> <Paragraph position="9"> to Apex More detail is apparent by looking at the clusters within one or more groups and examining the deviations. Figure 8 shows the sub-clusters within the Apex group. We include the summary for the group as a whole (directly beneath the Apex label), and then show the four sub-clusters by illustrating how they deviate from expected value. We see that two of these tend to be more positive than expected and two are more negative one product (Apex).</Paragraph> <Paragraph position="10"> Looking at the thematic distribution among the Apex documents shows topics that dominate its reviews (Figure 9).</Paragraph> <Paragraph position="11"> We can examine the affect across these various clusters. Figure 10 shows the comparison of the &quot;service&quot; cluster to the &quot;dvd player picture&quot; cluster. This graphic demonstrates that documents with &quot;service&quot; as a main theme tend to be much more negative, while documents with &quot;picture&quot; as a main theme are much more positive. The visualization tool includes a document viewer so that any selection of documents can be reviewed. For example, a user may be interested in why the &quot;service&quot; documents tend to be negative, in which case they can review the original reviews. The doc viewer, shown in Figure 11, can be used at any stage in the process with any number of documents selected. Individual documents can be viewed by clicking on a document title in the upper portion of the doc viewer.</Paragraph> <Paragraph position="12"> In this case study, we have illustrated the usefulness of visualizing the emotional content of a document collection. Using the tools presented here, we can summarize the dataset by saying that in general, the customer reviews are positive (Figure 5), but reviews for some products are more positive than others (Figures 6 and 7). In addition to the general content of the reviews, we can narrow our focus to the features contained in the reviews. We saw that while reviews for Apex are generally positive (Figure 8), reviews about Apex &quot;service&quot; tend to be much more negative than reviews about Apex &quot;picture&quot; (Figure 10).</Paragraph> </Section> class="xml-element"></Paper>