File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/06/w06-1641_metho.xml
Size: 14,783 bytes
Last Modified: 2025-10-06 14:10:48
<?xml version="1.0" standalone="yes"?> <Paper uid="W06-1641"> <Title>Sentiment Retrieval using Generative Models</Title> <Section position="5" start_page="345" end_page="346" type="metho"> <SectionTitle> 3 A Generative Model of Sentiment </SectionTitle> <Paragraph position="0"> In this section we will provide a formal underpinning for our approach to sentiment retrieval. The approach is based on the generative paradigm: we describe a statistical process that could be viewed, hypothetically, as a source of every statement of interest to our system. We stress that this generative process is to be treated as purely hypothetical; the process is only intended to reflect those aspects of human discourse that are pertinent to the problem of retrieving affectively appropriate and topic-relevant texts in response to a query posed by our user.</Paragraph> <Paragraph position="1"> Before giving a formal specification of our model, we will provide a high-level overview of the main ideas. We are trying to model a collection of natural-language statements, some of which are relevant to a user's query. In our experiments, these statements are individual sentences, but the model can be applied to textual chunks of any length. We assume that the content of an individual statement can be modeled independently of all other statements in the collection. Each statement consists of some topic-bearing and some sentiment-bearing words. We assume that the topic-bearing words represent exchangeable samples from some underlying topic language model. Exchangeability means that the relative order of the words is irrelevant, but the words are not independent of each other--the idea often stated as a bag-of-words assumption. Similarly, sentiment-bearing words are viewed as an order-invariant 'bag', sampled from the underlying sentiment language model. We will explicitly model dependency between the topic and sentiment language models, and will demonstrate that treating them independently leads to sub-optimal retrieval performance. When a sentiment polarity value is observed for a given statement, we will treat it as a ternary variable influencing the topic and sentiment language models.</Paragraph> <Paragraph position="2"> We represent a user's query as just another statement, consisting of topic and sentiment parts, sub-ject to all the independence assumptions stated above. We will use the query to estimate the topic and sentiment language models that are representative of the user's interests. Following (Lavrenko and Croft, 2001), we will use the term relevance models to describe these models, and will use them to rank statements in order of their relevance to the query.</Paragraph> <Section position="1" start_page="345" end_page="346" type="sub_section"> <SectionTitle> 3.1 Definitions </SectionTitle> <Paragraph position="0"> We start by providing a set of definitions that will be used in the remainder of this section. The task of our model is to generate a collection of state-</Paragraph> <Paragraph position="2"> , drawn from a common vocabulary CE. We introduce a binary variable CQ</Paragraph> </Section> </Section> <Section position="6" start_page="346" end_page="346" type="metho"> <SectionTitle> CXCY BECUCBBNCCCV </SectionTitle> <Paragraph position="0"> as an indicator of whether the word in the CYth position of the CXth statement will be a topic word or a sentiment word. For our purposes, CQ</Paragraph> </Section> <Section position="7" start_page="346" end_page="346" type="metho"> <SectionTitle> CXCY </SectionTitle> <Paragraph position="0"> is either provided by a human annotator (manual annotation), or determined heuristically (automatic annotation). null The sentiment polarity DC</Paragraph> </Section> <Section position="8" start_page="346" end_page="346" type="metho"> <SectionTitle> CX </SectionTitle> <Paragraph position="0"> for a given statement is a discrete random variable with three outcomes: CUA0BDBNBCBNB7BDCV, representing negative, neutral and positive polarity values, respectively. As a matter of convenience we will often denote a statement as</Paragraph> </Section> <Section position="9" start_page="346" end_page="346" type="metho"> <SectionTitle> CX </SectionTitle> <Paragraph position="0"> contains the topic words. As we mentioned above, the user's query is treated as just another statement. It will be denoted as a triple CUD5D7BND5D8BND5DCCV, corresponding to sentiment words, topic keywords, and the desired polarity value. We will use D4 to denote a unigram language model, i.e., a function that assigns a number D4B4DAB5BECJBCBNBDCL to every word DA in our vocabulary CE, such that A6</Paragraph> </Section> <Section position="10" start_page="346" end_page="346" type="metho"> <SectionTitle> DA </SectionTitle> <Paragraph position="0"> D4B4DAB5BPBD. The set of all possible unigram language models is the probability simplex</Paragraph> <Paragraph position="2"> will denote a distribution over the three possible polarity values, and C1C8</Paragraph> </Section> <Section position="11" start_page="346" end_page="346" type="metho"> <SectionTitle> DC </SectionTitle> <Paragraph position="0"> is the corresponding ternary probability simplex. We define AP BM C1C8A2C1C8A2C1C8</Paragraph> </Section> <Section position="12" start_page="346" end_page="346" type="metho"> <SectionTitle> DC </SectionTitle> <Paragraph position="0"> AXCJBCBNBDCL to be a measure function that assigns a probability APB4D4 , but we can avoid integration by specifying a mass function APB4B5 that assigns nonzero probabilities to a finite subset of points in C1C8A2C1C8A2C1C8</Paragraph> </Section> <Section position="13" start_page="346" end_page="347" type="metho"> <SectionTitle> DC </SectionTitle> <Paragraph position="0"> . We accomplish this by using a nonparametric estimate for APB4B5, the details of which are provided below.</Paragraph> <Paragraph position="1"> 3.2.1 A nonparametric generative mass function We use a nonparametric estimate for APB4A1BNA1BNA1B5, which makes our generative model similar to kernel-based density estimators or Parzen-window classifiers (Silverman, 1986). The primary difference is that our model operates over discrete events (strings of words), and accordingly the mass function is defined over the space of distributions, rather than directly over the data points. Our estimate relies on a collection of paired observa- null stands for the relative frequency of DA in the topic part of the collection. The same definitions apply to the sentiment parameters AZB4DABNDB ability mass only to pairs of models that actually co-occur in our observations.</Paragraph> <Paragraph position="2"> Our model represents each statement DB</Paragraph> </Section> <Section position="14" start_page="347" end_page="347" type="metho"> <SectionTitle> CX </SectionTitle> <Paragraph position="0"> as a bag of words, or more formally an order-invariant sequence. This representation is often confused with word independence, which is a much stronger assumption. The generative model defined by equation (1) ignores the relative ordering of the words, but it does allow arbitrarily strong unordered dependencies among them. To illustrate, consider the probability of observing the words 'unpredictable' and 'plot' in the same statement. Suppose we set AL</Paragraph> <Paragraph position="2"> BPBD in equation (2), reducing the effects of smoothing. It should be evident that C8B4unpredictable,plotB5 will be non-zero only when the two words actually co-occur in the training data. By carefully selecting the smoothing parameters, the model can preserve dependencies between topic and sentiment words, and is quite capable of distinguishing the positive sentiment of 'unpredictable plot' from the negative sentiment of 'unpredictable steering'. On the other hand, the model does ignore the ordering of the words, so it will not be able to differentiate the negative phrase 'gone from good to bad' from its exact opposite.</Paragraph> <Paragraph position="3"> Furthermore, our model is not well suited for modeling adjacency effects: the phrase 'unpredictable plot' is treated in the same way as two separate words, 'unpredictable' and 'plot', co-occurring in the same sentence.</Paragraph> <Paragraph position="4"> 3.3 Using the model for retrieval The generative model presented above can be applied to sentiment retrieval in the following fashion. We start with a collection of statements BV and CV supplied by the user. We use the machinery outlined in Section 3.2 to estimate the topic and sentiment relevance models corresponding to the user's information need, and then determine which statements in our collection most closely correspond to these models of relevance.</Paragraph> <Paragraph position="5"> The topic relevance model CA are estimated as follows. We assume that our query D5 Both the numerator and denominator are computed according to equation (1), with the mass function APB4B5 given by equations (3) and (2). We use the notation D5AEDA to denote appending word DA to the string D5. Estimation is done over the training corpus, which may or may not include numeric values of sentiment polarity.1 Once we have estimates for the topic and sentiment relevance models, we can rank testing statements DB by their sim- null are given by equation (4), while D4</Paragraph> <Paragraph position="7"> are computed according to equation (2). A weighting parameter AB allows us to change the balance of topic and sentiment in the final ranking formula; its value is selected empirically.</Paragraph> </Section> <Section position="15" start_page="347" end_page="347" type="metho"> <SectionTitle> 4 Sentiment Retrieval Task </SectionTitle> <Paragraph position="0"/> <Section position="1" start_page="347" end_page="347" type="sub_section"> <SectionTitle> 4.1 Task definition </SectionTitle> <Paragraph position="0"> We define two variations of the sentiment retrieval task. In one, the user supplies us with a numeric value for the desired polarity D5 DC. In the other, the user supplies a set of seed words D5 D7, reflecting the desired sentiment. The first task requires us to have polarity observations DC</Paragraph> </Section> </Section> <Section position="16" start_page="347" end_page="348" type="metho"> <SectionTitle> CX </SectionTitle> <Paragraph position="0"> in our training data, while the second does not. Output: a ranked list of topic-relevant and sentiment-relevant sentences from the test data.</Paragraph> <Paragraph position="1"> In the first task, we split our corpus into three parts: (i) the training set, which was used for estimating the relevance models CA and AB; and (iii) the testing set, from which we retrieved sentences in response to the query. In the second task, we split the corpus into two parts: (i) the training set, which was used for tuning the model parameters; and (ii) the testing set, which was used for constructing CA</Paragraph> <Paragraph position="3"> and from which we retrieved sentences in response to queries.2 The testing set was identical in both tasks. Note that the sentiment relevance model CA can be constructed in a topic-dependent fashion for both tasks.</Paragraph> <Section position="1" start_page="348" end_page="348" type="sub_section"> <SectionTitle> 4.2 Variations of the retrieval model </SectionTitle> <Paragraph position="0"> slm: the retrieval model as described in Section 3.3.</Paragraph> <Paragraph position="1"> lmt: the standard language modeling approach (Ponte and Croft, 1998; Song and Croft, 1999) on the topic keywords D5D8 for the topic part of the text DB D8.</Paragraph> <Paragraph position="2"> lms: the standard language modeling approach on the sentiment keywords D5 D7 for the sentiment part of the text DB D7.</Paragraph> <Paragraph position="3"> base: the weighted linear combination of lmt and lms.</Paragraph> <Paragraph position="4"> rmt: only the topic relevance model was used for ranking using D5 rmt-base: the slm model with AB BP BD, ignoring the sentiment relevance model. rms-base: the slm model with AB BP BC, ignoring the topic relevance model.</Paragraph> <Paragraph position="5"> in Section 5.2.2, we use the whole text instead of the topic part of the text, for the reasons given in that section. This treatment is applied to the base, rmt-base, rms-base, rmt-rms, rmt-slm and slm models that are described in this section for using the automatic annotation. However, we distinguish the lmt and rmt models using the topic part of the text and the lmtf and rmtf models, as baselines, using the whole text, respectively, even in the experiments using the automatic annotation. null rmt-rms: the rmt and rms models are treated independently.</Paragraph> <Paragraph position="6"> rmt-slm: the rmt and rms-base models are combined.</Paragraph> <Paragraph position="7"> lmtf: the standard language modeling approach using D5D8 for the nonsplit text, as baseline. null rmtf: the conventional relevance model was used for ranking using D5 D8 for the nonsplit text, as baseline.</Paragraph> <Paragraph position="8"> lmtsf: the standard language modeling approach using both D5</Paragraph> <Paragraph position="10"> text, for reference.</Paragraph> <Paragraph position="11"> rmtsf: the conventional relevance model was used for ranking using both D5</Paragraph> <Paragraph position="13"> nonsplit text, for reference.</Paragraph> <Paragraph position="14"> Note that the relevance models are constructed using training data for the training-based task, but are constructed using test data for the seed-based task, as mentioned in Section 4.1. Therefore, the base model is only used for the training data, not for the test data, in the training-based task, while it can be performed for the test data in the case of the seed-based task. Moreover, the lms, lmtsf and rmtsf models are based on the premise of using seed words to specify sentiments, and so they are only applicable to the seed-based task.</Paragraph> <Paragraph position="15"> In the models described in this subsec-</Paragraph> <Paragraph position="17"> in equation (2) were set to Dirichlet estimates (Zhai and Lafferty, 2001),</Paragraph> <Paragraph position="19"> in equation (4), and were fixed at 0.9 for ranking as in equation (5) for our experiments in Section 5.</Paragraph> <Paragraph position="20"> Here, AM</Paragraph> <Paragraph position="22"> were selected empirically according to the tasks described in Section 4.1. The model parameter AB in equation (5) was also selected empirically in the same manner. The num- null , in equation (4), was selected empirically in the same manner as above; however, we fixed the number of terms used in the relevance models as 1000.</Paragraph> </Section> </Section> class="xml-element"></Paper>