File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/06/e06-2031_evalu.xml
Size: 4,477 bytes
Last Modified: 2025-10-06 13:59:32
<?xml version="1.0" standalone="yes"?> <Paper uid="E06-2031"> <Title>Why Are They Excited? Identifying and Explaining Spikes in Blog Mood Levels</Title> <Section position="6" start_page="208" end_page="209" type="evalu"> <SectionTitle> 5 Experiments </SectionTitle> <Paragraph position="0"/> <Section position="1" start_page="208" end_page="208" type="sub_section"> <SectionTitle> Inthissectionweillustrateourmethodswithsome </SectionTitle> <Paragraph position="0"> examples and provide a preliminary analysis of their effectiveness.</Paragraph> </Section> <Section position="2" start_page="208" end_page="208" type="sub_section"> <SectionTitle> 5.1 The blog corpus Our corpus consists of </SectionTitle> <Paragraph position="0"> all public blogs published in LiveJournal during a 90 day period from July 5 to October 2, 2005, adding up to a total of 19 million blog posts. For each entry, the text of the post along with the date and time are indexed. Posts without an explicit mood indication (10M) are discarded. We applied standard preprocessing steps (stopword removal, stemming) to the text of blog posts.</Paragraph> </Section> <Section position="3" start_page="208" end_page="208" type="sub_section"> <SectionTitle> 5.2 The news corpus The collection con- </SectionTitle> <Paragraph position="0"> tains around 1000 news headlines that have been published in Wikinews (http://www.</Paragraph> <Paragraph position="1"> wikinews.org) during the period of July-September, 2005.</Paragraph> </Section> <Section position="4" start_page="208" end_page="209" type="sub_section"> <SectionTitle> 5.3 Case studies We present three particular </SectionTitle> <Paragraph position="0"> cases where an irregular behavior in a certain mood could be observed. We examine how accurately the overused terms describe the events that caused the spikes.</Paragraph> <Paragraph position="1"> 5.3.1 Harry Potter In July, 2005, a peak in &quot;excited&quot; was discovered; see Figure 4, where the shaded (green) area indicates the &quot;peak area.&quot; Step 1 of our peak explanation method (Section 4) reveals the following overused terms during the peak period: &quot;potter,&quot; &quot;book,&quot; &quot;excit,&quot; &quot;hbp,&quot; &quot;read,&quot; &quot;princ,&quot; &quot;midnight.&quot; Step 2 of our peak explanation method (Section 4) exploits these words to retrieve the following headline from the news collection: &quot;July 16. Harry Potter and the Half-Blood Prince released.&quot; ple illustrates the need for careful thresholding when defining peaks (see Section 3). We show peaks in &quot;worried&quot; discovered around late August, with a 40% and 80% threshold. Clearly, far morepeaksareidentifiedwiththelowerthreshold, while the peaks identified in the bottom plot (with the higher threshold), all appear to be clear peaks. Theoverusedtermsduringthepeakperiodinclude &quot;orlean,&quot; &quot;worri,&quot; &quot;hurrican,&quot; &quot;gas,&quot; &quot;katrina&quot; In threshold 40% change; bottom: threshold 80% change) Step 2 of our explanation method we retrieve the following news headlines (top 5 shown only): serious flooding across affected region (Aug 26) Hurricane Katrina strikes Florida, kills seven 5.3.3 London terror attacks On July 7 a sharp spike could be observed in the &quot;sad&quot; mood; see Figure 6; the tone of the shaded area shows the degree of the peak. Overused terms identified for this period include &quot;london,&quot; &quot;attack,&quot; &quot;terrorist,&quot; &quot;bomb,' &quot;peopl&quot;, &quot;explos.&quot; Consulting our news</Paragraph> </Section> <Section position="5" start_page="209" end_page="209" type="sub_section"> <SectionTitle> 5.4 Failure analysis Evaluation of the meth- </SectionTitle> <Paragraph position="0"> ods described here is non-trivial. We found that our peak detection method is effective despite its simplicity. Anecdotal evidence suggests that our approach to finding explanations underlying unusual spikes and drops in mood levels is effective.</Paragraph> <Paragraph position="1"> Weexpectthatitwillbreakdown,however,incase theunderlyingcauseisnotnewsrelatedbut,forinstance, related to celebrations or public holidays; news sources are unlikely to cover these.</Paragraph> </Section> </Section> class="xml-element"></Paper>