sentiment flow through hyperlink...

Sentiment Flow Through Hyperlink Networks

Mahalia MillerStanford UniversityStanford, CA, USA.

[email protected]

Conal SathiStanford UniversityStanford, CA, USA.

[email protected]

Daniel WiesenthalStanford UniversityStanford, CA, USA.

[email protected]

Abstract

This project explores the flow of sentimentinformation through a network of hyper-linked web pages. We analyze the maintext in a large set of web pages accord-ing to several rough heuristics for sen-timent information. We then model thedata as a network in which nodes representpages and (directed) edges represent hy-perlinks between pages. Within cascadeswe identify in the graph, we examine howthe sentiment features of a node changeas a function of its position in the cas-cade structure and the sentiment featuresof its cascade brethren, essentially captur-ing the flow of sentiment through the cas-cade. As well as analyzing the flow ofindividual sentiment features, we explorethe interaction between objectivity and theother features. Our analysis yields sev-eral interesting conclusions: blog posts inwhich the author uses more objective lan-guage than usual are more likely to be cas-cade initiators, long cascades show quali-tatively different behavior than short cas-cades, we see different behavior for “firstround” and “second round” contributors todiscussions, and subjective language leadsto more subjective language at an increas-ing rate.

1 Credits

This paper is based off work for Jure Leskovec’sSocial and Information Networks CS 224w courseand Dan Jurafsky and Christopher Potts ExtractingSocial Meaning and Sentiment CS424p course atStanford University, Autumn 2010.

2 Introduction

The ever-growing amount of data available on theweb, predominantly as prose in hyperlinked blogsand social network posts, has inspired much re-search focused on understanding the interactionsbetween authors and the trends emergent in thelanguage they use. News articles and blogs on theInternet offer an opportunity for readers to rapidlyshare their thoughts and opinions on an issue bycreating a new post that hyperlinks to the originalblog post. In this way, blog authors create a di-rected graph in which blog posts represent nodesand hyperlinks represent edges. Previous work onboth sentiment and network analysis inspire sev-eral interesting areas for exploration in the over-lap of the two mostly disparate fields. Is the sen-timent of blog post cascade initiators unusual orpredictable? How does sentiment flow through acascade? Are there noticeable differences in thesentiment characteristics of blogs in long cascadesversus short cascades? How do different sentimen-tal aspects interact across the network? Do certainsentiment properties modulate the flow of otherproperties? In this paper, we propose a prelimi-nary approach to answering these questions, applyour approach to a large data set, and present signif-icant results showing clear trends with real-worldimplications.

3 Prior Work

Prior work is split in two directions: understand-ing the network characteristics of the diffusion ofinformation in static and time-varying graphs, andclassifying sentiment in the text of actual posts inisolation.

3.1 Prior Work on Cascades

Recently researchers have analyzed basic linkingpatterns in large political blog graphs. Adamis andGlance (2005) examine different linking trends in

Democratic and Republican blogs. They note a di-vided blogosphere where liberal and conservativeblogs tend to link to their own communities, ratherthan to each other. They also discover the conser-vative blogosphere to be more densely connectedand that the two communities tend to link to dif-ferent news sources. There has also been work onanalyzing patterns of cascading behavior in bloggraphs. Lescovec et al (2007) discover temporalpatterns and topological patterns. They note com-mon cascade shapes, such as stars and chains, andcascade topological properties, such as the size ofcascades and their degree distribution. We willexamine topological features when assessing howbasic topology, such as cascade shape and size, af-fects network flow of sentiment.

3.2 Previous Work on Sentiment Extractionand Propagation of Sentiments

One interesting challenge in sentiment analysis isidentifying the presence of opinion in text and, ifsuch an opinion exists, the polarity of the opin-ion. With the rise of Internet blogs and selfpub-lishing, it is becoming increasingly easy for a userto post information to either strictly inform oth-ers or to express a personal opinion on an issue.Researchers have tried various approaches. Therehas been research in using bag-of-word modelsand sentiment lexicons. Additionally, Chesley etal. (2006) employ verb classes and Wiktionary toget definitions and synonyms for adjectives. God-bole et al. (2007) use WordNet distances, and Za-farni et al. (2010) employ Normalized GoogleDistance. The two fields of information propa-gation/diffusion and sentiment analysis seem tobe largely disparate, inviting research into theirmarriage. The authors are unaware of previouswork tracking extracted sentiment propagation ina prose network. Closely related is the work ofZafarani et al. (2010), which examines data fromLive Journal, a social network where users main-tain a blog, and in which users have the option ofassigning a mood to each of their posts. The pa-per explores tracking this manually assigned sen-timent across the network by assigning each moodto a score and by defining mathematically whetherthe sentiment of one user has propagated to an-other. Our work departs the seminal ideas inZarafani et al. (2010) in that we use links betweenblog posts, rather than friendships on a social net-work to denote edges, and we examine blogs from

multiple sources. Further, we use the content ofthe post to extract sentiment ourselves, rather thanrelying on the user to explicitly describe his sen-timent. This approach is subject to the limitationsof our sentiment extraction, which we address byverifying results across two independent sentimentlexica. This paper combines ideas from the graphanalysis and sentiment analysis fields in order toanalyze the flow of sentiment in blog post net-works connected by hyperlinks. In contrast to pre-vious work that tracks links, phrases, or memesonly, our approach is to analyze the full text of apost using rough sentiment heuristics, and to trackthe flow of these extracted metrics over the linkednetwork.

4 Methodology

4.1 Data Set

We obtained data from Jure Leskovecs meme-tracker project (Leskovec et al., 2009) from themonth of August 2010, which consists of roughly1 million blog posts per day. Each post contains aurl, time stamp, and all of the urls of the posts itcites. We worked with a newer version of the datathan that referenced in Leskovec et al. (2009); thisnew version includes the full text of the webpage(in addition to key quotes), upon which we ran oursentiment analysis.

4.2 Pruning of the Data Set

Our data set initially comprised nearly 200GB ofhighly compressed data representing a full monthof blogs and web pages, which allowed the luxuryof pruning out unwieldy parts of the data set with-out encountering serious sparsity issues. We firstexamine the 15 top domains in our data, as shownin Table 1. Much of the original data stems fromnon-English domains (.fr, .jp, etc), which we trimout since our lexica are only applicable to English-language posts. As shown in Figure 1, Facebookand Twitter dominate the data set, comprising 66%of the data. These posts tend to be short, infor-mal, and linguistically noisy. Also, most of theseposts tend to be singletons (i.e. they tend to notcite other posts), so they will not be useful in ourcascade analysis. Figure 1 shows the degree dis-tribution of nodes in our network with and with-out Facebook/Twitter. The CDF without Facebookand Twitter shows a more gradual increase, indi-cating that a large number of Facebook and Twit-ter posts have a very low degree. After pruning

Blog % of Postsin Network

twitter.com 41.59 %facebook.com 24.63 %blip.tv 2.98 %t.love.com 0.68 %blog.sina.com.cn 0.46 %feeds.feedblitz.com 0.40 %zoom.nl 0.24 %islive.nl 0.16 %plaza.rakuten.co.jp 0.14 %capi-logement-neuf.fr 0.13 %photobucket.com 0.11 %bizjournals.com 0.11 %snowboard-online-shop.eu

0.10 %

blogs.msdn.com 0.09 %article.wn.com 0.07 %

Table 1: Top 15 Blogs.

Facebook, Twitter, and non-English root domains,we further prune singleton nodes (nodes with 0 in-and out-degree) from our dataset. This yields alarge, linguistically rich, highly connected graphof roughly 8 million blog posts and 15 million hy-perlinked edges.1

4.3 Sentiment Extraction

We extract the sentiment from each post by cross-referencing the words in the post (treated withoutorder, as a bag-of-words model) against two sen-timent lexica, and tracking frequency, counts, andaverages for positive and negative scores, as wellas a synthesized objectivity score. The two sen-timent lexicons we use are the Harvard Inquirer(Stone et al., 1966) and SentiWordNet 3 (Stefanoet al., 2010). By using two separate lexica whichboth purport to give information about positivityand negativity, we are able to compare and validateour results in cases where the two lexicons agreedstrongly. In the Harvard Inquirer, each word isgiven a binary score (1 or 0) for positivity, andseparately for negativity. In SentiWordNet, eachword is assigned a score on a continuous scale be-tween 0 and 1 for positivity and negativity. To

1We had to use efficient coding techniques to manage adata set of this size. Calculating sentiment features took over24 hours distributed over eight 3+ GHz processor cores. Con-structing the graph took 20+ hours. Since we could neverhold our whole data set in RAM, we had to serialize to diskin between code segments.

Blog Average WordNegativity

raphaelku.livejournal.com 9.9e-4vadim-i-z.livejournal.com 9.9e-4darkgothiclolita 1.6e-2.forumcommunity.net

Table 2: Sample negativity sentiment values.

synthesize objectivity, we assign a word a valueof 1− (SWNpositivity+SWNnegativity), assuggested by the SentiWordNet authors. Thus, foreach post, we count and average these word scoresto compute the objectivity/subjectivity and posi-tivity/negativity of each post. We average thesescores across all posts for each web domain (i.e.for nytimes.com, techcrunch.com, etc.) to createa baseline value. This baseline represents the au-thors usual manner of writing; since global aver-ages do not account for differing writing styles,we find it more intuitive to compare against thislocal baseline in order to determine whether a par-ticular post is unusual for its author. We presenta motivating example below in Table 2, in whichthe third domains average negativity is more than15 times the average negativity of the first two do-mains. Thus, for each post, we calculate all senti-ment features as a delta above its baseline. In ad-dition to creating a baseline for each node, we cal-culate a global mean, median, and standard devia-tion for each feature to get a sense of our data (thesuffix “.Inq” denotes the feature came from calcu-lations using the HarvardInquirer while the suffix“.swn” denotes the feature came from SentiWord-Net) as shown in Table3. The average word ob-

Word Feature Median Mean SDposFreq.Inq 0 0.00915 0.0187negFreq.Inq 0 0.00493 0.0147emoFreq.Inq 0.00151 0.01408 0.0246avgPos.swn 0.0139 0.01794 0.0192avgNeg.swn 0.00973 0.0145 0.0178avgObj.swn 0.973 0.967 0.0315totalWords 105 333 1147

Table 3: Statistics for sentiment values.

jectivity has the highest standard deviation, whichmakes sense given the fact that objectivity is af-fected by both positivity and negativity, and so hasmore variability.

4.4 Cascade IdentificationWe construct a network graph of our data usingthe C++ SNAP network analysis library by JureLeskovec. We model the data in the graph asfollows: each node represents a blog post withits sentiment scores, and directed edges betweennodes represent hyperlinks. An edge points fromnode u to node v where post u contains a hyperlinkthat cites post v. We are specifically interested ininformation propagation graphs (cascades). Afterconstructing the graph, we search for “cascade ini-tiators” by iterating through each node and lookedfor nodes with a non-zero out-degree and a zeroin-degree. These nodes represent posts that begina long chain of links. Once cascade initiators arefound, we search for all the nodes in each cascadeby applying a breadth-first algorithm, starting ateach cascade initiator and following the in-links.A small example cascade of depth 2 is shown inFigure 2. In this example, the cascade initiator isthe node on the left.

5 Analysis

5.1 Global AnalysisOur primary analysis focuses on the change overbaseline of sentiment features for a post as theshortest-path distance to the post from the cascadeinitiator increases. We see an interesting trendin the average positivity and negativity values fornodes at distance d away from the cascade ini-tiator, shown in Figure 3. We note several qual-ities of this plot. The cascade initiators tend tohave lower average positivity and negativity val-ues, suggesting (as we explore in more depth be-low) that posts that initiate cascades tend to bewritten in a more objective manner than is usualfor the writer. The first two posters after the initia-tor also tend to have lower average positivity andnegativity (relative to their respective baselines),but for nodes at a distance between 4 and 9 wesee the average positivity and negativity divergeto mirror each other across the baseline (x axis),with an increase in average negativity. After a dis-tance of 9, most nodes are slightly below base-line for both features. As distance increases, thetwo features begin to diverge from baseline, andafter node 200 our data becomes sparse so out-liers destroy the trends. We suggest the follow-ing explanation for these observations: posts thatare more objective than usual tend to have moreinteresting factual content for readers to discuss

(for example, an original article on networks). Assuch, they tend to be largely objective and gen-erate cascades with greater longevity. The firstfew posters after the initiator contribute interest-ing ideas, hence their low pos/neg frequencies rel-ative to baseline (“You make an interesting pointhere, which might apply to the field of NLP”).After this first round of observers arrives on thescene, the second round of commentators arrives,with a set of writers and opinions laid out beforethem: they have more to criticize, both in terms ofraw material and in terms of the interactions be-tween their predecessors (“You completely failedto understand the original article, think before youwrite, you < insult > !”).

We note that the initially high objectivity, whichthen drops and picks up again, may be an artifactof two qualitatively different types of cascades:long and short. That is, cascades that are hun-dreds of posts long may behave in some qualita-tively different manner than their shorter brethren.We break down our data to compare not onlythe whole network graph, but also variants wherewe consider only nodes participating in cascadesshorter than 10 nodes in depth or greater than 10nodes in depth, shown in Figure 4.

We see that indeed, our above observations holdfor the large cascades, and are even intensified.There is a qualitatively different trend across smallcascades: the cascade initiators have significantlyhigher average positivity and negativity than base-line, but their responders immediately have lowerthan baseline values for these metrics. We hypoth-esize that for these short cascades, the initial postcontains incitatory language that prompts a few re-sponders to feel the need to clarify by “laying outthe facts,” or to calm the author down. As averagepositivity/negativity does not necessarily allow usto measure objectivity, we also examine objectiv-ity for the whole network, as well as the small-and large-cascade subsets. This is shown in Fig-ure 5. We find that, consistent with our abovefindings, objectivity is far above baseline at thenode initiator, and decreases slightly for nodes ata distance between 4 and 9 away from the cas-cade initiator, though it remains above baselinefor most of the cascade, only flaring out at theends of long cascades. These findings are consis-tent when we examine both the small- and large-cascade subsets of our graph. Thus, over bothlarge and small cascades, we find that objectivity

stays above baseline except at the very end of thecascade. We hypothesize that the flow of causal-ity here is from above-average objectivity towardslonger cascades: that is, more objective writing in-spires cascades, while peaks of heated rhetoric andstrong sentiment quickly kill cascades.

5.2 Local Analysis

In addition to examining the sentiment values at ascale relative to their global position in a cascade,we examine sentiment values in the local contextof parent-child node relationships. Figure 6 de-picts the positivity of a post by its distance fromthe cascade initiator, taking the objectivity of itsparent into account. The blue represents nodeswhose parents have an objectivity value of 0.99(i.e. the node cites a very objective post), whereasthe red represents nodes whose parents have an ob-jectivity value of 0.89 (i.e. the node cites a verysubjective post). The rest of the color spectrum isconsistent in the plot: that is, as a node’s parentbecomes more subjective, the color on the plot be-comes warmer. We see that as the subjectivity of anode’s parent increases, the positivity of the nodebecomes harder to predict (as indicated by themuch lower correlation coefficient for the nodeswith more subjective parents). We offer the fol-lowing explanation: a more opinionated post willelicit a more powerful emotional response. (Con-trast this to a very objective post, which allowsagreement and disagreement, but is not as likelyto inspire an impassioned response.) Within thisresponse, there will be posts that strongly agreeand nodes that strongly disagree. It is very hardto predict the polarity of the response, though thedata show that a high-arousal response is com-mon. This relationship of parent-to-child objec-tivity merits further examination. We see in Fig-ure 7 that if a node’s parent is unusually subjec-tive, it is more likely that the node will also besubjective. Intuitively, this might be explained asfollows. If there is a post that is very subjective,it is very opinionated, thus, eliciting either strongagreement or strong disagreement. Thus, any postthat cites this first post will express much senti-ment in expressing this agreement or disagreementto the prior post’s stance. Moreover, the slope ofthe graph gets progressively steeper. Thus, subjec-tive language elicits more subjective language atan increasing rate. This may be the cause of flamewars in long cascades despite the cascade initiator

itself being unusually objective.

6 Conclusion

We discover that the sentiment of a blog post isaffected by its position in a cascade as well asby the objectivity of its immediate parent. Weexamine both the degree of sentiment (objectiv-ity/subjectivity) as well as explore the polarity(positivity/negativity). Our analysis yields sev-eral interesting conclusions: (1) blog posts inwhich the author uses more objective languagethan usual are more likely to be cascade initiators(especially so for longer cascades), (2) long cas-cades show qualitatively different behavior thanshort cascades, with long cascades beginning withunusually objective posts and short cascades be-ginning with unusually subjective posts that arequickly responded to objectively, (3) we see differ-ent behavior for “first round” and “second round”contributors to discussions with the former of-fering more objective feedback and the latter us-ing unusually subjective language (perhaps judg-ment), and (4) subjective language leads to moresubjective language at an increasing rate. More-over, our results hold using the corresponding sen-timent scores from either the Harvard Inquirer orSentiWordNet showing that our results are not anartifact of our lexicon.

6.1 Future Work

Our work is one of the first research projects weknow of that explores sentiment extraction andanalysis across a large-scale multi-domain net-work. Naturally, there are many avenues to pursuefor future work.

One avenue for further work is to build a morerefined sentiment extraction engine. Use of a sen-tence parser to tag the parts of speech of words inthe post would help disambiguate multiple mean-ings of a word to better extract its sentiment.For example, our sentiment lexica currently scorewords like “well” with high positivity and lownegativity, unaware of whether the word is a nounor adverb–yet these two parts of speech carry sig-nificantly different sentiment information. Thus,finding the part of speech of the word could helpto disambiguate the sense of the word. Similarly,a full sentence tree parse would help to correctlyclassify special cases where a word’s presence isnot necessarily an indication of its effect on theoverall post sentiment, such as in the case of nega-

tion. (For example, when any negation (such as“no,” “not,” etc.) precedes a word, its sentimentshould reverse, such as in the case “I am not feel-ing good about the outcome of this election.”)

Often in this paper we discuss positivity asbeing related to agreement of a previous post’sagreement and negativity as being related to dis-agreement, which is a simplifying assumption. Itwould be be interesting to actually see whetherthis is the case by looking at the content of eachpost and examining the language related to agree-ment/disagreement, in addition to positivity andnegativity, to further refine our results. This mightbe done by starting with a seed set of words thatstrongly indicate agreement and disagreement (po-tentially these two words themselves) and thenpropagating through WordNet to expand the set,with agreement/disagreement scores decreasingwith increase in distance from the seed set.

We would also like to explore more networkgraph properties in conjunction with our sentimentfindings. We have largely focused on directed re-lationships and distance from the cascade initiatorto a node on the cascade, but looking at the ef-fect of cascade tree breadth versus depth is an arearipe with possibilities. For example, we might askwhether flame-wars have a tall and narrow graphstructure, perhaps with many flips in change frombaseline (a very positive post, then a very nega-tive one), versus short, fat cascades where manysecondary authors interpret a single seminal articlein varying ways as they bridge it to their fields ofinterest, without necessarily being aware of theirpeers.

We would like to extend both our network andsentiment features to predict a node’s sentimentvalues with a classifier. As our feature set growsmore complex, it becomes difficult to manuallysort out interactions between features, and a Max-Ent classifier may help us identify interesting in-teractions for close examination. The combina-tion of a writer’s history (baseline) and more nu-anced hybrid semantic-network features (the sen-timent of a friend who links to the same post, forexample), could be a powerful technique for pre-dicting a person’s reaction to a given post. Thishas a plethora of applications: advertisement tar-geting, personalized news feeds, and political ap-proval prediction, among others.

Acknowledgments

Mahalia Miller was partially funded by the Na-tional Science Foundation and the Stanford Grad-uate Fellowship. Daniel Wiesenthal and ConalSathi were partially fueled by Starbucks.

ReferencesLada Adamic. and Natalie Glance. 2005. The political

blogosphere and the 2004 U.S. election. Workshopon Link Discovery

Paula Chesley, Bruce Vincent, Li Xu, and Rohini Sri-hari. 2006. Using verbs and adjectives to automati-cally classify blog sentiment AAAI Symposium onComputational Approaches to Analysing Weblogs,27–29.

Namrata Godbole, Manjunath Srinivasaiah, andSteven Skiena. 2007. Large-scale sentiment anal-ysis for news and blogs. Proceedings of the Interna-tional Conference on Weblogs and Social Media

Jure Leskovec, Mary McGlohen, Christos Faloutsos,Natalie Glance and Matthew Hurst. 2007. Cas-cading behavior in large blog graphs: Patterns anda model. SIAM International Conference on DataMining

Jure Leskovec, Lars Backstrom, and Jon Kleinberg.2009. Meme-tracking and the Dynamics of theNews Cycle. ACM SIGKDD Intl. Conf. on Knowl-edge Discovery and Data Mining.

Jure Leskovec. 2010. Dynamics of large networks.PhD Dissertation. Machine Learning Department,Carnegie Mellon University, Technical report CMU-ML-08-111, 2008.

B. Stefano, A. Esuli, and F. Sebastani. 2010. Senti-WordNet 3.0: An enhanced lexical resource for sen-timent analysis and opinion mining. Proceedings ofthe Seventh conference on International LanguageResources and Evaluation (LREC’10): 2200-2204.

P. Stone, D. Dunphy, M. Smith, and D. Ogilvie. 1966.The General Inquirer: A Computer Approach toContent Analysis. MIT Press, Cambridge, MA.

Reza Zafarani, William Cole, and Huan Liu. 2010.Sentiment extraction in social networks: a casestudy in LiveJournal. In Advances in Social Com-puting, 6007:413–420.

60

68

76

84

92

100

0 10 20 30 40 50

Cumulative Degree Distribution

With FB+TWithout FB+T

x = Degree of node

Figure 1: Cumulative degree distribution for node degree

Figure 2: Sample cascade from data set

-0.01125

-0.00750

-0.00375

0

0.00375

0.00750

0.01125

0.01500

1 10 100 1000

Full Graph: Freq Pos/Neg by Distance

Featu

re C

han

ge O

ver

Base

line

Distance from Cascade Initiator

Neg FreqPos Freq

Figure 3: Average positive and negative word frequency for each node over each node’s own baseline

-0.0010

-0.0005

0

0.0005

0.0010

0.0015

0.0020

0 1 2 3 4 5 6 7 8 9

Small Cascades: Avg Pos/Neg by Distance

Feat

ure

Cha

nge

over

Bas

elin

e


Avg NegAvg Pos

-0.01500

-0.01125

-0.00750

-0.00375

0

0.00375

0.00750

0.01125

1 10 100 1000

Large Cascades: Avg Pos/Neg by Distance

Feat

ure

Cha

nge

Ove

r Bas

elin

e


Avg NegAvg Pos

-0.01500

-0.01125

-0.00750

-0.00375

0

0.00375

0.00750

0.01125

1 10 100 1000

Full Graph: Avg Pos/Neg by Distance

Feat

ure

Cha

nge

Ove

r Bas

elin

e


Avg NegAvg Pos

Figure 4: Average positivity and negativity for each node over each node’s own baseline

0.9700

0.9875

1.0050

1.0225

1.0400

1 10 100 1000

Full Graph: Objectivity by Distance

Feat

ure

Cha

nge

Ove

r Bas

elin

e


0.9900

1.0025

1.0150

1.0275

1.0400

0 2.25 4.50 6.75 9.00

Small Cascades: Objectivity by Distance

Feat

ure

Cha

nge

over

Bas

elin

e


Objectivity (+1)Baseline Objectivity

0.970

0.985

1.000

1.015

1.030

1 10 100 1000

Large Cascades: Objectivity by Distance

Feat

ure

Cha

nge

Ove

r Bas

elin

e




Figure 5: Objectivity for each node over each node’s own baseline

0.960

0.975

0.990

1.005

1.020

1 10 100 1000

R² = 0.0069R² = 0.012

R² = 0.1487R² = 0.0794

Full Graph: Pos Word Frequency by Distance for Varying Parent Objectivity

Po

s W

ord

Fre

qu

en

cy


99 (N)98 (N)94 (N)89 (N)

Figure 6: Average positivity for different levels of objectivity of the parent on the shortest path

-0.30

-0.15

0

0.15

0.30

1 10 100

Full Graph: Avg Pos/Neg and Objectivity by Parent Subjectivity

Featu

re D

elta (C

hild

No

de)

Subjectivity of Parent (100- (Bucketed Objectivity))

Avg NegAvg PosObjectivity of Child

Figure 7: Average positivity, negativity, and objectivity versus the subjectivity of the parent on the short-est path

sentiment flow through hyperlink...

Documents