rise and fall patterns of information diffusion: model and implications yasuko matsubara (kyoto...

Post on 29-Mar-2015

217 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Rise and Fall Patterns of Information Diffusion:Model and ImplicationsYasuko Matsubara (Kyoto University), Yasushi Sakurai (NTT), B. Aditya Prakash (CMU), Lei Li (UCB), Christos Faloutsos (CMU)

KDD 2012 1Y. Matsubara et al.

Motivation

Social media facilitate faster diffusion of news and rumors

KDD 2012 2

Q: How do news and rumors spread in social media?

Y. Matsubara et al.

News spread in social media

MemeTracker [Leskovec et al. KDD’09] short phrases sourced from U.S. politics in 2008

KDD 2012 3

“you can put lipstick on a pig” (# of mentions in blogs)

“yes we can”

Y. Matsubara et al.

(per hour, 1 week)

News spread in social media

MemeTracker [Leskovec et al. KDD’09] short phrases sourced from U.S. politics in 2008

KDD 2012 4

“you can put lipstick on a pig” (# of mentions in blogs)

“yes we can”

Y. Matsubara et al.

Breaking news

Decay News

spread

(per hour, 1 week)

Rise and fall patterns in social media

Twitter (# of hashtags per hour)

Google trend (# of queries per week)

KDD 2012 5Y. Matsubara et al.

“#assange” “#stevejobs”

“harry potter” (2010 - 2011) “tsunami” (in 2005)

(per hour, 1week) (per hour, 1 week)

(per week, 1 year) (per week, 2 years)

Rise and fall patterns in social media

KDD 2012 6

How many patterns are there? -Earlier work claims there’re several classes• four classes on YouTube [Crane et al.

PNAS’08]• six classes on Media [Yang et al.

WSDM’11]

Y. Matsubara et al.

Rise and fall patterns in social media

KDD 2012 7

Q. How many classes are there after all?

A. Our answer is “ONE”!

We can represent all patterns by single model

Y. Matsubara et al.

Outline

KDD 2012 8

- Motivation- Problem definition- Proposed method- Experiments- Discussion – SpikeM at work- Conclusions

Y. Matsubara et al.

Problem definition

KDD 2012 9

Goal: predict/model social activity

Given:

- Network of bloggers/users

- External shock/event- Quality of the event βFind:

- How blogging activity will evolve over time

Problem 1 (What-if?)β

Y. Matsubara et al.

Problem definition

KDD 2012 10

Goal: predict/model social activity

Given:

- Behavior of spikesFind:

- Equation/model that can explain them, e.g.,

- # of potential bloggers- Strength of external

shock- Quality of the event β

Epidemic process by word-of-mouth

Problem 2 (Model design)

β

Y. Matsubara et al.

Outline

KDD 2012 11

- Motivation- Problem definition- Proposed method- Experiments- Discussion – SpikeM at work- Conclusions

Y. Matsubara et al.

Proposed method

SpikeM capture 3 properties of real spike

KDD 2012 12

1.

periodicities

Y. Matsubara et al.

Proposed method

SpikeM capture 3 properties of real spike

KDD 2012 13

2. avoid infinity

Y. Matsubara et al.

1.

periodicities

Proposed method

SpikeM capture 3 properties of real spike

KDD 2012 14

3. power-law fall

Y. Matsubara et al.

1.

periodicities2. avoid infinity

Proposed method

SpikeM capture 3 properties of real spike

KDD 2012 15

3. power-law fall

SpikeM capture behavior of real spikes

using few parameters Y. Matsubara et al.

1.

periodicities2. avoid infinity

Main idea (details)- 1. Un-informed bloggers (clique of N

bloggers/nodes)

KDD 2012 16

Time n=0

Y. Matsubara et al.

Nodes (bloggers) consist of two states

- Un-informed of rumor

- informed, and Blogged about

rumor

U

B

Main idea (details)- 1. Un-informed bloggers (clique of N

bloggers/nodes)- 2. External shock at time nb (e.g, breaking

news)

KDD 2012 17

Time n=0 Time n=nb

Y. Matsubara et al.

External shock- Event happened at time - bloggers are informed, blog about

news

Main idea (details)- 1. Un-informed bloggers (clique of N

bloggers/nodes)- 2. External shock at time nb (e.g, breaking

news)- 3. Infection (word-of-mouth effects)

KDD 2012 18

Time n=0 Time n=nb Time n=nb+1

β

Y. Matsubara et al.

Infectiveness of a blog-post- Strength of infection (quality of news)

- Decay function (how infective a blog posting is)

Main idea (details)- 1. Un-informed bloggers (clique of N

bloggers/nodes)- 2. External shock at time nb (e.g, breaking

news)- 3. Infection (word-of-mouth effects)

KDD 2012 19

Time n=0 Time n=nb Time n=nb+1

β

Y. Matsubara et al.

Infectiveness of a blog-post- Strength of infection (quality of news)

- Decay function (how infective a blog posting is)

Decay function:

Linear scale Log scale

-1.5

SpikeM-base (details)

Equations of SpikeM (base)

KDD 2012 20

- Total population of available bloggers

- Strength of infection/news- External shock at birth (time )- Background noise

Y. Matsubara et al.

Un-informed

Blogged

SpikeM - with periodicity (details)Full equation of SpikeM

KDD 2012 21Y. Matsubara et al.

Un-informed

Blogged Periodicity

12pmPeak activity 3am

Low activity

Time n

activity

Bloggers change their activity over

time(e.g., daily, weekly, yearly)

Learning parameters- Given a real time sequence

- Minimize the error (Levenberg-Marquardt (LM) fitting)

Model fitting (Details)

SpikeM consists of 7 parameters

KDD 2012 22Y. Matsubara et al.

Analysis

SpikeM matches realityexponential rise and power-raw fall

KDD 2012 23

Y. Matsubara et al.

rise fall

SpikeM vs. SI model (susceptible infected model)

Analysis

KDD 2012 24

Linear-log

Log-log

Rise-part

SpikeM: exponential SI model: exponential

Y. Matsubara et al.

rise fall

Reverse

x-axis

Analysis

KDD 2012 25

Linear-log

Log-log

Y. Matsubara et al.

rise fall

Fall-part SpikeM: power law SI model:exponential

SpikeM matches reality

Outline

KDD 2012 26

- Motivation- Problem definition- Proposed method- Experiments- Discussion – SpikeM at work- Conclusions

Y. Matsubara et al.

Experiments

We answer the following questions…

KDD 2012 27

Q1. Match real spikes- Q1-1: K-SC clusters- Q1-2: MemeTracker- Q1-3: Twitter - Q1-4: Google trend

Q2. Forecast future patterns

Y. Matsubara et al.

Q1-1 Explaining K-SC clusters

KDD 2012 28

Six patterns of K-SC [Yang et al. WSDM’11]

SpikeM can generate all patterns in K-SC Y. Matsubara et al.

Q1-2 Matching MemeTracker patterns

KDD 2012 29

MemeTracker (memes in blogs) [Leskovec et al. KDD’09]

SpikeM can fit various patterns in blog

Linear scale

Log scale

Y. Matsubara et al.

Noise-robust fitting

Outliers

Q1-3 Matching Twitter data

KDD 2012 30

Twitter data (hashtags)

SpikeM can generate various patterns in social media

Y. Matsubara et al.

Linear scale

Log scale

Q1-4 Matching Google trend data

KDD 2012 31

Volume of searches for queries on Google

SpikeM can capture various patternsY. Matsubara et al.

Q2 Tail-part forecasts

KDD 2012 32

- Given a first part of the spike - forecast the tail part

SpikeM can capture tail part (AR: fail)

Y. Matsubara et al.

Outline

KDD 2012 33

- Motivation- Problem definition- Proposed method- Experiments- Discussion – SpikeM at work- Conclusions

Y. Matsubara et al.

SpikeM at work

SpikeM is capable of various applications

KDD 2012 34

- A1. What-if

forecasting- A2. Outlier detection- A3. Reverse

engineering

Y. Matsubara et al.

A1. “What-if” forecasting

KDD 2012 35

Forecast not only tail-part, but also rise-part!

e.g., given (1) first spike,(2) release date of two sequel movies (3) access volume before the release date

? ?

(1) First spike (2) Release date

(3) Two weeks before release

Y. Matsubara et al.

A1. “What-if” forecasting

KDD 2012 36

Forecast not only tail-part, but also rise-part!

SpikeM can forecast upcoming spikes Y. Matsubara et al.

(1) First spike (2) Release date

(3) Two weeks before release

A2. Outlier detection

KDD 2012 37

Fitting result of “tsunami (Google trend)” in log-log scale

Y. Matsubara et al.

Another earthquake

One year after Indian

Ocean earthquake

Indian Ocean

earthquake

A3. Reverse engineering

KDD 2012 38

SpikeM provide an intuitive explanationPDF of parameters over 1,000 memes/hashtags

Y. Matsubara et al.

Meme

Twitter

A3. Reverse engineering

KDD 2012 39

SpikeM provide an intuitive explanationPDF of parameters over 1,000 memes/hashtags

Y. Matsubara et al.

Observation 1

Total population N is almost sameMeme

Twitter

A3. Reverse engineering

KDD 2012 40

SpikeM provide an intuitive explanationPDF of parameters over 1,000 memes/hashtags

Y. Matsubara et al.

Observation 2

Strength of first burst

(news) is

Meme

Twitter

A3. Reverse engineering

KDD 2012 41

SpikeM provide an intuitive explanationPDF of parameters over 1,000 memes/hashtags

Y. Matsubara et al.

Observation 3

Daily periodicity with phase shift

Every meme has the same periodicity without

lag

(Twitter)Daily periodicity with

more spread in(i.e., Multiple time zone)

Meme

Twitter

Outline

KDD 2012 42

- Motivation- Background- Proposed method- Experiments- Discussion – SpikeM at work- Conclusions

Y. Matsubara et al.

Conclusions

KDD 2012 43

SpikeM has following advantages:

• Unification powerIt includes earlier patterns/models

• Practicality: It Matches real datasets

• ParsimonyIt requires only 7 parameters

• Usefulness: What-if scenarios, outliers, etc.

Y. Matsubara et al.

Acknowledgements

Thanks Jaewon Yang & Jure Leskovec for the six clusters

[WSDM’11]

Funding

44KDD 2012 Y. Matsubara et al.

Thank you

KDD 2012 45Y. Matsubara et al.

Code: http://www.kecl.ntt.co.jp/csl/sirg/people/yasuko/software.html

Email: matsubara.yasuko lab.ntt.co.jp

Yasushi Sakurai

Yasuko Matsubara

B. Aditya Prakash

Lei Li Christos Faloutsos

top related