Transcript
Page 1: Spread influence on social networks

Detection of Opinion Leaders and Spread of

Influence on Social Networks

Armando Vieira

1

Page 2: Spread influence on social networks

Social network plays a fundamental role as a medium for the spread of INFLUENCE among its members Opinions, ideas, information,

innovation…

Direct Marketing takes the “word-of-mouth” effects to significantly increase profits (Facebook, Twitter, Youtube …)

2

Page 3: Spread influence on social networks

Problem Setting

Given a limited budget B for initial advertising (e.g. give away free

samples of product) estimates for influence between individuals

Goal trigger a large cascade of influence (e.g. further adoptions

of a product) Question

Which set of individuals should B target at? Application besides product marketing

spread an innovation detect stories in blogs

3

Page 4: Spread influence on social networks

What we need?

Form models of influence in social networks. Obtain data about particular network (to

estimate inter-personal influence). Devise algorithm to maximize spread of

influence.

4

Page 5: Spread influence on social networks

Outline

Models of influence Linear Threshold Independent Cascade

Influence maximization problem Algorithm Proof of performance bound Compute objective function

Experiments Data and setting Results

5

Page 6: Spread influence on social networks

Outline

Models of influence Linear Threshold Independent Cascade

Influence maximization problem Algorithm Proof of performance bound Compute objective function

Experiments Data and setting Results

6

Page 7: Spread influence on social networks

Models of Influence

First mathematical models [Schelling '70/'78, Granovetter '78]

Large body of subsequent work: [Rogers '95, Valente '95, Wasserman/Faust '94]

Two basic classes of diffusion models: threshold and cascade

General operational view: A social network is represented as a directed graph, with

each person (customer) as a node Nodes start either active or inactive An active node may trigger activation of neighboring

nodes Monotonicity assumption: active nodes never deactivate

7

Page 8: Spread influence on social networks

Outline

Models of influence Linear Threshold Independent Cascade

Influence maximization problem Algorithm Proof of performance bound Compute objective function

Experiments Data and setting Results

8

Page 9: Spread influence on social networks

Linear Threshold Model

A node v has random threshold θv ~ U[0,1]

A node v is influenced by each neighbor w according to a weight bvw such that

A node v becomes active when at least

(weighted) θv fraction of its neighbors are active

, active neighbor of

v w vw v

b

, neighbor of

1v ww v

b

9

Page 10: Spread influence on social networks

Example

Inactive Node

Active Node

Threshold

Active neighbors

vw 0.5

0.30.2

0.5

0.10.4

0.3 0.2

0.6

0.2

Stop!

U

X

10

Page 11: Spread influence on social networks

Outline

Models of influence Linear Threshold Independent Cascade

Influence maximization problem Algorithm Proof of performance bound Compute objective function

Experiments Data and setting Results

11

Page 12: Spread influence on social networks

Independent Cascade Model

When node v becomes active, it has a single chance of activating each currently inactive neighbor w.

The activation attempt succeeds with probability pvw .

12

Page 13: Spread influence on social networks

Example

vw 0.5

0.3 0.20.5

0.10.4

0.3 0.2

0.6

0.2

Inactive Node

Active Node

Newly active node

Successful attempt

Unsuccessfulattempt

Stop!

UX

13

Page 14: Spread influence on social networks

Outline

Models of influence Linear Threshold Independent Cascade

Influence maximization problem Algorithm Proof of performance bound Compute objective function

Experiments Data and setting Results

14

Page 15: Spread influence on social networks

Influence Maximization Problem

Influence of node set S: f(S) expected number of active nodes at the end, if set

S is the initial active set Problem:

Given a parameter k (budget), find a k-node set S to maximize f(S)

Constrained optimization problem with f(S) as the objective function

15

Page 16: Spread influence on social networks

Outline

Models of influence Linear Threshold Independent Cascade

Influence maximization problem Algorithm Proof of performance bound Compute objective function

Experiments Data and setting Results

16

Page 17: Spread influence on social networks

f(S): properties (to be demonstrated)

Non-negative (obviously) Monotone: Submodular:

Let N be a finite set A set function is submodular iff

(diminishing returns)

: 2Nf

, \ ,

( ) ( ) ( ) ( )

S T N v N T

f S v f S f T v f T

( ) ( )f S v f S

17

Page 18: Spread influence on social networks

Bad News

For a submodular function f, if f only takes non-negative value, and is monotone, finding a k-element set S for which f(S) is maximized is an NP-hard optimization problem[GFN77, NWF78].

It is NP-hard to determine the optimum for influence maximization for both independent cascade model and linear threshold model.

18

Page 19: Spread influence on social networks

Good News

We can use Greedy Algorithm! Start with an empty set S For k iterations:

Add node v to S that maximizes f(S +v) - f(S).

How good (bad) it is? Theorem: The greedy algorithm is a (1 – 1/e)

approximation. The resulting set S activates at least (1- 1/e) > 63%

of the number of nodes that any size-k set S could activate.

19

Page 20: Spread influence on social networks

Outline

Models of influence Linear Threshold Independent Cascade

Influence maximization problem Algorithm Proof of performance bound Compute objective function

Experiments Data and setting Results

20

Page 21: Spread influence on social networks

Key 1: Prove submodularity

, \ ,

( ) ( ) ( ) ( )

S T N v N T

f S v f S f T v f T

21

Page 22: Spread influence on social networks

Submodularity for Independent Cascade

Coins for edges are flipped during activation attempts.

0.5

0.30.5

0.10.4

0.3 0.2

0.6

0.2

22

Page 23: Spread influence on social networks

Submodularity for Independent Cascade

Coins for edges are flipped during activation attempts.

Can pre-flip all coins and reveal results immediately.

0.5

0.30.5

0.10.4

0.3 0.2

0.6

0.2

Active nodes in the end are reachable via green paths from initially targeted nodes.

Study reachability in green graphs

23

Page 24: Spread influence on social networks

Submodularity, Fixed Graph

Fix “green graph” G. g(S) are nodes reachable from S in G.

Submodularity: g(T +v) - g(T) g(S +v) - g(S) when S T.

V

S

T

g(S)

g(T)

g(v)

g(S +v) - g(S): nodes reachable from S + v, but not from S.

From the picture: g(T +v) - g(T) g(S +v) - g(S) when S T (indeed!).

24

Page 25: Spread influence on social networks

Submodularity of the Function

gG(S): nodes reachable from S in G.

Each gG(S): is submodular (previous slide). Probabilities are non-negative.

Fact: A non-negative linear combination of submodular functions is submodular

( ) Prob( ) ( )GG

f S G is green graph g S

25

Page 26: Spread influence on social networks

Submodularity for Linear Threshold

Use similar “green graph” idea. Once a graph is fixed, “reachability” argument is

identical. How do we fix a green graph now? Assume every time nodes pick their threshold

uniformly from [0-1]. Each node picks at most one incoming edge, with

probabilities proportional to edge weights. Equivalent to linear threshold model (trickier proof).

26

Page 27: Spread influence on social networks

Outline

Models of influence Linear Threshold Independent Cascade

Influence maximization problem Algorithm Proof of performance bound Compute objective function

Experiments Data and setting Results

27

Page 28: Spread influence on social networks

Key 2: Evaluating f(S)

28

Page 29: Spread influence on social networks

Evaluating ƒ(S)

How to evaluate ƒ(S)? Still an open question of how to compute

efficiently But: very good estimates by simulation

Generate green graph G’ often enough (polynomial in n; 1/ε). Apply Greedy algorithm to G’.

Achieve (1± ε)-approximation to f(S). Generalization of Nemhauser/Wolsey proof

shows: Greedy algorithm is now a (1-1/e- ε′)-approximation.

29

Page 30: Spread influence on social networks

Outline

Models of influence Linear Threshold Independent Cascade

Influence maximization problem Algorithm Proof of performance bound Compute objective function

Experiments Data and setting Results

30

Page 31: Spread influence on social networks

Experiment Data

A collaboration graph obtained from co-authorships in papers of the arXiv high-energy physics theory section

co-authorship networks arguably capture many of the key features of social networks more generally

Resulting graph: 10748 nodes, 53000 distinct edges

31

Page 32: Spread influence on social networks

Experiment Settings

Linear Threshold Model: multiplicity of edges as weights weight(v→ω) = Cvw / dv, weight(ω→v) = Cwv / dw

Independent Cascade Model: Case 1: uniform probabilities p on each edge Case 2: edge from v to ω has probability 1/ dω of

activating ω.

Simulate the process 10000 times for each targeted set, re-choosing thresholds or edge outcomes pseudo-randomly from [0, 1] every time

Compare with other 3 common heuristics (in)degree centrality, distance centrality, random nodes.32

Page 33: Spread influence on social networks

Outline

Models of influence Linear Threshold Independent Cascade

Influence maximization problem Algorithm Proof of performance bound Compute objective function

Experiments Data and setting Results

33

Page 34: Spread influence on social networks

Results: linear threshold model

34

Page 35: Spread influence on social networks

Independent Cascade Model – Case 2

Reminder: linear threshold model

35

Page 36: Spread influence on social networks

Open Questions

Study more general influence models. Find

trade-offs between generality and feasibility. Deal with negative influences. Model competing ideas. Obtain more data about how activations

occur

in real social networks.

36

Page 37: Spread influence on social networks

Thanks!

37

Page 38: Spread influence on social networks

Influence Maximization When Negative Opinions

may Emerge and PropagateAuthors: Wei Chan and lot others

Microsoft Research Technical Report 2010

38

Page 39: Spread influence on social networks

Model of Influence

Similar to independent cascade model. When node v becomes active, it has a single chance

of activating each currently inactive neighbor w. The activation attempt succeeds with probability pvw .

If node v is negative then node w also becomes negative.

If node v is positive then node w becomes positive with probability q else becomes negative.

q is the product quality factor.

39

Page 40: Spread influence on social networks

Model of Influence

Intuition: Negative opinions originate from imperfect

product/service quality. Negative node generates the negative opinions. Positive and Negative opinions are asymmetric

Negative opinions are generally much stronger.

40

Page 41: Spread influence on social networks

Towards sub-modularity

Generate a deterministic graph G’ from G by flipping coin (biased according to the edge influence probability) for each edge.

PAPv = Positive activation probability of v.

= q^{shortest path from S to v + 1} F(S, G’) = E(# positively activated nodes)

= sum(PAPv) F(S,G’) is monotone and submodular. F(S,G) = E(F(S,G’) over all possible G’)

41

Page 42: Spread influence on social networks

Other models

Different quality factor for every node Stronger negative influence probability Different propagation delays

None of them are sub-modular in nature!

42


Top Related