i nformation c ascade priyanka garg. outline information propagation virus propagation model how to...
TRANSCRIPT
INFORMATION CASCADEPriyanka Garg
OUTLINE
Information Propagation Virus Propagation Model How to model infection?
Inferring Latent Social Networks Inferring edge influence Inferring influence volume
INFORMATION PROPAGATION
How information/infection/influence flows in the network?
Epidemiology: Question: Will a virus take over the network? Type of virus:
Susceptible Infected Susceptible (SIS) Example: Flu
Susceptible Infected Removed (SIR) Example: Chicken-pox , deadly disease
Viral Marketing: Once a node is infected, it remains infected. Question: How to select a subset of persons such
that maximum number of persons can be influenced?
HOW TO MODEL INFECTION?
Simple model: Each infected node infects its neighbor with a
fixed probability. SIS:
A node infects its neighbor with probability b (how infectious is the virus?)
Node recovers with probability a (how easy is it to get cured?)
Strength of virus = b/a Result: If virus strength < t then virus will
instinct eventually. t = 1/largest eigen value of adjacency matrix A.
HOW TO MODEL INFECTION?
Independent Contagion Model Each infected node infects its neighboring node
with probability pij.
Threshold Model Each infected node i infect its neighboring node j
with weight wij.
The node j becomes active if ∑j=neigh(i)wij > thi.
thi is the threshold of node i.
HOW TO MODEL INFECTION?: GENERAL CONTAGION MODEL
General language to describe information diffusion.
Model: S infected nodes tried but failed to infect node v. New node u becomes infected. Probability of node u successfully influencing node v
also depends on S. pv(u, S)
Example Node becomes active if k of its neighbors are active.
ie. if |S + 1| > k then pv(u, S) = 1 else 0
Independent Cascade: pv(u,S) = p(u,v)
Threshold model: if (p(S,v) + p(u,v)) > t then pv(u,S) = 1 else 0
HOW TO MODEL INFECTION?: GENERAL CONTAGION MODEL
Can also model the diminishing returns property S>T then Gain(S + u) < Gain (T + u) Gain = Probability of infecting neighbor j
CHALLENGES IN USING THESE MODELS
Problem under consideration Viral marketing: How to select a subset of persons such
that maximum number of persons can be influenced?
How to find the infection probability/weights of every edge?
INFERRING INFECTION PROBABILITIES
We know the time of infections over a lots of cascades.
Train: Maximize the likelihood of node infections over
all the nodes in all the cascades. Likelihood = ∏c∏iPi,c
Pi = P(i gets infected at time ti| infected nodes)
Independent Contagion Model Pi=At least one of the already infected node
infects node i Pi= 1 - ∏j(1-(probability of infection from node j
to node i at time ti))
INFERRING INFECTION PROBABILITIES
Variability with time: Infection probabilities vary with time. Let w(t) is
the distribution which captures the variability with time.
Probability of node j infecting node i at time t is w(t-tj)*Aji. Here tj is the infection time of node j.
Thus: Pi= 1 - ∏j(1- w(ti-tj)Aji)
The log-likelihood maximization problem can be shown to be a convex optimization problem
ANOTHER APPROACH: MORE DIRECT
Find number of infected nodes at any time t?
Number of infected nodes at time t depends only on number of already infected nodes.
Model: V(t) is the number of nodes infected at time t
V(t+1) = ∑u=1,N ∑l=0,L-1 Mu(t-l) Iu(l+1) Mu(t) = 1 if node u is infected at time t Iu(t) = Infection variability with time
Minimize the difference between V(t) and observed volume at every time t.
Accounting for novelty: V(t+1) = α(t)∑u=1,N ∑l=0,L-1 Mu(t-l) Iu(l+1)
THANK YOU
SIS
Let pit = P(i is infected at time t)
tit = P(i doesn’t receive infection from its neighbor)
tit = ∏j=neigh(i) (pj(t-1) (1-b) + 1 – pj(t-1))
1-pit=P(i is healthy at t-1 and didn’t receive infection) + P(i is infected at t-1 and got recovered and didn’t receive infection) + P(i is not infected at t-1 but got cured after infection at t).
1 – pit = (1-pi(t-1)) tit + pi(t-1)a tit + (1-pi(t-1))tita 0.5