leonid e. zhukov - leonid zhukovlecture outline 1 label propagation problem 2 collective classi...
TRANSCRIPT
Label propagation on graphs
Leonid E. Zhukov
School of Data Analysis and Artificial IntelligenceDepartment of Computer Science
National Research University Higher School of Economics
Structural Analysis and Visualization of Networks
Leonid E. Zhukov (HSE) Lecture 17 19.05.2015 1 / 26
Lecture outline
1 Label propagation problem
2 Collective classificationIterative classification
3 Semi-supervised learningRandom walk based methodsGraph regularization
Leonid E. Zhukov (HSE) Lecture 17 19.05.2015 2 / 26
Label propagation
Label propagation - labeling of all nodes in a graph structure
Subset of nodes is labeled: categorical/numeric/binary values
Extend labeling to all nodes on the graph
Classification in networked data, network classification, structuredinference
Leonid E. Zhukov (HSE) Lecture 17 19.05.2015 3 / 26
Label propagation problem
Structure can help only if labels/values of linked nodes are correlated
Social networks show assortative mixing - bias in favor of connectionsbetween network nodes with similar characteristics:– homophily: similar characteristics → connections– influence: connections → similar characteristics
Can apply to constructed (induced) similarity networks
Leonid E. Zhukov (HSE) Lecture 17 19.05.2015 4 / 26
Network classification
Supervised learning approach
Given graph nodes V = Vl ∪ Vu:– nodes Vl given labels Yl
– nodes Vu do not have lables
Need to find Yu
Labels can be binary, multi-class, real values
Features (attributes) can be computed for every node φi :– local node features (if available)– link features available (labels from neighbors, attributes fromneighbours, node degrees, connectivity patterns)
Feature (design) matrix Φ = (Φl ,Φu)
Leonid E. Zhukov (HSE) Lecture 17 19.05.2015 5 / 26
Network learning components
Local classifier. This is a local learned model, predicts node labelbased on node attributes. No network information
Relational classier. Takes into account labels and attributes of nodeneighbors. Uses neighborhood network information
Collective classifier. Estimates unknown values together applyingrelational classifier iteratively. Strongly depends on network structure
Leonid E. Zhukov (HSE) Lecture 17 19.05.2015 6 / 26
Collective classification
Algorithm: Iterative classification method
Input: Graph G (V ,E ), labels Yl
Output: labels Y
Compute Φ(0)
Train classifier on (Φ(0)l ,Yl)
Predict Y(0)u
repeat
Compute Φ(t)u
Train classifier on (Φ(t),Y (t))
Predict Y(t+1)u from Φ
(t)u
until Y(t)u converges;
Y ← Y (t)
Leonid E. Zhukov (HSE) Lecture 17 19.05.2015 7 / 26
Iterative classification
Leonid E. Zhukov (HSE) Lecture 17 19.05.2015 8 / 26
Relational classifiers
Weighted-vote relational neighbor classifier:
P(yi = c |Ni ) =1
Z
∑i∈Ni
AijP(yj = c |Nj)
Network only Bayes classifier:
P(yi = c |Ni ) =P(Ni |c)P(c)
P(Ni )
where
P(Ni |c) =1
Z
∏j∈Ni
P(yj = yj |yi = c)
Leonid E. Zhukov (HSE) Lecture 17 19.05.2015 9 / 26
Semi-supervised learning
Graph-based semi-supervised learning
Given partially labeled dataset
Data: X = Xl ∪ Xu
– small set of labeled data (Xl ,Yl)– large set of unlabeled data Xu
Similarity graph over data points G (V ,E ), where every vertex vicorresponds to a data point xi
Transductive learning: learn a function that predicts labels Yu for theunlabeled input Xu
Leonid E. Zhukov (HSE) Lecture 17 19.05.2015 10 / 26
Random walk methods
Consider random walk with absorbing states - labeled nodes Vl
Probability yi [c] for node vi ∈ Vu to have label c ,
yi [c] =∑j∈Vl
p∞ij yj [c]
where yi [c] - probability distribution over labels,pij = P(i → j) - one stept probaility transition matrixIf output requires single label per node, assign the most probableIn matrix form
Y = P∞Y
where Y = (Yl , 0), Y = (Yl , Yu)
Leonid E. Zhukov (HSE) Lecture 17 19.05.2015 11 / 26
Random walk methods
Random walk matrix: P = D−1A
Random walk with absorbing states
P =
(Pll Plu
Pul Puu
)=
(I 0Pul Puu
)At the t →∞ limit:
limt→∞
Pt =
(I 0
(∑∞
n=0 Pnuu)Pul P∞uu
)=
(I 0
(I − Puu)−1Pul 0
)
Leonid E. Zhukov (HSE) Lecture 17 19.05.2015 12 / 26
Random walk methods
Matrix equation(Yl
Yu
)=
(I 0
(I − Puu)−1Pul 0
)(Yl
Yu
)Solution
Yl = Yl
Yu = (I − Puu)−1PulYl
(I − Puu) is non-singular for all label connected graphs (is alwayspossible to reach a labeled node from any unlabeled node)
Leonid E. Zhukov (HSE) Lecture 17 19.05.2015 13 / 26
Label propagation
Algorithm: Label propagation, Zhu et. al 2002
Input: Graph G (V ,E ), labels Yl
Output: labels Y
Compute Dii =∑
j Aij
Compute P = D−1AInitialize Y (0) = (Yl , 0), t=0repeat
Y (t+1) ← P · Y (t)
Y(t+1)l ← Y
(t)l
until Y (t) converges;
Y ← Y (t)
Leonid E. Zhukov (HSE) Lecture 17 19.05.2015 14 / 26
Label propagation
IterationY (t) = P Y (t−1)
Y(t)u = PulYl + PuuY
(t−1)u
Y(t)u =
t∑n=1
Pn−1uu PulYl + Pt
uuYu
ConvergenceY = lim
t→∞Y (t) = (I − Puu)−1PulYl
Leonid E. Zhukov (HSE) Lecture 17 19.05.2015 15 / 26
Label spreading
Algorithm: Label spreading, Zhou et. al 2004
Input: Graph G (V ,E ), labels Yl
Output: labels Y
Compute Dii =∑
j Aij ,
Compute S = D−1/2AD−1/2
Initialize Y (0) = (Yl , 0), t=0repeat
Y (t+1) ← αSY (t) + (1− α)Y (0)
t ← t + 1
until Y (t) converges;
Leonid E. Zhukov (HSE) Lecture 17 19.05.2015 16 / 26
Label spreading
IterationsY (t) = αSY (t−1) + (1− α)Y (0)
At the t →∞ limit:
limt→∞
Y (t) = limt→∞
(αSY (t−1) + (1− α)Y (0)
)Y = αSY + (1− α)Y (0)
SolutionY = (1− α)(I − αS)−1Y (0)
Leonid E. Zhukov (HSE) Lecture 17 19.05.2015 17 / 26
Regularization on graphs
Find labeling Y = (Yl , Yu) that
Consistent with initial labeling:∑i∈Vl
(yi − yi )2 = ||Yl − Yl ||2
Consistent with graph structure (regression function smoothness):
1
2
∑i ,j∈V
Aij(yi − yj)2 = Y T (D − A)Y = Y TLY
Stable (additional regularization):
ε∑i∈V
y2i = ε||Y ||2
Leonid E. Zhukov (HSE) Lecture 17 19.05.2015 18 / 26
Regularization on graphs
Minimization with respect to Y , arg minY Q(Y )
Label propagation [Zhu, 2002]:
Q(Y ) =1
2
∑i ,j∈V
Aij(yi − yj)2 = Y TLY , with fixed Yl = Yl
Label spread [Zhou, 2003]:
Q(Y ) =1
2
∑ij∈V
Aij
(yi√d i
−yj√d j
)2
+ µ∑i∈V
(yi − yi )2
Q(Y ) = Y TLY + µ||Y − Y ||2
L = I − S = I − D−1/2AD−1/2
Leonid E. Zhukov (HSE) Lecture 17 19.05.2015 19 / 26
Regularization on graphs
Laplacian regularization [Belkin, 2003]
Q(Y ) =1
2
∑ij∈V
Aij (yi − yj)2 + µ
∑i∈Vl
(yi − yi )2
Q(Y ) = Y TLY + µ||Yl − Yl ||2
Use eigenvectors (e1..ep) from smallest eigenvalues of L = D − A:
Lej = λjej
Construct classifier (regression function) on eigenvectors
Err(a) =∑i∈Vl
(yi −p∑
j=1
ajeji )2
Predict value (classify) yi =∑p
j=1 ajeji , class ci = sign(yi )
Leonid E. Zhukov (HSE) Lecture 17 19.05.2015 20 / 26
Laplacian regularization
Algorithm: Laplacian regularization, Belkin and Niyogy, 2003
Input: Graph G (V ,E ), labels Yl
Output: labels Y
Compute Dii =∑
j Aij
Compute L = D − ACompute p eigenvectors e1..ep with smallest eigenvalues of L, Le = λeMinimize over a1...aparg mina1,...ap
∑li=1(yi −
∑pj=1 ajeji )
2, a = (ETE )−1ETYl
Label vi by the sign(∑p
j=1 ajeji )
Leonid E. Zhukov (HSE) Lecture 17 19.05.2015 21 / 26
Label propagation example
Leonid E. Zhukov (HSE) Lecture 17 19.05.2015 22 / 26
Label propagation example
Leonid E. Zhukov (HSE) Lecture 17 19.05.2015 23 / 26
Label propagation example
Leonid E. Zhukov (HSE) Lecture 17 19.05.2015 24 / 26
Label propagation example
Leonid E. Zhukov (HSE) Lecture 17 19.05.2015 25 / 26
References
S. A. Macskassy, F. Provost, Classification in Networked Data: AToolkit and a Univariate Case Study. Journal of Machine LearningResearch 8, 935-983, 2007Bengio Yoshua, Delalleau Olivier, Roux Nicolas Le. Label Propagationand Quadratic Criterion. Chapter in Semi-Supervised Learning, Eds.O. Chapelle, B. Scholkopf, and A. Zien, MIT Press 2006Smriti Bhagat, Graham Cormode, S. Muthukrishnan. Nodeclassification in social networks. Chapter in Social Network DataAnalytics, Eds. C. Aggrawal, 2011, pp 115-148D. Zhou, O. Bousquet, T. Lal, J. Weston, and B. Scholkopf. Learningwith local and global consistency. In NIPS, volume 16, 2004.X. Zhu, Z. Ghahramani, and J. Lafferty. Semi-supervised learningusing Gaussian fields and harmonic functions. In ICML, 2003.M. Belkin, P. Niyogi, V. Sindhwani. Manifold regularization: Ageometric framework for learning from labeled and unlabeledexamples. J. Mach. Learn. Res., 7, 2399-2434, 2006
Leonid E. Zhukov (HSE) Lecture 17 19.05.2015 26 / 26