big data analytics - universität hildesheim · big data analytics big data analytics lucas rego...
Post on 29-Apr-2019
243 Views
Preview:
TRANSCRIPT
Big Data Analytics
Big Data Analytics
Lucas Rego Drumond
Information Systems and Machine Learning Lab (ISMLL)Institute of Computer Science
University of Hildesheim, Germany
Recommender Systems Part II
Lucas Rego Drumond, Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany
Recommender Systems Part II 1 / 31
Big Data Analytics
Outline
1. Review
2. More on factorization models2.1 Adding bias terms
3. SVD++
4. Item Prediction
5. From Recommender Systems to Graphs5.1 Recommender Systems as a link prediction problem5.2 Link Prediction Approaches
Lucas Rego Drumond, Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany
Recommender Systems Part II 1 / 31
Big Data Analytics 1. Review
Outline
1. Review
2. More on factorization models2.1 Adding bias terms
3. SVD++
4. Item Prediction
5. From Recommender Systems to Graphs5.1 Recommender Systems as a link prediction problem5.2 Link Prediction Approaches
Lucas Rego Drumond, Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany
Recommender Systems Part II 1 / 31
Big Data Analytics 1. Review
Recommender Systems
Lucas Rego Drumond, Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany
Recommender Systems Part II 1 / 31
Big Data Analytics 1. Review
Formalization
I U - Set of Users
I I - Set of Items
I Ratings data D ⊆ U × I × R
Rating data D are typically represented as a sparse matrix R ∈ R|U|×|I |
user
sitems
Lucas Rego Drumond, Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany
Recommender Systems Part II 2 / 31
Big Data Analytics 1. Review
Example
Titanic (t) Matrix (m) The Godfather (g) Once (o)
Alice (a) 4 2 5Bob (b) 4 3John (j) 4 3
I Users U := {Alice,Bob, John}I Items I := {Titanic,Matrix,The Godfather,Once}I Ratings data D := {(Alice,Titanic, 4), (Bob,Matrix, 4), . . .}
Lucas Rego Drumond, Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany
Recommender Systems Part II 3 / 31
Big Data Analytics 1. Review
User Based Recommender - Prediction Function
r(u, i) := ru +
∑v∈Nu
sim(u, v)(rvi − rv )∑v∈Nu
|sim(u, v)|
Where:
I ru is the average rating of user u
I sim is a similarity function used to compute the neighborhood Nu
Lucas Rego Drumond, Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany
Recommender Systems Part II 4 / 31
Big Data Analytics 1. Review
Item Based Recommender - Prediction Function
r(u, i) := ri +
∑j∈Ni
sim(i , j)(rui − ri )∑j∈Ni|sim(i , j)|
Where:
I ri is the average rating of item i
I sim is a similarity function used to compute the neighborhood Ni
Lucas Rego Drumond, Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany
Recommender Systems Part II 5 / 31
Big Data Analytics 1. Review
Factorization modelsI Each item i ∈ I is associated with a latent feature vector qi ∈ Rk
I Each user u ∈ U is associated with a latent feature vector pu ∈ Rk
I Each entry in the original matrix can be estimated by
r(u, i) = p>u qi =k∑
f =1
pu,f qi ,f
Lucas Rego Drumond, Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany
Recommender Systems Part II 6 / 31
Big Data Analytics 1. Review
Example
Titanic (t) Matrix (m) The Godfather (g) Once (o)
Alice (a) 4 2 5Bob (b) 4 3John (j) 4 3
a≈b xx
RR QQTTPP
TT
AliceAlice
BobBob
JohnJohn
4
4
4
2
3
5
3
MM GG OO
AliceAlice
BobBob
JohnJohn
TT MM GG OO
Lucas Rego Drumond, Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany
Recommender Systems Part II 7 / 31
Big Data Analytics 2. More on factorization models
Outline
1. Review
2. More on factorization models2.1 Adding bias terms
3. SVD++
4. Item Prediction
5. From Recommender Systems to Graphs5.1 Recommender Systems as a link prediction problem5.2 Link Prediction Approaches
Lucas Rego Drumond, Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany
Recommender Systems Part II 8 / 31
Big Data Analytics 2. More on factorization models 2.1 Adding bias terms
Biased Matrix Factorization
I Specific users tend to have specific rating behaviorI Some users may tend to give higher (or lower) ratings
I The same can be said about items
I This can be easily modeled through bias terms for users bu and foritems bi in the prediction function:
r(u, i) = bu + bi + p>u qi
I Additionally a global bias can be added:
r(u, i) = g + bu + bi + p>u qi
Lucas Rego Drumond, Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany
Recommender Systems Part II 8 / 31
Big Data Analytics 2. More on factorization models 2.1 Adding bias terms
Effect of the Biases
Y. Koren, R. Bell, and C. Volinsky. Matrix factorization techniques forrecommender systems. IEEE Computer, 42(8):30–37, 2009.
Lucas Rego Drumond, Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany
Recommender Systems Part II 9 / 31
Big Data Analytics 3. SVD++
Outline
1. Review
2. More on factorization models2.1 Adding bias terms
3. SVD++
4. Item Prediction
5. From Recommender Systems to Graphs5.1 Recommender Systems as a link prediction problem5.2 Link Prediction Approaches
Lucas Rego Drumond, Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany
Recommender Systems Part II 10 / 31
Big Data Analytics 3. SVD++
Integrating Implicit feedback
I In many situations we have information about items that the user hasconsumed but did not evaluate
I Videos watchedI Products boughtI Webpages visitedI ...
I The set of items N (u) cosumed by a user u (rated or not) providesuseful information about the tastes of the user
Lucas Rego Drumond, Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany
Recommender Systems Part II 10 / 31
Big Data Analytics 3. SVD++
SVD++SVD++ (Koren 2008) incorporates information about implcit feedbackinto user factorsUser factors are represented as:
pu +1√|N (u)|
∑j∈N (u)
vj
The prediction function is then written as:
rui := bu + bi + qTi
pu +1√|N (u)|
∑j∈N (u)
vj
Where:
I vj ∈ Rk are item latent vectors used to construct user profile.I N (u) is the set of items consumed by the user u.
Lucas Rego Drumond, Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany
Recommender Systems Part II 11 / 31
Big Data Analytics 3. SVD++
SVD++ Performance
Dataset: NetflixMeasure: RMSE
Model 50 factors 100 factors 200 factors
MF 0.9046 0.9025 0.9009SVD++ 0.8952 0.8924 0.8911
Source: Yehuda Koren. Factorization meets the neighborhood: amultifaceted collaborative filtering model, KDD 2008
Lucas Rego Drumond, Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany
Recommender Systems Part II 12 / 31
Big Data Analytics 4. Item Prediction
Outline
1. Review
2. More on factorization models2.1 Adding bias terms
3. SVD++
4. Item Prediction
5. From Recommender Systems to Graphs5.1 Recommender Systems as a link prediction problem5.2 Link Prediction Approaches
Lucas Rego Drumond, Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany
Recommender Systems Part II 13 / 31
Big Data Analytics 4. Item Prediction
Item PredictionWhich will be the next items to be consumed by a user?
Lucas Rego Drumond, Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany
Recommender Systems Part II 13 / 31
Big Data Analytics 4. Item Prediction
FormalizationI U - Set of UsersI I - Set of ItemsI Positive implicit feedback data D ⊆ U × I × {1}
We have available only information about N (u) which items the user hasinteracted with
user
sitems
Lucas Rego Drumond, Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany
Recommender Systems Part II 14 / 31
Big Data Analytics 4. Item Prediction
Considerations
I We do not know whether a user has liked an item or not (how herated it)
I The only information we have is which items the user has bought,watched, clicked, ...
I The task is to predict which will be the next items the user willinteract with next
I We can assume that items already evaluated (i ∈ N (u)) are preferredover the not evaluated ones (i /∈ N (u))
Lucas Rego Drumond, Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany
Recommender Systems Part II 15 / 31
Big Data Analytics 4. Item Prediction
Item Prediction Task
Assuming that items already evaluated are preferred over the notevaluated ones
i >u j iff i ∈ N (u) and j /∈ N (u)
Given a dataset DS ⊆ U × I × I :
DS := {(u, i , j)|i ∈ N (u) ∧ j /∈ N (u)}
For each user, find a total order >u over items j /∈ N (u) that reflects userpreferences
Lucas Rego Drumond, Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany
Recommender Systems Part II 16 / 31
Big Data Analytics 4. Item Prediction
Item Prediction Approach
I Learn a model r : U × I → RI Sort items according to scores predicted by the model such that:
i >u j iff r(u, i) > r(u, j)
In a probabilistic setting, be Θ the model parameters, then
p(i >u j |Θ) := σ(yuij)
Where:
I σ(x) := 11+e−x
I yuij := r(u, i)− r(u, j)
Lucas Rego Drumond, Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany
Recommender Systems Part II 17 / 31
Big Data Analytics 4. Item Prediction
Bayesian Personalized Ranking (BPR)
The Maximum Likelihood Estimator:
arg maxΘ
p(Θ| >u) ∝ p(>u |Θ)p(Θ)
Prior:p(Θ) := N(0,ΣΘ)
Lucas Rego Drumond, Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany
Recommender Systems Part II 18 / 31
Big Data Analytics 4. Item Prediction
The Bayesian Personalized Ranking Optimization Criterion(BPR-Opt)
BPR-Opt := ln∏u∈U
p(Θ| >u)
= ln∏u∈U
p(>u |Θ)p(Θ)
= ln∏
(u,i ,j)∈DS
σ(yuij)p(Θ)
=∑
(u,i ,j)∈DS
lnσ(yuij) + ln p(Θ)
=∑
(u,i ,j)∈DS
lnσ(yuij)− λ||Θ||2
Lucas Rego Drumond, Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany
Recommender Systems Part II 19 / 31
Big Data Analytics 4. Item Prediction
Optimizing a factorization model for BPR:Model:
r(u, i) = p>u qi =k∑
f =1
pu,f qi ,f
Loss Function:L :=
∑(u,i ,j)∈DS
lnσ(yuij)− λ||Θ||2
Gradients:
∂BPR-Opt
∂θ=−e−yuij
1 + e−yuij· ∂∂θ
yuij − λθ
∂
∂θyuij =
(qif − qjf ) if θ = puf
puf if θ = qif
−puf if θ = qjf
Lucas Rego Drumond, Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany
Recommender Systems Part II 20 / 31
Big Data Analytics 4. Item Prediction
Stochastic Gradient Descent Algorithm1: procedure LearnBPR
input: DTrainS , λ, α,Σ
2: (pu)u∈U ∼ N(0,Σ)3: (qi )i∈I ∼ N(0,Σ)4: repeat5: for (u, i , j) ∈ DTrain
S do . In a random order6: for f ∈ 1, . . . , k do
7: puf ← puf + α(−e−yuij
1+e−yuij· (qif − qjf )− 2λpuf
)8: qif ← qif + α
(−e−yuij
1+e−yuij· puf − 2λqif
)9: qjf ← qjf + α
(−e−yuij
1+e−yuij· (−puf )− 2λqjf
)10: end for11: end for12: until convergence13: return P,Q14: end procedure
Lucas Rego Drumond, Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany
Recommender Systems Part II 21 / 31
Big Data Analytics 5. From Recommender Systems to Graphs
Outline
1. Review
2. More on factorization models2.1 Adding bias terms
3. SVD++
4. Item Prediction
5. From Recommender Systems to Graphs5.1 Recommender Systems as a link prediction problem5.2 Link Prediction Approaches
Lucas Rego Drumond, Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany
Recommender Systems Part II 22 / 31
Big Data Analytics 5. From Recommender Systems to Graphs
Link Prediction
1
2
3
4
y(1, 2)
y(4, 2)
y(1, 2)
y(3, 2)
y(2, 4) =?
y(3, 1) =?
Lucas Rego Drumond, Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany
Recommender Systems Part II 22 / 31
Big Data Analytics 5. From Recommender Systems to Graphs
Link Prediction - Formalization
Given a graph G := (V ,E ) where
I V is a set of vertices
I E ⊆ V × V is a set of edges
predict the most likely edges E ∗ * E
Lucas Rego Drumond, Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany
Recommender Systems Part II 23 / 31
Big Data Analytics 5. From Recommender Systems to Graphs
Link Prediction - Examples
There are a lot of applications for Link Prediction Models:
I Finding friends in social networks
I Recommender Systems
I Predicting Protein interaction
I Predicting links between web pages
I ...
Lucas Rego Drumond, Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany
Recommender Systems Part II 24 / 31
Big Data Analytics 5. From Recommender Systems to Graphs 5.1 Recommender Systems as a link predictionproblem
Recommender System Graph
Titanic (t) Matrix (m) The Godfather (g) Once (o)
Alice (a) 4 2 5Bob (b) 4 3John (j) 4 3
a b j
t g o m
rat = 4rag = 2 rao = 5
rbm = 4rbg = 3rjm = 4
rjo = 3
Lucas Rego Drumond, Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany
Recommender Systems Part II 25 / 31
Big Data Analytics 5. From Recommender Systems to Graphs 5.1 Recommender Systems as a link predictionproblem
Recommender Systems - Rating Prediction
Titanic (t) Matrix (m) The Godfather (g) Once (o)
Alice (a) 4 2 5Bob (b) 4 3John (j) 4 3
a b j
t g o m
rat = 4rag = 2
rao = 5 rbm = 4rbg = 3rjm = 4
rjo = 3rbo =?
Lucas Rego Drumond, Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany
Recommender Systems Part II 26 / 31
Big Data Analytics 5. From Recommender Systems to Graphs 5.1 Recommender Systems as a link predictionproblem
Recommender Systems - Item Prediction
Titanic (t) Matrix (m) The Godfather (g) Once (o)
Alice (a) 1 ? 1 1Bob (b) ? 1 1 ?John (j) ? 1 ? 1
a b j
t g o m
?
?
?
??
Lucas Rego Drumond, Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany
Recommender Systems Part II 27 / 31
Big Data Analytics 5. From Recommender Systems to Graphs 5.2 Link Prediction Approaches
Link Prediction Approaches
Given a graph G := (V ,E ),
I Determine a scoring function s : V × V → RI The scores should reflect the likelihood that there is a link between
the two vertices
I Rank possible pairs of vertices according to their scoresI Two basic streams of approaches:
I Compute the scores from graph statisticsI Learn a scoring function from the data
Lucas Rego Drumond, Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany
Recommender Systems Part II 28 / 31
Big Data Analytics 5. From Recommender Systems to Graphs 5.2 Link Prediction Approaches
Link Prediction Approaches
Be kv the degree of node v and N (v) the set of neighbors of a node:
N (v) := {u|(u, v) ∈ E ∨ (v , u) ∈ E}
the different approaches based on graph statistics:
I Common Neighbors: sCN(v , u) := |N (v) ∩N (u)|I Salton Index: sSalton(v , u) := |N (v)∩N (u)|√
kv×ku
I Jaccard Index: sJaccard(v , u) := |N (v)∩N (u)||N (v)∪N (u)|
I Adamic-Adar Index: sAA(v , u) :=∑
z∈N (v)∩N (u)1
log kz
Lucas Rego Drumond, Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany
Recommender Systems Part II 29 / 31
Big Data Analytics 5. From Recommender Systems to Graphs 5.2 Link Prediction Approaches
Link Prediction Approaches - Examples
v1
v5
v2
v3 v4
Finding possible links for v4:
Common Neighbors:sCN(v , u) := |N (v) ∩N (u)|
sCN(v1, v4) = {v5, v3, v2} ∩ {v3, v5} = 2
sCN(v2, v4) = {v1, v5} ∩ {v3, v5} = 1
Salton Index: sSalton(v , u) := |N (v)∩N (u)|√kv×ku
sSalton(v1, v4) = {v5,v3,v2}∩{v3,v5}√3×2
= 0.8165
sSalton(v2, v4) = {v1,v5}∩{v3,v5}√2×2
= 0.5
Jaccard Index:sJaccard(v , u) := |N (v)∩N (u)|
|N (v)∪N (u)|
sJaccard(v1, v4) = {v5,v3,v2}∩{v3,v5}{v5,v3,v2}∪{v3,v5} = 0.6667
sJaccard(v2, v4) = {v1,v5}∩{v3,v5}{v1,v5}∪{v3,v5} = 0.3333
Adamic-Adar:sAA(v , u) :=
∑z∈N (v)∩N (u)
1log kz
sAA(v1, v4) = 1log 2 + 1
log 3 = 2.3529
sAA(v2, v4) = 1log 3 = 0.9102
Lucas Rego Drumond, Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany
Recommender Systems Part II 30 / 31
Big Data Analytics 5. From Recommender Systems to Graphs 5.2 Link Prediction Approaches
Link Prediction Approaches - Learning a scoring Function
I Any item recommendation approaches could be used here.
I Factorization models: factorize the adjacency matrix of the graph
I Associate each vertex v with latent factors ϕ(v) ∈ Rk
I Scoring function:
s(u, v) = ϕ(u)>ϕ(v)
Lucas Rego Drumond, Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany
Recommender Systems Part II 31 / 31
top related