big data analytics

Big Data Analytics

Lucas Rego Drumond

Information Systems and Machine Learning Lab (ISMLL)Institute of Computer Science

University of Hildesheim, Germany

Recommender Systems Part II

Lucas Rego Drumond, Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany

Recommender Systems Part II 1 / 31

Big Data Analytics

Outline

1. Review

2. More on factorization models2.1 Adding bias terms

3. SVD++

4. Item Prediction

5. From Recommender Systems to Graphs5.1 Recommender Systems as a link prediction problem5.2 Link Prediction Approaches

Big Data Analytics 1. Review

Outline

1. Review

3. SVD++

4. Item Prediction

Recommender Systems

Formalization

I U - Set of Users

I I - Set of Items

I Ratings data D ⊆ U × I × R

Rating data D are typically represented as a sparse matrix R ∈ R|U|×|I |

sitems

Example

Titanic (t) Matrix (m) The Godfather (g) Once (o)

Alice (a) 4 2 5Bob (b) 4 3John (j) 4 3

I Users U := {Alice,Bob, John}I Items I := {Titanic,Matrix,The Godfather,Once}I Ratings data D := {(Alice,Titanic, 4), (Bob,Matrix, 4), . . .}

User Based Recommender - Prediction Function

r(u, i) := ru +

∑v∈Nu

sim(u, v)(rvi − rv )∑v∈Nu

|sim(u, v)|

Where:

I ru is the average rating of user u

I sim is a similarity function used to compute the neighborhood Nu

Item Based Recommender - Prediction Function

r(u, i) := ri +

∑j∈Ni

sim(i , j)(rui − ri )∑j∈Ni|sim(i , j)|

Where:

I ri is the average rating of item i

I sim is a similarity function used to compute the neighborhood Ni

Factorization modelsI Each item i ∈ I is associated with a latent feature vector qi ∈ Rk

I Each user u ∈ U is associated with a latent feature vector pu ∈ Rk

I Each entry in the original matrix can be estimated by

r(u, i) = p>u qi =k∑

pu,f qi ,f

Example

a≈b xx

RR QQTTPP

AliceAlice

BobBob

JohnJohn

MM GG OO

AliceAlice

BobBob

JohnJohn

TT MM GG OO

Big Data Analytics 2. More on factorization models

Outline

1. Review

3. SVD++

4. Item Prediction

Big Data Analytics 2. More on factorization models 2.1 Adding bias terms

Biased Matrix Factorization

I Specific users tend to have specific rating behaviorI Some users may tend to give higher (or lower) ratings

I The same can be said about items

I This can be easily modeled through bias terms for users bu and foritems bi in the prediction function:

r(u, i) = bu + bi + p>u qi

I Additionally a global bias can be added:

r(u, i) = g + bu + bi + p>u qi

Big Data Analytics 2. More on factorization models 2.1 Adding bias terms

Effect of the Biases

Y. Koren, R. Bell, and C. Volinsky. Matrix factorization techniques forrecommender systems. IEEE Computer, 42(8):30–37, 2009.

Big Data Analytics 3. SVD++

Outline

1. Review

3. SVD++

4. Item Prediction

Integrating Implicit feedback

I In many situations we have information about items that the user hasconsumed but did not evaluate

I Videos watchedI Products boughtI Webpages visitedI ...

I The set of items N (u) cosumed by a user u (rated or not) providesuseful information about the tastes of the user

SVD++SVD++ (Koren 2008) incorporates information about implcit feedbackinto user factorsUser factors are represented as:

pu +1√|N (u)|

∑j∈N (u)

The prediction function is then written as:

rui := bu + bi + qTi

pu +1√|N (u)|

∑j∈N (u)

Where:

I vj ∈ Rk are item latent vectors used to construct user profile.I N (u) is the set of items consumed by the user u.

SVD++ Performance

Dataset: NetflixMeasure: RMSE

Model 50 factors 100 factors 200 factors

MF 0.9046 0.9025 0.9009SVD++ 0.8952 0.8924 0.8911

Source: Yehuda Koren. Factorization meets the neighborhood: amultifaceted collaborative filtering model, KDD 2008

Big Data Analytics 4. Item Prediction

Outline

1. Review

3. SVD++

4. Item Prediction

Item PredictionWhich will be the next items to be consumed by a user?

FormalizationI U - Set of UsersI I - Set of ItemsI Positive implicit feedback data D ⊆ U × I × {1}

We have available only information about N (u) which items the user hasinteracted with

sitems

Considerations

I We do not know whether a user has liked an item or not (how herated it)

I The only information we have is which items the user has bought,watched, clicked, ...

I The task is to predict which will be the next items the user willinteract with next

I We can assume that items already evaluated (i ∈ N (u)) are preferredover the not evaluated ones (i /∈ N (u))

Item Prediction Task

Assuming that items already evaluated are preferred over the notevaluated ones

i >u j iff i ∈ N (u) and j /∈ N (u)

Given a dataset DS ⊆ U × I × I :

DS := {(u, i , j)|i ∈ N (u) ∧ j /∈ N (u)}

For each user, find a total order >u over items j /∈ N (u) that reflects userpreferences

Item Prediction Approach

I Learn a model r : U × I → RI Sort items according to scores predicted by the model such that:

i >u j iff r(u, i) > r(u, j)

In a probabilistic setting, be Θ the model parameters, then

p(i >u j |Θ) := σ(yuij)

Where:

I σ(x) := 11+e−x

I yuij := r(u, i)− r(u, j)

Bayesian Personalized Ranking (BPR)

The Maximum Likelihood Estimator:

arg maxΘ

p(Θ| >u) ∝ p(>u |Θ)p(Θ)

Prior:p(Θ) := N(0,ΣΘ)

The Bayesian Personalized Ranking Optimization Criterion(BPR-Opt)

BPR-Opt := ln∏u∈U

p(Θ| >u)

= ln∏u∈U

p(>u |Θ)p(Θ)

= ln∏

(u,i ,j)∈DS

σ(yuij)p(Θ)

(u,i ,j)∈DS

lnσ(yuij) + ln p(Θ)

(u,i ,j)∈DS

lnσ(yuij)− λ||Θ||2

Optimizing a factorization model for BPR:Model:

r(u, i) = p>u qi =k∑

pu,f qi ,f

Loss Function:L :=

∑(u,i ,j)∈DS

lnσ(yuij)− λ||Θ||2

Gradients:

∂BPR-Opt

∂θ=−e−yuij

1 + e−yuij· ∂∂θ

yuij − λθ

∂θyuij =

(qif − qjf ) if θ = puf

puf if θ = qif

−puf if θ = qjf

Stochastic Gradient Descent Algorithm1: procedure LearnBPR

input: DTrainS , λ, α,Σ

2: (pu)u∈U ∼ N(0,Σ)3: (qi )i∈I ∼ N(0,Σ)4: repeat5: for (u, i , j) ∈ DTrain

S do . In a random order6: for f ∈ 1, . . . , k do

7: puf ← puf + α(−e−yuij

1+e−yuij· (qif − qjf )− 2λpuf

)8: qif ← qif + α

(−e−yuij

1+e−yuij· puf − 2λqif

)9: qjf ← qjf + α

(−e−yuij

1+e−yuij· (−puf )− 2λqjf

)10: end for11: end for12: until convergence13: return P,Q14: end procedure

Big Data Analytics 5. From Recommender Systems to Graphs

Outline

1. Review

3. SVD++

4. Item Prediction

Link Prediction

y(1, 2)

y(4, 2)

y(1, 2)

y(3, 2)

y(2, 4) =?

y(3, 1) =?

Link Prediction - Formalization

Given a graph G := (V ,E ) where

I V is a set of vertices

I E ⊆ V × V is a set of edges

predict the most likely edges E ∗ * E

Link Prediction - Examples

There are a lot of applications for Link Prediction Models:

I Finding friends in social networks

I Recommender Systems

I Predicting Protein interaction

I Predicting links between web pages

Big Data Analytics 5. From Recommender Systems to Graphs 5.1 Recommender Systems as a link predictionproblem

Recommender System Graph

t g o m

rat = 4rag = 2 rao = 5

rbm = 4rbg = 3rjm = 4

rjo = 3

Recommender Systems - Rating Prediction

t g o m

rat = 4rag = 2

rao = 5 rbm = 4rbg = 3rjm = 4

rjo = 3rbo =?

Recommender Systems - Item Prediction

Alice (a) 1 ? 1 1Bob (b) ? 1 1 ?John (j) ? 1 ? 1

t g o m

Big Data Analytics 5. From Recommender Systems to Graphs 5.2 Link Prediction Approaches

Link Prediction Approaches

Given a graph G := (V ,E ),

I Determine a scoring function s : V × V → RI The scores should reflect the likelihood that there is a link between

the two vertices

I Rank possible pairs of vertices according to their scoresI Two basic streams of approaches:

I Compute the scores from graph statisticsI Learn a scoring function from the data

Link Prediction Approaches

Be kv the degree of node v and N (v) the set of neighbors of a node:

N (v) := {u|(u, v) ∈ E ∨ (v , u) ∈ E}

the different approaches based on graph statistics:

I Common Neighbors: sCN(v , u) := |N (v) ∩N (u)|I Salton Index: sSalton(v , u) := |N (v)∩N (u)|√

kv×ku

I Jaccard Index: sJaccard(v , u) := |N (v)∩N (u)||N (v)∪N (u)|

I Adamic-Adar Index: sAA(v , u) :=∑

z∈N (v)∩N (u)1

log kz

Link Prediction Approaches - Examples

Finding possible links for v4:

Common Neighbors:sCN(v , u) := |N (v) ∩N (u)|

sCN(v1, v4) = {v5, v3, v2} ∩ {v3, v5} = 2

sCN(v2, v4) = {v1, v5} ∩ {v3, v5} = 1

Salton Index: sSalton(v , u) := |N (v)∩N (u)|√kv×ku

sSalton(v1, v4) = {v5,v3,v2}∩{v3,v5}√3×2

= 0.8165

sSalton(v2, v4) = {v1,v5}∩{v3,v5}√2×2

Jaccard Index:sJaccard(v , u) := |N (v)∩N (u)|

|N (v)∪N (u)|

sJaccard(v1, v4) = {v5,v3,v2}∩{v3,v5}{v5,v3,v2}∪{v3,v5} = 0.6667

sJaccard(v2, v4) = {v1,v5}∩{v3,v5}{v1,v5}∪{v3,v5} = 0.3333

Adamic-Adar:sAA(v , u) :=

∑z∈N (v)∩N (u)

1log kz

sAA(v1, v4) = 1log 2 + 1

log 3 = 2.3529

sAA(v2, v4) = 1log 3 = 0.9102

Link Prediction Approaches - Learning a scoring Function

I Any item recommendation approaches could be used here.

I Factorization models: factorize the adjacency matrix of the graph

I Associate each vertex v with latent factors ϕ(v) ∈ Rk

I Scoring function:

s(u, v) = ϕ(u)>ϕ(v)

big data analytics - universität hildesheim · big data analytics big data analytics lucas rego...

Documents

hadoop, big data and big analytics 2014 - sas...hadoop, big...

big analytics & visualisation

emc it big data analytics journey - dell emc saudi · pdf...

idiro analytics - analytics & big data

petascale analytics - the world of big data requires big...

e6893 big data analytics lecture 4: big data analytics

big gains from big data analytics

big data analytics - 3. distributed file systems ·...

big data analytics infrastructure for dummies, ibm · pdf...

big analytics without big hassles

big data analytics – advanced analytics

big data analytics - zerostack · big data analytics i. big...

e6893 big data analytics lecture 11: linked big data...

big data and analytics creating actionable intelligence ·...

big data seminar - universität hildesheim · big data...

big data & analytics · concepts, technologies and the ibm...

data analytics big data & analytics

big data big analytics

big data & analytics