ics summer school 2016 data science - week 4 gabriella
TRANSCRIPT
![Page 1: ICS Summer School 2016 Data Science - Week 4 Gabriella](https://reader033.vdocument.in/reader033/viewer/2022052805/628f43e8f7840147f5797d08/html5/thumbnails/1.jpg)
ICS Summer School 2016Data Science - Week 4
Gabriella Contardo
LIP6, University Pierre et Marie Curie, Paris, France
August 10, 2016
1/22 G.Contardo Roscoff ICS Summer School 2016 - Data Science and Machine Learning
![Page 2: ICS Summer School 2016 Data Science - Week 4 Gabriella](https://reader033.vdocument.in/reader033/viewer/2022052805/628f43e8f7840147f5797d08/html5/thumbnails/2.jpg)
Outline of the week
Course 1 : Reminders of the learning paradigm, neural networks /multi layer perceptron.Course 2 : Deep learning : Convolutional Neural NetworksCourse 3 : Tips on deep-learning - PCA, Matrix Factorizationand Recommender systemsCourse 4 : Unsupervised learning - Unsupervised Learning :Clustering (K-Means), EMCourse 5 : Unsupervised learning with (deep) neural networks :Auto-encoders, RNN. Word embeddings.
2/22 G.Contardo Roscoff ICS Summer School 2016 - Data Science and Machine Learning
![Page 3: ICS Summer School 2016 Data Science - Week 4 Gabriella](https://reader033.vdocument.in/reader033/viewer/2022052805/628f43e8f7840147f5797d08/html5/thumbnails/3.jpg)
References
On-line course material for todayThanks to:
Patrick Gallinari - Professor at UPMC - Course ”ApprentissageStatistique”Nicolas Baskiotis - Assistant Professor at UPMC - Course ”ARF”(Master DAC)Fei-Fei Li’s course at Stanford : CS231n: Convolutional NeuralNetworks for Visual Recognition (Lecture : introduction to neuralnets, backpropagation)Course Y.Bengio (MLSS 2014 slides and video available online)
Also interesting (generally speaking):Machine Learning course by Andrew Ng on CourseraLectures by Nando De Freitas (Oxford) - videos available
3/22 G.Contardo Roscoff ICS Summer School 2016 - Data Science and Machine Learning
![Page 4: ICS Summer School 2016 Data Science - Week 4 Gabriella](https://reader033.vdocument.in/reader033/viewer/2022052805/628f43e8f7840147f5797d08/html5/thumbnails/4.jpg)
Outline of the day
Deep LearningTips on Deep Learning
Unsupervised Learning - Data Compression / Representation
PCAMF/NMF and recommender systems
4/22 G.Contardo Roscoff ICS Summer School 2016 - Data Science and Machine Learning
![Page 5: ICS Summer School 2016 Data Science - Week 4 Gabriella](https://reader033.vdocument.in/reader033/viewer/2022052805/628f43e8f7840147f5797d08/html5/thumbnails/5.jpg)
GoogLeNet
Erratum : Google’s paper, not by LeCun (he’s at Facebook) butnamed as a reference/joke to his (old first) model ”LeNet”.”Yellow layers” : additional supervised layers added for training”on the side” parts of the networks. ”Later control experimentshave shown that the effect of the auxiliary networks is relativelyminor (around 0.5%) and that it required only one of them toachieve the same effect.” (cf paper)Learning time :”a rough estimate suggests that the GoogLeNetnetwork could be trained to convergence using few high-endGPUs within a week, the main limitation being the memory usage”More info in the paper :http://www.cs.unc.edu/ wliu/papers/GoogLeNet.pdf
5/22 G.Contardo Roscoff ICS Summer School 2016 - Data Science and Machine Learning
![Page 6: ICS Summer School 2016 Data Science - Week 4 Gabriella](https://reader033.vdocument.in/reader033/viewer/2022052805/628f43e8f7840147f5797d08/html5/thumbnails/6.jpg)
Tips for learning deep (convolutional) networks
Data Augmentation
Modify input (pixels) without changing label
Train on transformed data
(credit Fei-Fei Li’s course CS231n oxford)
![Page 7: ICS Summer School 2016 Data Science - Week 4 Gabriella](https://reader033.vdocument.in/reader033/viewer/2022052805/628f43e8f7840147f5797d08/html5/thumbnails/7.jpg)
Tips for learning deep(convolutional) networks
Data Augmentation
Flip
Random crops/scales
(+sample on crop)
Color jitter (e.g contrast)
Translation, rotation, ...
(credit Fei-Fei Li’s course CS231n oxford)
![Page 8: ICS Summer School 2016 Data Science - Week 4 Gabriella](https://reader033.vdocument.in/reader033/viewer/2022052805/628f43e8f7840147f5797d08/html5/thumbnails/8.jpg)
Data Augmentation
Flip
Random crops/scales
(+sample on crop)
Color jitter (e.g contrast)
Translation, rotation, ...
(credit Fei-Fei Li’s course CS231n oxford)
Straightforward for images but for
other types of data ?
Tips for learning deep(convolutional) networks
![Page 9: ICS Summer School 2016 Data Science - Week 4 Gabriella](https://reader033.vdocument.in/reader033/viewer/2022052805/628f43e8f7840147f5797d08/html5/thumbnails/9.jpg)
Generally :
Training : add random noise
Testing : marginalize over the noise
(credit Fei-Fei Li’s course CS231n oxford)
Tips for learning deep(convolutional) networks
![Page 10: ICS Summer School 2016 Data Science - Week 4 Gabriella](https://reader033.vdocument.in/reader033/viewer/2022052805/628f43e8f7840147f5797d08/html5/thumbnails/10.jpg)
« You need a lot of data if you want to train/use CNN »
→ Nope ! (well, not necessarily…) « Transfer » learning
(credit Fei-Fei Li’s course CS231n oxford)
Tips for learning deep(convolutional) networks
![Page 11: ICS Summer School 2016 Data Science - Week 4 Gabriella](https://reader033.vdocument.in/reader033/viewer/2022052805/628f43e8f7840147f5797d08/html5/thumbnails/11.jpg)
(credit Fei-Fei Li’s course CS231n oxford)
Tips for learning deep(convolutional) networks
« You need a lot of data if you want to train/use CNN »
→ Nope ! (well, not necessarily…)
![Page 12: ICS Summer School 2016 Data Science - Week 4 Gabriella](https://reader033.vdocument.in/reader033/viewer/2022052805/628f43e8f7840147f5797d08/html5/thumbnails/12.jpg)
(credit Fei-Fei Li’s course CS231n oxford)
Tips for learning deep(convolutional) networks
Using pre-trained CNN is the norm
![Page 13: ICS Summer School 2016 Data Science - Week 4 Gabriella](https://reader033.vdocument.in/reader033/viewer/2022052805/628f43e8f7840147f5797d08/html5/thumbnails/13.jpg)
Dimensionality Reduction
ProblemMore dimensions⇒ more expressivityToo many dimensions : low variance on a dimension, noise, takesmemory space and time...For tasks on images, text, and other : too many dimensions→dimensions that are not very informative, highly correlated⇒ Lower dimensions but without losing informationGoal : find a projection Φ : Rd → Rd ′
with d ′ << dApplications :
LearningVisualizationNoise reduction
credit N.Baskiotis UPMC
6/22 G.Contardo Roscoff ICS Summer School 2016 - Data Science and Machine Learning
![Page 14: ICS Summer School 2016 Data Science - Week 4 Gabriella](https://reader033.vdocument.in/reader033/viewer/2022052805/628f43e8f7840147f5797d08/html5/thumbnails/14.jpg)
Data Compression
(credit Andrew Ng coursera)
Reduce data from 2D to 1D
![Page 15: ICS Summer School 2016 Data Science - Week 4 Gabriella](https://reader033.vdocument.in/reader033/viewer/2022052805/628f43e8f7840147f5797d08/html5/thumbnails/15.jpg)
Data Compression
(credit Andrew Ng coursera)
Reduce data from 2D to 1D
![Page 16: ICS Summer School 2016 Data Science - Week 4 Gabriella](https://reader033.vdocument.in/reader033/viewer/2022052805/628f43e8f7840147f5797d08/html5/thumbnails/16.jpg)
Data Compression
(credit Andrew Ng coursera)
Reduce data from 3D to 2D
Original datasetProjected dataset
Visualization of projected Dataset in 2D
z=[z1, z2] , zi = [z
i 1,z
i 2]
![Page 17: ICS Summer School 2016 Data Science - Week 4 Gabriella](https://reader033.vdocument.in/reader033/viewer/2022052805/628f43e8f7840147f5797d08/html5/thumbnails/17.jpg)
Data Compression
(credit http://setosa.io/ev/principal-component-analysis/)
Motivation : Visualization
![Page 18: ICS Summer School 2016 Data Science - Week 4 Gabriella](https://reader033.vdocument.in/reader033/viewer/2022052805/628f43e8f7840147f5797d08/html5/thumbnails/18.jpg)
Data Compression
(credit http://setosa.io/ev/principal-component-analysis/)
Motivation : Visualization
17D to 1D →
17D to 2D :
![Page 19: ICS Summer School 2016 Data Science - Week 4 Gabriella](https://reader033.vdocument.in/reader033/viewer/2022052805/628f43e8f7840147f5797d08/html5/thumbnails/19.jpg)
Dimensionality Reduction
Quizz !Suppose we apply dimensionality reduction to a dataset of mexamples {x1, . . . , xm} where x i ∈ Rn. As a result, we will get out :
A lower dimensional dataset {z1, . . . , zk} of k examples wherek ≤ nA lower dimensional dataset {z1, . . . , zk} of k examples wherek > nA lower dimensional dataset {z1, . . . , zm} of m examples wherez i ∈ Rk for some value of k and k ≤ nA lower dimensional dataset {z1, . . . , zm} of m examples wherez i ∈ Rk for some value of k and k > n
credit Andrew Ng - Stanford Univ
7/22 G.Contardo Roscoff ICS Summer School 2016 - Data Science and Machine Learning
![Page 20: ICS Summer School 2016 Data Science - Week 4 Gabriella](https://reader033.vdocument.in/reader033/viewer/2022052805/628f43e8f7840147f5797d08/html5/thumbnails/20.jpg)
Data Compression - PCA
Principal Component Analysis (PCA) : Problem formulation
=> Goal : find a line on which projected the data
(credit Andrew Ng coursera)
![Page 21: ICS Summer School 2016 Data Science - Week 4 Gabriella](https://reader033.vdocument.in/reader033/viewer/2022052805/628f43e8f7840147f5797d08/html5/thumbnails/21.jpg)
Dimensionality Reduction - PCA
FormulationReduce from 2-dimension to 1-dimension : Find a direction (avector u1 ∈ Rn) onto which to project he data so as to minimizethe projection error.Generic case : Reduce from n-dimension to k-dimension : find kvectors u1, . . . ,uk (directions) onto which to project the data so asto minimize the projection error.
credit Andrew Ng - Stanford Univ
8/22 G.Contardo Roscoff ICS Summer School 2016 - Data Science and Machine Learning
![Page 22: ICS Summer School 2016 Data Science - Week 4 Gabriella](https://reader033.vdocument.in/reader033/viewer/2022052805/628f43e8f7840147f5797d08/html5/thumbnails/22.jpg)
Data Compression - PCA
PCA is not linear regression
(credit Andrew Ng coursera)
![Page 23: ICS Summer School 2016 Data Science - Week 4 Gabriella](https://reader033.vdocument.in/reader033/viewer/2022052805/628f43e8f7840147f5797d08/html5/thumbnails/23.jpg)
Data Compression - PCA
Quiz : Suppose you run PCA on the following dataset. Which of the following would be a reasonnable vector u1 onto which to project the data ? (Usually ||u1||=1)
(credit Andrew Ng coursera)
u1 = [ 1 0 ]
u1 = [ 0 1 ]
u1 = [ 1/√2 1/√2 ]
u1 = [ -1/√2 1/√2 ]
![Page 24: ICS Summer School 2016 Data Science - Week 4 Gabriella](https://reader033.vdocument.in/reader033/viewer/2022052805/628f43e8f7840147f5797d08/html5/thumbnails/24.jpg)
Dimensionality Reduction - PCA algorithm
Data preprocessing
Training set x1, . . . , xm
Preprocessing : feature scaling / mean normalization:Compute µj = 1
m
∑mi=1 x i
j
Replace each x ij with x i
j − µj : each feature has 0 mean.If different features on different scales, rescale features to havecomparable range of values (e.g x i
j =x i
j −min(xj
max(xj )−min(xj )).
To do also in supervised learning ! credit Andrew Ng - Stanford Univ
9/22 G.Contardo Roscoff ICS Summer School 2016 - Data Science and Machine Learning
![Page 25: ICS Summer School 2016 Data Science - Week 4 Gabriella](https://reader033.vdocument.in/reader033/viewer/2022052805/628f43e8f7840147f5797d08/html5/thumbnails/25.jpg)
Dimensionality Reduction - PCA algorithm
Reduce data from n-dimensions to k-dimensionsCompute covariance matrix :
Σ =1m
n∑i=1
(x i)(x i)T
(Python : numpy.cov(x) if x is your matrix of examples withexamples in column (i.e ∈ Rn×n)).Compute eigenvectors of matrix Σ :
U,S,V = numpy.linalg.svd(Σ)W, V = numpy.linalg.eig(Σ)
The matrix U (or V if using eig) is ∈ Rn×n
Takes the first k columns of U→ Ureduce
Finding the new representation z i for an example x i :
z i = UTreduce × x i
credit Andrew Ng - Stanford Univ
10/22 G.Contardo Roscoff ICS Summer School 2016 - Data Science and Machine Learning
![Page 26: ICS Summer School 2016 Data Science - Week 4 Gabriella](https://reader033.vdocument.in/reader033/viewer/2022052805/628f43e8f7840147f5797d08/html5/thumbnails/26.jpg)
Dimensionality Reduction - PCA algorithm
Quiz
In PCA, we obtain z ∈ Rk from x ∈ Rn as follow :
z i = UTreducek × x i
Which of the following is a correct expression for z ij ?
z ij = (uk )T x i
z ij = (uj)T x i
j
z ij = (uj)T x i
k
z ij = (uj)T x i
credit Andrew Ng - Stanford Univ
11/22 G.Contardo Roscoff ICS Summer School 2016 - Data Science and Machine Learning
![Page 27: ICS Summer School 2016 Data Science - Week 4 Gabriella](https://reader033.vdocument.in/reader033/viewer/2022052805/628f43e8f7840147f5797d08/html5/thumbnails/27.jpg)
Dimensionality Reduction - PCA algorithm
Choosing number of components kAverage squared projection error =errorprojection = 1
m∑m
i=1 ||xi − x iapprox ||2
x iapprox = Ureducez : projection of x i .
Variation in data : 1m∑m
i=1 ||x i ||2 : distance on average of myexamples to origin.Choose k smallest value such that ratio between average squareprojection error and variation is small :errorprojection
variation ≤ 0.01”99% of variance retained”
credit Andrew Ng - Stanford Univ
12/22 G.Contardo Roscoff ICS Summer School 2016 - Data Science and Machine Learning
![Page 28: ICS Summer School 2016 Data Science - Week 4 Gabriella](https://reader033.vdocument.in/reader033/viewer/2022052805/628f43e8f7840147f5797d08/html5/thumbnails/28.jpg)
Dimensionality Reduction - PCA algorithm
Algorithm to find k
Compute Σ,UTry PCA with k=1
Compute Ureduce, z1, . . . , zm, x1approx , . . . , xm
approxCheck if ratio is inferior to 0.01Increment k if criterion is not met, stop otherwise.
Speeding up by using U,S, V = svd(Σ). S ∈ Rn×n, diagonal matrix Sii .For a given value of k , the ratio can be computed as :
1−∑k
i=1 Sii∑ni=1 Sii
⇒ Don’t need to recompute all z and xapprox .
credit Andrew Ng - Stanford Univ
13/22 G.Contardo Roscoff ICS Summer School 2016 - Data Science and Machine Learning
![Page 29: ICS Summer School 2016 Data Science - Week 4 Gabriella](https://reader033.vdocument.in/reader033/viewer/2022052805/628f43e8f7840147f5797d08/html5/thumbnails/29.jpg)
Dimensionality Reduction - PCA algorithm
Quiz
We said that PCA chooses k directions u1, . . . ,uk onto which toproject the data so as to minimize the squared projection error.Another way to say the same is that PCA tries to minimize
1m∑m
i=1 ||x i ||21m∑m
i=1 ||x iapprox ||2
1m∑m
i=1 ||x i − x iapprox ||2
1m∑m
i=1 ||x i + x iapprox ||2
credit Andrew Ng - Stanford Univ
14/22 G.Contardo Roscoff ICS Summer School 2016 - Data Science and Machine Learning
![Page 30: ICS Summer School 2016 Data Science - Week 4 Gabriella](https://reader033.vdocument.in/reader033/viewer/2022052805/628f43e8f7840147f5797d08/html5/thumbnails/30.jpg)
Dimensionality Reduction - PCA algorithm
Tips on PCA
Dataset (x1, y1), . . . , (xm, ym), with x i ∈ R10000
↪→ (z1, y1), . . . , (zm, ym) with z i ∈ R1000
PCA on training dataset (compute Ureduce, finding k etc.), appliedon train, validation and test.Less dimensions→ less space and models afterward with fewerparameters (e.g neural networks)If visualization : k=2 or 3
Don’tDon’t use PCA to prevent overfitting, use regularization instead.Don’t run PCA before even testing on your raw data !
credit Andrew Ng - Stanford Univ
15/22 G.Contardo Roscoff ICS Summer School 2016 - Data Science and Machine Learning
![Page 31: ICS Summer School 2016 Data Science - Week 4 Gabriella](https://reader033.vdocument.in/reader033/viewer/2022052805/628f43e8f7840147f5797d08/html5/thumbnails/31.jpg)
Data Compression PCA : ortogonal base → no redundancy
Images : made of little « basic » parts :
Can we learn a dictionnary of such patchs to represent data ?
(credit N.Baskiotis)
![Page 32: ICS Summer School 2016 Data Science - Week 4 Gabriella](https://reader033.vdocument.in/reader033/viewer/2022052805/628f43e8f7840147f5797d08/html5/thumbnails/32.jpg)
Matrix Factorization
(credit P.Gallinari)
Idea Project data vectors in a latent space of dimension k < m size of
the original space Axis in this latent space represent a new basis for data
representation Each original data vector will be approximated as a linear
combination of k basis vectors in this new space
![Page 33: ICS Summer School 2016 Data Science - Week 4 Gabriella](https://reader033.vdocument.in/reader033/viewer/2022052805/628f43e8f7840147f5797d08/html5/thumbnails/33.jpg)
Matrix Factorization
(credit P.Gallinari)
X U V
![Page 34: ICS Summer School 2016 Data Science - Week 4 Gabriella](https://reader033.vdocument.in/reader033/viewer/2022052805/628f43e8f7840147f5797d08/html5/thumbnails/34.jpg)
Matrix Factorization
(credit P.Gallinari)
Applications Recommendation (User x Item matrix)
Matrix completion
Link prediction (Adjacency matrix) …
![Page 35: ICS Summer School 2016 Data Science - Week 4 Gabriella](https://reader033.vdocument.in/reader033/viewer/2022052805/628f43e8f7840147f5797d08/html5/thumbnails/35.jpg)
Matrix Factorization
(credit P.Gallinari)
X V
x.jv.j
u.1u.2u.3
Original data Basis vectorsDictionnary
Representation
x
v.j
u.1
u.2
u.3
![Page 36: ICS Summer School 2016 Data Science - Week 4 Gabriella](https://reader033.vdocument.in/reader033/viewer/2022052805/628f43e8f7840147f5797d08/html5/thumbnails/36.jpg)
Matrix Factorization
(credit P.Gallinari)
Interpretation If X is a User x Item matrix
Users and items are represented in a common representation space of size k
Their interaction is measured by a dot product in this space
x.jv.j
ui.
Original data User Representation
Item Representation
xUsers
Items
![Page 37: ICS Summer School 2016 Data Science - Week 4 Gabriella](https://reader033.vdocument.in/reader033/viewer/2022052805/628f43e8f7840147f5797d08/html5/thumbnails/37.jpg)
Matrix Factorization
(credit P.Gallinari)
Interpretation If X is a directed graph adjacency or weight matrix
x.jv.j
ui.
Original data Sender Representation
Receiver _Representation
xNodes
Nodes
![Page 38: ICS Summer School 2016 Data Science - Week 4 Gabriella](https://reader033.vdocument.in/reader033/viewer/2022052805/628f43e8f7840147f5797d08/html5/thumbnails/38.jpg)
Recommender Systems
Predicting movie ratings
(credit Andrew Ng coursera)
![Page 39: ICS Summer School 2016 Data Science - Week 4 Gabriella](https://reader033.vdocument.in/reader033/viewer/2022052805/628f43e8f7840147f5797d08/html5/thumbnails/39.jpg)
Recommender Systems
Collaborative Filtering
(credit Andrew Ng coursera)
![Page 40: ICS Summer School 2016 Data Science - Week 4 Gabriella](https://reader033.vdocument.in/reader033/viewer/2022052805/628f43e8f7840147f5797d08/html5/thumbnails/40.jpg)
Data compression - Matrix Factorization
Collaborative filtering optimization objective
Mimize J with ui , v j , with i ∈ 1, . . . ,nm (items e.g movie), andj ∈ 1, . . . ,nu (users) :J (u1, . . . ,unm , v1, . . . , vnu ) = 1/2
∑(i,j):r(i,j)=1(uiv j − y ij)2 +
λ/2∑nm
i=1∑n
k=1(uik )2 + λ/2
∑nuj=1
∑nk=1(v j
k )2
16/22 G.Contardo Roscoff ICS Summer School 2016 - Data Science and Machine Learning
![Page 41: ICS Summer School 2016 Data Science - Week 4 Gabriella](https://reader033.vdocument.in/reader033/viewer/2022052805/628f43e8f7840147f5797d08/html5/thumbnails/41.jpg)
Data compression - Matrix Factorization
Algorithm
Initialize u1, . . . ,unm , v1, . . . , vnu randomlyMinimize J (u1, . . . ,unm , v1, . . . , vnu ) using gradient descent e.g
uik = ui
k − ε(∑
(i,j):r(i,j)=1(uiv j − y ij )v jk + λui
k )
v jk = v j
k − ε(∑
(i,j):r(i,j)=1(uiv j − y ij )uik + λv j
k )
(here : batch→ all ratings for i or j are used.)Predicting a rating for a user j with learned representation v j andan item i with learned representation ui : uiv j
N.B : possible to use alternate gradient descent or other.
17/22 G.Contardo Roscoff ICS Summer School 2016 - Data Science and Machine Learning
![Page 42: ICS Summer School 2016 Data Science - Week 4 Gabriella](https://reader033.vdocument.in/reader033/viewer/2022052805/628f43e8f7840147f5797d08/html5/thumbnails/42.jpg)
Data compression - Matrix Factorization
Loss function -example
Minimize C = ||X − UV ||2 + c(U,V )
Constraints on U,V via c(U,V ) e.g :Positivity (NMF - next slides)Sparsity of representations, e.g ||V ||1Over-complete dictionnary U : k > nSymmetryBias on U and V (e.g bias on popularity for item recommendation)Any a priori knowledge on U and V
credit P.Gallinari UPMC
18/22 G.Contardo Roscoff ICS Summer School 2016 - Data Science and Machine Learning
![Page 43: ICS Summer School 2016 Data Science - Week 4 Gabriella](https://reader033.vdocument.in/reader033/viewer/2022052805/628f43e8f7840147f5797d08/html5/thumbnails/43.jpg)
Data compression - Matrix Factorization
Non Negative Matrix Factorization
Minimize C = ||X − UV ||2 under constraints U,V ≥ 0Convex loss in U and in V but not in both U and V.Algorithm can be solved by a Lagrangian formulation :
Iterative multiplicative algorithmU,V initialized at random valuesIterate until convergence :
uij ← uijXV T
ij(UVV T )i j
vij ← vij(XT U)ij
(V T UT U)ij
Or by projected gradient formulations.Solution U,V is not unique : if U,V solution, then UD,D−1V for Ddiagonal positive is also solution,
credit P.Gallinari UPMC
19/22 G.Contardo Roscoff ICS Summer School 2016 - Data Science and Machine Learning
![Page 44: ICS Summer School 2016 Data Science - Week 4 Gabriella](https://reader033.vdocument.in/reader033/viewer/2022052805/628f43e8f7840147f5797d08/html5/thumbnails/44.jpg)
Data compression - Matrix Factorization
Using NMF for ClusteringNormalize U as a column stochastic matrix : each column vectoris of norm 1
uij ←uij√∑
i u2ij
vij ← vij
√∑i u2
ij
Under the constraint ”U normalized”, the solution U,V is uniqueAssociate x i to cluster j if j = argmaxj(vij)
credit P.Gallinari UPMC
20/22 G.Contardo Roscoff ICS Summer School 2016 - Data Science and Machine Learning
![Page 45: ICS Summer School 2016 Data Science - Week 4 Gabriella](https://reader033.vdocument.in/reader033/viewer/2022052805/628f43e8f7840147f5797d08/html5/thumbnails/45.jpg)
Data compression - Matrix Factorization
Many different version and extentions of NMFDifferent loss functions (e.g different constraints)Different algorithmsApplications:
ClusteringLink predictionRecommendationetc
credit P.Gallinari UPMC
21/22 G.Contardo Roscoff ICS Summer School 2016 - Data Science and Machine Learning
![Page 46: ICS Summer School 2016 Data Science - Week 4 Gabriella](https://reader033.vdocument.in/reader033/viewer/2022052805/628f43e8f7840147f5797d08/html5/thumbnails/46.jpg)
Data compression - Matrix Factorization
Many different version and extentions of NMFDifferent loss functions (e.g different constraints)Different algorithmsApplications:
ClusteringLink predictionRecommendationetc
credit P.Gallinari UPMC
22/22 G.Contardo Roscoff ICS Summer School 2016 - Data Science and Machine Learning