distributed cur decomposition for...
TRANSCRIPT
![Page 1: Distributed CUR Decomposition for Bi-Clusteringrezab/classes/cme323/S16/projects_slides/kline_shaw.pdfJune 1, 2016 {sakline, keshaw}@stanford.edu Stanford University, CME 323 Final](https://reader033.vdocument.in/reader033/viewer/2022050510/5f9ade24738380506c31db73/html5/thumbnails/1.jpg)
DistributedCURDecompositionforBi-Clustering
StephenKline,KevinShawJune1,2016
{sakline,keshaw}@stanford.eduStanfordUniversity,CME323FinalProject
![Page 2: Distributed CUR Decomposition for Bi-Clusteringrezab/classes/cme323/S16/projects_slides/kline_shaw.pdfJune 1, 2016 {sakline, keshaw}@stanford.edu Stanford University, CME 323 Final](https://reader033.vdocument.in/reader033/viewer/2022050510/5f9ade24738380506c31db73/html5/thumbnails/2.jpg)
CURasalternativetoSVD– e.g.Biclustering
CompleteratingsforselectUsers
CompleteratingsforselectMovies
Movie Ratingssparse /huge
SVD– accuratebutheavy,Less interpretable(rotatedspace)
CUR– less accuratebutlight,Moreinterpretable*
*Asarchetypal usersandmovies
1
Movies
Users
User-Movie Biclusters
Biclustering wasoriginally developedinthecontextofDNAmicroarrays
Biclustering alsohaspotential inotherareasandhas addedinterpretability
Source:SourceCodeforBiologyMedicine(April2013)- "Thenon-negativematrixfactorizationtoolboxforbiologicaldatamining"
![Page 3: Distributed CUR Decomposition for Bi-Clusteringrezab/classes/cme323/S16/projects_slides/kline_shaw.pdfJune 1, 2016 {sakline, keshaw}@stanford.edu Stanford University, CME 323 Final](https://reader033.vdocument.in/reader033/viewer/2022050510/5f9ade24738380506c31db73/html5/thumbnails/3.jpg)
ReviewofSVD:A=U∑VT
• PRO- Highaccuracyo ksingular values/vectors produce thebestk-rank
approximation toA
• CON - Highcomputation/spacerequirementso Inour biclustering applicationwithMovieLens
data,thedistributed SVDis “roughlysquare” -ARPACK (vs.“tallandskinny” – ATAtrick)
dense /fulldense /full
Asparse /huge
densebig
dense /bigsparse/small
2
![Page 4: Distributed CUR Decomposition for Bi-Clusteringrezab/classes/cme323/S16/projects_slides/kline_shaw.pdfJune 1, 2016 {sakline, keshaw}@stanford.edu Stanford University, CME 323 Final](https://reader033.vdocument.in/reader033/viewer/2022050510/5f9ade24738380506c31db73/html5/thumbnails/4.jpg)
BackgroundonA=CUR
• CURtradesaccuracy…... forcomputation/spacesavings
• C/R =cols/rows fromA• U =pseudo-inverse ofW
(intersectionof CandR)
C
R
sparsebig
sparse /bigdense/small
Asparse /huge
Pinv( ) Intersectionof CandR(callitW,verysmall)
• Col/RowSelect() algsamplesw/replacement (allowsduplicates)
• Pinv(W) calculatedviaSVD(W)
• Accuracybetterforlargedatasets
3
![Page 5: Distributed CUR Decomposition for Bi-Clusteringrezab/classes/cme323/S16/projects_slides/kline_shaw.pdfJune 1, 2016 {sakline, keshaw}@stanford.edu Stanford University, CME 323 Final](https://reader033.vdocument.in/reader033/viewer/2022050510/5f9ade24738380506c31db73/html5/thumbnails/5.jpg)
DesignDecisionsforDistributedCUR
C
R
sparsebig
sparse /bigdense/small
Asparse /huge
KeyDesignDecision:Distributetwoinstances ofAavoidingfutureall-to-allcommunications
Only necessary tostoreCandRassetofindices intoACompute Ulocally
Therearemultiple variations ofCUR.Weselected thealgorithmaspresented in:Drineas, et.al.,2006."FastMonteCarloAlgorithms forMatricesIII"which(forexample)does notremoveduplicatecols/rows assomeothers do.
4
![Page 6: Distributed CUR Decomposition for Bi-Clusteringrezab/classes/cme323/S16/projects_slides/kline_shaw.pdfJune 1, 2016 {sakline, keshaw}@stanford.edu Stanford University, CME 323 Final](https://reader033.vdocument.in/reader033/viewer/2022050510/5f9ade24738380506c31db73/html5/thumbnails/6.jpg)
Serialvs.DistributedCUR- Asymptotics
• BuildCandR:o Generateprobabilities– O(mn)o CreateCmatrix– O(mk)o CreateRmatrix– O(nk)
• ConstructU• ComputeCTC– O(mk2)• SVDofCTC– O(k3)• ComputeAandB– O(k3)• U=ABT – O(k3)
5
• BuildCandR:• Generateprobabilities– O(mn +p)cost,O(maxdense)time
o Create2RDDsbyRow/Colpartition– O(mn)cost,AtoAo Bothinstances:reducetoRow/Colsums—
O(maxdense)time,nocommunicationo Oneinstance:reduceRowsumtototal– O(p)cost,O(logp)timeo Broadcasttotaltocalculateprobs – O(p)cost,O(logp)time
• CreateC/Rmatriceso Locallysamplekrows/cols– O(k)o BroadcastsampletoRDDs– O(pk)cost,O(klogp)time
• ConstructU• SameasSerial(lessopportunitytodistribute)
Serial Distributed (communication cost andcomputation time)
![Page 7: Distributed CUR Decomposition for Bi-Clusteringrezab/classes/cme323/S16/projects_slides/kline_shaw.pdfJune 1, 2016 {sakline, keshaw}@stanford.edu Stanford University, CME 323 Final](https://reader033.vdocument.in/reader033/viewer/2022050510/5f9ade24738380506c31db73/html5/thumbnails/7.jpg)
Biclustering:DistributedCURvsSVD- Empirics
6