stochastic svd on hadoop shannon quinn (with thanks to gunnar martinsson and nathan halko of uc...

Stochastic SVD on Hadoop Shannon Quinn (with thanks to Gunnar Martinsson and Nathan Halko of UC Boulder, and Joel Tropp of CalTech)

Upload: adelia-patrick

Post on 04-Jan-2016

213 views

Category:

Documents

0 download

Report

Download

Embed Size (px):

TRANSCRIPT

Stochastic SVDon Hadoop

Shannon Quinn(with thanks to Gunnar Martinsson and Nathan Halko of UC Boulder, and Joel Tropp of CalTech)

Lecture breakdown

• Part I–Stochastic SVD

• Part II–Distributed stochastic SVD

Page 3: Stochastic SVD on Hadoop Shannon Quinn (with thanks to Gunnar Martinsson and Nathan Halko of UC Boulder, and Joel Tropp of CalTech)

Part I: Stochastic SVD

Page 4: Stochastic SVD on Hadoop Shannon Quinn (with thanks to Gunnar Martinsson and Nathan Halko of UC Boulder, and Joel Tropp of CalTech)

Basic goal

• Matrix A–Find a low-rank approximation of A–Basic dimensionality reduction

Preconditioning

Page 5: Stochastic SVD on Hadoop Shannon Quinn (with thanks to Gunnar Martinsson and Nathan Halko of UC Boulder, and Joel Tropp of CalTech)

Basic algorithm

• INPUT: A, k, p• OUTPUT: Q1. Draw Gaussian n x (k + p) test matrix Ω2. Form product Y = AΩ3. Orthogonalize columns of Y Q

Page 6: Stochastic SVD on Hadoop Shannon Quinn (with thanks to Gunnar Martinsson and Nathan Halko of UC Boulder, and Joel Tropp of CalTech)

Basic evaluation

Page 7: Stochastic SVD on Hadoop Shannon Quinn (with thanks to Gunnar Martinsson and Nathan Halko of UC Boulder, and Joel Tropp of CalTech)

Approximating the SVD

• INPUT: Q• OUTPUT: Singular vectors U1. Form k x n matrix B = QTA2. Compute SVD of B = ÛΣVT3. Compute singular vectors U = QÛ

Page 8: Stochastic SVD on Hadoop Shannon Quinn (with thanks to Gunnar Martinsson and Nathan Halko of UC Boulder, and Joel Tropp of CalTech)

Demo

Page 9: Stochastic SVD on Hadoop Shannon Quinn (with thanks to Gunnar Martinsson and Nathan Halko of UC Boulder, and Joel Tropp of CalTech)

Empirical Results

• 1000x1000 matrix

Page 10: Stochastic SVD on Hadoop Shannon Quinn (with thanks to Gunnar Martinsson and Nathan Halko of UC Boulder, and Joel Tropp of CalTech)

Power iterations

• Affects decay of eigenvalues / singular values

Page 11: Stochastic SVD on Hadoop Shannon Quinn (with thanks to Gunnar Martinsson and Nathan Halko of UC Boulder, and Joel Tropp of CalTech)

Empirical Results

Page 12: Stochastic SVD on Hadoop Shannon Quinn (with thanks to Gunnar Martinsson and Nathan Halko of UC Boulder, and Joel Tropp of CalTech)

Empirical Results

Page 13: Stochastic SVD on Hadoop Shannon Quinn (with thanks to Gunnar Martinsson and Nathan Halko of UC Boulder, and Joel Tropp of CalTech)

Part II: Distributed SSVD

Page 14: Stochastic SVD on Hadoop Shannon Quinn (with thanks to Gunnar Martinsson and Nathan Halko of UC Boulder, and Joel Tropp of CalTech)

Algorithm Overview

QR factorization

Power iterationQR

factorization

In-core SVD

Page 15: Stochastic SVD on Hadoop Shannon Quinn (with thanks to Gunnar Martinsson and Nathan Halko of UC Boulder, and Joel Tropp of CalTech)

SSVD Primitives

• Matrix-vector multiplication: y = Ax

• (midterm, anyone?)

Page 16: Stochastic SVD on Hadoop Shannon Quinn (with thanks to Gunnar Martinsson and Nathan Halko of UC Boulder, and Joel Tropp of CalTech)

SSVD Primitives

• Matrix-matrix multiplication: y = ATAx

Page 17: Stochastic SVD on Hadoop Shannon Quinn (with thanks to Gunnar Martinsson and Nathan Halko of UC Boulder, and Joel Tropp of CalTech)

Matrix-matrix multiplication

• Very clever use of map/reduce• Each Mapper outputs:

Page 18: Stochastic SVD on Hadoop Shannon Quinn (with thanks to Gunnar Martinsson and Nathan Halko of UC Boulder, and Joel Tropp of CalTech)

SSVD Primitives

• Distributed orthogonalization: Y = AΩ–Givens rotation

– Streaming QR• Sliding window

–Merge factorizations1. Merge R2. Merge QT

Page 19: Stochastic SVD on Hadoop Shannon Quinn (with thanks to Gunnar Martinsson and Nathan Halko of UC Boulder, and Joel Tropp of CalTech)

SSVD

Page 20: Stochastic SVD on Hadoop Shannon Quinn (with thanks to Gunnar Martinsson and Nathan Halko of UC Boulder, and Joel Tropp of CalTech)

1: Q-job

Page 21: Stochastic SVD on Hadoop Shannon Quinn (with thanks to Gunnar Martinsson and Nathan Halko of UC Boulder, and Joel Tropp of CalTech)

2: BT-job

Page 22: Stochastic SVD on Hadoop Shannon Quinn (with thanks to Gunnar Martinsson and Nathan Halko of UC Boulder, and Joel Tropp of CalTech)

3: ABT-job

Page 23: Stochastic SVD on Hadoop Shannon Quinn (with thanks to Gunnar Martinsson and Nathan Halko of UC Boulder, and Joel Tropp of CalTech)

4: U-job

Page 24: Stochastic SVD on Hadoop Shannon Quinn (with thanks to Gunnar Martinsson and Nathan Halko of UC Boulder, and Joel Tropp of CalTech)

5: V-job

Page 25: Stochastic SVD on Hadoop Shannon Quinn (with thanks to Gunnar Martinsson and Nathan Halko of UC Boulder, and Joel Tropp of CalTech)

Mahout SSVD Parameters

Page 26: Stochastic SVD on Hadoop Shannon Quinn (with thanks to Gunnar Martinsson and Nathan Halko of UC Boulder, and Joel Tropp of CalTech)

Block height

Page 27: Stochastic SVD on Hadoop Shannon Quinn (with thanks to Gunnar Martinsson and Nathan Halko of UC Boulder, and Joel Tropp of CalTech)

Power iterations

Page 28: Stochastic SVD on Hadoop Shannon Quinn (with thanks to Gunnar Martinsson and Nathan Halko of UC Boulder, and Joel Tropp of CalTech)

Comparison to Lanczos

Page 29: Stochastic SVD on Hadoop Shannon Quinn (with thanks to Gunnar Martinsson and Nathan Halko of UC Boulder, and Joel Tropp of CalTech)

Comparison to Lanczos

Page 30: Stochastic SVD on Hadoop Shannon Quinn (with thanks to Gunnar Martinsson and Nathan Halko of UC Boulder, and Joel Tropp of CalTech)

Comparison to Lanczos

Page 31: Stochastic SVD on Hadoop Shannon Quinn (with thanks to Gunnar Martinsson and Nathan Halko of UC Boulder, and Joel Tropp of CalTech)

Comparison to Lanczos

Page 32: Stochastic SVD on Hadoop Shannon Quinn (with thanks to Gunnar Martinsson and Nathan Halko of UC Boulder, and Joel Tropp of CalTech)

Datasets

• Wikipedia-all

• Wikipedia-MAX

Page 33: Stochastic SVD on Hadoop Shannon Quinn (with thanks to Gunnar Martinsson and Nathan Halko of UC Boulder, and Joel Tropp of CalTech)

That’s SSVD!

Page 34: Stochastic SVD on Hadoop Shannon Quinn (with thanks to Gunnar Martinsson and Nathan Halko of UC Boulder, and Joel Tropp of CalTech)

Resources

• Randomized methods for computing the SVD of very large matrices– http://web.stanford.edu/group/mmds/slides2010/Martinsson.pdf

• Randomized methods for computing low-rank approximations of matrices– https://amath.colorado.edu/faculty/martinss/Pubs/2012_halko_dissertation.pdf

• SSVD on Mahout– https://mahout.apache.org/users/algorithms/d-ssvd.html

http://web.stanford.edu/group/mmds/slides2010/Martinsson.pdf

https://amath.colorado.edu/faculty/martinss/Pubs/2012_halko_dissertation.pdf

https://mahout.apache.org/users/algorithms/d-ssvd.html

Jonas Martinsson Blog (FeedJournal)

The New Jersey Chapter of ASHRAE Newsletter … · Refrigeration Dave Halko 856-355-4153 ... Back Row (Left To Right), ... November 2013 THERMOGRAM PAGE 10 ASHRAE, AHR Expo Return

Education Friendship Ethnocentrism Bremen · EDUCATION, FRIENDSHIP, AND ETHNOCENTRISM 5 ethnocentric than lowly educated individuals (Pettigrew and Tropp, 2011). Educational attainment

Maturing Extreme Programming Through the CMM - Jonas Martinsson

Linda R. Tropp · Curriculum Vitae December 2016 Department of Psychological and Brain Sciences e-mail: [email protected] Tobin Hall, 135 Hicks Way office: Tobin Hall, room 637

Response option order effects. Scale lengths and horizontal or vertical layout. Johan Martinsson University of Gothenburg

Implementing Safety Management Systems for Aviation into ...ijme.us/cd_08/PDF/164ent203.pdfPurdue University, Department of Aviation Technology [email protected] Abstract While the

100 Meters - Centerville Public Schools / Homepage · 100 Meters Boys Time Kyle Carlson 10.91 Craig Halko 11.1 Steven Halko 11.3 Chris Halko 11.3 Pat Halko 11.4 ... Linda Guisti 105'5"

ARCHAEOLOGICAL INVESTIGATIONS IN INDEPENDENT SAMOA – TALA …926766/FULLTEXT01.pdf · 2016. 5. 9. · INDEPENDENT SAMOA – TALA ELI OF LAUPULE MOUND AND BEYOND Helene Martinsson

“Building the Case for Coordinated School Health”: The Cuyahoga County Board of Health Coordinated School Health Initiative Presented by: Martha Halko,

Ph.D. Students: Adrianna Gillman Nathan Halko Patrick ...€¦ · Standard methods (e.g. Lanczos) require O(Tmult k) operations. The new methods are also O(Tmult k) but are more robust,

data.over-blog-kiwi.com...2012/01/05 · 2nd. time rit. f 2nd time rit. Allegro ma non troppo Allegro dim. marc. non tropp cresc

Interviewing PRESENTED BY ALI BREWER & LINDSEY TROPP WITH CREATIVE FINANCIAL STAFFING

Linda R. Tropp · Society for the Psychological Study of Social Issues, and the American Psychological Association 2019 – 2020 Chancellor’s Leadership Fellow, Office of Equity

Randomized methods for matrix computations and analysis …amath.colorado.edu/faculty/martinss/2016_PCMI/martinsson... · Randomized methods for matrix computations and data analysis

V.sblog.s3.amazonaws.com/wp-content/uploads/2011/06/Tropp...No. 10-__ RICHARD A. TROPP, ON BEHALF OF HIMSELF AND ALL OTHERS SIMILARLY SITUATED, Petitioner, V. CORPORATION OF LLOYD’S,

BTES for tomorrows district heating systems ......Fredrik Martinsson [email protected] R&D areas Project title Organisation-Project leader Budget, € n Techno-economical

Lab No 14 Isopach and Isochore Maps University of Sulaimani School of Science Department of Geology Practical Structural Geology Instructor: Halko M. Mahmoud

Bernth Martinsson / varmblod, 2017 - Dalatravet · Bernth Martinsson / varmblod, 2017 ... 242 Baleo Josselyn 6 243 Roscoe Frontline 6 244 Donna's Girl 6 ... 373 Simb Full Point 3

June 4 - 17, 2019 Names, Faces, People and Places€¦ · State Investment Sales Group for Avison Young, the firm has hired Terri Gumula and Daniel Tropp. Gumula and Tropp become

Rittal Cooling Solutions – A. Tropp - E Technologies Cooling Solutions – A. Tropp 2 Rittal Cooling Solutions Presented by Adam Tropp. ... • LCP Hybrid • Aisle Containment •

Arrowhead Project -A platform perspective Pär Erik Martinsson, ProcessIT.EU / LTU 1

Enhanced Capabilities in Wet-end Paper Machine Clothing pps.pdf · Enhanced Capabilities in Wet-end Paper Machine Clothing Mikael Danielsson, Lars Martinsson Albany International

Divided Infringement After Travel Sentry v. Tropp: Direct …media.straffordpub.com/products/divided-infringement... · 2018-05-16 · improving airline luggage inspection through

Full Service Broadband an Ericsson View of the Future Michael Martinsson Marketing Director Business Unit Networks Ericsson AB

Matrix Concentration Inequalitiesusers.cms.caltech.edu/~jtropp/books/Tro14-Introduction...Matrix Concentration Inequalities Joel A. Tropp 24 December 2014 FnTML Draft, Revised I i

01 Wireless in Tropp t 367

Linda R. Tropptropp.socialpsychology.org/cv/Tropp-CV-June-2015.pdf · 2009 Henri Tajfel Fellowship, International Graduate College on Conflict and Cooperation between Social Groups,

“That was Basically Me”: Critical Literacy, Text, and Talk. J. Wilson & T. Tropp Laman

Fast matrix computations via randomized sampling...Adrianna Gillman Yoel Shkolnisky (Yale) Nathan Halko Joel Tropp (Caltech) Patrick Young Mark Tygert (Courant) Franco Woolfe (Goldman

DMS: The Distribution of Mass within Spiral Galaxies (Thomas Martinsson) · Thomas Martinsson Unique Solutions from Gas and Stellar Kinematics Leiden Observatory Garching, March 2014

Osteologi och Arkeogenetik...Osteologi och Arkeogenetik Osteologilärare Sabine Sten, docent Gustav Malmborg, adjunkt Osteologikompetens Paul Wallin, docent Helene Martinsson-Wallin,

phdthesis Anders Martinssonpublications.lib.chalmers.se/records/fulltext/248704/...A PeterHegartyandAnders Martinsson, “On the existence of accessible paths in various models of

Sport Event Name Where Year Result Class AG Sex WEB_Records_POST/Track_REC… · Track 100 Meters Halko, Alexa Junior Nationals 2014 20.71 T34 U17 Female Track 100 Meters Dewald,

Lund Aerosol Group Div. of Nuclear Physics, Lund University, LTH Professors: Erik Swietlicki, Bengt Martinsson Senior Scientists: Göran Frank, Birgitta