programming with big data in r
TRANSCRIPT
![Page 1: Programming with Big Data in R](https://reader034.vdocument.in/reader034/viewer/2022051319/627b22dafa2b614a6d51c4bc/html5/thumbnails/1.jpg)
Introduction pbdR Benchmarks Demos
Programming with Big Data in R
Drew Schmidt and George Ostrouchov
http://r-pbd.org/
November 19, 2013
Drew Schmidt and George Ostrouchov Programming with Big Data in R
![Page 2: Programming with Big Data in R](https://reader034.vdocument.in/reader034/viewer/2022051319/627b22dafa2b614a6d51c4bc/html5/thumbnails/2.jpg)
Introduction pbdR Benchmarks Demos
The pbdR Core Team
Wei-Chen Chen1
George Ostrouchov1,2
Pragneshkumar Patel1
Drew Schmidt1
SupportThis work used resources of National Institute for Computational Sciences at the University of Tennessee,Knoxville, which is supported by the Office of Cyberinfrastructure of the U.S. National Science Foundation underAward No. ARRA-NSF-OCI-0906324 for NICS-RDAV center. This work used resources of the Newton HPCProgram at the University of Tennessee, Knoxville. This work also used resources of the Oak Ridge LeadershipComputing Facility at the Oak Ridge National Laboratory, which is supported by the Office of Science of the U.S.Department of Energy under Contract No. DE-AC05-00OR22725.
1University of Tennessee. Supported in part by the project “NICS Remote Data Analysis and Visualization Center”funded by the Office of Cyberinfrastructure of the U.S. National Science Foundation under Award No.ARRA-NSF-OCI-0906324 for NICS-RDAV center.
2Oak Ridge National Laboratory. Supported in part by the project “Visual Data Exploration and Analysis ofUltra-large Climate Data” funded by U.S. DOE Office of Science under Contract No. DE-AC05-00OR22725.
Drew Schmidt and George Ostrouchov Programming with Big Data in R
![Page 3: Programming with Big Data in R](https://reader034.vdocument.in/reader034/viewer/2022051319/627b22dafa2b614a6d51c4bc/html5/thumbnails/3.jpg)
Introduction pbdR Benchmarks Demos
Contents
1 Introduction
2 pbdR
3 Benchmarks
4 Demos
Drew Schmidt and George Ostrouchov Elevating R to Supercomputers
![Page 4: Programming with Big Data in R](https://reader034.vdocument.in/reader034/viewer/2022051319/627b22dafa2b614a6d51c4bc/html5/thumbnails/4.jpg)
Introduction pbdR Benchmarks Demos
What is R?
What is R?
lingua franca for data analytics and statisticalcomputing.
Part programming language, part dataanalysis package.
Dialect of S (Bell Labs).
Syntax designed for data.
Drew Schmidt and George Ostrouchov Elevating R to Supercomputers 1 / 30
![Page 5: Programming with Big Data in R](https://reader034.vdocument.in/reader034/viewer/2022051319/627b22dafa2b614a6d51c4bc/html5/thumbnails/5.jpg)
Introduction pbdR Benchmarks Demos
What is R?
Who uses R?
Drew Schmidt and George Ostrouchov Elevating R to Supercomputers 2 / 30
![Page 6: Programming with Big Data in R](https://reader034.vdocument.in/reader034/viewer/2022051319/627b22dafa2b614a6d51c4bc/html5/thumbnails/6.jpg)
Introduction pbdR Benchmarks Demos
What is R?
Language Paradigms
Drew Schmidt and George Ostrouchov Elevating R to Supercomputers 3 / 30
![Page 7: Programming with Big Data in R](https://reader034.vdocument.in/reader034/viewer/2022051319/627b22dafa2b614a6d51c4bc/html5/thumbnails/7.jpg)
Introduction pbdR Benchmarks Demos
What is R?
Why R?
1 People love it.
2 Highly extensible (CRAN ≈ 5000 actively maintainedpackages.
3 Wealth of diverse analytical methods.
4 HPC community has growing need for data analytics.
Drew Schmidt and George Ostrouchov Elevating R to Supercomputers 4 / 30
![Page 8: Programming with Big Data in R](https://reader034.vdocument.in/reader034/viewer/2022051319/627b22dafa2b614a6d51c4bc/html5/thumbnails/8.jpg)
Introduction pbdR Benchmarks Demos
What is R?
Problems with R
1 Slow.
2 If you don’t know what you’re doing, it’s really slow.
3 Data science community has growing data size problem.
4 Chokes on big data.
5 Parallelism: “batteries not included”
Drew Schmidt and George Ostrouchov Elevating R to Supercomputers 5 / 30
![Page 9: Programming with Big Data in R](https://reader034.vdocument.in/reader034/viewer/2022051319/627b22dafa2b614a6d51c4bc/html5/thumbnails/9.jpg)
Introduction pbdR Benchmarks Demos
What is R?
In spite of all this. . .
Drew Schmidt and George Ostrouchov Elevating R to Supercomputers 6 / 30
![Page 10: Programming with Big Data in R](https://reader034.vdocument.in/reader034/viewer/2022051319/627b22dafa2b614a6d51c4bc/html5/thumbnails/10.jpg)
Introduction pbdR Benchmarks Demos
R and HPC
R Interfaces to Native Tools
Interconnection Network
PROC+ cache
PROC+ cache
PROC+ cache
PROC+ cache
Mem Mem Mem Mem
Distributed Memory
Memory
CORE+ cache
CORE+ cache
CORE+ cache
CORE+ cache
Network
Shared Memory Local Memory
GPUor
MIC
Co-Processor
GPU: Graphical Processing Unit
MIC: Many Integrated Core
Focus on who owns what data and what communication is needed
Focus on which tasks can be parallel
Same Task on Blocks of data
SocketsMPI
HADOOP
snowRmpi
pbdMPI
snow + multicore = parallel
CUDAOpenCL
OpenACC
OpenMPOpenACC
OpenMPThreads
fork
.C.CallRcpp
OpenCLinline
multicore(fork)
Drew Schmidt and George Ostrouchov Elevating R to Supercomputers 7 / 30
![Page 11: Programming with Big Data in R](https://reader034.vdocument.in/reader034/viewer/2022051319/627b22dafa2b614a6d51c4bc/html5/thumbnails/11.jpg)
Introduction pbdR Benchmarks Demos
R and HPC
R Interfaces to Scalable Math Libraries
PLASMA
HiPLAR
Interconnection Network
PROC+ cache
PROC+ cache
PROC+ cache
PROC+ cache
Mem Mem Mem Mem
Distributed Memory
Memory
CORE+ cache
CORE+ cache
CORE+ cache
CORE+ cache
Network
Shared Memory Local Memory
GPUor
MIC
Co-Processor
GPU: Graphical Processing Unit
MIC: Many Integrated Core
Focus on who owns what data and what communication is needed
Focus on which tasks can be parallel
Same Task on Blocks of data
SocketsMPI
Hadoop
OpenMPThreads
fork
CUDAOpenCL
OpenACC
OpenMPOpenACC
multicore(fork) snow + multicore = parallel
ScaLAPACKPBLASBLACS
MAGMA
PETSc
Trilinos
DPLASMA
CUBLAS
HiPLARM
MKLACMLLibSci
.C.CallRcpp
OpenCLinline
pbdDMATpbdDMAT
magma
snowRmpi
pbdMPI
Drew Schmidt and George Ostrouchov Elevating R to Supercomputers 8 / 30
![Page 12: Programming with Big Data in R](https://reader034.vdocument.in/reader034/viewer/2022051319/627b22dafa2b614a6d51c4bc/html5/thumbnails/12.jpg)
Introduction pbdR Benchmarks Demos
Contents
2 pbdRThe pbdR Project
Drew Schmidt and George Ostrouchov Elevating R to Supercomputers
![Page 13: Programming with Big Data in R](https://reader034.vdocument.in/reader034/viewer/2022051319/627b22dafa2b614a6d51c4bc/html5/thumbnails/13.jpg)
Introduction pbdR Benchmarks Demos
The pbdR Project
Programming with Big Data in R (pbdR)
Productivity, Portability, Performance
Freea R packages.
Bridging high-performance C withhigh-productivity of R
Distributed data details implicitlymanaged.
Methods have syntax identical to R.
aMPL, BSD, and GPL licensed
Drew Schmidt and George Ostrouchov Elevating R to Supercomputers 9 / 30
![Page 14: Programming with Big Data in R](https://reader034.vdocument.in/reader034/viewer/2022051319/627b22dafa2b614a6d51c4bc/html5/thumbnails/14.jpg)
Introduction pbdR Benchmarks Demos
The pbdR Project
pbdR Packages
Drew Schmidt and George Ostrouchov Elevating R to Supercomputers 10 / 30
![Page 15: Programming with Big Data in R](https://reader034.vdocument.in/reader034/viewer/2022051319/627b22dafa2b614a6d51c4bc/html5/thumbnails/15.jpg)
Introduction pbdR Benchmarks Demos
The pbdR Project
pbdR Example Syntax
1 x <- x[-1, 2:5]
2 x <- log(abs(x) + 1)
3 xtx <- t(x) %*% x
4 ans <- svd(solve(xtx))
The above runs on 1 core with R or 10,000 cores with pbdR
Drew Schmidt and George Ostrouchov Elevating R to Supercomputers 11 / 30
![Page 16: Programming with Big Data in R](https://reader034.vdocument.in/reader034/viewer/2022051319/627b22dafa2b614a6d51c4bc/html5/thumbnails/16.jpg)
Introduction pbdR Benchmarks Demos
The pbdR Project
Profiling with pbdPROF
1. Rebuild pbdR packages
R CMD INSTALL
pbdMPI_0.2 -1.tar.gz \
--configure -args= \
"--enable -pbdPROF"
2. Run code
mpirun -np 64 Rscript
my_script.R
3. Analyze results
1 library(pbdPROF)
2 prof <- read.prof(
"profiler_output.mpiP")
3 plot(prof)
Publication-quality graphs
Drew Schmidt and George Ostrouchov Elevating R to Supercomputers 12 / 30
![Page 17: Programming with Big Data in R](https://reader034.vdocument.in/reader034/viewer/2022051319/627b22dafa2b614a6d51c4bc/html5/thumbnails/17.jpg)
Introduction pbdR Benchmarks Demos
The pbdR Project
pbdR on HPC Resources
University of Tennessee
Kraken (XSEDE)
Nautilus
Darter
Newton
Oak Ridge National Lab
Titan
Lens
Chester
Sith
Other Resources
Stampede, TACC (XSEDE)
tara, UMBC
Hopper, NERSC
Edison, NERSC
If you are interested in installing pbdR: [email protected]
Drew Schmidt and George Ostrouchov Elevating R to Supercomputers 13 / 30
![Page 18: Programming with Big Data in R](https://reader034.vdocument.in/reader034/viewer/2022051319/627b22dafa2b614a6d51c4bc/html5/thumbnails/18.jpg)
Introduction pbdR Benchmarks Demos
The pbdR Project
Exploration of Climate Data with pbdR
Behavior of extreme precipitation across atmospheric layers of moisture, temperature, and wind
IMD
W.-C. Chen, G. Ostrouchov, D. Pugmire, Prabhat, M. Wehner. A Parallel EM Algorithm for Model-Based Clustering Applied to the Exploration of Large Spatio-Temporal Data. Technometrics (online, in print: Fall 2013).
Exploration of Climate Data with pbdR
• 2013 INCITE “AYribu.ng Changes in the Risk of Extreme Weather and Climate” Michael Wehner, PI
• High resolu0on atmospheric model, CAM 5.1 (~25 km resolu0on) • Resolves processes responsible for extreme precipita0on and storms
• R and pbdR used for scalable analy.cs and graphics • Advanced clustering capability (EM algorithm for Gaussian mixture
models)
Drew Schmidt and George Ostrouchov Elevating R to Supercomputers 14 / 30
![Page 19: Programming with Big Data in R](https://reader034.vdocument.in/reader034/viewer/2022051319/627b22dafa2b614a6d51c4bc/html5/thumbnails/19.jpg)
Introduction pbdR Benchmarks Demos
Contents
1 Introduction
2 pbdR
3 Benchmarks
4 Demos
Drew Schmidt and George Ostrouchov Elevating R to Supercomputers
![Page 20: Programming with Big Data in R](https://reader034.vdocument.in/reader034/viewer/2022051319/627b22dafa2b614a6d51c4bc/html5/thumbnails/20.jpg)
Introduction pbdR Benchmarks Demos
Benchmarks
Non-Optimal Choices Throughout
1 Only libre software used (no MKL, ACML, etc.).
2 1 core = 1 MPI process.
3 No tuning for data distribution.
Drew Schmidt and George Ostrouchov Elevating R to Supercomputers 15 / 30
![Page 21: Programming with Big Data in R](https://reader034.vdocument.in/reader034/viewer/2022051319/627b22dafa2b614a6d51c4bc/html5/thumbnails/21.jpg)
Introduction pbdR Benchmarks Demos
Benchmarks
Benchmark Data
1 Measure wallclock time for covariance and linear regression.
2 Random normal N(100, 10000).
3 Local problem size of ≈ 43.4MiB.
4 Three sets: 500, 1000, and 2000 columns.
5 Several runs at different core sizes within each set.
Drew Schmidt and George Ostrouchov Elevating R to Supercomputers 16 / 30
![Page 22: Programming with Big Data in R](https://reader034.vdocument.in/reader034/viewer/2022051319/627b22dafa2b614a6d51c4bc/html5/thumbnails/22.jpg)
Introduction pbdR Benchmarks Demos
Covariance
Covariance Code
cov(xn×p) =1
n − 1
n∑
i=1
(xi − µx) (xi − µx)T
1 x <- ddmatrix("rnorm", nrow=n, ncol=p, mean=mean , sd=sd)
2
3 cov.x <- cov(x)
Drew Schmidt and George Ostrouchov Elevating R to Supercomputers 17 / 30
![Page 23: Programming with Big Data in R](https://reader034.vdocument.in/reader034/viewer/2022051319/627b22dafa2b614a6d51c4bc/html5/thumbnails/23.jpg)
Introduction pbdR Benchmarks Demos
Covariance
cov()
0.04
21.3442.68 85.35 170.7 341.41
0.04
21.3442.68 85.35 170.7 341.41
0.04
21.3442.68 85.35 170.7341.41
4
8
12
16
1 504 1008 2016 4032 80641 504 1008 2016 4032 80641 504 1008 2016 4032 8064Cores
Run
Tim
e (S
econ
ds)
Predictors 500 1000 2000
Calculating cov(x) With Fixed Local Size of ~43.4 MiB
Drew Schmidt and George Ostrouchov Elevating R to Supercomputers 18 / 30
![Page 24: Programming with Big Data in R](https://reader034.vdocument.in/reader034/viewer/2022051319/627b22dafa2b614a6d51c4bc/html5/thumbnails/24.jpg)
Introduction pbdR Benchmarks Demos
Linear Model Fitting
Linear Model Code
Find β such that
y = Xβ + ε
When X is full rank,
β = (XTX)−1XTy
1 x <- ddmatrix("rnorm", nrow=n, ncol=p, mean=mean , sd=sd)
2 beta_true <- ddmatrix("runif", nrow=p, ncol =1)
3
4 y <- x %*% beta_true
5
6 beta_est <- lm.fit(x=x, y=y)$coefficients
Drew Schmidt and George Ostrouchov Elevating R to Supercomputers 19 / 30
![Page 25: Programming with Big Data in R](https://reader034.vdocument.in/reader034/viewer/2022051319/627b22dafa2b614a6d51c4bc/html5/thumbnails/25.jpg)
Introduction pbdR Benchmarks Demos
Linear Model Fitting
lm.fit()
21.34
42.6885.35170.7 341.41
1016.09
21.3442.6885.35 170.7
341.411016.09
21.3442.68
85.35
170.7341.41
1016.09
25
50
75
100
125
504
1008
2016
4032
8064
2400
050
410
0820
1640
3280
64
2400
050
410
0820
1640
3280
64
2400
0
Cores
Run
Tim
e (S
econ
ds)
Predictors 500 1000 2000
Fitting y~x With Fixed Local Size of ~43.4 MiB
Drew Schmidt and George Ostrouchov Elevating R to Supercomputers 20 / 30
![Page 26: Programming with Big Data in R](https://reader034.vdocument.in/reader034/viewer/2022051319/627b22dafa2b614a6d51c4bc/html5/thumbnails/26.jpg)
Introduction pbdR Benchmarks Demos
Linear Model Fitting
Data Generation
0
200
400
600
800
1 504 1008 2016 4032 8064 24000 48000Cores
Dat
a G
ener
atio
n (G
iB p
er S
econ
d)
Drew Schmidt and George Ostrouchov Elevating R to Supercomputers 21 / 30
![Page 27: Programming with Big Data in R](https://reader034.vdocument.in/reader034/viewer/2022051319/627b22dafa2b614a6d51c4bc/html5/thumbnails/27.jpg)
Introduction pbdR Benchmarks Demos
Contents
4 DemospbdMPIpbdDMATRandSVD
Drew Schmidt and George Ostrouchov Elevating R to Supercomputers
![Page 28: Programming with Big Data in R](https://reader034.vdocument.in/reader034/viewer/2022051319/627b22dafa2b614a6d51c4bc/html5/thumbnails/28.jpg)
Introduction pbdR Benchmarks Demos
pbdMPI
MPI Operations: The Gang’s All Here
Communicator wrangling: init(), finalize()
Rank query: comm.rank(), comm.size()
Reduction: reduce(x, op=’sum’), allreduce(x)
Gather: gather(x), allgather(x)
Broadcast: bcast(x)
Barrier: barrier()
Send/Receive: send(x), recv()
Drew Schmidt and George Ostrouchov Elevating R to Supercomputers 22 / 30
![Page 29: Programming with Big Data in R](https://reader034.vdocument.in/reader034/viewer/2022051319/627b22dafa2b614a6d51c4bc/html5/thumbnails/29.jpg)
Introduction pbdR Benchmarks Demos
pbdMPI
MPI Operations for R Users
Printing: comm.print(x), comm.cat(x)
RNG Seeds: comm.set.seed(diff=T)
Task Subsetting: get.jid(n)
*ply:pbdApply(X, MARGIN, FUN, ...)
pbdLapply(X, FUN, ...)
pbdSapply(X, FUN, ...)
Drew Schmidt and George Ostrouchov Elevating R to Supercomputers 23 / 30
![Page 30: Programming with Big Data in R](https://reader034.vdocument.in/reader034/viewer/2022051319/627b22dafa2b614a6d51c4bc/html5/thumbnails/30.jpg)
Introduction pbdR Benchmarks Demos
pbdMPI
Example 1: Monte Carlo Simulation
Sample N uniform observations (xi , yi ) in the unit square[0, 1]× [0, 1]. Then
π ≈ 4
(# Inside Circle
# Total
)= 4
(# Blue
# Blue + # Red
)
0.0
0.2
0.4
0.6
0.8
1.0
0.0 0.2 0.4 0.6 0.8 1.0
Drew Schmidt and George Ostrouchov Elevating R to Supercomputers 24 / 30
![Page 31: Programming with Big Data in R](https://reader034.vdocument.in/reader034/viewer/2022051319/627b22dafa2b614a6d51c4bc/html5/thumbnails/31.jpg)
Introduction pbdR Benchmarks Demos
pbdDMAT
Distributed Matrices
Most problems in data science are matrix algebra problems, so:
Distributed matrices =⇒ Bigger data
Drew Schmidt and George Ostrouchov Elevating R to Supercomputers 25 / 30
![Page 32: Programming with Big Data in R](https://reader034.vdocument.in/reader034/viewer/2022051319/627b22dafa2b614a6d51c4bc/html5/thumbnails/32.jpg)
Introduction pbdR Benchmarks Demos
pbdDMAT
DMAT: 2-dimensional Block-Cyclic with 6 Processors
x =
x11 x12 x13 x14 x15 x16 x17 x18 x19
x21 x22 x23 x24 x25 x26 x27 x28 x29
x31 x32 x33 x34 x35 x36 x37 x38 x39
x41 x42 x43 x44 x45 x46 x47 x48 x49
x51 x52 x53 x54 x55 x56 x57 x58 x59
x61 x62 x63 x64 x65 x66 x67 x68 x69
x71 x72 x73 x74 x75 x76 x77 x78 x79
x81 x82 x83 x84 x85 x86 x87 x88 x89
x91 x92 x93 x94 x95 x96 x97 x98 x99
9×9
Processor grid =
∣∣∣∣0 1 23 4 5
∣∣∣∣ =
∣∣∣∣(0,0) (0,1) (0,2)(1,0) (1,1) (1,2)
∣∣∣∣
Drew Schmidt and George Ostrouchov Elevating R to Supercomputers 26 / 30
![Page 33: Programming with Big Data in R](https://reader034.vdocument.in/reader034/viewer/2022051319/627b22dafa2b614a6d51c4bc/html5/thumbnails/33.jpg)
Introduction pbdR Benchmarks Demos
pbdDMAT
Understanding DMAT: Local View
x11 x12 x17 x18
x21 x22 x27 x28
x51 x52 x57 x58
x61 x62 x67 x68
x91 x92 x97 x98
5×4
x13 x14 x19
x23 x24 x29
x53 x54 x59
x63 x64 x69
x93 x94 x99
5×3
x15 x16
x25 x26
x55 x56
x65 x66
x95 x96
5×2
x31 x32 x37 x38
x41 x42 x47 x48
x71 x72 x77 x78
x81 x82 x87 x88
4×4
x33 x34 x39
x43 x44 x49
x73 x74 x79
x83 x84 x89
4×3
x35 x36
x45 x46
x75 x76
x85 x86
4×2
Processor grid =
∣∣∣∣0 1 23 4 5
∣∣∣∣ =
∣∣∣∣(0,0) (0,1) (0,2)(1,0) (1,1) (1,2)
∣∣∣∣
Drew Schmidt and George Ostrouchov Elevating R to Supercomputers 27 / 30
![Page 34: Programming with Big Data in R](https://reader034.vdocument.in/reader034/viewer/2022051319/627b22dafa2b614a6d51c4bc/html5/thumbnails/34.jpg)
Introduction pbdR Benchmarks Demos
RandSVD
Randomized SVD1
Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.
PROBABILISTIC ALGORITHMS FOR MATRIX APPROXIMATION 227
Prototype for Randomized SVDGiven an m × n matrix A, a target number k of singular vectors, and anexponent q (say, q = 1 or q = 2), this procedure computes an approximaterank-2k factorization UΣV ∗, where U and V are orthonormal, and Σ isnonnegative and diagonal.Stage A:1 Generate an n× 2k Gaussian test matrix Ω.2 Form Y = (AA∗)qAΩ by multiplying alternately with A and A∗.3 Construct a matrix Q whose columns form an orthonormal basis for
the range of Y .Stage B:4 Form B = Q∗A.5 Compute an SVD of the small matrix: B = UΣV ∗.6 Set U = QU .Note: The computation of Y in step 2 is vulnerable to round-off errors.When high accuracy is required, we must incorporate an orthonormalizationstep between each application of A and A∗; see Algorithm 4.4.
The theory developed in this paper provides much more detailed informationabout the performance of the proto-algorithm.
• When the singular values of A decay slightly, the error ‖A − QQ∗A‖ doesnot depend on the dimensions of the matrix (sections 10.2–10.3).
• We can reduce the size of the bracket in the error bound (1.8) by combiningthe proto-algorithm with a power iteration (section 10.4). For an example,see section 1.6 below.
• For the structured random matrices we mentioned in section 1.4.1, relatederror bounds are in force (section 11).
• We can obtain inexpensive a posteriori error estimates to verify the qualityof the approximation (section 4.3).
1.6. Example: Randomized SVD. We conclude this introduction with a shortdiscussion of how these ideas allow us to perform an approximate SVD of a large datamatrix, which is a compelling application of randomized matrix approximation [113].
The two-stage randomized method offers a natural approach to SVD compu-tations. Unfortunately, the simplest version of this scheme is inadequate in manyapplications because the singular spectrum of the input matrix may decay slowly. Toaddress this difficulty, we incorporate q steps of a power iteration, where q = 1 orq = 2 usually suffices in practice. The complete scheme appears in the box labeledPrototype for Randomized SVD. For most applications, it is important to incorporateadditional refinements, as we discuss in sections 4 and 5.
The Randomized SVD procedure requires only 2(q + 1) passes over the matrix,so it is efficient even for matrices stored out-of-core. The flop count satisfies
TrandSVD = (2q + 2) k Tmult +O(k2(m+ n)),
where Tmult is the flop count of a matrix–vector multiply with A or A∗. We have thefollowing theorem on the performance of this method in exact arithmetic, which is aconsequence of Corollary 10.10.
Theorem 1.2. Suppose that A is a real m × n matrix. Select an exponent qand a target number k of singular vectors, where 2 ≤ k ≤ 0.5minm,n. Execute the
Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.
244 N. HALKO, P. G. MARTINSSON, AND J. A. TROPP
Algorithm 4.3: Randomized Power IterationGiven an m× n matrix A and integers and q, this algorithm computes anm× orthonormal matrix Q whose range approximates the range of A.1 Draw an n× Gaussian random matrix Ω.2 Form the m× matrix Y = (AA∗)qAΩ via alternating application
of A and A∗.3 Construct an m× matrix Q whose columns form an orthonormal
basis for the range of Y , e.g., via the QR factorization Y = QR.Note: This procedure is vulnerable to round-off errors; see Remark 4.3. Therecommended implementation appears as Algorithm 4.4.
Algorithm 4.4: Randomized Subspace IterationGiven an m× n matrix A and integers and q, this algorithm computes anm× orthonormal matrix Q whose range approximates the range of A.1 Draw an n× standard Gaussian matrix Ω.2 Form Y0 = AΩ and compute its QR factorization Y0 = Q0R0.3 for j = 1, 2, . . . , q
4 Form Yj = A∗Qj−1 and compute its QR factorization Yj = QjRj .
5 Form Yj = AQj and compute its QR factorization Yj = QjRj .6 end7 Q = Qq.
Algorithm 4.3 targets the fixed-rank problem. To address the fixed-precisionproblem, we can incorporate the error estimators described in section 4.3 to obtainan adaptive scheme analogous with Algorithm 4.2. In situations where it is critical toachieve near-optimal approximation errors, one can increase the oversampling beyondour standard recommendation = k + 5 all the way to = 2k without changingthe scaling of the asymptotic computational cost. A supporting analysis appears inCorollary 10.10.
Remark 4.3. Unfortunately, when Algorithm 4.3 is executed in floating-pointarithmetic, rounding errors will extinguish all information pertaining to singularmodes associated with singular values that are small compared with ‖A‖. (Roughly,if machine precision is µ, then all information associated with singular values smallerthan µ1/(2q+1) ‖A‖ is lost.) This problem can easily be remedied by orthonormalizingthe columns of the sample matrix between each application of A and A∗. The result-ing scheme, summarized as Algorithm 4.4, is algebraically equivalent to Algorithm 4.3when executed in exact arithmetic [93, 125]. We recommend Algorithm 4.4 becauseits computational costs are similar to those of Algorithm 4.3, even though the formeris substantially more accurate in floating-point arithmetic.
4.6. An Accelerated Technique for General Dense Matrices. This section de-scribes a set of techniques that allow us to compute an approximate rank- factor-ization of a general dense m× n matrix in roughly O(mn log()) flops, in contrast tothe asymptotic cost O(mn) required by earlier methods. We can tailor this schemefor the real or complex case, but we focus on the conceptually simpler complex case.These algorithms were introduced in [138]; similar techniques were proposed in [119].
The first step toward this accelerated technique is to observe that the bottleneckin Algorithm 4.1 is the computation of the matrix product AΩ. When the test matrix
Serial R1 randSVD <− f u n c t i o n (A, k , q=3)2 3 ## Stage A4 Omega <− matrix(rnorm(n*2*k),5 nrow=n, ncol=2*k)6 Y <− A %∗% Omega7 Q <− qr .Q( qr (Y) )8 At <− t (A)9 f o r ( i i n 1 : q )
10 11 Y <− At %∗% Q12 Q <− qr .Q( qr (Y) )13 Y <− A %∗% Q14 Q <− qr .Q( qr (Y) )15 1617 ## Stage B18 B <− t (Q) %∗% A19 U <− La . svd (B) $u20 U <− Q %∗% U21 U[ , 1 : k ]22
1Halko N, Martinsson P-G and Tropp J A 2011 Finding structure with randomness: probabilistic algorithms for
constructing approximate matrix decompositions SIAM Rev. 53 217–88
Drew Schmidt and George Ostrouchov Elevating R to Supercomputers 28 / 30
![Page 35: Programming with Big Data in R](https://reader034.vdocument.in/reader034/viewer/2022051319/627b22dafa2b614a6d51c4bc/html5/thumbnails/35.jpg)
Introduction pbdR Benchmarks Demos
RandSVD
Randomized SVD
Serial R1 randSVD <− f u n c t i o n (A, k , q=3)2 3 ## Stage A4 Omega <− matrix(rnorm(n*2*k),5 nrow=n, ncol=2*k)6 Y <− A %∗% Omega7 Q <− qr .Q( qr (Y) )8 At <− t (A)9 f o r ( i i n 1 : q )
10 11 Y <− At %∗% Q12 Q <− qr .Q( qr (Y) )13 Y <− A %∗% Q14 Q <− qr .Q( qr (Y) )15 1617 ## Stage B18 B <− t (Q) %∗% A19 U <− La . svd (B) $u20 U <− Q %∗% U21 U[ , 1 : k ]22
Parallel pbdR1 randSVD <− f u n c t i o n (A, k , q=3)2 3 ## Stage A4 Omega <− ddmatrix(”rnorm”,5 nrow=n, ncol=2*k)6 Y <− A %∗% Omega7 Q <− qr .Q( qr (Y) )8 At <− t (A)9 f o r ( i i n 1 : q )
10 11 Y <− At %∗% Q12 Q <− qr .Q( qr (Y) )13 Y <− A %∗% Q14 Q <− qr .Q( qr (Y) )15 1617 ## Stage B18 B <− t (Q) %∗% A19 U <− La . svd (B) $u20 U <− Q %∗% U21 U[ , 1 : k ]22
Drew Schmidt and George Ostrouchov Elevating R to Supercomputers 29 / 30
![Page 36: Programming with Big Data in R](https://reader034.vdocument.in/reader034/viewer/2022051319/627b22dafa2b614a6d51c4bc/html5/thumbnails/36.jpg)
Introduction pbdR Benchmarks Demos
RandSVD
Randomized SVD on ≈ 765 MiB
1
2
4
8
16
32
64
128
1 2 4 8 16 32 64 128Cores
Spe
edup
Algorithm full randomized
30 Singular Vectors from a 100,000 by 1,000 Matrix
5
10
15
1 2 4 8 16 32 64 128Cores
Spe
edup
30 Singular Vectors from a 100,000 by 1,000 MatrixSpeedup of Randomized vs. Full SVD
Drew Schmidt and George Ostrouchov Elevating R to Supercomputers 30 / 30
![Page 37: Programming with Big Data in R](https://reader034.vdocument.in/reader034/viewer/2022051319/627b22dafa2b614a6d51c4bc/html5/thumbnails/37.jpg)
Introduction pbdR Benchmarks Demos
RandSVD
Thanks for coming!
Questions?
http://r-pbd.org/
Be sure to come to our R BoF: Wednesday 5:30-7:00 room 404
Drew Schmidt and George Ostrouchov Elevating R to Supercomputers