![Page 1: Acceleration of software package "R" using GPU's Sachinthaka Abeywardana](https://reader035.vdocument.in/reader035/viewer/2022062618/5513f2c65503466f748b5cdc/html5/thumbnails/1.jpg)
Acceleration of software package "R" using GPU's
Sachinthaka Abeywardana
![Page 2: Acceleration of software package "R" using GPU's Sachinthaka Abeywardana](https://reader035.vdocument.in/reader035/viewer/2022062618/5513f2c65503466f748b5cdc/html5/thumbnails/2.jpg)
CSIRO.
Introduction to Graphic Processing Units (GPU)
![Page 3: Acceleration of software package "R" using GPU's Sachinthaka Abeywardana](https://reader035.vdocument.in/reader035/viewer/2022062618/5513f2c65503466f748b5cdc/html5/thumbnails/3.jpg)
CSIRO.
Introduction to GPU contd.
![Page 4: Acceleration of software package "R" using GPU's Sachinthaka Abeywardana](https://reader035.vdocument.in/reader035/viewer/2022062618/5513f2c65503466f748b5cdc/html5/thumbnails/4.jpg)
CSIRO.
Introduction to R and BLAS
• R• Statistical Package
• Graphics
•BLAS (Basic Linear Algebra Subprograms)
•Vector-Vector Addition/Multiplication etc.
•Vector-Matrix Addition/Multiplication etc.
•Matrix-Matrix Addition/Multiplication etc.
LAPack (Linear Algebra Package)
![Page 5: Acceleration of software package "R" using GPU's Sachinthaka Abeywardana](https://reader035.vdocument.in/reader035/viewer/2022062618/5513f2c65503466f748b5cdc/html5/thumbnails/5.jpg)
What has been done in this project
• Aim: Replace Rblas.dll with a faster BLAS library
CSIRO.
R LAPack BLAS
New BLAS
Replace
![Page 6: Acceleration of software package "R" using GPU's Sachinthaka Abeywardana](https://reader035.vdocument.in/reader035/viewer/2022062618/5513f2c65503466f748b5cdc/html5/thumbnails/6.jpg)
Rblas.dll
How New Rblas.dll was created
CSIRO.
CUBLAS library
‘C program’ wrapper
FORTRAN
Initialise
![Page 7: Acceleration of software package "R" using GPU's Sachinthaka Abeywardana](https://reader035.vdocument.in/reader035/viewer/2022062618/5513f2c65503466f748b5cdc/html5/thumbnails/7.jpg)
CSIRO.
Results for 1000 x 1000 Matrices
CPU
Average (s)
3.2 * A %*% B + 4.1 * A
(3.2 A x B + 4.1 B)
1.9335
A%*%B
(Matrix A x matrix B)
1.8855
t(A)%*%B
(Transpose matrix A x Matrix B)
1.9135
solve(A)
(Invert Matrix A)
2.227 4.69 5.288
GPU
Average (s)
Single Precision
GPU
Average (s)
Double Precision
0.2375 0.123
0.176 0.092
0.207 0.089
![Page 8: Acceleration of software package "R" using GPU's Sachinthaka Abeywardana](https://reader035.vdocument.in/reader035/viewer/2022062618/5513f2c65503466f748b5cdc/html5/thumbnails/8.jpg)
CSIRO.
Improvements
Single Precision (%)
Double Precision (%)
3.2 * A %*% B + 4.1 * A 814.1052632 1571.95122
A%*%B 1071.306818 2049.456522
t(A)%*%B 924.3961353 2150
solve(A) -210.597216 -237.4494836
![Page 9: Acceleration of software package "R" using GPU's Sachinthaka Abeywardana](https://reader035.vdocument.in/reader035/viewer/2022062618/5513f2c65503466f748b5cdc/html5/thumbnails/9.jpg)
CSIRO.
Who to Blame
A. Simply random?
B. Me???
C. Stupid Computer?
D. Memory allocation.
![Page 10: Acceleration of software package "R" using GPU's Sachinthaka Abeywardana](https://reader035.vdocument.in/reader035/viewer/2022062618/5513f2c65503466f748b5cdc/html5/thumbnails/10.jpg)
CSIRO.
Nvidia GPU Architecture
![Page 11: Acceleration of software package "R" using GPU's Sachinthaka Abeywardana](https://reader035.vdocument.in/reader035/viewer/2022062618/5513f2c65503466f748b5cdc/html5/thumbnails/11.jpg)
CSIRO.
Nvidia GPU Architecture contd.
![Page 12: Acceleration of software package "R" using GPU's Sachinthaka Abeywardana](https://reader035.vdocument.in/reader035/viewer/2022062618/5513f2c65503466f748b5cdc/html5/thumbnails/12.jpg)
CSIRO.
Nvidia GPU Architecture contd.
![Page 13: Acceleration of software package "R" using GPU's Sachinthaka Abeywardana](https://reader035.vdocument.in/reader035/viewer/2022062618/5513f2c65503466f748b5cdc/html5/thumbnails/13.jpg)
CSIRO.
CPU vs GPU calculations for matrix inversion
139.5
45.42
-20
0
20
40
60
80
100
120
140
160
0 500 1000 1500 2000 2500 3000 3500 4000 4500
Size of Square Matrix (one side)
Tim
e (s
)
CPU
GPU
![Page 14: Acceleration of software package "R" using GPU's Sachinthaka Abeywardana](https://reader035.vdocument.in/reader035/viewer/2022062618/5513f2c65503466f748b5cdc/html5/thumbnails/14.jpg)
CSIRO.
Matrix Multiplication Timing
-20
0
20
40
60
80
100
120
140
0 1000 2000 3000 4000 5000
Matrix Size (one side)
Tim
e (s
) CPU
GPU Single Precision
GPU Double Precision
![Page 15: Acceleration of software package "R" using GPU's Sachinthaka Abeywardana](https://reader035.vdocument.in/reader035/viewer/2022062618/5513f2c65503466f748b5cdc/html5/thumbnails/15.jpg)
CSIRO.
Comparison with Atlas RBlas
• Improvement on multiplication : A%*%B 319%• Improvement on inverting matrix: solve(A) 281%
(source:http://www.stat.columbia.edu/~cook/movabletype/archives/2008/06/a-trick-to-spee.html)
Limitations on Atlas:
•Latest version is for pentium 4 only
![Page 16: Acceleration of software package "R" using GPU's Sachinthaka Abeywardana](https://reader035.vdocument.in/reader035/viewer/2022062618/5513f2c65503466f748b5cdc/html5/thumbnails/16.jpg)
CSIRO.
Limitations of this Project
• Specific Card• Cost
• GeForce GTX 280 $582 (Source: http://www.msy.com.au/Parts/PARTS.pdf)
• Precision?• RMS of 6.350072e-06 for inverting a 1024 x 1024 matrix for the
single precision cards.
• IEEE 754 deviations
![Page 17: Acceleration of software package "R" using GPU's Sachinthaka Abeywardana](https://reader035.vdocument.in/reader035/viewer/2022062618/5513f2c65503466f748b5cdc/html5/thumbnails/17.jpg)
CSIRO.
Where can I get this from
• https://wiki.csiro.au/confluence/display/terabyte/GPU+Accelerated+R
![Page 18: Acceleration of software package "R" using GPU's Sachinthaka Abeywardana](https://reader035.vdocument.in/reader035/viewer/2022062618/5513f2c65503466f748b5cdc/html5/thumbnails/18.jpg)
CSIRO.
Where to from now?
• Implementation of more Blas functions• Getting rid of overhead
• Adjusting LAPack
• Double precision to Single Precision and Single to Double Conversion
• Parallel Extensions (CPU)
![Page 19: Acceleration of software package "R" using GPU's Sachinthaka Abeywardana](https://reader035.vdocument.in/reader035/viewer/2022062618/5513f2c65503466f748b5cdc/html5/thumbnails/19.jpg)
CSIRO.
Thank You
• Luke Domanski• Dadong Wang• Pascal Valotton• Glenn Stone• Robert Dunne• CMIS/ CSIRO staff
![Page 20: Acceleration of software package "R" using GPU's Sachinthaka Abeywardana](https://reader035.vdocument.in/reader035/viewer/2022062618/5513f2c65503466f748b5cdc/html5/thumbnails/20.jpg)
CSIRO.