introduction to gpu computing - tams.informatik.uni-hamburg.de · i 32 cuda cores per sm i = 512...
TRANSCRIPT
Universitat Hamburg
MIN-FakultatFachbereich Informatik
Introduction to GPU Computing
Introduction to GPU Computing
Matthis Hauschild
Universitat HamburgFakultat fur Mathematik, Informatik und NaturwissenschaftenFachbereich Informatik
Technische Aspekte Multimodaler Systeme
December 4, 2014
M. Hauschild - Introduction to GPU Computing 1
Universitat Hamburg
MIN-FakultatFachbereich Informatik
Introduction to GPU Computing
Table of Contents
1. Architecture of a GPU
2. General-purpose computing on GPUs
3. Applications of GPGPU
4. Performance evaluation examples
M. Hauschild - Introduction to GPU Computing 2
Universitat Hamburg
MIN-FakultatFachbereich Informatik
Architecture of a GPU Introduction to GPU Computing
What is a GPU
I Graphics processing unitI Main GPU manufacturers
1. Intel2. AMD3. Nvidia
I Performance characteristics:1
I GPU architecture: 28 nmI GPU speed: ∼ 1 GHzI Memory amount: 8 GiB GDDR5I Memory bandwidth: 640 GiB/s
1based on the AMD Radeon R9 series (cf.[1])M. Hauschild - Introduction to GPU Computing 3
Universitat Hamburg
MIN-FakultatFachbereich Informatik
Architecture of a GPU Introduction to GPU Computing
Difference between GPU and CPU[3]
I CPU optimized for single thread execution
I GPU optimized for multiple data execution
M. Hauschild - Introduction to GPU Computing 4
Universitat Hamburg
MIN-FakultatFachbereich Informatik
Architecture of a GPU Introduction to GPU Computing
Architecture of a GPU[4]
based on the Nvidia Fermi architecture:
M. Hauschild - Introduction to GPU Computing 5
Universitat Hamburg
MIN-FakultatFachbereich Informatik
Architecture of a GPU Introduction to GPU Computing
Architecture of a GPU[4]
M. Hauschild - Introduction to GPU Computing 6
Universitat Hamburg
MIN-FakultatFachbereich Informatik
Architecture of a GPU Introduction to GPU Computing
Architecture of a GPU[4]
Summary of the Nvidia Fermi architecture:
I 16 Streaming Multiprocessors (SM)
I 32 CUDA cores per SM
I = 512 CUDA cores ⇒ 512 FMA op/clock
⇒ it is great for generating graphics, but what else could be donewith it?
M. Hauschild - Introduction to GPU Computing 7
Universitat Hamburg
MIN-FakultatFachbereich Informatik
General-purpose computing on GPUs Introduction to GPU Computing
What is GPGPU[5]
I General-purpose computing on graphics processing unitsI Using GPU for non-graphical computations
I Good for data parallelismI Bad for instruction parallelism
I First use in LU factorization
I Became popular at 2001 with matrix multiplication
I Started using DirectX and OpenGL
M. Hauschild - Introduction to GPU Computing 8
Universitat Hamburg
MIN-FakultatFachbereich Informatik
General-purpose computing on GPUs Introduction to GPU Computing
GPGPU Frameworks
I Brook – One of the earliest GPU frameworks by StanfordUniversity
I CUDA – Proprietary Nvidia-only framework
I OpenCL – Open source general framework by Khronos Group
I C++ AMP – Open C++ extension by Microsoft
I OpenACC – C, C++ and Fortran extension
I ArrayFire – Wrapper for CUDA, OpenCL, etc.
M. Hauschild - Introduction to GPU Computing 9
Universitat Hamburg
MIN-FakultatFachbereich Informatik
Applications of GPGPU Introduction to GPU Computing
General applications of GPGPU
Again, GPGPU can only be superior to CPU computing, if thesame algorithm is applied to a lot of data (data parallelism)For example:
I k-nearest neighbor
I Fast Fourier Transform
I Segmentation
I Audio Processing
I CT reconstruction
I Weather forecasting
I Cryptography
I Database operations
M. Hauschild - Introduction to GPU Computing 10
Universitat Hamburg
MIN-FakultatFachbereich Informatik
Applications of GPGPU Introduction to GPU Computing
Applications of GPGPU in Robotics[2]
For example:
I Generally many image processing tasks
I Frame transformation
I Inverse kinematic calculation
I 3D pose estimation
I Point-set registration
M. Hauschild - Introduction to GPU Computing 11
Universitat Hamburg
MIN-FakultatFachbereich Informatik
Performance evaluation examples Introduction to GPU Computing
Performance evaluation examples
Test 1I Sobel operator on a real image using OpenCL
I Measurement of the possible frames per second
I On GPU and CPU
Test 2I Matrix multiplication of two squared matrices using OpenCL
I Measurement of time needed for calculation
I On GPU and CPU
M. Hauschild - Introduction to GPU Computing 12
Universitat Hamburg
MIN-FakultatFachbereich Informatik
Performance evaluation examples Introduction to GPU Computing
Performance evaluation examples - System characteristics
I My CPU:I Model: AMD Phenom II X4 965I Clock speed: 3400 MHzI Misc: 4 Cores, SSE3
I My GPU:I Model: AMD Radeon HD 6950,I Memory: 2048 MBI Core clock: 800 MHzI Memory clock: 1250 MHzI Memory bandwidth: 160 GB/s
I My RAM: 8 GB
M. Hauschild - Introduction to GPU Computing 13
Universitat Hamburg
MIN-FakultatFachbereich Informatik
Performance evaluation examples Introduction to GPU Computing
Performance evaluation examples - Test 1
The Sobel operator:
3. s =√dx2 + dy2
M. Hauschild - Introduction to GPU Computing 14
Universitat Hamburg
MIN-FakultatFachbereich Informatik
Performance evaluation examples Introduction to GPU Computing
M. Hauschild - Introduction to GPU Computing 15
Universitat Hamburg
MIN-FakultatFachbereich Informatik
Performance evaluation examples Introduction to GPU Computing
Performance evaluation examples - Test 1
M. Hauschild - Introduction to GPU Computing 16
Universitat Hamburg
MIN-FakultatFachbereich Informatik
Performance evaluation examples Introduction to GPU Computing
Performance evaluation examples - Test 2
Matrix Multiplication2:
2from http://www.mathematrix.de/wp-content/uploads/matrixmul2.png
M. Hauschild - Introduction to GPU Computing 17
Universitat Hamburg
MIN-FakultatFachbereich Informatik
Performance evaluation examples Introduction to GPU Computing
Performance evaluation examples - Test 2
M. Hauschild - Introduction to GPU Computing 18
Universitat Hamburg
MIN-FakultatFachbereich Informatik
Performance evaluation examples Introduction to GPU Computing
Thank you for your attention!
Matthis [email protected]
Universitat HamburgFakultat fur Mathematik, Informatik und NaturwissenschaftenFachbereich Informatik
Technische Aspekte Multimodaler Systeme
M. Hauschild - Introduction to GPU Computing 19
Universitat Hamburg
MIN-FakultatFachbereich Informatik
Performance evaluation examples Introduction to GPU Computing
Bibliography
[1] AMD. AMD RadeonTM R9 Grafikkartenserie, 2014.http://www.amd.com/de-de/products/graphics/desktop/r9#.
[2] J. Bedkowski and A. Maslowski. GPGPU computation in mobile robotapplications. Warsaw University of Technology, 2012.
[3] Nvidia. CUDA C Programming Guide, 2014.http://docs.nvidia.com/cuda/pdf/CUDA_C_Programming_Guide.pdf.
[4] Nvidia. NVIDIA’s Next Generation CUDA Compute Architecture: Fermi,2014. http://www.nvidia.de/content/PDF/fermi_white_papers/NVIDIA_Fermi_Compute_Architecture_Whitepaper.pdf.
[5] Wikipedia. General-purpose computing on graphics processing units, 2014.http://en.wikipedia.org/wiki/General-purpose_computing_on_
graphics_processing_units.
M. Hauschild - Introduction to GPU Computing 20