the creative commons attribution-noncommercial 4.0 ...swjun/courses/2019s-cs295... · the gpu...
TRANSCRIPT
![Page 1: the Creative Commons Attribution-NonCommercial 4.0 ...swjun/courses/2019S-CS295... · The GPU Teaching Kit is licensed by NVIDIA and the University of Illinois under the Creative](https://reader034.vdocument.in/reader034/viewer/2022050517/5fa0cba2b365aa608a4fe742/html5/thumbnails/1.jpg)
Accelerated Computing
GPU Teaching Kit
The GPU Teaching Kit is licensed by NVIDIA and the University of Illinois under the Creative Commons Attribution-NonCommercial 4.0 International License.
![Page 2: the Creative Commons Attribution-NonCommercial 4.0 ...swjun/courses/2019S-CS295... · The GPU Teaching Kit is licensed by NVIDIA and the University of Illinois under the Creative](https://reader034.vdocument.in/reader034/viewer/2022050517/5fa0cba2b365aa608a4fe742/html5/thumbnails/2.jpg)
14
host device
Kernel 1
Grid 1 Block (0, 0)
Block (1, 1)
Block (1, 0)
Block (0, 1)
Grid 2
Block (1,0)
Thread(0,0,0)Thread
(0,1,3)Thread(0,1,0)
Thread(0,1,1)
Thread(0,1,2)
Thread(0,0,0)
Thread(0,0,1)
Thread(0,0,2)
Thread(0,0,3)
(1,0,0) (1,0,1) (1,0,2) (1,0,3)
A Multi-Dimensional Grid Example
14
![Page 3: the Creative Commons Attribution-NonCommercial 4.0 ...swjun/courses/2019S-CS295... · The GPU Teaching Kit is licensed by NVIDIA and the University of Illinois under the Creative](https://reader034.vdocument.in/reader034/viewer/2022050517/5fa0cba2b365aa608a4fe742/html5/thumbnails/3.jpg)
32
Image Blurring
![Page 4: the Creative Commons Attribution-NonCommercial 4.0 ...swjun/courses/2019S-CS295... · The GPU Teaching Kit is licensed by NVIDIA and the University of Illinois under the Creative](https://reader034.vdocument.in/reader034/viewer/2022050517/5fa0cba2b365aa608a4fe742/html5/thumbnails/4.jpg)
33
Pixels processed by a thread block
Blurring Box
![Page 5: the Creative Commons Attribution-NonCommercial 4.0 ...swjun/courses/2019S-CS295... · The GPU Teaching Kit is licensed by NVIDIA and the University of Illinois under the Creative](https://reader034.vdocument.in/reader034/viewer/2022050517/5fa0cba2b365aa608a4fe742/html5/thumbnails/5.jpg)
19
Not all threads in a Block will follow the same control flow path.
![Page 6: the Creative Commons Attribution-NonCommercial 4.0 ...swjun/courses/2019S-CS295... · The GPU Teaching Kit is licensed by NVIDIA and the University of Illinois under the Creative](https://reader034.vdocument.in/reader034/viewer/2022050517/5fa0cba2b365aa608a4fe742/html5/thumbnails/6.jpg)
34
Image Blur as a 2D Kernel
__global__ void blurKernel(unsigned char * in, unsigned char * out, int w, int h) { int Col = blockIdx.x * blockDim.x + threadIdx.x; int Row = blockIdx.y * blockDim.y + threadIdx.y;
if (Col < w && Row < h) { ... // Rest of our kernel } }
![Page 7: the Creative Commons Attribution-NonCommercial 4.0 ...swjun/courses/2019S-CS295... · The GPU Teaching Kit is licensed by NVIDIA and the University of Illinois under the Creative](https://reader034.vdocument.in/reader034/viewer/2022050517/5fa0cba2b365aa608a4fe742/html5/thumbnails/7.jpg)
16
M
0,2M
1,1
M
0,1
M
0,0M
1,0
M
0,3M
1,2
M
1,3
M
0,2
M
0,1
M
0,0
M
0,3
M
1,1
M
1,0
M
1,2
M
1,3
M
2,1
M
2,0
M
2,2
M
2,3
M
2,1
M
2,0
M
2,2
M
2,3M
3,1
M
3,0
M
3,2
M
3,3
M
3,1
M
3,0
M
3,2
M
3,3
M
Row*Width+Col = 2*4+1 = 9 M
2
M
1
M
0
M
3
M
5
M
4
M
6
M
7
M
9
M
8
M
10
M
11
M
13
M
12
M
14
M
15
M
Row-Major Layout in C/C++
![Page 8: the Creative Commons Attribution-NonCommercial 4.0 ...swjun/courses/2019S-CS295... · The GPU Teaching Kit is licensed by NVIDIA and the University of Illinois under the Creative](https://reader034.vdocument.in/reader034/viewer/2022050517/5fa0cba2b365aa608a4fe742/html5/thumbnails/8.jpg)
35
__global__ void blurKernel(unsigned char * in, unsigned char * out, int w, int h) { int Col = blockIdx.x * blockDim.x + threadIdx.x; int Row = blockIdx.y * blockDim.y + threadIdx.y;
if (Col < w && Row < h) { int pixVal = 0; int pixels = 0;
// Get the average of the surrounding 2xBLUR_SIZE x 2xBLUR_SIZE box for(int blurRow = -BLUR_SIZE; blurRow < BLUR_SIZE+1; ++blurRow) { for(int blurCol = -BLUR_SIZE; blurCol < BLUR_SIZE+1; ++blurCol) {
int curRow = Row + blurRow; int curCol = Col + blurCol; // Verify we have a valid image pixel if(curRow > -1 && curRow < h && curCol > -1 && curCol < w) { pixVal += in[curRow * w + curCol]; pixels++; // Keep track of number of pixels in the accumulated total } } }
// Write our new pixel value out out[Row * w + Col] = (unsigned char)(pixVal / pixels); } }