programming models for next generation of gpgpu architectures · programming models for next...
TRANSCRIPT
![Page 1: Programming models for next generation of GPGPU architectures · Programming models for next generation of GPGPU architectures ... Programing models for next ... // create the OpenCL](https://reader031.vdocument.in/reader031/viewer/2022021505/5ada62477f8b9add658c8c9d/html5/thumbnails/1.jpg)
Programming models for next generation
of GPGPU architecturesBenedict R. Gaster
February, 2011
![Page 2: Programming models for next generation of GPGPU architectures · Programming models for next generation of GPGPU architectures ... Programing models for next ... // create the OpenCL](https://reader031.vdocument.in/reader031/viewer/2022021505/5ada62477f8b9add658c8c9d/html5/thumbnails/2.jpg)
2 | Programing models for next generation GPGPU | February, 2011 | Public
Motivation
![Page 3: Programming models for next generation of GPGPU architectures · Programming models for next generation of GPGPU architectures ... Programing models for next ... // create the OpenCL](https://reader031.vdocument.in/reader031/viewer/2022021505/5ada62477f8b9add658c8c9d/html5/thumbnails/3.jpg)
3 | Programing models for next generation GPGPU | February, 2011 | Public
OPENCL™ PROGRAM STRUCTURE
DEVICE
(OpenCL C)
CPU
(Platform and
Runtime APIs)
Host C/C++ Code OpenCL™ C Device Code
![Page 4: Programming models for next generation of GPGPU architectures · Programming models for next generation of GPGPU architectures ... Programing models for next ... // create the OpenCL](https://reader031.vdocument.in/reader031/viewer/2022021505/5ada62477f8b9add658c8c9d/html5/thumbnails/4.jpg)
4 | Programing models for next generation GPGPU | February, 2011 | Public
HELLO WORLD OPENCL™ C SOURCE
__constant char hw[] = "Hello World\n";
__kernel void hello(__global char * out) {
size_t tid = get_global_id(0);
out[tid] = hw[tid];
}
![Page 5: Programming models for next generation of GPGPU architectures · Programming models for next generation of GPGPU architectures ... Programing models for next ... // create the OpenCL](https://reader031.vdocument.in/reader031/viewer/2022021505/5ada62477f8b9add658c8c9d/html5/thumbnails/5.jpg)
5 | Programing models for next generation GPGPU | February, 2011 | Public
HELLO WORLD OPENCL™ C SOURCE
__constant char hw[] = "Hello World\n";
__kernel void hello(__global char * out) {
size_t tid = get_global_id(0);
out[tid] = hw[tid];
}
• This is a separate source file (or string)
• Cannot directly access host data
• Compiled at runtime
![Page 6: Programming models for next generation of GPGPU architectures · Programming models for next generation of GPGPU architectures ... Programing models for next ... // create the OpenCL](https://reader031.vdocument.in/reader031/viewer/2022021505/5ada62477f8b9add658c8c9d/html5/thumbnails/6.jpg)
6 | Programing models for next generation GPGPU | February, 2011 | Public
HELLO WORLD - HOST PROGRAM// create the OpenCL context on a GPU device
cl_context = clCreateContextFromType(0, CL_DEVICE_TYPE_GPU, NULL, NULL, NULL);
// get the list of GPU devices associated with context
clGetContextInfo(context, CL_CONTEXT_DEVICES, 0,
NULL, &cb);
devices = malloc(cb);
clGetContextInfo(context, CL_CONTEXT_DEVICES, cb, devices, NULL);
// create a command-queue
cmd_queue = clCreateCommandQueue(context, devices[0], 0, NULL);
memobjs[0] = clCreateBuffer(context,CL_MEM_WRITE_ONLY,
sizeof(cl_char)*strlen(“Hello World”, NULL,NULL);
// create the program
program = clCreateProgramWithSource(context, 1, &program_source, NULL, NULL);
// build the program
err = clBuildProgram(program, 0, NULL, NULL, NULL, NULL);
// create the kernel
kernel = clCreateKernel(program, “vec_add”, NULL);
// set the args values
err = clSetKernelArg(kernel, 0, (void *) &memobjs[0],
sizeof(cl_mem));
// set work-item dimensions
global_work_size[0] = strlen(“Hello World”);;
// execute kernel
err = clEnqueueNDRangeKernel(cmd_queue, kernel, 1, NULL, global_work_size, NULL, 0, NULL, NULL);
// read output array
err = clEnqueueReadBuffer(cmd_queue, memobjs[0], CL_TRUE, 0, strlen(“Hello World”) *sizeof(cl_char), dst, 0, NULL, NULL);
![Page 7: Programming models for next generation of GPGPU architectures · Programming models for next generation of GPGPU architectures ... Programing models for next ... // create the OpenCL](https://reader031.vdocument.in/reader031/viewer/2022021505/5ada62477f8b9add658c8c9d/html5/thumbnails/7.jpg)
7 | Programing models for next generation GPGPU | February, 2011 | Public
HELLO WORLD - HOST PROGRAM
// create the OpenCL context on a GPU device
cl_context = clCreateContextFromType(0, CL_DEVICE_TYPE_GPU, NULL, NULL, NULL);
// get the list of GPU devices associated with context
clGetContextInfo(context, CL_CONTEXT_DEVICES, 0,
NULL, &cb);
devices = malloc(cb);
clGetContextInfo(context, CL_CONTEXT_DEVICES, cb, devices, NULL);
// create a command-queue
cmd_queue = clCreateCommandQueue(context, devices[0], 0, NULL);
// allocate the buffer memory objects
memobjs[0] = clCreateBuffer(context, CL_MEM_READ_ONLY | CL_MEM_COPY_HOST_PTR, sizeof(cl_char)*strlen(“Hello World”), srcA, NULL);}
// create the program
program = clCreateProgramWithSource(context, 1, &program_source, NULL, NULL);
// build the program
err = clBuildProgram(program, 0, NULL, NULL, NULL, NULL);
// create the kernel
kernel = clCreateKernel(program, “vec_add”, NULL);
// set the args values
err = clSetKernelArg(kernel, 0, (void *) &memobjs[0],
sizeof(cl_mem));
// set work-item dimensions
global_work_size[0] = n;
// execute kernel
err = clEnqueueNDRangeKernel(cmd_queue, kernel, 1, NULL, global_work_size, NULL, 0, NULL, NULL);
// read output array
err = clEnqueueReadBuffer(context, memobjs[2], CL_TRUE, 0, n*sizeof(cl_float), dst, 0, NULL, NULL);
Define platform and queues
Define Memory objects
Create the program
Build the program
Create and setup kernel
Execute the kernel
Read results on the host
![Page 8: Programming models for next generation of GPGPU architectures · Programming models for next generation of GPGPU architectures ... Programing models for next ... // create the OpenCL](https://reader031.vdocument.in/reader031/viewer/2022021505/5ada62477f8b9add658c8c9d/html5/thumbnails/8.jpg)
8 | Programing models for next generation GPGPU | February, 2011 | Public
What can we learn
![Page 9: Programming models for next generation of GPGPU architectures · Programming models for next generation of GPGPU architectures ... Programing models for next ... // create the OpenCL](https://reader031.vdocument.in/reader031/viewer/2022021505/5ada62477f8b9add658c8c9d/html5/thumbnails/9.jpg)
9 | Programing models for next generation GPGPU | February, 2011 | Public
LEARN FROM CURRENT GENERATION ARCHITECTURE
![Page 10: Programming models for next generation of GPGPU architectures · Programming models for next generation of GPGPU architectures ... Programing models for next ... // create the OpenCL](https://reader031.vdocument.in/reader031/viewer/2022021505/5ada62477f8b9add658c8c9d/html5/thumbnails/10.jpg)
10 | Programing models for next generation GPGPU | February, 2011 | Public
COMMON USE CASES
In OpenCL™ we generally see:
– Pick single device (often GPU or CL_DEVICE_TYPE_DEFAULT)
– All “kernels” in cl_program object are used in application
In CUDA the default for runtime mode is:
– Pick single device (always GPU)
– All “kernels” in scope are exported to the host application for specific
translation unit, i.e. calling kernels is syntactic and behave similar to
static linkage.
![Page 11: Programming models for next generation of GPGPU architectures · Programming models for next generation of GPGPU architectures ... Programing models for next ... // create the OpenCL](https://reader031.vdocument.in/reader031/viewer/2022021505/5ada62477f8b9add658c8c9d/html5/thumbnails/11.jpg)
11 | Programing models for next generation GPGPU | February, 2011 | Public
A look into the future
![Page 12: Programming models for next generation of GPGPU architectures · Programming models for next generation of GPGPU architectures ... Programing models for next ... // create the OpenCL](https://reader031.vdocument.in/reader031/viewer/2022021505/5ada62477f8b9add658c8c9d/html5/thumbnails/12.jpg)
12 | Programing models for next generation GPGPU | February, 2011 | Public
NEXT GENERATION GPGPU PROGRAM STRUCTURE
DEVICE
(C++0x)
CPU
(C++0x)
C++0x Code
![Page 13: Programming models for next generation of GPGPU architectures · Programming models for next generation of GPGPU architectures ... Programing models for next ... // create the OpenCL](https://reader031.vdocument.in/reader031/viewer/2022021505/5ada62477f8b9add658c8c9d/html5/thumbnails/13.jpg)
13 | Programing models for next generation GPGPU | February, 2011 | Public
HELLO WORLD C++0X SOURCE
hw[] = "Hello World\n";
void __attribute__(gpu) hello(
Index<1> index,
char * out)
{
size_t id = index.getX();
out[id] = hw[id];
}
int main(void)
{
char output[100];
parallelFor(Range<1>(length(hw)),
[output] (Index<1> index) {
hello(index, output);
});
}
![Page 14: Programming models for next generation of GPGPU architectures · Programming models for next generation of GPGPU architectures ... Programing models for next ... // create the OpenCL](https://reader031.vdocument.in/reader031/viewer/2022021505/5ada62477f8b9add658c8c9d/html5/thumbnails/14.jpg)
14 | Programing models for next generation GPGPU | February, 2011 | Public
HELLO WORLD C++0X SOURCE
hw[] = "Hello World\n";
void __attribute__(gpu) hello(
Index<1> index,
char * out)
{
size_t id = index.getX();
out[id] = hw[id];
}
int main(void)
{
char output[100];
parallelFor(Range<1>(length(hw)),
[output] (Index<1> index) {
hello(index, output);
});
}
• A single C++0x program
• Can directly access data on host and device
• Compiled offline
![Page 15: Programming models for next generation of GPGPU architectures · Programming models for next generation of GPGPU architectures ... Programing models for next ... // create the OpenCL](https://reader031.vdocument.in/reader031/viewer/2022021505/5ada62477f8b9add658c8c9d/html5/thumbnails/15.jpg)
15 | Programing models for next generation GPGPU | February, 2011 | Public
What questions still need
to be answered
![Page 16: Programming models for next generation of GPGPU architectures · Programming models for next generation of GPGPU architectures ... Programing models for next ... // create the OpenCL](https://reader031.vdocument.in/reader031/viewer/2022021505/5ada62477f8b9add658c8c9d/html5/thumbnails/16.jpg)
16 | Programing models for next generation GPGPU | February, 2011 | Public
WHAT QUESTIONS NEED TO BE ANSWERED
How close can the CPU and GPU really be?
– How does it effect the models of today:
GPU implies through put computing!
CPU implies local latency hiding in branchy code!
What effect does it have on the kind of applications one can run on these
Fusion systems?
How does this all fit with managed languages?
Is C++0x enough on its own?
What about languages like Haskell or other high-level models?
![Page 17: Programming models for next generation of GPGPU architectures · Programming models for next generation of GPGPU architectures ... Programing models for next ... // create the OpenCL](https://reader031.vdocument.in/reader031/viewer/2022021505/5ada62477f8b9add658c8c9d/html5/thumbnails/17.jpg)
17 | Programing models for next generation GPGPU | February, 2011 | Public
QUESTIONS
![Page 18: Programming models for next generation of GPGPU architectures · Programming models for next generation of GPGPU architectures ... Programing models for next ... // create the OpenCL](https://reader031.vdocument.in/reader031/viewer/2022021505/5ada62477f8b9add658c8c9d/html5/thumbnails/18.jpg)
18 | Programing models for next generation GPGPU | February, 2011 | Public
Trademark Attribution
AMD, the AMD Arrow logo and combinations thereof are trademarks of Advanced Micro Devices, Inc. in the United States
and/or other jurisdictions. OpenCL is a trademark of Apple Inc. used with permission by Khronos. Other names used in this
presentation are for identification purposes only and may be trademarks of their respective owners.
©2011 Advanced Micro Devices, Inc. All rights reserved.