programming with cuda ws 08/09 lecture 1 tue, 21 oct, 2008

Programming with Programming with CUDACUDAWS 08/09WS 08/09

Lecture 1Lecture 1Tue, 21 Oct, 2008Tue, 21 Oct, 2008

OrganizationOrganization

Two lectures per weekTwo lectures per week– Tuesdays : 4pm-5pm Ernst-Abbe-Tuesdays : 4pm-5pm Ernst-Abbe-

Platz 2 (room 3517)Platz 2 (room 3517)– Thursdays : 4pm-6pm Carl-Zeiss-Thursdays : 4pm-6pm Carl-Zeiss-

Strasse (room 125)Strasse (room 125) One exercise session per weekOne exercise session per week

– Tuesdays : 5pm-6pm Ernst-Abbe-Tuesdays : 5pm-6pm Ernst-Abbe-Platz 2 (room 3517) Platz 2 (room 3517)


PeoplePeople– Waqar Saleem (me)Waqar Saleem (me)

http://theinf2.informatik.uni-jena.de/People/Waqar+Saleem.hhttp://theinf2.informatik.uni-jena.de/People/Waqar+Saleem.htmltml

– Jens K. MüllerJens K. Müllerhttp://theinf2.informatik.uni-jena.de/People/Jens+K_+Muellerhttp://theinf2.informatik.uni-jena.de/People/Jens+K_+Mueller.html.html

Office hoursOffice hours– Wednesdays : 2pm-4pm Ernst-Abbe-Wednesdays : 2pm-4pm Ernst-Abbe-

Platz 2 (room 3311)Platz 2 (room 3311)


Reference MaterialReference Material– www.nvidia.com/object/cuda_education.htmlwww.nvidia.com/object/cuda_education.html

links to university courses on CUDAlinks to university courses on CUDA– www.nvidia.com/object/cuda_develop.htmlwww.nvidia.com/object/cuda_develop.html

documentation, programming guidedocumentation, programming guide

Lecture slides will be made Lecture slides will be made available after each lecture on the available after each lecture on the course websitecourse website– http://theinf2.informatik.uni-jena.de/Fhttp://theinf2.informatik.uni-jena.de/F

or+Students/CUDA.htmlor+Students/CUDA.html


Two-part courseTwo-part course Part 1: before ChristmasPart 1: before Christmas

– Present and learn about CUDAPresent and learn about CUDA– Form student groupsForm student groups– Groups choose/are assigned projectsGroups choose/are assigned projects

Part 2: after ChristmasPart 2: after Christmas– Groups work on and present their Groups work on and present their

projectsprojects

The Bacardi AlgorithmThe Bacardi Algorithm(courtesy of (courtesy of Elena AndreevaElena Andreeva))

The Bacardi AlgorithmThe Bacardi Algorithm

BacardiBacardi


BacardiBacardi BacarBacar


BacardiBacardi BacarBacar WacarWacar


BacardiBacardi BacarBacar WacarWacar WaqarWaqar

GPGPU (Intro)GPGPU (Intro)

GPGPUGPGPU

GPGPGPUGPU

GPGPUGPGPU

GPGPGPUGPU– Graphical Processing UnitGraphical Processing Unit

GPGPUGPGPU

GPGPGPUGPU– Graphical Processing UnitGraphical Processing Unit– Handles values of pixels displayed on Handles values of pixels displayed on

screenscreen Highly parallel computationHighly parallel computation Optimized for parallel computationsOptimized for parallel computations

GPGPUGPGPU

GPGPGPUGPU

GPGPUGPGPU

GPGPGPUGPU– General Purpose computing on GPUGeneral Purpose computing on GPU

GPGPUGPGPU

GPGPGPUGPU– General Purpose computing on GPUGeneral Purpose computing on GPU– Many non-graphics applications can Many non-graphics applications can

be parallelizedbe parallelized Can then be ported to a GPU Can then be ported to a GPU

implementationimplementation

GPGPUGPGPU

General infoGeneral info– http://http://www.gpgpu.orgwww.gpgpu.org//– http://http://en.wikipedia.org/wiki/GPGPUen.wikipedia.org/wiki/GPGPU

VariantsVariants– GPGPGPGP– GPGP22

Why GPU? (GPU vs. CPU)Why GPU? (GPU vs. CPU)

Specialized for renderingSpecialized for rendering– Highly parallel, compute-intensive Highly parallel, compute-intensive

applicationapplication– Multiple cores, high memory Multiple cores, high memory

bandwidthsbandwidths

Why GPU? (GPU vs. CPU)Why GPU? (GPU vs. CPU)

More data processing transistors More data processing transistors forfor– Flow control: same program for each Flow control: same program for each

datadata– Data caching: one arithmetic-Data caching: one arithmetic-

intensive program, many dataintensive program, many data

So, why now?So, why now?

PreviouslyPreviously– Needed specialized graphics APIsNeeded specialized graphics APIs– GPU DRAM had easy read but limited GPU DRAM had easy read but limited

write capabilitywrite capability Now, CUDA – Compute Unified Now, CUDA – Compute Unified

Device ArchitectureDevice Architecture– Minimal extension to CMinimal extension to C– GPU DRAM read & writeGPU DRAM read & write

CUDACUDA

Some setup issuesSome setup issues

CUDA ready cardsCUDA ready cards– http://http://

www.nvidia.com/object/cuda_learn_products.htwww.nvidia.com/object/cuda_learn_products.htmlml

– CUDA ready machine available in poolCUDA ready machine available in pool

CUDA can be run in emulation modeCUDA can be run in emulation mode– Will install CUDA in emu mode on Will install CUDA in emu mode on

pool PCspool PCs– Repeat at home on own machinesRepeat at home on own machines

Some setup issuesSome setup issues

Signup for pool accessSignup for pool access– Usernames need to be known to Usernames need to be known to

grant access to CUDA ready machinegrant access to CUDA ready machine

All for todayAll for today

Next timeNext time– Finalize course/people websitesFinalize course/people websites

– CUDA programming modelCUDA programming model

– Install CUDA on pool PCsInstall CUDA on pool PCs

See you next week!See you next week!

programming with cuda ws 08/09 lecture 1 tue, 21 oct, 2008

Documents

pm ernstabbe

pm carlzeiss

cuda ready machineall

strasse room

renderinghighly parallel

ready machine available

pool pcsrepeat

pool pcssee