graphics processing unit

39
06/26/2022 1 GRAPHICS PROCESSING UNIT Shashwat Shriparv [email protected] InfinitySoft

Upload: shashwat-shriparv

Post on 20-May-2015

873 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Graphics processing unit

04/12/2023 1

GRAPHICS PROCESSING UNIT

Shashwat [email protected]

Page 2: Graphics processing unit

204/12/2023

Presentation Overview

DefinitionComparison with CPUArchitectureGPU-CPU InteractionGPU Memory

Page 3: Graphics processing unit

04/12/2023 3

Why GPU?

To provide a separate dedicated graphics resources including a graphics processor and memory.

To relieve some of the burden of the main system resources, namely the Central Processing Unit, Main Memory, and the System Bus, which would otherwise get saturated with graphical operations and I/O requests.

Page 4: Graphics processing unit

04/12/2023 4

There comes

GPU

Page 5: Graphics processing unit

04/12/2023 5

What is a GPU?

A Graphics Processing Unit or GPU (also occasionally called Visual Processing Unit or VPU) is a dedicated processor efficient at manipulating and displaying computer graphics .

Like the CPU (Central Processing Unit), it is a single-chip processor.

Page 6: Graphics processing unit

04/12/2023 6

HOWEVER,

The abstract goal of a GPU, is to enable a representation of a 3D world as realistically as possible. So these GPUs are designed to provide additional computational power that is customized specifically to perform these 3D tasks.

Page 7: Graphics processing unit

04/12/2023 7

GPU vs CPU

A GPU is tailored for highly parallel operation while a CPU executes programs serially.

For this reason, GPUs have many parallel execution units , while CPUs have few execution units .

GPUs have singificantly faster and more advanced memory interfaces as they need to shift around a lot more data than CPUs.

GPUs have much deeper pipelines (several thousand stages vs 10-20 for CPUs).

Page 8: Graphics processing unit

04/12/2023 8

BRIEF HISTORY First-Generation GPUs

– Up to 1998; Nvidia’s TNT2, ATi’s Rage, and 3dfx’s Voodoo3;DX6 feature set.

Second-Generation GPUs– 1999 -2000; Nvidia’s GeForce256 and GeForce2, ATi’s

Radeon7500, and S3’s Savage3D; T&L; OpenGL and DX7;Configurable.

Third-Generation GPUs– 2001; GeForce3/4Ti, Radeon8500, MS’s Xbox; OpenGL ARB,

DX7/8; Vertex Programmability + ASM

Fourth-Generation GPUs– 2002 onwards; GeForce FX family, Radeon 9700;

OpenGL+extensions, DX9; Vertex/Pixel Programability + HLSL; 0.13μ Process, 125M T/C, 200M T/S.

Fifth-Generation GPUs - GeForce 8X:DirectX10.

Page 9: Graphics processing unit

04/12/2023 9

GPU Architecture

How many processing units?

How many ALUs?

Do you need a cache?

What kind of memory?

Page 10: Graphics processing unit

04/12/2023 10

GPU Architecture

How many processing units?– Lots.

How many ALUs?

Do you need a cache?

What kind of memory?

Page 11: Graphics processing unit

04/12/2023 11

GPU Architecture

How many processing units?– Lots.

How many ALUs?– Hundreds.

Do you need a cache?

What kind of memory?

Page 12: Graphics processing unit

04/12/2023 12

GPU Architecture

How many processing units?– Lots.

How many ALUs?– Hundreds.

Do you need a cache?– Sort of.

What kind of memory?

Page 13: Graphics processing unit

04/12/2023 13

GPU Architecture

How many processing units?– Lots.

How many ALUs?– Hundreds.

Do you need a cache?– Sort of.

What kind of memory?– very fast.

Page 14: Graphics processing unit

04/12/2023 14

The difference…….

Without GPU With GPU

Page 15: Graphics processing unit

04/12/2023 15

The GPU pipeline

The GPU receives geometry information from the CPU as an input and provides a picture as an output

Let’s see how that happens…

hostinterface

vertexprocessing

trianglesetup

pixel processing

memoryinterface

Page 16: Graphics processing unit

04/12/2023 16

Details………..

Page 17: Graphics processing unit

04/12/2023 17

Host Interface

The host interface is the communication bridge between the CPU and the GPU.

It receives commands from the CPU and also pulls geometry information from system memory.

It outputs a stream of vertices in object space with all their associated information (texture coordinates, per vertex color etc) .

hostinterface

vertexprocessing

trianglesetup

pixel processing

memoryinterface

Page 18: Graphics processing unit

04/12/2023 18

Vertex ProcessingThe vertex processing stage receives

vertices from the host interface in object space and outputs them in screen space

This may be a simple linear transformation, or a complex operation involving morphing effects

No new vertices are created in this stage, and no vertices are discarded (input/output has 1:1 mapping)

hostinterface

vertexprocessing

trianglesetup

pixel processing

memoryinterface

Page 19: Graphics processing unit

04/12/2023 19

Triangle setupIn this stage geometry information

becomes raster information (screen space geometry is the input, pixels are the output)

Prior to rasterization, triangles that are backfacing or are located outside the viewing frustrum are rejected

hostinterface

vertexprocessing

trianglesetup

pixel processing

memoryinterface

Page 20: Graphics processing unit

04/12/2023 20

Triangle Setup (cont…..)A pixel is generated if and only if its center is

inside the triangleEvery pixel generated has its attributes

computed to be the perspective correct interpolation of the three vertices that make up the triangle

Page 21: Graphics processing unit

04/12/2023 21

Pixel ProcessingEach pixel provided by triangle setup is

fed into pixel processing as a set of attributes which are used to compute the final color for this pixel

The computations taking place here include texture mapping and math operations

hostinterface

vertexprocessing

trianglesetup

pixel processing

memoryinterface

Page 22: Graphics processing unit

04/12/2023 22

Memory InterfacePixel colors provided by the previous stage

are written to the framebufferUsed to be the biggest bottleneck before

pixel processing took overBefore the final write occurs, some pixels

are rejected by the zbuffer .On modern GPUs z is compressed to reduce framebuffer bandwidth (but not size).

hostinterface

vertexprocessing

trianglesetup

pixel processing

memoryinterface

Page 23: Graphics processing unit

04/12/2023 23

Programmability in GPU pipelineIn current state of the art GPUs, vertex

and pixel processing are now programmable

The programmer can write programs that are executed for every vertex as well as for every pixel

This allows fully customizable geometry and shading effects that go well beyond the generic look and feel of older 3D applicationshost

interfacevertex

processingtrianglesetup

pixel processing

memoryinterface

Page 24: Graphics processing unit

04/12/2023 24

GPU Pipelined Architecture (simplified view)

Framebuffer

Pixel Shader

Texture Storage + Filtering

RasterizerVertex Shader

Vertex Setup

CPU

Vertices Pixels

GPU

…110010100100…

Page 25: Graphics processing unit

04/12/2023 25

GPU Pipelined Architecture (simplified view)

GPU

One unit can limit the speed of the pipeline…

Framebuffer

Pixel Shader

Texture Storage + Filtering

RasterizerVertex Shader

Vertex Setup

CPU

Page 26: Graphics processing unit

04/12/2023 26

CPU/GPU interaction

The CPU and GPU inside the PC work in parallel with each other

There are two “threads” going on, one for the CPU and one for the GPU, which communicate through a command buffer:

CPU writes commands here

GPU reads commands from here

Pending GPU commands

Page 27: Graphics processing unit

04/12/2023 27

CPU/GPU interaction (cont)If this command buffer is drained

empty, we are CPU limited and the GPU will spin around waiting for new input. All the GPU power in the universe isn’t going to make your application faster!

If the command buffer fills up, the CPU will spin around waiting for the GPU to consume it, and we are effectively GPU limited

Page 28: Graphics processing unit

04/12/2023 28

Synchronization issuesIn the figure below, the CPU must

not overwrite the data in the “yellow” block until the GPU is done with the “black” command, which references that data:

CPU writes commands here

GPU reads commands from here

data

Page 29: Graphics processing unit

04/12/2023 29

Inlining dataOne way to avoid these problems is

to inline all data to the command buffer and avoid references to separate data:

CPU writes commands here

GPU reads commands from here

However, this is also bad for performance, since we may need to copy several Mbytes of data instead of merely passing around a pointer

Page 30: Graphics processing unit

04/12/2023 30

GPU readbacks

The output of a GPU is a rendered image on the screen, what will happen if the CPU tries to read it?

CPU writes commands here

GPU reads commands from here

Pending GPU commands

GPU must be synchronized with the CPU, ie it must drain its entire command buffer, and the CPU must wait while this happens

Page 31: Graphics processing unit

04/12/2023 31

GPU readbacks (cont)

We lose all parallelism, since first the CPU waits for the GPU, then the GPU waits for the CPU (because the command buffer has been drained)

Both CPU and GPU performance take a nosedive

Bottom line: the image the GPU produces is for your eyes, not for the CPU (treat the CPU -> GPU highway as a one way street)

Page 32: Graphics processing unit

04/12/2023 32

About GPU memory…..

Page 33: Graphics processing unit

04/12/2023 33

Memory Hierarchy

CPU and GPU Memory Hierarchy

CPU Registers

Disk

CPU Caches

CPU Main Memory

GPU Video Memory

GPU Caches

GPU Constant Registers

GPU Temporary Registers

Page 34: Graphics processing unit

04/12/2023 34

Where is GPU Data Stored?– Vertex buffer– Frame buffer– Texture

Vertex BufferVertex

ProcessorRasterizer

FragmentProcessor

Frame Buffer(s)

Texture

Page 35: Graphics processing unit

04/12/2023 35

CPU memory vs GPU memory

CPU GPU

Registers Read/write Read/write

Local Mem Read/write stack None

Global Mem Read/write heap Read-only during computation.Write-only at end (to pre-computed address)

Disk Read/write disk None

Page 36: Graphics processing unit

04/12/2023 36

It looks like…..

Page 37: Graphics processing unit

04/12/2023 37

Some applications…..

Computer generated holography using a graphics processing unit

Improve the performance of CAD tools.

Computer graphics in games

Page 38: Graphics processing unit

04/12/2023 38

New…..

NVIDIA's new graphics processing unit, the GeForce 8X ULTRA, said to represent the very latest in visual effects technologies.

Page 39: Graphics processing unit

04/12/2023 39

THANK YOU

Shashwat [email protected]