evolution of the programmable graphics pipelinecis565/lectures2011s/lecture2.pdf · • faster agp...

15
1 Evolution of the Programmable Graphics Pipeline Patrick Cozzi University of Pennsylvania CIS 565 - Spring 2011 Administrivia Tip: google “cis 565” Slides posted before each class Tentative assignment dates on website 1 st assignment handed out today Write concisely Due start of class, one week from today Google group in progress FYI. GDC Early Registration - 01/24 Survey Results 15/23 – graphics experience Most students have usable video cards Lerk – don’t be scared I want to be a Toys R Us kid too Survey Results Class interests Pure architecture Game rendering Physical simulations Animation Vision algorithms Image/video processing

Upload: others

Post on 12-Mar-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Evolution of the Programmable Graphics Pipelinecis565/Lectures2011S/Lecture2.pdf · • Faster AGP bus instead of PCI Primitive Assembly Primitive Assembly Frame Buffer Frame Buffer

1

Evolution of the Programmable

Graphics Pipeline

Patrick Cozzi

University of Pennsylvania

CIS 565 - Spring 2011

Administrivia

� Tip: google “cis 565”

� Slides posted before each class

� Tentative assignment dates on website

� 1st assignment handed out today

�Write concisely

�Due start of class, one week from today

� Google group in progress

� FYI. GDC Early Registration - 01/24

Survey Results

� 15/23 – graphics experience

� Most students have usable video cards

� Lerk – don’t be scared

� I want to be a Toys R Us kid too

Survey Results

� Class interests�Pure architecture

�Game rendering

�Physical simulations

�Animation

�Vision algorithms

� Image/video processing

�…

Page 2: Evolution of the Programmable Graphics Pipelinecis565/Lectures2011S/Lecture2.pdf · • Faster AGP bus instead of PCI Primitive Assembly Primitive Assembly Frame Buffer Frame Buffer

2

Course Roadmap

� Graphics Pipeline (GLSL)

� GPGPU (GLSL)�Briefly

� GPU Computing (CUDA, OpenCL)

� Choose your own adventure�Student Presentation

�Final Project

� Goal: Prepare you for your presentation and project

Agenda

� Why program the GPU?

� Graphics Review

� Evolution of the Programmable Graphics

Pipeline

�Understand the past

Why Program the GPU?

Graph from: http://developer.download.nvidia.com/compute/cuda/3_2_prod/toolkit/docs/CUDA_C_Programming_Guide.pdf

Why Program the GPU?

Graph from: http://developer.download.nvidia.com/compute/cuda/3_2_prod/toolkit/docs/CUDA_C_Programming_Guide.pdf

Page 3: Evolution of the Programmable Graphics Pipelinecis565/Lectures2011S/Lecture2.pdf · • Faster AGP bus instead of PCI Primitive Assembly Primitive Assembly Frame Buffer Frame Buffer

3

Why Program the GPU?

� Compute� Intel Core i7 – 4 cores – 100 GFLOP

� NVIDIA GTX280 – 240 cores – 1 TFLOP

� Memory Bandwidth� System Memory – 60 GB/s

� NVIDIA GT200 – 150 GB/s

� Install Base� Over 200 million NVIDIA G80s shipped

Numbers from Programming Massively Parallel Processors.

NVIDIA GPU Evolution

Slide from David Luebke: http://s08.idav.ucdavis.edu/luebke-nvidia-gpu-architecture.pdf

Graphics Review

� Modeling

� Rendering

� Animation

Graphics Review: Modeling

� Modeling

�Polygons vs Triangles

� How do you store a triangle mesh?

� Implicit Surfaces

�Height maps

�…

Page 4: Evolution of the Programmable Graphics Pipelinecis565/Lectures2011S/Lecture2.pdf · • Faster AGP bus instead of PCI Primitive Assembly Primitive Assembly Frame Buffer Frame Buffer

4

Triangles

Image courtesy of A K Peters, Ltd. www.virtualglobebook.com

Triangles

Image courtesy of A K Peters, Ltd. www.virtualglobebook.com. Imagery from NASA Visible Earth: visibleearth.nasa.gov.

Triangles Triangles

Page 5: Evolution of the Programmable Graphics Pipelinecis565/Lectures2011S/Lecture2.pdf · • Faster AGP bus instead of PCI Primitive Assembly Primitive Assembly Frame Buffer Frame Buffer

5

Implicit Surfaces

Images from GPU Gems 3: http://http.developer.nvidia.com/GPUGems3/gpugems3_ch01.html

Height Maps

Image courtesy of A K Peters, Ltd. www.virtualglobebook.com

Graphics Review: Rendering

� Rendering�Goal: Assign color to pixels

� Two Parts�Visible surfaces

� What is in front of what for a given view

�Shading� Simulate the interaction of material and light to

produce a pixel color

Rasterization

� What about ray tracing?

Page 6: Evolution of the Programmable Graphics Pipelinecis565/Lectures2011S/Lecture2.pdf · • Faster AGP bus instead of PCI Primitive Assembly Primitive Assembly Frame Buffer Frame Buffer

6

Visible Surfaces

Image courtesy of A K Peters, Ltd. www.virtualglobebook.com

Visible Surfaces

� Z-Buffer / Depth Buffer

� Fragment vs Pixel

Image courtesy of A K Peters, Ltd. www.virtualglobebook.com

Shading

Images courtesy of A K Peters, Ltd. www.virtualglobebook.com

Shading

Image from GPU Gems 3: http://http.developer.nvidia.com/GPUGems3/gpugems3_ch14.html

Page 7: Evolution of the Programmable Graphics Pipelinecis565/Lectures2011S/Lecture2.pdf · • Faster AGP bus instead of PCI Primitive Assembly Primitive Assembly Frame Buffer Frame Buffer

7

Graphics Pipeline

PrimitiveAssembly

PrimitiveAssembly

VertexTransforms

VertexTransforms

Frame Buffer

Frame Buffer

RasterOperations

Rasterizationand

Interpolation

� Scissor Test

� Stencil Test

� Depth Test

� Blending

Graphics Pipeline

Images courtesy of A K Peters, Ltd. http://www.realtimerendering.com/

Graphics Pipeline

Images courtesy of A K Peters, Ltd. http://www.realtimerendering.com/

Graphics Pipeline

Images courtesy of A K Peters, Ltd. http://www.realtimerendering.com/

Page 8: Evolution of the Programmable Graphics Pipelinecis565/Lectures2011S/Lecture2.pdf · • Faster AGP bus instead of PCI Primitive Assembly Primitive Assembly Frame Buffer Frame Buffer

8

Graphics Pipeline

Images courtesy of A K Peters, Ltd. http://www.realtimerendering.com/

Graphics Review: Animation

� Move the camera and/or agents, and re-render the scene

� In less than 16.6 ms (60 fps)

Evolution of the Programmable

Graphics Pipeline

� Pre GPU

� Fixed function GPU

� Programmable GPU

� Unified Shader Processors

Early 90s – Pre GPU

Slide from Mike Houston: http://s09.idav.ucdavis.edu/talks/01-BPS-SIGGRAPH09-mhouston.pdf

Page 9: Evolution of the Programmable Graphics Pipelinecis565/Lectures2011S/Lecture2.pdf · • Faster AGP bus instead of PCI Primitive Assembly Primitive Assembly Frame Buffer Frame Buffer

9

Why GPUs?

� Exploit Parallelism

�Pipeline parallel

�Data-parallel

�CPU and GPU executing in parallel

� Hardware: texture filtering, MAD, etc.

Generation I: 3dfx Voodoo (1996)

Image from “7 years of Graphics”

• Did not do vertex transformations:these were done in the CPU

• Did do texture mapping, z-buffering.

PrimitiveAssembly

PrimitiveAssembly

VertexTransforms

VertexTransforms

Frame Buffer

Frame Buffer

RasterOperations

Rasterizationand

Interpolation

CPU GPUPCI

Slide adapted from Suresh Venkatasubramanian and Joe Kider

Aside: Mario Kart 64

Image from: http://www.gamespot.com/users/my_shoe/

� High fragment load / low vertex load

Aside: Mario Kart Wii

� High fragment load / low vertex load?

Image from: http://wii.ign.com/dor/objects/949580/mario-kart-wii/images/

Page 10: Evolution of the Programmable Graphics Pipelinecis565/Lectures2011S/Lecture2.pdf · • Faster AGP bus instead of PCI Primitive Assembly Primitive Assembly Frame Buffer Frame Buffer

10

Generation II: GeForce/Radeon 7500 (1998)

Slide from Suresh Venkatasubramanian and Joe Kider

VertexTransforms

VertexTransforms

• Main innovation: shifting the transformation and lighting

calculations to the GPU

• Allowed multi-texturing: giving bump

maps, light maps, and others..

• Faster AGP bus instead of PCI

PrimitiveAssembly

PrimitiveAssembly

Frame Buffer

Frame Buffer

RasterOperations

Rasterizationand

Interpolation

GPUAGP

Image from “7 years of Graphics”

Generation III: GeForce3/Radeon 8500(2001)

Slide from Suresh Venkatasubramanian and Joe Kider

VertexTransforms

VertexTransforms

• For the first time, allowed limited

amount of programmability in the vertex pipeline

• Also allowed volume texturing and multi-sampling (for antialiasing)

PrimitiveAssembly

PrimitiveAssembly

Frame Buffer

Frame Buffer

RasterOperations

Rasterizationand

Interpolation

GPUAGP

Small vertexshaders

Small vertexshaders

Image from “7 years of Graphics”

Generation IV: Radeon 9700/GeForce FX (2002)

VertexTransforms

VertexTransforms

• This generation is the first generation of fully-programmable graphics cards

• Different versions have different resource limits on fragment/vertex

programs

PrimitiveAssembly

PrimitiveAssembly

RasterOperations

Rasterizationand

Interpolation

AGP

ProgrammableVertex shader

ProgrammableVertex shader

ProgrammableFragmentProcessor

ProgrammableFragmentProcessor

Texture Memory

Slide from Suresh Venkatasubramanian and Joe Kider

Image from “7 years of Graphics”

Generation IV.V: GeForce6/X800 (2004)

Slide adapted from Suresh Venkatasubramanian and Joe Kider

� Simultaneous rendering to multiple buffers

� True conditionals and loops

� PCIe bus

� Vertex texture fetch

VertexTransforms

VertexTransforms

PrimitiveAssembly

PrimitiveAssembly

Frame Buffer

Frame Buffer

RasterOperations

Rasterizationand

Interpolation

PCIe

ProgrammableVertex shader

ProgrammableVertex shader

ProgrammableFragmentProcessor

ProgrammableFragmentProcessor

Texture Memory Texture Memory

Page 11: Evolution of the Programmable Graphics Pipelinecis565/Lectures2011S/Lecture2.pdf · • Faster AGP bus instead of PCI Primitive Assembly Primitive Assembly Frame Buffer Frame Buffer

11

NVIDIA NV40 Architecture

Image from GPU Gems 2: http://http.developer.nvidia.com/GPUGems2/gpugems2_chapter30.html

6 vertex

shader units

16 fragment

shader units

Vertex TextureFetch

Generation V: GeForce8800/HD2900 (2006)

Slide adapted from Suresh Venkatasubramanian and Joe Kider

� Ground-up GPU redesign

� Support for Direct3D 10 / OpenGL

3

� Geometry Shaders

� Stream out / transform-feedback

� Unified shader processors

� Support for General GPU programming

Input Assembler

Input Assembler

ProgrammablePixel (Fragment)

Shader

ProgrammablePixel (Fragment)

Shader

RasterOperations

ProgrammableGeometry Shader

PCIe

ProgrammableVertex shader

ProgrammableVertex shader

OutputMerger

D3D 10 Pipeline

Image from David Blythe : http://download.microsoft.com/download/f/2/d/f2d5ee2c-b7ba-4cd0-9686-b6508b5479a1/direct3d10_web.pdf

Geometry Shaders: Point Sprites

Page 12: Evolution of the Programmable Graphics Pipelinecis565/Lectures2011S/Lecture2.pdf · • Faster AGP bus instead of PCI Primitive Assembly Primitive Assembly Frame Buffer Frame Buffer

12

Geometry Shaders: Point Sprites Geometry Shaders

Image from David Blythe : http://download.microsoft.com/download/f/2/d/f2d5ee2c-b7ba-4cd0-9686-b6508b5479a1/direct3d10_web.pdf

NVIDIA G80 Architecture

Slide from David Luebke: http://s08.idav.ucdavis.edu/luebke-nvidia-gpu-architecture.pdf

NVIDIA G80 Architecture

Slide from David Luebke: http://s08.idav.ucdavis.edu/luebke-nvidia-gpu-architecture.pdf

Page 13: Evolution of the Programmable Graphics Pipelinecis565/Lectures2011S/Lecture2.pdf · • Faster AGP bus instead of PCI Primitive Assembly Primitive Assembly Frame Buffer Frame Buffer

13

Why Unify Shader Processors?

Slide from David Luebke: http://s08.idav.ucdavis.edu/luebke-nvidia-gpu-architecture.pdf

Why Unify Shader Processors?

Slide from David Luebke: http://s08.idav.ucdavis.edu/luebke-nvidia-gpu-architecture.pdf

Unified Shader Processors

Slide from David Luebke: http://s08.idav.ucdavis.edu/luebke-nvidia-gpu-architecture.pdf

Terminology

NVIDIA GeForce GTX 480

ATI Radeon HD 58704.x11.x4

NVIDIA GeForce 8800

ATI Radeon HD 29003.x10.x3

NVIDIA GeForce 6800

ATI Radeon X8002.x92

Video card

Example

OpenGLDirect3DShaderModel

Page 14: Evolution of the Programmable Graphics Pipelinecis565/Lectures2011S/Lecture2.pdf · • Faster AGP bus instead of PCI Primitive Assembly Primitive Assembly Frame Buffer Frame Buffer

14

Shader Capabilities

Table courtesy of A K Peters, Ltd. http://www.realtimerendering.com/

Shader Capabilities

Table courtesy of A K Peters, Ltd. http://www.realtimerendering.com/

Evolution of the Programmable Graphics Pipeline

Slide from Mike Houston: http://s09.idav.ucdavis.edu/talks/01-BPS-SIGGRAPH09-mhouston.pdf

Evolution of the Programmable Graphics Pipeline

Slide from Mike Houston: http://s09.idav.ucdavis.edu/talks/01-BPS-SIGGRAPH09-mhouston.pdf

Page 15: Evolution of the Programmable Graphics Pipelinecis565/Lectures2011S/Lecture2.pdf · • Faster AGP bus instead of PCI Primitive Assembly Primitive Assembly Frame Buffer Frame Buffer

15

� Not covered today:

�SM 5 / D3D 11 / GL 4

�Tessellation shaders

� *cough* student presentation *cough*

�Later this semester: NVIDIA Fermi

� Dual warp scheduler

� Configurable L1 / shared memory

� Double precision

� …

Evolution of the Programmable Graphics Pipeline New Tool: AMD System Monitor

� Released 01/04/2011

� http://support.amd.com/us/kbarticles/Pages/AMDSystemMonitor.aspx