status – week 242 victor moya. summary current status. current status. tests. tests. xbox...

34
Status – Week Status – Week 242 242 Victor Moya Victor Moya

Post on 20-Dec-2015

230 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Status – Week 242 Victor Moya. Summary Current status. Current status. Tests. Tests. XBox documentation. XBox documentation. Post Vertex Shader geometry

Status – Week Status – Week 242242

Victor MoyaVictor Moya

Page 2: Status – Week 242 Victor Moya. Summary Current status. Current status. Tests. Tests. XBox documentation. XBox documentation. Post Vertex Shader geometry

SummarySummary

Current status.Current status. Tests.Tests. XBox documentation.XBox documentation. Post Vertex Shader geometry.Post Vertex Shader geometry. Rasterization.Rasterization.

Page 3: Status – Week 242 Victor Moya. Summary Current status. Current status. Tests. Tests. XBox documentation. XBox documentation. Post Vertex Shader geometry

Current StatusCurrent Status Basic Command Processor.Basic Command Processor.

Read/Write GPU registers.Read/Write GPU registers. Read/Write GPU memory.Read/Write GPU memory. GPU commands.GPU commands. No DMA/AGP data access.No DMA/AGP data access.

Basic Memory Controller.Basic Memory Controller. 1 transaction per cycle served.1 transaction per cycle served. Memory module access latency accounted.Memory module access latency accounted. Transmission latency accounted.Transmission latency accounted. 3 buses (req/3 buses (req/writewrite + data): CP, StreamerFetch, + data): CP, StreamerFetch,

StreamerLoader.StreamerLoader.

Page 4: Status – Week 242 Victor Moya. Summary Current status. Current status. Tests. Tests. XBox documentation. XBox documentation. Post Vertex Shader geometry

Current StatusCurrent Status Shader (Vertex Shader).Shader (Vertex Shader).

Multithreaded.Multithreaded. F/D/E/W pipeline.F/D/E/W pipeline. Variable execution latency.Variable execution latency. Dependency checking is full register right now, Dependency checking is full register right now,

should be component based.should be component based. Problems with ‘ending’ instruction (requires Problems with ‘ending’ instruction (requires

something to fetch after it and takes many something to fetch after it and takes many cycles).cycles).

No branches (support code but instructions not No branches (support code but instructions not implemented).implemented).

No texture access (memory).No texture access (memory).

Page 5: Status – Week 242 Victor Moya. Summary Current status. Current status. Tests. Tests. XBox documentation. XBox documentation. Post Vertex Shader geometry

Current StatusCurrent Status Streamer.Streamer.

Pipelined:Pipelined: Hit: Fetch/OCache/Insert/CommitHit: Fetch/OCache/Insert/Commit Miss: Miss:

Fetch/OCache/IRQInsert/IRQRead/AttrLoad/Sh/Store/CFetch/OCache/IRQInsert/IRQRead/AttrLoad/Sh/Store/Commit.ommit.

Stream and index based modes implemented.Stream and index based modes implemented. No pre T&L cache (should be added to No pre T&L cache (should be added to

Streamer Loader?).Streamer Loader?). Supports out of order vertexes (shader or Supports out of order vertexes (shader or

memory).memory). Doesn’t support data from the AGP.Doesn’t support data from the AGP.

Page 6: Status – Week 242 Victor Moya. Summary Current status. Current status. Tests. Tests. XBox documentation. XBox documentation. Post Vertex Shader geometry

Current StatusCurrent Status

Streamer:Streamer: Streamer Loader pipeline should be (in Streamer Loader pipeline should be (in

hardware):hardware): Insert in the IRQ.Insert in the IRQ. Load from IRQ.Load from IRQ. Setup Input: start address + address increment for Setup Input: start address + address increment for

each active attribute.each active attribute. Attribute Load: request attribute to MC, increment Attribute Load: request attribute to MC, increment

address generators.address generators. Issue to Shader.Issue to Shader.

IRQ should be implemented with a pre T&L IRQ should be implemented with a pre T&L cache.cache.

Page 7: Status – Week 242 Victor Moya. Summary Current status. Current status. Tests. Tests. XBox documentation. XBox documentation. Post Vertex Shader geometry

Current StatusCurrent Status

Comments:Comments: Currently the signal latency/bandwidth is Currently the signal latency/bandwidth is

specified with raw numbers. Alternatives:specified with raw numbers. Alternatives: Use constants. Store in a single ‘signal definition’ Use constants. Store in a single ‘signal definition’

file for all units or in separate units (must be shared file for all units or in separate units (must be shared between the two boxes connected by the signal).between the two boxes connected by the signal).

Use some kind of Architecture Description for signal Use some kind of Architecture Description for signal delays, bandwidth, data bus width (to be used in delays, bandwidth, data bus width (to be used in memory transmission calculations and similar).memory transmission calculations and similar).

Currently most units only support single Currently most units only support single issue/fetch/process. Should be ‘generalized’ issue/fetch/process. Should be ‘generalized’ to multiissue/fetch/process and parametrized.to multiissue/fetch/process and parametrized.

Page 8: Status – Week 242 Victor Moya. Summary Current status. Current status. Tests. Tests. XBox documentation. XBox documentation. Post Vertex Shader geometry

Current StatusCurrent Status

Signal Trace Analyzer -> Carlos.Signal Trace Analyzer -> Carlos.

Page 9: Status – Week 242 Victor Moya. Summary Current status. Current status. Tests. Tests. XBox documentation. XBox documentation. Post Vertex Shader geometry

TestsTests

OpenGL test trace:OpenGL test trace: Used glutSolidSphere with (1, 100, 100) as Used glutSolidSphere with (1, 100, 100) as

parameter:parameter: 100 batches.100 batches.

– 2 triangle strips (200 triangles).2 triangle strips (200 triangles).– 98 quad strips (9800 quads).98 quad strips (9800 quads).

20000 vertexs.20000 vertexs. Added a lightning shader replacing the Added a lightning shader replacing the

normal model view + project matrix normal model view + project matrix transformation: one green light in the infinity transformation: one green light in the infinity with diffuse and specular component.with diffuse and specular component.

10 shader instructions.10 shader instructions.

Page 10: Status – Week 242 Victor Moya. Summary Current status. Current status. Tests. Tests. XBox documentation. XBox documentation. Post Vertex Shader geometry

TestsTests Light shader:Light shader:

////// i0 Vertex Position// i0 Vertex Position// i2 Vertex Normal// i2 Vertex Normal////// c0 - c3 Model View-Project Matrix.// c0 - c3 Model View-Project Matrix.// c4 Light Direction// c4 Light Direction// c5 Light Half Vector// c5 Light Half Vector// c6.x Material shininess// c6.x Material shininess// c7 Light ambient color// c7 Light ambient color// c8 Light diffuse color// c8 Light diffuse color// c9 Light specular color// c9 Light specular color////// o0 Vertex position (transformed)// o0 Vertex position (transformed)// o1 Vertex color.// o1 Vertex color.////

Page 11: Status – Week 242 Victor Moya. Summary Current status. Current status. Tests. Tests. XBox documentation. XBox documentation. Post Vertex Shader geometry

TestsTests

// Vertex Model View-Project transformation// Vertex Model View-Project transformation

dp4 o0.x, c0, i0dp4 o0.x, c0, i0

dp4 o0.y, c1, i0dp4 o0.y, c1, i0

dp4 o0.z, c2, i0dp4 o0.z, c2, i0

dp4 o0.w, c3, i0dp4 o0.w, c3, i0

// Compute diffuse and specular dot products and// Compute diffuse and specular dot products and

// use LIT to compute lightning coefficients// use LIT to compute lightning coefficients

dp3 r0.x, i2, c4dp3 r0.x, i2, c4

dp3 r0.y, i2, c5dp3 r0.y, i2, c5

mov r0.w, c6.xmov r0.w, c6.x

lit r0, r0lit r0, r0

Page 12: Status – Week 242 Victor Moya. Summary Current status. Current status. Tests. Tests. XBox documentation. XBox documentation. Post Vertex Shader geometry

TestsTests

// Accumulate color contributions// Accumulate color contributions

mad r1, r0.y, c8, c7mad r1, r0.y, c8, c7

mad o1, r0.z, c9, r1mad o1, r0.z, c9, r1

// Finish shader.// Finish shader.

endend

Page 13: Status – Week 242 Victor Moya. Summary Current status. Current status. Tests. Tests. XBox documentation. XBox documentation. Post Vertex Shader geometry

TestsTests

Results:Results: Simulated cycles: ~350K.Simulated cycles: ~350K. Simulation time: ~30s.Simulation time: ~30s. Signal trace size: ~150MB.Signal trace size: ~150MB.

Page 14: Status – Week 242 Victor Moya. Summary Current status. Current status. Tests. Tests. XBox documentation. XBox documentation. Post Vertex Shader geometry

TestsTests

Page 15: Status – Week 242 Victor Moya. Summary Current status. Current status. Tests. Tests. XBox documentation. XBox documentation. Post Vertex Shader geometry

TestsTests

Bugs:Bugs: TraceReader::parseFP() failed to correctly TraceReader::parseFP() failed to correctly

read a negative number with a 0 before the read a negative number with a 0 before the decimal point.decimal point.

GPU_CLAMP was using ‘<‘ and ‘>’ when it GPU_CLAMP was using ‘<‘ and ‘>’ when it should be using ‘<=‘ and ‘>=‘.should be using ‘<=‘ and ‘>=‘.

ShaderDecodeExecute was allowing the ShaderDecodeExecute was allowing the execution of the instruction in the same execution of the instruction in the same thread after a blocked instruction (data thread after a blocked instruction (data dependency).dependency).

Page 16: Status – Week 242 Victor Moya. Summary Current status. Current status. Tests. Tests. XBox documentation. XBox documentation. Post Vertex Shader geometry

TestsTests Changes:Changes:

Now ShaderDecodeExecute ignores any Now ShaderDecodeExecute ignores any instruction received after an instruction received after an endend instruction. instruction.

Added QUAD and QUADSTRIP support to the Added QUAD and QUADSTRIP support to the simulator (GPU.h, Rasterizer, Drawer).simulator (GPU.h, Rasterizer, Drawer).

Vertex color is clamped to 0.0 – 1.0 before Vertex color is clamped to 0.0 – 1.0 before being send to OpenGL (Drawer). The correct being send to OpenGL (Drawer). The correct behaviour should be that color attributes should behaviour should be that color attributes should be clampled when they exit the shader.be clampled when they exit the shader.

Added glNormal3f and glFrustum OpenGL Added glNormal3f and glFrustum OpenGL functions to the TraceReader and functions to the TraceReader and OGLtoAGPTransaction.OGLtoAGPTransaction.

Page 17: Status – Week 242 Victor Moya. Summary Current status. Current status. Tests. Tests. XBox documentation. XBox documentation. Post Vertex Shader geometry

TestsTests

Changes:Changes: OGLtoAGPTransaction now supports a OGLtoAGPTransaction now supports a

third vertex attribute: normal.third vertex attribute: normal. OGLtoAGPTransaction now supports a OGLtoAGPTransaction now supports a

‘special’ shader mode (the one used ‘special’ shader mode (the one used for the light test). No support for for the light test). No support for OpenGL lightning is implemented.OpenGL lightning is implemented.

Page 18: Status – Week 242 Victor Moya. Summary Current status. Current status. Tests. Tests. XBox documentation. XBox documentation. Post Vertex Shader geometry

TestsTests

Further tests:Further tests: Try to implement a sphere using Try to implement a sphere using

Icosahedron subdivision to create a Icosahedron subdivision to create a triangle strip mesh to test the index triangle strip mesh to test the index stream mode.stream mode.

Page 19: Status – Week 242 Victor Moya. Summary Current status. Current status. Tests. Tests. XBox documentation. XBox documentation. Post Vertex Shader geometry

XBox DocumentationXBox Documentation

Interesting information about the Interesting information about the Vertex Shader architecture and the Vertex Shader architecture and the T&L pipeline down to the Primitive T&L pipeline down to the Primitive Assembly Cache and the Triangle Assembly Cache and the Triangle Setup.Setup.

Includes estimated sizes and clock Includes estimated sizes and clock latencies for most of the latencies for most of the operations.operations.

Page 20: Status – Week 242 Victor Moya. Summary Current status. Current status. Tests. Tests. XBox documentation. XBox documentation. Post Vertex Shader geometry

Memory

Pre T&L Cache

Vertex Shader

Post T&L Cache

Primitive Assembly

Triangle Setup

cache line (raw vertex data)

raw vertex

transformed and lit vertex

transformed and lit vertex

3 transformed and lit vertices

Rasterization

4 KB 4-way set associative 128 32-B cache lines

16 – 24 entry FIFO

200 MHz

3 vertices

Page 21: Status – Week 242 Victor Moya. Summary Current status. Current status. Tests. Tests. XBox documentation. XBox documentation. Post Vertex Shader geometry

XBOXXBOX

Differences:Differences: No Pre T&L cache.No Pre T&L cache. The Post T&L cache seems to be accessed The Post T&L cache seems to be accessed

by the Primitive Assembly Cache. However by the Primitive Assembly Cache. However we push the vertex to the Rasterizer (or we push the vertex to the Rasterizer (or whatever lays after the shader).whatever lays after the shader).

Sending the shaded vertex to the primitive Sending the shaded vertex to the primitive assembly takes multiple cycles (2+) assembly takes multiple cycles (2+) depending on the number of attributes used depending on the number of attributes used by the vertex.by the vertex.

Page 22: Status – Week 242 Victor Moya. Summary Current status. Current status. Tests. Tests. XBox documentation. XBox documentation. Post Vertex Shader geometry

XBOX Vertex ShaderXBOX Vertex Shader

Registers:Registers: 16 input registers.16 input registers. 12 temporary registers.12 temporary registers. 192 constant registers.192 constant registers. 1 address register.1 address register. 11 output registers.11 output registers.

Page 23: Status – Week 242 Victor Moya. Summary Current status. Current status. Tests. Tests. XBox documentation. XBox documentation. Post Vertex Shader geometry

XBOX Vertex ShaderXBOX Vertex Shader Instructions:Instructions:

Shader Operations:Shader Operations: 13 MAC opcodes.13 MAC opcodes. 7 ILU (inverse logic unit) opcodes.7 ILU (inverse logic unit) opcodes.

136 microcode instructions. Each instruction can:136 microcode instructions. Each instruction can: Read three register with swizzle and negation.Read three register with swizzle and negation. Compute one MAC op and one ILU op.Compute one MAC op and one ILU op. Write up one output register and two temporary registers Write up one output register and two temporary registers

with masking.with masking. Shader types:Shader types:

Normal vertex shaders.Normal vertex shaders. Read/write vertex shaders.Read/write vertex shaders. Vertex state shaders.Vertex state shaders.

Page 24: Status – Week 242 Victor Moya. Summary Current status. Current status. Tests. Tests. XBox documentation. XBox documentation. Post Vertex Shader geometry
Page 25: Status – Week 242 Victor Moya. Summary Current status. Current status. Tests. Tests. XBox documentation. XBox documentation. Post Vertex Shader geometry

XBOX Vertex ShadersXBOX Vertex Shaders

Timing:Timing: The cycle speed is 250 MHzThe cycle speed is 250 MHz For normal shaders, instructions take For normal shaders, instructions take

between one-half cycle and one cycle to between one-half cycle and one cycle to complete.complete.

For read/write and vertex state shaders, For read/write and vertex state shaders, instructions take between one cycle and instructions take between one cycle and six cycles to complete.six cycles to complete.

Page 26: Status – Week 242 Victor Moya. Summary Current status. Current status. Tests. Tests. XBox documentation. XBox documentation. Post Vertex Shader geometry

XBOX Vertex ShadersXBOX Vertex Shaders Multithreaded: Multithreaded:

Two copies of the vertex shader pipeline (2 VS).Two copies of the vertex shader pipeline (2 VS). Each copy can run up to three threads (3 active Each copy can run up to three threads (3 active

threads per shader).threads per shader). Read/write vertex shaders and vertex state Read/write vertex shaders and vertex state

shaders run single threaded, on a single pipeline.shaders run single threaded, on a single pipeline. Stalling:Stalling:

Instructions take six cycles to compute their Instructions take six cycles to compute their outputs.outputs.

Bypasses: ALU, ILU and MLU bypasses.Bypasses: ALU, ILU and MLU bypasses. Three cycles latency with bypasses.Three cycles latency with bypasses. Bypass allows swizzling and negate of the result.Bypass allows swizzling and negate of the result.

Page 27: Status – Week 242 Victor Moya. Summary Current status. Current status. Tests. Tests. XBox documentation. XBox documentation. Post Vertex Shader geometry

Post Vertex ShaderPost Vertex Shader (based in 3DLabs OpenGL2 overview).(based in 3DLabs OpenGL2 overview). Primitive assembly.Primitive assembly. User clipping.User clipping. Frustum clipping.Frustum clipping. Perspective projection.Perspective projection. Viewport Mapping.Viewport Mapping. Polygon offset.Polygon offset. Polygon mode.Polygon mode. Shade mode.Shade mode. Culling.Culling.

Page 28: Status – Week 242 Victor Moya. Summary Current status. Current status. Tests. Tests. XBox documentation. XBox documentation. Post Vertex Shader geometry

Post Vertex ShaderPost Vertex Shader Primitive Assembly:Primitive Assembly:

Get the three vertexes of a triangle.Get the three vertexes of a triangle. Triangles: keep the last three vertexes, Triangles: keep the last three vertexes,

generate primitive with each new three generate primitive with each new three vertexes.vertexes.

Triangle strip: keep the last three vertexes, Triangle strip: keep the last three vertexes, generate primitive with each new vertex (after generate primitive with each new vertex (after the second)the second)

Triangle fan: keep the first vertex and the last Triangle fan: keep the first vertex and the last two vertex, generate primitive with each new two vertex, generate primitive with each new vertex (after the second). vertex (after the second).

Similar with other primitives.Similar with other primitives.

Page 29: Status – Week 242 Victor Moya. Summary Current status. Current status. Tests. Tests. XBox documentation. XBox documentation. Post Vertex Shader geometry

Post Vertex ShaderPost Vertex Shader

User clipping:User clipping: At least 6 user clip planes.At least 6 user clip planes. Define a clip volume.Define a clip volume. glClipPlane(enum p, double eqn[4]).glClipPlane(enum p, double eqn[4]). (p1 p2 p3 p4) (x y z w) >= 0(p1 p2 p3 p4) (x y z w) >= 0

Frustum clipping:Frustum clipping: View volume.View volume. -w <= x <= w-w <= x <= w -w <= y <= w-w <= y <= w -w <= z <= w-w <= z <= w

Page 30: Status – Week 242 Victor Moya. Summary Current status. Current status. Tests. Tests. XBox documentation. XBox documentation. Post Vertex Shader geometry

Post Vertex ShaderPost Vertex Shader

Clipping:Clipping: Clip polygon => add new vertexes => Clip polygon => add new vertexes =>

tesselate.tesselate. Clip triangle => add new vertexes => Clip triangle => add new vertexes =>

retesselate. retesselate. Use rasterization in homogeneous Use rasterization in homogeneous

coordinates: just add more clipping coordinates: just add more clipping edges.edges.

Guard Band Clipping (scissor).Guard Band Clipping (scissor).

Page 31: Status – Week 242 Victor Moya. Summary Current status. Current status. Tests. Tests. XBox documentation. XBox documentation. Post Vertex Shader geometry

Post Vertex ShaderPost Vertex Shader Divide by w.Divide by w. Viewport transformation.Viewport transformation.

Scale to screen/window coordinate system.Scale to screen/window coordinate system. glViewport(x, y, w, h)glViewport(x, y, w, h) glDepthRange(clampd n, clampd f)glDepthRange(clampd n, clampd f) xw = (px/2)*xd + oxxw = (px/2)*xd + ox yw = (py/2)*yd + oyyw = (py/2)*yd + oy zw = [(f-n)/2]*zd + (n + f)/2zw = [(f-n)/2]*zd + (n + f)/2 ox = x + w/2ox = x + w/2 oy = y + h/2oy = y + h/2 px = wpx = w py = hpy = h

Page 32: Status – Week 242 Victor Moya. Summary Current status. Current status. Tests. Tests. XBox documentation. XBox documentation. Post Vertex Shader geometry

Post Vertex ShaderPost Vertex Shader

Back face culling:Back face culling: Can be calculated using the area of Can be calculated using the area of

the triangle (determinant three vertex the triangle (determinant three vertex in homogeneous coordinates).in homogeneous coordinates).

Negative or possitive area.Negative or possitive area. Can be also used to cull zero area Can be also used to cull zero area

trianglestriangles

Page 33: Status – Week 242 Victor Moya. Summary Current status. Current status. Tests. Tests. XBox documentation. XBox documentation. Post Vertex Shader geometry

Post Vertex ShaderPost Vertex Shader

Discard degenerate triangles:Discard degenerate triangles: If two or more vertex are the same If two or more vertex are the same

(could be index based or full vertex (could be index based or full vertex comparition) the triangle can be comparition) the triangle can be discarded.discarded.

Page 34: Status – Week 242 Victor Moya. Summary Current status. Current status. Tests. Tests. XBox documentation. XBox documentation. Post Vertex Shader geometry

RasterizationRasterization

Alternatives:Alternatives: Scanline incremental interpolation (DDA).Scanline incremental interpolation (DDA). Rasterization in homogeneous Rasterization in homogeneous

coordinates.coordinates. Two phases:Two phases:

Triangle setup.Triangle setup. Set interpolation registers.Set interpolation registers.

Fragment generation.Fragment generation. Incrementally update the interpolants.Incrementally update the interpolants.