designing a portable shader library for current and future api's
DESCRIPTION
Designing a Portable Shader Library for Current and Future API's. David Gosselin 3D Application Research Group. Outline. Introduction & Motivation Demos Case Study: ATI’s demo shader format Design Goals Artist’s Interface Texture Definitions Vertex and Index Buffers - PowerPoint PPT PresentationTRANSCRIPT
Designing a Designing a Portable Shader Portable Shader Library for Current Library for Current and Future API'sand Future API's
David Gosselin
3D Application Research Group
2
Outline
• Introduction & Motivation• Demos• Case Study: ATI’s demo shader format
– Design Goals– Artist’s Interface– Texture Definitions– Vertex and Index Buffers– Sub-shaders and Passes– Vertex and Pixel Shaders
• Converting from D3D ASM to HLSL• Example
3
Motivation
• Share the lessons learned from our shader library– Shaders in general– Moving from D3D ASM to HLSL
• Show some advantages to being shader-centric• Show ways to include fallback paths • Show cross platform generalizations• Would like to see more games use shaders and
not just target the least common denominator for graphics
4
Why Shaders?
• Direction of the Industry– Hardware very shader driven– Will continue down this path– Modern engines need to be shader aware
• High level shader languages (HLSL and OpenGL) make shaders easier to write
• Visual flexibility• Less engine churn
5
What is a Shader Library?
• Library in the sense of a linkable .lib file or collection of source code – Not a collection of shaders
• Abstracts Graphics API– Render States– Constant Binding– Shader Binding– Etc.
• Manages Shaders• Integrates with preprocessing/export
6
Engine Block Diagram
Maya
3DS Max
Exporter
Material
Plug-ins
Preprocessor
Runtime Engine Shader Library
File IO
D3DOGL
Shader/State Management
Parsing
Shader Files
7
What is in a Shader File?
• Contains all the state and definitions needed to render a piece of geometry.
• Includes– Artist Instructions– Preprocessing Directives– Texture Definitions– Constant / Variable Definitions– Stream Definitions (Vertex/Index buffers)– Fixed Function States (Alpha, Stencil, Z, etc.)– Vertex Shader– Pixel Shader
• Potentially multiple LOD/# light variants
8
Demos Showing the Necessity of a Shader Library
9
Shader File Design Goals
• Cross platform / cross API• Reduce the frequency of redesigning graphics
engine• Expandable to future shader languages and
API’s• Drive preprocessing (optimal VB/IB)• Fallback shaders• LOD shaders• Ability to target a wide range of graphics
hardware• Artist interaction to runtime execution
10
Artists Point-of-View
• To the RIGHT are our Maya and 3DS MAX plug-ins
• Artists associate a shader with a material within the art tool
• Art tool plug-in pulls instructions from the file for the artists
11
Artist Notes
StartArtNotesSupports:- 3 Fast RT lights - 3 Object Ambient Lights
Requirements:- Base = RGB texture- Bump = RGB normal mapEndArtNotes
12
• Describe maximum number of lights supported
• Describe textures needed• Any special instructions– Artist editable variables– Special scene geometry– Etc.
Art Notes
13
Texture Placement
Texture tBase 2D DXT1("Base", RGB, Box)DefTexture tBase Trilinear
Texture tBump 2D RGBX("Bump", RGB, Box)DefTexture tBump Trilinear
14
Texture Declaration
• How to preprocess textures– Output format– Input Format– Mipmap Generation– Separate RGB and alpha textures
Texture tBase 2D DXT1 ("Base", RGB, KaiserGamma)Texture tFurShellTexture 2D RGBA (“T1", RGB, Kaiser, “T2", GRAY,Kaiser)Texture tBump 2D DXT5 ("Bump", HEIGHT,Box, “Opacity",GRAY,Box)Texture tAnisoLookup 2D RGBA (“T5", RGB, Box, “T6", GRAY,Box)Texture tNormCube CM RGBX ("Base3", RGB, Box)Texture tEnvMap CMAuto // Engine generated cubemapTexture tWater Renderable("WaterReflection")
15
Artist Editable Variables
Vector vColor(.7, .7, .7, 0.0) Editable(color)Vector vReflectionColor (1, 1, 1, 1.0) Editable(Color)Float vSpecularExp (16.0) Editable(Slider, 0, 512)
16
Variables
• Standard types: Float, Vector, Matrix• Can be bound to render state• Can be bound to engine state• Constant• Can be artist editable (from within tool)
Matrix wvp(MATRIX_WVP)Vector osCamPos(CAMERA_POSITION, OBJECT_SPACE)Vector osLightPos(LIGHT_POSITION,OBJECT_SPACE, 0)Vector time(0.0, 0.0, 0.0, 0.0) AppUpdate(“Time”)Vector furFadeScaleBias(0.5, 0.5, 0.0, 0.0) Float furHeight(4.0) EDITABLE
17
Vertex and Index Buffers
• How to preprocess vertex/index buffers• One IB per unique stream map
StartStream sPosNorm (Normal) float3 POSITION0 Position float3 NORMAL0 VertexNormalEndStream
StartStream sTexCoords (Normal) float2 TEX0 UV0 // BaseTexU, BaseTexV float3 TEX1 Tangent0EndStream
StartStream sFurFins (FurFins) // Different geometry than above 2 streams float3 POSITION0 Position float3 NORMAL0 VertexNormal float4 TEX0 FinFaceData0 // FinTexU, FinTexV, BaseTexUVDist, RandOffset float3 TEX2 FaceNormalEndStream
StreamMap smBasePass (sPosNorm, sTexCoords)StreamMap smShellPass (sPosNorm, sTexCoords)StreamMap smFinsPass (sFurFins)
18
Sub-Shaders
• Single shader file with multiple sub-shaders– Fallbacks– LOD– Split-screens– Different number of lights
• Controlled via a property string• First in file (top to bottom) with unique
properties that validates• Can contain multiple passes
19
Sub-Shader Example//-----------------------------------------------------StartShader //For vs.2.0 and ps.2.0 minimum Property “Normal” // Can be anything “Wire frame”, // “Invincible”, “One Light”, etc. StartPass … EndPassEndShader
//-----------------------------------------------------StartShader //For vs.1.1 and ps.1.4 Property “Normal” StartPass … EndPass StartPass … EndPassEndShader
20
Falling Back
• Determine which shaders validate before loading vertex buffers, index buffers and textures.
• Vertex/Index Buffers– May need single stream vertex buffers for older
hardware– All potential vertex streams defined in shader– Load only those required for shaders which validate– Extra data stored on disk, but optimal for runtime
• Textures– Many fallback shaders won’t require all the defined
textures– Only load textures required for validated shaders
21
Pass• Each pass has a Stream Mapping, unique set
of render state and textures, and vertex and pixel shader code.
• Within a sub-shader, state is sticky between passes.
StartPass //Set Stream Map //Set Textures //Set Render State
//Set Vertex Shader Constants //Vertex Shader Code
//Set Pixel Shader Constants //Pixel Shader CodeEndPass
22
Common Graphics Concepts
• Graphics hardware is very similar despite differences in API’s
• Hardware is functionally identical:– Setting texture state– Vertex/index buffers, stream maps– Draw calls– Alpha blending & testing– Shader constant setup– Z state– Stencil state– Renderable textures– etc.
23
Common State KeywordsClipping TRUECull CWFillMode SolidShadeMode Gouraud
SetTexture 0 NULL CoordIndex(0) Transform(0) Linear LODBias(0.0) Clamp(www) Border(0x00000000)SetBlender 0 Color(SelectArg1, Diffuse, Diffuse) Alpha(SelectArg1, Diffuse, Diffuse)
Fog FALSE Table(None) Vertex(None) Color(0) Start(0.0) End(1.0) Density(1.0)
AlphaTest FALSE Blend FALSE Src(One) Dest(Zero) Op(Add)ColorWriteEnable (R, G, B, A)MultiSampleAntiAlias TRUEMultiSampleMask 0xffffffffDitherEnable FALSE
Z TRUE Write(TRUE) Func(LessEqual) Bias(0.0) SlopeScale(0.0)
Stencil FALSE Pass(Keep) Fail(Keep) ZFail(Keep) Func(Always) Ref(0xFFFFFFFF) StencilCCW FALSE Pass(Keep) Fail(Keep) ZFail(Keep) Func(Always)
24
Shader Languages
• DirectX fixed function• DirectX assembly shaders• DirectX HLSL• OpenGL fixed function• OpenGL assembly shaders
(ARB_vertex_program, ARB_fragment_program)
• The OpenGL Shading Language (ARB_shading_language_100 )
• GameCube• PS2
25
D3D ASM Shaders
• VS/PS code embedded• Can also come from an external file.
VsConst 0 mWvp // Matrix takes up 4 constantsVsConst 4 vTimeConstVsConst 5 (0.0, 0.1, 2.0, 5.0)StartVertexShader vs.1.1 dcl_position v0 dcl_color0 v5 dcl_texcoord0 v7
m4x4 oPos, v0, c0 // Transform position mov oT0, v7 // Base texture coordinates mov oD0, v5 // Pass vertex light to PSEndVertexShader
26
HLSL Support Design Goals
• Build on our existing framework• Avoid explicit constant declarations• Allow HLSL include files to reference
common functions– Includes can contain their own variables and
textures
• Hidden from outside the shader library
27
VsConst 0 mWvp //Takes up 4 constantsVsConst 4 vTimeVsConst 5 (0.0, 0.1, 2.0, 5.0)StartVertexShader vs.1.1 dcl_position v0 dcl_color0 v5 dcl_texcoord0 v7
m4x4 oPos, v0, c0 //Transform mov oT0, v7 //Base tex coords mov oD0, v5 //Vertex lightEndVertexShader
Basic HLSL Shader
Matrix mWvp(MATRIX_WVP)Vector vTime AppUpdate(“Time”)
StartVertexShader(HLSL) float4x4 mWvp; float4 vTimeConst; struct VS_OUTPUT { float4 Pos : POSITION; float4 Diffuse : COLOR0; float2 TCoord0 : TEXCOORD0; };
VS_OUTPUT main (float4 aPosition : POSITION, float4 aDiffuse : COLOR0, float2 aTC0 : TEXCOORD0) { VS_OUTPUT outV = (VS_OUTPUT) 0; outV.Pos = mul (mWvp, aPosition); // Transform position outV.TCoord0 = vTC0; // Pass texture coordinates outV.Diffuse = vDiffuse; // Pass vertex light return outV; }EndVertexShaderHLSL
• New HLSL token• Constants by name
28
HLSL Pixel ShaderTexture tBaseTexture 2D DXT1("Base", RGB, KaiserGamma)StartPixelShader(HLSL) sampler tBaseTexture; struct PsInput { float2 texCoord : TEXCOORD0; float3 vertexLight : COLOR0; }; float4 main (PsInput i) : COLOR { // Sample base texture float3 cBase = tex2D (tBaseTexture, i.texCoord);
float4 o; o.rgb = cBase * i.vertexLight; //Final lighting o.a = 1.0f; return o; }EndPixelShader
29
Matching Names
• D3DXCompileShader() returns a constant table when a shader is compiled.
• ID3DXConstantTable has a member function GetConstantDesc() which allows you to get:– Name– Register Index– Type/Size
• Our shader library matches names to registers to send to SetPixelShaderConstantF() and SetVertexShaderConstantF()
30
Handling HLSL Includes
• Didn’t use HLSL’s #include• Variables and textures in includes need
to be interpreted by shader library• Our parser concatenates the include files
with the embedded shader code• Shader author needs no knowledge
about the contents of the include file other than the function declaration
31
Using an Include fileStartHLSL #define SI_SKINNING_MAX_BONES 40EndHLSL…VsInclude thisVsInclude "SiSkinning.shl"StartVertexShader(HLSL) float4x4 mVP;
struct VsInput { float4 pos : POSITION0; float4 weights : BLENDWEIGHT0; int4 indices : BLENDINDICES0; float3 normal : NORMAL0; };… VsOutput main (VsInput i) { // Skin position float4 pos = SiSkin4x4 (i.pos, i.weights, i.indices); o.pos = mul (pos, mVP);… }EndVertexShader
32
HLSL Include Example
#replicate ($i, 0, 100, 1) Matrix mSiWorld$i AppUpdate(world$imat)#endreplicate
StartHLSL float4x4 mSiWorld[SI_SKINNING_MAX_BONES];
float4 SiSkin4x4 (float4 aVec, float4 aWeights, float4 aIndices) { float4 vec = (float4)0; for(int bone = 0; bone < 4; bone++) vec += (aWeights[bone] * (mul (aVec, mSiWorld[aIndices[bone]])); return vec; }...EndHLSL
33
VsInclude "SiSkinning.shl"
float4x4 mVP; struct VsInput { float4 pos : POSITION0; float4 weights : BLENDWEIGHT0; int4 indices : BLENDINDICES0; float3 normal : NORMAL0; }; VsOutput main (VsInput i) { float4 pos = SiSkin4x4 (i.pos, i.weights, i.indices); o.pos = mul (pos, mVP); }
StartVertexShader(HLSL)
EndVertexShader
float4x4 mSiWorld[SI_SKINNING_MAX_BONES];
float4 SiSkin4x4 (float4 aVec, float4 aWeights, float4 aIndices) { float4 vec = (float4)0; for(int bone = 0; bone < 4; bone++) vec += (aWeights[bone] * (mul (aVec, mSiWorld[aIndices[bone]])); return vec; }
StartHLSL
#define SI_SKINNING_MAX_BONES 40
Matrix mVP mWvp(MATRIX_WVP)
StartHLSL
EndHLSL
Concatenation of FilesShader LibraryD3DX Compiler
34
Debugging
• Concatenated files means line numbers will not be accurate• Added a special tag to dump out the
concatenated code:
• Outputs the concatenated file to the given file name
StartPixelShader(HLSL) HLSLDebugOutput(“dbg.hlsl”)
35
Textures in HLSL Includes
• Including an HLSL pixel shader also requires implicitly binding textures to texture stages
• In non-HLSL pixel shaders, we had: SetTexture 0 tBaseTexture Trilinear
• Since we can’t specify the stage to bind the texture to explicitly:
DefTexture tBaseTexture Trilinear
• HLSL compiler returns table for matching our DefTextures names with stages
36
Texture Lookup in an Include
StartArtNotes* Anisotropic Strand Lighting map on T2 (24 bit)EndArtNotes
Texture tSiStrandLighting 2D DXT1("T2", RGB, Box)DefTexture tSiStrandLighting Linear Clamp(cc)
StartHLSL sampler tSiStrandLighting; struct SiStrandPair { float diffuse; float specular; }; // Compute Wolfgang Heidrich's Anisotropic lighting SiStrandPair SiComputeStrandLight (float3 normal, float3 light, float3 view, float3 dirAniso) { SiStrandPair sPair;
37
Texture Lookup Continued
float LdA = dot(light, dirAniso); float VdA = dot(view, dirAniso); float2 fnLookup = tex2D(tSiStrandLighting, (float2(LdA, VdA) * 0.5) + (float2)0.5); float spec = fnLookup.y * fnLookup.y; float diff = fnLookup.x; float selfShadow = saturate(dot(normal, light));
sPair.diffuse = diff * selfShadow; sPair.specular = spec * selfShadow;
return(sPair); }EndHLSL
38
Default Shader
• Defines all default state for your engine within a shader file.
• You don’t need to rely on D3D’s or OpenGL’s default state since you override it with what is most useful to your app. This also helps reduce overall state change.
• Necessary for cross API development since different API’s have different default state
• Your shaders ultimately have less redundancy since you can rely on the defaults you set.
39
A Full Example:Environment Mapped BumpsStartArtNotesSupports:- 3 Fast RT lights - 3 Object Ambient Lights- Per-Pixel Specular Exponent- Gloss Map- Environment map cube map- Bump map
Requirements:- Color = Set Editable Color (vBaseColor)- Bump = RGB normal map- Gloss = GRAY Texture (Gloss Map)- SpecularExp = GRAY Texture (Specular Exponent Map)- T2 = RGB Cube Map (Environment Map)EndArtNotes
StartMisc Animation Skinned(40) NumFastRTLights 3EndMisc
40
Example Continued
// TEXTURESTexture tGloss 2D GRAY("Gloss", GRAY, Box)Texture tSpec 2D GRAY("SpecularExp", GRAY, Box)Texture tEnv CM RGB("T2", RGB, Box)Texture tBump 2D RGB("Bump", RGB, Box)
DefTexture tBump TrilinearDefTexture tSpec TrilinearDefTexture tGloss TrilinearDefTexture tEnv Trilinear
// VARIABLES Matrix mVP(VP)Vector worldCamPos(CameraPosition, WorldSpace) Vector vBaseColor(.8, .8, .8, 0) Editable(color)
41
Example Continued
// STREAMS StartStream s1 Normal float3 POSITION Position float4 BLENDWEIGHT BlendWeight ubyte4 BLENDINDICES BlendIndex float3 NORMAL Normal float3 TANGENT0 Tangent("Bump") float3 BINORMAL0 Binormal("Bump") float2 TEX0 UV("Bump")EndStream
StreamMap sm1(s1)
// Global HLSL blockStartHLSL #define SI_SKINNING_MAX_BONES 40 #define NUM_OBJECT_AMBIENT_LIGHTS 3 #define SPECULAR_K_MIN 16 #define SPECULAR_K_MAX 256EndHLSL
42
Example Continued
StartShader "NormalFastRT1" Property "Normal" Property "FastRT1"
StartPass "Pass1" SetStreamMap sm1
VsInclude this VsInclude "SiRTLight.shl"(VS) VsInclude "SiObjAmbLight.shl" VsInclude "SiSkinning.shl" VsInclude "SiMath.shl"(Misc) StartVertexShader(HLSL) float4x4 mVP; float3 worldCamPos;
43
Example Continued
struct VsInput { float4 pos : POSITION0; float4 weights : BLENDWEIGHT0; float4 indices : BLENDINDICES0; float3 normal : NORMAL0; float3 tangent : TANGENT0; float3 binormal : BINORMAL0; float2 texCoord : TEXCOORD0;};
struct VsOutput{ float4 pos : POSITION0; float2 texCoord : TEXCOORD0; float3 lightVec0TS : TEXCOORD1; float3 lightSpacePos0 : TEXCOORD2; float3 viewVecTS : TEXCOORD3; float3 invNormal : TEXCOORD4; float3 invTangent : TEXCOORD5; float3 invBinormal : TEXCOORD6; float3 vertexLight : COLOR0;};
44
Example Continued
VsOutput main (VsInput i){ VsOutput o; o.texCoord = i.texCoord; float3x3 mTangent = 0;
// Skin float4 pos = SiSkin4x4 (i.pos, i.weights, i.indices); o.pos = mul (pos, mVP); mTangent[0] = SiSkin3x3 (i.tangent, i.weights, i.indices); mTangent[1] = SiSkin3x3 (i.binormal, i.weights, i.indices); mTangent[2] = SiSkin3x3 (i.normal, i.weights, i.indices);
// Invert tangent space float3x3 mInvTangent = transpose(mTangent); o.invTangent = mInvTangent[0]; o.invBinormal = mInvTangent[1]; o.invNormal = mInvTangent[2];
// Compute View Vector float3 viewVec = worldCamPos - i.pos; viewVec = normalize (viewVec); o.viewVecTS = mul (mTangent, viewVec);
45
Example Continued
// Compute ambient lighting float3 vertexLight = 0.0f; for (int idx = 0; idx < NUM_OBJECT_AMBIENT_LIGHTS; idx++) { vertexLight += SiComputeObjectAmbientLight (pos, mTangent[2], idx); } o.vertexLight = vertexLight;
// Compute runtime light vectors for RT1 o.lightSpacePos0 = SiComputeRTLightSpacePosition (pos, 0); float3 lightVec0 = SiComputeRTLightVectorNormalized (pos,0); o.lightVec0TS = mul (mTangent, lightVec0);
return o;}EndVertexShader
46
Example Continued
PsInclude thisPsInclude "SiRTLight.shl"(PS)PsInclude "SiMath.shl"(Misc)StartPixelShader(HLSL) sampler tBump; sampler tGloss; sampler tEnv; sampler tSpec;
float4 vBaseColor;
struct PsInput { float2 texCoord : TEXCOORD0; float3 lightVec0TS : TEXCOORD1; float3 lightSpacePos0 : TEXCOORD2; float3 viewVecTS : TEXCOORD3; float3 invNormal : TEXCOORD4; float3 invTangent : TEXCOORD5; float3 invBinormal : TEXCOORD6; float3 vertexLight : COLOR0; };
47
Example Continued
float4 main (PsInput i) : COLOR{ // Create arrays of light vectors and positions #define NUM_RT_LIGHTS 1 float3 vLightVec[NUM_RT_LIGHTS] = {i.lightVec0TS}; float3 vLightPos[NUM_RT_LIGHTS] = {i.lightSpacePos0};
// Sample normal map float3 vNormal = tex2D (tBump, i.texCoord); vNormal = SiConvertColorToVector (vNormal);
// Compute reflection vector float3 reflectionVec = SiReflect (i.viewVecTS, vNormal);
// Sample Exponent and Gloss Map float exponent = tex2D (tSpec, i.texCoord); float gloss = tex2D (tGloss, i.texCoord);
48
Example Continued
// Loop over runtime lights computing light contributions float3 diffuse = i.vertexLight * vNormal.z; float3 specular = 0; for (int idx = 0; idx < NUM_RT_LIGHTS; idx++) { float3 colorIntensity = SiComputeRTLightColorIntensity (vLightPos[idx], lightIdx); float diffuseNdotL = SiDot3Clamp (vNormal, vLightVec[idx]); diffuse += colorIntensity * diffuseNdotL;
float specularRdotL = SiComputeSpecular (reflectionVec, vLightVec[idx], exponent, SPECULAR_K_MIN, SPECULAR_K_MAX); specular += colorIntensity * specularRdotL; }
49
Example Continued
// Rotate reflection vector to object space float3x3 mInvTangent = {i.invTangent, i.invBinormal, i.invNormal}; float3 reflectionVecOS = mul (mInvTangent, reflectionVec);
// sample environment map float3 cEnv = texCUBE (tEnv, reflectionVecOS);
// Scale env map by fresnel and add to specular contribution float fresnel = SiComputeFresnelApprox (vNormal, i.viewVecTS); float specularEnv = cEnv * fresnel * gloss; specular *= gloss;
// Compute final color float4 o; o.rgb = (vBaseColor * diffuse)+ (specular + specularEnv); o.a = 0.0; return o;}EndPixelShaderEndPassEndShader
50
Summary
• Why a shader library is needed• Goals of designing a shader library• Case study: ATI’s demo shader format• Modifications for HLSL (it wasn’t that
painful)