![Page 1: Advanced Uses of Pixel Local Storage - Microsoft · Pixel Local Storage Exposed as EXT_shader_pixel_local_storage Per-pixel scratch memory available to fragment shaders Automatically](https://reader035.vdocument.in/reader035/viewer/2022071012/5fcad919f66834427b726c1a/html5/thumbnails/1.jpg)
Advanced Uses of Pixel Local Storage
Marius Bjørge
ARM Developer Day - London
Graphics Research Engineer
December 3rd 2015
![Page 2: Advanced Uses of Pixel Local Storage - Microsoft · Pixel Local Storage Exposed as EXT_shader_pixel_local_storage Per-pixel scratch memory available to fragment shaders Automatically](https://reader035.vdocument.in/reader035/viewer/2022071012/5fcad919f66834427b726c1a/html5/thumbnails/2.jpg)
© ARM 2015 2
Agenda
Pixel Local Storage
Indirect lighting pipeline
![Page 3: Advanced Uses of Pixel Local Storage - Microsoft · Pixel Local Storage Exposed as EXT_shader_pixel_local_storage Per-pixel scratch memory available to fragment shaders Automatically](https://reader035.vdocument.in/reader035/viewer/2022071012/5fcad919f66834427b726c1a/html5/thumbnails/3.jpg)
© ARM 2015 3
Pixel Local Storage
Exposed as EXT_shader_pixel_local_storage
Per-pixel scratch memory available to fragment shaders
Automatically discarded once a tile is fully processed
No impact on external memory bandwidth
Shader declares a view of PLS memory
Re-interpret PLS between different passes
Can have separate input and output views
Independent of framebuffer format
![Page 4: Advanced Uses of Pixel Local Storage - Microsoft · Pixel Local Storage Exposed as EXT_shader_pixel_local_storage Per-pixel scratch memory available to fragment shaders Automatically](https://reader035.vdocument.in/reader035/viewer/2022071012/5fcad919f66834427b726c1a/html5/thumbnails/4.jpg)
© ARM 2015 4
Pixel Local Storage
Rendering pipeline changes
slightly when PLS is enabled
Writing to PLS bypasses blending
Note
Fragment order
Fragment tests still applies
PLS and color share the same
memory location
Tile executionMemory
Primitive setup
Rasterization
Fragment Shading
Position data
Varyings
Textures
Framebuffer Writeback
Uniforms
Blending
Tilebuffer
PLS / Color
![Page 5: Advanced Uses of Pixel Local Storage - Microsoft · Pixel Local Storage Exposed as EXT_shader_pixel_local_storage Per-pixel scratch memory available to fragment shaders Automatically](https://reader035.vdocument.in/reader035/viewer/2022071012/5fcad919f66834427b726c1a/html5/thumbnails/5.jpg)
© ARM 2015 5
Deferred Shading
Popular technique on PC and console games
Very memory bandwidth intensive
Traditionally not a good fit for mobile
![Page 6: Advanced Uses of Pixel Local Storage - Microsoft · Pixel Local Storage Exposed as EXT_shader_pixel_local_storage Per-pixel scratch memory available to fragment shaders Automatically](https://reader035.vdocument.in/reader035/viewer/2022071012/5fcad919f66834427b726c1a/html5/thumbnails/6.jpg)
© ARM 2015 6
Order Independent Transparency
Depth peeling
Approximate approaches
Multi-Layer Alpha Blending [Salvi
et al, 2014]
Adaptive Range
![Page 7: Advanced Uses of Pixel Local Storage - Microsoft · Pixel Local Storage Exposed as EXT_shader_pixel_local_storage Per-pixel scratch memory available to fragment shaders Automatically](https://reader035.vdocument.in/reader035/viewer/2022071012/5fcad919f66834427b726c1a/html5/thumbnails/7.jpg)
© ARM 2015 7
Pixel Local Storage
OIT phaseOpaque phase
Fill gbufferLight
accumulationTonemap
Pixel Local Storage
Init OIT
R32UI R32UI R32UI R32UI
Transparent rendering
Resolve
ColorRGB10A2 RGB10A2 RG16F RG16F
At this point we change the layout of the PLS
![Page 8: Advanced Uses of Pixel Local Storage - Microsoft · Pixel Local Storage Exposed as EXT_shader_pixel_local_storage Per-pixel scratch memory available to fragment shaders Automatically](https://reader035.vdocument.in/reader035/viewer/2022071012/5fcad919f66834427b726c1a/html5/thumbnails/8.jpg)
© ARM 2015 8
Sample Code
http://malideveloper.arm.com/resources/sdks/mali-opengl-es-sdk-for-
android/
![Page 9: Advanced Uses of Pixel Local Storage - Microsoft · Pixel Local Storage Exposed as EXT_shader_pixel_local_storage Per-pixel scratch memory available to fragment shaders Automatically](https://reader035.vdocument.in/reader035/viewer/2022071012/5fcad919f66834427b726c1a/html5/thumbnails/9.jpg)
Indirect Lighting Pipeline
![Page 10: Advanced Uses of Pixel Local Storage - Microsoft · Pixel Local Storage Exposed as EXT_shader_pixel_local_storage Per-pixel scratch memory available to fragment shaders Automatically](https://reader035.vdocument.in/reader035/viewer/2022071012/5fcad919f66834427b726c1a/html5/thumbnails/10.jpg)
© ARM 2015 10
Related Work
Efficient Rendering With Tile
Local Storage [Bjorge14]
SH Irradiance Volumes
[Tatarchuk05]
Reflective Shadow Maps
[Dachsbacher09]
Virtual Point Lights
Light Propagation Volumes
[Kaplanyan09]
![Page 11: Advanced Uses of Pixel Local Storage - Microsoft · Pixel Local Storage Exposed as EXT_shader_pixel_local_storage Per-pixel scratch memory available to fragment shaders Automatically](https://reader035.vdocument.in/reader035/viewer/2022071012/5fcad919f66834427b726c1a/html5/thumbnails/11.jpg)
© ARM 2015 11
Global illumination with ARM
Enlighten remains the GI solution of choice
for mobile
Performant
High quality
Proven
PLS + Enlighten = bandwidth saving! [Bjorge14]
This R&D project is separate
Exploration of PLS capabilities
Indirect lighting as first challenging use case
Results remain an approximation
![Page 12: Advanced Uses of Pixel Local Storage - Microsoft · Pixel Local Storage Exposed as EXT_shader_pixel_local_storage Per-pixel scratch memory available to fragment shaders Automatically](https://reader035.vdocument.in/reader035/viewer/2022071012/5fcad919f66834427b726c1a/html5/thumbnails/12.jpg)
© ARM 2015 12
The Idea
A grid of probes are placed throughout the scene
Albedo, normal and depth information is stored
Resolution can be quite low (64x64, 32x32 or lower)
Selectively store as either cubemaps/latlon maps or spherical harmonics
Relighting does simple deferred shading style rendering per probe
Output: SH irradiance + reflection map
![Page 13: Advanced Uses of Pixel Local Storage - Microsoft · Pixel Local Storage Exposed as EXT_shader_pixel_local_storage Per-pixel scratch memory available to fragment shaders Automatically](https://reader035.vdocument.in/reader035/viewer/2022071012/5fcad919f66834427b726c1a/html5/thumbnails/13.jpg)
© ARM 2015 13
High-Level Overview
Offline Runtime
Probe data
Probe scene for indirect lighting information
Relight probes
Irradiance volume
Reflection maps
Compute SH
Dynamic lights and reflector objects
![Page 14: Advanced Uses of Pixel Local Storage - Microsoft · Pixel Local Storage Exposed as EXT_shader_pixel_local_storage Per-pixel scratch memory available to fragment shaders Automatically](https://reader035.vdocument.in/reader035/viewer/2022071012/5fcad919f66834427b726c1a/html5/thumbnails/14.jpg)
Direct Lighting
![Page 15: Advanced Uses of Pixel Local Storage - Microsoft · Pixel Local Storage Exposed as EXT_shader_pixel_local_storage Per-pixel scratch memory available to fragment shaders Automatically](https://reader035.vdocument.in/reader035/viewer/2022071012/5fcad919f66834427b726c1a/html5/thumbnails/15.jpg)
Indirect Lighting
![Page 16: Advanced Uses of Pixel Local Storage - Microsoft · Pixel Local Storage Exposed as EXT_shader_pixel_local_storage Per-pixel scratch memory available to fragment shaders Automatically](https://reader035.vdocument.in/reader035/viewer/2022071012/5fcad919f66834427b726c1a/html5/thumbnails/16.jpg)
Reflections
![Page 17: Advanced Uses of Pixel Local Storage - Microsoft · Pixel Local Storage Exposed as EXT_shader_pixel_local_storage Per-pixel scratch memory available to fragment shaders Automatically](https://reader035.vdocument.in/reader035/viewer/2022071012/5fcad919f66834427b726c1a/html5/thumbnails/17.jpg)
Final Image
![Page 18: Advanced Uses of Pixel Local Storage - Microsoft · Pixel Local Storage Exposed as EXT_shader_pixel_local_storage Per-pixel scratch memory available to fragment shaders Automatically](https://reader035.vdocument.in/reader035/viewer/2022071012/5fcad919f66834427b726c1a/html5/thumbnails/18.jpg)
Lighting Only
![Page 19: Advanced Uses of Pixel Local Storage - Microsoft · Pixel Local Storage Exposed as EXT_shader_pixel_local_storage Per-pixel scratch memory available to fragment shaders Automatically](https://reader035.vdocument.in/reader035/viewer/2022071012/5fcad919f66834427b726c1a/html5/thumbnails/19.jpg)
Offline
![Page 20: Advanced Uses of Pixel Local Storage - Microsoft · Pixel Local Storage Exposed as EXT_shader_pixel_local_storage Per-pixel scratch memory available to fragment shaders Automatically](https://reader035.vdocument.in/reader035/viewer/2022071012/5fcad919f66834427b726c1a/html5/thumbnails/20.jpg)
© ARM 2015 20
Offline – Probe Generation
For every probe in grid
Render scene from every direction
storing albedo, normal and depth
Save to a texture atlas
Depth == 1.0
Hemisphere
![Page 21: Advanced Uses of Pixel Local Storage - Microsoft · Pixel Local Storage Exposed as EXT_shader_pixel_local_storage Per-pixel scratch memory available to fragment shaders Automatically](https://reader035.vdocument.in/reader035/viewer/2022071012/5fcad919f66834427b726c1a/html5/thumbnails/21.jpg)
© ARM 2015 21
Offline – Probe Storage
High frequency
Cubemaps / latlon maps
Low frequency
Spherical harmonics
Both are used in the demo application
Cubemaps for probes that are used for
environment mapping
Spherical harmonics for the rest
Gives the best quality/performance trade-off
![Page 22: Advanced Uses of Pixel Local Storage - Microsoft · Pixel Local Storage Exposed as EXT_shader_pixel_local_storage Per-pixel scratch memory available to fragment shaders Automatically](https://reader035.vdocument.in/reader035/viewer/2022071012/5fcad919f66834427b726c1a/html5/thumbnails/22.jpg)
© ARM 2015 22
Offline – Probe Storage
Cubemaps
6kb
Latlon
2kb
Spherical harmonics
40 bytes
![Page 23: Advanced Uses of Pixel Local Storage - Microsoft · Pixel Local Storage Exposed as EXT_shader_pixel_local_storage Per-pixel scratch memory available to fragment shaders Automatically](https://reader035.vdocument.in/reader035/viewer/2022071012/5fcad919f66834427b726c1a/html5/thumbnails/23.jpg)
© ARM 2015 23
Offline – Meta Data
Neighbour visibility information
Reduce amount of work required when updating probes
Snap texture coordinates to reduce light leakage
Hemisphere visibility
Avoid updating probes that are never affected by global lights (such as the hemisphere
or sun light)
Local cubemap bounding box
![Page 24: Advanced Uses of Pixel Local Storage - Microsoft · Pixel Local Storage Exposed as EXT_shader_pixel_local_storage Per-pixel scratch memory available to fragment shaders Automatically](https://reader035.vdocument.in/reader035/viewer/2022071012/5fcad919f66834427b726c1a/html5/thumbnails/24.jpg)
Runtime
![Page 25: Advanced Uses of Pixel Local Storage - Microsoft · Pixel Local Storage Exposed as EXT_shader_pixel_local_storage Per-pixel scratch memory available to fragment shaders Automatically](https://reader035.vdocument.in/reader035/viewer/2022071012/5fcad919f66834427b726c1a/html5/thumbnails/25.jpg)
© ARM 2015 25
Runtime – Relighting
First step is to determine which probes require updating
Probe meta-data is really useful here
Dirty probes are pushed to queue for processing
![Page 26: Advanced Uses of Pixel Local Storage - Microsoft · Pixel Local Storage Exposed as EXT_shader_pixel_local_storage Per-pixel scratch memory available to fragment shaders Automatically](https://reader035.vdocument.in/reader035/viewer/2022071012/5fcad919f66834427b726c1a/html5/thumbnails/26.jpg)
© ARM 2015 26
Runtime – Per Probe Pipeline
Determine contributing lights
Accumulate lightingParallel sum of
lighting data
Dynamic lights and reflector objects
Probe data
Blit to reflection map
Compute 2nd order SH
Store to 3D textures
![Page 27: Advanced Uses of Pixel Local Storage - Microsoft · Pixel Local Storage Exposed as EXT_shader_pixel_local_storage Per-pixel scratch memory available to fragment shaders Automatically](https://reader035.vdocument.in/reader035/viewer/2022071012/5fcad919f66834427b726c1a/html5/thumbnails/27.jpg)
© ARM 2015 27
Runtime – Deferred Probe Relighting
Deferred shading implementation
PLS is really useful here
Sample from the baked albedo/depth and
normal maps
Possible to re-use existing deferred shading
lighting code path
Makes it easy to support custom light types
and reflector objects
Heavily batched
-X +X -Y +Y -Z +Z
-X +X -Y +Y -Z +Z
-X +X -Y +Y -Z +Z
-X +X -Y +Y -Z +Z
-X +X -Y +Y -Z +Z
Probe 0
Probe 1
Probe 2
Probe 3
Probe n
![Page 28: Advanced Uses of Pixel Local Storage - Microsoft · Pixel Local Storage Exposed as EXT_shader_pixel_local_storage Per-pixel scratch memory available to fragment shaders Automatically](https://reader035.vdocument.in/reader035/viewer/2022071012/5fcad919f66834427b726c1a/html5/thumbnails/28.jpg)
© ARM 2015 28
Runtime – Compute SH
Parallel sum with spherical harmonics coefficients
Compute shader
… and/or multiple fragment reduction passes
Output formats
Pack down to separate red, green and blue 3D irradiance textures
Store in FP16 for best visual quality
![Page 29: Advanced Uses of Pixel Local Storage - Microsoft · Pixel Local Storage Exposed as EXT_shader_pixel_local_storage Per-pixel scratch memory available to fragment shaders Automatically](https://reader035.vdocument.in/reader035/viewer/2022071012/5fcad919f66834427b726c1a/html5/thumbnails/29.jpg)
© ARM 2015 29
Runtime – Irradiance Volume
2nd order SH
3x 3D textures for red, green and blue
coefficients
Simple 3D texture lookup using
world-space position
![Page 30: Advanced Uses of Pixel Local Storage - Microsoft · Pixel Local Storage Exposed as EXT_shader_pixel_local_storage Per-pixel scratch memory available to fragment shaders Automatically](https://reader035.vdocument.in/reader035/viewer/2022071012/5fcad919f66834427b726c1a/html5/thumbnails/30.jpg)
© ARM 2015 30
Runtime – Reflections
Output to cubemap matching size of
probes
glGenerateMipmap
Cubemap filter too expensive for runtime
![Page 31: Advanced Uses of Pixel Local Storage - Microsoft · Pixel Local Storage Exposed as EXT_shader_pixel_local_storage Per-pixel scratch memory available to fragment shaders Automatically](https://reader035.vdocument.in/reader035/viewer/2022071012/5fcad919f66834427b726c1a/html5/thumbnails/31.jpg)
© ARM 2015 31
Runtime – Multiple Bounces
Temporal probe relighting
Slowly accumulate into irradiance volume
![Page 32: Advanced Uses of Pixel Local Storage - Microsoft · Pixel Local Storage Exposed as EXT_shader_pixel_local_storage Per-pixel scratch memory available to fragment shaders Automatically](https://reader035.vdocument.in/reader035/viewer/2022071012/5fcad919f66834427b726c1a/html5/thumbnails/32.jpg)
Single Bounce
![Page 33: Advanced Uses of Pixel Local Storage - Microsoft · Pixel Local Storage Exposed as EXT_shader_pixel_local_storage Per-pixel scratch memory available to fragment shaders Automatically](https://reader035.vdocument.in/reader035/viewer/2022071012/5fcad919f66834427b726c1a/html5/thumbnails/33.jpg)
Multiple Bounces
![Page 34: Advanced Uses of Pixel Local Storage - Microsoft · Pixel Local Storage Exposed as EXT_shader_pixel_local_storage Per-pixel scratch memory available to fragment shaders Automatically](https://reader035.vdocument.in/reader035/viewer/2022071012/5fcad919f66834427b726c1a/html5/thumbnails/34.jpg)
© ARM 2015 34
Future Work
Move “everything“ to compute
Octree representation
Occlusion culling?
![Page 35: Advanced Uses of Pixel Local Storage - Microsoft · Pixel Local Storage Exposed as EXT_shader_pixel_local_storage Per-pixel scratch memory available to fragment shaders Automatically](https://reader035.vdocument.in/reader035/viewer/2022071012/5fcad919f66834427b726c1a/html5/thumbnails/35.jpg)
© ARM 2015 35
For more information visit the
Mali Developer Centre:
http://malideveloper.arm.com
• Revisit this talk in PDF and audio
format post event
• Download tools and resources
![Page 36: Advanced Uses of Pixel Local Storage - Microsoft · Pixel Local Storage Exposed as EXT_shader_pixel_local_storage Per-pixel scratch memory available to fragment shaders Automatically](https://reader035.vdocument.in/reader035/viewer/2022071012/5fcad919f66834427b726c1a/html5/thumbnails/36.jpg)
The trademarks featured in this presentation are registered and/or unregistered trademarks of ARM Limited (or its
subsidiaries) in the EU and/or elsewhere. All rights reserved. All other marks featured may be trademarks of their
respective owners.
Copyright © 2015 ARM Limited
Thank you!
Questions?
![Page 37: Advanced Uses of Pixel Local Storage - Microsoft · Pixel Local Storage Exposed as EXT_shader_pixel_local_storage Per-pixel scratch memory available to fragment shaders Automatically](https://reader035.vdocument.in/reader035/viewer/2022071012/5fcad919f66834427b726c1a/html5/thumbnails/37.jpg)
© ARM 2015 37
References
1. http://www.ppsloan.org/publications/StupidSH36.pdf
2. http://www.crytek.com/cryengine/cryengine3/presentations/light-
propagation-volumes-in-cryengine-3
3. http://developer.amd.com/wordpress/media/2012/10/Tatarchuk_Irradi
ance_Volumes.pdf
4. http://www.vis.uni-stuttgart.de/~dachsbcn/download/rsm.pdf
5. http://www.geomerics.com/wp-content/uploads/2014/11/Efficient-
Rendering-with-Tile-Local-Storage.pdf