experiences with streaming construction of sah kd trees
DESCRIPTION
Experiences with Streaming Construction of SAH KD Trees. Stefan Popov, Johannes Günther, Hans-Peter Seidel, Philipp Slusallek. Motivation. Large speed-up of ray tracing lately Better algorithms (packet tracing [Wald04, Reshetov05] ) Optimized spatial index structures - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Experiences with Streaming Construction of SAH KD Trees](https://reader034.vdocument.in/reader034/viewer/2022051821/56815b15550346895dc8c643/html5/thumbnails/1.jpg)
Experiences with Streaming Construction of SAH KD Trees
Stefan Popov, Johannes Günther, Hans-Peter Seidel, Philipp Slusallek
![Page 2: Experiences with Streaming Construction of SAH KD Trees](https://reader034.vdocument.in/reader034/viewer/2022051821/56815b15550346895dc8c643/html5/thumbnails/2.jpg)
Stefan Popov Streaming Construction of KD Trees
Motivation Large speed-up of ray tracing lately
Better algorithms (packet tracing [Wald04, Reshetov05]) Optimized spatial index structures
Best known: KD trees [Havran00]
Faster hardware Research concentrated mainly on static scenes
Dynamic scenes Building – slow for SAH based KD trees Done in a pre-processing step
![Page 3: Experiences with Streaming Construction of SAH KD Trees](https://reader034.vdocument.in/reader034/viewer/2022051821/56815b15550346895dc8c643/html5/thumbnails/3.jpg)
Stefan Popov Streaming Construction of KD Trees
Dynamic Scenes Approaches Embed dynamics in the index structure
Use a two level approach [Wald03] Fuzzy KD trees [Günther06]
Update index structure Grids, BVHs and KD tree hybrids
Faster build/update Lower traversal performance
No efficient approach for KD trees Rebuild entire KD tree
Need to make it fast Lazy build
![Page 4: Experiences with Streaming Construction of SAH KD Trees](https://reader034.vdocument.in/reader034/viewer/2022051821/56815b15550346895dc8c643/html5/thumbnails/4.jpg)
Stefan Popov Streaming Construction of KD Trees
SAH Algorithm Extract & sort events in advance
Abstract objects with AABBs Events given by AABB boundaries
Recursive top-down construction Find split plane using SAH
Compute minimum cost Distribute objects to children
By distributing the events Keep them sorted
1
3
4
2
6
7
8
5
Split hyper-plane [X
: 68]
X: 68
Left Right
1, 2, 3, 4 4, 5, 6, 7, 8
1, 2, 3, 4, 5, 6, 7, 8
![Page 5: Experiences with Streaming Construction of SAH KD Trees](https://reader034.vdocument.in/reader034/viewer/2022051821/56815b15550346895dc8c643/html5/thumbnails/5.jpg)
Stefan Popov Streaming Construction of KD Trees
SAH Cost Function
Piecewise linear Discontinuities at object boundaries Evaluate only before opening and after closing event
1
2
X
Y
129
139
149
159
169
179
-2 18 38 58 78 98
![Page 6: Experiences with Streaming Construction of SAH KD Trees](https://reader034.vdocument.in/reader034/viewer/2022051821/56815b15550346895dc8c643/html5/thumbnails/6.jpg)
Stefan Popov Streaming Construction of KD Trees
Distribution Along the Split Axis
Given: event list & split position Sweep event list and classify
Open event Before split label object “both” After split label object “right”
Close event Before split re-label object “left”
Copy event to corresponding child’s list
Might have to insert new events
Random memory access
Both
Right
X
Left
[ [ ] [ ] ]
Re
-la
be
l le
ft
[ [ ] [ ] ]
RightLeft
Ke
ep
bo
th
Ke
ep
rig
ht
Both
] [
![Page 7: Experiences with Streaming Construction of SAH KD Trees](https://reader034.vdocument.in/reader034/viewer/2022051821/56815b15550346895dc8c643/html5/thumbnails/7.jpg)
Stefan Popov Streaming Construction of KD Trees
Distribution Along the Other Axes
Sweep event lists. Copy event to Left, if corresponding object labeled “left” or “both” Right, if corresponding object labeled “right” or “both” Look up in object array Random memory access
Both
Right
Y
Left
Left
child
’s li
st
Right child’s list
Y Y Y
![Page 8: Experiences with Streaming Construction of SAH KD Trees](https://reader034.vdocument.in/reader034/viewer/2022051821/56815b15550346895dc8c643/html5/thumbnails/8.jpg)
Stefan Popov Streaming Construction of KD Trees
Problems of KD Tree Construction
Random memory accesses
Expensive cost function evaluation
Initial sorting – inefficient for lazy builds
![Page 9: Experiences with Streaming Construction of SAH KD Trees](https://reader034.vdocument.in/reader034/viewer/2022051821/56815b15550346895dc8c643/html5/thumbnails/9.jpg)
Stefan Popov Streaming Construction of KD Trees
Streaming Algorithm Overview Work with unsorted lists of AABBs
Avoid initial sorting Sweep list once to locate initial split plane In a single sweep
Distribute objects (straightforward) Determine split positions of children
Once data fits in caches, switch to conventional build
Left list Right listParent list
![Page 10: Experiences with Streaming Construction of SAH KD Trees](https://reader034.vdocument.in/reader034/viewer/2022051821/56815b15550346895dc8c643/html5/thumbnails/10.jpg)
Stefan Popov Streaming Construction of KD Trees
SAH Cost Estimation Cost function typically varies only slowly
No need to evaluate SAH at every event
Use sampling!
Naïve approach For every event: check all samples O(kN)
How to sample efficiently?
6000
8000
10000
12000
14000
16000
18000
-0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3 0.4 0.5
Minimum foundReal minimum
SAH
![Page 11: Experiences with Streaming Construction of SAH KD Trees](https://reader034.vdocument.in/reader034/viewer/2022051821/56815b15550346895dc8c643/html5/thumbnails/11.jpg)
Stefan Popov Streaming Construction of KD Trees
Efficient Sampling Two step approach
#Objects to left of sample = #Opening events to its left #Objects to right of sample = #Closing events to its right
Count opening/closing events between samples Regular sampling index computation in O(1)
Reconstruct left/right object counts at samples Using two partial sums from left and right O(k+N)
[ [ ] [ ] ]1 11 0 1 1 0 1
0 3 1 3 2 2 3 1 3 0
![Page 12: Experiences with Streaming Construction of SAH KD Trees](https://reader034.vdocument.in/reader034/viewer/2022051821/56815b15550346895dc8c643/html5/thumbnails/12.jpg)
Stefan Popov Streaming Construction of KD Trees
Refining of Samples
SAH – sum of two monotone functions – Cl and Cr
Cost between two samples a < b is bounded from below C Cmin = min(Cl) + min(Cr) = Cl(a) + Cr(b)
Resample areas where Cmin < current minimum Typically only few intervals need to be re-sampled (< 5%)
Current minimum
C l
Cr
C = C l + Cr
0
2000
4000
6000
8000
10000
12000
14000
16000
18000
-0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3 0.4 0.5
![Page 13: Experiences with Streaming Construction of SAH KD Trees](https://reader034.vdocument.in/reader034/viewer/2022051821/56815b15550346895dc8c643/html5/thumbnails/13.jpg)
Stefan Popov Streaming Construction of KD Trees
Algorithm properties
Streaming memory accesses
SAH cost function estimated by sampling
No initial sorting required
Refining of Samples
![Page 14: Experiences with Streaming Construction of SAH KD Trees](https://reader034.vdocument.in/reader034/viewer/2022051821/56815b15550346895dc8c643/html5/thumbnails/14.jpg)
Stefan Popov Streaming Construction of KD Trees
Improvements Conventional Algorithm
Use radix sort – O(N) Fastest algorithm if data set fits into caches
No need to order events at same position Count opening/closing events instead Removes one radix sort pass
Multiple cores parallelize build Most time spent in the lower tree levels One sub-tree one core
![Page 15: Experiences with Streaming Construction of SAH KD Trees](https://reader034.vdocument.in/reader034/viewer/2022051821/56815b15550346895dc8c643/html5/thumbnails/15.jpg)
Stefan Popov Streaming Construction of KD Trees
Results
Speed-up up to 50% Only effective in the upper levels Limited by copying of object/events The larger the scene, the higher the speedup Performance independent of triangle order
Small decrease in traversal performance (< 2%) With 1024 samples
Multi-threading 2.43x @ 4 cores (no local memory management)
![Page 16: Experiences with Streaming Construction of SAH KD Trees](https://reader034.vdocument.in/reader034/viewer/2022051821/56815b15550346895dc8c643/html5/thumbnails/16.jpg)
Stefan Popov Streaming Construction of KD Trees
Future Work
Fully multi-threaded implementation Carefully memory management on NUMA
architectures
Extend to other spatial index structures BVHs, BKD trees, SKD trees, …
CPU Memory
CPUMemory
CPUMemory
CPU Memory
![Page 17: Experiences with Streaming Construction of SAH KD Trees](https://reader034.vdocument.in/reader034/viewer/2022051821/56815b15550346895dc8c643/html5/thumbnails/17.jpg)
Stefan Popov Streaming Construction of KD Trees
Conclusion
Streaming construction algorithm 50% speedup Cost function sampling Very low quality degradation
Refining of samples
![Page 18: Experiences with Streaming Construction of SAH KD Trees](https://reader034.vdocument.in/reader034/viewer/2022051821/56815b15550346895dc8c643/html5/thumbnails/18.jpg)
Stefan Popov Streaming Construction of KD Trees
Thank you!
![Page 19: Experiences with Streaming Construction of SAH KD Trees](https://reader034.vdocument.in/reader034/viewer/2022051821/56815b15550346895dc8c643/html5/thumbnails/19.jpg)
Stefan Popov Streaming Construction of KD Trees
Advantages
Sequential memory access in the upper levels
Small data foot print in conventional build Fits in caches Radix sort is efficient
Less computations needed for split plane position estimation
But, what about the tree cost?
![Page 20: Experiences with Streaming Construction of SAH KD Trees](https://reader034.vdocument.in/reader034/viewer/2022051821/56815b15550346895dc8c643/html5/thumbnails/20.jpg)
Stefan Popov Streaming Construction of KD Trees
Memory Managment
Use two arrays and alternate them
Left child’s objects
Object count += SP
Left only Right only
Right child’s objects
SP x 2
Object count for node n = in+1 - in
in+1in
Index array
Sift to second array
Objects
im+2im im+1
Index array
![Page 21: Experiences with Streaming Construction of SAH KD Trees](https://reader034.vdocument.in/reader034/viewer/2022051821/56815b15550346895dc8c643/html5/thumbnails/21.jpg)
Stefan Popov Streaming Construction of KD Trees
SAH tree cost Optimal KD tree for ray tracing
SAH based Minimize average expected traversal cost of an
arbitrary ray
![Page 22: Experiences with Streaming Construction of SAH KD Trees](https://reader034.vdocument.in/reader034/viewer/2022051821/56815b15550346895dc8c643/html5/thumbnails/22.jpg)
Stefan Popov Streaming Construction of KD Trees
SAH computation
Efficient computation – extract & sort events in advance Compute incrementally. Keep track of objects
on left/right Evaluate after close, before an open events
129
139
149
159
169
179
-2 18 38 58 78 98
1
2
X
Y
![Page 23: Experiences with Streaming Construction of SAH KD Trees](https://reader034.vdocument.in/reader034/viewer/2022051821/56815b15550346895dc8c643/html5/thumbnails/23.jpg)
Stefan Popov Streaming Construction of KD Trees
Alternative Multi-Threading required on NUMA architectures)
Sub-tree core not suitable for the first log(#cores) levels Also unsuitable for some architecture (Cell)
Alternative Bring data to cores from sequential pages Gather event counts in bins at each core Merge counts before actual cost evaluation
CPUMemory
CPUCPU
CPU
![Page 24: Experiences with Streaming Construction of SAH KD Trees](https://reader034.vdocument.in/reader034/viewer/2022051821/56815b15550346895dc8c643/html5/thumbnails/24.jpg)
Stefan Popov Streaming Construction of KD Trees
Extension: Multi-Threading
Multiple cores parallelize build Most time spent in the lower tree levels
One sub-tree one core