experiences with streaming construction of sah kd trees

24
Experiences with Streaming Construction of SAH KD Trees Stefan Popov, Johannes Günther, Hans-Peter Seidel, Philipp Slusallek

Upload: ismail

Post on 07-Feb-2016

20 views

Category:

Documents


0 download

DESCRIPTION

Experiences with Streaming Construction of SAH KD Trees. Stefan Popov, Johannes Günther, Hans-Peter Seidel, Philipp Slusallek. Motivation. Large speed-up of ray tracing lately Better algorithms (packet tracing [Wald04, Reshetov05] ) Optimized spatial index structures - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Experiences with Streaming Construction of SAH KD Trees

Experiences with Streaming Construction of SAH KD Trees

Stefan Popov, Johannes Günther, Hans-Peter Seidel, Philipp Slusallek

Page 2: Experiences with Streaming Construction of SAH KD Trees

Stefan Popov Streaming Construction of KD Trees

Motivation Large speed-up of ray tracing lately

Better algorithms (packet tracing [Wald04, Reshetov05]) Optimized spatial index structures

Best known: KD trees [Havran00]

Faster hardware Research concentrated mainly on static scenes

Dynamic scenes Building – slow for SAH based KD trees Done in a pre-processing step

Page 3: Experiences with Streaming Construction of SAH KD Trees

Stefan Popov Streaming Construction of KD Trees

Dynamic Scenes Approaches Embed dynamics in the index structure

Use a two level approach [Wald03] Fuzzy KD trees [Günther06]

Update index structure Grids, BVHs and KD tree hybrids

Faster build/update Lower traversal performance

No efficient approach for KD trees Rebuild entire KD tree

Need to make it fast Lazy build

Page 4: Experiences with Streaming Construction of SAH KD Trees

Stefan Popov Streaming Construction of KD Trees

SAH Algorithm Extract & sort events in advance

Abstract objects with AABBs Events given by AABB boundaries

Recursive top-down construction Find split plane using SAH

Compute minimum cost Distribute objects to children

By distributing the events Keep them sorted

1

3

4

2

6

7

8

5

Split hyper-plane [X

: 68]

X: 68

Left Right

1, 2, 3, 4 4, 5, 6, 7, 8

1, 2, 3, 4, 5, 6, 7, 8

Page 5: Experiences with Streaming Construction of SAH KD Trees

Stefan Popov Streaming Construction of KD Trees

SAH Cost Function

Piecewise linear Discontinuities at object boundaries Evaluate only before opening and after closing event

1

2

X

Y

129

139

149

159

169

179

-2 18 38 58 78 98

Page 6: Experiences with Streaming Construction of SAH KD Trees

Stefan Popov Streaming Construction of KD Trees

Distribution Along the Split Axis

Given: event list & split position Sweep event list and classify

Open event Before split label object “both” After split label object “right”

Close event Before split re-label object “left”

Copy event to corresponding child’s list

Might have to insert new events

Random memory access

Both

Right

X

Left

[ [ ] [ ] ]

Re

-la

be

l le

ft

[ [ ] [ ] ]

RightLeft

Ke

ep

bo

th

Ke

ep

rig

ht

Both

] [

Page 7: Experiences with Streaming Construction of SAH KD Trees

Stefan Popov Streaming Construction of KD Trees

Distribution Along the Other Axes

Sweep event lists. Copy event to Left, if corresponding object labeled “left” or “both” Right, if corresponding object labeled “right” or “both” Look up in object array Random memory access

Both

Right

Y

Left

Left

child

’s li

st

Right child’s list

Y Y Y

Page 8: Experiences with Streaming Construction of SAH KD Trees

Stefan Popov Streaming Construction of KD Trees

Problems of KD Tree Construction

Random memory accesses

Expensive cost function evaluation

Initial sorting – inefficient for lazy builds

Page 9: Experiences with Streaming Construction of SAH KD Trees

Stefan Popov Streaming Construction of KD Trees

Streaming Algorithm Overview Work with unsorted lists of AABBs

Avoid initial sorting Sweep list once to locate initial split plane In a single sweep

Distribute objects (straightforward) Determine split positions of children

Once data fits in caches, switch to conventional build

Left list Right listParent list

Page 10: Experiences with Streaming Construction of SAH KD Trees

Stefan Popov Streaming Construction of KD Trees

SAH Cost Estimation Cost function typically varies only slowly

No need to evaluate SAH at every event

Use sampling!

Naïve approach For every event: check all samples O(kN)

How to sample efficiently?

6000

8000

10000

12000

14000

16000

18000

-0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3 0.4 0.5

Minimum foundReal minimum

SAH

Page 11: Experiences with Streaming Construction of SAH KD Trees

Stefan Popov Streaming Construction of KD Trees

Efficient Sampling Two step approach

#Objects to left of sample = #Opening events to its left #Objects to right of sample = #Closing events to its right

Count opening/closing events between samples Regular sampling index computation in O(1)

Reconstruct left/right object counts at samples Using two partial sums from left and right O(k+N)

[ [ ] [ ] ]1 11 0 1 1 0 1

0 3 1 3 2 2 3 1 3 0

Page 12: Experiences with Streaming Construction of SAH KD Trees

Stefan Popov Streaming Construction of KD Trees

Refining of Samples

SAH – sum of two monotone functions – Cl and Cr

Cost between two samples a < b is bounded from below C Cmin = min(Cl) + min(Cr) = Cl(a) + Cr(b)

Resample areas where Cmin < current minimum Typically only few intervals need to be re-sampled (< 5%)

Current minimum

C l

Cr

C = C l + Cr

0

2000

4000

6000

8000

10000

12000

14000

16000

18000

-0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3 0.4 0.5

Page 13: Experiences with Streaming Construction of SAH KD Trees

Stefan Popov Streaming Construction of KD Trees

Algorithm properties

Streaming memory accesses

SAH cost function estimated by sampling

No initial sorting required

Refining of Samples

Page 14: Experiences with Streaming Construction of SAH KD Trees

Stefan Popov Streaming Construction of KD Trees

Improvements Conventional Algorithm

Use radix sort – O(N) Fastest algorithm if data set fits into caches

No need to order events at same position Count opening/closing events instead Removes one radix sort pass

Multiple cores parallelize build Most time spent in the lower tree levels One sub-tree one core

Page 15: Experiences with Streaming Construction of SAH KD Trees

Stefan Popov Streaming Construction of KD Trees

Results

Speed-up up to 50% Only effective in the upper levels Limited by copying of object/events The larger the scene, the higher the speedup Performance independent of triangle order

Small decrease in traversal performance (< 2%) With 1024 samples

Multi-threading 2.43x @ 4 cores (no local memory management)

Page 16: Experiences with Streaming Construction of SAH KD Trees

Stefan Popov Streaming Construction of KD Trees

Future Work

Fully multi-threaded implementation Carefully memory management on NUMA

architectures

Extend to other spatial index structures BVHs, BKD trees, SKD trees, …

CPU Memory

CPUMemory

CPUMemory

CPU Memory

Page 17: Experiences with Streaming Construction of SAH KD Trees

Stefan Popov Streaming Construction of KD Trees

Conclusion

Streaming construction algorithm 50% speedup Cost function sampling Very low quality degradation

Refining of samples

Page 18: Experiences with Streaming Construction of SAH KD Trees

Stefan Popov Streaming Construction of KD Trees

Thank you!

Page 19: Experiences with Streaming Construction of SAH KD Trees

Stefan Popov Streaming Construction of KD Trees

Advantages

Sequential memory access in the upper levels

Small data foot print in conventional build Fits in caches Radix sort is efficient

Less computations needed for split plane position estimation

But, what about the tree cost?

Page 20: Experiences with Streaming Construction of SAH KD Trees

Stefan Popov Streaming Construction of KD Trees

Memory Managment

Use two arrays and alternate them

Left child’s objects

Object count += SP

Left only Right only

Right child’s objects

SP x 2

Object count for node n = in+1 - in

in+1in

Index array

Sift to second array

Objects

im+2im im+1

Index array

Page 21: Experiences with Streaming Construction of SAH KD Trees

Stefan Popov Streaming Construction of KD Trees

SAH tree cost Optimal KD tree for ray tracing

SAH based Minimize average expected traversal cost of an

arbitrary ray

Page 22: Experiences with Streaming Construction of SAH KD Trees

Stefan Popov Streaming Construction of KD Trees

SAH computation

Efficient computation – extract & sort events in advance Compute incrementally. Keep track of objects

on left/right Evaluate after close, before an open events

129

139

149

159

169

179

-2 18 38 58 78 98

1

2

X

Y

Page 23: Experiences with Streaming Construction of SAH KD Trees

Stefan Popov Streaming Construction of KD Trees

Alternative Multi-Threading required on NUMA architectures)

Sub-tree core not suitable for the first log(#cores) levels Also unsuitable for some architecture (Cell)

Alternative Bring data to cores from sequential pages Gather event counts in bins at each core Merge counts before actual cost evaluation

CPUMemory

CPUCPU

CPU

Page 24: Experiences with Streaming Construction of SAH KD Trees

Stefan Popov Streaming Construction of KD Trees

Extension: Multi-Threading

Multiple cores parallelize build Most time spent in the lower tree levels

One sub-tree one core