Download - Fast Parallel Construction of High-Quality Bounding Volume Hierarchies Tero Karras Timo Aila
![Page 1: Fast Parallel Construction of High-Quality Bounding Volume Hierarchies Tero Karras Timo Aila](https://reader035.vdocument.in/reader035/viewer/2022062417/55165754550346a2698b4e32/html5/thumbnails/1.jpg)
Fast Parallel Construction of High-Quality Bounding Volume Hierarchies
Tero KarrasTimo Aila
![Page 2: Fast Parallel Construction of High-Quality Bounding Volume Hierarchies Tero Karras Timo Aila](https://reader035.vdocument.in/reader035/viewer/2022062417/55165754550346a2698b4e32/html5/thumbnails/2.jpg)
2
Ray tracing comes in many flavors
Interactive apps1M–100Mrays/frame
Architecture & design100M–10Grays/frame
Movie production10G–1T
rays/frame
© Activision 2009, Game trailer by Blur Studio
Courtesy of Delta Tracing Lucasfilm Ltd.™, Digital work by ILM
Courtesy of Columbia Pictures
NVIDIA
Courtesy of Dassault Systemes
![Page 3: Fast Parallel Construction of High-Quality Bounding Volume Hierarchies Tero Karras Timo Aila](https://reader035.vdocument.in/reader035/viewer/2022062417/55165754550346a2698b4e32/html5/thumbnails/3.jpg)
3
Effective performance
𝑒𝑓𝑓𝑒𝑐𝑡𝑖𝑣𝑒𝑟𝑎𝑦 𝑡𝑟𝑎𝑐𝑖𝑛𝑔𝑝𝑒𝑟𝑓𝑜𝑟𝑚𝑎𝑛𝑐𝑒=𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑟𝑎𝑦𝑠𝑟𝑒𝑛𝑑𝑒𝑟𝑖𝑛𝑔𝑡𝑖𝑚𝑒
![Page 4: Fast Parallel Construction of High-Quality Bounding Volume Hierarchies Tero Karras Timo Aila](https://reader035.vdocument.in/reader035/viewer/2022062417/55165754550346a2698b4e32/html5/thumbnails/4.jpg)
4
Effective performance
𝑟𝑒𝑛𝑑𝑒𝑟𝑖𝑛𝑔𝑡𝑖𝑚𝑒=𝑡𝑖𝑚𝑒𝑡𝑜𝑏𝑢𝑖𝑙𝑑𝐵𝑉𝐻+𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑟𝑎𝑦𝑠𝑟𝑎𝑦 h h𝑡 𝑟𝑜𝑢𝑔 𝑝𝑢𝑡
𝑒𝑓𝑓𝑒𝑐𝑡𝑖𝑣𝑒𝑟𝑎𝑦 𝑡𝑟𝑎𝑐𝑖𝑛𝑔𝑝𝑒𝑟𝑓𝑜𝑟𝑚𝑎𝑛𝑐𝑒=𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑟𝑎𝑦𝑠𝑟𝑒𝑛𝑑𝑒𝑟𝑖𝑛𝑔𝑡𝑖𝑚𝑒
“speed” “quality”
![Page 5: Fast Parallel Construction of High-Quality Bounding Volume Hierarchies Tero Karras Timo Aila](https://reader035.vdocument.in/reader035/viewer/2022062417/55165754550346a2698b4e32/html5/thumbnails/5.jpg)
5
Effective performance
1M 10M 100M 1G 10G 100G 1T0
50
100
150
200
250
300
350
400
450
𝑒𝑓𝑓𝑒𝑐𝑡𝑖𝑣𝑒
𝑝𝑒𝑟𝑓𝑜𝑟𝑚𝑎𝑛
𝑐𝑒
𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑟𝑎𝑦𝑠𝑝𝑒𝑟 𝑓𝑟𝑎𝑚𝑒
𝑟𝑒𝑛𝑑𝑒𝑟𝑖𝑛𝑔𝑡𝑖𝑚𝑒=𝑡𝑖𝑚𝑒𝑡𝑜𝑏𝑢𝑖𝑙𝑑𝐵𝑉𝐻+𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑟𝑎𝑦𝑠𝑟𝑎𝑦 h h𝑡 𝑟𝑜𝑢𝑔 𝑝𝑢𝑡
Interactiveapps
Architecture& design
Movieproduction
Speed
matters
Qua
lity
matt
ersBoth matter
𝑒𝑓𝑓𝑒𝑐𝑡𝑖𝑣𝑒𝑟𝑎𝑦 𝑡𝑟𝑎𝑐𝑖𝑛𝑔𝑝𝑒𝑟𝑓𝑜𝑟𝑚𝑎𝑛𝑐𝑒=𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑟𝑎𝑦𝑠𝑟𝑒𝑛𝑑𝑒𝑟𝑖𝑛𝑔𝑡𝑖𝑚𝑒
Mrays/s
![Page 6: Fast Parallel Construction of High-Quality Bounding Volume Hierarchies Tero Karras Timo Aila](https://reader035.vdocument.in/reader035/viewer/2022062417/55165754550346a2698b4e32/html5/thumbnails/6.jpg)
6
Effective performance
1M 10M 100M 1G 10G 100G 1T0
50
100
150
200
250
300
350
400
450
𝑒𝑓𝑓𝑒𝑐𝑡𝑖𝑣𝑒
𝑝𝑒𝑟𝑓𝑜𝑟𝑚𝑎𝑛
𝑐𝑒
𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑟𝑎𝑦𝑠𝑝𝑒𝑟 𝑓𝑟𝑎𝑚𝑒
Mrays/s
SODA (2.2M tris)NVIDIA GTX TitanDiffuse rays
![Page 7: Fast Parallel Construction of High-Quality Bounding Volume Hierarchies Tero Karras Timo Aila](https://reader035.vdocument.in/reader035/viewer/2022062417/55165754550346a2698b4e32/html5/thumbnails/7.jpg)
7
Effective performance
1M 10M 100M 1G 10G 100G 1T0
50
100
150
200
250
300
350
400
450
𝑒𝑓𝑓𝑒𝑐𝑡𝑖𝑣𝑒
𝑝𝑒𝑟𝑓𝑜𝑟𝑚𝑎𝑛
𝑐𝑒
SBVH[Stich et al. 2009]
(CPU, 4 cores)
𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑟𝑎𝑦𝑠𝑝𝑒𝑟 𝑓𝑟𝑎𝑚𝑒
Build timedominates
Mrays/s
![Page 8: Fast Parallel Construction of High-Quality Bounding Volume Hierarchies Tero Karras Timo Aila](https://reader035.vdocument.in/reader035/viewer/2022062417/55165754550346a2698b4e32/html5/thumbnails/8.jpg)
8
Effective performance
1M 10M 100M 1G 10G 100G 1T0
50
100
150
200
250
300
350
400
450
𝑒𝑓𝑓𝑒𝑐𝑡𝑖𝑣𝑒
𝑝𝑒𝑟𝑓𝑜𝑟𝑚𝑎𝑛
𝑐𝑒
HLBVH[Garanzha et al. 2011]
(GPU)
???
𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑟𝑎𝑦𝑠𝑝𝑒𝑟 𝑓𝑟𝑎𝑚𝑒
Mrays/s
SBVH[Stich et al. 2009]
(CPU, 4 cores)
![Page 9: Fast Parallel Construction of High-Quality Bounding Volume Hierarchies Tero Karras Timo Aila](https://reader035.vdocument.in/reader035/viewer/2022062417/55165754550346a2698b4e32/html5/thumbnails/9.jpg)
9
Effective performance
1M 10M 100M 1G 10G 100G 1T0
50
100
150
200
250
300
350
400
450
𝑒𝑓𝑓𝑒𝑐𝑡𝑖𝑣𝑒
𝑝𝑒𝑟𝑓𝑜𝑟𝑚𝑎𝑛
𝑐𝑒
Our method(GPU)
𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑟𝑎𝑦𝑠𝑝𝑒𝑟 𝑓𝑟𝑎𝑚𝑒
Mrays/s
HLBVH[Garanzha et al. 2011]
(GPU)
SBVH[Stich et al. 2009]
(CPU, 4 cores)
![Page 10: Fast Parallel Construction of High-Quality Bounding Volume Hierarchies Tero Karras Timo Aila](https://reader035.vdocument.in/reader035/viewer/2022062417/55165754550346a2698b4e32/html5/thumbnails/10.jpg)
10
1M 10M 100M 1G 10G 100G 1T0
50
100
150
200
250
300
350
400
450
Effective performance𝑒𝑓𝑓𝑒𝑐𝑡𝑖𝑣𝑒
𝑝𝑒𝑟𝑓𝑜𝑟𝑚𝑎𝑛
𝑐𝑒
30M–500Grays/frame
97% ofSBVH
𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑟𝑎𝑦𝑠𝑝𝑒𝑟 𝑓𝑟𝑎𝑚𝑒
Mrays/s
Best quality–speed tradeoff for wide range of applications
![Page 11: Fast Parallel Construction of High-Quality Bounding Volume Hierarchies Tero Karras Timo Aila](https://reader035.vdocument.in/reader035/viewer/2022062417/55165754550346a2698b4e32/html5/thumbnails/11.jpg)
11
Treelet restructuring
IdeaBuild a low-quality BVHOptimize its node topologyLook at multiple nodes at once
![Page 12: Fast Parallel Construction of High-Quality Bounding Volume Hierarchies Tero Karras Timo Aila](https://reader035.vdocument.in/reader035/viewer/2022062417/55165754550346a2698b4e32/html5/thumbnails/12.jpg)
12
Treelet restructuring
IdeaBuild a low-quality BVHOptimize its node topologyLook at multiple nodes at once
TreeletSubset of a node’s descendants
![Page 13: Fast Parallel Construction of High-Quality Bounding Volume Hierarchies Tero Karras Timo Aila](https://reader035.vdocument.in/reader035/viewer/2022062417/55165754550346a2698b4e32/html5/thumbnails/13.jpg)
13
R
Treelet restructuringTreelet root
Treelet leaf
IdeaBuild a low-quality BVHOptimize its node topologyLook at multiple nodes at once
TreeletSubset of a node’s descendants
![Page 14: Fast Parallel Construction of High-Quality Bounding Volume Hierarchies Tero Karras Timo Aila](https://reader035.vdocument.in/reader035/viewer/2022062417/55165754550346a2698b4e32/html5/thumbnails/14.jpg)
14
R
Treelet restructuringTreelet root
Treelet leaf
IdeaBuild a low-quality BVHOptimize its node topologyLook at multiple nodes at once
TreeletSubset of a node’s descendantsGrow by turning leaves into internal nodes
Grow
![Page 15: Fast Parallel Construction of High-Quality Bounding Volume Hierarchies Tero Karras Timo Aila](https://reader035.vdocument.in/reader035/viewer/2022062417/55165754550346a2698b4e32/html5/thumbnails/15.jpg)
15
R
Treelet restructuringTreeletinternal
node
Grow
IdeaBuild a low-quality BVHOptimize its node topologyLook at multiple nodes at once
TreeletSubset of a node’s descendantsGrow by turning leaves into internal nodes
![Page 16: Fast Parallel Construction of High-Quality Bounding Volume Hierarchies Tero Karras Timo Aila](https://reader035.vdocument.in/reader035/viewer/2022062417/55165754550346a2698b4e32/html5/thumbnails/16.jpg)
16
R
Treelet restructuring
IdeaBuild a low-quality BVHOptimize its node topologyLook at multiple nodes at once
TreeletSubset of a node’s descendantsGrow by turning leaves into internal nodes
![Page 17: Fast Parallel Construction of High-Quality Bounding Volume Hierarchies Tero Karras Timo Aila](https://reader035.vdocument.in/reader035/viewer/2022062417/55165754550346a2698b4e32/html5/thumbnails/17.jpg)
17
R
Treelet restructuring
IdeaBuild a low-quality BVHOptimize its node topologyLook at multiple nodes at once
TreeletSubset of a node’s descendantsGrow by turning leaves into internal nodes
![Page 18: Fast Parallel Construction of High-Quality Bounding Volume Hierarchies Tero Karras Timo Aila](https://reader035.vdocument.in/reader035/viewer/2022062417/55165754550346a2698b4e32/html5/thumbnails/18.jpg)
18
R
Treelet restructuring
IdeaBuild a low-quality BVHOptimize its node topologyLook at multiple nodes at once
TreeletSubset of a node’s descendantsGrow by turning leaves into internal nodes
![Page 19: Fast Parallel Construction of High-Quality Bounding Volume Hierarchies Tero Karras Timo Aila](https://reader035.vdocument.in/reader035/viewer/2022062417/55165754550346a2698b4e32/html5/thumbnails/19.jpg)
19
Treelet restructuring
R
IdeaBuild a low-quality BVHOptimize its node topologyLook at multiple nodes at once
TreeletSubset of a node’s descendantsGrow by turning leaves into internal nodesLargest leaves → best results
![Page 20: Fast Parallel Construction of High-Quality Bounding Volume Hierarchies Tero Karras Timo Aila](https://reader035.vdocument.in/reader035/viewer/2022062417/55165754550346a2698b4e32/html5/thumbnails/20.jpg)
20
Treelet restructuring
R
C
A B
F
D
G
E
treelet leaves
treelet internalnodesIdea
Build a low-quality BVHOptimize its node topologyLook at multiple nodes at once
TreeletSubset of a node’s descendantsGrow by turning leaves into internal nodesLargest leaves → best results
Valid binary tree in itself
![Page 21: Fast Parallel Construction of High-Quality Bounding Volume Hierarchies Tero Karras Timo Aila](https://reader035.vdocument.in/reader035/viewer/2022062417/55165754550346a2698b4e32/html5/thumbnails/21.jpg)
21
Treelet restructuring
R
C
A B
F
D
G
E
ActualBVH leaf
Arbitrarysubtree
IdeaBuild a low-quality BVHOptimize its node topologyLook at multiple nodes at once
TreeletSubset of a node’s descendantsGrow by turning leaves into internal nodesLargest leaves → best results
Valid binary tree in itselfLeaves can represent arbitrary subtrees of the BVH
![Page 22: Fast Parallel Construction of High-Quality Bounding Volume Hierarchies Tero Karras Timo Aila](https://reader035.vdocument.in/reader035/viewer/2022062417/55165754550346a2698b4e32/html5/thumbnails/22.jpg)
22
Treelet restructuring
R
C
A B
F
D
G
E
RestructuringConstruct optimal binary tree for the same set of leavesReplace old treelet
![Page 23: Fast Parallel Construction of High-Quality Bounding Volume Hierarchies Tero Karras Timo Aila](https://reader035.vdocument.in/reader035/viewer/2022062417/55165754550346a2698b4e32/html5/thumbnails/23.jpg)
23
Treelet restructuring
C
A B
F
D
G
E
R
RestructuringConstruct optimal binary tree for the same set of leavesReplace old treelet
Reuse the same nodesUpdate connectivity and AABBsNew AABBs should be smaller
![Page 24: Fast Parallel Construction of High-Quality Bounding Volume Hierarchies Tero Karras Timo Aila](https://reader035.vdocument.in/reader035/viewer/2022062417/55165754550346a2698b4e32/html5/thumbnails/24.jpg)
24
D
E
G
Treelet restructuring
A
R
C
F
B
RestructuringConstruct optimal binary tree for the same set of leavesReplace old treelet
Reuse the same nodesUpdate connectivity and AABBsNew AABBs should be smaller
![Page 25: Fast Parallel Construction of High-Quality Bounding Volume Hierarchies Tero Karras Timo Aila](https://reader035.vdocument.in/reader035/viewer/2022062417/55165754550346a2698b4e32/html5/thumbnails/25.jpg)
25
Treelet restructuring
CA BF
D
G E
R
RestructuringConstruct optimal binary tree for the same set of leavesReplace old treelet
Reuse the same nodesUpdate connectivity and AABBsNew AABBs should be smaller
![Page 26: Fast Parallel Construction of High-Quality Bounding Volume Hierarchies Tero Karras Timo Aila](https://reader035.vdocument.in/reader035/viewer/2022062417/55165754550346a2698b4e32/html5/thumbnails/26.jpg)
26
Treelet restructuring
R
CA BF
D
G E
RestructuringConstruct optimal binary tree for the same set of leavesReplace old treelet
Reuse the same nodesUpdate connectivity and AABBsNew AABBs should be smaller
Perfectly localized operationLeaves and their subtrees are kept intactNo need to look at subtree contents
![Page 27: Fast Parallel Construction of High-Quality Bounding Volume Hierarchies Tero Karras Timo Aila](https://reader035.vdocument.in/reader035/viewer/2022062417/55165754550346a2698b4e32/html5/thumbnails/27.jpg)
27
Processing stages
Initial BVH construction
Post-processing
Optimization
Input triangles
One triangleper leaf
![Page 28: Fast Parallel Construction of High-Quality Bounding Volume Hierarchies Tero Karras Timo Aila](https://reader035.vdocument.in/reader035/viewer/2022062417/55165754550346a2698b4e32/html5/thumbnails/28.jpg)
28
Processing stages
Initial BVH construction
Post-processing
Optimization
Parallel LBVH[Karras 2012]
60-bit Morton codesfor accurate spatial
partitioning
![Page 29: Fast Parallel Construction of High-Quality Bounding Volume Hierarchies Tero Karras Timo Aila](https://reader035.vdocument.in/reader035/viewer/2022062417/55165754550346a2698b4e32/html5/thumbnails/29.jpg)
29
Processing stages
Initial BVH construction
Post-processing
Optimization
Parallel bottom-up traversal[Karras 2012]
Restructure multipletreelets in parallel
![Page 30: Fast Parallel Construction of High-Quality Bounding Volume Hierarchies Tero Karras Timo Aila](https://reader035.vdocument.in/reader035/viewer/2022062417/55165754550346a2698b4e32/html5/thumbnails/30.jpg)
30
Processing stages
Initial BVH construction
Post-processing
Optimization
Parallel bottom-up traversal[Karras 2012]
![Page 31: Fast Parallel Construction of High-Quality Bounding Volume Hierarchies Tero Karras Timo Aila](https://reader035.vdocument.in/reader035/viewer/2022062417/55165754550346a2698b4e32/html5/thumbnails/31.jpg)
31
Processing stages
Initial BVH construction
Post-processing
Optimization
Parallel bottom-up traversal[Karras 2012]
![Page 32: Fast Parallel Construction of High-Quality Bounding Volume Hierarchies Tero Karras Timo Aila](https://reader035.vdocument.in/reader035/viewer/2022062417/55165754550346a2698b4e32/html5/thumbnails/32.jpg)
32
Processing stages
Initial BVH construction
Post-processing
Optimization
Parallel bottom-up traversal[Karras 2012]
![Page 33: Fast Parallel Construction of High-Quality Bounding Volume Hierarchies Tero Karras Timo Aila](https://reader035.vdocument.in/reader035/viewer/2022062417/55165754550346a2698b4e32/html5/thumbnails/33.jpg)
33
Processing stages
Initial BVH construction
Post-processing
Optimization
Parallel bottom-up traversal[Karras 2012]
![Page 34: Fast Parallel Construction of High-Quality Bounding Volume Hierarchies Tero Karras Timo Aila](https://reader035.vdocument.in/reader035/viewer/2022062417/55165754550346a2698b4e32/html5/thumbnails/34.jpg)
34
Processing stages
Initial BVH construction
Post-processing
Optimization
Parallel bottom-up traversal[Karras 2012]
![Page 35: Fast Parallel Construction of High-Quality Bounding Volume Hierarchies Tero Karras Timo Aila](https://reader035.vdocument.in/reader035/viewer/2022062417/55165754550346a2698b4e32/html5/thumbnails/35.jpg)
35
Processing stages
Initial BVH construction
Post-processing
Optimization
Parallel bottom-up traversal[Karras 2012]
Strict bottom-up order→ no overlap between treelets
![Page 36: Fast Parallel Construction of High-Quality Bounding Volume Hierarchies Tero Karras Timo Aila](https://reader035.vdocument.in/reader035/viewer/2022062417/55165754550346a2698b4e32/html5/thumbnails/36.jpg)
36
Processing stages
Initial BVH construction
Post-processing
Rinse and repeat(3 times is plenty)
Optimization
![Page 37: Fast Parallel Construction of High-Quality Bounding Volume Hierarchies Tero Karras Timo Aila](https://reader035.vdocument.in/reader035/viewer/2022062417/55165754550346a2698b4e32/html5/thumbnails/37.jpg)
37
Processing stages
Initial BVH construction
Post-processing
Optimization
Collapse subtreesinto leaf nodes
![Page 38: Fast Parallel Construction of High-Quality Bounding Volume Hierarchies Tero Karras Timo Aila](https://reader035.vdocument.in/reader035/viewer/2022062417/55165754550346a2698b4e32/html5/thumbnails/38.jpg)
38
Processing stages
Initial BVH construction
Post-processing
Optimization
Collect trianglesinto linear lists Prepare them for Woop’s
intersection test[Woop 2004]
![Page 39: Fast Parallel Construction of High-Quality Bounding Volume Hierarchies Tero Karras Timo Aila](https://reader035.vdocument.in/reader035/viewer/2022062417/55165754550346a2698b4e32/html5/thumbnails/39.jpg)
39
Processing stages
Initial BVH construction
Post-processing
Optimization
Fast GPU ray traversal[Aila et al. 2012]
![Page 40: Fast Parallel Construction of High-Quality Bounding Volume Hierarchies Tero Karras Timo Aila](https://reader035.vdocument.in/reader035/viewer/2022062417/55165754550346a2698b4e32/html5/thumbnails/40.jpg)
40
Processing stages
Initial BVH construction
Post-processing
Optimization
Triangle splitting
Fast GPU ray traversal[Aila et al. 2012]
![Page 41: Fast Parallel Construction of High-Quality Bounding Volume Hierarchies Tero Karras Timo Aila](https://reader035.vdocument.in/reader035/viewer/2022062417/55165754550346a2698b4e32/html5/thumbnails/41.jpg)
41
Processing stages
Initial BVH construction
Post-processing
Optimization
Triangle splitting
Fast GPU ray traversal[Aila et al. 2012]
0.4 ms
5.4 ms6.6 ms
17.0 ms21.4 ms
1.2 ms1.6 ms
DRAGON (870K tris)NVIDIA GTX Titan23.6 ms / 30.0 ms
No splits
30% splits
![Page 42: Fast Parallel Construction of High-Quality Bounding Volume Hierarchies Tero Karras Timo Aila](https://reader035.vdocument.in/reader035/viewer/2022062417/55165754550346a2698b4e32/html5/thumbnails/42.jpg)
42
Cost model
Surface area cost model[Goldsmith and Salmon 1987], [MacDonald and Booth 1990]
Track cost and triangle count of each subtree
Minimize SAH cost of the final BVHMake collapsing decisions already during optimization
→ Unified processing of leaves and internal nodes
![Page 43: Fast Parallel Construction of High-Quality Bounding Volume Hierarchies Tero Karras Timo Aila](https://reader035.vdocument.in/reader035/viewer/2022062417/55165754550346a2698b4e32/html5/thumbnails/43.jpg)
43
Optimal restructuring
Finding the optimal node topology is NP-hardNaive algorithm → Our approach →
But it becomes very powerful as grows treelet leaves is enough for high-quality results
Use fixed-size treeletsConstant cost per treelet
→ Linear with respect to scene size
![Page 44: Fast Parallel Construction of High-Quality Bounding Volume Hierarchies Tero Karras Timo Aila](https://reader035.vdocument.in/reader035/viewer/2022062417/55165754550346a2698b4e32/html5/thumbnails/44.jpg)
44
Optimal restructuring
Treelet size Layouts Quality vs. SBVH *4 15 78%5 105 85%6 945 88%7 10,395 97%8 135,135 98%
* SODA (2.2M tris)
Number of unique ways forrestructuring a given treelet
Ray tracing performanceafter 3 rounds of optimization
![Page 45: Fast Parallel Construction of High-Quality Bounding Volume Hierarchies Tero Karras Timo Aila](https://reader035.vdocument.in/reader035/viewer/2022062417/55165754550346a2698b4e32/html5/thumbnails/45.jpg)
45
Optimal restructuring
Treelet size Layouts Quality vs. SBVH *4 15 78%5 105 85%6 945 88%7 10,395 97%8 135,135 98%
* SODA (2.2M tris)
Almost thesame thing astree rotations
[Kensler 2008]
Limited options during optimization→ easy to get stuck in a local optimum
Varies a lotbetweenscenes
![Page 46: Fast Parallel Construction of High-Quality Bounding Volume Hierarchies Tero Karras Timo Aila](https://reader035.vdocument.in/reader035/viewer/2022062417/55165754550346a2698b4e32/html5/thumbnails/46.jpg)
46
Optimal restructuring
Treelet size Layouts Quality vs. SBVH *4 15 78%5 105 85%6 945 88%7 10,395 97%8 135,135 98%
* SODA (2.2M tris)
Can still beimplemented
efficiently
Surely one of thesewill take us forward
Consistentacross scenes
Further improvementis marginal
![Page 47: Fast Parallel Construction of High-Quality Bounding Volume Hierarchies Tero Karras Timo Aila](https://reader035.vdocument.in/reader035/viewer/2022062417/55165754550346a2698b4e32/html5/thumbnails/47.jpg)
47
Algorithm
Dynamic programmingSolve small subproblems firstTabulate their solutionsBuild on them to solve larger subproblems
Subproblem:What’s the best node topology for a subset of the leaves?
![Page 48: Fast Parallel Construction of High-Quality Bounding Volume Hierarchies Tero Karras Timo Aila](https://reader035.vdocument.in/reader035/viewer/2022062417/55165754550346a2698b4e32/html5/thumbnails/48.jpg)
48
Algorithm
input: set of treelet leavesfor to do
for each subset of size dofor each way of partitioning the leaves do
look up subtree costscalculate SAH cost
end forrecord the best solution
end forend forreconstruct optimal topology
Process subsets fromsmallest to largest
Record the optimalSAH cost for each
![Page 49: Fast Parallel Construction of High-Quality Bounding Volume Hierarchies Tero Karras Timo Aila](https://reader035.vdocument.in/reader035/viewer/2022062417/55165754550346a2698b4e32/html5/thumbnails/49.jpg)
49
Algorithm
input: set of treelet leavesfor to do
for each subset of size dofor each way of partitioning the leaves do
look up subtree costscalculate SAH cost
end forrecord the best solution
end forend forreconstruct optimal topology
Exhaustive search:assign each leaf toleft/right subtree
We already knowhow much thesubtrees will cost
Backtrack thepartitioning choices
![Page 50: Fast Parallel Construction of High-Quality Bounding Volume Hierarchies Tero Karras Timo Aila](https://reader035.vdocument.in/reader035/viewer/2022062417/55165754550346a2698b4e32/html5/thumbnails/50.jpg)
50
Scalar vs. SIMD
Scalar processingEach thread processes one treeletNeed many treelets in flight
SIMD processing32 threads collaborate on the same treeletNeed few treelets in flight
✗ Spills to off-chip memory✗ Doesn’t scale to small scenes✓ Trivial to implement
✓ Data fits in on-chip memory✓ Easy to fill the entire GPU✗ Need to keep all threads busy
Parallelize over subproblems usinga pre-optimized processing schedule
(details in the paper)
![Page 51: Fast Parallel Construction of High-Quality Bounding Volume Hierarchies Tero Karras Timo Aila](https://reader035.vdocument.in/reader035/viewer/2022062417/55165754550346a2698b4e32/html5/thumbnails/51.jpg)
51
Scalar vs. SIMD
Scalar processingEach thread processes one treeletNeed many treelets in flight
SIMD processing32 threads collaborate on the same treeletNeed few treelets in flight
✓ Data fits in on-chip memory✓ Easy to fill the entire GPU✓ Possible to keep threads busy
✗ Spills to off-chip memory✗ Doesn’t scale to small scenes✓ Trivial to implement
![Page 52: Fast Parallel Construction of High-Quality Bounding Volume Hierarchies Tero Karras Timo Aila](https://reader035.vdocument.in/reader035/viewer/2022062417/55165754550346a2698b4e32/html5/thumbnails/52.jpg)
52
Quality vs. speed
Spend less effort on bottom-most nodesLow contribution to SAH costQuick convergence
Additional parameter Only process subtrees that are large enoughTrade quality for speed
Double after each roundSignificant speedupNegligible effect on quality
![Page 53: Fast Parallel Construction of High-Quality Bounding Volume Hierarchies Tero Karras Timo Aila](https://reader035.vdocument.in/reader035/viewer/2022062417/55165754550346a2698b4e32/html5/thumbnails/53.jpg)
53
Triangle splitting
Early Split Clipping [Ernst and Greiner 2007]Split triangle bounding boxes as a pre-process
Bounding box is nota good approximation
Split it!
Large triangle
![Page 54: Fast Parallel Construction of High-Quality Bounding Volume Hierarchies Tero Karras Timo Aila](https://reader035.vdocument.in/reader035/viewer/2022062417/55165754550346a2698b4e32/html5/thumbnails/54.jpg)
54
Triangle splitting
Resulting boxesprovide atighter bound
Keep going until theyare small enough
Early Split Clipping [Ernst and Greiner 2007]Split triangle bounding boxes as a pre-process
![Page 55: Fast Parallel Construction of High-Quality Bounding Volume Hierarchies Tero Karras Timo Aila](https://reader035.vdocument.in/reader035/viewer/2022062417/55165754550346a2698b4e32/html5/thumbnails/55.jpg)
55
Triangle splitting
Early Split Clipping [Ernst and Greiner 2007]Split triangle bounding boxes as a pre-process
Keep going until theyare small enough
![Page 56: Fast Parallel Construction of High-Quality Bounding Volume Hierarchies Tero Karras Timo Aila](https://reader035.vdocument.in/reader035/viewer/2022062417/55165754550346a2698b4e32/html5/thumbnails/56.jpg)
56
Triangle splitting
Early Split Clipping [Ernst and Greiner 2007]Split triangle bounding boxes as a pre-process
Treat each box as aseparate primitive
Triangle itselfremains the same
![Page 57: Fast Parallel Construction of High-Quality Bounding Volume Hierarchies Tero Karras Timo Aila](https://reader035.vdocument.in/reader035/viewer/2022062417/55165754550346a2698b4e32/html5/thumbnails/57.jpg)
57
Triangle splitting
Shortcomings of pre-process splittingCan hurt ray tracing performanceUnpredictable memory usageRequires manual tuning
Improve with better heuristicsSelect good split planesConcentrate splits where they matterUse a fixed split budget
![Page 58: Fast Parallel Construction of High-Quality Bounding Volume Hierarchies Tero Karras Timo Aila](https://reader035.vdocument.in/reader035/viewer/2022062417/55165754550346a2698b4e32/html5/thumbnails/58.jpg)
58
Split plane selection
Root node partitions thescene at its spatial median
Reduce node overlap in the initial BVH
![Page 59: Fast Parallel Construction of High-Quality Bounding Volume Hierarchies Tero Karras Timo Aila](https://reader035.vdocument.in/reader035/viewer/2022062417/55165754550346a2698b4e32/html5/thumbnails/59.jpg)
59
Split plane selection
Left child
Right child
Reduce node overlap in the initial BVH
![Page 60: Fast Parallel Construction of High-Quality Bounding Volume Hierarchies Tero Karras Timo Aila](https://reader035.vdocument.in/reader035/viewer/2022062417/55165754550346a2698b4e32/html5/thumbnails/60.jpg)
60
If a trianglecrosses the plane...
Split plane selection
...the bounding boxes will overlap
Use the samespatial medianas a split plane
Reduce node overlap in the initial BVH
![Page 61: Fast Parallel Construction of High-Quality Bounding Volume Hierarchies Tero Karras Timo Aila](https://reader035.vdocument.in/reader035/viewer/2022062417/55165754550346a2698b4e32/html5/thumbnails/61.jpg)
61
Split plane selection
No overlap
Use the samespatial medianas a split plane
Reduce node overlap in the initial BVH
![Page 62: Fast Parallel Construction of High-Quality Bounding Volume Hierarchies Tero Karras Timo Aila](https://reader035.vdocument.in/reader035/viewer/2022062417/55165754550346a2698b4e32/html5/thumbnails/62.jpg)
62
Split plane selection
Splitting one triangledoes not help much
Need to split them allto get the benefits
Reduce node overlap in the initial BVH
![Page 63: Fast Parallel Construction of High-Quality Bounding Volume Hierarchies Tero Karras Timo Aila](https://reader035.vdocument.in/reader035/viewer/2022062417/55165754550346a2698b4e32/html5/thumbnails/63.jpg)
63
Split plane selectionReduce node overlap in the initial BVH
![Page 64: Fast Parallel Construction of High-Quality Bounding Volume Hierarchies Tero Karras Timo Aila](https://reader035.vdocument.in/reader035/viewer/2022062417/55165754550346a2698b4e32/html5/thumbnails/64.jpg)
64
Split plane selection
Same reasoning holds on multiple levels
Reduce node overlap in the initial BVH
Level 0
Level 1
Level 2 Level 2
Level 3
Level 3
![Page 65: Fast Parallel Construction of High-Quality Bounding Volume Hierarchies Tero Karras Timo Aila](https://reader035.vdocument.in/reader035/viewer/2022062417/55165754550346a2698b4e32/html5/thumbnails/65.jpg)
65
Split plane selectionLook at all spatial
median planes thatintersect a triangle
Split it with thedominant one
Reduce node overlap in the initial BVH
Level 1
Level 2 Level 2Level 0
Level 3
Level 3
![Page 66: Fast Parallel Construction of High-Quality Bounding Volume Hierarchies Tero Karras Timo Aila](https://reader035.vdocument.in/reader035/viewer/2022062417/55165754550346a2698b4e32/html5/thumbnails/66.jpg)
66
Algorithm
1. Allocate memory for a fixed split budget
![Page 67: Fast Parallel Construction of High-Quality Bounding Volume Hierarchies Tero Karras Timo Aila](https://reader035.vdocument.in/reader035/viewer/2022062417/55165754550346a2698b4e32/html5/thumbnails/67.jpg)
67
Algorithm
1. Allocate memory for a fixed split budget
2. Calculate a priority value for each triangle
![Page 68: Fast Parallel Construction of High-Quality Bounding Volume Hierarchies Tero Karras Timo Aila](https://reader035.vdocument.in/reader035/viewer/2022062417/55165754550346a2698b4e32/html5/thumbnails/68.jpg)
68
Algorithm
1. Allocate memory for a fixed split budget
2. Calculate a priority value for each triangle
3. Distribute the split budget among trianglesProportional to their priority values
![Page 69: Fast Parallel Construction of High-Quality Bounding Volume Hierarchies Tero Karras Timo Aila](https://reader035.vdocument.in/reader035/viewer/2022062417/55165754550346a2698b4e32/html5/thumbnails/69.jpg)
69
Algorithm
1. Allocate memory for a fixed split budget
2. Calculate a priority value for each triangle
3. Distribute the split budget among trianglesProportional to their priority values
4. Split each triangle recursivelyDistribute remaining splits according to the size of the resulting AABBs
![Page 70: Fast Parallel Construction of High-Quality Bounding Volume Hierarchies Tero Karras Timo Aila](https://reader035.vdocument.in/reader035/viewer/2022062417/55165754550346a2698b4e32/html5/thumbnails/70.jpg)
70
Split priority
𝑝𝑟𝑖𝑜𝑟𝑖𝑡𝑦=(2(−𝑙𝑒𝑣𝑒𝑙 ) ∙ ( 𝐴𝑎𝑎𝑏𝑏−𝐴𝑖𝑑𝑒𝑎𝑙 ))1/3
Crosses an importantspatial median plane?
Has large potential forreducing surface area?
Concentrate on triangleswhere both apply
…but leavesomething forthe rest, too
![Page 71: Fast Parallel Construction of High-Quality Bounding Volume Hierarchies Tero Karras Timo Aila](https://reader035.vdocument.in/reader035/viewer/2022062417/55165754550346a2698b4e32/html5/thumbnails/71.jpg)
71
ResultsCompare against 4 CPU and 3 GPU builders
4-core i7 930, NVIDIA GTX TitanAverage of 20 test scenes, multiple viewpoints
![Page 72: Fast Parallel Construction of High-Quality Bounding Volume Hierarchies Tero Karras Timo Aila](https://reader035.vdocument.in/reader035/viewer/2022062417/55165754550346a2698b4e32/html5/thumbnails/72.jpg)
Ray tracing performance
72
SweepSAH[MacDonald]
SBVH[Stich]
Treerotations[Kensler]
Iterativereinsertion
[Bittner]
0%
20%
40%
60%
80%
100%
120%
140%
SweepSAH = 100%
High-quality CPU builders
SBVH = 131%
![Page 73: Fast Parallel Construction of High-Quality Bounding Volume Hierarchies Tero Karras Timo Aila](https://reader035.vdocument.in/reader035/viewer/2022062417/55165754550346a2698b4e32/html5/thumbnails/73.jpg)
0%
20%
40%
60%
80%
100%
120%
140%
Ray tracing performance
73
SweepSAH[MacDonald]
SBVH[Stich]
Treerotations[Kensler]
Iterativereinsertion
[Bittner]
LBVH[Karras]
HLBVH[Garanzha]
GridSAH[Garanzha]
Fast GPU builders
67% – 69%
SBVH = 131%
Almost 2×
![Page 74: Fast Parallel Construction of High-Quality Bounding Volume Hierarchies Tero Karras Timo Aila](https://reader035.vdocument.in/reader035/viewer/2022062417/55165754550346a2698b4e32/html5/thumbnails/74.jpg)
Ray tracing performance
74
SweepSAH[MacDonald]
SBVH[Stich]
Treerotations[Kensler]
Iterativereinsertion
[Bittner]
TRBVH TRBVH+30% split
LBVH[Karras]
HLBVH[Garanzha]
GridSAH[Garanzha]
0%
20%
40%
60%
80%
100%
120%
140%
No splits30% splits
Our method
![Page 75: Fast Parallel Construction of High-Quality Bounding Volume Hierarchies Tero Karras Timo Aila](https://reader035.vdocument.in/reader035/viewer/2022062417/55165754550346a2698b4e32/html5/thumbnails/75.jpg)
Ray tracing performance
75
SweepSAH[MacDonald]
SBVH[Stich]
Treerotations[Kensler]
Iterativereinsertion
[Bittner]
TRBVH TRBVH+30% split
LBVH[Karras]
HLBVH[Garanzha]
GridSAH[Garanzha]
0%
20%
40%
60%
80%
100%
120%
140%
96% of SweepSAH
91% of SBVH
![Page 76: Fast Parallel Construction of High-Quality Bounding Volume Hierarchies Tero Karras Timo Aila](https://reader035.vdocument.in/reader035/viewer/2022062417/55165754550346a2698b4e32/html5/thumbnails/76.jpg)
76
Effective performance
1M 10M 100M 1G 10G 100G 1T0%
20%
40%
60%
80%
100%
120%
140%
𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑟𝑎𝑦𝑠𝑝𝑒𝑟 𝑓𝑟𝑎𝑚𝑒
Tree rotations[Kensler]
Iterative reinsertion[Bittner]
Not Pareto-optimal
SBVH[Stich]
SweepSAH[MacDonald]
![Page 77: Fast Parallel Construction of High-Quality Bounding Volume Hierarchies Tero Karras Timo Aila](https://reader035.vdocument.in/reader035/viewer/2022062417/55165754550346a2698b4e32/html5/thumbnails/77.jpg)
77
Effective performance
1M 10M 100M 1G 10G 100G 1T0%
20%
40%
60%
80%
100%
120%
140%
𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑟𝑎𝑦𝑠𝑝𝑒𝑟 𝑓𝑟𝑎𝑚𝑒
SBVH[Stich]
SweepSAH[MacDonald]
![Page 78: Fast Parallel Construction of High-Quality Bounding Volume Hierarchies Tero Karras Timo Aila](https://reader035.vdocument.in/reader035/viewer/2022062417/55165754550346a2698b4e32/html5/thumbnails/78.jpg)
78
Effective performance
1M 10M 100M 1G 10G 100G 1T0%
20%
40%
60%
80%
100%
120%
140%
𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑟𝑎𝑦𝑠𝑝𝑒𝑟 𝑓𝑟𝑎𝑚𝑒
HLBVH[Garanzha] GridSAH
[Garanzha]
SBVH[Stich]
SweepSAH[MacDonald]
LBVH[Karras]
Not Pareto-optimal
![Page 79: Fast Parallel Construction of High-Quality Bounding Volume Hierarchies Tero Karras Timo Aila](https://reader035.vdocument.in/reader035/viewer/2022062417/55165754550346a2698b4e32/html5/thumbnails/79.jpg)
79
Effective performance
1M 10M 100M 1G 10G 100G 1T0%
20%
40%
60%
80%
100%
120%
140%
𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑟𝑎𝑦𝑠𝑝𝑒𝑟 𝑓𝑟𝑎𝑚𝑒
SBVH[Stich]
SweepSAH[MacDonald]
LBVH[Karras]
![Page 80: Fast Parallel Construction of High-Quality Bounding Volume Hierarchies Tero Karras Timo Aila](https://reader035.vdocument.in/reader035/viewer/2022062417/55165754550346a2698b4e32/html5/thumbnails/80.jpg)
80
Effective performance
1M 10M 100M 1G 10G 100G 1T0%
20%
40%
60%
80%
100%
120%
140%
𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑟𝑎𝑦𝑠𝑝𝑒𝑟 𝑓𝑟𝑎𝑚𝑒
SBVH[Stich]
Our method(no splits)
SweepSAH[MacDonald]
LBVH[Karras]
Our method(30% splits)
![Page 81: Fast Parallel Construction of High-Quality Bounding Volume Hierarchies Tero Karras Timo Aila](https://reader035.vdocument.in/reader035/viewer/2022062417/55165754550346a2698b4e32/html5/thumbnails/81.jpg)
81
Effective performance
1M 10M 100M 1G 10G 100G 1T0%
20%
40%
60%
80%
100%
120%
140%
𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑟𝑎𝑦𝑠𝑝𝑒𝑟 𝑓𝑟𝑎𝑚𝑒
7M–60G rays/frame→ our method is the best choice
Below 7M→ LBVH
Above 60G→ SBVH
![Page 82: Fast Parallel Construction of High-Quality Bounding Volume Hierarchies Tero Karras Timo Aila](https://reader035.vdocument.in/reader035/viewer/2022062417/55165754550346a2698b4e32/html5/thumbnails/82.jpg)
82
Conclusion
General framework for optimizing treesInherently parallelApproximate restructuring → larger treelets?
Practical GPU-based BVH builderBest choice in a large class of applicationsAdjustable quality–speed tradeoff
Will be integrated into NVIDIA OptiX
![Page 83: Fast Parallel Construction of High-Quality Bounding Volume Hierarchies Tero Karras Timo Aila](https://reader035.vdocument.in/reader035/viewer/2022062417/55165754550346a2698b4e32/html5/thumbnails/83.jpg)
83
Thank you
AcknowledgementsSamuli LaineJaakko LehtinenSami LiedesDavid McAllisterAnonymous reviewersAnat Grynberg and Greg Ward for CONFERENCE
University of Utah for FAIRY
Marko Dabrovic for SIBENIK
Ryan Vance for BUBS
Samuli Laine for HAIRBALL and VEGETATION
Guillermo Leal Laguno for SANMIGUEL
Jonathan Good for ARABIC, BABYLONIAN and ITALIAN
Stanford Computer Graphics Laboratory for ARMADILLO, BUDDHA and DRAGON
Cornell University for BAR
Georgia Institute of Technology for BLADE