an interactive out-of-core framework for visualizing massively complex models

47
INFORMATIK INFORMATIK EGSR04 Norrkoeping, Sweden EGSR04 Norrkoeping, Sweden An Interactive Out-of-Core An Interactive Out-of-Core Framework for Visualizing Framework for Visualizing Massively Complex Models Massively Complex Models Ingo Wald Ingo Wald MPI Informatik MPI Informatik Andreas Dietrich, Philipp Slusallek Andreas Dietrich, Philipp Slusallek Saarland University Saarland University

Upload: jock

Post on 15-Jan-2016

32 views

Category:

Documents


0 download

DESCRIPTION

An Interactive Out-of-Core Framework for Visualizing Massively Complex Models. Ingo Wald MPI Informatik Andreas Dietrich, Philipp Slusallek Saarland University. Outline. Motivation Rendering complex models Our Challenge: The „Boeing 777“ model Our Approach - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: An Interactive Out-of-Core  Framework for Visualizing Massively Complex Models

INFORMATIKINFORMATIK

EGSR04 Norrkoeping, SwedenEGSR04 Norrkoeping, Sweden

An Interactive Out-of-CoreAn Interactive Out-of-Core Framework for Visualizing Framework for VisualizingMassively Complex ModelsMassively Complex Models

Ingo WaldIngo WaldMPI InformatikMPI Informatik

Andreas Dietrich, Philipp SlusallekAndreas Dietrich, Philipp Slusallek

Saarland UniversitySaarland University

Ingo WaldIngo WaldMPI InformatikMPI Informatik

Andreas Dietrich, Philipp SlusallekAndreas Dietrich, Philipp Slusallek

Saarland UniversitySaarland University

Page 2: An Interactive Out-of-Core  Framework for Visualizing Massively Complex Models

INFORMATIKINFORMATIK

22EGSR04 Norrkoeping, SwedenEGSR04 Norrkoeping, SwedenMay 21, 2004May 21, 2004

OutlineOutline

• MotivationMotivation• Rendering complex modelsRendering complex models

• Our Challenge: The „Boeing 777“ modelOur Challenge: The „Boeing 777“ model

• Our ApproachOur Approach• Out-of-core ray tracing for massive modelsOut-of-core ray tracing for massive models

• Memory management schemeMemory management scheme

• Proxy mechanism for representing not-yet-loaded dataProxy mechanism for representing not-yet-loaded data

• ResultsResults• Conclusion and Future WorkConclusion and Future Work

• MotivationMotivation• Rendering complex modelsRendering complex models

• Our Challenge: The „Boeing 777“ modelOur Challenge: The „Boeing 777“ model

• Our ApproachOur Approach• Out-of-core ray tracing for massive modelsOut-of-core ray tracing for massive models

• Memory management schemeMemory management scheme

• Proxy mechanism for representing not-yet-loaded dataProxy mechanism for representing not-yet-loaded data

• ResultsResults• Conclusion and Future WorkConclusion and Future Work

Page 3: An Interactive Out-of-Core  Framework for Visualizing Massively Complex Models

INFORMATIKINFORMATIK

33EGSR04 Norrkoeping, SwedenEGSR04 Norrkoeping, SwedenMay 21, 2004May 21, 2004

Motivation – Are there „complex Motivation – Are there „complex models“ any more ?models“ any more ?

• Today: Steeply rising graphics performanceToday: Steeply rising graphics performance• Faster GPUs (100+ million tris/s)Faster GPUs (100+ million tris/s)

• „„Moore‘s Law“ : Performance doubles every 1.5 years…Moore‘s Law“ : Performance doubles every 1.5 years…

• But: Model complexity rising (at least) as fastBut: Model complexity rising (at least) as fast• Higher performance spent as soon as availableHigher performance spent as soon as available

– Best example: Games...Best example: Games...

• CAD&VR used for ever larger engineering projectsCAD&VR used for ever larger engineering projects– Collaboration of more and more designersCollaboration of more and more designers

– Each of which models „his part“ at full accuracy…Each of which models „his part“ at full accuracy…

Immensely complex modelsImmensely complex models

• Today: Steeply rising graphics performanceToday: Steeply rising graphics performance• Faster GPUs (100+ million tris/s)Faster GPUs (100+ million tris/s)

• „„Moore‘s Law“ : Performance doubles every 1.5 years…Moore‘s Law“ : Performance doubles every 1.5 years…

• But: Model complexity rising (at least) as fastBut: Model complexity rising (at least) as fast• Higher performance spent as soon as availableHigher performance spent as soon as available

– Best example: Games...Best example: Games...

• CAD&VR used for ever larger engineering projectsCAD&VR used for ever larger engineering projects– Collaboration of more and more designersCollaboration of more and more designers

– Each of which models „his part“ at full accuracy…Each of which models „his part“ at full accuracy…

Immensely complex modelsImmensely complex models

Page 4: An Interactive Out-of-Core  Framework for Visualizing Massively Complex Models

INFORMATIKINFORMATIK

44EGSR04 Norrkoeping, SwedenEGSR04 Norrkoeping, SwedenMay 21, 2004May 21, 2004

Today‘s Challenge: Today‘s Challenge: The „Boeing 777“ – 350M TrianglesThe „Boeing 777“ – 350M Triangles

Page 5: An Interactive Out-of-Core  Framework for Visualizing Massively Complex Models

INFORMATIKINFORMATIK

55EGSR04 Norrkoeping, SwedenEGSR04 Norrkoeping, SwedenMay 21, 2004May 21, 2004

Complex Models: Previous WorkComplex Models: Previous Work

• Brute force rasterizationBrute force rasterization• Use fastest available graphics hardwareUse fastest available graphics hardware

• PC GFX cards today: ~100MTri/sPC GFX cards today: ~100MTri/s Even at Even at theoreticaltheoretical peak performance several sec. per frame peak performance several sec. per frame Usually try to reduce #triangles to be renderedUsually try to reduce #triangles to be rendered

• Mesh simplificationMesh simplification• Edge collapse, vertex removal, remeshing, etc.Edge collapse, vertex removal, remeshing, etc.

• Often requires „useful“ input meshesOften requires „useful“ input meshes

• Brute force rasterizationBrute force rasterization• Use fastest available graphics hardwareUse fastest available graphics hardware

• PC GFX cards today: ~100MTri/sPC GFX cards today: ~100MTri/s Even at Even at theoreticaltheoretical peak performance several sec. per frame peak performance several sec. per frame Usually try to reduce #triangles to be renderedUsually try to reduce #triangles to be rendered

• Mesh simplificationMesh simplification• Edge collapse, vertex removal, remeshing, etc.Edge collapse, vertex removal, remeshing, etc.

• Often requires „useful“ input meshesOften requires „useful“ input meshes

Page 6: An Interactive Out-of-Core  Framework for Visualizing Massively Complex Models

INFORMATIKINFORMATIK

66EGSR04 Norrkoeping, SwedenEGSR04 Norrkoeping, SwedenMay 21, 2004May 21, 2004

Complex Models: Previous WorkComplex Models: Previous Work

• Occlusion CullingOcclusion Culling• Visibility preprocessing (from region, from point, etc)Visibility preprocessing (from region, from point, etc)

• Hierarchical Z-BufferHierarchical Z-Buffer

• Can only helf Can only helf if if there is enough occlusionthere is enough occlusion

• System solutions (MMR, GigaWalk, iWalk)System solutions (MMR, GigaWalk, iWalk)• Build on combination of ideasBuild on combination of ideas

– Visibility precomputation + occlusion culling + LODs + …Visibility precomputation + occlusion culling + LODs + …

• Problem: Individual techniques already problematicProblem: Individual techniques already problematic

– Complex precomputation and data structuresComplex precomputation and data structures

– Often suffers from artefacts (popping etc)Often suffers from artefacts (popping etc)

• Occlusion CullingOcclusion Culling• Visibility preprocessing (from region, from point, etc)Visibility preprocessing (from region, from point, etc)

• Hierarchical Z-BufferHierarchical Z-Buffer

• Can only helf Can only helf if if there is enough occlusionthere is enough occlusion

• System solutions (MMR, GigaWalk, iWalk)System solutions (MMR, GigaWalk, iWalk)• Build on combination of ideasBuild on combination of ideas

– Visibility precomputation + occlusion culling + LODs + …Visibility precomputation + occlusion culling + LODs + …

• Problem: Individual techniques already problematicProblem: Individual techniques already problematic

– Complex precomputation and data structuresComplex precomputation and data structures

– Often suffers from artefacts (popping etc)Often suffers from artefacts (popping etc)

Page 7: An Interactive Out-of-Core  Framework for Visualizing Massively Complex Models

INFORMATIKINFORMATIK

77EGSR04 Norrkoeping, SwedenEGSR04 Norrkoeping, SwedenMay 21, 2004May 21, 2004

Complex Models: Previous WorkComplex Models: Previous Work

• QSplatQSplat• Hierarchical point-sampled representationHierarchical point-sampled representation

• Best for locally smooth meshesBest for locally smooth meshes

• Problematic for high depth complexityProblematic for high depth complexity

• Randomized Z-BufferRandomized Z-Buffer• Randomly selects subset of triangles to be renderedRandomly selects subset of triangles to be rendered

• Best for almost-random data (tree leaves etc.)Best for almost-random data (tree leaves etc.)

• Several OthersSeveral Others• Impostors, Textured depth meshes, …Impostors, Textured depth meshes, …

• QSplatQSplat• Hierarchical point-sampled representationHierarchical point-sampled representation

• Best for locally smooth meshesBest for locally smooth meshes

• Problematic for high depth complexityProblematic for high depth complexity

• Randomized Z-BufferRandomized Z-Buffer• Randomly selects subset of triangles to be renderedRandomly selects subset of triangles to be rendered

• Best for almost-random data (tree leaves etc.)Best for almost-random data (tree leaves etc.)

• Several OthersSeveral Others• Impostors, Textured depth meshes, …Impostors, Textured depth meshes, …

Page 8: An Interactive Out-of-Core  Framework for Visualizing Massively Complex Models

INFORMATIKINFORMATIK

88EGSR04 Norrkoeping, SwedenEGSR04 Norrkoeping, SwedenMay 21, 2004May 21, 2004

Today‘s Challenge: Today‘s Challenge: The „Boeing 777“The „Boeing 777“

• 350 million triangles350 million triangles• 12 GB just for vertex positions, 35-70GB incl BSPs12 GB just for vertex positions, 35-70GB incl BSPs

• Complex geometrical structureComplex geometrical structure• Unstructured „soup“ of trianglesUnstructured „soup“ of triangles

– Often self-intersecting, coplanar, and badly shapedOften self-intersecting, coplanar, and badly shaped

• Complex interwoven parts like pipes, cables, …Complex interwoven parts like pipes, cables, …

• Low degree of occlusionLow degree of occlusion

• Goal: Render interactively on single PCGoal: Render interactively on single PC• Dual-Opteron 246 (1.8GHz) w/ 6GB RAM (or less)Dual-Opteron 246 (1.8GHz) w/ 6GB RAM (or less)

• 350 million triangles350 million triangles• 12 GB just for vertex positions, 35-70GB incl BSPs12 GB just for vertex positions, 35-70GB incl BSPs

• Complex geometrical structureComplex geometrical structure• Unstructured „soup“ of trianglesUnstructured „soup“ of triangles

– Often self-intersecting, coplanar, and badly shapedOften self-intersecting, coplanar, and badly shaped

• Complex interwoven parts like pipes, cables, …Complex interwoven parts like pipes, cables, …

• Low degree of occlusionLow degree of occlusion

• Goal: Render interactively on single PCGoal: Render interactively on single PC• Dual-Opteron 246 (1.8GHz) w/ 6GB RAM (or less)Dual-Opteron 246 (1.8GHz) w/ 6GB RAM (or less)

Page 9: An Interactive Out-of-Core  Framework for Visualizing Massively Complex Models

INFORMATIKINFORMATIK

99EGSR04 Norrkoeping, SwedenEGSR04 Norrkoeping, SwedenMay 21, 2004May 21, 2004

Today‘s Challenge: Today‘s Challenge: The „Boeing 777“The „Boeing 777“

Page 10: An Interactive Out-of-Core  Framework for Visualizing Massively Complex Models

INFORMATIKINFORMATIK

1010EGSR04 Norrkoeping, SwedenEGSR04 Norrkoeping, SwedenMay 21, 2004May 21, 2004

Today‘s Challenge: Today‘s Challenge: The „Boeing 777“The „Boeing 777“

Page 11: An Interactive Out-of-Core  Framework for Visualizing Massively Complex Models

INFORMATIKINFORMATIK

1111EGSR04 Norrkoeping, SwedenEGSR04 Norrkoeping, SwedenMay 21, 2004May 21, 2004

Today‘s Challenge: Today‘s Challenge: The „Boeing 777“The „Boeing 777“

Page 12: An Interactive Out-of-Core  Framework for Visualizing Massively Complex Models

INFORMATIKINFORMATIK

1212EGSR04 Norrkoeping, SwedenEGSR04 Norrkoeping, SwedenMay 21, 2004May 21, 2004

Today‘s Challenge: Today‘s Challenge: The „Boeing 777“The „Boeing 777“

Same complexity all over the Same complexity all over the model…model…Same complexity all over the Same complexity all over the model…model…

Page 13: An Interactive Out-of-Core  Framework for Visualizing Massively Complex Models

INFORMATIKINFORMATIK

1313EGSR04 Norrkoeping, SwedenEGSR04 Norrkoeping, SwedenMay 21, 2004May 21, 2004

Problems in the 777Problems in the 777

• Unorganized „soup“ of trianglesUnorganized „soup“ of triangles• Unorganized „soup“ of trianglesUnorganized „soup“ of triangles

Page 14: An Interactive Out-of-Core  Framework for Visualizing Massively Complex Models

INFORMATIKINFORMATIK

1414EGSR04 Norrkoeping, SwedenEGSR04 Norrkoeping, SwedenMay 21, 2004May 21, 2004

Problems in the 777Problems in the 777

• Complex, interwoven geometryComplex, interwoven geometry• Problematic for simplification-style algorithmsProblematic for simplification-style algorithms

• High depth complexityHigh depth complexity

• Complex, interwoven geometryComplex, interwoven geometry• Problematic for simplification-style algorithmsProblematic for simplification-style algorithms

• High depth complexityHigh depth complexity

Page 15: An Interactive Out-of-Core  Framework for Visualizing Massively Complex Models

INFORMATIKINFORMATIK

1515EGSR04 Norrkoeping, SwedenEGSR04 Norrkoeping, SwedenMay 21, 2004May 21, 2004

Problems in the 777Problems in the 777

• Low degree of occlusionLow degree of occlusion• Visibility Preprocessing / Occlusion Culling ?Visibility Preprocessing / Occlusion Culling ?

• Even perfect occlusion culling generates millions of tris…Even perfect occlusion culling generates millions of tris…

• Low degree of occlusionLow degree of occlusion• Visibility Preprocessing / Occlusion Culling ?Visibility Preprocessing / Occlusion Culling ?

• Even perfect occlusion culling generates millions of tris…Even perfect occlusion culling generates millions of tris…

Page 16: An Interactive Out-of-Core  Framework for Visualizing Massively Complex Models

INFORMATIKINFORMATIK

1616EGSR04 Norrkoeping, SwedenEGSR04 Norrkoeping, SwedenMay 21, 2004May 21, 2004

Complex Models: Previous WorkComplex Models: Previous Work

• ConclusionConclusion• Most previous approaches problematic for 777-style modelsMost previous approaches problematic for 777-style models

– Note: Same problem with Note: Same problem with most most real-world CAD modelsreal-world CAD models Need another approach…Need another approach…

• ConclusionConclusion• Most previous approaches problematic for 777-style modelsMost previous approaches problematic for 777-style models

– Note: Same problem with Note: Same problem with most most real-world CAD modelsreal-world CAD models Need another approach…Need another approach…

Page 17: An Interactive Out-of-Core  Framework for Visualizing Massively Complex Models

INFORMATIKINFORMATIK

1717EGSR04 Norrkoeping, SwedenEGSR04 Norrkoeping, SwedenMay 21, 2004May 21, 2004

Complex Models: Previous WorkComplex Models: Previous Work

• ConclusionConclusion• Most previous approaches problematic for 777-style modelsMost previous approaches problematic for 777-style models

– Note: Same problem with Note: Same problem with most most real-world CAD modelsreal-world CAD models Need another approach…Need another approach…

• Idea: Ray Tracing logarithmic in #trianglesIdea: Ray Tracing logarithmic in #triangles Ideal for complex modelsIdeal for complex models

• ConclusionConclusion• Most previous approaches problematic for 777-style modelsMost previous approaches problematic for 777-style models

– Note: Same problem with Note: Same problem with most most real-world CAD modelsreal-world CAD models Need another approach…Need another approach…

• Idea: Ray Tracing logarithmic in #trianglesIdea: Ray Tracing logarithmic in #triangles Ideal for complex modelsIdeal for complex models

Page 18: An Interactive Out-of-Core  Framework for Visualizing Massively Complex Models

INFORMATIKINFORMATIK

1818EGSR04 Norrkoeping, SwedenEGSR04 Norrkoeping, SwedenMay 21, 2004May 21, 2004

Basic Idea: Ray Tracing Ideal for Basic Idea: Ray Tracing Ideal for Complex Models…Complex Models…

• Proof by example: Sunflowers model …Proof by example: Sunflowers model …

• 1 billion triangles (3x the 777)1 billion triangles (3x the 777)

• Interactive performance on OpenRT engine [Wald03]Interactive performance on OpenRT engine [Wald03]

– Even including shadows, transparency, textures, etc…Even including shadows, transparency, textures, etc… Are 350 million still a problem ? Are 350 million still a problem ?

• Proof by example: Sunflowers model …Proof by example: Sunflowers model …

• 1 billion triangles (3x the 777)1 billion triangles (3x the 777)

• Interactive performance on OpenRT engine [Wald03]Interactive performance on OpenRT engine [Wald03]

– Even including shadows, transparency, textures, etc…Even including shadows, transparency, textures, etc… Are 350 million still a problem ? Are 350 million still a problem ?

Page 19: An Interactive Out-of-Core  Framework for Visualizing Massively Complex Models

INFORMATIKINFORMATIK

1919EGSR04 Norrkoeping, SwedenEGSR04 Norrkoeping, SwedenMay 21, 2004May 21, 2004

Basic Idea: Ray Tracing Ideal for Basic Idea: Ray Tracing Ideal for Complex Models…Complex Models…

• CaveatCaveat• Sunflowers uses instantiation Sunflowers uses instantiation easily fits into <1GB RAM easily fits into <1GB RAM

• 777: individual triangles 777: individual triangles 35-70GB data 35-70GB data

• First test: On SUN SunFire 12k w/ 96 GB RAMFirst test: On SUN SunFire 12k w/ 96 GB RAM• Not a problem – it just works…Not a problem – it just works…

• On desktop PC:On desktop PC:• Typically 2 to (at most) 8 GB RAM Typically 2 to (at most) 8 GB RAM

Need out-of-core (OOC) mechanismNeed out-of-core (OOC) mechanism

• CaveatCaveat• Sunflowers uses instantiation Sunflowers uses instantiation easily fits into <1GB RAM easily fits into <1GB RAM

• 777: individual triangles 777: individual triangles 35-70GB data 35-70GB data

• First test: On SUN SunFire 12k w/ 96 GB RAMFirst test: On SUN SunFire 12k w/ 96 GB RAM• Not a problem – it just works…Not a problem – it just works…

• On desktop PC:On desktop PC:• Typically 2 to (at most) 8 GB RAM Typically 2 to (at most) 8 GB RAM

Need out-of-core (OOC) mechanismNeed out-of-core (OOC) mechanism

Page 20: An Interactive Out-of-Core  Framework for Visualizing Massively Complex Models

INFORMATIKINFORMATIK

2020EGSR04 Norrkoeping, SwedenEGSR04 Norrkoeping, SwedenMay 21, 2004May 21, 2004

OOC Ray TracingOOC Ray Tracing

• Pharr 1997: Memory Coherent Ray TracingPharr 1997: Memory Coherent Ray Tracing• Manual caching of scene geometryManual caching of scene geometry

• Extensive reordering of rays to minimize disk I/OExtensive reordering of rays to minimize disk I/O

• But: Only for offline renderingBut: Only for offline rendering

• Wald 2001: Interactive OOC Ray TracingWald 2001: Interactive OOC Ray Tracing• Same idea as MCRT, but interactiveSame idea as MCRT, but interactive

– Caching on „chunks“ of ~1500 trianglesCaching on „chunks“ of ~1500 triangles

• Minimal reordering: Only to hide loading latencyMinimal reordering: Only to hide loading latency

– Assumed that all missing data can be loaded every frameAssumed that all missing data can be loaded every frameOnly few cache misses tolerableOnly few cache misses tolerable

• Pharr 1997: Memory Coherent Ray TracingPharr 1997: Memory Coherent Ray Tracing• Manual caching of scene geometryManual caching of scene geometry

• Extensive reordering of rays to minimize disk I/OExtensive reordering of rays to minimize disk I/O

• But: Only for offline renderingBut: Only for offline rendering

• Wald 2001: Interactive OOC Ray TracingWald 2001: Interactive OOC Ray Tracing• Same idea as MCRT, but interactiveSame idea as MCRT, but interactive

– Caching on „chunks“ of ~1500 trianglesCaching on „chunks“ of ~1500 triangles

• Minimal reordering: Only to hide loading latencyMinimal reordering: Only to hide loading latency

– Assumed that all missing data can be loaded every frameAssumed that all missing data can be loaded every frameOnly few cache misses tolerableOnly few cache misses tolerable

Page 21: An Interactive Out-of-Core  Framework for Visualizing Massively Complex Models

INFORMATIKINFORMATIK

2121EGSR04 Norrkoeping, SwedenEGSR04 Norrkoeping, SwedenMay 21, 2004May 21, 2004

OOC Ray TracingOOC Ray Tracing

• Loading all missing data every frame ?Loading all missing data every frame ?• Lots of data required even for small camera movementLots of data required even for small camera movement Loading all missing data in same frame not possible Loading all missing data in same frame not possible

Must tolerate that some rays must be cancelledMust tolerate that some rays must be cancelled(due to lack of data)(due to lack of data)

Need to cancel „faulting“ raysNeed to cancel „faulting“ rays• Need to detect which scene access will stallNeed to detect which scene access will stall

OOC Memory managementOOC Memory management

• Need to find replacement color for cancelled rayNeed to find replacement color for cancelled rayGeometry ProxiesGeometry Proxies

• Loading all missing data every frame ?Loading all missing data every frame ?• Lots of data required even for small camera movementLots of data required even for small camera movement Loading all missing data in same frame not possible Loading all missing data in same frame not possible

Must tolerate that some rays must be cancelledMust tolerate that some rays must be cancelled(due to lack of data)(due to lack of data)

Need to cancel „faulting“ raysNeed to cancel „faulting“ rays• Need to detect which scene access will stallNeed to detect which scene access will stall

OOC Memory managementOOC Memory management

• Need to find replacement color for cancelled rayNeed to find replacement color for cancelled rayGeometry ProxiesGeometry Proxies

Page 22: An Interactive Out-of-Core  Framework for Visualizing Massively Complex Models

INFORMATIKINFORMATIK

2222EGSR04 Norrkoeping, SwedenEGSR04 Norrkoeping, SwedenMay 21, 2004May 21, 2004

OOC Memory ManagementOOC Memory Management

• Lessons learned from [Wald01]Lessons learned from [Wald01]• Streaming precomputation goodStreaming precomputation good

• Object-based caching on 1500-triangle-blocks not goodObject-based caching on 1500-triangle-blocks not good

– Extensive replication when generating 1500tri-blocksExtensive replication when generating 1500tri-blocks

– Fragmentation (both internal and external)Fragmentation (both internal and external)

– Bad cache granularityBad cache granularity

– Memory management and data handling quite costlyMemory management and data handling quite costly

• Better: Use tile-based caching à la LinuxBetter: Use tile-based caching à la Linux• Build large BSP on disk (streaming preprocess)Build large BSP on disk (streaming preprocess)

• „„mmap“ into 64bit-address spacemmap“ into 64bit-address space

• Let CPU and OS do I/O and address translationLet CPU and OS do I/O and address translation

– But: Need to avoid page faultsBut: Need to avoid page faults

• Lessons learned from [Wald01]Lessons learned from [Wald01]• Streaming precomputation goodStreaming precomputation good

• Object-based caching on 1500-triangle-blocks not goodObject-based caching on 1500-triangle-blocks not good

– Extensive replication when generating 1500tri-blocksExtensive replication when generating 1500tri-blocks

– Fragmentation (both internal and external)Fragmentation (both internal and external)

– Bad cache granularityBad cache granularity

– Memory management and data handling quite costlyMemory management and data handling quite costly

• Better: Use tile-based caching à la LinuxBetter: Use tile-based caching à la Linux• Build large BSP on disk (streaming preprocess)Build large BSP on disk (streaming preprocess)

• „„mmap“ into 64bit-address spacemmap“ into 64bit-address space

• Let CPU and OS do I/O and address translationLet CPU and OS do I/O and address translation

– But: Need to avoid page faultsBut: Need to avoid page faults

Page 23: An Interactive Out-of-Core  Framework for Visualizing Massively Complex Models

INFORMATIKINFORMATIK

2323EGSR04 Norrkoeping, SwedenEGSR04 Norrkoeping, SwedenMay 21, 2004May 21, 2004

OOC Memory ManagementOOC Memory Management• Tile table: Stores which tiles are in memoryTile table: Stores which tiles are in memory

• Organized as hash-table for efficient accessOrganized as hash-table for efficient access– Requires only few kilobytesRequires only few kilobytes– Each lookup costs only few bit-operations and comparesEach lookup costs only few bit-operations and compares

• On „cache“ missOn „cache“ miss• Cancel faulting ray Cancel faulting ray beforebefore access access avoid OS page fault avoid OS page fault• Put tile ID into request queuePut tile ID into request queue• Page in tile asynchronously in „tile fetcher thread“Page in tile asynchronously in „tile fetcher thread“

• If memory is fully usedIf memory is fully used• Asynchronously evict tiles using „second chance“Asynchronously evict tiles using „second chance“

Control what is paged in and out at what timeControl what is paged in and out at what time Avoid Avoid anyany stalls of the rendering threads stalls of the rendering threads

• Tile table: Stores which tiles are in memoryTile table: Stores which tiles are in memory• Organized as hash-table for efficient accessOrganized as hash-table for efficient access

– Requires only few kilobytesRequires only few kilobytes– Each lookup costs only few bit-operations and comparesEach lookup costs only few bit-operations and compares

• On „cache“ missOn „cache“ miss• Cancel faulting ray Cancel faulting ray beforebefore access access avoid OS page fault avoid OS page fault• Put tile ID into request queuePut tile ID into request queue• Page in tile asynchronously in „tile fetcher thread“Page in tile asynchronously in „tile fetcher thread“

• If memory is fully usedIf memory is fully used• Asynchronously evict tiles using „second chance“Asynchronously evict tiles using „second chance“

Control what is paged in and out at what timeControl what is paged in and out at what time Avoid Avoid anyany stalls of the rendering threads stalls of the rendering threads

Page 24: An Interactive Out-of-Core  Framework for Visualizing Massively Complex Models

INFORMATIKINFORMATIK

2424EGSR04 Norrkoeping, SwedenEGSR04 Norrkoeping, SwedenMay 21, 2004May 21, 2004

OOC Memory ManagementOOC Memory Management

• So farSo far• Can efficiently detect and avoid page faultsCan efficiently detect and avoid page faults

– And asynchronously load missing dataAnd asynchronously load missing data

• Render at full accuracy once all data is availableRender at full accuracy once all data is available

• PerformancePerformance• 2-3 fps @ 1280x10242-3 fps @ 1280x1024

– Single PCSingle PC

• So farSo far• Can efficiently detect and avoid page faultsCan efficiently detect and avoid page faults

– And asynchronously load missing dataAnd asynchronously load missing data

• Render at full accuracy once all data is availableRender at full accuracy once all data is available

• PerformancePerformance• 2-3 fps @ 1280x10242-3 fps @ 1280x1024

– Single PCSingle PC

Page 25: An Interactive Out-of-Core  Framework for Visualizing Massively Complex Models

INFORMATIKINFORMATIK

2525EGSR04 Norrkoeping, SwedenEGSR04 Norrkoeping, SwedenMay 21, 2004May 21, 2004

OOC Memory ManagementOOC Memory Management

• So farSo far• Can efficiently detect and avoid page faultsCan efficiently detect and avoid page faults

– And asynchronously load missing dataAnd asynchronously load missing data

• Render at full accuracy once all data is availableRender at full accuracy once all data is available

• PerformancePerformance• 2-3 fps @ 1280x10242-3 fps @ 1280x1024

– Single PCSingle PC

• Question: What to doQuestion: What to dowith cancelled rays ?with cancelled rays ?(marked red here)(marked red here)

• So farSo far• Can efficiently detect and avoid page faultsCan efficiently detect and avoid page faults

– And asynchronously load missing dataAnd asynchronously load missing data

• Render at full accuracy once all data is availableRender at full accuracy once all data is available

• PerformancePerformance• 2-3 fps @ 1280x10242-3 fps @ 1280x1024

– Single PCSingle PC

• Question: What to doQuestion: What to dowith cancelled rays ?with cancelled rays ?(marked red here)(marked red here)

Page 26: An Interactive Out-of-Core  Framework for Visualizing Massively Complex Models

INFORMATIKINFORMATIK

2626EGSR04 Norrkoeping, SwedenEGSR04 Norrkoeping, SwedenMay 21, 2004May 21, 2004

Cancelling raysCancelling rays

• Our approach: Shade ray using „proxies“Our approach: Shade ray using „proxies“• Build proxy for each subtree addressed by Build proxy for each subtree addressed by pointer crossing tile pointer crossing tile

boundariesboundaries

• Precomputation: Sample subtree‘s volume with raysPrecomputation: Sample subtree‘s volume with rays– Record shading information (normal and material properties)Record shading information (normal and material properties)

– Store information for several discretized directionsStore information for several discretized directions Similar to LightFieldSimilar to LightField

• For faulting ray during rendering: Fetch corresponding proxy For faulting ray during rendering: Fetch corresponding proxy

– Interpolate shading information from closest 3 directionsInterpolate shading information from closest 3 directions

• Only few memory affordable for proxiesOnly few memory affordable for proxies• Usually only 28 directions per proxyUsually only 28 directions per proxy

• With discretized normal and color: 66-344 MB for entire modelWith discretized normal and color: 66-344 MB for entire model

• Our approach: Shade ray using „proxies“Our approach: Shade ray using „proxies“• Build proxy for each subtree addressed by Build proxy for each subtree addressed by pointer crossing tile pointer crossing tile

boundariesboundaries

• Precomputation: Sample subtree‘s volume with raysPrecomputation: Sample subtree‘s volume with rays– Record shading information (normal and material properties)Record shading information (normal and material properties)

– Store information for several discretized directionsStore information for several discretized directions Similar to LightFieldSimilar to LightField

• For faulting ray during rendering: Fetch corresponding proxy For faulting ray during rendering: Fetch corresponding proxy

– Interpolate shading information from closest 3 directionsInterpolate shading information from closest 3 directions

• Only few memory affordable for proxiesOnly few memory affordable for proxies• Usually only 28 directions per proxyUsually only 28 directions per proxy

• With discretized normal and color: 66-344 MB for entire modelWith discretized normal and color: 66-344 MB for entire model

Page 27: An Interactive Out-of-Core  Framework for Visualizing Massively Complex Models

INFORMATIKINFORMATIK

2727EGSR04 Norrkoeping, SwedenEGSR04 Norrkoeping, SwedenMay 21, 2004May 21, 2004

Proxy Quality – OverviewProxy Quality – OverviewOutside View (2GB footprint)Outside View (2GB footprint)

Immediately after startup Immediately after startup (tiny fraction of data loaded)(tiny fraction of data loaded)

Immediately after startup Immediately after startup (tiny fraction of data loaded)(tiny fraction of data loaded)

no proxiesno proxiesno proxiesno proxies

Page 28: An Interactive Out-of-Core  Framework for Visualizing Massively Complex Models

INFORMATIKINFORMATIK

2828EGSR04 Norrkoeping, SwedenEGSR04 Norrkoeping, SwedenMay 21, 2004May 21, 2004

Proxy Quality – OverviewProxy Quality – OverviewOutside View (2GB footprint)Outside View (2GB footprint)

Immediately after startup Immediately after startup (tiny fraction of data loaded)(tiny fraction of data loaded)

Immediately after startup Immediately after startup (tiny fraction of data loaded)(tiny fraction of data loaded)

no proxiesno proxiesno proxiesno proxies with proxieswith proxieswith proxieswith proxies

Page 29: An Interactive Out-of-Core  Framework for Visualizing Massively Complex Models

INFORMATIKINFORMATIK

2929EGSR04 Norrkoeping, SwedenEGSR04 Norrkoeping, SwedenMay 21, 2004May 21, 2004

Proxy Quality – OverviewProxy Quality – OverviewOutside View (2GB footprint)Outside View (2GB footprint)

After loading for several secondsAfter loading for several seconds(roughly equal amount of geometry loaded)(roughly equal amount of geometry loaded)

After loading for several secondsAfter loading for several seconds(roughly equal amount of geometry loaded)(roughly equal amount of geometry loaded)

without proxieswithout proxieswithout proxieswithout proxies with proxieswith proxieswith proxieswith proxies

Page 30: An Interactive Out-of-Core  Framework for Visualizing Massively Complex Models

INFORMATIKINFORMATIK

3030EGSR04 Norrkoeping, SwedenEGSR04 Norrkoeping, SwedenMay 21, 2004May 21, 2004

Results: ProxiesResults: Proxies

• Proxy QualityProxy Quality• Not as good as expected (sampling too coarse)Not as good as expected (sampling too coarse)

• Still: Sufficient for navigating…Still: Sufficient for navigating…• Immediate visual feedback after loadingImmediate visual feedback after loading

• … … and at any time during interactionand at any time during interaction

• Artifacts quickly disappear while loadingArtifacts quickly disappear while loading• OnlyOnly use proxies while data still missing use proxies while data still missing

• Proxy QualityProxy Quality• Not as good as expected (sampling too coarse)Not as good as expected (sampling too coarse)

• Still: Sufficient for navigating…Still: Sufficient for navigating…• Immediate visual feedback after loadingImmediate visual feedback after loading

• … … and at any time during interactionand at any time during interaction

• Artifacts quickly disappear while loadingArtifacts quickly disappear while loading• OnlyOnly use proxies while data still missing use proxies while data still missing

Page 31: An Interactive Out-of-Core  Framework for Visualizing Massively Complex Models

INFORMATIKINFORMATIK

3131EGSR04 Norrkoeping, SwedenEGSR04 Norrkoeping, SwedenMay 21, 2004May 21, 2004

Results: ShadowsResults: Shadows

• So far: Only concentrated on simple shadingSo far: Only concentrated on simple shading

• Ray Tracing: Can easily add shadowsRay Tracing: Can easily add shadows• OOC memory management scheme and proxies OOC memory management scheme and proxies

completely transparent to secondary rays…completely transparent to secondary rays…

• No details here…No details here…

– Just show effect and importance of using shadows… Just show effect and importance of using shadows…

• So far: Only concentrated on simple shadingSo far: Only concentrated on simple shading

• Ray Tracing: Can easily add shadowsRay Tracing: Can easily add shadows• OOC memory management scheme and proxies OOC memory management scheme and proxies

completely transparent to secondary rays…completely transparent to secondary rays…

• No details here…No details here…

– Just show effect and importance of using shadows… Just show effect and importance of using shadows…

Page 32: An Interactive Out-of-Core  Framework for Visualizing Massively Complex Models

INFORMATIKINFORMATIK

3232EGSR04 Norrkoeping, SwedenEGSR04 Norrkoeping, SwedenMay 21, 2004May 21, 2004

Results: ShadowsResults: Shadows

• Ray Tracing: Can easily add shadowsRay Tracing: Can easily add shadows• Cost rather small (coherence: data already in cache)Cost rather small (coherence: data already in cache)

• Significantly improved „sense of depth“Significantly improved „sense of depth“

• Ray Tracing: Can easily add shadowsRay Tracing: Can easily add shadows• Cost rather small (coherence: data already in cache)Cost rather small (coherence: data already in cache)

• Significantly improved „sense of depth“Significantly improved „sense of depth“

Page 33: An Interactive Out-of-Core  Framework for Visualizing Massively Complex Models

INFORMATIKINFORMATIK

3333EGSR04 Norrkoeping, SwedenEGSR04 Norrkoeping, SwedenMay 21, 2004May 21, 2004

Results: ShadowsResults: Shadows

• Ray Tracing: Can easily add shadowsRay Tracing: Can easily add shadows• Cost rather small (coherence: data already in cache)Cost rather small (coherence: data already in cache)

• Significantly improved „sense of depth“Significantly improved „sense of depth“

• Ray Tracing: Can easily add shadowsRay Tracing: Can easily add shadows• Cost rather small (coherence: data already in cache)Cost rather small (coherence: data already in cache)

• Significantly improved „sense of depth“Significantly improved „sense of depth“

Page 34: An Interactive Out-of-Core  Framework for Visualizing Massively Complex Models

INFORMATIKINFORMATIK

3434EGSR04 Norrkoeping, SwedenEGSR04 Norrkoeping, SwedenMay 21, 2004May 21, 2004

Results: ShadowsResults: Shadows

• Ray Tracing: Can easily add shadowsRay Tracing: Can easily add shadows• Cost rather small (coherence: data already in cache)Cost rather small (coherence: data already in cache)

• Significantly improved „sense of depth“Significantly improved „sense of depth“

• Ray Tracing: Can easily add shadowsRay Tracing: Can easily add shadows• Cost rather small (coherence: data already in cache)Cost rather small (coherence: data already in cache)

• Significantly improved „sense of depth“Significantly improved „sense of depth“

Page 35: An Interactive Out-of-Core  Framework for Visualizing Massively Complex Models

INFORMATIKINFORMATIK

3535EGSR04 Norrkoeping, SwedenEGSR04 Norrkoeping, SwedenMay 21, 2004May 21, 2004

SummarySummary

• Proposed OOC RT for Complex ModelsProposed OOC RT for Complex Models• Clever memory managementClever memory management

• Plus proxies as replacements for missing dataPlus proxies as replacements for missing data

• ResultsResults• Fast visual feedback already during loadingFast visual feedback already during loading

• Render full-res model once loadedRender full-res model once loaded

• Achieve interactive fullscreen performanceAchieve interactive fullscreen performance

– 2-3fps @ 1280x1024 on single desktop PC2-3fps @ 1280x1024 on single desktop PC

– Including support for shadowsIncluding support for shadows

• Proposed OOC RT for Complex ModelsProposed OOC RT for Complex Models• Clever memory managementClever memory management

• Plus proxies as replacements for missing dataPlus proxies as replacements for missing data

• ResultsResults• Fast visual feedback already during loadingFast visual feedback already during loading

• Render full-res model once loadedRender full-res model once loaded

• Achieve interactive fullscreen performanceAchieve interactive fullscreen performance

– 2-3fps @ 1280x1024 on single desktop PC2-3fps @ 1280x1024 on single desktop PC

– Including support for shadowsIncluding support for shadows

Page 36: An Interactive Out-of-Core  Framework for Visualizing Massively Complex Models

INFORMATIKINFORMATIK

3636EGSR04 Norrkoeping, SwedenEGSR04 Norrkoeping, SwedenMay 21, 2004May 21, 2004

Future WorkFuture Work

• Future workFuture work• Improve proxy qualityImprove proxy quality

• Cache-aware parallelizationCache-aware parallelization

• Interactive lighting simulation in 777Interactive lighting simulation in 777

• AcknowledgementsAcknowledgements• Boeing CorpBoeing Corp

• Our SysAdmin groupOur SysAdmin group

• Future workFuture work• Improve proxy qualityImprove proxy quality

• Cache-aware parallelizationCache-aware parallelization

• Interactive lighting simulation in 777Interactive lighting simulation in 777

• AcknowledgementsAcknowledgements• Boeing CorpBoeing Corp

• Our SysAdmin groupOur SysAdmin group

Page 37: An Interactive Out-of-Core  Framework for Visualizing Massively Complex Models

INFORMATIKINFORMATIK

3737EGSR04 Norrkoeping, SwedenEGSR04 Norrkoeping, SwedenMay 21, 2004May 21, 2004

Questions ?Questions ?

Page 38: An Interactive Out-of-Core  Framework for Visualizing Massively Complex Models

INFORMATIKINFORMATIK

3838EGSR04 Norrkoeping, SwedenEGSR04 Norrkoeping, SwedenMay 21, 2004May 21, 2004

Today‘s Challenge: Today‘s Challenge: The „Boeing 777“The „Boeing 777“

Page 39: An Interactive Out-of-Core  Framework for Visualizing Massively Complex Models

INFORMATIKINFORMATIK

3939EGSR04 Norrkoeping, SwedenEGSR04 Norrkoeping, SwedenMay 21, 2004May 21, 2004

Rendering PerformanceRendering Performance

• Use Use single single desktop PCdesktop PC• AMD dual Opteron 1.8GHz PC AMD dual Opteron 1.8GHz PC

• 6GB RAM6GB RAM

• Rendering PerformanceRendering Performance• Outside view: 2-3 fps @ 1280x1024Outside view: 2-3 fps @ 1280x1024

– Even faster in closeupsEven faster in closeups Fullscreen performance on single PC !Fullscreen performance on single PC !

• Use Use single single desktop PCdesktop PC• AMD dual Opteron 1.8GHz PC AMD dual Opteron 1.8GHz PC

• 6GB RAM6GB RAM

• Rendering PerformanceRendering Performance• Outside view: 2-3 fps @ 1280x1024Outside view: 2-3 fps @ 1280x1024

– Even faster in closeupsEven faster in closeups Fullscreen performance on single PC !Fullscreen performance on single PC !

Page 40: An Interactive Out-of-Core  Framework for Visualizing Massively Complex Models

INFORMATIKINFORMATIK

4040EGSR04 Norrkoeping, SwedenEGSR04 Norrkoeping, SwedenMay 21, 2004May 21, 2004

Proxy Quality – Wheel ExampleProxy Quality – Wheel Example

• Without ProxiesWithout Proxies• Without ProxiesWithout Proxies

Page 41: An Interactive Out-of-Core  Framework for Visualizing Massively Complex Models

INFORMATIKINFORMATIK

4141EGSR04 Norrkoeping, SwedenEGSR04 Norrkoeping, SwedenMay 21, 2004May 21, 2004

Proxy Quality – Wheel ExampleProxy Quality – Wheel Example

• With ProxiesWith Proxies• With ProxiesWith Proxies

Page 42: An Interactive Out-of-Core  Framework for Visualizing Massively Complex Models

INFORMATIKINFORMATIK

4242EGSR04 Norrkoeping, SwedenEGSR04 Norrkoeping, SwedenMay 21, 2004May 21, 2004

Proxy Quality – OverviewProxy Quality – OverviewOutside View (2GB footprint)Outside View (2GB footprint)

After loading for several secondsAfter loading for several secondsVs full-scale modelVs full-scale model

After loading for several secondsAfter loading for several secondsVs full-scale modelVs full-scale model

without proxieswithout proxieswithout proxieswithout proxies

with proxieswith proxieswith proxieswith proxies

entire modelentire modelentire modelentire model

Page 43: An Interactive Out-of-Core  Framework for Visualizing Massively Complex Models

INFORMATIKINFORMATIK

4343EGSR04 Norrkoeping, SwedenEGSR04 Norrkoeping, SwedenMay 21, 2004May 21, 2004

MotivationMotivation

• Practical example: Max. model size at CGUdSPractical example: Max. model size at CGUdS• 2000/01: „Soda Hall“ (1.5 Mtri)2000/01: „Soda Hall“ (1.5 Mtri)

• 2001/02: „UNC PowerPlant“ (12.5 Mtri)2001/02: „UNC PowerPlant“ (12.5 Mtri)

• 2002/03: „Sunflowers“ (~1,000 Mtri, instantiated)2002/03: „Sunflowers“ (~1,000 Mtri, instantiated)

• 2003/04: „Boeing 777“ (350 Mtri, individual triangles)2003/04: „Boeing 777“ (350 Mtri, individual triangles)

• Todays industrial CAD models (rule of thumb)Todays industrial CAD models (rule of thumb)• One car: 10+ MTriOne car: 10+ MTri

• One plane: 100+ MtriOne plane: 100+ Mtri

• One cruise ship / factory / nuclear reactor: up to 1+GTri … One cruise ship / factory / nuclear reactor: up to 1+GTri …

• Scientific computing: Scientific computing: • LLNL Isosurface: 270+ time slices, 470MTri / slice… LLNL Isosurface: 270+ time slices, 470MTri / slice…

• Practical example: Max. model size at CGUdSPractical example: Max. model size at CGUdS• 2000/01: „Soda Hall“ (1.5 Mtri)2000/01: „Soda Hall“ (1.5 Mtri)

• 2001/02: „UNC PowerPlant“ (12.5 Mtri)2001/02: „UNC PowerPlant“ (12.5 Mtri)

• 2002/03: „Sunflowers“ (~1,000 Mtri, instantiated)2002/03: „Sunflowers“ (~1,000 Mtri, instantiated)

• 2003/04: „Boeing 777“ (350 Mtri, individual triangles)2003/04: „Boeing 777“ (350 Mtri, individual triangles)

• Todays industrial CAD models (rule of thumb)Todays industrial CAD models (rule of thumb)• One car: 10+ MTriOne car: 10+ MTri

• One plane: 100+ MtriOne plane: 100+ Mtri

• One cruise ship / factory / nuclear reactor: up to 1+GTri … One cruise ship / factory / nuclear reactor: up to 1+GTri …

• Scientific computing: Scientific computing: • LLNL Isosurface: 270+ time slices, 470MTri / slice… LLNL Isosurface: 270+ time slices, 470MTri / slice…

Page 44: An Interactive Out-of-Core  Framework for Visualizing Massively Complex Models

INFORMATIKINFORMATIK

4444EGSR04 Norrkoeping, SwedenEGSR04 Norrkoeping, SwedenMay 21, 2004May 21, 2004

Motivation – Are the „complex Motivation – Are the „complex models“ any more ?models“ any more ?

• Today: Steeply rising graphics performanceToday: Steeply rising graphics performance• Faster Desktop PCs (3+GHz CPUs, 2+GB RAM)Faster Desktop PCs (3+GHz CPUs, 2+GB RAM)

• Faster GPUs (100+ million tris/s)Faster GPUs (100+ million tris/s)

• Better AlgorithmsBetter Algorithms

• Performance increase still ongoingPerformance increase still ongoing• „„Moore‘s Law“ : Performance doubles every 1.5 years…Moore‘s Law“ : Performance doubles every 1.5 years…

• For GPUs: Even faster growth than for CPUsFor GPUs: Even faster growth than for CPUs

„„Affordable“ model size steeply risingAffordable“ model size steeply rising• What was a complex model 3 years ago can today often be What was a complex model 3 years ago can today often be

rendered on a laptop…rendered on a laptop…

• Today: Steeply rising graphics performanceToday: Steeply rising graphics performance• Faster Desktop PCs (3+GHz CPUs, 2+GB RAM)Faster Desktop PCs (3+GHz CPUs, 2+GB RAM)

• Faster GPUs (100+ million tris/s)Faster GPUs (100+ million tris/s)

• Better AlgorithmsBetter Algorithms

• Performance increase still ongoingPerformance increase still ongoing• „„Moore‘s Law“ : Performance doubles every 1.5 years…Moore‘s Law“ : Performance doubles every 1.5 years…

• For GPUs: Even faster growth than for CPUsFor GPUs: Even faster growth than for CPUs

„„Affordable“ model size steeply risingAffordable“ model size steeply rising• What was a complex model 3 years ago can today often be What was a complex model 3 years ago can today often be

rendered on a laptop…rendered on a laptop…

Page 45: An Interactive Out-of-Core  Framework for Visualizing Massively Complex Models

INFORMATIKINFORMATIK

4545EGSR04 Norrkoeping, SwedenEGSR04 Norrkoeping, SwedenMay 21, 2004May 21, 2004

Proxy ResultsProxy Results

• Proxy memory consumptionProxy memory consumption• 28 directions, 2x2bytes for normal+color 28 directions, 2x2bytes for normal+color ~100bytes ~100bytes

• Full model:Full model:

66-344MB quite affordable on 6GB PC66-344MB quite affordable on 6GB PC

• Proxy PerformanceProxy Performance• No performance impact at all !No performance impact at all !

– Proxy access faster than tracing the rayProxy access faster than tracing the ray

• Proxy memory consumptionProxy memory consumption• 28 directions, 2x2bytes for normal+color 28 directions, 2x2bytes for normal+color ~100bytes ~100bytes

• Full model:Full model:

66-344MB quite affordable on 6GB PC66-344MB quite affordable on 6GB PC

• Proxy PerformanceProxy Performance• No performance impact at all !No performance impact at all !

– Proxy access faster than tracing the rayProxy access faster than tracing the ray

Tile size and BSPTile size and BSP 16KB-tiles16KB-tiles 64KB-tiles64KB-tiles

Deep BSPsDeep BSPs 1.2GB1.2GB 344MB344MB

Shallow BSPsShallow BSPs 66MB66MB 30MB30MB

Page 46: An Interactive Out-of-Core  Framework for Visualizing Massively Complex Models

INFORMATIKINFORMATIK

4646EGSR04 Norrkoeping, SwedenEGSR04 Norrkoeping, SwedenMay 21, 2004May 21, 2004

MotivationMotivation

Problem: Model complexity rises even faster !Problem: Model complexity rises even faster !• Higher performance spent as soon as availableHigher performance spent as soon as available

• Best example: Games...Best example: Games...

• More detailed modelsMore detailed models

– Car industry today: 2MTri for a steering wheelCar industry today: 2MTri for a steering wheel

• Faster computers also drive user demandsFaster computers also drive user demands• Higher accuracy (structural analysis, FEM, …)Higher accuracy (structural analysis, FEM, …)

– Finer tesselationFiner tesselation

• CAD/VR/DMU increasingly important in industryCAD/VR/DMU increasingly important in industry• One model edited by more and more users…One model edited by more and more users…

– Each of which models at full accuracy…Each of which models at full accuracy…

Problem: Model complexity rises even faster !Problem: Model complexity rises even faster !• Higher performance spent as soon as availableHigher performance spent as soon as available

• Best example: Games...Best example: Games...

• More detailed modelsMore detailed models

– Car industry today: 2MTri for a steering wheelCar industry today: 2MTri for a steering wheel

• Faster computers also drive user demandsFaster computers also drive user demands• Higher accuracy (structural analysis, FEM, …)Higher accuracy (structural analysis, FEM, …)

– Finer tesselationFiner tesselation

• CAD/VR/DMU increasingly important in industryCAD/VR/DMU increasingly important in industry• One model edited by more and more users…One model edited by more and more users…

– Each of which models at full accuracy…Each of which models at full accuracy…

Page 47: An Interactive Out-of-Core  Framework for Visualizing Massively Complex Models

INFORMATIKINFORMATIK

4747EGSR04 Norrkoeping, SwedenEGSR04 Norrkoeping, SwedenMay 21, 2004May 21, 2004

OOC Memory ManagementOOC Memory Management

• Now: Need to detect page faults before they happenNow: Need to detect page faults before they happen• If not, access to data will stall thread until data availableIf not, access to data will stall thread until data available

• Several possible options:Several possible options:• Detect using OS signals [deMarle, PGV04]Detect using OS signals [deMarle, PGV04]

– Very elegant solutionVery elegant solution

– But: Can‘t easily cancel rays But: Can‘t easily cancel rays afterafter signal was raised signal was raised

• Detect via checking mem availability („mincore“)Detect via checking mem availability („mincore“)

– OS call OS call too costly for every access too costly for every access

• Our approach: Keep track of which data is in memoryOur approach: Keep track of which data is in memory

– Control what OS pages in and outControl what OS pages in and out

• Now: Need to detect page faults before they happenNow: Need to detect page faults before they happen• If not, access to data will stall thread until data availableIf not, access to data will stall thread until data available

• Several possible options:Several possible options:• Detect using OS signals [deMarle, PGV04]Detect using OS signals [deMarle, PGV04]

– Very elegant solutionVery elegant solution

– But: Can‘t easily cancel rays But: Can‘t easily cancel rays afterafter signal was raised signal was raised

• Detect via checking mem availability („mincore“)Detect via checking mem availability („mincore“)

– OS call OS call too costly for every access too costly for every access

• Our approach: Keep track of which data is in memoryOur approach: Keep track of which data is in memory

– Control what OS pages in and outControl what OS pages in and out