data management techniques sung-eui yoon kaist url: sungeui

Post on 25-Dec-2015

218 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Data Management Techniques

Sung-Eui YoonKAIST

URL: http://jupiter.kaist.ac.kr/~sungeui/

Data Avalanche (or Data Explosions)

There are too much data out data!!!

www.cs.umd.edu/class/spring2001/cmsc838b/Project/Parija_Spacco/images/

Geometric Data Avalanche

● Massive geometric data● Due to advances of modeling, simulation,

and data capture techniques

● Time-varying data (4D data sets)

CAD Model: Double Eagle Oil Tanker

82 million triangles (4 gigabyte)

CAD Model: Boeing 777

Ray Tracing Boeing 777,470 million triangles

Excerpted from SIGGRAPH course note on massive model rendering

Scanned Model: ST. Matthew Model

372 million triangles (10GB) www.cyberware.com

Possible Solutions?

● Hardware improvement will address the data avalanche?● Moore’s law: the number of transistor is

roughly double every 18 months

Current Architecture Trends

Accumulatedgrowth rate

during 1999~2009(log scale)

accessspeed

disk access speed

Data access time becomes the major computational bottleneck!

Four Orthogonal Approaches

● Cache-coherent layouts● Random-accessible compressed

meshes● Cache-oblivious ray reordering● Hybrid parallel continuous collision

detection

Overview

● Cache-coherent layouts● Random-accessible compressed

meshes● Cache-oblivious ray reordering● Hybrid parallel continuous collision

detection

Cache-Coherent Layouts of Meshes

● One dimensional data layout of a mesh● Reduce the number of cache misses

● Cache-aware or cache-oblivious layouts● Minimize the number of cache misses for

a specific or various cache parameters (e.g., cache block size)[Yoon et al. SIG05, VIS06,

Euro06]

va

vb vd

vc

va vb vd vc

One dimensional layout

Block-based I/O Model [Aggarwal and Vitter 88]

CPU or GPU

Fast memory or cache

Slow memory

Blocktransfer

Disk

1 secAccess time: 10-4 sec10-6 sec

Applications

● View-dependent meshes● View-dependent rendering

● Triangle meshes● Isocontour extractions

● Hierarchies● Ray tracing● Collision detection

View-Dependent Rendering using LODs

Improving GPU vertex cache

Utilization

GeForce 6800

(January 2005)

Applications

● View-dependent meshes● View-dependent rendering

● Triangle meshes● Isocontour extractions

● Hierarchies ● Ray tracing● Collision detection

Puget sound, 134 M triangles

Isocontourz(x,y) = 500m

Achieve up to 20X improvement on iso-

contouring

Applications

● View-dependent meshes● View-dependent rendering

● Triangle meshes● Isocontour extractions

● Hierarchies● Ray tracing● Collision detection

Achieve 30% ~ 300% performance improvement

Advantages

● General ● Works well for various applications

● Cache-oblivious● Can have benefit for all levels of the memory

hierarchy (e.g. CPU/GPU caches, memory, and disk)

● No modification of runtime applications

● Only layout computationSource codes are available as a library called

OpenCCL

Overview

● Cache-coherent layouts● Random-accessible compressed

meshes● Cache-oblivious ray reordering● Hybrid parallel continuous collision

detection

Random-Accessible Compressed Data

● Compression methods of meshes and hierarchies● Reduce the memory requirements● Supports random accesses on meshes

and hierarchies● Can be useful to many different

applications[Kim et al. Tech. Report 09;

Kim et al., TVCG 09; Yoon and Lindstrom, VIS 07]

Hierarchical-Culling oriented Compact Meshes (HCCMeshes)

● Consists of two parts:● i-HCCMeshes (in-core representation)● o-HCCMeshes (out-of-core

representation)

21

Data Access Framework

Main memory

User

Request

Data

Data pool

22

Data Access Framework- Out-of-Core Technique

Main memory

User

Request

Data

Cached data External drive

Data pool

Cluster c0

Cluster c1

Cluster c2

Cluster c3

Cluster c4

Cluster c5

…Cluster cn

cluster ID

cluster

23

HCCMeshes

Main memory

User

Request

Data

Cached data External drive

Data pool

cluster ID

Decomp.

clustercompressed

cluster

Decomp.

CompressedData

Cluster cm

Cluster c0

Cluster c1

Cluster c2

Cluster c3

Cluster c4

Cluster c5

Cluster c6

Cluster c7

Cluster c8

Cluster c9

Cluster c10

Cluster c11

Cluster c12

Cluster c13

o-HCCMeshi-HCCMesh

Support hierarchical random access!

24

Main Benefits

● Use a lower memory space and working set size● o-HCCMeshes have 20:1 compression

ratios● i-HCCMeshes have 6:1 compression ratios

● Improve runtime performance

25

Applications

● Whitted-style ray tracing● LOD-based ray tracing● Collision detection● Photon mapping● Non-photorealistic rendering

Source codes are available as OpenRACM

26

Results

27

Overview

● Multi-resolution representations● Random-accessible compressed

meshes● Cache-oblivious ray reordering● Hybrid parallel continuous collision

detection

28

Challenges

● Secondary rays generated show low ray coherence

● Result in low cache utilizations

● In case of ray tracing massive models, expensive cache misses occur (e.g. L1/L2, main memory)

Landscape ( >1000 M )

St.Matthew ( 372 M )

29

Goal

● Design an efficient algorithm for converting incoherent secondary rays to coherent

● Achieve a high cache coherence of these rays

● The performance improvement of ray tracing

30

Ray Reordering Framework

Camera information

Raygeneration

Rayreordering

Ray buffer

Hit points and material information

Rayprocessing

Disk

CachesL1

Main memory

Sceneinformation

[Moon et al., under review]

31

Applications

● Path tracing● Photon mapping

32

Result – Path Tracing (Video)

● 104 M triangles ● (12.8 GB)

● 512*512 resolution

● 100 path

● 8 area lights

33

Result – Photon Mapping

● 128 M triangles ● (15.7 GB)

● Cache 19% of all the data

● 4 area lights

● 13 X speedup

34

Overview

● Multi-resolution representations● Random-accessible compressed

meshes● Cache-oblivious ray reordering● Hybrid parallel continuous collision

detection

35

Collision Detection

● Collision detection is used in various fields

● Game, movie, scientific simulation and robotics

<Figure from PIXAR><Figure from C. Lauterbach >

<Figure from AION >

36

Discrete collision detection (DCD)

Discrete VS Continuous

Time step (i-1)Time step (i)

37

Continuous collision detection(CCD)

Discrete VS Continuous

Time step (i-1)Time step (i)

38

Discrete collision detection (DCD)

Discrete VS Continuous

Time step (i-1)Time step (i)

?

39

Discrete VS Continuous

Continuous CD Discrete CD

Accuracy AccurateMay miss collisions

Computation time

Expensive Very fast

40

Motivation

● Continuous collision detection● Accurate, but slow for complex models

● Hardware trend● CPUs and GPUs are increasing the # of

cores● Heterogeneous architectures

● Intel Larabee architecture

● Previous approaches● Utilize either multi-core CPUs or GPUs● Not enough performance for interactive

applications

41

Hybrid Parallel CCD [Kim et al. PG 09]

● Takes advantages of both:● Multi-core CPU architectures● GPU architectures

● Achieves interactive performance for various deforming models consisting of tens or hundreds of thousand triangles

CCDMulti-coreCPU

Multi-coreCPU

Multi-coreCPU

Multi-coreCPU

GPUGPU

GPUGPU

… …

42

Results

● Performance of HPCCD utilizing both CPUs and GPUs

Source codes are available as a library called

OpenCCD

43

Results

44

Conclusions

● Data explosion and lower growth rate of data access time

● Discussed three different techniques as a data management method● Cache-coherent layouts● Random-accessible compressed data● Cache-oblivious ray reordering● Hybrid continuous collision detection

● Applied to rendering and collision detection● Observed meaningful performance

improvement

45

Acknowledgements

● Research collaborators● TaeJoon Kim, DukSu Kim, Pio Claudio,

BooChang Moon, YongYoung Byun, JaePil Heo, SeungYong Lee, YongJin Kim, JaeHyuk Heo, John Kim, Peter Lindstrom, Valerio Pascucci, Dinesh Manocha

● Funding sources● Microsoft Research Asia● KAIST seed grant● Ministry of Knowledge Economy● Samsung● Korea Research Foundation

top related