data-driven query processing for immersive computational turbulence

17
Data-driven Query Processing for Immersive Computational Turbulence Kalin Kanov Department of Computer Science Johns Hopkins University

Upload: selene

Post on 23-Feb-2016

30 views

Category:

Documents


0 download

DESCRIPTION

Data-driven Query Processing for Immersive Computational Turbulence. Kalin Kanov Department of Computer Science Johns Hopkins University. The Big Picture. Scientific disciplines have developed a computational branch Models without closed form solutions solved numerically - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Data-driven Query Processing for Immersive Computational Turbulence

Data-driven Query Processing for Immersive Computational

TurbulenceKalin KanovDepartment of

Computer ScienceJohns Hopkins

University

Page 2: Data-driven Query Processing for Immersive Computational Turbulence

The Big PictureScientific disciplines have developed a

computational branchModels without closed form solutions solved

numericallyThis has lead to an explosion of data

Simulation and analysis workloads are data-intensiveProducing\scanning large amounts of data

Management of these data represents a significant challengeStorage\archivingQuery processingVisualization

Page 3: Data-driven Query Processing for Immersive Computational Turbulence

Remote Immersive AnalysisFormerly, analysis performed during the

computationNo data stored for subsequent examination

Data-intensive computing breakthroughs have allowed for new interaction with scientific numerical simulations

Turbulence Database ClusterStores entire space-time evolution of the simulationProvides public access to world-class simulationsImplements “immersive turbulence*” approach

Introduces new challenges

*E. Perlman, R. Burns, Y. Li, and C. Meneveau. Data exploration of turbulence simulations using a database

cluster. In Supercomputing, 2007.

Page 4: Data-driven Query Processing for Immersive Computational Turbulence

GoalsDevelop data-driven query processing

techniquesReduce I/O and computation costsReduce or eliminate storage overheadExploit domain knowledge and structure

Provide user interfaces that are efficient and flexible

Streamline the process of data ingest

Page 5: Data-driven Query Processing for Immersive Computational Turbulence

Turbulence Database Cluster

Page 6: Data-driven Query Processing for Immersive Computational Turbulence

0 1 2 3 4 5 6 7 8 910

11

12

13

14

15

Processing a Batch Query

10 11 14 15

8 9 12 13

2 3 6 7

0 1 4 5query 1 query 3

query 2

q1:q2: 9

11

12

14

q3: 4 5 6 7

0 1 2 3 4 6 8 912

Redundant I/OMultiple disk seeks

Page 7: Data-driven Query Processing for Immersive Computational Turbulence

I/O Streaming Evaluation MethodLinear data requirements of the

computation allow for:Incremental evaluationStreaming over the dataConcurrent evaluation of batch queries

Page 8: Data-driven Query Processing for Immersive Computational Turbulence

0 1 2 3 4 5 6 7 8 910

11

12

13

14

15

Processing a Batch Query

10 11 14 15

8 9 12 13

2 3 6 7

0 1 4 5query 1 query 3

query 2

11

145 70 1 2 3 4 6 8 9

12

q1 q1 q1 q1 q1

q3

q3 q1

q3

q3 q1 q1

q2

q2 q1

q2

q2

I/O Streaming:

Sequential I/OSingle pass

Page 9: Data-driven Query Processing for Immersive Computational Turbulence

Lagrange Polynomial Interpolation

f (x',y ') = lyp−N2+ j

j=1

N

∑ (y') lxn−N2+i

i=1

N

∑ (x') ⋅ f (xn−N2+ i,y

p−N2+ j)

Lagrange coefficients Dat

a

Page 10: Data-driven Query Processing for Immersive Computational Turbulence

Spatial Differentiation

dfdx xn

=

112Δx

f (xn−2)

−23Δx

f (xn−1)

+23Δx

f (xn+1)

−1

12Δxf (xn+2)

xn

Page 11: Data-driven Query Processing for Immersive Computational Turbulence

Derivative Interpolation

dfdx xn

=

112Δx

f (xn−2)

−23Δx

f (xn−1)

+23Δx

f (xn+1)

−1

12Δxf (xn+2)

xn

Page 12: Data-driven Query Processing for Immersive Computational Turbulence

128 Workload

Over an order of magnitude improvement Sorting leads to a more sequential accesJoin/Order By executes entire batch as a joinI/O Streaming

Each atom is read only onceEffective cache usage

Page 13: Data-driven Query Processing for Immersive Computational Turbulence

I/O Streaming alleviates I/O bottleneckComputation emerges as the more costly operation

Page 14: Data-driven Query Processing for Immersive Computational Turbulence

Particle TrackingWeb Server/Mediator

DB Node 1

Distribute Points based on

x p (tm )

Computational Module

x p* (tm ) = x p (tm ) + u(x p (tm ), tm )Δt p

Storage Layer Retrieve

u(x p (tm ), tm )

DB Node N

Computational Module

x p* (tm ) = x p (tm ) + u(x p (tm ), tm )Δt p

Storage Layer Retrieve

u(x p (tm ),tm )

xp(tm)xp(tm)

x*p(tm)x*

p(tm)

Page 15: Data-driven Query Processing for Immersive Computational Turbulence

Particle TrackingWeb Server/Mediator

DB Node 1

Distribute Points based on

x p* (tm )

Computational Module

x p (tm+1) =x p (tm ) + x p

* (tm ) + u(x p* (tm ),tm+1)Δtp

2

Storage Layer Retrieve

u(x p* (tm ),tm+1)

DB Node N

Computational Module

Storage Layer Retrieve

x*p(tm)x*p(tm)

xp(tm+1)xp(tm+1)

x p (tm+1) =x p (tm ) + x p

* (tm ) + u(x p* (tm ),tm+1)Δtp

2

u(x p* (tm ), tm+1)

Page 16: Data-driven Query Processing for Immersive Computational Turbulence

Summary and Future WorkExtend I/O streaming technique to different

decomposable kernel computations:DifferentiationSpatial InterpolationTemporal interpolationFiltering and coarse-graining

Provide a flexible user interfaceAllow for different filter functionsAllow for new kernel computations

Improve particle tracking routineReduce communication between mediator and DB nodesAsynchronous processingCaching and pre-fetching

Page 17: Data-driven Query Processing for Immersive Computational Turbulence

Questions

Images courtesy of Kai Buerger ([email protected])