1 the scidac scientific data management center: infrastructure and results arie shoshani lawrence...

31
1 The SciDAC Scientific The SciDAC Scientific Data Management Data Management Center: Infrastructure Center: Infrastructure and Results and Results Arie Shoshani Arie Shoshani Lawrence Berkeley National Lawrence Berkeley National Laboratory Laboratory SC 2004 SC 2004

Upload: aliya-tacy

Post on 29-Mar-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 The SciDAC Scientific Data Management Center: Infrastructure and Results Arie Shoshani Lawrence Berkeley National Laboratory SC 2004 November, 2004

1

The SciDAC Scientific Data The SciDAC Scientific Data Management Center: Management Center:

Infrastructure and ResultsInfrastructure and Results

Arie ShoshaniArie ShoshaniLawrence Berkeley National LaboratoryLawrence Berkeley National Laboratory

SC 2004SC 2004November, 2004November, 2004

Page 2: 1 The SciDAC Scientific Data Management Center: Infrastructure and Results Arie Shoshani Lawrence Berkeley National Laboratory SC 2004 November, 2004

2

A Typical SDM Scenario

Control Flow Layer

Applications &Software Tools

Layer

I/O System Layer

Storage & NetworkResouces

Layer

Flo

w T

ier

Wo

rk T

ier

+

DataMover

SimulationProgram

ParallelR

PostProcessing

TerascaleBrowser

Task A:Generate

Time-Steps

Task B:Move TS

Task C:Analyze TS

Task D:Visualize TS

ParallelNetCDF

PVFS LNHDF5

LibrariesSRM

Page 3: 1 The SciDAC Scientific Data Management Center: Infrastructure and Results Arie Shoshani Lawrence Berkeley National Laboratory SC 2004 November, 2004

3

Dealing with Large Data volumes:Three Stages of Data Handling

• Generate data• Run simulations, dump data• Capture metadata

• Post-processing of data• Process all data, produce reduced datasets, summaries• Generate indexes

• Analyze data• Interested in subset of the data, search over large amounts of

data, produce relatively little data (for analysis, vis, …)

Generation

Post-processing

Analysis

Input Output Computation

Low High High

High Med-High High

Med-High Low Low-Med

StageData

Page 4: 1 The SciDAC Scientific Data Management Center: Infrastructure and Results Arie Shoshani Lawrence Berkeley National Laboratory SC 2004 November, 2004

4

I/O is the predicted bottleneck

Source: Celeste Matarazzo

Main reason: data transfer rates to disk and tape devices have not kept pace with computational capacity

Page 5: 1 The SciDAC Scientific Data Management Center: Infrastructure and Results Arie Shoshani Lawrence Berkeley National Laboratory SC 2004 November, 2004

5

Data Generation – technology areas

• Parallel I/O writes – to disk farms• Does technology scale?• Reliability in face of disk failures• Dealing with various file formats (NetCDF, HDF,

AMR, unstructured meshes, …)

• Parallel Archiving – to tertiary storage• Is tape striping cost effective?• Reorganizing data before archiving to match

predicted access patterns

• Dynamic monitoring of simulation progress• Tools to automate workflow

• Compression – is overhead prohibitive?

Page 6: 1 The SciDAC Scientific Data Management Center: Infrastructure and Results Arie Shoshani Lawrence Berkeley National Laboratory SC 2004 November, 2004

6

Post Processing – technology areas

• Parallel I/O reads – from disk farms• Data clustering to minimize arm movement

• Parallel read synchronization

• Does it scale linearly?

• Reading from archives – tertiary storage• Minimize tape mounts

• Does tape striping help?

• What’s after tape? Large disk farm archives?

• Feed large volumes data into machine• Competes with write I/O

• Need fat I/O channels, parallel I/O

Page 7: 1 The SciDAC Scientific Data Management Center: Infrastructure and Results Arie Shoshani Lawrence Berkeley National Laboratory SC 2004 November, 2004

7

Analysis – technology areas

• Dynamic hot clustering of data from archive• Based on repeated use (caching & replication)• Takes advantage of data sharing

• Indexing over data values• Indexes need to scale:

• Linear search over billion of data objects • Search over combinations of multiple data measures per

mesh point• Take advantage of append-only data

• Parallel indexing methods

• Analysis result streaming• On the fly monitoring and visualization• Suspend-Resume capabilities, clean abort

Page 8: 1 The SciDAC Scientific Data Management Center: Infrastructure and Results Arie Shoshani Lawrence Berkeley National Laboratory SC 2004 November, 2004

8

SDM Technology: Layering of Components

Hardware, OS, and MSS (HPSS)

Scientific Process Automation (Workflow) Layer

Data Mining & AnalysisLayer

Storage Efficient AccessLayer

Page 9: 1 The SciDAC Scientific Data Management Center: Infrastructure and Results Arie Shoshani Lawrence Berkeley National Laboratory SC 2004 November, 2004

9

Data Generation

Workflow Design and ExecutionWorkflow Design and ExecutionScientific Process

Automation Layer

OS, Hardware (Disks, Mass Store)OS, Hardware (Disks, Mass Store)

Storage Efficient Access

Layer

Data Mining and Analysis Layer

SimulationRun

ParallelnetCDF

MPI-IO PVFS2

Page 10: 1 The SciDAC Scientific Data Management Center: Infrastructure and Results Arie Shoshani Lawrence Berkeley National Laboratory SC 2004 November, 2004

10

Parallel NetCDF v.s. NetCDF (ANL+NWU)

• Slow and cumbersome

• Data shipping

• I/O bottleneck

• Memory requirementnetCDF

Parallel File System

P2 P3P0 P1

Parallel File System

Parallel netCDF

P2 P3P0 P1 • Programming Convenience

• Perform I/O cooperatively or collectively

• Potential parallel I/O optimizations for better performance

Page 11: 1 The SciDAC Scientific Data Management Center: Infrastructure and Results Arie Shoshani Lawrence Berkeley National Laboratory SC 2004 November, 2004

11

Parallel NetCDF Library Overview

• User level library

• Accept parallel requests in netCDF I/O patterns

• Parallel I/O through MPI-IO to underlying file system and storage

• Good level of abstraction for portability and optimization opportunities

Communication Network

ComputeNode

ComputeNode

ComputeNode

ComputeNode

I/OServer

I/OServer

I/OServer

File SystemSpace

User SpaceParallel netCDF

MPI-IO

Page 12: 1 The SciDAC Scientific Data Management Center: Infrastructure and Results Arie Shoshani Lawrence Berkeley National Laboratory SC 2004 November, 2004

12

Parallel netCDF Performance

Page 13: 1 The SciDAC Scientific Data Management Center: Infrastructure and Results Arie Shoshani Lawrence Berkeley National Laboratory SC 2004 November, 2004

13

Robust Multi-file ReplicationRobust Multi-file Replication

• Problem: move thousands of files robustly

• Takes many hours• Need error recovery

• Mass storage systems failures

• Network failures• Solution: Use Storage Resource

Managers (SRMs)

• Problem: too slow• Solution:

• Use parallel streams• Use concurrent transfers• Use large FTP windows• Pre-stage files from MSS

NCAR

Anywhere

LBNL

DiskCache

DiskCache

SRM-COPY

(thousands of files)

SRM-GET (one file at a time)

DataMover

SRM(performs writes)

SRM(performs reads)GridFTP GET (pull mode)

stage filesarchive files

Network transfer

Get listof files

MSS

Page 14: 1 The SciDAC Scientific Data Management Center: Infrastructure and Results Arie Shoshani Lawrence Berkeley National Laboratory SC 2004 November, 2004

14

File tracking shows recovery from transient failures

Total:45 GBs

Page 15: 1 The SciDAC Scientific Data Management Center: Infrastructure and Results Arie Shoshani Lawrence Berkeley National Laboratory SC 2004 November, 2004

15

Tomcat servlet engine

Tomcat servlet engine

MCSMetadata Cataloguing Services

MCSMetadata Cataloguing Services

RLSReplica Location Services

RLSReplica Location Services

SOAP

RMI

MyProxyserver

MyProxyserver

MCS client

RLS client

MyProxy client

GRAMgatekeeper

GRAMgatekeeper

CASCommunity Authorization Services

CASCommunity Authorization Services

CAS client

NCAR-MSSMass Storage System

HPSSHigh PerformanceStorage System

HPSSHigh PerformanceStorage System

DRMStorage Resource

Management

DRMStorage Resource

Management

HRMStorage Resource

Management

HRMStorage Resource

Management

HRMStorage Resource

Management

HRMStorage Resource

Management

HRMStorage Resource

Management

HRMStorage Resource

Management

gridFTP

gridFTP

gridFTPserver

gridFTPserver

gridFTPserver

gridFTPserver

gridFTPserver

gridFTPserver

gridFTPserver

gridFTPserver

openDAPgserver

openDAPgserver

gridFTPStripedserver

gridFTPStripedserver

LBNL

LLNL

USC-ISI

NCAR

ORNL

ANL

DRMStorage Resource

Management

DRMStorage Resource

Management

disk

disk

disk

disk

Earth Science Grid

Page 16: 1 The SciDAC Scientific Data Management Center: Infrastructure and Results Arie Shoshani Lawrence Berkeley National Laboratory SC 2004 November, 2004

16

Layering of Components

Hardware, OS, and MSS (HPSS)

Scientific Process Automation (Workflow) Layer

Data Mining & AnalysisLayer

Storage Efficient AccessLayer

Page 17: 1 The SciDAC Scientific Data Management Center: Infrastructure and Results Arie Shoshani Lawrence Berkeley National Laboratory SC 2004 November, 2004

17

RScaLAPACK (ORNL)

• Through RScaLAPACK we provide a simple intuitive R interfaces to ScaLAPACK routines.

• The package does not need the user to worry about setting up the parallel environment and distribution of data prior to ScaLAPACK function call.

• RScaLAPACK is developed as an add-on package to R.

• Significant speed gain is observed in some function execution.

• Submitted to CRAN ( a network of 27 www sites across 15 countries holding R distribution) in March 2003

Page 18: 1 The SciDAC Scientific Data Management Center: Infrastructure and Results Arie Shoshani Lawrence Berkeley National Laboratory SC 2004 November, 2004

18

(1) Parallel Agent (PA): Orchestrates entire parallel execution as per user’s request.

(2) Spawned Processes: Actual parallel execution of the requested function.

RScaLAPACK Architecture

Page 19: 1 The SciDAC Scientific Data Management Center: Infrastructure and Results Arie Shoshani Lawrence Berkeley National Laboratory SC 2004 November, 2004

19

RScaLAPACK Benchmarks

Page 20: 1 The SciDAC Scientific Data Management Center: Infrastructure and Results Arie Shoshani Lawrence Berkeley National Laboratory SC 2004 November, 2004

20

Fastbit – bitmap Indexing Technology (LBNL)

• Search over large spatio-temporal data• Combustion simulation: 1000x1000x1000 mesh with 100s

of chemical species over 1000s of time steps

• Supernova simulation: 1000x1000x1000 mesh with 10s of variables per cell over 1000s of time steps

• Common searches are partial range queries• Temperature > 1000 AND pressure > 106

• HO2 > 10-7 AND HO2 > 10-6

• Features• Search time proportional to number of hits

• Index generation linear with data values(require read-once only)

Page 21: 1 The SciDAC Scientific Data Management Center: Infrastructure and Results Arie Shoshani Lawrence Berkeley National Laboratory SC 2004 November, 2004

21

FastBit-Based Multi-Attribute Region Finding is Theoretically Optimal

On 3D data with over 110 million points,

region finding takes less than 2 seconds

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

10000 110000 210000 310000 410000

Number of line segments

region growing time (sec)

Flame Front discovery (range conditions for multiple measures)

in a combustion simulation (Sandia)

Time required to identify regions in 3D Supernova simulation (LBNL)

Page 22: 1 The SciDAC Scientific Data Management Center: Infrastructure and Results Arie Shoshani Lawrence Berkeley National Laboratory SC 2004 November, 2004

22

Feature Selection and Extraction:Using Generic Analysis Tools (LLNL)

Comparing Climate simulation to experiment data Used PCA and ICA technology for accurate Climate signal separation

Page 23: 1 The SciDAC Scientific Data Management Center: Infrastructure and Results Arie Shoshani Lawrence Berkeley National Laboratory SC 2004 November, 2004

23

Layering of Components

Hardware, OS, and MSS (HPSS)

Scientific Process Automation (Workflow) Layer

Data Mining & AnalysisLayer

Storage Efficient AccessLayer

Page 24: 1 The SciDAC Scientific Data Management Center: Infrastructure and Results Arie Shoshani Lawrence Berkeley National Laboratory SC 2004 November, 2004

24

TSI Workflow ExampleIn conjunction with John Blondin, NCSU

Automate data generation, transfer and visualization of a large-scale simulation at ORNL

Page 25: 1 The SciDAC Scientific Data Management Center: Infrastructure and Results Arie Shoshani Lawrence Berkeley National Laboratory SC 2004 November, 2004

25

TSI Workflow ExampleIn conjunction with John Blondin, NCSU

Automate data generation, transfer and visualization of a large-scale simulation at ORNL

Submit Job to Cray at ORNLCheck whether a time slice

is finishedAggregate all into

One large File- Save to HPSS

Split it into 22 Files and store them in

XRaid

Notify Head Node at NC State

Head Node submit scheduling to SGE

SGE schedule the transfer for 22 Nodes

Node Retrieve File from XRaid

Node-0 Node-1

…..Node-21

YesYes

Start Ensight to generate Video Files at Head Node

ORNL

NCSU

Page 26: 1 The SciDAC Scientific Data Management Center: Infrastructure and Results Arie Shoshani Lawrence Berkeley National Laboratory SC 2004 November, 2004

26

Using the Scientific Workflow Tool (Kepler)Emphasizing Dataflow (SDSC, NCSU, LLNL)

Automate data generation, transfer and visualization of a large-scale simulation at ORNL

Page 27: 1 The SciDAC Scientific Data Management Center: Infrastructure and Results Arie Shoshani Lawrence Berkeley National Laboratory SC 2004 November, 2004

27

TSI Workflow Examplewith Doug Swesty and Eric Myra, Stony Brook

Automate the transfer of large-scale simulation data between NERSC and Stony Brook

Page 28: 1 The SciDAC Scientific Data Management Center: Infrastructure and Results Arie Shoshani Lawrence Berkeley National Laboratory SC 2004 November, 2004

28

Using the Scientific Workflow Tool

Page 29: 1 The SciDAC Scientific Data Management Center: Infrastructure and Results Arie Shoshani Lawrence Berkeley National Laboratory SC 2004 November, 2004

29

Collaborating Application Scientists

• Matt Coleman - LLNL (Biology)

• Tony Mezzacappa – ORNL (Astrophysics)

• Ben Santer – LLNL

• John Drake - ORNL (Climate)

• Doug Olson - LBNL, Wei-Ming Zhang – Kent (HENP)

• Wendy Koegler, Jacqueline Chen – Sandia Lab (Combustion)

• Mike Papka - ANL (Astrophysics Vis)

• Mike Zingale – U of Chicago (Astrophysics)

• John Michalakes – NCAR (Climate)

• Keith Burrell - General Atomics (Fusion)

Page 30: 1 The SciDAC Scientific Data Management Center: Infrastructure and Results Arie Shoshani Lawrence Berkeley National Laboratory SC 2004 November, 2004

30

Re-apply technology to new applications

• Parallel NetCDF• Astrophysics Climate

• Parallel VTK• Astrophysics Climate

• Compressed bitmaps• HENP Combustion Astrophysics

• Storage Resource Managers (MSS access)• HENP Climate Astrophysics

• Feature Selection• Climate Fusion

• Scientific Workflow• Biology Astrophysics (planned)

Page 31: 1 The SciDAC Scientific Data Management Center: Infrastructure and Results Arie Shoshani Lawrence Berkeley National Laboratory SC 2004 November, 2004

31

SummarySummary

• Focus: getting technology into the hands of scientists

• Integrated framework• Storage Efficient Access technology

• Data Mining and Analysis tools

• Scientific Process (workflow) Automation

• Technology migration to new applications• SDM Framework facilitates integration,

generalization, and reuse of SDM technology