petascale data intensive computing for escience

22
Petascale Data Intensive Computing for eScience Alex Szalay, Maria Nieto- Santisteban, Ani Thakar, Jan Vandenberg, Alainna Wonders, Gordon Bell, Dan Fay, Tony Hey, Catherine Van Ingen, Jim Heasley

Upload: graham

Post on 08-Feb-2016

39 views

Category:

Documents


0 download

DESCRIPTION

Petascale Data Intensive Computing for eScience. Alex Szalay, Maria Nieto-Santisteban, Ani Thakar, Jan Vandenberg, Alainna Wonders, Gordon Bell, Dan Fay, Tony Hey, Catherine Van Ingen, Jim Heasley. Gray’s Laws of Data Engineering. Jim Gray: - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Petascale  Data Intensive Computing for eScience

Petascale Data Intensive Computing for eScience

Alex Szalay, Maria Nieto-Santisteban,

Ani Thakar, Jan Vandenberg, Alainna Wonders, Gordon Bell,

Dan Fay, Tony Hey, Catherine Van Ingen, Jim Heasley

Page 2: Petascale  Data Intensive Computing for eScience

Gray’s Laws of Data EngineeringJim Gray:Scientific computing is

increasingly revolving around data

Need scale-out solution for analysis

Take the analysis to the data!Start with “20 queries”Go from “working to working”

DISSC: Data Intensive Scalable Scientific Computing

Page 3: Petascale  Data Intensive Computing for eScience

Amdahl’s Laws

Gene Amdahl (1965): Laws for a balanced system

i. Parallelism: max speedup is S/(S+P)ii. One bit of IO/sec per

instruction/sec (BW)iii. One byte of memory per one instr/sec

(MEM)iv. One IO per 50,000 instructions (IO)

Modern multi-core systems move farther away from Amdahl’s Laws (Bell, Gray and Szalay 2006)

For a Blue Gene the BW=0.001, MEM=0.12.For the JHU GrayWulf cluster BW=0.5,

MEM=1.04

Page 4: Petascale  Data Intensive Computing for eScience

Typical Amdahl Numbers

Page 5: Petascale  Data Intensive Computing for eScience

Commonalities of DISSCHuge amounts of data, aggregates needed

◦Also we must keep raw data ◦Need for parallelism

Requests benefit from indexingVery few predefined query patterns

◦Everything goes…. search for the unknown!!◦Rapidly extract small subsets of large data sets◦Geospatial everywhere

Limited by sequential IOFits DB quite well, but no need for

transactionsSimulations generate even more data

Page 6: Petascale  Data Intensive Computing for eScience

Total GrayWulf Hardware46 servers with 416 cores1PB+ disk space1.1TB total memoryCost <$700K

Infiniband 20Gbits/s

10 Gbits/s

Tier 1

Tier 2 Tier 3 Interconnect

320 CPU640GB memory

900TB disk

96 CPU512GB memory

158TB disk

Infiniband 20Gbits/s

10 Gbits/s

Tier 1

Tier 2 Tier 3 Interconnect

320 CPU640GB memory

900TB disk

96 CPU512GB memory

158TB disk

Page 7: Petascale  Data Intensive Computing for eScience

Data Layout7.6TB database partitioned 4-

ways◦4 data files (D1..D4), 4 log files

(L1..L4)Replicated twice to each server

(2x12)◦IB copy at 400MB/s over 4 threads

Files interleaved across controllers

Only one data file per volumeAll servers linked to head nodeDistributed Partitioned Views

GW01ctrl vol 82P 82Q1 E D1 L41 F D2 L31 G L1 D41 I L2 D32 J D4 L12 K D3 L22 L L3 D22 M L4 D1

Page 8: Petascale  Data Intensive Computing for eScience

Software UsedWindows Server 2008 Enterprise

EditionSQL Server 2008 Enterprise RTMSQLIO test suitePerfMon + SQL Performance

CountersBuilt in Monitoring Data

WarehouseSQL batch scripts for testingDPV for looking at results

Page 9: Petascale  Data Intensive Computing for eScience

Performance TestsLow level SQLIO

◦Measure the “speed of light”◦Aggregate and per volume tests (R,

some W)Simple queries

◦How does SQL Server perform on large scans

Porting a real-life astronomy problem◦Finding time series of quasars◦Complex workflow with billions of objects◦Well suited for parallelism

Page 10: Petascale  Data Intensive Computing for eScience

SQLIO Aggregate (12 nodes)

0 500 1000 1500 2000 25000

2000

4000

6000

8000

10000

12000

14000

16000

18000

20000

Read

Write

time [sec]

ag

gre

ga

te IO

[M

B/s

ec

]

Page 11: Petascale  Data Intensive Computing for eScience

Aggregate IO Per Volume

0 500 1000 1500 2000 2500 3000 35000

500

1000

1500

2000

2500

3000

3500

4000

FE G I J K L M

Page 12: Petascale  Data Intensive Computing for eScience

IO Per Disk (Node/Volume)

GW01 GW02 GW03 GW04 GW05 GW06 GW07 GW08 GW17 GW18 GW19 GW200

10

20

30

40

50

60

70

80

90

E

F

G

I

J

K

L

M

Test file on inner tracks,

plus 4K block format

2 ctrl volume

Page 13: Petascale  Data Intensive Computing for eScience

Astronomy Application Data

SDSS Stripe82 (time-domain) x 24◦300 square degrees, multiple scans

(~100)◦(7.6TB data volume) x 24 = 182.4TB◦(851M object detections)x24 = 20.4B

objects◦70 tables with additional info

Very little existing indexing Precursor to similar, but much

bigger data from Pan-STARRS (2009) & LSST(2014)

Page 14: Petascale  Data Intensive Computing for eScience

Simple SQL Query

0 200 400 600 800 1000 1200 1400 1600 180011000

11200

11400

11600

11800

12000

12200

12400

12600

12800

13000

2a

2b

2c

Harmonic Arithmetic 12,109 MB/s 12,081

Page 15: Petascale  Data Intensive Computing for eScience

Finding QSO Time-SeriesGoal: Find QSO candidates in the SDSS Stripe82

data and study their temporal behaviorUnprecedented sample size (1.14M time series)!Find matching detections (100+) from positionsBuild table of detections collected /sorted by the

common coadd object for fast analysesExtract/add timing information from Field tableOriginal script written by Brian Yanny (FNAL)

and Gordon Richards (Drexel)Ran in 13 days in the SDSS database at FNAL

Page 16: Petascale  Data Intensive Computing for eScience

CrossMatch Workflow

PhotoObjAll

coadd

zone1 zone2

Field

filter filter

xmatch

neighbors

join

Match

10 min

1 min2 min

Page 17: Petascale  Data Intensive Computing for eScience

Xmatch Perf Counters

Page 18: Petascale  Data Intensive Computing for eScience

Crossmatch ResultsPartition the queries spatially

◦Each server gets part of skyRuns in ~13 minutes!Nice scaling behaviorResulting data indexed Very fast posterior analysis

◦Aggregates in seconds over0.5B detections

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100

05000

100001500020000250003000035000400004500050000

Frequency of number of detections per object

Tim

e [

s]

Objects [M]

Page 19: Petascale  Data Intensive Computing for eScience

ConclusionsDemonstrated large scale

computations involving ~200TB of DB data

DB speeds close to “speed of light” (72%)

Scale-out over SQL Server clusterAggregate I/O over 12 nodes

◦17GB/s for raw IO, 12.5GB/s with SQLVery cost efficient: $10K/(GB/s)Excellent Amdahl number >0.5

Page 20: Petascale  Data Intensive Computing for eScience
Page 22: Petascale  Data Intensive Computing for eScience

Test Hardware LayoutDell 2950 servers

◦8 cores, 16GB memory◦2xPERC/6 disk controller◦2x(MD1000 + 15x750GB SATA)◦SilverStorm IB controller (20Gbits/s)

12 units= (4 per rack)x31xDell R900 (head-node)QLogic SilverStorm 9240

◦(288 port IB switch)