a ram-disk provisioning service for high performance data ... a_siparcs_2… · #pbs -q...

64
A RAM-disk provisioning service for high performance data analysis Allan Espinosa ([email protected]) Mentors: M. Woitaszek and J. Dennis University of Chicago, National Center for Atmospheric Research July 29, 2011 1 / 64

Upload: others

Post on 27-May-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A RAM-disk provisioning service for high performance data ... A_SIParCS_2… · #PBS -q ramdisk_service #PBS -l prologue=allocate.sh #PBS -l epilogue=cleanup.sh sleep 45h mail user@cluster

A RAM-disk provisioning service for highperformance data analysis

Allan Espinosa† ([email protected])Mentors: M. Woitaszek� and J. Dennis�

†University of Chicago, �National Center for Atmospheric Research

July 29, 2011

1 / 64

Page 2: A RAM-disk provisioning service for high performance data ... A_SIParCS_2… · #PBS -q ramdisk_service #PBS -l prologue=allocate.sh #PBS -l epilogue=cleanup.sh sleep 45h mail user@cluster

Outline

1 Motivation: data analysis

2 Approach and challenges

3 Implementation

4 Target applications

5 Conclusions

2 / 64

Page 3: A RAM-disk provisioning service for high performance data ... A_SIParCS_2… · #PBS -q ramdisk_service #PBS -l prologue=allocate.sh #PBS -l epilogue=cleanup.sh sleep 45h mail user@cluster

Motivation: data-intensive post-processing

Simulation results

Computing center

Transfernodes

Analysis 1 Analysis 2 Analysis n

Spinning disk-basedparallel file system

. . .

TapeArchive

Analysis cluster

Multiple trips to disk is slow

3 / 64

Page 4: A RAM-disk provisioning service for high performance data ... A_SIParCS_2… · #PBS -q ramdisk_service #PBS -l prologue=allocate.sh #PBS -l epilogue=cleanup.sh sleep 45h mail user@cluster

Motivation: data-intensive post-processing

Simulation results

Computing center

Transfernodes

Analysis 1

Analysis 2 Analysis n

Spinning disk-basedparallel file system

. . .

TapeArchive

Analysis cluster

Multiple trips to disk is slow

4 / 64

Page 5: A RAM-disk provisioning service for high performance data ... A_SIParCS_2… · #PBS -q ramdisk_service #PBS -l prologue=allocate.sh #PBS -l epilogue=cleanup.sh sleep 45h mail user@cluster

Motivation: data-intensive post-processing

Simulation results

Computing center

Transfernodes

Analysis 1

Analysis 2 Analysis n

Spinning disk-basedparallel file system

. . .

TapeArchive

Analysis cluster

Multiple trips to disk is slow

5 / 64

Page 6: A RAM-disk provisioning service for high performance data ... A_SIParCS_2… · #PBS -q ramdisk_service #PBS -l prologue=allocate.sh #PBS -l epilogue=cleanup.sh sleep 45h mail user@cluster

Motivation: data-intensive post-processing

Simulation results

Computing center

Transfernodes

Analysis 1 Analysis 2

Analysis n

Spinning disk-basedparallel file system

. . .

TapeArchive

Analysis cluster

Multiple trips to disk is slow

6 / 64

Page 7: A RAM-disk provisioning service for high performance data ... A_SIParCS_2… · #PBS -q ramdisk_service #PBS -l prologue=allocate.sh #PBS -l epilogue=cleanup.sh sleep 45h mail user@cluster

Motivation: data-intensive post-processing

Simulation results

Computing center

Transfernodes

Analysis 1 Analysis 2 Analysis n

Spinning disk-basedparallel file system

. . .

TapeArchive

Analysis cluster

Multiple trips to disk is slow

7 / 64

Page 8: A RAM-disk provisioning service for high performance data ... A_SIParCS_2… · #PBS -q ramdisk_service #PBS -l prologue=allocate.sh #PBS -l epilogue=cleanup.sh sleep 45h mail user@cluster

Motivation: data-intensive post-processing

Simulation results

Computing center

Transfernodes

Analysis 1 Analysis 2 Analysis n

Spinning disk-basedparallel file system

. . .

TapeArchive

Analysis cluster

Multiple trips to disk is slow

8 / 64

Page 9: A RAM-disk provisioning service for high performance data ... A_SIParCS_2… · #PBS -q ramdisk_service #PBS -l prologue=allocate.sh #PBS -l epilogue=cleanup.sh sleep 45h mail user@cluster

Approach: Run analysis on RAM

Fast I/O access

tmpfs or formatted/dev/ram

NFS-exported RAM

Split data over multiplenodes

Lustre parallel RAM filesystem

9 / 64

Page 10: A RAM-disk provisioning service for high performance data ... A_SIParCS_2… · #PBS -q ramdisk_service #PBS -l prologue=allocate.sh #PBS -l epilogue=cleanup.sh sleep 45h mail user@cluster

Approach: Run analysis on RAM

Fast I/O access

tmpfs or formatted/dev/ram

NFS-exported RAM

Split data over multiplenodes

Lustre parallel RAM filesystem

CPU CPU

RAM-baseddisk

Analysis node

Problem: Restricted parallelism

10 / 64

Page 11: A RAM-disk provisioning service for high performance data ... A_SIParCS_2… · #PBS -q ramdisk_service #PBS -l prologue=allocate.sh #PBS -l epilogue=cleanup.sh sleep 45h mail user@cluster

Approach: Run analysis on RAM

Fast I/O access

tmpfs or formatted/dev/ram

NFS-exported RAM

Split data over multiplenodes

Lustre parallel RAM filesystem

CPU CPU

RAM-baseddisk

CPU

CPU

Problem: Restricted data size

11 / 64

Page 12: A RAM-disk provisioning service for high performance data ... A_SIParCS_2… · #PBS -q ramdisk_service #PBS -l prologue=allocate.sh #PBS -l epilogue=cleanup.sh sleep 45h mail user@cluster

Approach: Run analysis on RAM

Fast I/O access

tmpfs or formatted/dev/ram

NFS-exported RAM

Split data over multiplenodes

Lustre parallel RAM filesystem

CPU CPU

RAM-baseddisk

CPU CPU

RAM-baseddisk

Problem: Requires thorough I/O management

12 / 64

Page 13: A RAM-disk provisioning service for high performance data ... A_SIParCS_2… · #PBS -q ramdisk_service #PBS -l prologue=allocate.sh #PBS -l epilogue=cleanup.sh sleep 45h mail user@cluster

Approach: Run analysis on RAM

Fast I/O access

tmpfs or formatted/dev/ram

NFS-exported RAM

Split data over multiplenodes

Lustre parallel RAM filesystem

CPU CPU CPU CPU

Lustre parallelRAM file system

CPU CPU CPU CPU

13 / 64

Page 14: A RAM-disk provisioning service for high performance data ... A_SIParCS_2… · #PBS -q ramdisk_service #PBS -l prologue=allocate.sh #PBS -l epilogue=cleanup.sh sleep 45h mail user@cluster

Solution: Automatically-provisioned parallel file system

ControlNode

TransferNode

AnalysisNodes

ArchiveNode

Parallel RAM file systemTape

Archive

Scheduler

User Client

Submit jobs

Polynya analysis cluster

WAN

TransferNode

File system

Kraken

14 / 64

Page 15: A RAM-disk provisioning service for high performance data ... A_SIParCS_2… · #PBS -q ramdisk_service #PBS -l prologue=allocate.sh #PBS -l epilogue=cleanup.sh sleep 45h mail user@cluster

Solution: Automatically-provisioned parallel file system

ControlNode

TransferNode

AnalysisNodes

ArchiveNode

Parallel RAM file system

TapeArchive

Scheduler

User Client

Submit jobs

Polynya analysis cluster

WAN

TransferNode

File system

Kraken

15 / 64

Page 16: A RAM-disk provisioning service for high performance data ... A_SIParCS_2… · #PBS -q ramdisk_service #PBS -l prologue=allocate.sh #PBS -l epilogue=cleanup.sh sleep 45h mail user@cluster

Solution: Automatically-provisioned parallel file system

ControlNode

TransferNode

AnalysisNodes

ArchiveNode

Parallel RAM file system

TapeArchive

Scheduler

User Client

Submit jobs

Polynya analysis cluster

WAN

TransferNode

File system

Kraken

16 / 64

Page 17: A RAM-disk provisioning service for high performance data ... A_SIParCS_2… · #PBS -q ramdisk_service #PBS -l prologue=allocate.sh #PBS -l epilogue=cleanup.sh sleep 45h mail user@cluster

Solution: Automatically-provisioned parallel file system

ControlNode

TransferNode

AnalysisNodes

ArchiveNode

Parallel RAM file system

TapeArchive

Scheduler

User Client

Submit jobs

Polynya analysis cluster

WAN

TransferNode

File system

Kraken

17 / 64

Page 18: A RAM-disk provisioning service for high performance data ... A_SIParCS_2… · #PBS -q ramdisk_service #PBS -l prologue=allocate.sh #PBS -l epilogue=cleanup.sh sleep 45h mail user@cluster

Solution: Automatically-provisioned parallel file system

ControlNode

TransferNode

AnalysisNodes

ArchiveNode

Parallel RAM file systemTape

Archive

Scheduler

User Client

Submit jobs

Polynya analysis cluster

WAN

TransferNode

File system

Kraken

18 / 64

Page 19: A RAM-disk provisioning service for high performance data ... A_SIParCS_2… · #PBS -q ramdisk_service #PBS -l prologue=allocate.sh #PBS -l epilogue=cleanup.sh sleep 45h mail user@cluster

Remote triggering the workflow

Simulationfinishes

Kraken

Requestspace

Transferdatasets

Archivedatasets

Runanalysis

Triggercleanup

Workflow

Polynya

Trigger workflow

19 / 64

Page 20: A RAM-disk provisioning service for high performance data ... A_SIParCS_2… · #PBS -q ramdisk_service #PBS -l prologue=allocate.sh #PBS -l epilogue=cleanup.sh sleep 45h mail user@cluster

Remote triggering the workflow

Simulationfinishes

Kraken

Requestspace

Transferdatasets

Archivedatasets

Runanalysis

Triggercleanup

Workflow

Polynya

Trigger workflow

20 / 64

Page 21: A RAM-disk provisioning service for high performance data ... A_SIParCS_2… · #PBS -q ramdisk_service #PBS -l prologue=allocate.sh #PBS -l epilogue=cleanup.sh sleep 45h mail user@cluster

Remote triggering the workflow

Simulationfinishes

Kraken

Requestspace

Transferdatasets

Archivedatasets

Runanalysis

Triggercleanup

Workflow

Polynya

Trigger workflow

21 / 64

Page 22: A RAM-disk provisioning service for high performance data ... A_SIParCS_2… · #PBS -q ramdisk_service #PBS -l prologue=allocate.sh #PBS -l epilogue=cleanup.sh sleep 45h mail user@cluster

Remote triggering the workflow

Simulationfinishes

Kraken

Requestspace

Transferdatasets

Archivedatasets

Runanalysis

Triggercleanup

Workflow

Polynya

Trigger workflow

22 / 64

Page 23: A RAM-disk provisioning service for high performance data ... A_SIParCS_2… · #PBS -q ramdisk_service #PBS -l prologue=allocate.sh #PBS -l epilogue=cleanup.sh sleep 45h mail user@cluster

Remote triggering the workflow

Simulationfinishes

Kraken

Requestspace

Transferdatasets

Archivedatasets

Runanalysis

Triggercleanup

Workflow

Polynya

Trigger workflow

23 / 64

Page 24: A RAM-disk provisioning service for high performance data ... A_SIParCS_2… · #PBS -q ramdisk_service #PBS -l prologue=allocate.sh #PBS -l epilogue=cleanup.sh sleep 45h mail user@cluster

Requesting RAM-based disk space

Implementation: PBS Torque+Maui scheduler generic resource

Parameters:

amount of space

duration of allocation

1 Route to control node

2 Prepare space

3 Sleep until allocationexpiration

4 Email notice beforeexpiration

5 Clean up space

#PBS -W x="GRES:ramdisk@25"

#PBS -l walltime="48:00:00"

#PBS -q ramdisk_service

#PBS -l prologue=allocate.sh

#PBS -l epilogue=cleanup.sh

sleep 45h

mail user@cluster ...

sleep 3h

24 / 64

Page 25: A RAM-disk provisioning service for high performance data ... A_SIParCS_2… · #PBS -q ramdisk_service #PBS -l prologue=allocate.sh #PBS -l epilogue=cleanup.sh sleep 45h mail user@cluster

Requesting RAM-based disk space

Implementation: PBS Torque+Maui scheduler generic resource

Parameters:

amount of space

duration of allocation

1 Route to control node

2 Prepare space

3 Sleep until allocationexpiration

4 Email notice beforeexpiration

5 Clean up space

#PBS -W x="GRES:ramdisk@25"

#PBS -l walltime="48:00:00"

#PBS -q ramdisk_service

#PBS -l prologue=allocate.sh

#PBS -l epilogue=cleanup.sh

sleep 45h

mail user@cluster ...

sleep 3h

25 / 64

Page 26: A RAM-disk provisioning service for high performance data ... A_SIParCS_2… · #PBS -q ramdisk_service #PBS -l prologue=allocate.sh #PBS -l epilogue=cleanup.sh sleep 45h mail user@cluster

Requesting RAM-based disk space

Implementation: PBS Torque+Maui scheduler generic resource

Parameters:

amount of space

duration of allocation

1 Route to control node

2 Prepare space

3 Sleep until allocationexpiration

4 Email notice beforeexpiration

5 Clean up space

#PBS -W x="GRES:ramdisk@25"

#PBS -l walltime="48:00:00"

#PBS -q ramdisk_service

#PBS -l prologue=allocate.sh

#PBS -l epilogue=cleanup.sh

sleep 45h

mail user@cluster ...

sleep 3h

26 / 64

Page 27: A RAM-disk provisioning service for high performance data ... A_SIParCS_2… · #PBS -q ramdisk_service #PBS -l prologue=allocate.sh #PBS -l epilogue=cleanup.sh sleep 45h mail user@cluster

Requesting RAM-based disk space

Implementation: PBS Torque+Maui scheduler generic resource

Parameters:

amount of space

duration of allocation

1 Route to control node

2 Prepare space

3 Sleep until allocationexpiration

4 Email notice beforeexpiration

5 Clean up space

#PBS -W x="GRES:ramdisk@25"

#PBS -l walltime="48:00:00"

#PBS -q ramdisk_service

#PBS -l prologue=allocate.sh

#PBS -l epilogue=cleanup.sh

sleep 45h

mail user@cluster ...

sleep 3h

27 / 64

Page 28: A RAM-disk provisioning service for high performance data ... A_SIParCS_2… · #PBS -q ramdisk_service #PBS -l prologue=allocate.sh #PBS -l epilogue=cleanup.sh sleep 45h mail user@cluster

Requesting RAM-based disk space

Implementation: PBS Torque+Maui scheduler generic resource

Parameters:

amount of space

duration of allocation

1 Route to control node

2 Prepare space

3 Sleep until allocationexpiration

4 Email notice beforeexpiration

5 Clean up space

#PBS -W x="GRES:ramdisk@25"

#PBS -l walltime="48:00:00"

#PBS -q ramdisk_service

#PBS -l prologue=allocate.sh

#PBS -l epilogue=cleanup.sh

sleep 45h

mail user@cluster ...

sleep 3h

28 / 64

Page 29: A RAM-disk provisioning service for high performance data ... A_SIParCS_2… · #PBS -q ramdisk_service #PBS -l prologue=allocate.sh #PBS -l epilogue=cleanup.sh sleep 45h mail user@cluster

Requesting RAM-based disk space

Implementation: PBS Torque+Maui scheduler generic resource

Parameters:

amount of space

duration of allocation

1 Route to control node

2 Prepare space

3 Sleep until allocationexpiration

4 Email notice beforeexpiration

5 Clean up space

#PBS -W x="GRES:ramdisk@25"

#PBS -l walltime="48:00:00"

#PBS -q ramdisk_service

#PBS -l prologue=allocate.sh

#PBS -l epilogue=cleanup.sh

sleep 45h

mail user@cluster ...

sleep 3h

29 / 64

Page 30: A RAM-disk provisioning service for high performance data ... A_SIParCS_2… · #PBS -q ramdisk_service #PBS -l prologue=allocate.sh #PBS -l epilogue=cleanup.sh sleep 45h mail user@cluster

Requesting RAM-based disk space

Implementation: PBS Torque+Maui scheduler generic resource

Parameters:

amount of space

duration of allocation

1 Route to control node

2 Prepare space

3 Sleep until allocationexpiration

4 Email notice beforeexpiration

5 Clean up space

#PBS -W x="GRES:ramdisk@25"

#PBS -l walltime="48:00:00"

#PBS -q ramdisk_service

#PBS -l prologue=allocate.sh

#PBS -l epilogue=cleanup.sh

sleep 45h

mail user@cluster ...

sleep 3h

30 / 64

Page 31: A RAM-disk provisioning service for high performance data ... A_SIParCS_2… · #PBS -q ramdisk_service #PBS -l prologue=allocate.sh #PBS -l epilogue=cleanup.sh sleep 45h mail user@cluster

Requesting RAM-based disk space

Implementation: PBS Torque+Maui scheduler generic resource

Parameters:

amount of space

duration of allocation

1 Route to control node

2 Prepare space

3 Sleep until allocationexpiration

4 Email notice beforeexpiration

5 Clean up space

#PBS -W x="GRES:ramdisk@25"

#PBS -l walltime="48:00:00"

#PBS -q ramdisk_service

#PBS -l prologue=allocate.sh

#PBS -l epilogue=cleanup.sh

sleep 45h

mail user@cluster ...

sleep 3h

31 / 64

Page 32: A RAM-disk provisioning service for high performance data ... A_SIParCS_2… · #PBS -q ramdisk_service #PBS -l prologue=allocate.sh #PBS -l epilogue=cleanup.sh sleep 45h mail user@cluster

Transferring datasets

Implementation: Route request to transfer nodes

Striped GridFTP data nodes

Co-located as RAM-based disk space provider

Other administrative components:

GridFTP control channel server

Key-authenticated SSH∗

X509-authenticaed GRAM5∗

∗Remote trigger mechanism

32 / 64

Page 33: A RAM-disk provisioning service for high performance data ... A_SIParCS_2… · #PBS -q ramdisk_service #PBS -l prologue=allocate.sh #PBS -l epilogue=cleanup.sh sleep 45h mail user@cluster

Transferring datasets

Implementation: Route request to transfer nodes

Striped GridFTP data nodes

Co-located as RAM-based disk space provider

Other administrative components:

GridFTP control channel server

Key-authenticated SSH∗

X509-authenticaed GRAM5∗

∗Remote trigger mechanism

33 / 64

Page 34: A RAM-disk provisioning service for high performance data ... A_SIParCS_2… · #PBS -q ramdisk_service #PBS -l prologue=allocate.sh #PBS -l epilogue=cleanup.sh sleep 45h mail user@cluster

Transferring datasets

Implementation: Route request to transfer nodes

Striped GridFTP data nodes

Co-located as RAM-based disk space provider

Other administrative components:

GridFTP control channel server

Key-authenticated SSH∗

X509-authenticaed GRAM5∗

∗Remote trigger mechanism

34 / 64

Page 35: A RAM-disk provisioning service for high performance data ... A_SIParCS_2… · #PBS -q ramdisk_service #PBS -l prologue=allocate.sh #PBS -l epilogue=cleanup.sh sleep 45h mail user@cluster

Transferring datasets

Implementation: Route request to transfer nodes

Striped GridFTP data nodes

Co-located as RAM-based disk space provider

Other administrative components:

GridFTP control channel server

Key-authenticated SSH∗

X509-authenticaed GRAM5∗

∗Remote trigger mechanism

35 / 64

Page 36: A RAM-disk provisioning service for high performance data ... A_SIParCS_2… · #PBS -q ramdisk_service #PBS -l prologue=allocate.sh #PBS -l epilogue=cleanup.sh sleep 45h mail user@cluster

Transferring datasets

Implementation: Route request to transfer nodes

Striped GridFTP data nodes

Co-located as RAM-based disk space provider

Other administrative components:

GridFTP control channel server

Key-authenticated SSH∗

X509-authenticaed GRAM5∗

∗Remote trigger mechanism

36 / 64

Page 37: A RAM-disk provisioning service for high performance data ... A_SIParCS_2… · #PBS -q ramdisk_service #PBS -l prologue=allocate.sh #PBS -l epilogue=cleanup.sh sleep 45h mail user@cluster

Example application: AMWG diagnostics

Compares CESMsimulation data,observational data,reanalysis data

Parallel implementation inSwift∗

Parameters:

dataset namenumber of timesegments (years)

Dataset volume: 2.8 GBper year (1◦ data)

∗Parallel scripting engine http://www.ci.uchicago.edu/swift

37 / 64

Page 38: A RAM-disk provisioning service for high performance data ... A_SIParCS_2… · #PBS -q ramdisk_service #PBS -l prologue=allocate.sh #PBS -l epilogue=cleanup.sh sleep 45h mail user@cluster

Example application: AMWG diagnostics

Compares CESMsimulation data,observational data,reanalysis data

Parallel implementation inSwift∗

Parameters:

dataset namenumber of timesegments (years)

Dataset volume: 2.8 GBper year (1◦ data)

∗Parallel scripting engine http://www.ci.uchicago.edu/swift

38 / 64

Page 39: A RAM-disk provisioning service for high performance data ... A_SIParCS_2… · #PBS -q ramdisk_service #PBS -l prologue=allocate.sh #PBS -l epilogue=cleanup.sh sleep 45h mail user@cluster

Example application: AMWG diagnostics

Compares CESMsimulation data,observational data,reanalysis data

Parallel implementation inSwift∗

Parameters:

dataset namenumber of timesegments (years)

Dataset volume: 2.8 GBper year (1◦ data)

∗Parallel scripting engine http://www.ci.uchicago.edu/swift

39 / 64

Page 40: A RAM-disk provisioning service for high performance data ... A_SIParCS_2… · #PBS -q ramdisk_service #PBS -l prologue=allocate.sh #PBS -l epilogue=cleanup.sh sleep 45h mail user@cluster

Example application: AMWG diagnostics

Compares CESMsimulation data,observational data,reanalysis data

Parallel implementation inSwift∗

Parameters:

dataset namenumber of timesegments (years)

Dataset volume: 2.8 GBper year (1◦ data)

∗Parallel scripting engine http://www.ci.uchicago.edu/swift

40 / 64

Page 41: A RAM-disk provisioning service for high performance data ... A_SIParCS_2… · #PBS -q ramdisk_service #PBS -l prologue=allocate.sh #PBS -l epilogue=cleanup.sh sleep 45h mail user@cluster

Data movement benchmarks∗

File systemIOR-8 GridFTP� to PolynyaWrite† from Frost from Kraken

/dev/null 3,190

139 28

Lustre disk 111

113 35

tmpfs RAM 2,983

117 34

XFS RAM 2,296

125 35

Lustre RAM 2,881

134 36GridFTP from Kraken to Frost: 216 MB/s

∗units in MB/s†from D. Duplyakin’s experiments

41 / 64

Page 42: A RAM-disk provisioning service for high performance data ... A_SIParCS_2… · #PBS -q ramdisk_service #PBS -l prologue=allocate.sh #PBS -l epilogue=cleanup.sh sleep 45h mail user@cluster

Data movement benchmarks∗

File systemIOR-8 GridFTP� to PolynyaWrite† from Frost from Kraken

/dev/null 3,190 139

28

Lustre disk 111 113

35

tmpfs RAM 2,983 117

34

XFS RAM 2,296 125

35

Lustre RAM 2,881 134

36GridFTP from Kraken to Frost: 216 MB/s

∗units in MB/s†from D. Duplyakin’s experiments�32 MB TCP buffer, 16 MB block size, 4 streams

42 / 64

Page 43: A RAM-disk provisioning service for high performance data ... A_SIParCS_2… · #PBS -q ramdisk_service #PBS -l prologue=allocate.sh #PBS -l epilogue=cleanup.sh sleep 45h mail user@cluster

Data movement benchmarks∗

File systemIOR-8 GridFTP� to PolynyaWrite† from Frost from Kraken

/dev/null 3,190 139 28Lustre disk 111 113 35tmpfs RAM 2,983 117 34XFS RAM 2,296 125 35Lustre RAM 2,881 134 36

GridFTP from Kraken to Frost: 216 MB/s

∗units in MB/s†from D. Duplyakin’s experiments�32 MB TCP buffer, 16 MB block size, 16 streams

43 / 64

Page 44: A RAM-disk provisioning service for high performance data ... A_SIParCS_2… · #PBS -q ramdisk_service #PBS -l prologue=allocate.sh #PBS -l epilogue=cleanup.sh sleep 45h mail user@cluster

Data movement benchmarks∗

File systemIOR-8 GridFTP� to PolynyaWrite† from Frost from Kraken

/dev/null 3,190 139 28Lustre disk 111 113 35tmpfs RAM 2,983 117 34XFS RAM 2,296 125 35Lustre RAM 2,881 134 36

GridFTP from Kraken to Frost: 216 MB/s

∗units in MB/s†from D. Duplyakin’s experiments�32 MB TCP buffer, 16 MB block size, 16 streams

44 / 64

Page 45: A RAM-disk provisioning service for high performance data ... A_SIParCS_2… · #PBS -q ramdisk_service #PBS -l prologue=allocate.sh #PBS -l epilogue=cleanup.sh sleep 45h mail user@cluster

Application performance

Ran on 64-CPU node, 2-year time segment (8.2 GB total)

File system Runtime (s)

Lustre disk 213tmpfs RAM 29XFS RAM 29Lustre RAM 70

45 / 64

Page 46: A RAM-disk provisioning service for high performance data ... A_SIParCS_2… · #PBS -q ramdisk_service #PBS -l prologue=allocate.sh #PBS -l epilogue=cleanup.sh sleep 45h mail user@cluster

Application performance

From Frost:

Lustre RAM

XFS RAM

tmpfs RAM

Lustre disk

Data TransferAMWG Analysis

Time (s)

0 50 100 150 200 250

46 / 64

Page 47: A RAM-disk provisioning service for high performance data ... A_SIParCS_2… · #PBS -q ramdisk_service #PBS -l prologue=allocate.sh #PBS -l epilogue=cleanup.sh sleep 45h mail user@cluster

End-to-end workflow

Analysis 1

Analysis 2

. . .

Analysis n

Archive

Transfer

Cleanup

Request space

Time (s)

47 / 64

Page 48: A RAM-disk provisioning service for high performance data ... A_SIParCS_2… · #PBS -q ramdisk_service #PBS -l prologue=allocate.sh #PBS -l epilogue=cleanup.sh sleep 45h mail user@cluster

End-to-end workflow

Analysis 1

Analysis 2

. . .

Analysis n

Archive

Transfer

Cleanup

Request space

Time (s)

48 / 64

Page 49: A RAM-disk provisioning service for high performance data ... A_SIParCS_2… · #PBS -q ramdisk_service #PBS -l prologue=allocate.sh #PBS -l epilogue=cleanup.sh sleep 45h mail user@cluster

End-to-end workflow

Analysis 1

Analysis 2

. . .

Analysis n

Archive

Transfer

Cleanup

Request space

Time (s)

49 / 64

Page 50: A RAM-disk provisioning service for high performance data ... A_SIParCS_2… · #PBS -q ramdisk_service #PBS -l prologue=allocate.sh #PBS -l epilogue=cleanup.sh sleep 45h mail user@cluster

End-to-end workflow

Analysis 1

Analysis 2

. . .

Analysis n

Archive

Transfer

Cleanup

Request space

Time (s)

50 / 64

Page 51: A RAM-disk provisioning service for high performance data ... A_SIParCS_2… · #PBS -q ramdisk_service #PBS -l prologue=allocate.sh #PBS -l epilogue=cleanup.sh sleep 45h mail user@cluster

End-to-end workflow

Analysis 1

Analysis 2

. . .

Analysis n

Archive

Transfer

Cleanup

Request space

Time (s)

51 / 64

Page 52: A RAM-disk provisioning service for high performance data ... A_SIParCS_2… · #PBS -q ramdisk_service #PBS -l prologue=allocate.sh #PBS -l epilogue=cleanup.sh sleep 45h mail user@cluster

End-to-end workflow

Analysis 1

Analysis 2

. . .

Analysis n

Archive

Transfer

Cleanup

Request space

Time (s)

52 / 64

Page 53: A RAM-disk provisioning service for high performance data ... A_SIParCS_2… · #PBS -q ramdisk_service #PBS -l prologue=allocate.sh #PBS -l epilogue=cleanup.sh sleep 45h mail user@cluster

End-to-end workflow

Analysis 1

Analysis 2

. . .

Analysis n

Archive

Transfer

Cleanup

Request space

Time (s)

53 / 64

Page 54: A RAM-disk provisioning service for high performance data ... A_SIParCS_2… · #PBS -q ramdisk_service #PBS -l prologue=allocate.sh #PBS -l epilogue=cleanup.sh sleep 45h mail user@cluster

Other use case: Interactive jobs

Automated workflow split component wise

Each step is run by the user manually

Steps:

1 Request space

2 Transfers data to allocated space (globus-url-copy orGlobus Online)

3 Runs analysis on allocated space

4 Email notice before expiration

5 Cleanup by deleting request job

54 / 64

Page 55: A RAM-disk provisioning service for high performance data ... A_SIParCS_2… · #PBS -q ramdisk_service #PBS -l prologue=allocate.sh #PBS -l epilogue=cleanup.sh sleep 45h mail user@cluster

Other use case: Interactive jobs

Automated workflow split component wise

Each step is run by the user manually

Steps:

1 Request space

2 Transfers data to allocated space (globus-url-copy orGlobus Online)

3 Runs analysis on allocated space

4 Email notice before expiration

5 Cleanup by deleting request job

55 / 64

Page 56: A RAM-disk provisioning service for high performance data ... A_SIParCS_2… · #PBS -q ramdisk_service #PBS -l prologue=allocate.sh #PBS -l epilogue=cleanup.sh sleep 45h mail user@cluster

Other use case: Interactive jobs

Automated workflow split component wise

Each step is run by the user manually

Steps:

1 Request space

2 Transfers data to allocated space (globus-url-copy orGlobus Online)

3 Runs analysis on allocated space

4 Email notice before expiration

5 Cleanup by deleting request job

56 / 64

Page 57: A RAM-disk provisioning service for high performance data ... A_SIParCS_2… · #PBS -q ramdisk_service #PBS -l prologue=allocate.sh #PBS -l epilogue=cleanup.sh sleep 45h mail user@cluster

Conclusions

End-to-end analysis platform without touching spinning disk

Interface through familiar PBS interface

Workflow automation to drive analysis

Network bandwidth critical to performance

Future work:

Tune network for high performance data movement

Application-perspective file system scalability

Explore framework on other resources: disk, bandwidth, etc.

57 / 64

Page 58: A RAM-disk provisioning service for high performance data ... A_SIParCS_2… · #PBS -q ramdisk_service #PBS -l prologue=allocate.sh #PBS -l epilogue=cleanup.sh sleep 45h mail user@cluster

Conclusions

End-to-end analysis platform without touching spinning disk

Interface through familiar PBS interface

Workflow automation to drive analysis

Network bandwidth critical to performance

Future work:

Tune network for high performance data movement

Application-perspective file system scalability

Explore framework on other resources: disk, bandwidth, etc.

58 / 64

Page 59: A RAM-disk provisioning service for high performance data ... A_SIParCS_2… · #PBS -q ramdisk_service #PBS -l prologue=allocate.sh #PBS -l epilogue=cleanup.sh sleep 45h mail user@cluster

Conclusions

End-to-end analysis platform without touching spinning disk

Interface through familiar PBS interface

Workflow automation to drive analysis

Network bandwidth critical to performance

Future work:

Tune network for high performance data movement

Application-perspective file system scalability

Explore framework on other resources: disk, bandwidth, etc.

59 / 64

Page 60: A RAM-disk provisioning service for high performance data ... A_SIParCS_2… · #PBS -q ramdisk_service #PBS -l prologue=allocate.sh #PBS -l epilogue=cleanup.sh sleep 45h mail user@cluster

Conclusions

End-to-end analysis platform without touching spinning disk

Interface through familiar PBS interface

Workflow automation to drive analysis

Network bandwidth critical to performance

Future work:

Tune network for high performance data movement

Application-perspective file system scalability

Explore framework on other resources: disk, bandwidth, etc.

60 / 64

Page 61: A RAM-disk provisioning service for high performance data ... A_SIParCS_2… · #PBS -q ramdisk_service #PBS -l prologue=allocate.sh #PBS -l epilogue=cleanup.sh sleep 45h mail user@cluster

Conclusions

End-to-end analysis platform without touching spinning disk

Interface through familiar PBS interface

Workflow automation to drive analysis

Network bandwidth critical to performance

Future work:

Tune network for high performance data movement

Application-perspective file system scalability

Explore framework on other resources: disk, bandwidth, etc.

61 / 64

Page 62: A RAM-disk provisioning service for high performance data ... A_SIParCS_2… · #PBS -q ramdisk_service #PBS -l prologue=allocate.sh #PBS -l epilogue=cleanup.sh sleep 45h mail user@cluster

Conclusions

End-to-end analysis platform without touching spinning disk

Interface through familiar PBS interface

Workflow automation to drive analysis

Network bandwidth critical to performance

Future work:

Tune network for high performance data movement

Application-perspective file system scalability

Explore framework on other resources: disk, bandwidth, etc.

62 / 64

Page 63: A RAM-disk provisioning service for high performance data ... A_SIParCS_2… · #PBS -q ramdisk_service #PBS -l prologue=allocate.sh #PBS -l epilogue=cleanup.sh sleep 45h mail user@cluster

Conclusions

End-to-end analysis platform without touching spinning disk

Interface through familiar PBS interface

Workflow automation to drive analysis

Network bandwidth critical to performance

Future work:

Tune network for high performance data movement

Application-perspective file system scalability

Explore framework on other resources: disk, bandwidth, etc.

63 / 64

Page 64: A RAM-disk provisioning service for high performance data ... A_SIParCS_2… · #PBS -q ramdisk_service #PBS -l prologue=allocate.sh #PBS -l epilogue=cleanup.sh sleep 45h mail user@cluster

Questions?

A RAM-disk provisioning service for highperformance data analysis

Allan Espinosa† ([email protected])Mentors: M. Woitaszek� and J. Dennis�

†University of Chicago, �National Center for Atmospheric Research

July 29, 2011

64 / 64