an hdf5-wrf module -a performance report

Post on 05-Jan-2016

42 Views

Category:

Documents

7 Downloads

Preview:

Click to see full reader

DESCRIPTION

An HDF5-WRF module -A performance report. MuQun Yang, Robert E. McGrath, Mike Folk National Center for Supercomputing Applications University of Illinois, Urbana-Champaign ymuqun@ncsa.uiuc.edu URL : http://hdf.ncsa.uiuc.edu/apps/WRF-ROMS/. The uniqueness about the study. - PowerPoint PPT Presentation

TRANSCRIPT

An HDF5-WRF module-A performance report

MuQun Yang, Robert E. McGrath, Mike Folk

National Center for Supercomputing ApplicationsUniversity of Illinois, Urbana-Champaign

ymuqun@ncsa.uiuc.edu

URL: http://hdf.ncsa.uiuc.edu/apps/WRF-ROMS/

The uniqueness about the study

• The HDF5 is not used to store satellite data, but the output from a sophisticated numerical weather model

• Explore the performance of parallel HDF5 in parallel computing environments

• Investigate the performance of the compression feature inside HDF5 when applying to the numerical model

From WRF tutorial

/

WRF_DATA Dim_group

TKE RHO_U

ETC…

Time

Schematic HDF5 File Structure of Sequential HDF5-WRF Output

HDF5 dataset(WRF field) HDF5 group(WRF dataset)

Solid line: HDF5 datasets or sub-groups (the arrow points to) that are members of the HDF5 parent group.

Dash line: The association of one HDF5 object to another HDF5 object through dimensional scale table.

/

WRF_DATA

Dim_group

TE RHO_U Time

Schematic HDF5 File Structure of Parallel HDF5-WRF Output

HDF5 dataset(WRF field)

Time_step1

HDF5 group(WRF dataset)

Solid line: HDF5 datasets or sub-groups (the arrow points to) that are members of the HDF5 parent group.

Time_step0

TE RHO_U

attr1

attr2

Time_stepN

Dash line: the association of one HDF5 object to another HDF5 object through dimensional table.

Wall Clock Time Used with Different Output File Size Case 1: Conus

IBM WinterHawkII (256 Processors)

0

10

20

30

40

50

60

70

80

0 5 10 15 20 25

Output File Size(GB)

Wa

ll C

loc

k T

ime

(Min

ute

)

Parallel HDF5

NetCDF

Wall Clock Time Used With Different Output File Size Case 3: Squall line

IBM Regatta (16 processors)

0

1

2

3

4

5

6

7

8

0 0.5 1 1.5 2

Output File Size(GB)

Wa

ll C

loc

k T

ime

(Min

ute

)

Parallel HDF5

NetCDF

Model Output File Size With Different CompressionsCase 3: Squall Line

IBM Regatta(16 processors)

0

500

1000

1500

2000

2500

10 30 50 70

Number of Timestep to Generate Output

Fil

e S

ize

(M

B)

No Compression

With szip

with shuffling + gzip

Wall Clock Time With Different Compressions Case 3: Squall Line

IBM Regatta(16 processors)

0

1

2

3

4

5

6

7

8

9

10

10 30 50 70

Number of Timestep to Generate Output

Wall

Clo

ck T

ime(

min

ute

)

No Compression

With szip compression

With shuffling + gzipcompression

Summary

• Parallel IO is not trivial• Effective chunking with MPI-IO inside

HDF5 library is the key to improve parallel HDF5 performance

• Szip compression can improve performance for WRF application

• Shuffling algorithm with gzip compression can further improve compression ratio

top related