NetCDF4 NetCDF4 Performance Performance BenchmarkBenchmark
Part IPart I
Will the performance in netCDF4 Will the performance in netCDF4 comparable with that in netCDF3?comparable with that in netCDF3?
ConfigurationsConfigurations
DatasetDataset 40 MB: 6 files40 MB: 6 files 1 MB: 6 files1 MB: 6 files
Storage LayoutStorage Layout ContiguousContiguous Chunked (HDF5 default cache size: 1 Chunked (HDF5 default cache size: 1
MB)MB) Chunked (HDF5 cache size: 64 MB)Chunked (HDF5 cache size: 64 MB)
System CacheSystem Cache
System CacheSystem Cache
OnOn Use all caches and buffers provided by Use all caches and buffers provided by
kernelkernel DropDrop
““drop_caches” to read data from diskdrop_caches” to read data from disk ““fsync” to write data into diskfsync” to write data into disk
10 cases10 casesDatasetDataset Storage LayoutStorage Layout System System
CacheCache
11 40 MB40 MB contiguouscontiguous onon
22 40 MB40 MB contiguouscontiguous dropdrop
33 40 MB40 MB chunked (64 MB chunked (64 MB cache)cache)
onon
44 40 MB40 MB chunked (64 MB chunked (64 MB cache)cache)
dropdrop
55 40 MB40 MB chunked (1 MB chunked (1 MB cache)cache)
onon
66 40 MB40 MB chunked (1 MB chunked (1 MB cache)cache)
dropdrop
77 1 MB1 MB contiguouscontiguous onon
88 1 MB1 MB contiguouscontiguous dropdrop
99 1 MB1 MB chunked (1 MB chunked (1 MB cache)cache)
onon
1010 1 MB1 MB chunked (1 MB chunked (1 MB cache)cache)
dropdrop
Default HyperslabDefault Hyperslab
One big hyperslab is selectedOne big hyperslab is selected
1. Contiguous layout 1. Contiguous layout with cachewith cache
0 100 200 300 400 500 600
1D
2D
3D
4D
5D
6D
Num
ber
of d
imen
sion
s
data read rate (MB/s)
HDF5 contiguous
netCDF4 contiguous
netCDF3
0 100 200 300 400
1D
2D
3D
4D
5D
6D
Num
ber
of d
imen
sion
s
data write rate (MB/s)
HDF5 contiguous
netCDF4 contiguous
netCDF3
DatasetDataset Storage Storage LayoutLayout
System System CacheCache
≈ ≈ 40 MB40 MB contiguouscontiguous onon
2. Contiguous layout w/o 2. Contiguous layout w/o cachecache
DatasetDataset Storage Storage LayoutLayout
System System CacheCache
≈ ≈ 40 MB40 MB contiguouscontiguous dropdrop
0 20 40 60 80 100
1D
2D
3D
4D
5D
6D
Num
ber
of d
imen
sion
s
data read rate (MB/s)
HDF5
netCDF4
netCDF3
0 20 40 60 80 100
1D
2D
3D
4D
5D
6D
Num
ber
of d
imen
sion
s
data write rate (MB/s)
HDF5
netCDF4
netCDF3
3. Chunked layout with 3. Chunked layout with cachecache
DatasetDataset Storage LayoutStorage Layout System System CacheCache
≈ ≈ 40 40 MBMB
chunked (chunked (HDF5 HDF5 cache size: 64 cache size: 64 MBMB))
onon
0 100 200 300 400 500
1D
2D
3D
4D
5D
6D
num
ber
of d
imen
sion
s
data read rate (MB/s)
HDF5 chunked
netCDF4 chunked
netCDF3
0 100 200 300 400 500
1D
2D
3D
4D
5D
6D
num
ber
of d
imen
sion
s
data write rate (MB/s)
HDF5 chunked
netCDF4 chunked
netCDF3
4. Chunked layout w/o 4. Chunked layout w/o cachecache
DatasetDataset Storage LayoutStorage Layout System System CacheCache
≈ ≈ 40 40 MBMB
chunked (chunked (HDF5 HDF5 cache size: 64 cache size: 64 MBMB))
dropdrop
0 20 40 60 80 100
1D
2D
3D
4D
5D
6D
Num
ber
of d
imen
sion
s
data read rate (MB/s)
HDF5
netCDF4
netCDF3
0 20 40 60 80 100
1D
2D
3D
4D
5D
6D
Num
ber
of d
imen
sion
s
data write rate (MB/s)
HDF5
netCDF4
netCDF3
5. Chunked layout with 5. Chunked layout with cachecache
DataseDatasett
Storage LayoutStorage Layout System System CacheCache
≈ ≈ 40 40 MBMB
chunked (chunked (HDF5 HDF5 default cache default cache size: 1 MBsize: 1 MB))
onon
0 100 200 300 400 500
1D
2D
3D
4D
5D
6D
num
ber
of d
imen
sion
s
data read rate (MB/s)
HDF5 chunked
netCDF4 chunked
netCDF3
0 100 200 300 400
1D
2D
3D
4D
5D
6D
num
ber
of d
imen
sion
s
data write rate (MB/s)
HDF5 chunked
netCDF4 chunked
netCDF3
H5Pset_alloc_time(EARLH5Pset_alloc_time(EARLY)Y)
DatasDatasetet
Storage LayoutStorage Layout System System CacheCache
≈ ≈ 40 40 MBMB
chunked (chunked (HDF5 HDF5 default cache default cache size: 1 MBsize: 1 MB))
onon
0 100 200 300 400
1D
2D
3D
4D
5D
6D
num
ber
of d
imen
sion
s
data write rate (MB/s)
HDF5 chunked
netCDF4 chunked
netCDF3
0 100 200 300 400
1D
2D
3D
4D
5D
6D
Num
ber
of d
imen
sion
s
data write rate (MB/s)
HDF5 chunked
netCDF4 chunked
netCDF3 chunked
H5Pset_alloc_time(EARLY)
6. Chunked layout w/o 6. Chunked layout w/o cachecache
DatasDatasetet
Storage LayoutStorage Layout System System CacheCache
≈ ≈ 40 40 MBMB
chunked (chunked (HDF5 HDF5 default cache size: default cache size: 1 MB1 MB))
dropdrop
0 20 40 60 80 100
1D
2D
3D
4D
5D
6D
Num
ber
of d
imen
sion
s
data read rate (MB/s)
HDF5
netCDF4
netCDF3
0 20 40 60 80 100
1D
2D
3D
4D
5D
6D
Num
ber
of d
imen
sion
s
data write rate (MB/s)
HDF5
netCDF4
netCDF3
7. Contiguous layout 7. Contiguous layout with cachewith cache
DatasetDataset Storage Storage LayoutLayout
System System CacheCache
≈ ≈ 1 MB1 MB contiguouscontiguous onon
0 100 200 300 400 500 600
1D
2D
3D
4D
5D
6D
Num
ber
of d
imen
sion
s
data read rate (MB/s)
HDF5 contiguous
netCDF4 contiguous
netCDF3
0 100 200 300 400
1D
2D
3D
4D
5D
6D
Num
ber
of d
imen
sion
s
data write rate (MB/s)
HDF5 contiguous
netCDF4 contiguous
netCDF3
8. Contiguous layout w/o 8. Contiguous layout w/o cachecache
DatasetDataset Storage Storage LayoutLayout
System System CacheCache
≈ ≈ 1 MB1 MB contiguouscontiguous dropdrop
0 20 40 60 80 100
1D
2D
3D
4D
5D
6D
Num
ber
of d
imen
sion
s
data read rate (MB/s)
HDF5
netCDF4
netCDF3
0 20 40 60 80 100
1D
2D
3D
4D
5D
6D
Num
ber
of d
imen
sion
s
data write rate (MB/s)
HDF5
netCDF4
netCDF3
9. Chunked layout with 9. Chunked layout with cachecache
DataseDatasett
Storage LayoutStorage Layout System System CacheCache
≈ ≈ 1 1 MBMB
chunked (chunked (HDF5 HDF5 default cache default cache size: 1 MBsize: 1 MB))
onon
0 100 200 300 400 500
1D
2D
3D
4D
5D
6D
Num
ber
of d
imen
sion
s
data read rate (MB/s)
HDF5 chunked
netCDF4 chunked
netCDF3
0 100 200 300 400 500
1D
2D
3D
4D
5D
6D
Num
ber
of d
imen
sion
s
data write rate (MB/s)
HDF5 chunked
netCDF4 chunked
netCDF3
10. Chunked layout w/o 10. Chunked layout w/o cachecache
DatasDatasetet
Storage LayoutStorage Layout System System CacheCache
≈ ≈ 1 1 MBMB
chunked (chunked (HDF5 HDF5 default cache size: default cache size: 1 MB1 MB))
dropdrop
0 20 40 60 80 100
1D
2D
3D
4D
5D
6D
Num
ber
of d
imen
sion
s
data read rate (MB/s)
HDF5
netCDF4
netCDF3
0 20 40 60 80 100
1D
2D
3D
4D
5D
6D
Num
ber
of d
imen
sion
s
data write rate (MB/s)
HDF5
netCDF4
netCDF3
Part IIPart II
Can I get better performance with Can I get better performance with netCDF4? If yes, under what netCDF4? If yes, under what circumstances can I get better circumstances can I get better performance?performance?
Non-contiguous AccessNon-contiguous Access
Logical layout for 2-dimensional Logical layout for 2-dimensional arraysarrays
256
256
163
84
16
1
240
Non-contiguous AccessNon-contiguous Access
Physical layoutPhysical layout
16384 non-adjacent data points
Chunk size [16384][1]
Chunk size [8192][1]
Chunk size [4096][1]
11. Non-contiguous 11. Non-contiguous AccessAccess
DatasetDataset Storage LayoutStorage Layout System System CacheCache
≈ ≈ 16 16 MBMB
contiguous; contiguous; chunkedchunked
(default chunk (default chunk cache)cache)
dropdrop
0 100 200 300 400 500 600
netCDF3contiguous
netCDF4contiguous
chunked[16384][1]
chunked[8192][1]
chunked[4096][1]
Sto
rage
Lay
out
wall clock time to read one non-contiguous hyperslab (ms)
0 5 10 15 20 25
netCDF3contiguous
netCDF4contiguous
chunked[16384][1]
chunked [8192][1]
chunked [4096][1]
Sto
rage
Lay
out
wall clock time to write non-contiguous hyperslabs (s)
12. Chunked layout with 12. Chunked layout with cachecache
DatasetDataset Storage LayoutStorage Layout System System CacheCache
≈ ≈ 40 40 MBMB
chunkedchunked
(chunk cache (chunk cache varies)varies)
onon
0
50
100
150
200
250
300
350
400
450
1 4 8 16 32 64
cache size for 5D dataset (MB)
data
writ
e ra
te (
MB
/s)
netCDF3
netCDF4
13. Compression13. Compression
DatasetDataset Storage LayoutStorage Layout System System CacheCache
Radar Radar datadata
chunkedchunked
(default chunk (default chunk cache)cache)
dropdrop
0.0 0.5 1.0 1.5
tile1
tile2
tile4
Dat
aset
Nam
e
wall clock time to read radar data (second)
deflate compression level 1
without compression
0.0 0.5 1.0 1.5 2.0
tile1
tile2
tile4
Dat
aset
Nam
e
wall clock time to write radar data (second)
deflate compression level 1
without compression
13. Compression13. Compression
Compression ratioCompression ratio
DatasDatasetet
UncompresUncompressedsed
CompressCompresseded
CompressiCompression Ratioon Ratio
Tile1Tile1 72,132,89272,132,892 3,432,5593,432,559 2121
Tile2Tile2 72,132,89272,132,892 5,129,4825,129,482 1414
Tile3Tile3 72,132,89272,132,892 3,069,2543,069,254 2323
Part IIIPart III
Can netCDF4 performance be bad? Can netCDF4 performance be bad? How can I avoid the bad How can I avoid the bad performance?performance?
14. Chunk size14. Chunk size
Too small chunk size is badToo small chunk size is bad Little bit smaller than Little bit smaller than (number of (number of
elements) / Nelements) / N is bad is bad
14. Chunk size14. Chunk size
chunkchunk00
chunkchunk11
chunkchunk22
chunkchunk33
chunkchunk00
chunkchunk11
chunkchunk22
chunkchunk33
chunkchunk44
chunkchunk55
chunkchunk66
chunkchunk77
chunkchunk88
3162
791
3162
790
dataset
chunk
36
38
40
42
44
46
48
8 16 32 50 128 200Number of elements for each dimension in a chunk
file
size
(M
B)
0
5
10
15
20
25
30
35
40
45
50
8 16 32 50 128 200Number of elements for each dimension in a chunk
data
writ
e ra
te (
MB
/s)
14. Chunk size14. Chunk size
DatasetDataset
≈ ≈ 64 MB64 MB
Storage LayoutStorage Layout
chunkedchunked
(default chunk (default chunk cache)cache)
System CacheSystem Cache
dropdrop
0
20
40
60
80
100
120
140
160
316 527 791 1054 1581 2400 3162Number of elements for each dimension in a chunk
file
size
(M
B)
14. Chunk size (more)14. Chunk size (more)
DatasetDataset
≈ ≈ 64 MB64 MB
Storage LayoutStorage Layout
chunkedchunked
(default chunk (default chunk cache)cache)
System CacheSystem Cache
dropdrop
0
5
10
15
20
25
30
35
40
45
316 527 791 1054 1581 2400 3162Number of elements for each dimension in a chunk
data
writ
e ra
te (
MB
/s)
n
n + 1
n - 1
15. Many Hyperslab 15. Many Hyperslab selectionsselections
H5Pcreate()
H5Dopen()
15. Many Hyperslab 15. Many Hyperslab selectionsselections
ConclusionConclusion
The performance in netCDF4 is The performance in netCDF4 is comparable with that in netCDF3comparable with that in netCDF3
ImprovementImprovement Non-contiguous access patternNon-contiguous access pattern Adjusted cache sizeAdjusted cache size CompressionCompression
PitfallPitfall Small chunk sizeSmall chunk size Many small hyperslab selectionsMany small hyperslab selections