data reduction using singular value decomposition (svd ... · data reduction using singular value...
TRANSCRIPT
09.10.2018::
::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::
:: Stuttgart, Germany
Data Reduction using Singular Value Decomposition (SVD) Algorithm
28th Workshop on Sustained Simulation Performance
Jing Zhang
High Performance Computing Center Stuttgart (HLRS), University of Stuttgart, Germany
09.10.2018::
::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::
::
Overview
• Motivation and aim
• SVD algorithm
• Results by applying SVD
– 2D steady-state test case (1)
– 3D transient test cases (2 - 5)
• Summary and future work
228th Workshop on Sustained Simulation Performance
09.10.2018::
::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::
::
Motivation and aim
• Steady increase in computing power
– to use more complex models
– to simulate a lot of fluid-flow problems
– to perform simulations with large amount of CFD data
• I/O is bottleneck
– subsystem slow compared to other parts of computing system
• Aim of this study
– to reduce amount of CFD data transferred from memory to disk
– by using data-reduction algorithm SVD
– to minimize impact of I/O bottleneck on computing performance
328th Workshop on Sustained Simulation Performance
09.10.2018::
::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::
::
SVD algorithm
• 𝐴 = 𝑈 × 𝛴 × 𝑉𝑇 (if 𝑚 > 𝑛)
• 𝐴 = 𝑖=1𝑛 𝑢𝑖𝜎𝑖𝑣𝑖
𝑇 ≈ 𝑖=1𝑟 𝑢𝑖𝜎𝑖𝑣𝑖
𝑇 = 𝐴𝑟 (𝑖𝑓 𝜎𝑟 ≫ 𝜎𝑟+1)
• SVD subroutine PDGESVD in ScaLAPACK library
428th Workshop on Sustained Simulation Performance
𝐴 𝑈𝛴 𝑉𝑇
= x x
𝐴 ≈
𝑉𝑟𝑇
x x𝑈𝑟
𝛴𝑟
𝐴𝑟=
𝑚
𝑛 𝑛 𝑛 𝑛
𝑛
𝑚
𝑛 𝑟 𝑟 𝑛 𝑛
𝑚
𝑟
Singular values: 𝜎1 ≥ 𝜎2 ≥ ⋯ ≥ 𝜎𝑛 ≥ 0
0
0
𝜎1
𝜎n
𝜎2…( )
09.10.2018::
::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::
::
Parameters for data-reduction algorithm SVD
• Compression ratio: 𝐶𝑅 =𝑚×𝑛
𝑚×𝑟+𝑟+𝑟×𝑛=
𝑚×𝑛
𝑟× 𝑚+1+𝑛
• Requirement of compression: r <𝑚×𝑛
𝑚+1+𝑛
• Mean Squared Error: 𝑀𝑆𝐸 = 𝑖=1𝑚 𝑗=1
𝑛 𝐴(𝑖,𝑗)−𝐴𝑟(𝑖,𝑗)2
𝑚×𝑛
• Peak Signal to Noise Ratio: 𝑃𝑆𝑁𝑅 = 10𝑙𝑜𝑔10𝑚𝑎𝑥 𝐴 𝑖,𝑗 2
𝑀𝑆𝐸
– Good quality for SVD algorithm: 𝑃𝑆𝑁𝑅 ≥ 35 dB
528th Workshop on Sustained Simulation Performance
09.10.2018::
::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::
::
2D steady-state test case 1 (𝐴76×76)
• Original matrix is square and symmetric
• Original and reconstructed matrix are similar
628th Workshop on Sustained Simulation Performance
Original Reconstructed (rank=2)
1
0.5
0
-0.5
-1
09.10.2018::
::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::
::
2D steady-state test case 1 (𝐴76×76)
• Limit on rank number 37
• Variation of CR, MSE and PSNR according to rank value up to 37
728th Workshop on Sustained Simulation Performance
09.10.2018::
::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::
::
2D steady-state test case 1 (𝐴76×76)
• Maximal absolute value from original matrix: 1.00
• Maximal absolute difference between original and reconstructed matrix (rank=2): 4×10-15
828th Workshop on Sustained Simulation Performance
Difference
2.8 × 10−15
1.5 × 10−15
2.5 × 10−16
−1.0 × 10−15
−2.3 × 10−15
09.10.2018::
::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::
::
3D transient test cases
• Only one dataset Rhou2 (i.e. Rho × u2) is analysed
• Transformed data from 3D transient test cases into 2D dataset
• Data-transformation process
– Convert many HDF5 files at every time step into many single binary files using HDF5 utility h5dump
– Merge all binary files into one binary file including all data
– Reconvert the merged binary file into one 2D HDF5 file for applying SVD using HDF5 utility h5import
928th Workshop on Sustained Simulation Performance
𝑚: no. grid points
𝑛: no. time steps
𝐴
09.10.2018::
::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::
::
3D transient test case 2 (m=363, n=8)
• Number of grid points: 𝑚 = 363 = 46656
• Number of time steps: 𝑛 = 8
• Non-dimensional simulation time: t = 0~20
1028th Workshop on Sustained Simulation Performance
𝑛 = 1 𝑛 = 2 𝑛 = 3 𝑛 = 4
𝑛 = 5 𝑛 = 6 𝑛 = 7 𝑛 = 8
rhou20.5
0.4
0.3
0.2
0.1
0
-0.1
-0.2
-0.3
-0.4
-0.5
09.10.2018::
::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::
::
3D transient test case 2 (m=363, n=8)
• Limit on rank number 7
• Variation of CR, MSE and PSNR according to rank value (2, 4, 6)
1128th Workshop on Sustained Simulation Performance
09.10.2018::
::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::
::
3D transient test case 2 (m=363, n=8)
• Comparison of original and reconstructed data at the last time step
• All reconstructed data show poor quality
1228th Workshop on Sustained Simulation Performance
𝑛𝑡 = 8
Original r = 2 r = 4 r = 6
rhou20.50.40.30.20.10-0.1-0.2-0.3-0.4-0.5
09.10.2018::
::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::
::
3D transient test case 2 (m=363, n=8)
• Maximal absolute value of original data: 1.2
• Difference between original and reconstructed data at the last time step
• Three figures show similar results and difference is relative large
1328th Workshop on Sustained Simulation Performance
r = 2 r = 4 r = 6
difference0.2
0.15
0.1
0.15
0
-0.05
-0.1
-0.15
-0.2
09.10.2018::
::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::
::
3D transient test case 3 (m=363, n=99)
• Limit on rank number 98
• Variation of CR, MSE and PSNR according to rank value up to 98
1428th Workshop on Sustained Simulation Performance
35
09.10.2018::
::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::
::
3D transient test case 3 (m=363, n=99)
• Comparison of original and reconstructed data at the last time step
• If the rank number is increased, reconstructed data approach the original data
1528th Workshop on Sustained Simulation Performance
𝑛𝑡 = 8
Original r = 2 r = 9 r = 18
rhou20.50.40.30.20.10-0.1-0.2-0.3-0.4-0.5
09.10.2018::
::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::
::
3D transient test case 3 (m=363, n=99)
• Maximal absolute value from original data: 1.5
• Difference between original and reconstructed data at the last time step
• If rank number is increased, difference is reduced
1628th Workshop on Sustained Simulation Performance
r = 2 r = 9 r = 18
difference0.2
0.15
0.1
0.15
0
-0.05
-0.1
-0.15
-0.2
09.10.2018::
::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::
::
3D transient test case 4 (m=363, n=423)
• Limit on rank number 419
• Variation of CR, MSE and PSNR according to rank value up to 419
1728th Workshop on Sustained Simulation Performance
35
09.10.2018::
::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::
::
3D transient test case 4 (m=363, n=423)
• Comparison of original and reconstructed data at the last time step
• With rank number of 18 reconstructed data are similar as original data
1828th Workshop on Sustained Simulation Performance
𝑛𝑡 = 8
Original r = 2 r = 9 r = 18
rhou20.50.40.30.20.10-0.1-0.2-0.3-0.4-0.5
09.10.2018::
::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::
::
3D transient test case 4 (m=363, n=423)
• Maximal absolute value from original data: 1.5
• Difference between original and reconstructed data at the last time step
• Compression ratio almost 4x larger than for test case 3
1928th Workshop on Sustained Simulation Performance
r = 2 r = 9 r = 18
difference0.2
0.15
0.1
0.15
0
-0.05
-0.1
-0.15
-0.2
09.10.2018::
::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::
::
3D transient test case 5 (m=2603, n=8)
• Number of grid points: 𝑚 = 2603 = 17576000
• Number of time steps: 𝑛 = 8
• Non-dimensional simulation time: t = 0~20
2028th Workshop on Sustained Simulation Performance
𝑛 = 1 𝑛 =2 𝑛 = 3 𝑛 = 4
𝑛 = 5 𝑛 = 6 𝑛 = 7 𝑛 = 8
rhou20.5
0.4
0.3
0.2
0.1
0
-0.1
-0.2
-0.3
-0.4
-0.5
09.10.2018::
::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::
::
3D transient test case 5 (m=2603, n=8)
• Limit on rank number 7
• Variation of CR, MSE and PSNR according to rank value (2 and 4)
2128th Workshop on Sustained Simulation Performance
09.10.2018::
::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::
::
3D transient test case 5 (m=2603, n=8)
• Comparison of original and reconstructed data at the last time step
• Both reconstructed data show poor quality
2228th Workshop on Sustained Simulation Performance
𝑛𝑡 = 8
Original r = 2 r = 4
rhou20.50.40.30.20.10-0.1-0.2-0.3-0.4-0.5
09.10.2018::
::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::
::
3D transient test case 5 (m=2603, n=8)
• Maximal absolute value from original data: 1.1
• Difference between original and reconstructed data at the last time step
• Difference is relative large
2328th Workshop on Sustained Simulation Performance
r = 2 r = 4
difference0.2
0.15
0.1
0.15
0
-0.05
-0.1
-0.15
-0.2
09.10.2018::
::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::
::
Summary and future work
• SVD has been implemented and parallel tested with 2D and 3D test cases
• Important parameters have been analysed for data-reduction algorithm SVD, e.g. CR, MSE and PSNR
• Results of applying SVD have been visualized and discussed
– 2D steady-state test case: square original matrix; with small rank no. CR is large; store less data; good quality for reconstructed data
– 3D transient test cases: same grid size and rank number; if number of time steps increased; CR increased; store less data; if PSNR value is equal to or just higher than 35 dB; good quality for results
• Future work
– SVD should be further improved with very large datasets
2428th Workshop on Sustained Simulation Performance
09.10.2018::
::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::
:: Stuttgart, Germany
Thanks for your attention!
28th Workshop on Sustained Simulation Performance
Jing Zhang
High Performance Computing Center Stuttgart (HLRS), University of Stuttgart, Germany