Claude TadonkiMines ParisTech – CRI – Mathématiques et Systèmes
Laboratoire de l’Accélérateur Linéaire/IN2P3/CNRSFrance
2nd Workshop on Architecture and Multi-Core Applications23rd International Symposium on Computer Architecture and High Performance Computing (SBAC PAD 2011)
October, 26 – 29 2010, Vitória, Espírito Santo, Brazil.
Large Scale Kronecker Product on Supercomputers C. TADONKI
The Kronecker product (définition and applications)The Kronecker product (définition and applications)
2nd Workshop on Architecture and Multi-Core Applications23rd International Symposium on Computer Architecture and High Performance Computing (SBAC PAD 2011)
October, 26 – 29 2010, Vitória, Espírito Santo, Brazil.
Large Scale Kronecker Product on Supercomputers C. TADONKI
The Kronecker product (properties and problem formulation)The Kronecker product (properties and problem formulation)
2nd Workshop on Architecture and Multi-Core Applications23rd International Symposium on Computer Architecture and High Performance Computing (SBAC PAD 2011)
October, 26 – 29 2010, Vitória, Espírito Santo, Brazil.
Large Scale Kronecker Product on Supercomputers C. TADONKI
The Kronecker (complexity and recurrence equation)The Kronecker (complexity and recurrence equation)
2nd Workshop on Architecture and Multi-Core Applications23rd International Symposium on Computer Architecture and High Performance Computing (SBAC PAD 2011)
October, 26 – 29 2010, Vitória, Espírito Santo, Brazil.
Forming the matrix first would • require a huge amount of memory• yield lot of redundant multiplication, which in total would be
Using the so-called normal factorization, we could derive an optimal scheme which reduces the number of floatting point multiplication to
Large Scale Kronecker Product on Supercomputers C. TADONKI
The Kronecker product and its applicationsThe Kronecker product and its applications
2nd Workshop on Architecture and Multi-Core Applications23rd International Symposium on Computer Architecture and High Performance Computing (SBAC PAD 2011)
October, 26 – 29 2010, Vitória, Espírito Santo, Brazil.
Large Scale Kronecker Product on Supercomputers C. TADONKI
Performance issues and heuristic for finding a good topology Performance issues and heuristic for finding a good topology
2nd Workshop on Architecture and Multi-Core Applications23rd International Symposium on Computer Architecture and High Performance Computing (SBAC PAD 2011)
October, 26 – 29 2010, Vitória, Espírito Santo, Brazil.
The total (parallel) execution time depends on• the sizes of the matrices• the gap between virtual topology and physical topology• the way the task is splitted among the processors (decomposition)
Large Scale Kronecker Product on Supercomputers C. TADONKI
Performances Performances
2nd Workshop on Architecture and Multi-Core Applications23rd International Symposium on Computer Architecture and High Performance Computing (SBAC PAD 2011)
October, 26 – 29 2010, Vitória, Espírito Santo, Brazil.
We consider N = 6 matrices of orders 30, 36, 32, 18, 24, 16,thus L = 159 252 480
We see that• our heuristic yields a significant improvment compare to trivial decompositions• we start loosing the scalabily when the number of cores increases (com)We the turn to hybrid implementation
Large Scale Kronecker Product on Supercomputers C. TADONKI
Performance of the hybrid implementationPerformance of the hybrid implementation
2nd Workshop on Architecture and Multi-Core Applications23rd International Symposium on Computer Architecture and High Performance Computing (SBAC PAD 2011)
October, 26 – 29 2010, Vitória, Espírito Santo, Brazil.
We see that• the hybrid implementation is better for larger number of cores• for smaller number of cores, the SM implemntation exacerbates on cache missesNeed to investigate on the compromise and a better memory layout.
END & QUESTIONSEND & QUESTIONS
2nd Workshop on Architecture and Multi-Core Applications23rd International Symposium on Computer Architecture and High Performance Computing (SBAC PAD 2011)
October, 26 – 29 2010, Vitória, Espírito Santo, Brazil.
Large Scale Kronecker Product on Supercomputers C. TADONKI