heat: the helmholtz analytics toolkit · helmholtz analytics toolkit scientific data analytics...

1
HeAT: The Helmholtz Analytics ToolKit Simulation and Software Technology High-Performance Computing Helmholtz Analytics Framework Joint effort of all 6 Helmholtz centers Aim: foster data analytics within Helmholtz Systematic development of domain-specific data analysis techniques in a co-design approach between domain scientists and information experts Common components for data analysis Helmholtz Analytics ToolKit Scientific data analytics library for HPC systems build on top of PyTorch Operates on heterogeneous hardware like GPU/CPU systems Allows computation on distributed systems Distributed tensor data object: operations like basic scalar functions, linear algebra algorithms, slicing or broadcasting operations Dr. Charlotte Debus [email protected] Dr. Martin Siggel Dr. Philipp Knechtges German Aerospace Center Simulation and Software Technology Linder Höhe, 51147 Cologne, Germany Contact Clustering of Combustion Data Facilitating Use Cases in their work Bringing HPC and Machine Learning / Data Analytics closer together k-means SVM NN Ease of use Pythonic numpy – like interface Automatic Differentiation Tensor Linear Algebra GPU support Distributed Parallelism (MPI) mpi4py Combustion tests at DLR Institute of Space Propulsion Super-high resolution video camera (10 000 images / second) Clustering of Images for identificion of combustion phases Kmeans Generalizing and standardizing data analytics, machine learning and deep learning approaches for high performance computation Guided by use cases from different scientific fields Facilitating use-cases by identifying and providing common components for data analysis Clustering, Uncertainty Quantification, Dimension Reduction, Feature Learning, Data Assimilation, Classification / Regression, Modelling, Optimization Techniques, Hyperparameter Optimization, Interpolation, Data Mining https://github.com/ helmholtz-analytics Open Source 0,0 0, ,0 , 0,0 0, ,0 , +1,+1 +1, 2,+1 2,2 ∙,∙ ∙, ,∙ , PyTorch tensor Distributed tensor Earth System Modelling Aeronautics and Aerodynamics Neuroscience Research with Photons Structural Biology Hybrid rocket engines: paraffin-based fuel with gaseous oxidator

Upload: others

Post on 09-Aug-2020

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: HeAT: The Helmholtz Analytics ToolKit · Helmholtz Analytics ToolKit Scientific data analytics library for HPC systems build on top of PyTorch Operates on heterogeneous hardware like

HeAT: The Helmholtz Analytics ToolKit

Simulation and Software Technology High-Performance Computing

Helmholtz Analytics Framework

Joint effort of all 6 Helmholtz centers

Aim: foster data analytics within Helmholtz

Systematic development of domain-specific data

analysis techniques in a co-design approach

between domain scientists and information

experts

Common components for data analysis

Helmholtz Analytics ToolKit

Scientific data analytics library for HPC systems

build on top of PyTorch

Operates on heterogeneous hardware like

GPU/CPU systems

Allows computation on distributed systems

Distributed tensor data object: operations like

basic scalar functions, linear algebra algorithms,

slicing or broadcasting operations

Dr. Charlotte Debus

[email protected]

Dr. Martin Siggel

Dr. Philipp Knechtges

German Aerospace Center

Simulation and Software Technology

Linder Höhe, 51147 Cologne, Germany

Contact

Clustering of Combustion Data

Facilitating Use Cases

in their work

Bringing HPC and Machine

Learning / Data Analytics

closer together

k-means

SVM

NN

Ease of use

Pythonic numpy –

like interface

Automatic

Differentiation

Tensor Linear

Algebra

GPU support

Distributed Parallelism (MPI)

mpi4py

Combustion tests at DLR Institute of Space Propulsion

Super-high resolution video camera (10 000 images /

second)

Clustering of Images for identificion of combustion

phases Kmeans

Generalizing and standardizing data analytics, machine learning and

deep learning approaches for high performance computation

Guided by use cases from different scientific fields

Facilitating use-cases by identifying and providing common components for data analysis

Clustering, Uncertainty Quantification, Dimension Reduction, Feature Learning, Data Assimilation,

Classification / Regression, Modelling, Optimization Techniques, Hyperparameter Optimization,

Interpolation, Data Mining

https://github.com/

helmholtz-analytics

Open Source

𝑥0,0 … 𝑥0,𝑚⋮ ⋱ ⋮

𝑥𝑛,0 … 𝑥𝑛,𝑚

𝑥0,0 … 𝑥0,𝑚⋮ ⋱ ⋮

𝑥𝑛,0 … 𝑥𝑛,𝑚

𝑥0,0 … 𝑥0,𝑀⋮ 𝑥𝑖𝑗 ⋮

𝑥𝑁,0 … 𝑥𝑁,𝑀

𝑥𝑛+1,𝑚+1 … 𝑥𝑛+1,𝑚⋮ ⋱ ⋮

𝑥2𝑛,𝑚+1 … 𝑥2𝑛,2𝑚

𝑥𝑛+1,𝑚+1 … 𝑥𝑛+1,𝑚⋮ ⋱ ⋮

𝑥2𝑛,𝑚+1 … 𝑥2𝑛,2𝑚

𝑥𝑟∙𝑛,𝑟∙𝑚 … 𝑥𝑟∙𝑛,𝑀⋮ ⋱ ⋮

𝑥𝑁,𝑟∙𝑚 … 𝑥𝑁,𝑀

𝑥𝑟∙𝑛,𝑟∙𝑚 … 𝑥𝑟∙𝑛,𝑀⋮ ⋱ ⋮

𝑥𝑁,𝑟∙𝑚 … 𝑥𝑁,𝑀 …

PyTorch tensor

Distributed tensor

Earth System Modelling Aeronautics and Aerodynamics Neuroscience

Research with Photons

Structural Biology

Hybrid rocket engines: paraffin-based fuel

with gaseous oxidator