reaching harmony in tomorrow’s world of supercomputing · large hadron collider needs • 3d...

20
© 2019 Cray Inc. Reaching Harmony in Tomorrow’s World of Supercomputing Per Nyberg VP, Artificial Intelligence

Upload: others

Post on 21-May-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Reaching Harmony in Tomorrow’s World of Supercomputing · Large Hadron Collider needs • 3D convolutional GAN can generate realistic detector output >2000x faster. Ref: Dr. Federico

© 2019 Cray Inc.

R e a c h i n g H a r m o n y i n To m o r r o w ’ s W o r l d o f S u p e r c o m p u t i n g

Per NybergVP, Artificial Intelligence

Page 2: Reaching Harmony in Tomorrow’s World of Supercomputing · Large Hadron Collider needs • 3D convolutional GAN can generate realistic detector output >2000x faster. Ref: Dr. Federico

© 2019 Cray Inc.

• How is AI being adopted in supercomputing environments today ?• What does our future look like ? • Can we reach a state of harmony between the worlds of simulation and AI ?

Topics

Page 3: Reaching Harmony in Tomorrow’s World of Supercomputing · Large Hadron Collider needs • 3D convolutional GAN can generate realistic detector output >2000x faster. Ref: Dr. Federico

© 2019 Cray Inc.

The Competitive Landscape Is Changing

Advent of Supercomputing

1970s

The Haves and the Have Mores

2000s

The Haves and Have Nots

1980s and 90s

Adapted from: “DIGITAL AMERICA: A TALE OF THE HAVES AND HAVE-MORES” McKinsey Global Institute, 2015

The Information Elite

Today

Page 4: Reaching Harmony in Tomorrow’s World of Supercomputing · Large Hadron Collider needs • 3D convolutional GAN can generate realistic detector output >2000x faster. Ref: Dr. Federico

© 2019 Cray Inc.

How Have Supercomputers Enabled Advanced Simulation ?

• Delivering socio-economic benefits through advanced weather and climate forecasts

• Met Office supercomputers process more than 215 billion weather observations each day

• High resolution climate modeling can better assess impact of climate change at a regional scale

• Determined the precise chemical structure of the HIV capsid

• Protein shell that protects the virus’s genetic material and is a key to its virulence.

• Key to the development of new antiretroviral drugs

• Requires the assembly of more than 1,300 identical proteins – in atomic-level detail.

• Crop devastation by whiteflies is a major cause of hunger in East Africa

• Understanding the DNA of the species by generating phylogenetic trees

• With only 500 whiteflies in a genetic dataset, the possible relationships between these flies run into the octillions (10^25)

Page 5: Reaching Harmony in Tomorrow’s World of Supercomputing · Large Hadron Collider needs • 3D convolutional GAN can generate realistic detector output >2000x faster. Ref: Dr. Federico

© 2019 Cray Inc.

...and what about Machine Learning ?• Crop data is key to decision

makers.• Applying Deep Learning on

satellite data, the two major crops can be distinguished with 95% accuracy just a few months after planting and well before harvest.

• More timely estimates could be used for a variety applications, including supply-chain logistics, commodity market future projections, and more.

• Cryo-electron microscopy (cryo-EM) provides 3D structural information of biological molecules and assemblies

• Cryo-EM has improved structure resolution to near atomic in just the past few years

• Critical to advancing basic biology and to characterizing drugs and drug targets for improved drug discovery.

• Development of systems for connected cars and autonomous technologies are key to enabling autonomous vehicles.

• These advances could not have been realized without the application of deep learning to object detection in image and full motion video

Page 6: Reaching Harmony in Tomorrow’s World of Supercomputing · Large Hadron Collider needs • 3D convolutional GAN can generate realistic detector output >2000x faster. Ref: Dr. Federico

© 2019 Cray Inc.

SIMULATION ARTIFICIAL INTELLIGENCE

COGNITIVE SIMULATION

How are Individual Domains Evolving ?

Deep CFD“Learning on the outside”

Object DetectionCombustion Modeling

Researching Precursors of Tropical Cyclones Using

Deep Learning

Numerical Weather Prediction Machine Learning Based Physics Emulators in Numerical Weather

Prediction Models“Learning on the inside”

Sim

ML

Sim ML

Page 7: Reaching Harmony in Tomorrow’s World of Supercomputing · Large Hadron Collider needs • 3D convolutional GAN can generate realistic detector output >2000x faster. Ref: Dr. Federico

© 2019 Cray Inc.

What Are Some Other Benefits of Machine Learning ?

Saving Compute Cycles Through Improved Simulation Guidance

• Application of machine learning optimization techniques such as regularization and steering to Full Waveform Inversion (FWI) seismic imaging.

• Compared with manual velocity model determination and tuning, FWI with ML converges more quickly and efficiently.

When Simulation Is too Expensive

• Detailed simulation of subatomic particles interactions is essential to High Energy Physics at CERN

• Monte Carlo approach is not fast enough for the High-Luminosity Large Hadron Collider needs

• 3D convolutional GAN can generate realistic detector output >2000x faster

Ref: Dr. Federico Carminati et al, CERN

Maximizing Data Utilization

• Satellites create more data than can be assimilated

• Only a small % of available data is used today.

• “Deep learning object detection can be used to identify areas of atmospheric instability from satellite observation data, focus extraction of observations on these regions of interest.”

Ref: Jebb Stewart, NOAA, 2018 ECMWF workshop on HPC in Meteorology

Page 8: Reaching Harmony in Tomorrow’s World of Supercomputing · Large Hadron Collider needs • 3D convolutional GAN can generate realistic detector output >2000x faster. Ref: Dr. Federico

© 2019 Cray Inc. 8

Computational Demands of Machine Learning are Growing (fast)

Ref: Open AI

Ref: Open AI

Page 9: Reaching Harmony in Tomorrow’s World of Supercomputing · Large Hadron Collider needs • 3D convolutional GAN can generate realistic detector output >2000x faster. Ref: Dr. Federico

© 2019 Cray Inc.

A More Relevant Example: Climate ScienceCredit: Prabhat, NERSC, Exascale Deep Learning for Climate Analytics

Liu, et alABDA’16

Racah, et al, NIPS’17Kurth, et al, SC’17

Kurth, et alSC’18

Racah, et alNIPS’17

Page 10: Reaching Harmony in Tomorrow’s World of Supercomputing · Large Hadron Collider needs • 3D convolutional GAN can generate realistic detector output >2000x faster. Ref: Dr. Federico

© 2019 Cray Inc. 10

Climate Instance SegmentationCredit: Prabhat, NERSC, Exascale Deep Learning for Climate Analytics

Climate Data Set• CAM5 0.25-degree output

• 1152x768 pixels, 3-hr• 16 variables • 20 TB

• Ground Truth Labels for TC and AR • Heuristics for detection followed by mask

creation• Formulate segmentation problem

• 3 classes - background, TC and AR• High imbalance - most pixels are background

high variance - shape of events change over time and in-between themselves

Segmentation Architecture• DeepLabv3+, 66 layers, • 43.7M parameters, 14.4 TF/sample• 100,000 samples

Page 11: Reaching Harmony in Tomorrow’s World of Supercomputing · Large Hadron Collider needs • 3D convolutional GAN can generate realistic detector output >2000x faster. Ref: Dr. Federico

© 2019 Cray Inc. 11

What Does Our Future Look Like ?

Think You Know How DisruptiveArtificial Intelligence Is? Think Again –

Forbes

Collision Course: Commercial and HPC Big Data - Scientific

Computing

Opportunities at the Intersection of high-end computing and data science –SC17 panel

Blurring the Lines: High-End Computing and Data Science – SC17 panel

Cray CTO On The Cambrian Compute Explosion – The Next Platform

The Convergence of HPC and AI

...Not to mention that ML/DL continues to rapidly evolve

Page 12: Reaching Harmony in Tomorrow’s World of Supercomputing · Large Hadron Collider needs • 3D convolutional GAN can generate realistic detector output >2000x faster. Ref: Dr. Federico

© 2019 Cray Inc.

Rapidly Evolving AI Workflows with Increasing Technology Diversity

DataAcquisition

DataProcessing

Data PreparationLabor & CPU Intensive

ModelTraining

ModelTesting

Model DevelopmentCompute (GPU & CPU) Intensive

ModelDeployment

ModelInference

Model ImplementationBusiness Process Intensive

Compute Requirements (By Stage and Task): Data preparation Machine

Learning Training Deep Learning

Training Inference

Data Requirements (Size, Type, Performance): Large volume data

stores High-bandwidth Fast IOPs Direct connectivity

to external storage HDF5 ~ Key-Value

Stores

Performance Requirements (By Stage and Task): Scaling Throughput

User-productivity Containers Collaborative

notebooks Open-source

frameworks High productivity

interpreted languages

Training and Inference Processor Landscape Intel, AMD, ARM,

NVIDIA Google TPU v2 Graphcore Habana 30+ startups….

Storage Technology Landscape Tape HDD SSD-SATA/SAS SSD-NVMe Flash On-node vs. off-

node – Datawarp vs. NVMe-over-Fabrics

Page 13: Reaching Harmony in Tomorrow’s World of Supercomputing · Large Hadron Collider needs • 3D convolutional GAN can generate realistic detector output >2000x faster. Ref: Dr. Federico

© 2019 Cray Inc.

Time to Insight: Performance Emphasis Will Move From Optimized “Codes” to “Workflows”

Time to solution

Limiting resource( # memory channels )

limiting resource#IO channels

I/Oparallel

Time to insight

Data from Database 2

Data from Database 1

P1Job2

(Simulation,

Graph Analytics)

Job1

(Clustering)

Job1’

(Deep Learning)

P2

Page 14: Reaching Harmony in Tomorrow’s World of Supercomputing · Large Hadron Collider needs • 3D convolutional GAN can generate realistic detector output >2000x faster. Ref: Dr. Federico

© 2019 Cray Inc.

Development of Domain and Use-Case Specific Multi-Models

Scientific Data

Streaming Sensor Data

Public Data

3rd Party Data

Operational Data

Streaming Sensor Data

Enterprise Data

Social Media

Research & Demographics

Historical Data

AI Today Multi-Model Future

Model CNN, RNN, LSTM, GAN etc. Domain-specific

Baseline Humans, Other ML algorithms Theory, Science

Use Case Speech, Image interpretation

Computational SteeringProxy models

Figure of Merit

Time-to-accuracy, Model-size + Interpretability, Feasibility

Model Design Transfer/Incremental Learning Hyper-parameter optimization

Model Testing A/B Testing on a cadence Statistical rigor

Points

Lines

Grid

Surface

Mesh

Page 15: Reaching Harmony in Tomorrow’s World of Supercomputing · Large Hadron Collider needs • 3D convolutional GAN can generate realistic detector output >2000x faster. Ref: Dr. Federico

© 2019 Cray Inc. 15

AI for Science: Increasing Mingling of Simulation and AI Methodologies

Domain-Specific Models• Fluid Dynamics• Schrodinger Equation• Full-Wave Inversion• Integration with user-facilities

Combinatorial Patterns• Search for meta-stable states• Search for particles• Search for N-way correlations• Search in a high-dimensional feature space

Model Repurposing• Deep Learning for X• Feature engineering• Predictive modeling• Hypothesis creation with interpretation

Workflows and Automation• Smarter initialization for simulations• Computational steering• Hybrid models of theory and AI

EmergingUse-cases

Mod

el C

ompl

exity

Computational Complexity

Page 16: Reaching Harmony in Tomorrow’s World of Supercomputing · Large Hadron Collider needs • 3D convolutional GAN can generate realistic detector output >2000x faster. Ref: Dr. Federico

© 2019 Cray Inc.

Convergence Does Not Mean Sameness.How do Simulation and AI Workflows Compare?

SIMULATION ARTIFICIAL INTELLIGENCE

Credit: “The ML Test Score: A Rubric for ML Production Readiness and Technical Debt Reduction” Eric Breck et al, Google, Inc.

• Workflows are bespoke: Use case and data-specific

• Code can be disposable• ML-based system behaviour is not easily specified

in advance - depends on dynamic qualities of the data and on various model configuration choices

• Maximize data movement-- scan/sort/stream all the data all the time

• Simulation codes are universal, long lasting and evolve more predictably

• Focused on making sure the code mirrors the physics

• Architecture specific optimizations are long lasting• Tightly integrated processor-memory-interconnect

& network storage

Page 17: Reaching Harmony in Tomorrow’s World of Supercomputing · Large Hadron Collider needs • 3D convolutional GAN can generate realistic detector output >2000x faster. Ref: Dr. Federico

© 2019 Cray Inc.

• Rapidly evolving AI Workflows with greater technology diversity• Performance emphasis will move from optimized “Codes” to “Workflows”• Development of domain and use-case specific multi-models• The mingling of simulation and AI methodologies will:

• Increase computational requirements and complexity• Create entirely new use-cases

• Convergence does not mean sameness:• The unique characteristics of simulation and AI need to be addressed• Harmony: Need for both simulation and AI to co-exist within the same

environment

Recap – What Does Our Future Look Like ?

17

Page 18: Reaching Harmony in Tomorrow’s World of Supercomputing · Large Hadron Collider needs • 3D convolutional GAN can generate realistic detector output >2000x faster. Ref: Dr. Federico

© 2019 Cray Inc.

Cray Shasta – Data-Centric Design

Designed to run multiple workloads simultaneously (simulation, AI, analytics)

Reflective of changing application/workflow landscape

Designed to run different processors and accelerators

Designed for a variety of datacenter types

Cray system software to enable modularity, extensibility, and delivers a single interface for application development

Running any workflow at scale –simulation/modeling, AI, analytics

Portable and extensible to support an evolving software ecosystem

Cray Slingshot interconnect to handle the complex processing and communication of HPC and AI applications

Slingshot is a complete rethinking of the interconnect for HPC and has added novel adaptive routing, quality-of-service, and congestion management features

Slingshot also adds standard ethernet compatibility to integrate seamlessly into your existing datacenter operations

FLEXIBILITY EXTENSIBILITY SCALABILITY

Page 19: Reaching Harmony in Tomorrow’s World of Supercomputing · Large Hadron Collider needs • 3D convolutional GAN can generate realistic detector output >2000x faster. Ref: Dr. Federico

© 2019 Cray Inc.

AURORA – The 1st US Exascale System

“This system will be an excellent platform for both traditional high performance

computing applications but also is being designed to be excellent for data analytics, particularly the kinds of

streaming data problems we have in the DOE, where we have data coming off

accelerators, detectors, telescopes and so forth,”

“This platform is designed to tackle the largest AI training and inference problems

that we know about,”

“There are over a hundred AI applications being developed by the various national

laboratories. As part of the ExascaleComputing Project, there’s a new effort

around exascale machine learning, and that activity is feeding into the requirements for

Aurora, particularly the software environment; the hardware will be very excellent for training and will achieve state of the art performance

on training.”

Rick Stevens, Argonne Associate laboratory director for computing, environment and life sciences

Page 20: Reaching Harmony in Tomorrow’s World of Supercomputing · Large Hadron Collider needs • 3D convolutional GAN can generate realistic detector output >2000x faster. Ref: Dr. Federico

© 2019 Cray Inc.

T h a n k Yo u .

Per NybergVP, Artificial Intelligence