heterogeneous exascale computing

18
© Copyright 2021 Xilinx Copyright 2021 Xilinx © Heterogeneous Exascale Computing Ivo Bolsens, CTO March 25, 2021 1

Upload: others

Post on 19-Jan-2022

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Heterogeneous Exascale Computing

© Copyright 2021 XilinxCopyright 2021 Xilinx©

Heterogeneous Exascale Computing

Ivo Bolsens, CTO

March 25, 2021

1

Page 2: Heterogeneous Exascale Computing

© Copyright 2021 XilinxCopyright 2021 Xilinx©

Digitalization

Towards Exascale Computing

Compute

P C E R A

2000 2010 2020

EXA

PETA

TERA

ASIC

RTL developers

Democratize hardware designCloud

Accelerator

Hardware savvy SW developers

HW/SW co-design

Virtex

Peer

Data scientists

Machine Learning

AI

Zynq

Alveo

Zen

Radeon

Versal

SCALAR VECTOR MATRIX SPATIAL

Mobile

Internet

Page 3: Heterogeneous Exascale Computing

© Copyright 2021 XilinxCopyright 2021 Xilinx©

3

Historic: CPU OnlyTop of Rack Switch (TOR)

Page 4: Heterogeneous Exascale Computing

© Copyright 2021 XilinxCopyright 2021 Xilinx©

4

Scale-Up: CPU Host With GPU and FPGA Accelerators

Page 5: Heterogeneous Exascale Computing

© Copyright 2021 XilinxCopyright 2021 Xilinx©

5

Scale-Out: CPU Host With GPU, FPGA and SmartNIC

HPC Ethernet

(Cray)

Page 6: Heterogeneous Exascale Computing

© Copyright 2021 XilinxCopyright 2021 Xilinx©

6

Scale-Up: Heterogeneous Fabric of Peers (CPU, GPU, FPGA, SSD)

SmartSSD

Communication

bottleneck through CPU

Communication

between peers

Page 7: Heterogeneous Exascale Computing

© Copyright 2021 XilinxCopyright 2021 Xilinx©

Communication

bottleneck through CPU

7

Scale-up: Heterogeneous fabric of Peers + Smart SSD

Accelerator

memory

Customizable

accelerator

5th gen

Samsung

VNAND

Page 8: Heterogeneous Exascale Computing

© Copyright 2021 XilinxCopyright 2021 Xilinx©

8

Scale-out : Compute On Data-In-Use, Data-In-Motion, Data-At-Rest

Data-In-Motion

Data-In-Use Data-At-Rest

Page 9: Heterogeneous Exascale Computing

© Copyright 2021 XilinxCopyright 2021 Xilinx©

Scale-Out : Disaggregation of Compute And Storage

1

1

2

9

1

2

FPGA# =Workload

CPU

GPU

Storage

NIC

Page 10: Heterogeneous Exascale Computing

© Copyright 2021 XilinxCopyright 2021 Xilinx©

10

Fabric

Data “In-Use” In Distributed Memory

Fabric

Page 11: Heterogeneous Exascale Computing

© Copyright 2021 XilinxCopyright 2021 Xilinx©

11

Communication using message passing

Software view : Local memory & message passing

Page 12: Heterogeneous Exascale Computing

© Copyright 2021 XilinxCopyright 2021 Xilinx©

12

Memory

Fabric

Future Programmer’s View: Unified Memory

Unified Distributed System Memory

Memory

Fabric

Any To Any Shared Memory Fabric

Page 13: Heterogeneous Exascale Computing

© Copyright 2021 XilinxCopyright 2021 Xilinx©

Memory

Fabric

Future: Shared Virtual Address Spaces

Memory

Fabric

13

Virtual Resource Group A

Virtual Resource Group B

Page 14: Heterogeneous Exascale Computing

© Copyright 2021 XilinxCopyright 2021 Xilinx©

14

Software Abstraction for Exascale Compute

Uniform System Memory

Any To Any Memory Fabric

Virtualized Address Spaces

Page 15: Heterogeneous Exascale Computing

© Copyright 2021 XilinxCopyright 2021 Xilinx©

OS

S h e l l

VITIS Software StackEnables Smart NIC|SSD + Heterogeneous Compute

SMART

NIC

F R A M E W OR K S

L I B R A R I E S

V++ dev ice code + XRT API

X R T

SMART

SSD

VIRTUALIZATION

ORCHESTRATION

COMPUTE

AIE

COMPUTE

PL

Page 16: Heterogeneous Exascale Computing

© Copyright 2021 XilinxCopyright 2021 Xilinx©

Conclusions

16

Page 17: Heterogeneous Exascale Computing

© Copyright 2021 XilinxCopyright 2021 Xilinx©

17

Future

Exascale computing and data - driven by new workloads such as AI

Increase compute power by

Scale-up of compute nodes

Scale-out of network

Enabled by Accelerators, SmartNIC and SmartSSD

Compute on data-in-use

Compute on data-in-motion

Compute on data-at-rest

Software stack :

Higher level of programming abstraction

Runtime enabling heterogeneous compute fabrics

Shared virtual address spaces

Page 18: Heterogeneous Exascale Computing

© Copyright 2021 Xilinx

Thank You

Copyright 2021 Xilinx©

18