dpu for edge computing - live.nvidia-china.com
TRANSCRIPT
Dec 2020
DPU FOR EDGE COMPUTING
2
WELCOME TO THE DATA REVOLUTION ERA
70,000
BCE1500CE
10,000
BCE
2000+
CE
Agricultural
Revolution
Cognitive
RevolutionScientific
RevolutionData
Revolution
Yuval Noah Harari, Author of Sapiens and Homo Deus
3
DATA GROWTH DRIVES ADVANCED NETWORKING NEEDS
Faster Compute
Disaggregated Infrastructure
Accelerated Compute
Faster Storage
FASTER NETWORK
Needs a
Zero Trust
Ubiquitous Threats
More Regulations
SECURED NETWORK
Needs a
Cloud Native Data Center architectures
Virtualization / Containerization
Heavy Workloads
And … Moore’s Law Fading
ACCELERATED NETWORK
Needs an
4
DON’T KILL YOUR BUSINESS REVENUEDPU free up server CPU for its primary application tasks
Bare Metal
CoreCoreCoreCore
CoreCoreCoreCore
CoreCoreCoreCore
CoreCoreCoreCore
CoreCoreCoreCore
CoreCoreCoreCore
Money Maker ApplicationCore
Software Defined
Hardware
Accelerated
CoreCoreCoreCore
CoreCoreCoreCore
CoreCoreCoreCore
CoreCoreCoreCore
CoreCoreCoreCore
CoreCoreCoreCore
Virtualized &
Software Defined
CoreCoreCoreCore
CoreCoreCoreCore
CoreCoreCoreCore
CoreCoreCoreCore
CoreCoreCoreCore
CoreCoreCoreCore
Money Maker ApplicationCore
Software-Defined NetworkingCore
DPUC
EncryptionCore Deep Packet InspectionCoreDone by Hardware
5
6.9B Transistors
8 64-bit Arm CPUs Cores
Dual 16-way VLIW Engine
100 Gbps IPsec
50 Gbps RegEx
100 Gbps Video Streaming
5M NVMe IOPs
BLUEFIELD-2 DATA PROCESSING UNITData Center Infrastructure-on-a-Chip
6
BLUEFIELD-2 DPU BLOCK DIAGRAM
200 Gbps Ethernet & InfiniBand, NRZ & PAM4 modulation
Powered by ConnectX-6 Dx
8 ARM A72 CPUs subsystem in a Tile architecture
- 8MB L2 cache, 6MB L3 cache in 4 Tiles
- ARM Frequency up-to 2.5GHz
Fully integrated PCIe switch, 16 bi-furcated Gen4.0
- Root Complex or End Point modes
1GbE Out-of-Band management port
16 lanes PCIe Gen3/4
7
BLUEFIELD-2 DELIVERS HIGHEST APPLICATION EFFICIENCYEquivalent additional CPUs to match a single DPU
10X 15X
30X
50X
MALWARE PATTERN MATCHING VIDEO STREAMING IPSEC ENCRYPTION
ELASTIC BLOCK STORAGE
2.5X
CLOUD OVERLAY NETWORKING
150X
NG STATEFUL FIREWALL
8
SDK for BlueField DPUs
Open source APIs – DPDK, SPDK, P4
Certified reference apps & 3rd party solutions
Support for multiple OS
INTRODUCINGNVIDIA DOCAData Center Infrastructure-on-a-Chip Architecture
StorageSPDK
SecurityDPDK
NetworkingDPDK / P4
DOCA SDK
INFRASTRUCTURE APPLICATIONS
ASAP2 CRYPTO RoT RDMASNAP
ManagementTelemetry
Infrastructure
Management
Software-defined
StorageSoftware-defined
Security
Software-defined
Networking
9
NVIDIA & VMWARE ENABLE HYBRID CLOUD ARCHITECTURERun Modern Workloads Efficiently Over New Composable, Disaggregated Infrastructure
Bare Metal
Linux &
Windows
Isolation
Network and
Security:
NSX Svcs
Compute
Hypervisor
Storage:
VSAN Data
ESXi
Host
Management
DPU
Project Monterey
10
BLUEFIELD-2X DATA PROCESSING UNITAI-Powered DPU
200 Gb/s BlueField-2 augmented by Ampere GPU
Enhanced the DPU with AI capabilities
Scale out computing performance with GPUDirect and CUDA
Tighter security across the PCIe bus
Apply AI to real time network traffic
- Anomaly detection & automated response
- Traffic shaping/steering
- Dynamic security orchestration
11
SOFTWARE-DEFINED, HARDWARE-ACCELERATED
Software Defined Security
Distributed
NG Firewall
IDS/IPS DDOS
Prevention
Software Defined Storage
vRouter vSwitch VMs &
Containers
Software Defined Networking
NVMe-oF
Storage Direct
Data
Encryption
DeDup Micro
Segmentation
Telco/NFV Elastic
Storage
Root of
Trust
CompressionNAT/Load
Balancer
12
BLUEFIELD - A NEW RANGE OF OPPORTUNITIES
SecurityMicro-Segmentation
Next Generation Stateless FirewallZero-trust and Agent-less solutions
DDoS and DPI applications
StorageStorage-as-a-Service
Storage disaggregation with SNAPStorage offloads : NVMe-oF, RAID, CRC
CloudBare metal & virtualized cloud
Enable hardware secured containersOffload mission critical workloadsIsolated control and data planes
Edge ComputingMoving data processing at the edge
Virtualized edge gatewaysEnable micro-servers at the edge
Cloud | Security | Edge | Storage
13
DATA-CENTER SECURITY CHALLENGES
Attack Surface New Cyber RegulationAttack Sophistication
Lack of visibility & controlThe perimeter is broken
14
BLUEFIELD DPU SECURITY PACKAGE Security in All Levels
Advanced L4-L7 SecurityNG Stateful firewall
Deep Packet InspectionHost Introspection
Programmability & IsolationFunctional isolation
Security ecosystem
Ability to run privacy &
authentication algorithms
Secured Hardware (RoT)Secure firmware upgrade
Secure boot Arm Trust Zone
Crypto AccelerationsInline encryption: IPsec \ TLS Storage encryption: AES-XTS
Hardware public key acceleration
15
BLUEFIELD FUNCTIONAL ISOLATION
A Computer in front of a computer
Infrastructure functions fully isolated in SmartNIC
Functionality runs secure in separate trust domain
- Enforces policies on compromised host
- Host access to SmartNIC can be blocked by hardware
ASAP2
16
BLUEFIELD-2 INLINE ENCRYPTION– IPSEC/TLS
Lower CPU utilization with significant higher performance
- Encryption/decryption at 100G (IPsec) or 200G (TLS)
Inline offload at the Application level
- Removes software overhead of invoking accelerator
- Inline with other offloads (tunneling, IPsec, SR-IOV etc.)
- Independent of Host interference
Cipher: AES-GCM 128/256bit keys
Supports TLS1.2 and TLS1.3
CPU CPU
Encrypted
TLS PlaintextEncrypted Data
Encrypted
Click for TLS White Paper
17
BLUEFIELD-2 KTLS OFFLOAD PERFORMANCE
Up to 66% Improvement in CPU Utilization with transmit offload
Offload recovers and improves TLS’s CPU overhead
Significant CPU Savings – more than 3 Xeon E5 cores can be saved
16 20 24 28 32 64
SW kTLS Sender CPU Utilization [%] 23.52 % 23.37 % 25.70 % 27.21 % 28.37 % 31.70 %
kTLS offload Sender CPU Utilization [%] 9.07 % 9.43 % 10.29 % 10.88 % 11.19 % 13.10 %
0%
10%
20%
30%
40%
Sender CPU Utilization – 100G Single Port, TX Offload (lower is better)
18
BLUEFIELD-2 NG STATEFUL FIREWALL
Static Firewall rules at wire speed programmed using OVS
Accelerated through Connection Tracking
Nvidia ASAP2 enables seamless offload of filtering and steering
Next-generation Firewall agents can run on Arm cores
19
BLUEFIELD-2 IS THE MOST SECURE DPUTrust Shifts to the DPU
Root-of-Trust
Stateful Firewall
Inline Crypto Accelerators
Deep Packet Inspection
Isolated Security Control Plane
Full Isolation from the Host
CPU
GPU
Network Traffic
Distributed
NG FirewallIDS/IPS DDOS
PreventionMicro
Segmentation
Root of
Trust
20
BLUEFIELD DPU STORAGE PACKAGE
Storage SecurityData-at-rest AES-XTS encryption
Authentication servicesProtection between users
Secret SauceNVMe SNAP
Data (De) Compression(De) Duplication
Integrated data and control planes
Superior PerformanceDual 100Gbps or single 200Gbps
Up to 5.4M IOPs @4KBLowest latency
NVMe-oF offloads
21
BLUEFIELD ELASTIC-BLOCK-STORAGE SNAPAll types of storage solutions : DAS, Scale-UP, Scale-OUT, Hyperconverged and more
Emulated Interfaces on PCIeNVMe SNAP / virtio-blk SNAP
Remote Storage AccessNVMe-oF & RDMA offloadiSCSI, iSER, NFS, CEPH
HyperconvergedLocal Storage Access
Direct/Indirect
Indirect
Direct
NVME SNAP
22
DPU ACCELERATES STORAGE COMPOSABILITY
Emulates remote storage to appear as local to the host OS
Dynamically assigned storage, not bound by physical capacity
Virtualized or Bare Metal Cloud
Over-provisioning, scaled to rack/cluster
Inbox standard drivers
OS agnostic - supports legacy OSs
Disrupting Enterprise Cloud Economics
Compute Platforms
HOST OS
Remote Storage
Virtio-blk
DPU SNAP Framework
NVMe
23
BLUEFIELD-2 ENCRYPTION DATA-AT-REST
200Gbs Encryption/Decryption to/from storage media
- Transparent to users
- Protection between users
Support AES-XTS 256/512bit keys Data-at-Rest
Signature/T10-DIF and encryption can operate simultaneously
Protection between users sharing the same storage resource
FIPS compliancy at the adapter level for all storage disk types
Network Interface
PCIe Interface
Decryption Encryption
SignatureVerification
SignatureCalculation
Encryption
@ Initiator
Encryption
@ Target
Network Interface
PCIe Interface
DecryptionEncryption
SignatureVerification
SignatureCalculation
24
ONE AND ONE MAKE THREE
Accelerate Network and Storage IOs
- vSwitch running inside BlueField-2 ARM cores
- IOs directly attached to user VMs/Containers
Secure and Segregate. Physically separated from host CPU
Offload technologies to reach highest performance
- OVS control path in ARM cores
- OVS data path accelerated by NIC hardware
- Storage offload engines : RDMA, NVMe-oF, RAID,T10Dif
25
INTRODUCING 5T FOR 5G!
▪ 5T for 5G
▪ Time-Triggered Transmission Technology for Telco
▪ Hardware timing offloads
▪ Supports ITU-T G.8275/IEEE 1588v2 PTP
▪ Highest clock accuracy
▪ Maximum Absolute Time Error less than 16 ns!
▪ ASAP2 Time Based Flow Engine
▪ Software-Defined, Hardware-Accelerated Time Bound Packet Steering
▪ Ideal for vRAN (virtual Radio Access Networks)
▪ Running on NVIDIA EGX platform over Aerial 5G SDK
▪ eCPRI windowing
▪ Hardware Offloaded, Schedule Based Packet Transmission
▪ Built-in ConnectX & BlueField offloads
▪ Best Power & Cost for Telco Edge Cloud
EGX
26
A DPU SOLUTION FOR EACH CUSTOMER
Open Platform VMWare based Solution Full Solution & Ecosystem
Today 2021 2021
BlueField-2 DPU
DOCA SDK
Customer Apps
VMwareDOCA SDK
3rd Party Apps
BlueField-2 DPU BlueField-2 DPU
27
Accelerated Disaggregated
Infrastructure (ADI)
THE DATA CENTER IS THE NEW UNIT OF COMPUTING
NVIDIA Networking
Software defined, Hardware-acceleratedDPU (data processing unit)
DPU essential to disaggregate resources
& make composable ADI
Accelerated ComputingGPU: AI & machine learning
GPU critical for AI & machine learning
Every workload will become AI Accelerated