sonic: software for open networking in the cloud...clone code repos with all git submodules build...

22
SONiC: Software for Open Networking in the Cloud Rita Hui Principal Software Manager Feb 10, 2020

Upload: others

Post on 28-Jul-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: SONiC: Software for Open Networking in the Cloud...Clone code repos with all Git submodules Build image with a specific ASIC choice make configure PLATFORM=[ASIC_VENDOR] make all Each

SONiC: Software for Open Networking in the Cloud

Rita Hui

Principal Software Manager

Feb 10, 2020

Page 2: SONiC: Software for Open Networking in the Cloud...Clone code repos with all Git submodules Build image with a specific ASIC choice make configure PLATFORM=[ASIC_VENDOR] make all Each

Microsoft Cloud Network – A Multi-Billion Dollar Bet

Page 3: SONiC: Software for Open Networking in the Cloud...Clone code repos with all Git submodules Build image with a specific ASIC choice make configure PLATFORM=[ASIC_VENDOR] make all Each

FastTechnology Evolution

Reduce Operational

Burden

Modular & Composable

Software

Choices ofVendors

& Platforms

Disaggregation with

SONiC

Page 4: SONiC: Software for Open Networking in the Cloud...Clone code repos with all Git submodules Build image with a specific ASIC choice make configure PLATFORM=[ASIC_VENDOR] make all Each

SYNCD

LLDP

RedisDB

TeamD

Utility

Platform

SWSS

Database

More apps SNMP DHCP IPv6BGPNew Telemetry New

Page 5: SONiC: Software for Open Networking in the Cloud...Clone code repos with all Git submodules Build image with a specific ASIC choice make configure PLATFORM=[ASIC_VENDOR] make all Each

Hello

Network Applications Simple, consistent, and stable network application stack

Helps consume the underlying complex, heterogeneous hardware easily and faster

Switch Abstraction Interface (SAI)

Network Applications

Switch Abstraction Interface

Network Applications

частный 你好 Bonjourनम#ते

Page 6: SONiC: Software for Open Networking in the Cloud...Clone code repos with all Git submodules Build image with a specific ASIC choice make configure PLATFORM=[ASIC_VENDOR] make all Each

Switch Abstraction Interface

CRUD operations over extensible Entity/Attribute/Value data model

Reference data-plane behavior model supports many devices

Significant feature/partner growth since announcement in 2015

https://github.com/opencomputeproject/SAI

Page 7: SONiC: Software for Open Networking in the Cloud...Clone code repos with all Git submodules Build image with a specific ASIC choice make configure PLATFORM=[ASIC_VENDOR] make all Each
Page 8: SONiC: Software for Open Networking in the Cloud...Clone code repos with all Git submodules Build image with a specific ASIC choice make configure PLATFORM=[ASIC_VENDOR] make all Each

SONiC Containerization

Strengths of Containers

•Clean isolation

•Easy deployment

•Transactional

•Run universally

SONiC Benefits• Serviceability • Extensibility • Development agility• Cross-platform

Page 9: SONiC: Software for Open Networking in the Cloud...Clone code repos with all Git submodules Build image with a specific ASIC choice make configure PLATFORM=[ASIC_VENDOR] make all Each

SONiC Containerization

Routing

LLDP

SNMP

DHCP TEAMD

SYNCD SWSS

Utility Platform

SONiC Base Image

Database

Quagga

FRR

GoBGP

Arista BGP

Components developed in different environments

Source code may not be available

Enables choices on a per-component basis

Telemetry

Page 10: SONiC: Software for Open Networking in the Cloud...Clone code repos with all Git submodules Build image with a specific ASIC choice make configure PLATFORM=[ASIC_VENDOR] make all Each

SONiC for Network Engineers and Software Engineers

CLI-style interaction enabled by scripts written in Python

Contributed largely by Network Engineers who are using SONiC

Linux bash prompt enables direct access to containers and Redis

Page 11: SONiC: Software for Open Networking in the Cloud...Clone code repos with all Git submodules Build image with a specific ASIC choice make configure PLATFORM=[ASIC_VENDOR] make all Each

Build a SONiC ImagePreparation: any server > 1T hard disk, ubuntu linux 16.04Prerequisite: install PIP and JINJIA on the host build machine

Clone code repos with all Git submodules Build image with a specific ASIC choice

make configure PLATFORM=[ASIC_VENDOR]

make all

Each module compiles the source code and generates the .deb packageMain .deb package and zero or more derived packages

From the .deb packages docker-<image(s)>.gzSonic-<image>.bin

Sonic Build Module(s)(eg Linux, libnl etc)

Compile and build .deb

Build Artifactstarget/debs/moulde(s).de

b

Docker Imagestarget/docker-

images(s).gz

Sonic Imagetarget/sonic-<vendor.bin>

Building guide on Wiki

Page 12: SONiC: Software for Open Networking in the Cloud...Clone code repos with all Git submodules Build image with a specific ASIC choice make configure PLATFORM=[ASIC_VENDOR] make all Each

Warm Boot – Subsecond Disruption

Control plane disruption < 90 secondsData plane disruption < 1 second

O.S Reboot SONiCStarts

ASIC WarmInit

State Reconciliation, via SAI state-driven API

Warm RebootFinishes

Routing

Control plane

Data Plane

Page 13: SONiC: Software for Open Networking in the Cloud...Clone code repos with all Git submodules Build image with a specific ASIC choice make configure PLATFORM=[ASIC_VENDOR] make all Each

SONiC Is Powering Microsoft Cloud At Scale

T2-1-1 T2-1-2 T2-1-8

T3-1 T3-2 T3-3 T3-4

Tier 1 – Row Leaf

T2-4-1 T2-4-2 T2-4-4Tier 2 - Spine

T1-1 T1-8T1-7…

T1-2

… …

Tier 3 - Regional Spine

T1-1 T1-8T1-7…

T1-2 T1-1 T1-8T1-7…

T1-2

Tier 0 - Rack…

T0-1 T0-2 T0-20

Servers

…T0-1 T0-2 T0-20

…T0-1 T0-2 T0-20

SONIC SONIC SONIC SONIC SONIC SONIC SONIC SONIC SONIC SONIC SONIC SONIC

SONIC SONIC SONIC SONIC SONIC SONICSONIC SONIC SONIC

Servers Servers

Page 14: SONiC: Software for Open Networking in the Cloud...Clone code repos with all Git submodules Build image with a specific ASIC choice make configure PLATFORM=[ASIC_VENDOR] make all Each

Debugging Datacenter Network is a Daunting Task

DCNs are large and complexO(100K) low-cost devices Complex software stack

Network faults are unavoidable and diversePacket drop, latency spike, low throughput, load imbalance, loop…

Existing tools insufficientDevice counters, ping, traceroute…

14

Page 15: SONiC: Software for Open Networking in the Cloud...Clone code repos with all Git submodules Build image with a specific ASIC choice make configure PLATFORM=[ASIC_VENDOR] make all Each

Example #1: Silent Packet Drop

SOS: application requests encounter timeouts!!

15

Client Load balancer Server

……

Req.

Req.

Req.Req.Timeout

Retry

Page 16: SONiC: Software for Open Networking in the Cloud...Clone code repos with all Git submodules Build image with a specific ASIC choice make configure PLATFORM=[ASIC_VENDOR] make all Each

Counters and Traceroute Cannot Help

16

• Counters: no significant drops• Exhaustive traceroute: prohibitively

expensive or even infeasible

Client Load balancer Server

……

Page 17: SONiC: Software for Open Networking in the Cloud...Clone code repos with all Git submodules Build image with a specific ASIC choice make configure PLATFORM=[ASIC_VENDOR] make all Each

Example #2: Latency Spikes

Queue size watermarks: too coarse-grained to correlate w/ affected flows

Ping & traceroute: cannot measure per hop delay

17

… …

… …

… …

… …

… …

… …

Page 18: SONiC: Software for Open Networking in the Cloud...Clone code repos with all Git submodules Build image with a specific ASIC choice make configure PLATFORM=[ASIC_VENDOR] make all Each

Azure Scale Monitoring -- EverFlow

18

Guidedprober

Load-balancer

Match & mirror

Collector

Collector

Collector

Distributed storage

Filter & aggregation

EverFlow controller

Query

Load imbalance

Latency spike

Packet drop Loop

Cloud Network Challenge

Cloud Scale Data

Azure Storage

Azure Analytics

Page 19: SONiC: Software for Open Networking in the Cloud...Clone code repos with all Git submodules Build image with a specific ASIC choice make configure PLATFORM=[ASIC_VENDOR] make all Each

Ø ARM based

Ø ASICNephos: TaurusBRCM: Trident3/Tomahaw3,

Helix4Cisco: Larcosse

Ø 31 platforms

2019

Ø LinuxØ Basic L2/L3Ø ContainerizedØ Redis DB

Ø RDMA/QoSØ IPv6Ø Mgmt. via SwarmØ Fast Reboot(<30s)

Ø Streaming TelemetryØ Config DBØ Support VirtualizationØ Warm Reboot (<1s)Ø Restful API

Ø Richer Routing Stack: FRR, cRPD

Ø Management Framework Ø Dev/Test EnhancementsØ NAT

Ø 40G

ASICBRCM: Trident 2MLNX: SpectrumCavium: XpliantCentec: Goldengate

Ø 5 platforms

Ø 100G

Ø ASICBRCM: Tomahawk,

Tomhawk2Marvell: PresteraBarefoot: Tofino

Ø 16 platforms

Ø Chassis Support

BRCM: Jericho2Innovium: TeralynxMarvell: FalconMLNX: Spectrum IICisco: Silicon One

Ø 92 platforms

2016 2017 2018 2020

SONiC Keeps Evolving

Page 20: SONiC: Software for Open Networking in the Cloud...Clone code repos with all Git submodules Build image with a specific ASIC choice make configure PLATFORM=[ASIC_VENDOR] make all Each
Page 21: SONiC: Software for Open Networking in the Cloud...Clone code repos with all Git submodules Build image with a specific ASIC choice make configure PLATFORM=[ASIC_VENDOR] make all Each

Open Invitation to the SONiC Community

• Inviting contributions in all areas• New ideas on white/open network devices • SAI proposals• Hardware platform• New features, applications and tools• Download it, Test, Deploy!

Website: https://azure.github.io/SONiC/Mailing list: [email protected]: https://github.com/Azure/SONiCWiki: https://github.com/Azure/SONiC/wiki/

Page 22: SONiC: Software for Open Networking in the Cloud...Clone code repos with all Git submodules Build image with a specific ASIC choice make configure PLATFORM=[ASIC_VENDOR] make all Each

© Copyright Microsoft Corporation. All rights reserved.

Thank You