b4:experience with a globally deployed software defined wan › sites › default › files ›...
TRANSCRIPT
![Page 1: B4:Experience with a Globally Deployed Software Defined WAN › sites › default › files › T7... · Before SDN Google ran B4 as a single Autonomous System using BGP/ISIS protocols](https://reader035.vdocument.in/reader035/viewer/2022070806/5f04cb487e708231d40fbd1f/html5/thumbnails/1.jpg)
B4:Experience with a Globally Deployed Software Defined WAN
![Page 2: B4:Experience with a Globally Deployed Software Defined WAN › sites › default › files › T7... · Before SDN Google ran B4 as a single Autonomous System using BGP/ISIS protocols](https://reader035.vdocument.in/reader035/viewer/2022070806/5f04cb487e708231d40fbd1f/html5/thumbnails/2.jpg)
Why?
• To save money! – WAN hardware and links are over-provisioned
– But this hardware is expensive!
– And Google’s traffic between DC’s is increasing Rapidly!!
![Page 3: B4:Experience with a Globally Deployed Software Defined WAN › sites › default › files › T7... · Before SDN Google ran B4 as a single Autonomous System using BGP/ISIS protocols](https://reader035.vdocument.in/reader035/viewer/2022070806/5f04cb487e708231d40fbd1f/html5/thumbnails/3.jpg)
Assumptions/Insights
• Control over applications, servers, switches
• Only few dozen Datacenters
• Applications can: – handle failures
– adapt to changing bandwidth
– class and priority tells traffic patterns/importance
![Page 4: B4:Experience with a Globally Deployed Software Defined WAN › sites › default › files › T7... · Before SDN Google ran B4 as a single Autonomous System using BGP/ISIS protocols](https://reader035.vdocument.in/reader035/viewer/2022070806/5f04cb487e708231d40fbd1f/html5/thumbnails/4.jpg)
Implementation
• Full control over WAN routing – WAN scale SDN deployment
• Managing the links in smart way – Traffic Engineering (TE)
![Page 5: B4:Experience with a Globally Deployed Software Defined WAN › sites › default › files › T7... · Before SDN Google ran B4 as a single Autonomous System using BGP/ISIS protocols](https://reader035.vdocument.in/reader035/viewer/2022070806/5f04cb487e708231d40fbd1f/html5/thumbnails/5.jpg)
TAKING CONTROL OVER WAN LINKS Step-1
![Page 6: B4:Experience with a Globally Deployed Software Defined WAN › sites › default › files › T7... · Before SDN Google ran B4 as a single Autonomous System using BGP/ISIS protocols](https://reader035.vdocument.in/reader035/viewer/2022070806/5f04cb487e708231d40fbd1f/html5/thumbnails/6.jpg)
![Page 7: B4:Experience with a Globally Deployed Software Defined WAN › sites › default › files › T7... · Before SDN Google ran B4 as a single Autonomous System using BGP/ISIS protocols](https://reader035.vdocument.in/reader035/viewer/2022070806/5f04cb487e708231d40fbd1f/html5/thumbnails/7.jpg)
![Page 8: B4:Experience with a Globally Deployed Software Defined WAN › sites › default › files › T7... · Before SDN Google ran B4 as a single Autonomous System using BGP/ISIS protocols](https://reader035.vdocument.in/reader035/viewer/2022070806/5f04cb487e708231d40fbd1f/html5/thumbnails/8.jpg)
DECIDING WHO GETS RESOURCES Step-2
![Page 9: B4:Experience with a Globally Deployed Software Defined WAN › sites › default › files › T7... · Before SDN Google ran B4 as a single Autonomous System using BGP/ISIS protocols](https://reader035.vdocument.in/reader035/viewer/2022070806/5f04cb487e708231d40fbd1f/html5/thumbnails/9.jpg)
Traffic Engineering (TE)
• Goal: Share bandwidth among competing applications possibly using multiple paths.
• Sharing bandwidth is defined by Google as max-min fairness.
• Basics of max-min fairness: – No source gets a resource share larger than its
demand. – Sources with unsatisfied demands get an equal share
of the resource.
![Page 10: B4:Experience with a Globally Deployed Software Defined WAN › sites › default › files › T7... · Before SDN Google ran B4 as a single Autonomous System using BGP/ISIS protocols](https://reader035.vdocument.in/reader035/viewer/2022070806/5f04cb487e708231d40fbd1f/html5/thumbnails/10.jpg)
![Page 11: B4:Experience with a Globally Deployed Software Defined WAN › sites › default › files › T7... · Before SDN Google ran B4 as a single Autonomous System using BGP/ISIS protocols](https://reader035.vdocument.in/reader035/viewer/2022070806/5f04cb487e708231d40fbd1f/html5/thumbnails/11.jpg)
TE Optimization Algorithm
• Traditional solutions are expensive • Google’s solution
– Aggregate flows into flow-groups, tunnel-groups – 25x faster, and utilizes at least 99% of the bandwidth
• Three Steps – Tunnel selection: select tunnels for flow group (FG) – Tunnel Group Generation: Allocation of bandwidth to
FGs – Tunnel Group Quantization: Changing split ratios in
each TG to match the granularity supported by switch hardware.
![Page 12: B4:Experience with a Globally Deployed Software Defined WAN › sites › default › files › T7... · Before SDN Google ran B4 as a single Autonomous System using BGP/ISIS protocols](https://reader035.vdocument.in/reader035/viewer/2022070806/5f04cb487e708231d40fbd1f/html5/thumbnails/12.jpg)
TE State and OpenFlow
• B4 switches operate in 3 roles:
1. Encapsulating switch initiates tunnels and splits traffic between them.
2. Transit switch forwards packets based on their outer header.
3. Decapsulating switch terminates tunnels then forwards packets using regular routes.
![Page 13: B4:Experience with a Globally Deployed Software Defined WAN › sites › default › files › T7... · Before SDN Google ran B4 as a single Autonomous System using BGP/ISIS protocols](https://reader035.vdocument.in/reader035/viewer/2022070806/5f04cb487e708231d40fbd1f/html5/thumbnails/13.jpg)
![Page 14: B4:Experience with a Globally Deployed Software Defined WAN › sites › default › files › T7... · Before SDN Google ran B4 as a single Autonomous System using BGP/ISIS protocols](https://reader035.vdocument.in/reader035/viewer/2022070806/5f04cb487e708231d40fbd1f/html5/thumbnails/14.jpg)
Using TE and shortest path together
![Page 15: B4:Experience with a Globally Deployed Software Defined WAN › sites › default › files › T7... · Before SDN Google ran B4 as a single Autonomous System using BGP/ISIS protocols](https://reader035.vdocument.in/reader035/viewer/2022070806/5f04cb487e708231d40fbd1f/html5/thumbnails/15.jpg)
RESULTS Step-3
![Page 16: B4:Experience with a Globally Deployed Software Defined WAN › sites › default › files › T7... · Before SDN Google ran B4 as a single Autonomous System using BGP/ISIS protocols](https://reader035.vdocument.in/reader035/viewer/2022070806/5f04cb487e708231d40fbd1f/html5/thumbnails/16.jpg)
Link Utilization
![Page 17: B4:Experience with a Globally Deployed Software Defined WAN › sites › default › files › T7... · Before SDN Google ran B4 as a single Autonomous System using BGP/ISIS protocols](https://reader035.vdocument.in/reader035/viewer/2022070806/5f04cb487e708231d40fbd1f/html5/thumbnails/17.jpg)
Link Utilization
![Page 18: B4:Experience with a Globally Deployed Software Defined WAN › sites › default › files › T7... · Before SDN Google ran B4 as a single Autonomous System using BGP/ISIS protocols](https://reader035.vdocument.in/reader035/viewer/2022070806/5f04cb487e708231d40fbd1f/html5/thumbnails/18.jpg)
Failures Google conducted experiments to test the recovery time from different types of failures. Their results are summarized below:
![Page 19: B4:Experience with a Globally Deployed Software Defined WAN › sites › default › files › T7... · Before SDN Google ran B4 as a single Autonomous System using BGP/ISIS protocols](https://reader035.vdocument.in/reader035/viewer/2022070806/5f04cb487e708231d40fbd1f/html5/thumbnails/19.jpg)
Experience with an outage
• During a move of switches from one physical location to another, two switches became manually configured with the same ID.
• Resulted in network state never converging to the topology.
• System recovered after all traffic was stopped, buffers emptied and OFCs restarted from scratch.
![Page 20: B4:Experience with a Globally Deployed Software Defined WAN › sites › default › files › T7... · Before SDN Google ran B4 as a single Autonomous System using BGP/ISIS protocols](https://reader035.vdocument.in/reader035/viewer/2022070806/5f04cb487e708231d40fbd1f/html5/thumbnails/20.jpg)
Backup