scale/switchengines update - current and possible sdn applications

18
SCALE/SWITCHengines Update Current and Possible SDN Applications SDN Workshop, Winterthur, 20 November 2014 Simon Leinen [email protected]

Upload: simon-leinen

Post on 16-Jul-2015

215 views

Category:

Internet


0 download

TRANSCRIPT

Page 1: SCALE/SWITCHengines Update - Current and Possible SDN Applications

SCALE/SWITCHengines UpdateCurrent and Possible SDN Applications

SDN Workshop, Winterthur, 20 November 2014

Simon Leinen

[email protected]

Page 2: SCALE/SWITCHengines Update - Current and Possible SDN Applications

© 2014 SWITCH

• Status at October 2013 SDN workshop presentation:

“Building Cloud – Where SDN Could Help”:

– Pilot OpenStack/Ceph cloud “BCC” existed

– New production cloud (2*2 racks) was planned

– SDN considered applicable for:

• Low-cost scalable internal fabric: whitebox ToR leaf/spine with multipath

• DC/Backbone interface without traditional $$ routers

• Tunneling towards customers for VPC (Virtual Private Cloud) offerings

What happened since last time?

2

Page 3: SCALE/SWITCHengines Update - Current and Possible SDN Applications

© 2014 SWITCH

• Built ~10-node Ceph+OpenStack cluster

BCC – “building cloud competence”

• Services:

– VMs for various researchers and internal testers

– File synchronization server for ~3’500 end users

of SWITCHdrive service

• Networking:

– 2*10GE per server

– 6*10GE on front-end servers, which route

– Two Brocade “ToR” switches with TRILL-based

multi-chassis multipath, L2+VLANs

– 2*10GE towards backbone

“BCC” Cluster – Still Running!

(for some value of…)-:

3

Page 4: SCALE/SWITCHengines Update - Current and Possible SDN Applications

© 2014 SWITCH

Co-funded (50% of manpower) by CUS P-2 program

“Information scientifique – accès, traitement et sauvegarde”

Project SCALE, May 2014 – May 2015

• Two sites: UniL Géopolis, ZHdK Toni-Areal, each with

– 32 (16 compute+16 storage) 2U servers w/2*10GE each

– 2*10GE external connectivity (pending 2nd link @Toni-Areal)

– room for growth to ~20 racks

• New “SWITCHengines” IaaS offering, in limited testing

now

Pilot SWITCH Cloud Project SCALE

4

Page 5: SCALE/SWITCHengines Update - Current and Possible SDN Applications

© 2014 SWITCH

• Hardware: Two ToRs (Brocade ICX 48*10GE,6*40GE)

– External: BGP peerings (IPv4+IPv6) to backbone routers

– Internal: Redundant connections to servers (802.1ad), mult. VLANs

– Interconnected with single 40GE link

– Router connections optical, all others DAC (direct-attach copper)

SCALE Networking

5

Page 6: SCALE/SWITCHengines Update - Current and Possible SDN Applications

© 2014 SWITCH

• OpenStack “Icehouse” (2014.1)

– Neutron (OpenStack Networking), standard/open-source components:

• ML2 (Modular Layer-2) plugin

• Open vSwitch (OVS)

• VXLAN overlay for tenant network isolation

• Setup:

– Two OpenStack regions LS/ZH (each with its own Ceph cluster)

– Single “network node” (VM!) per region for virtual L3 routing

• between tenant (virtual) networks

• between VMs and Internet

SCALE Networking: Software

6

Page 7: SCALE/SWITCHengines Update - Current and Possible SDN Applications

© 2014 SWITCH

Neutron

7

Page 8: SCALE/SWITCHengines Update - Current and Possible SDN Applications

© 2014 SWITCH

Neutron

8

Page 9: SCALE/SWITCHengines Update - Current and Possible SDN Applications

© 2014 SWITCH

Neutron

9

Page 10: SCALE/SWITCHengines Update - Current and Possible SDN Applications

© 2014 SWITCH

• In the default configuration, usable MTU in tenant (overlay)

networks is 14xx bytes

• So, “ping” works, but “apt-get update” hangs…

• Everybody (who uses ML2/OVS/VXLAN) has this problem!

• Default way to “fix” this is to lower client (VM) MTU

– IMHO this is hopeless – 1500 bytes much too ingrained now

• We increase underlay MTU instead (to 1600) - much better

Problems: MTU (solved)

10

Page 11: SCALE/SWITCHengines Update - Current and Possible SDN Applications

© 2014 SWITCH

• Single-stream performance VM <-> Internet ~1.5Gb/s

– Should be close to 10Gb/s

• Many possible bottlenecks:

– Single network node (will get better with DVR in OpenStack Juno)

– Virtualized network node (could be un-virtualized if necessary)

– OVS overhead (will get better with new versions, e.g. zero-copy)

– VXLAN overhead (will get better with kernel versions, possibly

hardware support with future Ethernet adapters)

Problems: Performance

11

Page 12: SCALE/SWITCHengines Update - Current and Possible SDN Applications

© 2014 SWITCH

• IPv6 for VMs

– “hopefully in Juno” (not really a Neutron issue?)

• VPC (VPNning back into customers’ campus LANs)

– To be added in the longer term (IT needs more than researchers?)

• LBaaS (OpenStack-integrated Load Balancer)

– Should be easy to add

Problems: Missing features

12

Page 13: SCALE/SWITCHengines Update - Current and Possible SDN Applications

© 2014 SWITCH

• But there are so many to choose from!

– Nicira N$X

– OpenContrail (open source, uses MPLS, Juniper base)

– Midokura/MidoNet (open source since this month! EPFL ties)

– Nuage (uses MPLS, Alcatel-Lucent/TiMetra base)

– Open Daylight (but what does it do?)

– Calico (open source, Metaswitch base, L3-based, even does IPv6)

– Plumgrid, …

– Snabb NFV! (open source, hackable, high-performance user-space)

• …or should we just wait until the market sorts this out?

Move to “real” SDN controller?

13

Page 14: SCALE/SWITCHengines Update - Current and Possible SDN Applications

© 2014 SWITCH

• Beyond a few racks, we need some sort of “aggregation

layer” beyond the ToR. There are multiple approaches:

– Traditional with large aggregation switch (doubled for redundancy)

– Modern with leaf/spine design <- cost-effective “commodity” kit

• How can servers make use of parallelism in the fabric?

– Smart L2 switches (TRILL, Multi-chassis LAG etc.) – vendor lock-in?

– L3 switches with hypervisor-based overlay à la Nicira OVP

Growing the Cloud: Internal fabric

14

Page 15: SCALE/SWITCHengines Update - Current and Possible SDN Applications

© 2014 SWITCH

• “White-box” ToR switches sold without OS (just OPIE)

– e.g. 32-port 40GE <CHF 10’000

• Run e.g. Cumulus Networks (Linux) on them

– Could use Puppet to provision = same as servers

New HW/SW options for leaf/spine

15

Page 16: SCALE/SWITCHengines Update - Current and Possible SDN Applications

© 2014 SWITCH

“The Internet has a great future behind it”- Jon Crowcroft

Big funding drive since ~2007 in US and EU (and…)

• Clean slate / greenfield / disruptive thinking

• Radical new (or old) ideas (e.g. ICN)

• Testbeds!

What Became of “Future Internet”?

16

Page 17: SCALE/SWITCHengines Update - Current and Possible SDN Applications

© 2014 SWITCH

• In 2014:

– EU held last Future Internet Assembly

– Moving towards new horizons, e.g. 5G

• FI-PPP still running – see 5 December event @ZHAW

What Became of “Future Internet”?

17

Page 18: SCALE/SWITCHengines Update - Current and Possible SDN Applications

© 2014 SWITCH

• In 2014:

– EU held last Future Internet Assembly

– Moving towards new horizons, e.g. 5G

• FI-PPP still running – see 5 December event @ZHAW

Hypothesis:

Future Internet = Current Internet + Cloud

• as new generative platform (cf. FI-Labs)

• to save Telcos (NFV)

Future of Cloud (incl. NFV) = OpenStack

What Became of “Future Internet”?

18