architektur im rechenzentrum - 25, 50 und 100g ... · architektur im rechenzentrum - 25, 50 und...

41
Arne Heitmann | Sr. System Engineer EMEA ComConcult Netzwerk Forum | Koenigswinter | 19.04.2016 Architektur im Rechenzentrum - 25, 50 und 100G Architectures for the Datacenter 25, 50 and 100G

Upload: phamanh

Post on 26-Jul-2018

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Architektur im Rechenzentrum - 25, 50 und 100G ... · Architektur im Rechenzentrum - 25, 50 und 100G Architectures for the Datacenter ... Optical Connector types for Parallel and

Arne Heitmann | Sr. System Engineer EMEA

ComConcult Netzwerk Forum | Koenigswinter | 19.04.2016

Architektur im Rechenzentrum - 25, 50 und 100G

Architectures for the Datacenter – 25, 50 and 100G

Page 2: Architektur im Rechenzentrum - 25, 50 und 100G ... · Architektur im Rechenzentrum - 25, 50 und 100G Architectures for the Datacenter ... Optical Connector types for Parallel and

© 2016 Mellanox Technologies 2

Agenda

Introduction – Drivers for higher speeds

Bandwidth Factors

• Silicon / Buffers

• Transceivers / Cabling

Bandwidth and Resource Opitmization

• Remote Direct Memory Access (RDMA)

• Remote Direct Memory Access over Converged Ethernet (RoCE)

Possible Scenarios

Page 3: Architektur im Rechenzentrum - 25, 50 und 100G ... · Architektur im Rechenzentrum - 25, 50 und 100G Architectures for the Datacenter ... Optical Connector types for Parallel and

© 2016 Mellanox Technologies 3

Drivers for Higher Speeds

Introduction

Page 4: Architektur im Rechenzentrum - 25, 50 und 100G ... · Architektur im Rechenzentrum - 25, 50 und 100G Architectures for the Datacenter ... Optical Connector types for Parallel and

© 2016 Mellanox Technologies 4

Entering The Era of 25GbE, 50GbE And 100GbE

Copper (Passive, Active) Optical Cables (VCSEL) Silicon Photonics

100GbE Adapter

(10 / 25 / 40 / 50 / 56 / 100GbE)

Multi Host Solution

32 100GbE Ports, 64 25/50GbE Ports

(10 / 25 / 40 / 50 / 56 / 100GbE)

Throughput of 6.4Tb/s

Page 5: Architektur im Rechenzentrum - 25, 50 und 100G ... · Architektur im Rechenzentrum - 25, 50 und 100G Architectures for the Datacenter ... Optical Connector types for Parallel and

© 2016 Mellanox Technologies 5

Demand

More Virtual Machines Per Server

Interconnect Bandwidth Determines VM Density

10GbE

adapter card

Mellanox

40GbE

adapter card

20 VMs

+

Vs.

+

60 VMs

Page 6: Architektur im Rechenzentrum - 25, 50 und 100G ... · Architektur im Rechenzentrum - 25, 50 und 100G Architectures for the Datacenter ... Optical Connector types for Parallel and

© 2016 Mellanox Technologies 6

Demand

The World of Bandwidth is changing

International bandwidth growth (projected

2012-2019)

Global IP traffic by type in petabytes/month

Source: Ars Technica, 2012 Source: TeleGeography/ITU

Page 7: Architektur im Rechenzentrum - 25, 50 und 100G ... · Architektur im Rechenzentrum - 25, 50 und 100G Architectures for the Datacenter ... Optical Connector types for Parallel and

© 2016 Mellanox Technologies 7

Description Hres Vres

Colour

depth

(bits)

Pixels

RGBFPS

RAW BW

(MB/sec)

RAW BW

(Gbits/sec)

8Gb FC

lanes

16Gb FC

lanes

No. 10Gb

lanes

No. 40Gb

lanes

No. 56Gb

lanes

No.

100Gb

lanes

Storage

GB/sec

Strg 90

min

Movie

(TB)

HD Video - Low FPS 1920 1080 16 3 30 373.25 2.99 1 1 1 1 1 1 0.37 2.02

HD Video (US) 1920 1080 16 3 50 622.08 4.98 1 1 1 1 1 1 0.62 3.36

HD Video (EMEA) 1920 1080 16 3 60 746.50 5.97 1 1 1 1 1 1 0.75 4.03

2K Video (US) 2048 1080 16 3 50 663.55 5.31 1 1 1 1 1 1 0.66 3.58

2K Video (EMEA) 2048 1080 16 3 60 796.26 6.37 2 1 1 1 1 1 0.80 4.30

4K UHD (Std FPS) 3840 2160 16 3 30 1492.99 11.94 2 1 2 1 1 1 1.49 8.06

4K UHD (3D FPS) 3840 2160 16 3 60 2985.98 23.89 4 2 4 1 1 1 2.99 16.12

4K Cinema (Std FPS) 4096 2160 16 3 30 1592.52 12.74 3 2 2 1 1 1 1.59 8.60

4K-Full Cinema (Std FPS) 4096 3112 16 3 30 2294.42 18.36 4 2 3 1 1 1 2.29 12.39

4K Cinema (3D FPS) 4096 2160 16 3 60 3185.05 25.48 5 3 4 1 1 1 3.19 17.20

5K Cinema (Std FPS) 5120 2700 16 3 30 2488.32 19.91 4 2 3 1 1 1 2.49 13.44

5K Cinema (3D FPS) 5120 2700 16 3 60 4976.64 39.81 7 4 6 2 1 1 4.98 26.87

8K UHD (Std FPS) 7680 4320 16 3 30 5971.97 47.78 8 4 7 2 1 1 5.97 32.25

8K UHD (3D FPS) 7680 4320 16 3 60 11943.94 95.55 16 8 14 3 2 2 11.94 64.50

Super Hi-Vision 7680 4320 16 3 120 23887.87 191.10 32 16 28 6 4 3 23.89 128.99

Demand

Media/Entertainment – Acceleration already happening

Data rates and storage are exploding, due to high pixel counts and frame rates

10GbE Ethernet is not

going to provide the

necessary BW going

forward

Page 8: Architektur im Rechenzentrum - 25, 50 und 100G ... · Architektur im Rechenzentrum - 25, 50 und 100G Architectures for the Datacenter ... Optical Connector types for Parallel and

© 2016 Mellanox Technologies 8

Demand

New Storage Media Require Faster Networks

Transition to faster storage media requires

faster networks

Flash SSDs move the bottleneck from the

storage to the network

What does it take to saturate one 10Gb/s link?

• 24 x HDDs

• 2 x SATA SSDs

• 1 x SAS SSD

• NVMe…

Page 9: Architektur im Rechenzentrum - 25, 50 und 100G ... · Architektur im Rechenzentrum - 25, 50 und 100G Architectures for the Datacenter ... Optical Connector types for Parallel and

© 2016 Mellanox Technologies 9

Demand

Clouds: Private, Public, Hybrid

Scale up vs. Scale out

The SDDC requires more network interaction

Higher bandwidth required

Page 10: Architektur im Rechenzentrum - 25, 50 und 100G ... · Architektur im Rechenzentrum - 25, 50 und 100G Architectures for the Datacenter ... Optical Connector types for Parallel and

© 2016 Mellanox Technologies 10

Moving to 25GbE, 50GbE And 100GbE

Compute

Nodes150% Higher

Bandwidth

Storage

Nodes25% Higher

Bandwidth

Network150% Higher

Bandwidth 100GbE

25GbE 50GbE

Same Connectors

Similar Infrastructure

Better Cost / Power

Compute

NodesStorage

Nodes

40GbE

Network

40GbE

10GbE

Page 11: Architektur im Rechenzentrum - 25, 50 und 100G ... · Architektur im Rechenzentrum - 25, 50 und 100G Architectures for the Datacenter ... Optical Connector types for Parallel and

© 2016 Mellanox Technologies 11

Bandwidth Factor

Silicon

Page 12: Architektur im Rechenzentrum - 25, 50 und 100G ... · Architektur im Rechenzentrum - 25, 50 und 100G Architectures for the Datacenter ... Optical Connector types for Parallel and

© 2016 Mellanox Technologies 12

Silicon - SerDes

Silicon is connected to board

SerDes• Serializer / Deserializer

• May work at

• ~10Gb/s

• ~14Gb/s

• ~25Gb/s

• Addresses a certain number of ports

Can be bundled:• i.e. 4x10G for a 40Gb/s link

• i.e. 4x14G for a 56Gb/s link

• i.e. 4x25G for a 100G link

Page 13: Architektur im Rechenzentrum - 25, 50 und 100G ... · Architektur im Rechenzentrum - 25, 50 und 100G Architectures for the Datacenter ... Optical Connector types for Parallel and

© 2016 Mellanox Technologies 13

Silicon – Port Architecture Example

100 GigE

50 GigE

50 GigE

40 GigE

40 GigE

10 GigE

10 GigE

40 GigE

OR

OR

OR

OR

25Gig

25Gig

25Gig

25Gig

25Gig

25Gig

25Gig

25Gig

20Gig

20Gig

20Gig

20Gig

10Gig

10Gig

10Gig

10Gig

10Gig

10Gig

25Gig

25Gig

25Gig

25Gig

25Gig25Gig25Gig25Gig25Gig25Gig25Gig25Gig25Gig25Gig25Gig25Gig25Gig25Gig25Gig25Gig25Gig25Gig25Gig25Gig25Gig25Gig25Gig25Gig25Gig25Gig25Gig25Gig25Gig25Gig25Gig25Gig

25Gig25Gig25Gig25Gig25Gig25Gig25Gig25Gig25Gig25Gig25Gig25Gig25Gig25Gig25Gig25Gig25Gig25Gig25Gig25Gig25Gig25Gig25Gig25Gig25Gig25Gig25Gig25Gig25Gig25Gig25Gig25Gig

n x 100Gig

Port Basic UnitPort Options ASIC

Page 14: Architektur im Rechenzentrum - 25, 50 und 100G ... · Architektur im Rechenzentrum - 25, 50 und 100G Architectures for the Datacenter ... Optical Connector types for Parallel and

© 2016 Mellanox Technologies 14

Silicon - Many Port Configuration Options for n*25GbE

SFP28

SFP+

25GbE (1x25Gb/s)

10GbE (1x10Gb/s)

SFP+10GbE (1x10Gb/s)

SFP+10GbE (1x10Gb/s)

SFP+10GbE(1x10Gb/s)

SFP2825GbE (1x25Gb/s)

SFP2825GbE (1x25Gb/s)

SFP2825GbE(1x25Gb/s)

QSFP40GbE(4x10Gb/s)

QSFP2850GbE (2x25Gb/s)

QSFP2850GbE

(2x25Gb/s)

QSFP28100GbE(4x25Gb/s)

135 watts

Page 15: Architektur im Rechenzentrum - 25, 50 und 100G ... · Architektur im Rechenzentrum - 25, 50 und 100G Architectures for the Datacenter ... Optical Connector types for Parallel and

© 2016 Mellanox Technologies 15

Silicon and Buffering

How to handle different Port Speeds

Ports at same speed:

• Cut Through – fast and efficient

• Store & Forward – slow

Fast port to slow port:

• Cut Through – immediate transmission, risk of “oversubscription”

• Store & Forward – safe, buffer intense

Slow port to fast port:

• Cut Through – needs intelligent S&F, buffer “gap” between speeds

• Store & Forward – very buffer intense

Buffer should be…

… dynamically allocated

… flexibly reserved

Page 16: Architektur im Rechenzentrum - 25, 50 und 100G ... · Architektur im Rechenzentrum - 25, 50 und 100G Architectures for the Datacenter ... Optical Connector types for Parallel and

© 2016 Mellanox Technologies 16

Enabling the Most Efficient Storage and Data Analytics Systems

Fast Port to Slow Port

Slow Port to Fast Port

Ports at Same Speed

Performance

• Zero Packet Loss

• 300ns Cut-Through Latency

• Non-Blocking 25/50/100GbE

Dynamic Buffer

• Dynamically allocated

• Flexible buffer reservation

Spectrum Competition

Cut Through Store & Forward

Spectrum Competition

Cut Through Store & Forward

Spectrum Competition

Intelligent S&F* Store & Forward

* Buffers minimum possible amount of packet data

Page 17: Architektur im Rechenzentrum - 25, 50 und 100G ... · Architektur im Rechenzentrum - 25, 50 und 100G Architectures for the Datacenter ... Optical Connector types for Parallel and

© 2016 Mellanox Technologies 17

Bandwidth Factor

Transceiver / Cabling

Page 18: Architektur im Rechenzentrum - 25, 50 und 100G ... · Architektur im Rechenzentrum - 25, 50 und 100G Architectures for the Datacenter ... Optical Connector types for Parallel and

© 2016 Mellanox Technologies 18

Transceiver – some Numbers

What comes next?

50G over a single Lane

• IEEE 802.3 50 Gb/s Ethernet Over a Single Lane and Next Generation 100 Gb/s and 200 Gb/s Ethernet Study Group

IEEE P802.3by 25 Gb/s Ethernet Task Force

• Standard to come

IEEE P802.3bs 400 GbE Task Force

• Adopted timeline says Standard for 2018??

Page 19: Architektur im Rechenzentrum - 25, 50 und 100G ... · Architektur im Rechenzentrum - 25, 50 und 100G Architectures for the Datacenter ... Optical Connector types for Parallel and

© 2016 Mellanox Technologies 19

Transcerver - The Evolution between 10Gb, 25Gb, 40Gb, 50Gb &

100Gb

IEEE 802.3bm

Page 20: Architektur im Rechenzentrum - 25, 50 und 100G ... · Architektur im Rechenzentrum - 25, 50 und 100G Architectures for the Datacenter ... Optical Connector types for Parallel and

© 2016 Mellanox Technologies 20

Transceiver - Pluggable Module Standards

CFPThe CFP MSA defines hot-pluggable optical transceiver form factors to enable 40 Gbit/s and 100 Gbit/s applications. CFP modules use the 10-lane CAUI-10 electrical interface.

CFP2CFP2 modules use the 10-lane CAUI-10 electrical interface or the 4-lane

CAUI-4 electrical interface.

CFP4CFP4 modules use the 4-lane CAUI-4 electrical interface.

QSFP28QSFP28 modules use the 4-lane CAUI-4 electrical interface.

CPAKCisco has the CPAK optical module that uses the four lane CEI-28G-VSR electrical interface.

CXPThere are also CXP and HD module standards. CXP modules use the CAUI-10 electrical interface.

Page 21: Architektur im Rechenzentrum - 25, 50 und 100G ... · Architektur im Rechenzentrum - 25, 50 und 100G Architectures for the Datacenter ... Optical Connector types for Parallel and

© 2016 Mellanox Technologies 21

Cables - Optical Connector types for Parallel and single fiber

infrastructures

MPO Optical Connectors

4 Transmit 4 Receive

12-fiber Optical Connector(4-unused fibers in middle)

Duplex LC 2-fiber Optical Connector

Typically for Single-Mode

Can be used for

Multi-mode (SR4)

or

Single-mode (PSM4)

Also called MTP or MPO/MTP

Single-mode (LR4)

Page 22: Architektur im Rechenzentrum - 25, 50 und 100G ... · Architektur im Rechenzentrum - 25, 50 und 100G Architectures for the Datacenter ... Optical Connector types for Parallel and

© 2016 Mellanox Technologies 22

Cables - Solutions for data center applications

22

Data Center Fabrics

Link Length (m)

10 100 500150 300 1000 2000

10

25

50

3 51

20

Da

ta R

ate

pe

r L

an

e (

Gb

\s)

10000500020 30 50 752

Single mode fiber

OM4OM3

Copper Multi-mode fiber

Silicon Photonics

Direct Attach Copper

• Zero power

• Demo’s 8m at 100G

• Best fit 3m

VCSELsDACs

Active Optical Cables

• VCELs or SiP

• Reaches to 200m

• Best fit for 5-20m

VCSEL Transceivers

• Reaches to 100m

• Best fit for MMF

SiP Transceivers

• Reaches to 2km

• Best fit for SMF

• Parallel or WDM

Page 23: Architektur im Rechenzentrum - 25, 50 und 100G ... · Architektur im Rechenzentrum - 25, 50 und 100G Architectures for the Datacenter ... Optical Connector types for Parallel and

© 2016 Mellanox Technologies 23

Cables/Transceivers - 100GbE Products

100G SR4Ethernet-Only Transceiver

100G Copper DACInfiniBand & Ethernet

100G AOCInfiniBand & Ethernet

For lowest-cost optical 100G

switch-to-switch links .

For Breakouts to 25G / 50G servers +

storage with breakout fibers.

For low-cost, 100G links up to 100m.

Lowest-cost, 100G-to-Quad-

25/50G Breakout cables.

For Linking Servers + Storage to

ToR Switch & NICs.

Page 24: Architektur im Rechenzentrum - 25, 50 und 100G ... · Architektur im Rechenzentrum - 25, 50 und 100G Architectures for the Datacenter ... Optical Connector types for Parallel and

© 2016 Mellanox Technologies 24

Bandwidth and Resource Optimization

RDMA / RoCE

Page 25: Architektur im Rechenzentrum - 25, 50 und 100G ... · Architektur im Rechenzentrum - 25, 50 und 100G Architectures for the Datacenter ... Optical Connector types for Parallel and

© 2016 Mellanox Technologies 25

Convergence: Eliminates Dedicated Storage Network

Storage – prio1

Management – prio2

vMotion – prio3

Networking – prio4

Web 2.0, Public & Private Clouds Converging on Fast RDMA Interconnects

Single Interconnect for Compute, Networking, Storage

RDMA: InfiniBand & Ethernet (RoCE*)

There is no Fibre Channel in the Cloud!

Converged Fabrics

56Gb/s InfiniBand

10/40Gb/s Ethernet

Compute

Networking

Storage

* RoCE: RDMA over Converged Ethernet

Page 26: Architektur im Rechenzentrum - 25, 50 und 100G ... · Architektur im Rechenzentrum - 25, 50 und 100G Architectures for the Datacenter ... Optical Connector types for Parallel and

© 2016 Mellanox Technologies 26

Solving the Storage (Synchronous) IOPs Bottleneck

100usec 200usec 6000usec

25

usec

1 us

20 usec

10

usec

Mechanical

Disks

(~6msec)

Software Disk

With SSDs

(~0.5msec)

With Fast Network

(~0.2msec)

With RDMA

(~0.05msec)

Network

100usec 200usec

200usec25

usec

25

usec

180 IOPs

3000 IOPs

4300 IOPs

20,000 IOPs

Synchronous (back to back)

With Full OS Bypass

& NV-Dimm/Cache

(~0.007msec)

1 us

6

us

3

us

>100,000 IOPs

Synchronous

Page 27: Architektur im Rechenzentrum - 25, 50 und 100G ... · Architektur im Rechenzentrum - 25, 50 und 100G Architectures for the Datacenter ... Optical Connector types for Parallel and

© 2016 Mellanox Technologies 27

Remote Direct Memory Access (RDMA)

Remote Direct Memory Access over Converged Ethernet (RoCE)

What is RDMA?

• Direct memory access from the memory of one

computer to that of another without involving

either one's operating system. This permits high-

throughput, low-latency networking, omitting the

OS and freeing the Processor to other tasks.

IBTA specified

Zero-copy, CPU bypass technology for data

transfer

Supported over standard interconnect

protocols

Allows applications to transfer data directly to

the buffer of a remote application

Provides extremely low latency data

transfers

Standard RDMA Protocols

• InfiniBand – up to 100Gb/s (EDR)

• RDMA-over-Converged-Ethernet (RoCE) – Up to

100Gb/s

Supports diverse storage protocols

Page 28: Architektur im Rechenzentrum - 25, 50 und 100G ... · Architektur im Rechenzentrum - 25, 50 und 100G Architectures for the Datacenter ... Optical Connector types for Parallel and

© 2016 Mellanox Technologies 28

I/O Offload Frees Up CPU for Application Processing

~88% CPU

Efficiency

Us

er

Sp

ac

eS

ys

tem

Sp

ac

e

~53% CPU

Efficiency

~47% CPU

Overhead/Idle

~12% CPU

Overhead/Idle

Without RDMA With RDMA and Offload

Us

er

Sp

ac

eS

ys

tem

Sp

ac

e

Page 29: Architektur im Rechenzentrum - 25, 50 und 100G ... · Architektur im Rechenzentrum - 25, 50 und 100G Architectures for the Datacenter ... Optical Connector types for Parallel and

© 2016 Mellanox Technologies 29

RDMA – How it Works

RDMA over InfiniBand or

Ethernet

KE

RN

EL

HA

RD

WA

RE

US

ER

RACK 1

OS

NIC Buffer 1

Application

1Application

2

OS

Buffer 1

NICBuffer 1

TCP/IP

RACK 2

HCA HCA

Buffer 1Buffer 1

Buffer 1

Buffer 1

Buffer 1

Page 30: Architektur im Rechenzentrum - 25, 50 und 100G ... · Architektur im Rechenzentrum - 25, 50 und 100G Architectures for the Datacenter ... Optical Connector types for Parallel and

© 2016 Mellanox Technologies 30

Congestion Control – The Need

Source(s) is pushing more traffic than the network can handle

• Usually due to the bandwidth available a bottleneck, congested link

• Can arise from other causes as well

• Situation lasts for relatively long time

Buffers fill up, latency climbs

Lossless vs. lossy network

• Lossy

- drop packets when buffer is full

- Requires drop indication mechanism – timeouts, NACKs, etc.

- Bad goodput, latency ~ buffer size/BW

• Lossless

- Stop the previous hop when buffer is full, no dropping of packets

- goodput=throughput in congested link - no wasted effort

- Congestion spreading and victim flows with long lived congestion

F ABCDE

G

Y

X

Page 31: Architektur im Rechenzentrum - 25, 50 und 100G ... · Architektur im Rechenzentrum - 25, 50 und 100G Architectures for the Datacenter ... Optical Connector types for Parallel and

© 2016 Mellanox Technologies 31

RDMA over Converged Ethernet (RoCE) and Routable RoCE require lossless medium• Application assumes lossless media

In order to provide lossless network, few mechanisms may be used:• Global Pause

- IEEE 802.3x standard

• Priority Flow Control (PFC)

- IEEE 802.1Qbb standard

• DSCP based PFC

- Not a standard but becoming more and more popular

The Challenge – lossless traffic over lossy network

Page 32: Architektur im Rechenzentrum - 25, 50 und 100G ... · Architektur im Rechenzentrum - 25, 50 und 100G Architectures for the Datacenter ... Optical Connector types for Parallel and

© 2016 Mellanox Technologies 32

RoCEv1 Operates Within one L2 Network

• Many cloud applications span multiple L2 domains

• Some customers use L3 across datacenter, want RDMA

across racks or IP subnets in L3 datacenter

• L2 ToR switch for intra-rack communication

• L3 (IP) router for inter-rack communication

Need for L3 Routable RDMA Protocol

• RoCEv2 Meets This Need

• IBTA collaboration defined Routable RoCE

• Small change—transparent to applications and networks

Approved by IBTA, announced Sept 16th, 2014

• Mellanox ConnectX-3 Pro supports RoCEv2 today

• ConnectX-4 supports RoCEv1/RoCEv2

• Drivers already released for Linux & Windows

Need For A Routable RoCE—RoCEv2

L2 L2 L2

L2 Domain L2 Domain L2 Domain

Page 33: Architektur im Rechenzentrum - 25, 50 und 100G ... · Architektur im Rechenzentrum - 25, 50 und 100G Architectures for the Datacenter ... Optical Connector types for Parallel and

© 2016 Mellanox Technologies 33

RoCE Is an Open Standard And Routable

IBTA Collaboration on RoCE• Steering Committee: Cray, Emulex, HP, IBM, Intel,

Mellanox, Microsoft, Oracle,

• RoCE specification first released in 2010

• Most widely deployed Ethernet RDMA standard

• Routable since September 2015

Standardization paves way for multi-

vendor interoperable solutions RoCEv2

Specification

InfiniBand RoCEv1 RoCEv2

Page 34: Architektur im Rechenzentrum - 25, 50 und 100G ... · Architektur im Rechenzentrum - 25, 50 und 100G Architectures for the Datacenter ... Optical Connector types for Parallel and

© 2016 Mellanox Technologies 34

Possible Scenarios

Page 35: Architektur im Rechenzentrum - 25, 50 und 100G ... · Architektur im Rechenzentrum - 25, 50 und 100G Architectures for the Datacenter ... Optical Connector types for Parallel and

© 2016 Mellanox Technologies 35

Multi-Mode Optics

3m-100m

Where Interconnects are Being Used in Data Center

DACServer/ToR-to-ToR

SR4For structured cabling

Short Reaches

PSM4

DAC

WDM4

8-Fiber

MPOAOC: 3-50m

AOCToR-Leaf/Spine

“DAC In the Rack”

3m

Quad 25G SFP

breakout

Dual 50G

Breakout

25G SFP

Quad 25G SFP

breakout

Dual 50G

Breakout

25G SFP

For Structured Cabling

Long Reaches

2-Fiber

LC

Single-Mode Optics

Up to 2Km

Optical

Patch

Panel

Page 36: Architektur im Rechenzentrum - 25, 50 und 100G ... · Architektur im Rechenzentrum - 25, 50 und 100G Architectures for the Datacenter ... Optical Connector types for Parallel and

© 2016 Mellanox Technologies 36

Small/Medium Cloud Deployment 10GbE endpoints

(48 + 48)x

10/25GbE

active HA

(48 + 48)x

10/25GbE

active HA

15x Racks

Full L2 solution

1440x 10GbE ports

WAN

access2x 10/40GbE

uplink to

router2x 100G Spines

To Spine

mLAG2x 10/25G ToRs

400G

960G

Per Rack

Pure L2 Network; full HA and no-SPoF

Phase-1: Start with as small as 1 Rack and

2x ToR

Phase-2: Add 2x spines (32x100G) and

build up-to 15x Racks in pure L2 domain

ToR to spine uplinks with 50GbE to ensure

(2+2)x link bundle; mitigate cable failure

(4+4)x 100/40GbE or (8+8)x 50GbE ports

available per rack for high performance/”fat”

storage nodes

48+48x 10GbE for compute/hyper-

converged infrastructure

50GbE

100GbE

Page 37: Architektur im Rechenzentrum - 25, 50 und 100G ... · Architektur im Rechenzentrum - 25, 50 und 100G Architectures for the Datacenter ... Optical Connector types for Parallel and

© 2016 Mellanox Technologies 37

Small/Medium Cloud Deployment 25GbE endpoints

(48 + 48)x

10/25GbE

active HA

(48 + 48)x

10/25GbE

active HA

7x Racks

Full L2 solution

672x 25GbE ports

WAN

access2x 10/40GbE

uplink to

router2x 100G Spines

To Spine

mLAG2x 25G ToRs

800G

2400G

Per Rack

Pure L2 Network; full HA and no-SPoF

Ideal for small/medium private cloud

Phase-1: Start with as small as 1 Rack and

2x ToR

Phase-2: Add 2x spines (32x100G) and

build up-to 7x Racks in pure L2 domain

mLAG on ToRs and spines for full active-

active HA

(2+2)x 100GbE or (4+4)x 50GbE ports

available per rack for high performance/”fat”

storage nodes

48+48x 25GbE for compute/hyper-

converged infrastructure

100GbE

Page 38: Architektur im Rechenzentrum - 25, 50 und 100G ... · Architektur im Rechenzentrum - 25, 50 und 100G Architectures for the Datacenter ... Optical Connector types for Parallel and

© 2016 Mellanox Technologies 38

1024-port 1:1 100GbE, 2048-port 1:1 50GbE

32x Spines

64x Leafs

16x

100GbE

16x

100GbE16x

100GbE

1024x 100GbE 1:1 non-

blocking network

All leaf-spine

links are 50GbE

32x 50GbE

uplinks

Same Concept can be used for 2048 port 50GbE 1:1

Can be used as Spine for 3-level networks

Page 39: Architektur im Rechenzentrum - 25, 50 und 100G ... · Architektur im Rechenzentrum - 25, 50 und 100G Architectures for the Datacenter ... Optical Connector types for Parallel and

© 2016 Mellanox Technologies 39

Leaf-2

Port 1,2

Leaf-1

Port 1,2

Spine-2

Port 1,2

Spine-1

Port 1,2

How to split-connect 50GbE with standard optics

MPO to 4x LC-

LC splitter

cables

• Leaf-Spine links are all 50GbE

• Constructed by splitting each 100GbE port to

2x 50GbE

• Use SR4 or PSM4 100GbE optics on each

port

• Use standard MPO- 4xLC-LC splitter cables

(MM for SR4 and SM for PSM4)

• Lane 1-2 is port-1

• Lane 3-4 is port-2

• Cables connected using standard LC-LC

passive couplers

Page 40: Architektur im Rechenzentrum - 25, 50 und 100G ... · Architektur im Rechenzentrum - 25, 50 und 100G Architectures for the Datacenter ... Optical Connector types for Parallel and

Thank You

Page 41: Architektur im Rechenzentrum - 25, 50 und 100G ... · Architektur im Rechenzentrum - 25, 50 und 100G Architectures for the Datacenter ... Optical Connector types for Parallel and

© 2016 Mellanox Technologies 41

References

www.ieee802.org/3/ad_hoc/bwa/BWA_Report.pdf

http://www.ieee802.org/3/50G/public/adhoc/

http://www.ethernetalliance.org/wp-content/uploads/2013/04/Ethernet-Alliance-Technology-

Roadmap-FINAL.pdf

https://en.wikipedia.org/wiki/Terabit_Ethernet

http://www.ieee802.org/3/bs/

http://www.open-ethernet.com/

http://25gethernet.org/

https://community.mellanox.com/docs/DOC-1451

http://www.mellanox.com/ethernet/