challenges of layer 2 nid based architecture for vcpe/nfv

26
Challenges of Layer 2 NID Based Architecture For vCPE/NFV Deployments Santanu Dasgupta Sr. Consulting Engineer – Global Service Provider Network Architecture August, 2015 SANOG 26

Upload: truongthu

Post on 14-Feb-2017

223 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Challenges of Layer 2 NID Based Architecture For vCPE/NFV

Challenges of Layer 2 NID Based Architecture For vCPE/NFV Deployments

Santanu Dasgupta Sr. Consulting Engineer – Global Service Provider Network Architecture

August, 2015

SANOG 26

Page 2: Challenges of Layer 2 NID Based Architecture For vCPE/NFV

2 © 2013-2014 Cisco and/or its affiliates. All rights reserved.

Brief NFV Introduction

Page 3: Challenges of Layer 2 NID Based Architecture For vCPE/NFV

3 © 2013-2014 Cisco and/or its affiliates. All rights reserved.

“Network Functions” in SP Network Architecture Landscape

LTE

Smartphone

Access

xDSL

WiFi

Smartphone

PC

RNC 2G 3G

Ethernet CE

NodeB

eNodeB

AP

Small Cell FAP

Gateways / Service Edge

OSS/BSS

Subsystems and Control

Data Plane Voice Video Data

Core Network Infrastructure

IMS

xDSL HFC

PGW SGW

2/3G GGSN

2/3G SGSN

MME

ePDG

eWAG

PE

Metro Network Infrastructure

NAT FW IPSec

DPI CGN Caching

Opt

MSC-S MGW

A-SBC I-SBC

BGCF

MGCF

PS / RLS

DRA

Video ingestion

DRM

Video Network

EMS Provisioning Analytics Billing

Radius

DNS

DHCP

S-CSCF

P-CSCF

I-CSCF

Trans- coding

Cache Control

Policy

Parental control

HLR

HSS

ENUM

TAS SMS-C

Services

OCS MMS-C HCS RMS

xDSL DSLAM DSL/ FTTX BNG

Core Routing

Metro Ethernet

Biz CPE

Consumer CPE

Cable Modem CMTS

Capacity Planning

WLC

SecGW

HNB-GW

Policy

SDN Controller

BGP server

Metro Ethernet

Data Center Core and Data Center Network Infrastructure

Page 4: Challenges of Layer 2 NID Based Architecture For vCPE/NFV

4 © 2013-2014 Cisco and/or its affiliates. All rights reserved.

Virtualization of “Network Functions”

Existing Hardware / Appliance based Network Functions (NFs)

Virtualized NFs running as VM on x86 Server Platform

Step 1: Decouple software from underlying hardware

Step 2: Port it as a VM/container on x86 Server

platform running as a Network Function

Data Center Switching Infrastructure

Hypervisor

vFW vCPE vDPI vLB

Hypervisor

Page 5: Challenges of Layer 2 NID Based Architecture For vCPE/NFV

5 © 2013-2014 Cisco and/or its affiliates. All rights reserved.

NFV, SDN & Orchestration Together Partial list, just a few main ones are mentioned here

Ethernet Switching Network Underlay

Hypervisor Hypervisor Hypervisor

NAT Firewall DPI

Orchestration and SDN Control Function

Storage

Server 1 Server 2 Server 3

Firewall DPI

VM / VNF Lifecycle Management in

End-to-end manner

Network Plumbing to orchestrate

dynamic topologies

Configuration Management of the VNFs

Integration with Other DC/POD And the WAN

OAM, Assurance, Analytics

Standard APIs

NAT

Page 6: Challenges of Layer 2 NID Based Architecture For vCPE/NFV

6 © 2013-2014 Cisco and/or its affiliates. All rights reserved.

L2NID for vCPE/NFV Background & Context

Page 7: Challenges of Layer 2 NID Based Architecture For vCPE/NFV

7 © 2013-2014 Cisco and/or its affiliates. All rights reserved.

Traditional Managed CPE with IP/MPLS L3VPN

PE PE P

CE CE Carrier Ethernet /

Backhaul Carrier Ethernet /

Backhaul

§  There are multiple genuine and perceived issues in the traditional service delivery model – §  CPE provisioning and servicing often require truck roll (sending engineers) à high OPEX §  The amount of feature sets enabled on the on-premise CPE makes the solution complex to operate §  Service delivery is not agile, lacks automation, service turn-up / changes takes a lot of time §  On site CPE’s are often expensive and not an open platform

§  The industry is expecting something that is more open, agile, fully automated, flexible to address different market segments and can help the operators to reduce their TCO §  This is where L2 NID on-premises + vCPE architecture discussion for business VPN started almost

2 years back in the industry

MP-BGP Static / IGP / eBGP Static / IGP / eBGP

MPLS Core Branch Branch

Page 8: Challenges of Layer 2 NID Based Architecture For vCPE/NFV

8 © 2013-2014 Cisco and/or its affiliates. All rights reserved.

What is “L2 NID” ?

§  Layer 2 NID – Layer 2 Network Interface Device (example Cisco ME 1200)

§  Some calls it Layer 2 Network Termination Device (NTD) à we will call it L2 NID for this presentation

§  L2 NID is the device that Carrier Ethernet Operator drops at Customer Premises to terminate the Ethernet last mile §  It is managed by the operator

§  It has user facing interfaces (UNI) and network facing interfaces (NNI) – typically all Ethernet §  It marks the demarcation point between the Operator and Customer Network

§  The L2 NID is typically a 4 to 6 port FE/GE/10GE L2 switch with some other capabilities such as – §  Ethernet OAM (CFM, Y.1731) for fault & performance management, Service Activation (Y.1564), timing support (mobile b/h) etc.

Carrier Ethernet AGG

AGG

AGG

NPE

NPE

L2 NID MPLS Core

Typical Carrier Ethernet Operation Domain

Dem

arca

tion

Customer Premises

PE

Page 9: Challenges of Layer 2 NID Based Architecture For vCPE/NFV

9 © 2013-2014 Cisco and/or its affiliates. All rights reserved.

L2 NID + vCPE Architecture Proposal for Managed CPE/VPN

§  No need to have a Layer 3 CPE at Customer premises anymore

§  Virtualize the L3 CPE and Put that at SP’s POP or Cloud / NFV Data Center

§  Make the branch simplified with only one device, where complex features are running at SP’s Cloud making it easier to operate à may also help to reduce cost

Carrier Ethernet AGG

AGG

AGG

NPE

NPE

L2 NID MPLS Core

One Device on Customer Prem

vCPE

vCPE

Layer 2 Backhaul of Branch Traffic to vCPE PE-CE Routing Very simple on-premises CPE

L3 CPE Virtualized, Complex features running at SP’s premises

PE CPE

Animated

Page 10: Challenges of Layer 2 NID Based Architecture For vCPE/NFV

10 © 2013-2014 Cisco and/or its affiliates. All rights reserved.

L2 NID + vCPE Architecture Proposal for Managed CPE/VPN

Carrier Ethernet AGG

AGG

AGG

NPE

NPE

L2 NID MPLS Core

One Device on Customer Prem

vCPE

vCPE

Layer 2 Backhaul of Branch Traffic to vCPE PE-CE Routing Very simple on-premises CPE

L3 CPE Virtualized, Complex features running at SP’s premises

PE

§  No need to have a Layer 3 CPE at Customer premises anymore

§  Virtualize the L3 CPE and Put that at SP’s POP or Cloud / NFV Data Center

§  Make the branch simplified with only one device, where complex features are running at SP’s Cloud making it easier to operate à may also help to reduce cost

Page 11: Challenges of Layer 2 NID Based Architecture For vCPE/NFV

11 © 2013-2014 Cisco and/or its affiliates. All rights reserved.

Why The L2 NID Based Alternate Looked Promising ?

Carrier Ethernet

AGG

AGG

NPE

NPE

L2 NID

Carrier Ethernet AGG

AGG

AGG

NPE

NPE

L2 NID MPLS Core

CPE AGG

Two Devices on Customer Prem

One Device on Customer Prem

MPLS Core

vCPE

vCPE

PE-CE Routing over Carrier Ethernet Transport

Layer 2 Backhaul of Branch Traffic to vCPE PE-CE Routing

Traditional Model

L2 NID+ vCPE Model

Reduction of Customer Premise Devices from Two to One was Promising to Reduce Cost and Complexity

PE

PE

Page 12: Challenges of Layer 2 NID Based Architecture For vCPE/NFV

12 © 2013-2014 Cisco and/or its affiliates. All rights reserved.

This Would Enable Agile Service Creation too

Carrier Ethernet AGG

AGG

AGG

NPE

NPE

L2 NID MPLS Core

One Device on Customer Prem

vCPE

vFW

vDPI

NFV DC

NFV and Cloud Orchestration

§  NFV and Orchestration also enables agile service creation and turn-up

§  vCPE can be chained with rich set of NFV, Cloud IaaS, PaaS and SaaS services (SP hosted or 3rd party)

§  This is true irrespective of the on-premises CPE type – be in L2 NID or L3 CPE or whatever else

PE

PE

Cloud DC (SP / 3rd Party)

IaaS / PaaS / SaaS

Page 13: Challenges of Layer 2 NID Based Architecture For vCPE/NFV

13 © 2013-2014 Cisco and/or its affiliates. All rights reserved.

Challenges with the L2 NID Based Architecture, where there is no L3 CPE at the Branch

Page 14: Challenges of Layer 2 NID Based Architecture For vCPE/NFV

14 © 2013-2014 Cisco and/or its affiliates. All rights reserved.

MAC Address Scale Issues for the NFV DC

.

.

.

250 MAC Address

250 MAC Address

250 MAC Address

Carrier Ethernet AGG

AGG

AGG

NPE

NPE

L2 NID L2 NID

L2 NID L2 NID

TOR Switch

vCPE vCPE

L2 Trunk / QinQ

L3 VPN PE

L3 Links

1 Million MAC Address!!!

(4000 customers @250 MAC)

§  NFV DC’s are built with a network switching underlay à servers aren’t directly connected to the NPE

§  With layer 2 backhaul of traffic from customer branches, the NFV DC switching layer will learn all customer MAC addresses

§  An example site with 4000 customer sites and 250 MAC address per site means 1 Million MACs §  The switching underlay / TOR switches will now need to support and learn 1Million MAC addresses §  Impacting cost of the network, service scale, convergence time upon failure due to large table size

§  This can be technically solved with end-to-end overlay (like GRE or MPLS PW) from branch to vCPE or NPE to vCPE à defeating the original simplicity of the proposed architecture to a major extent

L2 NID

L2 NID

L2 NID

Page 15: Challenges of Layer 2 NID Based Architecture For vCPE/NFV

15 © 2013-2014 Cisco and/or its affiliates. All rights reserved.

Security Exposure Due to Extension of the Broadcast Domain

Carrier Ethernet AGG

AGG

AGG

NPE

vCPE

L2 NID

L2 NID

L2 NID

.

.

. vCPE

vCPE vCPE

TOR Switch

Customer’s L2 Domain Extended to NFV DC’s delivering vCPE

§  By not having a L3 CPE at branch, and vCPE at NFV DC, it extends customer’s Layer 2 domain all the way to the NFV DC

§  For a POP with 4000 customers, it means extension of 4000 layer 2 domains hitting the NFV DC à SP typically has no control what assets are there at these branches and how secure they are

§  This poses a significant security risk to SP’s infrastructure for various DDoS/other attacks

NFV DC

Page 16: Challenges of Layer 2 NID Based Architecture For vCPE/NFV

16 © 2013-2014 Cisco and/or its affiliates. All rights reserved.

Potential Risks Due to Layer 2 Loops

Carrier Ethernet AGG

AGG

AGG

NPE

vCPE L2 NID

L2 NID

vCPE

vCPE vCPE

TOR Switch

L2 NID

L2 NID

To Customer LAN

To Customer LAN

Customer LAN’s STP Domain Operators domain Starts here onwards

§  In a L2 NID only based architecture, it is critical to demarcate customer’s STP domains at the L2 NID

§  There could be dual homing situations, where L2 NID may have to participate in Customer’s Spanning Tree domain, also may require some form of loop prevention mechanism on the NNI side too

§  Such dual homed connectivity requirement pose risks. Operational errors may cause the SP infrastructure to get impacted due to layer 2 loops originated from a customer branch

Page 17: Challenges of Layer 2 NID Based Architecture For vCPE/NFV

17 © 2013-2014 Cisco and/or its affiliates. All rights reserved.

High Availability Design Challenges

Carrier Ethernet AGG

AGG

AGG

NPE

NPE

L2 NID MPLS Core

vCPE1

vCPE2

PE-CE Routing

PE

Layer 2 Backhaul of Branch Traffic to vCPE

§  For situations with two vCPE’s for HA, the two vCPE’s need to run HSRP / VRRP (lets consider VRRP)

§  There are multiple ways to run the VRRP traffic between the two vCPE’s that comes with different levels of complexity and different degree of reliability

§  The “L2 NID ßà vCPE” connectivity tracking becomes key for reliable bi-directional packet forwarding

Page 18: Challenges of Layer 2 NID Based Architecture For vCPE/NFV

18 © 2013-2014 Cisco and/or its affiliates. All rights reserved.

High Availability Design Challenges VRRP on Directly Connected Links

Carrier Ethernet AGG

AGG

AGG

NPE

NPE

L2 NID MPLS Core

vCPE1

vCPE2

PE-CE Routing

PE

Layer 2 Backhaul of Branch Traffic to vCPE

§  Simplest way to run VRRP à but requires a L2 segment between two NFV DCs (typically two sites)

§  Less reliable, since VRRP operation is blind to the connectivity failures from vCPE to L2 NID

§  If vCPE1 is active on VRRP segment, and if vCPE1 to the L2 NID connectivity fails, vCPE1 may continue to remain active à will cause service outage

VRRP

Page 19: Challenges of Layer 2 NID Based Architecture For vCPE/NFV

19 © 2013-2014 Cisco and/or its affiliates. All rights reserved.

High Availability Design Challenges VRRP via the Carrier Ethernet Network

Carrier Ethernet AGG

AGG

AGG

NPE2

NPE1

L2 NID MPLS Core

vCPE1

vCPE2

PE

§  To address the reliability issue, VRRP may be carried across the Carrier Ethernet network §  vCPE1 – NPE1 – NPE2 – vCPE2 à need a L2 path, FRR enabled explicit path between NPE’s to force the path

§  vCPE1 – NPE1 – AGG … – AGG – NPE2 – vCPE2

§  More reliable now, since VRRP traffic will be dropped if the L2NID to vCPE connectivity fails, but – §  Carrier Ethernet network to ensure VRRP packets aren’t dropped during congestion – that will trigger false failover §  VRRP delay timers may not be very aggressive, Carrier Ethernet network to ensure minimum delay

§  This is more complex to provision and operate

VRRP

Potential Alternate Paths for VRRP Sessions

Page 20: Challenges of Layer 2 NID Based Architecture For vCPE/NFV

20 © 2013-2014 Cisco and/or its affiliates. All rights reserved.

High Availability Design Challenges VRRP via the Carrier Ethernet Network + IEEE 802.1ag (CFM)

Carrier Ethernet AGG

AGG

AGG

NPE2

NPE1

L2 NID MPLS Core

vCPE1

vCPE2

PE

§  The previous solution is still not end-to-end, not covering the AGG to the L2 NID connectivity §  For an end-to-end reliable operations, that is a key requirement

§  A way to address this challenge is to use VRRP and CFM (802.1ag) together §  CFM runs end-to-end from L2NID to vCPE. When due to any failure on the path, CFM session expires à interface of

vCPE goes to line protocol “down” state à VRRP traffic cannot go out any more out of the interface à VRRP switchover takes place to the standby

§  Solves this HA issue, but brings back a lot of complexities in the network §  We’re trying to remove complexities by removing L3 CPE from branch à but we introduced different complexities now

VRRP

802.1ag CFM Sessions

VRRP

CFM

CFM

Page 21: Challenges of Layer 2 NID Based Architecture For vCPE/NFV

21 © 2013-2014 Cisco and/or its affiliates. All rights reserved.

Carrier Ethernet AGG

AGG

AGG

NPE2

NPE1

L2 NID MPLS Core

vCPE1

vCPE2

PE

VRRP

802.1ag CFM Sessions

VRRP

CFM

CFM

High Availability Design Challenges Upstream Routing from vCPE to the L3VPN PE

PE-CE Routing with Conditional Advertisement depending on vCPE to L2 NID connectivity and VRRP status (typically will require eBGP)

§  The vCPE’s need to run PE-CE routing with eBGP / IGP or Static routing

§  The vCPE’s now need to perform conditional route advertisement to the L3 VPN PE’s depending on the reachability of vCPE to L2 NID and VRRP status

§  If vCPE1 is the preferred path for the downstream traffic, but vCPE1 has lost connectivity to L2 NID, the vCPE1 needs to make L3VPN PE aware by advertising routes with less preferred attribute than vCPE2

§  Typically restrict the PE-CE routing protocol to eBGP and may add more complexity in the design

Page 22: Challenges of Layer 2 NID Based Architecture For vCPE/NFV

22 © 2013-2014 Cisco and/or its affiliates. All rights reserved.

Lack of L3 Capability @Branch Will Limit Available Services

Carrier Ethernet AGG

AGG

AGG

NPE

NPE

L2 NID MPLS Core

vCPE1

vCPE2

PE VRRP

Branch

§  Many customer may require capabilities at branch that requires Layer 3 devices §  Such as IPSec VPN or WAN Acceleration

§  Many Service Providers are looking forward to use 3G or 4G LTE as backup connectivity §  Typical L2 NIDs do not have those interfaces à forcing another CPE for the backup

§  If Hierarchical & granular QOS is a requirement at the branch, this could be challenging with L2 NID too

Page 23: Challenges of Layer 2 NID Based Architecture For vCPE/NFV

23 © 2013-2014 Cisco and/or its affiliates. All rights reserved.

Conclusion of L2 NID + vCPE Architecture

§  We were attempting to simplify the architecture by removing L3 CPE from the branch in the managed CPE / VPN architecture

§  But in that process, complexities of other types got injected back into the network §  MAC address scaling issue impacting service scale, convergence, cost §  Security exposure of the vCPE / NFV DC and the SP infrastructure due to extension of L2 domains §  Possible chances of Layer 2 loops due to operational errors §  Complex design requirements to satisfy high availability à more difficult to operate

§  It may create further limitations when it comes to service availability at the branch §  Layer 3 services such as IPSec, WAN Acceleration etc. are not possible from the branch anymore §  3G / 4G LTE on the same device §  Hierarchical and Granular QOS from the branch

Page 24: Challenges of Layer 2 NID Based Architecture For vCPE/NFV

24 © 2013-2014 Cisco and/or its affiliates. All rights reserved.

So What We May Do To Approach the Problem ? This is My Recommendation J

Carrier Ethernet AGG

AGG

AGG

NPE

NPE

L3 NID MPLS Core

vFW

vCPE*

vDPI

NFV DC

NFV and Cloud Orchestration

Cloud DC (SP / 3rd Party)

PE

PE

§  Keep a L3 CPE at branch, may be physical or virtual à call it a L3 NID or L3 CPE or whatever you like §  We need to demarcate customer’s L2 network at the branch itself, that’s how the networks scaled §  This helps avoid MAC address scale, security & L2 Loop issues. Also avoids additional issues with VRRP design §  Make the L3 CPE at branch Zero Touch Provisioning (ZTP) capable – to achieve automation and agility

§  If required, try and make the L3 CPE at branch simplified by reducing the footprint of “enabled features” §  Provision complex CPE features on NFV DC (may include advanced routing on a vCPE) §  Have the ability to service chain vCPE with other rich set of functions using NFV orchestration system à make it agile!

L3 NID or L3 CPE

Simple Routing like Static / IGP with vCPE, or PE-CE Routing from branch Demarcate Customer L2 Network Here

vCPE* - May not be reqd. most of the time

IaaS / PaaS / SaaS

Page 25: Challenges of Layer 2 NID Based Architecture For vCPE/NFV

Thank you.

Page 26: Challenges of Layer 2 NID Based Architecture For vCPE/NFV

26 © 2013-2014 Cisco and/or its affiliates. All rights reserved.

NFV – How to build / Augment Operations skillsets

•  Most existing technologies, protocols and associated skills are equally required •  On top of that, there are needs for acquisition of New Skills •  x86 Server Virtualization

•  Virtualization on Linux (and KVM/QEMU) Environment •  Cloud Orchestration Systems – such as OpenStack

•  Virtual Switches – OVS, Netmap/VALE, Snabbswitch, Vendor Specific etc

•  SDN Controllers – OpenDayLight, Vendor Specific •  Device Programmability and APIs – NETCONF, Yang, RESTCONF, REST APIs, OF….

•  Service Function Chaining – specially NSH (Network Service Header)

•  Network based Virtual Overlay transport – VXLAN, MPLSoGRE/UDP, LISP, L2TPv3….. •  Automation Tools – puppet / chef etc.

•  Management, Orchestration, OSS Fundamentals,

•  …..