vs (virtual subnet) draft-xu-virtual-subnet-03 xiaohu xu ietf 79, beijing

15
VS (Virtual Subnet) draft-xu-virtual-subnet-03 Xiaohu Xu <[email protected]> IETF 79, Beijing

Upload: audrey-mathews

Post on 23-Dec-2015

221 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: VS (Virtual Subnet) draft-xu-virtual-subnet-03 Xiaohu Xu IETF 79, Beijing

VS (Virtual Subnet)draft-xu-virtual-subnet-03

Xiaohu Xu<[email protected]>

IETF 79, Beijing

Page 2: VS (Virtual Subnet) draft-xu-virtual-subnet-03 Xiaohu Xu IETF 79, Beijing

VS Overview• VS aims to be a practical and scalable data center

network architecture which is desired to meet the following objectives:– Maximize Bandwidth Utilization:

• Use L3 routing to overcome the limitations of STP.– Layer-2 Connectivity Service:

• Just as if the servers of a given service domain were on a LAN or a subnet.

– Service Domain Isolation:• Due to performance isolation and security considerations, servers

of different service domains should be isolated from each other, just as if they were isolated via VLANs.

– Broadcast Flooding Suppression• Limit the broadcast flooding (e.g., ARP broadcast traffic, unknown

unicast traffic) scope as small as possible.

Page 3: VS (Virtual Subnet) draft-xu-virtual-subnet-03 Xiaohu Xu IETF 79, Beijing

VS Overview (cont)• VS provides an IP-only L2VPN service for server

interconnection in data center networks by mainly combining L3VPN and ARP proxy [RFC 925] (was invented by Jon Postel) technologies.

• On PE control plane– Host routes (i.e., /32) for local CE hosts are generated

automatically according to learnt ARP entries. – Host routes for remote CE hosts are learnt by using the existing

L3VPN technology to distribute the above local CE host routes across PEs.

– Acting as an ARP proxy, the PE returns its own MAC as a response to an ARP request for a remote CE host which is sent from a local CE host.

• On PE data plane– Use L3VPN forwarding mechanism WITHOUT ANY CHANGE.

Page 4: VS (Virtual Subnet) draft-xu-virtual-subnet-03 Xiaohu Xu IETF 79, Beijing

VPN Blue:1.1.1.0/24

Host D:1.1.1.4

Host B:1.1.1.3

Unicast Communication Example

MPLS/IP Backbone

PE-1

VPN Blue:1.1.1.0/24

PE-2

Prefix Next-hop Protocol 1.1.1.1/32 PE-1 BGP1.1.1.2/32 PE-1 BGP1.1.1.3/32 Local ARP1.1.1.4/32 Local ARP

Prefix Next-hop Protocol 1.1.1.1/32 Local ARP1.1.1.2/32 Local ARP1.1.1.3/32 PE-2 BGP1.1.1.4/32 PE-2 BGP

Host C:1.1.1.2

Host A:1.1.1.1

ToRSwitch

ToRSwitch

VRF Blue: VRF Blue:

IP MACIP(C) MAC(C)IP(B) MAC(PE-1)IP(D) MAC(PE-1)

ARP:

ARP Proxy

ARP Proxy

IP MACIP(D) MAC(D)IP(A) MAC(PE-2)IP(C) MAC(PE-2)

ARP:

IP(A)->IP(B)

VLAN ID

MAC(A)->MAC(PE-1)

IP(A)->IP(B)

VPN Label

Tunnel to PE-2

IP(A)->IP(B)

VLAN ID

MAC(PE-2)->MAC(B)

Page 5: VS (Virtual Subnet) draft-xu-virtual-subnet-03 Xiaohu Xu IETF 79, Beijing

Local CE Host Discovery• Local CE hosts are discovered through ARP learning.

– PE sends unicast ARP requests to those learnt local CE hosts periodically to keep their corresponding ARP entries from expiring.

• To ensure the PE has learnt all local CE hosts, especially in the event of rebooting, ARP scan should be performed at least once after rebooting:– Option 1 (available today):

• PE sends to its local site an ARP request for each IP address within the configured IP subnet in turn.

– Option 2 (extensions to existing ARP needed):• PE sends to its local site an ARP request for a directed broadcast

address (i.e., 255.255.255.255) or an ALL-Systems multicast group address (i.e., 224.0.0.1).

• Any CE host receiving such ARP request should respond with an ARP reply containing its IP and MAC addresses.

Page 6: VS (Virtual Subnet) draft-xu-virtual-subnet-03 Xiaohu Xu IETF 79, Beijing

ARP Reduction• Besides ARP learning, PE should perform the ARP proxy

[RFC 925] function:– For an ARP request for a local CE host, discards it.– For an ARP request for a remote CE host, return its own MAC as

an ARP reply.– For an ARP request for an unknown CE host (i.e., no matching

VRF entry found), discards it.• ARP broadcast traffic from CE hosts is limited to local

VPN sites– ARP broadcast traffic would not be flooded across PEs.– ARP update for a CE host (e.g., triggered by VM mobility) would

not trigger any BGP update as long as that CE host is still attached to its original PE and VRF instance (e.g., VM mobility within the VPN site).

Page 7: VS (Virtual Subnet) draft-xu-virtual-subnet-03 Xiaohu Xu IETF 79, Beijing

CE Multi-homing

• CE multi-homing is an important feature for redundancy and load-balancing, especially in data center networks.– Multiple equal-cost host routes with different BGP next-hops

(i.e., remote PEs) for a given multi-homed CE host can be used to achieve maximum capacity for server interconnection.

• CE hosts can be multi-homed to PEs via Intermediary bridges (e.g., ToR switches) in the following way.– VRRP is enabled on PEs of a given redundancy group, – and only VRRP master is delegated to act as ARP proxy and

respond with its VIRTUAL MAC.

Page 8: VS (Virtual Subnet) draft-xu-virtual-subnet-03 Xiaohu Xu IETF 79, Beijing

CE Mobility (e.g., VM Mobility)

• CE mobility within a VPN site.– PE just needs to update the corresponding ARP entry.– No BGP update is triggered.

• CE mobility across VPN sites.– Upon learning a host route for a given local CE host

via BGP, PE should immediately send an ARP request to that host to determine whether that host is still connected to it. • If not, PE should delete the corresponding ARP entry and

host route for that CE host, and withdrawn the corresponding BGP route advertised before.

• Otherwise, it is judged as CE multi-homing.

Page 9: VS (Virtual Subnet) draft-xu-virtual-subnet-03 Xiaohu Xu IETF 79, Beijing

Multicast/Broadcast

• MVPN technology can be used directly without any change to distribute customer multicast traffic among PEs.– Inclusive multicast distribution tree– Selective multicast distribution tree

• Customer broadcast traffic can be processed as a special customer multicast group.

Page 10: VS (Virtual Subnet) draft-xu-virtual-subnet-03 Xiaohu Xu IETF 79, Beijing

ComparisonIPLS VS

CE reachability Information Distribution

MAC reachability advertisement via LDP

IP reachability advertisement via BGP

ARP reduction mechanism

ARP cache/snooping (return a real MAC of the requested CE).

ARP proxy (return the MAC of the ARP proxy)

Eliminating ARP/unknown unicast flooding across PEs

No Yes

CE multi-homing Not support Support natively

MAC table capacity pressure on Intermediary bridges

Need to learn MACs of both local and remote CEs. Not aging out learned MAC entries worsen such pressure.

Only need to learn local CE hosts’ MAC addresses.

Page 11: VS (Virtual Subnet) draft-xu-virtual-subnet-03 Xiaohu Xu IETF 79, Beijing

Next-steps

• Any comments?

Page 12: VS (Virtual Subnet) draft-xu-virtual-subnet-03 Xiaohu Xu IETF 79, Beijing

IPLS vs. VS (CE Reachability Advertisement)

• In IPLS, MAC reachability is advertised via LDP– LDP sessions face scalability challenge in a full-meshed large

data center network.– Adding new PEs would require configurations on all remote PEs.

• In VS, IP reachability is advertised via BGP– BGP session can scale well with the help of route reflector

mechanism.– Adding new PEs just induce configuration on RRs.

• The forwarding table size on PE is the same for both IPLS and VS.– Both host routes and MAC routes are not aggregatable.

Page 13: VS (Virtual Subnet) draft-xu-virtual-subnet-03 Xiaohu Xu IETF 79, Beijing

IPLS vs. VS (ARP Reduction)

• In IPLS, ARP storm issue is not solved completely.– ARP packets even including the unicast ARP reply

packets are forwarded from attachment circuits to "multicast" PWs and the received APR packets from the "multicast" PWs will be flooded to all CE hosts.

– How to keep the consistency of ARP caches on different PE routers is a hard issue.

• In VS, by using ARP Proxy on PE routers, ARP traffic is limited within a site scope.

Page 14: VS (Virtual Subnet) draft-xu-virtual-subnet-03 Xiaohu Xu IETF 79, Beijing

IPLS vs. VS (CE Multi-homing)

• IPLS prohibits connection of a common LAN or VLAN to more than one PE router.– That’s to say, IPLS can not support

redundancy and load-balancing of PE-CE connections.

• VS can support CE multi-homing natively.

Page 15: VS (Virtual Subnet) draft-xu-virtual-subnet-03 Xiaohu Xu IETF 79, Beijing

IPLS vs. VS (Intermediary Bridge’s MAC Table Size)

• In IPLS, the intermediary bridges between PEs and CEs would have to learn all CE hosts (both local and remote)– An IP frame received over a unicast PW is prepended with the

PE router’s own local MAC address before transmitting it on the appropriate attachment circuits. However, the destination MAC address of the packet to a remote CE host which is sent from a local CE host is the MAC of the remote CE host, rather than the local PE router’s MAC. Thus, flooding unknown unicast frames on the above Ethernet bridges would happen sooner or latter.

– To avoid flooding unknown unicast frames, these bridges are configured to not age out the learned MAC entries.

• In VS, the intermediary bridges only need to learn the MAC addresses of local CE hosts and local PE routers.