this document describes modifications of bgp labeled ... · this document describes modifications...

198
IDR Luyuan Fang Internet Draft Deepak Bansal Intended status: Standards Track Microsoft Expires: July 26, 2016 Chandra Ramachandran Juniper Networks Fabio Chiussi Nabil Bitar Verizon Yakov Rekhter January 23, 2016 BGP-LU for HSDN Label Distribution draft-fang-idr-bgplu-for-hsdn-03 Abstract This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned network. Specifically, these procedures are suitable for building the Hierarchical SDN (HSDN) control plane for the hyper-scale Data Center (DC) and cloud networks. Status of this Memo This Internet-Draft is submitted to IETF in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/1id-abstracts.html The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html Copyright Notice Fang et al. Expires <July 26, 2016> [Page 1]

Upload: others

Post on 25-Feb-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

IDR Luyuan FangInternet Draft Deepak BansalIntended status: Standards Track MicrosoftExpires: July 26, 2016 Chandra Ramachandran Juniper Networks Fabio Chiussi

Nabil Bitar Verizon Yakov Rekhter

January 23, 2016

BGP-LU for HSDN Label Distribution draft-fang-idr-bgplu-for-hsdn-03

Abstract

This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned network. Specifically, these procedures are suitable for building the Hierarchical SDN (HSDN) control plane for the hyper-scale Data Center (DC) and cloud networks.

Status of this Memo

This Internet-Draft is submitted to IETF in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

The list of current Internet-Drafts can be accessed at http://www.ietf.org/1id-abstracts.html

The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html

Copyright Notice

Fang et al. Expires <July 26, 2016> [Page 1]

Page 2: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft BGP-LU for HSDN January 23, 2016

Copyright (c) 2016 IETF Trust and the persons identified as the document authors. All rights reserved.

This document is subject to BCP 78 and the IETF Trust’s Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.

Table of Contents

1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . 5 3. Description of BGP-LU Procedures . . . . . . . . . . . . . . . 7 3.1. Partitioned-Unique Label Info Extended Community . . . . . 10 3.2 Partition-Unique Label Info Extended Community Procedures . 11 3.3 BGP Policies on UPBNs and LMS . . . . . . . . . . . . . . . 13 3.4 BGP-LU Procedures for UP0 Destinations . . . . . . . . . . . 14 3.5 Advertising labels without partition label extended community . . . . . . . . . . . . . . . . . . . . . . . . . 15 4. Route Resolution in HSDN Architecture . . . . . . . . . . . . . 16 5. Security Considerations . . . . . . . . . . . . . . . . . . . . 17 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 17 7. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . 17 8. Normative References . . . . . . . . . . . . . . . . . . . . . 17 9. Informative References . . . . . . . . . . . . . . . . . . . . 18 Authors’ Addresses . . . . . . . . . . . . . . . . . . . . . . . . 18

Fang et al. Expires <July 26, 2016> [Page 2]

Page 3: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft BGP-LU for HSDN January 23, 2016

1. Introduction

This document describes modifications to BGP Labeled Unicast (BGP- LU)-based procedures for label distribution [RFC3107] in a partitioned network where a label stack is used for forwarding. Current BGP-LU procedures do not provide mechanisms for distributing and installing operator-assigned partition-scope labels.

Specifically, the modifications described in this document are suitable for label distribution in the control plane of a MPLS-based Hierarchical SDN (HSDN) Data Center (DC) and cloud network.

Hierarchical SDN (HSDN) [I-D.fang-mpls-hsdn-for-hsdc] is an architectural solution to scale a hyper-scale cloud consisting of DCs interconnected by a Data Center Interconnect (DCI) to tens of millions of physical underlay endpoints, while efficiently handling both Equal Cost Multi Path (ECMP) load-balanced traffic and any-to- any end-to-end Traffic Engineered (TE) traffic. The HSDN reference model, operation, and requirements are described in [I-D.fang-mpls-hsdn-for-hsdc].

HSDN is designed to allow the physical decoupling of control and forwarding, and have the LFIBs configured by a controller according to a full SDN approach. Such a controller-centric approach is described in [I-D.fang-mpls-hsdn-for-hsdc].

However, the HSDN control plane can also be built in a hybrid approach, using a routing or label distribution protocol to distribute the labels, together with a controller. This hybrid approach may be particularly useful during technology migration. This document specifies the use of BGP-LU for label distribution and LFIB configuration in the HSDN control plane.

In the HSDN architecture, the DC/DCI network is partitioned into hierarchical underlay partitions (UPs) such that the number of destinations in each UP does not increase beyond the limit imposed by capabilities of network nodes. Once the DC cloud has been partitioned to the desired configuration, the traffic from a source endpoint to a destination endpoint uses a stack of labels, one label per each level in the hierarchy, whose semantics indicate to the forwarding network nodes at each level which destination in its local UP should forward the packet to. The label semantics can also identify a specific path (or group of paths) in the UP, rather than simply a destination.

In other words, the label stack indirectly represents the UPs that the packet should traverse to reach the destination end device. More precisely, the outer label specifies the destination in the partition at the highest level that the packet should traverse, while the other

Fang et al. Expires <July 26, 2016> [Page 3]

Page 4: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft BGP-LU for HSDN January 23, 2016

labels specify the destination in each partition that the packet traverse thereafter.

UP0 \ +---------+ +---------+ +---------+ +---------+ / \|UPBN1-1-1|˜˜˜|UPBN1-1-2|-----------|UPBN1-2-1|˜˜˜|UPBN1-2-2|/ +---------+ +---------+ +---------+ +---------+ ( ) ( ) ( UP1-1 ) ( UP1-2 ) ( ) ( ) +---------+ +---------+ +---------+ +---------+ |UPBN2-1-1|˜˜˜|UPBN2-1-2| |UPBN2-2-1|˜˜˜|UPBN2-2-2| +---------+ +---------+ +---------+ +---------+ ( ) ( ) ( UP2-1 ) ( UP2-2 ) ( ) ( ) +---------+ +---------+ +---------+ +---------+ | Server1 |˜˜˜| Server2 | | Server3 |˜˜˜| Server4 | +---------+ +---------+ +---------+ +---------+

Figure 1 - Example topology with 3 levels of partitioning

In the example of Figure 1, there are 3 levels in the hierarchical partitioning. The UPs are connected by a number of Underlay Partition Border Nodes (UPBNs), grouped in Underlay Partition Border Groups (UPBGs). The UPBGs are the destinations for ECMP-forwarded traffic in each partition.

Packets from Server3 to Server1 use a label stack consisting of 3 Path Labels (PLs) for forwarding.

- Top label (PL0) forwards the packet to one of the UPBN1-1 nodes, which are grouped as UPBG1-1, connecting to UP1-1, which contains Server1 (note that, by definition of HSDN forwarding, PL0 points to UPBG1-1, i.e., the destination in UP0, rather than UPBG2-1).

- Next label (PL1) forwards the packet to one of the UPBN2-1 nodes, which are grouped as UPBG2-1, connecting to UP2-1, which contains Server1 (UPNBG2-1 is a destination in UP1-1).

- Next label (PL2) forwards the packet to Server1 (which is a destination in UP2-1)

This document proposes modified BGP-LU based procedures for:

- How each UPBN learns the destinations in its UP and the operator

Fang et al. Expires <July 26, 2016> [Page 4]

Page 5: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft BGP-LU for HSDN January 23, 2016

assigned partition unique labels that should be installed in its LFIB to forward traffic to these destinations;

- How UPBN learns the context labels used by other UPBN destinations in the partition if the DC operator implements a policy of using separate LFIBs for installing partition unique labels on UPBNs

We also introduce an associated new extended community [RFC4360] that serves the following purposes:

1. Enables a UPBN to trigger the modified BGP-LU behavior to allow distribution of partition-unique labels to UPBNs from Label Mapping Server (LMS), and

2. Identifies which LFIB partition unique labels should be installed into (if there is ambiguity due to overlapping label name spaces), and

Such extended community allows to advertise persistent labels, which can survive across BGP session restarts.

Strictly speaking, the labels advertised with the new mechanisms described in this document are not typical downstream-advertised labels, but they are more similar to upstream-advertised labels installed in context LFIBs corresponding to upstream.

It should be noted that the BGP-LU procedures specified in this document may be implemented through operator configured policy using any existing BGP community types if some conditions are met. The minor changes to the procedures and the conditions under which policy based application of an existing BGP community can be used are described in Section 3.5.

The procedures specified in the document are applicable to ECMP traffic in mpls-based HSDN DC cloud architectures.

2. Terminology

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119].

This document inherits the terminology defined in [I-D.fang-mpls-hsdn-for-hsdc] and additionally introduces the following terms that apply when BGP-LU based control plane is used to realize HSDN architecture.

o Border Node (BN): A border node is a node that is present in a UP.

Fang et al. Expires <July 26, 2016> [Page 5]

Page 6: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft BGP-LU for HSDN January 23, 2016

In HSDN architecture, UPBNi is a special BN that connects UPi with UPi-1.

o Partition Label Space: Label space that is shared by all border nodes of a UP to reach a destination in the UP. For a border node, UP destinations comprise other border nodes and end devices that are present in the UP.

o Partition Labels: Operator assigned labels that belong to partition label space corresponding to a UP. The labels need not be allocated from the platform label space on the BNs but may be directly installed in the context table corresponding the UP.

o Label Mapping Server (LMS): A BGP speaker present in each UP that allocates labels for destinations in the partition and distributes the labels to border nodes through BGP-LU.

o BGP Peer Group: Collection of BGP peers for which a set of policies are applied on a BGP speaker.

o Partition-Unique Label Info Community: A new type of BGP extended community that contains the operator assigned partition unique label for the BGP destination, origin partition and border group identifier.

o Border-group Community: This community identifies a group of border nodes that interconnect two partitions and is configured as policy on the border nodes as well as the LMS. It acts as the UPBG identifier.

o Route Resolver: A single or a collection of entities that provides the MPLS label stack to reach a destination underlay end device.

Term Definition ----------- -------------------------------------------------- BGP Border Gateway Protocol BGP-LU Border Gateway Protocol Labeled Unicast BN Border Node DC Data Center DCI Data Center Interconnect ECMP Equal Cost MultiPathing FIB Forwarding Information Base HSDN Hierarchical SDN LFIB Label Forwarding Information Base LMS Label Mapping Server MPLS Multi-Protocol Label Switching SDN Software Defined Network UP Underlay Partition

Fang et al. Expires <July 26, 2016> [Page 6]

Page 7: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft BGP-LU for HSDN January 23, 2016

UPBG Underlay Partition Border Group UPBN Underlay Partition Border Node TE Traffic Engineering

3. Description of BGP-LU Procedures

This section provides an overview of how operator assigned partition label space is used to achieve end-to-end forwarding of label stacked packets. Consider the DC network that is present in the right hand side DC in Figure 1. The diagram in Figure 2 is a part of the DCI network in Figure 1 (the partitions are arranged horizontally rather than vertically as in Figure 1). UP1 in Figure 2 denotes a level 1 UP and UP2 denotes a level 2 UP. BN1 and BN2 are UPBNs of UP1, BN3 and BN4 are UPBNs of UP2. The nodes BN5 and BN6 may be some ToR switches or Servers. The nodes BN3, BN4, BN2, and BN1 are internal to the DC/DCI network (leafs and spines).

˜˜˜˜˜˜˜˜˜ ˜˜˜˜˜˜˜˜˜ +-----+ ( ) +-----+ ( ) +-----+ | BN1 |-( )-| BN3 |-( )-| BN5 | +-----+ ( ) +-----+ ( ) +-----+ ( UP1 ) ( UP2 ) +-----+ ( ) +-----+ ( ) +-----+ | BN2 |-( )-| BN4 |-( )-| BN6 | +-----+ ( ) +-----+ ( ) +-----+ ˜˜˜˜˜˜˜˜˜ ˜˜˜˜˜˜˜˜˜

Figure 2 - Example to illustrate partition labels

If the DC network in Figure 2 ran conventional flat distributed BGP- LU control plane using router-allocated labels, when BN5 advertises itself as destination to BN3, BN3 allocates a new label (say L35) from its platform label space. If BN3 finds BN5 reachable (through say LSP35), it advertises L35 (for destination BN5) to BN1. Similarly, BN1 finds BN3 reachable (through say LSP13) and pushes two labels - bottom label is L35 and top label is the LSP13 label. In this model, BN3 stitches L35 to LSP35 that takes the packet to BN5. The same procedure runs on BN4, which allocates a label (say L45, in general different from L35) from its own platform label space for BN5 and advertises the label to BN1. This model is not suitable when end- to-end traffic from a Server behind BN1 or BN2 (not shown in the figure) to a Server behind BN5 or BN6 (not shown in the figure) needs to be forwarded using a label stack imposed by the SDN Controller with the condition that the label stack does not depend on the BN traversed to reach UP2 from UP1.

This document specifies a mechanism to implement the forwarding model

Fang et al. Expires <July 26, 2016> [Page 7]

Page 8: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft BGP-LU for HSDN January 23, 2016

using label stacks imposed by SDN Controller but not have the limitation described in previous paragraph. The new procedures introduced in this document are explained using the above example.

1. BN5 and BN6 advertise their own loopback addresses in UP2. Assuming BN5 and BN6 do not belong to any border group, the BGP-LU advertisements from BN5 and BN6 contain NULL label. The routes will be: {Nlri: BN5, Label: NULL, Nh: BN5} {Nlri: BN6, Label: NULL, Nh: BN6}

2. BN3 and BN4 do not allocate labels for BN5 and BN6 from their own platform label space when they receive the BGP-LU advertisements. This is because BN3 and BN4 are configured to be part of a border group for UP2 destinations. Both BN3 and BN4 are configured with border group community "Border-group-2".

3. BN3 and BN4 re-advertise BN5 and BN6 as IP NLRI destinations (with BGP next-hop self) to the LMS assigned for UP2 and appends "Partition-Unique Label Info" extended community . The Partition- Unique Label Info extended community and the procedures relating to it are newly introduced in this document. Refer to Section 3.1 for the extended community format and Section 3.2 for LMS procedures. The R-bit in the extended community is set to indicate that the originator requests the receiver to assign and reflect the partition label info community with the label assigned by LMS. The routes for BN5 destination will be: {Nlri: BN5, Nh: BN3, Com: Border-group-2, Label-Ext-Comm: R} {Nlri: BN5, Nh: BN4, Com: Border-group-2, Label-Ext-Comm: R}

If the operator has set aside a BGP community value that unambiguously indicates that the next-hop (BN3 or BN4) in the BGP route requests a label to be allocated for the destination (BN5) in UP2 partition, then the newly specified Partition label info extended community may not be added to the route. Refer to Section 3.5 for details.

4. UP2 LMS processes the IP routes for BN5 and BN6, assigns labels for them (or simply reads the labels from label mapping database configured by operator) and originates a BGP-LU route containing the label assigned for the UP2 destinations. LMS may set the P-bit to indicate that the label can be persistent and can be retained for a specified time period. For the two IP routes for BN5 originated by BN3 and BN4, the BGP-LU routes originated by LMS will be: {Nlri: BN5, Label: L5, Nh: BN3, Com: Border-group-2, Label-Ext- Comm: P:UP2-context} {Nlri: BN5, Label: L5, Nh: BN4, Com: Border-group-2, Label-Ext-

Fang et al. Expires <July 26, 2016> [Page 8]

Page 9: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft BGP-LU for HSDN January 23, 2016

Comm: P:UP2-context}

The procedures if newly specified partition label info extended community is not used are described in Section 3.5.

5. Only when BN3 and BN4 learn the BGP-LU route for BN5 advertised by LMS of UP2, they install the label route in context table corresponding to UP2-context. Note that the operator may configure BN3 and BN4 to install the operator assigned label for BN5 in main LFIB itself (instead of UP2-context). The operator may choose this option if non-overlapping labels are assigned for different UPs.

6. BN3 and BN4 do not advertise BN5 and BN6 in UP1 but only advertise their own loopback addresses. As BN3 and BN4 are configured to be part of a border group, the border group identifier advertised as community is the same in BGP-LU advertisements from BN3 and BN4. If the partitions may have overlapping label spaces, then BN3 and BN4 advertise non-NULL labels in their BGP-LU advertisements. BN3 and BN4 install the label (that gets advertised) in default LFIB and point the label entry to the context table for UP2. In such a case, the routes from BN3 and BN4 will be: {Nlri: BN3, Label: CL3, Nh: BN3, Com: Border-group-1} {Nlri: BN4, Label: CL4, Nh: BN4, Com: Border-group-1}

7. BN1 and BN2 do not allocate labels for BN3 and BN4 from their platform label space when they receive BGP-LU advertisement. BN1 and BN2 only use the BGP-LU advertisement from BN3 and BN4 for determining the labels to be pushed during forwarding. Note that if there are intermediate routers between BN1/BN2 and BN3/BN4, then the labels CL3 and CL4 advertised by BN3 and BN4 will be used by those intermediate routers for determining the labels to be pushed.

8. BN1 and BN2 re-advertise BN3 and BN4 as IP destinations (with BGP next-hop self) to the LMS assigned for UP2 and appends "Partition- Unique Label Info" extended community. The R-bit is set to indicate that the originator requests the receiver to assign and reflect the partition label info community with the label assigned by LMS. The routes for BN3 destination will be: {Nlri: BN3, Nh: BN1, Com: Border-group-1, Label-Ext-Comm: R} {Nlri: BN3, Nh: BN2, Com: Border-group-1, Label-Ext-Comm: R}

The procedures if newly specified partition label info extended community is not used are described in Section 3.5.

9. UP1 LMS processes the IP routes for BN3 and BN4, assigns labels for them (or simply reads the labels from label mapping database configured by operator) and originates a BGP-LU route containing

Fang et al. Expires <July 26, 2016> [Page 9]

Page 10: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft BGP-LU for HSDN January 23, 2016

the label assigned for the UP1 destinations. As the group label advertisements will differ only in BGP next-hop, BGP add-path should be enabled on the peer group between LMS and BNs. LMS may set P-bit to indicate that the advertised label can be persistent and can be retained for specified time. For the two IP routes for BN3 originated by BN1 and BN2, the BGP-LU routes originated by LMS will be: {Nlri: BN3, Label: L3, Nh: BN1, Com: Border-group-1, Label-Ext- Comm: P:UP1-context} {Nlri: BN3, Label: LG2, Nh: BN1, Com: Border-group-1, Label-Ext- Comm: PG:UP1-context} {Nlri: BN3, Label: L3, Nh: BN2, Com: Border-group-1, Label-Ext- Comm: P:UP1-context} {Nlri: BN3, Label: LG2, Nh: BN2, Com: Border-group-1, Label-Ext- Comm: PG:UP1-context}

Note that there are two BGP-LU routes with same NLRI for advertising group label and so BGP add-path [I-D.ietf-idr-add-paths] should be enabled between LMS and BNs.

10. Only when BN1 and BN2 learn the BGP-LU route for BN3 advertised by LMS of UP1, they install the label route in context table that has been configured on BN1 and BN2 to contain all UP1 destinations. Note that the operator may configure BN1 and BN2 to install the operator assigned label for BN3 in main LFIB itself (instead of UP1-context). The operator may choose this option if non-overlapping labels are assigned for different UPs.

Apart from advertising partition labels to BNs, the LMSs also advertise the routes (IP routes received from BNs as well as the BGP- LU routes originated back to BNs) to Route Resolver. Resolver is logically centralized component that constructs label stacks for end- to-end traffic and it uses the routes advertised from LMSs as inputs for constructing label stacks.

The description of the procedures using the example DC network in Figure 2 provides an overview of how the LFIB states are set up for traffic entering BN1 or BN2 is forwarded to BN5 or BN6 ("downward traffic"). It should be be noted that this overview has not explained how packets from a source in a remote DC can reach BN5 or BN6. In other words, the overview has not yet explained how packets are exchanged between servers in one DC to the other DC in Figure 1. The description of how the LFIB states are setup for "upward traffic" is presented in Section 3.4.

3.1. Partitioned-Unique Label Info Extended Community

This document introduces a new extended community that enables the

Fang et al. Expires <July 26, 2016> [Page 10]

Page 11: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft BGP-LU for HSDN January 23, 2016

originator of a BGP-LU route to convey the information specified below.

0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type=TBD | Sub-Type=TBD | Flags(1 octet)| Reserved=0 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Partition context identifier | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Partition Label Retention Period | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Flags R-bit: Set to 1 if the originator requests label

G-bit: Set to 1 if the label is a group label

P-bit: Set to 1 if the receiver can retain label for specified time even if BGP peering between LMS and BN is lost

Partition context identifier: Context table identifier to which label will be installed

Partition label retention period: Timer period in seconds that the label can be retained after the BGP peering between LMS and BN is lost. This value must be zero if P-bit is not set.

3.2 Partition-Unique Label Info Extended Community Procedures

LMS is a BGP speaker that implements the following new procedures when it receives an IP route BGP advertisement containing "Partition- Unique Label Info" extended community.

- If IGP is the routing protocol with in a UP, then LMS may be implemented as a modified Route Reflector (RR) [RFC4456] assigned for the UP.

- If eBGP runs with in a UP, then the BGP peering between LMS and each border node should be configured by operator and on the BNs the eBGP peering with LMS should be configured in a peer group separate from eBGP peering with other routers in the partition. Note that even if eBGP is in use, the LMS procedures may be considered to act as a "modified reflector" because the primary goal of LMS is to return back the partition label to BN.

- LMS is configured with all the border groups that are connected to the UP where each border group is identified by a unique value of

Fang et al. Expires <July 26, 2016> [Page 11]

Page 12: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft BGP-LU for HSDN January 23, 2016

Border-group community.

When LMS receives an IP route advertisement whose NLRI and BGP next- hop are the same, then it executes the following procedure.

1. If the operator has already assigned a label (DstLabel) for the UP destination in the NLRI, then no action is performed.

2. If the operator has not assigned a label for the UP destination, then LMS allocates a label (DstLabel) and stores the mapping between the UP destination and the label.

3. If the IP route advertisement also contains a known Border-group community and if the operator has not assigned a label for the border group, then LMS allocates a label and stores the mapping between the Border-group and the allocated label. Let the label assigned or allocated be BGLabel. LMS also stores the NLRI to the list of nodes belonging to the Border-group community contained in the route.

4. After executing the following procedures, LMS advertises the IP route to the Route Resolver.

When LMS receives an IP route advertisement whose NLRI and BGP next- hop are different, then it executes the following procedures.

1. If the IP route advertisement does not contain "Partition-Unique Label Info" extended community, then no further action is taken. Alternatively, if the LMS is configured with a policy to interpret a BGP community configured on it as equivalent to "partition label info" extended community, then the subsequent steps may be executed (refer to Section 3.5 for details).

2. If the IP route advertisement contains "Partition-Unique Label Info" extended community but the BGP next-hop does not belong to any known Border-group community configured on the LMS, then no further action is taken.

3. If none of the above conditions is true, then the LMS executes the following procedures.

a. LMS retrieves the DstLabel label already assigned for the UP destination. LMS originates BGP-LU route with DstLabel set in the NLRI and clears the G-bit in "Partition-Unique Label Info" extended community. If the partition labels are operator assigned and is read from label mapping database, then LMS sets P-bit in the extended community flags and sets the "partition label retention period" to the value configured on LMS (default

Fang et al. Expires <July 26, 2016> [Page 12]

Page 13: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft BGP-LU for HSDN January 23, 2016

value is 7200 seconds).

b. If the NLRI of the IP route is equal to a known Border-group community configured on the LMS, then the LMS also retrieves the BGLabel assigned for the Border-group. LMS also originates BGP-LU route with BGLabel set in the NLRI and sets the G-bit in "Partition-Unique Label Info" extended community. If the partition labels operator assigned and is read from label mapping database, then LMS sets P-bit in the extended community flags and sets the "partition label retention period" to the value configured on LMS (default value is 7200 seconds).

When the BN that originated the IP route receives the BGP-LU route "reflected" back by the LMS, it executes the following procedures.

1. BN first checks whether R-bit is cleared in "Partition-Unique Label Info" extended community. If R-bit has been reset, the label in the NLRI is installed in the context table corresponding to the "partition context identifier" present in the extended community. If "partition context identifier" is zero, then BN installs the label entry in default LFIB.

2. If P-bit is set, then BN should retain the label entry in the designated LFIB (context or default) for the time period specified in "partition label retention period" should the BGP peering with LMS is lost. After BGP peering with LMS is lost, the BN should start "label retention timer" for the labels learnt from the LMS. When the BGP peering is restored, BN should reset the "label retention timer" and re-advertise IP routes corresponding to all UP destinations it had originated before. This procedure ensures that both LMS and BNs exchange all requisite routes before reaching steady state again.

3. If P-bit is not set, then BN should delete the label entry immediately when BGP peering with LMS is lost.

4. BN should delete the label entry from the LFIB when LMS withdraws the BGP-LU route containing the "Partition-Unique Label Info" extended community.

3.3 BGP Policies on UPBNs and LMS

The BGP-LU based control plane mechanism specified in this document assumes the following set of policies be applied on various network nodes in HSDN architecture. The policy configurations required are listed below.

Fang et al. Expires <July 26, 2016> [Page 13]

Page 14: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft BGP-LU for HSDN January 23, 2016

- Each UPBN that connects two UPs are configured with a unique Border-group to advertise membership to "border group" or UPBG. For example, in figure 1 UPBN1-1-1 and UPBN1-1-2 are configured with same Border-group community that uniquely represents the connectivity of the two BNs to UP1-1.

- Depending the routing protocol used with in a UP, each UPBN should either have iBGP or eBGP peering sessions such that all lower level UPBNs or end-devices that are connected to the UP learn each other. For example, the BNs present in UP1-1 in Figure 1 are UPBN1-1-1, UPBN1-1-2, UPBN2-1-1 and UPBN2-1-2 and each of them should learn the loopback address of the other BNs.

- Each UP should have a Label Mapping Server (LMS) that advertises to all the UPBNs the operator assigned partition labels corresponding to each UP destination. Destinations of UPi consists of all individual UPBNi+1 connected to UPi and lower level UPBGs connected to UPi. For example, destinations of UP1-1 (Figure 1) are UPBN2-1- 1, UPBN2-1-2 and UPBG2-1, and LMS-1-1 will assign and advertise three labels for UP1-1.

- Each BN in a UP should also have iBGP or eBGP peering session with LMS of the UP. For example, all BNs in UP1-1 should have eBGP peering session with LMS-1-1 if UP1-1 runs eBGP routing protocol.

- UPBNj has a policy to automatically export destinations learnt from UPBNi peer group to UPj peer group (where i=j-1). But UPBNj does not export destinations learnt from UPj peer group to UPBNi peer group. This export policy on UPBNj limits the number of BGP advertisements that any network node in UPi has to process apart from limiting the number of LFIB entries in network nodes.

3.4 BGP-LU Procedures for UP0 Destinations

It should be noted that in the example topology in Figure 2, the BNs attached to UP1 and UP2 have been specified as UP destinations for illustration purposes only. Even a remote destination can be considered as a UP destination as long as the route is leaked into the UP. In HSDN architecture, even though the BNs connected to UP0 are remote for the UPBNs from level 2 down to the leaf level, as long as the normal BGP-LU route leaking policy (specified in Section 3.3) is followed, the LMS of the level 2 (or lower level) UPs will have to assign label for BNs in UP0 (or UP0 destinations). For example, UPBN2-1-1 and UPBN2-1-2 (figure 1) will learn UPBN1-1-1, UPBN1-1-2, UPBN1-2-1 and UPBN1-2-2 because UPBN1-2-1 and UPBN1-2-2 are leaked into UP1-1.

In the DC cloud network specified in Figure 1, the following

Fang et al. Expires <July 26, 2016> [Page 14]

Page 15: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft BGP-LU for HSDN January 23, 2016

procedures are executed to enable packets with top label PL0 reach one of UPBNs connecting to UP0. To obtain end-to-end forwarding using a three label stack in a HSDN network with two levels (i.e. Servers located in UP2-x), the LMS of all UP2-x and UP1-x are set up such that they reflect the same label (i.e. PL0 label) for every UP0 destination (BNs as well as border groups).

1. UPBN1-2-1 and UPBN1-2-2 advertise their own loopback addresses in UP0. As the UPBNs are configured to be part of a border group, the border group community is the same in BGP-LU advertisements. If the partitions may have overlapping label spaces, then UPBN1-2-1 and UPBN1-2-2 advertise non-NULL labels in their BGP-LU advertisements. BN3 and BN4 install the label (that gets advertised) in default LFIB and point the label entry to the context table for UP1-2. In such a case, the routes from BN3 and BN4 will be: {Nlri: UPBN1-2-1, Label: CL11, Nh: UPBN1-2-1, Com: Border-group-0} {Nlri: UPBN1-2-2, Label: CL12, Nh: UPBN1-2-2, Com: Border-group-0}

2. For UPBN1-1-1 and UPBN1-1-2, the routes to UPBN1-2-1 and UPBN1-2-2 are in same partition (i.e. UP0). The label assigned for UPBN1-2- 1, UPBN1-2-2 and UPBG1-2 are the same on LMS-0, LMS-1-1 and LMS-2- 1. So all BNs in the left hand side DC in Figure 1 install the same label for UPBN1-2-1, UPBN1-2-2 and UPBG1-2.

Note that as all BNs in the DC cloud install the same label for a UP0 destination, the label range on the implementations of all BNs should have common label space (among different platform label spaces on all BNs) that can be set aside for the UP0 destinations. If this is not possible, then all BNs should be configured with a separate context table for UP0 partition. The BGP-LU procedures involving the "Partition-unique label info" community supports both forms of forwarding.

3.5 Advertising labels without partition label extended community

The procedures specified in Section 3.2 may be executed on LMS and border nodes without using the newly partition label info extended community but using an existing BGP community if all the following conditions are true.

- Each partition has a separate LMS such that border nodes connecting two partitions must have separate BGP peering with LMS of the two partitions.

- Both LMS and BNs are configured with a BGP community and both LMS and BNs interpret that community as an indication from the BGP peer that the procedures specified in Section 3 of this document should be

Fang et al. Expires <July 26, 2016> [Page 15]

Page 16: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft BGP-LU for HSDN January 23, 2016

applied. If LMS receives IP route advertisement whose NLRI and next- hop attribute are different and contains the pre-configured BGP community, then LMS should interpret the update as label request from the BGP peer for the IP destination corresponding to the NLRI. Similarly, when BN receives BGP-LU advertisement for which the BN has originated an IP route and if the BGP-LU advertisement contains the pre-configured BGP community, then BN should interpret the update as partition label advertisement from LMS for the IP destination corresponding to the NLRI.

- BNs are configured with the LFIB to which the label advertised by the LMS should be installed. In this model, LMS cannot advertise the LFIB to which the label forwarding entry should be installed.

- Both LMS and BNs are configured with label retention policy in the event of BGP peering between LMS and BNs were to fail. For example, both LMS and BNs may be configured with label retention period of 7200 seconds so that BNs can retain the LFIB entry for 7200 seconds even if BGP peering with LMS fails.

4. Route Resolution in HSDN Architecture

As a consequence of the procedures described in Section 3, Route Resolver of the network will have the knowledge of the destinations in all UPs and the UPBNs that have advertised those UP destinations. Route Resolver uses this information to construct MPLS label stack to forward the packet to desired destination End-device.

Note that the procedure specified in this Section is only for illustration purpose and hence the implementation of Resolver is free to choose a more optimal mechanism to obtain the same result. The resolution for a given DstServer or End-device IP address works as follows.

1. Resolver should have received all BGP-LU routes of all End-devices from the LMSs of all "leaf" UPs with BGP next-hop specifying the UPBN that serves the UP. The Resolver looks up the given DstServer IP address in the resolution database. If the IP address is not present, then Resolver considers the resolution as having failed.

2. If the DstServer has been advertised by LMS of a UP, then the Resolver obtains the BGP next-hop from the BGP-LU route advertisement. The BGP next-hop is the UPBN of the leaf UP. Note that there may be multiple BGP-LU routes advertising the same DstServer. Assuming the policy is to use ECMP for the traffic, the Resolver picks the BGP-LU advertisement having G-bit set in "Partition-Unique Label Info" extended community and adds the BGLabel to the resulting stack. Assuming the DstServer is located

Fang et al. Expires <July 26, 2016> [Page 16]

Page 17: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft BGP-LU for HSDN January 23, 2016

in second level UP and LG2 is the group label, the stack will be {LG2}.

3. Resolver then looks up the UPBN in the resolution database. If the UPBN IP address is not present, then Resolver considers the resolution as having failed. If there is one or more BGP-LU route with the UPBN as the destination, then the Resolver obtains the BGP next-hop(s). Assuming the policy is to use ECMP for the traffic, the Resolver picks the BGP-LU advertisement having G-bit set in "Partition-Unique Label Info" extended community and adds the BGLabel to the resulting stack. Assuming LG1 is the group label of level 1 UPBG, the stack will be {LG1, LG2}.

4. As the resolution has reached level 1 UPBN (that is a BN in UP0), the Resolver looks up the level 1 UPBN in resolution database. There should be multiple BGP-LU routes with level 1 UPBN as destination. Assuming the policy is to use ECMP for the traffic, the Resolver picks the BGP-LU advertisement having G-bit set in "Partition-Unique Label Info" extended community and adds the BGLabel to the resulting stack. Assuming LG0 is the group label of level 0 BG, the stack will be {LG0, LG1, LG2}. At this point the resolution is considered as successful (refer to Section 3.4) and the Resolver returns the resultant label stack to the querying system.

5. Security Considerations

The procedures defined in the document does not necessitate any security considerations.

6. IANA Considerations

This document defines a new extended community type (see Section 3.1).

7. Acknowledgments

We would like to thank Kaliraj Vairavakkalai and Balaji Rajagopalan for their valuable input and feedback.

8. Normative References

[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997.

[RFC3107] Rekhter, Y. and E. Rosen, "Carrying Label Information in BGP-4", RFC 3107, May 2001.

Fang et al. Expires <July 26, 2016> [Page 17]

Page 18: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft BGP-LU for HSDN January 23, 2016

[RFC4456] Bates, T., Chen, E., and R. Chandra, "BGP Route Reflection: An Alternative to Full Mesh Internal BGP (IBGP)", RFC 4456, April 2006.

[RFC4360] Sangli, S., Tappan, D., and Y. Rekhter, "BGP Extended Communities Attribute", RFC 4360, February 2006.

9. Informative References

[I-D.fang-mpls-hsdn-for-hsdc] L. Fang, et. al., "MPLS-Based Hierarchical SDN for Hyper-Scale DC/Cloud", draft-fang- mpls-hsdn-for-hsdc-04 (work in progress), July 2015.

[I-D.ietf-idr-add-paths] D. Walton et al., "Advertisement of Multiple Paths in BGP", draft-ietf-idr-add-paths-13 (work in progress), Dec. 2015.

Authors’ Addresses

Luyuan Fang Microsoft 15590 NE 31st St. Redmond, WA 98052 Email: [email protected]

Deepak Bansal Microsoft 15590 NE 31st St. Redmond, WA 98052 Email: [email protected]

Chandra Ramachandran Juniper Networks Bangalore, India Email: [email protected]

Fabio Chiussi Seattle, Washington 98116 Email: [email protected]

Nabil Bitar Verizon 40 Sylvan Road Waltham, MA 02145 Email: [email protected]

Yakov Rekhter

Fang et al. Expires <July 26, 2016> [Page 18]

Page 19: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

IDR Working Group R. Raszuk, Ed.

Internet-Draft Bloomberg LP

Intended status: Standards Track C. Cassar

Expires: July 11, 2020 Tesla

E. Aman

Telia Company

B. Decraene

Orange

K. Wang

Juniper Networks

January 8, 2020

BGP Optimal Route Reflection (BGP-ORR)

draft-ietf-idr-bgp-optimal-route-reflection-20

Abstract

This document proposes a solution for BGP route reflectors to allow

them to choose the best path for their clients that the clients

themselves would have chosen under the same conditions, without

requiring further state or any new features to be placed on the

clients. This facilitates, for example, best exit point policy (hot

potato routing). This solution is primarily applicable in

deployments using centralized route reflectors.

The solution relies upon all route reflectors learning all paths

which are eligible for consideration. Best path selection is

performed in each route reflector based on a configured virtual

location in the IGP. The location can be the same for all clients or

different per peer/update group or per peer. Best path selection can

also be performed based on user configured policies in each route

reflector.

Status of This Memo

This Internet-Draft is submitted in full conformance with the

provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering

Task Force (IETF). Note that other groups may also distribute

working documents as Internet-Drafts. The list of current Internet-

Drafts is at https://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months

and may be updated, replaced, or obsoleted by other documents at any

time. It is inappropriate to use Internet-Drafts as reference

material or to cite them other than as "work in progress."

Raszuk, et al. Expires July 11, 2020 [Page 1]

Page 20: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft bgp-optimal-route-reflection January 2020

This Internet-Draft will expire on July 11, 2020.

Copyright Notice

Copyright (c) 2020 IETF Trust and the persons identified as the

document authors. All rights reserved.

This document is subject to BCP 78 and the IETF Trust’s Legal

Provisions Relating to IETF Documents

(https://trustee.ietf.org/license-info) in effect on the date of

publication of this document. Please review these documents

carefully, as they describe your rights and restrictions with respect

to this document. Code Components extracted from this document must

include Simplified BSD License text as described in Section 4.e of

the Trust Legal Provisions and are provided without warranty as

described in the Simplified BSD License.

Table of Contents

1. Definitions of Terms Used in This Memo . . . . . . . . . . . 2

2. Authors . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

3. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 4

3.1. Problem Statement . . . . . . . . . . . . . . . . . . . . 4

3.2. Existing/Alternative Solutions . . . . . . . . . . . . . 5

4. Two Proposed Solutions . . . . . . . . . . . . . . . . . . . 6

4.1. Client’s Perspective IGP Based Best Path Selection . . . 7

4.1.1. Restriction when BGP next hop is BGP prefix . . . . . 8

4.2. Client’s Perspective Policy Based Best Path Selection . . 8

4.3. Solution Interactions . . . . . . . . . . . . . . . . . . 9

4.3.1. IGP and policy based optimal route refresh . . . . . 9

4.3.2. Add-paths plus IGP and policy optimal route refresh . 9

4.3.3. Likely Deployments and need for backup . . . . . . . 9

5. CPU and Memory Scalability . . . . . . . . . . . . . . . . . 10

6. Advantages and Deployment Considerations . . . . . . . . . . 10

7. Security Considerations . . . . . . . . . . . . . . . . . . . 11

8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 12

9. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 12

10. References . . . . . . . . . . . . . . . . . . . . . . . . . 12

10.1. Normative References . . . . . . . . . . . . . . . . . . 12

10.2. Informative References . . . . . . . . . . . . . . . . . 13

Authors’ Addresses . . . . . . . . . . . . . . . . . . . . . . . 14

1. Definitions of Terms Used in This Memo

NLRI - Network Layer Reachability Information.

RIB - Routing Information Base.

Raszuk, et al. Expires July 11, 2020 [Page 2]

Page 21: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft bgp-optimal-route-reflection January 2020

AS - Autonomous System number.

VRF - Virtual Routing and Forwarding instance.

PE - Provider Edge router

RR - Route Reflector

POP - Point Of Presence

L3VPN - Layer 3 Virtual Private Networks [RFC4364]

6PE - IPv6 Provider Edge Router

IGP - Interior Gateway Protocol

SPT - Shortest Path Tree

best path - the route chosen by the decision process detailed in

[RFC 4271] section 9.1.2 and its subsections

best path computation - the decision process detailed in [RFC 4271]

section 9.1.2 and its subsections

best path algorithm - the decision process detailed in [RFC 4271]

section 9.1.2 and its subsections

best path selection - the decision process detailed in [RFC 4271]

section 9.1.2 and its subsections

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",

"SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and

"OPTIONAL" in this document are to be interpreted as described in BCP

14 [RFC2119] [RFC8174] when, and only when, they appear in all

capitals, as shown here.

2. Authors

Following authors substantially contributed to the current format of

the document:

Stephane Litkowski

Orange

9 rue du chene germain

Cesson Sevigne, 35512

France

[email protected]

Raszuk, et al. Expires July 11, 2020 [Page 3]

Page 22: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft bgp-optimal-route-reflection January 2020

Adam Chappell

Interoute Communications

31st Floor

25 Canada Square

London, E14 5LQ

United Kingdom

[email protected]

3. Introduction

There are three types of BGP deployments within Autonomous Systems

today: full mesh, confederations and route reflection. BGP route

reflection [RFC4456] is the most popular way to distribute BGP routes

between BGP speakers belonging to the same Autonomous System.

However, in some situations, this method suffers from non-optimal

path selection.

3.1. Problem Statement

[RFC4456] asserts that, because the Interior Gateway Protocol (IGP)

cost to a given point in the network will vary across routers, "the

route reflection approach may not yield the same route selection

result as that of the full IBGP mesh approach." One practical

implication of this assertion is that the deployment of route

reflection may thwart the ability to achieve hot potato routing. Hot

potato routing attempts to direct traffic to the best AS exit point

in cases where no higher priority policy dictates otherwise. As a

consequence of the route reflection method, the choice of exit point

for a route reflector and its clients will be the exit point best for

the route reflector - not necessarily the one best for the route

reflector clients.

Section 11 of [RFC4456] describes a deployment approach and a set of

constraints which, if satisfied, would result in the deployment of

route reflection yielding the same results as the iBGP full mesh

approach. This deployment approach makes route reflection compatible

with the application of hot potato routing policy. In accordance

with these design rules, route reflectors have traditionally often

been deployed in the forwarding path and carefully placed on the POP

to core boundaries.

The evolving model of intra-domain network design has enabled

deployments of route reflectors outside of the forwarding path.

Initially this model was only employed for new address families, e.g.

L3VPNs and L2VPNs, however it has been gradually extended to other

BGP address families including IPv4 and IPv6 Internet using either

Raszuk, et al. Expires July 11, 2020 [Page 4]

Page 23: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft bgp-optimal-route-reflection January 2020

native routing or 6PE. In such environments, hot potato routing

policy remains desirable.

Route reflectors outside of the forwarding path can be placed on the

POP to core boundaries, but they are often placed in arbitrary

locations in the core of large networks.

Such deployments suffer from a critical drawback in the context of

best path selection: A route reflector with knowledge of multiple

paths for a given prefix will typically pick its best path and only

advertise that best path to its clients. If the best path for a

prefix is selected on the basis of an IGP tie break, the path

advertised will be the exit point closest to the route reflector.

However, the clients are in a different place in the network topology

than the route reflector. In networks where the route reflectors are

not in the forwarding path, this difference will be even more acute.

In addition, there are deployment scenarios where service providers

want to have more control in choosing the exit points for clients

based on other factors, such as traffic type, traffic load, etc.

This further complicates the issue and makes it less likely for the

route reflector to select the best path from the client’s

perspective. It follows that the best path chosen by the route

reflector is not necessarily the same as the path which would have

been chosen by the client if the client had considered the same set

of candidate paths as the route reflector.

3.2. Existing/Alternative Solutions

One possible valid solution or workaround to the best path selection

problem requires sending all domain external paths from the route

reflector to all its clients. This approach suffers the significant

drawback of pushing a large amount of BGP state to all edge routers.

Many networks receive full Internet routing information in a large

number of locations. This could easily result in tens of paths for

each prefix that would need to be distributed to clients.

Notwithstanding this drawback, there are a number of reasons for

sending more than just the single best path to the clients. Improved

path diversity at the edge is a requirement for fast connectivity

restoration, and a requirement for effective BGP level load

balancing.

In practical terms, add/diverse path deployments [RFC7911] [RFC6774]

are expected to result in the distribution of 2, 3, or n (where n is

a small number) good paths rather than all domain external paths.

When the route reflector chooses one set of n paths and distributes

them to all its route reflector clients, those n paths may not be the

Raszuk, et al. Expires July 11, 2020 [Page 5]

Page 24: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft bgp-optimal-route-reflection January 2020

right n paths for all clients. In the context of the problem

described above, those n paths will not necessarily include the

closest exit point out of the network for each route reflector

client. The mechanisms proposed in this document are likely to be

complementary to mechanisms aimed at improving path diversity.

Another possibility to optimize exit point selection is the

implementation of distributed route reflector functionality at key

IGP locations in order to ensure that these locations see their

viewpoints respected in exit selection. Typically, however, this

requires the installation of physical nodes to implement the

reflection, and if exit policy subsequently changes, the reflector

placement and position can become inappropriate.

To counter the burden of physical installation, it is possible to

build a logical overlay of tunnels with appropriate IGP metrics in

order to simulate closeness to key locations required to implement

exit policy. There is significant complexity overhead in this

approach, however, enough so to typically make it undesirable.

Trends in control plane decoupling are causing a shift from

traditional routers to compute virtualization platforms, or even

third-party cloud platforms. As a result, without this proposal,

operators are left with a difficult choice for the distribution and

reflection of address families with significant exit diversity:

o centralized path selection, and tolerate the associated suboptimal

paths, or

o defer selection to end clients, but lose potential route scale

capacity

The latter can be a viable option, but it is clearly a decision that

needs to be made on an application and address family basis, with

strong consideration for the number of available paths per prefix

(which may even vary per prefix range, depending on peering policy,

e.g. consider bilateral peerings versus onward transit arrangements)

4. Two Proposed Solutions

This document is describes two solution to allow a route reflector to

choose the best path for its clients that the clients themselves

would have chosen had they considered the same set of candidate

paths.

For purposes of route selection, the perspective of a client can

differ from that of a route reflector or another client in two

distinct ways:

Raszuk, et al. Expires July 11, 2020 [Page 6]

Page 25: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft bgp-optimal-route-reflection January 2020

it can, and usually will, have a different position in the IGP

topology, and

it can have a different routing policy.

These factors correspond to the issues described earlier.

Accordingly, this document specifies two distinct modifications to

the best path algorithm, to address these two distinct factors. A

route reflector can implement either or both of the modifications in

order to allow it to choose the best path for its clients that the

clients themselves would have chosen given the same set of candidate

paths.

Both modifications rely upon all route reflectors learning all paths

that are eligible for consideration. In order to satisfy this

requirement, path diversity enhancing mechanisms such as add-path/

diverse paths may need to be deployed between route reflectors.

A significant advantage of these approaches is that the route

reflector clients do not need to run new software or hardware.

4.1. Client’s Perspective IGP Based Best Path Selection

The core of this solution is the ability for an operator to specify

on a per route reflector basis, or per peer/update group basis, or

per peer basis the virtual IGP location placement of the route

reflector. This core ability enables the route reflector to send to

a given group of clients routes with shortest distance to the next

hops from the position of the configured virtual IGP location. This

core ability provides for freedom of route reflector location, and

allows transient or permanent migration of this network control plane

function to an arbitrary location.

The choice of specific granularity (route reflector basis, peer/

update group basis, or peer peer basis) is left as an implementation

decision. An implementation is considered compliant with the

document if it supports at least one listed grouping of virtual IGP

location.

In this approach, optimal refers to the decision made during best

path selection at the IGP metric to BGP next hop comparison step.

This approach does not apply to path selection preference based on

other policy steps and provisions.

The computation of the virtual IGP location with any of the above

described granularity is outside of the scope of this document. The

operator may configure it manually, implementation may automate it

Raszuk, et al. Expires July 11, 2020 [Page 7]

Page 26: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft bgp-optimal-route-reflection January 2020

based on heuristics, or it can be computed centrally and configured

by an external system.

This solution does not require any BGP or IGP protocol changes, as

all required changes are contained within the route reflector

implementation.

This solution applies to NLRIs of all address families, that can be

route reflected.

4.1.1. Restriction when BGP next hop is BGP prefix

In situations where the BGP next hop is a BGP prefix itself the IGP

metric of a route used for its resolution SHOULD be the final IGP

cost to reach such next hop. Implementations which can not inform

BGP of the final IGP metric to a recursive next hop SHOULD treat such

paths as least preferred during next hop metric comparison. However

such paths SHOULD still be considered valid for best path selection.

4.2. Client’s Perspective Policy Based Best Path Selection

Optimal route reflection based on virtual IGP location could reflect

the best path to the client from IGP cost perspective. However,

there are also cases where the client might want the best path based

on factors beyond IGP cost. Examples include, but not limited to:

o Selecting the best path for the clients from a traffic engineering

perspective.

o Dedicating certain exit points for certain ingress points.

The user MAY specify and apply a general policy on the route

reflector to select a subset of exit points as the candidate exit

points for its clients. For a given client, the policy SHOULD also

allow the operator to select different candidate exit points for

different address families. Regular path selection, including

client’s perspective IGP based best path selection stated above, will

be applied to the candidate paths to select the final paths to

advertise to the clients.

Since the policy is applied on the route reflector on behalf of its

clients, the route reflector will be able to reflect only the optimal

paths to its clients. An additional advantage of this approach is

that configuration need only be done on a small number of route

reflectors, rather than on a significantly larger number of clients.

Raszuk, et al. Expires July 11, 2020 [Page 8]

Page 27: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft bgp-optimal-route-reflection January 2020

4.3. Solution Interactions

4.3.1. IGP and policy based optimal route refresh

Depending on the actual deployment scenarios, service providers may

configure IGP based optimal route reflection or policy based optimal

route reflection. It is also possible to configure both approaches

together. In cases where both are configured together, policy based

optimal route reflection MUST be applied first to select the

candidate paths, then IGP based optimal route reflection can be

applied on top of the candidate paths to select the final path to

advertise to the client.

The expected use case for optimal route reflection is to avoid

reflecting all paths to the client because the client either: does

not support add-paths or does not have the capacity to process all of

the paths. Typically the route reflector would just reflect a single

optimal route to the client. However, the solutions MUST NOT prevent

reflecting more than one optimal path to the client as path diversity

may be desirable for load balancing or fast restoration. In cases

where add-path and optimal route reflection are configured together,

the route reflector MUST reflect n optimal paths to a client, where n

is the add-path count.

4.3.2. Add-paths plus IGP and policy optimal route refresh

The most complicated scenario is where add-path is configured

together with both IGP based and policy based optimal route

reflection. In this scenario, the policy based optimal route

reflection MUST be applied first to select the candidate paths (from

add-path). Subsequently, IGP based optimal route reflection will be

applied on top of the candidate paths to select the best n paths to

advertise to the client.

4.3.3. Likely Deployments and need for backup

With IGP based optimal route reflection, even though the virtual IGP

location could be specified on a per route reflector basis or per

peer/update group basis or per peer basis, in reality, it’s most

likely to be specified per peer/update group basis. All clients with

the same or similar IGP location can be grouped into the same peer/

update group. A virtual IGP location is then specified for the peer/

update group. The virtual location is usually specified as the

location of one of the clients from the peer group or an ABR to the

area where clients are located. Also, one or more backup virtual

locations SHOULD be allowed to be specified for redundancy.

Implementations may wish to take advantage of peer group mechanisms

Raszuk, et al. Expires July 11, 2020 [Page 9]

Page 28: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft bgp-optimal-route-reflection January 2020

in order to provide for better scalability of optimal route reflector

client groups with similar properties.

5. CPU and Memory Scalability

For IGP based optimal route reflection, determining the shortest path

and associated cost between any two arbitrary points in a network

based on the IGP topology learned by a router is expected to add some

extra cost in terms of CPU resources. However, current SPF tree

generation code is implemented efficiently in a number of

implementations, and therefore this is not expected to be a major

drawback. The number of SPTs computed is expected to be of the order

of the number of clients of a route reflector whenever a topology

change is detected. Advanced optimizations like partial and

incremental SPF may also be exploited. The number of SPTs computed

is expected to be higher but comparable to some existing deployed

features such as (Remote) Loop Free Alternate which computes a (r)SPT

per IGP neighbor.

For policy based optimal route reflection, there will be some

overhead to apply the policy to select the candidate paths. This

overhead is comparable to existing BGP export policies and therefore

should be manageable.

By the nature of route reflection, the number of clients can be split

arbitrarily by the deployment of more route reflectors for a given

number of clients. While this is not expected to be necessary in

existing networks with best in class route reflectors available

today, this avenue to scaling up the route reflection infrastructure

is available.

If we consider the overall network wide cost/benefit factor, the only

alternative to achieve the same level of optimality would require

significantly increasing state on the edges of the network. This

will consume CPU and memory resources on all BGP speakers in the

network. Building this client perspective into the route reflectors

seems appropriate.

6. Advantages and Deployment Considerations

The solutions described provide a model for integrating the client

perspective into the best path computation for route reflectors.

More specifically, the choice of BGP path factors in either the IGP

cost between the client and the nexthop (rather than the IGP cost

from the route reflector to the nexthop) or other user configured

policies.

Raszuk, et al. Expires July 11, 2020 [Page 10]

Page 29: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft bgp-optimal-route-reflection January 2020

Implementations considered compliant with this document allow the

configuration of a logical location from which the best path will be

computed, on the basis of either a peer, a peer group, or an entire

routing instance.

These solutions can be deployed in traditional hop-by-hop forwarding

networks as well as in end-to-end tunneled environments. In networks

where there are multiple route reflectors and hop-by-hop forwarding

without encapsulation, such optimizations SHOULD be enabled in a

consistent way on all route reflectors. Otherwise, clients may

receive an inconsistent view of the network, in turn leading to

intra-domain forwarding loops.

With this approach, an ISP can effect a hot potato routing policy

even if route reflection has been moved out of the forwarding plane,

and hop-by-hop switching has been replaced by end-to-end MPLS or IP

encapsulation.

As per above, these approaches reduce the amount of state which needs

to be pushed to the edge of the network in order to perform hot

potato routing. The memory and CPU resources required at the edge of

the network to provide hot potato routing using these approaches is

lower than what would be required to achieve the same level of

optimality by pushing and retaining all available paths (potentially

10s) per each prefix at the edge.

The solutions above allow for a fast and safe transition to a BGP

control plane using centralized route reflection, without

compromising an operator’s closest exit operational principle. This

enables edge-to-edge LSP/IP encapsulation for traffic to IPv4 and

IPv6 prefixes.

Regarding the client’s IGP best-path selection, it should be self

evident that this solution does not interfere with policies enforced

above IGP tie breaking in the BGP best path algorithm.

7. Security Considerations

This document does not introduce any new BGP functionality therefor

it does not change the underlying security issues inherent in the

existing IBGP path propagation when BGP Route Reflection [RFC4456] is

used.

It however enables the deployment of base BGP Route Reflection as

described in [RFC4456] to be possible using virtual compute

environments without any negative consequence on the BGP routing path

optimality.

Raszuk, et al. Expires July 11, 2020 [Page 11]

Page 30: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft bgp-optimal-route-reflection January 2020

This document does not introduce requirements for any new protection

measures, but it also does not relax best operational practices for

keeping the IGP network stable or to pace rate of policy based IGP

cost to next hops such that it does not have any substantial effect

on BGP path changes and their propagation to route reflection

clients.

8. IANA Considerations

This document does not request any IANA allocations.

9. Acknowledgments

Authors would like to thank Keyur Patel, Eric Rosen, Clarence

Filsfils, Uli Bornhauser, Russ White, Jakob Heitz, Mike Shand, Jon

Mitchell, John Scudder, Jeff Haas, Martin Djernaes, Daniele

Ceccarelli, Kieran Milne, Job Snijders and Randy Bush for their

valuable input.

10. References

10.1. Normative References

[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate

Requirement Levels", BCP 14, RFC 2119,

DOI 10.17487/RFC2119, March 1997,

<https://www.rfc-editor.org/info/rfc2119>.

[RFC4271] Rekhter, Y., Ed., Li, T., Ed., and S. Hares, Ed., "A

Border Gateway Protocol 4 (BGP-4)", RFC 4271,

DOI 10.17487/RFC4271, January 2006,

<https://www.rfc-editor.org/info/rfc4271>.

[RFC4360] Sangli, S., Tappan, D., and Y. Rekhter, "BGP Extended

Communities Attribute", RFC 4360, DOI 10.17487/RFC4360,

February 2006, <https://www.rfc-editor.org/info/rfc4360>.

[RFC5492] Scudder, J. and R. Chandra, "Capabilities Advertisement

with BGP-4", RFC 5492, DOI 10.17487/RFC5492, February

2009, <https://www.rfc-editor.org/info/rfc5492>.

[RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC

2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,

May 2017, <https://www.rfc-editor.org/info/rfc8174>.

Raszuk, et al. Expires July 11, 2020 [Page 12]

Page 31: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft bgp-optimal-route-reflection January 2020

10.2. Informative References

[RFC1997] Chandra, R., Traina, P., and T. Li, "BGP Communities

Attribute", RFC 1997, DOI 10.17487/RFC1997, August 1996,

<https://www.rfc-editor.org/info/rfc1997>.

[RFC1998] Chen, E. and T. Bates, "An Application of the BGP

Community Attribute in Multi-home Routing", RFC 1998,

DOI 10.17487/RFC1998, August 1996,

<https://www.rfc-editor.org/info/rfc1998>.

[RFC4384] Meyer, D., "BGP Communities for Data Collection", BCP 114,

RFC 4384, DOI 10.17487/RFC4384, February 2006,

<https://www.rfc-editor.org/info/rfc4384>.

[RFC4456] Bates, T., Chen, E., and R. Chandra, "BGP Route

Reflection: An Alternative to Full Mesh Internal BGP

(IBGP)", RFC 4456, DOI 10.17487/RFC4456, April 2006,

<https://www.rfc-editor.org/info/rfc4456>.

[RFC4893] Vohra, Q. and E. Chen, "BGP Support for Four-octet AS

Number Space", RFC 4893, DOI 10.17487/RFC4893, May 2007,

<https://www.rfc-editor.org/info/rfc4893>.

[RFC5283] Decraene, B., Le Roux, JL., and I. Minei, "LDP Extension

for Inter-Area Label Switched Paths (LSPs)", RFC 5283,

DOI 10.17487/RFC5283, July 2008,

<https://www.rfc-editor.org/info/rfc5283>.

[RFC5668] Rekhter, Y., Sangli, S., and D. Tappan, "4-Octet AS

Specific BGP Extended Community", RFC 5668,

DOI 10.17487/RFC5668, October 2009,

<https://www.rfc-editor.org/info/rfc5668>.

[RFC5714] Shand, M. and S. Bryant, "IP Fast Reroute Framework",

RFC 5714, DOI 10.17487/RFC5714, January 2010,

<https://www.rfc-editor.org/info/rfc5714>.

[RFC6774] Raszuk, R., Ed., Fernando, R., Patel, K., McPherson, D.,

and K. Kumaki, "Distribution of Diverse BGP Paths",

RFC 6774, DOI 10.17487/RFC6774, November 2012,

<https://www.rfc-editor.org/info/rfc6774>.

[RFC7911] Walton, D., Retana, A., Chen, E., and J. Scudder,

"Advertisement of Multiple Paths in BGP", RFC 7911,

DOI 10.17487/RFC7911, July 2016,

<https://www.rfc-editor.org/info/rfc7911>.

Raszuk, et al. Expires July 11, 2020 [Page 13]

Page 32: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft bgp-optimal-route-reflection January 2020

Authors’ Addresses

Robert Raszuk (editor)

Bloomberg LP

731 Lexington Ave

New York City, NY 10022

USA

Email: [email protected]

Christian Cassar

Tesla

43 Avro Way

Weybridge KT13 0XY

UK

Email: [email protected]

Erik Aman

Telia Company

Solna SE-169 94

Sweden

Email: [email protected]

Bruno Decraene

Orange

38-40 rue du General Leclerc

Issy les Moulineaux cedex 9 92794

France

Email: [email protected]

Kevin Wang

Juniper Networks

10 Technology Park Drive

Westford, MA 01886

USA

Email: [email protected]

Raszuk, et al. Expires July 11, 2020 [Page 14]

Page 33: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Inter-Domain Routing S. PrevidiInternet-Draft IndividualIntended status: Standards Track K. Talaulikar, Ed.Expires: November 17, 2019 C. Filsfils Cisco Systems, Inc. K. Patel Arrcus, Inc. S. Ray Individual Contributor J. Dong Huawei Technologies May 16, 2019

BGP-LS extensions for Segment Routing BGP Egress Peer Engineering draft-ietf-idr-bgpls-segment-routing-epe-19

Abstract

Segment Routing (SR) leverages source routing. A node steers a packet through a controlled set of instructions, called segments, by prepending the packet with an SR header. A segment can represent any instruction, topological or service-based. SR segments allow steering a flow through any topological path and service chain while maintaining per-flow state only at the ingress node of the SR domain.

This document describes an extension to BGP Link-State (BGP-LS) for advertisement of BGP Peering Segments along with their BGP peering node information so that efficient BGP Egress Peer Engineering (EPE) policies and strategies can be computed based on Segment Routing.

Requirements Language

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at https://datatracker.ietf.org/drafts/current/.

Previdi, et al. Expires November 17, 2019 [Page 1]

Page 34: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft Segment Routing EPE BGP-LS Extensions May 2019

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on November 17, 2019.

Copyright Notice

Copyright (c) 2019 IETF Trust and the persons identified as the document authors. All rights reserved.

This document is subject to BCP 78 and the IETF Trust’s Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.

Table of Contents

1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 2. BGP Peering Segments . . . . . . . . . . . . . . . . . . . . 4 3. BGP-LS NLRI Advertisement for BGP Protocol . . . . . . . . . 5 3.1. BGP Router-ID and Member AS Number . . . . . . . . . . . 6 3.2. Mandatory BGP Node Descriptors . . . . . . . . . . . . . 6 3.3. Optional BGP Node Descriptors . . . . . . . . . . . . . . 7 4. BGP-LS Attributes for BGP Peering Segments . . . . . . . . . 7 4.1. Advertisement of the PeerNode SID . . . . . . . . . . . . 10 4.2. Advertisement of the PeerAdj SID . . . . . . . . . . . . 11 4.3. Advertisement of the PeerSet SID . . . . . . . . . . . . 12 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 12 5.1. New BGP-LS Protocol-ID . . . . . . . . . . . . . . . . . 13 5.2. Node Descriptors and Link Attribute TLVs . . . . . . . . 13 6. Manageability Considerations . . . . . . . . . . . . . . . . 14 7. Security Considerations . . . . . . . . . . . . . . . . . . . 15 8. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 15 9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 16 10. References . . . . . . . . . . . . . . . . . . . . . . . . . 16 10.1. Normative References . . . . . . . . . . . . . . . . . . 16 10.2. Informative References . . . . . . . . . . . . . . . . . 17 Authors’ Addresses . . . . . . . . . . . . . . . . . . . . . . . 17

Previdi, et al. Expires November 17, 2019 [Page 2]

Page 35: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft Segment Routing EPE BGP-LS Extensions May 2019

1. Introduction

Segment Routing (SR) leverages source routing. A node steers a packet through a controlled set of instructions, called segments, by prepending the packet with an SR header with segment identifiers (SID). A SID can represent any instruction, topological or service- based. SR segments allows to enforce a flow through any topological path or service function while maintaining per-flow state only at the ingress node of the SR domain.

The SR architecture [RFC8402] defines three types of BGP Peering Segments that may be instantiated at a BGP node:

o Peer Node Segment (PeerNode SID) : instruction to steer to a specific peer node

o Peer Adjacency Segment (PeerAdj SID) : instruction to steer over a specific local interface towards a specific peer node

o Peer Set Segment (PeerSet SID) : instruction to load-balance to a set of specific peer nodes

SR can be directly applied to either to an MPLS dataplane (SR/MPLS) with no change on the forwarding plane or to a modified IPv6 forwarding plane (SRv6).

This document describes extensions to the BGP Link-State NLRI (BGP-LS NLRI) and the BGP-LS Attribute defined for BGP-LS [RFC7752] for advertising BGP peering segments from a BGP node along with its peering topology information (i.e., its peers, interfaces, and peering ASs) to enable computation of efficient BGP Egress Peer Engineering (BGP-EPE) policies and strategies using the SR/MPLS dataplane. The corresponding extensions for SRv6 are specified in [I-D.dawra-idr-bgpls-srv6-ext].

[I-D.ietf-spring-segment-routing-central-epe] illustrates a centralized controller-based BGP Egress Peer Engineering solution involving SR path computation using the BGP Peering Segments. This use case comprises a centralized controller that learns the BGP Peering SIDs via BGP-LS and then uses this information to program a BGP-EPE policy at any node in the domain to perform traffic steering via a specific BGP egress node to a specific EBGP peer(s) optionally also over a specific interface. The BGP-EPE policy can be realized using the SR Policy framework [I-D.ietf-spring-segment-routing-policy].

This document introduces a new BGP-LS Protocol-ID for BGP and defines new BGP-LS Node and Link Descriptor TLVs to facilitate advertising

Previdi, et al. Expires November 17, 2019 [Page 3]

Page 36: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft Segment Routing EPE BGP-LS Extensions May 2019

BGP-LS Link NLRI to represent the BGP peering topology. Further, it specifies the BGP-LS Attribute TLVs for advertisement of the BGP Peering Segments (i.e., PeerNode SID, PeerAdj SID, and PeerSet SID) to be advertised in the same BGP-LS Link NLRI.

2. BGP Peering Segments

As described in [RFC8402], a BGP-EPE enabled Egress PE node instantiates SR Segments corresponding to its attached peers. These segments are called BGP Peering Segments or BGP Peering SIDs. In case of EBGP, they enable the expression of source-routed inter- domain paths.

An ingress border router of an AS may compose a list of SIDs to steer a flow along a selected path within the AS, towards a selected egress border router C of the AS, and to a specific EBGP peer. At minimum, a BGP-EPE policy applied at an ingress PE involves two SIDs: the Node SID of the chosen egress PE and then the BGP Peering SID for the chosen egress PE peer or peering interface.

Each BGP session MUST be described by a PeerNode SID. The description of the BGP session MAY be augmented by additional PeerAdj SIDs. Finally, multiple PeerNode SIDs or PeerAdj SIDs MAY be part of the same group/set in order to group EPE resources under a common PeerSet SID. These BGP Peering SIDs and their encoding are described in detail in Section 4.

The following BGP Peering SIDs need to be instantiated on a BGP router for each of its BGP peer sessions that are enabled for Egress Peer Engineering:

o One PeerNode SID MUST be instantiated to describe the BGP peer session.

o One or more PeerAdj SID MAY be instantiated corresponding to the underlying link(s) to the directly connected BGP peer session.

o A PeerSet SID MAY be instantiated and additionally associated and shared between one or more PeerNode SIDs or PeerAdj SIDs.

While an egress point in a topology usually refers to EBGP sessions between external peers, there’s nothing in the extensions defined in this document that would prevent the use of these extensions in the context of IBGP sessions. However, unlike EBGP sessions which are generally between directly connected BGP routers which are also along the traffic forwarding path, IBGP peer sessions may be setup to BGP routers which are not in the forwarding path. As such, when the IBGP design includes sessions with route-reflectors, a BGP router SHOULD

Previdi, et al. Expires November 17, 2019 [Page 4]

Page 37: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft Segment Routing EPE BGP-LS Extensions May 2019

NOT instantiate a BGP Peering SID for those sessions to peer nodes which are not in the forwarding path since the purpose of BGP Peering SID is to steer traffic to that specific peers. Thus, the applicability for IBGP peering may be limited to only those deployments where the IBGP peer is also along the forwarding data path.

Any BGP Peering SIDs instantiated on the node are advertised via BGP- LS Link NLRI type as described in the sections below. An illustration of the BGP Peering SIDs’ allocations in a reference BGP peering topology along with the information carried in the BGP-LS Link NLRI and its corresponding BGP-LS Attribute are described in [I-D.ietf-spring-segment-routing-central-epe].

3. BGP-LS NLRI Advertisement for BGP Protocol

This section describes the BGP-LS NLRI encodings that describe the BGP peering and link connectivity between BGP routers.

This document specifies the advertisement of BGP peering topology information via BGP-LS Link NLRI type which requires use of a new BGP-LS Protocol-ID.

+-------------+----------------------------------+ | Protocol-ID | NLRI information source protocol | +-------------+----------------------------------+ | 7 | BGP | +-------------+----------------------------------+

Table 1: BGP-LS Protocol Identifier for BGP

The use of a new Protocol-ID allows separation and differentiation between the BGP-LS NLRIs carrying BGP information from the BGP-LS NLRIs carrying IGP link-state information defined in [RFC7752].

The BGP Peering information along with their Peering Segments are advertised using BGP-LS Link NLRI type with the Protocol-ID set to BGP. The BGP-LS Link NLRI type uses the Descriptor TLVs and BGP-LS Attribute TLVs as defined in [RFC7752]. In order to correctly describe BGP nodes, new TLVs are defined in this section.

[RFC7752] defines Link NLRI Type is as follows:

Previdi, et al. Expires November 17, 2019 [Page 5]

Page 38: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft Segment Routing EPE BGP-LS Extensions May 2019

0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+ | Protocol-ID | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Identifier | | (64 bits) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ // Local Node Descriptors // +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ // Remote Node Descriptors // +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ // Link Descriptors // +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Figure 1: BGP-LS Link NLRI

Node Descriptors and Link Descriptors are defined in [RFC7752].

3.1. BGP Router-ID and Member AS Number

Two new Node Descriptors TLVs are defined in this document:

o BGP Router Identifier (BGP Router-ID):

Type: 516

Length: 4 octets

Value: 4 octet unsigned non-zero integer representing the BGP Identifier as defined in [RFC6286].

o Member-AS Number (Member-ASN)

Type: 517

Length: 4 octets

Value: 4 octet unsigned non-zero integer representing the Member-AS Number [RFC5065].

3.2. Mandatory BGP Node Descriptors

The following Node Descriptors TLVs MUST be included in BGP-LS NLRI as Local Node Descriptors when distributing BGP information:

o BGP Router-ID (TLV 516), which contains a valid BGP Identifier of the local BGP node.

Previdi, et al. Expires November 17, 2019 [Page 6]

Page 39: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft Segment Routing EPE BGP-LS Extensions May 2019

o Autonomous System Number (TLV 512) [RFC7752], which contains the ASN or AS Confederation Identifier (ASN) [RFC5065], if confederations are used, of the local BGP node.

Note that [RFC6286] (section 2.1) requires the BGP identifier (Router-ID) to be unique within an Autonomous System and non-zero. Therefore, the <ASN, BGP Router-ID> tuple is globally unique. Their use in the Node Descriptor helps map Link-State NLRIs with BGP protocol-ID to a unique BGP router in the administrative domain where BGP-LS is enabled.

The following Node Descriptors TLVs MUST be included in BGP-LS Link NLRI as Remote Node Descriptors when distributing BGP information:

o BGP Router-ID (TLV 516), which contains the valid BGP Identifier of the peer BGP node.

o Autonomous System Number (TLV 512) [RFC7752], which contains the ASN or the AS Confederation Identifier (ASN) [RFC5065], if confederations are used, of the peer BGP node.

3.3. Optional BGP Node Descriptors

The following Node Descriptors TLVs MAY be included in BGP-LS NLRI as Local Node Descriptors when distributing BGP information:

o Member-ASN (TLV 517), which contains the ASN of the confederation member (i.e., Member-AS Number), if BGP confederations are used, of the local BGP node.

o Node Descriptors as defined in [RFC7752].

The following Node Descriptors TLVs MAY be included in BGP-LS Link NLRI as Remote Node Descriptors when distributing BGP information:

o Member-ASN (TLV 517), which contains the ASN of the confederation member (i.e., Member-AS Number), if BGP confederations are used, of the peer BGP node.

o Node Descriptors as defined in [RFC7752].

4. BGP-LS Attributes for BGP Peering Segments

This section defines the BGP-LS Attributes corresponding to the following BGP Peer Segment SIDs:

Peer Node Segment Identifier (PeerNode SID)

Previdi, et al. Expires November 17, 2019 [Page 7]

Page 40: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft Segment Routing EPE BGP-LS Extensions May 2019

Peer Adjacency Segment Identifier (PeerAdj SID)

Peer Set Segment Identifier (PeerSet SID)

The following new BGP-LS Link attributes TLVs are defined for use with BGP-LS Link NLRI for advertising BGP Peering SIDs:

+----------+---------------------------+ | TLV Code | Description | | Point | | +----------+---------------------------+ | 1101 | PeerNode SID | | 1102 | PeerAdj SID | | 1103 | PeerSet SID | +----------+---------------------------+

Figure 2: BGP-LS TLV code points for BGP-EPE

PeerNode SID, PeerAdj SID, and PeerSet SID have all the same format defined here below:

0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type | Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Flags | Weight | Reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | SID/Label/Index (variable) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Figure 3: BGP Peering SIDs TLV Format

o Type: 1101, 1102 or 1103 as listed in Figure 2.

o Length: variable. Valid values are either 7 or 8 based on the whether the encoding is done as a SID Index or a label.

o Flags: one octet of flags with the following definition:

0 1 2 3 4 5 6 7 +-+-+-+-+-+-+-+-+ |V|L|B|P| Rsvd | +-+-+-+-+-+-+-+-+

Figure 4: Peering SID TLV Flags Format

Previdi, et al. Expires November 17, 2019 [Page 8]

Page 41: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft Segment Routing EPE BGP-LS Extensions May 2019

* V-Flag: Value flag. If set, then the SID carries a label value. By default the flag is SET.

* L-Flag: Local Flag. If set, then the value/index carried by the SID has local significance. By default the flag is SET.

* B-Flag: Backup Flag. If set, the SID refers to a path that is eligible for protection using fast re-route (FRR). The computation of the backup forwarding path and its association with the BGP Peering SID forwarding entry is implementation specific. [I-D.ietf-spring-segment-routing-central-epe] section 3.6 discusses some of the possible ways of identifying backup paths for BGP Peering SIDs.

* P-Flag: Persistent Flag: If set, the SID is persistently allocated, i.e., the SID value remains consistent across router restart and session/interface flap.

* Rsvd bits: Reserved for future use and MUST be zero when originated and ignored when received.

o Weight: 1 octet. The value represents the weight of the SID for the purpose of load balancing. An example use of the weight is described in [RFC8402].

o SID/Index/Label. According to the TLV length and to the V and L flags settings, it contains either:

* A 3 octet local label where the 20 rightmost bits are used for encoding the label value. In this case, the V and L flags MUST be SET.

* A 4 octet index defining the offset in the Segment Routing Global Block (SRGB) [RFC8402] advertised by this router. In this case, the SRGB MUST be advertised using the extensions defined in [I-D.ietf-idr-bgp-ls-segment-routing-ext].

The values of the PeerNode SID, PeerAdj SID, and PeerSet SID Sub-TLVs SHOULD be persistent across router restart.

When enabled for Egress Peer Engineering, the BGP router MUST include the PeerNode SID TLV in the BGP-LS Attribute for the BGP-LS Link NLRI corresponding to its BGP peering sessions. The PeerAdj SID and PeerSet SID TLVs MAY be included in the BGP-LS Attribute for the BGP- LS Link NLRI.

Additional BGP-LS Link Attribute TLVs, as defined in [RFC7752] MAY be included with the BGP-LS Link NLRI in order to advertise the

Previdi, et al. Expires November 17, 2019 [Page 9]

Page 42: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft Segment Routing EPE BGP-LS Extensions May 2019

characteristics of the peering link. E.g., one or more interface addresses (TLV 259 or TLV 261) of the underlying link(s) over which a multi-hop BGP peering session is setup may be included in the BGP-LS Attribute along with the PeerNode SID TLV.

4.1. Advertisement of the PeerNode SID

The PeerNode SID TLV includes a SID associated with the BGP peer node that is described by a BGP-LS Link NLRI as specified in Section 3.

The PeerNode SID, at the BGP node advertising it, has the following semantics (as defined in [RFC8402]):

o SR operation: NEXT.

o Next-Hop: the connected peering node to which the segment is associated.

The PeerNode SID is advertised with a BGP-LS Link NLRI, where:

o Local Node Descriptors include:

* Local BGP Router-ID (TLV 516) of the BGP-EPE enabled egress PE.

* Local ASN (TLV 512).

o Remote Node Descriptors include:

* Peer BGP Router-ID (TLV 516) (i.e., the peer BGP ID used in the BGP session)

* Peer ASN (TLV 512).

o Link Descriptors include the addresses used by the BGP session encoded using TLVs as defined in [RFC7752]:

* IPv4 Interface Address (TLV 259) contains the BGP session IPv4 local address.

* IPv4 Neighbor Address (TLV 260) contains the BGP session IPv4 peer address.

* IPv6 Interface Address (TLV 261) contains the BGP session IPv6 local address.

* IPv6 Neighbor Address (TLV 262) contains the BGP session IPv6 peer address.

Previdi, et al. Expires November 17, 2019 [Page 10]

Page 43: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft Segment Routing EPE BGP-LS Extensions May 2019

o Link Attribute TLVs include the PeerNode SID TLV as defined in Figure 3.

4.2. Advertisement of the PeerAdj SID

The PeerAdj SID TLV includes a SID associated with the underlying link to the BGP peer node that is described by a BGP-LS Link NLRI as specified in Section 3.

The PeerAdj SID, at the BGP node advertising it, has the following semantics (as defined in [RFC8402]):

o SR operation: NEXT.

o Next-Hop: the interface peer address.

The PeerAdj SID is advertised with a BGP-LS Link NLRI, where:

o Local Node Descriptors include:

* Local BGP Router-ID (TLV 516) of the BGP-EPE enabled egress PE.

* Local ASN (TLV 512).

o Remote Node Descriptors include:

* Peer BGP Router-ID (TLV 516) (i.e., the peer BGP ID used in the BGP session).

* Peer ASN (TLV 512).

o Link Descriptors MUST include the following TLV, as defined in [RFC7752]:

* Link Local/Remote Identifiers (TLV 258) contains the 4-octet Link Local Identifier followed by the 4-octet Link Remote Identifier. The value 0 is used by default when the link remote identifier is unknown.

o Additional Link Descriptors TLVs, as defined in [RFC7752], MAY also be included to describe the addresses corresponding to the link between the BGP routers:

* IPv4 Interface Address (Sub-TLV 259) contains the address of the local interface through which the BGP session is established.

Previdi, et al. Expires November 17, 2019 [Page 11]

Page 44: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft Segment Routing EPE BGP-LS Extensions May 2019

* IPv6 Interface Address (Sub-TLV 261) contains the address of the local interface through which the BGP session is established.

* IPv4 Neighbor Address (Sub-TLV 260) contains the IPv4 address of the peer interface used by the BGP session.

* IPv6 Neighbor Address (Sub-TLV 262) contains the IPv6 address of the peer interface used by the BGP session.

o Link Attribute TLVs include the PeerAdj SID TLV as defined in Figure 3.

4.3. Advertisement of the PeerSet SID

The PeerSet SID TLV includes a SID that is shared amongst BGP peer nodes or the underlying links that are described by BGP-LS Link NLRI as specified in Section 3.

The PeerSet SID, at the BGP node advertising it, has the following semantics (as defined in [RFC8402]):

o SR operation: NEXT.

o Next-Hop: load balance across any connected interface to any peer in the associated peer set.

The PeerSet SID TLV containing the same SID value (encoded as defined in Figure 3) is included in the BGP-LS Attribute for all of the BGP- LS Link NLRI corresponding to the PeerNode or PeerAdj segments associated with the peer set.

5. IANA Considerations

This document defines:

A new Protocol-ID: BGP. The codepoint is from the "BGP-LS Protocol-IDs" registry.

Two new TLVs: BGP-Router-ID and BGP Confederation Member. The codepoints are in the "BGP-LS Node Descriptor, Link Descriptor, Prefix Descriptor, and Attribute TLVs" registry.

Three new BGP-LS Attribute TLVs: PeerNode SID, PeerAdj SID and PeerSet SID. The codepoints are in the "BGP-LS Node Descriptor, Link Descriptor, Prefix Descriptor, and Attribute TLVs" registry.

Previdi, et al. Expires November 17, 2019 [Page 12]

Page 45: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft Segment Routing EPE BGP-LS Extensions May 2019

5.1. New BGP-LS Protocol-ID

This document defines a new value in the registry "BGP-LS Protocol- IDs":

+------------------------------------------------------+ | Codepoint | Description | Status | +------------------------------------------------------+ | 7 | BGP | Early Allocation by IANA | +------------------------------------------------------+

Figure 5: BGP Protocol Codepoint

5.2. Node Descriptors and Link Attribute TLVs

This document defines 5 new TLVs in the registry "BGP-LS Node Descriptor, Link Descriptor, Prefix Descriptor, and Attribute TLVs":

o Two new node descriptor TLVs

o Three new link attribute TLVs

All the new 5 codepoints are in the same registry: "BGP-LS Node Descriptor, Link Descriptor, Prefix Descriptor, and Attribute TLVs".

The following new Node Descriptors TLVs are defined:

+-------------------------------------------------------------------+ | Codepoint | Description | Status | +-------------------------------------------------------------------+ | 516 | BGP Router-ID | Early Allocation by IANA | | 517 | BGP Confederation Member | Early Allocation by IANA | +------------+------------------------------------------------------+

Figure 6: BGP-LS Descriptor TLVs Codepoints

The following new Link Attribute TLVs are defined:

+-------------------------------------------------------------------+ | Codepoint | Description | Status | +-------------------------------------------------------------------+ | 1101 | PeerNode SID | Early Allocation by IANA | | 1102 | PeerAdj SID | Early Allocation by IANA | | 1103 | PeerSet SID | Early Allocation by IANA | +------------+------------------------------------------------------+

Figure 7: BGP-LS Attribute TLVs Codepoints

Previdi, et al. Expires November 17, 2019 [Page 13]

Page 46: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft Segment Routing EPE BGP-LS Extensions May 2019

6. Manageability Considerations

The new protocol extensions introduced in this document augment the existing IGP topology information BGP-LS distribution [RFC7752] by adding support for distribution of BGP peering topology information. As such, the Manageability Considerations section of [RFC7752] applies to these new extensions as well.

Specifically, the malformed Link-State NLRI and BGP-LS Attribute tests for syntactic checks in the Fault Management section of [RFC7752] now apply to the TLVs defined in this document. The semantic or content checking for the TLVs specified in this document and their association with the BGP-LS NLRI types or their associated BGP-LS Attributes is left to the consumer of the BGP-LS information (e.g., an application or a controller) and not the BGP protocol.

A consumer of the BGP-LS information retrieves this information from a BGP Speaker, over a BGP-LS session (refer Section 1 and 2 of [RFC7752]). The handling of semantic or content errors by the consumer would be dictated by the nature of its application usage and hence is beyond the scope of this document. It may be expected that an error detected in the NLRI descriptor TLVs would result in that specific NLRI update being unusable and hence its update to be discarded along with an error log. While an error in Attribute TLVs would result in only that specific attribute being discarded with an error log.

The operator MUST be provided with the options of configuring, enabling, and disabling the advertisement of each of the PeerNode SID, PeerAdj SID, and PeerSet SID as well as control of which information is advertised to which internal or external peer. This is not different from what is required by a BGP speaker in terms of information origination and advertisement.

BGP Peering Segments are associated with the normal BGP routing peering sessions. However, the BGP peering information along with these Peering Segments themselves are advertised via a distinct BGP- LS peering session. It is expected that this isolation as described in [RFC7752] is followed when advertising BGP peering topology information via BGP-LS.

BGP-EPE functionality enables the capability for instantiation of an SR path for traffic engineering a flow via an egress BGP router to a specific peer, bypassing the normal BGP best path routing for that flow and any routing policies implemented in BGP on that egress BGP router. As with any traffic engineering solution, the controller or application implementing the policy needs to ensure that there is no looping or mis-routing of traffic. Traffic counters corresponding to

Previdi, et al. Expires November 17, 2019 [Page 14]

Page 47: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft Segment Routing EPE BGP-LS Extensions May 2019

the MPLS label of the BGP Peering SID on the router would indicate the traffic being forwarded based on the specific EPE path. Monitoring these counters and the flows hitting the corresponding MPLS forwarding entry would help identify issues, if any, with traffic engineering over the EPE paths. Errors in the encoding or decoding of the SR information in the TLVs defined in this document may result in the unavailability of such information to a Centralized EPE Controller or incorrect information being made available to it. This may result in the controller not being able to perform the desired SR based optimization functionality or to perform it in an unexpected or inconsistent manner. The handling of such errors by applications like such a controller may be implementation specific and out of scope of this document.

7. Security Considerations

[RFC7752] defines BGP-LS NLRI to which the extensions defined in this document apply. The Security Considerations section of [RFC7752] also applies to these extensions. The procedures and new TLVs defined in this document, by themselves, do not affect the BGP-LS security model discussed in [RFC7752].

BGP-EPE enables engineering of traffic when leaving the administrative domain via an egress BGP router. Therefore precaution is necessary to ensure that the BGP peering information collected via BGP-LS is limited to specific consumers in a secure manner. Segment Routing operates within a trusted domain [RFC8402] and its security considerations also apply to BGP Peering Segments. The BGP-EPE policies are expected to be used entirely within this trusted SR domain (e.g., between multiple AS/domains within a single provider network).

The isolation of BGP-LS peering sessions is also required to ensure that BGP-LS topology information (including the newly added BGP peering topology) is not advertised to an external BGP peering session outside an administrative domain.

8. Contributors

Mach (Guoyi) Chen Huawei Technologies China

Email: [email protected]

Previdi, et al. Expires November 17, 2019 [Page 15]

Page 48: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft Segment Routing EPE BGP-LS Extensions May 2019

Acee Lindem Cisco Systems Inc. US

Email: [email protected]

9. Acknowledgements

The authors would like to thank Jakob Heitz, Howard Yang, Hannes Gredler, Peter Psenak, Arjun Sreekantiah and Bruno Decraene for their feedback and comments. Susan Hares helped in improving the clarity of the document with her substantial contributions during her shepherd’s review. The authors would also like to thank Alvaro Retana for his extensive review and comments which helped correct issues and improve the document.

10. References

10.1. Normative References

[I-D.ietf-idr-bgp-ls-segment-routing-ext] Previdi, S., Talaulikar, K., Filsfils, C., Gredler, H., and M. Chen, "BGP Link-State extensions for Segment Routing", draft-ietf-idr-bgp-ls-segment-routing-ext-14 (work in progress), May 2019.

[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997, <https://www.rfc-editor.org/info/rfc2119>.

[RFC5065] Traina, P., McPherson, D., and J. Scudder, "Autonomous System Confederations for BGP", RFC 5065, DOI 10.17487/RFC5065, August 2007, <https://www.rfc-editor.org/info/rfc5065>.

[RFC6286] Chen, E. and J. Yuan, "Autonomous-System-Wide Unique BGP Identifier for BGP-4", RFC 6286, DOI 10.17487/RFC6286, June 2011, <https://www.rfc-editor.org/info/rfc6286>.

[RFC7752] Gredler, H., Ed., Medved, J., Previdi, S., Farrel, A., and S. Ray, "North-Bound Distribution of Link-State and Traffic Engineering (TE) Information Using BGP", RFC 7752, DOI 10.17487/RFC7752, March 2016, <https://www.rfc-editor.org/info/rfc7752>.

Previdi, et al. Expires November 17, 2019 [Page 16]

Page 49: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft Segment Routing EPE BGP-LS Extensions May 2019

[RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, May 2017, <https://www.rfc-editor.org/info/rfc8174>.

[RFC8402] Filsfils, C., Ed., Previdi, S., Ed., Ginsberg, L., Decraene, B., Litkowski, S., and R. Shakir, "Segment Routing Architecture", RFC 8402, DOI 10.17487/RFC8402, July 2018, <https://www.rfc-editor.org/info/rfc8402>.

10.2. Informative References

[I-D.dawra-idr-bgpls-srv6-ext] Dawra, G., Filsfils, C., Talaulikar, K., Chen, M., [email protected], d., Uttaro, J., Decraene, B., and H. Elmalky, "BGP Link State Extensions for SRv6", draft- dawra-idr-bgpls-srv6-ext-06 (work in progress), March 2019.

[I-D.ietf-spring-segment-routing-central-epe] Filsfils, C., Previdi, S., Dawra, G., Aries, E., and D. Afanasiev, "Segment Routing Centralized BGP Egress Peer Engineering", draft-ietf-spring-segment-routing-central- epe-10 (work in progress), December 2017.

[I-D.ietf-spring-segment-routing-policy] Filsfils, C., Sivabalan, S., [email protected], d., [email protected], b., and P. Mattes, "Segment Routing Policy Architecture", draft-ietf-spring-segment-routing- policy-03 (work in progress), May 2019.

Authors’ Addresses

Stefano Previdi Individual

Email: [email protected]

Ketan Talaulikar (editor) Cisco Systems, Inc. India

Email: [email protected]

Previdi, et al. Expires November 17, 2019 [Page 17]

Page 50: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft Segment Routing EPE BGP-LS Extensions May 2019

Clarence Filsfils Cisco Systems, Inc. Brussels Belgium

Email: [email protected]

Keyur Patel Arrcus, Inc.

Email: [email protected]

Saikat Ray Individual Contributor

Email: [email protected]

Jie Dong Huawei Technologies Huawei Campus, No. 156 Beiqing Rd. Beijing 100095 China

Email: [email protected]

Previdi, et al. Expires November 17, 2019 [Page 18]

Page 51: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

IDR W. Hao Q. LiangInternet Draft Huawei Technologies Ltd.Intended status: Standards Track Jim Uttaro AT&T S. Litkowski Orange Business Service S. Zhuang Huawei Technologies Ltd.Expires: November 2015 May 9, 2015

Dissemination of Flow Specification Rules for L2 VPN draft-ietf-idr-flowspec-l2vpn-01.txt

Abstract

This document defines BGP flow-spec extension for Ethernet traffic filtering in L2 VPN network. SAFI=134 in [RFC5575] is redefined for dissemination traffic filtering information in an L2VPN environment. A new subset of component types and extended community also are defined.

Status of this Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

The list of current Internet-Drafts can be accessed at http://www.ietf.org/1id-abstracts.html

The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html.

Copyright Notice

Copyright (c) 2015 IETF Trust and the persons identified as the document authors. All rights reserved.

Hao & Liang,et,al Expires November 9, 2015 [Page 1]

Page 52: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft EVPN Flow Spec May 2015

This document is subject to BCP 78 and the IETF Trust’s Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.

Table of Contents

1. Introduction ................................................ 2 2. Layer 2 Flow Specification encoding in BGP................... 3 3. Ethernet Flow Specification encoding in BGP.................. 4 4. Ethernet Flow Specification Traffic Actions.................. 6 5. Security Considerations...................................... 8 6. IANA Considerations ......................................... 8 6.1. Normative References................................... 10 6.2. Informative References................................. 10 7. Acknowledgments ............................................ 10

1. Introduction

BGP Flow-spec is an extension to BGP that allows for the dissemination of traffic flow specification rules. It leverages the BGP Control Plane to simplify the distribution of ACLs, new filter rules can be injected to all BGP peers simultaneously without changing router configuration. The typical application of BGP Flow- spec is to automate the distribution of traffic filter lists to routers for DDOS mitigation.

RFC5575 defines a new BGP Network Layer Reachability Information (NLRI) format used to distribute traffic flow specification rules. NLRI (AFI=1, SAFI=133)is for IPv4 unicast filtering. NLRI (AFI=1, SAFI=134)is for BGP/MPLS VPN filtering. The Flow specification match part only includes L3/L4 information like source/destination prefix, protocol, ports, and etc, so traffic flows can only be selectively filtered based on L3/L4 information.

Layer 2 Virtual Private Networks L2VPNs have already been deployed in an increasing number of networks today. In L2VPN network, we also have requirement to deploy BGP Flow-spec to mitigate DDoS attack traffic. Within L2VPN network, both IP and non-IP Ethernet traffic

Hao & Liang,et,al Expires November 9, 2015 [Page 2]

Page 53: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft EVPN Flow Spec May 2015

maybe exist. For IP traffic filtering, the Flow specification rules defined in [RFC5575] which include match criteria and actions can still be used, flow specification rules received via new NLRI format apply only to traffic that belongs to the VPN instance(s) in which it is imported. For non-IP Ethernet traffic filtering, Layer 2 related information like source/destination MAC and VLAN should be considered. But the flow specification match criteria defined in RFC5575 only include layer 3 and layer 4 IP information, layer 2 Ethernet information haven’t been included.

There are different kinds of L2VPN networks like EVPN [EVPN], BGP VPLS [RFC4761], LDP VPLS [RFC4762] and border gateway protocol (BGP) auto discovery [RFC 6074]. Because the flow-spec feature relies on BGP protocol to distribute traffic filtering rules, so it can only be incrementally deployed in those L2VPN networks where BGP is used for auto discovery and/or signaling purposes such as BGP-based VPLS [4761], EVPN and LDP-based VPLS [4762] with BGP auto-discovery [6074].

This draft proposes a new subset of component types and extended community to support L2VPN flow-spec application. SAFI=134 in [RFC5575] is redefined for dissemination traffic filtering information in an L2VPN environment.

2. Layer 2 Flow Specification encoding in BGP

The [RFC5575] defines SAFI 133 and SAFI 134 for ’’dissemination of IPv4 flow specification rules’’ and ’’dissemination of VPNv4 flow specification rules’’ respectively. [draft-ietf-idr-flow-spec-v6-06] redefines the [RFC5575] SAFIs in order to make them applicable to both IPv4 and IPv6 applications. This document will further redefine the SAFI 134 in order to make them applicable to L2VPN applications.

The following changes are defined:

’’SAFI 134 for dissemination of L3VPN flow specification rules’’ to now be defined as ’’SAFI 134 for dissemination of VPN flow specification rules’’

For SAFI 134 the indication to which address family it is referring to will be recognized by AFI value (AFI=1 for VPNv4, AFI=2 VPNv6 and AFI=25 for L2VPN). Such modification is fully backwards compatible with existing implementation and production deployments.

Hao & Liang,et,al Expires November 9, 2015 [Page 3]

Page 54: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft EVPN Flow Spec May 2015

3. Ethernet Flow Specification encoding in BGP

The NLRI format for this address family consists of a fixed-length Route Distinguisher field (8 bytes) followed by a flow specification, following the encoding defined in this document. The NLRI length field shall include both the 8 bytes of the Route Distinguisher as well as the subsequent flow specification.

Flow specification rules received via this NLRI apply only to traffic that belongs to the VPN instance(s) in which it is imported. Flow rules are accepted by default, when received from remote PE routers.

Besides the component types defined in [RFC5575] and [draft-ietf- idr-flow-spec-v6-06], this document proposes the following additional component types for L2VPN Ethernet traffic filtering:

Type 14 - Source MAC

Encoding: <type (1 octet), [op, value]+>

Defines a list of {operation(op), value} pairs used to match source MAC. Values are encoded as 6-byte quantities. The operation field is encoded as ’’Numeric operator’’ defined in [RFC5575].

Type 15 - Destination MAC

Encoding: <type (1 octet), [op, value]+>

Defines a list of {operation, value} pairs used to match destination MAC. Values are encoded as 6-byte quantities.

Type 16 - Ethernet Type

Encoding: <type (1 octet), [op, value]+>

Defines a list of {operation, value} pairs used to match two-octet field. Values are encoded as 2-byte quantities.

Ethernet II framing defines the two-octet EtherType field in an Ethernet frame, preceded by destination and source MAC addresses, that identifies an upper layer protocol encapsulating the frame data.

Type 17 - DSAP(Destination Service Access Point) in LLC

Encoding: <type (1 octet), [op, value]+>

Hao & Liang,et,al Expires November 9, 2015 [Page 4]

Page 55: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft EVPN Flow Spec May 2015

Defines a list of {operation, value} pairs used to match 1-octet DSAP in the 802.2 LLC(Logical Link Control Header). Values are encoded as 1-byte quantities.

Type 18 - SSAP(Source Service Access Point) in LLC

Encoding: <type (1 octet), [op, value]+>

Defines a list of {operation, value} pairs used to match 1-octet SSAP in the 802.2 LLC. Values are encoded as 1-byte quantities.

Type 19 - Control field in LLC

Encoding: <type (1 octet), [op, value]+>

Defines a list of {operation, value} pairs used to match 1-octet control field in the 802.2 LLC. Values are encoded as 1-byte quantities.

Type 20 - SNAP

Encoding: <type (1 octet), [op, value]+>

Defines a list of {operation, value} pairs used to match 5-octet SNAP(Sub-Network Access Protocol) field. Values are encoded as 5- byte quantities.

Type 21 - VLAN ID

Encoding: <type (1 octet), [op, value]+>

Defines a list of {operation, value} pairs used to match VLAN ID. Values are encoded as 1- or 2-byte quantities.

In virtual local-area network (VLAN) stacking case, the VLAN ID is outer VLAN ID.

Type 22 - VLAN COS

Encoding: <type (1 octet), [op, value]+>

Defines a list of {operation, value} pairs used to match 3-bit VLAN COS fields [802.1p]. Values are encoded using a single byte, where the five most significant bits are zero and the three least significant bits contain the VLAN COS value.

Hao & Liang,et,al Expires November 9, 2015 [Page 5]

Page 56: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft EVPN Flow Spec May 2015

In virtual local-area network (VLAN) stacking case, the VLAN COS is outer VLAN COS.

Type 23 - Inner VLAN ID

Encoding: <type (1 octet), [op, value]+>

Defines a list of {operation, value} pairs used to match inner VLAN ID using for virtual local-area network (VLAN) stacking or Q in Q case. Values are encoded as 1- or 2-byte quantities.

In single VLAN case, the component type MUST not be used.

Type 24 - Inner VLAN COS

Encoding: <type (1 octet), [op, value]+>

Defines a list of {operation, value} pairs used to match 3-bit inner VLAN COS fields [802.1p] using for virtual local-area network (VLAN) stacking or Q in Q case. Values are encoded using a single byte, where the five most significant bits are zero and the three least significant bits contain the VLAN COS value.

In single VLAN case, the component type MUST not be used.

4. Ethernet Flow Specification Traffic Actions

+--------+--------------------+--------------------------+ | type | extended community | encoding | +--------+--------------------+--------------------------+ | 0x8006 | traffic-rate | 2-byte as#, 4-byte float | | 0x8007 | traffic-action | bitmask | | 0x8008 | redirect | 6-byte Route Target | | 0x8009 | traffic-marking | DSCP value | +--------+--------------------+--------------------------+ Besides to support the above extended communities per RFC5575, this document also proposes the following BGP extended communities specifications for Ethernet flow to extend [RFC5575]:

Hao & Liang,et,al Expires November 9, 2015 [Page 6]

Page 57: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft EVPN Flow Spec May 2015

+--------+------------------------+--------------------------+ | type | extended community | encoding | +--------+------------------------+--------------------------+ | 0x800A | VLAN-action | bitmask | | 0x800B | TPID-action | bitmask | +--------+------------------------+--------------------------+

VLAN-action: The VLAN-action extended community consists of 6 bytes which include the fields of action Flags, and two VLAN ID/COS value. The action Flags field includes PO, PU, SW, RI, RO, CI and CO Flag to indicate the action type. The two VLAN ID/COS value are carried in COS1/VLAN ID1 and COS2/VLAN ID2 field respectively.

0 15 +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ |PO|PU|SW|RI|RO|CI|CO| Resv | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | COS1 |R | VLAN ID1 | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | COS2 |R | VLAN ID2 | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+

PO: Pop action. It indicates the outmost VLAN should be removed.

PU: Push action. It indicates a new VLAN should be added as outmost VLAN, the new VLAN is VLAN ID1.

SW: Swap action. It indicates the outer VLAN and inner VLAN should be swapped.

RI: Rewrite inner VLAN action. It indicates the inner VLAN should be replaced by a new VLAN, the new VLAN is VLAN ID1.

RO: Rewrite outer VLAN action. It indicates the outer VLAN should be replaced by a new VLAN, the new VLAN is VLAN ID2.

CI: Mapping inner COS action. It indicates the inner COS should be replaced by a new COS, the new COS is COS1.

CO: Mapping outer COS action. It indicates the outer COS should be replaced by a new COS, the new COS is COS2.

Resv: Reserved for future use.

COS1: 3 bits. COS value.

Hao & Liang,et,al Expires November 9, 2015 [Page 7]

Page 58: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft EVPN Flow Spec May 2015

COS2: 3 bits. COS value.

VLAN ID1: 12 bits. VLAN ID value.

VLAN ID2: 12 bits. VLAN ID value.

R: Reserved for future use.

TPID-action: The TPID-action extended community consists of 6 bytes which include the fields of action Flags, TPID1 and TPID2.

0 15 +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ |TI|TO| Resv | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | TP ID1 | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | TP ID2 | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+

TI: Mapping inner TP ID action. It indicates the inner TP ID should be replaced by a new TP ID, the new TP ID is TP ID1.

TO: Mapping outer TP ID action. It indicates the outer TP ID should be replaced by a new TP ID, the new TP ID is TP ID2.

Resv: Reserved for future use.

In some cases, some more complicated actions are needed, such as SwapPop, PushSwap, etc. These actions can be represented using two catenated VLAN-actions which are carried in two VLAN-action extended communities. For example, Swap and Pop action can be represented by catenation of Swap action 1 and Pop action 2.

5. Security Considerations

No new security issues are introduced to the BGP protocol by this specification.

6. IANA Considerations

IANA is requested to rename currently defined SAFI 134 per [RFC5575] to read:

Hao & Liang,et,al Expires November 9, 2015 [Page 8]

Page 59: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft EVPN Flow Spec May 2015

134 VPN dissemination of flow specification rules

IANA is requested to create and maintain a new registry for "Flow spec L2VPN Component Types". For completeness, the types defined in [RFC5575] and [draft-ietf-idr-flow-spec-v6-06] also are listed here.

+--------+-------------------------------+--------------------------+ | type | RFC or Draft | discription | +--------+-------------------------------+--------------------------+ | 1 |RFC5575 | Destination Prefix | | 1 |draft-ietf-idr-flow-spec-v6-06 | Destination IPv6 Prefix | | 2 |RFC5575 | Source Prefix | | 2 |draft-ietf-idr-flow-spec-v6-06 | Source IPv6 Prefix | | 3 |RFC5575 | IP Protocol | | 3 |draft-ietf-idr-flow-spec-v6-06 | Next Header | | 4 |RFC5575 | Port | | 5 |RFC5575 | Destination port | | 6 |RFC5575 | Source port | | 7 |RFC5575 | ICMP type | | 8 |RFC5575 | ICMP code | | 9 |RFC5575 | TCP flags | | 10 |RFC5575 | Packet length | | 11 |RFC5575 | DSCP | | 12 |RFC5575 | Fragment | | 13 |draft-ietf-idr-flow-spec-v6-06 | Flow Label | | 14 |This draft | Source MAC | | 15 |This draft | Destination MAC | | 16 |This draft | Ethernet Type | | 17 |This draft | DSAP in LLC | | 18 |This draft | SSAP in LLC | | 19 |This draft | Control field in LLC | | 20 |This draft | SNAP | | 21 |This draft | VLAN ID | | 22 |This draft | VLAN COS | | 23 |This draft | Inner VLAN ID | | 24 |This draft | Inner VLAN COS | +--------+-------------------------------+--------------------------+ IANA is requested to update the reference for the following assignment in the "BGP Extended Communities Type - extended, transitive" registry:

Type value Name Reference

---------- ---------------------------------------- ---------

0x080A Flow spec VLAN action [this document]

Hao & Liang,et,al Expires November 9, 2015 [Page 9]

Page 60: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft EVPN Flow Spec May 2015

0x080B Flow spec TPID action [this document]

6.1. Normative References

[1] [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate

Requirement Levels", BCP 14, RFC 2119, March 1997.

[2] [RFC5575] P. Marques, N. Sheth, R. Raszuk, B. Greene, J.Mauch, D. McPherson, "Dissemination of Flow Specification Rules", RFC 5575, August 2009.

[3] [RFC4761] K. Kompella, Ed., Y. Rekhter, Ed., "Virtual Private LAN Service (VPLS) Using BGP for Auto-Discovery and Signaling", RFC4761, January 2007.

[4] [RFC4762] M. Lasserre, Ed., V. Kompella, Ed., "Virtual Private LAN Service (VPLS) Using Label Distribution Protocol (LDP) Signaling", RFC4762, January 2007.

[5] [RFC6074] E. Rosen, B. Davie, V. Radoaca, "Provisioning, Auto- Discovery, and Signaling in Layer 2 Virtual Private Networks (L2VPNs)", RFC6074, January 2011.

6.2. Informative References

[1] [EVPN] Sajassi et al., "BGP MPLS Based Ethernet VPN", draft- ietf-l2vpn-evpn-07.txt, work in progress, May, 2014.

[2] [IEEE 802.1p] Javin, et.al. "IEEE 802.1p: LAN Layer 2 QoS/CoS Protocol for Traffic Prioritization", 2012-02-15

7. Acknowledgments

The authors wish to acknowledge the important contributions of Hannes Gredler, Xiaohu Xu and Lucy Yong.

Hao & Liang,et,al Expires November 9, 2015 [Page 10]

Page 61: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft EVPN Flow Spec May 2015

Authors’ Addresses

Weiguo Hao Huawei Technologies 101 Software Avenue, Nanjing 210012 China Email: [email protected]

Qiandeng Liang Huawei Technologies 101 Software Avenue, Nanjing 210012 China Email: [email protected]

Shunwan Zhuang Huawei Technologies Huawei Bld., No.156 Beiqing Rd. Beijing 100095 China Email: [email protected]

James Uttaro AT&T EMail: [email protected]

Stephane Litkowski Orange [email protected]

Hao & Liang,et,al Expires November 9, 2015 [Page 11]

Page 62: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Network Working Group S. PrevidiInternet-DraftIntended status: Standards Track K. Talaulikar, Ed.Expires: April 16, 2020 Cisco Systems, Inc. J. Dong, Ed. M. Chen Huawei Technologies H. Gredler RtBrick Inc. J. Tantsura Apstra October 14, 2019

Distribution of Traffic Engineering (TE) Policies and State using BGP-LS draft-ietf-idr-te-lsp-distribution-12

Abstract

This document describes a mechanism to collect the Traffic Engineering and Policy information that is locally available in a node and advertise it into BGP Link State (BGP-LS) updates. Such information can be used by external components for path computation, re-optimization, service placement, network visualization, etc.

Requirements Language

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at https://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

Previdi, et al. Expires April 16, 2020 [Page 1]

Page 63: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft TE Policy State Distribution using BGP-LS October 2019

This Internet-Draft will expire on April 16, 2020.

Copyright Notice

Copyright (c) 2019 IETF Trust and the persons identified as the document authors. All rights reserved.

This document is subject to BCP 78 and the IETF Trust’s Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.

Table of Contents

1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 2. Carrying TE Policy Information in BGP . . . . . . . . . . . . 5 3. TE Policy NLRI . . . . . . . . . . . . . . . . . . . . . . . 6 4. TE Policy Descriptors . . . . . . . . . . . . . . . . . . . . 7 4.1. Tunnel Identifier (Tunnel ID) . . . . . . . . . . . . . . 8 4.2. LSP Identifier (LSP ID) . . . . . . . . . . . . . . . . . 8 4.3. IPv4/IPv6 Tunnel Head-End Address . . . . . . . . . . . . 9 4.4. IPv4/IPv6 Tunnel Tail-End Address . . . . . . . . . . . . 9 4.5. SR Policy Candidate Path Descriptor . . . . . . . . . . . 10 4.6. Local MPLS Cross Connect . . . . . . . . . . . . . . . . 11 4.6.1. MPLS Cross Connect Interface . . . . . . . . . . . . 13 4.6.2. MPLS Cross Connect FEC . . . . . . . . . . . . . . . 14 5. MPLS-TE Policy State TLV . . . . . . . . . . . . . . . . . . 15 5.1. RSVP Objects . . . . . . . . . . . . . . . . . . . . . . 16 5.2. PCEP Objects . . . . . . . . . . . . . . . . . . . . . . 17 6. SR Policy State TLVs . . . . . . . . . . . . . . . . . . . . 18 6.1. SR Binding SID . . . . . . . . . . . . . . . . . . . . . 18 6.2. SR Candidate Path State . . . . . . . . . . . . . . . . . 20 6.3. SR Candidate Path Name . . . . . . . . . . . . . . . . . 22 6.4. SR Candidate Path Constraints . . . . . . . . . . . . . . 22 6.4.1. SR Affinity Constraint . . . . . . . . . . . . . . . 24 6.4.2. SR SRLG Constraint . . . . . . . . . . . . . . . . . 25 6.4.3. SR Bandwidth Constraint . . . . . . . . . . . . . . . 26 6.4.4. SR Disjoint Group Constraint . . . . . . . . . . . . 26 6.5. SR Segment List . . . . . . . . . . . . . . . . . . . . . 28 6.6. SR Segment . . . . . . . . . . . . . . . . . . . . . . . 31 6.6.1. Segment Descriptors . . . . . . . . . . . . . . . . . 32 6.7. SR Segment List Metric . . . . . . . . . . . . . . . . . 39 7. Procedures . . . . . . . . . . . . . . . . . . . . . . . . . 41

Previdi, et al. Expires April 16, 2020 [Page 2]

Page 64: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft TE Policy State Distribution using BGP-LS October 2019

8. Manageability Considerations . . . . . . . . . . . . . . . . 41 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 42 9.1. BGP-LS NLRI-Types . . . . . . . . . . . . . . . . . . . . 42 9.2. BGP-LS Protocol-IDs . . . . . . . . . . . . . . . . . . . 42 9.3. BGP-LS TLVs . . . . . . . . . . . . . . . . . . . . . . . 42 9.4. BGP-LS SR Policy Protocol Origin . . . . . . . . . . . . 43 9.5. BGP-LS TE State Object Origin . . . . . . . . . . . . . . 44 9.6. BGP-LS TE State Address Family . . . . . . . . . . . . . 44 9.7. BGP-LS SR Segment Descriptors . . . . . . . . . . . . . . 44 9.8. BGP-LS Metric Type . . . . . . . . . . . . . . . . . . . 45 10. Security Considerations . . . . . . . . . . . . . . . . . . . 45 11. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 46 12. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 46 13. References . . . . . . . . . . . . . . . . . . . . . . . . . 46 13.1. Normative References . . . . . . . . . . . . . . . . . . 46 13.2. Informative References . . . . . . . . . . . . . . . . . 48 Authors’ Addresses . . . . . . . . . . . . . . . . . . . . . . . 49

1. Introduction

In many network environments, traffic engineering (TE) policies are instantiated into various forms:

o MPLS Traffic Engineering Label Switched Paths (TE-LSPs).

o IP based tunnels (IP in IP, GRE, etc).

o Segment Routing (SR) Policies as defined in [I-D.ietf-spring-segment-routing-policy]

o Local MPLS cross-connect configuration

All this information can be grouped into the same term: TE Policies. In the rest of this document we refer to TE Policies as the set of information related to the various instantiation of polices: MPLS TE LSPs, IP tunnels (IPv4 or IPv6), SR Policies, etc.

TE Polices are generally instantiated at the head-end and are based on either local configuration or controller based programming of the node using various APIs and protocols, e.g., PCEP or BGP.

In many network environments, the configuration and state of each TE Policy that is available in the network is required by a controller which allows the network operator to optimize several functions and operations through the use of a controller aware of both topology and state information.

Previdi, et al. Expires April 16, 2020 [Page 3]

Page 65: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft TE Policy State Distribution using BGP-LS October 2019

One example of a controller is the stateful Path Computation Element (PCE) [RFC8231], which could provide benefits in path reoptimization. While some extensions are proposed in Path Computation Element Communication Protocol (PCEP) for the Path Computation Clients (PCCs) to report the LSP states to the PCE, this mechanism may not be applicable in a management-based PCE architecture as specified in section 5.5 of [RFC4655]. As illustrated in the figure below, the PCC is not an LSR in the routing domain, thus the head-end nodes of the TE-LSPs may not implement the PCEP protocol. In this case a general mechanism to collect the TE-LSP states from the ingress LERs is needed. This document proposes an TE Policy state collection mechanism complementary to the mechanism defined in [RFC8231].

----------- | ----- | Service | | TED |<-+-----------> Request | ----- | TED synchronization | | | | mechanism (for example, v | | | routing protocol) ------------- Request/ | v | | | Response| ----- | | NMS |<--------+> | PCE | | | | | ----- | ------------- ----------- Service | Request | v ---------- Signaling ---------- | Head-End | Protocol | Adjacent | | Node |<---------->| Node | ---------- ----------

Figure 1. Management-Based PCE Usage

In networks with composite PCE nodes as specified in section 5.1 of [RFC4655], PCE is implemented on several routers in the network, and the PCCs in the network can use the mechanism described in [RFC8231] to report the TE Policy information to the PCE nodes. An external component may also need to collect the TE Policy information from all the PCEs in the network to obtain a global view of the LSP state in the network.

In multi-area or multi-AS scenarios, each area or AS can have a child PCE to collect the TE Policies in its own domain, in addition, a parent PCE needs to collect TE Policy information from multiple child PCEs to obtain a global view of LSPs inside and across the domains involved.

Previdi, et al. Expires April 16, 2020 [Page 4]

Page 66: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft TE Policy State Distribution using BGP-LS October 2019

In another network scenario, a centralized controller is used for service placement. Obtaining the TE Policy state information is quite important for making appropriate service placement decisions with the purpose to both meet the application’s requirements and utilize network resources efficiently.

The Network Management System (NMS) may need to provide global visibility of the TE Policies in the network as part of the network visualization function.

BGP has been extended to distribute link-state and traffic engineering information to external components [RFC7752]. Using the same protocol to collect Traffic Engineering Policy and state information is desirable for these external components since this avoids introducing multiple protocols for network information collection. This document describes a mechanism to distribute traffic engineering policy information (MPLS, SR, IPv4 and IPv6) to external components using BGP-LS.

2. Carrying TE Policy Information in BGP

TE Policy information is advertised in BGP UPDATE messages using the MP_REACH_NLRI and MP_UNREACH_NLRI attributes [RFC4760]. The "Link- State NLRI" defined in [RFC7752] is extended to carry the TE Policy information. BGP speakers that wish to exchange TE Policy information MUST use the BGP Multiprotocol Extensions Capability Code (1) to advertise the corresponding (AFI, SAFI) pair, as specified in [RFC4760]. New TLVs carried in the Link_State Attribute defined in [RFC7752] are also defined in order to carry the attributes of a TE Policy in the subsequent sections.

The format of "Link-State NLRI" is defined in [RFC7752] as follows:

0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | NLRI Type | Total NLRI Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | // Link-State NLRI (variable) // | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

A new "NLRI Type" is defined for TE Policy Information as following:

o NLRI Type: TE Policy NLRI value 5.

The format of this new NLRI type is defined in Section 3 below.

Previdi, et al. Expires April 16, 2020 [Page 5]

Page 67: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft TE Policy State Distribution using BGP-LS October 2019

3. TE Policy NLRI

This document defines the new TE Policy NLRI-Type and its format as shown in the following figure:

0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+ | Protocol-ID | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Identifier | | (64 bits) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ // Headend (Node Descriptors) // +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ // TE Policy Descriptors (variable) // +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

where:

o Protocol-ID field specifies the component that owns the TE Policy state in the advertising node. The following new Protocol-IDs are defined and apply to the TE Policy NLRI:

+-------------+----------------------------------+ | Protocol-ID | NLRI information source protocol | +-------------+----------------------------------+ | 8 | RSVP-TE | | 9 | Segment Routing | +-------------+----------------------------------+

o "Identifier" is an 8 octet value as defined in [RFC7752].

o "Headend" consists of a Local Node Descriptor (TLV 256) as defined in [RFC7752].

o "TE Policy Descriptors" consists of one or more of the TLVs listed as below:

Previdi, et al. Expires April 16, 2020 [Page 6]

Page 68: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft TE Policy State Distribution using BGP-LS October 2019

+-----------+----------------------------------+ | Codepoint | Descriptor TLVs | +-----------+----------------------------------+ | 550 | Tunnel ID | | 551 | LSP ID | | 552 | IPv4/6 Tunnel Head-end address | | 553 | IPv4/6 Tunnel Tail-end address | | 554 | SR Policy Candidate Path | | 555 | Local MPLS Cross Connect | +-----------+----------------------------------+

The Local Node Descriptor TLV MUST include the following Node Descriptor TLVs:

o BGP Router-ID (TLV 516) [I-D.ietf-idr-bgpls-segment-routing-epe], which contains a valid BGP Identifier of the local node.

o Autonomous System Number (TLV 512) [RFC7752], which contains the ASN or AS Confederation Identifier (ASN) [RFC5065], if confederations are used, of the local node.

The Local Node Descriptor TLV SHOULD include the following Node Descriptor TLVs:

o IPv4 Router-ID of Local Node (TLV 1028) [RFC7752], which contains the IPv4 TE Router-ID of the local node when one is provisioned.

o IPv6 Router-ID of Local Node (TLV 1029) [RFC7752], which contains the IPv6 TE Router-ID of the local node when one is provisioned.

The Local Node Descriptor TLV MAY include the following Node Descriptor TLVs:

o Member-ASN (TLV 517) [I-D.ietf-idr-bgpls-segment-routing-epe], which contains the ASN of the confederation member (i.e. Member- AS Number), if BGP confederations are used, of the local node.

o Node Descriptors as defined in [RFC7752].

4. TE Policy Descriptors

This sections defines the TE Policy Descriptors TLVs which are used to describe the TE Policy being advertised by using the new BGP-LS TE Policy NLRI type defined in Section 3.

Previdi, et al. Expires April 16, 2020 [Page 7]

Page 69: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft TE Policy State Distribution using BGP-LS October 2019

4.1. Tunnel Identifier (Tunnel ID)

The Tunnel Identifier TLV contains the Tunnel ID defined in [RFC3209] and is used for RSVP-TE protocol TE Policies. It has the following format:

0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type | Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Tunnel ID | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

where:

o Type: 550

o Length: 2 octets.

o Tunnel ID: 2 octets as defined in [RFC3209].

4.2. LSP Identifier (LSP ID)

The LSP Identifier TLV contains the LSP ID defined in [RFC3209] and is used for RSVP-TE protocol TE Policies. It has the following format:

0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type | Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | LSP ID | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

where:

o Type: 551

o Length: 2 octets.

o LSP ID: 2 octets as defined in [RFC3209].

Previdi, et al. Expires April 16, 2020 [Page 8]

Page 70: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft TE Policy State Distribution using BGP-LS October 2019

4.3. IPv4/IPv6 Tunnel Head-End Address

The IPv4/IPv6 Tunnel Head-End Address TLV contains the Tunnel Head- End Address defined in [RFC3209] and is used for RSVP-TE protocol TE Policies. It has following format:

0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type | Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ // IPv4/IPv6 Tunnel Head-End Address (variable) // +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

where:

o Type: 552

o Length: 4 or 16 octets.

When the IPv4/IPv6 Tunnel Head-end Address TLV contains an IPv4 address, its length is 4 (octets).

When the IPv4/IPv6 Tunnel Head-end Address TLV contains an IPv6 address, its length is 16 (octets).

4.4. IPv4/IPv6 Tunnel Tail-End Address

The IPv4/IPv6 Tunnel Tail-End Address TLV contains the Tunnel Tail- End Address defined in [RFC3209] and is used for RSVP-TE protocol TE Policies. It has following format:

0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type | Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ // IPv4/IPv6 Tunnel Tail-End Address (variable) // +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

where:

o Type: 553

o Length: 4 or 16 octets.

When the IPv4/IPv6 Tunnel Tail-end Address TLV contains an IPv4 address, its length is 4 (octets).

Previdi, et al. Expires April 16, 2020 [Page 9]

Page 71: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft TE Policy State Distribution using BGP-LS October 2019

When the IPv4/IPv6 Tunnel Tail-end Address TLV contains an IPv6 address, its length is 16 (octets).

4.5. SR Policy Candidate Path Descriptor

The SR Policy Candidate Path Descriptor TLV identifies a Segment Routing Policy candidate path (CP) as defined in [I-D.ietf-spring-segment-routing-policy] and has the following format:

0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type | Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |Protocol-origin| Flags | RESERVED | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Endpoint (4 or 16 octets) // +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Policy Color (4 octets) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Originator AS Number (4 octets) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Originator Address (4 or 16 octets) // +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Discriminator (4 octets) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

where:

o Type: 554

o Length: variable (valid values are 24, 36 or 48 octets)

o Protocol-Origin : 1 octet field which identifies the protocol or component which is responsible for the instantiation of this path. Following protocol-origin codepoints are defined in this document.

+------------+---------------------------------------------------------+| Code Point | Protocol Origin |+------------+---------------------------------------------------------+| 1 | PCEP || 2 | BGP SR Policy || 3 | Local (via CLI, Yang model through NETCONF, gRPC, etc.) |+------------+---------------------------------------------------------+

Previdi, et al. Expires April 16, 2020 [Page 10]

Page 72: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft TE Policy State Distribution using BGP-LS October 2019

o Flags: 1 octet field with following bit positions defined. Other bits SHOULD be cleared by originator and MUST be ignored by receiver.

0 1 2 3 4 5 6 7 +-+-+-+-+-+-+-+-+ |E|O| | +-+-+-+-+-+-+-+-+

where:

* E-Flag : Indicates the encoding of endpoint as IPv6 address when set and IPv4 address when clear

* O-Flag : Indicates the encoding of originator address as IPv6 address when set and IPv4 address when clear

o Reserved : 2 octets which SHOULD be set to 0 by originator and MUST be ignored by receiver.

o Endpoint : 4 or 16 octets (as indicated by the flags) containing the address of the endpoint of the SR Policy

o Color : 4 octets that indicates the color of the SR Policy

o Originator ASN : 4 octets to carry the 4 byte encoding of the ASN of the originator. Refer [I-D.ietf-spring-segment-routing-policy] Sec 2.4 for details.

o Originator Address : 4 or 16 octets (as indicated by the flags) to carry the address of the originator. Refer [I-D.ietf-spring-segment-routing-policy] Sec 2.4 for details.

o Discriminator : 4 octets to carry the discrimator of the path. Refer [I-D.ietf-spring-segment-routing-policy] Sec 2.5 for details.

4.6. Local MPLS Cross Connect

The Local MPLS Cross Connect TLV identifies a local MPLS state in the form of incoming label and interface followed by an outgoing label and interface. Outgoing interface may appear multiple times (for multicast states). It is used with Protocol ID set to "Static Configuration" value 5 as defined in [RFC7752].

The Local MPLS Cross Connect TLV has the following format:

Previdi, et al. Expires April 16, 2020 [Page 11]

Page 73: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft TE Policy State Distribution using BGP-LS October 2019

0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type | Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Incoming label (4 octets) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Outgoing label (4 octets) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ // Sub-TLVs (variable) // +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

where:

o Type: 555

o Length: variable.

o Incoming and Outgoing labels: 4 octets each.

o Sub-TLVs: following Sub-TLVs are defined:

* Interface Sub-TLV

* Forwarding Equivalent Class (FEC)

The Local MPLS Cross Connect TLV:

MUST have an incoming label.

MUST have an outgoing label.

MAY contain an Interface Sub-TLV having the I-flag set.

MUST contain at least one Interface Sub-TLV having the I-flag unset.

MAY contain multiple Interface Sub-TLV having the I-flag unset. This is the case of a multicast MPLS cross connect.

MAY contain a FEC Sub-TLV.

The following sub-TLVs are defined for the Local MPLS Cross Connect TLV:

Previdi, et al. Expires April 16, 2020 [Page 12]

Page 74: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft TE Policy State Distribution using BGP-LS October 2019

+-----------+----------------------------------+ | Codepoint | Descriptor TLV | +-----------+----------------------------------+ | 556 | MPLS Cross Connect Interface | | 557 | MPLS Cross Connect FEC | +-----------+----------------------------------+

These are defined in the following sub-sections.

4.6.1. MPLS Cross Connect Interface

The MPLS Cross Connect Interface sub-TLV is optional and contains the identifier of the interface (incoming or outgoing) in the form of an IPv4 address or an IPv6 address.

The MPLS Cross Connect Interface sub-TLV has the following format:

0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type | Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

+-+-+-+-+-+-+-+-+ | Flags | +-+-+-+-+-+-+-+-+

+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Local Interface Identifier (4 octets) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ // Interface Address (4 or 16 octets) // +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

where:

o Type: 556

o Length: 9 or 21.

o Flags: 1 octet of flags defined as follows:

0 1 2 3 4 5 6 7 +-+-+-+-+-+-+-+-+ |I| | +-+-+-+-+-+-+-+-+

where:

Previdi, et al. Expires April 16, 2020 [Page 13]

Page 75: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft TE Policy State Distribution using BGP-LS October 2019

* I-Flag is the Interface flag. When set, the Interface Sub-TLV describes an incoming interface. If the I-flag is not set, then the Interface Sub-TLV describes an outgoing interface.

o Local Interface Identifier: a 4 octet identifier.

o Interface address: a 4 octet IPv4 address or a 16 octet IPv6 address.

4.6.2. MPLS Cross Connect FEC

The MPLS Cross Connect FEC sub-TLV is optional and contains the FEC associated to the incoming label.

The MPLS Cross Connect FEC sub-TLV has the following format:

0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type | Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Flags | Masklength | Prefix (variable) // +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ // Prefix (variable) // +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

where:

o Type: 557

o Length: variable.

o Flags: 1 octet of flags defined as follows:

0 1 2 3 4 5 6 7 +-+-+-+-+-+-+-+-+ |4| | +-+-+-+-+-+-+-+-+

where:

* 4-Flag is the IPv4 flag. When set, the FEC Sub-TLV describes an IPv4 FEC. If the 4-flag is not set, then the FEC Sub-TLV describes an IPv6 FEC.

o Mask Length: 1 octet of prefix length.

Previdi, et al. Expires April 16, 2020 [Page 14]

Page 76: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft TE Policy State Distribution using BGP-LS October 2019

o Prefix: an IPv4 or IPv6 prefix whose mask length is given by the " Mask Length" field padded to an octet boundary.

5. MPLS-TE Policy State TLV

A new TLV called "MPLS-TE Policy State TLV", is used to describe the characteristics of the MPLS-TE Policy and it is carried in the optional non-transitive BGP Attribute "LINK_STATE Attribute" defined in [RFC7752]. These MPLS-TE Policy characteristics include the characteristics and attributes of the policy, its dataplane, explicit path, Quality of Service (QoS) parameters, route information, the protection mechanisms, etc.

The MPLS-TE Policy State TLV has the following format:

0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type | Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Object-origin | Address Family| RESERVED | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ // MPLS-TE Policy State Objects (variable) // +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

where:

MPLS-TE Policy State TLV

o Type: 1200

o Length: the total length of the MPLS-TE Policy State TLV not including Type and Length fields.

o Object-origin: identifies the component (or protocol) from which the contained object originated. This allows for objects defined in different components to be collected while avoiding the possible codepoint collisions among these components. Following object-origin codepoints are defined in this document.

Previdi, et al. Expires April 16, 2020 [Page 15]

Page 77: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft TE Policy State Distribution using BGP-LS October 2019

+----------+------------------+ | Code | Object | | Point | Origin | +----------+------------------+ | 1 | RSVP-TE | | 2 | PCEP | | 3 | Local/Static | +----------+------------------+

o Address Family: describes the address family used to setup the MPLS-TE policy. The following address family values are defined in this document:

+----------+------------------+ | Code | Dataplane | | Point | | +----------+------------------+ | 1 | MPLS-IPv4 | | 2 | MPLS-IPv6 | +----------+------------------+

o RESERVED: 16-bit field. SHOULD be set to 0 on transmission and MUST be ignored on receipt.

o TE Policy State Objects: Rather than replicating all these objects in this document, the semantics and encodings of the objects as defined in RSVP-TE and PCEP are reused.

The state information is carried in the "MPLS-TE Policy State Objects" with the following format as described in the sub-sections below.

5.1. RSVP Objects

RSVP-TE objects are encoded in the "MPLS-TE Policy State Objects" field of the MPLS-TE Policy State TLV and consists of MPLS TE LSP objects defined in RSVP-TE [RFC3209] [RFC3473]. Rather than replicating all MPLS TE LSP related objects in this document, the semantics and encodings of the MPLS TE LSP objects are re-used. These MPLS TE LSP objects are carried in the MPLS-TE Policy State TLV.

When carrying RSVP-TE objects, the "Object-Origin" field is set to "RSVP-TE".

The following RSVP-TE Objects are defined:

o SENDER_TSPEC and FLOW_SPEC [RFC2205]

Previdi, et al. Expires April 16, 2020 [Page 16]

Page 78: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft TE Policy State Distribution using BGP-LS October 2019

o SESSION_ATTRIBUTE [RFC3209]

o EXPLICIT_ROUTE Object (ERO) [RFC3209]

o ROUTE_RECORD Object (RRO) [RFC3209]

o FAST_REROUTE Object [RFC4090]

o DETOUR Object [RFC4090]

o EXCLUDE_ROUTE Object (XRO) [RFC4874]

o SECONDARY_EXPLICIT_ROUTE Object (SERO) [RFC4873]

o SECONDARY_RECORD_ROUTE (SRRO) [RFC4873]

o LSP_ATTRIBUTES Object [RFC5420]

o LSP_REQUIRED_ATTRIBUTES Object [RFC5420]

o PROTECTION Object [RFC3473][RFC4872][RFC4873]

o ASSOCIATION Object [RFC4872]

o PRIMARY_PATH_ROUTE Object [RFC4872]

o ADMIN_STATUS Object [RFC3473]

o LABEL_REQUEST Object [RFC3209][RFC3473]

For the MPLS TE LSP Objects listed above, the corresponding sub- objects are also applicable to this mechanism. Note that this list is not exhaustive, other MPLS TE LSP objects which reflect specific characteristics of the MPLS TE LSP can also be carried in the LSP state TLV.

5.2. PCEP Objects

PCEP objects are encoded in the "MPLS-TE Policy State Objects" field of the MPLS-TE Policy State TLV and consists of PCEP objects defined in [RFC5440]. Rather than replicating all MPLS TE LSP related objects in this document, the semantics and encodings of the MPLS TE LSP objects are re-used. These MPLS TE LSP objects are carried in the MPLS-TE Policy State TLV.

When carrying PCEP objects, the "Object-Origin" field is set to "PCEP".

Previdi, et al. Expires April 16, 2020 [Page 17]

Page 79: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft TE Policy State Distribution using BGP-LS October 2019

The following PCEP Objects are defined:

o METRIC Object [RFC5440]

o BANDWIDTH Object [RFC5440]

For the MPLS TE LSP Objects listed above, the corresponding sub- objects are also applicable to this mechanism. Note that this list is not exhaustive, other MPLS TE LSP objects which reflect specific characteristics of the MPLS TE LSP can also be carried in the TE Policy State TLV.

6. SR Policy State TLVs

Segment Routing Policy (SR Policy) architecture is specified in [I-D.ietf-spring-segment-routing-policy]. A SR Policy can comprise of one or more candidate paths (CP) of which at a given time one and only one may be active (i.e. installed in forwarding and usable for steering of traffic). Each CP in turn may have one or more SID-List of which one or more may be active; when multiple are active then traffic is load balanced over them.

This section defines the various TLVs which enable the headend to report the state of an SR Policy, its CP(s), SID-List(s) and their status. These TLVs are carried in the optional non-transitive BGP Attribute "LINK_STATE Attribute" defined in [RFC7752] and enable the same consistent form of reporting for SR Policy state irrespective of the Protocol-Origin used to provision the policy. Detailed procedure is described in Section 7 .

6.1. SR Binding SID

The SR Binding SID (BSID) is an optional TLV that provides the BSID and its attributes for the SR Policy CP. The TLV MAY also optionally contain the Provisioned BSID value for reporting when explicitly provisioned.

The TLV has the following format:

Previdi, et al. Expires April 16, 2020 [Page 18]

Page 80: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft TE Policy State Distribution using BGP-LS October 2019

0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type | Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | BSID Flags | RESERVED | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Binding SID (4 or 16 octets) // +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Provisioned Binding SID (4 or 16 octets) // +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

where:

o Type: 1201

o Length: variable (valid values are 12 or 36 octets)

o BSID Flags: 2 octet field that indicates attribute and status of the Binding SID (BSID) associated with this CP. The following bit positions are defined and the semantics are described in detail in [I-D.ietf-spring-segment-routing-policy]. Other bits SHOULD be cleared by originator and MUST be ignored by receiver.

0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |D|B|U|L|F| | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

where:

* D-Flag : Indicates the dataplane for the BSIDs and if they are 16 octet SRv6 SID when set and are 4 octet SR/MPLS label value when clear.

* B-Flag : Indicates the allocation of the value in the BSID field when set and indicates that BSID is not allocated when clear.

* U-Flag : Indicates the provisioned BSID value is unavailable when set.

* L-Flag : Indicates the BSID value is from the Segment Routing Local Block (SRLB) of the headend node when set and is from the local dynamic label pool when clear

Previdi, et al. Expires April 16, 2020 [Page 19]

Page 81: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft TE Policy State Distribution using BGP-LS October 2019

* F-Flag : Indicates the BSID value is one allocated from dynamic label pool due to fallback (e.g. when specified BSID is unavailable) when set.

o RESERVED: 2 octets. SHOULD be set to 0 by originator and MUST be ignored by receiver.

o Binding SID: It indicates the operational or allocated BSID value for the CP based on the status flags.

o Provisioned BSID: It is used to report the explicitly provisioned BSID value regardless of whether it is successfully allocated or not. The field is set to value 0 when BSID has not been specified or provisioned for the CP.

The BSID fields above are 4 octet carrying the MPLS Label or 16 octets carrying the SRv6 SID based on the BSID D-flag. When carrying the MPLS Label, as shown in the figure below, the TC, S and TTL (total of 12 bits) are RESERVED and SHOULD be set to 0 by originator and MUST be ignored by the receiver.

0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Label | TC |S| TTL | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

6.2. SR Candidate Path State

The SR Candidate Path (CP) State TLV provides the operational status and attributes of the SR Policy at the CP level. The TLV has the following format:

0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type | Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Priority | RESERVED | Flags | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Preference (4 octets) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

where:

o Type: 1202

Previdi, et al. Expires April 16, 2020 [Page 20]

Page 82: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft TE Policy State Distribution using BGP-LS October 2019

o Length: 8 octets

o Priority : 1 octet value which indicates the priority of the CP. Refer Section 2.12 of [I-D.ietf-spring-segment-routing-policy].

o RESERVED: 1 octet. SHOULD be set to 0 by originator and MUST be ignored by receiver.

o Flags: 2 octet field that indicates attribute and status of the CP. The following bit positions are defined and the semantics are described in detail in [I-D.ietf-spring-segment-routing-policy]. Other bits SHOULD be cleared by originator and MUST be ignored by receiver.

0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |S|A|B|E|V|O|D|C|I|T| | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

where:

* S-Flag : Indicates the CP is in administrative shut state when set

* A-Flag : Indicates the CP is the active path (i.e. one provisioned in the forwarding plane) for the SR Policy when set

* B-Flag : Indicates the CP is the backup path (i.e. one identified for path protection of the active path) for the SR Policy when set

* E-Flag : Indicates that the CP has been evaluated for validity (e.g. headend may evaluate CPs based on their preferences) when set

* V-Flag : Indicates the CP has at least one valid SID-List when set. When the E-Flag is clear (i.e. the CP has not been evaluated), then this flag MUST be set to 0 by the originator and ignored by the receiver.

* O-Flag : Indicates the CP was instantiated by the headend due to an on-demand-nexthop trigger based on local template when set. Refer Section 8.5 of [I-D.ietf-spring-segment-routing-policy].

* D-Flag : Indicates the CP was delegated for computation to a PCE/controller when set

Previdi, et al. Expires April 16, 2020 [Page 21]

Page 83: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft TE Policy State Distribution using BGP-LS October 2019

* C-Flag : Indicates the CP was provisioned by a PCE/controller when set

* I-Flag : Indicates the CP will perform the "drop upon invalid" behavior when no other active path is available for this SR Policy and this path is the one with best preference amongst the available CPs. Refer Section 8.2 of [I-D.ietf-spring-segment-routing-policy].

* T-Flag : Indicates the CP has been marked as eligible for use as Transit Policy on the headend when set. Refer Section 8.3 of [I-D.ietf-spring-segment-routing-policy].

o Preference : 4 octet value which indicates the preference of the CP. Refer Section 2.7 of [I-D.ietf-spring-segment-routing-policy].

6.3. SR Candidate Path Name

The SR Candidate Path Name TLV is an optional TLV that is used to carry the symbolic name associated with the candidate path. The TLV has the following format:

0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type | Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Candidate Path Symbolic Name (variable) // +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

where:

o Type: 1203

o Length: variable

o CP Name : Symbolic name for the CP. It is a string of printable ASCII characters without a NULL terminator.

6.4. SR Candidate Path Constraints

The SR Candidate Path Constraints TLV is an optional TLV that is used to report the constraints associated with the candidate path. The constraints are generally applied to a dynamic candidate path which is computed by the headend. The constraints may also be applied to an explicit path where the headend is expected to validate that the path expresses satisfies the specified constraints and the path is to

Previdi, et al. Expires April 16, 2020 [Page 22]

Page 84: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft TE Policy State Distribution using BGP-LS October 2019

be invalidated by the headend when the constraints are no longer met (e.g. due to topology changes).

The TLV has the following format:

0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type | Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Flags | RESERVED | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | MTID | Algorithm | RESERVED | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | sub-TLVs (variable) // +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

where:

o Type: 1204

o Length: variable

o Flags: 2 octet field that indicates the constraints that are being applied to the CP. The following bit positions are defined and the other bits SHOULD be cleared by originator and MUST be ignored by receiver.

0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |D|P|U|A|T| | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

where:

* D-Flag : Indicates that the CP needs to use SRv6 dataplane when set and SR/MPLS dataplane when clear

* P-Flag : Indicates that the CP needs to use only protected SIDs when set

* U-Flag : Indicates that the CP needs to use only unprotected SIDs when set

* A-Flag : Indicates that the CP needs to use the SIDs belonging to the specified SR Algorithm only when set

Previdi, et al. Expires April 16, 2020 [Page 23]

Page 85: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft TE Policy State Distribution using BGP-LS October 2019

* T-Flag: Indicates that the CP needs to use the SIDs belonging to the specified topology only when set

o RESERVED: 2 octet. SHOULD be set to 0 by originator and MUST be ignored by receiver.

o MTID : Indicates the multi-topology identifier of the IGP topology that is preferred to be used when the path is setup. When the T-flag is set then the path is strictly useing the specified topology SIDs only.

o Algorithm : Indicates the algorithm that is preferred to be used when the path is setup. When the A-flag is set then the path is strictly using the specified algorithm SIDs only.

o RESERVED: 1 octet. SHOULD be set to 0 by originator and MUST be ignored by receiver.

o sub-TLVs: optional sub-TLVs MAY be included in this TLV to describe other constraints.

The following constraint sub-TLVs are defined for the SR CP Constraints TLV.

6.4.1. SR Affinity Constraint

The SR Affinity Constraint sub-TLV is an optional sub-TLV that is used to carry the affinity constraints [RFC2702] associated with the candidate path. The affinity is expressed in terms of Extended Admin Group (EAG) as defined in [RFC7308]. The TLV has the following format:

0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type | Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Excl-Any-Size | Incl-Any-Size | Incl-All-Size | RESERVED | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Exclude-Any EAG (optional, variable) // +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Include-Any EAG (optional, variable) // +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Include-All EAG (optional, variable) // +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

where:

Previdi, et al. Expires April 16, 2020 [Page 24]

Page 86: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft TE Policy State Distribution using BGP-LS October 2019

o Type: 1208

o Length: variable, dependent on the size of the Extended Admin Group. MUST be a multiple of 4 octets.

o Exclude-Any-Size : one octet to indicate the size of Exclude-Any EAG bitmask size in multiples of 4 octets. (e.g. value 0 indicates the Exclude-Any EAG field is skipped, value 1 indicates that 4 octets of Exclude-Any EAG is included)

o Include-Any-Size : one octet to indicate the size of Include-Any EAG bitmask size in multiples of 4 octets. (e.g. value 0 indicates the Include-Any EAG field is skipped, value 1 indicates that 4 octets of Include-Any EAG is included)

o Include-All-Size : one octet to indicate the size of Include-All EAG bitmask size in multiples of 4 octets. (e.g. value 0 indicates the Include-All EAG field is skipped, value 1 indicates that 4 octets of Include-All EAG is included)

o RESERVED: 1 octet. SHOULD be set to 0 by originator and MUST be ignored by receiver.

o Exclude-Any EAG : the bitmask used to represent the affinities to be excluded from the path.

o Include-Any EAG : the bitmask used to represent the affinities to be included in the path.

o Include-All EAG : the bitmask used to represent the all affinities to be included in the path.

6.4.2. SR SRLG Constraint

The SR SRLG Constraint sub-TLV is an optional sub-TLV that is used to carry the Shared Risk Link Group (SRLG) values [RFC4202] that are to be excluded from the candidate path. The TLV has the following format:

0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type | Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | SRLG Values (variable, multiples of 4 octets) // +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

where:

Previdi, et al. Expires April 16, 2020 [Page 25]

Page 87: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft TE Policy State Distribution using BGP-LS October 2019

o Type: 1209

o Length: variable, dependent on the number of SRLGs encoded. MUST be a multiple of 4 octets.

o SRLG Values : One or more SRLG values (each of 4 octets).

6.4.3. SR Bandwidth Constraint

The SR Bandwidth Constraint sub-TLV is an optional sub-TLV that is used to indicate the desired bandwidth availability that needs to be ensured for the candidate path. The TLV has the following format:

0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type | Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Bandwidth | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

where:

o Type: 1210

o Length: 4 octects

o Bandwidth : 4 octets which specify the desired bandwidth in unit of bytes per second in IEEE floating point format.

6.4.4. SR Disjoint Group Constraint

The SR Disjoint Group Constraint sub-TLV is an optional sub-TLV that is used to carry the disjointness constraint associated with the candidate path. The disjointness between two SR Policy Candidate Paths is expressed by associating them with the same disjoint group identifier and then specifying the type of disjointness required between their paths. The computation is expected to achieve the highest level of disjointness requested and when that is not possible then fallback to a lesser level progressively based on the levels indicated.

The TLV has the following format:

Previdi, et al. Expires April 16, 2020 [Page 26]

Page 88: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft TE Policy State Distribution using BGP-LS October 2019

0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type | Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Request-Flags | Status-Flags | RESERVED | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Disjoint Group Identifier | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

where:

o Type: 1211

o Length: 8 octets

o Request Flags : one octet to indicate the level of disjointness requested as specified in the form of flags. The following flags are defined and the other bits SHOULD be cleared by originator and MUST be ignored by receiver.

0 1 2 3 4 5 6 7 +-+-+-+-+-+-+-+-+ |S|N|L|F|I| | +-+-+-+-+-+-+-+-+

where:

* S-Flag : Indicates that SRLG disjointness is requested

* N-Flag : Indicates that node disjointness is requested when

* L-Flag : Indicates that link disjointness is requested when

* F-Flag : Indicates that the computation may fallback to a lower level of disjointness amongst the ones requested when all cannot be achieved

* I-Flag : Indicates that the computation may fallback to the default best path (e.g. IGP path) in case of none of the desired disjointness can be achieved.

o Status Flags : one octet to indicate the level of disjointness that has been achieved by the computation as specified in the form of flags. The following flags are defined and the other bits SHOULD be cleared by originator and MUST be ignored by receiver.

Previdi, et al. Expires April 16, 2020 [Page 27]

Page 89: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft TE Policy State Distribution using BGP-LS October 2019

0 1 2 3 4 5 6 7 +-+-+-+-+-+-+-+-+ |S|N|L|F|I|X| | +-+-+-+-+-+-+-+-+

where:

* S-Flag : Indicates that SRLG disjointness is achieved

* N-Flag : Indicates that node disjointness is achieved

* L-Flag : Indicates that link disjointness is achieved

* F-Flag : Indicates that the computation has fallen back to a lower level of disjointness that requested.

* I-Flag : Indicates that the computation has fallen back to the best path (e.g. IGP path) and disjointness has not been achieved

* X-Flag : Indicates that the disjointness constraint could not be achieved and hence path has been invalidated

o RESERVED: 2 octets. SHOULD be set to 0 by originator and MUST be ignored by receiver.

o Disjointness Group Identifier : 4 octet value that is the group identifier for a set of disjoint paths

6.5. SR Segment List

The SR Segment List TLV is used to report the SID-List(s) of a candidate path. The TLV has following format:

Previdi, et al. Expires April 16, 2020 [Page 28]

Page 90: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft TE Policy State Distribution using BGP-LS October 2019

0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type | Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Flags | RESERVED | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | MTID | Algorithm | RESERVED | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Weight (4 octets) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | sub-TLVs (variable) // +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

where:

o Type: 1205

o Length: variable

o Flags: 2 octet field that indicates attribute and status of the SID-List.The following bit positions are defined and the semantics are described in detail in [I-D.ietf-spring-segment-routing-policy]. Other bits SHOULD be cleared by originator and MUST be ignored by receiver.

0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |D|E|C|V|R|F|A|T|M| | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

where:

* D-Flag : Indicates the SID-List is comprised of SRv6 SIDs when set and indicates it is comprised of SR/MPLS labels when clear.

* E-Flag : Indicates that SID-List is an explicit path when set and indicates dynamic path when clear.

* C-Flag : Indicates that SID-List has been computed for a dynamic path when set. It is always reported as set for explicit paths.

* V-Flag : Indicates the SID-List has passed verification or its verification was not required when set and failed verification when clear.

Previdi, et al. Expires April 16, 2020 [Page 29]

Page 91: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft TE Policy State Distribution using BGP-LS October 2019

* R-Flag : Indicates that the first Segment has been resolved when set and failed resolution when clear.

* F-Flag : Indicates that the computation for the dynamic path failed when set and succeeded (or not required in case of explicit path) when clear

* A-Flag : Indicates that all the SIDs in the SID-List belong to the specified algorithm when set.

* T-Flag : Indicates that all the SIDs in the SID-List belong to the specified topology (identified by the multi-topology ID) when set.

* M-Flag : Indicates that the SID-list has been removed from the forwarding plane due to fault detection by a monitoring mechanism (e.g. BFD) when set and indicates no fault detected or monitoring is not being done when clear.

o RESERVED: 2 octet. SHOULD be set to 0 by originator and MUST be ignored by receiver.

o MTID : 2 octet that indicates the multi-topology identifier of the IGP topology to be used when the T-flag is set.

o Algorithm: 1 octet that indicates the algorithm of the SIDs used in the SID-List when the A-flag is set.

o RESERVED: 1 octet. SHOULD be set to 0 by originator and MUST be ignored by receiver.

o Weight: 4 octet field that indicates the weight associated with the SID-List for weighted load-balancing. Refer Section 2.2 and 2.11 of [I-D.ietf-spring-segment-routing-policy].

o Sub-TLVs : variable and contains the ordered set of Segments and any other optional attributes associated with the specific SID- List.

The SR Segment sub-TLV (defined in Section 6.6) MUST be included as an ordered set of sub-TLVs within the SR Segment List TLV when the SID-List is not empty. A SID-List may be empty in certain cases (e.g. for a dynamic path) where the headend has not yet performed the computation and hence not derived the segments required for the path; in such cases, the SR Segment List TLV SHOULD NOT include any SR Segment sub-TLVs.

Previdi, et al. Expires April 16, 2020 [Page 30]

Page 92: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft TE Policy State Distribution using BGP-LS October 2019

6.6. SR Segment

The SR Segment sub-TLV describes a single segment in a SID-List. One or more instances of this sub-TLV in an ordered manner constitute a SID-List for a SR Policy candidate path. It is a sub-TLV of the SR Segment List TLV and has following format:

0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type | Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Segment Type | RESERVED | Flags | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | SID (4 or 16 octets) // +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ // Segment Descriptor (variable) // +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ // Sub-TLVs (variable) // +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

where:

o Type: 1206

o Length: variable

o Segment Type : 1 octet which indicates the type of segment (refer Section 6.6.1 for details)

o RESERVED: 1 octet. SHOULD be set to 0 by originator and MUST be ignored by receiver.

o Flags: 2 octet field that indicates attribute and status of the Segment and its SID. The following bit positions are defined and the semantics are described in detail in [I-D.ietf-spring-segment-routing-policy]. Other bits SHOULD be cleared by originator and MUST be ignored by receiver.

0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |S|E|V|R|A| | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

where:

Previdi, et al. Expires April 16, 2020 [Page 31]

Page 93: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft TE Policy State Distribution using BGP-LS October 2019

* S-Flag : Indicates the presence of SID value in the SID field when set and that no value is indicated when clear.

* E-Flag : Indicates the SID value is explicitly provisioned value (locally on headend or via controller/PCE) when set and is a dynamically resolved value by headend when clear

* V-Flag : Indicates the SID has passed verification or did not require verification when set and failed verification when clear.

* R-Flag : Indicates the SID has been resolved or did not require resolution (e.g. because it is not the first SID) when set and failed resolution when clear.

* A-Flag : Indicates that the Algorithm indicated in the Segment descriptor is valid when set. When clear, it indicates that the headend is unable to determine the algorithm of the SID.

o SID : 4 octet carrying the MPLS Label or 16 octets carrying the SRv6 SID based on the Segment Type. When carrying the MPLS Label, as shown in the figure below, the TC, S and TTL (total of 12 bits) are RESERVED and SHOULD be set to 0 by originator and MUST be ignored by the receiver.

0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Label | TC |S| TTL | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

o Segment Descriptor : variable size Segment descriptor based on the type of segment (refer Section 6.6.1 for details)

o Sub-Sub-TLVs : variable and contains any other optional attributes associated with the specific SID-List.

Currently no Sub-Sub-TLV of the SR Segment sub-TLV is defined.

6.6.1. Segment Descriptors

[I-D.ietf-spring-segment-routing-policy] section 4 defines multiple types of segments and their description. This section defines the encoding of the Segment Descriptors for each of those Segment types to be used in the Segment sub-TLV describes previously in Section 6.6.

Previdi, et al. Expires April 16, 2020 [Page 32]

Page 94: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft TE Policy State Distribution using BGP-LS October 2019

The following types are currently defined:

+-------+--------------------------------------------------------------+| Type | Segment Description |+-------+--------------------------------------------------------------+| 0 | Invalid || 1 | SR-MPLS Label || 2 | SRv6 SID as IPv6 address || 3 | SR-MPLS Prefix SID as IPv4 Node Address || 4 | SR-MPLS Prefix SID as IPv6 Node Global Address || 5 | SR-MPLS Adjacency SID as IPv4 Node Address & Local || | Interface ID || 6 | SR-MPLS Adjacency SID as IPv4 Local & Remote Interface || | Addresses || 7 | SR-MPLS Adjacency SID as pair of IPv6 Global Address & || | Interface ID for Local & Remote nodes || 8 | SR-MPLS Adjacency SID as pair of IPv6 Global Addresses for || | the Local & Remote Interface || 9 | SRv6 END SID as IPv6 Node Global Address || 10 | SRv6 END.X SID as pair of IPv6 Global Address & Interface ID || | for Local & Remote nodes || 11 | SRv6 END.X SID as pair of IPv6 Global Addresses for the || | Local & Remote Interface |+-------+--------------------------------------------------------------+

6.6.1.1. Type 1: SR-MPLS Label

The Segment is SR-MPLS type and is specified simply as the label. The format of its Segment Descriptor is as follows:

0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+ | Algorithm | +-+-+-+-+-+-+-+-+

Where:

o Algorithm: 1 octet value that indicates the algorithm used for picking the SID. This is valid only when the A-flag has been set in the Segment TLV.

6.6.1.2. Type 2: SRv6 SID

The Segment is SRv6 type and is specified simply as the SRv6 SID address. The format of its Segment Descriptor is as follows:

Previdi, et al. Expires April 16, 2020 [Page 33]

Page 95: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft TE Policy State Distribution using BGP-LS October 2019

0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+ | Algorithm | +-+-+-+-+-+-+-+-+

Where:

o Algorithm: 1 octet value that indicates the algorithm used for picking the SID. This is valid only when the A-flag has been set in the Segment TLV.

6.6.1.3. Type 3: SR-MPLS Prefix SID for IPv4

The Segment is SR-MPLS Prefix SID type and is specified as an IPv4 node address. The format of its Segment Descriptor is as follows:

0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+ | Algorithm | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | IPv4 Node Address (4 octets) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Where:

o Algorithm: 1 octet value that indicates the algorithm used for picking the SID

o IPv4 Node Address: 4 octet value which carries the IPv4 address associated with the node

6.6.1.4. Type 4: SR-MPLS Prefix SID for IPv6

The Segment is SR-MPLS Prefix SID type and is specified as an IPv6 global address. The format of its Segment Descriptor is as follows:

0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+ | Algorithm | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | IPv6 Node Global Address (16 octets) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Where:

Previdi, et al. Expires April 16, 2020 [Page 34]

Page 96: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft TE Policy State Distribution using BGP-LS October 2019

o Algorithm: 1 octet value that indicates the algorithm used for picking the SID

o IPv6 Node Global Address: 16 octet value which carries the IPv6 global address associated with the node

6.6.1.5. Type 5: SR-MPLS Adjacency SID for IPv4 with Interface ID

The Segment is SR-MPLS Adjacency SID type and is specified as an IPv4 node address along with the local interface ID on that node. The format of its Segment Descriptor is as follows:

0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | IPv4 Node Address (4 octets) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Local Interface ID (4 octets) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Where:

o IPv4 Node Address: 4 octet value which carries the IPv4 address associated with the node

o Local Interface ID : 4 octet value which carries the local interface ID of the node identified by the Node Address

6.6.1.6. Type 6: SR-MPLS Adjacency SID for IPv4 with Interface Address

The Segment is SR-MPLS Adjacency SID type and is specified as a pair of IPv4 local and remote addresses. The format of its Segment Descriptor is as follows:

0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | IPv4 Local Address (4 octets) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | IPv4 Remote Address (4 octets) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Where:

o IPv4 Local Address: 4 octet value which carries the local IPv4 address associated with the node

Previdi, et al. Expires April 16, 2020 [Page 35]

Page 97: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft TE Policy State Distribution using BGP-LS October 2019

o IPv4 Remote Address: 4 octet value which carries the remote IPv4 address associated with the node’s neighbor. This is optional and MAY be set to 0 when not used (e.g. when identifying point-to- point links).

6.6.1.7. Type 7: SR-MPLS Adjacency SID for IPv6 with interface ID

The Segment is SR-MPLS Adjacency SID type and is specified as a pair of IPv6 global address and interface ID for local and remote nodes. The format of its Segment Descriptor is as follows:

0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | IPv6 Local Node Global Address (16 octets) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Local Node Interface ID (4 octets) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | IPv6 Remote Node Global Address (16 octets) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Remote Node Interface ID (4 octets) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Where:

o IPv6 Local Node Global Address: 16 octet value which carries the IPv6 global address associated with the local node

o Local Node Interface ID : 4 octet value which carries the interface ID of the local node identified by the Local Node Address

o IPv6 Remote Node Global Address: 16 octet value which carries the IPv6 global address associated with the remote node. This is optional and MAY be set to 0 when not used (e.g. when identifying point-to-point links).

o Remote Node Interface ID : 4 octet value which carries the interface ID of the remote node identified by the Remote Node Address. This is optional and MAY be set to 0 when not used (e.g. when identifying point-to-point links).

6.6.1.8. Type 8: SR-MPLS Adjacency SID for IPv6 with interface address

The Segment is SR-MPLS Adjacency SID type and is specified as a pair of IPv6 Global addresses for local and remote interface addresses. The format of its Segment Descriptor is as follows:

Previdi, et al. Expires April 16, 2020 [Page 36]

Page 98: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft TE Policy State Distribution using BGP-LS October 2019

0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Global IPv6 Local Interface Address (16 octets) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Global IPv6 Remote Interface Address (16 octets) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Where:

o IPv6 Local Address: 16 octet value which carries the local IPv6 address associated with the node

o IPv6 Remote Address: 16 octet value which carries the remote IPv6 address associated with the node’s neighbor

6.6.1.9. Type 9: SRv6 END SID as IPv6 Node Address

The Segment is SRv6 END SID type and is specified as an IPv6 global address. The format of its Segment Descriptor is as follows:

0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+ | Algorithm | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | IPv6 Node Global Address (16 octets) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Where:

o Algorithm: 1 octet value that indicates the algorithm used for picking the SID

o IPv6 Node Global Address: 16 octet value which carries the IPv6 global address associated with the node

6.6.1.10. Type 10: SRv6 END.X SID as interface ID

The Segment is SRv6 END.X SID type and is specified as a pair of IPv6 global address and interface ID for local and remote nodes. The format of its Segment Descriptor is as follows:

Previdi, et al. Expires April 16, 2020 [Page 37]

Page 99: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft TE Policy State Distribution using BGP-LS October 2019

0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | IPv6 Local Node Global Address (16 octets) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Local Node Interface ID (4 octets) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | IPv6 Remote Node Global Address (16 octets) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Remote Node Interface ID (4 octets) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Where:

o IPv6 Local Node Global Address: 16 octet value which carries the IPv6 global address associated with the local node

o Local Node Interface ID : 4 octet value which carries the interface ID of the local node identified by the Local Node Address

o IPv6 Remote Node Global Address: 16 octet value which carries the IPv6 global address associated with the remote node. This is optional and MAY be set to 0 when not used (e.g. when identifying point-to-point links).

o Remote Node Interface ID : 4 octet value which carries the interface ID of the remote node identified by the Remote Node Address. This is optional and MAY be set to 0 when not used (e.g. when identifying point-to-point links).

6.6.1.11. Type 11: SRv6 END.X SID as interface address

The Segment is SRv6 END.X SID type and is specified as a pair of IPv6 Global addresses for local and remote interface addresses. The format of its Segment Descriptor is as follows:

0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Global IPv6 Local Interface Address (16 octets) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Global IPv6 Remote Interface Address (16 octets) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Where:

Previdi, et al. Expires April 16, 2020 [Page 38]

Page 100: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft TE Policy State Distribution using BGP-LS October 2019

o IPv6 Local Address: 16 octet value which carries the local IPv6 address associated with the node

o IPv6 Remote Address: 16 octet value which carries the remote IPv6 address associated with the node’s neighbor

6.7. SR Segment List Metric

The SR Segment List Metric sub-TLV describes the metric used for computation of the SID-List. It is used to report the type of metric used in the computation of a dynamic path either on the headend or when the path computation is delegated to a PCE/controller. When the path computation is done on the headend, it is also used to report the calculated metric for the path.

It is a sub-TLV of the SR Segment List TLV and has following format:

0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type | Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Metric Type | Flags | RESERVED | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Metric Margin | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Metric Bound | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Metric Value | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

where:

o Type: 1207

o Length: 16 octets

o Metric Type : 1 octet field which identifies the type of metric used for path computation. Following metric type codepoints are defined in this document.

+------------+-----------------------------------------+ | Code Point | Metric Type | +------------+-----------------------------------------+ | 0 | IGP Metric | | 1 | Min Unidirectional Link Delay [RFC7471] | | 2 | TE Metric [RFC3630] | +------------+-----------------------------------------+

Previdi, et al. Expires April 16, 2020 [Page 39]

Page 101: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft TE Policy State Distribution using BGP-LS October 2019

o Flags: 1 octet field that indicates the validity of the metric fields and their semantics. The following bit positions are defined and the other bits SHOULD be cleared by originator and MUST be ignored by receiver.

0 1 2 3 4 5 6 7 +-+-+-+-+-+-+-+-+ |M|A|B|V| | +-+-+-+-+-+-+-+-+

where:

* M-Flag : Indicates that the metric margin allowed for path computation is specified when set

* A-Flag : Indicates that the metric margin is specified as an absolute value when set and is expressed as a percentage of the minimum metric when clear.

* B-Flag : Indicates that the metric bound allowed for the path is specified when set.

* V-Flag : Indicates that the metric value computed is being reported when set.

o RESERVED: 2 octets. SHOULD be set to 0 by originator and MUST be ignored by receiver.

o Metric Margin : 4 octets which indicate the metric margin value when M-flag is set. The metric margin is specified as either an absolute value or as a percentage of the minimum computed path metric based on the A-flag. The metric margin loosens the criteria for minimum metric path calculation up to the specified metric to accomodate for other factors such as bandwidth availability, minimal SID stack depth and maximizing of ECMP for the SR path computed.

o Metric Bound : 4 octects which indicate the maximum metric value that is allowed when B-flag is set. If the computed path metric crosses the specified bound value then the path is considered as invalid.

o Metric Value : 4 octets which indicate the metric value of the computed path when V-flag is set. This value is available and reported when the computation is successful and a valid path is available.

Previdi, et al. Expires April 16, 2020 [Page 40]

Page 102: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft TE Policy State Distribution using BGP-LS October 2019

7. Procedures

The BGP-LS advertisements for the TE Policy NLRI are originated by the headend node for the TE Policies that are instantiated on its local node.

For MPLS TE LSPs signaled via RSVP-TE, the NLRI descriptor TLVs as specified in Section 4.1, Section 4.2, Section 4.3 and Section 4.4 are used. Then the TE LSP state is encoded in the BGP-LS Attribute field as MPLS-TE Policy State TLV as described in Section 5. The RSVP-TE objects that reflect the state of the LSP are included as defined in Section 5.1. When the TE LSP is setup with the help of PCEP signaling then another MPLS-TE Policy State TLV SHOULD be used to to encode the related PCEP objects corresponding to the LSP as defined in Section 5.2.

For SR Policies, the NLRI descriptor TLV as specified in Section 4.5 is used. An SR Policy candidate path (CP) may be instantiated on the headend node via a local configuration, PCEP or BGP SR Policy signaling and this is indicated via the SR Protocol Origin. Then the SR Policy Candidate Path’s attribute and state is encoded in the BGP- LS Attribute field as SR Policy State TLVs and sub-TLVs as described in Section 6. The SR Candidate Path State TLV as defined in Section 6.2 is included to report the state of the CP. The SR BSID TLV as defined in Section 6.1 is included to report the BSID of the CP when one is either provisioned or allocated by the headend. The constraints for the SR Policy Candidate Path are reported using the SR Candidate Path Constraints TLV as described in Section 6.4.The SR Segment List TLV is included for each of the SID-List(s) associated with the CP. Each SR Segment List TLV in turn includes SR Segment sub-TLV(s) to report the segment(s) and their status. The SR Segment List Metric sub-TLV is used to report the metric values and constraints for the specific SID List.

When the SR Policy CP is setup with the help of PCEP signaling then another MPLS-TE Policy State TLV MAY be used to to encode the related PCEP objects corresponding to the LSP as defined in Section 5.2 specifically to report information and status that is not covered by the defined TLVs under Section 6. In the event of a conflict of information, the receiver MUST prefer the information originated via TLVs defined in Section 6 over the PCEP objects reported via the TE Policy State TLV.

8. Manageability Considerations

The Existing BGP operational and management procedures apply to this document. No new procedures are defined in this document. The considerations as specified in [RFC7752] apply to this document.

Previdi, et al. Expires April 16, 2020 [Page 41]

Page 103: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft TE Policy State Distribution using BGP-LS October 2019

In general, it is assumed that the TE Policy head-end nodes are responsible for the distribution of TE Policy state information, while other nodes, e.g. the nodes in the path of a policy, MAY report the TE Policy information (if available) when needed. For example, the border routers in the inter-domain case will also distribute LSP state information since the ingress node may not have the complete information for the end-to-end path.

9. IANA Considerations

This document requires new IANA assigned codepoints.

9.1. BGP-LS NLRI-Types

IANA maintains a registry called "Border Gateway Protocol - Link State (BGP-LS) Parameters" with a sub-registry called "BGP-LS NLRI- Types".

The following codepoints have been assigned by early allocation process by IANA:

+------+----------------------------+---------------+ | Type | NLRI Type | Reference | +------+----------------------------+---------------+ | 5 | TE Policy NLRI type | this document | +------+----------------------------+---------------+

9.2. BGP-LS Protocol-IDs

IANA maintains a registry called "Border Gateway Protocol - Link State (BGP-LS) Parameters" with a sub-registry called "BGP-LS Protocol-IDs".

The following Protocol-ID codepoints have been assigned by early allocation process by IANA:

+-------------+----------------------------------+---------------+ | Protocol-ID | NLRI information source protocol | Reference | +-------------+----------------------------------+---------------+ | 8 | RSVP-TE | this document | | 9 | Segment Routing | this document | +-------------+----------------------------------+---------------+

9.3. BGP-LS TLVs

IANA maintains a registry called "Border Gateway Protocol - Link State (BGP-LS) Parameters" with a sub-registry called "Node Anchor, Link Descriptor and Link Attribute TLVs".

Previdi, et al. Expires April 16, 2020 [Page 42]

Page 104: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft TE Policy State Distribution using BGP-LS October 2019

The following TLV codepoints have been assigned by early allocation process by IANA:

+----------+----------------------------------------+---------------+ | TLV Code | Description | Value defined | | Point | | in | +----------+----------------------------------------+---------------+ | 550 | Tunnel ID TLV | this document | | 551 | LSP ID TLV | this document | | 552 | IPv4/6 Tunnel Head-end address TLV | this document | | 553 | IPv4/6 Tunnel Tail-end address TLV | this document | | 554 | SR Policy CP Descriptor TLV | this document | | 555 | MPLS Local Cross Connect TLV | this document | | 556 | MPLS Cross Connect Interface TLV | this document | | 557 | MPLS Cross Connect FEC TLV | this document | | 1200 | MPLS-TE Policy State TLV | this document | | 1201 | SR BSID TLV | this document | | 1202 | SR CP State TLV | this document | | 1203 | SR CP Name TLV | this document | | 1204 | SR CP Constraints TLV | this document | | 1205 | SR Segment List TLV | this document | | 1206 | SR Segment sub-TLV | this document | | 1207 | SR Segment List Metric sub-TLV | this document | | 1208 | SR Affinity Constraint sub-TLV | this document | | 1209 | SR SRLG Constraint sub-TLV | this document | | 1210 | SR Bandwidth Constraint sub-TLV | this document | | 1211 | SR Disjoint Group Constraint sub-TLV | this document | +----------+----------------------------------------+---------------+

9.4. BGP-LS SR Policy Protocol Origin

This document requests IANA to maintain a new sub-registry under "Border Gateway Protocol - Link State (BGP-LS) Parameters". The new registry is called "SR Policy Protocol Origin" and contains the codepoints allocated to the "Protocol Origin" field defined in Section 4.5. The registry contains the following codepoints, with initial values, to be assigned by IANA:

+------------+---------------------------------------------------------+| Code Point | Protocol Origin |+------------+---------------------------------------------------------+| 1 | PCEP || 2 | BGP SR Policy || 3 | Local (via CLI, Yang model through NETCONF, gRPC, etc.) |+------------+---------------------------------------------------------+

Previdi, et al. Expires April 16, 2020 [Page 43]

Page 105: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft TE Policy State Distribution using BGP-LS October 2019

9.5. BGP-LS TE State Object Origin

This document requests IANA to maintain a new sub-registry under "Border Gateway Protocol - Link State (BGP-LS) Parameters". The new registry is called "TE State Path Origin" and contains the codepoints allocated to the "Object Origin" field defined in Section 5. The registry contains the following codepoints, with initial values, to be assigned by IANA:

+----------+------------------+ | Code | Object | | Point | Origin | +----------+------------------+ | 1 | RSVP-TE | | 2 | PCEP | | 3 | Local/Static | +----------+------------------+

9.6. BGP-LS TE State Address Family

This document requests IANA to maintain a new sub-registry under "Border Gateway Protocol - Link State (BGP-LS) Parameters". The new registry is called "TE State Address Family" and contains the codepoints allocated to the "Address Family" field defined in Section 5. The registry contains the following codepoints, with initial values, to be assigned by IANA:

+----------+------------------+ | Code | Address | | Point | Family | +----------+------------------+ | 1 | MPLS-IPv4 | | 2 | MPLS-IPv6 | +----------+------------------+

9.7. BGP-LS SR Segment Descriptors

This document requests IANA to maintain a new sub-registry under "Border Gateway Protocol - Link State (BGP-LS) Parameters". The new registry is called "SR Segment Descriptor Types" and contains the codepoints allocated to the "Segment Type" field defined in Section 6.6 and described in Section 6.6.1. The registry contains the following codepoints, with initial values, to be assigned by IANA:

Previdi, et al. Expires April 16, 2020 [Page 44]

Page 106: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft TE Policy State Distribution using BGP-LS October 2019

+-------+--------------------------------------------------------------+| Code | Segment Description || Point | |+-------+--------------------------------------------------------------+| 0 | Invalid || 1 | SR-MPLS Label || 2 | SRv6 SID as IPv6 address || 3 | SR-MPLS Prefix SID as IPv4 Node Address || 4 | SR-MPLS Prefix SID as IPv6 Node Global Address || 5 | SR-MPLS Adjacency SID as IPv4 Node Address & Local || | Interface ID || 6 | SR-MPLS Adjacency SID as IPv4 Local & Remote Interface || | Addresses || 7 | SR-MPLS Adjacency SID as pair of IPv6 Global Address & || | Interface ID for Local & Remote nodes || 8 | SR-MPLS Adjacency SID as pair of IPv6 Global Addresses for || | the Local & Remote Interface || 9 | SRv6 END SID as IPv6 Node Global Address || 10 | SRv6 END.X SID as pair of IPv6 Global Address & Interface ID || | for Local & Remote nodes || 11 | SRv6 END.X SID as pair of IPv6 Global Addresses for the || | Local & Remote Interface |+-------+--------------------------------------------------------------+

9.8. BGP-LS Metric Type

This document requests IANA to maintain a new sub-registry under "Border Gateway Protocol - Link State (BGP-LS) Parameters". The new registry is called "Metric Type" and contains the codepoints allocated to the "metric type" field defined in Section 6.7. The registry contains the following codepoints, with initial values, to be assigned by IANA:

+------------+-----------------------------------------+ | Code Point | Metric Type | +------------+-----------------------------------------+ | 0 | IGP Metric | | 1 | Min Unidirectional Link Delay [RFC7471] | | 2 | TE Metric [RFC3630] | +------------+-----------------------------------------+

10. Security Considerations

Procedures and protocol extensions defined in this document do not affect the BGP security model. See [RFC6952] for details.

Previdi, et al. Expires April 16, 2020 [Page 45]

Page 107: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft TE Policy State Distribution using BGP-LS October 2019

11. Contributors

The following people have substantially contributed to the editing of this document:

Clarence Filsfils Cisco Systems Email: [email protected]

12. Acknowledgements

The authors would like to thank Dhruv Dhody, Mohammed Abdul Aziz Khalid, Lou Berger, Acee Lindem, Siva Sivabalan, Arjun Sreekantiah, and Dhanendra Jain for their review and valuable comments.

13. References

13.1. Normative References

[I-D.ietf-idr-bgpls-segment-routing-epe] Previdi, S., Talaulikar, K., Filsfils, C., Patel, K., Ray, S., and J. Dong, "BGP-LS extensions for Segment Routing BGP Egress Peer Engineering", draft-ietf-idr-bgpls- segment-routing-epe-19 (work in progress), May 2019.

[I-D.ietf-spring-segment-routing-policy] Filsfils, C., Sivabalan, S., [email protected], d., [email protected], b., and P. Mattes, "Segment Routing Policy Architecture", draft-ietf-spring-segment-routing- policy-03 (work in progress), May 2019.

[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997, <https://www.rfc-editor.org/info/rfc2119>.

[RFC2205] Braden, R., Ed., Zhang, L., Berson, S., Herzog, S., and S. Jamin, "Resource ReSerVation Protocol (RSVP) -- Version 1 Functional Specification", RFC 2205, DOI 10.17487/RFC2205, September 1997, <https://www.rfc-editor.org/info/rfc2205>.

[RFC3209] Awduche, D., Berger, L., Gan, D., Li, T., Srinivasan, V., and G. Swallow, "RSVP-TE: Extensions to RSVP for LSP Tunnels", RFC 3209, DOI 10.17487/RFC3209, December 2001, <https://www.rfc-editor.org/info/rfc3209>.

Previdi, et al. Expires April 16, 2020 [Page 46]

Page 108: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft TE Policy State Distribution using BGP-LS October 2019

[RFC3473] Berger, L., Ed., "Generalized Multi-Protocol Label Switching (GMPLS) Signaling Resource ReserVation Protocol- Traffic Engineering (RSVP-TE) Extensions", RFC 3473, DOI 10.17487/RFC3473, January 2003, <https://www.rfc-editor.org/info/rfc3473>.

[RFC4090] Pan, P., Ed., Swallow, G., Ed., and A. Atlas, Ed., "Fast Reroute Extensions to RSVP-TE for LSP Tunnels", RFC 4090, DOI 10.17487/RFC4090, May 2005, <https://www.rfc-editor.org/info/rfc4090>.

[RFC4760] Bates, T., Chandra, R., Katz, D., and Y. Rekhter, "Multiprotocol Extensions for BGP-4", RFC 4760, DOI 10.17487/RFC4760, January 2007, <https://www.rfc-editor.org/info/rfc4760>.

[RFC4872] Lang, J., Ed., Rekhter, Y., Ed., and D. Papadimitriou, Ed., "RSVP-TE Extensions in Support of End-to-End Generalized Multi-Protocol Label Switching (GMPLS) Recovery", RFC 4872, DOI 10.17487/RFC4872, May 2007, <https://www.rfc-editor.org/info/rfc4872>.

[RFC4873] Berger, L., Bryskin, I., Papadimitriou, D., and A. Farrel, "GMPLS Segment Recovery", RFC 4873, DOI 10.17487/RFC4873, May 2007, <https://www.rfc-editor.org/info/rfc4873>.

[RFC4874] Lee, CY., Farrel, A., and S. De Cnodder, "Exclude Routes - Extension to Resource ReserVation Protocol-Traffic Engineering (RSVP-TE)", RFC 4874, DOI 10.17487/RFC4874, April 2007, <https://www.rfc-editor.org/info/rfc4874>.

[RFC5420] Farrel, A., Ed., Papadimitriou, D., Vasseur, JP., and A. Ayyangarps, "Encoding of Attributes for MPLS LSP Establishment Using Resource Reservation Protocol Traffic Engineering (RSVP-TE)", RFC 5420, DOI 10.17487/RFC5420, February 2009, <https://www.rfc-editor.org/info/rfc5420>.

[RFC5440] Vasseur, JP., Ed. and JL. Le Roux, Ed., "Path Computation Element (PCE) Communication Protocol (PCEP)", RFC 5440, DOI 10.17487/RFC5440, March 2009, <https://www.rfc-editor.org/info/rfc5440>.

[RFC7752] Gredler, H., Ed., Medved, J., Previdi, S., Farrel, A., and S. Ray, "North-Bound Distribution of Link-State and Traffic Engineering (TE) Information Using BGP", RFC 7752, DOI 10.17487/RFC7752, March 2016, <https://www.rfc-editor.org/info/rfc7752>.

Previdi, et al. Expires April 16, 2020 [Page 47]

Page 109: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft TE Policy State Distribution using BGP-LS October 2019

[RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, May 2017, <https://www.rfc-editor.org/info/rfc8174>.

13.2. Informative References

[RFC2702] Awduche, D., Malcolm, J., Agogbua, J., O’Dell, M., and J. McManus, "Requirements for Traffic Engineering Over MPLS", RFC 2702, DOI 10.17487/RFC2702, September 1999, <https://www.rfc-editor.org/info/rfc2702>.

[RFC3630] Katz, D., Kompella, K., and D. Yeung, "Traffic Engineering (TE) Extensions to OSPF Version 2", RFC 3630, DOI 10.17487/RFC3630, September 2003, <https://www.rfc-editor.org/info/rfc3630>.

[RFC4202] Kompella, K., Ed. and Y. Rekhter, Ed., "Routing Extensions in Support of Generalized Multi-Protocol Label Switching (GMPLS)", RFC 4202, DOI 10.17487/RFC4202, October 2005, <https://www.rfc-editor.org/info/rfc4202>.

[RFC4655] Farrel, A., Vasseur, J., and J. Ash, "A Path Computation Element (PCE)-Based Architecture", RFC 4655, DOI 10.17487/RFC4655, August 2006, <https://www.rfc-editor.org/info/rfc4655>.

[RFC5065] Traina, P., McPherson, D., and J. Scudder, "Autonomous System Confederations for BGP", RFC 5065, DOI 10.17487/RFC5065, August 2007, <https://www.rfc-editor.org/info/rfc5065>.

[RFC6952] Jethanandani, M., Patel, K., and L. Zheng, "Analysis of BGP, LDP, PCEP, and MSDP Issues According to the Keying and Authentication for Routing Protocols (KARP) Design Guide", RFC 6952, DOI 10.17487/RFC6952, May 2013, <https://www.rfc-editor.org/info/rfc6952>.

[RFC7308] Osborne, E., "Extended Administrative Groups in MPLS Traffic Engineering (MPLS-TE)", RFC 7308, DOI 10.17487/RFC7308, July 2014, <https://www.rfc-editor.org/info/rfc7308>.

[RFC7471] Giacalone, S., Ward, D., Drake, J., Atlas, A., and S. Previdi, "OSPF Traffic Engineering (TE) Metric Extensions", RFC 7471, DOI 10.17487/RFC7471, March 2015, <https://www.rfc-editor.org/info/rfc7471>.

Previdi, et al. Expires April 16, 2020 [Page 48]

Page 110: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft TE Policy State Distribution using BGP-LS October 2019

[RFC8231] Crabbe, E., Minei, I., Medved, J., and R. Varga, "Path Computation Element Communication Protocol (PCEP) Extensions for Stateful PCE", RFC 8231, DOI 10.17487/RFC8231, September 2017, <https://www.rfc-editor.org/info/rfc8231>.

Authors’ Addresses

Stefano Previdi

Email: [email protected]

Ketan Talaulikar (editor) Cisco Systems, Inc. India

Email: [email protected]

Jie Dong (editor) Huawei Technologies Huawei Campus, No. 156 Beiqing Rd. Beijing 100095 China

Email: [email protected]

Mach(Guoyi) Chen Huawei Technologies Huawei Campus, No. 156 Beiqing Rd. Beijing 100095 China

Email: [email protected]

Hannes Gredler RtBrick Inc.

Email: [email protected]

Jeff Tantsura Apstra

Email: [email protected]

Previdi, et al. Expires April 16, 2020 [Page 49]

Page 111: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet Engineering Task Force J. DurandInternet-Draft CISCOIntended status: Standards Track D. FreedmanExpires: September 10, 2015 Claranet March 9, 2015

Path validation toward BGP next-hop with AUTO-BFD draft-jdurand-auto-bfd-00.txt

Abstract

This document describes a solution called AUTO-BFD, that automatically initiates BFD sessions for BGP next-hops. This makes it possible to avoid blackholing scenarios when a BGP peer is not the BGP next-hop, especially on Internet eXchange Points (IXPs) when BGP Route-Servers are deployed. When they are configured with this mechanism, routers can automatically try to establish a new adjacency for every new BGP next-hop. BGP routes are then checked against the state of the BFD session for the corresponding next-hop.

Foreword

A placeholder to list general observations about this document.

Requirements Language

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [1].

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on September 10, 2015.

Durand & Freedman Expires September 10, 2015 [Page 1]

Page 112: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft AUTO-BFD March 2015

Copyright Notice

Copyright (c) 2015 IETF Trust and the persons identified as the document authors. All rights reserved.

This document is subject to BCP 78 and the IETF Trust’s Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.

Table of Contents

1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 2. Definitions and Accronyms . . . . . . . . . . . . . . . . . . 3 3. Solution Requirements . . . . . . . . . . . . . . . . . . . . 3 4. Solution Overview . . . . . . . . . . . . . . . . . . . . . . 4 5. BFD Session Ageing . . . . . . . . . . . . . . . . . . . . . 5 6. Possible other use cases . . . . . . . . . . . . . . . . . . 6 7. Future Work . . . . . . . . . . . . . . . . . . . . . . . . . 6 8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 7 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 7 10. Security Considerations . . . . . . . . . . . . . . . . . . . 7 11. References . . . . . . . . . . . . . . . . . . . . . . . . . 7 11.1. Normative References . . . . . . . . . . . . . . . . . . 7 11.2. Informative References . . . . . . . . . . . . . . . . . 7 Appendix A. Configuration . . . . . . . . . . . . . . . . . . . 7 Authors’ Addresses . . . . . . . . . . . . . . . . . . . . . . . 8

1. Introduction

Internet eXchange Points (IXPs) implement BGP Route-Servers (RS) [5] so that connected members do not need to configure BGP peerings with every other member to exchange routes. Through a single peering with the RS, a member can receive routes of all the other members peering with the RS. The RS redistributes routes and could simply be described as a Route-Reflector for eBGP peerings.

Usually, deployed RS do not modify the BGP next-hop of exchanged routes so traffic exchanged between IXP members does not pass through the RS, which keeps only a control-plane role, exactly as for a BGP RR. The drawback is that it may happen that a peering stays up between members and route-server while there is no connectivity between members. This results in black holes for members with no

Durand & Freedman Expires September 10, 2015 [Page 2]

Page 113: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft AUTO-BFD March 2015

easy troubleshooting: usually upon such problem a member just shuts its connectivity to the IXP. This situation has happened several times on many different IXPs and many members do not want to peer with route-servers for this reason.

EBGP UP----> RS <-------EBGP UP | | | | | | ----------------IXP LAN--------------------- | | | | V | | V Member 1 <================> Member 2 BROKEN CONNECTIVITY

Figure 1

This proposal intends to solve this situation with the automation of BFD adjacency creation when new BGP next-hops are discovered. When configured with this solution, routers can automatically try to establish a new adjacency for every new BGP next-hop. BGP routes are then checked against the BFD session states for the corresponding next-hop.

2. Definitions and Accronyms

o BFD: Bidirectional Forwarding Detection protocol [3][4]

o BGP: Border Gateway Protocol [2]

o IXP: Internet eXchange Point

o RR: Route Reflector (BGP Route-Reflector)

o RS: Route Server (BGP Route-Server) [5]

3. Solution Requirements

The following requirements apply to the solution described in this document:

o The solution MUST be independent of the BGP route-server’s configuration. In other words IXP members SHOULD be able to check each other’s liveliness without anything configured on the route- server.

Durand & Freedman Expires September 10, 2015 [Page 3]

Page 114: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft AUTO-BFD March 2015

o Solution MUST NOT imply a configuration per IXP member. Each IXP member SHOULD automatically discover other members and automatically start probing when this is possible.

o The solution MUST accept situations when not all the IXP members do not implement it. In other words the implementation of this mechanism on one of the IXP member MUST NOT impact the other members.

o The solution SHOULD rely on a widely adopted host liveliness verification protocol in the context of BGP peerings. The used host liveliness mechanism MUST be negotiated between members to avoid false positives.

o The solution SHOULD be as simple as possible and SHOULD NOT require the design of a new protocol.

Based on these requirements, the following is suggested:

o BFD [3] (Bidirectional Forwarding Detection) is an appropriate liveliness verification mechanism that IXP members can use to check their mutual status. It is widely adopted in the Internet community for this use and its scalability on modern routers makes it suitable for IXPs. Also BFD integrates an initial negotiation phase that makes it appropriate to interdomain scenarios, when all routers are not configured with the same options.

o IXP members cannot easily rely on an existing protocol to announce their mutual capability to support a host liveliness protocol. Since IXP members using BGP RS do not directly establish BGP peerings between them, any use of BGP to announce BFD capability would require solution support on the BGP RS which would prevent any usage on an IXP not implementing it. Solutions based on optional transitive BGP attribute have been studied [6] and showed some complexity and privacy challenges as it could force every member to disclose topology information globally to the downstreams of other IXP members.

4. Solution Overview

The solution proposed in this document, named AUTO-BFD, is straightforward and relies on existing mechanisms. It works as follows:

o AUTO-BFD is configured on the BGP peering to the BGP RS.

o Every time a new BGP next-hop is received from this peering, AUTO- BFD triggers a new BFD session with this next-hop (or reinstates

Durand & Freedman Expires September 10, 2015 [Page 4]

Page 115: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft AUTO-BFD March 2015

an existing session, see Section 5). The operation mode for this BFD session MUST be asynchronous. Timers can be locally configured. Optional security configuration can limit the number of authorized adjacencies or trigger BFD only for next-hops in a given prefix. Other optional configuration can define the BFD timers.

o Routes coming from the AUTO-BFD enabled BGP neighbor are then checked based on the BGP next-hop and its BFD session state. Acceptance of routes is then subject to the administrative policy based on BFD session state. An example of such a policy can be found in appendix Appendix A

5. BFD Session Ageing

In order to maintain the relevance of AUTO-BFD sessions, it is required to prune sessions when they are not needed anymore. A router X may not simply prune BFD session toward Y when there are no more routes with corresponding next-hop Y as the BFD session may still be needed by the Y to accept route from X.

The following solution makes it possible to prune BFD sessions only when it is sure the remote end does not need it anymore. A per- session timer (bfd.AutoAge) is defined to determine an interval. This timer MAY derive its configuration from a global value in an implementation. The timer counts down until it expires, at which point, the relevant AUTO-BFD session is checked against next-hop information received from the AUTO-BFD configured BGP session to determine if there are still next hops received which relate to the session. Based on the information, the following occurs:

o If next-hops for the remote system are still being received, bfd.LocalDiag should be set to XX, which will set the appropriate diagnostic code "XX - Continuing AUTO-BFD" (to be assigned by IANA) in the outgoing control packet, to indicate that the next hop continues to be seen and used by this system. At this point, the timer is reset and no further action is taken until it expires next.

o If next-hops for the remote system are no longer being received, the following occurs:

* If the session is still up (bfd.Sessionstate is UP) and a control packet arrives which would not change the state, but with the flag "XX - Continuing AUTO-BFD" set, this indicates that the remote system is still receiving routes with the local system as the next hop. In this case, the session MUST remain

Durand & Freedman Expires September 10, 2015 [Page 5]

Page 116: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft AUTO-BFD March 2015

up, the timer is reset and no further action is taken until it expires next.

* If the session is still up (bfd.Sessionstate is UP) and a control packet arrives which would not change the state, but does not set the "XX - Continuing AUTO-BFD" diagnostic flag, then we must consider that the remote system is no longer receiving routes with the local system as next hop. In this case, the bfd.LocalState should be set to ’AdminDown’ and the session should be placed in an Administrative Down state.

When a session is first established (or indeed re-established as per Section 4), it is important that the bfd.LocalDiag should be set to XX to ensure that control packets start to signal this state to the remote system.

6. Possible other use cases

While the primary focus of the authors is to solve the issue met with BGP route-servers on IXPs described in section Section 1, the proposed solution may also apply to the following use cases:

o IBGP Route-Reflector (RR): the scenario described for BGP RS could also apply for IBGP RR. The solution described in this draft could be used to validate received IBGP routes against real reachability of BGP next-hop (a router in same AS in case next-hop self is used, or the EBGP next-hop announcing the route.

o Any EBGP peering: the proposed solution would enable automatic BFD auto-deployment on every EBGP peering. AUTO-BFD would automatically "harden" the peering without the need of any additional configuration.

7. Future Work

Following points may need to be documented further in next versions of this document based on comments received by the community:

o AUTO-BFD interoperability with manual BFD sessions.

o S-BFD integration

o Security considerations

Durand & Freedman Expires September 10, 2015 [Page 6]

Page 117: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft AUTO-BFD March 2015

8. Acknowledgements

The authors would like to thank the following people for their comments and support: [TBD].

9. IANA Considerations

This memo requests a code point from the registry for BFD diagnostic codes [3]: XX -- Continuing Auto-BFD

10. Security Considerations

TBD

11. References

11.1. Normative References

[1] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997.

[2] Rekhter, Y., Li, T., and S. Hares, "A Border Gateway Protocol 4 (BGP-4)", RFC 4271, January 2006.

[3] Katz, D. and D. Ward, "Bidirectional Forwarding Detection (BFD)", RFC 5880, June 2010.

[4] Katz, D. and D. Ward, "Bidirectional Forwarding Detection (BFD) for IPv4 and IPv6 (Single Hop)", RFC 5881, June 2010.

[5] "Internet Exchange Route Server", <http://tools.ietf.org/id/ draft-ietf-idr-ix-bgp-route-server-06.txt>.

11.2. Informative References

[6] "Path validation toward BGP next-hop", <https://tools.ietf.org/id/draft-jdurand-idr-next-hop- liveliness-00.txt>.

Appendix A. Configuration

This configuration in classic Cisco IOS style will help reader understand the way AUTO-BFD can be integrated in current deployments.

Durand & Freedman Expires September 10, 2015 [Page 7]

Page 118: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft AUTO-BFD March 2015

router bgp 65001 neighbor 192.0.2.102 remote-as 65102 ! address-family ipv4 unicast neighbor 192.0.2.102 activate neighbor 192.0.2.102 auto-bfd neighbor 192.0.2.102 route-map FROM-RS in ! route-map FROM-RS permit 10 match next-hop bfd-session-state Up set local-pref 120 route-map FROM-RS deny 20 match next-hop bfd-session-state Down Init route-map FROM-RS permit 30 match next-hop bfd-session-state AdminDown set local-pref 50

Figure 2

Authors’ Addresses

Jerome Durand CISCO Systems, Inc. 11 rue Camille Desmoulins Issy-les-Moulineaux 92782 CEDEX FR

Email: [email protected]

David Freedman Claranet 21 Southampton Row, Holborn London WC1B 5HA UK

Email: [email protected]

Durand & Freedman Expires September 10, 2015 [Page 8]

Page 119: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

IDR K. PatelInternet-Draft S. PrevidiIntended status: Standards Track C. FilsfilsExpires: January 21, 2016 A. Sreekantiah Cisco Systems S. Ray Unaffiliated H. Gredler Juniper Networks July 20, 2015

Segment Routing Prefix SID extensions for BGP draft-keyupate-idr-bgp-prefix-sid-05

Abstract

Segment Routing (SR) architecture allows a node to steer a packet flow through any topological path and service chain by leveraging source routing. The ingress node prepends a SR header to a packet containing a set of "segments". Each segment represents a topological or a service-based instruction. Per-flow state is maintained only at the ingress node of the SR domain.

This document describes the BGP extension for announcing BGP Prefix Segment Identifier (BGP Prefix SID) information.

Requirements Language

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119] only when they appear in all upper case. They may also appear in lower or mixed case as English words, without any normative meaning.

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any

Patel, et al. Expires January 21, 2016 [Page 1]

Page 120: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft July 2015

time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on January 21, 2016.

Copyright Notice

Copyright (c) 2015 IETF Trust and the persons identified as the document authors. All rights reserved.

This document is subject to BCP 78 and the IETF Trust’s Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.

Table of Contents

1. Segment Routing Documents . . . . . . . . . . . . . . . . . . 3 2. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 3. BGP-Prefix-SID . . . . . . . . . . . . . . . . . . . . . . . 4 3.1. MPLS Prefix Segment . . . . . . . . . . . . . . . . . . . 4 3.2. IPv6 Prefix Segment . . . . . . . . . . . . . . . . . . . 5 4. BGP-Prefix-SID Attribute . . . . . . . . . . . . . . . . . . 5 4.1. Label-Index TLV . . . . . . . . . . . . . . . . . . . . . 6 4.2. IPv6 SID . . . . . . . . . . . . . . . . . . . . . . . . 6 4.3. Originator SRGB TLV . . . . . . . . . . . . . . . . . . . 7 5. Receiving BGP-Prefix-SID Attribute . . . . . . . . . . . . . 9 5.1. MPLS Dataplane: Labeled Unicast . . . . . . . . . . . . . 9 5.2. IPv6 Dataplane . . . . . . . . . . . . . . . . . . . . . 10 6. Announcing BGP-Prefix-SID Attribute . . . . . . . . . . . . . 10 6.1. MPLS Dataplane: Labeled Unicast . . . . . . . . . . . . . 10 6.2. IPv6 Dataplane . . . . . . . . . . . . . . . . . . . . . 11 7. Error Handling of BGP-Prefix-SID Attribute . . . . . . . . . 11 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 12 9. Security Considerations . . . . . . . . . . . . . . . . . . . 12 10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 12 11. Change Log . . . . . . . . . . . . . . . . . . . . . . . . . 12 12. References . . . . . . . . . . . . . . . . . . . . . . . . . 12 12.1. Normative References . . . . . . . . . . . . . . . . . . 12 12.2. Informative References . . . . . . . . . . . . . . . . . 13 Authors’ Addresses . . . . . . . . . . . . . . . . . . . . . . . 14

Patel, et al. Expires January 21, 2016 [Page 2]

Page 121: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft July 2015

1. Segment Routing Documents

The main references for this document are the SR architecture defined in [I-D.ietf-spring-segment-routing] and the related use case illustrated in [I-D.filsfils-spring-segment-routing-msdc].

The Segment Routing Egress Peer Engineering architecture is described in [I-D.filsfils-spring-segment-routing-central-epe].

The Segment Routing Egress Peer Engineering BGPLS extensions are described in [I-D.ietf-idr-bgpls-segment-routing-epe].

2. Introduction

Segment Routing (SR) architecture leverages the source routing paradigm. A group of inter-connected nodes that use SR forms a SR domain. The ingress node of the SR domain prepends a SR header containing "segments" to an incoming packet. Each segment represents a topological instruction such as "go to prefix P following shortest path" or a service instruction (e.g.: "pass through deep packet inspection"). By inserting the desired sequence of instructions, the ingress node is able to steer a packet via any topological path and/ or service chain; per-flow state is maintained only at the ingress node of the SR domain.

Each segment is identified by a Segment Identifier (SID). As described in [I-D.ietf-spring-segment-routing], when SR is applied to the MPLS dataplane the SID consists of a label while when SR is applied to the IPv6 dataplane the SID consists of an IPv6 prefix (see [I-D.previdi-6man-segment-routing-header]).

A BGP-Prefix Segment (aka BGP-Prefix-SID), is a BGP segment attached to a BGP prefix. A BGP-Prefix-SID is always global within the SR/BGP domain and identifies an instruction to forward the packet over the ECMP-aware best-path computed by BGP to the related prefix. The BGP- Prefix-SID is the identifier of the BGP prefix segment.

This document describes the BGP extension to signal the BGP-Prefix- SID. Specifically, this document defines a new BGP attribute known as the BGP Prefix SID attribute and specifies the rules to originate, receive and handle error conditions of the new attribute.

As described in [I-D.filsfils-spring-segment-routing-msdc], the newly proposed BGP Prefix-SID attribute can be attached to prefixes from AFI/SAFI:

Multiprotocol BGP labeled IPv4/IPv6 Unicast ([RFC3107]).

Patel, et al. Expires January 21, 2016 [Page 3]

Page 122: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft July 2015

Multiprotocol BGP ([RFC4760]) unlabeled IPv6 Unicast.

[I-D.filsfils-spring-segment-routing-msdc] describes use cases where the Prefix-SID is used for the above AFI/SAFI.

3. BGP-Prefix-SID

The BGP-Prefix-SID attached to a BGP prefix P represents the instruction "go to Prefix P" along its BGP bestpath (potentially ECMP-enabled).

3.1. MPLS Prefix Segment

The BGP Prefix Segment is realized on the MPLS dataplane in the following way:

According to [I-D.ietf-spring-segment-routing], each BGP speaker is configured with a label block called the Segment Routing Global Block (SRGB). The SRGB of a node is a local property and could be different on different speakers.

As described in [I-D.filsfils-spring-segment-routing-msdc] the operator assigns a globally unique "index", L_I, to a locally sourced prefix of a BGP speaker N which is advertised to all other BGP speakers in the SR domain.

The index L_I is a 32 bit offset in the SRGB. Each BGP speaker derives its local MPLS label, L, by adding L_I to the start value of its own SRGB, and programs L in its MPLS dataplane as its incoming/local label for the prefix.

If the BGP speakers are configured with the same SRGB start value, they will all program the same MPLS label for a given prefix P. This has the effect of having a single label for prefix P across all BGP speakers despite that the MPLS paradigm of "local label" is preserved and this clearly simplifies the deployment and operations of traffic engineering in BGP driven networks, as described in [I-D.filsfils-spring-segment-routing-msdc].

If the BGP speakers cannot be configured with the same SRGB, the proposed BGP Prefix-SID attribute allows the advertisement of the SRGB so each node can advertise the SRGB it’s configured with. The drawbacks of the use case where BGP speakers have different SRGBs are documented in [I-D.filsfils-spring-segment-routing-msdc].

In order to advertise the label index of a given prefix P and, optionally, the SRGB, a new extension to BGP is needed: the BGP

Patel, et al. Expires January 21, 2016 [Page 4]

Page 123: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft July 2015

Prefix SID attribute. This extension is described in subsequent sections.

3.2. IPv6 Prefix Segment

As defined in [I-D.previdi-6man-segment-routing-header], in SR for the IPv6 dataplane, the SRGB consists of the set of IPv6 addresses used within the SR domain (as described in [I-D.previdi-6man-segment-routing-header]). Therefore the BGP speaker willing to process SR IPv6 packets MUST advertise an IPv6 prefix with the attached Prefix SID attribute and related SR IPv6 flag (see subsequent section).

As described in [I-D.filsfils-spring-segment-routing-msdc], when SR is used over an IPv6 dataplane, the BGP Prefix Segment is instantiated by an IPv6 prefix originated by the BGP speaker.

Each node advertises a globally unique IPv6 address representing itself in the domain. This prefix (e.g.: its loopback interface address) is advertised to all other BGP speakers in the SR domain.

Also, each node MUST advertise its support of Segment Routing for IPv6 dataplane. This is realized using the Prefix SID Attribute defined below.

4. BGP-Prefix-SID Attribute

The BGP Prefix SID attribute is an optional, transitive BGP path attribute. The attribute type code is to be assigned by IANA (suggested value: 40). The value field of the BGP-Prefix-SID attribute has the following format:

The value field of the BGP Prefix SID attribute is defined here to be a set of elements encoded as "Type/Length/Value" (i.e., a set of TLVs). Following TLVs are defined:

o Label-Index TLV

o IPv6 SID TLV

o Originator SRGB TLV

Label-Index and Originator SRGB TLVs are used only when SR is applied to the MPLS dataplane.

IPv6 SID TLV is used only when SR is applied to the IPv6 dataplane.

Patel, et al. Expires January 21, 2016 [Page 5]

Page 124: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft July 2015

4.1. Label-Index TLV

The Label-Index TLV MUST be present in the Prefix-SID attribute attached to Labeled IPv4/IPv6 unicast prefixes ([RFC3107]) and has the following format:

0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type | Length | RESERVED | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Flags | Label Index | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Label Index | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

where:

o Type is 1.

o Length: is 7, the total length of the value portion of the TLV.

o RESERVED: 8 bit field. SHOULD be 0 on transmission and MUST be ignored on reception.

o Flags: 16 bits of flags. None are defined at this stage of the document. The flag field SHOULD be clear on transmission and MUST be ignored at reception.

o Label Index: 32 bit value representing the index value in the SRGB space.

4.2. IPv6 SID

The Label-Index TLV MUST be present in the Prefix-SID attribute attached to MP-BGP unlabeled IPv6 unicast prefixes ([RFC4760]) and has the following format:

0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type | Length | RESERVED | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Flags | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

where:

Patel, et al. Expires January 21, 2016 [Page 6]

Page 125: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft July 2015

o Type is 2.

o Length: is 3, the total length of the value portion of the TLV.

o RESERVED: 8 bit field. SHOULD be 0 on transmission and MUST be ignored on reception.

o Flags: 16 bits of flags defined as follow:

0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |S| | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

where:

* S flag: if set then it means that the BGP speaker attaching the Prefix-SID Attribute to a prefix is capable of processing the IPv6 Segment Routing Header (SRH, [I-D.previdi-6man-segment-routing-header]) for the segment corresponding to the originated IPv6 prefix. The use case leveraging the S flag is described in [I-D.filsfils-spring-segment-routing-msdc].

The other bits of the flag field SHOULD be clear on transmission an MUST be ignored at reception.

4.3. Originator SRGB TLV

The Originator SRGB TLV is an optional TLV and has the following format:

Patel, et al. Expires January 21, 2016 [Page 7]

Page 126: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft July 2015

0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type | Length | Flags | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Flags | +-+-+-+-+-+-+-+-+

+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | SRGB 1 (6 octets) | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | SRGB n (6 octets) | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

where:

o Type is 3.

o Length is the total length of the value portion of the TLV: 2 + multiple of 6.

o Flags: 16 bits of flags. None are defined in this document. Flags SHOULD be clear on transmission an MUST be ignored at reception.

o SRGB: 3 octets of base followed by 3 octets of range. Note that SRGB field MAY appear multiple times.

The Originator SRGB TLV contains the SRGB of the router originating the prefix to which the BGP Prefix SID is attached and MUST be kept in the Prefix-SID Attribute unchanged during the propagation of the BGP update.

The originator SRGB describes the SRGB of the node where the BGP Prefix Segment end. It is used to build SRTE policies when different SRGB’s are used in the fabric ([I-D.filsfils-spring-segment-routing-msdc]).

The originator SRGB may only appear on Prefix-SID attribute attached to prefixes of SAFI 4 (labeled unicast, [RFC3107]).

Patel, et al. Expires January 21, 2016 [Page 8]

Page 127: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft July 2015

5. Receiving BGP-Prefix-SID Attribute

A BGP speaker receiving a BGP Prefix-SID attribute from an EBGP neighbor residing outside the boundaries of the SR domain, SHOULD discard the attribute unless it is configured to accept the attribute from the EBGP neighbor. A BGP speaker MAY log an error for further analysis when discarding an attribute.

5.1. MPLS Dataplane: Labeled Unicast

The Prefix-SID attribute MUST contain the Label-Index TLV and MAY contain the Originator SRGB. A BGP Prefix-SID attribute received without a Label-Index TLV MUST be considered as "unacceptable" by the receiving speaker.

A BGP speaker may be locally configured with an SRGB=[GB_S, GB_E]. The preferred method for deriving the SRGB is a matter of local router configuration.

Given a label index L_I, we call L = L_I + GB_S as the derived label. A BGP Prefix-SID attribute is called "unacceptable" for a speaker M if the derived label value L lies outside the SRGB configured on M. Otherwise the Label Index attribute is called "acceptable" to speaker M.

The mechanisms through which a given label_index value is assigned to a given prefix are outside the scope of this document. The label- index value associated with a prefix is locally configured at the BGP router originating the prefix.

The Prefix-SID attribute MUST contain the Label-Index TLV and MAY contain the Originator SRGB TLV. A BGP Prefix-SID attribute received without a Label-Index TLV MUST be considered as "unacceptable" by the receiving speaker.

When a BGP speaker receives a path from a neighbor with an acceptable BGP Prefix-SID attribute, it SHOULD program the derived label as the local label for the prefix in its MPLS dataplane. In case of any error, a BGP speaker MUST resort to the error handling rules specified in Section 7. A BGP speaker MAY log an error for further analysis.

When a BGP speaker receives a path from a neighbor with an unacceptable BGP Prefix-SID attribute, for the purpose of label allocation, it SHOULD treat the path as if it came without a Prefix- SID attribute. A BGP speaker MAY choose to assign a local (also called dynamic) label (non-SRGB) for such a prefix. A BGP speaker MAY log an error for further analysis.

Patel, et al. Expires January 21, 2016 [Page 9]

Page 128: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft July 2015

A BGP speaker receiving a prefix with a Prefix-SID attribute and a label NLRI field of implicit-null from a neighbor MUST adhere to standard behavior and program its MPLS dataplane to pop the top label when forwarding traffic to the prefix. The label NLRI defines the outbound label that MUST be used by the receiving node. The Label Index gives a hint to the receiving node on which local/incoming label the BGP speaker SHOULD use.

5.2. IPv6 Dataplane

When a SR IPv6 BGP speaker receives a IPv6 Unicast BGP Update with a prefix having the BGP Prefix SID attribute attached, it checks whether the IPv6 SID TLV is present and if the S-flag is set. If the IPv6 SID TLV is not present or if the S-flag is not set, then the Prefix-SID attribute MUST be considered as "unacceptable" by the receiving speaker.

The Originator SRGB MUST be ignored on reception.

A BGP speaker receiving a BGP Prefix-SID attribute from an EBGP neighbor residing outside the boundaries of the SR domain, SHOULD discard the attribute unless it is configured to accept the attribute from the EBGP neighbor. A BGP speaker MAY log an error for further analysis when discarding an attribute.

6. Announcing BGP-Prefix-SID Attribute

The BGP Prefix-SID attribute MAY be attached to labeled BGP prefixes (IPv4/IPv6) [RFC3107]or to IPv6 prefixes [RFC4760]. In order to prevent distribution of the BGP Prefix-SID attribute beyond its intended scope of applicability, attribute filtering MAY be deployed.

6.1. MPLS Dataplane: Labeled Unicast

A BGP speaker that originates a prefix attaches the Prefix-SID attribute when it advertises the prefix to its neighbors. The value of the Label-Index in the Label-Index TLV is determined by configuration.

A BGP speaker that originates a Prefix-SID attribute MAY optionally announce Originator SRGB TLV along with the mandatory Label-Index TLV. The content of the Originator SRGB TLV is determined by the configuration.

Since the Label-index value must be unique within an SR domain, by default an implementation SHOULD NOT advertise the BGP Prefix-SID attribute outside an Autonomous System unless it is explicitly configured to do so.

Patel, et al. Expires January 21, 2016 [Page 10]

Page 129: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft July 2015

A BGP speaker that advertises a path received from one of its neighbors SHOULD advertise the Prefix-SID received with the path without modification regardless of whether the Prefix-SID was acceptable. If the path did not come with a Prefix-SID attribute, the speaker MAY attach a Prefix-SID to the path if configured to do so. The content of the TLVs present in the Prefix-SID is determined by the configuration.

In all cases, the label field of the NLRI ([RFC3107], [RFC4364]) MUST be set to the local/incoming label programmed in the MPLS dataplane for the given prefix. If the prefix is associated with one of the BGP speakers interfaces, this label is the usual MPLS label (such as the implicit or explicit NULL label).

6.2. IPv6 Dataplane

A BGP speaker that originates a prefix attaches the Prefix-SID attribute when it advertises the prefix to its neighbors. The IPv6 SID TLV MUST be present and the S-flag MUST be set.

A BGP speaker that advertises a path received from one of its neighbors SHOULD advertise the Prefix-SID received with the path without modification regardless of whether the Prefix-SID was acceptable. If the path did not come with a Prefix-SID attribute, the speaker MAY attach a Prefix-SID to the path if configured to do so. The IPv6-SID TLV MUST be present in the Prefix-SID and with the S-flag set.

7. Error Handling of BGP-Prefix-SID Attribute

When a BGP Speaker receives a BGP Update message containing a malformed BGP Prefix-SID attribute, it MUST ignore the received BGP Prefix-SID attributes and not pass it to other BGP peers. This is equivalent to the -attribute discard- action specified in [I-D.ietf-idr-error-handling]. When discarding an attribute, a BGP speaker MAY log an error for further analysis.

If the BGP Prefix-SID attribute appears more than once in an BGP Update message message, then, according to [I-D.ietf-idr-error-handling], all the occurrences of the attribute other than the first one SHALL be discarded and the BGP Update message shall continue to be processed.

When a BGP speaker receives an unacceptable Prefix-SID attribute, it MAY log an error for further analysis.

Patel, et al. Expires January 21, 2016 [Page 11]

Page 130: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft July 2015

8. IANA Considerations

This document defines a new BGP path attribute known as the BGP Prefix-SID attribute. This document requests IANA to assign a new attribute code type (suggested value: 40) for BGP the Prefix-SID attribute from the BGP Path Attributes registry.

This document defines two new TLVs for BGP Prefix-SID attribute. These TLVs need to be registered with IANA. We request IANA to create a new registry for BGP Prefix-SID Attribute TLVs as follows:

Under "Border Gateway Protocol (BGP) Parameters" registry, "BGP Prefix SID attribute Types" Reference: draft-keyupate-idr-bgp-prefix- side-05 Registration Procedure(s): Values 1-254 First Come, First Served, Value 0 and 255 reserved

Value Type Reference 0 Reserved draft-keyupate-idr-bgp-prefix-side-05 1 Label-Index draft-keyupate-idr-bgp-prefix-side-05 2 IPv6 SID draft-keyupate-idr-bgp-prefix-side-05 3 Originator SRGB draft-keyupate-idr-bgp-prefix-side-05 4-254 Unassigned 255 Reserved draft-keyupate-idr-bgp-prefix-side-05

9. Security Considerations

This document introduces no new security considerations above and beyond those already specified in [RFC4271] and [RFC3107].

10. Acknowledgements

The authors would like to thanks Satya Mohanty and Acee Lindem for their contribution to this document.

11. Change Log

Initial Version: Sep 21 2014

12. References

12.1. Normative References

[I-D.ietf-idr-error-handling] Chen, E., Scudder, J., Mohapatra, P., and K. Patel, "Revised Error Handling for BGP UPDATE Messages", draft- ietf-idr-error-handling-19 (work in progress), April 2015.

Patel, et al. Expires January 21, 2016 [Page 12]

Page 131: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft July 2015

[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997, <http://www.rfc-editor.org/info/rfc2119>.

[RFC3107] Rekhter, Y. and E. Rosen, "Carrying Label Information in BGP-4", RFC 3107, DOI 10.17487/RFC3107, May 2001, <http://www.rfc-editor.org/info/rfc3107>.

[RFC4271] Rekhter, Y., Ed., Li, T., Ed., and S. Hares, Ed., "A Border Gateway Protocol 4 (BGP-4)", RFC 4271, DOI 10.17487/RFC4271, January 2006, <http://www.rfc-editor.org/info/rfc4271>.

[RFC4364] Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private Networks (VPNs)", RFC 4364, DOI 10.17487/RFC4364, February 2006, <http://www.rfc-editor.org/info/rfc4364>.

12.2. Informative References

[I-D.filsfils-spring-segment-routing-central-epe] Filsfils, C., Previdi, S., Patel, K., Aries, E., [email protected], s., Ginsburg, D., and D. Afanasiev, "Segment Routing Centralized Egress Peer Engineering", draft- filsfils-spring-segment-routing-central-epe-04 (work in progress), July 2015.

[I-D.filsfils-spring-segment-routing-msdc] Filsfils, C., Previdi, S., Mitchell, J., Aries, E., Lapukhov, P., Gaya, G., Afanasiev, D., Laberge, T., Nkposong, E., Nanduri, M., Uttaro, J., and S. Ray, "BGP- Prefix Segment in large-scale data centers", draft- filsfils-spring-segment-routing-msdc-02 (work in progress), July 2015.

[I-D.ietf-idr-bgpls-segment-routing-epe] Previdi, S., Filsfils, C., Ray, S., Patel, K., Dong, J., and M. Chen, "Segment Routing Egress Peer Engineering BGP- LS Extensions", draft-ietf-idr-bgpls-segment-routing- epe-00 (work in progress), June 2015.

[I-D.ietf-spring-segment-routing] Filsfils, C., Previdi, S., Decraene, B., Litkowski, S., and R. Shakir, "Segment Routing Architecture", draft-ietf- spring-segment-routing-03 (work in progress), May 2015.

Patel, et al. Expires January 21, 2016 [Page 13]

Page 132: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft July 2015

[I-D.previdi-6man-segment-routing-header] Previdi, S., Filsfils, C., Field, B., and I. Leung, "IPv6 Segment Routing Header (SRH)", draft-previdi-6man-segment- routing-header-06 (work in progress), May 2015.

[RFC4760] Bates, T., Chandra, R., Katz, D., and Y. Rekhter, "Multiprotocol Extensions for BGP-4", RFC 4760, DOI 10.17487/RFC4760, January 2007, <http://www.rfc-editor.org/info/rfc4760>.

Authors’ Addresses

Keyur Patel Cisco Systems 170 W. Tasman Drive San Jose, CA 95124 95134 USA

Email: [email protected]

Stefano Previdi Cisco Systems Via Del Serafico, 200 Rome 00142 Italy

Email: [email protected]

Clarence Filsfils Cisco Systems Brussels Belgium

Email: [email protected]

Arjun Sreekantiah Cisco Systems 170 W. Tasman Drive San Jose, CA 95124 95134 USA

Email: [email protected]

Patel, et al. Expires January 21, 2016 [Page 14]

Page 133: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft July 2015

Saikat Ray Unaffiliated

Email: [email protected]

Hannes Gredler Juniper Networks

Email: [email protected]

Patel, et al. Expires January 21, 2016 [Page 15]

Page 134: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Network Working Group Z. LiInternet-Draft HuaweiIntended status: Standards Track L. OuExpires: January 8, 2020 Y. Luo China Telcom Co., Ltd. S. Lu Tencent H. Chen Futurewei S. Zhuang H. Wang Huawei July 7, 2019

BGP Extensions for Routing Policy Distribution (RPD) draft-li-idr-flowspec-rpd-05

Abstract

It is hard to adjust traffic and optimize traffic paths on a traditional IP network from time to time through manual configurations. It is desirable to have an automatic mechanism for setting up routing policies, which adjust traffic and optimize traffic paths automatically. This document describes BGP Extensions for Routing Policy Distribution (BGP RPD) to support this.

Requirements Language

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119].

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at https://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

Li, et al. Expires January 8, 2020 [Page 1]

Page 135: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft BGP RPD July 2019

This Internet-Draft will expire on January 8, 2020.

Copyright Notice

Copyright (c) 2019 IETF Trust and the persons identified as the document authors. All rights reserved.

This document is subject to BCP 78 and the IETF Trust’s Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.

Table of Contents

1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3 3. Problem Statements . . . . . . . . . . . . . . . . . . . . . 3 3.1. Inbound Traffic Control . . . . . . . . . . . . . . . . . 3 3.2. Outbound Traffic Control . . . . . . . . . . . . . . . . 4 4. Protocol Extensions . . . . . . . . . . . . . . . . . . . . . 5 4.1. Using a New AFI and SAFI . . . . . . . . . . . . . . . . 5 4.2. BGP Wide Community . . . . . . . . . . . . . . . . . . . 6 4.2.1. New Wide Community Atoms . . . . . . . . . . . . . . 6 4.3. Capability Negotiation . . . . . . . . . . . . . . . . . 12 5. Consideration . . . . . . . . . . . . . . . . . . . . . . . . 12 5.1. Route-Policy . . . . . . . . . . . . . . . . . . . . . . 12 6. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 13 7. Security Considerations . . . . . . . . . . . . . . . . . . . 13 8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 14 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 14 10. References . . . . . . . . . . . . . . . . . . . . . . . . . 15 10.1. Normative References . . . . . . . . . . . . . . . . . . 15 10.2. Informative References . . . . . . . . . . . . . . . . . 16 Authors’ Addresses . . . . . . . . . . . . . . . . . . . . . . . 16

1. Introduction

It is difficult to optimize traffic paths on a traditional IP network because of:

o Heavy configuration and error prone. Traffic can only be adjusted device by device. All routers that the traffic traverses need to be configured. The configuration workload is heavy. The

Li, et al. Expires January 8, 2020 [Page 2]

Page 136: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft BGP RPD July 2019

operation is not only time consuming but also prone to misconfiguration for Service Providers.

o Complex. The routing policies used to control network routes are complex, posing difficulties to subsequent maintenance, high maintenance skills are required.

It is desirable to have an automatic mechanism for setting up routing policies, which can simplify the routing policies configuration. This document describes extensions to BGP for Routing Policy Distribution to resolve these issues.

2. Terminology

The following terminology is used in this document.

o ACL:Access Control List

o BGP: Border Gateway Protocol

o FS: Flow Specification

o PBR:Policy-Based Routing

o RPD: Routing Policy Distribution

o VPN: Virtual Private Network

3. Problem Statements

It is obvious that providers have the requirements to adjust their business traffic from time to time because:

o Business development or network failure introduces link congestion and overload.

o Network transmission quality is decreased as the result of delay, loss and they need to adjust traffic to other paths.

o To control OPEX and CPEX, prefer the transit provider with lower price.

3.1. Inbound Traffic Control

In the scenario below, for the reasons above, the provider of AS100 saying P may wish the inbound traffic from AS200 enters AS100 through link L3 instead of the others. Since P doesn’t have any

Li, et al. Expires January 8, 2020 [Page 3]

Page 137: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft BGP RPD July 2019

administration over AS200, so there is no way for P to modify the route selection criteria directly.

Traffic from PE1 to Prefix1 ----------------------------------->

+-----------------+ +-------------------------+ | +---------+ | L1 | +----+ +----------+| | |Speaker1 | +------------+ |IGW1| |policy || | +---------+ |** L2**| +----+ |controller|| | | ** ** | +----------+| | +---+ | **** | | | |PE1| | **** | | | +---+ | ** ** | | | +---------+ |** L3**| +----+ | | |Speaker2 | +------------+ |IGW2| AS100 | | +---------+ | L4 | +----+ | | | | | | AS200 | | | | | | ... | | | | | | +---------+ | | +----+ +-------+ | | |Speakern | | | |IGWn| |Prefix1| | | +---------+ | | +----+ +-------+ | +-----------------+ +-------------------------+

Prefix1 advertised from AS100 to AS200 <----------------------------------------

Inbound Traffic Control case

3.2. Outbound Traffic Control

In the scenario below, the provider of AS100 saying P prefers link L3 for the traffic to the destination Prefix2 among multiple exits and links. This preference can be dynamic and changed frequently because of the reasons above. So the provider P expects an efficient and convenient solution.

Li, et al. Expires January 8, 2020 [Page 4]

Page 138: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft BGP RPD July 2019

Traffic from PE2 to Prefix2 -----------------------------------> +-------------------------+ +-----------------+ |+----------+ +----+ |L1 | +---------+ | ||policy | |IGW1| +------------+ |Speaker1 | | ||controller| +----+ |** **| +---------+ | |+----------+ |L2** ** | +-------+| | | **** | |Prefix2|| | | **** | +-------+| | |L3** ** | | | AS100 +----+ |** **| +---------+ | | |IGW2| +------------+ |Speaker2 | | | +----+ |L4 | +---------+ | | | | | |+---+ | | AS200 | ||PE2| ... | | | |+---+ | | | | +----+ | | +---------+ | | |IGWn| | | |Speakern | | | +----+ | | +---------+ | +-------------------------+ +-----------------+

Prefix2 advertised from AS200 to AS100 <----------------------------------------

Outbound Traffic Control case

4. Protocol Extensions

A solution is proposed to use a new AFI and SAFI with the BGP Wide Community for encoding a routing policy.

4.1. Using a New AFI and SAFI

A new AFI and SAFI are defined: the Routing Policy AFI whose codepoint TBD1 is to be assigned by IANA, and SAFI whose codepoint TBD2 is to be assigned by IANA.

The AFI and SAFI pair uses a new NLRI, which is defined as follows:

Li, et al. Expires January 8, 2020 [Page 5]

Page 139: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft BGP RPD July 2019

0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+ | NLRI Length | +-+-+-+-+-+-+-+-+ | Policy Type | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Distinguisher (4 octets) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Peer IP (4/16 octets) ˜ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Where:

NLRI Length: 1 octet represents the length of NLRI.

Policy Type: 1 octet indicates the type of a policy. 1 is for export policy. 2 is for import policy.

Distinguisher: 4 octet value uniquely identifies the policy in the peer.

Peer IP: 4/16 octet value indicates an IPv4/IPv6 peer.

The NLRI containing the Routing Policy is carried in a BGP UPDATE message, which MUST contain the BGP mandatory attributes and MAY also contain some BGP optional attributes.

When receiving a BGP UPDATE message, a BGP speaker processes it only if the peer IP address in the NLRI is the IP address of the BGP speaker or 0.

The content of the Routing Policy is encoded in a BGP Wide Community.

4.2. BGP Wide Community

The BGP wide community is defined in [I-D.ietf-idr-wide-bgp-communities]. It can be used to facilitate the delivery of new network services, and be extended easily for distributing different kinds of routing policies.

4.2.1. New Wide Community Atoms

A wide community Atom is a TLV (or sub-TLV), which may be included in a BGP wide community container (or BGP wide community for short) containing some BGP Wide Community TLVs. Three BGP Wide Community TLVs are defined in [I-D.ietf-idr-wide-bgp-communities], which are BGP Wide Community Target(s) TLV, Exclude Target(s) TLV, and

Li, et al. Expires January 8, 2020 [Page 6]

Page 140: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft BGP RPD July 2019

Parameter(s) TLV. Each of these TLVs comprises a series of Atoms, each of which is a TLV (or sub-TLV). A new wide community Atom is defined for BGP Wide Community Target(s) TLV and a few new Atoms are defined for BGP Wide Community Parameter(s) TLV. For your reference, the format of the TLV is illustrated below:

0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type | Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Value (variable) ˜ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Format of Wide Community Atom TLV

A RouteAttr Atom TLV (or RouteAttr TLV/sub-TLV for short) is defined and may be included in a Target TLV. It has the following format.

0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type (TBD1) | Length (variable) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | sub-TLVs ˜ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Format of RouteAttr Atom TLV

The Type for RouteAttr is TBD1 (suggested value 48) to be assigned by IANA. In RouteAttr TLV, three sub-TLVs are defined: IP Prefix, AS- Path and Community sub-TLV.

An IP prefix sub-TLV gives matching criteria on IPv4 prefixes. Its format is illustrated below:

Li, et al. Expires January 8, 2020 [Page 7]

Page 141: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft BGP RPD July 2019

0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type (TBD2) | Length (N x 8) |M-Type | Flags | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | IPv4 Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Mask | GeMask | LeMask |M-Type | Flags | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ ˜ . . . +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | IPv4 Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Mask | GeMask | LeMask | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Format of IPv4 Prefix sub-TLV

Type: TBD2 (suggested value 1) for IPv4 Prefix is to be assigned by IANA.

Length: N x 8, where N is the number of tuples <M-Type, Flags, IPv4 Address, Mask, GeMask, LeMask>.

M-Type: 4 bits for match types, four of which are defined:

M-Type = 0: Exact match.

M-Type = 1: Match prefix greater and equal to the given masks.

M-Type = 2: Match prefix less and equal to the given masks.

M-Type = 3: Match prefix within the range of the given masks.

Flags: 4 bits. No flags are currently defined.

IPv4 Address: 4 octets for an IPv4 address.

Mask: 1 octet for the mask length.

GeMask: 1 octet for match range, must be less than Mask or be 0.

LeMask: 1 octet for match range, must be greater than Mask or be 0.

For example, tuple <M-Type=0, Flags=0, IPv4 Address = 1.1.0.0, Mask = 22, GeMask = 0, LeMask = 0> represents an exact IP prefix match for 1.1.0.0/22.

Li, et al. Expires January 8, 2020 [Page 8]

Page 142: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft BGP RPD July 2019

<M-Type=1, Flags=0, IPv4 Address = 16.1.0.0, Mask = 24, GeMask = 24, LeMask = 0> represents match IP prefix 1.1.0.0/24 greater-equal 24.

<M-Type=2, Flags=0, IPv4 Address = 17.1.0.0, Mask = 24, GeMask = 0, LeMask = 26> represents match IP prefix 17.1.0.0/24 less-equal 26.

<M-Type=3, Flags=0, IPv4 Address = 18.1.0.0, Mask = 24, GeMask = 24, LeMask = 32> represents match IP prefix 18.1.0.0/24 greater-equal to 24 and less-equal 32.

Similarly, an IPv6 Prefix sub-TLV represents match criteria on IPv6 prefixes. Its format is illustrated below:

0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type(TBD3) | Length (N x 20) |M-Type | Flags | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | IPv6 Address (16 octets) ˜ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Mask | GeMask | LeMask |M-Type | Flags | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ ˜ . . . +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | IPv6 Address (16 octets ˜ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Mask | GeMask | LeMask | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Format of IPv6 Prefix sub-TLV

An AS-Path sub-TLV represents a match criteria in a regular expression string. Its format is illustrated below:

0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type (TBD4) | Length (Variable) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | AS-Path Regex String | : : | ˜ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Format of AS Path sub-TLV

Type: TBD4 (suggested value 2) for AS-Path is to be assigned by IANA.

Li, et al. Expires January 8, 2020 [Page 9]

Page 143: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft BGP RPD July 2019

Length: Variable, maximum is 1024.

AS-Path Regex String: AS-Path regular expression string.

A community sub-TLV represents a list of communities to be matched all. Its format is illustrated below:

0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type (TBD5) | Length (N x 4 + 1) | Flags | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Community 1 Value | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ ˜ . . . ˜ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Community N Value | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Format of Community sub-TLV

Type: TBD5 (suggested value 3) for Community is to be assigned by IANA.

Length: N x 4 + 1, where N is the number of communities.

Flags: 1 octet. No flags are currently defined.

In Parameter(s) TLV, two action sub-TLVs are defined: MED change sub- TLV and AS-Path change sub-TLV. When the community in the container is MATCH AND SET ATTR, the Parameter(s) TLV includes some of these sub-TLVs. When the community is MATCH AND NOT ADVERTISE, the Parameter(s) TLV’s value is empty.

A MED change sub-TLV indicates an action to change the MED. Its format is illustrated below:

0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type (TBD6) | Length (5) | OP | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Value | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Format of MED Change sub-TLV

Li, et al. Expires January 8, 2020 [Page 10]

Page 144: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft BGP RPD July 2019

Type: TBD6 (suggested value 1) for MED Change is to be assigned by IANA.

Length: 5.

OP: 1 octet. Three are defined:

OP = 0: assign the Value to the existing MED.

OP = 1: add the Value to the existing MED. If the sum is greater than the maximum value for MED, assign the maximum value to MED.

OP = 2: subtract the Value from the existing MED. If the existing MED minus the Value is less than 0, assign 0 to MED.

Value: 4 octets.

An AS-Path change sub-TLV indicates an action to change the AS-Path. Its format is illustrated below:

0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type (TBD7) | Length (n x 5) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | AS1 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Count1 | +-+-+-+-+-+-+-+-+ ˜ . . . +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ASn | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Countn | +-+-+-+-+-+-+-+-+

Format of AS-Path Change sub-TLV

Type: TBD7 (suggested value 2) for AS-Path Change is to be assigned by IANA.

Length: n x 5.

ASi: 4 octet. An AS number.

Counti: 1 octet. ASi repeats Counti times.

Li, et al. Expires January 8, 2020 [Page 11]

Page 145: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft BGP RPD July 2019

The sequence of AS numbers are added to the existing AS Path.

4.3. Capability Negotiation

It is necessary to negotiate the capability to support BGP Extensions for Routing Policy Distribution (RPD). The BGP RPD Capability is a new BGP capability [RFC5492]. The Capability Code for this capability is to be specified by the IANA. The Capability Length field of this capability is variable. The Capability Value field consists of one or more of the following tuples:

+--------------------------------------------------+ | Address Family Identifier (2 octets) | +--------------------------------------------------+ | Subsequent Address Family Identifier (1 octet) | +--------------------------------------------------+ | Send/Receive (1 octet) | +--------------------------------------------------+

BGP RPD Capability

The meaning and use of the fields are as follows:

Address Family Identifier (AFI): This field is the same as the one used in [RFC4760].

Subsequent Address Family Identifier (SAFI): This field is the same as the one used in [RFC4760].

Send/Receive: This field indicates whether the sender is (a) willing to receive Routing Policies from its peer (value 1), (b) would like to send Routing Policies to its peer (value 2), or (c) both (value 3) for the <AFI, SAFI>.

5. Consideration

5.1. Route-Policy

Routing policies are used to filter routes and control how routes are received and advertised. If route attributes, such as reachability, are changed, the path along which network traffic passes changes accordingly.

When advertising, receiving, and importing routes, the router implements certain policies based on actual networking requirements to filter routes and change the attributes of the routes. Routing policies serve the following purposes:

Li, et al. Expires January 8, 2020 [Page 12]

Page 146: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft BGP RPD July 2019

o Control route advertising: Only routes that match the rules specified in a policy are advertised.

o Control route receiving: Only the required and valid routes are received. This reduces the size of the routing table and improves network security.

o Filter and control imported routes: A routing protocol may import routes discovered by other routing protocols. Only routes that satisfy certain conditions are imported to meet the requirements of the protocol.

o Modify attributes of specified routes Attributes of the routes: that are filtered by a routing policy are modified to meet the requirements of the local device.

o Configure fast reroute (FRR): If a backup next hop and a backup outbound interface are configured for the routes that match a routing policy, IP FRR, VPN FRR, and IP+VPN FRR can be implemented.

Routing policies are implemented using the following procedures:

1. Define rules: Define features of routes to which routing policies are applied. Users define a set of matching rules based on different attributes of routes, such as the destination address and the address of the router that advertises the routes.

2. Implement the rules: Apply the matching rules to routing policies for advertising, receiving, and importing routes.

6. Contributors

The following people have substantially contributed to the definition of the BGP-FS RPD and to the editing of this document:

Peng Zhou Huawei Email: [email protected]

7. Security Considerations

Protocol extensions defined in this document do not affect the BGP security other than those as discussed in the Security Considerations section of [RFC5575].

Li, et al. Expires January 8, 2020 [Page 13]

Page 147: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft BGP RPD July 2019

8. Acknowledgements

The authors would like to thank Acee Lindem, Jeff Haas, Jie Dong, Lucy Yong, Qiandeng Liang, Zhenqiang Li for their comments to this work.

9. IANA Considerations

This document requests assigning a new AFI in the registry "Address Family Numbers" as follows:

+-----------------------+-------------------------+-------------+ | Code Point | Description | Reference | +-----------------------+-------------------------+-------------+ | TBD (36879 suggested) | Routing Policy AFI |This document| +-------------------------------------------------+-------------+

This document requests assigning a new SAFI in the registry "Subsequent Address Family Identifiers (SAFI) Parameters" as follows:

+-----------------------+-------------------------+-------------+ | Code Point | Description | Reference | +-----------------------+-------------------------+-------------+ | TBD(179 suggested) | Routing Policy SAFI |This document| +-----------------------+-------------------------+-------------+

This document defines a new registry called "Routing Policy NLRI". The allocation policy of this registry is "First Come First Served (FCFS)" according to [RFC8126].

Following code points are defined:

+-------------+-----------------------------------+-------------+ | Code Point | Description | Reference | +-------------+-----------------------------------+-------------+ | 1 | Export Policy |This document| +-------------+-----------------------------------+-------------+ | 2 | Import Policy |This document| +-------------+-----------------------------------+-------------+

This document requests assigning a code-point from the registry "BGP Community Container Atom Types" as follows:

+---------------------+------------------------------+-------------+ | TLV Code Point | Description | Reference | +---------------------+------------------------------+-------------+ | TBD1 (48 suggested) | RouteAttr Atom |This document| +---------------------+------------------------------+-------------+

Li, et al. Expires January 8, 2020 [Page 14]

Page 148: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft BGP RPD July 2019

This document defines a new registry called "Route Attributes Sub- TLV" under RouteAttr Atom TLV. The allocation policy of this registry is "First Come First Served (FCFS)" according to [RFC8126].

Following Sub-TLV code points are defined:

+-------------+-----------------------------------+-------------+ | Code Point | Description | Reference | +-------------+-----------------------------------+-------------+ | 0 | Reserved | | +-------------+-----------------------------------+-------------+ | 1 | IP Prefix Sub-TLV |This document| +-------------+-----------------------------------+-------------+ | 2 | AS-Path Sub-TLV |This document| +-------------+-----------------------------------+-------------+ | 3 | Community Sub-TLV |This document| +-------------+-----------------------------------+-------------+ | 4 - 255 | To be assigned in FCFS | | +-------------+-----------------------------------+-------------+

This document defines a new registry called "Attribute Change Sub- TLV" under Parameter(s) TLV. The allocation policy of this registry is "First Come First Served (FCFS)" according to [RFC8126].

Following Sub-TLV code points are defined:

+-------------+-----------------------------------+-------------+ | Code Point | Description | Reference | +-------------+-----------------------------------+-------------+ | 0 | Reserved | | +-------------+-----------------------------------+-------------+ | 1 | MED Change Sub-TLV |This document| +-------------+-----------------------------------+-------------+ | 2 | AS-Path Change Sub-TLV |This document| +-------------+-----------------------------------+-------------+ | 3 - 255 | To be assigned in FCFS | | +-------------+-----------------------------------+-------------+

10. References

10.1. Normative References

[I-D.ietf-idr-wide-bgp-communities] Raszuk, R., Haas, J., Lange, A., Decraene, B., Amante, S., and P. Jakma, "BGP Community Container Attribute", draft- ietf-idr-wide-bgp-communities-05 (work in progress), July 2018.

Li, et al. Expires January 8, 2020 [Page 15]

Page 149: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft BGP RPD July 2019

[RFC1997] Chandra, R., Traina, P., and T. Li, "BGP Communities Attribute", RFC 1997, DOI 10.17487/RFC1997, August 1996, <https://www.rfc-editor.org/info/rfc1997>.

[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997, <https://www.rfc-editor.org/info/rfc2119>.

[RFC4271] Rekhter, Y., Ed., Li, T., Ed., and S. Hares, Ed., "A Border Gateway Protocol 4 (BGP-4)", RFC 4271, DOI 10.17487/RFC4271, January 2006, <https://www.rfc-editor.org/info/rfc4271>.

[RFC4760] Bates, T., Chandra, R., Katz, D., and Y. Rekhter, "Multiprotocol Extensions for BGP-4", RFC 4760, DOI 10.17487/RFC4760, January 2007, <https://www.rfc-editor.org/info/rfc4760>.

[RFC5492] Scudder, J. and R. Chandra, "Capabilities Advertisement with BGP-4", RFC 5492, DOI 10.17487/RFC5492, February 2009, <https://www.rfc-editor.org/info/rfc5492>.

[RFC5575] Marques, P., Sheth, N., Raszuk, R., Greene, B., Mauch, J., and D. McPherson, "Dissemination of Flow Specification Rules", RFC 5575, DOI 10.17487/RFC5575, August 2009, <https://www.rfc-editor.org/info/rfc5575>.

[RFC8126] Cotton, M., Leiba, B., and T. Narten, "Guidelines for Writing an IANA Considerations Section in RFCs", BCP 26, RFC 8126, DOI 10.17487/RFC8126, June 2017, <https://www.rfc-editor.org/info/rfc8126>.

10.2. Informative References

[I-D.ietf-idr-registered-wide-bgp-communities] Raszuk, R. and J. Haas, "Registered Wide BGP Community Values", draft-ietf-idr-registered-wide-bgp-communities-02 (work in progress), May 2016.

Authors’ Addresses

Li, et al. Expires January 8, 2020 [Page 16]

Page 150: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft BGP RPD July 2019

Zhenbin Li Huawei Huawei Bld., No.156 Beiqing Rd. Beijing 100095 China

Email: [email protected]

Liang Ou China Telcom Co., Ltd. 109 West Zhongshan Ave,Tianhe District Guangzhou 510630 China

Email: [email protected]

Yujia Luo China Telcom Co., Ltd. 109 West Zhongshan Ave,Tianhe District Guangzhou 510630 China

Email: [email protected]

Sujian Lu Tencent Tengyun Building,Tower A ,No. 397 Tianlin Road Shanghai, Xuhui District 200233 China

Email: [email protected]

Huaimo Chen Futurewei Boston, MA USA

Email: [email protected]

Li, et al. Expires January 8, 2020 [Page 17]

Page 151: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft BGP RPD July 2019

Shunwan Zhuang Huawei Huawei Bld., No.156 Beiqing Rd. Beijing 100095 China

Email: [email protected]

Haibo Wang Huawei Huawei Bld., No.156 Beiqing Rd. Beijing 100095 China

Email: [email protected]

Li, et al. Expires January 8, 2020 [Page 18]

Page 152: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Idr Working Group Q. LiangInternet-Draft S. HaresIntended status: Standards Track J. YouExpires: September 22, 2016 Huawei R. Raszuk Nozomi D. Ma Cisco Systems March 21, 2016

Carrying Label Information for BGP FlowSpec draft-liang-idr-bgp-flowspec-label-02

Abstract

This document specifies a method in which the label mapping information for a particular FlowSpec rule is piggybacked in the same Border Gateway Protocol (BGP) Update message that is used to distribute the FlowSpec rule. Based on the proposed method, the Label Switching Routers (LSRs) (except the ingress LSR) on the Label Switched Path (LSP) can use label to indentify the traffic matching a particular FlowSpec rule; this facilitates monitoring and traffic statistics for FlowSpec rules.

Requirements Language

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119].

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on September 22, 2016.

Liang, et al. Expires September 22, 2016 [Page 1]

Page 153: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft BGP FlowSpec March 2016

Copyright Notice

Copyright (c) 2016 IETF Trust and the persons identified as the document authors. All rights reserved.

This document is subject to BCP 78 and the IETF Trust’s Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.

Table of Contents

1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 1.1. Background . . . . . . . . . . . . . . . . . . . . . . . 2 1.2. MPLS Flow Specification Deployment . . . . . . . . . . . 3 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3 3. Overview of Proposal . . . . . . . . . . . . . . . . . . . . 3 4. Protocol Extensions . . . . . . . . . . . . . . . . . . . . . 5 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 7 6. Security considerations . . . . . . . . . . . . . . . . . . . 7 7. Acknowledgement . . . . . . . . . . . . . . . . . . . . . . . 7 8. References . . . . . . . . . . . . . . . . . . . . . . . . . 7 8.1. Normative References . . . . . . . . . . . . . . . . . . 7 8.2. Informative References . . . . . . . . . . . . . . . . . 8 Authors’ Addresses . . . . . . . . . . . . . . . . . . . . . . . 8

1. Introduction

This section provides the background for proposing a new action for BGP Flow specification that push/pops MPLS or swaps MPLS tags. For those familiar with BGP Flow specification ([RFC5575], [RFC7674], [I-D.ietf-idr-flow-spec-v6], [I-D.ietf-idr-flowspec-l2vpn], [I-D.ietf-idr-bgp-flowspec-oid] and MPLS ([RFC3107]) can skip this background section.

1.1. Background

[RFC5575] defines the flow specification (FlowSpec) that is an n-tuple consisting of several matching criteria that can be applied to IP traffic. The matching criteria can include elements such as source and destination address prefixes, IP protocol, and transport protocol port numbers. A given IP packet is said to match the defined flow if it matches all the specified criteria. [RFC5575]

Liang, et al. Expires September 22, 2016 [Page 2]

Page 154: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft BGP FlowSpec March 2016

also defines a set of filtering actions, such as rate limit, redirect, marking, associated with each flow specification. A new Border Gateway Protocol Network Layer Reachability Information (BGP NLRI) (AFI/SAFI: 1/133 for IPv4, AFI/SAFI: 1/134 for VPNv4) encoding format is used to distribute traffic flow specifications.

[RFC3107] specifies the way in which the label mapping information for a particular route is piggybacked in the same Border Gateway Protocol Update message that is used to distribute the route itself. Label mapping information is carried as part of the Network Layer Reachability Information (NLRI) in the Multiprotocol Extensions attributes. The Network Layer Reachability Information is encoded as one or more triples of the form <length, label, prefix>. The NLRI contains a label is indicated by using Subsequent Address Family Identifier (SAFI) value 4.

[RFC4364] describes a method in which each route within a Virtual Private Network (VPN) is assigned a Multiprotocol Label Switching (MPLS) label. If the Address Family Identifier (AFI) field is set to 1, and the SAFI field is set to 128, the NLRI is an MPLS-labeled VPN- IPv4 address.

1.2. MPLS Flow Specification Deployment

In BGP VPN/MPLS networks when flow specification policy rules exist on multiple forwarding devices in the network bound with labels from one or more LSPs, only the ingress LSR (Label Switching Router) needs to identify a particular traffic flow based on the matching criteria for flow. Once the flow is match by the ingress LSR, the ingress LSR steers the packet to a corresponding LSP (Label Switched Path). Other LSRs of the LSP just need to forward the packet according to the label carried in it.

2. Terminology

This section contains definitions of terms used in this document.

Flow Specification (FlowSpec): A flow specification is an n-tuple consisting of several matching criteria that can be applied to IP traffic, including filters and actions. Each FlowSpec consists of a set of filters and a set of actions.

3. Overview of Proposal

This document proposes adding a BGP-FS action in an extended community alters the label switch path associated with a matched flow. If the match does not have a label switch path, this action is skipped.

Liang, et al. Expires September 22, 2016 [Page 3]

Page 155: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft BGP FlowSpec March 2016

The BGP flow specification (BGP-FS) policy rule could match on the destination prefix and then utilize a BGP-FS action to adjust the label path associated with it (push/pop/swap tags.) Or a BGP-FS policy rule could match on any set of BGP-FS match conditions associated with a BGP-FS action that adjust the label switch path (push/pop/swap).

[I-D.yong-idr-flowspec-mpls-match] provides a match BGP-FS that may be used with this action to match and direct MPLS packets.

Example of Use:

Forwarding information for the traffic from IP1 to IP2 in the Routers:

PE1: in(<IP2,IP1>) --> out(Label2) ASBR1: in(Label2) --> out(Label3) ASBR2: in(Label3) --> out(Label4) PE2: in(Label4) --> out(--)

Labels allocated by flow policy process:

Label4 allocated by PE2 Label3 allocated by ASBR2 Label2 allocated by ASBR1

|<------AS1----->| |<------AS2----->| +-----+ +-----+ +-----+ +-----+ VPN 1,IP1..| PE1 |====|ASBR1|----|ASBR2|====| PE2 |..VPN1,IP2 +-----+ +-----+ +-----+ +-----+ | LDP LSP1 | | LDP LSP2 | | -------> | | -------> | |-------BGP VPN Flowspec LSP---->| (Label1) (Label2) (Label3) (Label4)

Figure 1: Usage of FlowSpec with Label

BGP-FS rule1 (locally configured):

Filters: destination ip prefix:IP2/32 source ip prefix:IP1/32

Actions: Extended Communities traffic-marking: 1 MPLS POP

Note:

Liang, et al. Expires September 22, 2016 [Page 4]

Page 156: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft BGP FlowSpec March 2016

The following Extended Communities are added/deleted

[rule-1a] BGP-FS action MPLS POP [used on PE2] [rule-1b] BGP-FS action SWAP 4 [used on ASBR-2] [rule-1c] BGP-FS action SWAP 3 [used on ASBR-1] [rule-1d] BGP-FS action push 2 [used on PE1]

PE-2 Changes BGP-FS rule-1a to rule-1b prior to sending Clears Extended Community: BGP-FS action MPLS POP Adds Extended Community: BGP-FS action MPLS SWAP 4

ASBR-2 receives BGP-FS rule-1b (NRLI + 2 Extended Community) Installs the BGP-FS rule-1b (MPLS SWAP 4, traffic-marking) Changes BGP-FS rule-1b to rule-1c prior to sending to ASBR1 Clear Extended Community: BGP-FS action MPLS SWAP 4 Adds Extended Community: BGP-FS action MPLS SWAP 3

ASBR-1 Receives BGP-FS rule-1c (NLRI + 2 Extended Community) Installs the BGP-FS rule-1c (MPLS SWAP 3, traffic-marking Changes BGP-FS rule-1c to rule-1d prior to sending to PE-2 Clear Extended Community: BGP-FS action MPLS SWAP 3 Adds Extended Community: BGP-FS action MPLS SWAP 2

PE-1 Receives BGP-FS rule-1d (NLRI + 2 Extended Communities) Installs BGP-FS rule-1d action [MPLS SWAP 2, traffic-marking]

4. Protocol Extensions

In this document, BGP is used to distribute the FlowSpec rule bound with label(s). A new label-action is defined as BGP extended community value based on Section 7 of [RFC5575].

+--------+--------------------+--------------------------+ | type | extended community | encoding | +--------+--------------------+--------------------------+ | TBD1 | label-action | MPLS tag | +--------+--------------------+--------------------------+

Label-action is described below:

0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type (TBD1 | OpCode|Reserve| order | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Label | Label | Exp |S| TTL | Stack +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Entry

Liang, et al. Expires September 22, 2016 [Page 5]

Page 157: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft BGP FlowSpec March 2016

The use and the meaning of these fields are as follows:

Type: the same as defined in [RFC4360]

OpCode: Operation code

+------+------------------------------------------------------------+ |OpCode| Function | +------+------------------------------------------------------------+ | 0 | Push the MPLS tag | +------+------------------------------------------------------------+ | 1 | Pop the outermost MPLS tag in the packet | +------+------------------------------------------------------------+ | 2 | Swap the MPLS tag with the outermost MPLS tag in the packet| +------+------------------------------------------------------------+ | 3˜15 | Reserved | +------+------------------------------------------------------------+

When the Opcode field is set to 0, the label stack entry Should be pushed on the MPLS label stack.

When the OpCode field is set to 1, the label stack entry is invalid, and the router SHOULD pop the existing outermost MPLS tag in the packet.

When the OpCode field is set to 2, the router SHOULD swap the label stack entry with the existing outermost MPLS tag in the packet. If the packet has no MPLS tag, it just pushes the label stack entry.

The OpCode 0 or 1 may be used in some SDN networks, such as the scenario described in [I-D.filsfils-spring-segment-routing-central-epe].

The OpCode 2 can be used in traditional BGP MPLS/VPN networks.

Reserved: all zeros.

Order: A FlowSpec rule MAY include one or more ordering label- action(s). If multiple label action extended communities are associated with a BGP-FS Rule, this gives the order of this in the list. The Last action received for an order will be used.

Label: the same as defined in [RFC3032].

Bottom of Stack (S): the same as defined in [RFC3032]. It SHOULD be invalid, and set to zero by default. It MAY be modified by the forwarding router locally.

Liang, et al. Expires September 22, 2016 [Page 6]

Page 158: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft BGP FlowSpec March 2016

Time to Live (TTL): the same as defined in[RFC3032]. It MAY be modified by the forwarding router locally.

Experimental Use (Exp): the same as defined in [RFC3032]. It MAY be modified by the forwarding router according to the local routing policy.

5. IANA Considerations

For the purpose of this work, IANA should allocate the following Extended community:

TBD1 for label-action

6. Security considerations

This extension to BGP does not change the underlying security issues inherent in the existing BGP.

7. Acknowledgement

The authors would like to thank Shunwan Zhuang, Zhenbin Li, Peng Zhou and Jeff Haas for their comments.

8. References

8.1. Normative References

[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997, <http://www.rfc-editor.org/info/rfc2119>.

[RFC3032] Rosen, E., Tappan, D., Fedorkow, G., Rekhter, Y., Farinacci, D., Li, T., and A. Conta, "MPLS Label Stack Encoding", RFC 3032, DOI 10.17487/RFC3032, January 2001, <http://www.rfc-editor.org/info/rfc3032>.

[RFC3107] Rekhter, Y. and E. Rosen, "Carrying Label Information in BGP-4", RFC 3107, DOI 10.17487/RFC3107, May 2001, <http://www.rfc-editor.org/info/rfc3107>.

[RFC4360] Sangli, S., Tappan, D., and Y. Rekhter, "BGP Extended Communities Attribute", RFC 4360, DOI 10.17487/RFC4360, February 2006, <http://www.rfc-editor.org/info/rfc4360>.

Liang, et al. Expires September 22, 2016 [Page 7]

Page 159: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft BGP FlowSpec March 2016

[RFC4364] Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private Networks (VPNs)", RFC 4364, DOI 10.17487/RFC4364, February 2006, <http://www.rfc-editor.org/info/rfc4364>.

[RFC5575] Marques, P., Sheth, N., Raszuk, R., Greene, B., Mauch, J., and D. McPherson, "Dissemination of Flow Specification Rules", RFC 5575, DOI 10.17487/RFC5575, August 2009, <http://www.rfc-editor.org/info/rfc5575>.

[RFC7674] Haas, J., Ed., "Clarification of the Flowspec Redirect Extended Community", RFC 7674, DOI 10.17487/RFC7674, October 2015, <http://www.rfc-editor.org/info/rfc7674>.

8.2. Informative References

[I-D.filsfils-spring-segment-routing-central-epe] Filsfils, C., Previdi, S., Patel, K., Shaw, S., Ginsburg, D., and D. Afanasiev, "Segment Routing Centralized Egress Peer Engineering", draft-filsfils-spring-segment-routing- central-epe-05 (work in progress), August 2015.

[I-D.ietf-idr-bgp-flowspec-oid] Uttaro, J., Filsfils, C., Smith, D., Alcaide, J., and P. Mohapatra, "Revised Validation Procedure for BGP Flow Specifications", draft-ietf-idr-bgp-flowspec-oid-02 (work in progress), January 2014.

[I-D.ietf-idr-flow-spec-v6] McPherson, D., Raszuk, R., Pithawala, B., Andy, A., and S. Hares, "Dissemination of Flow Specification Rules for IPv6", draft-ietf-idr-flow-spec-v6-07 (work in progress), March 2016.

[I-D.ietf-idr-flowspec-l2vpn] Weiguo, H., Litkowski, S., and S. Zhuang, "Dissemination of Flow Specification Rules for L2 VPN", draft-ietf-idr- flowspec-l2vpn-03 (work in progress), November 2015.

[I-D.yong-idr-flowspec-mpls-match] Yong , L., Liang , Q., and J. You , "Dissemination of Flow Specification Rules for MPLS Label", March 2016.

Authors’ Addresses

Liang, et al. Expires September 22, 2016 [Page 8]

Page 160: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft BGP FlowSpec March 2016

Qiandeng Liang Huawei 101 Software Avenue, Yuhuatai District Nanjing, 210012 China

Email: [email protected]

Susan Hares Huawei 7453 Hickory Hill Saline, MI 48176 USA

Email: [email protected]

Jianjie You Huawei 101 Software Avenue, Yuhuatai District Nanjing, 210012 China

Email: [email protected]

Robert Raszuk Nozomi

Email: [email protected]

Dan Ma Cisco Systems

Email: [email protected]

Liang, et al. Expires September 22, 2016 [Page 9]

Page 161: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Network Working Group N. WuInternet-Draft S. ZhuangIntended status: Standards Track HuaweiExpires: April 2, 2016 A. Choudhary Cisco Systems Sep 30, 2015

A YANG Data Model for Flow Specification draft-wu-idr-flowspec-yang-cfg-02

Abstract

This document defines a YANG data model for Flow Specification implementations. The data model includes configuration data and state data.

Requirements Language

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119].

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on April 2, 2016.

Copyright Notice

Copyright (c) 2015 IETF Trust and the persons identified as the document authors. All rights reserved.

This document is subject to BCP 78 and the IETF Trust’s Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of

Wu, et al. Expires April 2, 2016 [Page 1]

Page 162: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft YANG for Flowspec Sep 2015

publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.

Table of Contents

1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 1.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 3 1.2. Flowspec Model Design . . . . . . . . . . . . . . . . . . 3 1.3. Tree Diagrams . . . . . . . . . . . . . . . . . . . . . . 3 2. Flow Specification data model . . . . . . . . . . . . . . . . 4 3. Flow Specification YANG Module . . . . . . . . . . . . . . . 9 4. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 30 5. Security Considerations . . . . . . . . . . . . . . . . . . . 30 6. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 30 7. References . . . . . . . . . . . . . . . . . . . . . . . . . 31 7.1. Normative References . . . . . . . . . . . . . . . . . . 31 7.2. Informative References . . . . . . . . . . . . . . . . . 32 Authors’ Addresses . . . . . . . . . . . . . . . . . . . . . . . 32

1. Introduction

This document defines a YANG [RFC6020] data model for the configuration and state data of Flowspec policies. Any RPC or notification definition is not part of this document. The model is based on Flowspec Policy architecture[RFC5575] and various other internet drafts [I-D.ietf-idr-flow-spec-v6][I-D.ietf-idr-bgp-flowspec-oid]. The configuration data defined in this document is encoded as flow specification rules and which can be distributed as BGP NLRI and/or applied locally to a network element. This distribution of traffic filtering rules through provider backbone can be used to filter Denial of Service attacks (DOS), maintaining some access control lists at network boundary [I-D.litkowski-idr-flowspec-interfaceset] besides other use-cases.

IP routers can classify packets based on multiple header fields. Such a classification rule may contain matching criteria to match on one or more packet header fields. Packets can be grouped to match on such packet rules defining a traffic flow. These traffic flows can be originated based on a manual policy or automatically or a combination of both. These defined traffic flows can be thus distributed.

Wu, et al. Expires April 2, 2016 [Page 2]

Page 163: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft YANG for Flowspec Sep 2015

As many vendors have different object constructs to represent the same traffic flow data, it has been tried to design this model in a very flexible, extensible and generic way to fit into most of the vendor requirements.

1.1. Terminology

The following terms are defined in [RFC6020]:

o configuration data

o data model

o module

o state data

The terminology for describing YANG data models is found in [RFC6020].

1.2. Flowspec Model Design

A flowspec policy contains one or more flowspec rules. Each flowspec rule may contain one or more flowspec components. A traffic flow matches a flowspec rule when it matches all the components present in the rule. A flow MAY match one or more flowspec rules. The order in which the flowspec rules are matched in a flowspec policy is defined by RFC 5575.

A flowspec rule contains one or more packet conditioning functions. A packet conditioning function MAY drop, limit the traffic flow rate, mark or redirect network packets to a specified routing instance. A flowspec policy can be stored as an object and used across different network device interfaces.

1.3. Tree Diagrams

A simplified graphical representation of the data model is used in this document. The meaning of the symbols in these diagrams is as follows:

o Brackets "[" and "]" enclose list keys.

o Abbreviations before data node names: "rw" means configuration data (read-write), and "ro" means state data (read-only).

o Symbols after data node names: "?" means an optional node, "!" means a presence container, and "*" denotes a list and leaf-list.

Wu, et al. Expires April 2, 2016 [Page 3]

Page 164: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft YANG for Flowspec Sep 2015

o Parentheses enclose choice and case nodes, and case nodes are also marked with a colon (":").

o Ellipsis ("...") stands for contents of subtrees that are not shown.

2. Flow Specification data model

The traffic flow specification data model has the following structure:

module: ietf-flowspec +--rw flowspec +--rw flowspec-cfg | +--rw flowspec-policy* [policy-name] | +--rw policy-name string | +--rw vrf-name? string | +--rw address-family? identityref | +--rw flowspec-rule* [rule-name] | +--rw rule-name string | +--rw flowspec-component* [component-type] | | +--rw component-type component-enum | | +--rw not-operator? boolean | | +--rw (component)? | | +--:(destination-prefix) | | | +--rw destination-prefix? inet:ip-address | | +--:(source-prefix) | | | +--rw source-prefix? inet:ip-address | | +--:(ip-protocol) | | | +--rw ip-protocol* [min max] | | | +--rw min uint8 | | | +--rw max uint8 | | +--:(port) | | | +--rw port* [min max] | | | +--rw min uint16 | | | +--rw max uint16 | | +--:(destination-port) | | | +--rw destination-port* [min max] | | | +--rw min uint16 | | | +--rw max uint16 | | +--:(source-port) | | | +--rw source-port* [min max] | | | +--rw min uint16 | | | +--rw max uint16 | | +--:(icmp-type) | | | +--rw icmp-type* [min max] | | | +--rw min uint8

Wu, et al. Expires April 2, 2016 [Page 4]

Page 165: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft YANG for Flowspec Sep 2015

| | | +--rw max uint8 | | +--:(icmp-code) | | | +--rw icmp-code* [min max] | | | +--rw min uint8 | | | +--rw max uint8 | | +--:(tcp-flags) | | | +--rw tcp-flag* [value] | | | +--rw value uint16 | | +--:(packet-length) | | | +--rw packet-length* [min max] | | | +--rw min uint16 | | | +--rw max uint16 | | +--:(dscp) | | | +--rw dscp* [min max] | | | +--rw min uint8 | | | +--rw max uint8 | | +--:(fragment) | | +--rw fragment-value | | +--rw is-fragment? boolean | | +--rw first-fragment? boolean | | +--rw last-fragment? boolean | | +--rw dont-fragment? boolean | +--rw flowspec-action* [action-type] | +--rw action-type action-type | +--rw (action)? | +--:(traffic-rate) | | +--rw rate? float | +--:(redirect) | | +--rw route-target? string | +--:(traffic-marking) | +--rw remark-dscp? dscp-type +--ro flowspec-state +--ro flowspec-rib | +--ro flowspec-entry* | +--ro index? uint32 | +--ro islocal? boolean | +--ro local-name? string | +--ro neighbor? inet:ip-address | +--ro duration? uint32 | +--ro flowspec-protocol-specific | +--ro (protocol)? | +--:(bgp) | +--ro neighbor-router-id? inet:ipv4-address | +--ro as-path? string | +--ro origin? enumeration | +--ro med? uint32 | +--ro local-preference? uint8 | +--ro community? string

Wu, et al. Expires April 2, 2016 [Page 5]

Page 166: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft YANG for Flowspec Sep 2015

| +--ro ext-community? string | +--ro preference? uint32 | +--ro originator? inet:ip-address | +--ro cluster-list? string | +--ro flowspec-component* [component-type] | | +--ro component-type component-enum | | +--ro not-operator? boolean | | +--ro (component)? | | +--:(destination-prefix) | | | +--ro destination-prefix? inet:ip-address | | +--:(source-prefix) | | | +--ro source-prefix? inet:ip-address | | +--:(ip-protocol) | | | +--ro ip-protocol* [min max] | | | +--ro min uint8 | | | +--ro max uint8 | | +--:(port) | | | +--ro port* [min max] | | | +--ro min uint16 | | | +--ro max uint16 | | +--:(destination-port) | | | +--ro destination-port* [min max] | | | +--ro min uint16 | | | +--ro max uint16 | | +--:(source-port) | | | +--ro source-port* [min max] | | | +--ro min uint16 | | | +--ro max uint16 | | +--:(icmp-type) | | | +--ro icmp-type* [min max] | | | +--ro min uint8 | | | +--ro max uint8 | | +--:(icmp-code) | | | +--ro icmp-code* [min max] | | | +--ro min uint8 | | | +--ro max uint8 | | +--:(tcp-flags) | | | +--ro tcp-flag* [value] | | | +--ro value uint16 | | +--:(packet-length) | | | +--ro packet-length* [min max] | | | +--ro min uint16 | | | +--ro max uint16 | | +--:(dscp) | | | +--ro dscp* [min max] | | | +--ro min uint8

Wu, et al. Expires April 2, 2016 [Page 6]

Page 167: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft YANG for Flowspec Sep 2015

| | | +--ro max uint8 | | +--:(fragment) | | +--ro fragment-value | | +--ro is-fragment? boolean | | +--ro first-fragment? boolean | | +--ro last-fragment? boolean | | +--ro dont-fragment? boolean | +--ro flowspec-action* [action-type] | +--ro action-type action-type | +--ro (action)? | +--:(traffic-rate) | | +--ro rate? float | +--:(redirect) | | +--ro route-target? string | +--:(traffic-marking) | +--ro remark-dscp? dscp-type +--ro flowspec-statistics +--ro flowspec-stats* +--ro vrf-name? string +--ro address-family? string +--ro flowspec-rule-stats* +--ro flowspec-component* [component-type] | +--ro component-type component-enum | +--ro not-operator? boolean | +--ro (component)? | +--:(destination-prefix) | | +--ro destination-prefix? inet:ip-address | +--:(source-prefix) | | +--ro source-prefix? inet:ip-address | +--:(ip-protocol) | | +--ro ip-protocol* [min max] | | +--ro min uint8 | | +--ro max uint8 | +--:(port) | | +--ro port* [min max] | | +--ro min uint16 | | +--ro max uint16 | +--:(destination-port) | | +--ro destination-port* [min max] | | +--ro min uint16 | | +--ro max uint16 | +--:(source-port) | | +--ro source-port* [min max] | | +--ro min uint16 | | +--ro max uint16 | +--:(icmp-type) | | +--ro icmp-type* [min max] | | +--ro min uint8

Wu, et al. Expires April 2, 2016 [Page 7]

Page 168: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft YANG for Flowspec Sep 2015

| | +--ro max uint8 | +--:(icmp-code) | | +--ro icmp-code* [min max] | | +--ro min uint8 | | +--ro max uint8 | +--:(tcp-flags) | | +--ro tcp-flag* [value] | | +--ro value uint16 | +--:(packet-length) | | +--ro packet-length* [min max] | | +--ro min uint16 | | +--ro max uint16 | +--:(dscp) | | +--ro dscp* [min max] | | +--ro min uint8 | | +--ro max uint8 | +--:(fragment) | +--ro fragment-value | +--ro is-fragment? boolean | +--ro first-fragment? boolean | +--ro last-fragment? boolean | +--ro dont-fragment? boolean +--ro flowspec-action* [action-type] | +--ro action-type action-type | +--ro (action)? | +--:(traffic-rate) | | +--ro rate? float | +--:(redirect) | | +--ro route-target? string | +--:(traffic-marking) | +--ro remark-dscp? dscp-type +--ro classified-pkts? uint64 +--ro classified-bytes? uint64 +--ro drop-pkts? uint64 +--ro drop-bytes? uint64

This data model defines the configuration and state containers for traffic flow specification. In flowspec-cfg container, there is a list of configuration containers per flow specification route, which contains the configuration for filtering rules and corresponding actions.

The data model for state of traffic flow defines two state containers. Container flowspec-rib contains the current state of flowspec route, including filtering rules and actions. In the second container, there is statistics information for traffic flow specifications. The flowspec-rules in the flowspec-statistics

Wu, et al. Expires April 2, 2016 [Page 8]

Page 169: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft YANG for Flowspec Sep 2015

container is listed per routing instance per address family. The list of flowspec-rules is in order of applied rules in the forwarding path.

3. Flow Specification YANG Module

<CODE BEGINS> file "[email protected]"

module ietf-flowspec { namespace "urn:ietf:params:xml:ns:yang:ietf-flowspec"; prefix flowspec;

import ietf-inet-types { prefix inet; }

organization "IETF IDR (Inter-Domain Routing) Working Group"; contact "WG Web: <http://tools.ietf.org/wg/idr/> WG List: <mailto:[email protected]>

WG Chair: Susan Hares <mailto:[email protected]>

WG Chair: John Scudder <mailto:[email protected]>

Editor: Shunwan Zhuang <mailto:[email protected]>

Editor: Nan Wu <mailto:[email protected]>

Editor: Aseem Choudhary <mailto:[email protected]>"; description "This module contains a collection of YANG definitions for configuring flow specification implementations.

Copyright (c) 2014 IETF Trust and the persons identified as authors of the code. All rights reserved.

Redistribution and use in source and binary forms, with or without modification, is permitted pursuant to, and subject to the license terms contained in, the Simplified BSD License set forth in Section 4.c of the IETF Trust’s Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info).";

Wu, et al. Expires April 2, 2016 [Page 9]

Page 170: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft YANG for Flowspec Sep 2015

revision 2015-09-15 { description "Initial revision."; reference "[RFC5575] Dissemination of Flow Specification Rules [draft-ietf-netmod-routing-cfg-16] A YANG Data Model for Routing Management. "; }

typedef component-enum { type enumeration { enum "destination-prefix" { value 1; description "Type 1 - Destination Prefix"; } enum "source-prefix" { value 2; description "Type 2 - Source Prefix"; } enum "ip-protocol" { value 3; description "Type 3 - IP Protocol"; } enum "port" { value 4; description "Type 4 - Port"; } enum "destination-port" { value 5; description "Type 5 - Destination port"; } enum "source-port" { value 6; description "Type 6 - Source port"; } enum "icmp-type" { value 7; description "Type 7 - ICMP type"; } enum "icmp-code" {

Wu, et al. Expires April 2, 2016 [Page 10]

Page 171: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft YANG for Flowspec Sep 2015

value 8; description "Type 8 - ICMP code"; } enum "tcp-flags" { value 9; description "Type 9 - TCP flags"; } enum "packet-length" { value 10; description "Type 10 - Packet length"; } enum "dscp" { value 11; description "Type 11 - DSCP (Diffserv Code Point)"; } enum "fragment" { value 12; description "Type 12 - Fragment"; } } description "Definition for component type."; }

typedef action-type { type enumeration { enum "traffic-rate" { value 32774; description "Carry the 2-octet id and 4-octet of rate information in IEEE floating point [IEEE.754.1985] format."; } enum "traffic-action" { value 32775; description "Consists of 6 bytes of which only the 2 least significant bits of the 6th byte (from left to right) are currently defined."; } enum "redirect" { value 32776; description "Allows the traffic to be redirected to a VRF

Wu, et al. Expires April 2, 2016 [Page 11]

Page 172: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft YANG for Flowspec Sep 2015

routing instance that lists the specified route-target in its import policy."; } enum "traffic-marking" { value 32777; description "Instructs a system to modify the DSCP bits of a transiting IP packet to the corresponding value."; } } description "Definition for action type."; }

typedef dscp-remarked { type enumeration { enum "be" { value 0; description "Default dscp (000000)"; } enum "cs1" { value 8; description "CS1(precedence 1) dscp (001000)"; } enum "af11" { value 10; description "AF11 dscp (001010)"; } enum "af12" { value 12; description "AF12 dscp (001100)"; } enum "af13" { value 14; description "AF13 dscp (001110)"; } enum "cs2" { value 16; description "CS2(precedence 2) dscp (010000)"; } enum "af21" { value 18;

Wu, et al. Expires April 2, 2016 [Page 12]

Page 173: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft YANG for Flowspec Sep 2015

description "AF21 dscp (010010)"; } enum "af22" { value 20; description "AF22 dscp (010100)"; } enum "af23" { value 22; description "AF23 dscp (010110)"; } enum "cs3" { value 24; description "CS3(precedence 3) dscp (011000)"; } enum "af31" { value 26; description "AF31 dscp (011010)"; } enum "af32" { value 28; description "AF32 dscp (011100)"; } enum "af33" { value 30; description "AF33 dscp (011110)"; } enum "cs4" { value 32; description "CS4(precedence 4) dscp (100000)"; } enum "af41" { value 34; description "AF41 dscp (100010)"; } enum "af42" { value 36; description "AF42 dscp (100100)"; }

Wu, et al. Expires April 2, 2016 [Page 13]

Page 174: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft YANG for Flowspec Sep 2015

enum "af43" { value 38; description "AF43 dscp (100110)"; } enum "cs5" { value 40; description "CS5(precedence 5) dscp (101000)"; } enum "ef" { value 46; description "EF dscp (101110)"; } enum "cs6" { value 48; description "CS6(precedence 6) dscp (110000)"; } enum "cs7" { value 56; description "CS7(precedence 7) dscp (111000)"; } } description "Definition for dscp type."; }

typedef dscp-type { type union { type dscp-remarked; type uint8; } description "Definition for dscp type."; }

typedef float { type union { type decimal64 { fraction-digits 14; range "-9999.99999999999999 .. 9999.99999999999999"; } type decimal64 { fraction-digits 13; range "-99999.9999999999999 .. 99999.9999999999999";

Wu, et al. Expires April 2, 2016 [Page 14]

Page 175: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft YANG for Flowspec Sep 2015

} type decimal64 { fraction-digits 12; range "-999999.999999999999 .. 999999.999999999999"; } type decimal64 { fraction-digits 11; range "-9999999.99999999999 .. 9999999.99999999999"; } type decimal64 { fraction-digits 10; range "-99999999.9999999999 .. 99999999.9999999999"; } type decimal64 { fraction-digits 9; range "-999999999.999999999 .. 999999999.999999999"; } type decimal64 { fraction-digits 8; range "-9999999999.99999999 .. 9999999999.99999999"; } type decimal64 { fraction-digits 7; range "-99999999999.9999999 .. 99999999999.9999999"; } type decimal64 { fraction-digits 6; range "-999999999999.999999 .. 999999999999.999999"; } type decimal64 { fraction-digits 5; range "-9999999999999.99999 .. 9999999999999.99999"; } type decimal64 { fraction-digits 4; range "-99999999999999.9999 .. 99999999999999.9999"; } type decimal64 { fraction-digits 3; range "-999999999999999.999 .. 999999999999999.999"; } type decimal64 { fraction-digits 2; range "-9999999999999999.99 .. 9999999999999999.99"; } type decimal64 { fraction-digits 1; range "-99999999999999999.9 .. 99999999999999999.9";

Wu, et al. Expires April 2, 2016 [Page 15]

Page 176: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft YANG for Flowspec Sep 2015

} } description "Definition for float point referenced from maillist."; }

identity address-family { description "Base identity from which identities describing address families are derived."; }

identity ipv4 { base address-family; description "This identity represents IPv4 address family."; }

identity ipv6 { base address-family; description "This identity represents IPv6 address family."; }

grouping tcp-flags-value { description "This group defines TCP flags values for tcp-flag component of FlowSpec."; leaf value { type uint16; description "Define the value of TCP flags."; } }

grouping dscp-value { description "This group defines dscp values for DSCP component of FlowSpec."; leaf min { type uint8; description "Define the minimum value of dscp range."; } leaf max { type uint8; must ". >= ../min" { error-message

Wu, et al. Expires April 2, 2016 [Page 16]

Page 177: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft YANG for Flowspec Sep 2015

"The max value must be greater than or equal to min value"; description "The max value must be greater than or equal to min value."; } description "Define the maximum value of dscp range."; } }

grouping ip-protocol-value { description "This group defines ip-protocol values for ip-protocol component of FlowSpec."; leaf min { type uint8; description "Define the minimum value of ip-protocol range."; } leaf max { type uint8; must ". >= ../min" { error-message "The max value must be greater than or equal to min value"; description "The max value must be greater than or equal to min value."; } description "Define the maximum value of ip-protocol range."; } }

grouping icmp-type-value { description "This group defines icmp type values for icmp-type component of FlowSpec."; leaf min { type uint8; description "Define the minimum value of icmp-type range."; } leaf max { type uint8; must ". >= ../min" { error-message

Wu, et al. Expires April 2, 2016 [Page 17]

Page 178: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft YANG for Flowspec Sep 2015

"The max value must be greater than or equal to min value"; description "The max value must be greater than or equal to min value."; } description "Define the maximum value of icmp-type range."; } }

grouping icmp-code-value { description "This group defines icmp code values for icmp-code component of FlowSpec."; leaf min { type uint8; description "Define the minimum value of icmp-code range."; } leaf max { type uint8; must ". >= ../min" { error-message "The max value must be greater than or equal to min value"; description "The max value must be greater than or equal to min value."; } description "Define the maximum value of icmp-code range."; } }

grouping packet-length-value { description "This group defines packet length values for packet-length component of FlowSpec."; leaf min { type uint16; description "Define the minimum value of packet-length range."; } leaf max { type uint16; must ". >= ../min" { error-message

Wu, et al. Expires April 2, 2016 [Page 18]

Page 179: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft YANG for Flowspec Sep 2015

"The max value must be greater than or equal to min value"; description "The max value must be greater than or equal to min value."; } description "Define the maximum value of packet-length range."; } }

grouping port-value { description "This group defines port values for port component of FlowSpec."; leaf min { type uint16; description "Define the minimum value of port range."; } leaf max { type uint16; must ". >= ../min" { error-message "The max value must be greater than or equal to min value"; description "The max value must be greater than or equal to min value."; } description "Define the maximum value of port range."; } }

grouping destination-port-value { description "This group defines destination port values for destination-port component of FlowSpec."; leaf min { type uint16; description "Define the minimum value of destination-port range."; } leaf max { type uint16; must ". >= ../min" { error-message

Wu, et al. Expires April 2, 2016 [Page 19]

Page 180: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft YANG for Flowspec Sep 2015

"The max value must be greater than or equal to min value"; description "The max value must be greater than or equal to min value."; } description "Define the maximum value of destination-port range."; } }

grouping source-port-value { description "This group defines source port values for source-port component of FlowSpec."; leaf min { type uint16; description "Define the minimum value of source-port range."; } leaf max { type uint16; must ". >= ../min" { error-message "The max value must be greater than or equal to min value"; description "The max value must be greater than or equal to min value."; } description "Define the maximum value of source-port range."; } }

grouping fragment-value { description "This group defines fragment values for Fragment component of FlowSpec."; leaf is-fragment { type boolean; description "Match if packet is a fragment "; } leaf first-fragment { type boolean; description "Match if packet is first fragment ";

Wu, et al. Expires April 2, 2016 [Page 20]

Page 181: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft YANG for Flowspec Sep 2015

} leaf last-fragment { type boolean; description "Match if packet is last fragment "; } leaf dont-fragment { type boolean; description "Match if don’t fragment bit is set in the packet"; } }

grouping flowspec-traffic-filter { description "This group defines the traffic filter rules for FlowSpec."; list flowspec-component { key "component-type"; min-elements 1; description "Define the traffic filter components into a list."; leaf component-type { type component-enum; description "Specify the type of component for this list entry."; } leaf not-operator { type boolean; description "If set to TRUE, the values or ranges specified in the component is not to be matched."; } choice component { description "Define different kinds of flowspec components involved."; case destination-prefix { leaf destination-prefix { type inet:ip-address; description "Specifies the destination address of the traffic."; } } case source-prefix { leaf source-prefix { type inet:ip-address; description "Specifies the source address of the traffic."; }

Wu, et al. Expires April 2, 2016 [Page 21]

Page 182: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft YANG for Flowspec Sep 2015

} case ip-protocol { list ip-protocol { key "min max"; uses ip-protocol-value; description "Define the minimum and maximum ranges of an ip-protocol list."; } } case port { list port { key "min max"; uses port-value; description "Define the minimum and maximum ranges of a port list."; } } case destination-port { list destination-port { key "min max"; uses destination-port-value; description "Define the minimum and maximum ranges of a destination-port list."; } } case source-port { list source-port { key "min max"; uses source-port-value; description "Define the minimum and maximum ranges of a source-port list."; } } case icmp-type { list icmp-type { key "min max"; uses icmp-type-value; description "Define the minimum and maximum ranges of an icmp-type list."; } } case icmp-code { list icmp-code {

Wu, et al. Expires April 2, 2016 [Page 22]

Page 183: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft YANG for Flowspec Sep 2015

key "min max"; uses icmp-code-value; description "Define the minimum and maximum ranges of an icmp-code list."; } } case tcp-flags { list tcp-flag { key "value"; uses tcp-flags-value; description "Defines list of tcp-flag entries."; } } case packet-length { list packet-length { key "min max"; uses packet-length-value; description "Define the minimum and maximum ranges of a packet-length list."; } } case dscp { list dscp { key "min max"; uses dscp-value; description "Define the minimum and maximum ranges of a dscp list."; } } case fragment { container fragment-value { uses fragment-value; description "Define Fragment flags value."; } } } } }

grouping flowspec-traffic-action { description "This group defines the traffic actions for FlowSpec."; list flowspec-action {

Wu, et al. Expires April 2, 2016 [Page 23]

Page 184: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft YANG for Flowspec Sep 2015

key "action-type"; description "Define the traffic actions of FlowSpec into a list."; leaf action-type { type action-type; description "Specify the type of traffic filter action."; } choice action { description "Define different kinds of traffic actions involved."; case traffic-rate { leaf rate { type float; description "Specifies the traffic rate in IEEE floating point [IEEE.754.1985] format, units being bytes per second."; } } case redirect { leaf route-target { type string { length "3..21"; } description "Allows the traffic to be redirected to a VRF routing instance that lists the specified route-target in its import policy"; } } case traffic-marking { leaf remark-dscp { type dscp-type; description "Instructs a system to modify the DSCP bits of a transiting IP packet to the corresponding value"; } } } } }

grouping flowspec-bgp-route { description "This group define extensions of FlowSpec for bgp protocol."; leaf neighbor-router-id { type inet:ipv4-address;

Wu, et al. Expires April 2, 2016 [Page 24]

Page 185: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft YANG for Flowspec Sep 2015

description "The router-id of the neighbor from whom this route received."; } leaf as-path { type string; description "Number of the AS the BGP FlowSpec route passes."; } leaf origin { type enumeration { enum "igp" { value 0; description "Designate this route originated from igp route."; } enum "egp" { value 1; description "Designate this route originated from egp route."; } enum "incomplete" { value 2; description "Designate this route originated from other route."; } } description "The Origin attribute defines the origin of a route. The Origin attribute is classified into the following types:

Interior Gateway Protocol (IGP): This attribute type has the highest priority. IGP is the Origin attribute for routes obtained through an IGP in the AS from which the routes originate. For example, the Origin attribute of the routes imported to the BGP routing table using the network command is IGP.

Exterior Gateway Protocol (EGP): This attribute type has the second highest priority. The Origin attribute of the routes obtained through EGP is EGP.

Incomplete: This attribute type has the lowest priority. Incomplete is the Origin attribute type of all routes that do not have the IGP or EGP Origin attribute. For example, the Origin attribute of the routes imported is Incomplete."; } leaf med {

Wu, et al. Expires April 2, 2016 [Page 25]

Page 186: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft YANG for Flowspec Sep 2015

type uint32; description "The Multi-Exit-Discriminator (MED) is transmitted only between two neighboring ASs."; } leaf local-preference { type uint8; description "The local preference of the BGP FlowSpec route."; } leaf community { type string; description "Community attribute of the BGP FlowSpec route."; } leaf ext-community { type string; description "Extended community attribute of the BGP FlowSpec route."; } leaf preference { type uint32; description "The preferred value of a protocol."; } leaf originator { type inet:ip-address; description "The address of the advertiser of the BGP FlowSpec route."; } leaf cluster-list { type string; description "Cluster list attribute of the BGP FlowSpec route."; } uses flowspec-traffic-filter; uses flowspec-traffic-action; }

grouping flowspec-protocol-specific { description "This group define extensions of FlowSpec per protocols."; choice protocol { description "Define specific part of FlowSpec when carried in different protocols."; case bgp { uses flowspec-bgp-route;

Wu, et al. Expires April 2, 2016 [Page 26]

Page 187: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft YANG for Flowspec Sep 2015

} } }

container flowspec { description "Container for flowspec configuration and state"; container flowspec-cfg { description "Configuration for flow specification."; list flowspec-policy { key "policy-name"; description "Configuration of a flow route list."; leaf policy-name { type string; description "The name of a flow route."; } leaf vrf-name { type string; description " Vrf-name of the flowspec-rule"; } leaf address-family { type identityref { base address-family; } description " address-family of the flowspec-rule"; } list flowspec-rule { key "rule-name"; ordered-by system; leaf rule-name { type string; description "The name of a flowspec rule."; } uses flowspec-traffic-filter; uses flowspec-traffic-action; description "Define flow specification filters."; } } } container flowspec-state { config false;

Wu, et al. Expires April 2, 2016 [Page 27]

Page 188: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft YANG for Flowspec Sep 2015

description "Operational state of flow specification."; container flowspec-rib { description "Define the operational state data for FlowSpec entries."; list flowspec-entry { description "FlowSpec entries are organized into list of routes."; leaf index { type uint32; description "Flow Specification route entry index."; } leaf islocal { type boolean; description "Locally configured Flow Specification route."; } leaf local-name { type string; description "The name of locally configured FlowSpec route."; } leaf neighbor { type inet:ip-address; description "IP address of an advertising device"; } leaf duration { type uint32; description "Route duration in seconds."; } container flowspec-protocol-specific { description "Define the specific extension for each protocol."; uses flowspec-protocol-specific; } } } container flowspec-statistics { description "Define the statistics of list of flowspec rules."; list flowspec-stats { description "Statistics of list of flowspec rules per VRF & Address-family."; leaf vrf-name {

Wu, et al. Expires April 2, 2016 [Page 28]

Page 189: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft YANG for Flowspec Sep 2015

type string; description "Vrf-name of the set of flowspec-rules"; } leaf address-family { type string; description "Address-family of the set of flowspec-rules"; } list flowspec-rule-stats { description " This defines the flowspec filter statistics of each flowspec-rules. "; uses flowspec-traffic-filter; uses flowspec-traffic-action; leaf classified-pkts { type uint64; description " Number of total packets which matched to the flowspec-filter"; } leaf classified-bytes { type uint64; description " Number of total bytes which matched to the flowspec-filter"; } leaf drop-pkts { type uint64; description " Number of total packets which got dropped"; } leaf drop-bytes { type uint64; description " Number of total bytes which got dropped"; } } } } } } }

<CODE ENDS>

Wu, et al. Expires April 2, 2016 [Page 29]

Page 190: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft YANG for Flowspec Sep 2015

4. IANA Considerations

This document registers a URI in the "IETF XML Registry" [RFC3688]. Following the format in RFC 3688, the following registration has been made.

URI: urn:ietf:params:xml:ns:yang:ietf-flowspec

Registrant Contact: The RTGWG WG of the IETF.

XML: N/A; the requested URI is an XML namespace.

This document registers a YANG module in the "YANG Module Names" registry [RFC6020].

Name: ietf-flowspec

Namespace: urn:ietf:params:xml:ns:yang:ietf-flowspec

Prefix: flowspec

Reference: RFC XXXX

5. Security Considerations

The YANG module defined in this memo is designed to be accessed via the NETCONF protocol [RFC6241]. The lowest NETCONF layer is the secure transport layer and the mandatory-to-implement secure transport is SSH [RFC6242]. The NETCONF access control model [RFC6536] provides the means to restrict access for particular NETCONF users to a pre-configured subset of all available NETCONF protocol operations and content.

There are a number of data nodes defined in the YANG module which are writable/creatable/deletable (i.e., config true, which is the default). These data nodes may be considered sensitive or vulnerable in some network environments. Write operations (e.g., <edit-config>) to these data nodes without proper protection can have a negative effect on network operations.

6. Acknowledgments

The editor of this document wishes to thank Andrew Mao for the guidance and support in coming up with this draft.

Wu, et al. Expires April 2, 2016 [Page 30]

Page 191: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft YANG for Flowspec Sep 2015

7. References

7.1. Normative References

[I-D.ietf-idr-bgp-flowspec-oid] Uttaro, J., Filsfils, C., Smith, D., Alcaide, J., and P. Mohapatra, "Revised Validation Procedure for BGP Flow Specifications", draft-ietf-idr-bgp-flowspec-oid-02 (work in progress), January 2014.

[I-D.ietf-idr-flow-spec-v6] Raszuk, R., Pithawala, B., McPherson, D., and A. Andy, "Dissemination of Flow Specification Rules for IPv6", draft-ietf-idr-flow-spec-v6-06 (work in progress), November 2014.

[I-D.litkowski-idr-flowspec-interfaceset] Litkowski, S., Simpson, A., Patel, K., and J. Haas, "Applying BGP flowspec rules on a specific interface set", draft-litkowski-idr-flowspec-interfaceset-02 (work in progress), March 2015.

[RFC3688] Mealling, M., "The IETF XML Registry", BCP 81, RFC 3688, DOI 10.17487/RFC3688, January 2004, <http://www.rfc-editor.org/info/rfc3688>.

[RFC5575] Marques, P., Sheth, N., Raszuk, R., Greene, B., Mauch, J., and D. McPherson, "Dissemination of Flow Specification Rules", RFC 5575, DOI 10.17487/RFC5575, August 2009, <http://www.rfc-editor.org/info/rfc5575>.

[RFC6020] Bjorklund, M., Ed., "YANG - A Data Modeling Language for the Network Configuration Protocol (NETCONF)", RFC 6020, DOI 10.17487/RFC6020, October 2010, <http://www.rfc-editor.org/info/rfc6020>.

[RFC6241] Enns, R., Ed., Bjorklund, M., Ed., Schoenwaelder, J., Ed., and A. Bierman, Ed., "Network Configuration Protocol (NETCONF)", RFC 6241, DOI 10.17487/RFC6241, June 2011, <http://www.rfc-editor.org/info/rfc6241>.

[RFC6242] Wasserman, M., "Using the NETCONF Protocol over Secure Shell (SSH)", RFC 6242, DOI 10.17487/RFC6242, June 2011, <http://www.rfc-editor.org/info/rfc6242>.

Wu, et al. Expires April 2, 2016 [Page 31]

Page 192: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft YANG for Flowspec Sep 2015

[RFC6536] Bierman, A. and M. Bjorklund, "Network Configuration Protocol (NETCONF) Access Control Model", RFC 6536, DOI 10.17487/RFC6536, March 2012, <http://www.rfc-editor.org/info/rfc6536>.

7.2. Informative References

[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997, <http://www.rfc-editor.org/info/rfc2119>.

Authors’ Addresses

Nan Wu Huawei Huawei Bld., No.156 Beiqing Rd. Beijing 100095 China

Email: [email protected]

Shunwan Zhuang Huawei Huawei Bld., No.156 Beiqing Rd. Beijing 100095 China

Email: [email protected]

Aseem Choudhary Cisco Systems 170 W. Tasman Drive San Jose, CA 95134 USA

Email: [email protected]

Wu, et al. Expires April 2, 2016 [Page 32]

Page 193: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

IDR WG Sandy. ZhangInternet-Draft ZTE CorporationIntended status: Standards Track Ying. ChengExpires: January 7, 2016 China Unicom July 6, 2015

Upstream Assigned Label Collision Solution draft-zhang-idr-upstream-assigned-label-solution-00.txt

Abstract

The upstream-assigned label is used to identify the specific multicast flow. In MVPN technology RFC6513, it says: "This procedure requires each egress PE to support a separate label space for every other PE. The egress PEs creates a forwarding entry for the upstream-assigned MPLS label, allocated by the ingress PE, in this label space."

In common scenario, the "separate label space" is planned by network administrator in advance. For the stable network, the method of allocating the separate label space is easy and practicable. But when the network changes dynamically, it is very hard to allocate the separate label space in advance. The planned label space is probably insufficient. That leads to collision when PE uses the common label space to allocate upstream-assigned label. Therefore we need a flexible solution for the collision of upstream-assigned label in the dynamically changing network.

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on January 7, 2016.

Zhang & Cheng Expires January 7, 2016 [Page 1]

Page 194: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft UPSTREAM LABEL COLLISION SOLUTION July 2015

Copyright Notice

Copyright (c) 2015 IETF Trust and the persons identified as the document authors. All rights reserved.

This document is subject to BCP 78 and the IETF Trust’s Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.

Table of Contents

1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 2. Timestamp Attribute . . . . . . . . . . . . . . . . . . . . . 3 2.1. Applicability Restriction and Cautions . . . . . . . . . 4 2.2. Handing a malformed UALRT Attribute . . . . . . . . . . . 4 2.3. Restriction . . . . . . . . . . . . . . . . . . . . . . . 4 3. Originating and Sending the UALRT attribute . . . . . . . . . 4 4. Receiving the UALRT attribute . . . . . . . . . . . . . . . . 5 5. Security Considerations . . . . . . . . . . . . . . . . . . . 5 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 5 7. Normative References . . . . . . . . . . . . . . . . . . . . 5 Authors’ Addresses . . . . . . . . . . . . . . . . . . . . . . . 6

1. Introduction

As the development of MVPN technology, data center interconnection technology, and BIER (Bit Indexed Explicit Replication) technology, etc. Multicast is more and more used. Upstream-assigned label is used for egress PEto indicate a specific multicast flow. uppose at a L3VPN network which was used for MVPN only is expanded to support the interconnection of DC. Many VxLan which each need an upstream- assigned label to indicate lead to lack of the label space and it will comsumes a lot of manpower to adjust the label space on every PE. Adjusting the label space on every PE will be difficult and cost more manpower.

In this document, we define a new BGP attribute, named UALRT: "upstream-assigned label route’s timestamp". When the collision of upstream-assigned label occurs, all the PEs in the L3VPN, will find out those routes which are feasible with use of the common algorithm. And the PE which announces the collision upstream-assigned label will adjust the label to avoid collision.

Zhang & Cheng Expires January 7, 2016 [Page 2]

Page 195: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft UPSTREAM LABEL COLLISION SOLUTION July 2015

------- | PEa | --------------------------------- ------- | | ------- | MVPN | | PEc | ------- | | ------- | PEb | --------------------------------- -------

Figure 1: collision of upstream-assigned label in MVPN

For example, PEa and PEb send MVPN flows to PEc with a same I-PMSI. PEa send (vpn1, S1, G1) flows to PEc, PEb send (vpn1, S2, G2) flows to PEc similar. PEa and PEb advertise the two multicast routes by upstream-assigned label to let PEc distinguish the two flows. If network manager has not programmed the separate space for upstream- assigned label on every PE, PEa and PEb may advertise the same label for each route. When PEc receives the two multicast flows from the common I-PMSI tunnel, PEc can not distinguish the two flows correctly, and PEc then forward the two flows to wrong receivers.

2. Timestamp Attribute

The UALRT attribute is an optional, non-transitive BGP path attribute. The attribute type code for the UALRT attribute is TBD.

The value field of the UALRT attribute is defined here to be a set of elements encoded as "Type/Length/Value" (i.e., a set of TLVs). Each such TLV is encoded as shown in Figure 1.

0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type | Length | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ˜ ˜ | Value +-+-+-+-+-+-+-+-| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-| Figure 2: UALRT TLV

o Type: A single octet encoding the TLV Type. Only type 1, "UALRT TLV", is defined in this document. Use of other TLV types is outside the scope of this document.

o Length: Two octets encoding the length in octets of the TLV, including the Type and Length fields. The length is encoded as an unsigned binary integer.

o Value: A field containing eight octets of timestamp.

Zhang & Cheng Expires January 7, 2016 [Page 3]

Page 196: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft UPSTREAM LABEL COLLISION SOLUTION July 2015

This document defines only a single such TLV, the "UALRT TLV". The UALRT TLV is encoded as follows:

o Type: 1

o Length: 11

o Value: timestamp.

The value field of the UALRT TLV is always 8 octets long, and its value is interpreted as an unsigned 64-bit integer. The time clock when the label is assigned are frequently expressed as 4-octet fragment. By using an 8-octet field, we can be more accurate for high rate clock.

When an UALRT attribute is created, it SHOULD contain no more than one UALRT TLV. However, if it contains more than one UALRT TLV, only the first one is used as described the generation of routes.

2.1. Applicability Restriction and Cautions

The documents only consider the use of UALRT attribute in MVPN networks, the interconnection of data center, BIER, etc, where the PEs generate upstream-assigned label for multicast routes.

2.2. Handing a malformed UALRT Attribute

When receiving a BGP Update message which contains a malformed UALRT attribute, the attribute should be treated as if it was an unrecognized non-transitive attribute. That is, it MUST be quietly ignored and not passed along to other BGP peers.

If the UALRT attribute is received and its value is set to 0x0 or the maximum value of 0xFFFFFFFFFFFFFFFF, the attribute should be considered to be malformed and SHOULD be ignored.

2.3. Restriction

In this document we discuss the UALRT attribute should be used in intranet of network. The use of UALRT attribute across multi-ASs May be out of the scope of this document.

3. Originating and Sending the UALRT attribute

When a PE decides to generate a UALRT attribute for multicast routes with upstream-assigned label, the PE should set the value of UALRT to the time when the route is generated. The PE sends the route with UALRT attribute and other attributes to other PEs to let all the PEs

Zhang & Cheng Expires January 7, 2016 [Page 4]

Page 197: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft UPSTREAM LABEL COLLISION SOLUTION July 2015

receive the routes with UALRT attribute. The UALRT attribute MUST be transferred by RR. To avoid all the PEs to generate label in the first label space, each PE SHOULD start assign the label with a random offset.

4. Receiving the UALRT attribute

When a PE received an upstream-assigned label multicast routes, it will check if there is a BGP UALRT attribute. If the UALRT value is correct, the PE will store the route with the UALRT. Then PE will check if the label is conflict with any exist upstream-assigned label, include the lable generated by the PE or other PEs. If there is a collision, the PE MUST choose the one which timestamp was earlier as the legal route. If the timestamp of two conflict routes are equal, the PE will use the BGP peer address to judge which one is legal. The PE SHOULD choose the one which was advertised by the peer with smaller BGP peer address or larger BGP peer address.

If the other conflict label is generated by the PE itself, the PE will adjust the upstream-assigned label to other value, and the PE MUST re-advertise a route with new label and new UALRT attribute. If the other conflict route was not generated by the PE itself, the PE will do nothing until other PE re-advertise a route with new upstream-assigned label and the UALRT attribute. When the PE receives route with new upstream-assigned label and the UALRT attribute, the PE will repeat the store and conflict check function.

5. Security Considerations

For general BGP Security Considerations.

6. IANA Considerations

IANA is requested to allocate BGP path attribute for UALRT TLVs.

7. Normative References

[RFC4364] Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private Networks (VPNs)", RFC 4364, February 2006.

[RFC5331] Aggarwal, R., Rekhter, Y., and E. Rosen, "MPLS Upstream Label Assignment and Context-Specific Label Space", RFC 5331, August 2008.

[RFC6513] Rosen, E. and R. Aggarwal, "Multicast in MPLS/BGP IP VPNs", RFC 6513, February 2012.

Zhang & Cheng Expires January 7, 2016 [Page 5]

Page 198: This document describes modifications of BGP Labeled ... · This document describes modifications of BGP Labeled Unicast (BGP-LU) procedures for label distribution in a partitioned

Internet-Draft UPSTREAM LABEL COLLISION SOLUTION July 2015

[RFC7432] Sajassi, A., Aggarwal, R., Bitar, N., Isaac, A., Uttaro, J., Drake, J., and W. Henderickx, "BGP MPLS-Based Ethernet VPN", RFC 7432, February 2015.

Authors’ Addresses

Sandy Zhang ZTE Corporation No. 50 Software Ave, Yuhuatai Distinct Nanjing 210000 China

Phone: +86-025-88014634 Email: [email protected]

Ying Cheng China Unicom Beijing China

Phone: +86-10-68799999-7702 Email: [email protected]

Zhang & Cheng Expires January 7, 2016 [Page 6]