Network Working Group Amit Bhagat Internet-Draft July 06, 2016 Intended status: Standards Track Expires: January 06, 2017 BGP Overload draft-bhagat-bgp-overload-01 Abstract This document will present a new feature in BGP that allows a BGP speaker to participate in prefix distribution without carrying transit traffic. Status of This Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." This Internet-Draft will expire on January 06, 2017. Copyright Notice Copyright (c) 2016 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. 1. Introduction In ISIS, the Overload-bit is present in Link-State PDU (LSP) header. In OSPFv2, H-bit in Router-LSA is defined in the specification [H-BIT]. In Data Center networks where CLOS fabrics are built solely using BGP, the operators drain transit traffic from the router by either increasing MED or prepending AS-PATH or manipulating other BGP attributes. The point being that the operator always needs to be wary of the topology. This document presents a feature similar to ISIS's Overload-bit and (to an extent) OSPFv2's H-bit. The primary use-case is to drain transit traffic from a BGP router. 2. Specification of Requirements The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119]. 3. Capability Support A new Capability Optional parameter will be communicated in BGP Open message. A BGP speaker SHOULD use Capability Advertisement procedure [BGP-CAP] to announce the support. The Capability Code is to be assigned by IANA. Capability Code: TBA by IANA Capability Length: Variable length. Capability Value: Specifies all AFI/SAFI configured on the BGP Speaker that will support the feature. +---------------------------+ | AFI | +---------------------------+ | SAFI | +---------------------------+ 4. Multiprotocol BGP Extensions BGP Overload is advertised in BGP UPDATE message using MP_REACH_NLRI and MP_UNREACH_NLRI attributes. The [AFI,SAFI] value pair used to identify this NLRI is - AFI = 1 or 2 (1 for IPv4, 2 for IPv6) and SAFI = To be assigned by IANA. 5. NLRI Encoding The NLRI field in the MP_REACH_NLRI and MP_UNREACH_NLRI attributes is encoded as defined in section 4 of [MP-BGP]. The structure of prefix is defined as follows: +---------------------------+ | Router ID (4-octets) | +---------------------------+ | Flags (2-octets) | +---------------------------+ | AFI (2-octets) | +---------------------------+ | SAFI (1-octet) | +---------------------------+ Router ID: BGP Router ID of the BGP speaker sending the BGP UPDATE message. Flags: Flags indicate peers the reason for BGP UPDATE message. +---------------------------+ | Reserved | C| M | +---------------------------+ M (1-bit) : Set for "Maintenance" C (1-bit) : Set when Established neighbor count threshold is breached Reserved : Set to 0 AFI: Address-Family Identifier SAFI: Subsequent Address-Family Identifier The [AFI/SAFI] pair identify which prefix types will be affected. If multiple AFI/SAFI pairs supported are affected, they are all listed in the encoding. 6. Operation When a BGP speaker is required to drain traffic, the operator will configure BGP Overload. This will trigger the BGP speaker to advertise a BGP UPDATE message to all its peers that support the feature. The BGP UPDATE message MUST include a standard community NO-ADVERTISE to avoid flooding to other peers. The advertising BGP speaker will follow-up with an End-Of-RIB marker to hint the end of update. Upon receiving the BGP UPDATE message with Overload NLRI, the receiving BGP speaker should examine the ADJ-RIB-IN table for the advertising BGP speaker. It should re-evaluate the prefixes that the "overloaded" BGP speaker advertised and run a BGP best-path selection process for all AFI/SAFI that the overloaded BGP speaker announced in BGP UPDATE message and NOT utilize any prefixes for which the advertising BGP speaker is in the transit. The selection process, however, should still utilize the advertising BGP speaker for all prefixes it is the originator for. The way to determine this is by checking AS_PATH attribute. The receiving BGP speaker may send further BGP UPDATE messages downstream to withdraw prefixes that were impacted by the overloaded BGP speaker. A "global" configuration of BGP overload should affect all AFI/SAFI for which the BGP speaker advertised the capability. 7. Other Usecases 7.1 Established neighbor count threshold Consider a Leaf router with connections to 10 Spine routers in a CLOS fabric running BGP as the routing protocol. It receives exactly same prefixes from all Spine routers. If links to 5 Spine routers go down, the capacity of this Leaf router is reduced to 50%. This can cause congestion as the peers still see best path via the Leaf router. With BGP Overload, the router can monitor the "Established" neighbor count and if it breaches the threshold for a configurable duration, it sends a BGP UPDATE message with Overload NLRI and "C" flag set, to declare itself as overloaded and stop using it for transit traffic. Once the Established neighbor count is above threshold, the BGP speaker will advertise the Overload NLRI with "C" flag cleared. 8. IANA Considerations As specified in the document, the IANA will assign a new Capability Code. The IANA will also assign a new SAFI value for MP_REACH_NLRI and MP_UNREACH_NLRI attributes. 9. Security Considerations This document introduces no new security concerns to BGP or other specifications referenced in this document. 10. References [BGP-CAP] Chandra, R. and J. Scudder, "Capabilities Advertisement with BGP-4", RFC 2842, May 2000. [H-BIT] Patel, K. et al, "H-bit Support for OSPFv2", draft-ietf-ospf-ospfv2-hbit-00, October 2015. [MP-BGP] Bates, T. et al, "Multiprotocol Extensions for BGP-4", RFC 2858, June 2000. 11. Author's Address Amit Bhagat Email: scet.amit@gmail.com