1© 2001, Cisco Systems, Inc. All rights reserved.ISP Workshops
BGP Deployment & Scalability
Mike PenningtonMike Pennington
Network Consulting EngineerNetwork Consulting Engineer
Cisco Systems, DenverCisco Systems, Denver
2© 2001, Cisco Systems, Inc. All rights reserved.ISP Workshops
Basic BGP Review
© 2001, Cisco Systems, Inc. All rights reserved. 3ISP Workshops
Border Gateway Protocol
• Routing Protocol used to exchange routing information between networks
exterior gateway protocol
• RFC1771
work in progress to update
draft-ietf-idr-bgp4-17.txt
• Currently Version 4
• Runs over TCP
© 2001, Cisco Systems, Inc. All rights reserved. 4ISP Workshops
BGP
• Path Vector Protocol
• Incremental Updates
• Many options for policy enforcement
• Classless Inter Domain Routing (CIDR)
• Widely used for Internet backbone
• Autonomous systems
© 2001, Cisco Systems, Inc. All rights reserved. 5ISP Workshops
Path Vector Protocol
• BGP is classified as a path vector routing protocol (see RFC 1322)
A path vector protocol defines a route as a pairing between a destination and the attributes of the path to that destination.
12.6.126.0/24 207.126.96.43 1021 0 6461 7018 6337 11268 i12.6.126.0/24 207.126.96.43 1021 0 6461 7018 6337 11268 i
AS PathAS Path
© 2001, Cisco Systems, Inc. All rights reserved. 6ISP Workshops
• Sequence of ASes a route has traversed
• Loop detection
• Apply policy
AS-Path
AS 100
AS 300
AS 200
AS 500
AS 400
170.10.0.0/16 180.10.0.0/16
150.10.0.0/16
180.10.0.0/16 300 200 100
170.10.0.0/16 300 200
150.10.0.0/16 300 400
180.10.0.0/16 300 200 100170.10.0.0/16 300 200
© 2001, Cisco Systems, Inc. All rights reserved. 7ISP Workshops
AS-Path loop detection
AS 100
AS 300
AS 200
AS 500
170.10.0.0/16 180.10.0.0/16
180.10.0.0/16 300 200 100
170.10.0.0/16 300 200
140.10.0.0/16 300
140.10.0.0/16 500 300
170.10.0.0/16 500 300 200
140.10.0.0/16
180.10.0.0/16 is not announced to AS100 as AS500 sees that it is originated from AS100, and that AS100 is the neighbouring AS – loop detection in action
© 2001, Cisco Systems, Inc. All rights reserved. 8ISP Workshops
Autonomous System (AS)
• Collection of networks with same routing policy
• Single routing protocol
• Usually under single ownership, trust and administrative control
AS 100
© 2001, Cisco Systems, Inc. All rights reserved. 9ISP Workshops
BGP Basics
AS 100 AS 101
AS 102
EE
BB DD
AA CC
Peering
BGP speakers are called peers
© 2001, Cisco Systems, Inc. All rights reserved. 10ISP Workshops
BGP General Operation
• Learns multiple paths via internal and external BGP speakers
• Picks the best path and installs in the forwarding table
• Policies applied by influencing the best path selection
© 2001, Cisco Systems, Inc. All rights reserved. 11ISP Workshops
External BGP Peering (eBGP)
AS 100 AS 101CC
AA
• Between BGP speakers in different AS
• Should be directly connected
• Do not run an IGP between eBGP peers
BB
© 2001, Cisco Systems, Inc. All rights reserved. 12ISP Workshops
Internal BGP Peering (iBGP)
• Topology independent• Each iBGP speaker must peer with
every other iBGP speaker in the AS
AS 100
AA
EE
BB
DD
© 2001, Cisco Systems, Inc. All rights reserved. 13ISP Workshops
Internal BGP (iBGP)
• BGP peer within the same AS
• Not required to be directly connected
• iBGP speakers need to be fully meshed
they originate connected networks
they do not pass on prefixes learned from other iBGP speakers
14© 2001, Cisco Systems, Inc. All rights reserved.ISP Workshops
BGP Attributes
© 2001, Cisco Systems, Inc. All rights reserved. 15ISP Workshops
What Is an Attribute?
• Describes the characteristics of prefix
• Transitive or non-transitive
• Some are mandatory
Next Next HopHop
AS AS PathPath
............MEDMED......
© 2001, Cisco Systems, Inc. All rights reserved. 16ISP Workshops
• Sequence of ASes a route has traversed
• Loop detection
• Apply policy
AS-Path
AS 100
AS 300
AS 200
AS 500
AS 400
170.10.0.0/16 180.10.0.0/16
150.10.0.0/16
180.10.0.0/16 300 200 100
170.10.0.0/16 300 200
150.10.0.0/16 300 400
180.10.0.0/16 300 200 100170.10.0.0/16 300 200
© 2001, Cisco Systems, Inc. All rights reserved. 17ISP Workshops
150.10.0.0/16 150.10.1.1160.10.0.0/16 150.10.1.1
150.10.1.2150.10.1.1
Next Hop
• Next hop to reach a network
• Usually a local network is the next hop in eBGP session
160.10.0.0/16
150.10.0.0/16
AS 100
AS 300AS 200
AA BB
20
© 2001, Cisco Systems, Inc. All rights reserved. 18ISP Workshops
Next Hop (continued)
• IGP should carry route to next hops
• Recursive route look-up
• Unlinks BGP from actual physical topology
• Allows IGP to make intelligent forwarding decision
© 2001, Cisco Systems, Inc. All rights reserved. 19ISP Workshops
Local Preference
AS 400
AS 200
160.10.0.0/16AS 100
AS 300
160.10.0.0/16 500> 160.10.0.0/16 800
500 800 EE
BB
CC
AA
DD
© 2001, Cisco Systems, Inc. All rights reserved. 20ISP Workshops
Local Preference
• Local to an AS – non-transitive
local preference set to 100 when heard from neighbouring AS
• Used to influence BGP path selection
determines best path for outbound traffic
• Path with highest local preference wins
© 2001, Cisco Systems, Inc. All rights reserved. 21ISP Workshops
Multi-Exit Discriminator (MED)
AS 201
AS 200
192.68.1.0/24
CC
AA BB
192.68.1.0/24 1000192.68.1.0/24 2000
© 2001, Cisco Systems, Inc. All rights reserved. 22ISP Workshops
Multi-Exit Discriminator
• Inter-AS – non-transitive
metric reset to 0 on announcement to next AS
• Used to convey the relative preference of entry points
determines best path for inbound traffic
• Comparable if paths are from same AS
• IGP metric can be conveyed as MED
© 2001, Cisco Systems, Inc. All rights reserved. 23ISP Workshops
MED & IGP Metric
• set metric-type internal
enable BGP to advertise a MED which corresponds to the IGP metric values
changes are monitored (and re-advertised if needed) every 600s
bgp dynamic-med-interval <secs>
© 2001, Cisco Systems, Inc. All rights reserved. 24ISP Workshops
Community
• BGP attribute
• Used to group destinations
• Represented as two 16bit integers
• Each destination could be member of multiple communities
• Community attribute carried across AS’s
• Useful in applying policies
© 2001, Cisco Systems, Inc. All rights reserved. 25ISP Workshops
160.10.0.0/16 300:1
Community
AS 200
160.10.0.0/16 300:1
170.10.0.0/16 300:1
170.10.0.0/16 300:1
AS 400
DD
CC
EE
BB
170.10.0.0/16
AS 100AA
160.10.0.0/16
ISP 1200.10.0.0/16 300:9
XX
ISP 2
200.10.0.0/16
AS 300
26© 2001, Cisco Systems, Inc. All rights reserved.ISP Workshops
BGP Deployment Guidelines
© 2001, Cisco Systems, Inc. All rights reserved. 27ISP Workshops
Recommended BGP commands for everyone
• ip bgp-community new-format
• no auto-summary
• no synchronization
• bgp deterministic-med
Whatever you do, use of deterministic-med MUST be consistent in your Autonomous System.
© 2001, Cisco Systems, Inc. All rights reserved. 28ISP Workshops
Other serious considerations
• For public peering: filter EBGP routes inbound and outbound
Block your own address space inbound
Block RFC 1918 space (inbound and outbound)
Block DSUA space (inbound and outbound):
http://www.ietf.org/internet-drafts/draft-manning-dsua-08.txt
• Use prefix-lists for route-filtering when possible (easier to read than ACLs)
© 2001, Cisco Systems, Inc. All rights reserved. 29ISP Workshops
Other serious considerations
• If you carry a default in the IGP, your BGP next-hops ALWAYS resolve (generally not good)
• bgp bestpath compare-routerid
Restores RFC-compliant path selection; OFF by default to reduce update churn, use with discretion
• If you have a large BGP network, consider techniques in the next section
30© 2001, Cisco Systems, Inc. All rights reserved.ISP Workshops
BGP Scaling Techniques
© 2001, Cisco Systems, Inc. All rights reserved. 31ISP Workshops
BGP Scaling Techniques
• How to scale iBGP mesh beyond a few peers?
• How to implement new policy without causing flaps and route churning?
• How to reduce the overhead on the routers?
© 2001, Cisco Systems, Inc. All rights reserved. 32ISP Workshops
BGP Scaling Techniques
• Dynamic reconfiguration
• Peer groups
• Route flap damping
• Route reflectors
© 2001, Cisco Systems, Inc. All rights reserved. 33ISP Workshops
Soft Reconfiguration
Problem:
• Hard BGP peer clear required after every policy change because the router does not store prefixes that are denied by a filter
• Hard BGP peer clearing consumes CPU and affects connectivity for all networks
Solution:
• Soft-reconfiguration
© 2001, Cisco Systems, Inc. All rights reserved. 34ISP Workshops
Soft Reconfiguration
BGP in
process
BGP
table
BGP out
process
BGP in
table
receivedreceivedand used
accepted
discardedpeer
peer
normal
soft
© 2001, Cisco Systems, Inc. All rights reserved. 35ISP Workshops
Soft Reconfiguration
• New policy is activated without tearing down and restarting the peering session
• Per-neighbour basis
• Use more memory to keep prefixes whose attributes have been changed or have not been accepted
© 2001, Cisco Systems, Inc. All rights reserved. 36ISP Workshops
Configuring Soft reconfiguration
router bgp 100
neighbor 1.1.1.1 remote-as 101
neighbor 1.1.1.1 route-map infilter in
neighbor 1.1.1.1 soft-reconfiguration inbound
! Outbound does not need to be configured !
Then when we change the policy, we issue an exec command
clear ip bgp 1.1.1.1 soft [in | out]
© 2001, Cisco Systems, Inc. All rights reserved. 37ISP Workshops
Managing Policy Changes
• clear ip bgp <addr> [soft] [in|out]
<addr> may be any of the following
x.x.x.x IP address of a peer
* all peers
ASN all peers in an AS
external all external peers
peer-group <name> all peers in a peer-group
© 2001, Cisco Systems, Inc. All rights reserved. 38ISP Workshops
Route Refresh Capability
• Facilitates non-disruptive policy changes
• No configuration is needed
• No additional memory is used
• Requires peering routers to support “route refresh capability” – RFC2918
• clear ip bgp x.x.x.x in tells peer to resend full BGP announcement
© 2001, Cisco Systems, Inc. All rights reserved. 39ISP Workshops
Soft Reconfiguration vs Route Refresh
• Use Route Refresh capability if supported
find out from “show ip bgp neighbor”
uses much less memory
• Otherwise use Soft Reconfiguration
© 2001, Cisco Systems, Inc. All rights reserved. 40ISP Workshops
Peer Groups
Without peer groups
• iBGP neighbours receive same update
• Large iBGP mesh slow to build
• Router CPU wasted on repeat calculations
Solution – peer groups!
• Group peers with same outbound policy
• Updates are generated once per group
© 2001, Cisco Systems, Inc. All rights reserved. 41ISP Workshops
Peer Groups - Advantages
• Makes configuration easier
• Makes configuration less prone to error
• Makes configuration more readable
• Lower router CPU load
• iBGP mesh builds more quickly
• Members can have different inbound policy
• Can be used for eBGP neighbours too!
© 2001, Cisco Systems, Inc. All rights reserved. 42ISP Workshops
Configuring Peer Group
router bgp 100
neighbor ibgp-peer peer-group
neighbor ibgp-peer remote-as 100
neighbor ibgp-peer update-source loopback 0
neighbor ibgp-peer send-community
neighbor ibgp-peer route-map outfilter out
neighbor 1.1.1.1 peer-group ibgp-peer
neighbor 2.2.2.2 peer-group ibgp-peer
neighbor 2.2.2.2 route-map infilter in
neighbor 3.3.3.3 peer-group ibgp-peer
! note how 2.2.2.2 has different inbound filter from peer-group !
© 2001, Cisco Systems, Inc. All rights reserved. 43ISP Workshops
Route Flap Damping
• Route flap
Going up and down of path or change in attribute
BGP WITHDRAW followed by UPDATE = 1 flap
eBGP neighbour going down/up is NOT a flap
Ripples through the entire Internet
Wastes CPU
• Damping aims to reduce scope of route flap propagation
© 2001, Cisco Systems, Inc. All rights reserved. 44ISP Workshops
Route Flap Damping (Continued)
• Requirements
Fast convergence for normal route changes
History predicts future behaviour
Suppress oscillating routes
Advertise stable routes
• Implementation described in RFC2439
© 2001, Cisco Systems, Inc. All rights reserved. 45ISP Workshops
Operation
• Add penalty (1000) for each flapChange in attribute gets penalty of 500
• Exponentially decay penaltyhalf life determines decay rate
• Penalty above suppress-limitdo not advertise route to BGP peers
• Penalty decayed below reuse-limitre-advertise route to BGP peers
penalty reset to zero when it is half of reuse-limit
© 2001, Cisco Systems, Inc. All rights reserved. 46ISP Workshops
Operation
Reuse limit
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
0
1000
2000
3000
4000
Time
Penalty
Suppress limit
NetworkAnnounced
NetworkRe-announced
NetworkNot Announced
© 2001, Cisco Systems, Inc. All rights reserved. 47ISP Workshops
Operation
• Only applied to inbound announcements from eBGP peers
• Alternate paths still usable
• Controlled by:
Half-life (default 15 minutes)
reuse-limit (default 750)
suppress-limit (default 2000)
maximum suppress time (default 60 minutes)
© 2001, Cisco Systems, Inc. All rights reserved. 48ISP Workshops
Configuration
Fixed dampingrouter bgp 100
bgp dampening [<half-life> <reuse-value> <suppress-penalty> <maximum suppress time>]
Selective and variable damping bgp dampening [route-map <name>] route-map <name> permit 10
match ip address prefix-list FLAP-LIST
set dampening [<half-life> <reuse-value> <suppress-penalty> <maximum suppress time>]
ip prefix-list FLAP-LIST permit 192.0.2.0/24 le 32
© 2001, Cisco Systems, Inc. All rights reserved. 49ISP Workshops
Operation
• Care required when setting parameters
• Penalty must be less than reuse-limit at the maximum suppress time
• Maximum suppress time and half life must allow penalty to be larger than suppress limit
© 2001, Cisco Systems, Inc. All rights reserved. 50ISP Workshops
Configuration
• Examples - bgp dampening 30 750 3000 60
reuse-limit of 750 means maximum possible penalty is 3000 – no prefixes suppressed as penalty cannot exceed suppress-limit
• Examples - bgp dampening 30 2000 3000 60
reuse-limit of 2000 means maximum possible penalty is 8000 – suppress limit is easily reached
© 2001, Cisco Systems, Inc. All rights reserved. 51ISP Workshops
Configuration
• Examples - bgp dampening 15 500 2500 30
reuse-limit of 500 means maximum possible penalty is 2000 – no prefixes suppressed as penalty cannot exceed suppress-limit
• Examples - bgp dampening 15 750 3000 45
reuse-limit of 750 means maximum possible penalty is 6000 – suppress limit is easily reached
© 2001, Cisco Systems, Inc. All rights reserved. 52ISP Workshops
• Maximum value of penalty is
• Always make sure that suppress-limit is LESS than max-penalty otherwise there will be no route damping
Maths!
© 2001, Cisco Systems, Inc. All rights reserved. 53ISP Workshops
Enhancements
• Selective damping based on
AS-path, Community, Prefix
• Variable damping
recommendations for ISPs
http://www.ripe.net/docs/ripe-229.html
• Flap statisticsshow ip bgp neighbor <x.x.x.x> [dampened-routes | flap-statistics]
© 2001, Cisco Systems, Inc. All rights reserved. 54ISP Workshops
Scaling iBGP mesh
Two solutions
Route reflector – simpler to deploy and run
Confederation – more complex, corner case benefits
13 Routers 78 iBGP
Sessions!
n=1000 nearlyhalf a millionibgp sessions!
n=1000 nearlyhalf a millionibgp sessions!
Avoid n(n-1)/2 iBGP mesh
© 2001, Cisco Systems, Inc. All rights reserved. 55ISP Workshops
AS 100
Route Reflector: Principle
AA
CCBB
Route Reflector
© 2001, Cisco Systems, Inc. All rights reserved. 56ISP Workshops
Route Reflector
AS 100
AA
BB CC
Clients
Reflectors
• Reflector receives path from clients and non-clients
• Selects best path
• If best path is from client, reflect to other clients and non-clients
• If best path is from non-client, reflect to clients only
• Non-meshed clients
• Described in RFC2796
© 2001, Cisco Systems, Inc. All rights reserved. 57ISP Workshops
Route Reflector Topology
• Divide the backbone into multiple clusters
• At least one route reflector and few clients per cluster
• Route reflectors are fully meshed
• Clients in a cluster could be fully meshed
• Single IGP to carry next hop and local routes
© 2001, Cisco Systems, Inc. All rights reserved. 58ISP Workshops
Route Reflectors:Loop Avoidance
• Originator_ID attribute
Carries the RID of the originator of the route in the local AS (created by the RR)
• Cluster_list attribute
The local cluster-id is added when the update is sent by the RR
Cluster-id is router-id (address of loopback)
Do NOT use bgp cluster-id x.x.x.x
© 2001, Cisco Systems, Inc. All rights reserved. 59ISP Workshops
Route Reflectors:Redundancy
• Multiple RRs can be configuredin the same cluster – not advised!
All RRs in the cluster must have the same cluster-id (otherwise it is a different cluster)
• A router may be a client of RRsin different clusters
Common today in ISP networks to overlay two clusters – redundancy achieved that way
Each client has two RRs = redundancy
© 2001, Cisco Systems, Inc. All rights reserved. 60ISP Workshops
Route Reflector: Benefits
• Solves iBGP mesh problem
• Packet forwarding is not affected
• Normal BGP speakers co-exist
• Multiple reflectors for redundancy
• Easy migration
• Multiple levels of route reflectors
© 2001, Cisco Systems, Inc. All rights reserved. 61ISP Workshops
Configuring a Route Reflector
router bgp 100
neighbor 1.1.1.1 remote-as 100
neighbor 1.1.1.1 route-reflector-client
neighbor 2.2.2.2 remote-as 100
neighbor 2.2.2.2 route-reflector-client
neighbor 3.3.3.3 remote-as 100
neighbor 3.3.3.3 route-reflector-client
62