Post on 15-Dec-2015
transcript
Rensselaer Polytechnic Institute
1
Today’s Big Picture
Large ISP Large ISP
Dial-UpISP
AccessNetwork
Small ISP
Stub Stub
Stub
Large number of diverse networks
Rensselaer Polytechnic Institute
2
Internet AS Map: caida.org
Rensselaer Polytechnic Institute
3
Autonomous System(AS)
Internet is not a single networkCollection of networks controlled by different
administrations An autonomous system is a network under a
single administrative control An AS owns an IP prefix Every AS has a unique AS number ASes need to inter-network themselves to form
a single virtual global networkNeed a common protocol for communication
Rensselaer Polytechnic Institute
4
Who speaks Inter-AS routing?
R border router internal router
BGPR2
R1
R3AS1
AS2
Two types of routers Border router(Edge), Internal router(Core)
Two border routers of different ASes will have a BGP session
Rensselaer Polytechnic Institute
5
Intra-AS vs Inter-AS
An AS is a routing domain Within an AS:
Can run a link-state routing protocol Trust other routers Scale of network is relatively small
Between ASes: Lack of information about other AS’s network (Link-
state not possible) Crossing trust boundaries Link-state protocol will not scale Routing protocol based on route propagation
Rensselaer Polytechnic Institute
6
Autonomous Systems (ASes)
An autonomous system is an autonomous routing domain that has been assigned an Autonomous System Number (ASN).All parts within an AS remain connected.
RFC 1930: Guidelines for creation, selection, and registration of an Autonomous System
… the administration of an AS appears to other ASes to have a single coherent interior routing plan and presents a consistent picture of what networks are reachable through it.
Rensselaer Polytechnic Institute
7
IP Address Allocation and Assignment: Internet Registries
IANAwww.iana.org
RFC 2050 - Internet Registry IP Allocation Guidelines RFC 1918 - Address Allocation for Private Internets RFC 1518 - An Architecture for IP Address Allocation with CIDR
ARINwww.arin.org
APNICwww.apnic.org
RIPEwww.ripe.org
Allocate to National and local registries and ISPs Addresses assigned to customers by ISPs
Rensselaer Polytechnic Institute
8
AS Numbers (ASNs)
ASNs are 16 bit values.64512 through 65535 are “private”
• Genuity: 1 • MIT: 3• Harvard: 11• UC San Diego: 7377• AT&T: 7018, 6341, 5074, … • UUNET: 701, 702, 284, 12199, …• Sprint: 1239, 1240, 6211, 6242, …• …
ASNs represent units of routing policy
Currently over 11,000 in use.
Rensselaer Polytechnic Institute
9
Nontransit vs. Transit ASes
ISP 1ISP 2
Nontransit ASmight be a corporateor campus network.Could be a “content provider”
NET ATraffic NEVER flows from ISP 1through NET A to ISP 2
Internet Serviceproviders (ISPs)have transit networks
Rensselaer Polytechnic Institute
10
Selective Transit
NET BNET C
NET A provides transitbetween NET B and NET Cand between NET D and NET C
NET A
NET D
NET A DOES NOTprovide transitBetween NET D and NET B
Most transit ASes allow only selective transitkey impact of commercialization
Rensselaer Polytechnic Institute
11
Customers and Providers
Customer pays provider for access to the Internet
provider
customer
IP trafficprovider customer
Rensselaer Polytechnic Institute
12
Customer-Provider Hierarchy
IP trafficprovider customer
Rensselaer Polytechnic Institute
13
The Peering Relationship
peer peer
customerprovider
Peers provide transit between their respective customers
Peers do not provide transit between peers
Peers (often) do not exchange $$$trafficallowed
traffic NOTallowed
Rensselaer Polytechnic Institute
14
BGP-4 BGP = Border Gateway Protocol
Is a Policy-Based routing protocol
Is the de facto EGP of today’s global Internet
Relatively simple protocol, but configuration is complex and the
entire world can see, and be impacted by, your mistakes.
• 1989 : BGP-1 [RFC 1105]– Replacement for EGP (1984, RFC 904)
• 1990 : BGP-2 [RFC 1163]
• 1991 : BGP-3 [RFC 1267]
• 1995 : BGP-4 [RFC 1771] – Support for Classless Interdomain Routing (CIDR)
Rensselaer Polytechnic Institute
15
BGP Operations (Simplified)
Establish session on TCP port 179
Exchange all active routes
Exchange incremental updates
AS1
AS2
While connection is ALIVE exchangeroute UPDATE messages
BGP session
Rensselaer Polytechnic Institute
16
Four Types of BGP Messages
Open : Establish a peering session.
Keep Alive : Handshake at regular intervals.
Notification : Shuts down a peering session.
Update : Announcing new routes or withdrawing
previously announced routes.
announcement = prefix + attributes values
Rensselaer Polytechnic Institute
17
What is Routing Policy Policy refers to arbitrary preference among a menu of
available routes (based upon routes’ attributes) Public description of the relationship between external
BGP peers Can also describe internal BGP peer relationship
Eg: Who are my BGP peers What routes are
Originated by a peer Imported from each peer Exported to each peer Preferred when multiple routes exist
What to do if no route exists?
Rensselaer Polytechnic Institute
18
Routing Policy Example
AS1 originates prefix “d” AS1 exports “d” to AS2,
AS2 imports AS2 exports “d” to AS3,
AS3 imports AS3 exports “d” to AS5,
AS5 imports
Rensselaer Polytechnic Institute
19
Routing Policy Example (cont)
AS5 also imports “d” from AS4
Which route does it prefer? Does it matter? Consider case where
AS3 = Commercial Internet
AS4 = Internet2
Rensselaer Polytechnic Institute
20
Import and Export Policies
Inbound filtering controls outbound traffic filters route updates received from other peers filtering based on IP prefixes, AS_PATH, community
Outbound Filtering controls inbound traffic forwarding a route means others may choose to reach
the prefix through you not forwarding a route means others must use another
router to reach the prefix Attribute Manipulation
Import: LOCAL_PREF (manipulate trust) Export: AS_PATH and MEDs
Rensselaer Polytechnic Institute
21
Attributes are Used to Select Best Routes
192.0.2.0/24pick me!
192.0.2.0/24pick me!
192.0.2.0/24pick me!
192.0.2.0/24pick me!
Given multipleroutes to the sameprefix, a BGP speakermust pick at mostone best route
(Note: it could reject them all!)
Rensselaer Polytechnic Institute
22
BGP Policy Knob: Attributes
Value Code Reference----- --------------------------------- --------- 1 ORIGIN [RFC1771] 2 AS_PATH [RFC1771] 3 NEXT_HOP [RFC1771] 4 MULTI_EXIT_DISC [RFC1771] 5 LOCAL_PREF [RFC1771] 6 ATOMIC_AGGREGATE [RFC1771] 7 AGGREGATOR [RFC1771] 8 COMMUNITY [RFC1997] 9 ORIGINATOR_ID [RFC2796] 10 CLUSTER_LIST [RFC2796] 11 DPA [Chen] 12 ADVERTISER [RFC1863] 13 RCID_PATH / CLUSTER_ID [RFC1863] 14 MP_REACH_NLRI [RFC2283] 15 MP_UNREACH_NLRI [RFC2283] 16 EXTENDED COMMUNITIES [Rosen] ... 255 reserved for development
From IANA: http://www.iana.org/assignments/bgp-parameters
We will cover a subset of these attributes
Not all attributesneed to be present inevery announcement
Rensselaer Polytechnic Institute
23
BGP Route Processing
Best Route Selection
Apply Import Policies
Best Route Table
Apply Export Policies
Install forwardingEntries for bestRoutes.
ReceiveBGPUpdates
BestRoutes
TransmitBGP Updates
Apply Policy =filter routes & tweak attributes
Based onAttributeValues
IP Forwarding Table
Apply Policy =filter routes & tweak attributes
Rensselaer Polytechnic Institute
24
Import and Export Policies For inbound traffic
Filter outbound routes Tweak attributes on
outbound routes in the hope of influencing your neighbor’s best route selection
For outbound traffic Filter inbound routes Tweak attributes on
inbound routes to influence best route selection
outboundroutes
inboundroutes
inboundtraffic
outboundtraffic
In general, an AS has morecontrol over outbound traffic
Rensselaer Polytechnic Institute
25
Policy Implementation Flow
MainBGPRIB
AdjRIBOut
Outgo-ing
AdjRIBIn
Incom-ing
MainRIB/FIB
IGPs
Static&
HWInfo
Rensselaer Polytechnic Institute
26
Conceptual Model of BGP Operation
RIB : Routing Information Base Adj-RIB-In: Prefixes learned from neighbors. As
many Adj-RIB-In as there are peers Loc-RIB: Prefixes selected for local use after
analyzing Adj-RIB-Ins. This RIB is advertised internally.
Adj-RIB-Out : Stores prefixes advertised to a particular neighbor. As many Adj-RIB-Out as there are neighbors
Rensselaer Polytechnic Institute
27
Path Attributes: ORIGIN
ORIGIN:Describes how a prefix came to BGP at the
origin ASPrefixes are learned from a source and
“injected” into BGP: Directly connected interfaces, manually
configured static routes, dynamic IGP or EGPValues:
IGP (EGP): Prefix learnt from IGP (EGP)INCOMPLETE: Static routes
Rensselaer Polytechnic Institute
28
Path Attributes: AS-PATH List of ASs thru which the prefix announcement
has passed. AS on path adds ASN to AS-PATH Eg: 138.39.0.0/16 originates at AS1 and is
advertised to AS3 via AS2. Eg: AS-SEQUENCE: “100 200” Used for loop detection and path selection
AS1(100)
AS2(200)
AS3(15)
138.39.0.0/16
Rensselaer Polytechnic Institute
29
Traffic Often Follows ASPATH
AS 4AS 3AS 2AS 1135.207.0.0/16
135.207.0.0/16ASPATH = 3 2 1
IP Packet Dest =135.207.44.66
Rensselaer Polytechnic Institute
30
… But It Might Not
AS 4AS 3AS 2AS 1135.207.0.0/16
135.207.0.0/16ASPATH = 3 2 1
IP Packet Dest =135.207.44.66
AS 5
135.207.44.0/25ASPATH = 5
135.207.44.0/25
AS 2 filters allsubnets with maskslonger than /24
135.207.0.0/16ASPATH = 1
From AS 4, it may look like this packet will take path 3 2 1, but it actually takes path 3 2 5
Rensselaer Polytechnic Institute
31
Shorter AS-PATH Doesn’t Mean Shorter # Hops
AS 4
AS 3
AS 2
AS 1
BGP says that path 4 1 is better than path 3 2 1
Duh!
Rensselaer Polytechnic Institute
32
ASPATH Padding: Shed inbound traffic
Padding will (usually) force inbound traffic from AS 1to take primary link
AS 1
192.0.2.0/24ASPATH = 2 2 2
customerAS 2
provider
192.0.2.0/24
backupprimary
192.0.2.0/24ASPATH = 2
Rensselaer Polytechnic Institute
33
Padding May Not Shut Off All Traffic
AS 1
192.0.2.0/24ASPATH = 2 2 2 2 2 2 2 2 2 2 2 2 2 2
customerAS 2
provider
192.0.2.0/24
192.0.2.0/24ASPATH = 2
AS 3provider
AS 3 will send traffic on “backup” link because it prefers customer routes and local preference is considered before ASPATH length!
Padding in this way is oftenused as a form of loadbalancing
backupprimary
Rensselaer Polytechnic Institute
34
Deaggregation + Multihoming
AS 1
customerAS 2
provider
12.0.0.0/8
AS 3provider
12.2.0.0/16
12.2.0.0/16
12.2.0.0/16
If AS 1 doesnot announce themore specific prefix,then most traffic to AS 2 will go through AS 3 because it is a longer match
AS 2 is “punching a hole” in the CIDR block of AS 1=> subverts CIDR
Rensselaer Polytechnic Institute
35
BGP Table Growth
Thanks: Geoff Huston. http://www.telstra.net/ops/bgptable.html
Rensselaer Polytechnic Institute
36
Large BGP Tables Considered Harmful
• Routing tables must store best routes and alternate routes
• Burden can be large for routers with many alternate routes (route reflectors for example)
• Routers have been known to die• Increases CPU load, especially during
session reset
Rensselaer Polytechnic Institute
37
ASNs Growth
From: Geoff Huston. http://www.telstra.net/ops
Rensselaer Polytechnic Institute
38
Dealing with ASN growth… Make ASNs larger than 16 bits
How about 32 bits? See Internet Draft: “BGP support for four-octet AS number
space” (draft-ietf-idr-as4bytes-03.txt) Requires protocol change and wide deployment
Change the way ASNs are used Allow multihomed, non-transit networks to use private ASNs Uses ASE (AS number Substitution on Egress ) See Internet Draft: “Autonomous System Number
Substitution on Egress” (draft-jhaas-ase-00.txt) Works at edge, requires protocol change (for loop
prevention)
Rensselaer Polytechnic Institute
39
Daily Update Count
Rensselaer Polytechnic Institute
40
A Few Bad Apples …
Thanks to Madanlal Musuvathi for this plot. Data source: RIPE NCC
Typically, 80% ofthe updates are for less than 5% Of the prefixes.
Most prefixes are stable most of the time. On this day, about 83% of the prefixes were not updated.
Percent of BGP table prefixes
Rensselaer Polytechnic Institute
41
Route Flap Dampening (RFC 2439)
Routes are given a penalty for changing. If penalty exceeds suppress limit, the route is dampened. When the route is not changing, its penalty decays exponentially. If the penalty goesbelow reuse limit, then it is announced again.
• Can dramatically reduce the number of BGP updates
• Requires additional router resources• Applied on eBGP inbound only
Rensselaer Polytechnic Institute
42
Route Flap Dampening Example
route dampenedfor nearly 1 hour
penalty for each flap = 1000
Rensselaer Polytechnic Institute
43
How Long Does BGP Take to Adapt to Changes?
0
10
20
30
40
50
60
70
80
90
100
0 20 40 60 80 100 120 140 160
Seconds Until Convergence
Cu
mu
lati
ve P
erce
nta
ge
of
Eve
nts
Tup
Tshort
Tlong
Tdow n
From: Abha Ahuja and Craig Labovitz
Rensselaer Polytechnic Institute
44
Two Main Factors in Delayed Convergence
Rate limiting timer slows everything down BGP can explore many alternate paths before
giving up or arriving at a new pathNo global knowledge in vectoring protocols
Rensselaer Polytechnic Institute
45
Implementation Does Matter!
stateless withdrawswidely deployed
stateful withdrawswidely deployed