Date post: | 25-Dec-2015 |
Category: |
Documents |
Upload: | aldous-wilkins |
View: | 217 times |
Download: | 2 times |
10940_03F8_c1NW97_US_106
Multicast Issues for Gigapop Operators
David MeyerGigapop Operators II Workshop 26 June 1998
0940_03F8_c1NW97_US_106
20940_03F8_c1NW97_US_106
• Introduction
• First Some Basic TechnologyBasic Host Model
Basic Router Model
Data Distribution Concepts
• What Are the Deployment ObstaclesWhat Are the Non-technical Issues
What Are the Technical Scaling Issues
Agenda
30940_03F8_c1NW97_US_106
Agenda (Cont.)
• Potential Solutions (Cisco Specific)
Multi-level RP, Anycast Clusters, MSDP
Using Directory Services
• Industry Solutions
BGMP and MASC
• Possible Deployment Scenarios
• References
40940_03F8_c1NW97_US_106
Introduction—Level Set
• This presentation focuses on large-scale multicast routing for Gigapops and their customers
Note that the problem is essentially the same as the inter-domain multicast routing problem
• The problems/solutions presented are related to inter-enterprise or Gigapop deployment of IP multicast
• The current set of deployed technology is sufficient for enterprise environments
50940_03F8_c1NW97_US_106
Introduction—Why Would You Want to Deploy IP Multicast?
• You don’t want the same data traversing your links many times— bandwidth saver
• You want to join and leave groups dynamically without notifying all data sources—pay-per-view
60940_03F8_c1NW97_US_106
Introduction—Why Would You Want to Deploy IP Multicast?
• You want to discover a resource but don’t know who is providing it or if you did, don’t want to configure it— expanding ring search
• Reduce startup latency for subscribers
70940_03F8_c1NW97_US_106
Introduction—Why Would a Gigapop Want to Deploy IP Multicast?
• All of the previous, plus revenue potential for deploying IP multicast
• Initial applicationsRadio station transmissions
Real-time stock quote service
• Future applicationsDistance learning
Entertainment
80940_03F8_c1NW97_US_106
Basic Host Model
• Strive to make the host model simpleWhen sourcing data, just send the data
Map network layer address to link layer address
Routers will figure out where receivers are and are not
When receiving data, need to perform two actions
Tell routers what group you’re interested in (via IGMP)
Tell your LAN controller to receive for link-layer mapped address
90940_03F8_c1NW97_US_106
Basic Host Model
• Hosts can be receivers and not send to the group
• Hosts can send but not be receivers of the group
• Or they can be both
100940_03F8_c1NW97_US_106
Basic Host Model
• There are some protocol and architectural issues
Multiple IP group addresses map into a single link-layer address
You need IP-level filtering
Hosts join groups, which means they receive traffic from all sources sending to the group
Wouldn’t it be better if hosts could say what sources they were willing to receive from
110940_03F8_c1NW97_US_106
Basic Host Model
• There are some protocol and architectural issues (continued)
You can access control sources but you can’t access control receivers in a scalable way
120940_03F8_c1NW97_US_106
Basic Router Model
• Since hosts can send any time to any group, routers must be prepared to receive on all link-layer group addresses
And know when to forward or drop packets
130940_03F8_c1NW97_US_106
Basic Router Model
• What does a router keep track of?
interfaces leading to receivers
sources when utilizing source distribution trees
prune state depending on the multicast routing protocol (e.g. Dense Mode)
140940_03F8_c1NW97_US_106
Data Distribution Concepts
• Routers maintain state to deliver data down a distribution tree
• Source trees
Router keeps (S,G) state so packets can flow from the source to all receivers
Trades off low delay from source against router state
150940_03F8_c1NW97_US_106
Data Distribution Concepts
• Shared trees
Router keeps (*,G) state so packets flow from the root of the tree to all receivers
Trades off higher delay from source against less router state
160940_03F8_c1NW97_US_106
Data Distribution Concepts
• How is the tree built?
On demand, in response to data arrival
Dense-mode protocols (PIM-DM and DVMRP)
MOSPF
Explicit control
Sparse-mode protocols (PIM-SM and CBT)
170940_03F8_c1NW97_US_106
Data Distribution Concepts
• Building distribution trees requires knowledge of where members are
flood data to find out where members are not (Dense-mode protocols)
flood group membership information (MOSPF), and build tree as data arrives
send explicit joins and keep join state (Sparse-mode protocols)
180940_03F8_c1NW97_US_106
Data Distribution Concepts
• Construction of source trees requires knowledge of source locations
In dense-mode protocols you learn them when data arrives (at each depth of the tree)
Same with MOSPF
In sparse-mode protocols you learn them when data arrives on the shared tree (in leaf routers only)
Ignore since routing based on direction from RP
Pay attention if moving to source tree
190940_03F8_c1NW97_US_106
Data Distribution Concepts
• To build a shared tree you need to know where the core (RP) is
Can be learned dynamically in the routing protocol (Auto-RP, PIMv2)
Can be configured in the routers
Could use a directory service
200940_03F8_c1NW97_US_106
Data Distribution Concepts
• Source trees make sense forBroadcast radio transmissions
Expanding ring search
Generic few-sources-to-many-receiver applications
High-rate, low-delay application requirements
Per source policy from a service provider’s point of view
Per source access control
210940_03F8_c1NW97_US_106
Data Distribution Concepts
• Shared trees make sense for
Many low-rate sources
Applications that don’t require low delay
Consistent policy and access control across most participants in a group
When most of the source trees overlap topologically with the shared tree
220940_03F8_c1NW97_US_106
Deployment Obstacles—Non-Technical Issues
• How to bill for the serviceIs the service what runs on top of multicast?
Or is it the transport itself?
Do you bill based on sender or receiver, or both?
• How to control accessShould sources be rate-controlled (unlike unicast routing)
Should receivers be able to receive at a specific rate only?
230940_03F8_c1NW97_US_106
Deployment Obstacles—Non-Technical Issues
• How to make your peers fan-out instead of you (reduce the replication factor in you own network)
Closest exit versus latest entrance—all a wash
• How to avoid multicast from opening a lot of security holes
Network-wide denial of service attacks
Eaves-dropping simpler since receivers are unknown
240940_03F8_c1NW97_US_106
Deployment Obstacles—Technical Issues
• Source tree state will become a problem as IP multicast gains popularity
When policy and access control per source will be the rule rather than the exception
• Group state will become a problem as IP multicast gains popularity
10,000 three member groups across the Internet
250940_03F8_c1NW97_US_106
Deployment Obstacles— Technical Issues
• Hopefully we can upper bound the state in routers based on their switching capacity
260940_03F8_c1NW97_US_106
Deployment Obstacles— Technical Issues
• Gigapop customers are telling us they don’t want to depend on another customer’s (or gigapop) RP
Do we connect shared trees together?
Do we have a single shared tree across domains?
Do we use source trees only for inter-domain groups?
270940_03F8_c1NW97_US_106
Deployment Obstacles— Technical Issues
• Customers are telling us that the unicast and multicast topologies won’t be congruent across domains
Due to physical/topological constraints
Due to policy constraints
• We need a inter-domain routing protocol that distinguishes unicast versus multicast policy
280940_03F8_c1NW97_US_106
How to Control Multicast Routing Table State in the Network?
• Fundamental problem of learning group membership
Flood and PruneDVMRPPIM-DM
Broadcast MembershipMOSPFDWRs
Rendezvous MechanismPIM-SMBGMP
290940_03F8_c1NW97_US_106
Rendezvous Mechanism
• Why not use sparse-mode PIM?
Where to put the root of the shared tree (the RP)
Third-party RP problem
• If you did use sparse-mode PIM
Group-to-RP mappings would have to be distributed throughout the Internet
300940_03F8_c1NW97_US_106
Rendezvous Mechanism
• Lets try using sparse-mode PIM for inter-domain multicast
• Look at four possibilities
Multi-level RP
Anycast clusters
MSDP
Use directory services
310940_03F8_c1NW97_US_106
Connect Shared Trees Together—Multi-Level RP
• Idea is to have a hierarchy of shared trees
Level-0 RPs are inside of domains
They propagate joins from routers to a Level-1 RP that may be in another domain
All level-0 shared trees get connected together via a Level-1 RP
If multiple Level-1 RPs, iterate up to Level-2 RPs
320940_03F8_c1NW97_US_106
Connect Shared Trees Together—Multi-Level RP
• ProblemsRequires PIM protocol changes
If you don’t locate the Level-0 RP at the border, intermediate PIM routers think there may be two RPs for the group
Still has the third-party problem, there is ultimately one node at the root of the hierarchy
Data has to flow all the way to the highest-level RP
330940_03F8_c1NW97_US_106
Connect Shared Trees Together—Anycast Clusters
• Share the burden of being an RP among service providers
Each RP in each domain is a border router
• Build RP clusters at interconnect points (or dense-mode clouds)
• Group allocation is per cluster and not per-user or per-domain
340940_03F8_c1NW97_US_106
Connect Shared Trees Together—Anycast Clusters
• Closest border router in cluster is used as the RP
• Routers in a domain will use the domain’s RP
Provided you have an RP for that group range at an interconnect point
If not, you use the closest RP at the interconnect point (could be RP in another domain)
350940_03F8_c1NW97_US_106
Connect Domains Together—MSDP
• If you can’t connect shared trees together easily, then don’t
• Multicast Source Discovery Protocol
Different paradigm
Rather than getting trees connected, get sources known to all trees
Sounds non-scalable, but the trick is in the implementation
360940_03F8_c1NW97_US_106
Connect Domains Together—MSDP
• An RP in a domain has a MSDP peering session with an RP in another domain
Runs over TCP
Source-Active (SA) messages are sent to describe active sending sources in a domain
Logical topology is built for the sole purpose to distribute SA messages
370940_03F8_c1NW97_US_106
Connect Domains Together—MSDP
• How it worksSource goes active in PIM-SM domain
It’s packets get PIM registered to domain’s RP
RP sends SA message to it’s MSDP peers
Other MSDP peers forward to their peers away from the originating RP
If a peer in another domain has receivers for the group the source is sending to, it joins the source (Flood-and-Join model)
380940_03F8_c1NW97_US_106
Connect Domains Together—MSDP• There is no shared tree across domains
Therefore each domain can depend solely on it’s own RP (no third-party problem)
• SA state is not stored at each MSDP peer
• You could encapsulate data in SA messages for low-rate bursty sources
• You could have SA caching peers to speed up join latency
390940_03F8_c1NW97_US_106
Use Directory Services
• You can use directory services to:
Enable a single shared tree across domains
Enable use of source tree only and avoid using a single shared tree across domains
400940_03F8_c1NW97_US_106
Use Directory Services
• How it works with a single shared tree across domains
Put RP in client’s domain
Optimal placement of the RP if the domain had a multicast source or receiver active
Policy for RP is consistent with policy for domain’s unicast prefixes
Use directory to find RP address for a given group
410940_03F8_c1NW97_US_106
Use Directory Services
• For example
Receiver host sends IGMP report for 224.1.1.1
First-hop router performs DNS name resolution on
1.1.1.224.pim.mcast.net
An A record is returned with the IP address of RP
First-hop router sends PIM join message towards RP
420940_03F8_c1NW97_US_106
Use Directory Services
• All routers get consistent RP addresses via DNS
• When dynamic DNS is widely deployed it will be easier to change A records
• In the mean time, use loopback addresses on routers and move them around in your domain
430940_03F8_c1NW97_US_106
Use Directory Services
• When domain group allocation exists, a domain can be authoritative for a DNS zone
1.224.pim.mcast.net
128/17.1.224.pim.mcast.net
440940_03F8_c1NW97_US_106
Use Directory Services
• Another approach—avoid using shared trees all together
Build PIM-SM source trees across domains
• Put multiple A records in DNS to describe sources for the group
1.0.2.224.sources.pim.mcast.net IN CNAME dmm-home IN CNAME dino-homedmm-home IN A 171.69.58.81dino-home IN A 171.69.127.178
450940_03F8_c1NW97_US_106
Standards Solutions
• Ultimate scalability of both routing and group allocation can be achieved using BGMP/MASC
• Use BGP4+ (MBGP) to deal with non-congruency issues
460940_03F8_c1NW97_US_106
Border Gateway Multicast Protocol (BGMP)
• Use a PIM-like protocol that runs between domains (BGP equivalent for multicast)
• The protocol builds a shared tree of domains for a group
So we can use a rendezvous mechanism at the domain level
Shared tree is bi-directional
Root of shared tree of domains is at root domain
470940_03F8_c1NW97_US_106
Border Gateway Multicast Protocol (BGMP)
• Runs in routers that border a multicast routing domain
• Runs over TCP like BGP
• Joins and prunes travel at domain hops
• Can build unidirectional source trees
• MIGP tells the borders about group membership
480940_03F8_c1NW97_US_106
Multicast Address Set Claim (MASC)
• How does one determine the root domain for a given group?
• Group prefixes are temporarily leased to domains
• They are allocated out of a service provider’s allocation which in turn get from upstream provider
490940_03F8_c1NW97_US_106
Multicast Address Set Claim (MASC)
• Claims for group allocation resolve collisions
• Group allocations are advertised across domains
• Lots of machinery for aggregating group allocations
500940_03F8_c1NW97_US_106
Multicast Address Set Claim (MASC)
• Tradeoff between aggregation and anticipated demand for group addresses
• Group prefix allocations are not assigned to domains—they are leased
Application must be written to know that group addresses may go away
• Work in progress
510940_03F8_c1NW97_US_106
Using BGP4+ (MBGP) for Non-Congruency Issues
• Multiprotocol extensions to BGP4—RFC 2283
• MBGP allows you to build a unicast RIB and multicast RIB independently with one protocol
• Can use the existing or new BGP peering topology
• MBGP carries unicast prefixes of multicast capable sources
520940_03F8_c1NW97_US_106
Possible Deployment Scenarios
• Environment:Multiple customers multicast peer at a Gigapop
• Deployment proposalEach customer puts their own administered RP attached to the Gigapop
That RP as well as all border routers run MBGP
The interconnect runs dense-mode PIM
Each customer runs PIM-SM/DM internally
530940_03F8_c1NW97_US_106
Possible Deployment Scenarios
• What about multiple interconnect points between Gigapop customers?
If multiple Gigapop customers connect at different interconnect points, they can multicast peer for any groups as long as their respective RPs are collocated on the same Gigapop (and the interconnect is Dense mode)
540940_03F8_c1NW97_US_106
Possible Deployment Scenarios
• What if all RPs are not at the same interconnect point?
Use MSDP so the sources known to the interconnect with RPs can tell the RPs at other interconnects where to join
550940_03F8_c1NW97_US_106
Possible Deployment Scenarios
• Use a group range that depends on DNS for rendezvousing or building trees
Customers decide which domains will have RPs
Customers decide which groups will use source trees and don’t have to administer RPs
Customers administer DNS databases