Home >Documents >Networking Nick Feamster Georgia Tech. 2 Goal of This Tutorial Teach engineers the basics of...

Networking Nick Feamster Georgia Tech. 2 Goal of This Tutorial Teach engineers the basics of...

Date post:27-Mar-2015
View:218 times
Download:5 times
Share this document with a friend
  • Slide 1

Networking Nick Feamster Georgia Tech Slide 2 2 Goal of This Tutorial Teach engineers the basics of networking and ISP operations Networks today Business models Operations (NOC, operators) Common problems Measurement, Monitoring, and Security Slide 3 3 Todays Networks Service provider business models Network operations center Network operators and engineers Slide 4 4 Business Models Increasingly commoditized (see Geoff Hustons talk at NANOG) Status quo: Establish transit costs, bill at 95 th percentile of usage Future: differential pricing, preference for certain groups of users, applications Slide 5 5 Billing for Internet Usage 95 th Percentile billing Customer network pays for committed information rate (CIR) Throughput measured every 5 minutes (typically with SNMP; flow statistics also can be used for billing) Customer billed based on 95 th percentile Slide 6 6 Net Neutrality Slide 7 7 Network Operations Operators run the day-to-day operations of the network Adjusting to shifts in traffic, failures, etc. Responding to security threats Provisioning new customers Slide 8 8 Point-of-Presence (PoP) A cluster of routers in a single physical location Inter-PoP links Long distances High bandwidth Intra-PoP links Cables between racks or floors Aggregated bandwidth PoP Slide 9 9 Example: Abilene Network Topology Slide 10 10 Another Example Backbone Slide 11 11 Georgia Tech Internet Routing Overview Intradomain (i.e., intra-AS) routing Interdomain routing Comcast Abilene AT&T Cogent Autonomous Systems (ASes) Slide 12 12 Internet Routing Protocol: BGP Route Advertisement Autonomous Systems (ASes) Session Traffic DestinationNext-hopAS Path 10578..2637 174 2637 Slide 13 13 Two Flavors of BGP External BGP (eBGP): exchanging routes between ASes Internal BGP (iBGP): disseminating routes to external destinations among the routers within an AS eBGP iBGP Question: Whats the difference between IGP and iBGP? Slide 14 14 IPv4 Addresses: Networks of Networks 32-bit number in dotted-quad notation www.cc.gatech.edu --- 1000001011001111 0000011100100100 Network (16 bits)Host (16 bits) 130207736 Problem: 2 32 addresses is a lot of table entries Solution: Routing based on network and host is a 16-bit prefix with 2 16 IP addresses Topological Addressing Slide 15 15 Pre-1994: Classful Addressing Network IDHost ID 816 Class A 32 0 Class B 10 Class C 110 Multicast Addresses Class D 1110 Reserved for experiments Class E 1111 24 /8 blocks (e.g., MIT has /16 blocks (e.g., Georgia Tech has /24 blocks (e.g., AT&T Labs has Simple Forwarding: Address range specifies network ID length Slide 16 16 Classless Interdomain Routing (CIDR) IP Address: Mask: 0100000100001110 1111100000000000 11111111 1111110000000000 Use two 32-bit numbers to represent a network. Network number = IP address + Mask Example: BellSouth Prefix: Address no longer specifies network ID range. New forwarding trick: Longest Prefix Match Slide 17 17 Benefits of CIDR Efficiency: Can allocate blocks of prefixes on a finer granularity Hierarchy: Prefixes can be aggregated into supernets. (Not always done. Typically not, in fact.) Customer 1 Customer 2 AT&TInternet Slide 18 18 Growth of IP Prefixes Slide 19 19 1994-1998: Linear Growth About 10,000 new entries per year In theory, less instability at the edges (why?) Source: Geoff Huston Slide 20 20 Around 2000: Fast Growth Resumes Claim: remaining /8s will be exhausted within the next 5-10 years. T. Hain, A Pragmatic Report on IPv4 Address Space Consumption, Cisco IPJ, September 2005 Slide 21 21 Fast growth resumes Rapid growth in routing tables Dot-Bomb Hiccup Significant contributor: Multihoming Source: Geoff Huston Slide 22 22 The Address Allocation Process Allocation policies of RIRs affect pressure on IPv4 address space IANA AfriNICAPNICARINLACNICRIPE http://www.iana.org/assignments/ipv4-address-space Georgia Tech Slide 23 23 Common Problems Diagnosis and troubleshooting (hence, measurement) Traffic engineering Security Design and capacity planning Slide 24 24 What can go wrong? Two-thirds of the problems are caused by configuration of the routing protocol Some downtime is very hard to protect against Slide 25 25 Measurement and Monitoring Slide 26 26 Passive vs. Active Measurement Passive Measurement: Collection of packets, flow statistics of traffic that is already flowing on the network Packet traces Flow statistics Application-level logs Active Measurement: Inject probing traffic to measure various characteristics Traceroute Ping Application-level probes (e.g., Web downloads) Slide 27 27 Billing for Internet Usage 95 th Percentile billing Customer network pays for committed information rate (CIR) Throughput measured every 5 minutes (typically with SNMP; flow statistics also can be used for billing) Customer billed based on 95 th percentile Slide 28 28 Passive Traffic Data Measurement SNMP byte/packet counts: everywhere Packet monitoring: selected locations Flow monitoring: typically at edges (if possible) Direct computation of the traffic matrix Input to denial-of-service attack detection Deep Packet Inspection: also at edge, where possible Slide 29 29 Simple Network Management Protocol Management Information Base (MIB) Information store Unique variables named by OIDs Accessed with SNMP Specific MIBs for byte/packet counts (per link) Manager Agent SNMP DB Managed Objects Slide 30 30 SNMP (Passive) Advantage: ubiquitous Supported on all networking equipment Multiple products for polling and analyzing data Disadvantages Coarse granularity Cannot express complex queries on the data Unreliable delivery of the data using UDP Utility Link utilization (billing) Traffic matrix inference Slide 31 31 Packet-level Monitoring Passive monitoring to collect full packet contents (or at least headers) Advantages: lots of detailed information Precise timing information Information in packet headers Disadvantages: overhead Hard to keep up with high-speed links Often requires a separate monitoring device Slide 32 32 Full Packet Capture (Passive) Example: Georgia Tech OC3Mon Rack-mounted PC Optical splitter Data Acquisition and Generation (DAG) card Source: endace.com Slide 33 33 What is a flow? Source IP address Destination IP address Source port Destination port Layer 3 protocol type TOS byte (DSCP) Input logical interface (ifIndex) Slide 34 34 Cisco NetFlow Basic output: Flow record Most common version is v5 Current version (9) is being standardized in the IETF (template-based) More flexible record format Much easier to add new flow record types Core Network Collection and Aggregation Collector (PC) Approximately 1500 bytes 20-50 flow records Sent more frequently if traffic increases Slide 35 35 Flow Record Contents Source and Destination, IP address and port Packet and byte counts Start and end times ToS, TCP flags Basic information about the flow plus, information related to routing Next-hop IP address Source and destination AS Source and destination prefix Slide 36 36 flow 1flow 2flow 3 flow 4 Aggregating Packets into Flows Criteria 1: Set of packets that belong together Source/destination IP addresses and port numbers Same protocol, ToS bits, Same input/output interfaces at a router (if known) Criteria 2: Packets that are close together in time Maximum inter-packet spacing (e.g., 15 sec, 30 sec) Example: flows 2 and 4 are different flows due to time Slide 37 37 Reducing Measurement Overhead Filtering: on interface destination prefix for a customer port number for an application (e.g., 80 for Web) Sampling: before insertion into flow cache Random, deterministic, or hash-based sampling 1-out-of-n or stratified based on packet/flow size Two types: packet-level and flow-level Aggregation: after cache eviction packets/flows with same next-hop AS packets/flows destined to a particular service Slide 38 38 Packet Sampling for Flow Monitoring Packet sampling before flow creation (Sampled Netflow) 1-out-of-m sampling of individual packets (e.g., m=100) Create of flow records over the sampled packets Reducing overhead Avoid per-packet overhead on (m-1)/m packets Avoid creating records for a large number of small flows Increasing overhead (in some cases) May split some long transfers into multiple flow records due to larger time gaps between successive packets time not sampled two flows timeout Slide 39 39 Sampling: Flow-Level Sampling Sampling of flow records evicted from flow cache When evicting flows from table or when analyzing flows Stratified sampling to put weight on heavy flows Select all long flows and sample the short flows Reduces the number of flow records Still measures the vast majority of the traffic Flow 1, 40 bytes Flow 2, 15580 bytes Flow 3, 8196 bytes Flow 4, 5350789 bytes Flow 5, 532 bytes Flow 6, 7432 bytes sample with 100% probability sample with 0.1% probability sample with 10% probability Slide 40 40 Two Main Approaches Packet-level Monitoring Keep packet-level statistics Examine (and potentially, log) variety of packet-level statistics. Essentially, anything in the packet. Timing Flow-level Monitoring Monitor packet-by-packet (though sometimes sampled) Keep aggregate statistics on a flow Slide 41 41 Packet Capture on High-Speed Links Example: Georgia Tech OC3Mon Rack-mounted PC Optical splitter Data Acquisition and Generation (DAG) card Source: endace.com Slide 42 42 Characteristics of Packet Capture Allows inspection on every packet on 10G links Disadvantages Costly Requires splitting optical fibers Must be able to filter/store data Slide 43 43 Data Measurement Repositories Abilene/Internet 2 Observatory Configuration examples SNMP data ISIS, BGP routing data, NetFlow traffic data RouteViews BGP updates BGP table snapshots Slide 44 44 Multihoming and Traffic Engineering Slide 45 45 What is Multihoming? The use of redundant network links for the purposes of external connectivity Can be achieved at many layers of the protocol stack and many places in the network Multiple network interfaces in a PC An ISP with multiple upstream interfaces Can refer to having multiple connections to The same ISP Multiple ISPs Slide 46 46 Why Multihome? Redundancy Availability Performance Cost Interdomain traffic engineering: the process by which a multihomed network configures its network to achieve these goals Slide 47 47 Redundancy Maintain connectivity in the face of: Physical connectivity problems (fiber cut, device failures, etc.) Failures in upstream ISP Slide 48 48 Performance Use multiple network links at once to achieve higher throughput than just over a single link. Allows incoming traffic to be load-balanced. 70% of traffic 30% of traffic Slide 49 49 Multihoming in IP Networks Today Stub AS: no transit service for other ASes No need to use BGP Multi-homed stub AS: has connectivity to multiple immediate upstream ISPs Need BGP No need for a public AS number No need for IP prefix allocation Multi-homed transit AS: connectivity to multiple ASes and transit service Need BGP, public AS number, IP prefix allocation Slide 50 50 BGP or no? Advantages of static routing Cheaper/smaller routers (less true nowadays) Simpler to configure Advantages of BGP More control of your destiny (have providers stop announcing you) Faster/more intelligent selection of where to send outbound packets. Better debugging of net problems (you can see the Internet topology now) Slide 51 51 Same Provider or Multiple? If your provider is reliable and fast, and affordably, and offers good tech-support, you may want to multi-home initially to them via some backup path (slow is better than dead). Eventually youll want to multi-home to different providers, to avoid failure modes due to one providers architecture decisions. Slide 52 52 Multihomed Stub: One Link Downstream ISPs routers configure default (static) routes pointing to border router. Upstream ISP advertises reachability Upstream ISP Multiple links between same pair of routers. Default routes to border Stub ISP Slide 53 53 Multihomed Stub: Multiple Links Use BGP to share load Use private AS number (why is this OK?) As before, upstream ISP advertises prefix Upstream ISP Multiple links to different upstream routers Stub ISP Internal routing for hot potato BGP for load balance at edge Slide 54 54 Multihomed Stub: Multiple ISPs Many possibilities Load sharing Primary-backup Selective use of different ISPs Requires BGP, public AS number, etc. Stub ISP Upstream ISP 1 Upstream ISP 2 Slide 55 55 Multihomed Transit Network BGP everywhere Incoming and outcoming traffic Challenge: balancing load on intradomain and egress links, given an offered traffic load Transit ISP ISP 1ISP 2 ISP 3 Slide 56 56 Interdomain Traffic Engineering The process by which a network operator configures the network to achieve Traffic load balance Redundancy (primary/backup), etc. Two tasks Outbound traffic control Inbound traffic control Key Problems: Predictability and Scalability Slide 57 57 Outbound Traffic Control Easier to control than inbound traffic Destination-based routing: sender determines where the packets go Control over next-hop AS only Cannot control selection of the entire path Provider 1 Provider 2 Control with local preference Slide 58 58 Outbound Traffic: Load Balancing Control routes to provider per-prefix Assign local preference across destination prefixes Change the local preference assignments over time Useful inputs to load balancing End-to-end path performance data Outbound traffic statistics per destination prefix Challenge: Getting from traffic volumes to groups of prefixes that should be assigned to each link Premise of intelligent route control preoducts. Slide 59 59 Traffic Engineering Goals Predictability Ensure the BGP decision process is deterministic Assume that BGP updates are (relatively) stable Limit overhead introduced by routing changes Minimize frequency of changes to routing policies Limit number of prefixes affected by changes Limit impact on how traffic enters the network Avoid new routes that might change neighbors mind Select route with same attributes, or at least path length Slide 60 60 Managing Scale Destination prefixes More than 90,000 destination prefixes Dont want to have per-prefix routing policies Small fraction of prefixes contribute most of the traffic Focus on the small number of heavy hitters Define routing policies for selected prefixes Routing choices About 27,000 unique routing choices Help in reducing the scale of the problem Small fraction of routing choices contribute most traffic Focus on the very small number of routing choices Define routing policies on common attributes Slide 61 61 Achieving Predictability Route prediction with static analysis Helpful to know effects before deployment Static analysis can help Topology BGP policy configuration eBGP routes Offered traffic BGP routing model Flow of traffic through the network Slide 62 62 Challenges to Predictability For transit ISPs: effects on incoming traffic Lack of coordination strikes again! Slide 63 63 Hot Potato routing Inter-AS Negotiation Coordination aids predictability Negotiate where to send Inbound and outbound Mutual benefits How to implement? What info to exchange? Protecting privacy? How to prioritize choices? How to prevent cheating? Destination 2 Destination 1 multiple peering points Provider A Provider B Slide 64 64 Outbound: Multihoming Goals Redundancy Dynamic routing will failover to backup link Performance Select provider with best performance per prefix Requires active probing Cost Select provider per prefix over time to minimize the total financial cost Slide 65 65 Inbound Traffic Control More difficult: no control over neighbors decisions. Three common techniques (previously discussed) AS path prepending Communities and local preference Prefix splitting How does todays paper (MONET) control inbound traffic? Slide 66 66 How many links are enough? K upstream ISPs Not much benefit beyond 4 ISPs Akella et al., Performance Benefits of Multihoming, SIGCOMM 2003 Slide 67 67 Problems with Multihoming in IPv4 Routing table growth Provider-based addressing Advertising prefix out multiple ISPs cant aggregate Poor control over inbound traffic Existing mechanisms do not allow hosts to control inbound traffic Slide 68 68 Georgia Tech Internet Routing Overview Intradomain (i.e., intra-AS) routing Interdomain routing Comcast Abilene AT&T Cogent Autonomous Systems (ASes) Slide 69 69 Configuration Problems: AS 7007 a glitch at a small ISP triggered a major outage in Internet access across the country. The problem started when MAI Network Services...passed bad router information from one of its customers onto Sprint. -- news.com, April 25, 1997 UUNet Florida Internet Barn Sprint Slide 70 70 Diagnosis and Troubleshooting a glitch at a small ISP triggered a major outage in Internet access across the country. The problem started when MAI Network Services...passed bad router information from one of its customers onto Sprint. -- news.com, April 25, 1997Sprint Microsoft's websites were offline for up to 23 hours...because of a [router] misconfigurationit took nearly a day to determine what was wrong and undo the changes. -- wired.com, January 25, 2001 WorldCom Incsuffered a widespread outage on its Internet backbone that affected roughly 20 percent of its U.S. customer base. The network problemsaffected millions of computer users worldwide. A spokeswoman attributed the outage to "a route table issue." -- cnn.com, October 3, 2002 "A number of Covad customers went out from 5pm today due to, supposedly, a DDOS (distributed denial of service attack) on a key Level3 data center, which later was described as a route leak (misconfiguration). -- dslreports.com, February 23, 2004 Slide 71 71 Operator Mailing List (NANOG) Date: Mon, 18 Oct 2004 09:15:15 -0700 Subject: Level 3 US east coast "issues" Level 3 experiencing widespread "unspecified routing issues" on the US east coast. Master ticket 1086844. Anyone have more specific information? Date: Mon, 18 Oct 2004 12:20:34 -0400 (EDT) Subject: Re: Level 3 US east coast "issues" Level 3 is currently experiencing a backbone outage causing routing instability and packet loss. We are working to restore and will be sending hourly updates Slide 72 72 Operator Mailing List Note: Only includes problems openly discussed on this list. Compare: 83 power outages, 1 fire Slide 73 73 Routing Configuration Ranking: route selection Dissemination: internal route advertisement Filtering: route advertisement Customer Competitor Primary Backup Slide 74 74 Internet Business Model (Simplified) Customer/Provider: One AS pays another for reachability to some set of destinations Settlement-free Peering: Bartering. Two ASes exchange routes with one another. Provider Peer Customer Preferences implemented with local preference manipulation Destination Pay to use Get paid to use Free to use Slide 75 75 Peering Contracts: Consistent Export Rules of settlement-free peering: Advertise routes at all peering points Advertised routes must have equal AS path length Sprint AT&T Enables hot potato routing. equally good routes Slide 76 76 Consistent Export Malice/deception iBGP signaling partition Inconsistent export policy neighbor route-map PEER permit 10 set prepend 123 neighbor route-map PEER permit 10 set prepend 123 123 Possible Causes 4561 4562 Neighbor AS Export 11 123 ExportClause Prepend 21 123 123 Two different Export Policies Slide 77 77 Inconsistent Export in Practice Feamster et al., BorderGuard: Detecting Cold Potatoes from Peers. ACM IMC, October 2004. Slide 78 78 Blackholes Date: Thu, 18 Jul 2002 06:05:10 -0400 (EDT) From: Chad Oleary Subject: Re: problems with 701 To: We're starting to see the same issues with UUNet, again. Anyone else seeing this? Trying to reach Qwest... traceroute to (, 30 hops max, 38 byte packets 1 esc-lp2-gw.e-solutionscorp.com ( 1.167 ms 1.163 ms 1.142 ms 2 500.Serial2-10.GW1.TPA2.ALTER.NET ( 1.097 ms 1.059 ms 1.044 ms 3 161.at-1-0-0.XL4.ATL1.ALTER.NET ( 13.839 ms 14.108 ms 16.638 ms 4 0.so-3-1-0.XL2.ATL5.ALTER.NET ( 14.370 ms 14.587 ms 14.553 ms 5 POS7-0.BR2.ATL5.ALTER.NET ( 13.928 ms 14.099 ms 14.053 ms 6 * * * 7 * * * Slide 79 79 Security Slide 80 80 Security: Bogon Routes Feamster et al., An Empirical Study of Bogon Route Advertisements. ACM CCR, January 2005. Slide 81 81 Spam, Phishing, etc. Unsolicited commercial email As of about August 2008, estimates indicate that about 95% of all email is spam Common spam filtering techniques Content-based filters DNS Blacklist (DNSBL) lookups: Significant fraction of todays DNS traffic! Can IP addresses from which spam is received be spoofed? Slide 82 82 Spam and Routing Slide 83 83 Worms and Botnets Slide 84 84 What is a Worm? Code that replicates and propagates across the network Often carries a payload Usually spread via exploiting flaws in open services Viruses require user action to spread First worm: Robert Morris, November 1988 6-10% of all Internet hosts infected (!) Many more since, but none on that scale until July 2001 Slide 85 85 Example Worm: Code Red Initial version: July 13, 2001 Exploited known ISAPI vulnerability in Microsoft IIS Web servers 1 st through 20 th of each month: spread 20 th through end of each month: attack Payload: Web site defacement Scanning: Random IP addresses Bug: failure to seed random number generator Slide 86 86 Code Red: Revisions Released July 19, 2001 Payload: flooding attack on www.whitehouse.gov Attack was mounted at the IP address of the Web site Bug: died after 20 th of each month Random number generator for IP scanning fixed Slide 87 87 Code Red: Host Infection Rate Exponential infection rate Measured using backscatter technique Slide 88 88 Designing Fast-Spreading Worms Hit-list scanning Time to infect first 10k hosts dominates infection time Solution: Reconnaissance (stealthy scans, etc.) Permutation scanning Observation: Most scanning is redundant Idea: Shared permutation of address space. Start scanning from own IP address. Re-randomize when another infected machine is found. Internet-scale hit lists Flash worm: complete infection within 30 seconds Slide 89 89 Botnets Bots: Autonomous programs performing tasks Plenty of benign bots e.g., weatherbug Botnets: group of bots Typically carries malicious connotation Large numbers of infected machines Machines enlisted with infection vectors like worms (last lecture) Available for simultaneous control by a master Size: up to 350,000 nodes (from todays paper) Slide 90 90 Rallying the Botnet Easy to combine worm, backdoor functionality Problem: how to learn about successfully infected machines? Options Email Hard-coded email address Slide 91 91 Botnet Control Botnet master typically runs some IRC server on a well- known port (e.g., 6667) Infected machine contacts botnet with pre-programmed DNS name (e.g., big-bot.de) Dynamic DNS: allows controller to move about freely Infected Machine Dynamic DNS Botnet Controller (IRC server) Slide 92 92 Some Defenses Slide 93 93 Idea #1: Ingress Filtering RFC 2827: Routers install filters to drop packets from networks that are not downstream Feasible at edges Difficult to configure closer to network core Internet Drop all packets with source address other than Slide 94 94 Idea #2: uRPF Checks Unicast Reverse Path Forwarding Cisco: ip verify unicast reverse-path Requires symmetric routing Accept packet from interface only if forwarding table entry for source IP address matches ingress interface Strict Mode uRPF Enabled A Routing Table Destination Next Hop Int. 1 Int. 2 from wrong interface Slide 95 95 Problems with uRPF Asymmetric routing Slide 96 96 S-BGP Address-based PKI: validate signatures Authentication of ownership for IP address blocks, AS number, an AS's identity, and a BGP router's identity Use existing infrastructure (Internet registries etc.) Routing origination is digitally signed BGP updates are digitally signed Route attestations: A new, optional, BGP transitive path attribute carries digital signatures covering the routing information in updates Slide 97 97 Practical Problems with S-BGP Requires Public-Key Infrastructure Lots of digital signatures to calculate and verify. Message overhead CPU overhead Calculation expense is greatest when topology is changing Caching can help Route aggregation is problematic (maybe thats OK) Secure route withdrawals when link or node fails? Address ownership data out of date Deployment

Popular Tags:
of 97/97
Networking Nick Feamster Georgia Tech
Embed Size (px)