+ All Categories
Home > Documents > Router Internals CS 4251: Computer Networking II Nick Feamster Fall 2008.

Router Internals CS 4251: Computer Networking II Nick Feamster Fall 2008.

Date post: 27-Mar-2015
Category:
Upload: zachary-maloney
View: 222 times
Download: 3 times
Share this document with a friend
Popular Tags:
56
Router Internals CS 4251: Computer Networking II Nick Feamster Fall 2008
Transcript
Page 1: Router Internals CS 4251: Computer Networking II Nick Feamster Fall 2008.

Router Internals

CS 4251: Computer Networking IINick Feamster

Fall 2008

Page 2: Router Internals CS 4251: Computer Networking II Nick Feamster Fall 2008.

2

Today’s Lecture

• The design of big, fast routers• Design constraints

– Speed– Size– Power consumption

• Components• Algorithms

– Lookups and packet processing (classification, etc.)– Packet queueing– Switch arbitration– Fairness

Page 3: Router Internals CS 4251: Computer Networking II Nick Feamster Fall 2008.

3

What’s In A Router

• Interfaces– Input/output of packets

• Switching fabric– Moving packets from input to output

• Software– Routing– Packet processing– Scheduling– Etc.

Page 4: Router Internals CS 4251: Computer Networking II Nick Feamster Fall 2008.

4

What a Router Chassis Looks Like

Cisco CRS-1 Juniper M320

6ft

19”

2ft

Capacity: 1.2Tb/s Power: 10.4kWWeight: 0.5 TonCost: $500k

3ft

2ft

17”

Capacity: 320 Gb/s Power: 3.1kW

Page 5: Router Internals CS 4251: Computer Networking II Nick Feamster Fall 2008.

5

What a Router Line Card Looks Like

1-Port OC48 (2.5 Gb/s)(for Juniper M40)

4-Port 10 GigE(for Cisco CRS-1)

Power: about 150 Watts 21in

2in

10in

Page 6: Router Internals CS 4251: Computer Networking II Nick Feamster Fall 2008.

6

Big, Fast Routers: Why Bother?

• Faster link bandwidths• Increasing demands• Larger network size (hosts, routers, users)

Page 7: Router Internals CS 4251: Computer Networking II Nick Feamster Fall 2008.

7

Summary of Routing Functionality

• Router gets packet• Looks at packet header for destination• Looks up forwarding table for output interface• Modifies header (ttl, IP header checksum)• Passes packet to output interface

Page 8: Router Internals CS 4251: Computer Networking II Nick Feamster Fall 2008.

8

Generic Router Architecture

LookupIP Address

UpdateHeader

Header ProcessingData Hdr Data Hdr

1M prefixesOff-chip DRAM

AddressTable

AddressTable

IP Address Next Hop

QueuePacket

BufferMemory

BufferMemory

1M packetsOff-chip DRAM

Question: What is the difference between this architecture and that in today’s paper?

Page 9: Router Internals CS 4251: Computer Networking II Nick Feamster Fall 2008.

9

Innovation #1: Each Line Card Has the Routing Tables

• Prevents central table from becoming a bottleneck at high speeds

• Complication: Must update forwarding tables on the fly. – How would a router update tables without slowing the

forwarding engines?

Page 10: Router Internals CS 4251: Computer Networking II Nick Feamster Fall 2008.

10

Generic Router ArchitectureLookup

IP AddressUpdateHeader

Header Processing

AddressTable

AddressTable

LookupIP Address

UpdateHeader

Header Processing

AddressTable

AddressTable

LookupIP Address

UpdateHeader

Header Processing

AddressTable

AddressTable

Data Hdr

Data Hdr

Data Hdr

BufferManager

BufferMemory

BufferMemory

BufferManager

BufferMemory

BufferMemory

BufferManager

BufferMemory

BufferMemory

Data Hdr

Data Hdr

Data Hdr

Interconnection Fabric

Page 11: Router Internals CS 4251: Computer Networking II Nick Feamster Fall 2008.

11

RouteTableCPU Buffer

Memory

LineInterface

MAC

LineInterface

MAC

LineInterface

MAC

Typically <0.5Gb/s aggregate capacity

Shared Bus

Line Interface

CPU

Memory

First Generation Routers

Off-chip Buffer

Page 12: Router Internals CS 4251: Computer Networking II Nick Feamster Fall 2008.

12

RouteTableCPU

LineCard

BufferMemory

LineCard

MAC

BufferMemory

LineCard

MAC

BufferMemory

FwdingCache

FwdingCache

FwdingCache

MAC

BufferMemory

Typically <5Gb/s aggregate capacity

Second Generation Routers

Page 13: Router Internals CS 4251: Computer Networking II Nick Feamster Fall 2008.

13

Innovation #2: Switched Backplane• Every input port has a connection to every output port

• During each timeslot, each input connected to zero or one outputs

• Advantage: Exploits parallelism• Disadvantage: Need scheduling algorithm

Page 14: Router Internals CS 4251: Computer Networking II Nick Feamster Fall 2008.

14

Third Generation Routers

LineCard

MAC

LocalBuffer

Memory

CPUCard

LineCard

MAC

LocalBuffer

Memory

“Crossbar”: Switched Backplane

Line Interface

CPUMemory Fwding

Table

RoutingTable

FwdingTable

Typically <50Gb/s aggregate capacity

Page 15: Router Internals CS 4251: Computer Networking II Nick Feamster Fall 2008.

15

Other Goal: Utilization

• “100% Throughput”: no packets experience head-of-line blocking

• Does the previous scheme achieve 100% throughput?

• What if the crossbar could have a “speedup”?

Key result: Given a crossbar with 2x speedup, any maximal matching can achieve 100% throughput.

Page 16: Router Internals CS 4251: Computer Networking II Nick Feamster Fall 2008.

16

Head-of-Line Blocking

Output 1

Output 2

Output 3

Input 1

Input 2

Input 3

Problem: The packet at the front of the queue experiences contention for the output queue, blocking all packets behind it.

Maximum throughput in such a switch: 2 – sqrt(2)

Page 17: Router Internals CS 4251: Computer Networking II Nick Feamster Fall 2008.

17

Combined Input-Output Queueing

• Advantages– Easy to build

• 100% can be achieved with limited speedup

• Disadvantages– Harder to design algorithms

• Two congestion points• Flow control at

destination

input interfaces output interfaces

Crossbar

Page 18: Router Internals CS 4251: Computer Networking II Nick Feamster Fall 2008.

18

Solution: Virtual Output Queues

• Maintain N virtual queues at each input– one per output

Output 1

Output 2

Output 3

Input 1

Input 2

Input 3

Page 19: Router Internals CS 4251: Computer Networking II Nick Feamster Fall 2008.

19

Scheduling and Fairness

• What is an appropriate definition of fairness?– One notion: Max-min fairness– Disadvantage: Compromises throughput

• Max-min fairness gives priority to low data rates/small values

• Is it guaranteed to exist?• Is it unique?

Page 20: Router Internals CS 4251: Computer Networking II Nick Feamster Fall 2008.

20

Max-Min Fairness

• A flow rate x is max-min fair if any rate x cannot be increased without decreasing some y which is smaller than or equal to x.

• How to share equally with different resource demands– small users will get all they want– large users will evenly split the rest

• More formally, perform this procedure:– resource allocated to customers in order of increasing demand– no customer receives more than requested– customers with unsatisfied demands split the remaining resource

Page 21: Router Internals CS 4251: Computer Networking II Nick Feamster Fall 2008.

21

Example

• Demands: 2, 2.6, 4, 5; capacity: 10– 10/4 = 2.5 – Problem: 1st user needs only 2; excess of 0.5,

• Distribute among 3, so 0.5/3=0.167– now we have allocs of [2, 2.67, 2.67, 2.67],– leaving an excess of 0.07 for cust #2– divide that in two, gets [2, 2.6, 2.7, 2.7]

• Maximizes the minimum share to each customer whose demand is not fully serviced

Page 22: Router Internals CS 4251: Computer Networking II Nick Feamster Fall 2008.

22

How to Achieve Max-Min Fairness

• Take 1: Round-Robin– Problem: Packets may have different sizes

• Take 2: Bit-by-Bit Round Robin– Problem: Feasibility

• Take 3: Fair Queuing – Service packets according to soonest “finishing time”

Adding QoS: Add weights to the queues…

Page 23: Router Internals CS 4251: Computer Networking II Nick Feamster Fall 2008.

23

Router Components and Functions

• Route processor– Routing– Installing forwarding tables– Management

• Line cards– Packet processing and classification– Packet forwarding

• Switched bus (“Crossbar”)– Scheduling

Page 24: Router Internals CS 4251: Computer Networking II Nick Feamster Fall 2008.

24

Crossbar Switching

• Conceptually: N inputs, N outputs– Actually, inputs are also outputs

• In each timeslot, one-to-one mapping between inputs and outputs.

• Goal: Maximal matching

L11(n)

LN1(n)

Traffic Demands Bipartite Match

MaximumWeight Match

*

( )( ) argmax( ( ) ( ))T

S nS n L n S n

Page 25: Router Internals CS 4251: Computer Networking II Nick Feamster Fall 2008.

25

Processing: Fast Path vs. Slow Path

• Optimize for common case– BBN router: 85 instructions for fast-path code– Fits entirely in L1 cache

• Non-common cases handled on slow path– Route cache misses– Errors (e.g., ICMP time exceeded)– IP options– Fragmented packets– Mullticast packets

Page 26: Router Internals CS 4251: Computer Networking II Nick Feamster Fall 2008.

26

IP Address Lookup

Challenges:1. Longest-prefix match (not exact).

2. Tables are large and growing.

3. Lookups must be fast.

Page 27: Router Internals CS 4251: Computer Networking II Nick Feamster Fall 2008.

27

Address Tables are Large

Page 28: Router Internals CS 4251: Computer Networking II Nick Feamster Fall 2008.

28

Lookups Must be Fast

12540Gb/s2003

31.2510Gb/s2001

7.812.5Gb/s1999

1.94622Mb/s1997

40B packets (Mpkt/s)

LineYear

OC-12

OC-48

OC-192

OC-768

Still pretty rare outside of research networks

Cisco CRS-1 1-Port OC-768C (Line rate: 42.1 Gb/s)

Page 29: Router Internals CS 4251: Computer Networking II Nick Feamster Fall 2008.

29

Lookup is Protocol Dependent

Protocol Mechanism Techniques

MPLS, ATM, Ethernet

Exact match search

–Direct lookup

–Associative lookup

–Hashing

–Binary/Multi-way Search Trie/Tree

IPv4, IPv6 Longest-prefix match search

-Radix trie and variants

-Compressed trie

-Binary search on prefix intervals

Page 30: Router Internals CS 4251: Computer Networking II Nick Feamster Fall 2008.

30

Exact Matches, Ethernet Switches

• layer-2 addresses usually 48-bits long• address global, not just local to link• range/size of address not “negotiable” • 248 > 1012, therefore cannot hold all addresses in table

and use direct lookup

Page 31: Router Internals CS 4251: Computer Networking II Nick Feamster Fall 2008.

31

Exact Matches, Ethernet Switches

• advantages:– simple– expected lookup time is small

• disadvantages– inefficient use of memory– non-deterministic lookup time

attractive for software-based switches, but decreasing use in hardware platforms

Page 32: Router Internals CS 4251: Computer Networking II Nick Feamster Fall 2008.

32

IP Lookups find Longest Prefixes

128.9.16.0/21128.9.172.0/21

128.9.176.0/24

0 232-1

128.9.0.0/16142.12.0.0/1965.0.0.0/8

128.9.16.14

Routing lookup: Find the longest matching prefix (aka the most specific route) among all prefixes that match the destination address.

Page 33: Router Internals CS 4251: Computer Networking II Nick Feamster Fall 2008.

IP Address Lookup• routing tables contain (prefix, next hop)

pairs• address in packet compared to stored

prefixes, starting at left• prefix that matches largest number of

address bits is desired match• packet forwarded to specified next hop

01* 5110* 31011* 50001* 0

10* 7

0001 0* 10011 00* 21011 001* 31011 010* 5

0101 1* 7

0100 1100* 41011 0011* 81001 1000*100101 1001* 9

0100 110* 6

prefixnexthop

routing table

address: 1011 0010 1000

Problem - large router may have100,000 prefixes in its list

Page 34: Router Internals CS 4251: Computer Networking II Nick Feamster Fall 2008.

34

Longest Prefix Match Harder than Exact Match

• destination address of arriving packet does not carry information to determine length of longest matching prefix

• need to search space of all prefix lengths; as well as space of prefixes of given length

Page 35: Router Internals CS 4251: Computer Networking II Nick Feamster Fall 2008.

35

LPM in IPv4: exact matchUse 32 exact match algorithms

Exact matchagainst prefixes

of length 1

Exact matchagainst prefixes

of length 2

Exact matchagainst prefixes

of length 32

Network Address PortPriorityEncodeand pick

Page 36: Router Internals CS 4251: Computer Networking II Nick Feamster Fall 2008.

36

• prefixes “spelled” out by following path from root

• to find best prefix, spell out address in tree

• last green node marks longest matching prefix

Lookup 10111

• adding prefix easy

Address Lookup Using Tries

P1 111* H1

P2 10* H2

P3 1010* H3

P4 10101 H4

P2

P3

P4

P1

A

B

C

G

D

F

H

E

1

0

0

1 1

1

1add P5=1110*

I

0

P5

next-hop-ptr (if prefix)

left-ptr right-ptr

Trie node

Page 37: Router Internals CS 4251: Computer Networking II Nick Feamster Fall 2008.

37

Single-Bit Tries: Properties

• Small memory and update times– Main problem is the number of memory accesses

required: 32 in the worst case

• Way beyond our budget of approx 4– (OC48 requires 160ns lookup, or 4 accesses)

Page 38: Router Internals CS 4251: Computer Networking II Nick Feamster Fall 2008.

38

Direct Trie

• When pipelined, one lookup per memory access• Inefficient use of memory

0000……0000 1111……1111

0 224-1

24 bits

8 bits

0 28-1

Page 39: Router Internals CS 4251: Computer Networking II Nick Feamster Fall 2008.

39

Multi-bit Tries

Depth = WDegree = 2Stride = 1 bit

Binary trieW

Depth = W/kDegree = 2k

Stride = k bits

Multi-ary trie

W/k

Page 40: Router Internals CS 4251: Computer Networking II Nick Feamster Fall 2008.

40

4-ary Trie (k=2)

P2

P3 P12

A

B

F11

next-hop-ptr (if prefix)

ptr00 ptr01

A four-ary trie node

P11

10

P42

H11

P41

10

10

1110

D

C

E

G

ptr10 ptr11

Lookup 10111

P1 111* H1

P2 10* H2

P3 1010* H3

P4 10101 H4

Page 41: Router Internals CS 4251: Computer Networking II Nick Feamster Fall 2008.

41

Prefix Expansion with Multi-bit Tries

If stride = k bits, prefix lengths that are not a multiple of k must be expanded

Prefix Expanded prefixes

0* 00*, 01*

11* 11*

E.g., k = 2:

Page 42: Router Internals CS 4251: Computer Networking II Nick Feamster Fall 2008.

42

Leaf-Pushed Trie

A

B

C

G

D

E

1

0

0

1

1

left-ptr or next-hop

Trie node

right-ptr or next-hop

P2

P4P3

P2

P1P1 111* H1

P2 10* H2

P3 1010* H3

P4 10101 H4

Page 43: Router Internals CS 4251: Computer Networking II Nick Feamster Fall 2008.

43

Further Optmizations: Lulea

• 3-level trie: 16-bits, 8-bits, 8-bits

• Bitmap to compress out repeated entries

Page 44: Router Internals CS 4251: Computer Networking II Nick Feamster Fall 2008.

44

PATRICIAPatricia tree internal node

bit-position

left-ptr right-ptr

Lookup 10111

2A

B C

E

10

1

3

P3 P4

P11

0F G

5

P1 111* H1

P2 10* H2

P3 1010* H3

P4 10101 H4

Bitpos 12345

• PATRICIA (practical algorithm to retrieve coded information in alphanumeric)–Eliminate internal nodes with only

one descendant–Encode bit position for determining

(right) branching

P2

0

Page 45: Router Internals CS 4251: Computer Networking II Nick Feamster Fall 2008.

45

Fast IP Lookup Algorithms • Lulea Algorithm (SIGCOMM 1997)

– Key goal: compactly represent routing table in small memory (hopefully, within cache size), to minimize memory access

– Use a three-level data structure• Cut the look-up tree at level 16 and level 24

– Clever ways to design compact data structures to represent routing look-up info at each level

• Binary Search on Levels (SIGCOMM 1997)– Represent look-up tree as array of hash tables– Notion of “marker” to guide binary search– Prefix expansion to reduce size of array (thus memory accesses)

Page 46: Router Internals CS 4251: Computer Networking II Nick Feamster Fall 2008.

46

Faster LPM: Alternatives

• Content addressable memory (CAM)– Hardware-based route lookup– Input = tag, output = value

– Requires exact match with tag• Multiple cycles (1 per prefix) with single CAM• Multiple CAMs (1 per prefix) searched in parallel

– Ternary CAM• (0,1,don’t care) values in tag match• Priority (i.e., longest prefix) by order of entries

Historically, this approach has not been very economical.

Page 47: Router Internals CS 4251: Computer Networking II Nick Feamster Fall 2008.

47

Faster Lookup: Alternatives

• Caching – Packet trains exhibit temporal locality– Many packets to same destination

• Cisco Express Forwarding

Page 48: Router Internals CS 4251: Computer Networking II Nick Feamster Fall 2008.

48

IP Address Lookup: Summary

• Lookup limited by memory bandwidth.• Lookup uses high-degree trie.

Page 49: Router Internals CS 4251: Computer Networking II Nick Feamster Fall 2008.

49

Recent Trends: Programmability

• NetFPGA: 4-port interface card, plugs into PCI bus(Stanford)– Customizable forwarding– Appearance of many

virtual interfaces (with VLAN tags)

• Programmability with Network processors(Washington U.)

LineCards

PEs

Switch

Page 50: Router Internals CS 4251: Computer Networking II Nick Feamster Fall 2008.

50The Stanford Clean Slate Program http://cleanslate.stanford.edu

Experimenter’s Dream(Vendor’s Nightmare)

StandardNetwork

Processing

StandardNetwork

Processinghwsw Experimenter writes

experimental codeon switch/router

User-defined

Processing

User-defined

Processing

Page 51: Router Internals CS 4251: Computer Networking II Nick Feamster Fall 2008.

51The Stanford Clean Slate Program http://cleanslate.stanford.edu

No obvious way

Commercial vendor won’t open software and hardware development environment Complexity of support Market protection and barrier to entry

Hard to build my own Prototypes are flakey Software only: Too slow Hardware/software: Fanout too small

(need >100 ports for wiring closet)

Page 52: Router Internals CS 4251: Computer Networking II Nick Feamster Fall 2008.

52The Stanford Clean Slate Program http://cleanslate.stanford.edu

Furthermore, we want…• Isolation: Regular production traffic untouched• Virtualized and programmable: Different flows processed

in different ways • Equipment we can trust in our wiring closet• Open development environment for all researchers (e.g.

Linux, Verilog, etc). • Flexible definitions of a flow

Individual application traffic Aggregated flows Alternatives to IP running side-by-side …

Page 53: Router Internals CS 4251: Computer Networking II Nick Feamster Fall 2008.

53The Stanford Clean Slate Program http://cleanslate.stanford.edu

Controller

OpenFlow Switch

FlowTableFlowTable

SecureChannelSecureChannel

PCOpenFlow

Protocol

SSL

hw

sw

OpenFlow Switch specification

OpenFlow Switching

Page 54: Router Internals CS 4251: Computer Networking II Nick Feamster Fall 2008.

54The Stanford Clean Slate Program http://cleanslate.stanford.edu

Flow Table Entry“Type 0” OpenFlow Switch

SwitchPort

MACsrc

MACdst

Ethtype

VLANID

IPSrc

IPDst

IPProt

TCPsport

TCPdport

Rule Action Stats

1. Forward packet to port(s)2. Encapsulate and forward to controller3. Drop packet4. Send to normal processing pipeline

+ mask

Packet + byte counters

Page 55: Router Internals CS 4251: Computer Networking II Nick Feamster Fall 2008.

55The Stanford Clean Slate Program http://cleanslate.stanford.edu

OpenFlow “Type 1”• Definition in progress• Additional actions

Rewrite headers Map to queue/classEncrypt

• More flexible headerAllow arbitrary matching of first few bytes

• Support multiple controllersLoad-balancing and reliability

Page 56: Router Internals CS 4251: Computer Networking II Nick Feamster Fall 2008.

56The Stanford Clean Slate Program http://cleanslate.stanford.edu

Controller

PC

OpenFlowAccess Point

Server room

OpenFlow

OpenFlow

OpenFlow

OpenFlow-enabledCommercial Switch

FlowTableFlowTable

SecureChannelSecureChannel

NormalSoftware

NormalDatapath


Recommended