Link Layer

1

Link Layer

4/19/2012

Admin

Written Assignment—Network new due date: Monday, April 23

If you are considering replacement work, please stop by to talk to me

Any feedback/suggestions on the course will be appreciated.

2

3

Recap: Internet Routing

Intradomain routing and interdomain routing

CIDR to allow flexibility in aggregation of destination addresses to improve routing scalability Longest prefix matching to determine the

next hop to a destination

Basic switching fabric design

4

Putting it Together: Example 1 (same network): A->B

Look up dest address find dest is on same net

Hand datagram to link layer to send inside a link-layer frame

miscfields223.1.1.1223.1.1.3data

223.1.1.1

223.1.1.2

223.1.1.3

223.1.1.4 223.1.2.9

223.1.2.2

223.1.2.1

223.1.3.2223.1.3.1

223.1.3.27

A

B

Dest. Net. next router Nhops

223.1.1/24 1223.1.2/24 223.1.1.4 2223.1.3/24 223.1.1.4 2

forwarding table in A

0.0.0.0/0 223.1.1.4 -

223.1.4.1

To Internet

src dst

5

Putting it Together: Example 2 (Different Networks): A-> E

look up dest address in forwarding table routing table: next hop

router to dest is 223.1.1.4

Hand datagram to link layer to send to router 223.1.1.4 inside a link-layer frame the dest. of the link layer

frame is 223.1.1.4

miscfields223.1.1.1223.1.2.3 data

223.1.1.1

223.1.1.2

223.1.1.3

223.1.1.4 223.1.2.9

223.1.2.3

223.1.2.1

223.1.3.2223.1.3.1

223.1.3.27

A

BE

Dest. Net. next router Nhops

223.1.1/24 1223.1.2/24 223.1.1.4 2223.1.3/24 223.1.1.4 2

forwarding table in A

0.0.0.0/0 223.1.1.4 -

223.1.4.1

To Internet

Summary of Network Layer We have covered the basics of the network

layer routing and forwarding

There are multiple other topics that we did not cover Multicast/anycast QoS slides will be linked on the

schedule page just in case you need reading in the summer

6

7

Recap: The Hourglass Architecture of the Internet

IP

Ethernet FDDIWireless

TCP UDP

Telnet Email FTP WWW

ADSL CableDOCSIS

8

Link Layer: Introduction

Some terminology: hosts and routers are nodes (bridges and switches too)

communication channels that connect adjacent nodes along a communication path are links wired, wireless dedicated, shared

2-PDU is called a frame, encapsulates 3-PDU datagram

“link”

9

Link layer: Context

Data-link layer has responsibility of transferring datagram from one node to another node

Datagram may be transferred by different link protocols over different links, e.g., Ethernet on first link, frame relay on

intermediate links 802.11 on last link

transportation analogy

trip from New Haven to San Francisco taxi: home to union

station train: union station

to JFK plane: JFK to San

Francisco airport shuttle: airport to

hotel

10

Link Layer Services Framing

o encapsulate datagram into frame, adding header, trailer and error detection/correction

Multiplexing/demultiplexingo frame headers to identify src, dest

Media access control Forwarding/switching with a link-layer (Layer 2)

domain Reliable delivery between adjacent nodes

o we learned how to do this already !o seldom used on low bit error link (fiber, some twisted

pair)o common for wireless links: high error rates

11

Adaptors Communicating

link layer typically implemented in “adaptor” (aka NIC) Ethernet card,

modem, 802.11 card

adapter is semi-autonomous, implementing link & physical layers

sending side: encapsulates datagram

in a frame adds error checking bits,

rdt, flow control, etc.

receiving side looks for errors, rdt, flow

control, etc extracts datagram,

passes to receiving node

sendingnode

frame

receivingnode

datagram

frame

adapter adapter

link layer protocol

12

LAN/MAC/Physical Address

In most link-layer, each adapter has a unique link layer address (also called MAC address)

• used as address in datalink frames to identify the interface

• 48 bit MAC address (for most types of LANs) burned in the adapter ROM

• MAC address allocation administered by IEEE;manufacturer buys portion of MAC address space (to assure uniqueness)

13

Recall Earlier Routing Discussion

Starting at A, given IP datagram addressed to E:

look up net. address of E, find C

link layer sends datagram to C inside link-layer frame; the dest. address should be C’s MAC address

C’s MACaddr

A ’s MACaddr

A ’s IPaddr

E’s IPaddr

IP payload

datagramframe

frame source,dest address

datagram source,dest address

223.1.1.1

223.1.1.2

223.1.1.3

223.1.1.4 223.1.2.9

223.1.2.2

223.1.2.1

223.1.3.2223.1.3.1

223.1.3.27

A

BE

C

Question: how to determine MAC address of C knowing C’s IP address?

14

ARP: Address Resolution Protocol

Each IP node (Host, Router) on LAN has ARP table

ARP Table: IP/MAC address mappings for some LAN nodes

< IP address; MAC address; TTL> TTL (Time To Live): time

after which address mapping will be forgotten (typically 20 min)

[yry3@cicada yry3]$ /sbin/arpAddress HWtype HWaddress Flags Mask Ifacezoo-gatew.cs.yale.edu ether AA:00:04:00:20:D4 C eth0artemis.zoo.cs.yale.edu ether 00:06:5B:3F:6E:21 C eth0lab.zoo.cs.yale.edu ether 00:B0:D0:F3:C7:A5 C eth0

15

ARP Protocol

ARP is “plug-and-play”: nodes create their ARP tables without

intervention from net administrator

A broadcast protocol: source broadcasts query frame, containing

queried IP address • all machines on LAN receive ARP query

destination D receives ARP frame, replies• frame sent to A’s MAC address (unicast)

16

Comparison of IP address and MAC Address

IP address is locator address depends on

network to which an interface is attached

• NOT portable

introduces features (e.g., CIDR) for routing scalability

IP address needs to be globally unique (if no NAT)

MAC address is an identifiero dedicated to a

device• portable

o flat

MAC address does not need to be globally unique, but the current assignment ensures uniqueness

Outline

Admin Link layer overview Error detection

17

18

Error Detection

D = Data protected by error checking, may include header fieldsED = Error Detection bits (redundancy)

• Error detection not 100% reliable!• a good error detector may miss some errors, but rarely• larger ED field generally yields better detection

• Error detection design considers computation primitives.

19

Cyclic Redundancy Check: Background Widely used in practice, e.g.,

Ethernet, DOCSIS (Cable Modem), FDDI, PKZIP, WinZip, PNG

For a given data D, consider it as a polynomial D(x) consider the string of 0 and 1 as the

coefficients of a polynomial• e.g. consider string 10011 as x4+x+1

addition and subtraction are modular 2, thus the same as xor

Choose generator polynomial G(x) with r+1 bits, where r is called the degree of G(x)

20

Cyclic Redundancy Check: Encode Given data G(x) and D(x), choose R(x)

with r bits, such that D(x)xr+R(x) is exactly divisible by G(x)

The bits correspond to D(x)xr+R(x) are sent to the receiver

+x

21

Ethernet Frame Structure

Sending adapter encapsulates IP datagram (or other network layer protocol packet) in Ethernet frame

Preamble: 8 bytes 7 bytes with pattern 10101010 followed by one byte with

pattern 10101011 (why the preamble?) Source and dest. addresses: 6 bytes Type: indicates the higher layer protocol, mostly IP but

others may be supported such as Novell IPX and AppleTalk

CRC: CRC-32 checked at receiver, if error is detected, the frame is simply dropped

8 6 6 2 46-1500 (including padding) 4

22

Cyclic Redundancy Check: Decode

Since G(x) is global, when the receiver receives the transmission T’(x), it divides T’(x) by G(x) if non-zero remainder: error detected! if zero remainder, assumes no error

Encode:CRC(G)

DT = D(x)xr+R(x) T ’

check

23

CRC: Steps and an Example

Suppose the degree of G(x) is r

Append r zero to D(x), i.e. consider D(x)xr

Divide D(x)xr by G(x). Let R(x) denote the reminder

Send <D, R> to the receiver

24

The Power of CRC Let T(x) denote D(x)xr+R(x), and E(x) the polynomial of the

error bits the received signal is T’(x) = T(x)+E(x)

Since T(x) is divisible by G(x), we only need to consider if E(x) is divisible by G(x)

Encode:CRC(G)

DT = D(x)xr+R(x) T ’

check

25

Designing CRC

Detect a single-bit error: E(x) = xi

if G(x) contains two or more terms, E(x) is not divisible by G(x)

Detect an odd number of errors: E(x) has an odd number of terms: lemma: if E(x) has an odd number of terms, E(x) cannot

be divisible by (x+1)• suppose E(x) = (x+1)F(x), let x=1, the left hand will be 1, while

the right hand will be 0 thus if G(x) contains x+1 as a factor, E(x) will not be

divided by G(x)

Many more errors can be detected by designing the right G(x)

26

Example G(x) 32 bits CRC:

CRC32: x32 + x26 + x23 + x22 + x16 + x12 + x11 + x10 + x8 + x7 + x5 + x4 + x2 + x + 1

used by Ethernet, FDDI, PKZIP, WinZip, and PNG GSM phones

For more details see the link below and further links it contains: http://en.wikipedia.org/wiki/Cyclic_redundancy_check

.

Outline

Admin Link layer overview Error detection/correction Link access

27

28

Multiple Access Links and Protocols

Two types of “links”: point-to-point

e.g., a leased dedicated line, PPP for dial-up access

broadcast (shared wire or medium) traditional Ethernet; Cable networks 802.11 wireless LAN; cellular networks satellite

29

Multiple Access Protocols Single shared broadcast channel

thus, if two or more simultaneous transmissions by nodes, due to interference, only one node can send successfully at a time (see CDMA later for an exception)

multiple access protocol Protocol that determines how nodes share

channel, i.e., determines when nodes can transmit Communication about channel sharing must use

channel itself !

Discussion: properties of an ideal multiple access protocol.

30

Ideal Mulitple Access ProtocolBroadcast channel of rate R bps Efficiency: when only one node wants to

transmit, it can send at full rate R Rate allocation:

simple fairness: when N nodes want to transmit, each can send at average rate R/N

we may need more complex rate control Decentralized:

no special node to coordinate transmissions no synchronization of clocks

Simple

31

MAC Protocols: a Taxonomy

Goals efficient, rate control, decentralized,

simple

Three broad classes: channel partitioning

divide channel into smaller “pieces” (time slot, frequency, code)

non-partitioning random access

• allow collisions “taking-turns”

• a token coordinates shared access to avoid collisions

32

Outline

Admin. and recap Link layer overview Error detection and correction Media access control (MAC) protocols

channel partitioning

33

Channel Partitioning: TDMA

TDMA: time division multiple access Access to channel in "rounds" Each station gets fixed length slot (length =

pkt trans time) in each round Unused slots go idle Example: 6-station LAN, 1,3,4 have pkt, slots

2,5,6 idle

34

Channel Partitioning: FDMA

FDMA: frequency division multiple access Channel spectrum divided into frequency bands Each station assigned fixed frequency band Unused transmission time in frequency bands go

idle Example: 6-station LAN, 1,3,4 have pkt,

frequency bands 2,5,6 idle

frequ

ency

bands time

5

1

4

3

2

6

35

1 2 3 4 5 6 7 8

935-960 MHz124 channels (200 kHz)downlink

890-915 MHz124 channels (200 kHz)uplink

frequ

ency

time

GSM TDMA frame

GSM time-slot (normal burst)

4.615 ms

546.5 µs577 µs

tail user data TrainingSguardspace S user data tail

guardspace

3 bits 57 bits 26 bits 57 bits1 1 3

GSM - TDMA/FDMA

S: indicates data or control

36

Channel Partitioning: CDMA

CDMA (Code Division Multiple Access) Used mostly in wireless broadcast channels

(cellular, satellite, etc) A spread-spectrum technique

History: http://people.seas.harvard.edu/~jones/cscie129/nu_lectures/lecture7/hedy/lemarr.htm

37

CDMA: Encoding

All users share same frequency, but each user m has its own unique “chipping” sequence (i.e., code) cm to encode data, i.e., code set partitioning e.g. cm = 1 1 1 -1 1 -1 -1 -1

Assume original data are represented by 1 and -1

Encoded signal = (original data) modulated by (chipping sequence) assume cm = 1 1 1 -1 1 -1 -1 -1

if data is d, send d cm, • if data d is 1, send cm

• if data d is -1 send -cm

CDMA: Encoding

38

user data d(t)

chipping sequence c(t)

resultingsignal

1 -1

-1 1 1 -1 1 -1 1 -11 -1 -1 1 11

X

=

tb

tc

tb: bit periodtc: chip period

-1 1 1 -1 -1 1 -1 11 -1 1 -1 -11

39

CDMA: Decoding

Inner-product (summation of bit-by-bit product) of encoded signal and chipping sequence if inner-product > 0, the data is 1; else -1

40

CDMA Encode/Decode

Code of user m cm: 1 1 1 -1 1 -1 -1 -1

- The number of bitsof each chipping sequence is M

Encode

Decode

41

CDMA: Deal with Multiple-User Interference

Two codes Ci and Cj are orthogonal, if , where we use “.” to denote inner

product, e.g.

If codes are orthogonal, multiple users can “coexist” and transmit simultaneously with minimal interference:

iiij

jj cdccd )(

0 ij cc

C1: 1 1 1 -1 1 -1 -1 -1 C2: 1 -1 1 1 1 -1 1 1-----------------------------------------C1 . C2 = 1 +(-1) + 1 + (-1) +1 + 1+ (-1)+(-1)=0

Analogy: Speak in different languages!

42

CDMA: Two-Sender Interference

Code 1: 1 1 1 -1 1 -1 -1 -1Code 2: 1 -1 1 1 1 -1 1 1

Discussions

Advantages of channel partitioning

Problems of channel partitioning

43

44

Outline

Recap Link layer overview Error detection and correction MAC protocols

Partitioning protocols Non-partitioning MAC protocols

• Random access

45

Random Access Protocols

When a node has packets to send transmit at full channel data rate R no a priori coordination among nodes

Two or more transmitting nodes -> “collision” Random access MAC protocol specifies:

when to access channel? how to detect collisions? how to recover from collisions?

Examples of random access MAC protocols: slotted ALOHA and pure ALOHA CSMA and CSMA/CD, CSMA/CA

46

Slotted Aloha [Norm Abramson]

Time is divided into equal size slots (= pkt trans. time)

Node with new arriving pkt: transmit at beginning of next slot

If collision: retransmit pkt in future slots with probability p, until successful.

Success (S), Collision (C), Empty (E) slots

47

Slotted Aloha EfficiencyQ: What is the fraction of successful

slots?suppose n stations have packets to sendsuppose each transmits in a slot with probability

p

- prob. of succ. by a specific node: p (1-p)(n-1)

- prob. of succ. by any one of the N nodes

S(p) = n * Prob (only one transmits) = n p (1-p)(n-1)

48

Goodput vs. Offered LoadS =

thro

ughput

= “

goodput

” (

succ

ess

rate

)

G = offered load = np0.5 1.0 1.5 2.0

Slotted Aloha

when p n < 1, as p (or n) increases probability of empty slots reduces probability of collision is still low, thus goodput increases

when p n > 1, as p (or n) increases, probability of empty slots does not reduce much, but probability of collision increases, thus goodput decreases

goodput is optimal when p n = 1

49

Maximum Efficiency vs. n

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

2 7 12 17 n

ma

xim

um

eff

icie

nc

y1/e = 0.37

At best: channeluse for useful transmissions 37%of time!

50

Pure (unslotted) Aloha Unslotted Aloha: simpler, no clock

synchronization Whenever pkt needs transmission:

send without awaiting for the beginning of slot

Collision probability increases: pkt sent at t0 collide with other pkts sent in [t0-1,

t0+1]

51

Pure Aloha (cont.)Assume a node transmit with probability p in one unit of time

P(success by a given node) = P(node transmits) * P(no other node transmits in [t0-1,t0]

* P(no other node transmits in [t0,

t0+1]

= p . (1-p)n-1 . (1-p)n-1

= p . (1-p)2(n-1)

P(success by any of N nodes) = n p . (1-p)2(n-1)

- Bound: 1/(2e) = .18

52

Goodput vs. Offered LoadS =

thro

ughput

= “

goodput

” (

succ

ess

rate

)

G = offered load = Np0.5 1.0 1.5 2.0

0.1

0.2

0.3

0.4

Pure Aloha

protocol constrainseffective channelthroughput!

Slotted Aloha

53

Dynamics of (Slotted) Aloha

In reality, the number of stations backlogged is changing we need to study the dynamics when using a

fixed transmission probability p

Assume we have a total of m stations (the machines on a LAN): n of them are currently backlogged, each tries

with a (fixed) probability p the remaining m-n stations are not backlogged.

They may start to generate packets with a probability pa, where pa is much smaller than p

54

Modeln backlogged

each transmits with prob. p

m-n: unbacklogged

each transmits with prob. pa

55

Dynamics of Aloha: Effects of Fixed Probability

n: number of backlogged stations

0 m

successful transmission rate at

offered load np + (m-n)pa

new arrival rate:(m-n) pa

desirable stable point

undesirable stable point

Lesson: if we fix p, but n varies, we may have an undesirable stable point

offered load = 1

- assume a total ofm stations- pa << p- success rate is thedeparture rate, the rate the backlog is reducing

dep.andarrivalrateofbackloggedstations

56

Summary of Problems of Aloha Protocols Problems

slotted Aloha has better efficiency than pure Aloha but clock synchronization is hard to achieve

Aloha protocols have low efficiency due to collision or empty slots

• when offered load is optimal (p = 1/N), the goodput is only about 37%

• when the offered load is not optimal, the goodput is even lower

undesirable steady state at a fixed transmission rate, when the number of backlogged stations varies

Ethernet design: address the problems: approximate slotted Aloha without clock

synchronization reduce the penalty of collision or empty slots infer optimal transmission rate

57

The Basic MAC Mechanisms of Ethernet

get a packet from upper layer;K := 0; n := 0; // K: control wait time; n: no. of

collisionsrepeat: wait for K * 512 bit-time; while (network busy) wait; wait for 96 bit-time after detecting no signal; transmit and detect collision; if detect collision stop and transmit a 48-bit jam signal; n ++; m:= min(n, 10), where n is the number of

collisions choose K randomly from {0, 1, 2, …, 2m-1}. if n < 16 goto repeat else give up

58

Ethernet

“Dominant” LAN technology: First widely used LAN

technology Kept up with speed race: 10 Mbps, 100 Mbps,

1 Gbps, 10 Gbps

Metcalfe’s Ethernetsketch

Course Topics Summary

The Internet is a general-purpose, large-scale, distributed computer network

Major design features/principles packet switching/statistical multiplexing hour-glass architecture end-to-end principle decentralized architecture

• E.g., DNS, interdomain routing resource allocation framework

• optimization decomposition through duality adaptive control

• e.g., AIMD sliding window self clocking, Ethernet queueing modeling/performance analysis and design tradeoff between theoretical impossibility and practice

Evolution Driven by Technology, Infrastructure, Policy,

Applications, and Understanding: technology

• e.g., wireless/optical communication technologies and device miniaturization (sensors)

infrastructure• e.g., cloud computing

applications• e.g., content distribution, game, tele presence, sensing, grid

computing, VoIP, understanding

• e.g., resource sharing principle, routing principles, mechanism design, optimal stochastic control (randomized access)

Complexity comes from evolution. Don’t be afraid to challenge the foundation and

redesign!

61

Backup Slides

62

63

Ethernet’s Exponential Backoff:

Goal: adapt retransmission attempts to estimated current load compared with CSMA, 1/2m can be considered

as p not a static p---adjusted using exponential

backoff• first collision: choose K from {0,1}; delay is K x 512

bit transmission times• after second collision: choose K from {0,1,2,3}…• after ten or more collisions, choose K from {0,1,2,3,4,

…,1023}

Many Issues

How to make it faster

How to make it more efficient

How to make it more reliable/robust/secure

64

65

CSMA: Carrier Sense Multiple Access

CSMA: listen before transmitObjective: approximate slotted Aloha (compared

with pure Aloha)

If backlogged, wait until channel sensed idle, then transmit pkt with prob. p

human analogy: don’t interrupt others !

66

CSMA Collisions

collisions can still occur:propagation delay means two nodes may nothear each other’s transmission

Collision:entire packet transmission time wasted; still not veryefficient!

spatial layout of nodes along EthernetA B C D

tim

e

t0

67

CSMA/CD (Collision Detection) Human analogy: the polite conversationalist

CSMA/CD: observations:

• collisions can be detected within short time• if colliding transmissions are aborted, we can reduce

channel wastage carrier sensing, deferral as in CSMA collision detection:

• easy in wired LANs: measure signal strengths, compare transmitted, received signals

• difficult in wireless LANs: receiver shuts off while transmitting

68


tim

e

t0


tim

e

t0

B detectscollision, aborts

D detectscollision,aborts

CSMA/CD: Collision Detection

instead of wasting the whole packettransmission time, abort after detection.

69

Efficiency of CSMA/CD Given collision detection, instead of wasting the

whole packet transmission time (a slot), we waste only the time needed to detect collision.

Use a contention slot of 2 T, where T is one-way propagation delay (why 2 T ?)

When the transmission probability p is approximately optimal (p = 1/N), we try approximately e times before each successful transmission

P/C

P: packet size, e.g. 1000 bitsC: link capacity, e.g. 10Mbps

70

Efficiency of CSMA/CD The efficiency (the percentage of useful time) is

approximately

The value of a plays a fundamental role in the efficiency of CSMA/CD protocols.

Question: you want to increase the capacity of a link layer technology (e.g., , 10 Mbps Ethernet to 100 Mbps), but still want to maintain the same efficiency, what can you do?

PTC

aTea

CPT

CP

CP

where,511

11

2 5

71

Summary of Problems to be Addressed

Approximate slotted Aloha

Reduce the penalty of collision or empty slots

Infer optimal transmission rate

Physical Layer

72

Internet Bandwidth Growth

Source: TeleGeograph Research

What Determines Transmission Rate?

Service: transmit a bit stream from a sender to a receiver

Encodingchannel

Decodingoutput bit stream

input bit stream

sender receiver

Question to be addressed: how much can we send through the channel ?

Basic Theory: Channel Capacity The maximum number of bits that can be

transmitted per second (bps) by a physical media is:

where W is the frequency range, S/N is the signal noise ratio. We assume Gaussian noise.

)1(log2 NSW

Fourier Transform

Suppose the period of a data unit is f (=1/T), then the data unit can be represented as the sum of many harmonics (sin(), cos()) with frequencies f, 2f, 3f, 4f, …

A reasonably behaved periodic function g(t), with minimal period T, can be constructed as the sum of a series of sines and cosines:

11

21 )2cos()2sin()(

nn

nn nftbnftactg

dtnfttgb

dtnfttga

dttgc

Tf

T

Tn

T

Tn

T

T

)2cos()(

)2sin()(

)(

/1

0

2

0

2

0

2

nnn barms char “b”

Signal Attenuation

The quality of signal will degrade when it travels loss, frequency passing

)1(log2 NSW

Frequency Dependent Attenuation The received signal will be distorted even when

there is no interference and the transmitted signal is “perfect” square waveform

Example: Voltage-attenuation magnitude ratios of Category 5 cable. For example, 500 feet of cable attenuates a 10-MHz, 1-V signal to 0.32 V, which corresponds to about –9.90 dB (= 20 log 1/0.32)

Example

Example: W=3000Hz, S/N 4000

kbpsbandwidth 36)40001(log3000max 2

telephone networksender modem

ModemModulation

(digit->analog)

3Khz bandwidth(add white noise)

ISPdemodulation

output bit stream

input bit stream

Analog to Digital quantization

for transmitting throughthe digital telephone

backbone

ISP modem

V.34 (33.6kbps Dialup Modem)

channel

Example: ADSL Spectrum allocation:

divided into a total of 256 downstream and 32 upstream tones, where each tone is a standard 4kHz voice channel

During initial negotiation, a tone is used only if the S/N is above 6 db (4)

kbpsup

Mbpsdown

297)41(log4000*32

4.2)41(log4000*256

2

2

Faster

82

The Wire: Fiber

A look at a fiber

How it works?

A graded index fiber

The Wire: Fiber

Wide spectrum at low loss: ~0.3db/km (c.f. copper ~190db/km @100Mhz), 30-100km without repeater

Bandwidth of a single fiber theoretical: 100-200Tbps

http://www.trnmag.com/Stories/080101/Study_shows_fiber_has_room_to_grow_080101.html

Lightweight: 33 tons of copper to transmit the same amount of information carried by ¼ pound of optical fiber

Advantages of Fibers

How to Do Switching?

Optical-Electrical-Optical Optical switch: optical micro-electro-mechanical systems

(MEMS)

Optical path One optical switch

http://www.qwest.com/largebusiness/enterprisesolutions/networkMaps/preloader.swf

Example: MEMS Optical Switch Using mirrors, e.g. Lambda Router

Implications

Fine-grained switching may not be feasible

What is the architecture of optical networks: packet switching, circuit switching, or others?

More Efficient

89

Large deployment of highly adaptive, multipoint applications

An iterative process between two sets of adaptation: ISP: traffic engineering to change routing

to shift traffic away from higher utilized links

• current traffic pattern new routing matrix

App: direct traffic to better performing end points

• current routing matrix new traffic pattern

Problem: Inefficient Interactions

ISP optimizer interacts poorly with App.

ISP Traffic Engineering+ App Latency Optimizer

- red: App adjust alone; fixed ISP routing- blue: ISP traffic engineering adapt alone; fixed App communications

The Fundamental Problem Traditional Internet architectural feedback

to application efficiency is limited: routing (hidden) rate control through coarse-grained TCP

congestion feedback To achieve better efficiency, needs explicit

communications between network resource providers and applications

P4P Framework – Design Goals

Performance improvement Scalability and extensibility: support

diverse ISP objectives and applications scenarios in large networks

Privacy preservation Ease of implementation Open standard: any ISP, provider,

applications can easily implement it

Current Status

P4P-WG Next step

wider integration IETF standard

• AT&T• Bezeq Intl• BitTorrent• CacheLogic• Cisco Systems• Grid Networks• Joost• LimeWire• Manatt• Oversi• Pando Networks• PeerApp• Telefonica Group• VeriSign• Verizon• Vuze• Univ of Washington• Yale University

• Abacast• AHT Intl• Akamai• Alcatel Lucent• CableLabs• Cablevision• Comcast• Cox Comm• Juniper Networks• Microsoft• MPAA• NBC Universal• Nokia• RawFlow• Solid State

Networks• Thomson• Time Warner Cable• Turner Broadcasting

Reliability

Is the Internet Reliable?

A key design objective of the “Internet” (i.e., packet-switched networks) is robustness

Does the Internet infrastructure achieve the target reliability objective of a highly reliable system (99.999%)?

Perspective

911 Phone service (1993 NRIC report +) 29 minutes per year per line 99.994% availability

Std. Phone service (various sources) 53+ minutes per line per year 99.99+% availability

…what about the Internet? Various studies: about 99.5% Need to reduce down time by 500 times to

achieve five nines; 50 times to match phone service

Unreachable Networks: 10 days

Internet Disaster Recovery Response

Why slow response? the cable repairing is slow: not until 21 days

after quake BGP is not designed to create business

relationship

Objective a meta-BGP to facilitate discovery and

creation of BGP business relationship

100

101

Backup: IP Multicast

102

IP Fragmentation & Reassembly Network links have MTU

(max.transfer size) - largest possible link-level frame. different link types,

different MTUs, e.g. Ethernet MTU is 1500 bytes

Large IP datagram divided (“fragmented”) one datagram

becomes several datagrams

“reassembled” only at final destination

IP header bits used to identify, order related fragments

fragmentation: in: one large datagramout: 3 smaller datagrams

reassembly

103

IP Fragmentation and Reassembly

ID=x

offset=0

fragflag=0

length=4000

ID=x

offset=0

fragflag=1

length=1500

ID=x

offset=1480

fragflag=1

length=1500

ID=x

offset=2960

fragflag=0

length=1040

One large datagram becomesseveral smaller datagrams

Example 4000 byte

datagram MTU = 1500

bytes

104

IP Multicast: Service Model

Multicast group concept: use of indirection A group is identified by a location-independent

logical address (class D IP address: prefix 1110) Open group model

Anyone can send packets to the “logical” group address Anyone can join a group and receive packets

Normal, best-effort delivery semantics of IP

128.119.40.186

128.59.16.12

128.34.108.63

128.34.108.60

multicast group

226.17.30.197

Needed: infrastructure to deliver mcast-addressed datagrams to all hosts that have joined that multicast group

105

Multicast Across LANs

shared tree source-based trees

Goal: find a tree (or trees) connecting routers having local mcast group members source-based: different tree from sender to each receiver

– Distance-vector multicast routing protocol (DVMRP)– Protocol-independent multicast-dense mode (PIM-DM)

shared-tree: same tree used by all group members– Core-Based Tree (CBT)– Protocol-independent multicast-sparse mode (PIM-SM)

106

Source Tree: Reverse Path Flooding (RPF)

A router x forwards a packet from source (S) iff it arrives via neighbor y, and y is on the shortest path from x back to S

A packet is replicated to all but the incoming interface

xxyy

tt

SS

a

zz

1

1

1

1

1

107

Reverse Path Forwarding: Improvement Basic idea: forward a packet from S only

on child links for S A child link of router x for source S

a link that has x as parent on the shortest path from thelink to S

a child x notifies its parent y(through the routing protocol)that it has selected y as itsparent

xxyy

tt

SS

a

zz

108

Reverse Path Forwarding: Pruning No need to forward datagrams down

subtree with no mcast group members

“prune” msgs sent upstream by router with no downstream group members

R1

R2

R3

R4

R5

R6 R7

router with attachedgroup member

router with no attachedgroup member

prune message

LEGENDS: source

links with multicastforwarding

P

P

P

109

Pruning

Prune (Source, Group) at a leaf router if no members send No-Membership Report (NMR) up tree

If all children of router R prune (S,G) propagate prune for (S,G) to its parent

What do you do when a member of a group (re)joins? send a Graft message to upstream parent

How to deal with failures? prune dropped flow is reinstated down stream routers re-prune

Note: again a soft-state approach

110

Implementation of Source Trees in the Internet

Multicast OSFP (MOSFP) Membership is part of the link state distribution;

calculate source specific, pre-pruned trees

Reverse Path Forwarding Distance Vector Multicast Routing Protocol (DVMRP) Protocol Independent Multicast – Dense Mode (PIM-DM)

• very similar to DVMRP

Difference: PIM uses any unicast routing algorithm to determine the path from a router to the source; DVMRP uses distance vector

Question: the state requirement of Reverse Path Forwarding

111

Building a Shared Tree

Steiner Tree: minimum cost tree connecting all routerswith attached group members

A Steiner tree is not a spanning tree because you do not need to connect all nodes in the network

Problem is NP-hard Excellent heuristics exists Not used in practice:

computational complexity information about entire network needed monolithic: rerun whenever a router needs to join/leave

112

Center (Core) based Shared Tree

Single delivery tree shared by all One router identified as “center” of tree Tree construction is receiver-based

edge router sends unicast join-msg addressed to center router

join-msg “processed” by intermediate routers and forwarded towards center

join-msg either hits existing tree branch for this center, or arrives at center

path taken by join-msg becomes new branch of tree for this router

A sender unicasts a packet to center The packet is distributed on the tree when it hits the

tree

113

Example: M3 Joins

Group members: M1, M2

core

M1

M2 M3

shared tree

S1join message

Discussion: what is property of the constructed tree?

114

Example: M1 Sends Data Group members: M1, M2, M3 M1 sends data

core

M1

M2 M3

control (join) messagesdata S1

115

Shared Tree Protocols in the Internet

Core Based Tree Protocol Independent Multicast (PIM)

Sparse mode The catch: how do you know the center?

session announcement

116

Mbone: Tunneling

Q: How to connect “islands” of multicast routers in a “sea” of unicast routers?

mcast datagram encapsulated inside “normal” (non-multicast-addressed) datagram

normal IP datagram sent thru “tunnel” via regular IP unicast to receiving mcast router

receiving mcast router unencapsulates to get mcast datagram

physical topology logical topology

Date post:	03-Jan-2016
Category:	Documents
Upload:	winchell-vance
View:	19 times
Download:	1 times

Link Layer

Documents