FAST TCP Steven Low CS/EE netlab.CALTECH.edu Oct 2003.

Post on 28-Mar-2015

224 views 2 download

Tags:

transcript

FAST TCP

Steven Low

CS/EEnetlab.CALTECH.edu

Oct 2003

FAST Protocols for Ultrascale Networks

netlab.caltech.edu/FAST

Internet: distributed feedback control system TCP: adapts sending rate to congestion AQM: feeds back congestion information

Rf (s)

Rb’(s)

x

))((1

lll

l ctyc

p

)()(1)( tan)(

)()(1-2

tqtttT

wx iid

tqtxi

ii ii

ii

y

pq

TCP AQM

Theory

Calren2/Abilene

Chicago

Amsterdam

CERN

Geneva

SURFNet

StarLight

WAN in LabCaltech

research & production networks

Multi-Gbps50-200ms delay

Experiment

Students Choe (Postech/CIT) Hu (Williams) J. Wang (CDS) Z.Wang (UCLA) Wei (CS)

Industry Doraiswami (Cisco) Yip (Cisco)

Faculty Doyle (CDS,EE,BE) Low (CS,EE) Newman (Physics) Paganini (UCLA)

Staff/Postdoc Bunn (CACR) Jin (CS) Ravot (Physics) Singh (CACR)

Partners CERN, Internet2, CENIC, StarLight/UI, SLAC, AMPATH, Cisco

People

155Mb/s

slowstart

equilibrium

FASTrecovery

FASTretransmit

timeout

10Gb/s

Implementation

netlab.caltech.edu

Outline

Motivation Network model FAST TCP

Equilibrium Stability Experiments

TCP/IP

Applications

TCP/AQM

IP

Transmission

WWW, Email, Napster, FTP, …

Ethernet, ATM, POS, WDM, …

netlab.caltech.edu

High Energy Physics Large global collaborations

2000 physicists from 150 institutions in >30 countries 300-400 physicists in US from >30 universities & labs

SLAC has 500TB data by 4/2002, world’s largest database Typical file transfer ~1 TB

At 622Mbps: ~ 4 hrs At 2.5Gbps: ~ 1 hr At 10Gbps: ~15min Gigantic elephants!

LHC (Large Hadron Collider) at CERN, to open 2007 Generate data at PB (1015B)/sec Filtered in realtime by a factor of 106 to 107

Data stored at CERN at 100MB/sec Many PB of data per year To rise to Exabytes (1018B) in a decade

netlab.caltech.edu

HEP high speed network

… that must change

netlab.caltech.edu

HEP Network (DataTAG)

NLNLSURFnet

GENEVA

UKUKSuperJANET4ABILEN

E

ABILENE

ESNETESNET

CALREN

CALREN

ItItGARR-B

GEANT

NewYork

FrFrRenater

STAR-TAP

STARLIGHT

Wave

Triangle

2.5 Gbps Wavelength Triangle 2002 10 Gbps Triangle in 2003

Newman (Caltech)

netlab.caltech.edu

Performance at large windowsns-2 simulation

10Gbps

capacity = 155Mbps, 622Mbps, 2.5Gbps, 5Gbps, 10Gbps; 100 ms round trip latency; 100 flowsJ. Wang (Caltech, June 02)

27%

txq=100 txq=10000

95%1G

Linux TCP Linux TCP FAST

19%

average utilization

capacity = 1Gbps; 180 ms round trip latency;1 flowC. Jin, D. Wei, S. Ravot, etc (Caltech, Nov 02)

DataTAG Network:CERN (Geneva) – StarLight (Chicago) – SLAC/Level3 (Sunnyvale)

txq=100

netlab.caltech.edu

Outline

Motivation Network model FAST TCP

Equilibrium Stability Experiments

TCP/IP

Applications

TCP/AQM

IP

Transmission

WWW, Email, Napster, FTP, …

Ethernet, ATM, POS, WDM, …

netlab.caltech.edu

Congestion Control

~ W packets per RTT Lost packet detected by missing ACK Congestion signal: delay and loss

RTT

time

time

Source

Destination

1 2 W

1 2 W

1 2 W

data ACKs

1 2 W

netlab.caltech.edu

Congestion control

xi(t)

pl(t)

Example congestion measure pl(t) Loss (Reno) Queueing delay (Vegas)

netlab.caltech.edu

TCP/AQM

Congestion control is a distributed asynchronous algorithm to share bandwidth

It has two components TCP: adapts sending rate (window) to congestion AQM: adjusts & feeds back congestion information

They form a distributed feedback control system Equilibrium & stability depends on both TCP and AQM And on delay, capacity, routing, #connections

pl(t)

xi(t)TCP: Reno Vegas

AQM: DropTail RED REM/PI AVQ

netlab.caltech.edu

Network model

F1

FN

G1

GL

Rf(s)

Rb’(s)

TCP Network AQM

x y

q p

lieR lis

lif link uses source if

lieR lislib link uses source if R

netlab.caltech.edu

for every RTT

{ if W/RTTmin – W/RTT < then W ++

if W/RTTmin – W/RTT > then W -- }

queue size

Vegas model

iiiii

i dtqtxtT

x )()( if )(

12

else 0ix

Fi:

iiiii

i dtqtxtT

x )()( if )(

12

Gl:))((1

llcl ctypl

Link queueing delay

E2E queueing delay

netlab.caltech.edu

Vegas model

F1

FN

G1

GL

Rf(s)

Rb’(s)

TCP Network AQM

x y

q p

1)(

l

ll c

tyG

ii

ii

dtqtx

i tTF

)()(

21sgn

)(

1

netlab.caltech.edu

Outline

Motivation Network model FAST TCP

Equilibrium Stability Experiments

TCP/IP

Applications

TCP/AQM

IP

Transmission

WWW, Email, Napster, FTP, …

Ethernet, ATM, POS, WDM, …

netlab.caltech.edu

Methodology

Protocol (Reno, Vegas, RED, REM/PI…)

Equilibrium Performance

Throughput, loss, delay

Fairness Utility

Dynamics Local stability Cost of stabilization

))( ),(( )1(

))( ),(( )1(

txtpGtp

txtpFtx

netlab.caltech.edu

Model

c1 c2

Network Links l of capacities cl

Sources sL(s) - links used by source sUs(xs) - utility if source rate = xs

x1

x2

x3

121 cxx 231 cxx

netlab.caltech.edu

Summary: duality model

cRx

xUs

ssxs

subject to

)( max0

Flow control problem (Kelly, Malloo, Tan 98)

TCP/AQM Maximize utility with different utility functions

Primal-dual algorithm

))( ),(( )1(

))( ),(( )1(

tRxtpGtp

txtpRFtx T

Reno,

VegasDropTail, RED, REM

Result (L 00): (x*,p*) primal-dual optimal iff 0 ifequality with ** lll pcy

netlab.caltech.edu

Example utility functions

1 log

1 )1( :General

log : Vegas

32log

1 :2-Reno

3/2tan23

:1-Reno

11

1

i

i

ii

ii

ii

i

iii

x

x

x

Tx

Tx

T

TxT

/

netlab.caltech.edu

Game interpretation

lllssss

xpRxxU

s

)( max0

Source s:

s

lslslp

cxRpl

max0

Link l:

sllsss tpRUtx )()1( 1'

slslll ctxtptp )()()1(

netlab.caltech.edu

Synchronous convergence

Theorem (L & Lapsley 99)

Provided R has full row rank & Us strictly concave:

Gradient projection algorithm of dual problem

Converges to optimal primal-dual solutions if

Limit point: unique Pareto optimal Nash equilibrium

LSl 2

netlab.caltech.edu

Asynchronous convergence

Sources and links update & compute at different times with different frequencies using delayed info

Theorem (L & Lapsley 99)

Converges in asynchronous environment with smaller

netlab.caltech.edu

Equilibrium of VegasNetwork

Link queueing delays: pl

Queue length: clpl

Sources

Throughput: xi

E2E queueing delay : qi

Packets buffered:

Utility funtion: Ui(x) = i di log x Proportional fairness

iiii dqx

netlab.caltech.edu

Validation (L. Wang, Princeton)

Source rates (pkts/ms)# src1 src2 src3 src4 src51 5.98 (6) 2 2.05 (2) 3.92 (4)3 0.96 (0.94) 1.46 (1.49) 3.54 (3.57)4 0.51 (0.50) 0.72 (0.73) 1.34 (1.35) 3.38 (3.39)5 0.29 (0.29) 0.40 (0.40) 0.68 (0.67) 1.30 (1.30) 3.28

(3.34)

# queue (pkts) baseRTT (ms)1 19.8 (20) 10.18 (10.18)2 59.0 (60) 13.36 (13.51)3 127.3 (127) 20.17 (20.28)4 237.5 (238) 31.50 (31.50)5 416.3 (416) 49.86 (49.80)

netlab.caltech.edu

Methodology

Protocol (Reno, Vegas, RED, REM/PI…)

Equilibrium Performance

Throughput, loss, delay

Fairness Utility

Dynamics Local stability Cost of stabilization

))( ),(( )1(

))( ),(( )1(

txtpGtp

txtpFtx

netlab.caltech.edu

222

2

3

33

)1(4

)1 )(

2

-(Nc

N

c

Theorem (Low et al, Infocom’02) Reno/RED is locally stable if

Stability: Reno/RED

F1

FN

G1

GL

Rf(s)

Rb’(s)

TCP Network AQM

x y

q p

TCP: Small Small c Large N

RED: Small Large delay

netlab.caltech.edu

Stability: scalable control

F1

FN

G1

GL

Rf(s)

Rb’(s)

TCP Network AQM

x y

q p

lll

l ctyc

tp )(1

)()(

)(tq

mii

iii

i

extx

Theorem (Paganini, Doyle, L, CDC’01) Provided R is full rank, feedback loop is locally stable for arbitrary delay, capacity, load and topology

netlab.caltech.edu

Stability: Stabilized Vegas

)()(1)( tan)(

1 )()(1-

2tqtt

tTx iid

tqtxi ii

ii

F1

FN

G1

GL

Rf(s)

Rb’(s)

TCP Network AQM

x y

q p

lll

l ctyc

tp )(1

)(

Theorem (Choe & L, Infocom’03) Provided R is full rank, feedback loop is locally stable if

),( max aTx ii

netlab.caltech.edu

Stability: Stabilized Vegas

ii

ii

dtqtx

i tTx

)()(

21sgn

)(

1

F1

FN

G1

GL

Rf(s)

Rb’(s)

TCP Network AQM

x y

q p

lll

l ctyc

tp )(1

)(

Theorem (Choe & L, Infocom’03) Provided R is full rank, feedback loop is locally stable if

),( max aTx ii

-1

netlab.caltech.edu

Stability: FAST

)()(1)( tan)(

1 )()(1-

2tqtt

tTx iid

tqtxi ii

ii

F1

FN

G1

GL

Rf(s)

Rb’(s)

TCP Network AQM

x y

q p

lll

l ctyc

tp )(1

)(

Application Stabilized TCP with current routers Queueing delay as congestion measure has right scaling Incremental deployment with ECN

netlab.caltech.edu

Outline

Motivation Network model FAST TCP

Equilibrium Stability Experiments

TCP/IP

Applications

TCP/AQM

IP

Transmission

WWW, Email, Napster, FTP, …

Ethernet, ATM, POS, WDM, …

netlab.caltech.edu

Window control algorithm

Theorem (Jin, Wei, L ‘03) In absence of delay Mapping from w(t) to w(t+1) is contraction Global exponential convergence Full utilization after finite time Utility function: i log xi (proportional fairness)

netlab.caltech.edu

Network

(Sylvain Ravot, caltech/CERN)

netlab.caltech.edu

FAST BMPS

Internet2Land Speed

Record

FAST

1 2

1

2

7

9

10

Gen

eva-

Sunn

yval

e

Baltim

ore-S

unnyvale

#flows

FAST Standard MTU Throughput averaged over > 1hr

netlab.caltech.edu

Aggregate throughput

1 flow 2 flows 7 flows 9 flows 10 flows

Average utilization

95%

92%

90%

90%

88%FAST Standard MTU Utilization averaged over > 1hr

1hr 1hr 6hr 1.1hr 6hr

netlab.caltech.edu

Aggregate throughput

Linux TCP Linux TCP FAST

Average utilization

19%

27%

92%FAST Standard MTU Utilization averaged over 1hr

txq=100 txq=10000

95%

16%

48%

Linux TCP Linux TCP FAST

2G

1G

SCinet Caltech-SLAC experiments

netlab.caltech.edu/FAST

SC2002 Baltimore, Nov 2002

Acknowledgments

PrototypeC. Jin, D. Wei

TheoryD. Choe (Postech/Caltech), J. Doyle, S. Low, F. Paganini (UCLA), J. Wang, Z. Wang (UCLA)

Experiment/facilities Caltech: J. Bunn, C. Chapman, C. Hu (Williams/Caltech), H. Newman, J. Pool, S.

Ravot (Caltech/CERN), S. Singh CERN: O. Martin, P. Moroni Cisco: B. Aiken, V. Doraiswami, R. Sepulveda, M. Turzanski, D. Walsten, S. Yip DataTAG: E. Martelli, J. P. Martin-Flatin Internet2: G. Almes, S. Corbato Level(3): P. Fernes, R. Struble SCinet: G. Goddard, J. Patton SLAC: G. Buhrmaster, R. Les Cottrell, C. Logg, I. Mei, W. Matthews, R. Mount, J.

Navratil, J. Williams StarLight: T. deFanti, L. Winkler

Major sponsorsARO, CACR, Cisco, DataTAG, DoE, Lee Center, NSF

netlab.caltech.edu

Dynamic sharing: 3 flowsFAST Linux

Dynamic sharing on Dummynet capacity = 800Mbps delay=120ms 3 flows iperf throughput Linux 2.4.x (HSTCP: UCL)

netlab.caltech.edu

Dynamic sharing: 3 flowsFAST Linux

HSTCP STCP

Steady throughput

netlab.caltech.edu

FAST Linux

throughput

loss

queue

STCPHSTCP

Dynamic sharing on Dummynet capacity = 800Mbps delay=120ms 14 flows iperf throughput Linux 2.4.x (HSTCP: UCL)

30min

netlab.caltech.edu

FAST Linux

throughput

loss

queue

STCPHSTCP

30min

Room for mice !

HSTCP

netlab.caltech.edu

Outline

Motivation Network model FAST TCP

Equilibrium Stability Experiments

TCP/IP

Applications

TCP/AQM

IP

Transmission

WWW, Email, Napster, FTP, …

Ethernet, ATM, POS, WDM, …

netlab.caltech.edu

Network model

F1

FN

G1

GL

R

RT

TCP Network AQM

x y

q p

))( ),(( )1(

))( ),(( )1(

tRxtpGtp

txtpRFtx T

Reno, Vegas

DT, RED, …

liRli link uses source if 1 IP routing

netlab.caltech.edu

Motivation

ll

li l

lliR

iiixp

iii

xR

cppRxxU

cRxxU

ii

max)( max min

subject to )( maxmax

00

0

:Dual

:Primal

netlab.caltech.edu

Motivation

Can TCP/IP maximize utility?

ll

li l

lliR

iiixp

iii

xR

cppRxxU

cRxxU

ii

max)( max min

subject to )( maxmax

00

0

:Dual

:Primal

Shortest path routing!

netlab.caltech.edu

TCP-AQM/IP

Theorem (Wang, et al 03)

Primal problem is NP-hard

Ai

iAi

i cc

Proof Reduce integer partition to primal problem

Given: integers {c1, …, cn}Find: set A s.t.

netlab.caltech.edu

TCP-AQM/IP

Theorem (Wang, et al 03)

Primal problem is NP-hard

Achievable utility of TCP/IP?

Stability? Duality gap?

Conclusion: Inevitable tradeoff between

achievable utility routing stability

netlab.caltech.edu

Ring networkdestination

r

Single destination Instant convergence of

TCP/IP Shortest path routing

Link cost = pl(t) + dl

price static

TCP/AQM

IPr(0)

pl(0)

r(1)

pl(1)

… r(t), r(t+1) , …

routing

netlab.caltech.edu

Ring networkdestination

r

TCP/AQM

IPr(0)

pl(0)

r(1)

pl(1)

… r(t), r(t+1) , …

Stability: r ?

Utility: V ?r* : optimal routing

V* : max utility

netlab.caltech.edu

Ring networkdestination

rTheorem (Infocom 2003)

“No” duality gap Unstable if = 0

starting from any r(0), subsequent r(t) oscillates between 0 and 1

link cost = pl(t) + dl

Stability: r ?

Utility: V ?

netlab.caltech.edu

Ring networkdestination

r

link cost = pl(t) + dl

0

0||*

*

VV

rr

Theorem (Infocom 2003)

Solve primal problem asymptoticallyas

Stability: r ?

Utility: V ?

netlab.caltech.edu

Ring networkdestination

r

link cost = pl(t) + dl

Theorem (Infocom 2003)

large: globally unstable small: globally stable medium: depends on r(0)

Stability: r ?

Utility: V ?

netlab.caltech.edu

General network

Conclusion: Inevitable tradeoff between

achievable utility routing stability

random graph20 nodes, 200 links Achievable

utility

netlab.caltech.edu

FAST TCP: motivation, architecture, algorithms, performance. submitted for publication, July 1, 2003

-release: August 2003Inquiry: fast-support@cs.caltech.edu

FAST Project Review Caltech, Oct 27-28, 2003

netlab.caltech.edu/FAST