+ All Categories
Home > Documents > ZGP001 (zphddef - 07/15/03)

ZGP001 (zphddef - 07/15/03)

Date post: 03-Jan-2016
Category:
Upload: paulina-dante
View: 32 times
Download: 0 times
Share this document with a friend
Description:
This material is based upon work funded by the National Science Foundation under grant no. 9875177. Performance Evaluation of URL Routing for Content Distribution Networks. PhD defense by Zornitza Genova Prodanoff Committee Members: Dr. K. J. Christensen (Major Professor) Dr. M. Varanasi - PowerPoint PPT Presentation
Popular Tags:
53
ZGP001 (zphddef.ppt - 07/15/03) Performance Evaluation of URL Routing for Content Distribution Networks PhD defense by Zornitza Genova Prodanoff Committee Members: Dr. K. J. Christensen (Major Professor) Dr. M. Varanasi Dr. R. Perez Dr. Chari Dr. Labrador This material is based upon work funded by the National Science Foundation under grant no. 9875177
Transcript
Page 1: ZGP001  (zphddef - 07/15/03)

ZGP001 (zphddef.ppt - 07/15/03)

Performance Evaluation of URL Routing for

Content Distribution Networks

PhD defenseby

Zornitza Genova Prodanoff

Committee Members:Dr. K. J. Christensen (Major Professor)

Dr. M. VaranasiDr. R. Perez

Dr. ChariDr. Labrador

This material is based upon work funded by the National Science Foundation under grant no. 9875177

Page 2: ZGP001  (zphddef - 07/15/03)

ZGP002

Acknowledgements

I would like to thank:

My major professor Dr. Ken Christensen,

My committee: Dr. Varanasi, Dr. Perez, Dr. Chari, and Dr. Labrador

Dr. Suen for his comments at my proposal defense

My colleagues: K. Yoshigoe, A. Aslam, G. Perrera, and J. Shahbazian

My family

Page 3: ZGP001  (zphddef - 07/15/03)

• Motivation

• Problem and contributions

• URL Routing

• Improvements to URL routing

• Evaluation of URL signatures

• Evaluation of hashing for URL routing

• Summary

• List of my publications

ZGP003

Topics

New

New

New

Page 4: ZGP001  (zphddef - 07/15/03)

ZGP004

Motivation

“…2.5 Billion Hours Spent Waiting on the Web in 1998.” - John Roth, chief executive of Nortel Networks at Telecom '99

Page 5: ZGP001  (zphddef - 07/15/03)

ZGP005

Problem:

Excessive delay in the Internet caused by the inability to efficiently access distributed content in the Web

My contributions:

1) Architected a new URL router that uses HTTP redirection

2) Investigated new use of CRC32 for reducing the size of routing tables

3) Investigated a new self-adjusting hashing method for faster URL routing look-up

4) Performed the first queuing evaluation of hashing - effects of correlation discovered

Problem and contributions

Page 6: ZGP001  (zphddef - 07/15/03)

• Motivation

• Problem and contributions

• URL Routing

• Improvements to URL routing

• Evaluation of URL signatures

• Evaluation of hashing for URL routing

• Summary

• List of my publications

ZGP006

Topics

Page 7: ZGP001  (zphddef - 07/15/03)

ZGP007

• Next generation Internet - Content Distribution Networks- A CDN is an overlay network on the Internet - A CDN co-locates content throughout the world

• CDNs are of a great commercial and research interest- $15 million in NSF funding for Web services research- Akamai is one major CDN provider

URL routing

Page 8: ZGP001  (zphddef - 07/15/03)

ZGP008

URL routing continued

Transparent cache

Origin site

Reverse cache

Distributed server

Internet

Clients

Proxy cache http://www.some.com/page

http://334.249.2.8/page

Global content distribution in a CDN

http://214.29.2.15/page

Page 9: ZGP001  (zphddef - 07/15/03)

ZGP009

URL routing continued

HTTP redirection in a CDN

(1) HTTP request and redirect

(2) HTTP re-request and response

Origin site

Reverse cache

Distributed server

Clients

Proxy cache

(1)

(2)URL router

Page 10: ZGP001  (zphddef - 07/15/03)

ZGP010

URL routing continued

One armed URL router

HTTP requestsand redirects

Network links

Layer 3 switch

Architecture of a new URL router

URL 1 Loc 1 (state), loc 2 (state), … loc M1 (state)

URL 2 Loc 1 (state), loc 2 (state), … loc M2 (state)

URL N Loc 1 (state), loc 2 (state), … loc MN (state)

… …

Routing Table

Page 11: ZGP001  (zphddef - 07/15/03)

ZGP011

URL routing continued

Need to exchange routing tables (digesting)

Summary Cache [17]– Use Bloom filters to “merge” routing (hash) tables

Bloom filter is probabilistic and does not support updates- False positives if non-unique hashes- Results in a “routing collision” in the context of URLs

Page 12: ZGP001  (zphddef - 07/15/03)

ZGP012

URL routing continued

Need to do look-ups in routing tables

• Why use hashing?– Build routing tables as hash tables for efficient look-up

• Idea of self–adjusting hash– Most frequently used keys are closer to the head

» If chained hashing: rearrange after key accesses»Transposition rule for lists [50], [7]» Move-to-front rule for lists [33]

• Review of H1 hashing [74]– Self-adjusting by using transposition

Page 13: ZGP001  (zphddef - 07/15/03)

ZGP013

URL routing continued

Chained resolution of hash table collision

key record

rs

0

1

2

index chain

The hashing collision at index 0 causes the chain to be

created

r2

r1

r0 rn-1r0

r1

r2

rn-1

k0

k1

k2

kn-1

m-1

Page 14: ZGP001  (zphddef - 07/15/03)

URL routing continued

ZGP014

C1. [Create lists] For i 0 to m-1 set LISTi NULL.

C2. [Hash] Set i h(KEY), j 0

C3. [Is there a list?] If LISTi = NULL, go to C6.

C4. [Compare] If K = LISTi[j], terminate

C5. [Advance to next] If LISTi[j] NULL, set j j+1 and go to step C4.

C6. [Insert new key] Set LISTi[j] KEY.

C4A. [Compare and transpose – H1 hashing]If K = LISTi[j] and j 0, swap LISTi[j] with LISTi[j-1] and terminate Else terminate

H1 and Simple hashing algorithms based on [37]

Page 15: ZGP001  (zphddef - 07/15/03)

ZGP015

URL routing continued

Now begin my contributions in digesting and hashing (and evaluation thereof)

Page 16: ZGP001  (zphddef - 07/15/03)

• Motivation

• Problem and contributions

• URL routing

• Improvements to URL routing

• Evaluation of URL signatures

• Evaluation of hashing for URL routing

• Summary

• List of my publications

ZGP016

Topics

Page 17: ZGP001  (zphddef - 07/15/03)

ZGP027

Improvements to URL routing

Open problems

1) Select best source based on state (and location of client)

2) Reduce the size of the routing table to update/share

3) Perform fast routing look-ups

My problems

Page 18: ZGP001  (zphddef - 07/15/03)

ZGP018

Improvements to URL routing continued

• My idea…−Use CRC32 for URL signatures

• CRC32 circuitry is already part of an Ethernet adapter– Serial shift-register with wrapped XOR terms

• Use to get CRC32 signatures for URL in HTTP request header

• Need to calculate a CRC32 over a subfield [53]– The subfield is the URL in an HTTP request header

Page 19: ZGP001  (zphddef - 07/15/03)

ZGP019

Improvements to URL routing continued

Define the following,– P is CRC32 generator polynomial– Ai, i = 1, …, m is a polynomial (bit sequence)

– We store in a table (for all possible M) the remainders…

, where M is length of subfield

SubfieldPacket header

A0 A2

A1

Rest of packet

PR

M

M2

Rem

Page 20: ZGP001  (zphddef - 07/15/03)

ZGP020

Improvements to URL routing continued

We have the following,

P

ARA

0Rem0

Returned by adapter - from CRC32 shift register

What we want (CRC32 for subfield)

P

ARA

1Rem1

P

ARA

2Rem2

Page 21: ZGP001  (zphddef - 07/15/03)

ZGP021

Improvements to URL routing continued

P

R

P

Am

iA

m

ii i

11 RemRem

P

R

P

Am

iA

m

ii i

11 RemRem

P

AR iAi

Rem For the following properties apply:

Page 22: ZGP001  (zphddef - 07/15/03)

ZGP022

Improvements to URL routing continued

Solve for RA2 as follows…

Let A3 be A0 shifted left M bits.

Then

and

.

P

RR

P

AR

MAM

A0

3Rem

2Rem 0

P

RR

P

AAR

AAA

132

RemRem 13

32-bit multiply

Page 23: ZGP001  (zphddef - 07/15/03)

ZGP023

Improvements to URL routing continued

• My idea…−Aggressive hashing to perform fast look-up

» Self-adjusting chained collision resolution

» Fast way to do hash table look-ups

» Based on move-to-front rule for lists [33], [50]

Page 24: ZGP001  (zphddef - 07/15/03)

Improvements to URL routing continued

The new Aggressive hashing algorithm

C1. [Create lists] For i 0 to m-1 set LISTi NULL.

C2. [Hash] Set i h(KEY), j 0

C3. [Is there a list?] If LISTi = NULL, go to C6.

C4. [Compare] If K = LISTi[j], terminate

C5. [Advance to next] If LISTi[j] NULL, set j j+1 and go to step C4.

C6. [Insert new key] Set LISTi[j] KEY.

C4B. [Compare and move-to-front – Aggressive hashing]If K = LISTi[j] and j 0 LISTi[j] TEMP, for k = 0 to j

LISTi[k] LISTi [k-1]. Terminate.Else terminate.

New

Page 25: ZGP001  (zphddef - 07/15/03)

• Motivation

• Problem and contributions

• URL routing

• Improvements to URL routing

• Evaluation of URL signatures

• Evaluation of hashing for URL routing

• Summary

• List of my publications

ZGP025

Topics

Page 26: ZGP001  (zphddef - 07/15/03)

ZGP026

Evaluation of URL signatures Evaluation done with trace-driven simulation

Response variables:

1) Probability of false hits due to signature collisions

2) CPU time required to generate URL signatures

3) Reduction in processing and memory resources for URL look-up

Page 27: ZGP001  (zphddef - 07/15/03)

ZGP027

Input data used in the evaluation:

Obtained lists of URLs from 9 cache and server HTTP logs– Access lists– URL lists– CRC32 lists

Unique URLs range from 70 to 2.5 million (1.5 to 146 MBytes)

Continuity of logs was in months

Full URL string or CRC32 signatures lists were built

Evaluation of URL signatures continued

generated by me

2.1 GBytes of ASCII format raw data was used

Page 28: ZGP001  (zphddef - 07/15/03)

ZGP028

Evaluation of URL signatures continued

Access list name Number accesses

Number URLs

Mean URL

length (B)

Full URL list size (bytes)

CRC32 list size

(bytes)

www.peak.org 16,374 70 23.93 1,675 280

SDMA 41,941 153 33.76 5,165 612

UVA 318,899 45,816 44.91 2,057,625 183,264

NLANR 944,028 504,967 58.44 29,510,135 2,019,868

UC Berkeley 1,791,34

9149,344 41.87 6,253,716 597,376

mcs.net 1,862,07

075,361 29.87 2,250,829 301,444

hyperreal.org 4,080,59

086,338 89.17 7,698,337 345,352

CA*netII4,642,86

12,552,045 57.83 147,573,556

10,208,184

USF CSEE8,819,45

449,029 51.84 2,541,483 196,116

Input data characteristics

Access list name Number accesses

Number URLs

Mean URL

length (B)

Full URL list size (bytes)

CRC32 list size

(bytes)

www.peak.org 16,374 70 23.93 1,675 280

SDMA 41,941 153 33.76 5,165 612

UVA 318,899 45,816 44.91 2,057,625 183,264

NLANR 944,028 504,967 58.44 29,510,135 2,019,868

UC Berkeley 1,791,34

9149,344 41.87 6,253,716 597,376

mcs.net 1,862,07

075,361 29.87 2,250,829 301,444

hyperreal.org 4,080,59

086,338 89.17 7,698,337 345,352

CA*netII4,642,86

12,552,045 57.83 147,573,556

10,208,184

USF CSEE8,819,45

449,029 51.84 2,541,483 196,116

CA*netII 4,642,86

12,552,045 57.83 147,573,556

10,208,184

USF CSEE 8,819,45

449,029 51.84 2,541,483 196,116

Page 29: ZGP001  (zphddef - 07/15/03)

ZGP029

Experiments on the performance of CRC32

• Experiment #1: Number of CRC collisions was measured– CRC32 generated for each URL – Non-unique CRC32s counted

• Experiment #2: Measured CPU time to generate CRC32 URL list

– Software CRC generation (8-bit look-up coded in “C”)

• Experiment #3: Measured CPU time required for look-up – All entries from access list were looked up in URL list – URL list is a Simple chained hash table

Evaluation of URL signatures continued

Page 30: ZGP001  (zphddef - 07/15/03)

ZGP030

Evaluation of URL signatures continued

Access list name

CollisionsMeasured

Calculated value

Pr[collision] measured

www.peak.org 0 0 0.0000000

SDMA 0 0 0.0000000

UVA 0 1 0.0000000

NLANR 68 59 0.0001347

UC Berkeley 2 5 0.0000134

mcs.net 0 1 0.0000000

hyperreal.org 2 2 0.0000463

CA*netII 1558 1516 0.0006105

USF CSEE 2 1 0.0000408

Results for experiment #1

Measured and theoretical are close

Page 31: ZGP001  (zphddef - 07/15/03)

ZGP031

Evaluation of URL signatures continued

Access list Time for URL list Time for URLwww.peak.or

g <10

millisec--

SDMA <10 --

UVA 40 0.8730

secNLANR 460 0.9109

UC Berkeley 100 0.6695

mcs.net 40 0.5307

hyperreal.org 120 1.3897

CA*netII 2390 . 0.9368

USF CSEE 40 0.8158

Results for experiment #2

Time per URL string is small ( sec)

Page 32: ZGP001  (zphddef - 07/15/03)

ZGP032

Evaluation of URL signatures continued

0

0.1

0.2

0.3

0.4

0.5

0.6

10 12 14 16 18 20 22

H value

Look-u

p t

ime (

sec)

CRC32 URL signatures

Full URL

Results for experiment #3

CRC32 URL signature is better

Page 33: ZGP001  (zphddef - 07/15/03)

ZGP033

Evaluation of URL signatures continued

Experiments for CRC32 vs. MD5-Bloom filter digesting

•Experiment #1: Measured digest size and generation CPU time

–MD5-Bloom filter –CRC32–32-bit checksum–Lempel-Ziv (LZ) compression (used pkzip25)

 •Experiment #2: Measured digest size and CPU time

–MD5-Bloom

•Experiment #3: Measured collisions –Control variable is URL length –MD5-Bloom vs. CRC32 –URL length is a maximum of 25, 30, …, 80 bytes

Page 34: ZGP001  (zphddef - 07/15/03)

ZGP034

Evaluation of URL signatures continued

Experiments for CRC32 vs. MD5-Bloom filter digesting (continued)

•Experiment #4: Measured digest size of the hash chain method

–Based on the number of components –Tree structure of 32 bits for a <depth, hash code> pair

Page 35: ZGP001  (zphddef - 07/15/03)

ZGP035

Evaluation of URL signatures continued

 CA*net list CSE list

Method (Load Factor)

CPU time (sec)

Size (Mbytes

)

Collisions (%)

CPU time (sec)

Size (Mbytes

)

Collisions (%)

MD5-Bloom (8) 89.13 9.74 0.03 1.63 0.19 0.00

CRC32 16.22 9.74 0.03 0.27 0.19 0.00

32-bit checksum

14.85 9.74 0.71 0.24 0.19 0.22

LZ compression 17.35 16.43 0.00 0.23 0.25 0.00

MD5-Bloom (8) 89.13 9.74 0.03 1.63 0.19 0.00

MD5-Bloom (16) 92.37 19.47 0.00 1.71 0.37 0.00

MD5-Bloom (32) 97.40 38.94 0.00 1.84 0.75 0.00

Results for experiments #1 and #2

Similar CRC32 and Bloom filter collisions

Page 36: ZGP001  (zphddef - 07/15/03)

ZGP036

Evaluation of URL signatures continued

0.00

0.01

0.10

25 35 45 55 65 75

URL length (bytes)

Colli

sions

(%)

MD5-Bloom

CRC32

Results for experiment #3

Collisions are same for CRC32 and Bloom filter

Page 37: ZGP001  (zphddef - 07/15/03)

ZGP037

Evaluation of URL signatures continued

Results from experiment #4

• Hash chaining in an average of 212% larger digests than CRC32

Substantially larger then the other methods

Page 38: ZGP001  (zphddef - 07/15/03)

ZGP038

Evaluation of URL signatures continued

Discussion of results

• CRC32 URL signatures reduce the size of URL lists and speed-up look-up in a hash table– Require less network bandwidth to transfer – Require less memory for storage in the URL router

• For CRC32 the number of collisions was found to be small

• CRC32 digests require less CPU and produce same collisions

Page 39: ZGP001  (zphddef - 07/15/03)

• Motivation

• Problem and contributions

• URL routing

• Improvements to URL routing

• Evaluation of URL signatures

• Evaluation of hashing for URL routing

• Summary

• List of my publications

ZGP039

Topics

Page 40: ZGP001  (zphddef - 07/15/03)

ZGP040

Evaluation of hashing for URL routing continued

Look-up time experiments:

• Experiment #1: Effect of hash table size on look-up time (NASA access list)

• Experiment #2: Effect of hash table size (in K ) on look-up time (Clark.net access list)

Page 41: ZGP001  (zphddef - 07/15/03)

ZGP041

Hash table look-up time for experiment #1

Evaluation of hashing for URL routing continued

0

10

20

30

40

50

60

8 9 10 11 12 13

Hash table Size (K)

Mean L

ook-

up T

ime

Simple

H1

Aggressive

For dense hash tables Aggressive is better than H1

Page 42: ZGP001  (zphddef - 07/15/03)

ZGP042

Hash table look-up time for experiment #2

Evaluation of hashing for URL routing continued

0

10

20

30

40

8 9 10 11 12 13K

Mean L

ook-

up T

ime

Simple

H1

Aggressive

Similar to experiment #1 results

Page 43: ZGP001  (zphddef - 07/15/03)

ZGP043

Evaluation of hashing for URL routing continued

• Evaluation model (single server queue):

• Response variables: – mean queuing delay – drop in utilization

Queued URLs

Arrivals are URLs to be looked-up

Server is a hash table look

Page 44: ZGP001  (zphddef - 07/15/03)

ZGP044

Mean queue length experiments:

•Experiment #1: Effect of hash table size (K) on queue length (L) for utilization U = 80% (Simple chain) and exponential arrivals

•Experiment #2: Effect of burtiness (Tmax) on L for U = 80% (Simple chain) and K = 8

•Experiment #3: Effect of (Tmax) on L for U = 80% and K = 8

•Experiment #4: Effect of autocorrelation (unshuffled and shuffled ordering of requests) on L for U = 80% and K = 8

•Experiment #5: Effect of autocorrelation (unshuffled and shuffled ordering of requests) on L for U = 80% (Simple chain) and K = 8

Evaluation of hashing for URL routing continued

Page 45: ZGP001  (zphddef - 07/15/03)

ZGP045

Evaluation of hashing for URL routing continued

0

1

2

3

4

5

6

8 9 10 11 12 13

K

L

Simple

H1Aggressive

Results for experiment #1

Self-adjusting methods show similar performance

Page 46: ZGP001  (zphddef - 07/15/03)

ZGP046

Evaluation of hashing for URL routing continued

20

0

10

30

40

50 100 250 500 750 1000

Tmax

L

Simple hashing - value range is 5500 to 34000

H1

Aggressive

Results for experiment #2

H1 shows faster increase in L

Page 47: ZGP001  (zphddef - 07/15/03)

ZGP047

Evaluation of hashing for URL routing continued

0

40K

80K

120K

50 100 250 500 750 1000

T max

L

Simple

H1

Aggressive

Results for experiment #3

H1 has magnitudes worse queue length

Page 48: ZGP001  (zphddef - 07/15/03)

ZGP048

Results for experiment #4

Evaluation of hashing for URL routing continued

Algorithm unshuffled shuffled M/G/1

Simple 5.20 3.15 3.13

H1 29102.01 8.58 8.57

Aggressive

294.09 9.93 9.76

H1 has magnitudes worse queue length

Page 49: ZGP001  (zphddef - 07/15/03)

ZGP049

Results for experiment #5

Evaluation of hashing for URL routing continued

Algorithm U unshuffled shuffled

Simple 80.0% 5.20 3.15

H1 21.7 0.43 0.36

Aggressive

12.9 0.19 0.18

Page 50: ZGP001  (zphddef - 07/15/03)

ZGP050

Discussion of results

• Aggressive hashing improves upon H1 hashing– Modest look-up time improvement– Significant improvement from a queueing perspective

• Queueing must be used for evaluating hashing algorithms

• LRD in look-up time of H1 results in extreme queueing delay – Catastrophic effects on any application

Evaluation of hashing for URL routing continued

Page 51: ZGP001  (zphddef - 07/15/03)

• Motivation

• Problem and contributions

• URL routing

• Improvements to URL routing

• Evaluation of URL signatures

• Evaluation of hashing for URL routing

• Summary

• List of my publications

ZGP051

Topics

Page 52: ZGP001  (zphddef - 07/15/03)

ZGP052

In summary, I have address the problem of

Excessive delay in the Internet caused by the inability to efficiently access distributed content in the Web

My work has shown that:

1) A URL router that uses HTTP redirection is feasible

2) CRC32 can be used for digesting of URL routing tables

3) Aggressive hashing improves upon existing hashing algorithms in fast look-up

4) Queueing behavior needs to be considered when evaluating hashing algorithms

Summary

Four publications have resulted

Page 53: ZGP001  (zphddef - 07/15/03)

ZGP053

List of my related publications

1. Z. Genova and K. Christensen, "Managing Routing Tables for URL Routers in Content Distribution Networks," submitted to the International Journal of Network Management in June 2003

2. Z. Genova and K. Christensen, “Efficient Summarization of URLs using CRC32 for Implementing URL Switching,” Proceedings of the 27th IEEE Conference on Local Computer Networks (LCN), pp. 343-344, November 2002

3. Z. Genova and K. Christensen, “Using Signatures to Improve URL Routing,” Proceedings of IEEE International Performance, Computing, and Communications Conference, pp. 45-52, April 2002

4. Z. Genova and K. Christensen, “Challenges in URL Switching for Implementing Globally Distributed Web Sites,” Proceedings of the Workshop on Scalable Web Services, pp. 89-94, August 2000

 


Recommended