+ All Categories

Cdn

Date post: 13-Jul-2015
Category:
Upload: aravindh-ramanan
View: 155 times
Download: 2 times
Share this document with a friend
Popular Tags:
35
Caching and Content Distribution Networks
Transcript

Caching and Content Distribution Networks

Web Caching

� As an example, we use the web to illustrate caching and other related issues

browser Web Proxycache

request

response

request

response

Web server

browser Web server

request

response

Web Browser Caching

� Web browsers have their own caches. When a page is downloaded from a site the web page is put into the browser cache.

� This is especially useful in those cases when the back button is pressed.

� If a new copy is needed then a “refresh” can be done.

� No page stays permanently in the cache. There is limited room.

� A replacement algorithm is needed to determine which cached page should be purged.

Why Web Server Caching

� Latency� Reduce latency � Request does not require going to the server� Request is served from the client side which

means that network communication is avoided� Reduce traffic

Consistency

� What if the page changes after saved in the cache?� This means that cached copy is out of date� The copy and the original are not consistent

� There are different strategies for dealing with this

Web Browser Caching� Client pull

� The server provides the content with instructions on when the client should ask for a refreshed copy of the content or if the content should be cached.

� Server push� The server transmits page information to the screen.� The browser application displays the information and

leaves the connection to the server open.� With an open connection, the server can continue to push

updated pages for your screen to display on an ongoing basis. You can close the connection by closing the page.

� The server is in control� Browser caches are different from proxy caches (discussed

next).

Web Caching� Proxy caches (also called proxy server)

� Intercepts HTTP requests from client• Serves object if in its cache and the date is still valid• If not go to object’s home server

– On behalf of user, gets the object and possibly deposits in its cache before returning to user

• Usually deployed at edges of a network– Wide area bandwidth savings, improved response time

and increased availability of static web-based objects

� A browser may have to be configured to point to the proxy server.

� Usually a proxy cache is purchased and installed by an organization

Web Caching

� Not all web pages can be cached� If the Last-Modified tag then page can be

cached� Refresh is often done when

� There is a request; and� Expiry time has passed

Cooperative Caching

� Caching infrastructure can have multiple web proxies� Proxies can be arranged in a hierarchy or other

structures� Proxies can cooperate with one another

• Answer client requests• Propagate server notifications

� Uses a combination of HTTP and ICP (Internet Caching Protocol).

• ICP can be used by one cache to quickly ask another cache if it has an object.

• HTTP is used to actually retrieve the object.

Problems

� Caching proxies do not serve all Internet users

� Content providers (say, Web servers) cannot rely on existence and correct implementation of caching proxies.

� Accounting issues with caching proxies:� Example: www.cnn.com needs to know the

number of hits to the advertisements displayed on the web page.

Content Distribution Networks (CDN)

� Business Model: A content provider such as www.cnn.com or Yahoo pays a CDN company (such as Akamai) to get its content to the requesting users with short delays.

� A CDN provides a mechanism for � Replicating content on multiple servers in

the Internet� Providing clients with a means to

determine the servers that can deliver the content fastest.

Terminology

� Content: Any publicly accessible combination of text, images, applets, frames, MP3, video, flash, virtual reality objects, etc.

� Content Provider: Any individual, organization, or company that has content that it wishes to make available to users.

� Origin Server: Content provider’s server , where the content is first uploaded.

� Surrogate Server (sometimes called edge server): Content distributor’s server, where the replicated content is kept.

Players

Content Provider

H/W and S/W Vendor

Content Distributor

Hosting Provider

Yahoo, MSNBC, CNNCBC

Cisco, Oracle-Sun

Akamai,

Bell

Sells serve

rs

Send content

Install servers

CDN Distribution� Content providers are CDN

customers

Content replication� CDN company installs

thousands of servers throughout Internet� In large datacenters� Or, close to users

� CDN replicates customers’ content

� When provider updates content, CDN updates servers

origin server

in North America

CDN distribution node

CDN server

in S. America CDN server

in Europe

CDN server

in Asia

14

CDN: Functional Components

� Distribution Service� Redirection Service� Accounting and Billing system

CDN:Distribution Service

� The content provider determines which of its objects it wants the CDN to distribute.

� The content provider tags and then pushes this content to a CDN node, which in turn replicates and pushes the content to all its CDN servers.

CDN: Redirection

� When a browser in a user’s host is instructed to retrieve a specific object (specified using a URL), how does the browser determine whether it should retrieve the object from the origin server or from one of the CDN servers?

� As an example, suppose the hostname of the content provider is www.cnn.com

How Akamai Works

End-user

cnn.com (content provider) DNS root server

1 2

Nearby Akamai

cluster

GET index.html

18

http://a.73.g.akamai.net/7/23/cnn.com/af/cnn.com/foo.jpg

HTTP

Akamai

clusterAkamai global

DNS server

Akamai regional

DNS server

CDN: Redirection� Users get an html document from

www.cnn.com; this could be index.html� The file index.html uses a modified URL for

content that has been replicated.� Example: If the jpeg files are what has been

replicated then <img src=“http://cnn.com/af/foo.jpg> may be modified as follows: <img src=http://a73.g.akamai.net/7/23/cnn.com/af/foo.jpg>

� The browser needs to resolve a73.g.akamai.net hostname for replicated content.

CDN: Redirection� What does this mean?

<img src=http://a73.g.akamai.net/7/23/cnn.com/af/foo.jpg>

� host part: a73.g.akamai.net� Akamai control part: /7/23� Content URL: /af/foo.jpg

CDN: Redirection� DNS is configured so that all queries about

g.akamai.net that arrive at a DNS server are sent to an authoritative DNS server for g.akamai.net.

� This is referred to as a Akamai DNS server (authoritative DNS server)

How Akamai Works

End-user

cnn.com (content provider) DNS root server

1 2

Nearby Akamai

cluster

DNS lookup

cache.cnn.comAkamai

cluster3

4 ALIAS:

g.akamai.net

Akamai global

DNS server

Akamai regional

DNS server

CDN: Redirection� DNS is configured so that all queries about

g.akamai.net that arrive at a DNS server are sent to an authoritative DNS server for g.akamai.net. This is referred to as a Akamai DNS server (authoritative DNS server)

� When the Akamai DNS server receives the query, it extracts the IP address of the requesting browser.

.

PP

How Akamai Works

End-user

cnn.com (content provider) DNS root server

1 2

Akamai global

DNS server

Akamai regional

DNS server

Nearby Akamai

cluster

Akamai

cluster3

4 6

5

ALIAS

a73.g.akamai.net

DNS lookup

g.akamai.net

CDN: Redirection

� Based on the IP address and information that it has about the Internet (called a map), the IP address of an Akamai regional server is returned to the requesting browser based on policy

� e.g., select the server that is the fewest hops away.� The regional server may choose a surrogate server for content retrieval

HTTPHTTP

How Akamai Works

End-user

cnn.com (content provider) DNS root server

1 2

Akamai global

DNS server

Akamai regional

DNS server

Nearby Akamai

cluster

Akamai

cluster3

4 6

5

8

7

DNS a73.g.akamai.net

Address

1.2.3.4

HTTPHTTP

How Akamai Works

End-user

cnn.com (content provider) DNS root server

1 2

Akamai global

DNS server

Akamai regional

DNS server

Nearby Akamai

cluster

Akamai

cluster3

4 6

5

8

7

9

GET /foo.jpgHost: cache.cnn.com

HTTPHTTP

How Akamai Works

End-user

cnn.com (content provider) DNS root server

1 2

Akamai global

DNS server

Akamai regional

DNS server

Nearby Akamai

cluster

Akamai

cluster3

4 6

5

8

7

9

GET /foo.jpgHost: cache.cnn.com

1211

GET foo.jpg

CDN Redirection

� The Akamai DNS server IP address is now in the cache of the local DNS server.� This implies that it is not always

necessary to go to the root DNS server.� The TTL associated with the IP address of

an Akamai server(surrogate) is relatively small.� This is done for performance reasons.

� Akamai content distribution servers are caches

CDN Redirection

� What if content is not there?� If the request content is not found then the

surrogate will ask other surrogates within a specified region for information.

� If requested information is still not found or is stale, then a request is made to the original web site.

CDN Selection� The tricky issue is selecting which local

content server to use for a particular request� Want to spread load evenly� Want minimal impact if server is added or

removed.� In Akamai, each surrogate server sends

measurement results to the Network Operations Communications Center (NOCC).� Measurement results include number of active

TCP connections, HTTP request arrival rate, bandwidth availability, etc

� This information is used by the Akamai DNS server.

Accounting Mechanism� Accounting mechanisms collect and track

information related to request routing, distribution and delivery.

� Information is gathered in real time and put into log files for each CDN component.

� This gets sent to the Network Operations

Communications Center (NOCC).

Full Site Delivery vs. Partial Site Delivery

� Full Site Delivery : All the contents are delivered by the CDN (including HTML, images, and other objects).

� Partial Site delivery: Only images, streaming media and other bandwidth intensive objects delivered by the CDN.

Current Akamai Customers

Summary

� We have examined replication and issues related to the design and implementation of a replicated system.

� Many choices and tradeoffs to consider


Recommended