Date post: | 13-Jul-2015 |
Category: |
Internet |
Upload: | aravindh-ramanan |
View: | 155 times |
Download: | 2 times |
Web Caching
� As an example, we use the web to illustrate caching and other related issues
browser Web Proxycache
request
response
request
response
Web server
browser Web server
request
response
Web Browser Caching
� Web browsers have their own caches. When a page is downloaded from a site the web page is put into the browser cache.
� This is especially useful in those cases when the back button is pressed.
� If a new copy is needed then a “refresh” can be done.
� No page stays permanently in the cache. There is limited room.
� A replacement algorithm is needed to determine which cached page should be purged.
Why Web Server Caching
� Latency� Reduce latency � Request does not require going to the server� Request is served from the client side which
means that network communication is avoided� Reduce traffic
Consistency
� What if the page changes after saved in the cache?� This means that cached copy is out of date� The copy and the original are not consistent
� There are different strategies for dealing with this
Web Browser Caching� Client pull
� The server provides the content with instructions on when the client should ask for a refreshed copy of the content or if the content should be cached.
� Server push� The server transmits page information to the screen.� The browser application displays the information and
leaves the connection to the server open.� With an open connection, the server can continue to push
updated pages for your screen to display on an ongoing basis. You can close the connection by closing the page.
� The server is in control� Browser caches are different from proxy caches (discussed
next).
Web Caching� Proxy caches (also called proxy server)
� Intercepts HTTP requests from client• Serves object if in its cache and the date is still valid• If not go to object’s home server
– On behalf of user, gets the object and possibly deposits in its cache before returning to user
• Usually deployed at edges of a network– Wide area bandwidth savings, improved response time
and increased availability of static web-based objects
� A browser may have to be configured to point to the proxy server.
� Usually a proxy cache is purchased and installed by an organization
Web Caching
� Not all web pages can be cached� If the Last-Modified tag then page can be
cached� Refresh is often done when
� There is a request; and� Expiry time has passed
Cooperative Caching
� Caching infrastructure can have multiple web proxies� Proxies can be arranged in a hierarchy or other
structures� Proxies can cooperate with one another
• Answer client requests• Propagate server notifications
� Uses a combination of HTTP and ICP (Internet Caching Protocol).
• ICP can be used by one cache to quickly ask another cache if it has an object.
• HTTP is used to actually retrieve the object.
Problems
� Caching proxies do not serve all Internet users
� Content providers (say, Web servers) cannot rely on existence and correct implementation of caching proxies.
� Accounting issues with caching proxies:� Example: www.cnn.com needs to know the
number of hits to the advertisements displayed on the web page.
Content Distribution Networks (CDN)
� Business Model: A content provider such as www.cnn.com or Yahoo pays a CDN company (such as Akamai) to get its content to the requesting users with short delays.
� A CDN provides a mechanism for � Replicating content on multiple servers in
the Internet� Providing clients with a means to
determine the servers that can deliver the content fastest.
Terminology
� Content: Any publicly accessible combination of text, images, applets, frames, MP3, video, flash, virtual reality objects, etc.
� Content Provider: Any individual, organization, or company that has content that it wishes to make available to users.
� Origin Server: Content provider’s server , where the content is first uploaded.
� Surrogate Server (sometimes called edge server): Content distributor’s server, where the replicated content is kept.
Players
Content Provider
H/W and S/W Vendor
Content Distributor
Hosting Provider
Yahoo, MSNBC, CNNCBC
Cisco, Oracle-Sun
Akamai,
Bell
Sells serve
rs
Send content
Install servers
CDN Distribution� Content providers are CDN
customers
Content replication� CDN company installs
thousands of servers throughout Internet� In large datacenters� Or, close to users
� CDN replicates customers’ content
� When provider updates content, CDN updates servers
origin server
in North America
CDN distribution node
CDN server
in S. America CDN server
in Europe
CDN server
in Asia
14
CDN: Functional Components
� Distribution Service� Redirection Service� Accounting and Billing system
CDN:Distribution Service
� The content provider determines which of its objects it wants the CDN to distribute.
� The content provider tags and then pushes this content to a CDN node, which in turn replicates and pushes the content to all its CDN servers.
CDN: Redirection
� When a browser in a user’s host is instructed to retrieve a specific object (specified using a URL), how does the browser determine whether it should retrieve the object from the origin server or from one of the CDN servers?
� As an example, suppose the hostname of the content provider is www.cnn.com
How Akamai Works
End-user
cnn.com (content provider) DNS root server
1 2
Nearby Akamai
cluster
GET index.html
18
http://a.73.g.akamai.net/7/23/cnn.com/af/cnn.com/foo.jpg
HTTP
Akamai
clusterAkamai global
DNS server
Akamai regional
DNS server
CDN: Redirection� Users get an html document from
www.cnn.com; this could be index.html� The file index.html uses a modified URL for
content that has been replicated.� Example: If the jpeg files are what has been
replicated then <img src=“http://cnn.com/af/foo.jpg> may be modified as follows: <img src=http://a73.g.akamai.net/7/23/cnn.com/af/foo.jpg>
� The browser needs to resolve a73.g.akamai.net hostname for replicated content.
CDN: Redirection� What does this mean?
<img src=http://a73.g.akamai.net/7/23/cnn.com/af/foo.jpg>
� host part: a73.g.akamai.net� Akamai control part: /7/23� Content URL: /af/foo.jpg
CDN: Redirection� DNS is configured so that all queries about
g.akamai.net that arrive at a DNS server are sent to an authoritative DNS server for g.akamai.net.
� This is referred to as a Akamai DNS server (authoritative DNS server)
How Akamai Works
End-user
cnn.com (content provider) DNS root server
1 2
Nearby Akamai
cluster
DNS lookup
cache.cnn.comAkamai
cluster3
4 ALIAS:
g.akamai.net
Akamai global
DNS server
Akamai regional
DNS server
CDN: Redirection� DNS is configured so that all queries about
g.akamai.net that arrive at a DNS server are sent to an authoritative DNS server for g.akamai.net. This is referred to as a Akamai DNS server (authoritative DNS server)
� When the Akamai DNS server receives the query, it extracts the IP address of the requesting browser.
.
PP
How Akamai Works
End-user
cnn.com (content provider) DNS root server
1 2
Akamai global
DNS server
Akamai regional
DNS server
Nearby Akamai
cluster
Akamai
cluster3
4 6
5
ALIAS
a73.g.akamai.net
DNS lookup
g.akamai.net
CDN: Redirection
� Based on the IP address and information that it has about the Internet (called a map), the IP address of an Akamai regional server is returned to the requesting browser based on policy
� e.g., select the server that is the fewest hops away.� The regional server may choose a surrogate server for content retrieval
HTTPHTTP
How Akamai Works
End-user
cnn.com (content provider) DNS root server
1 2
Akamai global
DNS server
Akamai regional
DNS server
Nearby Akamai
cluster
Akamai
cluster3
4 6
5
8
7
DNS a73.g.akamai.net
Address
1.2.3.4
HTTPHTTP
How Akamai Works
End-user
cnn.com (content provider) DNS root server
1 2
Akamai global
DNS server
Akamai regional
DNS server
Nearby Akamai
cluster
Akamai
cluster3
4 6
5
8
7
9
GET /foo.jpgHost: cache.cnn.com
HTTPHTTP
How Akamai Works
End-user
cnn.com (content provider) DNS root server
1 2
Akamai global
DNS server
Akamai regional
DNS server
Nearby Akamai
cluster
Akamai
cluster3
4 6
5
8
7
9
GET /foo.jpgHost: cache.cnn.com
1211
GET foo.jpg
CDN Redirection
� The Akamai DNS server IP address is now in the cache of the local DNS server.� This implies that it is not always
necessary to go to the root DNS server.� The TTL associated with the IP address of
an Akamai server(surrogate) is relatively small.� This is done for performance reasons.
� Akamai content distribution servers are caches
CDN Redirection
� What if content is not there?� If the request content is not found then the
surrogate will ask other surrogates within a specified region for information.
� If requested information is still not found or is stale, then a request is made to the original web site.
CDN Selection� The tricky issue is selecting which local
content server to use for a particular request� Want to spread load evenly� Want minimal impact if server is added or
removed.� In Akamai, each surrogate server sends
measurement results to the Network Operations Communications Center (NOCC).� Measurement results include number of active
TCP connections, HTTP request arrival rate, bandwidth availability, etc
� This information is used by the Akamai DNS server.
Accounting Mechanism� Accounting mechanisms collect and track
information related to request routing, distribution and delivery.
� Information is gathered in real time and put into log files for each CDN component.
� This gets sent to the Network Operations
Communications Center (NOCC).
Full Site Delivery vs. Partial Site Delivery
� Full Site Delivery : All the contents are delivered by the CDN (including HTML, images, and other objects).
� Partial Site delivery: Only images, streaming media and other bandwidth intensive objects delivered by the CDN.