Date post: | 18-May-2015 |
Category: |
Technology |
Upload: | hcl-technologies |
View: | 876 times |
Download: | 4 times |
Accelerated Web Content Delivery
Sanjeet Joshi
Architecture Technology Services
HCL Technologies Ltd.
Page 2 of 12
Accelerated Web Content Delivery
© 2010, HCL Technologies Ltd.
November 2010
The author would like to thank Dr. Usha Thakur of ATS for her valuable help in
content formatting and content enhancement.
NON-DISCLOSURE OBLIGATIONS AND DISCLAIMER
The data, information or material provided herein is
confidential and proprietary to HCL and shall not be
disclosed, duplicated or used in whole or in part for any
purpose other than as approved by an authorized official of
HCL in writing. The recipient agrees to maintain complete
confidentiality of the information; data received and shall
take all reasonable precautions/steps in maintaining
confidentiality of the same, however in any event not less
than the precautions/steps taken for its own confidential
material. If you are not the intended recipient of this
information, you are not authorized to read, forward, print,
retain, copy or disseminate this document or any part of it.
Any statements in this presentation that are not historical
facts may include forward-looking statements that involve
risks and uncertainties; actual results may differ from the
forward-looking statements.
Page 3 of 12
Background
In 1995 when the Internet was still in its infancy there were about 16 million users worldwide
using it, compared to about 2 billion worldwide users today1. Over the last fifteen years not only
has the number of people using the Internet grown exponentially, but we have also witnessed
an evolution of technology standards, protocols, and information consumption patterns. The
Internet is no longer limited to desktop/laptop computers. An increasing number of people on
the go are using handheld devices to access their preferred websites. The easy access of
websites has resulted in a significant increase in Web traffic.
Today while designing a Web application or a website that is expected to generate a lot of
interest, one has to ensure that the Web application has the right design and infrastructure to
handle the extra load, failing which websites are likely to experience difficulties. For instance,
the highly popular micro-blogging website twitter.com faced stability issues for a long time after
its launch, since it was not designed to handle a large amount of traffic.
The performance of a Web application is determined by multiple factors such as design and
application architecture, quality of code and hardware infrastructure. Performance needs to be
built at every layer of the technology stack to get a solid finished product.
This paper focuses on the Web content caching aspect of website performance.
Purpose
Web caching is not a new idea. It has been in use for quite some time and current browsers,
caching proxies, and Web servers provide support for it. However Web caching is most often an
ignored aspect while designing a technology stack of a Web application.
Web content caching can be implemented by content consumers (end users) to improve their
Internet browsing experience or by content providers to reduce the load on their origin
infrastructure, as well as to give their customers a better Web surfing experience.
Caching at content consumer’s end is handled by Web browsers such as Internet Explorer,
Firefox etc. This is done automatically and end users have limited control over how and what
will be cached. Some organizations also install caching proxies to cache incoming Web content
and to apply security policies.
This paper will focus on caching solutions from a content provider’s point of view and the
various ways in which content caching can enhance a website’s performance. The paper
1 See http://www.internetworldstats.com/stats.htm [November 2010] -> indicates when this site was
accessed
Page 4 of 12
assumes that the reader is familiar with Web standards like HTTP, HTML and is technical in
nature. It is targeted towards technology architects and solution designers.
Web Caching Concepts
The concept of caching has been widely used since the early days of computing and
implemented at various layers in a technology stack. For example the processor chip layer has
a hardware cache that is used for storing most frequently accessed instructions. Irrespective of
where a cache is used, its main function is to store the most frequently accessed data
(information or instructions) and its main goal is to improve performance by reducing
read/computation cycle times.
It is common knowledge that application level caching can be extremely beneficial in saving
multiple expensive database reads or expensive repetitive computations thereby improving the
overall application performance.
HTTP caching or Web caching goes one layer above and caches entire static Web resources
(e.g., HTML pages, CSS files etc) either at the client side (browser cache) or at the server side
(origin cache infrastructure).
Let us take a quick look at some of the common terms used with respect to caching in general
and Web caching in particular.
Origin server or origin infrastructure is the server infrastructure where Web servers or
application servers are hosted. These servers are responsible for serving fresh content upon
request.
Time to live (TTL) Cacheable data has a validity period beyond which it is considered stale.
This is referred to as TTL. It is a critical parameter because a very low TTL makes caching
ineffective and a very high TTL results in stale data being served to clients.
Cache hit occurs each time an HTTP request is served from cache.
Cache hit ratio is the percentage of all requests that result in cache hits.
A cache miss occurs when a request cannot be served from cache.
Page 5 of 12
Controlling Caching Behavior of Your Content
Web browsers and caching proxies depend on the HTML and HTTP headers of the delivered
content for determining if the content can be cached, and if so, for how long it can be cached.
These cache headers can be tuned to define the cache behavior of a Web application/website.
Cache Headers
HTML authors can use tags in the <HEAD> section of the HTML page to dictate the caching
behavior of that page. However, header tags for caching do not have defined standards and
hence not all browsers or caches honor them. For example using <Pragma: no-cache> does
not guarantee that the content will never be cached. Hence it is not advisable to use HTML
cache headers.
A more reliable approach is to use HTTP headers. HTTP headers are created by the Web
server and sent in response to a request. The headers help the caching layer decide if the
content can be cached, for how long it can be served, and when it needs to be refreshed from
the origin server. Some important HTTP headers that control caching are as follows:
Expires: Gives the date and time after which response is considered stale. For
example,
Expires: Sun, 06 Aug 2011 10:00:00 GMT.
Cache-Control: Provides multiple options for controlling cache mechanism. They are
as follows
max-age=[seconds] — specifies the maximum time for which a resource will be
considered fresh. Similar to Expire, this directive is relative to the time of the
request, and not absolute.
s-maxage=[seconds] — similar to max-age, except that it only applies to shared
(e.g., proxy) caches.
public — marks authenticated responses as cacheable; normally, if HTTP
authentication is required, responses are automatically private.
private — allows caches that are specific to one user (e.g., in a browser) to
store the response; shared caches (e.g., in a proxy) may not.
no-cache — forces cache to submit each request back to the origin server for
validation before releasing a cached copy. This is useful for ensuring that authentication has been respected (in combination with public) and for maintaining freshness without sacrificing all of the benefits of caching.
no-store — instructs caches not to keep a copy of the representation under any
conditions.
Page 6 of 12
must-revalidate — tells cache that it must obey any freshness information
user gives about a representation. HTTP allows cache to serve stale representations under special conditions.
proxy-revalidate — similar to must-revalidate, except that it only applies to
proxy caches.
Note: One important point to remember here is that not all type of content can be
cached. For instance, dynamic content generated using server side scripting cannot be
cached under normal conditions. However, dynamically assembled content that does not
change frequently can be cached by making those scripts return valid cache headers.
Content Delivery Networks
Content Delivery Networks (CDN) are established commercial solutions on the market that
provide a Web content caching layer. These networks provide a transparent caching layer
between Web clients and the origin infrastructure, and intercept every request going to the
origin server. Typically CDNs have their cache servers distributed around the world and have
smart algorithms for delivering cached content from the nearest (in terms of network hops)
cache location. CDNs take a major chunk of content serving load away from the origin
infrastructure thus reducing its load. CDNs are also used for delivering rich multimedia content
such as audio and video files.
Figure 1 illustrates where a CDN fits in the overall workflow.
CDN
httphttp
www
http
Web clients
Origin Server Infrastructure
Figure 1: Positioning a CDN
Although CDNs deliver huge value they may not be suitable for small organizations with limited
budget because they are expensive to hire. CDNs are recommended mostly for organizations
that want more control over the caching behavior of their content. In such cases, a custom CDN
not only works out to be cheaper to implement but also gives immense control over caching.
Page 7 of 12
SQUID Proxy in Server Acceleration Mode
Squid is an open source caching proxy product licensed under the GNU GPL. It is one of the
most widely used, robust and feature-rich open source products available on the market. Squid
is used by websites such as Wikipedia.org that witness very high traffic volumes.
Squid can be installed as a proxy to improve client side Web surfing performance, apply security
and filtering mechanism and apply organizational policies by monitoring outgoing requests.
Squid can also be installed in a reverse proxy mode to improve server side content delivery
performance. This is also known as server accelerator mode. A reverse proxy is setup close to
the origin Web servers to serve incoming requests rather than outgoing requests.
http
www
http
Web clients
Origin Server InfrastructureSquid
Reverse
Proxies
Figure 2: Squid as Reverse Proxy
A reverse proxy acts as an intermediary between a Web client and the origin Web server(s). It
receives all content requests and delivers valid content available in cache. If the requested
content is not available, the reverse proxy requests the origin server for the content. This
reduces TCP connection and content rendering load on the origin servers making them
available for other important tasks.
Some key benefits of the afore-mentioned architecture are as follows.
1. LOAD BALANCING: If the Web server infrastructure requires expensive server hardware,
Squid can be installed on a number of inexpensive commodity hardware boxes, thereby
reducing the number of expensive origin servers.
2. SECURITY: This can also provide an effective security solution because the origin server
infrastructure is hidden behind the Squid infrastructure layer. Hence any attack on the
website is limited to the squid infrastructure, and any damage is limited to the cached
content.
3. PERFORMANCE: A correctly tuned Squid installation can provide significant performance
gains as the proxy is meant for serving cached content at very high speeds. It uses in-
memory caching for better performance. Squid also provides various cache replacement
policies that play a major role in determining the performance of a Squid server.
Page 8 of 12
Squid Cache Replacement Policies
Cache replacement policy determines which objects in the cache can be replaced by other new
objects that are most likely to be served and thereby improve the cache hit ratio. This is an
important choice because it helps in disk and memory usage optimization. For example, the
most popular objects should not be removed from the cache and least accessed cached objects
should be replaced by more popular objects.
There are various replacement policies offered by Squid. Below we provide a brief introduction
to all of them. There is no single recommended or best policy. The right policy is chosen after
studying the content and how it is accessed.
LRU (Least Recently Used)
LRU is a common and effective choice for most cache implementations. It removes objects with
the greatest last accessed timestamp i.e. cached objects that are not accessed for a long time
are the prime candidates for replacement. LRU works well when objects that are most recently
accessed have a greater likelihood of being accessed again in the near future.
LFUDA (Least Frequently Used with Dynamic Aging)
LFU is another commonly used policy that keeps count of object references and then removes
the least used objects.
LFUDA is a variant of LFU that uses a dynamic aging policy to accommodate shifts in the set of
popular objects. In the dynamic aging policy, the cache age factor is added to the reference
count when an object is added to the cache or an existing object is modified. This prevents
previously popular documents from polluting the cache.
GDSF (Greedy Dual-Size Frequency)
GDSF is an enhancement of GDS which takes into account the size of the cached object and
the cost associated to retrieve it. GDFS takes into account frequency of reference. This policy is
optimized for more popular, smaller objects in order to maximize object hit rate.
Squid Deployment Topologies
Multiple Squid servers can be configured to work together to improve cache hit ratios or to
handle additional load. Squid caches, when installed in such a group, share either a sibling
relationship or a parent relationship. Squid servers running as parents can have multiple sibling
nodes communicating with it essentially forming a hierarchy. A flat topology may include Squid
servers with only sibling relationships.
If a request results in cache miss on a sibling node, it is transferred to the parent node. If parent
also returns a cache miss then the parent contacts the origin server for fresh content.
Page 9 of 12
Squid Capacity Planning
Squid's hardware requirements are generally modest. Memory is often the most important
resource. A memory shortage significantly reduces performance. Higher hit ratios are obtained
by caching more objects. Caching more objects requires more disk space. Therefore disk space
is also an important factor that needs to be considered. Fast disks and interfaces are also
beneficial in improving disk access time. SCSI performs better than ATA, and may be chosen if
the higher cost can be justified. While fast CPUs are nice, they are not critical to good
performance.
Squid allocates a small amount of memory for each cached resource (up to 24 bytes per
resource). As a rule of thumb it requires 32MB RAM for each GB disk space. So a server with
512MB RAM can serve a disk cache of 16GB, or for a 300GB disk cache, approximately 10GB
RAM will be needed.
Conclusion
Using reverse proxies for Web caching is a non-intrusive way of improving content
delivery performance.
Reverse proxy based Web caching can be implemented as a cost effective replacement
for commercial CDNs.
A customized CDN gives better control over the caching infrastructure and helps meet
the specific performance needs of an enterprise as compared to an expensive
commercial CDN which may provide limited configuration options.
CDNs can reduce considerable load from the origin servers thus freeing up the origin
server resources for other tasks.
Page 10 of 12
Appendix A – Case Study
“Squid Implementation for a Leading Global Entertainment Content
Company”
The customer uses Akamai Edge Server Platform for improved content delivery. Edge Server
Platform’s design helps in improving content availability and reducing request response time.
This ideally translates into less Web traffic coming directly to the Web servers (origin servers)
thus improving the overall efficiency of the infrastructure and reducing infrastructure costs.
Ironically though, it was observed that origin servers are receiving increased Web traffic from
Akamai Edge servers themselves. A solution had to be put in place to tackle that problem with
minimal impact on existing applications and content.
Problem Context
The Akamai Edge Platform offers a robust design for highly efficient content delivery across the
globe. This is achieved by deploying several thousand servers at data centers all over the world
(edge servers) and then replicating the content to be delivered on appropriate servers. The key
then is to route all content requests from clients to the nearest (in terms of network hops)
available server resulting in minimal response time and higher availability. Here the edge server
act as a caching proxy that requests content from the origin server and then serves the cached
copy until its expiry, at which point a fresh copy is again requested from the origin server.
Akamai uses a hierarchical architecture for its edge platform to avoid thousands of edge servers
making multiple refresh requests to the origin server.
The problem is that the ‘innermost’ edge servers still need to make a refresh request to get the
new content from the origin server. This results in the origin server having to serve each of the
requests separately. This was the root cause of the problem.
Akamai CDN
Origin Server Infrastructure
Foo.htm
Foo.htm
Foo.htm
Foo.htm
Figure 3 – High-level Problem Representation
Page 11 of 12
The customer summarized the problem at hand thus:
- High traffic documents such as home pages were being requested from their origin
servers as many as 70 times within a single TTL interval. This meant that there were that
many innermost Akamai servers in the hierarchy.
- Far too many requests were being received for pages, XML documents, dynamically
generated JS, CSS etc.
Customer felt that if the above-mentioned problems were addressed, the availability of the origin
servers would rise close to to 99.99%.
Solution Approaches Considered by HCL
Below is a brief summary of the approaches evaluated by the HCL team and its assessment of
those approaches.
Approach 1: Custom Solution - Application Server Side
The first approach called for intercepting incoming content refresh requests from the Akamai
servers to the origin servers, queuing and prioritizing them, and then rendering the highest
priority content.
HCL Assessment of Approach 1
Solution was a workable one but complex and many race conditions would have to be
considered before the solution’s effectiveness became known.
Robustness and performance of such a solution was not obvious.
Solution mandated changes to the application layer which could have resulted in a
cascading effect on the underlying layers.
Approach 2: Using Pre-fetch Settings Provided by Akamai
The second approach called for asynchronous content refresh. When this feature is enabled in
Akamai, the content refresh requests are sent even before the content becomes stale. Akamai
servers continue to serve the existing content even after sending refresh requests, thereby
refreshing content asynchronously.
HCL Assessment of Approach 2
Solution seemed like it was a perfect fit for the problem at hand, but it would not provide
a complete solution.
Solution would work well only when content was requested during the threshold set by
pre-fetch settings. For example if pre-fetch was set to 90%, Akamai servers would send
refresh requests to origin after 90% of TTL were over.
Core problem of receiving multiple requests for the same content would remain
unaddressed
Page 12 of 12
HCL’s Squid Reverse Proxy-Based Solution
The HCL solution was based on the following design principles:
1. Minimal or no changes to the application layer
2. No rework for content producers or brand owners
3. Once installed, solution should work transparently (without any other layers being aware
of its existence)
4. Solution should be repeatable/reusable
Using Squid as a Reverse Proxy
The goal of HCL’s solution was to minimize the number of requests going to the origin servers
while still serving as fresh content as possible.
As a first step, the HCL team proposed the installation of Squid in the reverse proxy mode on a
separate infrastructure. This introduced an additional caching layer between Akamai servers
and the origin servers. Upon setup, it cached all the relevant content and served it whenever
requested by Akamai. The team used advanced cache control setting provided by Squid (v 2.7)
to control the number of redundant requests for a single resource and to also support
asynchronous refresh.
Goals Achieved
The solution proposed by the HCL team passed the rigorous performance checks with over
90% load reduction.