+ All Categories
Home > Documents > Web server. Definition A computer that is responsible for accepting HTTP requests from clients,...

Web server. Definition A computer that is responsible for accepting HTTP requests from clients,...

Date post: 26-Dec-2015
Category:
Upload: jayson-wood
View: 239 times
Download: 2 times
Share this document with a friend
Popular Tags:
60
Web server
Transcript
Page 1: Web server. Definition A computer that is responsible for accepting HTTP requests from clients, which are known as Web browsers, and serving them Web.

Web server

Page 2: Web server. Definition A computer that is responsible for accepting HTTP requests from clients, which are known as Web browsers, and serving them Web.

Definition

• A computer that is responsible for accepting HTTP requests from clients, which are known as Web browsers, and serving them Web pages, which are usually HTML documents and linked objects (images, etc.).

• A computer program that provides the functionality described in the first sense of the term.

Page 3: Web server. Definition A computer that is responsible for accepting HTTP requests from clients, which are known as Web browsers, and serving them Web.

Web server programs

Basic common features

• HTTP responses to HTTP requests: every Web server program operates by accepting HTTP requests from the network, and providing an HTTP response to the requester. – The HTTP response typically consists of an HTML document, but

can also be a raw text file, an image, or some other type of document; if something bad is found in client request or while trying to serve the request, a Web server has to send an error response which may include some custom HTML or text messages to better explain the problem to end users.

• Logging: usually Web servers have also the capability of logging some detailed information, about client requests and server responses, to log files; this allows the Webmaster to collect statistics by running log analyzers on log files.

Page 4: Web server. Definition A computer that is responsible for accepting HTTP requests from clients, which are known as Web browsers, and serving them Web.

Web servers implement the following features:

• Configurability of available features by configuration files or even by an external user interface.

• Authentication, optional authorization request (request of user name and password) before allowing access to some or all kind of resources.

• Handling of not only static content (file content recorded

in server's filesystem(s)) but of dynamic content too by supporting one or more related interfaces (SSI, CGI, SCGI, FastCGI, PHP, ASP, ASP .NET, Server API such as NSAPI, ISAPI, etc.).

Page 5: Web server. Definition A computer that is responsible for accepting HTTP requests from clients, which are known as Web browsers, and serving them Web.

• Module support, in order to allow the extension of server capabilities by adding or modifying software modules which are linked to the server software or that are dynamically loaded (on demand) by the core server.

• HTTPS support (by SSL or TLS) in order to allow secure (encrypted) connections to the server on the standard port 443 instead of usual port 80.

• Content compression (i.e. by gzip encoding) to reduce the size of the responses (to lower bandwidth usage, etc.).

Page 6: Web server. Definition A computer that is responsible for accepting HTTP requests from clients, which are known as Web browsers, and serving them Web.

• Virtual Host to serve many web sites using one IP address.

• Large file support to be able to serve files whose size is greater than 2 GB on 32 bit OS.

• Bandwidth throttling to limit the speed of responses in order to not saturate the network and to be able to serve more clients.

Page 7: Web server. Definition A computer that is responsible for accepting HTTP requests from clients, which are known as Web browsers, and serving them Web.

Origin of returned content

The origin of the content sent by server is called:

• static if it comes from an existing file lying on a filesystem;

• dynamic if it is dynamically generated by some other program or script or API called by the Web server.

• Serving static content is usually much faster (from 2 to 100 times) than serving dynamic content, especially if the latter involves data pulled from a database.

• Server Application Programming Interface (API), the API used by PHP to interface with Web Servers

Page 8: Web server. Definition A computer that is responsible for accepting HTTP requests from clients, which are known as Web browsers, and serving them Web.

Path translation

• Web servers usually translate the path component of a Uniform Resource Locator (URL) into a local file system resource. – The URL path specified by the client is relative to

the Web server's root directory.

Page 9: Web server. Definition A computer that is responsible for accepting HTTP requests from clients, which are known as Web browsers, and serving them Web.
Page 10: Web server. Definition A computer that is responsible for accepting HTTP requests from clients, which are known as Web browsers, and serving them Web.

Concurrency

Page 11: Web server. Definition A computer that is responsible for accepting HTTP requests from clients, which are known as Web browsers, and serving them Web.

Server Models

A webserver program, as any other server, can be implemented by using one of these server models:1. single process, finite state machine and non blocking

or even asynchronous I/O; 2. multi process, finite state machine and non blocking or

even asynchronous I/O; 3. single process, forking a new process for each

request; 4. multi process, with adaptive pre-forking of processes; 5. single process, multithreaded; 6. multi process, multithreaded.

Page 12: Web server. Definition A computer that is responsible for accepting HTTP requests from clients, which are known as Web browsers, and serving them Web.

Finite state machine servers

• To minimize the context switches and to maximize the scalability, many small Web servers are implemented as a single process (or at most as a process per CPU) and a finite state machine.

• Every task is split into two or more small steps that are executed as needed (typically on demand); by keeping the internal state of each connection and by using non-blocking I/O or asynchronous I/O, it is possible to implement ultra fast Web servers, at least for serving static content.

Page 13: Web server. Definition A computer that is responsible for accepting HTTP requests from clients, which are known as Web browsers, and serving them Web.

Threaded-based servers

• Many Web servers are multithreaded.

• This means that inside each server's process, there are two or more threads, each one able to execute its own task independently from the others.

Page 14: Web server. Definition A computer that is responsible for accepting HTTP requests from clients, which are known as Web browsers, and serving them Web.

• When a user visits a Web site, a Web server will use a thread to serve the page to that user.

• If another user visits the site while the previous user is still being served, the Web server can serve the second visitor by using a different thread.

• Thus, the second user does not have to wait for the first visitor to be served.

• This is very important because not all users have the same speed Internet connection.

• A slow user should not delay all other visitors from downloading a Web page.

Page 15: Web server. Definition A computer that is responsible for accepting HTTP requests from clients, which are known as Web browsers, and serving them Web.

• Threads are often used to serve dynamic content.

• For better performance, threads used by Web servers and other Internet services are typically pooled and reused to eliminate even the small overhead associated with creating a thread.

Page 16: Web server. Definition A computer that is responsible for accepting HTTP requests from clients, which are known as Web browsers, and serving them Web.

Process-based servers

• For reliability and security reasons, some Web servers using multiple processes (rather than multiple threads within a single process) still remain in production use, such as Apache 1.3.

• A pool of processes are used, and reused, until a certain threshold of requests have been served by a process before it is replaced by a new one.

Page 17: Web server. Definition A computer that is responsible for accepting HTTP requests from clients, which are known as Web browsers, and serving them Web.

• Because threads share a main process context, a crashing thread may more easily crash the whole application, and a buffer overflow can have more disastrous consequences.

• Moreover, a memory leak in system libraries which are out of the control of the application programmer cannot be dealt with using threads, but are appropriately dealt with using a pool of processes with a limited life time (because OS automatically frees all the allocated memory, requested by a process, when the process dies).

Page 18: Web server. Definition A computer that is responsible for accepting HTTP requests from clients, which are known as Web browsers, and serving them Web.

• Another problem relates to the wide variety of third party libraries which might be used by an application (a PHP extension library for instance) which might not be thread safe.

• Using multiple processes also allows to deal with situations which can benefit from privilege separation techniques to achieve better security and to work around some OS limits which very often are per-process.

Page 19: Web server. Definition A computer that is responsible for accepting HTTP requests from clients, which are known as Web browsers, and serving them Web.

Mixed model servers

• To leverage the advantages of finite state machines, threads and processes, many webservers implement a mixture of all these programming techniques, trying to use the best model for each task (i.e. for serving static or dynamic content, etc.).

Page 20: Web server. Definition A computer that is responsible for accepting HTTP requests from clients, which are known as Web browsers, and serving them Web.

Load Limits

• A web server (program) has defined load limits, because it can handle only a limited number of concurrent client connections

(usually between 2 and 60,000, by default between 500 and 1,000) per IP address

(and IP port) and it can serve only a certain maximum number of requests per

second depending on:

Page 21: Web server. Definition A computer that is responsible for accepting HTTP requests from clients, which are known as Web browsers, and serving them Web.

– its own settings; – the HTTP request type;

– content origin (static or dynamic); – the fact that the served content is or is not

cached; – the hardware and software limits of the OS

where it is working. – When a web server is near to or over its limits,

it becomes overloaded and thus unresponsive.

Page 22: Web server. Definition A computer that is responsible for accepting HTTP requests from clients, which are known as Web browsers, and serving them Web.

Overload Causes• too much legitimate Web traffic (i.e. thousands or

even millions of clients hitting the Web site in a short interval of time);

• DDoS (Distributed Denial of Service) attacks;

• Computer worms that sometimes cause abnormal traffic because of millions of infected computers (not coordinated among them);

• Internet web robots traffic not filtered / limited on large web sites with very few resources (bandwidth, etc.);

Page 23: Web server. Definition A computer that is responsible for accepting HTTP requests from clients, which are known as Web browsers, and serving them Web.

• Internet (network) slowdowns, so that client requests are served more slowly and the number of connections increases so much that server limits are reached;

• Web servers (computers) partial unavailability, this can happen because of required / urgent maintenance or upgrade, HW or SW failures, back-end (i.e. DB) failures, etc.; in these cases the remaining web servers get too much traffic and of course they become overloaded.

Page 24: Web server. Definition A computer that is responsible for accepting HTTP requests from clients, which are known as Web browsers, and serving them Web.

Overload Symptoms

• The symptoms of an overloaded Web server are:– requests are served with noticeably (long)

delays (from 1 second to a few hundreds of seconds);

– 500, 503 HTTP errors are returned to clients (sometimes also unrelated 404 error or even 408 error may be returned);

– TCP connections are refused or reset before any content is sent to clients.

Page 25: Web server. Definition A computer that is responsible for accepting HTTP requests from clients, which are known as Web browsers, and serving them Web.

Anti Overload Techniques

• managing network traffic, by using: – Firewalls to block unwanted traffic coming from bad IP sources

or having bad patterns; – HTTP traffic managers to drop, redirect or rewrite requests

having bad HTTP patterns; – Bandwidth management and Traffic shaping, in order to

smooth down peaks in network usage; • deploying Web cache techniques; • using different domain names to serve different (static

and dynamic) content by separate Web servers, i.e.:

– http://images.example.com – http://www.example.com

Page 26: Web server. Definition A computer that is responsible for accepting HTTP requests from clients, which are known as Web browsers, and serving them Web.

• using many Web servers (programs) per computer, each one bound to its own network card and IP address;

• using many Web servers (computers) that are grouped together so that they act or are seen as one big Web server, see also: Load balancer;

• adding more HW resources (i.e. RAM, disks) to each computer;

Page 27: Web server. Definition A computer that is responsible for accepting HTTP requests from clients, which are known as Web browsers, and serving them Web.

• tuning OS parameters for HW capabilities and usage;

• using more efficient computer programs for Web servers, etc.;

• using other workarounds, specially if dynamic content is involved.

Page 28: Web server. Definition A computer that is responsible for accepting HTTP requests from clients, which are known as Web browsers, and serving them Web.

Software

The four top most common Web or HTTP server programs are:

1. Apache HTTP Server from the Apache Software Foundation.

2. Internet Information Services (IIS) from Microsoft. 3. Sun Java System Web Server from Sun Microsystems,

formerly Sun ONE Web Server, iPlanet Web Server, and Netscape Enterprise Server.

4. Zeus Web Server from Zeus Technology.

There are thousands of different Web server programs available, many of them are specialized for some uses

and can be tailored to satisfy specific needs.

Page 29: Web server. Definition A computer that is responsible for accepting HTTP requests from clients, which are known as Web browsers, and serving them Web.

Statistics

• The most popular Web servers, used for public Web sites, are tracked by Netcraft Web Server Survey, with details given by Netcraft Web Server Reports.

• The Apache HTTP Server Project is an effort to develop and maintain an open-source HTTP server for modern operating systems including UNIX and Windows NT. The goal of this project is to provide a secure, efficient and extensible server that provides HTTP services in sync with the current HTTP standards.

Page 30: Web server. Definition A computer that is responsible for accepting HTTP requests from clients, which are known as Web browsers, and serving them Web.

• Apache has been the most popular Web server on the Internet since April of 1996. – The November 2005 Netcraft Web Server Survey

found that more than 70% of the Web sites on the Internet are using Apache, thus making it more widely used than all other Web servers combined.

– The Apache HTTP Server is a project of the Apache Software Foundation

• Another site provide statistics is SecuritySpace: [1] and they also provide a detail break down for each version of Web server: [2]

Page 31: Web server. Definition A computer that is responsible for accepting HTTP requests from clients, which are known as Web browsers, and serving them Web.

Web Server Survey Across All Domains

Market Share Change (Total servers: 16,236,196) December 1st, 2005 1Servers are ordered according to their global market share.

Server1 NovemberCount

November%

OctoberCount

October%

Change

Apache 11,705,062 72.09% 11,508,481 71.95% +0.14%

Microsoft 3,588,469 22.10% 3,561,256 22.27% -0.17%

Zeus 123,100 0.76% 125,218 0.78% -0.02%

Netscape 80,711 0.50% 82,783 0.52% -0.02%

WebSTAR 65,289 0.40% 58,201 0.36% +0.04%

WebSite 14,792 0.09% 14,888 0.09% +0.00%

Other 658,773 4.06% 643,314 4.02% +0.04%

Copyright© 1998-2006 E-Soft Inc. Excerpts of this report may be reproduced providing that E-Soft and the URL http://www.securityspace.com are attributed.

Page 32: Web server. Definition A computer that is responsible for accepting HTTP requests from clients, which are known as Web browsers, and serving them Web.
Page 33: Web server. Definition A computer that is responsible for accepting HTTP requests from clients, which are known as Web browsers, and serving them Web.

Proxy server

Page 34: Web server. Definition A computer that is responsible for accepting HTTP requests from clients, which are known as Web browsers, and serving them Web.
Page 35: Web server. Definition A computer that is responsible for accepting HTTP requests from clients, which are known as Web browsers, and serving them Web.

• หมายเลขที่ 1 Client 1 จะที่ าการเรยกเว็�ป http://www.www.com ไปที่ Server

• หมายเลขที่ 2 ระบบ Proxy Server ที่ าการตรว็จสอบข�อม�ลต�างๆ ว็�าม หร�อไม� ซึ่�งในที่น"ไม�มจะที่ าการเรยกข�อม�ลไปที่เคร�อข�ายอ$นเตอร%เน�ต

• หมายเลขที่ 3 เม�อ Web Server ตอบกล&บและส�งข�อม�ลกล&บมาที่Proxy Server

• หมายเลขที่ 4 ระบบ Proxy Server ที่ าการค&ดลอกไฟล%ที่โหลดมาลงCache ระบบ และส�งต�อไฟล%ที่ เคร�อง Client 1

• หมายเลขที่ 5 Client 2 ที่ าการเรยกว็�า http://www.www.com เช่�นก&น

• หมายเลขที่ 6 ระบบ Proxy Server ที่ าการตรว็จสอบและพบว็�าม เว็�ป ใน Cache ระบบ จ�งที่ าการส�งเว็�ปไปให� Client 2 ที่&นที่โดยไม�ต�อง

ที่ าการร�องข�อใหม�จากอ$นเตอร%เน�ต

Page 36: Web server. Definition A computer that is responsible for accepting HTTP requests from clients, which are known as Web browsers, and serving them Web.
Page 37: Web server. Definition A computer that is responsible for accepting HTTP requests from clients, which are known as Web browsers, and serving them Web.

Proxy• An intermediary program which acts as both

a server and a client for the purpose of making requests on behalf of other clients.

• Requests are serviced internally or by passing them on, with possible translation, to other servers.

• A proxy MUST implement both the client and server requirements of this specification.

Page 38: Web server. Definition A computer that is responsible for accepting HTTP requests from clients, which are known as Web browsers, and serving them Web.

• A "transparent proxy" is a proxy that does not modify the request or response beyond what is required for proxy authentication and identification.

• A "non-transparent proxy" is a proxy that modifies the request or response in order to provide some added service to the user agent, such as group annotation services, media type transformation, protocol reduction, or anonymity filtering.

• Except where either transparent or non-transparent behavior is explicitly stated, the HTTP proxy requirements apply to both types of proxies.

Page 39: Web server. Definition A computer that is responsible for accepting HTTP requests from clients, which are known as Web browsers, and serving them Web.

Cache• A program's local store of response messages and

the subsystem that controls its message storage, retrieval, and deletion.

• A cache stores cacheable responses in order to reduce the response time and network bandwidth consumption on future, equivalent requests.

• Any client or server may include a cache, though a cache cannot be used by a server that is acting as a tunnel.

Page 40: Web server. Definition A computer that is responsible for accepting HTTP requests from clients, which are known as Web browsers, and serving them Web.

What's a Web Cache?

Why do people use them?

• A Web cache sits between Web servers (or origin servers) and a client or many clients, and watches requests for HTML pages, images and files (collectively known as objects) come by, saving a copy for itself.

• Then, if there is another request for the same object, it will use the copy that it has, instead of asking the origin server for it again.

Page 41: Web server. Definition A computer that is responsible for accepting HTTP requests from clients, which are known as Web browsers, and serving them Web.

Web caches

There are two main reasons that Web caches are used:• To reduce latency - Because the request is satisfied

from the cache (which is closer to the client) instead of the origin server, it takes less time for the client to get the object and display it. This makes Web sites seem more responsive.

• To reduce traffic - Because each object is only gotten from the server once, it reduces the amount of bandwidth used by a client. This saves money if the client is paying by traffic, and keeps their bandwidth requirements lower and more manageable.

Page 42: Web server. Definition A computer that is responsible for accepting HTTP requests from clients, which are known as Web browsers, and serving them Web.

Kinds of Web Caches

Browser Caches• If you examine the preferences dialog of any modern

browser (like Internet Explorer or Netscape), you'll probably notice a 'cache' setting. This lets you set aside a section of your computer's hard disk to store objects that you've seen, just for you. The browser cache works according to fairly simple rules. It will check to make sure that the objects are fresh, usually once a session (that is, the once in the current invocation of the browser).

• This cache is useful when a client hits the 'back' button to go to a page they've already seen. Also, if you use the same navigation images throughout your site, they'll be served from the browser cache almost instantaneously.

Page 43: Web server. Definition A computer that is responsible for accepting HTTP requests from clients, which are known as Web browsers, and serving them Web.

Kinds of Web Caches

Proxy Caches• Web proxy caches work on the same principle, but a much larger

scale. Proxies serve hundreds or thousands of users in the same way; large corporations and ISP's often set them up on their firewalls.

• Because proxy caches usually have a large number of users behind them, they are very good at reducing latency and traffic. That's because popular objects are requested only once, and served to a large number of clients.

• Most proxy caches are deployed by large companies or ISPs that want to reduce the amount of Internet bandwidth that they use. Because the cache is shared by a large number of users, there are a large number of shared hits (objects that are requested by a number of clients). Hit rates of 50% efficiency or greater are not uncommon. Proxy caches are a type of shared cache.

Page 44: Web server. Definition A computer that is responsible for accepting HTTP requests from clients, which are known as Web browsers, and serving them Web.

Aren't Web Caches bad for me? Why should I help them?

• Web caching is one of the most misunderstood technologies on the Internet. Webmasters in particular fear losing control of their site, because a cache can 'hide' their users from them, making it difficult to see who's using the site.

• Unfortunately for them, even if no Web caches were used, there are too many variables on the Internet to assure that they'll be able to get an accurate picture of how users see their site. If this is a big concern for you, this document will teach you how to get the statistics you need without making your site cache-unfriendly.

• Another concern is that caches can serve content that is out of date, or stale. However, this document can show you how to configure your server to control this, while making it more cacheable.

Page 45: Web server. Definition A computer that is responsible for accepting HTTP requests from clients, which are known as Web browsers, and serving them Web.

• On the other hand, if you plan your site well, caches can help your Web site load faster, and save load on your server and Internet link. – The difference can be dramatic; a site that is difficult to cache may take

several seconds to load, while one that takes advantage of caching can seem instantaneous in comparison.

– Users will appreciate a fast-loading site, and will visit more often.

• Think of it this way; many large Internet companies are spending millions of dollars setting up farms of servers around the world to replicate their content, in order to make it as fast to access as possible for their users. – Caches do the same for you, and they're even closer to the end user. Best of

all, you don't have to pay for them.

• The fact is that caches will be used whether you like it or not. If you don't configure your site to be cached correctly, it will be cached using whatever defaults the cache's administrator decides upon.

Page 46: Web server. Definition A computer that is responsible for accepting HTTP requests from clients, which are known as Web browsers, and serving them Web.

How Web Caches Work

• All caches have a set of rules that they use to determine when to serve an object from the

cache, if it's available. – Some of these rules are set in the protocols (HTTP 1.0 and 1.1), and some are set by the administrator of the cache (either the user of

the browser cache, or the proxy administrator).

• The most common rules that are followed for a particular request :

Page 47: Web server. Definition A computer that is responsible for accepting HTTP requests from clients, which are known as Web browsers, and serving them Web.

• If the object's headers tell the cache not to keep the object, it won't. Also, if no validator is present, most caches will mark the object as uncacheable.

• If the object is authenticated or secure, it won't be cached.

• A cached object is considered fresh (that is, able to be sent to a client without checking with the origin server) if: – It has an expiry time or other age-controlling directive set,

and is still within the fresh period. – If a browser cache has already seen the object, and has

been set to check once a session. – If a proxy cache has seen the object recently, and it was

modified relatively long ago.

Page 48: Web server. Definition A computer that is responsible for accepting HTTP requests from clients, which are known as Web browsers, and serving them Web.

• Fresh documents are served directly from the cache, without checking with the origin server.

• If an object is stale, the origin server will be asked to validate the object, or tell the cache

whether the copy that it has is still good.

• Together, freshness and validation are the most important ways that a cache works with content.

– A fresh object will be available instantly from the cache, while a validated object will avoid sending the

entire object over again if it hasn't changed.

Page 49: Web server. Definition A computer that is responsible for accepting HTTP requests from clients, which are known as Web browsers, and serving them Web.

How (and how not) to Control Caches

• There are several tools that Web designers and Webmasters can use to fine-tune how caches will treat their sites. – It may require getting your hands a little dirty with the server

configuration, but the results are worth it.

• HTML Meta Tags vs. HTTP Headers

• HTML authors can put tags in a document's <HEAD> section that describe its attributes. – These Meta tags are often used in the belief that they can mark

a document as uncacheable, or expire it at a certain time.

Page 50: Web server. Definition A computer that is responsible for accepting HTTP requests from clients, which are known as Web browsers, and serving them Web.

• Meta tags are easy to use, but aren't very effective. – That's because they're usually only honored by browser caches (which actually

read the HTML), not proxy caches (which almost never read the HTML in the document).

– While it may be tempting to slap a Pragma: no-cache meta tag on a home page, it won't necessarily cause it to be kept fresh, if it goes through a shared cache.

• True HTTP headers give a lot of control over how both browser caches and proxies handle your objects. – They can't be seen in the HTML, and are usually automatically generated by the

Web server. – However, you can control them to some degree, depending on the server you

use. In the following sections, you'll see what HTTP headers are interesting, and how to apply them to your site.

• If your site is hosted at an ISP or hosting farm and they don't give you the ability to set arbitrary HTTP headers (like Expires and Cache-Control), complain loudly; these are tools necessary for doing your job.

Page 51: Web server. Definition A computer that is responsible for accepting HTTP requests from clients, which are known as Web browsers, and serving them Web.

• HTTP headers are sent by the server before the HTML, and only seen by the browser and any intermediate caches. Typical HTTP 1.1 response headers might look like this:

• HTTP/1.1 200 OK Date: Fri, 30 Oct 1998 13:19:41 GMT Server: Apache/1.3.3 (Unix) Cache-Control: max-age=3600, must-revalidate Expires: Fri, 30 Oct 1998 14:19:41 GMT Last-Modified: Mon, 29 Jun 1998 02:28:12 GMT ETag: "3e86-410-3596fbbc" Content-Length: 1040 Content-Type: text/html The HTML document would follow these headers, separated by a blank line.

Page 52: Web server. Definition A computer that is responsible for accepting HTTP requests from clients, which are known as Web browsers, and serving them Web.

• Pragma HTTP Headers (and why they don't work)

• Many people believe that assigning a Pragma: no-cache HTTP header to an object will make it uncacheable. – This is not necessarily true; the HTTP specification does

not set any guidelines for Pragma response headers; instead, Pragma request headers (the headers that a browser sends to a server) are discussed.

– Although a few caches may honor this header, the majority won't, and it won't have any effect. Use the headers below instead.

Page 53: Web server. Definition A computer that is responsible for accepting HTTP requests from clients, which are known as Web browsers, and serving them Web.

• Controlling Freshness with the Expires HTTP Header

– The Expires HTTP header is the basic means of controlling caches; it tells all caches how long the object is fresh for; after that time,

caches will always check back with the origin server to see if a document is changed.

Expires headers are supported by practically every client.

Page 54: Web server. Definition A computer that is responsible for accepting HTTP requests from clients, which are known as Web browsers, and serving them Web.

• Most Web servers allow you to set Expires response headers in a number of ways. – Commonly, they will allow setting an absolute

time to expire, a time based on the last time that the client saw the object (last access time), or a time based on the last time the document changed on your server (last

modification time).

Page 55: Web server. Definition A computer that is responsible for accepting HTTP requests from clients, which are known as Web browsers, and serving them Web.

• Expires headers are especially good for making static images (like navigation bars and buttons) cacheable. – Because they don't change much, you can set

extremely long expiry time on them, making your site appear much more responsive to your users.

– They're also useful for controlling caching of a page that is regularly changed.

– For instance, if you update a news page once a day at 6am, you can set the object to expire at that time, so caches will know when to get a fresh copy, without users having to hit 'reload'.

Page 56: Web server. Definition A computer that is responsible for accepting HTTP requests from clients, which are known as Web browsers, and serving them Web.

• The only value valid in an Expires header is a HTTP date; anything else will most likely be interpreted as 'in the past', so that the object is uncacheable. – Also, remember that the time in a HTTP

date is Greenwich Mean Time (GMT), not local time.

Page 57: Web server. Definition A computer that is responsible for accepting HTTP requests from clients, which are known as Web browsers, and serving them Web.

Cache-Control HTTP Headers

• Although the Expires header is useful, it is still somewhat limited; there are many situations where content is cacheable, but the HTTP 1.0 protocol lacks methods of telling caches what it is, or how to work with it.

Page 58: Web server. Definition A computer that is responsible for accepting HTTP requests from clients, which are known as Web browsers, and serving them Web.

• HTTP 1.1 introduces a new class of headers, the Cache-Control response headers, which allow Web publishers to define how pages should be handled by caches. – They include directives to declare what should

be cacheable, what may be stored by caches, modifications of the expiration mechanism, and revalidation and reload controls.

Page 59: Web server. Definition A computer that is responsible for accepting HTTP requests from clients, which are known as Web browsers, and serving them Web.

Interesting Cache-Control response headers

• max-age=[seconds] - specifies the maximum amount of time that an object will be considered fresh. Similar to Expires, this directive allows more flexibility. [seconds] is the number of seconds from the time of the request you wish the object to be fresh for.

• s-maxage=[seconds] - similar to max-age, except that it only applies to proxy (shared) caches.

• public - marks the response as cacheable, even if it would normally be uncacheable. For instance, if your pages are authenticated, the public directive makes them cacheable.

http://www.web-caching.com/mnot_tutorial/how.html#WORK

Page 60: Web server. Definition A computer that is responsible for accepting HTTP requests from clients, which are known as Web browsers, and serving them Web.

• no-cache - forces caches (both proxy and browser) to submit the request to the origin server for validation before releasing a cached copy, every time. This is useful to assure that authentication is respected (in combination with public), or to maintain rigid object freshness, without sacrificing all of the benefits of caching.

• must-revalidate - tells caches that they must obey any freshness information you give them about an object. The HTTP allows caches to take liberties with the freshness of objects; by specifying this header, you're telling the cache that you want it to strictly follow your rules.

• proxy-revalidate - similar to must-revalidate, except that it only applies to proxy caches.


Recommended