HTTP Protocol Design 1
HTTP - timeline
Mar 1990 CERN labs document proposing Web Jan 1992 HTTP/0.9 specification Dec 1992 Proposal to add MIME to HTTP Feb 1993 UDI (Universal Document Identifier) Network Mar 1993 HTTP/1.0 first draft Jun 1993 HTML (1.0 Specification) Oct 1993 URL specification Nov 1993 HTTP/1.0 second draft Mar 1994 URI in WWW May 1996 HTTP/1.0 Informational, RFC 1945 Jan 1997 HTTP/1.1 Proposed Standard, RFC 2068 Jun 1999 HTTP/1.1 Draft Standard, RFC 2616 2001 HTTP/1.1 Formal Standard
HTTP Protocol Design 2
Uniform Resource Identifier (URI)
Resource independent of its current location or name by which it is known
URI combination of : Uniform Resource Locator (URL) - Several alternatives (e.g., http://, ftp://)
- Most popular Uniform Resource Name (URN)
- Globally unique - Like ISBN for a book URI characteristics
Absolute: if scheme:string (scheme: file news, http, telnet,…)
Relative: if no scheme
HTTP Protocol Design 3
MIME and HTTP
Original proposal All resources MIME encapsulated Protocols such as Web should only handle MIME-
compliant data
Adopted Classification of data formats (MIME types) Formats for multipart messages
Not adopted Rich text markup mechanism (rather used HTML) Addressing external documents (rather used
URLs)
HTTP Protocol Design 4
MIME and HTTP differences
MIME defined for e-mail HTTP high performance Interpretation of header fields (content-
length) Limitation on line length HTTP is not MIME-compliant (content-
encoding) Different kinds of entities
HTTP Protocol Design 5
HTTP terms
Message Sequence of octets Syntax: Request
• Request-Line• General/Request/Entity Header(s)• CRLF• Optional Message Body
Syntax: Response• Status-Line• General/Response/Entity Header(s)• CRLF• Optional Message Body
HTTP Protocol Design 6
HTTP terms (cont.)
Entity Representation of a resource from request
or response message Includes entity headers and an optional
entity body Resource
“Network data object or service that can beidentified by a URI”
User agent
HTTP Protocol Design 7
HTTP/1.0 request methods
Safety: examines the state of a resource Idempotent: side effects of one request ==
those of multiple requests GET (safe, idempotent) HEAD POST (not safe, not idempotent) PUT (not safe, idempotent) Delete LINK/UNLINK
HTTP Protocol Design 8
HTTP/1.0 headers
General Date Pragma (no-cache)
Request Authorization From If-Modified-Since Referer User-Agent
Response Location (redirects) Server WWW-Authenticate (issues challenge)
HTTP Protocol Design 9
HTTP/1.0 headers (cont.)
Entity Allow (valid methods) Content-Type Content-Encoding Content-Length Expires Last-Modified
HTTP Protocol Design 10
HTTP/1.0 response classes?
From SMTP reply codes (yet no specificmeaning)
X00: default response 1XX: Informational 2XX: Success
200 OK, 201 Created, 202 Accepted, 204 No Content 3XX: Redirection
300 Multiple Choices, 301 Moved Permanently, 302 Moved Temporarily, 304 Not Modified
4XX: Client error 400 Bad Request, 401 Unauthorized, 403 Forbidden, 404 Not Found
5XX: Server error 500 Internal Server Error 502 Not Implemented
HTTP Protocol Design 11
Problems with HTTP/1.0
Lack of control: cache duration, cache location,selection among cached variants, …
Ambiguity of rules for proxies and caches Download of full resource instead of needed part Poor use of TCP: short Web responses No guarantee for full receipt for dynamically
generated responses Depletion of IP addresses Inability to tailor request, responses according to
client, server preference Poor level of security Miscellaneous
HTTP Protocol Design 12
New concepts
Hop-by-hop mechanism Headers valid only for a single transport-level
connection:Transfer-Encoding, Connection
Cannot be stored by caches or forwarded by proxies Transfer coding
Split: message vs. entity (including headers) Content coding is applied to whole entity Transfer coding applies to entity-body
• Property of message not original entity• TE, Transfer-Encoding
Virtual hosting Semantic transparency for caching Support for variants of a resource
HTTP Protocol Design 13
HTTP/1.1 methods
GET, HEAD, POST PUT, DELETE: formalized OPTIONS: purpose extensibility
Learn about a server’s capability Learn about intermediate servers in the path
(Max-Forwards header == TTL) TRACE: purpose extensibility
Returns the content of the message from the receiver (Viaheader == records intermediaries)
CONNECT: future use(extensibility: Upgrade header allows switch to otherprotocols)
HTTP Protocol Design 14
New headers: General
Old: Date, Pragma
New: Cache-Control Caching Connection Hop-by-hop Trailer List of headers at end Transfer-Encoding Transformation to
message body Upgrade Upgrade to other
protocols Via Intermediate servers Warning Error-notification
HTTP Protocol Design 15
New headers: Request
Response preference New: Accept (charset, encoding, language), TE
Information Old: Authorization, From, Referer, User-Agent New: Proxy-Authorization
Conditional request Old: If-Modified-Since New: If-Match, If-None-Match, If-Unmodified-
Since, If-Range Constraint on server
New: Expect, Host, Max-Forwards, Range
HTTP Protocol Design 16
New headers: Response
Redirection: Old: Location
Information Old: Server New: Retry-After, Accept-Ranges
Security related Old: WWW-Authenticate New: Proxy-Authenticate
Caching related New: Etag, Age, Vary
HTTP Protocol Design 17
New headers: Entity
Old: Allow Content-Encoding, -Length, -Type Expires Last-Modified
New: Content-Language, -Location, -MD5, -
Range
HTTP Protocol Design 18
Response codes: Examples
Informational 100 Continue, 101 Switching Protocols
Success: 206 Partial Content, … Redirection: 305 Use Proxy, … Client errors
14 new ones Error codes: 400 bad request, 404 not found Clarification status codes: 405 method not allowed,
410 gone Using negotiation: 406 not acceptable, 412
unsupported media type Length related: 411 length required Other features: 402 Payment Required, 417
expectation failed Server errors: 504 gateway Timeout, …
HTTP Protocol Design 19
Caching HTTP/1.0
Control options Request directive (Pragma: no-cache) Modifier to GET (If-Modified-Since) Response header (Expires)
Cache busting Expire header forced immediate expiry of
resource Last-Modified typically means not
dynamically generated Absolute clock values
HTTP Protocol Design 20
Caching HTTP/1.1
Issues: Separation of cacheable and save use of
copy Ensure correctness (no cache should
unknowingly return a stale value) More control by server over cacheability No absolute timestamps (no synch) Caching of negotiated responses
Headers: Age, Cache-control, Etag, Vary
HTTP Protocol Design 21
Cache-control HTTP/1.1 request
No-cache forcible revalidation Only-if-cached resource only from cache No-store cache cannot store Max-age age <= value Max-stale expired OK but <= value Min-fresh remain fresh for value No-transform no change of media type Extension new tokens
HTTP Protocol Design 22
Cache-control HTTP/1.1 response Public OK to cache Private Response for specific user
only No-store not permitted to store No-cache do not serve from cache
without revalidation No-transform proxy cannot change media
type, etc Must-revalidate cached but revalidate if
stale Proxy-revalidate shared caches need revalidation Max-age response age should be <=
age S-max age shared caches use value as
max-age
HTTP Protocol Design 23
Etag
Opaque value Different versions of resources =>
different etag values Decoupled from cache validation
If-Match, If-None-Match E.g. If-None-Match in PUT to avoid
overwriting
HTTP Protocol Design 24
Vary
E.g., Accept-Language
HTTP Protocol Design 25
Bandwidth Optimization :Factors
Resource sizes are growing
Embedded Images
More users, better connected
Multiple parallel connections
HTTP Protocol Design 26
Bandwidth Optimization: Solutions
Only transfer necessary pieces of resource Range request
Only transfer if receiver can handle response Expect/continue
Transform resource before sending Compression
HTTP Protocol Design 27
Connection management
Problem: TCP not optimized for typical short-lived HTTP message exchange Use of parallel connections
Solution: Persistent connections
- Keep-Alive Pipelined connections
- Connection header Problems:
- Head of line blocking
- Unexpected close (aborts)
HTTP Protocol Design 28
Message transmission
HTTP/1.0 Content-length field
HTTP/1.1 Chunked encoding (ends with zero-length
chunk) Response: Transfer-Encoding: chunked Request: TE: trailers
HTTP Protocol Design 29
Internet address conservation Many Web server on a single host HTTP/1.0 one IP address per Web
server HTTP/1.1 Host header
Host: www.foo.com
HTTP Protocol Design 30
Content negotiation
Different formats for each resource Client and server negotiate about
preferred representation Agent-driven Server-driven
HTTP Protocol Design 31
Proxies in HTTP/1.1: syntax
Requirement dealing with forwarding messages HTTP/1.1 vs. HTTP/1.0 Forward non understood headers Treat hop-by-hop headers Remove Connection header
Requirement dealing with modifying existing headers,adding new ones Add Via header Do not alter the order of field values Adhere to cache control directive Do not modify From and Server Do not alter fully qualified domain names Do not generate certain headers: Content-MD5
HTTP Protocol Design 32
Proxies in HTTP/1.1: semantic
Caching requirements See cache control Obligated to send Age header
Connection management requirements HTTP/1.1 proxies may not establish persistent
connections with HTTP/1.0 clients Different guidelines regarding persistent
connections (2*simultaneously active users)
Bandwidth management requirements Range requests Forward expect header/417 expectation failed
response