Date post: | 19-Jan-2016 |
Category: |
Documents |
Upload: | rodney-king |
View: | 214 times |
Download: | 1 times |
HTTP
CS587x LectureDepartment of Computer Science
Iowa State University
What to Cover
WWW HTTP/1.0
Protocol highlights Problems
HTTP/1.1 Highlights of improvement
World Wide Web (WWW)
Core Components Servers
Store files and execute remote commands Browsers (i.e., clients)
Retrieve and display “pages” of content linked by hypertext
Networks Send information back and forth upon request
Problems How to identify an object How to retrieve an object How to interpret an object
Semantic Parts of WWW
URI (Uniform Resource Identifier) protocol://hostname:port/directory/object
http://www.cs.iastate.edu/index.html ftp://popeye.cs.iastate.edu/welcome.txt https://finance.yahoo.com/q/cq?s=ibm&d=v1
Implementation: extend hierarchical namespace to include
anything in a file system server side processing
HTTP (Hyper Text Transfer Protocol) An application protocol for information sending/receiving
HTML (Hypertext Markup Language) An language specification used to interpret the information
received from server
HTTP Properties
Request-response exchange Server runs over TCP, Port 80 Client sends HTTP requests and gets responses from
server Synchronous request/reply protocol
Stateless No state is maintained by clients or servers across
requests and responses Each pair of request and response is treated as an
independent message exchange
Resource metadata Information about resources are often included in web
transfers and can be used in several ways
HTTP Commands
GET Transfer resource from given URL
HEAD Get resource metadata (headers) only
PUT Store/modify resource under a given URL
DELETE Remove resource
POST Provide input for a process identified by the
given URL (usually used to post CGI parameters)
Response Codes of HTTP 1.0
2xx success3xx redirection4xx client error in request5xx server error; can’t satisfy the request
Steps of Processing an HTTP Requesthttp://www.cs.iastate.edu/index.html
The client1. Contact its local DNS to find out the IP address
of www.cs.iastate.edu2. Initiate a TCP connection on port 803. Send the get request via the established
socketGET /index.html HTTP/1.0
The server 4. Send its response containing the required file5. Tell TCP to terminate connection
The browser6. Parse the file and display it accordingly7. Repeat the same steps in the presence of any
embedded objects
Server Response
HTTP/1.0 200 OKContent-Type: text/htmlContent-Length: 1234Last-Modified: Mon, 19 Nov 2001 15:31:20 GMT<HTML><HEAD><TITLE>CS Home Page</TITLE></HEAD>…</BODY></HTML>
HTTP/1.0 Example
Client Server
Request file 1
Transfer file 1
Request file 2
Transfer file 2
Request file n
Transfer file n
Finish displaypage
HTTP Server Implementation
public WebServerDemo(String[] args) { public static void main(String[] args) { ServerSocket ss = new ServerSocket(80);
for (;;) { // accept connection Socket accept = ss.accept();
// Start a thread to process the request new Handler(accept).start(); }}
HTTP Server Implementation
class Handler extends Thread { // Handler for a HTTP request Socket socket; BufferedReader br; PrintWriter pw;
public Handler(Socket _socket) { socket=_socket; }
public void run() { br = new BufferedReader(new InputStreamReader(socket.getInputStream())); pw = new PrintWriter(new OutputStreamWriter(bos));
String line = br.readLine(); // Read HTTP request from user if(line.toUpperCase().startsWith("GET")) { // parse the string to find the file name // locate the file and send it back ::::: } //other commands: post, delete, put, etc. }}
HTTP/1.0 Caching
CLIENT GET request:
If-modified-since – return a “not modified” response if resource was not modified since specified time
Request header No-cache – ignore all caches and get resource
directly from server
SERVER Response header:
Expires – specify to the client for how long it is safe to cache the resource
Issues with HTTP/1.0
Each resource requires a new connection Large number of embedded objects in a web page Many short lived connections
Serial vs. parallel connections Serial connection downloads one object at a time (e.g.,
MOSAIC) causing long latency to display a whole page Parallel connection (e.g., NETSCAPE) opens several
connections (typically 4) contributing to network congestion
HTTP uses TCP as the transport protocol TCP is not optimized for the typical short-lived connections Most Internet traffic fit in 10 packets (overhead: 7 out of 17)
Too slow for small object May never exit slow-start phase
Highlights of HTTP/1.1
Persistent connections Pipelined requests/responses Support for virtual hosting More explicit support on caching Internet Caching Protocol (ICP) Content negotiation/adaptation Range Request
Persistent Connections
The basic idea was reducing the number of TCP connections
opened and closed reducing TCP connection costs reducing latency by avoiding multiple TCP
slow-starts avoid bandwidth wastage and reducing overall
congestion A longer TCP connection knows better about
networking condition (Why?)
New GET methods GETALL GETLIST
Pipelined Requests/Responses
Buffer requests and responses to reduce the number of packetsMultiple requests can be contained in one TCP segmentNote: order of responses has to be maintained
Client Server
Request 1Request 2Request 3
Transfer 1
Transfer 2
Transfer 3
Support for Virtual Hosting
Problem – outsourcing web content to some company
http://www.hostmany.com/A http://www.A.com http://www.hostmany.com/B http://www.B.com
In HTTP/1.0, a request for http://www.A.com/index.html has in its header only:
GET /index.html HTTP/1.0
It is not possible to run two web servers at the same IP address, because GET is ambiguousHTTP/1.1 addresses this by adding “Host” header
GET /index.html HTTP/1.1Host: www.A.com
Content Negotiation/Adaptation
A resource may have more than one representation Different languages Different size of images, etc.
ExampleGET /index.html HTTP/1.1Host: www.getbelix.comAccept-Language: en-us, fr-BE
Two approaches Agent-driven: the client receives a set of alternative
representation of the response, chooses the best representation and indicates in the second request
Server-driven: the server chooses the representation based on what is available at the server, the headers in the request messages, or information about the client, such as its IP
Range Request
A user may want to load only some portion of content E.g., retrieve only the newly appended
portion E.g., load some pages of a PDF file
GET bigfile.html HTTP/1.1Host: www.justwhatiwant.comRange: 2000-3999
Range: -1000Range: 2000-
no-cache: forcible revalidation with origin serveronly-if-cached: obtain resource only from cacheno-store: don’t allow caches to store request/responsemax-age: response’s should be no greater than this valuemax-stale: expired response OK but not older than staled valuemin-fresh: response should remain fresh for at least stated valueno-transform: proxy should not change media type
Cache-Control Request Directives
Cache-Control Response Directives
public: OK to cache response anywhereprivate: response for specific user onlyno-cache: do not serve from cache without prior revalidation
Must revalidate regardless of when the response becomes staleno-store: caches are not permitted to store response, requestno-transform: proxy should not change media typemust-revalidate: can be cached but revalidate if stale
A file may be associated with an age (expiration)proxy-revalidate: force shared user agent caches to revalidate cached responsemax-age: response’s age should be no greater than this values-maxage: shared caches use value as response’s maximum age (overide max-age)
Factors to Consider for Cache Replacement
Cost of storing the resource (size)
Cost of fetching the resource (size+distance)
The time since the last modification of the resource
The number of accesses to the resource in the past
The probability of the resource being accessed in the near future
May be a known priori or based on the past access pattern
The heuristic expiration time If there is no server-specified expiration time, the cache
decides on a heuristic expiration time. If no expired resource are available as candidates, then
resource that are close to their expiration time are prioritized as candidates for replacement
Summary
HTTP 1.0HTTP 1.1
What covered so far
HTTP DNS
TCP UDP
IP
Ethernet FDDI Token Etc.
FYI
SOURCE: National Science Board, Science and Engineering Indicators-2002
Internet domain survey host count worldwide
HTTP Server (1)import java.io.*;import java.net.*;import java.util.*;
public class WebServerDemo { protected String docroot; // Directory of HTML pages and other files protected int port; // Port number of web server protected ServerSocket ss; // Socket for the web server
class Handler extends Thread { // Handler for a HTTP request protected Socket socket; protected PrintWriter pw; protected BufferedOutputStream bos; protected BufferedReader br; protected File docroot;
public Handler(Socket _socket, String _docroot) throws Exception { socket=_socket; docroot=new File(_docroot).getCanonicalFile(); // Absolute dir of the filepath }
HTTP Server (2) public void run() { try { // Prepare our readers and writers br = new BufferedReader(new InputStreamReader(socket.getInputStream())); bos = new BufferedOutputStream(socket.getOutputStream()); pw = new PrintWriter(new OutputStreamWriter(bos)); String line = br.readLine(); // Read HTTP request from user socket.shutdownInput(); // Shutdown any further input if(line == null) { socket.close(); return; } if(line.toUpperCase().startsWith("GET")) { // Eliminate any trailing ? data, such as for a CGI GET request StringTokenizer tokens = new StringTokenizer(line," ?"); tokens.nextToken(); String req = tokens.nextToken(); String name; // ... form a full filename if(req.startsWith("/") || req.startsWith("\\")) name = this.docroot+req; else name = this.docroot+File.separator+req; File file = new File(name).getCanonicalFile(); // Get absolute file path // Check to see if request doesn't start with our document root .... if(!file.getAbsolutePath().startsWith(this.docroot.getAbsolutePath())) { pw.println("HTTP/1.0 403 Forbidden"); pw.println(); }
HTTP Server (3) // run() continued else if(!file.canRead()) { // No access pw.println("HTTP/1.0 403 Forbidden"); pw.println(); } else if(file.isDirectory()) { // Directory, not file sendDir(bos,pw,file,req); } else { sendFile(bos, pw, file.getAbsolutePath()); } } else { // Unsupported command pw.println("HTTP/1.0 501 Not Implemented"); pw.println(); } pw.flush(); bos.flush(); } catch(Exception e) { e.printStackTrace(); } try { socket.close(); } catch(Exception e) { e.printStackTrace(); } } // run() protected void sendFile(BufferedOutputStream bos, PrintWriter pw, String filename) throws Exception { try { BufferedInputStream bis = new BufferedInputStream(new FileInputStream(filename)); byte[] data = new byte[10*1024]; int read = bis.read(data); pw.println("HTTP/1.0 200 Okay"); pw.println(); pw.flush(); bos.flush(); while(read != -1) { bos.write(data,0,read); read = bis.read(data); } bos.flush(); } catch(Exception e) { pw.flush(); bos.flush(); } }
HTTP Server (4) protected void sendDir(BufferedOutputStream bos, PrintWriter pw, File dir, String req) throws
Exception { try { pw.println("HTTP/1.0 200 Okay"); pw.println(); pw.flush(); pw.print("<html><head><title>Directory of " + req + "</title></head><body><h1>Directory of “ + req
+ "</h1><table border=\"0\">"); File[] contents=dir.listFiles(); for(int i=0;i<contents.length;i++) { pw.print("<tr><td><a href=\"" + req + contents[i].getName()); if(contents[i].isDirectory()) pw.print("/"); pw.print("\">"); if(contents[i].isDirectory()) pw.print("Dir -> "); pw.println(contents[i].getName() + "</a></td></tr>"); } pw.println("</table></body></html>"); pw.flush(); } catch(Exception e) { pw.flush(); bos.flush(); } } } protected void parseParams(String[] args) throws Exception { switch(args.length) { // Check that a filepath has been specified and a port number case 1: case 0: System.err.println ("Syntax: <jvm> "+this.getClass().getName()+" docroot port"); System.exit(0); default: this.docroot = args[0]; this.port = Integer.parseInt(args[1]); break; } }
HTTP Server (5)
public WebServerDemo(String[] args) throws Exception { System.out.println ("Checking for paramters"); parseParams(args); // Check for command line parameters System.out.print ("Starting web server...... "); this.ss = new ServerSocket(this.port); // Create a new server socket System.out.println ("OK");
for (;;) { // Forever Socket accept = ss.accept(); // Accept connection via server socket // Start a new handler instance to process the request new Handler(accept, docroot).start(); } }
// Start an instance of the web server public static void main(String[] args) throws Exception { WebServerDemo webServerDemo = new WebServerDemo(args); }}