Post on 15-Jan-2016
description
transcript
1 herbert van de sompel
CS 502 Computing Methods for Digital
Libraries
Cornell University – Computer ScienceHerbert Van de Sompelherbertv@cs.cornell.edu
Lecture 4 Basic Web Concepts
2 herbert van de sompel
HypertexT Transfer Protocol (HTTP)
web serverHTTP server
web browserHTTP client
renders response
HTTP request
HTTP response
IP address 1 IP address 2
TCP/IP network
3 herbert van de sompel
Transmission Control Protocol/Internet Protocol (TCP/IP )
• is the protocol suite that drives the Internet
• handles network communications between network nodes (computers, printers, webcams, … connected to the Internet)
• protocol suite:
• TCP: communication of data between applications
• IP: communication of data between nodes
• UDP: communication between applications
• ICMP: error and stats
4 herbert van de sompel
TCP/IP protocol architecture
Application layerClient sends HTTP request
Server receivesHTTP request
Transport layer TCP
Internet layer IP
Network Accesslayer
Ethernet, …
5 herbert van de sompel
Transmission Control Protocol (TCP)
• breaks message up into chunks
• chunks get sequence number and IP address of addressee
• opens connection with addressee (handshake)
• hands chunks over to IP layer
• guarantees error-free delivery of chunks at addressee (through connection)
6 herbert van de sompel
Internet Protocol (IP)
• handles the routing of chunks towards addressee (through routers)• IP Addressing:
• each node has an IP address: 157.193.101.6• each node can have readable name erlserv.rug.ac.be • DNS connects IP and readable name
• IP Data Transmission:• sender delivers chunk to router (via lower level protocol)• router delivers chunk to router or host• individual chunks can be delivered via different paths• routers decide on the path of least resistance• at addressee delivers chunk to TCP layer
7 herbert van de sompel
Application layer
Transport layer
Internet layer
Network Accesslayer
IP, ICMP
TCP, UDP
HTTP, FTP, telnet
Ethernet, …
TCP/IP protocol architecture
8 herbert van de sompel
HTTP request
web browserHTTP client
HTTP request
no.good.com
GET / HTTP/1.1
Date: Wednesday, 02-Feb-99 23:04:12 GMTAccept-Language: en-usUser-Agent: Mozilla/4.0 (compatible; MSIE 5.01; Windows NT) Host: no.good.comConnection: Keep-Alive* a blank line *
method
header
entity-body
web serverHTTP server
9 herbert van de sompel
HTTP request
method URI HTTP-version GET - POST - HEAD – PUT - … GET / HTTP/1.1
header
entity-body
method
• general-header: optional, general informationDate: Wednesday, 02-Feb-99 23:04:12 GMTConnection: Keep-Alive
• request-header: about clientAccept-Language: en-usUser-Agent: Mozilla/4.0 (compatible; MSIE 5.01; Windows NT)
• entity-header: about entity-body
What is sent to the server
10 herbert van de sompel
HTTP response
HTTP response
status
header
HTTP/1.1 200 OKDate: Wednesday, 02-Feb-99 23:04:25 GMTServer: Apache/1.3.6 (Unix)Last-Modified: Sun, 01 Feb 1999 13:54:26 GMTETag: “2f5cd-964-38js8”Content-length: 327Connection: closeContent-Type: text/html* a blank line * <title>Welcome to nogood</title><img src=“/images/nogood-logo.gif”>
web browserHTTP client
no.good.com
web serverHTTP server
entity-body
11 herbert van de sompel
HTTP response
HTTP-version Status-code Reason-phrase HTTP/1.1 200 OK
header
entity-body
status
• general-header: optional, general informationDate: Wednesday, 02-Feb-99 23:04:25 GMT
• response-header: about serverServer: Apache/1.3.6 (Unix)
• entity-header: about entity-bodyContent-Type: text/htmlETag: “2f5cd-964-38js8”Content-length: 327What is sent to the clienttitle>Welcome to nogood</title><img src=“/images/nogood-logo.gif”>
12 herbert van de sompel
HTTP request
web browserHTTP client
HTTP request
no.good.com
GET /images/nogood-logo.gif HTTP/1.1
Date: Wednesday, 02-Feb-99 23:04:27 GMTAccept-Language: en-usUser-Agent: Mozilla/4.0 (compatible; MSIE 5.01; Windows NT) Host: no.good.comConnection: Keep-Alive* a blank line *
web serverHTTP server
13 herbert van de sompel
HTTP response
HTTP response
HTTP/1.1 200 OKDate: Wednesday, 02-Feb-99 23:04:29 GMTServer: Apache/1.3.6 (Unix)Last-Modified: Sun, 01 Feb 1999 08:20:00 GMTETag: “2f5cd-964-445e”Content-length: 220Connection: closeContent-Type: image/gif* a blank line * the GIF file
web browserHTTP client
no.good.com
web serverHTTP server
14 herbert van de sompel
HypertexT Transfer Protocol (HTTP)
web serverHTTP server
web browserHTTP client
renders response
HTTP request
HTTP response
MIME type + file
15 herbert van de sompel
Browser
file
Presentation software
Display
MIME type
• built into browser• plug-in• helper application
16 herbert van de sompel
HTTP Proxies
web browserHTTP client
no.good.com
web serverHTTP serverHTTP proxy
• Reduce network traffic: caching (Etag, Last-Modified)• IP-based authentication
server
client
cache
17 herbert van de sompel
HTTP cookies
• HTTP protocol is stateless: once a server has given a response to a client, it forgets about it. No session information.• Fake state with cookies:
• server sends token to client• client sends token back to server• server understands the meaning of the token• for instance: server avoids to require input of username/password with every request by reading authorization from cookie
18 herbert van de sompel
Dynamic content: Common Gateway Interface (CGI)
web browserHTTP client
HTTP request
no.good.com
web serverHTTP server
program
HTTP response
CGI
• Client interaction with non-web servers
19 herbert van de sompel
CGI -- HTTP POST request
web browserHTTP client
HTTP request
no.good.com
POST /cgi-bin/find HTTP/1.1
Date: Wednesday, 02-Feb-99 23:04:27 GMTAccept-Language: en-usUser-Agent: Mozilla/4.0 (compatible; MSIE 5.01; Windows NT) Host: no.good.comConnection: Keep-AliveContent-length: 26Content-type: application/x-www-form-urlencoded* a blank line * search=herbert&type=author
web serverHTTP server
programfind
CGI
20 herbert van de sompel
CGI -- HTTP GET request
web browserHTTP client
HTTP request
no.good.com
GET /cgi-bin/find?search=herbert&type=author HTTP/1.1
Date: Wednesday, 02-Feb-99 23:04:27 GMTAccept-Language: en-usUser-Agent: Mozilla/4.0 (compatible; MSIE 5.01; Windows NT) Host: no.good.comConnection: Keep-Alive* a blank line *
web serverHTTP server
programfind
CGI
21 herbert van de sompel
CGI - the interface
programfind
CGI
no.good.com
web serverHTTP server
find receives input from• STDIN• environment variables (about client, server, request …
search=herbert&type=authorSERVER-NAME server.good.comREMOTE-HOST 157.193.101.6…
22 herbert van de sompel
CGI - the interface
programfind
CGI
no.good.com
web serverHTTP server
find outputs to STDOUTContent-type: text/html
<title>Search results</title>…
web server adds header informationsends response to client
23 herbert van de sompel
Dynamic content: Mobile code - JavaScript
web browserHTTP client
no.good.com
web serverHTTP server
HTTP response
• Executed by the browser• User interface, client-side validation, …
HTML
JavaScript
24 herbert van de sompel
Dynamic content: Mobile code – Java applets
web browserHTTP client
no.good.com
web serverHTTP server
HTTP response
• Executed by virtual machine• Interaction with find not via HTTP
Java
programfind
25 herbert van de sompel
Want to read a bit more?
• on Web Characterization http://www.w3.org/1999/05/WCA-terms/01
• on CGI http://www.ukans.edu/~acs/docs/other/forms-intro.shtml
• on Web, TCP/IP, CGI http://www.wdvl.com/Authoring/Tools/Tutorial/index4.html
• HTTP http://www.ietf.org/rfc/rfc1945.txt?number=1945 ; http://www.jmarshall.com/easy/http/