Lecture 9, 20-755: The Internet, Summer 1999 1
20-755: The InternetLecture 9: Web Services II
David O’Hallaron
School of Computer Science and
Department of Electrical and Computer Engineering
Carnegie Mellon University
Institute for eCommerce, Summer 1999
Lecture 9, 20-755: The Internet, Summer 1999 2
Today’s lecture
• Dynamic content background (35 min)
• Break (10 min)
• Serving dynamic content with GET and POST (40 min)
Lecture 9, 20-755: The Internet, Summer 1999 3
How programs run other programs
• Recall that a process is an instance of a running program.
• Suppose a process A, which is running program foo, wants to run the program bar.
• Two-step procedure:– First, process A creates a new process B that is a clone
of A
» A and B are independent processes running concurrently on the machine.
» A is the parent, B is the child.
» Each has a unique process id (pid)
– Second, process B recognizes that it is a clone, overwrites foo with bar, and transfers control to the first instruction in bar.
Lecture 9, 20-755: The Internet, Summer 1999 4
How programs run other programs
• Initially, foo is running in process A with process id (pid) of 325.
foo
Process A
pid = 325
Lecture 9, 20-755: The Internet, Summer 1999 5
How programs run other programs
• Next, program foo running in process A clones a copy of itself.
• So now we have two identical independent processes (A and B) running the same code.
• A can wait immediately for B to complete, or do other work in the meantime.
foo
Process A
foo
Process B
pid = 325
pid = 326
Lecture 9, 20-755: The Internet, Summer 1999 6
How programs run other programs
• The instance of foo in process B recognizes that it is a clone.
• Process B foo replaces its code with the code for bar.
foo
Process A
bar
Process B
pid = 325
pid = 326
Lecture 9, 20-755: The Internet, Summer 1999 7
How programs run other programs
•pid = fork()– creates a clone of the current process.
– returns a 0 to the child process.
– returns the positive integer process ID of the child to the parent.
•exec(objfile)– replaces the current running program with the code in the
executable file objfile.
– exec never returns to the caller unless there is an error.
» e.g., if it can’t locate objfile.
Lecture 9, 20-755: The Internet, Summer 1999 8
How programs run other programs
# This is how program foo running in process A# runs program bar in a new process B # the parent executes this statement$child_pid = fork();
# both parent and child run the if statementif ($child_pid == 0) { # Only the child executes this code print “I’m the child\n” exec(bar); # the child only gets to this point if the # exec fails die “can’t exec bar: $!”; }# the parent continues here
Lecture 9, 20-755: The Internet, Summer 1999 9
Perl abstractions for fork and exec
• backquote operator– $output = `foo`;
» runs the executable program foo and returns the contents of STDOUT to variable $output.
• system command– system(“foo”, $arg1, arg2);
» runs executable program date.
» output goes to wherever STDOUT is currently going (e.g., the screen)
– system($prog > mydate.txt”)
» redirects output to file mydate.txt
Lecture 9, 20-755: The Internet, Summer 1999 10
How programs pass info to the programs they create
• Command line arguments– the exec operator can pass a list of ASCII arguments to
the program that it run
» exec(“foo.pl”, “dave”, “ohallaron”);
#!/usr/local/bin/perl5 -w # Array @ARGV holds the arguments.# Acessing @ARGV returns the number of array elements# $0 is the name of the perl script (foo.pl)# $ARGV[0] is the first array element (argument)# $ARGV[1] is the second array element (argument)if (@ARGV != 2) { print "usage: $0 first last\n"; exit; } print "arg0 = $ARGV[0]\n"; # daveprint "arg1 = $ARGV[1]\n"; # ohallaron
Lecture 9, 20-755: The Internet, Summer 1999 11
How programs pass info to the programs they create
• Environment variables– Each process maintains a set of “environment variables”
» list of ASCII (name,value) pairs.
» represent long term conditions or preferences.
– A forked process gets an exact duplicate of the parent’s environment variables.
Lecture 9, 20-755: The Internet, Summer 1999 12
Unix shell environment variables
% printenvPWD=/usr/droh/afs/TERM=emacs EMACS=t MANPATH=/usr/man:/usr/local/man:/usr/local/apache/man:/usr/X11R6/man PRINTER=iron login_done=1 HOSTNAME=kittyhawk.cmcl.cs.cmu.edu HOSTTYPE=i386_linux3 HOST=kittyhawk.cmcl.cs.cmu.edu SHLVL=2 KRBTKFILE=/tkt/3478-030d-379b6ada PATH=.:/usr/droh/bin:/usr/sbin:/sbin:/usr/local/apache/bin: /usr/local/bin:/usr/bin:/bin:/usr/X11R6/bin:/usr/etc:/etc: /usr/X11R6/bin USER=droh SHELL=/usr/local/bin/tcsh HOME=/usr/droh
Lecture 9, 20-755: The Internet, Summer 1999 13
Accessing environment variables from PERL
• Environment variables stored in a special hash called “%ENV”
# sort and list the environment variablesforeach $key(sort keys %ENV) { print “$key=$ENV{$key}\n”;}
# add a new (key,value) pair to the environment hash%ENV{“IPADDR”} = “128.1.194.242”;
# delete a (key,value) pair from the environment hashdelete $ENV{“IPADDR”};
Lecture 9, 20-755: The Internet, Summer 1999 14
Serving dynamic content
client server
• Client sends request to server.
• If request URI contains the string “/cgi-bin”, then the server assumes that the request is for dynamic content.
GET /cgi-bin/env.pl HTTP/1.1
Lecture 9, 20-755: The Internet, Summer 1999 15
Serving dynamic content
client server
• The server creates a child process and runs the program identified by the URI in that process
env.pl
fork/exec
Lecture 9, 20-755: The Internet, Summer 1999 16
Serving dynamic content
client server• The child runs and
generates the dynamic content.
• The server captures the content of the child and forwards it without modification to the client
env.pl
content
content
Lecture 9, 20-755: The Internet, Summer 1999 17
Serving dynamic content
client server
• The child terminates.
• Server waits for the next client request.
Lecture 9, 20-755: The Internet, Summer 1999 18
Issues in serving dynamic content
• How does the client pass program arguments to the server?
• How does the server pass these arguments to the child?
• How does the server pass other info relevant to the request to the child?
• How does the server capture the content produced by the child?
• These issues are addressed by the Common Gateway Interface (CGI) specification.
client server
content
content
request
create
env.pl
Lecture 9, 20-755: The Internet, Summer 1999 19
Break time!
Fish
Lecture 9, 20-755: The Internet, Summer 1999 20
Today’s lecture
• Dynamic content background (35 min)
• Break (10 min)
• Serving dynamic content with GET and POST (40 min)
Lecture 9, 20-755: The Internet, Summer 1999 21
Issues in serving dynamic content
• How does the client pass program arguments to the server?
• How does the server pass these arguments to the child?
• How does the server pass other info relevant to the request to the child?
• How does the server capture the content produced by the child?
• These issues are addressed by the Common Gateway Interface (CGI) specification.
client server
content
content
request
create
env.pl
Lecture 9, 20-755: The Internet, Summer 1999 22
CGI
• Because the children are written according to the CGI spec, they are often called CGI programs.
– Because many CGI programs are written in Perl, they are often called CGI scripts.
• However, CGI really defines a simple standard between the client (browser), the server, and the child process.
Lecture 9, 20-755: The Internet, Summer 1999 23
add.com: THE Internet addition service!
• Ever needed to add two numbers together and you just can’t find your calculator?
• Try Dr. Dave’s addition service at add.com!– Takes as input your name, and two numbers you want to
add together.
– Returns their sum in a tasteful personalized message.
– After the IPO we’ll expand to multiplication!
Lecture 9, 20-755: The Internet, Summer 1999 24
Serving dynamic content with GET
• Question: How does the client pass arguments to the server?
• Answer: The arguments are appended to the URI
• Can be encoded directly in a URL typed to a browser or a URL in an HTML link
– http://add.com/cgi-bin/add.pl?Dave+O’Hallaron&1&2
– add.pl is the program on the server that will do the addition.
– argument list starts with “?”
– arguments separated by “&”
– spaces represented by “+”
• Can also be generated by an HTML form<form method=get action="http://add.com/cgi-bin/post.pl">
Lecture 9, 20-755: The Internet, Summer 1999 25
Serving dynamic content with GET
• URL: – http://add.com/cgi-bin/add.pl?Dave+O’Hallaron&1&2
• Result:
Mr. Dave O'Hallaron, Welcome to add.com!
The answer is: 1 + 2 = 3
Please come again soon! Tell your friends!
Lecture 9, 20-755: The Internet, Summer 1999 26
Serving dynamic content with GET
• Question: How does the server pass these arguments to the child?
• Answer: In environment variable QUERY_STRING– a single string containing everything after the “?”
– for add.com: QUERY_STRING = “Dave+O’Hallaron&1&2”
## Child code that parses the add.com arguments#$args = $ENV{QUERY_STRING}; $args =~ s/\+/ /; #replaces + with “ “($name, $a1, $a2) = split(/&/, $args);
Lecture 9, 20-755: The Internet, Summer 1999 27
Serving dynamic content with GET
• Question: How does the server pass other info relevant to the request to the child?
• Answer: in a collection of environment variables defined by the CGI spec.
Lecture 9, 20-755: The Internet, Summer 1999 28
Some CGI environment variables
• General– SERVER_SOFTWARE
– SERVER_NAME
– GATEWAY_INTERFACE (CGI version)
• Request specific– SERVER_PORT
– REQUEST_METHOD (GET, POST, etc)
– QUERY_STRING (contains args)
– REMOTE_HOST (domain name of client)
– REMOTE_ADDR (IP address of client)
– CONTENT_TYPE (for POST, type of data in message body, e.g., text/html)
– CONTENT_LENGTH (length in bytes)
Lecture 9, 20-755: The Internet, Summer 1999 29
Some CGI environment variables
• In addition, the value of each header of type type received from the client is placed in environment variable HTTP_type
– Examples:
» HTTP_ACCEPT
» HTTP_HOST
» HTTP_USER_AGENT (any “-” is changed to “_”)
Lecture 9, 20-755: The Internet, Summer 1999 30
Serving dynamic content with GET
• Questions: How does the server capture the content produced by the child?
• Answer: The child writes its content to stdout.
## server code that runs child and captures stdout#
# run the child and put its dynamic content in $child_output$child_output = `add.pl`;
# send the child’s dynamic content back to the client$connfd->print($output)
Lecture 9, 20-755: The Internet, Summer 1999 31
Putting it all together:The CGI script for
GET requests to add.com
#!/usr/local/bin/perl5 $args = $ENV{QUERY_STRING}; $args =~ s/\+/ /; ($name, $a1, $a2) = split(/&/, $args); print "Content-type: text/html\n\n"; print "<html><head></head><body>\n"; print "<h3>Mr. $name, Welcome to add.com!</h3>\n"; print "<b>The answer is: $a1 + $a2 = ", $a1+$a2, "</b><br>\n"; print "<p><i>Please come again soon! Tell your friends!</i>\n"; print "</body></html>\n";
Lecture 9, 20-755: The Internet, Summer 1999 32
Serving dynamic content with POST
• More complicated and less general than GET
• Less frequently used because of the complexity.
• Only advantage is that it provides arbitrary-length argument lists
– older browsers and servers had unnecessary limits on URI lengths in GET requests
– doesn’t seem to be a problem anymore
Lecture 9, 20-755: The Internet, Summer 1999 33
Serving dynamic content with POST
• Question: How does the client pass arguments to the server?
• Answer: In the message body of the HTTP request generated by a form.
– space converted to “+”
– puctuation converted to “%asciihexvalue”
» e.g., apostrophe becomes “%27”
Lecture 9, 20-755: The Internet, Summer 1999 34
add.com HTML form(form.html)
<html> <body> <form method=post action="http://add.com/cgi-bin/post.pl"> <p>Name <input name="name" type=text SIZE="48"> <p>num1 <input name="num1" type=text SIZE="6"> <p>num2 <input name="num2" type=text SIZE="6"> <p><input type=submit> <input type=reset> </form> </body> </html>
Lecture 9, 20-755: The Internet, Summer 1999 35
HTTP request generated by add.com form
POST /cgi-bin/post.pl HTTP/1.1 Accept: */* Referer: http://add.com/form.html Accept-Language: en-us Content-Type: application/x-www-form-urlencoded Accept-Encoding: gzip, deflate User-Agent: Mozilla/4.0 (compatible; MSIE 4.01; Windows 98) Host: add.com Content-Length: 34CRLFname=Dave+O%27Hallaron&num1=1&num2=2
Lecture 9, 20-755: The Internet, Summer 1999 36
Serving dynamic content with POST
• Questions: How does the server pass the arguments to the child?
• Answer: Arguments are passed as one line via stdin.
Lecture 9, 20-755: The Internet, Summer 1999 37
Serving dynamic content with POST
• Question: How does the server pass other info relevant to the request to the child?
• Answer: As with GET, in a collection of environment variables defined by the CGI spec.
• Question: How does the server capture the content produced by the child?
• Answer: As with GET, via stdout.