+ All Categories
Home > Documents > © 1999 Elliotte Rusty Harold8/16/2015 URLs, InetAddresses, and URLConnections High Level Network...

© 1999 Elliotte Rusty Harold8/16/2015 URLs, InetAddresses, and URLConnections High Level Network...

Date post: 24-Dec-2015
Category:
Upload: maud-parker
View: 213 times
Download: 0 times
Share this document with a friend
Popular Tags:
93
1999 Elliotte Rusty Harold 06/27/22 URLs, InetAddresses, and URLConnections High Level Network Programming Elliotte Rusty Harold [email protected] http://metalab.unc.edu/ javafaq/slides/
Transcript
Page 1: © 1999 Elliotte Rusty Harold8/16/2015 URLs, InetAddresses, and URLConnections High Level Network Programming Elliotte Rusty Harold elharo@metalab.unc.edu.

© 1999 Elliotte Rusty Harold 04/19/23

URLs, InetAddresses, and URLConnections

High Level Network Programming

Elliotte Rusty Harold

[email protected]

http://metalab.unc.edu/javafaq/slides/

Page 2: © 1999 Elliotte Rusty Harold8/16/2015 URLs, InetAddresses, and URLConnections High Level Network Programming Elliotte Rusty Harold elharo@metalab.unc.edu.

© 1999 Elliotte Rusty Harold 04/19/23

We will learn how Java handles

• Internet Addresses

• URLs

• CGI

• URLConnection

• Content and Protocol handlers

Page 3: © 1999 Elliotte Rusty Harold8/16/2015 URLs, InetAddresses, and URLConnections High Level Network Programming Elliotte Rusty Harold elharo@metalab.unc.edu.

© 1999 Elliotte Rusty Harold 04/19/23

I assume you

• Understand basic Java syntax and I/O

• Have a user’s view of the Internet

• No prior network programming experience

Page 4: © 1999 Elliotte Rusty Harold8/16/2015 URLs, InetAddresses, and URLConnections High Level Network Programming Elliotte Rusty Harold elharo@metalab.unc.edu.

© 1999 Elliotte Rusty Harold 04/19/23

Applet Network Security Restrictions

• Applets may:– send data to the code base– receive data from the code base

• Applets may not:– send data to hosts other than the code base– receive data from hosts other than the code

base

Page 5: © 1999 Elliotte Rusty Harold8/16/2015 URLs, InetAddresses, and URLConnections High Level Network Programming Elliotte Rusty Harold elharo@metalab.unc.edu.

© 1999 Elliotte Rusty Harold 04/19/23

Some Background

• Hosts

• Internet Addresses

• Ports

• Protocols

Page 6: © 1999 Elliotte Rusty Harold8/16/2015 URLs, InetAddresses, and URLConnections High Level Network Programming Elliotte Rusty Harold elharo@metalab.unc.edu.

© 1999 Elliotte Rusty Harold 04/19/23

Hosts

• Devices connected to the Internet are called hosts

• Most hosts are computers, but hosts also include routers, printers, fax machines, soda machines, bat houses, etc.

Page 7: © 1999 Elliotte Rusty Harold8/16/2015 URLs, InetAddresses, and URLConnections High Level Network Programming Elliotte Rusty Harold elharo@metalab.unc.edu.

© 1999 Elliotte Rusty Harold 04/19/23

Internet addresses

• Every host on the Internet is identified by a unique, four-byte Internet Protocol (IP) address.

• This is written in dotted quad format like 199.1.32.90 where each byte is an unsigned integer between 0 and 255.

• There are about four billion unique IP addresses, but they aren’t very efficiently allocated

Page 8: © 1999 Elliotte Rusty Harold8/16/2015 URLs, InetAddresses, and URLConnections High Level Network Programming Elliotte Rusty Harold elharo@metalab.unc.edu.

© 1999 Elliotte Rusty Harold 04/19/23

Domain Name System (DNS)

• Numeric addresses are mapped to names like "www.blackstar.com" or "star.blackstar.com" by DNS.

• Each site runs domain name server software that translates names to IP addresses and vice versa

• DNS is a distributed system

Page 9: © 1999 Elliotte Rusty Harold8/16/2015 URLs, InetAddresses, and URLConnections High Level Network Programming Elliotte Rusty Harold elharo@metalab.unc.edu.

© 1999 Elliotte Rusty Harold 04/19/23

The InetAddress Class

• The java.net.InetAddress class represents an IP address.

• It converts numeric addresses to host names and host names to numeric addresses.

• It is used by other network classes like Socket and ServerSocket to identify hosts

Page 10: © 1999 Elliotte Rusty Harold8/16/2015 URLs, InetAddresses, and URLConnections High Level Network Programming Elliotte Rusty Harold elharo@metalab.unc.edu.

© 1999 Elliotte Rusty Harold 04/19/23

Creating InetAddresses

• There are no public InetAddress() constructors. Arbitrary addresses may not be created.

• All addresses that are created must be checked with DNS

Page 11: © 1999 Elliotte Rusty Harold8/16/2015 URLs, InetAddresses, and URLConnections High Level Network Programming Elliotte Rusty Harold elharo@metalab.unc.edu.

© 1999 Elliotte Rusty Harold 04/19/23

The getByName() factory method

public static InetAddress getByName(String host) throws UnknownHostException

InetAddress utopia, duke;try { utopia = InetAddress.getByName("utopia.poly.edu"); duke = InetAddress.getByName("128.238.2.92");}catch (UnknownHostException e) { System.err.println(e);}

Page 12: © 1999 Elliotte Rusty Harold8/16/2015 URLs, InetAddresses, and URLConnections High Level Network Programming Elliotte Rusty Harold elharo@metalab.unc.edu.

© 1999 Elliotte Rusty Harold 04/19/23

Other ways to create InetAddress objects

public static InetAddress[] getAllByName(String host) throws UnknownHostException

public static InetAddress getLocalHost() throws UnknownHostException

Page 13: © 1999 Elliotte Rusty Harold8/16/2015 URLs, InetAddresses, and URLConnections High Level Network Programming Elliotte Rusty Harold elharo@metalab.unc.edu.

© 1999 Elliotte Rusty Harold 04/19/23

Getter Methods

• public boolean isMulticastAddress()• public String getHostName()• public byte[] getAddress()• public String getHostAddress()

Page 14: © 1999 Elliotte Rusty Harold8/16/2015 URLs, InetAddresses, and URLConnections High Level Network Programming Elliotte Rusty Harold elharo@metalab.unc.edu.

© 1999 Elliotte Rusty Harold 04/19/23

Utility Methods

• public int hashCode()• public boolean equals(Object o)• public String toString()

Page 15: © 1999 Elliotte Rusty Harold8/16/2015 URLs, InetAddresses, and URLConnections High Level Network Programming Elliotte Rusty Harold elharo@metalab.unc.edu.

© 1999 Elliotte Rusty Harold 04/19/23

Ports

• In general a host has only one Internet address

• This address is subdivided into 65,536 ports• Ports are logical abstractions that allow

one host to communicate simultaneously with many other hosts

• Many services run on well-known ports. For example, http tends to run on port 80

Page 16: © 1999 Elliotte Rusty Harold8/16/2015 URLs, InetAddresses, and URLConnections High Level Network Programming Elliotte Rusty Harold elharo@metalab.unc.edu.

© 1999 Elliotte Rusty Harold 04/19/23

Protocols

• A protocol defines how two hosts talk to each other. • The daytime protocol, RFC 867, specifies an ASCII

representation for the time that's legible to humans.• The time protocol, RFC 868, specifies a binary

representation, for the time that's legible to computers.

• There are thousands of protocols, standard and non-standard

Page 17: © 1999 Elliotte Rusty Harold8/16/2015 URLs, InetAddresses, and URLConnections High Level Network Programming Elliotte Rusty Harold elharo@metalab.unc.edu.

© 1999 Elliotte Rusty Harold 04/19/23

IETF RFCs

• Requests For Comment

• Document how much of the Internet works

• Various status levels from obsolete to required to informational

• TCP/IP, telnet, SMTP, MIME, HTTP, and more

• http://www.faqs.org/rfc/

Page 18: © 1999 Elliotte Rusty Harold8/16/2015 URLs, InetAddresses, and URLConnections High Level Network Programming Elliotte Rusty Harold elharo@metalab.unc.edu.

© 1999 Elliotte Rusty Harold 04/19/23

W3C Standards

• IETF is based on “rough consensus and running code”

• W3C tries to run ahead of implementation

• IETF is an informal organization open to participation by anyone

• W3C is a vendor consortium open only to companies

Page 19: © 1999 Elliotte Rusty Harold8/16/2015 URLs, InetAddresses, and URLConnections High Level Network Programming Elliotte Rusty Harold elharo@metalab.unc.edu.

© 1999 Elliotte Rusty Harold 04/19/23

W3C Standards

• HTTP

• HTML

• XML

• RDF

• MathML

• SMIL

• P3P

Page 20: © 1999 Elliotte Rusty Harold8/16/2015 URLs, InetAddresses, and URLConnections High Level Network Programming Elliotte Rusty Harold elharo@metalab.unc.edu.

© 1999 Elliotte Rusty Harold 04/19/23

URLs

• A URL, short for "Uniform Resource Locator", is a way to unambiguously identify the location of a resource on the Internet.

Page 21: © 1999 Elliotte Rusty Harold8/16/2015 URLs, InetAddresses, and URLConnections High Level Network Programming Elliotte Rusty Harold elharo@metalab.unc.edu.

© 1999 Elliotte Rusty Harold 04/19/23

Example URLs

http://java.sun.com/file:///Macintosh%20HD/Java/Docs/JDK%201.1.1%20docs/api/

java.net.InetAddress.html#_top_http://www.macintouch.com:80/newsrecent.shtmlftp://ftp.info.apple.com/pub/mailto:[email protected]://utopia.poly.eduftp://mp3:[email protected]:21000/c%3a/stuff/mp3/

http://[email protected]/ http://metalab.unc.edu/nywc/comps.phtml?

category=Choral+Works

Page 22: © 1999 Elliotte Rusty Harold8/16/2015 URLs, InetAddresses, and URLConnections High Level Network Programming Elliotte Rusty Harold elharo@metalab.unc.edu.

© 1999 Elliotte Rusty Harold 04/19/23

The Pieces of a URL

• the protocol, aka scheme

• the authority– user info

user namepassword

– host name or address– port

• the path, aka file

• the ref, aka section or anchor

• the query string

Page 23: © 1999 Elliotte Rusty Harold8/16/2015 URLs, InetAddresses, and URLConnections High Level Network Programming Elliotte Rusty Harold elharo@metalab.unc.edu.

© 1999 Elliotte Rusty Harold 04/19/23

The java.net.URL class

• A URL object represents a URL.

• The URL class contains methods to – create new URLs – parse the different parts of a URL– get an input stream from a URL so you can

read data from a server– get content from the server as a Java object

Page 24: © 1999 Elliotte Rusty Harold8/16/2015 URLs, InetAddresses, and URLConnections High Level Network Programming Elliotte Rusty Harold elharo@metalab.unc.edu.

© 1999 Elliotte Rusty Harold 04/19/23

Content and Protocol Handlers

• Content and protocol handlers separate the data being downloaded from the the protocol used to download it.

• The protocol handler negotiates with the server and parses any headers. It gives the content handler only the actual data of the requested resource.

• The content handler translates those bytes into a Java object like an InputStream or ImageProducer.

Page 25: © 1999 Elliotte Rusty Harold8/16/2015 URLs, InetAddresses, and URLConnections High Level Network Programming Elliotte Rusty Harold elharo@metalab.unc.edu.

© 1999 Elliotte Rusty Harold 04/19/23

Finding Protocol Handlers

• When the virtual machine creates a URL object, it looks for a protocol handler that understands the protocol part of the URL such as "http" or "mailto".

• If no such handler is found, the constructor throws a MalformedURLException.

Page 26: © 1999 Elliotte Rusty Harold8/16/2015 URLs, InetAddresses, and URLConnections High Level Network Programming Elliotte Rusty Harold elharo@metalab.unc.edu.

© 1999 Elliotte Rusty Harold 04/19/23

Supported Protocols• The exact protocols that Java supports vary

from implementation to implementation though http and file are supported pretty much everywhere. Sun's JDK 1.1 understands ten:– file – ftp – gopher – http – mailto

–appletresource –doc –netdoc –systemresource –verbatim

Page 27: © 1999 Elliotte Rusty Harold8/16/2015 URLs, InetAddresses, and URLConnections High Level Network Programming Elliotte Rusty Harold elharo@metalab.unc.edu.

© 1999 Elliotte Rusty Harold 04/19/23

URL Constructors

• There are four (six in 1.2) constructors in the java.net.URL class.

public URL(String u) throws MalformedURLExceptionpublic URL(String protocol, String host, String file) throws MalformedURLException

public URL(String protocol, String host, int port, String file) throws MalformedURLException

public URL(URL context, String url) throws MalformedURLException

public URL(String protocol, String host, int port, String file, URLStreamHandler handler) throws MalformedURLException

public URL(URL context, String url, URLStreamHandler handler) throws MalformedURLException

Page 28: © 1999 Elliotte Rusty Harold8/16/2015 URLs, InetAddresses, and URLConnections High Level Network Programming Elliotte Rusty Harold elharo@metalab.unc.edu.

© 1999 Elliotte Rusty Harold 04/19/23

Constructing URL Objects

• An absolute URL like http://www.poly.edu/fall97/grad.html#cstry { URL u = newURL("http://www.poly.edu/fall97/grad.html#cs");

}catch (MalformedURLException e) {}

Page 29: © 1999 Elliotte Rusty Harold8/16/2015 URLs, InetAddresses, and URLConnections High Level Network Programming Elliotte Rusty Harold elharo@metalab.unc.edu.

© 1999 Elliotte Rusty Harold 04/19/23

Constructing URL Objects in Pieces

• You can also construct the URL by passing its pieces to the constructor, like this:

URL u = null;try { u = new URL("http", "www.poly.edu", "/schedule/fall97/bgrad.html#cs");

} catch (MalformedURLException e) {}

Page 30: © 1999 Elliotte Rusty Harold8/16/2015 URLs, InetAddresses, and URLConnections High Level Network Programming Elliotte Rusty Harold elharo@metalab.unc.edu.

© 1999 Elliotte Rusty Harold 04/19/23

Including the Port

URL u = null;try { u = new URL("http", "www.poly.edu", 8000, "/fall97/grad.html#cs");

} catch (MalformedURLException e) {}

Page 31: © 1999 Elliotte Rusty Harold8/16/2015 URLs, InetAddresses, and URLConnections High Level Network Programming Elliotte Rusty Harold elharo@metalab.unc.edu.

© 1999 Elliotte Rusty Harold 04/19/23

Relative URLs

• Many HTML files contain relative URLs.

• Consider the page http://metalab.unc.edu/javafaq/index.html

• On this page a link to “books.html" refers to http://metalab.unc.edu/javafaq/books.html.

Page 32: © 1999 Elliotte Rusty Harold8/16/2015 URLs, InetAddresses, and URLConnections High Level Network Programming Elliotte Rusty Harold elharo@metalab.unc.edu.

© 1999 Elliotte Rusty Harold 04/19/23

Constructing Relative URLs

• The fourth constructor creates URLs relative to a given URL. For example,try { URL u1 = new URL("http://metalab.unc.edu/index.html");

URL u2 = new URL(u1, ”books.html");}catch (MalformedURLException e) {}

• This is particularly useful when parsing HTML.

Page 33: © 1999 Elliotte Rusty Harold8/16/2015 URLs, InetAddresses, and URLConnections High Level Network Programming Elliotte Rusty Harold elharo@metalab.unc.edu.

© 1999 Elliotte Rusty Harold 04/19/23

Parsing URLs

• The java.net.URL class has five methods to split a URL into its component parts. These are: public String getProtocol() public String getHost() public int getPort() public String getFile() public String getRef()

Page 34: © 1999 Elliotte Rusty Harold8/16/2015 URLs, InetAddresses, and URLConnections High Level Network Programming Elliotte Rusty Harold elharo@metalab.unc.edu.

© 1999 Elliotte Rusty Harold 04/19/23

For example,

try { URL u = new URL("http://www.poly.edu/fall97/grad.html#cs ");

System.out.println("The protocol is " + u.getProtocol());

System.out.println("The host is " + u.getHost());

System.out.println("The port is " + u.getPort());

System.out.println("The file is " + u.getFile());

System.out.println("The anchor is " + u.getRef());

}catch (MalformedURLException e) { }

Page 35: © 1999 Elliotte Rusty Harold8/16/2015 URLs, InetAddresses, and URLConnections High Level Network Programming Elliotte Rusty Harold elharo@metalab.unc.edu.

© 1999 Elliotte Rusty Harold 04/19/23

Parsing URLs

• JDK 1.3 adds three more: public String getAuthority() public String getUserInfo() public String getQuery()

Page 36: © 1999 Elliotte Rusty Harold8/16/2015 URLs, InetAddresses, and URLConnections High Level Network Programming Elliotte Rusty Harold elharo@metalab.unc.edu.

© 1999 Elliotte Rusty Harold 04/19/23

Missing Pieces

• If a port is not explicitly specified in the URL it's set to -1. This means the default port is to be used.

• If the ref doesn't exist, it's just null, so watch out for NullPointerExceptions. Better yet, test to see that it's non-null before using it.

• If the file is left off completely, e.g. http://java.sun.com, then it's set to "/".

Page 37: © 1999 Elliotte Rusty Harold8/16/2015 URLs, InetAddresses, and URLConnections High Level Network Programming Elliotte Rusty Harold elharo@metalab.unc.edu.

© 1999 Elliotte Rusty Harold 04/19/23

Reading Data from a URL• The openStream() method connects to the

server specified in the URL and returns an InputStream object fed by the data from that connection.

public final InputStream openStream() throws IOException

• Any headers that precede the actual data are stripped off before the stream is opened.

• Network connections are less reliable and slower than files. Buffer with a BufferedReader or a BufferedInputStream.

Page 38: © 1999 Elliotte Rusty Harold8/16/2015 URLs, InetAddresses, and URLConnections High Level Network Programming Elliotte Rusty Harold elharo@metalab.unc.edu.

© 1999 Elliotte Rusty Harold 04/19/23

Webcatimport java.net.*;import java.io.*;

public class Webcat { public static void main(String[] args) { for (int i = 0; i < args.length; i++) { try { URL u = new URL(args[i]); InputStream in = u.openStream(); InputStreamReader isr = new InputStreamReader(in); BufferedReader br = new BufferedReader(isr); String theLine; while ((theLine = br.readLine()) != null) { System.out.println(theLine); } } catch (IOException e) { System.err.println(e);} } }}

Page 39: © 1999 Elliotte Rusty Harold8/16/2015 URLs, InetAddresses, and URLConnections High Level Network Programming Elliotte Rusty Harold elharo@metalab.unc.edu.

© 1999 Elliotte Rusty Harold 04/19/23

The Bug in readLine()

• What readLine() does:– Sees a carriage return, waits to see if next

character is a line feed before returning

• What readLine() should do:– Sees a carriage return, return, throw away

next character if it's a linefeed

Page 40: © 1999 Elliotte Rusty Harold8/16/2015 URLs, InetAddresses, and URLConnections High Level Network Programming Elliotte Rusty Harold elharo@metalab.unc.edu.

© 1999 Elliotte Rusty Harold 04/19/23

Webcatimport java.net.*;import java.io.*;

public class Webcat { public static void main(String[] args) { for (int i = 0; i < args.length; i++) { try { URL u = new URL(args[i]); InputStream in = u.openStream(); InputStreamReader isr = new InputStreamReader(in); char c; while ((c = br.read()) != -1) { System.out.print(c); } } catch (IOException e) { System.err.println(e);} } }}

Page 41: © 1999 Elliotte Rusty Harold8/16/2015 URLs, InetAddresses, and URLConnections High Level Network Programming Elliotte Rusty Harold elharo@metalab.unc.edu.

© 1999 Elliotte Rusty Harold 04/19/23

CGI

• Common Gateway Interface

• A lot is written about writing server side CGI. I’m going to show you client side CGI.

• We’ll need to explore HTTP a little deeper to do this

Page 42: © 1999 Elliotte Rusty Harold8/16/2015 URLs, InetAddresses, and URLConnections High Level Network Programming Elliotte Rusty Harold elharo@metalab.unc.edu.

© 1999 Elliotte Rusty Harold 04/19/23

Normal web surfing uses these two steps:

– The browser requests a page– The server sends the page

• Data flows primarily from the server to the client.

Page 43: © 1999 Elliotte Rusty Harold8/16/2015 URLs, InetAddresses, and URLConnections High Level Network Programming Elliotte Rusty Harold elharo@metalab.unc.edu.

© 1999 Elliotte Rusty Harold 04/19/23

Forms

• There are times when the server needs to get data from the client rather than the other way around. The common way to do this is with a form like this one:

Page 44: © 1999 Elliotte Rusty Harold8/16/2015 URLs, InetAddresses, and URLConnections High Level Network Programming Elliotte Rusty Harold elharo@metalab.unc.edu.

© 1999 Elliotte Rusty Harold 04/19/23

CGI

• The user types the requested data into the form and hits the submit button.

• The client browser then sends the data to the server using the Common Gateway Interface, CGI for short.

• CGI uses the HTTP protocol to transmit the data, either as part of the query string or as separate data following the MIME header.

Page 45: © 1999 Elliotte Rusty Harold8/16/2015 URLs, InetAddresses, and URLConnections High Level Network Programming Elliotte Rusty Harold elharo@metalab.unc.edu.

© 1999 Elliotte Rusty Harold 04/19/23

GET and POST

• When the data is sent as a query string included with the file request, this is called CGI GET.

• When the data is sent as data attached to the request following the MIME header, this is called CGI POST

Page 46: © 1999 Elliotte Rusty Harold8/16/2015 URLs, InetAddresses, and URLConnections High Level Network Programming Elliotte Rusty Harold elharo@metalab.unc.edu.

© 1999 Elliotte Rusty Harold 04/19/23

HTTP

• Web browsers communicate with web servers through a standard protocol known as HTTP, an acronym for HyperText Transfer Protocol.

• This protocol defines – how a browser requests a file from a web server– how a browser sends additional data along with

the request (e.g. the data formats it can accept), – how the server sends data back to the client – response codes

Page 47: © 1999 Elliotte Rusty Harold8/16/2015 URLs, InetAddresses, and URLConnections High Level Network Programming Elliotte Rusty Harold elharo@metalab.unc.edu.

© 1999 Elliotte Rusty Harold 04/19/23

A Typical HTTP Connection– Client opens a socket to port 80 on the server.– Client sends a GET request including the name

and path of the file it wants and the version of the HTTP protocol it supports.

– The client sends a MIME header.– The client sends a blank line.– The server sends a MIME header– The server sends the data in the file. – The server closes the connection.

Page 48: © 1999 Elliotte Rusty Harold8/16/2015 URLs, InetAddresses, and URLConnections High Level Network Programming Elliotte Rusty Harold elharo@metalab.unc.edu.

© 1999 Elliotte Rusty Harold 04/19/23

What the client sends to the server

GET /javafaq/images/cup.gifConnection: Keep-AliveUser-Agent: Mozilla/3.01 (Macintosh; I; PPC)Host: www.oreilly.com:80Accept: image/gif, image/x-xbitmap, image/jpeg, */*

Page 49: © 1999 Elliotte Rusty Harold8/16/2015 URLs, InetAddresses, and URLConnections High Level Network Programming Elliotte Rusty Harold elharo@metalab.unc.edu.

© 1999 Elliotte Rusty Harold 04/19/23

MIME

• MIME is an acronym for "Multipurpose Internet Mail Extensions".

• an Internet standard defined in RFCs 2045 through 2049

• originally intended for use with email messages, but has been been adopted for use in HTTP.

Page 50: © 1999 Elliotte Rusty Harold8/16/2015 URLs, InetAddresses, and URLConnections High Level Network Programming Elliotte Rusty Harold elharo@metalab.unc.edu.

© 1999 Elliotte Rusty Harold 04/19/23

Browser Request MIME Header• When the browser sends a request to a

web server, it also sends a MIME header.

• MIME headers contain name-value pairs, essentially a name followed by a colon and a space, followed by a value.

Connection: Keep-AliveUser-Agent: Mozilla/3.01 (Macintosh; I; PPC)Host: www.digitalthink.com:80Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, */*

Page 51: © 1999 Elliotte Rusty Harold8/16/2015 URLs, InetAddresses, and URLConnections High Level Network Programming Elliotte Rusty Harold elharo@metalab.unc.edu.

© 1999 Elliotte Rusty Harold 04/19/23

Server Response MIME Header

• When a web server responds to a web browser it sends a MIME header along with the response that looks something like this:

Server: Netscape-Enterprise/2.01Date: Sat, 02 Aug 1997 07:52:46 GMTAccept-ranges: bytesLast-modified: Tue, 29 Jul 1997 15:06:46 GMT

Content-length: 2810Content-type: text/html

Page 52: © 1999 Elliotte Rusty Harold8/16/2015 URLs, InetAddresses, and URLConnections High Level Network Programming Elliotte Rusty Harold elharo@metalab.unc.edu.

© 1999 Elliotte Rusty Harold 04/19/23

Query Strings

• CGI GET data is sent in URL encoded query strings

• a query string is a set of name=value pairs separated by ampersands

Author=Sadie, Julie&Title=Women Composers

• separated from rest of URL by a question mark

Page 53: © 1999 Elliotte Rusty Harold8/16/2015 URLs, InetAddresses, and URLConnections High Level Network Programming Elliotte Rusty Harold elharo@metalab.unc.edu.

© 1999 Elliotte Rusty Harold 04/19/23

URL Encoding

• Alphanumeric ASCII characters (a-z, A-Z, and 0-9) and the $-_.!*'(), punctuation symbols are left unchanged.

• The space character is converted into a plus sign (+).

• Other characters (e.g. &, =, ^, #, %, ^, {, and so on) are translated into a percent sign followed by the two hexadecimal digits corresponding to their numeric value.

Page 54: © 1999 Elliotte Rusty Harold8/16/2015 URLs, InetAddresses, and URLConnections High Level Network Programming Elliotte Rusty Harold elharo@metalab.unc.edu.

© 1999 Elliotte Rusty Harold 04/19/23

For example,

• The comma is ASCII character 44 (decimal) or 2C (hex). Therefore if the comma appears as part of a URL it is encoded as %2C.

• The query string "Author=Sadie, Julie&Title=Women Composers" is encoded as:Author=Sadie%2C+Julie&Title=Women+Composers

Page 55: © 1999 Elliotte Rusty Harold8/16/2015 URLs, InetAddresses, and URLConnections High Level Network Programming Elliotte Rusty Harold elharo@metalab.unc.edu.

© 1999 Elliotte Rusty Harold 04/19/23

The URLEncoder class

• The java.net.URLEncoder class contains a single static method which encodes strings in x-www-form-url-encoded formatURLEncoder.encode(String s)

Page 56: © 1999 Elliotte Rusty Harold8/16/2015 URLs, InetAddresses, and URLConnections High Level Network Programming Elliotte Rusty Harold elharo@metalab.unc.edu.

© 1999 Elliotte Rusty Harold 04/19/23

For example,String qs = "Author=Sadie, Julie&Title=Women Composers";String eqs = URLEncoder.encode(qs);System.out.println(eqs);

• This prints:

Author%3dSadie%2c+Julie%26Title%3dWomen+Composers

Page 57: © 1999 Elliotte Rusty Harold8/16/2015 URLs, InetAddresses, and URLConnections High Level Network Programming Elliotte Rusty Harold elharo@metalab.unc.edu.

© 1999 Elliotte Rusty Harold 04/19/23

String eqs = "Author=" + URLEncoder.encode("Sadie, Julie");

eqs += "&";eqs += "Title=";eqs += URLEncoder.encode("Women Composers");

• This prints the properly encoded query string:

Author=Sadie%2c+Julie&Title=Women+Composers

Page 58: © 1999 Elliotte Rusty Harold8/16/2015 URLs, InetAddresses, and URLConnections High Level Network Programming Elliotte Rusty Harold elharo@metalab.unc.edu.

© 1999 Elliotte Rusty Harold 04/19/23

The URLDecoder class

• In Java 1.2 the java.net.URLDecoder class contains a single static method which decodes strings in x-www-form-url-encoded formatURLEncoder.decode(String s)

Page 59: © 1999 Elliotte Rusty Harold8/16/2015 URLs, InetAddresses, and URLConnections High Level Network Programming Elliotte Rusty Harold elharo@metalab.unc.edu.

© 1999 Elliotte Rusty Harold 04/19/23

GET URLsString eqs = "Author=" + URLEncoder.encode("Sadie, Julie"); eqs += "&"; eqs += "Title="; eqs += URLEncoder.encode("Women Composers"); try { URL u = new URL("http://www.superbooks.com/search.cgi?" + eqs);

InputStream in = u.openStream(); //... } catch (IOException e) { //...

Page 60: © 1999 Elliotte Rusty Harold8/16/2015 URLs, InetAddresses, and URLConnections High Level Network Programming Elliotte Rusty Harold elharo@metalab.unc.edu.

© 1999 Elliotte Rusty Harold 04/19/23

URLConnections

• The java.net.URLConnection class is an abstract class that handles communication with different kinds of servers like ftp servers and web servers.

• Protocol specific subclasses of URLConnection handle different kinds of servers.

• By default, connections to HTTP URLs use the GET method.

Page 61: © 1999 Elliotte Rusty Harold8/16/2015 URLs, InetAddresses, and URLConnections High Level Network Programming Elliotte Rusty Harold elharo@metalab.unc.edu.

© 1999 Elliotte Rusty Harold 04/19/23

URLConnections vs. URLs

• Can send output as well as read input

• Can post data to CGIs

• Can read headers from a connection

Page 62: © 1999 Elliotte Rusty Harold8/16/2015 URLs, InetAddresses, and URLConnections High Level Network Programming Elliotte Rusty Harold elharo@metalab.unc.edu.

© 1999 Elliotte Rusty Harold 04/19/23

URLConnection five steps:1. The URL is constructed.

2. The URL’s openConnection() method creates the URLConnection object.

3. The parameters for the connection and the request properties that the client sends to the server are set up.

4. The connect() method makes the connection to the server. (optional)

5. The response header information is read using getHeaderField().

Page 63: © 1999 Elliotte Rusty Harold8/16/2015 URLs, InetAddresses, and URLConnections High Level Network Programming Elliotte Rusty Harold elharo@metalab.unc.edu.

© 1999 Elliotte Rusty Harold 04/19/23

I/O Across a URLConnection

• Data may be read from the connection in one of two ways– raw by using the input stream returned by getInputStream()

– through a content handler with getContent().

• Data can be sent to the server using the output stream provided by getOutputStream().

Page 64: © 1999 Elliotte Rusty Harold8/16/2015 URLs, InetAddresses, and URLConnections High Level Network Programming Elliotte Rusty Harold elharo@metalab.unc.edu.

© 1999 Elliotte Rusty Harold 04/19/23

For example,

try { URL u = new URL("http://www.sd99.com/");

URLConnection uc = u.openConnection();

uc.connect(); InputStream in = uc.getInputStream();

// read the data... } catch (IOException e) { //...

Page 65: © 1999 Elliotte Rusty Harold8/16/2015 URLs, InetAddresses, and URLConnections High Level Network Programming Elliotte Rusty Harold elharo@metalab.unc.edu.

© 1999 Elliotte Rusty Harold 04/19/23

Reading Header Data

• The getHeaderField(String name) method returns the string value of a named header field.

• Names are case-insensitive.

• If the requested field is not present, null is returned. String lm = uc.getHeaderField("Last-modified");

Page 66: © 1999 Elliotte Rusty Harold8/16/2015 URLs, InetAddresses, and URLConnections High Level Network Programming Elliotte Rusty Harold elharo@metalab.unc.edu.

© 1999 Elliotte Rusty Harold 04/19/23

getHeaderFieldKey()

• The keys of the header fields are returned by the getHeaderFieldKey(int n) method.

• The first field is 1.

• If a numbered key is not found, null is returned.

• You can use this in combination with getHeaderField() to loop through the complete header

Page 67: © 1999 Elliotte Rusty Harold8/16/2015 URLs, InetAddresses, and URLConnections High Level Network Programming Elliotte Rusty Harold elharo@metalab.unc.edu.

© 1999 Elliotte Rusty Harold 04/19/23

For example

String key = null;for (int i=1; (key = uc.getHeaderFieldKey(i))!=null); i++) {

System.out.println(key + ": " + uc.getHeaderField(key));

}

Page 68: © 1999 Elliotte Rusty Harold8/16/2015 URLs, InetAddresses, and URLConnections High Level Network Programming Elliotte Rusty Harold elharo@metalab.unc.edu.

© 1999 Elliotte Rusty Harold 04/19/23

getHeaderFieldInt() and getHeaderFieldDate()

• These are utility methods that read a named header and convert its value into an int and a long respectively.

public int getHeaderFieldInt(String name, int default)public long getHeaderFieldDate(String name, long

default)

Page 69: © 1999 Elliotte Rusty Harold8/16/2015 URLs, InetAddresses, and URLConnections High Level Network Programming Elliotte Rusty Harold elharo@metalab.unc.edu.

© 1999 Elliotte Rusty Harold 04/19/23

• The long returned by getHeaderFieldDate() can be converted into a Date object using a Date() constructor like this:

String s = uc.getHeaderFieldDate("Last-modified", 0);

Date lm = new Date(s);

Page 70: © 1999 Elliotte Rusty Harold8/16/2015 URLs, InetAddresses, and URLConnections High Level Network Programming Elliotte Rusty Harold elharo@metalab.unc.edu.

© 1999 Elliotte Rusty Harold 04/19/23

Six Convenience Methods

• These return the values of six particularly common header fields:

public int getContentLength()public String getContentType()public String getContentEncoding()public long getExpiration()public long getDate()public long getLastModified()

Page 71: © 1999 Elliotte Rusty Harold8/16/2015 URLs, InetAddresses, and URLConnections High Level Network Programming Elliotte Rusty Harold elharo@metalab.unc.edu.

© 1999 Elliotte Rusty Harold 04/19/23

try { URL u = new URL("http://www.sdexpo.com/"); URLConnection uc = u.openConnection(); uc.connect(); String key=null; for (int n = 1; (key=uc.getHeaderFieldKey(n)) != null; n++) { System.out.println(key + ": " + uc.getHeaderField(key));

}}catch (IOException e) { System.err.println(e);}

Page 72: © 1999 Elliotte Rusty Harold8/16/2015 URLs, InetAddresses, and URLConnections High Level Network Programming Elliotte Rusty Harold elharo@metalab.unc.edu.

© 1999 Elliotte Rusty Harold 04/19/23

Writing data to a URLConnection

• Similar to reading data from a URLConnection.

• First inform the URLConnection that you plan to use it for output

• Before getting the connection's input stream, get the connection's output stream and write to it.

• Commonly used to talk to CGIs that use the POST method

Page 73: © 1999 Elliotte Rusty Harold8/16/2015 URLs, InetAddresses, and URLConnections High Level Network Programming Elliotte Rusty Harold elharo@metalab.unc.edu.

© 1999 Elliotte Rusty Harold 04/19/23

Eight Steps:1.Construct the URL.

2.Call the URL’s openConnection() method to create the URLConnection object.

3.Pass true to the URLConnection’s setDoOutput() method

4.Create the data you want to send, preferably as a byte array.

Page 74: © 1999 Elliotte Rusty Harold8/16/2015 URLs, InetAddresses, and URLConnections High Level Network Programming Elliotte Rusty Harold elharo@metalab.unc.edu.

© 1999 Elliotte Rusty Harold 04/19/23

5.Call getOutputStream() to get an output stream object.

6.Write the byte array calculated in step 5 onto the stream.

7.Close the output stream.

8.Call getInputStream() to get an input stream object. Read from it as usual.

Page 75: © 1999 Elliotte Rusty Harold8/16/2015 URLs, InetAddresses, and URLConnections High Level Network Programming Elliotte Rusty Harold elharo@metalab.unc.edu.

© 1999 Elliotte Rusty Harold 04/19/23

POST CGIs

• A typical POST request to a CGI looks like this:

POST /cgi-bin/booksearch.pl HTTP/1.0 Referer: http://www.macfaq.com/sampleform.html

User-Agent: Mozilla/3.01 (Macintosh; I; PPC)

Content-length: 60Content-type: text/x-www-form-urlencodedHost: utopia.poly.edu:56435

username=Sadie%2C+Julie&realname=Women+Composers

Page 76: © 1999 Elliotte Rusty Harold8/16/2015 URLs, InetAddresses, and URLConnections High Level Network Programming Elliotte Rusty Harold elharo@metalab.unc.edu.

© 1999 Elliotte Rusty Harold 04/19/23

A POST request includes

• the POST line

• a MIME header which must include – content type – content length

• a blank line that signals the end of the MIME header

• the actual data of the form, encoded in x-www-form-urlencoded format.

Page 77: © 1999 Elliotte Rusty Harold8/16/2015 URLs, InetAddresses, and URLConnections High Level Network Programming Elliotte Rusty Harold elharo@metalab.unc.edu.

© 1999 Elliotte Rusty Harold 04/19/23

• A URLConnection for an http URL will set up the request line and the MIME header for you as long as you set its doOutput field to true by invoking setDoOutput(true).

• If you also want to read from the connection, you should set doInput to true with setDoInput(true) too.

Page 78: © 1999 Elliotte Rusty Harold8/16/2015 URLs, InetAddresses, and URLConnections High Level Network Programming Elliotte Rusty Harold elharo@metalab.unc.edu.

© 1999 Elliotte Rusty Harold 04/19/23

For example,

URLConnection uc = u.openConnection();

uc.setDoOutput(true);uc.setDoInput(true);

Page 79: © 1999 Elliotte Rusty Harold8/16/2015 URLs, InetAddresses, and URLConnections High Level Network Programming Elliotte Rusty Harold elharo@metalab.unc.edu.

© 1999 Elliotte Rusty Harold 04/19/23

• The request line and MIME header are sent as soon as the URLConnection connects. Then getOutputStream() returns an output stream on which you can write the x-www-form-urlencoded name-value pairs.

Page 80: © 1999 Elliotte Rusty Harold8/16/2015 URLs, InetAddresses, and URLConnections High Level Network Programming Elliotte Rusty Harold elharo@metalab.unc.edu.

© 1999 Elliotte Rusty Harold 04/19/23

HttpURLConnection

• java.net.HttpURLConnection is an abstract subclass of URLConnection that provides some additional methods specific to the HTTP protocol.

• URL connection objects that are returned by an http URL will be instances of java.net.HttpURLConnection.

Page 81: © 1999 Elliotte Rusty Harold8/16/2015 URLs, InetAddresses, and URLConnections High Level Network Programming Elliotte Rusty Harold elharo@metalab.unc.edu.

© 1999 Elliotte Rusty Harold 04/19/23

Recall

• a typical HTTP response from a web server begins like this:

HTTP/1.0 200 OKServer: Netscape-Enterprise/2.01Date: Sat, 02 Aug 1997 07:52:46 GMTAccept-ranges: bytesLast-modified: Tue, 29 Jul 1997 15:06:46 GMT

Content-length: 2810Content-type: text/html

Page 82: © 1999 Elliotte Rusty Harold8/16/2015 URLs, InetAddresses, and URLConnections High Level Network Programming Elliotte Rusty Harold elharo@metalab.unc.edu.

© 1999 Elliotte Rusty Harold 04/19/23

Response Codes

• The getHeaderField() and getHeaderFieldKey() don't return the HTTP response code

• After you've connected, you can retrieve the numeric response code--200 in the above example--with the getResponseCode() method and the message associated with it--OK in the above example--with the getResponseMessage() method.

Page 83: © 1999 Elliotte Rusty Harold8/16/2015 URLs, InetAddresses, and URLConnections High Level Network Programming Elliotte Rusty Harold elharo@metalab.unc.edu.

© 1999 Elliotte Rusty Harold 04/19/23

HTTP Protocols• Java 1.0 only supports GET and POST

requests to HTTP servers

• Java 1.1/1.2 supports GET, POST, HEAD, OPTIONS, PUT, DELETE, and TRACE.

• The protocol is chosen with the setRequestMethod(String method) method.

• A java.net.ProtocolException, a subclass of IOException, is thrown if an unknown protocol is specified.

Page 84: © 1999 Elliotte Rusty Harold8/16/2015 URLs, InetAddresses, and URLConnections High Level Network Programming Elliotte Rusty Harold elharo@metalab.unc.edu.

© 1999 Elliotte Rusty Harold 04/19/23

getRequestMethod()

• The getRequestMethod() method returns the string form of the request method currently set for the URLConnection. GET is the default method.

Page 85: © 1999 Elliotte Rusty Harold8/16/2015 URLs, InetAddresses, and URLConnections High Level Network Programming Elliotte Rusty Harold elharo@metalab.unc.edu.

© 1999 Elliotte Rusty Harold 04/19/23

disconnect()

• The disconnect() method of the HttpURLConnection class closes the connection to the web server.

• Needed for HTTP/1.1 Keep-alive

Page 86: © 1999 Elliotte Rusty Harold8/16/2015 URLs, InetAddresses, and URLConnections High Level Network Programming Elliotte Rusty Harold elharo@metalab.unc.edu.

© 1999 Elliotte Rusty Harold 04/19/23

For example,try { URL u = new URL("http://www.amnesty.org/");

HttpURLConnection huc = (HttpURLConnection) u.openConnection();

huc.setRequestMethod("PUT"); huc.connect(); OutputStream os = huc.getOutputStream(); int code = huc.getResponseCode(); if (code >= 200 && < 300) { // put the data... } huc.disconnect();}catch (IOException e) { //...

Page 87: © 1999 Elliotte Rusty Harold8/16/2015 URLs, InetAddresses, and URLConnections High Level Network Programming Elliotte Rusty Harold elharo@metalab.unc.edu.

© 1999 Elliotte Rusty Harold 04/19/23

usingProxy

• The boolean usingProxy() method returns true if web connections are being funneled through a proxy server, false if they're not.

Page 88: © 1999 Elliotte Rusty Harold8/16/2015 URLs, InetAddresses, and URLConnections High Level Network Programming Elliotte Rusty Harold elharo@metalab.unc.edu.

© 1999 Elliotte Rusty Harold 04/19/23

Redirect Instructions

• Most web servers can be configured to automatically redirect browsers to the new location of a page that's moved.

• To redirect browsers, a server sends a 300 level response and a Location header that specifies the new location of the requested page.

Page 89: © 1999 Elliotte Rusty Harold8/16/2015 URLs, InetAddresses, and URLConnections High Level Network Programming Elliotte Rusty Harold elharo@metalab.unc.edu.

© 1999 Elliotte Rusty Harold 04/19/23

GET /~elharo/macfaq/index.html HTTP/1.0

HTTP/1.1 302 Moved TemporarilyDate: Mon, 04 Aug 1997 14:21:27 GMTServer: Apache/1.2b7Location: http://www.macfaq.com/macfaq/index.htmlConnection: closeContent-type: text/html

<HTML><HEAD><TITLE>302 Moved Temporarily</TITLE></HEAD><BODY><H1>Moved Temporarily</H1>The document has moved <A HREF="http://www.macfaq.com/macfaq/index.html">here</A>.<P>

</BODY></HTML>

Page 90: © 1999 Elliotte Rusty Harold8/16/2015 URLs, InetAddresses, and URLConnections High Level Network Programming Elliotte Rusty Harold elharo@metalab.unc.edu.

© 1999 Elliotte Rusty Harold 04/19/23

• HTML is returned for browsers that don't understand redirects, but most modern browsers do not display this and jump straight to the page specified in the Location header instead.

• Because redirects can change the site which a user is connecting without their knowledge so redirects are not arbitrarily followed by URLConnections.

Page 91: © 1999 Elliotte Rusty Harold8/16/2015 URLs, InetAddresses, and URLConnections High Level Network Programming Elliotte Rusty Harold elharo@metalab.unc.edu.

© 1999 Elliotte Rusty Harold 04/19/23

Following RedirectsHttpURLConnection.setFollowRedirects(true) method says that connections will follow redirect instructions from the web server. Untrusted applets are not allowed to set this.

HttpURLConnection.getFollowRedirects() returns true if redirect requests are honored, false if they're not.

Page 92: © 1999 Elliotte Rusty Harold8/16/2015 URLs, InetAddresses, and URLConnections High Level Network Programming Elliotte Rusty Harold elharo@metalab.unc.edu.

© 1999 Elliotte Rusty Harold 04/19/23

To Learn More

• Java Network Programming– O’Reilly & Associates, 1997– ISBN 1-56592-227-1

• Java I/O– O’Reilly & Associates, 1999– ISBN 1-56592-485-1

• Web Client Programming with Java– http://www.digitalthink.com/catalog/cs/

cs308/index.html

Page 93: © 1999 Elliotte Rusty Harold8/16/2015 URLs, InetAddresses, and URLConnections High Level Network Programming Elliotte Rusty Harold elharo@metalab.unc.edu.

© 1999 Elliotte Rusty Harold 04/19/23

Questions?


Recommended