Post on 22-Dec-2015
transcript
Processing XML Part II
• Parser Operations with DOM and SAX overview • XML Validation with examples
• Processing XML with SAX (locally and on the internet)
FixedFloatSwap.xml
<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE FixedFloatSwap SYSTEM "FixedFloatSwap.dtd"><FixedFloatSwap> <Notional>100</Notional> <Fixed_Rate>5</Fixed_Rate> <NumYears>3</NumYears> <NumPayments>6</NumPayments></FixedFloatSwap>
FixedFloatSwap.dtd
<?xml version="1.0" encoding="utf-8"?><!ELEMENT FixedFloatSwap (Notional, Fixed_Rate, NumYears, NumPayments ) ><!ELEMENT Notional (#PCDATA) ><!ELEMENT Fixed_Rate (#PCDATA) ><!ELEMENT NumYears (#PCDATA) ><!ELEMENT NumPayments (#PCDATA) >
Operation of a Tree-based Parser
Tree-BasedParser
ApplicationLogic
Document Tree
Valid
XML DTD
XML Document
Tree Benefits
• Some data preparation tasks require early
access to data that is further along in the
document (e.g. we wish to extract titles to build a table of contents)
• New tree construction is easier (e.g. xslt works from a tree to convert FpML to WML)
Operation of an Event Based Parser
Event-BasedParser
ApplicationLogic
Valid
XML DTD
XML Document
Operation of an Event Based Parser
Event-BasedParser
ApplicationLogic
Valid
XML DTD
XML Document
public void startDocument ()public void endDocument ()public void startElement (String name, AttributeList attrs)public void endElement (String name)public void characters (char buf [], int offset, int len)
public void error(SAXParseException e) throws SAXException { System.out.println("\n\n--Invalid document ---" + e); }
Event-Driven Benefits
• We do not need the memory required for trees
• Parsing can be done faster with no tree construction going on
XML Validation
A batch validating process involves comparing the DTD against a complete document instance and producing a report containing any errors or warnings.
Software developers should consider batch validation to be analogous to program compilation, with similar errors detected.
Interactive validation involves constant comparison of the DTDagainst a document as it is being created.
XML Validation
The benefits of validating documents against a DTD include:
• Programmers can write extraction and manipulation filters without fear of their software ever processing unexpected input.
• Using an XML-aware word processor, authors and editors can be guided and constrained to produce conforming documents.
XML Validation Examples
XML elements may contain further, embedded elements, andthe entire document must be enclosed by a single documentelement.
The degree to which an element’s content is organized into childelements is often termed its granularity.
Some hierarchical structures may be recursive.
The Document Type Definition (DTD) contains rules for each elementallowed within a specific class of documents.
// Validate.java
import java.io.*;import org.xml.sax.*;import javax.xml.parsers.SAXParserFactory;import javax.xml.parsers.ParserConfigurationException;import javax.xml.parsers.SAXParser;
public class Validate extends HandlerBase{ public static boolean valid = true;
public static void main (String argv []) { if (argv.length != 1) { System.err.println ("Usage: java Validate filename.xml"); System.exit (1); }
SAXParserFactory factory = SAXParserFactory.newInstance(); factory.setValidating(true);
We’ll run this program against several xml fileswith DTD’s.
try { SAXParser saxParser = factory.newSAXParser(); saxParser.parse( new File(argv [0]), new Validate());
} catch (Throwable t) {
t.printStackTrace ();
} System.out.println("Valid document is " + valid); System.exit (0); }
public void error(SAXParseException e) throws SAXException { System.out.println(e.toString()); valid = false; }}
<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE FixedFloatSwap SYSTEM "FixedFloatSwap.dtd"><FixedFloatSwap> <Notional>100</Notional> <Fixed_Rate>5</Fixed_Rate> <NumYears>3</NumYears> <NumPayments>6</NumPayments></FixedFloatSwap>
<?xml version="1.0" encoding="utf-8"?><!ELEMENT FixedFloatSwap (Notional, Fixed_Rate, NumYears, NumPayments ) ><!ELEMENT Notional (#PCDATA) ><!ELEMENT Fixed_Rate (#PCDATA) ><!ELEMENT NumYears (#PCDATA) ><!ELEMENT NumPayments (#PCDATA) >
XML Document
DTD
Valid document is true
<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE FixedFloatSwap SYSTEM "FixedFloatSwap.dtd"><FixedFloatSwap> <Notional>100</Notional> <Fixed_Rate>5</Fixed_Rate> <NumYears>3</NumYears> <NumPayments>6</NumPayments></FixedFloatSwap>
<?xml version="1.0" encoding="utf-8"?><!ELEMENT FixedFloatSwap (Notional, Fixed_Rate, NumPayments ) ><!ELEMENT Notional (#PCDATA) ><!ELEMENT Fixed_Rate (#PCDATA) ><!ELEMENT NumPayments (#PCDATA) >
XML Document
DTD
Valid document is false
<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE Swaps SYSTEM "FixedFloatSwap.dtd"><Swaps> <FixedFloatSwap> <Notional>100</Notional> <Fixed_Rate>5</Fixed_Rate> <NumYears>3</NumYears> <NumPayments>6</NumPayments> </FixedFloatSwap>
<FixedFloatSwap> <Notional>100</Notional> <Fixed_Rate>5</Fixed_Rate> <NumYears>3</NumYears> <NumPayments>6</NumPayments> </FixedFloatSwap></Swaps>
XML Document
<?xml version="1.0" encoding="utf-8"?><!ELEMENT Swaps (FixedFloatSwap+) ><!ELEMENT FixedFloatSwap (Notional, Fixed_Rate, NumYears, NumPayments ) ><!ELEMENT Notional (#PCDATA) ><!ELEMENT Fixed_Rate (#PCDATA) ><!ELEMENT NumYears (#PCDATA) ><!ELEMENT NumPayments (#PCDATA) >
DTD
C:\McCarthy\www\46-928\examples\sax>java Validate FixedFloatSwap.xml
Quantity Indicators ? 0 or 1 time + 1 or more times * 0 or more times
Valid document is true
The locations where document text data is allowed are indicated by the keyword ‘PCDATA’ (Parsed Character Data).
<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE FixedFloatSwap SYSTEM "FixedFloatSwap.dtd">
<FixedFloatSwap> <Notional>100</Notional> <Fixed_Rate>5</Fixed_Rate> <NumYears> <StartYear>2000</StartYear> <EndYear>2002</EndYear> </NumYears> <NumPayments>6</NumPayments>
</FixedFloatSwap>
XML Document
<?xml version="1.0" encoding="utf-8"?><!ELEMENT FixedFloatSwap (Notional, Fixed_Rate, NumYears, NumPayments ) ><!ELEMENT Notional (#PCDATA) ><!ELEMENT Fixed_Rate (#PCDATA) ><!ELEMENT NumYears (#PCDATA) ><!ELEMENT NumPayments (#PCDATA) >
C:\McCarthy\www\46-928\examples\sax>java Validate FixedFloatSwap.xmlorg.xml.sax.SAXParseException: Element "NumYears" does not allow "StartYear" --(#PCDATA)org.xml.sax.SAXParseException: Element type "StartYear" is not declared.org.xml.sax.SAXParseException: Element "NumYears" does not allow "EndYear" -- (#PCDATA)org.xml.sax.SAXParseException: Element type "EndYear" is not declared.Valid document is false
Output of program afterbeing modified to displaythe error.
DTD
There are strict rules which must be applied when an element is allowed to contain both text and child elements.
The PCDATA keyword must be the first token in the group, and the group must be a choice group (using “|” not “,”).
The group must be optional and repeatable.
This is known as a mixed content model.
<?xml version="1.0" encoding="utf-8"?><!ELEMENT Mixed (emph) ><!ELEMENT emph (#PCDATA | sub | super)* ><!ELEMENT sub (#PCDATA)><!ELEMENT super (#PCDATA)>
DTD
<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE Mixed SYSTEM "Mixed.dtd"><Mixed> <emph>H<sub>2</sub>O is water.</emph></Mixed>
XML Document
Valid document istrue
AttributesAn attribute is associated with a particular element by the DTDand is assigned an attribute type.
The attribute type can restrict the range of values it can hold.
Example attribute types include :
CDATA indicates a simple string of characters NMTOKEN indicates a word or token A named token group such as (left | center | right)
<?xml version="1.0" encoding="utf-8"?><!ELEMENT FixedFloatSwap (Notional, Fixed_Rate, NumYears, NumPayments ) ><!ELEMENT Notional (#PCDATA) ><!ELEMENT Fixed_Rate (#PCDATA) ><!ELEMENT NumYears (#PCDATA) ><!ELEMENT NumPayments (#PCDATA) ><!ATTLIST Notional currency (Dollars | Pounds) #REQUIRED>
DTD
<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE FixedFloatSwap SYSTEM "FixedFloatSwap.dtd"> <FixedFloatSwap> <Notional>100</Notional> <Fixed_Rate>5</Fixed_Rate> <NumYears>3</NumYears> <NumPayments>6</NumPayments> </FixedFloatSwap>
XML Document
C:\McCarthy\www\46-928\examples\sax>java Validate FixedFloatSwap.xmlorg.xml.sax.SAXParseException: Attribute value for "currency" is #REQUIRED.
Valid document is false
<?xml version="1.0" encoding="utf-8"?><!ELEMENT FixedFloatSwap (Notional, Fixed_Rate, NumYears, NumPayments ) ><!ELEMENT Notional (#PCDATA) ><!ELEMENT Fixed_Rate (#PCDATA) ><!ELEMENT NumYears (#PCDATA) ><!ELEMENT NumPayments (#PCDATA) ><!ATTLIST Notional currency (Dollars | Pounds) #REQUIRED>
DTD
<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE FixedFloatSwap SYSTEM "FixedFloatSwap.dtd"> <FixedFloatSwap> <Notional currency = “Pounds”>100</Notional> <Fixed_Rate>5</Fixed_Rate> <NumYears>3</NumYears> <NumPayments>6</NumPayments> </FixedFloatSwap>
XML Document
Valid document is true
DTD
<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE FixedFloatSwap SYSTEM "FixedFloatSwap.dtd"> <FixedFloatSwap> <Notional currency = “Pounds”>100</Notional> <Fixed_Rate>5</Fixed_Rate> <NumYears>3</NumYears> <NumPayments>6</NumPayments> </FixedFloatSwap>
XML Document
Valid document is true#IMPLIED means optional
<?xml version="1.0" encoding="utf-8"?><!ELEMENT FixedFloatSwap (Notional, Fixed_Rate, NumYears, NumPayments ) ><!ELEMENT Notional (#PCDATA) ><!ELEMENT Fixed_Rate (#PCDATA) ><!ELEMENT NumYears (#PCDATA) ><!ELEMENT NumPayments (#PCDATA) ><!ATTLIST Notional currency (Dollars | Pounds) #REQUIRED><!ATTLIST FixedFloatSwap note CDATA #IMPLIED>
DTD
<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE FixedFloatSwap SYSTEM "FixedFloatSwap.dtd"> <FixedFloatSwap note = “For your eyes only”> <Notional currency = “Pounds”>100</Notional> <Fixed_Rate>5</Fixed_Rate> <NumYears>3</NumYears> <NumPayments>6</NumPayments> </FixedFloatSwap>
XML Document
Valid document is true
<?xml version="1.0" encoding="utf-8"?><!ELEMENT FixedFloatSwap (Notional, Fixed_Rate, NumYears, NumPayments ) ><!ELEMENT Notional (#PCDATA) ><!ELEMENT Fixed_Rate (#PCDATA) ><!ELEMENT NumYears (#PCDATA) ><!ELEMENT NumPayments (#PCDATA) ><!ATTLIST Notional currency (Dollars | Pounds) #REQUIRED><!ATTLIST FixedFloatSwap note CDATA #IMPLIED>
<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE FixedFloatSwap SYSTEM "FixedFloatSwap.dtd" [ <!ENTITY bankname "Mellon National Bank and Trust" > ]> <FixedFloatSwap> <Bank>&bankname;</Bank> <Notional>100</Notional> <Fixed_Rate>5</Fixed_Rate> <NumYears>3</NumYears> <NumPayments>6</NumPayments> </FixedFloatSwap>
<?xml version="1.0" encoding="utf-8"?><!ELEMENT FixedFloatSwap (Bank,Notional, Fixed_Rate, NumYears, NumPayments ) ><!ELEMENT Bank (#PCDATA) ><!ELEMENT Notional (#PCDATA) ><!ELEMENT Fixed_Rate (#PCDATA) ><!ELEMENT NumYears (#PCDATA) ><!ELEMENT NumPayments (#PCDATA) >
DTD
Document usinga General Entity
Validate is true
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:template match = "Bank"> <WML> <CARD> <xsl:apply-templates/> </CARD> </WML> </xsl:template>
<xsl:template match = "Notional | Fixed_Rate | NumYears | NumPayments"> </xsl:template> </xsl:stylesheet>
XSLT Program
C:\McCarthy\www\46-928\examples\sax>java -Dcom.jclark.xsl.sax.parser=com.jclark.xml.sax.CommentDriver com.jclark.xsl.sax.Driver FixedFloatSwap.xml FixedFloatSwap.xsl FixedFloatSwap.wml
C:\McCarthy\www\46-928\examples\sax>type FixedFloatSwap.wml
<?xml version="1.0" encoding="utf-8"?>
<WML><CARD>Mellon National Bank and Trust</CARD></WML>
XSLT OUTPUT
<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE FixedFloatSwap SYSTEM "FixedFloatSwap.dtd" [
<!ENTITY bankname SYSTEM "JustAFile.dat" >
]> <FixedFloatSwap> <Bank>&bankname;</Bank> <Notional>100</Notional> <Fixed_Rate>5</Fixed_Rate> <NumYears>3</NumYears> <NumPayments>6</NumPayments> </FixedFloatSwap>
An external text entity
Mellon Bank And Trust CorporationWhen you need a friend!
XSLT Output
<?xml version="1.0" encoding="utf-8"?>
<WML><CARD>Mellon Bank And Trust CorporationWhen you need a friend!</CARD></WML>
JustAFile.dat
<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE FixedFloatSwap SYSTEM "FixedFloatSwap.dtd"> <FixedFloatSwap> <Notional>100</Notional> <Fixed_Rate>5</Fixed_Rate> <NumYears>3</NumYears> <NumPayments>6</NumPayments> </FixedFloatSwap>
<?xml version="1.0" encoding="utf-8"?><!ELEMENT FixedFloatSwap (Notional, Fixed_Rate, NumYears, NumPayments ) ><!ENTITY % parsedCharacterData "(#PCDATA)"><!ELEMENT Notional %parsedCharacterData; ><!ELEMENT Fixed_Rate (#PCDATA) ><!ELEMENT NumYears (#PCDATA) ><!ELEMENT NumPayments (#PCDATA) >
XML Document
DTD
Internal Parameter Entities
<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE FixedFloatSwap SYSTEM "FixedFloatSwap.dtd"> <FixedFloatSwap> <Bank> &bankname; </Bank> <Notional>100</Notional> <Fixed_Rate>5</Fixed_Rate> <NumYears>3</NumYears> <NumPayments>6</NumPayments> </FixedFloatSwap>
<?xml version="1.0" encoding="utf-8"?><!ELEMENT FixedFloatSwap (Bank, Notional, Fixed_Rate, NumYears, NumPayments ) ><!ENTITY bankname "Mellon National Bank and Trust Corporation" ><!ELEMENT Bank (#PCDATA)><!ELEMENT Notional (#PCDATA)><!ELEMENT Fixed_Rate (#PCDATA) ><!ELEMENT NumYears (#PCDATA) ><!ELEMENT NumPayments (#PCDATA) >
XML Document
DTD
General Entity defined in the DTD
<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE FixedFloatSwap SYSTEM "FixedFloatSwap.dtd"> <FixedFloatSwap> <Notional>100</Notional> <Fixed_Rate>5</Fixed_Rate> <NumYears>3</NumYears> <NumPayments>6</NumPayments> <Note> <![CDATA[This is text that <b>will not be parsed for markup]]> </Note> </FixedFloatSwap>
<?xml version="1.0" encoding="utf-8"?><!ELEMENT FixedFloatSwap ( Notional, Fixed_Rate, NumYears, NumPayments, Note ) ><!ELEMENT Notional (#PCDATA)><!ELEMENT Fixed_Rate (#PCDATA) ><!ELEMENT NumYears (#PCDATA) ><!ELEMENT NumPayments (#PCDATA) ><!ELEMENT Note (#PCDATA) >
XML Document
DTD
CDATA Section
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:template match = "Note"> <WML> <CARD> <xsl:apply-templates/> </CARD>h </WML> </xsl:template>
<xsl:template match = "Notional | Fixed_Rate | NumYears | NumPayments"> </xsl:template> </xsl:stylesheet>
XSLT Program
<?xml version="1.0" encoding="utf-8"?><WML><CARD>
This is text that <b>will not be parsed for markup
</CARD></WML>
XSLT Output
DTD Components<?xml version="1.0" encoding = "UTF-8"?><!DOCTYPE ORDER SYSTEM "order.dtd"><!-- example order form --><ORDER SOURCE ="web" CUSTOMERTYPE="consumer" CURRENCY="USD"> <addresses> <address ADDTYPE="billship"> <firstname>Kevin</firstname> <lastname>Dick</lastname> <street ORDER="1">123 Anywhere Lane</street> <street ORDER="2">Apt 1b</street> <city>Palo Alto</city> <state>CA</state> <postal>94303</postal> <country>USA</country> </address>
Order.xml
<address ADDTYPE="bill"> <firstname>Kevin</firstname> <lastname>Dick</lastname> <street ORDER="1">123 Not The Same Lane</street> <street ORDER="2">Work Place</street> <city>Palo Alto</city> <state>CA</state> <postal>94300</postal> <country>USA</country> </address> </addresses>
An order may have more than oneaddress.
<lineitems> <lineitem ID="line1"> <product CAT="MBoard">440BX Motherboard</product> <quantity>1</quantity> <unitprice>200</unitprice> </lineitem> <lineitem ID="line2"> <product CAT = "RAM">128 MB PC-100 DIMM</product> <quantity>2</quantity> <unitprice>175</unitprice> </lineitem> <lineitem ID="line3"> <product CAT="CDROM">40x CD-ROM</product> <quantity>1</quantity> <unitprice>50</unitprice> </lineitem> </lineitems>
Several productsmay be purchased.
<payment> <card CARDTYPE="VISA"> <cardholder>Kevin S. Dick</cardholder> <cardnumber>11111-22222-33333</cardnumber> <expiration>01/01</expiration> </card> </payment></ORDER>
The payment is witha Visa card.
Valid document is true
order.dtd<?xml version="1.0" encoding="UTF-8"?>
<!-- Example Order form DTD adapted from XML: A Manager's Guide -->
<!-- Define an ORDER element -->
<!ELEMENT ORDER (addresses, lineitems, payment)> <!ATTLIST ORDER SOURCE (web | phone | retail) #REQUIRED CUSTOMERTYPE (consumer | business) "consumer" CURRENCY CDATA "USD">
Define an order based on other elements.
<!ENTITY % anAddress SYSTEM "address.dtd" >%anAddress;
<!-- Collection of Addresses --><!ELEMENT addresses (address+)>
<!ENTITY % aLineItem SYSTEM "lineitem.dtd" >%aLineItem;
<!-- Collection of LineItems --><!ELEMENT lineitems (lineitem+)>
<!ENTITY % aPayment SYSTEM "payment.dtd" >%aPayment;
The other elements are in their own dtd files.
External parameterentities
address.dtd<!-- Address Structure --><!ELEMENT address (firstname, middlename?, lastname, street+, city, state,postal,country)>
<!ELEMENT firstname (#PCDATA)><!ELEMENT middlename (#PCDATA)><!ELEMENT lastname (#PCDATA)><!ELEMENT street (#PCDATA)><!ELEMENT city (#PCDATA)><!ELEMENT state (#PCDATA)><!ELEMENT postal (#PCDATA)><!ELEMENT country (#PCDATA)><!ATTLIST address ADDTYPE (bill | ship | billship) "billship"><!ATTLIST street ORDER CDATA #IMPLIED>
lineitem.dtd<!ELEMENT lineitem (product,quantity,unitprice)><!ATTLIST lineitem ID ID #REQUIRED>
<!ELEMENT product (#PCDATA)><!ATTLIST product CAT (CDROM|MBoard|RAM) #REQUIRED>
<!ELEMENT quantity (#PCDATA)><!ELEMENT unitprice (#PCDATA)>
<!ELEMENT payment (card | PO)><!ELEMENT card (cardholder, cardnumber, expiration)><!ELEMENT cardholder (#PCDATA)><!ELEMENT cardnumber (#PCDATA)><!ELEMENT expiration (#PCDATA)><!ELEMENT PO (number,authorization*)><!ELEMENT number (#PCDATA)><!ELEMENT authorization (#PCDATA)>
<!ATTLIST card CARDTYPE (VISA|MasterCard|Amex) #REQUIRED>
payment.dtd
Processing XML with SAX
• Important interfaces and classes are found in org.xml.sax package
• We will look at the following interfaces and then study an example
interface DocumentHandler -- reports on document events interface ErrorHandler – reports on validity errors class HandlerBase – implements both of the above plus two others
public interface DocumentHandler
Receive notification of general document events.
This is the main interface that most SAX applications implement: if the application needs to be informed of basic parsing events, it implements this interface andregisters an instance with the SAX parser.
The parser uses the instance to report basic document-related events like thestart and end of elements and character data.
void characters(char[] ch, int start, int length) Receive notification of character data.void endDocument() Receive notification of the end of a document.void endElement(java.lang.String name) Receive notification of the end of an element.void startDocument() Receive notification of the beginning of a document. void startElement(java.lang.String name, AttributeList atts) Receive notification of the beginning of an element.
Some methods from the DocumentHandler Interface
public interface ErrorHandler
Basic interface for SAX error handlers.
If a SAX application needs to implement customized error handling, it must implement this interface and then register an instance with the SAX parser.The parser will then report all errors and warnings through this interface.
Some methods are:void error(SAXParseException exception) Receive notification of a recoverable error.void fatalError(SAXParseException exception) Receive notification of a non-recoverable error.void warning(SAXParseException exception) Receive notification of a warning.
public class HandlerBaseextends java.lang.Objectimplements EntityResolver, DTDHandler, DocumentHandler, ErrorHandler
Default base class for handlers.
This class implements the default behaviour for four SAX interfaces: EntityResolver, DTDHandler, DocumentHandler, and ErrorHandler.
<?xml version="1.0" encoding="utf-8"?><!ELEMENT FixedFloatSwap ( Bank, Notional, Fixed_Rate, NumYears, NumPayments ) ><!ELEMENT Bank (#PCDATA)><!ELEMENT Notional (#PCDATA)><!ATTLIST Notional currency (dollars | pounds) #REQUIRED><!ELEMENT Fixed_Rate (#PCDATA) ><!ELEMENT NumYears (#PCDATA) ><!ELEMENT NumPayments (#PCDATA) >
FixedFloatSwap.dtd
Input
<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE FixedFloatSwap SYSTEM "FixedFloatSwap.dtd" [<!ENTITY bankname "Pittsburgh National Corporation"> ]> <FixedFloatSwap> <Bank>&bankname;</Bank> <Notional currency = "pounds">100</Notional> <Fixed_Rate>5</Fixed_Rate> <NumYears>3</NumYears> <NumPayments>6</NumPayments> </FixedFloatSwap>
FixedFloatSwap.xml
Input
// NotifyStr.java// Adapted from XML and Java by Maruyama, Tamura and Uramoto// IBM Tokyo Research, Addison-Wesley
import java.io.*;import org.xml.sax.*;import javax.xml.parsers.SAXParserFactory;import javax.xml.parsers.ParserConfigurationException;import javax.xml.parsers.SAXParser;
Processing
Java event-driven processing
public class NotifyStr extends HandlerBase{ public static void main (String argv []) { if (argv.length != 1) { System.err.println ("Usage: java NotifyStr filename.xml"); System.exit (1); } SAXParserFactory factory = SAXParserFactory.newInstance(); factory.setValidating(true); NotifyStr myHandler = new NotifyStr(); try {
SAXParser saxParser = factory.newSAXParser(); saxParser.parse( new File(argv [0]), myHandler);
} catch (Throwable t) { t.printStackTrace (); } System.exit (0); }
public NotifyStr() {}
public void startDocument() throws SAXException { System.out.println("startDocument called:"); }
public void endDocument() throws SAXException { System.out.println("endDocument called:"); }
public void startElement(String Name, AttributeList aMap) throws SAXException {
System.out.println("startElement called: element name =" + Name); // examine the attributes for(int i = 0; i < aMap.getLength(); i++) {
String attName = aMap.getName(i); String type = aMap.getType(i); String value = aMap.getValue(i); System.out.println(" attribute name = " + attName + " type = " + type + " value = " + value); } }
public void endElement(String name) throws SAXException { System.out.println("endElement is called:" + name);
}
public void characters(char[] ch, int start, int length) throws SAXException {
// build String from char array String dataFound = new String(ch,start,length); System.out.println("characters called:" + dataFound);
}
public void error(SAXParseException e) throws SAXException {
System.out.println("Parsing error"); System.out.println(e.toString()); }}
C:\McCarthy\www\46-928\examples\sax>java NotifyStr FixedFloatSwap.xmlstartDocument called:startElement called: element name =FixedFloatSwapstartElement called: element name =Bankcharacters called:Pittsburgh National CorporationendElement is called:BankstartElement called: element name =Notional attribute name = currency type = ENUMERATION value = poundscharacters called:100endElement is called:NotionalstartElement called: element name =Fixed_Ratecharacters called:5endElement is called:Fixed_RatestartElement called: element name =NumYearscharacters called:3endElement is called:NumYearsstartElement called: element name =NumPaymentscharacters called:6endElement is called:NumPaymentsendElement is called:FixedFloatSwapendDocument called:
Output
Accessing the swap from Jigsaw
<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE FixedFloatSwap [<!ENTITY bankname "Pittsburgh National Corporation"> ]> <FixedFloatSwap> <Bank>&bankname;</Bank> <Notional currency = "pounds">100</Notional> <Fixed_Rate>5</Fixed_Rate> <NumYears>3</NumYears> <NumPayments>6</NumPayments> </FixedFloatSwap>
Saved under Www/fpml/ServerSwap.xml
// This servlet file is stored in WWW/Jigsaw/servlet/GetXML.java// This servlet returns a user selected xml file from// the Www/fpml directory and returns it to the client.
import java.io.*;import java.util.*;import javax.servlet.*;import javax.servlet.http.*;
public class GetXML extends HttpServlet { public void doGet(HttpServletRequest req, HttpServletResponse res) throws ServletException, IOException {
String theData = ""; String extraPath = req.getPathInfo(); extraPath = extraPath.substring(1);
Servlet Code
// read the file and write it to the client try { // open file and create a DataInputStream FileInputStream theFile = new FileInputStream("c:\\Jigsaw\\Jigsaw\\Jigsaw\\Www\\fpml\\“ +extraPath); //DataInputStream dis = new DataInputStream(theFile); InputStreamReader is = new InputStreamReader(theFile); BufferedReader br = new BufferedReader(is);
// read the file into the string theData String thisLine; while((thisLine = br.readLine()) != null) { theData += thisLine + "\n"; } } catch(Exception e) { System.err.println("Error " + e); }
PrintWriter out = res.getWriter();
out.write(theData); System.out.println("Wrote document to client"); // write data to console System.out.println(theData); out.close(); }
}
// Sax Clientimport java.io.*;import org.xml.sax.*;import javax.xml.parsers.SAXParserFactory;import javax.xml.parsers.ParserConfigurationException;import javax.xml.parsers.SAXParser;
public class JigsawNotifyStr extends HandlerBase{ public static void main (String argv []) { if (argv.length != 1) { System.err.println ("Usage: java NotifyStr filename.xml"); System.exit (1); }
String serverString = "http://localhost:8001/servlet/getXML/"; String fileName = argv[0];
InputSource is = new InputSource(serverString + fileName);
System.out.println("Got the input source");
SAXParserFactory factory = SAXParserFactory.newInstance(); factory.setValidating(true);
JigsawNotifyStr myHandler = new JigsawNotifyStr();
try { SAXParser saxParser = factory.newSAXParser(); saxParser.parse( is, myHandler);
} catch (Throwable t) { System.out.println("Big error");
t.printStackTrace (); } System.exit (0); }
public JigsawNotifyStr() {}
public void startDocument() throws SAXException {
System.out.println("startDocument called:"); }
public void endDocument() throws SAXException {
System.out.println("endDocument called:");
} // Same as before // public void error(SAXParseException e) throws SAXException {
// describe each arror and show each error method System.out.println("Parsing error"); System.out.println(e.toString()); }}
Being served by the servlet
<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE FixedFloatSwap [<!ENTITY bankname "Pittsburgh National Corporation"> ]> <FixedFloatSwap> <Bank>&bankname;</Bank> <Notional currency = "pounds">100</Notional> <Fixed_Rate>5</Fixed_Rate> <NumYears>3</NumYears> <NumPayments>6</NumPayments> </FixedFloatSwap>
Got the input sourcestartDocument called:Parsing errororg.xml.sax.SAXParseException: Element type "FixedFloatSwap" is not declared.startElement called: element name =FixedFloatSwapcharacters called: Parsing errororg.xml.sax.SAXParseException: Element type "Bank" is not declared.startElement called: element name =Bankcharacters called:Pittsburgh National CorporationendElement is called:Bankcharacters called: Parsing errororg.xml.sax.SAXParseException: Element type "Notional" is not declared.Parsing errororg.xml.sax.SAXParseException: Attribute "currency" is not declared for element "Notional".startElement called: element name =Notional attribute name = currency type = CDATA value = poundscharacters called:100endElement is called:Notionalcharacters called:
We have some parsing errors.
Do you see why?
Parsing errororg.xml.sax.SAXParseException: Element type "Fixed_Rate" is not declared.startElement called: element name =Fixed_Ratecharacters called:5endElement is called:Fixed_Ratecharacters called: Parsing errororg.xml.sax.SAXParseException: Element type "NumYears" is not declared.startElement called: element name =NumYearscharacters called:3endElement is called:NumYearscharacters called: Parsing errororg.xml.sax.SAXParseException: Element type "NumPayments" is not declared.startElement called: element name =NumPaymentscharacters called:6endElement is called:NumPaymentscharacters called: endElement is called:FixedFloatSwapendDocument called:
The InputSource Class
The SAX and DOM parsers need XML input. The “output”produced by these parsers amounts to a series of method calls(SAX) or an application programmer interface to the tree (DOM).
An InputSource object can be used to provided input to theparser.
InputSurce SAX or DOM
Tree
Eventsapplication
So, how do we build an InputSource object?
Some InputSource constructors:
InputSource(String pathToFile); InputSource(InputStream byteStream); InputStream(Reader characterStream);
For example: String text = “<a>some xml</a>”; StringReader sr = new StringReader(text); InputSource is = new InputSource(sr); : myParser.parse(is);
But what about the DTD?
public interface EntityResolver
Basic interface for resolving entities.
If a SAX application needs to implement customized handling for external entities, it must implement this interface and registeran instance with the SAX parser using the parser'ssetEntityResolver method.
The parser will then allow the application to intercept any externalentities (including the external DTD subset and external parameterentities, if any) before including them.
EntityResolver
public InputSource resolveEntity(String publicId, String systemId) {
// Add this method to the client above. The systemId String // holds the path to the dtd as specified in the xml document. // We may now access the dtd from a servlet and return an // InputStream or return null and let the parser resolve the // external entity. System.out.println("Attempting to resolve" + "Public id :" + publicId + "System id :" + systemId); return null;
}