+ All Categories
Home > Documents > Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold [email protected]

Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold [email protected]

Date post: 17-Jan-2016
Category:
Upload: beverly-gaines
View: 215 times
Download: 0 times
Share this document with a friend
145
Intro to XML Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold [email protected] http://metalab.unc.edu/xml/ slides/
Transcript
Page 1: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

Intro to XMLIntro to XML

Monday, May 10, 1999 SD99

Copyright 1999 Elliotte Rusty Harold

[email protected]

http://metalab.unc.edu/xml/slides/

Page 2: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

What is XML?What is XML?

• Extensible Markup Language

• A syntax for documents

• A Meta-Markup Language

• A Structural and Semantic language, not a formatting language

• Not just for Web pages

Page 3: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

XML is a Meta Markup XML is a Meta Markup LanguageLanguage

• Not like HTML, troff, LaTeX

• Make up the tags you needs as you need them

• The tags you create can be documented in a Document Type Definition (DTD)

• A meta syntax for domain-specific markup languages like MusicML, MathML, and CML

Page 4: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

XML describes structure and XML describes structure and semantics, not formattingsemantics, not formatting

• XML documents form a tree

• Element and attribute names reflect the kind of the element

• Formatting can be added with a style sheet

Page 5: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

A Song Description in HTMLA Song Description in HTML

<dt>Hot Cop<dd> by Jacques Morali, Henri Belolo, and Victor Willis

<ul><li>Producer: Jacques Morali<li>Publisher: PolyGram Records<li>Length: 6:20<li>Written: 1978<li>Artist: Village People</ul>

Page 6: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

A Song Description in XMLA Song Description in XML

<SONG> <TITLE>Hot Cop</TITLE> <COMPOSER>Jacques Morali</COMPOSER> <COMPOSER>Henri Belolo</COMPOSER> <COMPOSER>Victor Willis</COMPOSER> <PRODUCER>Jacques Morali</PRODUCER> <PUBLISHER>PolyGram Records</PUBLISHER> <LENGTH>6:20</LENGTH> <YEAR>1978</YEAR> <ARTIST>Village People</ARTIST></SONG>

Page 7: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

Style Sheets provide Style Sheets provide formattingformatting

<?xml version="1.0"?><xsl:stylesheet xmlns:xsl="http://www.w3.org/TR/WD-xsl">

<xsl:template match="/"> <html> <head><title>Song</title></head> <body> <xsl:value-of select="."/> </body> </html> </xsl:template>

</xsl:stylesheet>

Page 8: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

Attaching style sheets to Attaching style sheets to documentsdocuments

• <?xml-stylesheet type="text/xsl" href="song1.xsl"?>

• C:\> java ‑Dcom.jclark.xsl.sax.parser= com.jclark.xml.sax.CommentDriver com.jclark.xsl.sax.Driver hotcop.xml song1.xsl hotcop.html

• C:\> xt hotcop.xml xtsong1.xsl hotcop.html

Page 9: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

Templates for Other Templates for Other ElementsElements <xsl:template match="/"> <html> <head><title>Song</title></head> <body> <xsl:apply-templates/> </body> </html> </xsl:template> <xsl:template match="TITLE"> <h1><xsl:value-of select="."/></h1> </xsl:template>

Page 10: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

Style Sheets can be quite Style Sheets can be quite complexcomplex

<xsl:template match="SONG"> <h1><xsl:value-of select="TITLE"/> by the <xsl:value-of select="ARTIST"/></h1>

<ul> <li>Length: <xsl:value-of select="LENGTH"/> </li>

<li>Producer: <xsl:value-of select="PRODUCER"/> </li>

<li>Publisher: <xsl:value-of select="PUBLISHER"/> </li>

<li>Year: <xsl:value-of select="YEAR"/> </li> <xsl:apply-templates select="COMPOSER"/> </ul></xsl:template>

Page 11: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

What is XML used for?What is XML used for?

• Domain-Specific Markup Languages

• Self-Describing Data

• Interchange of Data Among Applications

• Structured and Integrated Data

Page 12: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

Domain-Specific Markup Domain-Specific Markup LanguagesLanguages

• Non proprietary format

• Don’t pay for what you don’t use

Page 13: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

Self-Describing DataSelf-Describing Data

• Much data is lost due to format problems

• XML is very simple

• XML is self-describing

• XML is well documented

Page 14: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

<PERSON ID="p1100" SEX="M"> <NAME> <GIVEN>Judson</GIVEN> <SURNAME>McDaniel</SURNAME> </NAME> <BIRTH> <DATE>21 Feb 1834</DATE> </BIRTH> <DEATH> <DATE>9 Dec 1905</DATE> </DEATH></PERSON>

Page 15: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

Interchange of Data Among Interchange of Data Among ApplicationsApplications

• E-commerce

• Syndication

Page 16: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

Structured and Integrated Structured and Integrated DataData

• Can specify relationships between elements

• Can assemble data from multiple sources

Page 17: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

XML ApplicationsXML Applications

• A specific markup language uses the XML meta-syntax is called an XML application

• Different XML applications have their own more constricted syntaxes and vocabularies within the broader XML syntax

• Further syntax can be layered on top of this; e.g. data typing through DCDs or other schemas

Page 18: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

Example XML ApplicationsExample XML Applications

• Web Pages

• Mathematical Equations

• Music Notation

• Vector Graphics

• Metadata

• and more…

Page 19: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

Mathematical Markup LanguageMathematical Markup Language

Page 20: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

Channel Definition FormatChannel Definition Format

<?xml version="1.0"?><CHANNEL HREF="http://metalab.unc.edu/xml/index.html"> <TITLE>Cafe con Leche</TITLE> <ITEM HREF="http://metalab.unc.edu/xml/books.html"> <TITLE>Books about XML</TITLE> </ITEM> <ITEM HREF="http://metalab.unc.edu/xml/tradeshows.html"> <TITLE>Trade shows and conferences about XML</TITLE> </ITEM> <ITEM HREF="http://metalab.unc.edu/xml/lists.htm"> <TITLE>Mailing Lists dedicated to XML</TITLE> </ITEM></CHANNEL>

Page 21: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

Classic LiteratureClassic Literature

• The Complete Plays of Shakespeare

• The Bible

• The Koran

• The Book of Mormon

Page 22: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

Vector GraphicsVector Graphics

• Vector Markup Language (VML)– Internet Explorer 5.0

– Microsoft Office 2000

• Scalable Vector Graphics (SVG)

Page 23: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

The Resource Description The Resource Description Framework (RDF)Framework (RDF)

• Meta-data

• Dublin Core

• Better Web searching

Page 24: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

An Example of RDFAn Example of RDF

<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"

xmlns:dc="http://purl.org/DC/> <rdf:Description about="http://metalab.unc.edu/xml/>

<dc:CREATOR>Elliotte Rusty Harold</dc:CREATOR>

<dc:TITLE>Cafe con Leche</dc:TITLE> </rdf:Description></rdf:RDF>

Page 25: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

XML for XMLXML for XML

• XSL: The Extensible Stylesheet Language

• DCD: The Document Content Description Schema Language

• XLL: The Extensible Linking Language

Page 26: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

XSL: The Extensible XSL: The Extensible Stylesheet LanguageStylesheet Language

• XSL Transformations

• XSL Formatting Objects

Page 27: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

DCD: The Document Content DCD: The Document Content Description Schema Description Schema LanguageLanguage

• Data Typing in XML is Weak

• <MONTH>9</MONTH>

<DCD> <ElementDef Type="MONTH" Model="Data" Datatype="i1" Min="1" Max="12" /></DCD>

Page 28: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

XLL: The Extensible Linking XLL: The Extensible Linking LanguageLanguage• Any element can be a link

• Links can be bi-directional

• Links can be separated from the documents they connect

<footnote xlink:form="simple" href="footnote7.xml">7</footnote>

Page 29: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

File Formats, in-house applications, File Formats, in-house applications, and other behind the scenes usesand other behind the scenes uses

• Microsoft Office 2000

• Federal Express Web API

• Netscape What’s Related

Page 30: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

Hello XMLHello XML

<?xml version="1.0" standalone="yes"?><FOO>Hello XML!</FOO>

• Plain ASCII or UTF-8 text

• .xml is standard file extension

• Any standard text editor will work

Page 31: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

The XML DeclarationThe XML Declaration

• version attribute– required

– always has the value 1.0

• standalone attribute– yes

– no

• encoding attribute– UTF-8

– 8859_1

– etc.

<?xml version="1.0" standalone="yes"?>

Page 32: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

The FOO elementThe FOO element

• Start tag <FOO>

• Contents "Hello XML!"

• End tag </FOO>

<FOO>Hello XML!</FOO>

Page 33: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

greeting.xmlgreeting.xml

<?xml version="1.0" standalone="yes"?><GREETING>Hello XML!</GREETING>

Page 34: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

Style sheetsStyle sheets• Separate from the XML document

• Different Languages– Cascading Style Sheets Level 1 (CSS1)

Internet Explorer 5.0Mozilla 5.0

– Cascading Style Sheets Level 2 (CSS2)Internet Explorer 5 (partial)Mozilla 5.0 (partial)

– Extensible Style Language (XSL)Internet Explorer 5.0 (older draft, buggy)LotusXSL, XT, Other non-browser converters

– Document Style and Semantics Language (DSSSL)Jade

Page 35: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

xml-stylesheetxml-stylesheet• Style sheets are attached via an xml-stylesheet processing instruction in the prolog

<?xml version="1.0" standalone="yes"?><?xml-stylesheet type="text/css" href="greeting.css"?>

<GREETING>Hello XML!</GREETING>

• Can also use non-browser converters like XT, LotusXSL, and Jade

Page 36: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

greeting.cssgreeting.css

GREETING {display: block; font-size: 24pt; font-weight: bold}

Page 37: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

greeting.xslgreeting.xsl<?xml version="1.0"?><xsl:stylesheet xmlns:xsl="http://www.w3.org/TR/WD-xsl">

<xsl:template match="/"> <html> <body> <h1> <xsl:value-of select="GREETING"/>

</h1> </body> </html> </xsl:template>

</xsl:stylesheet>

Page 38: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

Attaching a style sheet to an Attaching a style sheet to an XML documentXML document

• xml-stylesheet processing instruction after the XML declaration and before the root element

• type attribute has the value text/css or text/xsl

• href attribute is a URL to the stylesheet, possibly relative

Page 39: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

A larger example: Baseball A larger example: Baseball statisticsstatistics

• Examine the data

• Design a vocabulary for the data

• Write a style sheet

Page 40: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

Sample statisticsSample statisticshttp://cbs.sportsline.com/u/baseball/mlb/stats.htm

Page 41: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

Organizing the DataOrganizing the Data

• XML documents are trees.

• XML elements contain other elements as well as text

• Within these limits there's more than one way to organize the data

– Hierarchically

– Relationally

– Objects

Page 42: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

What is the Root ElementWhat is the Root Element

• The League?

• The Season?

• A custom Document element?

Page 43: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

The Root ElementThe Root Element

<?xml version="1.0"?><SEASON></SEASON>

• Choose SEASON for the root element

• Everything else will be a descendant of SEASON

• This is not the only possible choice

Page 44: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

What are the Immediate What are the Immediate Children of The root?Children of The root?

• Leagues?

• Teams?

• Players?

• Games?

Page 45: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

Child ElementsChild Elements

<?xml version="1.0"?><SEASON> <YEAR> 1998 </YEAR></SEASON>

Page 46: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

White space in XML is not White space in XML is not especially significantespecially significant

<?xml version="1.0"?>

<SEASON><YEAR>1998</YEAR></SEASON>

Page 47: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

LeaguesLeagues

• Major league baseball is divided into two leagues

• Each league has– a name

– three divisions

Page 48: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

DivisionsDivisions

• Each division has– name

– 4-6 teams

Page 49: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

TeamsTeams

• Each team has– Name

– City

– Players

Page 50: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

Player DataPlayer Data

• Each player has– First name

– Last name

– Position

– Statistics

Page 51: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

Player Batting StatisticsPlayer Batting Statistics

• G Games Played• GS Games Started• AB At Bats• R Runs• H Hits• 2B Doubles• 3B Triples• HR Home Runs• RBI Runs Batted In

• SB Stolen Bases• CS Caught Stealing• SH Sacrifice Hits• SF Sacrifice Flies• Err Errors• PB Pitcher Balked• BB Base on Balls

(Walks)• SO Strike Outs• HBP Hit By Pitch

Page 52: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

What does a player look likeWhat does a player look like

• Long names vs. short names

Page 53: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

The Complete 1998 Major The Complete 1998 Major LeagueLeague

• Long version

Page 54: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

A Style SheetA Style Sheet

• 1998shortstats.xml

• baseballstats.css

• <?xml-stylesheet type="text/css" href="baseballstats.css"?>

• styled1998shortstats.xml

Page 55: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

Cascading Style SheetsCascading Style Sheets

• Partially supported by Mozilla and IE 5.0

• Full W3C Recommendation

Page 56: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

The Default RuleThe Default Rule

• Not every element needs a rule

• The root element should be at least display: block

SEASON { font-size: 14pt; background-color: white; color: black; display: block}

Page 57: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

A style rule for the YEAR A style rule for the YEAR elementelement

• Make it look like a title

YEAR { display: block; font-size: 32pt; font-weight: bold; text-align: center}

Page 58: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

Style Rules for Division and Style Rules for Division and League NamesLeague Names

LEAGUE_NAME { display: block; text-align: center; font-size: 28pt; font-weight: bold}

DIVISION_NAME { display: block; text-align: center; font-size: 24pt; font-weight: bold}

Page 59: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

Alternate Style Rules for Alternate Style Rules for Division and League NamesDivision and League Names

LEAGUE_NAME, DIVISION_NAME { display: block; text-align: center; font-weight: bold}LEAGUE_NAME {font-size: 28pt }DIVISION_NAME {font-size: 24pt }

Page 60: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

Style Rules for TeamsStyle Rules for Teams• Team name and Team city must be one title

• Must be inline elements

• Previous and following must be block elements

TEAM_CITY { font-size: 20pt; font-weight: bold; font-style: italic}

TEAM_NAME { font-size: 20pt; font-weight: bold; font-style: italic}

TEAM, PLAYER {display: block}

Page 61: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

Style Rules for PlayersStyle Rules for PlayersTEAM {display: table}TEAM_CITY {display: table-caption}TEAM_NAME {display: table-caption}PLAYER {display: table-row}

SURNAME, GIVEN_NAME, POSITION, GAMES, GAMES_STARTED, AT_BATS, RUNS, HITS, DOUBLES, TRIPLES, HOME_RUNS, RBI, STEALS,CAUGHT_STEALING, SACRIFICE_HITS, SACRIFICE_FLIES, ERRORS, WALKS, STRUCK_OUT, HIT_BY_PITCH {display: table-cell}

Page 62: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

Finished Style SheetFinished Style Sheet

SEASON {font-size: 14pt; background-color: white; color: black; display: block}YEAR {display: block; font-size: 32pt; font-weight: bold; text-align: center}LEAGUE_NAME {display: block; text-align: center; font-size: 28pt; font-weight: bold}DIVISION_NAME {display: block; text-align: center; font-size: 24pt; font-weight: bold}TEAM_CITY {font-size: 20pt; font-weight: bold; font-style: italic}TEAM_NAME {font-size: 20pt; font-weight: bold; font-style: italic}TEAM {display: block}PLAYER {display: block}

Page 63: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

Possible ExtensionsPossible Extensions• There should be captions like "RBI" or

"At Bats.”

• Derived numbers like batting averages are not included.

• The titles are short. E.g. "1998" instead of "1998 Major League Baseball".

• The document is so long it's hard to read. Something similar to IE5's collapsible outline view would be nice.

• Pitcher stats should be separated from batter stats.

Page 64: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

Possible SolutionsPossible Solutions

• CSS Level 2

• XSL

• XSL + JavaScript

Page 65: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

AttributesAttributes

• name=value

• Values must be quoted

• An element may not have two attributes with the same name

<IMG SRC="cup.gif" WIDTH="89" HEIGHT="67" ALT="Cup of coffee"></IMG>

Page 66: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

Attributes in the Baseball Attributes in the Baseball ExampleExample

<SEASON YEAR="1998"> <!--leagues go here --></SEASON>

Page 67: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

Leagues are still child Leagues are still child elementselements• Attribute names must be unique

• Leagues have sub-structure (e.g. they contain divisions, teams, players, etc.)

<SEASON YEAR="1998" LEAGUE="National" League="American"></SEASON>

Page 68: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

Team AttributesTeam Attributes• Divisions and teams can also have

NAME attributes without any fear of confusion with the name of a league

• Thus you Names like NAME instead of LEAGUE_NAME

• Team cities are also good as attributes

Page 69: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

Team AttributesTeam Attributes

<LEAGUE NAME="American League"> <DIVISION NAME="East"> <TEAM NAME="Orioles" CITY="Baltimore"></TEAM> <TEAM NAME="Red Sox" CITY="Boston"></TEAM> <TEAM NAME="Yankees" CITY="New York"></TEAM> <TEAM NAME="Blue Jays" CITY="Toronto"></TEAM> </DIVISION></LEAGUE>

Page 70: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

Player AttributesPlayer Attributes

<PLAYER GIVEN_NAME="Joe" SURNAME="Girardi" GAMES="78" AT_BATS="254" RUNS="31" HITS="70" DOUBLES="11" TRIPLES="4" HOME_RUNS="3" RUNS_BATTED_IN="31" WALKS="14" STRUCK_OUT="38" STOLEN_BASES="2" CAUGHT_STEALING="4" SACRIFICE_FLY="1" SACRIFICE_HIT="8" HIT_BY_PITCH="2"></PLAYER>

Page 71: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

Attributes and ElementsAttributes and Elements

<P> On Tuesday <PLAYER GAMES="78" AT_BATS="254" RUNS="31" HITS="70" DOUBLES="11" TRIPLES="4" HOME_RUNS="3" RUNS_BATTED_IN="31" WALKS="14" STRIKE_OUTS="38" STOLEN_BASES="2" CAUGHT_STEALING="4" SACRIFICE_FLY="1" SACRIFICE_HIT="8" HIT_BY_PITCH="2"> <FIRST_NAME>Joe</FIRST_NAME> <SURNAME>Girardi </SURNAME></PLAYER> struck out twice and...</P>

Page 72: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

Attributes vs. ElementsAttributes vs. Elements

• Attribute are for meta-data; elements are for data

• Does the reader want to see the information? If yes, use element content; if no, use attributes

• Attributes are good for ID numbers, URLs, references, and other information not directly relevant to the reader

Page 73: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

When not to use attributesWhen not to use attributes

• Attributes can't hold structure well.

• Elements allow you to include meta-meta-data (information about the information about the information).

• Not everyone always agrees on what is and isn't meta-data.

• Elements are more extensible in the face of future changes.

Page 74: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

Empty TagsEmpty Tags

• End with a />– e.g. <PLAYER/>

• Same as <PLAYER></PLAYER>

Page 75: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

The Extensible Style The Extensible Style LanguageLanguage

• Partially supported by IE 5.0

• Many third party tools

• W3C Working Draft

Page 76: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

The Two Parts of XSLThe Two Parts of XSL

• Transformation Language

• Formatting Objects

Page 77: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

TemplatesTemplates

<HTML> <HEAD> <TITLE> XSL Instructions to get the title </TITLE> </HEAD> <H1>XSL Instructions to get the title</H1> <BODY> XSL Instructions to get the statistics </BODY></HTML>

Page 78: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

XSL InstructionsXSL Instructions

• An XSL style sheet is a well-formed XML document

• XSL instructions are particular XML elements– xsl:apply-templates

– xsl:template

– xsl:for-each

– xsl:value-of

– a few others

Page 79: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

An XSL style sheetAn XSL style sheet<?xml version="1.0"?><xsl:stylesheet xmlns:xsl="http://www.w3.org/TR/WD-xsl">

<xsl:template match="/"><HTML xmlns:xsl="http://www.w3.org/TR/WD-xsl"><HEAD><TITLE>Major League Baseball Statistics</TITLE></HEAD><BODY> <H1>Major League Baseball Statistics</H1></BODY></HTML></xsl:template></xsl:stylesheet>

Page 80: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

xsl:for-each and xsl:value-xsl:for-each and xsl:value-ofof<xsl:template match="/"> <HTML xmlns:xsl="http://www.w3.org/TR/WD-xsl"><HEAD><TITLE> <xsl:for-each select="SEASON"> <xsl:value-of select="@YEAR"/> </xsl:for-each> Major League Baseball Statistics</TITLE></HEAD><BODY> <xsl:for-each select="SEASON"><H1><xsl:value-of select="@YEAR"/> Major League Baseball Statistics</H1> </xsl:for-each> </BODY></HTML></xsl:template>

Page 81: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

xsl:for-each and xsl:value-xsl:for-each and xsl:value-ofof

<xsl:for-each select="SEASON"> <xsl:value-of select="@YEAR"/> </xsl:for-each>

Page 82: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

Namespaces and XSLNamespaces and XSL

• XSL instructions are in the xsl namespace to distinguish them from output HTML elements.

• The namespace is identified by the xmlns:xsl attribute of the root element of the style sheet.

• The value of that attribute is http://www.w3.org/TR/WD-xsl.

Page 83: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

LeaguesLeagues

Page 84: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

Divisions and TeamsDivisions and Teams

Page 85: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

PlayersPlayers

Page 86: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

CSS or XSL?CSS or XSL?

• CSS has broader support

• CSS is more stable

• XSL is more powerful

Page 87: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

Well-formedness RulesWell-formedness Rules• Open and close all tags

• Empty tags end with />

• There is a unique root element

• Elements may not overlap

• Attribute values are quoted

• < and & are only used to start tags and entities

• Only the five predefined entity references are used

Page 88: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

Open and close all tagsOpen and close all tags

Page 89: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

Empty tags end with Empty tags end with />/>

• <BR/>, <HR/>, and <IMG/> instead of <BR>, <HR>, and <IMG>

• Web browsers deal inconsistently with these

• Can use <BR></BR> <HR></HR> <IMG></IMG> instead

Page 90: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

There is a unique root There is a unique root elementelement

• One element completely contains all other elements of the document

• This is HTML in HTML files

• XML Declaration is not an element

<?xml version="1.0" standalone="yes"?><GREETING>Hello XML!</GREETING>

Page 91: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

Elements may not overlapElements may not overlap

• If an element contains a start tag for an element, it must also contain the corresponding end tag

• Empty elements may appear anywhere

• Every non root element has a parent element

Page 92: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

Attribute values are quotedAttribute values are quoted

• Good: – <A

HREF="http://metalab.unc.edu/xml/">

• Bad: – <A

HREF=http://metalab.unc.edu/xml/>

Page 93: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

<< and and && are only used to start are only used to start tags and entitiestags and entities

• Good: <H1>O'Reilly &amp; Associates</H1>

• Bad: <H1> O'Reilly & Associates</H1>

• Good: – <CODE>for (int i = 0; i &lt;= args.length; i++ ) { </CODE>

• Bad: – <CODE>for (int i = 0; i <= args.length; i++ ) { </CODE>

Page 94: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

Only the five predefined Only the five predefined entity references are usedentity references are used

• Good: – &amp;

– &lt;

– &gt;

– &quot;

– &apos;

• Bad:– &copy;

– &reg;

– &tm;

– &alpha;

– &eacute;

– &nbsp;

– etc.

Page 95: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

DTDs and ValidityDTDs and Validity

• A Document Type Definition describes the elements and attributes that may appear in a document

• Validation compares a particular document against a DTD

• Well-formedness is a prerequisite for validity

Page 96: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

What is a DTD?What is a DTD?

• a list of the elements, tags, attributes, and entities contained in a document, and their relationship to each other

• internal vs. external DTDs

Page 97: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

The importance of validationThe importance of validation

• Ensures that data is correct before feeding it into a program

• Ensure that a format is followed

• Establish what must be supported

• Not all documents need to be valid; sometimes well-formed is enough

Page 98: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

A DTD for greeting.xmlA DTD for greeting.xml

• greeting.xml:<?xml version="1.0"?><GREETING>Hello XML!</GREETING>

• greeting.dtd:

<!ELEMENT GREETING (#PCDATA)>

Page 99: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

Document Type DeclarationsDocument Type Declarations<?xml version="1.0"?><!DOCTYPE GREETING SYSTEM "greeting.dtd">

<GREETING>Hello XML!</GREETING>

• specifies the root element

• gives a URL for the DTD

Page 100: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

Invalid DocumentsInvalid Documents• Valid:

<GREETING>various random text but no markup</GREETING>

• Invalid: anything else including<GREETING> <sometag>various random text</sometag> <someEmptyTag/></GREETING>– or<GREETING> <GREETING>various random text</GREETING>

</GREETING>

Page 101: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

Validating ToolsValidating Tools

• Command line programs like XJParse

• Online validators– http://www.stg.brown.edu/service/

xmlvalid/

– http://www.cogsci.ed.ac.uk/%7Erichard/xml-check.html

• Browsers

Page 102: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

Element DeclarationsElement Declarations

• Each tag must be declared in a <!ELEMENT> declaration.

• A <!ELEMENT> declaration gives the name and content model of the element

• The content model uses a simple regular expression-like grammar to precisely specify what is and isn't allowed in an element

Page 103: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

Content SpecificationsContent Specifications

• ANY

• #PCDATA

• Sequences

• Choices

• Mixed Content

• Modifiers

• Empty

Page 104: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

ANYANY

<!ELEMENT SEASON ANY>

• A SEASON can contain any child element and/or raw text (parsed character data)

Page 105: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

#PCDATA#PCDATA

<!ELEMENT YEAR (#PCDATA)>

• Parsed Character Data; i.e. raw text, no markup

Page 106: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

#PCDATA#PCDATA

• Valid:<YEAR>1999</YEAR><YEAR>99</YEAR><YEAR>1999 C.E.</YEAR><YEAR> The year of our Lord one thousand, nine hundred, and ninety-nine

</YEAR>

• Invalid:<YEAR><MONTH>January</MONTH><MONTH>February</MONTH><MONTH>March</MONTH><MONTH>April</MONTH><MONTH>May</MONTH><MONTH>June</MONTH><MONTH>July</MONTH><MONTH>August</MONTH><MONTH>September</MONTH><MONTH>October</MONTH><MONTH>November</MONTH><MONTH>December</MONTH></YEAR>

Page 107: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

Child ElementsChild Elements

• To declare that a LEAGUE element must have a LEAGUE_NAME child:

<!ELEMENT LEAGUE (LEAGUE_NAME)>

<!ELEMENT LEAGUE_NAME (#PCDATA)>

Page 108: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

SequencesSequences

• Separate multiple required child elements with commas; e.g.

<!ELEMENT SEASON (YEAR, LEAGUE, LEAGUE)>

<!ELEMENT LEAGUE (LEAGUE_NAME, DIVISION, DIVISION, DIVISION)>

Page 109: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

One or More Children +One or More Children +

<!ELEMENT DIVISION_NAME (#PCDATA)>

<!ELEMENT DIVISION (DIVISION_NAME, TEAM+)>

Page 110: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

Zero or More Children *Zero or More Children *

<!ELEMENT TEAM (TEAM_CITY, TEAM_NAME, PLAYER*)>

<!ELEMENT TEAM_CITY (#PCDATA)>

<!ELEMENT TEAM_NAME (#PCDATA)>

Page 111: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

Zero or One Children ?Zero or One Children ? <!ELEMENT PLAYER (GIVEN_NAME, SURNAME, POSITION, GAMES, GAMES_STARTED, AT_BATS?, RUNS?, HITS?, DOUBLES?, TRIPLES?, HOME_RUNS?, RBI?, STEALS?, CAUGHT_STEALING?, SACRIFICE_HITS?, SACRIFICE_FLIES?, ERRORS?, WALKS?, STRUCK_OUT?, HIT_BY_PITCH?, WINS?, LOSSES?, SAVES?, COMPLETE_GAMES?, SHUT_OUTS?, ERA?, INNINGS?, EARNED_RUNS?, HIT_BATTER?, WILD_PITCHES?, BALK?,WALKED_BATTER?, STRUCK_OUT_BATTER?)

>

Page 112: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

Finished DTDFinished DTD

Page 113: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

ChoicesChoices

<!ELEMENT PAYMENT (CASH | CREDIT_CARD)>

<!ELEMENT PAYMENT (CASH | CREDIT_CARD | CHECK)>

Page 114: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

Grouping With ParenthesesGrouping With Parentheses

• Parentheses combine several elements into a single element.

• Parenthesized element can be nested inside other parentheses in place of a single element.

• The parenthesized element can be suffixed with a plus sign, a comma, or a question mark. <!ELEMENT dl (dt, dd)*><!ELEMENT ARTICLE (TITLE, (P | PHOTO | GRAPH | SIDEBAR | PULLQUOTE | SUBHEAD)*, BYLINE?)>

Page 115: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

Mixed ContentMixed Content

• Both #PCDATA and child elements in a choice

<!ELEMENT TEAM (#PCDATA | TEAM_CITY | TEAM_NAME | PLAYER)*>

• #PCDATA must come first

• #PCDATA cannot be used in a sequence

Page 116: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

Empty elementsEmpty elements

<!ELEMENT BR EMPTY>

<!ELEMENT IMG EMPTY>

<!ELEMENT HR EMPTY>

Page 117: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

Attribute DeclarationsAttribute Declarations

• Consider this element:<GREETING LANGUAGE="Spanish"> Hola!</GREETING>

• It is declared like this:<!ELEMENT GREETING (#PCDATA)><!ATTLIST GREETING LANGUAGE CDATA "English">

• <!ATTLIST Element_name Attribute_name Type Default_value>

Page 118: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

Multiple Attribute Multiple Attribute DeclarationsDeclarations

• Consider this element

<RECT LENGTH="70px" WIDTH="85px"/>

• With two attribute declarations:<!ELEMENT RECTANGLE EMPTY><!ATTLIST RECTANGLE LENGTH CDATA "0px"><!ATTLIST RECTANGLE WIDTH CDATA "0px">

• With one attribute declaration<!ATTLIST RECTANGLE LENGTH CDATA "0px"

WIDTH CDATA "0px">

• Indentation is a convetion, not a requirement

Page 119: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

Attribute TypesAttribute Types

• CDATA

• ID

• IDREF

• IDREFS

• ENTITY

• ENTITIES

• NOTATION

• NMTOKEN

• NMTOKENS

• Enumerated

Page 120: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

CDATACDATA

• Most general attribute type

• Value can be any string of text not containing a less-than sign (<) or quotation marks (")

Page 121: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

IDID

• Value must be an XML name– May include letters, digits, underscores,

hyphens, and periods

– May not include whitespace

– May contain colons only if used for namespaces

• Value must be unique within ID type attributes in the document

• Generally the default value is #REQUIRED

Page 122: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

IDREFIDREF

• Value matches the ID of an element in the same document

• Used for links and the like

Page 123: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

IDREFSIDREFS

• A list of ID values in the same document

• Separated by white space

Page 124: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

ENTITYENTITY

• Value is the name of an unparsed general entity declared in the DTD

Page 125: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

ENTITIESENTITIES

• Value is a list of unparsed general entities declared in the DTD

• Separated by white space

Page 126: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

NOTATIONNOTATION

• Value is the name of a notation declared in the DTD

Page 127: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

NMTOKENNMTOKEN

• Value is any legal XML name

Page 128: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

NMTOKENSNMTOKENS

• Value is a list of XML names

• Separated by white space

Page 129: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

EnumeratedEnumerated

• Not a keyword

• Refers to a list of possible values from which one must be chosen

• Default value is generally provided explicitly

<!ATTLIST P VISIBLE (TRUE | FALSE) "TRUE">

Page 130: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

Attribute Default ValuesAttribute Default Values

• A literal string value

• One of these three keywords– #REQUIRED

– #IMPLIED

– #FIXED

Page 131: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

#REQUIRED#REQUIRED

• No default value is provided in the DTD

• Document authors must provide attribute value for each element

<!ELEMENT IMG EMPTY><!ATTLIST IMG ALT CDATA #REQUIRED><!ATTLIST IMG WIDTH CDATA #REQUIRED><!ATTLIST IMG HEIGHT CDATA #REQUIRED>

Page 132: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

#IMPLIED#IMPLIED

• No default value in the DTD

• Author may(but does not have to) provide a value with each element

Page 133: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

#FIXED#FIXED

• Value is the same for all elements

• Default value must be provided in DTD

• Document author may not change default value<!ELEMENT AUTHOR EMPTY><!ATTLIST AUTHOR NAME CDATA #REQUIRED>

<!ATTLIST AUTHOR EMAIL CDATA #REQUIRED>

<!ATTLIST AUTHOR EXTENSION CDATA #IMPLIED><!ATTLIST AUTHOR COMPANY CDATA #FIXED "TIC">

Page 134: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

Internal DTDsInternal DTDs

<?xml version="1.0"?><!DOCTYPE GREETING [ <!ELEMENT GREETING (#PCDATA)>]><GREETING>Hello XML!</GREETING>

Page 135: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

Internal DTD SubsetsInternal DTD Subsets

<?xml version="1.0"?><!DOCTYPE GREETING SYSTEM "greeting.dtd" [

<!ELEMENT GREETING (#PCDATA)>]><GREETING>Hello XML!</GREETING>

• Internal declarations override external declarations

Page 136: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

Programming with XMLProgramming with XML

• Java works best

• C, Perl, Python etc. can also be used

• Unicode support is the biggest issue

Page 137: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

SAX, the Simple API for XMLSAX, the Simple API for XML

• Event based

• Programs can plug in different parsers

Page 138: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

The Document Object Model The Document Object Model (DOM)(DOM)

Page 139: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

Additional TechnologiesAdditional Technologies

• Namespaces

• XLinks

• XPointers

• RDF

Page 140: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

NamespacesNamespaces

• Attach a prefix to each element

• Prefix is connected to a unique URI by an xmlns attribbute– Uniform Resource Identifier

– URI need not point to a page

xmlns:bb="http://metalab.unc.edu/xml/baseball/"

Page 141: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

XLinksXLinks

Page 142: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

XPointersXPointers

Page 143: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

The Resource Description The Resource Description FrameworkFramework

Page 144: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

To Learn More: BooksTo Learn More: Books

• XML: Extensible Markup Language

– IDG Books 1998

– ISBN 0-76453-199-9

• The XML Bible

– IDG Books 1999

– ISBN 0-76453-236-7

Page 145: Intro to XML Monday, May 10, 1999 SD99 Copyright 1999 Elliotte Rusty Harold elharo@metalab.unc.edu

Questions?Questions?


Recommended