+ All Categories
Home > Documents > Extensible Markup Language (XML)ce.sharif.edu/~zarrabi/courses/2013/ce419/notes/xml.pdf ·...

Extensible Markup Language (XML)ce.sharif.edu/~zarrabi/courses/2013/ce419/notes/xml.pdf ·...

Date post: 18-Apr-2020
Category:
Upload: others
View: 5 times
Download: 0 times
Share this document with a friend
41
Extensible Markup Language (XML) Hamid Zarrabi-Zadeh Web Programming – Fall 2013
Transcript
Page 1: Extensible Markup Language (XML)ce.sharif.edu/~zarrabi/courses/2013/ce419/notes/xml.pdf · documents in a format that is both human- ... • XHTML is a version of HTML in XML 7. XML

Extensible Markup

Language (XML)Hamid Zarrabi-Zadeh

Web Programming – Fall 2013

Page 2: Extensible Markup Language (XML)ce.sharif.edu/~zarrabi/courses/2013/ce419/notes/xml.pdf · documents in a format that is both human- ... • XHTML is a version of HTML in XML 7. XML

Outline

• Introduction

• XML Structure

• Document Type Definition (DTD)

• XHMTL

• Formatting XML

CSS Formatting

XSLT Transformations

• JSON

2

Page 3: Extensible Markup Language (XML)ce.sharif.edu/~zarrabi/courses/2013/ce419/notes/xml.pdf · documents in a format that is both human- ... • XHTML is a version of HTML in XML 7. XML

What is XML?

• XML is a markup language for encoding

documents in a format that is both human-

readable and machine-readable

• Is designed to transport and store data

• Emphasizes simplicity, generality, and usability

over the Internet

• Has strong support via Unicode for the languages

of the world

3

Page 4: Extensible Markup Language (XML)ce.sharif.edu/~zarrabi/courses/2013/ce419/notes/xml.pdf · documents in a format that is both human- ... • XHTML is a version of HTML in XML 7. XML

XML History

• XML is based on SGML, a Standard Generalized

Markup Language (ISO 8879:1986)

• Most of XML comes from SGML unchanged

• First XML specification draft published in 1996

• XML 1.0 became a W3C recommendation in

1998 (fifth edition published in 2008)

• XML 1.1 published in 2004 (revised in 2006), but is

not widely implemented and is rarely used

4

Page 5: Extensible Markup Language (XML)ce.sharif.edu/~zarrabi/courses/2013/ce419/notes/xml.pdf · documents in a format that is both human- ... • XHTML is a version of HTML in XML 7. XML

XML Example

• A simple XML example

5

<?xml version="1.0"?>

<message>

<from>Hassan</from>

<to>Hossein</to>

<body>Please give me a call!</body>

</message>

Page 6: Extensible Markup Language (XML)ce.sharif.edu/~zarrabi/courses/2013/ce419/notes/xml.pdf · documents in a format that is both human- ... • XHTML is a version of HTML in XML 7. XML

XML Example

• Another example:

6

<?xml version="1.0"?>

<books>

<book>

<title>Maktub</title>

<author>Paulo Coelho</author>

</book>

<book>

<title>Never Crashed!</title>

<author>Microsoft</author>

</book>

</books>

Page 7: Extensible Markup Language (XML)ce.sharif.edu/~zarrabi/courses/2013/ce419/notes/xml.pdf · documents in a format that is both human- ... • XHTML is a version of HTML in XML 7. XML

XML versus HTML

• XML and HTML are both markup languages

• HTML is for displaying data, while XML is for

describing data

• XML syntax differences

New tags may be defined at will

Tags may be nested to arbitrary depth

May contain an optional description of its grammar

• XHTML is a version of HTML in XML

7

Page 8: Extensible Markup Language (XML)ce.sharif.edu/~zarrabi/courses/2013/ce419/notes/xml.pdf · documents in a format that is both human- ... • XHTML is a version of HTML in XML 7. XML

XML Markup Languages

• Lots of new markup languages have been

created with XML, including:

XHTML

RSS for news feeds

RDF for describing resources

SVG for scalable vector graphics

SMIL for describing multimedia for the web

MathML for describing mathematical notation

8

Page 9: Extensible Markup Language (XML)ce.sharif.edu/~zarrabi/courses/2013/ce419/notes/xml.pdf · documents in a format that is both human- ... • XHTML is a version of HTML in XML 7. XML

XML Pros and Cons

• Pros:

software- and hardware-independent

simplifying:

sharing data between applications

transporting data between different platforms

• Cons:

verbosity

rather complex parsing and mapping to type systems

9

Page 10: Extensible Markup Language (XML)ce.sharif.edu/~zarrabi/courses/2013/ce419/notes/xml.pdf · documents in a format that is both human- ... • XHTML is a version of HTML in XML 7. XML

XML Structure

Page 11: Extensible Markup Language (XML)ce.sharif.edu/~zarrabi/courses/2013/ce419/notes/xml.pdf · documents in a format that is both human- ... • XHTML is a version of HTML in XML 7. XML

XML Tree

• Each XML document forms a tree structure that

starts at the root and branches to the leaves

11

<bookstore>

<book category="COOKING">

<title lang="en">Everyday Italian</title>

<author>Giada De Laurentiis</author>

<year>2005</year>

<price>30.00</price>

</book>

</bookstore>

Page 12: Extensible Markup Language (XML)ce.sharif.edu/~zarrabi/courses/2013/ce419/notes/xml.pdf · documents in a format that is both human- ... • XHTML is a version of HTML in XML 7. XML

XML Tree Example12

Page 13: Extensible Markup Language (XML)ce.sharif.edu/~zarrabi/courses/2013/ce419/notes/xml.pdf · documents in a format that is both human- ... • XHTML is a version of HTML in XML 7. XML

XML Tags

• XML tags are similar to HTML tags but

They are case-sensitive

All tags must be closed

• Like HTML tags they must be properly nested

• All XML documents must have a single root

element that contains all other elements

This root element can have any name

13

Page 14: Extensible Markup Language (XML)ce.sharif.edu/~zarrabi/courses/2013/ce419/notes/xml.pdf · documents in a format that is both human- ... • XHTML is a version of HTML in XML 7. XML

XML Attributes

• XML elements can have attributes

• Attribute values must be quoted with either single

or double quotes

• Attributes have limitations (use with care)

– Child elements are more flexible alternatives

14

<book title="Let's party!">

<book>

<title>It's me</title>

<author>Me who</author>

</book>

<film name='The "Lost"'/>

Page 15: Extensible Markup Language (XML)ce.sharif.edu/~zarrabi/courses/2013/ce419/notes/xml.pdf · documents in a format that is both human- ... • XHTML is a version of HTML in XML 7. XML

Document Type

Definitions (DTDs)

Page 16: Extensible Markup Language (XML)ce.sharif.edu/~zarrabi/courses/2013/ce419/notes/xml.pdf · documents in a format that is both human- ... • XHTML is a version of HTML in XML 7. XML

Document Type Definitions

• Most applications will not be able to deal with

general XML documents

• Instead, they expect documents that have a

specific structure

• This structure can be defined with an XML

Document Type Definition (DTD)

• A DTD specifies the root node's tag name and

what it contains

16

Page 17: Extensible Markup Language (XML)ce.sharif.edu/~zarrabi/courses/2013/ce419/notes/xml.pdf · documents in a format that is both human- ... • XHTML is a version of HTML in XML 7. XML

Valid XML

• A well-formed XML document which conforms to

the rules of a DTD is called a valid XML

17

<?xml version="1.0"?>

<!DOCTYPE message SYSTEM "message.dtd">

<message>

<from>Hassan</from>

<to>Hossein</to>

<body>Please give me a call!</body>

</message>

Page 18: Extensible Markup Language (XML)ce.sharif.edu/~zarrabi/courses/2013/ce419/notes/xml.pdf · documents in a format that is both human- ... • XHTML is a version of HTML in XML 7. XML

DTD Example

• A simple DTD for our message example would

look like this

18

<!DOCTYPE message

[

<!ELEMENT message (from,to,subject,body)>

<!ELEMENT from (#PCDATA)>

<!ELEMENT to (#PCDATA)>

<!ELEMENT subject (#PCDATA)>

<!ELEMENT body (#PCDATA)>

]>

Page 19: Extensible Markup Language (XML)ce.sharif.edu/~zarrabi/courses/2013/ce419/notes/xml.pdf · documents in a format that is both human- ... • XHTML is a version of HTML in XML 7. XML

DTD Building Blocks

• In a DTD we can specify

Elements – tags and the stuff text between them

Attributes – information about elements

Entities – special character &lt;, &gt;, &amp;

PCDATA – parsed character data

Parsed by the XML parser and examined for markup

CDATA – (unparsed) character data

19

Page 20: Extensible Markup Language (XML)ce.sharif.edu/~zarrabi/courses/2013/ce419/notes/xml.pdf · documents in a format that is both human- ... • XHTML is a version of HTML in XML 7. XML

Elements

• There are different ways to declare an element

Empty

Parsed character data

Anything

With a specific sequence of children

20

<!ELEMENT br EMPTY>

<!ELEMENT p (#PCDATA)>

<!ELEMENT x ANY>

<!ELEMENT message (from,to,subject,body)>

Page 21: Extensible Markup Language (XML)ce.sharif.edu/~zarrabi/courses/2013/ce419/notes/xml.pdf · documents in a format that is both human- ... • XHTML is a version of HTML in XML 7. XML

Elements with Children

• Child sequences can be specified using a syntax

similar to regular expressions

<!ELEMENT picture (polygon+)>

<!ELEMENT picture (polygon+)>

<!ELEMENT picture (polygon?)>

<!ELEMENT polygon (point,point,point+)>

<!ELEMENT picture (polygon|image)>

<!ELEMENT picture (polygon|image)*>

21

Page 22: Extensible Markup Language (XML)ce.sharif.edu/~zarrabi/courses/2013/ce419/notes/xml.pdf · documents in a format that is both human- ... • XHTML is a version of HTML in XML 7. XML

Element Attributes

• We can also specify which attributes an element

has

<!ATTLIST element-name attribute-name attribute-type

default-value>

22

<!ATTLIST polygon boundary CDATA "black">

<!ATTLIST polygon interior CDATA "white">

<!ATTLIST polygon fill (true|false) "true">

<!ATTLIST point x CDATA "0">

Page 23: Extensible Markup Language (XML)ce.sharif.edu/~zarrabi/courses/2013/ce419/notes/xml.pdf · documents in a format that is both human- ... • XHTML is a version of HTML in XML 7. XML

Attribute Value Types

• Attribute values types can be

CDATA - The value is character data

(en1|en2|..) - The value must be one from an enumerated list

ID - The value is a unique id

IDREF - The value is the id of another element

IDREFS - The value is a list of other ids

NMTOKEN - The value is a valid XML name

NMTOKENS - The value is a list of valid XML names

ENTITY - The value is an entity

ENTITIES - The value is a list of entities

NOTATION - The value is a name of a notation

xml: - The value is a predefined xml value

23

Page 24: Extensible Markup Language (XML)ce.sharif.edu/~zarrabi/courses/2013/ce419/notes/xml.pdf · documents in a format that is both human- ... • XHTML is a version of HTML in XML 7. XML

Default Attribute Values

• Default attribute values can be

Value - The default value of the attribute

#REQUIRED - The attribute value must be included in

the element (no default)

#IMPLIED - The attribute does not have to be included

#FIXED value - The attribute value is fixed

24

Page 25: Extensible Markup Language (XML)ce.sharif.edu/~zarrabi/courses/2013/ce419/notes/xml.pdf · documents in a format that is both human- ... • XHTML is a version of HTML in XML 7. XML

Entities

• Entities are variables used to define common text

<!ENTITY entity-name "entity-value">

25

<!ENTITY sut "Sharif University of Technology">

...

[in XML file:]

&sut;

Page 26: Extensible Markup Language (XML)ce.sharif.edu/~zarrabi/courses/2013/ce419/notes/xml.pdf · documents in a format that is both human- ... • XHTML is a version of HTML in XML 7. XML

Example – Newspaper26

<!DOCTYPE newspaper [

<!ELEMENT newspaper (article+)>

<!ELEMENT article (headline,byline,body,notes)>

<!ELEMENT headline (#PCDATA)>

<!ELEMENT byline (#PCDATA)>

<!ELEMENT body (#PCDATA)>

<!ELEMENT NOTES (#PCDATA)>

<!ATTLIST article author CDATA #REQUIRED>

<!ATTLIST article editor CDATA #IMPLIED>

<!ATTLIST article date CDATA #IMPLIED>

<!ATTLIST article edition CDATA #IMPLIED>

<!ENTITY publisher "Sample Press">

<!ENTITY copy "Copyright 2013 Sample Press"> ]>

Page 27: Extensible Markup Language (XML)ce.sharif.edu/~zarrabi/courses/2013/ce419/notes/xml.pdf · documents in a format that is both human- ... • XHTML is a version of HTML in XML 7. XML

XML Schema

• XML Schema is an XML-based alternative to DTD

• Main differences to DTDs

XML schemas use XML syntax

XML schemas support data types

XML schemas are extensible

27

Page 28: Extensible Markup Language (XML)ce.sharif.edu/~zarrabi/courses/2013/ce419/notes/xml.pdf · documents in a format that is both human- ... • XHTML is a version of HTML in XML 7. XML

Schema Example28

<?xml version="1.0"?>

<xs:element name="message">

<xs:complexType>

<xs:sequence>

<xs:element name="to" type="xs:string"/>

<xs:element name="from" type="xs:string"/>

<xs:element name="subject" type="xs:string"/>

<xs:element name="body" type="xs:string"/>

</xs:sequence>

</xs:complexType>

</xs:element>

Page 29: Extensible Markup Language (XML)ce.sharif.edu/~zarrabi/courses/2013/ce419/notes/xml.pdf · documents in a format that is both human- ... • XHTML is a version of HTML in XML 7. XML

XHTML

Page 30: Extensible Markup Language (XML)ce.sharif.edu/~zarrabi/courses/2013/ce419/notes/xml.pdf · documents in a format that is both human- ... • XHTML is a version of HTML in XML 7. XML

XHTML

• XHTML is a version of HTML that is proper XML

• XHTML 1.0 released in 2000

• Because it is XML, it is defined using a DTD

• The html tag must have an xmlns attribute

30

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0

Transitional//EN"

"http://www.w3.org/TR/xhtml1/DTD/xhtml1-

transitional.dtd">

<html xmlns="http://www.w3.org/1999/xhtml">

<body>

</body>

</html>

Page 31: Extensible Markup Language (XML)ce.sharif.edu/~zarrabi/courses/2013/ce419/notes/xml.pdf · documents in a format that is both human- ... • XHTML is a version of HTML in XML 7. XML

XHTML versus HTML

• XHTML and HTML have mostly the same tags

• Main differences have to do with XML syntax

All tags must be closed

Empty tags must also be closed

Elements must be properly nested

Tag names must be lowercase

Attribute values must be quoted

Attributes must have values

<input type="checkbox" checked="checked" />

<input type="text" readonly="readonly" />

The id attribute replaces the name attribute

31

Page 32: Extensible Markup Language (XML)ce.sharif.edu/~zarrabi/courses/2013/ce419/notes/xml.pdf · documents in a format that is both human- ... • XHTML is a version of HTML in XML 7. XML

Formatting XML

Page 33: Extensible Markup Language (XML)ce.sharif.edu/~zarrabi/courses/2013/ce419/notes/xml.pdf · documents in a format that is both human- ... • XHTML is a version of HTML in XML 7. XML

CSS Formatting

• Formatting information can be added to XML

documents using CSS

• This works by adding a reference to a CSS

stylesheet in the XML document header

33

<?xml version="1.0"?>

<?xml-stylesheet type="text/css" href="msg.css"?>

<message>

<from>Hassan</from>

<to>Hossein</to>

<body>Please give me a call!</body>

</message>

Page 34: Extensible Markup Language (XML)ce.sharif.edu/~zarrabi/courses/2013/ce419/notes/xml.pdf · documents in a format that is both human- ... • XHTML is a version of HTML in XML 7. XML

CSS Example

• The :before and :after CSS pseudo-elements can

be very useful here

34

from {

display: block;

padding: 10px;

}

from:before {

content: "From: ";

font-weight: bold;

}

Page 35: Extensible Markup Language (XML)ce.sharif.edu/~zarrabi/courses/2013/ce419/notes/xml.pdf · documents in a format that is both human- ... • XHTML is a version of HTML in XML 7. XML

XSLT Transformations

• Formatting XML with CSS is not the most common

method

• W3C recommends using XSLT instead

• XSLT (eXtensible Stylesheet Language Transformation) is a language for transforming

XML documents into other XML documents

• To display XML on the web, we could use XSLT to

convert our XML document into an XHTML

document

35

Page 36: Extensible Markup Language (XML)ce.sharif.edu/~zarrabi/courses/2013/ce419/notes/xml.pdf · documents in a format that is both human- ... • XHTML is a version of HTML in XML 7. XML

XSLT Example36

<?xml version="1.0"?>

<html xsl:version="1.0"

xmlns:xsl="http://www.w3.org/1999/XSL/Transform"

xmlns="http://www.w3.org/1999/xhtml">

<body>

<xsl:for-each select="messages/message">

<div style="padding:10px; margin:10px>

<div><b>From</b>:

<xsl:value-of select="from"/></div>

<div><b>To</b>:

<xsl:value-of select="to"/></div>

<div><xsl:value-of select="body"/></div>

</div>

</xsl:for-each>

</body>

</html>

Page 37: Extensible Markup Language (XML)ce.sharif.edu/~zarrabi/courses/2013/ce419/notes/xml.pdf · documents in a format that is both human- ... • XHTML is a version of HTML in XML 7. XML

JSON

Page 38: Extensible Markup Language (XML)ce.sharif.edu/~zarrabi/courses/2013/ce419/notes/xml.pdf · documents in a format that is both human- ... • XHTML is a version of HTML in XML 7. XML

What is JSON?

• JSON stands for JavaScript Object Notation

• It is a lightweight text-data interchange format,

commonly used as an alternative to XML

• JSON is smaller, faster and easier to parse

• Although JSON uses JavaScript syntax, it is still

language and platform independent.

38

Page 39: Extensible Markup Language (XML)ce.sharif.edu/~zarrabi/courses/2013/ce419/notes/xml.pdf · documents in a format that is both human- ... • XHTML is a version of HTML in XML 7. XML

JSON Examples39

{

"message": {

"from": "Hassan",

"to": "Hossein",

"body": "Please give me a call!"

}

}

{

"books": [

{"title": "Maktub", "author": "Paulo Coelho"},

{"title": "Crashed!", "author": "Microsoft"}

]

}

Page 40: Extensible Markup Language (XML)ce.sharif.edu/~zarrabi/courses/2013/ce419/notes/xml.pdf · documents in a format that is both human- ... • XHTML is a version of HTML in XML 7. XML

Summary

• XML is used to describe data

• DTDs and Schemas can be used to define valid

documents

• XML can be formatted with CSS and XSLT

• XHTML is a version of HTML which is proper XML

• JSON is a good alternative to XML

40

Page 41: Extensible Markup Language (XML)ce.sharif.edu/~zarrabi/courses/2013/ce419/notes/xml.pdf · documents in a format that is both human- ... • XHTML is a version of HTML in XML 7. XML

References

• W3Schools

http://www.w3schools.com/xml

• Internet Programming by Pat Morin

http://cg.scs.carleton.ca/~morin/teaching/2405/

• Wikipedia

http://en.wikipedia.org/wiki/XML

41


Recommended