+ All Categories
Home > Documents > XML Fundamentals Transparency No. 1 XML Fundamentals Cheng-Chia Chen.

XML Fundamentals Transparency No. 1 XML Fundamentals Cheng-Chia Chen.

Date post: 04-Jan-2016
Category:
Upload: elizabeth-moody
View: 224 times
Download: 0 times
Share this document with a friend
58
XML Fundamentals Transparency No. 1 XML Fundamentals Cheng-Chia Chen
Transcript
Page 1: XML Fundamentals Transparency No. 1 XML Fundamentals Cheng-Chia Chen.

XML Fundamentals

Transparency No. 1

XML Fundamentals

Cheng-Chia Chen

Page 2: XML Fundamentals Transparency No. 1 XML Fundamentals Cheng-Chia Chen.

XML Fundamentals

Transparency No. 2

Contents

Well-formed XML concrete textual representation of XML

XML Data Model Conceptual tree model

Namespaces How does XML avoid name conflicts?

Page 3: XML Fundamentals Transparency No. 1 XML Fundamentals Cheng-Chia Chen.

XML Fundamentals

Transparency No. 3

Well-formed XML Document

An XML document is a sequence of characters: Each character is an atomic unit of text as

specified by ISO/IEC 10646 [unicode]. usually given a .xml extension file name MIME media type: application/xml or text/xml

Ex:

<?xml version=“1.0” encoding=“UTF-8”>

<student> 張得功 </student>

Page 4: XML Fundamentals Transparency No. 1 XML Fundamentals Cheng-Chia Chen.

XML Fundamentals

Transparency No. 4

Characters used in XML

A character is an atomic unit of text as specified by ISO/IEC 10646 [ISO/IEC 10646].

Legal characters are tab, carriage return, line feed, and the legal graphic characters of Unicode and ISO/IEC 10646.

Character Range

[2] Char ::= #x9 | #xA | #xD | [#x20-#xD7FF]

| [#xE000-#xFFFD] | [#x10000-#x10FFFF]

/* any Unicode character, excluding the surrogate blocks(#xD800~#xDFFF, FFFE, and FFFF. */

character encoding may vary from entity to entity. All XML processors must accept the UTF-8 and UTF-16

encodings.

Page 5: XML Fundamentals Transparency No. 1 XML Fundamentals Cheng-Chia Chen.

XML Fundamentals

Transparency No. 5

ASCII code

ASCII – Needs 7-bits of storage

Codes 0 – 127 used

Page 6: XML Fundamentals Transparency No. 1 XML Fundamentals Cheng-Chia Chen.

XML Fundamentals

Transparency No. 6

Whitespace

White Space:

[3] S ::= (#x20 | #x9 | #xD | #xA)+S (white space) consists of one or more space (#x20)

characters, tabs, carriage returns or line feeds.Whitespace can used to separate otherwise

indistinguishable parts of an XML Document. <student age=“15”>…</student> <studentage=“15”>…</student>

Page 7: XML Fundamentals Transparency No. 1 XML Fundamentals Cheng-Chia Chen.

XML Fundamentals

Transparency No. 7

XML Declaration

<?xml version=“1.0” encoding=“Big5” standalone=“no” ?>

Besides using file extension name, an xml document may use an XML declaration to identify itself as an XML document.

If used, it should occur first (no proceding whitespace allowed) in the document.

Version of the

XML specification

1.0 or 1.1

character encoding of

the document, expressed

in Latin characters, e.g.,

UTF-8, UTF-16,

iso-8859-1,

no: parsing affected

by external

DTD subset

yes: not affected .

Page 8: XML Fundamentals Transparency No. 1 XML Fundamentals Cheng-Chia Chen.

XML Fundamentals

Transparency No. 8

Elements, tags and character data

The example : <?xml version=“1.0” encoding=“UTF-8” ?> <student> 張得功 </student> is composed of a single element named student

Start-tag: <student> End-tag: </student>

Everything between start-tag and end-tag is called content Content encompasses real information Whitespace is part of the content, though many

applications will choose to ignore it<student> and </student> are markups張得功 and its surrounding whitespace are character

data

Page 9: XML Fundamentals Transparency No. 1 XML Fundamentals Cheng-Chia Chen.

XML Fundamentals

Transparency No. 9

Structure of an element

Each XML document contains one or more elements, the boundaries of which are either delimited by start-tags and end-tags, or, for empty elements, by an empty-element tag.

Each element has a type, identified by name, and may have a set of attribute specifications. The name used in start-tag and end-tag must be identical. Note: xml is case sensitive, so <student> != <Student>

Each attribute specification has a name and a value. Element

[39] element ::= EmptyElemTag | STag content ETag

Page 10: XML Fundamentals Transparency No. 1 XML Fundamentals Cheng-Chia Chen.

XML Fundamentals

Transparency No. 10

Element (cont’d)

Content of Elements : those between the start-tag and end-tag are called the element's content:

[43] content ::= CharData? ((element | Reference | CDSect | PI | Comment) CharData?)*

[14] CharData ::= [^<&]* - ([^<&]* ']]>' [^<&]*) i.e., Any string containing none of <, & and ]]>.

If an element has empty content, it is represented either by a start-tag immediately followed by an end-tag or by an empty-element tag.

Tags for Empty Elements [44] EmptyElemTag ::= '<' Name (S Attribute)* S? '/>‘

Page 11: XML Fundamentals Transparency No. 1 XML Fundamentals Cheng-Chia Chen.

XML Fundamentals

Transparency No. 11

Examples of empty elements

<IMG align="left” src="http://www.w3.org/Icons/WWW/w3c_home"

/>

<IMG align="left” src="http://www.w3.org/Icons/WWW/w3c_home"

></IMG>

1. <br></br>

2. <br/>

3. <br> </br> Note: 1 = 2 != 3.

Page 12: XML Fundamentals Transparency No. 1 XML Fundamentals Cheng-Chia Chen.

XML Fundamentals

Transparency No. 12

Start tag with attribute ( in document) and end tag

<tag attributeName = “ attrbute-value “ … >

</tag>

name of the

attribute

value or values

of the attribute

name(or type)

of the element

single or double

quotes,

‘ or “ must match

Each element

may contain zero

or more attributes

start tag and end

tag name must match

Page 13: XML Fundamentals Transparency No. 1 XML Fundamentals Cheng-Chia Chen.

XML Fundamentals

Transparency No. 13

Attributes

Attach additional information to elementsAn attribute is a name-value pair attached to an

element’s start-tag One element can have more than one attribute Name and value are separated by = and optional

whitespace Attribute value is enclosed in double or single quotation

marks <tel type=“office”>02-29381111</tel> Attribute order is not significant <student age=“20” gender=“male”> 趙得勝 </student>

Page 14: XML Fundamentals Transparency No. 1 XML Fundamentals Cheng-Chia Chen.

XML Fundamentals

Transparency No. 14

Start Tag

Start-tag

[40] STag ::= '<' Name (S Attribute)* S? '>'

[ WFC: Unique Att Spec ]

[41] Attribute ::= Name Eq AttValue

Example:

<termdef id=“dt-dog” term=“dog”>End-tag

[42] ETag ::= '</' Name S? '>’

Example:

</termdef> </termdef > vs </ termdef> < /termdef>

Page 15: XML Fundamentals Transparency No. 1 XML Fundamentals Cheng-Chia Chen.

XML Fundamentals

Transparency No. 15

XML Names

Rules for naming elements, attributes…Names and Tokens [4] NameChar ::= Letter | Digit | '.' | '-' | '_' | ':' | CombiningChar | Extender [5] Name ::= (Letter | '_' | ':') (NameChar)* [6] Names ::= Name ( #x20 Name)* [7] Nmtoken ::= (NameChar)+ [8] Nmtokens ::= Nmtoken (#x20 Nmtoken)*

Names beginning with (x|M)(m|M)(l|L) are reserved.Name is used for naming elements, attributes, entities

etc.Nmtoken (Nmtokens) is used for values of special

attributes(ID,IDREFS,NMTOKEN,NMTOKENS).

Page 16: XML Fundamentals Transparency No. 1 XML Fundamentals Cheng-Chia Chen.

XML Fundamentals

Transparency No. 16

AttValues (attribute value literal)

are those that can occur as an attribute value.

[10] AttValue ::= '"' ([^<&"] | Reference)* '"'

| "'" ([^<&'] | Reference)* "'"

Enclosed by double or single quotes.Can contain

entity/char references (see later slide)or any char data but excluding < and & and ( ’ or ”).

Page 17: XML Fundamentals Transparency No. 1 XML Fundamentals Cheng-Chia Chen.

XML Fundamentals

Transparency No. 17

Comments

Comments may appear 1. anywhere in a document outside other markup; 2. within the document type declaration at places

allowed by the grammar. They are not part of the document's character data. The string "--" (double-hyphen) must not occur within

comments. Comments

[15] Comment ::= '<!--' ( (Char - '-') | ('-' (Char - '-')) )* '-->'Example:

<!-- declarations for <head> & <body> --> <error <!-- comments cannot appear here! --> a=“aa”> ..

Page 18: XML Fundamentals Transparency No. 1 XML Fundamentals Cheng-Chia Chen.

XML Fundamentals

Transparency No. 18

Processing Instructions (PIs)

Processing instructions (PIs) allow documents to contain instructions for applications.

Processing Instructions:

[16] PI ::= '<?' PITarget (S (Char* - (Char* '?>' Char*)))? '?>'

[17] PITarget ::= Name - (('X' | 'x') ('M' | 'm') ('L' | 'l'))The PI begins with a target (PITarget) used to identify the

application. The remaing part is called PIData, which should not contain

substring “?>”.The target names "XML", "xml", and so on are reserved

for standardization. Ex: <?xml-stylesheet type=“text/css” href=“style.css” ?>

xml-stylesheet is reserved for XSLT stylesheet.

Page 19: XML Fundamentals Transparency No. 1 XML Fundamentals Cheng-Chia Chen.

XML Fundamentals

Transparency No. 19

Processing Instruction and comment

<?PItarget ***other staff*** ?>

<!-- 這是說明或註解 -->

may contain any characters

except the string “--”

may contain any characters

except the substring “?>”

Page 20: XML Fundamentals Transparency No. 1 XML Fundamentals Cheng-Chia Chen.

XML Fundamentals

Transparency No. 20

XML Document

[1]document ::= prolog element Misc*

elemet is called the root or document element of the document

[22] prolog ::= XMLDecl? Misc* (doctypedecl Misc*)?

[23] XMLDecl ::= '<?xml' VersionInfo EncodingDecl?

SDDecl? S? '?>'

[27] Misc ::= Comment | PI | S

Page 21: XML Fundamentals Transparency No. 1 XML Fundamentals Cheng-Chia Chen.

XML Fundamentals

Transparency No. 21

Character references

What if the character data inside an element contains < or & ? <expr> x+1 < z </expr>

Instead of using ‘<‘, we can use its character code (60) reference: &#60; --- decimal #60 &#x3c; --- hexadecimal #X3c or #x3C

Rule: if C is a char with code point dddd (decimal) or yyyy (hexideciaml), then we can represent C using & #dddd; or &#xyyyy;

Cf: in C or Java, we use \t or \011 to represent HT (#09). \\ or \x5c to represent back slash \ (#x5c)

Page 22: XML Fundamentals Transparency No. 1 XML Fundamentals Cheng-Chia Chen.

XML Fundamentals

Transparency No. 22

Entity reference

Numeric code is hard to remember. Can use a name to denote a char or a string Such name is called an entity.

Entity reference – If xxx is an entity => &xxx; is its entity reference While parsing an XML document, xml processor would

replace every encountered entity reference with its actual character.

XML predefines 5 entity references – you can define your own. &lt; – the less-than sign (<) &amp; – the ampersand (&) &gt; – the greater-than sign(>) -- not needed in general &quot; – the straight, double quotation marks (") &apos; – the straight single quote (')

Page 23: XML Fundamentals Transparency No. 1 XML Fundamentals Cheng-Chia Chen.

XML Fundamentals

Transparency No. 23

CDATA Section

What if my element content has a lot of special characters ? Ex: <expr> x < y && z < 1 </expr>

Solution 1: <expr> x &lt; y &amps;&amps; z &lt; 1 </expr> Hard to read/comprehend

Solution 2: <expr><![CDATA[ x < y && z < 1 ]]></expr>

Page 24: XML Fundamentals Transparency No. 1 XML Fundamentals Cheng-Chia Chen.

XML Fundamentals

Transparency No. 24

CDATA Sections

CDATA sections may occur as part of the content of an element; used to escape blocks of text containing many special characters. begin with the string "<![CDATA[" and end with the string "]]>":

CDATA Sections

[18] CDSect ::= CDStart CData CDEnd

[19] CDStart ::= '<![CDATA['

[20] CData ::= (Char* - (Char* ']]>' Char*))

[21] CDEnd ::= ']]>' What cannot occur inside a CDATA section?

Ans: ']]>' Every character inside CDATA section is recognized as a literal

character, so ‘<‘ and ‘&’ may and must occur in their literal form. Example: <![CDATA[<greeting>Hello, world!</greeting>]]>

Page 25: XML Fundamentals Transparency No. 1 XML Fundamentals Cheng-Chia Chen.

XML Fundamentals

Transparency No. 25

Character Data and Markup

XML Document consists of intermingled character data and markup. Markup takes the form of start-tags, end-tags, empty-element tags, entity references, character references, comments, CDATA section delimiters, document type declarations, processing instructions, XML declarations, text declarations and white space outside root element

All text that is not markup constitutes the character data of the document. I.e., it may occur in the content of an element or In the content of an CDATA Section.

Page 26: XML Fundamentals Transparency No. 1 XML Fundamentals Cheng-Chia Chen.

XML Fundamentals

Transparency No. 26

Character Data and Markup (cont’d)

In the content of elements, character data is any string of characters, which does not contain the start-delimiter (< and & ) of any markup.

In a CDATA section, character data is any string of characters not including the CDATA-section-close delimiter, "]]>".

To allow attribute values to contain both single and double quotes, the apostrophe or single-quote character (') may be represented as "&apos;", and the double-quote character (") as "&quot;".

Character Data :

[14] CharData ::= [^<&]* - ([^<&]* ']]>' [^<&]*) i.e., Any string containing none of <, & and ]]>.

Page 27: XML Fundamentals Transparency No. 1 XML Fundamentals Cheng-Chia Chen.

XML Fundamentals

Transparency No. 27

Possible contents of an element

Element

[39] element ::= EmptyElemTag | STag content ETag

Content of Elements

[43] content ::= CharData? ((element | Reference | CDSect | PI | Comment) CharData?)*

In addition to char data and child elements, an element may contain as children also references, PIs, comments or CDATA sections.

Page 28: XML Fundamentals Transparency No. 1 XML Fundamentals Cheng-Chia Chen.

XML Fundamentals

Transparency No. 28

General rules for well-formed XML Documents

1: balanced start and end tags The set of tags is unlimited but all start tags must have

matching end tags

Example of legal XML <student>

<name> DeTsi Wang</name><email> [email protected]</email><age> 20 </age></student>

2: There must be exactly one root element

Page 29: XML Fundamentals Transparency No. 1 XML Fundamentals Cheng-Chia Chen.

XML Fundamentals

Transparency No. 29

Rules for well-formed XML Documents

Rule 3: Proper element nesting All tags must be nested correctly. Like HTML, XML can

intermix tags and text, but tags may not overlap each other.

Legal XML<student>

<name> DeTsi Wang</name><email> [email protected]</email><age> 20 </age>

</student> Illegal XML

<b><i>This text is bold and italic</b> and italic</i>

Page 30: XML Fundamentals Transparency No. 1 XML Fundamentals Cheng-Chia Chen.

XML Fundamentals

Transparency No. 30

Rules for well-formed XML Documents

Rule 4: Attribute values must be single or double quoted Legal

<tag attribute=“value”><tag attribute=‘value’>

Illegal<font size=6> <font size=“60’>

Rule 5: An element may not have two attributes with the same name <font size=“6” size = “10”/>

Rule 6: Comments and processing instructions may not appear inside tags <font <!– error comment --> size = “6” />

Rule 7: No unescaped < or & signs may occur in the character data of an element or attributes <font zise=“<20”> 20&3 </font>

Page 31: XML Fundamentals Transparency No. 1 XML Fundamentals Cheng-Chia Chen.

XML Fundamentals

Transparency No. 31

An example XML document : Recipes in XML

Define our own “Recipe Markup Language”Choose markup tags that correspond to concepts in this

application domain recipe, ingredient, amount, ...

No canonical choices granularity of markup?

simply <date>14 Jun 95</date> or <date><y>95</y><m>6</m><d>14</d></date>

structuring? elements or attributes? ...

Page 32: XML Fundamentals Transparency No. 1 XML Fundamentals Cheng-Chia Chen.

XML Fundamentals

Transparency No. 32

Example (1/2)

<collection> <description>Recipes suggested by Jane Dow</description>

<recipe id="r117"> <title>Rhubarb Cobbler</title> <date>Wed, 14 Jun 95</date>

<ingredient name="diced rhubarb" amount="2.5" unit="cup"/> <ingredient name="sugar" amount="2" unit="tablespoon"/> <ingredient name="fairly ripe banana" amount="2"/> <ingredient name="cinnamon" amount="0.25" unit="teaspoon"/> <ingredient name="nutmeg" amount="1" unit="dash"/>

<preparation> <step> Combine all and use as cobbler, pie, or crisp. </step> </preparation>

Page 33: XML Fundamentals Transparency No. 1 XML Fundamentals Cheng-Chia Chen.

XML Fundamentals

Transparency No. 33

Example (2/2)

<comment> Rhubarb Cobbler made with bananas as the main sweetener. It was delicious. </comment>

<nutrition calories="170" fat="28%" carbohydrates="58%" protein="14%"/> <related ref="42">Garden Quiche is also yummy</related> </recipe></collection>

Page 34: XML Fundamentals Transparency No. 1 XML Fundamentals Cheng-Chia Chen.

XML Fundamentals

Transparency No. 34

Building on the XML Notation

Defining the syntax of our recipe language DTD, XML Schema, ...

Showing recipe documents in browsers XPath, XSLT

Recipe collections as databases XQuery

Building a Web-based recipe editor HTTP, Servlets, JSP, ...

...

Page 35: XML Fundamentals Transparency No. 1 XML Fundamentals Cheng-Chia Chen.

XML Fundamentals

Transparency No. 35

XML data models

An XML document may contain lots of information which not all applications would need/like to use. eg: <abc>abc <![CDATA[ def ]]> end</abc> <abc>abc def end</abc> need to be differentiated?

XML data models are abstracted views of XML documents so that unintended information of an XML document is ignored in the model.

There are more than one XML data model. DOM (document object model) XPath 1.0 ; XPath 2.0; XML information set …

All uses tree structure to model an XML document. though we could also model XML documents as graphs.

Page 36: XML Fundamentals Transparency No. 1 XML Fundamentals Cheng-Chia Chen.

XML Fundamentals

Transparency No. 36

XML Trees

Conceptually, an XML document is a tree structure node, edge root, leaf child, parent sibling (ordered),

ancestor,descendant

Page 37: XML Fundamentals Transparency No. 1 XML Fundamentals Cheng-Chia Chen.

XML Fundamentals

Transparency No. 37

An Analogy: File Systems

Page 38: XML Fundamentals Transparency No. 1 XML Fundamentals Cheng-Chia Chen.

XML Fundamentals

Transparency No. 38

Tree View of the XML Recipes

Page 39: XML Fundamentals Transparency No. 1 XML Fundamentals Cheng-Chia Chen.

XML Fundamentals

Transparency No. 39

Nodes in XML Trees

Root nodes: every XML tree has one root node that represents the entire tree

Element nodes: define hierarchical logical groupings of contents, each have a name

Text nodes: carry the actual contents, leaf nodesAttribute nodes: unordered, each associated with an

element node, has a name and a valueNamepace nodes: effective namespace associated with

an element.Comment nodes: ignorable meta-informationProcessing instructions: instructions to specific

processors, each have a target and a value

Page 40: XML Fundamentals Transparency No. 1 XML Fundamentals Cheng-Chia Chen.

XML Fundamentals

Transparency No. 40

Types of node in an XML tree

The tree contains nodes. Types of nodes and their possible children:

root nodes : element ( = 1), comment, PI element nodes: element, text, PI, comment,

[attribute, namespace] text nodes: leaves attribute nodes : leaves namespace nodes: leaves processing instruction nodes : leaves comment nodes : leaves

Page 41: XML Fundamentals Transparency No. 1 XML Fundamentals Cheng-Chia Chen.

XML Fundamentals

Transparency No. 41

XML Applications

Rough classification:Data-oriented languages

inventory, customer and employee records in a company regular flat wide tree ; traditionally stored in db

Document-oriented languages XHTML, DOCBook, WML, XML formats of word, openOffice loosely structured, tags ignorable, mixed content

Protocols and programming languages XML Schema, XSLT, WDSL ebXML, XMI, BML

Hybrids patient record : billing info; notes from doctor article collection: isbn, name; abstract

Page 42: XML Fundamentals Transparency No. 1 XML Fundamentals Cheng-Chia Chen.

XML Fundamentals

Transparency No. 42

Example: XHTML

<?xml version="1.0" encoding="UTF-8"?><html xmlns="http://www.w3.org/1999/xhtml"> <head><title>Hello world!</title></head> <body> <h1>This is a heading</h1> This is some text. </body></html>

• XMLification of HTML

•end tag must not be omitted

•element/attribute names all in lower case

•attribute values must be present and quoted.

•decomposed into modules reuseable by other applications

Page 43: XML Fundamentals Transparency No. 1 XML Fundamentals Cheng-Chia Chen.

XML Fundamentals

Transparency No. 43

Example: CML

<molecule id="METHANOL"> <atomArray> <stringArray builtin="id">a1 a2 a3 a4 a5 a6</stringArray> <stringArray builtin="elementType">C O H H H H</stringArray> <floatArray builtin="x3" units="pm"> -0.748 0.558 ... </floatArray> <floatArray builtin="y3" units="pm"> -0.015 0.420 ... </floatArray> <floatArray builtin="z3" units="pm"> 0.024 -0.278 ... </floatArray> </atomArray></molecule>

CML : XML-based data-oriented language for representation of molecules and

chemical reaction.

Page 44: XML Fundamentals Transparency No. 1 XML Fundamentals Cheng-Chia Chen.

XML Fundamentals

Transparency No. 44

Example: ebXML

<MultiPartyCollaboration name="DropShip"> <BusinessPartnerRole name="Customer"> <Performs initiatingRole='//binaryCollaboration[@name="Firm Order"]/ InitiatingRole[@name="buyer"]' /> </BusinessPartnerRole> <BusinessPartnerRole name="Retailer"> <Performs respondingRole='//binaryCollaboration[@name="Firm Order"]/ RespondingRole[@name="seller"]' /> <Performs initiatingRole='//binaryCollaboration[...]/ InitiatingRole[@name="buyer"]' /> </BusinessPartnerRole> <BusinessPartnerRole name="DropShip Vendor"> ... </BusinessPartnerRole></MultiPartyCollaboration>

ebXML: a worldwide initiative aiming to utilize XML for exchange of electronic

business data. It has delivered many XML standards for business processes,

core data component, collaboration protocol agreements, messaging,

registries and repositories.

Page 45: XML Fundamentals Transparency No. 1 XML Fundamentals Cheng-Chia Chen.

XML Fundamentals

Transparency No. 45

Example: ThML

<h3 class="s05" id="One.2.p0.2">Having a Humble Opinion of Self</h3><p class="First" id="One.2.p0.3">EVERY man naturally desires knowledge <note place="foot" id="One.2.p0.4"> <p class="Footnote" id="One.2.p0.5"><added id="One.2.p0.6"> <name id="One.2.p0.7">Aristotle</name>, Metaphysics, i. 1. </added></p> </note>; but what good is knowledge without fear of God? Indeed a humble rustic who serves God is better than a proud intellectual who neglects his soul to study the course of the stars. <added id="One.2.p0.8"><note place="foot" id="One.2.p0.9"> <p class="Footnote" id="One.2.p0.10"> Augustine, Confessions V. 4. </p> </note></added></p>

A XML-based markup language for theological texts.

Page 46: XML Fundamentals Transparency No. 1 XML Fundamentals Cheng-Chia Chen.

XML Fundamentals

Transparency No. 1

XML Namespace

Page 47: XML Fundamentals Transparency No. 1 XML Fundamentals Cheng-Chia Chen.

XML Fundamentals

Transparency No. 47

Motivation

name clashes. Consider an XML language WidgetML which uses XHTML as a

sublanguage for help messages:<widget type="gadget"> <head size="medium"/> <body><subwidget ref="gizmo"/></body> <info> <head><title>Description of gadget</title> </head> <body><h1>Gadget</h1> A gadget contains a big gizmo </body> </info></widget> Meanings of head and body depend on context!

complicates things for processors and might even cause ambiguities. The solution: different namespaces for different use of the

same name.

Page 48: XML Fundamentals Transparency No. 1 XML Fundamentals Cheng-Chia Chen.

XML Fundamentals

Transparency No. 48

The Idea

Assign a namespace to each set of elements/attributes (which forms an XML language)

http://www.w3.org/1999/xhtml

Each namespace is identified and referenced by a URI Qualify every element/attribute names with the URI of its

namespace:{http://www.w3.org/1999/xhtml}head

=> name = namespace URI + local part

widget

head

body

info …

html

head

body

h1 …

Page 49: XML Fundamentals Transparency No. 1 XML Fundamentals Cheng-Chia Chen.

XML Fundamentals

Transparency No. 49

URI as part of a name would use too much space since it is usually a long string.

Not all URIs are legal Attribute/element names. (XML names do not allow/restrict the use of special

characters: (.:_- ok) (/,#,%,… no)

Solution: use namespace prefix as a proxy for namespace URI. xmlns:aPfx = “aURI”

Notes: URI = URL URN (extended to IRI at 1.1). URI here used only for identification - doesn't have to point

at anything.

Problems for qualifying names

Page 50: XML Fundamentals Transparency No. 1 XML Fundamentals Cheng-Chia Chen.

XML Fundamentals

Transparency No. 50

Namespace declarations

Namespaces are declared by special namespace attributes (xmlns: or xmlns) and associated prefixes.

Example:<foo:e1 xmlns:foo="http://www.w3.org/TR/xhtml1"> ... <foo:head>...</foo:head> ...</...> xmlns:prefix1="URI1" declares a namespace with a prefix: prefix1

and a URI: URI1. Scope rule: lexical

A namespace declaration has effect on the element containing the declaration as well as all its descendants unless it is overridden by other declaration in nested declarations.

Both element and attribute names can be qualified with namespaces.

Note: the prefix is just a proxy - applications should use only the URI for identification.

Page 51: XML Fundamentals Transparency No. 1 XML Fundamentals Cheng-Chia Chen.

XML Fundamentals

Transparency No. 51

The default namespace

for backward compatibility and simplicity.

declaration: xmlns=“aURI"

Unprefixed element names are assigned the default namespace aURI.

could be disabled by xmlns=""

Default namespace declaration has no effect on attributes.

Page 52: XML Fundamentals Transparency No. 1 XML Fundamentals Cheng-Chia Chen.

XML Fundamentals

Transparency No. 52

Example<ex xmlns=“http://test.com/” att1=“…”

xmlns:s=“http://test.com/” >

<ex att1=“abc” > … </ex>

<ex xmlns=“” s:att1=“abc”>…</ex> </ex>

Notes 1. the 1st and 2nd <ex> belong to the same namespace

(http:://test.com/) but the 3rd <ex> belongs to no namespace.

2. Global attribute s:att1 belongs to namespace: http:://test.com.

3. Both <att1> attributes are local in the sense that they belong to the local namespace of ex and are different from s:att1.

4. Note the asymmetry of default namespace on elements and attributes

Page 53: XML Fundamentals Transparency No. 1 XML Fundamentals Cheng-Chia Chen.

XML Fundamentals

Transparency No. 53

An example: WidgetML with namespaces

<widget xmlns="http://www.widget.org"

xmlns:xhtml="http://www.w3.org/TR/xhtml1"

type="gadget">

<head size="medium"/>

<body><subwidget ref="gizmo"/></big>

<info><xhtml:head>

<xhtml:title>Description of gadget</xhtml:title>

</xhtml:head>

<xhtml:body> <xhtml:h1>Gadget</xhtml:h1>

A gadget contains a big gizmo

</xhtml:body> </info></widget> The main part of WidgetML uses the default namespace which has

the URI http://www.widget.org; XHTML uses the namespace prefix xhtml which is assigned the URI

http://www.w3.org/TR/xhtml1.

Page 54: XML Fundamentals Transparency No. 1 XML Fundamentals Cheng-Chia Chen.

XML Fundamentals

Transparency No. 54

Notes related to XML namespaces

Namespace awareness. XML languages and applications should consider Namespaces as an inherent part of XML.

Reserve colon (:) as a prefix/localpart separator and do not use it in your element/attribute names or any other names.

It is the namespace URI instead of namespace prefix that is used for identifying a namespace.

URI references which identify namespaces are considered identical only when they are exactly the same character-for-character. E.g.

1. http://a.b.c/~wine/d , 2. http://a.B.c/%7Ewine/d,

3. http://a.b.c/%7ewine/d , 4. d (relative URI deprecated)

All 4 URIs are treated as equal in URI spec, but are seen as different namespace URIs in xml namespace.

Note: Relative URI should not be used.

Page 55: XML Fundamentals Transparency No. 1 XML Fundamentals Cheng-Chia Chen.

XML Fundamentals

Transparency No. 55

Uniqueness of Attributes

Why are the <bad …> tags illegal ?

<x xmlns:n1="http://www.w3.org"

xmlns:n2="http://www.w3.org" >

<bad a="1" a="2" />

<bad n1:a="1" n2:a="2" />

</x>

Page 56: XML Fundamentals Transparency No. 1 XML Fundamentals Cheng-Chia Chen.

XML Fundamentals

Transparency No. 56

Uniqueness of Attributes (cont’d)

Both <good … /> elements are legal. Why ?

<x xmlns:n1="http://www.w3.org"

xmlns="http://www.w3.org" >

<good a="1" b="2" />

<good a="1" n1:a="2" />

</x>

Page 57: XML Fundamentals Transparency No. 1 XML Fundamentals Cheng-Chia Chen.

XML Fundamentals

Transparency No. 57

Summary

XML: a notation for hierarchically structured textconcrete textual representation and Well-formednessConceptual tree modelNamespaces

Page 58: XML Fundamentals Transparency No. 1 XML Fundamentals Cheng-Chia Chen.

XML Fundamentals

Transparency No. 58

Essential Online Resources

http://www.w3.org/TR/xml11/http://www.w3.org/TR/xml-names11http://www.unicode.org/


Recommended