Date post: | 30-May-2018 |
Category: |
Documents |
Upload: | neeraj-singh |
View: | 228 times |
Download: | 0 times |
8/14/2019 Session04 XML Validation Schema
http://slidepdf.com/reader/full/session04-xml-validation-schema 1/42
© 2008 MindTree Consulting
XML SchemaNeeraj Singh
October 2009
8/14/2019 Session04 XML Validation Schema
http://slidepdf.com/reader/full/session04-xml-validation-schema 2/42
Slide 2
Agenda
XML Validation
Introduction to XML Schema
Examples / Demo
8/14/2019 Session04 XML Validation Schema
http://slidepdf.com/reader/full/session04-xml-validation-schema 3/42
© 2008 MindTree Consulting
XML Validation
8/14/2019 Session04 XML Validation Schema
http://slidepdf.com/reader/full/session04-xml-validation-schema 4/42
Slide 4
An Introduction to XML Validation
One of the important innovations of XML is the ability to placepreconditions on the data the programs read, and to do this in a
simple declarative way.
XML allows you to say
that every Order element must contain exactly one Customer element,
that each Customer element must have an id attribute that contains an
XML name token,
that every ShipTo element must contain one or more Streets, one City,
one State, and one Zip, and so forth.
Checking an XML document against this list of conditions is called
validation.
Validation is an optional step but an important one.
8/14/2019 Session04 XML Validation Schema
http://slidepdf.com/reader/full/session04-xml-validation-schema 5/42
Slide 5
Validation
There are many reasons and opportunities to validate an XML document:When we receive one, before importing data into a legacy system
When we receive one, before importing data into a legacy system, when we have
produced or hand-edited one
To test the output of an application, etc.
Validation as “firewall”
to serve as actual firewalls when we receive documents from the external world
(as is commonly the case with Web Services and other XML communications),
to provide check points when we design processes as pipelines of transformations.
Validation can take place at several levels.
Structural validation
Data validation
8/14/2019 Session04 XML Validation Schema
http://slidepdf.com/reader/full/session04-xml-validation-schema 6/42
Slide 6
Schema Languages
There is more than one language in which you can express suchvalidation conditions. Generically, these are called schema
languages, and the documents that list the constraints are called
schemas.
Different schema languages have different strengths and
weaknesses.
The document type definition (DTD) is the only schema language
built into most XML parsers and endorsed as a standard part of XML.
The W3C XML Schema Language (schemas for short, though it’s
hardly the only schema language) addresses several limitations of
DTDs.
Many other schema languages have been invented that can easily
be integrated with your systems.
8/14/2019 Session04 XML Validation Schema
http://slidepdf.com/reader/full/session04-xml-validation-schema 7/42© 2008 MindTree Consulting
XML Schema
8/14/2019 Session04 XML Validation Schema
http://slidepdf.com/reader/full/session04-xml-validation-schema 8/42
8/14/2019 Session04 XML Validation Schema
http://slidepdf.com/reader/full/session04-xml-validation-schema 9/42Slide 9
Schema definition
A schema is defined in a separate file and generally stored with the.xsd extension.
Every schema definition has a schema root element that belongs to
the http://www.w3.org/2001/XMLSchema namespace. The schema
element can also contain optional attributes.
For example:
The following example indicates that the elements used in the schema
come from the http://www.w3.org/2001/XMLSchema namespace.
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<!– Other definitions will come here.-->
</xs:schema>
8/14/2019 Session04 XML Validation Schema
http://slidepdf.com/reader/full/session04-xml-validation-schema 10/42Slide 10
Schema Linking when document root element is from null namespace
Let's start with our first document. It must have only "root"element and this element can contain text only. The element is
from null namespace. Valid document –
<root xmlns="">aaa</root>
If you want to validate this document with XML Schema, you haveto associate some Schema document with it. If the root element is
from null namespace, you will use "noNamespaceSchemaLocation"
attribute.
<root xsi:noNamespaceSchemaLocation="correct_0.xsd" xmlns=""xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" > test
</root>
8/14/2019 Session04 XML Validation Schema
http://slidepdf.com/reader/full/session04-xml-validation-schema 11/42Slide 11
Schema Linking when document root element from some particular
namespace
Now, let's have the same document as in previous example, but the"root" element must be from some concrete namespace, let's say
"http://foo". Valid document
<root xmlns="http://foo" >aaa</root>
If the root element is from some particular namespace, youassociate the Schema using "schemaLocation" attribute. The first
part of this attribute is the target namespace, the second one the
URL of the Schema file.
<f:root xsi:schemaLocation="http://foo correct_0.xsd"xmlns:f="http://foo" xmlns:xsi="http://www.w3.org/2001/XMLSchema-
instance" > test </f:root>
8/14/2019 Session04 XML Validation Schema
http://slidepdf.com/reader/full/session04-xml-validation-schema 12/42Slide 12
01_FirstXMLSchema.xsdWriting your first XML Schema and a valid XML file based on this. This
will also demonstrate how to link a XML file with a XML schema.
02_FirstNameSpace.xsd
This example demonstrate the use of namespace. If you have a xmldocument that belongs to certain namespace, how to connect to a XML
Schema.
Example
s / Demo
8/14/2019 Session04 XML Validation Schema
http://slidepdf.com/reader/full/session04-xml-validation-schema 13/42Slide 13
Schema elements
A schema file contains definitions for element and attributes, aswell as data types for elements and attributes. It is also used to
define the structure or the content model of an XML document.
Elements in a schema file can be classified as either simple or complex
Schema elements: Simple type
A simple type element is an element that cannot contain any attributes
or child elements; it can only contain the data type specified in its
declaration. The syntax for defining a simple element is:
<xs:element name="ELEMENT_NAME" type="DATA_TYPE" default/fixed="VALUE" />Where DATA_TYPE is one of the built-in schema data types
8/14/2019 Session04 XML Validation Schema
http://slidepdf.com/reader/full/session04-xml-validation-schema 14/42Slide 14
Schema elements: Simple type Contd…
You can also specify default or fixed values for an element. You dothis with either the default or fixed attribute and specify a value
for the attribute. Note: Specifying a fixed or default attribute is
optional.
An example of a simple type element is:<xs:element name="Author" type="xs:string" default="Whizlabs"/>
All attributes are simple types, so they are defined in the same
way that simple elements are defined. For example:
<xs:attribute name="title" type="xs:string" />
8/14/2019 Session04 XML Validation Schema
http://slidepdf.com/reader/full/session04-xml-validation-schema 15/42Slide 15
Schema data types
All data types in schema
inherit from anyType.
This includes both simple
and complex data types.
You can further classify
simple types into built-
in-primitive types and
built-in-derived types.
Built-in datatype
hierarchy
A complete hierarchical
diagram from the XML
Schema DatatypesRecommendation is
shown below.
ur types – derived by restriction
built-in primitive types – derived by list
built-in primitive types – derived by
extension or restriction
Complex types
All complex types
8/14/2019 Session04 XML Validation Schema
http://slidepdf.com/reader/full/session04-xml-validation-schema 16/42Slide 16
Schema elements: Complex types
Complex types are elements that either:
Contain other elements
Contain attributes
Are empty (empty elements)
Contain text
To define a complex type in a schema, use a complexType element.
You can specify the order of occurrence and the number of times an element can occur (cardinality) by using
the order and occurrence indicators, respectively.
For example:
<xs:element name="Book">
<xs:complexType>
<xs:sequence>
<xs:element name="Name" type="xs:string" />
<xs:element name="Author" type="xs:string" maxOccurs="4"/>
<xs:element name="ID" type="xs:string"/>
<xs:element name="Price" type="xs:string"/></xs:sequence>
</xs:complexType>
</xs:element>
In this example, the order indicator is xs:sequence, and the occurrence indicator is maxOccurs in the Author element name.
8/14/2019 Session04 XML Validation Schema
http://slidepdf.com/reader/full/session04-xml-validation-schema 17/42Slide 17
Schema elements: Complex types (Mixed content)
W3C XML Schema supports mixed content though the mixed attribute in
the xs:complexType elements. Consider
<xs:element name="book">
<xs:complexType mixed="true">
<xs:all>
<xs:element name="title" type="xs:string"/>
<xs:element name="author" type="xs:string"/>
</xs:all>
<xs:attribute name="isbn" type="xs:string"/>
</xs:complexType>
</xs:element>
It will validate an XMLelement such as:
<book isbn="0836217462">
Funny book by
<author>Charles M. Schulz</author>.
Its title (<title>Being a Dog Is a Full-
Time Job</title>) says it all !
</book>
8/14/2019 Session04 XML Validation Schema
http://slidepdf.com/reader/full/session04-xml-validation-schema 18/42Slide 18
07_ComplexType01.xsd
Your first complex type. Element can contain a mixture of elements.
Now, we want the element "root" to contain elements "aaa", "bbb", and
"ccc" in any order. We will use the "all" element. It also demonstrate the
use of All.
11_EmptyElementUsingAnyType.xsd
Empty element. We want to have the root element to be named "AAA",
from null namespace and empty. The empty element is defined as a
"complexType" with a "complexContent" which is a restriction of
"anyType", but without any elements.
Example
s / Demo
8/14/2019 Session04 XML Validation Schema
http://slidepdf.com/reader/full/session04-xml-validation-schema 19/42Slide 19
Occurrence indicators
Occurrence indicators specify the number of times an element canoccur in an XML document. You specify them with the minOccurs
and maxOccurs attributes of the element in the element definition.
As the names suggest, minOccurs specifies the minimum number of
times an element can occur in an XML document while maxOccurs
specifies the maximum number of times the element can occur.
It is possible to specify that an element might occur any number of times
in an XML document. This is determined by setting the maxOccurs value
to unbounded.
The default values for both minOccurs and maxOccurs is 1, which means
that by default an element or attribute can appear exactly one time.
8/14/2019 Session04 XML Validation Schema
http://slidepdf.com/reader/full/session04-xml-validation-schema 20/42
Slide 20
Order indicators
Order indicators define the order or sequence in which elementscan occur in an XML document. Three types of order indicators are:
All: If All is the order indicator, then the defined elements can appear in
any order and must occur only once. Remember that both the maxOccurs
and minOccurs values for All are always 1.
Sequence: If Sequence is the order indicator, then the elements must
appear in the order specified.
Choice: If Choice is the order indicator, then any one of the elements
specified must appear in the XML document.
8/14/2019 Session04 XML Validation Schema
http://slidepdf.com/reader/full/session04-xml-validation-schema 21/42
Slide 21
Example: Occurrence and order indicators
<xs:element name="Book">
<xs:complexType>
<xs:all>
<xs:element name="Name" type="xs:string" />
<xs:element name="ID" type="xs:string"/>
<xs:element name="Authors" type="authorType"/>
<xs:element name="Price" type="priceType"/>
</xs:all>
</xs:complexType>
</xs:element>
<xs:complexType name="authorType">
<xs:sequence>
<xs:element name="Author" type="xs:string" maxOccurs="4"/>
</xs:sequence>
</xs:complexType >
<xs:complexType name="priceType">
<xs:choice>
<xs:element name="dollars" type="xs:double" />
<xs:element name="pounds" type="xs:double" />
</xs:choice>
</xs:complexType >
the <xs:all> indicator specifies that the
Book element, if present, must contain
only one instance of each of the following
four elements: Name, ID, Authors, Price.
The xs:sequence indicator in the
authorType declaration specifies that
elements of this particular type (Authors
element) contain at least one Author
element and can contain up to four
Author elements.
The xs:choice indicator in the priceType
declaration specifies that elements of this particular type (Price element) can
contain either a dollars element or a
pounds element, but not both.
8/14/2019 Session04 XML Validation Schema
http://slidepdf.com/reader/full/session04-xml-validation-schema 22/42
Slide 22
Restriction
A main advantage of schema is that you have the ability to controlthe value of XML attributes and elements.
A restriction, which applies to all of the simple data elements in a
schema, allows you to define your own data type according to the
requirements by modifying the facets available for a particularsimple type.
To achieve this, use the restriction element defined in the schema
namespace.
W3C XML Schema defines 12 facets for simple data types.Enumeration, maxExclusive, minExclusive, maxInclusive, minInclusive,
maxLength, minLength, pattern, length, whiteSpace, fractionDigits,
totalDigits
8/14/2019 Session04 XML Validation Schema
http://slidepdf.com/reader/full/session04-xml-validation-schema 23/42
Slide 23
Example - To restrict the length of the text node
An example that shows how to restrict the length of the text node
<xs:element name="title">
<xs:complexType>
<xs:simpleContent>
<xs:restriction base="tokenWithLangAndNote"><xs:maxLength value="255"/>
<xs:attribute name="lang" type="xs:language"/>
<xs:attribute name="note" type="xs:token"/>
</xs:restriction>
</xs:simpleContent>
</xs:complexType>
</xs:element>
8/14/2019 Session04 XML Validation Schema
http://slidepdf.com/reader/full/session04-xml-validation-schema 24/42
Slide 24
Example – Remove an attribute from the element
To remove the note attribute from the element title, we declare note to
be prohibited in the list of attributes in the restriction:
<xs:element name="title">
<xs:complexType>
<xs:simpleContent>
<xs:restriction base="tokenWithLangAndNote">
<xs:maxLength value="255"/>
<xs:attribute name="lang" type="xs:language"/>
<xs:attribute name="note" use="prohibited"/>
</xs:restriction></xs:simpleContent>
</xs:complexType>
</xs:element>
8/14/2019 Session04 XML Validation Schema
http://slidepdf.com/reader/full/session04-xml-validation-schema 25/42
8/14/2019 Session04 XML Validation Schema
http://slidepdf.com/reader/full/session04-xml-validation-schema 26/42
Slide 26
Facets Contd…
maxInclusive - Numeric value of the data type is less than or
equal to the value specified.
minInclusive - Numeric value of
the data type is greater than orequal to the value specified.
<xs:simpleType name="id">
<xs:restriction base="xs:integer">
<xs:minInclusive value="0"/>
<xs:maxInclusive value="100"/>
</xs:restriction>
</xs:simpleType>
maxLength - Specifies the maximum
number of characters or list items
allowed in the value.
minLength - Specifies the minimum
number of characters or list items
allowed in the value.
pattern - Value of the data type is
constrained to a specific sequence of
characters that are expressed using
regular expressions.
<xs:simpleType name="nameFormat">
<xs:restriction base="xs:string">
<xs:minLength value="3"/>
<xs:maxLength value="10"/>
<xs:pattern value="[a-z][A-Z]*"/>
</xs:restriction>
</xs:simpleType>
8/14/2019 Session04 XML Validation Schema
http://slidepdf.com/reader/full/session04-xml-validation-schema 27/42
Slide 27
Facets Contd…
length - Specifies the exact number of
characters or list items allowed in thevalue.
<xs:simpleType name="secretCode">
<xs:restriction base="xs:string">
<xs:length value="5"/>
</xs:restriction></xs:simpleType>
whiteSpace - Specifies the method for
handling white space. Allowed values for
the value attribute are preserve,
replace, and collapse.
<xs:simpleType name="FirstName">
<xs:restriction base="xs:string">
<xs:whiteSpace value="preserve"/>
</xs:restriction>
</xs:simpleType>
fractionDigits - Constrains themaximum number of decimal
places allowed in the value.
totalDigits - The number of
digits allowed in the value.<xs:simpleType name="reducedPrice">
<xs:restriction base="xs:float">
<xs:totalDigits value="4"/>
<xs:fractionDigits value="2"/>
</xs:restriction>
</xs:simpleType>
8/14/2019 Session04 XML Validation Schema
http://slidepdf.com/reader/full/session04-xml-validation-schema 28/42
Slide 28
Multiple Restriction using ‘Union’
The union has been applied on the two embedded simple types to allow values from
both data types, our new data type will now accept the values from an enumerationwith two possible values (TBD and NA).
<xs:simpleType name="isbnType">
<xs:union>
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:pattern value="[0-9]{10}"/>
</xs:restriction>
</xs:simpleType>
<xs:simpleType>
<xs:restriction base="xs:NMTOKEN">
<xs:enumeration value="TBD"/><xs:enumeration value="NA"/>
</xs:restriction>
</xs:simpleType>
</xs:union>
</xs:simpleType>
Example
8/14/2019 Session04 XML Validation Schema
http://slidepdf.com/reader/full/session04-xml-validation-schema 29/42
Slide 29
03_RestrictSimpleType01.xsd
This example restricts a simple type. Here we will require the value of
the element "root" to be integer and less than 25.
04_RestrictUsingUnion01.xsd
We want the element "root" to be from the range 0-100 or 300-400
(including the border values). We will make a union from two intervals.
06_RestrictUnionEnum02.xsd
Element can contain a string from an enumerated set. Now, we want the
element "root" to have a value "N/A" or "#REF!".
14_RestrictionOfSequence.xsd
The Schema declares type "AAA", which can contain up to two sequences
of "x" and "y" elements. Then we declare the type "BBB", which is a
restriction of the type "AAA" and contain only one x-y sequence.
Example
s / Demo
8/14/2019 Session04 XML Validation Schema
http://slidepdf.com/reader/full/session04-xml-validation-schema 30/42
Slide 30
Extension
The extension element defines complex types that might derive from othercomplex or simple types.
If the base type is a simple type, then the complex type can only add attributes.
If the base type is a complex type, then it is possible to add attributes and
elements.
To derive from a complex type, you have to use the complexContent
element in conjunction with the base attribute of the extension element.
Extensions are particularly useful when you need to reuse complex element
definitions in other complex element definitions.
For example, it is possible to define a Name element that contains two child
elements (First and Last) and then reuse it in other complex element definitions.
8/14/2019 Session04 XML Validation Schema
http://slidepdf.com/reader/full/session04-xml-validation-schema 31/42
Slide 31
An example of extensions
<!--Base element definition -->
<xs:complexType name="Name">
<xs:sequence>
<xs:element name="First"/>
<xs:element name="Last"/>
</xs:sequence>
</xs:complexType>
<!-- Customer element that reuses it -->
<xs:complexType name="Customer">
<xs:complexContent>
<xs:extension base="Name">
<xs:sequence>
<xs:element name="phone" type="xs:string"/>
</xs:sequence>
</xs:extension>
</xs:complexContent>
</xs:complexType>
<!-- Student element that reuses it -->
<xs:complexType name="Student">
<xs:complexContent>
<xs:extension base="Name">
<xs:sequence>
<xs:element name="school" type="xs:string"/>
<xs:element name="year" type="xs:string"/>
</xs:sequence>
</xs:extension>
</xs:complexContent>
</xs:complexType>
Example
8/14/2019 Session04 XML Validation Schema
http://slidepdf.com/reader/full/session04-xml-validation-schema 32/42
Slide 32
12_ExtensionOfSequence.xsd
Extension of a sequence. When we extend the complexType, which
contains a sequence A with a sequence B, then the sequence B will be
appended to sequence A.
Example
s / Demo
8/14/2019 Session04 XML Validation Schema
http://slidepdf.com/reader/full/session04-xml-validation-schema 33/42
Slide 33
Groups
W3C XML Schema also allows the definition of groups
of elements and attributes.
These groups are not datatypes but containers holding a
set of elements or attributes that can be used to describe
complex types.
<!-- definition of an element group -->
<xs:group name="mainBookElements">
<xs:sequence>
<xs:element name="title" type="nameType"/>
<xs:element name="author" type="nameType"/>
</xs:sequence>
</xs:group>
<!-- definition of an attribute group -->
<xs:attributeGroup name="bookAttributes">
<xs:attribute name="isbn" type="isbnType" use="required"/>
<xs:attribute name="available" type="xs:string"/>
</xs:attributeGroup>
W3C XML Schema also allows the
definition of groups of elements
and attributes.
<xs:complexType name="bookType">
<xs:sequence>
<xs:group ref="mainBookElements"/>
<xs:element name="character"
type="characterType"
minOccurs="0"
maxOccurs="unbounded"/>
</xs:sequence>
<xs:attributeGroupref="bookAttributes"/>
</xs:complexType>
Example
8/14/2019 Session04 XML Validation Schema
http://slidepdf.com/reader/full/session04-xml-validation-schema 34/42
Slide 34
08_AttributeGroup01.xsd
Defining a group of attributes. Let's say we want to define a group of
common attributes, which will be reused. The root element is named
"root", it must contain the "aaa" and "bbb" elements, and these elements
must have attributes "x" and "y".
12_SequenceChoiceGroup.xsd
Element which contains two "patterns" (sequences), in any order. We
want to have the root element to be named "AAA", from null namespace
and contains two patterns in any order. The first pattern is a sequence of
"BBB" and "CCC" elements, the second one is a sequence of "XXX" and"YYY" element. The element "choice" allows one of the cases: either the
sequence "myFirstSequence"-"mySecondSequence" or
"mySecondSequence"-"myFirstSequence".
Example
s / Demo
8/14/2019 Session04 XML Validation Schema
http://slidepdf.com/reader/full/session04-xml-validation-schema 35/42
Slide 35
List Datatypes
List datatypes are special cases in
which a structure is defined within
the content of a single attribute or
element.
IDREFS, ENTITIES, and NMTOKENS are
predefined list datatypes
As we have seen with these threedatatypes, all the list datatypes that
can be defined must be whitespace-
separated. No other separator is
accepted.
The definition of a list datatype by
reference to an existing type is donethrough a itemType attribute:
<xs:simpleType name="integerList">
<xs:list itemType="xs:integer"/>
</xs:simpleType>
The definition of a list datatype can
also be done by embedding a
xs:simpleType element:
<xs:simpleType name="myIntegerList">
<xs:list>
<xs:simpleType>
<xs:restriction base="xs:integer">
<xs:maxInclusive value="100"/>
</xs:restriction>
</xs:simpleType>
</xs:list>
</xs:simpleType>
This datatype can be used to define
attributes or elements that accept a
whitespace-separated list of integers
smaller than or equal to 100 such as: "1
-25000 100."
Example
8/14/2019 Session04 XML Validation Schema
http://slidepdf.com/reader/full/session04-xml-validation-schema 36/42
Slide 36
09_ListDataType01.xsd
Attribute contains a list of values. Now, we want the "root" element to
have attribute "xyz", which contains a list of three integers. We will
define a general list (element "list") of integers and then restrict it
(element "restriction") to have exact length (element "length") of three
items.
10_ListDataType02.xsd
Element contains a list of values. Now, we want the "root" element to
contain a list of three integers. We will define a general list (element
"list") of integers and then restrict it (element "restriction") to have exactlength (element "length") of three items.
Example
s / Demo
8/14/2019 Session04 XML Validation Schema
http://slidepdf.com/reader/full/session04-xml-validation-schema 37/42
© 2008 MindTree Consulting
More Examples
Examples / Demo
Example
8/14/2019 Session04 XML Validation Schema
http://slidepdf.com/reader/full/session04-xml-validation-schema 38/42
Slide 38
15_CustomSimpleType.xsd
Definition of a custom simpleType - temperature must be greater than
-273.15. The element "T" must contain number greater than -273.15. We
will define our custom type for temperature named "Temperature" and
will require the element "T" to be of that type.
16_PatternElement.xsd
String must contain e-mail address. The element "A" must contain an
email address. We will define our custom type, which will at least
approximately check the validity of the address. We will use the
"pattern" element, to restrict the string using regular expressions.
Example
s / Demo
8/14/2019 Session04 XML Validation Schema
http://slidepdf.com/reader/full/session04-xml-validation-schema 39/42
Slide 39
Summary
W3C XML Schema has become the de facto standard for definingthe structure of an XML document and for checking the validity of
XML documents. Using schema, it is possible to define:
Elements (simple and complex)
AttributesFacets for XML elements
The structure of a document (order indicators)
The allowable number of elements (occurrence indicators) in an XML
document
8/14/2019 Session04 XML Validation Schema
http://slidepdf.com/reader/full/session04-xml-validation-schema 40/42
Slide 40
References
ibm.com/developerWorksIBM XML certification success, Part 1:
W3schools.com
www.Xml.com
XML Schema by OReilly
http://www.zvon.org/xxl/XMLSchemaTutorial
Examples used in the presentation are attached here
XML-Schema-Project.zip
8/14/2019 Session04 XML Validation Schema
http://slidepdf.com/reader/full/session04-xml-validation-schema 41/42
Slide 41
Questions
8/14/2019 Session04 XML Validation Schema
http://slidepdf.com/reader/full/session04-xml-validation-schema 42/42
Thank you
XML Technology, Semester 4
SICSR Executive MBA(IT) @ MindTree, Bangalore, India
By Neeraj Singh (toneeraj(AT)gmail(DOT)com