Irvine COP 4814
COP 4814 Florida International University
Kip Irvine
XML Schema Basics, and Defining Simple Types
Updated: 2/23/2016 Based on Goldberg, Chapters 9 & 10
XML Schema Overview
• Also known as XML Schema Definition (XSD) • Specifies the structure of valid XML documents
– defines a set of elements, their relationships to each other, and the attributes that they can contain.
• Designed to address shortcomings of DTDs – has a system of data types – lets you define global and local elements – likely to replace DTDs in the future as the standard
schema language
Irvine COP 4814
Latest info: http://w3.org/XML/Schema
Data Type Categories
• Atomic type – XML element that only contains text
• List type – collection of items
• Complex type – XML element that contains child elements and/or
attributes
Irvine COP 4814
Sample XML File & Schema
<?xml version="1.0"?> <wonder> <name>Colossus of Rhodes</name> <location>Greece</location> <height>107</height> </wonder>
Irvine COP 4814
<?xml version="1.0"?> <element name="wonder"> <complexType> <sequence> <element name="name" type="string"/> <element name="location" type="string"/> <element name="height" type="string"/> </sequence> </complexType> </element>
incomplete Schema file
Complete Schema File
Irvine COP 4814
<?xml version="1.0"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"> <element name="wonder"> <complexType> <sequence> <element name="name" type="string"/> <element name="location" type="string"/> <element name="height" type="string"/> </sequence> </complexType> </element> </xs:schema>
Linking the XML Document
To the XML Schema file:
<?xml version="1.0"?>
<wonder xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="09-06.xsd" >
<name>Colossus of Rhodes</name>
<location>Greece</location>
<height>107</height>
</wonder>
Irvine COP 4814
Web location, network path, or local file
Visual Studio doesn’t need this extra line—you can just assign a value to the Schemas property of the XML file.
XML Annotations
Structured comments that can be processed by XML parsers. Can appear anywhere, multiple times. <xs:annotation>
<xs:documentation>
This XML Schema will be used to validate documents from the student registration system.
</xs:documentation>
</xs:annotation>
Irvine COP 4814
Atomic Datatypes
• Contain a single value (cannot be divided). • Based on one of the built-in types.
– Example: type="xs:string • Restrictions can be included on their ranges and
character patterns, allowing you to create subtypes. • Type categories:
– string, integer, boolean, date, decimal, etc.
Irvine COP 4814
Simple Types
Some of the more common types: <xs:element name="height" type="xs:string"/>
<xs:element name="year_built" type="xs:integer"/>
<xs:element name="cost" type="xs:decimal"/>
<xs:element name="is_standing" type="xs:boolean"/>
<xs:element name="image" type="xs:anyURI"/>
Irvine COP 4814
Web location, network path, or local file
Irvine COP 4814
Standard Date/Time Formats
• xs:date: yyyy-mm-dd "2005-04-26"
• xs:time: hh:mm:ss "16:21:00"
• xs:dateTime yyyy-mm-ddThh:mm:ss "2005-04-26T16:21:00"
Irvine COP 4814
xs:duration
• xs:duration: – PnYnMnDTnHnMnS
• 3 months, 4 days, 6 hours, 17 minutes: – "P3M4DT6H17M"
• 90 days: – "P90D"
• 4 days and 6 hours: – "P4DT6H"
Irvine COP 4814
Other Date Types
Irvine COP 4814
Type Examples
xs:gYear "1965"
xs:gMonth "--04" (April)
xs:gYearMonth "1965-04" (April 1965)
xs:gMonthDay "--0426" (April 26)
xs:gDay
"---26" (26th day)
Custom Type
• Identify the XML element you wish to define • The "base" attribute identifies an existing type. • Example: string of <= 1024 characters: <xs:element name="story"> <xs:simpleType> <xs:restriction base="xs:string"> <xs:length value="1024"/> </xs:restriction> </xs:simpleType> </xs:element>
Irvine COP 4814
this custom type is anonymous
Named Custom Type
• Alternatively you can add a name attribute to the xs:simpleType element:
<xs:element name="story"> <xs:simpleType name="story_type"> <xs:restriction base="xs:string"> <xs:length value="1024"/> </xs:restriction> </xs:simpleType> </xs:element>
Irvine COP 4814
Applying a Custom Type
• The same named type can be used multiple times: <xs:element name="story" type="story_type"/> <xs:element name="summary" type="story_type"/> <xs:element name="another_story" type="story_type"/>
Irvine COP 4814
Limiting Values to a Range
• Use xs:minInclusive and xs:maxInclusive:
<xs:element name="student_age">
<xs:simpleType>
<xs:restriction base="xs:integer">
<xs:minInclusive value="10">
<xs:maxInclusive value="120">
</xs:restriction>
</xs:simpleType>
</xs:element>
Irvine COP 4814
Also possible: minExclusive, maxExclusive
Set of Possible Values
Use xs:enumeration
<xs:element name="student_level"> <xs:simpleType> <xs:restriction base="xs:string">
<xs:enumeration value="undergraduate"/> <xs:enumeration value="graduate"/> <xs:enumeration value="unclassified"/>
</xs:restriction> </xs:simpleType> </xs:element>
Irvine COP 4814
Specifying an Exact Length
Example: <xs:element name="panther_id"> <xs:simpleType> <xs:restriction base="xs:string">
<xs:length value="7"/>
</xs:restriction> </xs:simpleType> </xs:element>
Irvine COP 4814
Also possible: xs:minLength, xs:maxLength
Specifying a Matching Pattern
• Uses regular expression syntax • Suppose the account_id element must contain
AB, followed by digits:
<xs:element name="account_id"> <xs:simpleType> <xs:restriction base="xs:string">
<xs:pattern value="AB\d+"/>
</xs:restriction> </xs:simpleType> </xs:element>
Irvine COP 4814
Deriving a List Type
• When the XML file contains a list of values – Example: dates when a student declared or
changed majors:
<xs:element name="catalog_dates"> <xs:simpleType>
<xs:list itemType="xs:date"/>
</xs:simpleType> </xs:element>
Irvine COP 4814
this custom type is anonymous
Deriving a Named List Type
Use a named type if you plan to apply it more than once. <xs:simpleType name="dateList">
<xs:list itemType="xs:date"/>
</xs:simpleType> .
<!-- create instances: -->
<xs:element name="catalog_dates" type="dateList"/>
<xs:element name="enrollment_dates" type="dateList"/>
Irvine COP 4814
Summary
• Use built-in types when you have no particular restrictions on the values
• Use simple (atomic) derived types to control ranges and lengths
• Use list types for repeated items • Chapter 11 explains how to create complex types.
Irvine COP 4814