XML IV. The Document Object Model The Document Object model is a hierarchical structure of an XML...

XML IV

The Document Object Model The Document Object model is a hierarchical structure of an

XML document. It provides a means for accessing, and manipulating XML

documents. It allows access to all the parts of XML documents, i.e. the

document, its root element, child elements, attributes, etc. The DOM also provides methods that allow manipulations,

additions, and deletions to the original document. Without DOM, XML would be nothing more than a storage

system for data. Generally, DOMs can follow one of three patterns: Linear,

Tree, or Object Models.

Linear Models: Simplest model, detailing the document in a linear fashion. Illustration: Consider if I said “Go to page 100 of a book, go to

line 10, and read the 1st word”. The problem with this is that if I altered the earlier section of the

book, it would invalidate all previous references to the latter sections of the book.

Tree Models Describes the document in terms of a tree, and each item on the

tree is referred to as a node. Terminal items are regarded as leaves.

It describes a root element, and child nodes. A parent node refers to an immediately preceding node, and

sibling nodes have the same parent

An advantage of the tree is that any part of the document can be reached by ‘walking the tree’

Tree nodes are not as sensitive to change like the linear model, but are still somewhat sensitive in the sense that if a complete node was removed, the relationship between remaining nodes might be altered.

Using the book analogy, I would perhaps say “Go to the 1st item of the 2nd paragraph of the 4th chapter.

Object Models Least sensitive to change. Each section of a document has a name property, hence even

when the document changes, the model would still be valid.

The W3C DOM is a combination of the Tree and Object models.

Example of a Document tree:Simple XML Document:

<?xml version=“1.0”?>

<!--demonstration of a parse tree -- >

<contact type=“personal”>

<name>James Blogg</name>

<telephone>02089443243</telephone>

</contact>

Document tree: Document

Element Element

Version Comment Element

TextText

The W3C DOM The specification for the W3C DOM can be found at

http://www.w3.org/TR/REC-DOM-Level-1 It implements the language of the OMG IDL (Object

Management Group Interface Definition Language). OMG IDL is a language that allows applications to communicate

with each other even if they are written in different programming languages.

Under the IDL, objects expose various interfaces. Each interface has a series of attributes that describe the

properties of the object behind that interface. The object can also be operated on by methods that activate

functions behind the interface. These methods will return some kind of value to the requesting application.

Together, the attributes and methods constitute the API of that object.

http://www.w3.org/TR/REC-DOM-Level-1

DOM Objects An object that supports DOM must be capable of loading an

XML document into itself. It must also be able to expose all the interfaces with the

appropriate attributes and methods, as laid out in the DOM specification.

DOM Interfaces DOMImplementation: This is a query to the object itself, and is

independent of the document loaded into it. Document: This interface provides information about the

document that has been loaded into the object. Node: Everything in a document can be considered as a node of

the document, i.e. elements, comments, etc. It contains several attributes and methods for manipulating any kind of node.

NodeList: This is an ordered collection of nodes that can be accessed using an index.

NamedNodeMap: Collection of nodes that can be accessed by name.

CharacterData: Deals with the text of the document. Attr: Deals with the XML attributes of a node. However, it is

not represented in the tree structure as it is considered to be a property of its element.

Element: Most nodes of an XML document will be elements. The Element interface has properties and methods for dealing with elements and their XML attributes

Text: Deals with the text content of an element CDATASection, Notation, Entity, EntityReference and

Processing instructions interfaces all deal with their namesake nodes.

Before considering DOM interfaces in detail, I’ll introduce the concept of XML Data Source Objects (DSO), as we’ll use this in the examples on DOM interfaces.

Consider the following HTML file:

<html><head><title>XML DSO</title></head>

<body>

<xml id=“isle”>

<addressbook>...........</ addressbook >

</xml>

<script language=“JavaScript”>

var MyDoc=isle;

var rootEl=MyDoc.documentElement;

document.write(rootEl.nodeName);

</script>

</body> </html>

The example demonstrates the creation of an XML DSO or data island within an HTML file, with the use of the xml element.

Note that the xml element is a proprietary element of IE5, and will not work on other browsers.

The same xml file can be referenced with the src attribute of the xml element, i.e.

<html><head><title>XML DSO</title></head><body><xml id=“isle” src=“addressbook.xml></xml><script language=“JavaScript”>var MyDoc=isle;var rootEl=MyDoc.documentElement;document.write(rootEl.nodeName);</script></body> </html>

The Document Interface Returns information about the document. It has the following read-only attributes:

doctype: Returns the <!DOCTYPE information in the document, with the exception of information on the DTD.

implementation: Returns the implementation node. Has a boolean method called hasFeatures( ) which takes 2 parameters: type of document (HTML or XML) and version.

documentElement: Most important attribute of the Document interface because it returns the root element of the document. It’s an ideal starting point if you’re walking the tree to access a node.

Document Interface Methods createElement( ) createDocumentFragment( ) createTextNode( )

createComment( ) createCDATASection( ) createProcessingInstruction( ) createAttribute( ) createEntityReference( )getElementsByTagName( ): Gets a list of all the elements of

that name, passed to it as an argument. It returns a NodeList object. Widely used in search routines.

Accessing Nodes in DOM There are 2 ways of accessing nodes in DOM.

Walking the Tree: Start anywhere in the DOM, and use methods of the DOM node interface such as parentNode( ), firstChild, nextSibling and previousSibling.

Accessing nodes by name: This approach uses the getElementsByTagName( ) method to get a list of elements of that name.

Example:<html><head><title>XML Example</title></head><body><xml id='isle'> <addressbook><contact> <name>Tony Benn</name> <address>210 Temple road</address><city>London</city><postcode>NW9 0RT</postcode><phone>02082049565</phone></contact><contact> <name>Mary Blair</name><address>20 St James road</address><city>London</city><postcode>SE4 0RT</postcode><phone>02072049565</phone>

</contact>

</addressbook>

</xml>

<script>

var myDoc=isle;

var contactList=myDoc.getElementsByTagName('contact');

for (var i=0; i<contactList.length; i++)

{document.write(contactList.item(i).firstChild.firstChild.nodeValue);

document.write('<br\/>');

document.write(contactList.item(i).firstChild.nextSibling.firstChild.nodeValue);

document.write('<br\/> <br\/> ');

}

</script>

</body>

</html>

The above listing would output: Tony Benn

210 Temple road

Mary Blair20 St James road

The Document Fragment Interface An utility interface that creates a subsection of the document,

which can subsequently be inserted into the main document. Particularly useful when you’re creating a lot of new elements. If you created the new elements in the main document directly,

the node list would have to be updated each time. With a DFI, the node list only has to be updated once.

The Node Interface This is the key interface of DOM, as everything in a document

can be considered as a node. Has the following read-only attributes:

nodeName: E.g the name of the tagnodeValue: Null for elements, Content of a text nodenodeType: Element, Attr, Text, etcparentNode childNodes: Returns a node list of all child nodes of the

element firstChild lastChildpreviousSiblingnextSiblingattributes: Works only with element type nodes. Returns a

NamedNodeMap of all the attributes.ownerDocument

Methods of the Node Interface All except the hasChildNodes( ) and cloneNode( ) take one or

more node objects as parameters. Prior to using the insertBefore( ), replaceChild( ), or

appendChild( ) methods, you would have created a new node with one of the document methods or by the cloneNode( ) method.

The insertBefore method( ): Inserts the new child node before the reference node.

var dummy;

dummy=node_object.insertBefore(new_node, reference_node)

The replaceChild method( ): Replaces the reference node, and returns the replaced node.

The appendChild( ) method: Appends the new node to the end of the reference node

The hasChildNodes method( ): This is a Boolean that checks whether a node has child nodes

The cloneNode method( ): Makes a duplicate of the cloned node.

Example of using node methods:<script>

var dummy;

var myDoc=isle;

var new_element=myDoc.createElement('email');

var rootEl=myDoc.documentElement;

dummy=rootEl.appendChild(new_element);

alert('I have just created a new element called '+dummy.nodeName);

</script>

The CharacterData Interface Contains attributes and methods for accessing and editing strings

in nodes that take a string value, e.g text and comment nodes. Contains the following attributes:

The data attribute: Returns all the text in the node as a Unicode string.

Length: Returns the number of characters in the string. It has the following methods (all self-explanatory):

subStringData( ) methodappendData( ) method insertData( ) methoddeleteData( ) method replaceData( ) method

The Element Interface Most of the nodes in a document belong to the element or text

interfaces. However, the node interface deals with most of the operations on

elements, and the CharacterData interface handles most operations on text nodes.

The majority of the element interface attributes and methods are therefore concerned with managing its XML attribute’s properties.

Methods:getAttribute( ): Retrieves an attribute’s value by name setAttribute( ): Creates an attribute, and sets its value at once removeAttribute( )getAttributeNode( ): When passed a name, it retrieves the Attr

node.

setAttributeNode( ) removeAttributeNode( )getElementsByTagName( ): Similar to when used in the

document interface.

The Attr Interface Represents an attribute of an element object. They are not considered as part of the document tree, but as

properties of their elements. As such, they are accessed by methods of the element interface Has the following attributes:

name: The name of the attributevalue: Its value specified: A Boolean value that returns true if the attribute has

been assigned a value, either in the original document, or by code.

References:

XML Unleashed, Morrison, et al. Chapter 15

Internet & World wide web How to Program, Dietel, Dietel and Nieto. Chapter 20.

Date post:	28-Mar-2015
Category:	Documents
Upload:	tyler-hurley
View:	220 times
Download:	4 times

XML IV. The Document Object Model The Document Object model is a hierarchical structure of an XML...

Documents