Date post: | 28-Mar-2015 |
Category: |
Documents |
Upload: | tyler-hurley |
View: | 220 times |
Download: | 4 times |
XML IV
The Document Object Model The Document Object model is a hierarchical structure of an
XML document. It provides a means for accessing, and manipulating XML
documents. It allows access to all the parts of XML documents, i.e. the
document, its root element, child elements, attributes, etc. The DOM also provides methods that allow manipulations,
additions, and deletions to the original document. Without DOM, XML would be nothing more than a storage
system for data. Generally, DOMs can follow one of three patterns: Linear,
Tree, or Object Models.
Linear Models: Simplest model, detailing the document in a linear fashion. Illustration: Consider if I said “Go to page 100 of a book, go to
line 10, and read the 1st word”. The problem with this is that if I altered the earlier section of the
book, it would invalidate all previous references to the latter sections of the book.
Tree Models Describes the document in terms of a tree, and each item on the
tree is referred to as a node. Terminal items are regarded as leaves.
It describes a root element, and child nodes. A parent node refers to an immediately preceding node, and
sibling nodes have the same parent
An advantage of the tree is that any part of the document can be reached by ‘walking the tree’
Tree nodes are not as sensitive to change like the linear model, but are still somewhat sensitive in the sense that if a complete node was removed, the relationship between remaining nodes might be altered.
Using the book analogy, I would perhaps say “Go to the 1st item of the 2nd paragraph of the 4th chapter.
Object Models Least sensitive to change. Each section of a document has a name property, hence even
when the document changes, the model would still be valid.
The W3C DOM is a combination of the Tree and Object models.
Example of a Document tree:Simple XML Document:
<?xml version=“1.0”?>
<!--demonstration of a parse tree -- >
<contact type=“personal”>
<name>James Blogg</name>
<telephone>02089443243</telephone>
</contact>
Document tree: Document
Element Element
Version Comment Element
TextText
The W3C DOM The specification for the W3C DOM can be found at
http://www.w3.org/TR/REC-DOM-Level-1 It implements the language of the OMG IDL (Object
Management Group Interface Definition Language). OMG IDL is a language that allows applications to communicate
with each other even if they are written in different programming languages.
Under the IDL, objects expose various interfaces. Each interface has a series of attributes that describe the
properties of the object behind that interface. The object can also be operated on by methods that activate
functions behind the interface. These methods will return some kind of value to the requesting application.
Together, the attributes and methods constitute the API of that object.
DOM Objects An object that supports DOM must be capable of loading an
XML document into itself. It must also be able to expose all the interfaces with the
appropriate attributes and methods, as laid out in the DOM specification.
DOM Interfaces DOMImplementation: This is a query to the object itself, and is
independent of the document loaded into it. Document: This interface provides information about the
document that has been loaded into the object. Node: Everything in a document can be considered as a node of
the document, i.e. elements, comments, etc. It contains several attributes and methods for manipulating any kind of node.
NodeList: This is an ordered collection of nodes that can be accessed using an index.
NamedNodeMap: Collection of nodes that can be accessed by name.
CharacterData: Deals with the text of the document. Attr: Deals with the XML attributes of a node. However, it is
not represented in the tree structure as it is considered to be a property of its element.
Element: Most nodes of an XML document will be elements. The Element interface has properties and methods for dealing with elements and their XML attributes
Text: Deals with the text content of an element CDATASection, Notation, Entity, EntityReference and
Processing instructions interfaces all deal with their namesake nodes.
Before considering DOM interfaces in detail, I’ll introduce the concept of XML Data Source Objects (DSO), as we’ll use this in the examples on DOM interfaces.
Consider the following HTML file:
<html><head><title>XML DSO</title></head>
<body>
<xml id=“isle”>
<addressbook>...........</ addressbook >
</xml>
<script language=“JavaScript”>
var MyDoc=isle;
var rootEl=MyDoc.documentElement;
document.write(rootEl.nodeName);
</script>
</body> </html>
The example demonstrates the creation of an XML DSO or data island within an HTML file, with the use of the xml element.
Note that the xml element is a proprietary element of IE5, and will not work on other browsers.
The same xml file can be referenced with the src attribute of the xml element, i.e.
<html><head><title>XML DSO</title></head><body><xml id=“isle” src=“addressbook.xml></xml><script language=“JavaScript”>var MyDoc=isle;var rootEl=MyDoc.documentElement;document.write(rootEl.nodeName);</script></body> </html>
The Document Interface Returns information about the document. It has the following read-only attributes:
doctype: Returns the <!DOCTYPE information in the document, with the exception of information on the DTD.
implementation: Returns the implementation node. Has a boolean method called hasFeatures( ) which takes 2 parameters: type of document (HTML or XML) and version.
documentElement: Most important attribute of the Document interface because it returns the root element of the document. It’s an ideal starting point if you’re walking the tree to access a node.
Document Interface Methods createElement( ) createDocumentFragment( ) createTextNode( )
createComment( ) createCDATASection( ) createProcessingInstruction( ) createAttribute( ) createEntityReference( )getElementsByTagName( ): Gets a list of all the elements of
that name, passed to it as an argument. It returns a NodeList object. Widely used in search routines.
Accessing Nodes in DOM There are 2 ways of accessing nodes in DOM.
Walking the Tree: Start anywhere in the DOM, and use methods of the DOM node interface such as parentNode( ), firstChild, nextSibling and previousSibling.
Accessing nodes by name: This approach uses the getElementsByTagName( ) method to get a list of elements of that name.
Example:<html><head><title>XML Example</title></head><body><xml id='isle'> <addressbook><contact> <name>Tony Benn</name> <address>210 Temple road</address><city>London</city><postcode>NW9 0RT</postcode><phone>02082049565</phone></contact><contact> <name>Mary Blair</name><address>20 St James road</address><city>London</city><postcode>SE4 0RT</postcode><phone>02072049565</phone>
</contact>
</addressbook>
</xml>
<script>
var myDoc=isle;
var contactList=myDoc.getElementsByTagName('contact');
for (var i=0; i<contactList.length; i++)
{document.write(contactList.item(i).firstChild.firstChild.nodeValue);
document.write('<br\/>');
document.write(contactList.item(i).firstChild.nextSibling.firstChild.nodeValue);
document.write('<br\/> <br\/> ');
}
</script>
</body>
</html>
The above listing would output: Tony Benn
210 Temple road
Mary Blair20 St James road
The Document Fragment Interface An utility interface that creates a subsection of the document,
which can subsequently be inserted into the main document. Particularly useful when you’re creating a lot of new elements. If you created the new elements in the main document directly,
the node list would have to be updated each time. With a DFI, the node list only has to be updated once.
The Node Interface This is the key interface of DOM, as everything in a document
can be considered as a node. Has the following read-only attributes:
nodeName: E.g the name of the tagnodeValue: Null for elements, Content of a text nodenodeType: Element, Attr, Text, etcparentNode childNodes: Returns a node list of all child nodes of the
element firstChild lastChildpreviousSiblingnextSiblingattributes: Works only with element type nodes. Returns a
NamedNodeMap of all the attributes.ownerDocument
Methods of the Node Interface All except the hasChildNodes( ) and cloneNode( ) take one or
more node objects as parameters. Prior to using the insertBefore( ), replaceChild( ), or
appendChild( ) methods, you would have created a new node with one of the document methods or by the cloneNode( ) method.
The insertBefore method( ): Inserts the new child node before the reference node.
var dummy;
dummy=node_object.insertBefore(new_node, reference_node)
The replaceChild method( ): Replaces the reference node, and returns the replaced node.
The appendChild( ) method: Appends the new node to the end of the reference node
The hasChildNodes method( ): This is a Boolean that checks whether a node has child nodes
The cloneNode method( ): Makes a duplicate of the cloned node.
Example of using node methods:<script>
var dummy;
var myDoc=isle;
var new_element=myDoc.createElement('email');
var rootEl=myDoc.documentElement;
dummy=rootEl.appendChild(new_element);
alert('I have just created a new element called '+dummy.nodeName);
</script>
The CharacterData Interface Contains attributes and methods for accessing and editing strings
in nodes that take a string value, e.g text and comment nodes. Contains the following attributes:
The data attribute: Returns all the text in the node as a Unicode string.
Length: Returns the number of characters in the string. It has the following methods (all self-explanatory):
subStringData( ) methodappendData( ) method insertData( ) methoddeleteData( ) method replaceData( ) method
The Element Interface Most of the nodes in a document belong to the element or text
interfaces. However, the node interface deals with most of the operations on
elements, and the CharacterData interface handles most operations on text nodes.
The majority of the element interface attributes and methods are therefore concerned with managing its XML attribute’s properties.
Methods:getAttribute( ): Retrieves an attribute’s value by name setAttribute( ): Creates an attribute, and sets its value at once removeAttribute( )getAttributeNode( ): When passed a name, it retrieves the Attr
node.
setAttributeNode( ) removeAttributeNode( )getElementsByTagName( ): Similar to when used in the
document interface.
The Attr Interface Represents an attribute of an element object. They are not considered as part of the document tree, but as
properties of their elements. As such, they are accessed by methods of the element interface Has the following attributes:
name: The name of the attributevalue: Its value specified: A Boolean value that returns true if the attribute has
been assigned a value, either in the original document, or by code.
References:
XML Unleashed, Morrison, et al. Chapter 15
Internet & World wide web How to Program, Dietel, Dietel and Nieto. Chapter 20.