Multilevel XML Data Model
by
Deepanwita Roy
Bachelor of EngineeringYeshwantrao Chavan College of Engineering, 2002
————————————————————–
Submitted in Partial Fulfillment of the
Requirements for the Degree of Master of Science in the
Department of Computer Science and Engineering
College of Engineering and Information Technology
University of South Carolina
2005
Department of Computer Scienceand EngineeringDirector of Thesis
Department of Computer Scienceand Engineering2nd Reader
Department of Computer Scienceand Engineering3rd Reader
Dean of The Graduate School
Acknowledgments
I acknowledge my gratitude to my thesis advisor, Dr Csilla Farkas for her inspira-
tion, guidance and encouragement during this work. Her observations and comments
helped me to establish the overall direction of the research and to move forward with
investigation in depth. It has been a pleasure working with her during my graduate
studies.
I also thank my thesis committee (Dr Manton Matthews Dr Caroline Eastman) for
providing helpful suggestions during the work.
I would like to express my sincere thanks to Vaibhav Gowadia for generously sharing
his time and knowledge in my work on implementation of XML Access Control Model
Last, but not the least, I would like to dedicate this thesis to my parents(Dr Abhinaba
Roy and Mrs Kabari Roy) and Mr. Bharath Pandravada for their love, patience and
understanding. It would not haven been possible to complete this work without their
sacrifices and perseverance.
ii
Abstract
In my thesis work I have studied the existing access control models for XML data,
identified their shortcomings, and developed technical solutions to improve upon ex-
isting works. The focus of access control models developed so far has been on pro-
viding read access to the users. The aim of my work is to evaluate the feasibility of
existing XML access control models to handle update operations. My hypothesis is
that the existing, syntax-based access control models do not provide sufficient support
for XML databases. In particular, they do not provide policy validation capabilities
and are vulnerable in the presence of updates. In my thesis I address the limitations
regarding the update vulnerabilities and develop technical solutions to improve upon
the existing access control models. . There are various update operations and my
area of focus has been on delete operations from the perspective of illegal inferences
and data integrity.I have also applied the Multilevel Secure (MLS) Access Control
Model to XML databases. In addition I have also implemented a proof-of-concept
prototype of the proposed technique to support MLS-XML.
iii
Contents
Acknowledgments ii
Abstract iii
List of Figures vi
1 Introduction 1
2 Related Work 5
2.1 XML Access Control Models . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 XML Update . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.3 Mandatory Access Control Model . . . . . . . . . . . . . . . . . . . . 7
2.4 Limitation of existing research . . . . . . . . . . . . . . . . . . . . . . 8
3 Thesis 9
3.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.2 Document Update Framework for Multilevel Secure XML . . . . . . . 10
3.2.1 Correctness Criteria for XML Security . . . . . . . . . . . . . 11
3.2.2 Delete Operation . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.2.3 Correctness Criteria of the algorithm . . . . . . . . . . . . . . 16
4 Implementation Of Access Control Model 20
4.1 Implementation details . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4.2 User Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
iv
5 Conclusions 30
5.1 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
Bibliography 32
A 35
v
List of Figures
1.1 original XML document . . . . . . . . . . . . . . . . . . . . . . . . . 3
3.1 Lattice Structure of Security Levels (a) Original Lattice structure (b) Sublattice with domain unique domain deleted for each label 12
3.2 Lattice with unique domain deleted added . . . . . . . . . . . . . . . 13
3.3 Public Level User Deletes Weather (a) Public Level View (b) Secret Level View (c)Top Secret Level View 17
3.4 Public Level User Deletes Data (a) Public Level View (b) Secret Level View (c)Top Secret Level View 17
3.5 Secret Level User Deletes water-resource (a) Public Level View (b) Secret Level View (c)Top Secret Level View 18
3.6 Secret Level User Deletes concrete location images (a) Public Level View (b) Secret Level View (c)Top Secret Level View 18
4.1 Unique identifiers generated by Exist . . . . . . . . . . . . . . . . . . 21
4.2 Framework of EXIST database . . . . . . . . . . . . . . . . . . . . . 22
4.3 Schema of mac.xml file . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.4 Instance of mac.xml file . . . . . . . . . . . . . . . . . . . . . . . . . 24
4.5 Converting path expression to regular expression . . . . . . . . . . . . 25
4.6 Administrator logs on . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
4.7 File loaded by the administrator . . . . . . . . . . . . . . . . . . . . . 27
4.8 Bob logs on to the database . . . . . . . . . . . . . . . . . . . . . . . 27
4.9 Bob’s view of the database . . . . . . . . . . . . . . . . . . . . . . . . 28
4.10 Bob submits a query . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
4.11 The result of the XQuery . . . . . . . . . . . . . . . . . . . . . . . . . 29
4.12 Bob logs on to the database . . . . . . . . . . . . . . . . . . . . . . . 29
vi
Chapter 1
Introduction
Extensible Markup Language(XML) [7] is a flexible text format for describing semi-
structured and unstructured data over the Internet. It is derived from the Structured
Generalized Markup Language (SGML). SGML provides a mechanism to define semi-
structured data, but XML a subset of SGML does not require a strict syntax for
its documents [7]. XML specifications are defined by the World Wide Consortium
(W3C). The structure of a XML document can be described with help of a Document
Type Definition (DTD) or an XML-Schema (XMLS) [8].
XML is being increasingly used to support web-based applications. XML applica-
tions can be categorized as data exchange and data storage applications [13]. To
name a few, electronic data exchange applications include Open Financial Exchange,
Intrusion Detection Message Exchange, etc. [21].A large number of application ar-
eas, like health care [16], published data sets [5]and meteorological data rely on
semi-structured and XML data to provide inter-operation among different systems
[13] [15]. For many of these systems it has become essential to incorporate security,
to prevent malicious corruption of data and prohibit unauthorized users from access-
ing and using classified data. For example, medical records maintained by hospitals
or weather and area records maintained by the meteorological departments do not
conform to strict schema. XML can be used to store both schema-less files and doc-
1
uments that conform to a predefined Schema or DTD. Due to its flexibility XML is
a less restrictive data storage format in comparison to relational databases.
In organizations there are groups of members with different access privileges, hence
a multilevel secure access control model needs to be developed that conforms to the
security criterias of an efficient access control system. Recently many access control
systems have been proposed for XML documents. The model proposed by Damiani
et al [13], XML Access Control Language (XACL) [17] policy language, Author-X by
Elisa Bertino [1] has focused on providing read access to XML documents. However,
providing secure write access at the node level of XML documents has not been stud-
ied sufficiently. Constraints in XML include cardinality issues, structural connectivity
and restrictions imposed by its Schema or DTD if specified. This proposal focuses on
the specific problem of providing efficient access control mechanism, which supports
the constraints in XML documents.The update operations in XML include inserting
a node, deleting a node, replacing or renaming a node. The update operations must
guarantee data confidentiality, integrity and availability of data. My area of focus is
in the delete operation performed on nodes.
In the current access control models [1, 3, 4, 2, 11, 10, 9, 12, 17, 18, 22, 15] and
current XUpdate language [14], when a node is deleted its entire subtree is deleted
along with it. This means that users at lower security level can delete nodes at higher
security level if nodes at higher security level are in the subtree being deleted . Such
blind deletes may lead to undesirable loss of important information. To illustrate the
problem consider the following example. Figure 1.1 shows an XML document that
describes data received from satellite images. The data in the document is shared with
users having three different access controls. It follows a hierarchial system wherein
the Top Secret has the highest level user clearance, so it can view data of lower
security users. The Top Secret level is followed by the Secret level user and finally the
Public level user clearance. Data about weather is classified as public and data about
2
Title
Place
Data
Date
Weather
Temperature
Humidity
Images
Report
Water Resource
Man−made resources
Natural
Vegetation
Concrete Location Images
Buildings
Nuclear PowerResources
P
P
P
P
P
P
P
S
S
S
S
S
S
S
TS
P
Figure 1.1: original XML document
images and its subtree has a higher classification level than the public level . Security
classification in the example document increases when we traverse downwards in the
XML tree. Since, most access control models mentioned earlier assume this type
of document structure for read access assume this type of document structure, we
may think that the document schema is reasonably good for enforcing access control.
We observe that document integrity may be easily violated with update operations
supported by existing models.
There are following options to implement delete operation in an XML document:
• Delete only the viewable nodes and allow fragmentation in the XML tree. This
means that the dangling subtrees get connected to the nearest parent nodes
after the deletion operation, leading to loose structural containment and the
XML schema may be violated. Future querying of the XML tree and policy
3
enforcement are the other difficulties faced by this approach.
• Delete the entire subtree irrespective of the security classification of the nodes
in the subtree. For example, if an employee in weather forcasting department
decides that data older than one month must be deleted and if the < data >
element is deleted from the view of public user, < images > also gets deleted
from the view of Secret and Top Secret level which is an undesired effect. It
affects the integrity and availability constraints of a system. Existing XML
update implementation and access control models [17, 13] have suggested this
approach.
• Refuse delete operation to any node that has a higher security clearance node
in its subtree. This denial of service creates covert channels .
The above options demonstrates that the integrity requirements for secure update of
XML documents needs to be maintained. In general, a secure and practical update
operation also requires that update operation must not create covert channels for
information flow and must not require modification to existing applications.
In this paper we present architecture for performing XML delete that preserves doc-
ument integrity and information confidentiality. Thus there should be some method
to allow deletion of that node but preserve the Schema of the XML document and
disallow blind delete.
The organization of the paper is as follows. In Section 2 we give a brief overview of
basic concepts that are later used in this paper. In Section 3 we present the proposed
framework for secure XML updates. In this chapter, we also present specifications
of the proposed XML Access Control model and the algorithms for secure XML
deletions(partial deletes). In Section 4 we go into the implementation details of the
Access Control Model in an XML database. Finally in 5 we conclude and suggest
future research work.
4
Chapter 2
Related Work
This section gives a brief introduction to XML access control models, the XML update
operations and describes the limitations in existing models.
2.1 XML Access Control Models
XML access control supports security classification at all levels of granularity (node-
level, association-level, and file-level) . We can have different security classification
levels of individual documents and elements within them. Hence, there is a strong
need of policies ensuring a controlled access and exchange of XML documents. Policies
can be enforced at the Schema or DTD level where all the documents following the
Schema inherit the authorizations. Policies can also be specified at the individual
document level. The development of an access control system requires the definition
of the subjects and objects against which authorization is issued. A subject is specified
by a triple: uid, role-set and group-set where role-set and group-set are a set of role
names and group names respectively [13]. Roles have a hierarchical structure; hence
all authorizations of a role propagate to all its subgroups and sub roles respectively.
An object represents an element or a set of elements in a target XML document. It can
be identified using an XPath expression [6]. They have an element based hierarchical
structure. Authorizations can be either positive (permissions) or negative (denials)
5
depending on the application system. It can be defined as applicable to the elements
attributes only (local authorization) or in a recursive approach to its sub elements and
their attributes (recursive authorization [13]. Authorization can be specified on single
XML documents (instance level authorization) called as soft authorization [13] or
on DTD or XML Schema (schema level authorization) called as hard authorization.
Hence access authorization is a 5 tuple of the form: ¡subject, object, action, sign,
type¿ . Issuing a policy for each element in the document may lead to a huge number
of authorizations. To limit the number of authorizations to be defined and maintained
for documents in a source, the tree structure of XML documents can be exploited to
enforce authorization propagation.
Element to subelement: an authorization specified for an element propagates to its
subelement [1] Element to attribute/link: authorization specified for an element
propagates to all its attributes and links, if defined. DTD-to-instance: authorization
specified on a protection object at the DTD level propagates on the same protection
object in all documents recognized as valid. Nodes of the XML tree are labeled
according to the authorization policy. Hence the requester is allowed to view all
the XML nodes and documents that are labeled with positive authorization. The
view of the document can be achieved by pruning all the subtrees labeled negative
[13]. The pruned document may not be valid with respect to the DTD referenced
by the original XML document. Hence a loosening transformation is applied which
makes all the required attributes optional so that signaling channel is avoided and
confidentiality of information is maintained.
2.2 XML Update
XUpdate makes extensive use of the expression language defined by XPath for se-
lecting elements for updating and for conditional processing. It is a pure descriptive
6
language. There are various update operations that can be performed on XML doc-
uments. They are referred as write authorizations on an XML document. The write
authorizations include inserting a node, deleting a node or updating of a node (chang-
ing the value of an attribute or a change of the text for an element) [23, 17]. When
XML is stored in relational database systems (RDBMS) a mapping is done between
the XML document and the relational tables. Various approaches have been put
forward to map the XML document to relational tables like the Mapping Edges and
the Mapping Values approach [20]. So whenever an update is made it should be
reflected in all the relational tables associated with the document. Insert privileges
allow the insertion of new elements and attributes in the document and the updated
version is checked with the DTD, if yes then updations is successfully implemented
or else rejected. Deletion of a node is permitted if the authorization for the subject
is positive.
2.3 Mandatory Access Control Model
Mandatory access control(MAC) secures information by assigning sensitivity labels on
information and comparing this to the level of sensitivity a user is operating at. Each
object is associated with security classification and subjects have security clearances.
It has a set of access rules defined by comparing the security classification of the
requested objects with the security clearance of the subject. Each labelλ(µ, {s}) is
divided in two parts:
• 1. total order security level
• 2. set of categories.
The dominance relation between labels is defined as follows: label λ(µ, {s}) dominates
label λ′(µ′, {s′}) iff µ ≥ µ′ and {s’} ⊆ {s}.
7
Bell-LaPadula (BLP) model is a type of MAC model which deals with confidentiality
protection. The two axioms enforced by this model are
Simple-security property :
A subject s is allowed to read an object o only if the security label of s dominates
the security label of o. This ensures no read up.
*-property:
A subject S is allowed to write an object O dominates the security label of S. This
ensures no write down.
However, BLP allows blind writes which leads to improper modification of data. For
example a lower security user can write a higher sensitivity data although he cannot
view it. Most practical implementation support write at the same level.
2.4 Limitation of existing research
Delete operation, using XUpdate requires that if a node is deleted then its corre-
sponding subtree be also deleted. Current access control models adopt this approach.
Assuming that the security labels increase as we traverse down the XML tree, this
approach may result in blind writes. Although this is permitted by BLP, it is clearly
undesired. This activity will decrease data availability of data to higher security
users. If delete operation is not allowed then this creates inference channels for the
lower security users. Hence a new access model needs to be created which not only
allows data access by denying blind writes and also prevents blind writes.
8
Chapter 3
Thesis
In this chapter, we present an MLS-XML Access Control Model. The proposed so-
lution is based on representing access permissions using Mandatory Access Control
model. We discuss the basic definition required to define our model and the architec-
ture of the proposed access model. We also provide the algorithm to perform secure
XML updates after the deletion operation.
3.1 Definitions
This section describes the basic concepts used in this paper and presents a formal
description of them. First, we describe the representation of XML documents and
Schema.
An XML document is tree-structure composed of properly nested element nodes.
In textual representation of XML document, each subtree is delimited by a pair of
start and end tags of element name. Each element has zero or more child nodes,
which may include other element nodes, text nodes, and attribute nodes. Attributes
are child nodes of element nodes with a constraint that their sibling nodes cannot
have same node-label and attribute nodes have exactly one child node, a text node.
We know present, a more formal definition of the XML Document structure.
9
Definition 3.1.1 (XML Tree)
An XML tree is a node-labeled tree or a tree defined recursively [15] as follows:
• 1. The empty set {} is a tree, called the empty tree.
• 2. A single node {} is a tree.
• 3. if t1, t2, ..., tk are trees then {n→ {t1, t2, ..., tk}} is a tree. In this case we say
that {n → {t1, t2, ..., tk}} represents a tree with root node n that has outgoing
edges to subtrees t1, t2, ..., tk.
The nodes of the tree are labeled. Labels may be actual facts, node variables (corre-
sponding to any node value) or path variables (corresponding to any path). Constants
correspond to element, attribute and text nodes.
XML documents can be classified as well-formed and valid. An XML document
is well formed if it obeys the syntax of XML. A well-formed document is valid if it
conforms to a Document Type Definition (DTD) [7] or XML Schema [8], i.e. docu-
ment satisfies constraints on cardinality of nodes, data-types, and node-labels. The
purpose of DTD is to define the legal building blocks of an XML document. It de-
fines the document structure with a list of legal elements. The XML-schema whose
function is similar to that of DTD is more extensible and richer. It is written in XML
and supports additional constraints on data-types, cardinality, and namespaces.
3.2 Document Update Framework for Multilevel
Secure XML
In this section we present a framework for performing update operations in multilevel
secure XML documents. Our framework focuses on delete operations on XML nodes.
We present following algorithms to maintain document integrity, avoid need to modify
legacy applications, and present a correct view to user’s at each classification level.
10
3.2.1 Correctness Criteria for XML Security
The information security breaks down into three goals of integrity, confidentiality
and availability. Confidentiality refers to limiting the access of information only to a
set of authorized users. XML document may have multiple access levels and hence
confidentiality plays an important rule to protect classified data from unauthenticated
user. To provide fine grained access control, security of individual nodes in an XML
tree needs to be checked to prevent dissipation of information. A correct view of
XML document T for a user s shows all nodes in T with classification less than
or equal to the clearance level of user. Integrity refers that the data has not been
changed inappropriately whether by accident or by malign activity. It checks the
validity of the document. It should prevent unauthorized data modification. In other
words, to preserve document integrity, users from lower levels must not be able to
modify the data available in the view of users with higher security clearance. Also,
the modified XML tree should satisfy the constraints imposed by document schema,
such as cardinality, type, and referential integrity. Availability refers to the assurance
that the systems responsible for delivering, storing, and processing information are
accessible when needed.
3.2.2 Delete Operation
The current deletion operations does not satisfy the integrity and availability cor-
rectness criteria of information systems as it allows blind writes. Hence a mechanism
needs to be devised that preserves document integrity by preventing blind deletion of
XML nodes that have a higher security classification. Further, we want to prevent the
creation of disconnected XML fragments. Therefore we need to preserve the nodes
that connect the higher security XML subtrees to the lower security node.
Problem of cascading delete will arise only if the node that is to be deleted is an
element node, as only element nodes can have subtrees containing nodes with higher
11
(S,{A,B})
(S,{B})
(S,{})
(P,{})
(P,{A})
(P,{A,B})
(P,{B})
(S,{A})
DelPB
{DelPAB}
{DelP}
}{
{DelPA}
{DelS}
{DelSB}
{DelSAB}
{DelSA}
(a) (b)
Figure 3.1: Lattice Structure of Security Levels (a) Original Lattice structure (b)Sublattice with domain unique domain deleted for each label
security classification under them. Attribute and text nodes do not have any child
nodes. Therefore the problem of cascading delete does not arise.
We now present our technique for updating multilevel secure XML document upon
a delete operation. This technique is applied only if the node being deleted is an
element node. My approach makes use of mandatory access control(MAC) model.
Each subject and object in the system is identified by a classification label comprising
of a hierarchial component and a set of domains.
This approach makes use of a unique new domain for each label in the lattice
when deletion operation is performed.
The unique domain for each label is labeled as [{Del} + {hierarchical component}
+ {set of domains}]
All subjects λs along with its domain also contains the deleted domain associated
with labels which is dominated by λs. This allows subjects with clearance λs to view
the deleted objects where classification of object λs > λo
When a node N is deleted decision needs to be taken whether the node should
be preserved and be visible to the user with a higher security clearance or not. To
reach this decision, the subtree of the node is traversed to check whether there exits
12
(P,{})
(P,{B, DelP})
(P,{A, B, DelPA, DelPB})
(S,{DelP})
(S,{B, DelP, DelS, DelPB})
(S,{A, DelP, DelS, DelPA})
(S,{A, B, DelP, DelS, DelPA, DelPB, DelPAB, DelSA, DelSB})
(P,{A, DelP})
Figure 3.2: Lattice with unique domain deleted added
any node which has a higher security classification than node N. This is achieved by
running the procedure VisitSubtree. The existence of a higher security node reflects
that deleting the whole subtree will lead to undesired deletion of a node whose access
rights are not present to the current user.
Our solution to the security problems created by delete operation in a multilevel
security document is to intercept the delete action and modify security labeling of
nodes required to preserve document integrity. Algorithm 1 presents our solution in
detail. We now present an intuitive description of the algorithm.
The algorithm is divided in to two procedures which perform the following tasks.
1) Every node has a variable mark which is set to ”1” by procedure VisitSubtree
whenever the security level of the node needs to be changed. 2) The procedure
SecurityLabelling changes the security level of the node by adding the domain {Del
+ hierarchical component of the label + set of domains in that label} to the security
label of all the nodes marked by the previous procedure.
Whenever the subtree of the node needs to be preserved we change the security
13
classification of the node N and its following subtree to a new security classification
with a unique domain Deleted corresponding to that level included where the present
user doesn’t have its user clearance but all the user with a security clearance higher
than the present user can view it. This means that the security classification of object
λo(µo, {so}) is changed to λ′
o(µo, {so} ∪ {D}). The new security classification of the
node will allow only subjects with classification λs ≥ λ′
o to view the object . Thus
for the user at that level deletion is accepted and therefore the deleted items will not
be visible any longer. The users with security clearance that strictly dominates the
deleted objects classification (i.e.λs > λo) will be able to see this node. This partial
delete will avoid covert channels and also maintain the integrity of the document .
Definition 3.2.1 Subject Clearance Update
Let λ′ be the set of labels λ1 = (µ1, s1), λ2 = (µ2, s2), . . . , λn = (µn, sn) such that
λs > λi(i=1,...,n) then give clearance λ′
i = (µi, si ∪D) where i = 1, . . . , n to s.
Definition 3.2.2 Structural Connectivity Preservation
There exists a path for any object o in the XML tree with clearance λo which can be
accessed by subject s with clearance λs if λs ≥ λo.
Definition 3.2.3 Minimal Node Preservation
Any given node n in the subtree rooted at the deleted node o is preserved only if its
subtree contains a node with security classification higher than λo.
Definition 3.2.4 Delete Semantics
An object o when deleted by a subject s where λs ≤ λo is removed along with its
subtree from the view of the subject. However for any object o in the logical view for
s where λs ≥ λo object o is preserved if it contains a higher security node than λs in
its subtree.
14
Algorithm 1 gaurantees the following properties:
1. Users with dominating clearance than the deleted node satisfies structural con-
nectivity and minimal node preservation.
2. Users with clearance level same or lower than the deleted node preserves the
delete semantics.
Algorithm 1: Data Node Deletion ALgorithm
input : XML tree X, SecurityClearance(Subject),SecurityClearance(Object)
output: Updated XML tree T
if SecurityClearance( Subject)=SecurityClearance( Object) thenX’ ← VisitSubtree(X, nodeN)
changing the security classification of the nodes by checking all the nodesmarked ”1” in the previous procedure T ← SecurityLabelling(X’,N)
returnT
Procedure SecurityLabelling(X,N )
input : Marked XML tree X’, Node selected for deletion Noutput: XML tree X’ with changed security classification of appropriate nodes
Let [{Del} + {hierarchical component(node)} + {set of domains(node)}] be thedomain for nodes marked deleted Perform a post-order traversal of XML treeX ′
foreach Node node in X ′ doif node.mark = 1 then
if SecurityClearance( node) = SecurityClearance( N) thenSecurity classification of node ← (SecurityClearance(node),Domain (node) ∪ [{Del} + {hierarchical component(node)} + {set ofdomains(node)}])
Example 3.2.1 Let T be the XML tree as shown is Figure 1.1 selected for the dele-
tion operation. Let Pete be a user with Public clearance, John be a user with Secret
clearance and Sam be a user with Top Secret clearance. If Pete deletes <weather>
the subtree following <weather> does not have any node higher than the security
15
Procedure VisitSubtree(Root,N )
input : root node of the tree being visited, Node N selected for deletionoutput: True, if tree contains nodes with higher security classification,
otherwise False
list ← ChildNodes(X)
flag ← falsefor i = 0 to Length( list) do
child ← list. GetNode(i)child.mark ← 0if IsLeaf( child) then
if SecurityClearance( child) ≤ SecurityClearance( N) thenDelete(child)
elsechild.mark ← 1flag ← true
elseretval ← VisitSubtree(child, N)
if retval = true thenflag ← true
if flag thenroot.mark =1
returnflag
clearance of weather and hence the whole subtree is deleted. It gets deleted from
the view of all users. The view generated for each user of tree T after this delete
operation is explained in Figure 3.3.
In the updated XML tree T if Pete deletes <data>, it has nodes in its subtree
which have higher security clearance than <data>. Hence the security classification
of the node <data> and all nodes present in its subtree which are present in the
view of Pete is changed to Pnull ∪ DelP . The views for users with different security
clearance is given in Figure 3.4. 2
3.2.3 Correctness Criteria of the algorithm
We have discussed in the earlier sections the correctness criterions of any secure
information systems. Let us discuss how the above algorithms satisfy the correctness
16
P
Jan 1 1980
Iraq
Information
P
P
P
P
Report
Date
Data
Place
Title
S
S
S
S
S
S
S
P
P
P
P
Information
Iraq
Jan 1 1980
P
Title
Place
Data
Date
Report
Images
Water Resource
Buildings
ImagesConcrete Location
Vegetation
Natural
resourcesMan−made
TS
S
S
S
S
S
S
S
P
Jan 1 1980
Iraq
Information
P
P
P
P
Water Resource
Images
Report
Date
Data
Place
Title
ResourcesNuclear Power
Buildings
ImagesConcrete Location
Vegetation
Natural
resourcesMan−made
(a) (b) (c)
Figure 3.3: Public Level User Deletes Weather (a) Public Level View (b) Secret LevelView (c)Top Secret Level View
Title
Place
Report
P
P
P
Images
Water Resource
Man−made resources
Natural
Vegetation
Concrete Location Images
Buildings
S
S
S
S
S
S
S
Title
Place
Data
Report
P
P
Information
Iraq
P{DelP}
P
Concrete Location Images
Buildings
Title
Place
Data
Report
Images
Water Resource
Man−made resources
Natural
Vegetation
TS
S
S
P
P
Information
Iraq
S
S
S
S
S
P{DelP}
Nuclear power reactors
P
(a) (b) (c)
Figure 3.4: Public Level User Deletes Data (a) Public Level View (b) Secret LevelView (c)Top Secret Level View
17
Title
Place
Report
P
P
P
Images
Vegetation
Concrete Location Images
Buildings
S
S
S
S
Title
Place
Data
Report
P
P
Information
Iraq
P{DelP}
P
Title
Place
Data
Report
Images
Vegetation
Concrete Location Images
Buildings
S
S
S
Nuclear power reactorsTS
S
P
P
P
Information
Iraq
P{DelP}
(a) (b) (c)
Figure 3.5: Secret Level User Deletes water-resource (a) Public Level View (b) SecretLevel View (c)Top Secret Level View
Title
Place
Report
P
P
P
Title
Place
Data
Report
Images
Vegetation
S
P
P
Information
Iraq
S
P{DelP}
P
Title
Place
Data
Report
Images
Vegetation
Concrete Location Images
P
P
Information
Iraq
P{DelP}
Nuclear power reactorsTS
S
S
S{DelS}
P
(a) (b) (c)
Figure 3.6: Secret Level User Deletes concrete location images (a) Public Level View(b) Secret Level View (c)Top Secret Level View
18
criteria.
The security of the document is enabled by checking the subject access clearance to
its object. The update operations are invoked only if the access authorizations are
positive. The XML views of each level only display the data authorized to that level
and existence of coherent channels are avoided through partial delete. This ensures
confidentiality of data.
The integrity of the document is maintained by allowing modifications only if allowed.
The change in security levels of the deleted document will help a higher end user to
have the correct view of its level even though the parent of the higher clearance node
is deleted by a lower end user. The problem of dangling nodes is avoided hence the
structure of the document is also maintained.
By allowing partial delete the higher security nodes that formed a subtree to a lower
security node is still available. Hence availability criteria in the higher security view
is enabled. Thus the above algorithms satisfy the 3 goals of information security.
19
Chapter 4
Implementation Of Access Control
Model
4.1 Implementation details
The implementation of the XML Multilevel Access Control Model has been done on
eXist: an open source native XML database. The database and the model imple-
mentation is completely written in Java and maybe deployed either as a stand-alone
server process, inside a servelet-engine or directly embedded into an application [19].
The code was 34.5 MB when downloaded from http://www.sourceforge.net site.
Exist uses Indexing to store XML documents for efficient XQuery and XMLUp-
date.This indexing uses document id, node position and nesting depth to identify
nodes. A unique node identifier is assigned to each node by traversing the tree
in level-order. From a given unique identifiers easily determine the id of it’s par-
ent,sibling or possible child nodes. The number of children a node has is recomputed
for every level of the tree: for node x and y of a tree size(x) = size(y) if level(x)
= level(y) where size(n) is the number of children of a node n and level(m) is the
length of the path from the root node of the tree to m. Figure 4.1 shows the unique
identifiers generated by eXist for an example XML document [19].
20
Figure 4.1: Unique identifiers generated by Exist
Currently eXist uses four index files for native XML storage: collections.dbx-
manages the collection hierarchy, dom.dbx collects nodes in a paged file and associates
unique node identifiers to actual nodes, elements.dbx indexes elements and attributes
and words.dbx keeps track of word occurrences and is used by the full text search
extensions.
The framework of the access control model that is incorporated in eXist is pre-
sented in Figure 4.2
The access control system is based on Mandatory Access Control model(MAC).
Hence every user is assigned a clearance label, which consists of an hierarchial com-
ponent and a set of domains. The paths in an XML tree are also associated with
a classification label which has a hierarchial component and a set of domains. The
policy file showing the clearance label of the user and the classification label of the
nodes in the XML tree is provided by the system administrator. This file will have
a set of Subjects and Objects. Each subject has its subject-id and a classification
label associated with it. Every object has a path that represents the node or a set of
nodes in an XML tree and a classification label.The schema for this file is presented
in Figure 4.3. An instance of the mac.xml file is shown in Figure 4.4
This information in the policy file is parsed by the MACAssignmentsParser.java
21
Figure 4.2: Framework of EXIST database
file and stored in hashtables. The SubjectClearances table lists down all the users
that can access the system. The ObjectClassification table has a list of all files present
in the collections. Each file entry has a hashtable within it which describes the path
and its classification label. The path is divided into 2 parts: the main path and
conditions of any. For example, if XPath is defined as /a/b/c[@name = ”bob”] then
main path is /a/b/c and condition is [@name = ”bob”]. If condition is not specified
it is set to null. We can also convert a relative path into an absolute path. If the
policy file writes a policy for the relative path /a//d the PathSatisfaction.java file
converts this to the absolute path /a/b/c/d. This file converts a path expression to
a regular expression. It then checks whether the absolute path obtained satisfies the
22
Figure 4.3: Schema of mac.xml file
path expression given as input. A sample code to convert path expression to regular
expression is shown in Figure 4.5
The SubjectClearances and ObjectClassification tables forms the lattice structure
of our MAC model.
When an XML file is introduced in the database the set of absolute paths of
each node in the XML document is derived and each node in the tree is assigned
a classification label. The absolute path of each node is compared to the paths
stored in the ObjectClassification hashtable. In this model if for a particular path,
the classification label is not defined then the node inherits the classification label
of its parent and ancestors. In this way all the nodes will have one classification
label. If there occurs a conflict in the classification label for a node the highest
classification label in the set of conflicting labels is set. This process is called as
23
Figure 4.4: Instance of mac.xml file
24
Figure 4.5: Converting path expression to regular expression
generating materialized views and is stored in a cache.
An XML file is queried using the XQuery language and the query answer is re-
trieved and displayed to the user. We need to build an absolute path of the answer
and check the security constraints before it is output by XQuery engine. The method
to determine the absolute path of a resultant node from a unique node identifier is
written in XMLUtil.java present in the org.exist.dom package. This function makes
use of the indexing schema used by Exist to reach its parents and finally traversing
to the root of the tree to get the absolute path. Thus for every node in the resultant
set of the XQuery we can determine the absolute path. This absolute path needs to
obtain its classification label from the cache created by the materialized view.
After retrieving the absolute path of the node it is compared to the paths stored in
25
Figure 4.6: Administrator logs on
the ObjectClassification hashtable. The classification label of the user is derived from
SubjectClearances hashtable. If the objects classification label dominates subjects
classification then node is added in the resultant set or else the path is ignored.
This enables node level security in eXist database.
4.2 User Interface
The administrator needs to log as in Figure 4.6 because he has the privileges to upload
the mac.xml file.
The administrator uploads a file medical record.xml in the collection. The file is
parsed to create a materialized view i.e. absolute path is derived for each node in the
XML tree and classification label assigned to it. This information is stored in cache.
The view for the administrator is as in Figure 4.7
Now a user Bob logs on with classification label λ(TS, {Staff, General}) as in
Figure 4.8
The database view for Bob is as given in Figure 4.9
If Bob queries medical record.xml file and asks for all patient name’s under pa-
tient info he submits a query as in Figure 4.10
We see that the classification label of medical record is λ(TS, {General}) and
hence by propagation of policies medical record
26
Figure 4.7: File loaded by the administrator
Figure 4.8: Bob logs on to the database
27
Figure 4.9: Bob’s view of the database
Figure 4.10: Bob submits a query
28
Figure 4.11: The result of the XQuery
Figure 4.12: Bob logs on to the database
patient info also has the classification label λ(TS, {General}). Since λ(TS, {General, Staff})
dominates λ(TS, {General}) Bob’s view of the resultant query is given in Figure 4.11
In another scenario, user Charles logs on with classification label λ(S, {General})
as in Figure 4.12. If Charles also submits the same query as Bob comparison of classi-
fication labels is done. The object’s classification label λ(TS, {General}) dominates
the classification label of Charles as TS ¿ S . Hence access is denied.
29
Chapter 5
Conclusions
The proposed algorithm will maintain the integrity of the documents thus restricting
blind writes from a lower security level user. The nodes that are being deleted by the
lower sensitivity level user are also marked by the delete marker attribute so that the
higher security level user realizes that a deletion operation has been attempted on
the node by a lower security level user. It is upto the higher-level user to maintain
the document or update the document according to the deletion operation.
The access control data model incorporated in eXist allows a Multilevel Secure Access
Control Model. On querying an XML file only those nodes accessible to the user at
that particular classification is displayed. The granularity level of access control is
node level, hence a single XML document can have multiple access control levels.
The prototype ensures confidentiality, integrity and availability during XQuery and
XUpdates.
5.1 Future Work
Ontology will play a crucial role inorder to enhance the updation of the current
document. If the relationship between nodes of the XML document can be properly
determined then valid updates can be done. A cohesion tree graph can be drawn which
reflects the coupling between two nodes. It may show whether a node is valid without
30
the presence of other node. If one node is deleted then the other node can be as well
deleted, as it holds no meaning without the first. This graph can be declared once
during the design phase of the XML document and can be used to validate updates
on the document. In the implementation of XML database the update operations
discussed in our modified Mandatory Access Control Model can be included in eXist.
31
Bibliography
[1] E. Bertino, M. Braun, S. Castano, E. Ferrari, and M. Mesiti. Author-X: A Java-
based system for XML data protection. In Proc. of 14th IFIP WG11.3 Working
Conference on Database Security, The Netherlands, August 2000.
[2] E. Bertino, S. Castano, and E. Ferrari. Securing XML documents with Author-X.
IEEE Internet Computing, 3, May/June 2001.
[3] E. Bertino, S. Castano, E. Ferrari, and M. Mesiti. Controlled access and dissemi-
nation of XML documents. In Proc. of 2nd ACM Workshop on Web Information
and Data Management, pages 22–27, Kansas City, 1999.
[4] E. Bertino, S. Castano, E. Ferrari, and M.Mesiti. Specifying and enforcing ac-
cess control policies for XML document sources. In World Wide Web Journal,
volume 3. Kluwer Academic Publishers, 2000.
[5] Kirk D. Borne. Xml group resources. Retrieved on April 25, 2004 from
http://xml.gsfc.nasa.gov, April 2004.
[6] World Wide Web Consortium. XML Path Language (XPath) Ver-
sion 1.0. W3C Recommendation, retrieved on November 16,1999 from
http://www.w3.org/XML/Schema, Nov 1999.
[7] World Wide Web Consortium. Extensible Markup Language Language 1.0
specification. W3C Recommendation, retrieved on October 6,2000 from
http://www.w3.org/TR/2000/REC-xml-20001006, October 2000.
32
[8] World Wide Web Consortium. XML Schema. W3C Recommendation, retrieved
on May 2,2001 from http://www.w3.org/XML/Schema, May 2001.
[9] E. Damiani, S. De Capitani di Vimercati, S. Paraboschi, and P. Samarati. Design
and implementation of an access control processor for XML documents. In Proc.
of 9th International World Wide Web Conference, The Netherlands, 2000.
[10] E. Damiani, S. De Capitani di Vimercati, S. Paraboschi, and P. Samarati. Reg-
ulating access to semistructured information on the web. In Proc. of 16th IFIP
TC11 Annual Working Conference on Information Security: Information Secu-
rity for Global Information Infrastructures, Beijing, China, August 2000.
[11] E. Damiani, S. De Capitani di Vimercati, S. Paraboschi, and P. Samarati. XML
access control systems: A component-based approach. In Proc. of 14th IFIP
WG11.3 Working Conference on Database Security, The Netherlands, August
2000.
[12] E. Damiani, S. De Capitani di Vimercati, S. Paraboschi, and P. Samarati. Secur-
ing XML documents. In Proc. of Conference on Extending Database Technology,
Prague, March 2002.
[13] Ernesto Damiani, Sabrina De Capitani di Vimercati, Stefano Paraboschi, and
Pierangela Samarati. A fine-grained access control system for xml documents.
ACM Trans. Inf. Syst. Secur., 5(2):169–202, 2002.
[14] XML:DB Working draft. Xml updates(xupdate). retrived September 2004 from
http://xmldb-org.sourceforge.net/xupdate/xupdate-wd.html, September 2004.
[15] Vaibhav Gowadia and Csilla Farkas. Rdf metadata for xml access control. In
Proceedings of the 2003 ACM workshop on XML security, pages 39–48. ACM
Press, 2003.
33
[16] Andy Hadley and Cheryl Hutchings. 1.25 million electronic patient records in
xml at poole. In Proc. of XML Europe, Berlin, May 2001.
[17] M. Kudo and S. Hada. XML document security based on provisional authoriza-
tions. In Proc. of the 7th ACM conference on Computer and Communications
Security, Athens, Greece, November 2000.
[18] M. Kudo and S. Hada. Access control model with provisional actions. IEICE
Trans. Fundamentals, E84-A(1), 2001.
[19] Wolfgang Meier. exist: An open source native xml database. Web, Web-Services
and Database systems, March 2003.
[20] Daniela Florescu Michael Rys, Donald D.Chamberlin. Xml and relational
database management systems. In SIGMOD Conference 2005, 2005.
[21] OASIS. Cover pages. Hosted by OASIS, retrieved on April 25, 2004 from
http://xml.coverpages.org/, April 2004.
[22] A. Stoica and C. Farkas. Secure XML views. In Proc.of IFIP WG11.3 Working
Group Conference on Database and Application Security, 2002.
[23] Igor Tatarinov, Zachary G. Ives, Alon Y. Halevy, and Daniel S. Weld. Updating
xml. In Proceedings of the 2001 ACM SIGMOD international conference on
Management of data, pages 413–424. ACM Press, 2001.
34
Appendix A
35