+ All Categories
Home > Documents > Multilevel XML Data Model - Semantic ScholarDepartment of Computer Science and Engineering College...

Multilevel XML Data Model - Semantic ScholarDepartment of Computer Science and Engineering College...

Date post: 05-Jan-2020
Category:
Upload: others
View: 0 times
Download: 0 times
Share this document with a friend
41
Multilevel XML Data Model by Deepanwita Roy Bachelor of Engineering Yeshwantrao Chavan College of Engineering, 2002 ————————————————————– Submitted in Partial Fulfillment of the Requirements for the Degree of Master of Science in the Department of Computer Science and Engineering College of Engineering and Information Technology University of South Carolina 2005 Department of Computer Science and Engineering Director of Thesis Department of Computer Science and Engineering 2nd Reader Department of Computer Science and Engineering 3rd Reader Dean of The Graduate School
Transcript
Page 1: Multilevel XML Data Model - Semantic ScholarDepartment of Computer Science and Engineering College of Engineering and Information Technology ... 3.2 Document Update Framework for Multilevel

Multilevel XML Data Model

by

Deepanwita Roy

Bachelor of EngineeringYeshwantrao Chavan College of Engineering, 2002

————————————————————–

Submitted in Partial Fulfillment of the

Requirements for the Degree of Master of Science in the

Department of Computer Science and Engineering

College of Engineering and Information Technology

University of South Carolina

2005

Department of Computer Scienceand EngineeringDirector of Thesis

Department of Computer Scienceand Engineering2nd Reader

Department of Computer Scienceand Engineering3rd Reader

Dean of The Graduate School

Page 2: Multilevel XML Data Model - Semantic ScholarDepartment of Computer Science and Engineering College of Engineering and Information Technology ... 3.2 Document Update Framework for Multilevel

Acknowledgments

I acknowledge my gratitude to my thesis advisor, Dr Csilla Farkas for her inspira-

tion, guidance and encouragement during this work. Her observations and comments

helped me to establish the overall direction of the research and to move forward with

investigation in depth. It has been a pleasure working with her during my graduate

studies.

I also thank my thesis committee (Dr Manton Matthews Dr Caroline Eastman) for

providing helpful suggestions during the work.

I would like to express my sincere thanks to Vaibhav Gowadia for generously sharing

his time and knowledge in my work on implementation of XML Access Control Model

Last, but not the least, I would like to dedicate this thesis to my parents(Dr Abhinaba

Roy and Mrs Kabari Roy) and Mr. Bharath Pandravada for their love, patience and

understanding. It would not haven been possible to complete this work without their

sacrifices and perseverance.

ii

Page 3: Multilevel XML Data Model - Semantic ScholarDepartment of Computer Science and Engineering College of Engineering and Information Technology ... 3.2 Document Update Framework for Multilevel

Abstract

In my thesis work I have studied the existing access control models for XML data,

identified their shortcomings, and developed technical solutions to improve upon ex-

isting works. The focus of access control models developed so far has been on pro-

viding read access to the users. The aim of my work is to evaluate the feasibility of

existing XML access control models to handle update operations. My hypothesis is

that the existing, syntax-based access control models do not provide sufficient support

for XML databases. In particular, they do not provide policy validation capabilities

and are vulnerable in the presence of updates. In my thesis I address the limitations

regarding the update vulnerabilities and develop technical solutions to improve upon

the existing access control models. . There are various update operations and my

area of focus has been on delete operations from the perspective of illegal inferences

and data integrity.I have also applied the Multilevel Secure (MLS) Access Control

Model to XML databases. In addition I have also implemented a proof-of-concept

prototype of the proposed technique to support MLS-XML.

iii

Page 4: Multilevel XML Data Model - Semantic ScholarDepartment of Computer Science and Engineering College of Engineering and Information Technology ... 3.2 Document Update Framework for Multilevel

Contents

Acknowledgments ii

Abstract iii

List of Figures vi

1 Introduction 1

2 Related Work 5

2.1 XML Access Control Models . . . . . . . . . . . . . . . . . . . . . . . 5

2.2 XML Update . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2.3 Mandatory Access Control Model . . . . . . . . . . . . . . . . . . . . 7

2.4 Limitation of existing research . . . . . . . . . . . . . . . . . . . . . . 8

3 Thesis 9

3.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

3.2 Document Update Framework for Multilevel Secure XML . . . . . . . 10

3.2.1 Correctness Criteria for XML Security . . . . . . . . . . . . . 11

3.2.2 Delete Operation . . . . . . . . . . . . . . . . . . . . . . . . . 11

3.2.3 Correctness Criteria of the algorithm . . . . . . . . . . . . . . 16

4 Implementation Of Access Control Model 20

4.1 Implementation details . . . . . . . . . . . . . . . . . . . . . . . . . . 20

4.2 User Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

iv

Page 5: Multilevel XML Data Model - Semantic ScholarDepartment of Computer Science and Engineering College of Engineering and Information Technology ... 3.2 Document Update Framework for Multilevel

5 Conclusions 30

5.1 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

Bibliography 32

A 35

v

Page 6: Multilevel XML Data Model - Semantic ScholarDepartment of Computer Science and Engineering College of Engineering and Information Technology ... 3.2 Document Update Framework for Multilevel

List of Figures

1.1 original XML document . . . . . . . . . . . . . . . . . . . . . . . . . 3

3.1 Lattice Structure of Security Levels (a) Original Lattice structure (b) Sublattice with domain unique domain deleted for each label 12

3.2 Lattice with unique domain deleted added . . . . . . . . . . . . . . . 13

3.3 Public Level User Deletes Weather (a) Public Level View (b) Secret Level View (c)Top Secret Level View 17

3.4 Public Level User Deletes Data (a) Public Level View (b) Secret Level View (c)Top Secret Level View 17

3.5 Secret Level User Deletes water-resource (a) Public Level View (b) Secret Level View (c)Top Secret Level View 18

3.6 Secret Level User Deletes concrete location images (a) Public Level View (b) Secret Level View (c)Top Secret Level View 18

4.1 Unique identifiers generated by Exist . . . . . . . . . . . . . . . . . . 21

4.2 Framework of EXIST database . . . . . . . . . . . . . . . . . . . . . 22

4.3 Schema of mac.xml file . . . . . . . . . . . . . . . . . . . . . . . . . . 23

4.4 Instance of mac.xml file . . . . . . . . . . . . . . . . . . . . . . . . . 24

4.5 Converting path expression to regular expression . . . . . . . . . . . . 25

4.6 Administrator logs on . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

4.7 File loaded by the administrator . . . . . . . . . . . . . . . . . . . . . 27

4.8 Bob logs on to the database . . . . . . . . . . . . . . . . . . . . . . . 27

4.9 Bob’s view of the database . . . . . . . . . . . . . . . . . . . . . . . . 28

4.10 Bob submits a query . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

4.11 The result of the XQuery . . . . . . . . . . . . . . . . . . . . . . . . . 29

4.12 Bob logs on to the database . . . . . . . . . . . . . . . . . . . . . . . 29

vi

Page 7: Multilevel XML Data Model - Semantic ScholarDepartment of Computer Science and Engineering College of Engineering and Information Technology ... 3.2 Document Update Framework for Multilevel

Chapter 1

Introduction

Extensible Markup Language(XML) [7] is a flexible text format for describing semi-

structured and unstructured data over the Internet. It is derived from the Structured

Generalized Markup Language (SGML). SGML provides a mechanism to define semi-

structured data, but XML a subset of SGML does not require a strict syntax for

its documents [7]. XML specifications are defined by the World Wide Consortium

(W3C). The structure of a XML document can be described with help of a Document

Type Definition (DTD) or an XML-Schema (XMLS) [8].

XML is being increasingly used to support web-based applications. XML applica-

tions can be categorized as data exchange and data storage applications [13]. To

name a few, electronic data exchange applications include Open Financial Exchange,

Intrusion Detection Message Exchange, etc. [21].A large number of application ar-

eas, like health care [16], published data sets [5]and meteorological data rely on

semi-structured and XML data to provide inter-operation among different systems

[13] [15]. For many of these systems it has become essential to incorporate security,

to prevent malicious corruption of data and prohibit unauthorized users from access-

ing and using classified data. For example, medical records maintained by hospitals

or weather and area records maintained by the meteorological departments do not

conform to strict schema. XML can be used to store both schema-less files and doc-

1

Page 8: Multilevel XML Data Model - Semantic ScholarDepartment of Computer Science and Engineering College of Engineering and Information Technology ... 3.2 Document Update Framework for Multilevel

uments that conform to a predefined Schema or DTD. Due to its flexibility XML is

a less restrictive data storage format in comparison to relational databases.

In organizations there are groups of members with different access privileges, hence

a multilevel secure access control model needs to be developed that conforms to the

security criterias of an efficient access control system. Recently many access control

systems have been proposed for XML documents. The model proposed by Damiani

et al [13], XML Access Control Language (XACL) [17] policy language, Author-X by

Elisa Bertino [1] has focused on providing read access to XML documents. However,

providing secure write access at the node level of XML documents has not been stud-

ied sufficiently. Constraints in XML include cardinality issues, structural connectivity

and restrictions imposed by its Schema or DTD if specified. This proposal focuses on

the specific problem of providing efficient access control mechanism, which supports

the constraints in XML documents.The update operations in XML include inserting

a node, deleting a node, replacing or renaming a node. The update operations must

guarantee data confidentiality, integrity and availability of data. My area of focus is

in the delete operation performed on nodes.

In the current access control models [1, 3, 4, 2, 11, 10, 9, 12, 17, 18, 22, 15] and

current XUpdate language [14], when a node is deleted its entire subtree is deleted

along with it. This means that users at lower security level can delete nodes at higher

security level if nodes at higher security level are in the subtree being deleted . Such

blind deletes may lead to undesirable loss of important information. To illustrate the

problem consider the following example. Figure 1.1 shows an XML document that

describes data received from satellite images. The data in the document is shared with

users having three different access controls. It follows a hierarchial system wherein

the Top Secret has the highest level user clearance, so it can view data of lower

security users. The Top Secret level is followed by the Secret level user and finally the

Public level user clearance. Data about weather is classified as public and data about

2

Page 9: Multilevel XML Data Model - Semantic ScholarDepartment of Computer Science and Engineering College of Engineering and Information Technology ... 3.2 Document Update Framework for Multilevel

Title

Place

Data

Date

Weather

Temperature

Humidity

Images

Report

Water Resource

Man−made resources

Natural

Vegetation

Concrete Location Images

Buildings

Nuclear PowerResources

P

P

P

P

P

P

P

S

S

S

S

S

S

S

TS

P

Figure 1.1: original XML document

images and its subtree has a higher classification level than the public level . Security

classification in the example document increases when we traverse downwards in the

XML tree. Since, most access control models mentioned earlier assume this type

of document structure for read access assume this type of document structure, we

may think that the document schema is reasonably good for enforcing access control.

We observe that document integrity may be easily violated with update operations

supported by existing models.

There are following options to implement delete operation in an XML document:

• Delete only the viewable nodes and allow fragmentation in the XML tree. This

means that the dangling subtrees get connected to the nearest parent nodes

after the deletion operation, leading to loose structural containment and the

XML schema may be violated. Future querying of the XML tree and policy

3

Page 10: Multilevel XML Data Model - Semantic ScholarDepartment of Computer Science and Engineering College of Engineering and Information Technology ... 3.2 Document Update Framework for Multilevel

enforcement are the other difficulties faced by this approach.

• Delete the entire subtree irrespective of the security classification of the nodes

in the subtree. For example, if an employee in weather forcasting department

decides that data older than one month must be deleted and if the < data >

element is deleted from the view of public user, < images > also gets deleted

from the view of Secret and Top Secret level which is an undesired effect. It

affects the integrity and availability constraints of a system. Existing XML

update implementation and access control models [17, 13] have suggested this

approach.

• Refuse delete operation to any node that has a higher security clearance node

in its subtree. This denial of service creates covert channels .

The above options demonstrates that the integrity requirements for secure update of

XML documents needs to be maintained. In general, a secure and practical update

operation also requires that update operation must not create covert channels for

information flow and must not require modification to existing applications.

In this paper we present architecture for performing XML delete that preserves doc-

ument integrity and information confidentiality. Thus there should be some method

to allow deletion of that node but preserve the Schema of the XML document and

disallow blind delete.

The organization of the paper is as follows. In Section 2 we give a brief overview of

basic concepts that are later used in this paper. In Section 3 we present the proposed

framework for secure XML updates. In this chapter, we also present specifications

of the proposed XML Access Control model and the algorithms for secure XML

deletions(partial deletes). In Section 4 we go into the implementation details of the

Access Control Model in an XML database. Finally in 5 we conclude and suggest

future research work.

4

Page 11: Multilevel XML Data Model - Semantic ScholarDepartment of Computer Science and Engineering College of Engineering and Information Technology ... 3.2 Document Update Framework for Multilevel

Chapter 2

Related Work

This section gives a brief introduction to XML access control models, the XML update

operations and describes the limitations in existing models.

2.1 XML Access Control Models

XML access control supports security classification at all levels of granularity (node-

level, association-level, and file-level) . We can have different security classification

levels of individual documents and elements within them. Hence, there is a strong

need of policies ensuring a controlled access and exchange of XML documents. Policies

can be enforced at the Schema or DTD level where all the documents following the

Schema inherit the authorizations. Policies can also be specified at the individual

document level. The development of an access control system requires the definition

of the subjects and objects against which authorization is issued. A subject is specified

by a triple: uid, role-set and group-set where role-set and group-set are a set of role

names and group names respectively [13]. Roles have a hierarchical structure; hence

all authorizations of a role propagate to all its subgroups and sub roles respectively.

An object represents an element or a set of elements in a target XML document. It can

be identified using an XPath expression [6]. They have an element based hierarchical

structure. Authorizations can be either positive (permissions) or negative (denials)

5

Page 12: Multilevel XML Data Model - Semantic ScholarDepartment of Computer Science and Engineering College of Engineering and Information Technology ... 3.2 Document Update Framework for Multilevel

depending on the application system. It can be defined as applicable to the elements

attributes only (local authorization) or in a recursive approach to its sub elements and

their attributes (recursive authorization [13]. Authorization can be specified on single

XML documents (instance level authorization) called as soft authorization [13] or

on DTD or XML Schema (schema level authorization) called as hard authorization.

Hence access authorization is a 5 tuple of the form: ¡subject, object, action, sign,

type¿ . Issuing a policy for each element in the document may lead to a huge number

of authorizations. To limit the number of authorizations to be defined and maintained

for documents in a source, the tree structure of XML documents can be exploited to

enforce authorization propagation.

Element to subelement: an authorization specified for an element propagates to its

subelement [1] Element to attribute/link: authorization specified for an element

propagates to all its attributes and links, if defined. DTD-to-instance: authorization

specified on a protection object at the DTD level propagates on the same protection

object in all documents recognized as valid. Nodes of the XML tree are labeled

according to the authorization policy. Hence the requester is allowed to view all

the XML nodes and documents that are labeled with positive authorization. The

view of the document can be achieved by pruning all the subtrees labeled negative

[13]. The pruned document may not be valid with respect to the DTD referenced

by the original XML document. Hence a loosening transformation is applied which

makes all the required attributes optional so that signaling channel is avoided and

confidentiality of information is maintained.

2.2 XML Update

XUpdate makes extensive use of the expression language defined by XPath for se-

lecting elements for updating and for conditional processing. It is a pure descriptive

6

Page 13: Multilevel XML Data Model - Semantic ScholarDepartment of Computer Science and Engineering College of Engineering and Information Technology ... 3.2 Document Update Framework for Multilevel

language. There are various update operations that can be performed on XML doc-

uments. They are referred as write authorizations on an XML document. The write

authorizations include inserting a node, deleting a node or updating of a node (chang-

ing the value of an attribute or a change of the text for an element) [23, 17]. When

XML is stored in relational database systems (RDBMS) a mapping is done between

the XML document and the relational tables. Various approaches have been put

forward to map the XML document to relational tables like the Mapping Edges and

the Mapping Values approach [20]. So whenever an update is made it should be

reflected in all the relational tables associated with the document. Insert privileges

allow the insertion of new elements and attributes in the document and the updated

version is checked with the DTD, if yes then updations is successfully implemented

or else rejected. Deletion of a node is permitted if the authorization for the subject

is positive.

2.3 Mandatory Access Control Model

Mandatory access control(MAC) secures information by assigning sensitivity labels on

information and comparing this to the level of sensitivity a user is operating at. Each

object is associated with security classification and subjects have security clearances.

It has a set of access rules defined by comparing the security classification of the

requested objects with the security clearance of the subject. Each labelλ(µ, {s}) is

divided in two parts:

• 1. total order security level

• 2. set of categories.

The dominance relation between labels is defined as follows: label λ(µ, {s}) dominates

label λ′(µ′, {s′}) iff µ ≥ µ′ and {s’} ⊆ {s}.

7

Page 14: Multilevel XML Data Model - Semantic ScholarDepartment of Computer Science and Engineering College of Engineering and Information Technology ... 3.2 Document Update Framework for Multilevel

Bell-LaPadula (BLP) model is a type of MAC model which deals with confidentiality

protection. The two axioms enforced by this model are

Simple-security property :

A subject s is allowed to read an object o only if the security label of s dominates

the security label of o. This ensures no read up.

*-property:

A subject S is allowed to write an object O dominates the security label of S. This

ensures no write down.

However, BLP allows blind writes which leads to improper modification of data. For

example a lower security user can write a higher sensitivity data although he cannot

view it. Most practical implementation support write at the same level.

2.4 Limitation of existing research

Delete operation, using XUpdate requires that if a node is deleted then its corre-

sponding subtree be also deleted. Current access control models adopt this approach.

Assuming that the security labels increase as we traverse down the XML tree, this

approach may result in blind writes. Although this is permitted by BLP, it is clearly

undesired. This activity will decrease data availability of data to higher security

users. If delete operation is not allowed then this creates inference channels for the

lower security users. Hence a new access model needs to be created which not only

allows data access by denying blind writes and also prevents blind writes.

8

Page 15: Multilevel XML Data Model - Semantic ScholarDepartment of Computer Science and Engineering College of Engineering and Information Technology ... 3.2 Document Update Framework for Multilevel

Chapter 3

Thesis

In this chapter, we present an MLS-XML Access Control Model. The proposed so-

lution is based on representing access permissions using Mandatory Access Control

model. We discuss the basic definition required to define our model and the architec-

ture of the proposed access model. We also provide the algorithm to perform secure

XML updates after the deletion operation.

3.1 Definitions

This section describes the basic concepts used in this paper and presents a formal

description of them. First, we describe the representation of XML documents and

Schema.

An XML document is tree-structure composed of properly nested element nodes.

In textual representation of XML document, each subtree is delimited by a pair of

start and end tags of element name. Each element has zero or more child nodes,

which may include other element nodes, text nodes, and attribute nodes. Attributes

are child nodes of element nodes with a constraint that their sibling nodes cannot

have same node-label and attribute nodes have exactly one child node, a text node.

We know present, a more formal definition of the XML Document structure.

9

Page 16: Multilevel XML Data Model - Semantic ScholarDepartment of Computer Science and Engineering College of Engineering and Information Technology ... 3.2 Document Update Framework for Multilevel

Definition 3.1.1 (XML Tree)

An XML tree is a node-labeled tree or a tree defined recursively [15] as follows:

• 1. The empty set {} is a tree, called the empty tree.

• 2. A single node {} is a tree.

• 3. if t1, t2, ..., tk are trees then {n→ {t1, t2, ..., tk}} is a tree. In this case we say

that {n → {t1, t2, ..., tk}} represents a tree with root node n that has outgoing

edges to subtrees t1, t2, ..., tk.

The nodes of the tree are labeled. Labels may be actual facts, node variables (corre-

sponding to any node value) or path variables (corresponding to any path). Constants

correspond to element, attribute and text nodes.

XML documents can be classified as well-formed and valid. An XML document

is well formed if it obeys the syntax of XML. A well-formed document is valid if it

conforms to a Document Type Definition (DTD) [7] or XML Schema [8], i.e. docu-

ment satisfies constraints on cardinality of nodes, data-types, and node-labels. The

purpose of DTD is to define the legal building blocks of an XML document. It de-

fines the document structure with a list of legal elements. The XML-schema whose

function is similar to that of DTD is more extensible and richer. It is written in XML

and supports additional constraints on data-types, cardinality, and namespaces.

3.2 Document Update Framework for Multilevel

Secure XML

In this section we present a framework for performing update operations in multilevel

secure XML documents. Our framework focuses on delete operations on XML nodes.

We present following algorithms to maintain document integrity, avoid need to modify

legacy applications, and present a correct view to user’s at each classification level.

10

Page 17: Multilevel XML Data Model - Semantic ScholarDepartment of Computer Science and Engineering College of Engineering and Information Technology ... 3.2 Document Update Framework for Multilevel

3.2.1 Correctness Criteria for XML Security

The information security breaks down into three goals of integrity, confidentiality

and availability. Confidentiality refers to limiting the access of information only to a

set of authorized users. XML document may have multiple access levels and hence

confidentiality plays an important rule to protect classified data from unauthenticated

user. To provide fine grained access control, security of individual nodes in an XML

tree needs to be checked to prevent dissipation of information. A correct view of

XML document T for a user s shows all nodes in T with classification less than

or equal to the clearance level of user. Integrity refers that the data has not been

changed inappropriately whether by accident or by malign activity. It checks the

validity of the document. It should prevent unauthorized data modification. In other

words, to preserve document integrity, users from lower levels must not be able to

modify the data available in the view of users with higher security clearance. Also,

the modified XML tree should satisfy the constraints imposed by document schema,

such as cardinality, type, and referential integrity. Availability refers to the assurance

that the systems responsible for delivering, storing, and processing information are

accessible when needed.

3.2.2 Delete Operation

The current deletion operations does not satisfy the integrity and availability cor-

rectness criteria of information systems as it allows blind writes. Hence a mechanism

needs to be devised that preserves document integrity by preventing blind deletion of

XML nodes that have a higher security classification. Further, we want to prevent the

creation of disconnected XML fragments. Therefore we need to preserve the nodes

that connect the higher security XML subtrees to the lower security node.

Problem of cascading delete will arise only if the node that is to be deleted is an

element node, as only element nodes can have subtrees containing nodes with higher

11

Page 18: Multilevel XML Data Model - Semantic ScholarDepartment of Computer Science and Engineering College of Engineering and Information Technology ... 3.2 Document Update Framework for Multilevel

(S,{A,B})

(S,{B})

(S,{})

(P,{})

(P,{A})

(P,{A,B})

(P,{B})

(S,{A})

DelPB

{DelPAB}

{DelP}

}{

{DelPA}

{DelS}

{DelSB}

{DelSAB}

{DelSA}

(a) (b)

Figure 3.1: Lattice Structure of Security Levels (a) Original Lattice structure (b)Sublattice with domain unique domain deleted for each label

security classification under them. Attribute and text nodes do not have any child

nodes. Therefore the problem of cascading delete does not arise.

We now present our technique for updating multilevel secure XML document upon

a delete operation. This technique is applied only if the node being deleted is an

element node. My approach makes use of mandatory access control(MAC) model.

Each subject and object in the system is identified by a classification label comprising

of a hierarchial component and a set of domains.

This approach makes use of a unique new domain for each label in the lattice

when deletion operation is performed.

The unique domain for each label is labeled as [{Del} + {hierarchical component}

+ {set of domains}]

All subjects λs along with its domain also contains the deleted domain associated

with labels which is dominated by λs. This allows subjects with clearance λs to view

the deleted objects where classification of object λs > λo

When a node N is deleted decision needs to be taken whether the node should

be preserved and be visible to the user with a higher security clearance or not. To

reach this decision, the subtree of the node is traversed to check whether there exits

12

Page 19: Multilevel XML Data Model - Semantic ScholarDepartment of Computer Science and Engineering College of Engineering and Information Technology ... 3.2 Document Update Framework for Multilevel

(P,{})

(P,{B, DelP})

(P,{A, B, DelPA, DelPB})

(S,{DelP})

(S,{B, DelP, DelS, DelPB})

(S,{A, DelP, DelS, DelPA})

(S,{A, B, DelP, DelS, DelPA, DelPB, DelPAB, DelSA, DelSB})

(P,{A, DelP})

Figure 3.2: Lattice with unique domain deleted added

any node which has a higher security classification than node N. This is achieved by

running the procedure VisitSubtree. The existence of a higher security node reflects

that deleting the whole subtree will lead to undesired deletion of a node whose access

rights are not present to the current user.

Our solution to the security problems created by delete operation in a multilevel

security document is to intercept the delete action and modify security labeling of

nodes required to preserve document integrity. Algorithm 1 presents our solution in

detail. We now present an intuitive description of the algorithm.

The algorithm is divided in to two procedures which perform the following tasks.

1) Every node has a variable mark which is set to ”1” by procedure VisitSubtree

whenever the security level of the node needs to be changed. 2) The procedure

SecurityLabelling changes the security level of the node by adding the domain {Del

+ hierarchical component of the label + set of domains in that label} to the security

label of all the nodes marked by the previous procedure.

Whenever the subtree of the node needs to be preserved we change the security

13

Page 20: Multilevel XML Data Model - Semantic ScholarDepartment of Computer Science and Engineering College of Engineering and Information Technology ... 3.2 Document Update Framework for Multilevel

classification of the node N and its following subtree to a new security classification

with a unique domain Deleted corresponding to that level included where the present

user doesn’t have its user clearance but all the user with a security clearance higher

than the present user can view it. This means that the security classification of object

λo(µo, {so}) is changed to λ′

o(µo, {so} ∪ {D}). The new security classification of the

node will allow only subjects with classification λs ≥ λ′

o to view the object . Thus

for the user at that level deletion is accepted and therefore the deleted items will not

be visible any longer. The users with security clearance that strictly dominates the

deleted objects classification (i.e.λs > λo) will be able to see this node. This partial

delete will avoid covert channels and also maintain the integrity of the document .

Definition 3.2.1 Subject Clearance Update

Let λ′ be the set of labels λ1 = (µ1, s1), λ2 = (µ2, s2), . . . , λn = (µn, sn) such that

λs > λi(i=1,...,n) then give clearance λ′

i = (µi, si ∪D) where i = 1, . . . , n to s.

Definition 3.2.2 Structural Connectivity Preservation

There exists a path for any object o in the XML tree with clearance λo which can be

accessed by subject s with clearance λs if λs ≥ λo.

Definition 3.2.3 Minimal Node Preservation

Any given node n in the subtree rooted at the deleted node o is preserved only if its

subtree contains a node with security classification higher than λo.

Definition 3.2.4 Delete Semantics

An object o when deleted by a subject s where λs ≤ λo is removed along with its

subtree from the view of the subject. However for any object o in the logical view for

s where λs ≥ λo object o is preserved if it contains a higher security node than λs in

its subtree.

14

Page 21: Multilevel XML Data Model - Semantic ScholarDepartment of Computer Science and Engineering College of Engineering and Information Technology ... 3.2 Document Update Framework for Multilevel

Algorithm 1 gaurantees the following properties:

1. Users with dominating clearance than the deleted node satisfies structural con-

nectivity and minimal node preservation.

2. Users with clearance level same or lower than the deleted node preserves the

delete semantics.

Algorithm 1: Data Node Deletion ALgorithm

input : XML tree X, SecurityClearance(Subject),SecurityClearance(Object)

output: Updated XML tree T

if SecurityClearance( Subject)=SecurityClearance( Object) thenX’ ← VisitSubtree(X, nodeN)

changing the security classification of the nodes by checking all the nodesmarked ”1” in the previous procedure T ← SecurityLabelling(X’,N)

returnT

Procedure SecurityLabelling(X,N )

input : Marked XML tree X’, Node selected for deletion Noutput: XML tree X’ with changed security classification of appropriate nodes

Let [{Del} + {hierarchical component(node)} + {set of domains(node)}] be thedomain for nodes marked deleted Perform a post-order traversal of XML treeX ′

foreach Node node in X ′ doif node.mark = 1 then

if SecurityClearance( node) = SecurityClearance( N) thenSecurity classification of node ← (SecurityClearance(node),Domain (node) ∪ [{Del} + {hierarchical component(node)} + {set ofdomains(node)}])

Example 3.2.1 Let T be the XML tree as shown is Figure 1.1 selected for the dele-

tion operation. Let Pete be a user with Public clearance, John be a user with Secret

clearance and Sam be a user with Top Secret clearance. If Pete deletes <weather>

the subtree following <weather> does not have any node higher than the security

15

Page 22: Multilevel XML Data Model - Semantic ScholarDepartment of Computer Science and Engineering College of Engineering and Information Technology ... 3.2 Document Update Framework for Multilevel

Procedure VisitSubtree(Root,N )

input : root node of the tree being visited, Node N selected for deletionoutput: True, if tree contains nodes with higher security classification,

otherwise False

list ← ChildNodes(X)

flag ← falsefor i = 0 to Length( list) do

child ← list. GetNode(i)child.mark ← 0if IsLeaf( child) then

if SecurityClearance( child) ≤ SecurityClearance( N) thenDelete(child)

elsechild.mark ← 1flag ← true

elseretval ← VisitSubtree(child, N)

if retval = true thenflag ← true

if flag thenroot.mark =1

returnflag

clearance of weather and hence the whole subtree is deleted. It gets deleted from

the view of all users. The view generated for each user of tree T after this delete

operation is explained in Figure 3.3.

In the updated XML tree T if Pete deletes <data>, it has nodes in its subtree

which have higher security clearance than <data>. Hence the security classification

of the node <data> and all nodes present in its subtree which are present in the

view of Pete is changed to Pnull ∪ DelP . The views for users with different security

clearance is given in Figure 3.4. 2

3.2.3 Correctness Criteria of the algorithm

We have discussed in the earlier sections the correctness criterions of any secure

information systems. Let us discuss how the above algorithms satisfy the correctness

16

Page 23: Multilevel XML Data Model - Semantic ScholarDepartment of Computer Science and Engineering College of Engineering and Information Technology ... 3.2 Document Update Framework for Multilevel

P

Jan 1 1980

Iraq

Information

P

P

P

P

Report

Date

Data

Place

Title

S

S

S

S

S

S

S

P

P

P

P

Information

Iraq

Jan 1 1980

P

Title

Place

Data

Date

Report

Images

Water Resource

Buildings

ImagesConcrete Location

Vegetation

Natural

resourcesMan−made

TS

S

S

S

S

S

S

S

P

Jan 1 1980

Iraq

Information

P

P

P

P

Water Resource

Images

Report

Date

Data

Place

Title

ResourcesNuclear Power

Buildings

ImagesConcrete Location

Vegetation

Natural

resourcesMan−made

(a) (b) (c)

Figure 3.3: Public Level User Deletes Weather (a) Public Level View (b) Secret LevelView (c)Top Secret Level View

Title

Place

Report

P

P

P

Images

Water Resource

Man−made resources

Natural

Vegetation

Concrete Location Images

Buildings

S

S

S

S

S

S

S

Title

Place

Data

Report

P

P

Information

Iraq

P{DelP}

P

Concrete Location Images

Buildings

Title

Place

Data

Report

Images

Water Resource

Man−made resources

Natural

Vegetation

TS

S

S

P

P

Information

Iraq

S

S

S

S

S

P{DelP}

Nuclear power reactors

P

(a) (b) (c)

Figure 3.4: Public Level User Deletes Data (a) Public Level View (b) Secret LevelView (c)Top Secret Level View

17

Page 24: Multilevel XML Data Model - Semantic ScholarDepartment of Computer Science and Engineering College of Engineering and Information Technology ... 3.2 Document Update Framework for Multilevel

Title

Place

Report

P

P

P

Images

Vegetation

Concrete Location Images

Buildings

S

S

S

S

Title

Place

Data

Report

P

P

Information

Iraq

P{DelP}

P

Title

Place

Data

Report

Images

Vegetation

Concrete Location Images

Buildings

S

S

S

Nuclear power reactorsTS

S

P

P

P

Information

Iraq

P{DelP}

(a) (b) (c)

Figure 3.5: Secret Level User Deletes water-resource (a) Public Level View (b) SecretLevel View (c)Top Secret Level View

Title

Place

Report

P

P

P

Title

Place

Data

Report

Images

Vegetation

S

P

P

Information

Iraq

S

P{DelP}

P

Title

Place

Data

Report

Images

Vegetation

Concrete Location Images

P

P

Information

Iraq

P{DelP}

Nuclear power reactorsTS

S

S

S{DelS}

P

(a) (b) (c)

Figure 3.6: Secret Level User Deletes concrete location images (a) Public Level View(b) Secret Level View (c)Top Secret Level View

18

Page 25: Multilevel XML Data Model - Semantic ScholarDepartment of Computer Science and Engineering College of Engineering and Information Technology ... 3.2 Document Update Framework for Multilevel

criteria.

The security of the document is enabled by checking the subject access clearance to

its object. The update operations are invoked only if the access authorizations are

positive. The XML views of each level only display the data authorized to that level

and existence of coherent channels are avoided through partial delete. This ensures

confidentiality of data.

The integrity of the document is maintained by allowing modifications only if allowed.

The change in security levels of the deleted document will help a higher end user to

have the correct view of its level even though the parent of the higher clearance node

is deleted by a lower end user. The problem of dangling nodes is avoided hence the

structure of the document is also maintained.

By allowing partial delete the higher security nodes that formed a subtree to a lower

security node is still available. Hence availability criteria in the higher security view

is enabled. Thus the above algorithms satisfy the 3 goals of information security.

19

Page 26: Multilevel XML Data Model - Semantic ScholarDepartment of Computer Science and Engineering College of Engineering and Information Technology ... 3.2 Document Update Framework for Multilevel

Chapter 4

Implementation Of Access Control

Model

4.1 Implementation details

The implementation of the XML Multilevel Access Control Model has been done on

eXist: an open source native XML database. The database and the model imple-

mentation is completely written in Java and maybe deployed either as a stand-alone

server process, inside a servelet-engine or directly embedded into an application [19].

The code was 34.5 MB when downloaded from http://www.sourceforge.net site.

Exist uses Indexing to store XML documents for efficient XQuery and XMLUp-

date.This indexing uses document id, node position and nesting depth to identify

nodes. A unique node identifier is assigned to each node by traversing the tree

in level-order. From a given unique identifiers easily determine the id of it’s par-

ent,sibling or possible child nodes. The number of children a node has is recomputed

for every level of the tree: for node x and y of a tree size(x) = size(y) if level(x)

= level(y) where size(n) is the number of children of a node n and level(m) is the

length of the path from the root node of the tree to m. Figure 4.1 shows the unique

identifiers generated by eXist for an example XML document [19].

20

Page 27: Multilevel XML Data Model - Semantic ScholarDepartment of Computer Science and Engineering College of Engineering and Information Technology ... 3.2 Document Update Framework for Multilevel

Figure 4.1: Unique identifiers generated by Exist

Currently eXist uses four index files for native XML storage: collections.dbx-

manages the collection hierarchy, dom.dbx collects nodes in a paged file and associates

unique node identifiers to actual nodes, elements.dbx indexes elements and attributes

and words.dbx keeps track of word occurrences and is used by the full text search

extensions.

The framework of the access control model that is incorporated in eXist is pre-

sented in Figure 4.2

The access control system is based on Mandatory Access Control model(MAC).

Hence every user is assigned a clearance label, which consists of an hierarchial com-

ponent and a set of domains. The paths in an XML tree are also associated with

a classification label which has a hierarchial component and a set of domains. The

policy file showing the clearance label of the user and the classification label of the

nodes in the XML tree is provided by the system administrator. This file will have

a set of Subjects and Objects. Each subject has its subject-id and a classification

label associated with it. Every object has a path that represents the node or a set of

nodes in an XML tree and a classification label.The schema for this file is presented

in Figure 4.3. An instance of the mac.xml file is shown in Figure 4.4

This information in the policy file is parsed by the MACAssignmentsParser.java

21

Page 28: Multilevel XML Data Model - Semantic ScholarDepartment of Computer Science and Engineering College of Engineering and Information Technology ... 3.2 Document Update Framework for Multilevel

Figure 4.2: Framework of EXIST database

file and stored in hashtables. The SubjectClearances table lists down all the users

that can access the system. The ObjectClassification table has a list of all files present

in the collections. Each file entry has a hashtable within it which describes the path

and its classification label. The path is divided into 2 parts: the main path and

conditions of any. For example, if XPath is defined as /a/b/c[@name = ”bob”] then

main path is /a/b/c and condition is [@name = ”bob”]. If condition is not specified

it is set to null. We can also convert a relative path into an absolute path. If the

policy file writes a policy for the relative path /a//d the PathSatisfaction.java file

converts this to the absolute path /a/b/c/d. This file converts a path expression to

a regular expression. It then checks whether the absolute path obtained satisfies the

22

Page 29: Multilevel XML Data Model - Semantic ScholarDepartment of Computer Science and Engineering College of Engineering and Information Technology ... 3.2 Document Update Framework for Multilevel

Figure 4.3: Schema of mac.xml file

path expression given as input. A sample code to convert path expression to regular

expression is shown in Figure 4.5

The SubjectClearances and ObjectClassification tables forms the lattice structure

of our MAC model.

When an XML file is introduced in the database the set of absolute paths of

each node in the XML document is derived and each node in the tree is assigned

a classification label. The absolute path of each node is compared to the paths

stored in the ObjectClassification hashtable. In this model if for a particular path,

the classification label is not defined then the node inherits the classification label

of its parent and ancestors. In this way all the nodes will have one classification

label. If there occurs a conflict in the classification label for a node the highest

classification label in the set of conflicting labels is set. This process is called as

23

Page 30: Multilevel XML Data Model - Semantic ScholarDepartment of Computer Science and Engineering College of Engineering and Information Technology ... 3.2 Document Update Framework for Multilevel

Figure 4.4: Instance of mac.xml file

24

Page 31: Multilevel XML Data Model - Semantic ScholarDepartment of Computer Science and Engineering College of Engineering and Information Technology ... 3.2 Document Update Framework for Multilevel

Figure 4.5: Converting path expression to regular expression

generating materialized views and is stored in a cache.

An XML file is queried using the XQuery language and the query answer is re-

trieved and displayed to the user. We need to build an absolute path of the answer

and check the security constraints before it is output by XQuery engine. The method

to determine the absolute path of a resultant node from a unique node identifier is

written in XMLUtil.java present in the org.exist.dom package. This function makes

use of the indexing schema used by Exist to reach its parents and finally traversing

to the root of the tree to get the absolute path. Thus for every node in the resultant

set of the XQuery we can determine the absolute path. This absolute path needs to

obtain its classification label from the cache created by the materialized view.

After retrieving the absolute path of the node it is compared to the paths stored in

25

Page 32: Multilevel XML Data Model - Semantic ScholarDepartment of Computer Science and Engineering College of Engineering and Information Technology ... 3.2 Document Update Framework for Multilevel

Figure 4.6: Administrator logs on

the ObjectClassification hashtable. The classification label of the user is derived from

SubjectClearances hashtable. If the objects classification label dominates subjects

classification then node is added in the resultant set or else the path is ignored.

This enables node level security in eXist database.

4.2 User Interface

The administrator needs to log as in Figure 4.6 because he has the privileges to upload

the mac.xml file.

The administrator uploads a file medical record.xml in the collection. The file is

parsed to create a materialized view i.e. absolute path is derived for each node in the

XML tree and classification label assigned to it. This information is stored in cache.

The view for the administrator is as in Figure 4.7

Now a user Bob logs on with classification label λ(TS, {Staff, General}) as in

Figure 4.8

The database view for Bob is as given in Figure 4.9

If Bob queries medical record.xml file and asks for all patient name’s under pa-

tient info he submits a query as in Figure 4.10

We see that the classification label of medical record is λ(TS, {General}) and

hence by propagation of policies medical record

26

Page 33: Multilevel XML Data Model - Semantic ScholarDepartment of Computer Science and Engineering College of Engineering and Information Technology ... 3.2 Document Update Framework for Multilevel

Figure 4.7: File loaded by the administrator

Figure 4.8: Bob logs on to the database

27

Page 34: Multilevel XML Data Model - Semantic ScholarDepartment of Computer Science and Engineering College of Engineering and Information Technology ... 3.2 Document Update Framework for Multilevel

Figure 4.9: Bob’s view of the database

Figure 4.10: Bob submits a query

28

Page 35: Multilevel XML Data Model - Semantic ScholarDepartment of Computer Science and Engineering College of Engineering and Information Technology ... 3.2 Document Update Framework for Multilevel

Figure 4.11: The result of the XQuery

Figure 4.12: Bob logs on to the database

patient info also has the classification label λ(TS, {General}). Since λ(TS, {General, Staff})

dominates λ(TS, {General}) Bob’s view of the resultant query is given in Figure 4.11

In another scenario, user Charles logs on with classification label λ(S, {General})

as in Figure 4.12. If Charles also submits the same query as Bob comparison of classi-

fication labels is done. The object’s classification label λ(TS, {General}) dominates

the classification label of Charles as TS ¿ S . Hence access is denied.

29

Page 36: Multilevel XML Data Model - Semantic ScholarDepartment of Computer Science and Engineering College of Engineering and Information Technology ... 3.2 Document Update Framework for Multilevel

Chapter 5

Conclusions

The proposed algorithm will maintain the integrity of the documents thus restricting

blind writes from a lower security level user. The nodes that are being deleted by the

lower sensitivity level user are also marked by the delete marker attribute so that the

higher security level user realizes that a deletion operation has been attempted on

the node by a lower security level user. It is upto the higher-level user to maintain

the document or update the document according to the deletion operation.

The access control data model incorporated in eXist allows a Multilevel Secure Access

Control Model. On querying an XML file only those nodes accessible to the user at

that particular classification is displayed. The granularity level of access control is

node level, hence a single XML document can have multiple access control levels.

The prototype ensures confidentiality, integrity and availability during XQuery and

XUpdates.

5.1 Future Work

Ontology will play a crucial role inorder to enhance the updation of the current

document. If the relationship between nodes of the XML document can be properly

determined then valid updates can be done. A cohesion tree graph can be drawn which

reflects the coupling between two nodes. It may show whether a node is valid without

30

Page 37: Multilevel XML Data Model - Semantic ScholarDepartment of Computer Science and Engineering College of Engineering and Information Technology ... 3.2 Document Update Framework for Multilevel

the presence of other node. If one node is deleted then the other node can be as well

deleted, as it holds no meaning without the first. This graph can be declared once

during the design phase of the XML document and can be used to validate updates

on the document. In the implementation of XML database the update operations

discussed in our modified Mandatory Access Control Model can be included in eXist.

31

Page 38: Multilevel XML Data Model - Semantic ScholarDepartment of Computer Science and Engineering College of Engineering and Information Technology ... 3.2 Document Update Framework for Multilevel

Bibliography

[1] E. Bertino, M. Braun, S. Castano, E. Ferrari, and M. Mesiti. Author-X: A Java-

based system for XML data protection. In Proc. of 14th IFIP WG11.3 Working

Conference on Database Security, The Netherlands, August 2000.

[2] E. Bertino, S. Castano, and E. Ferrari. Securing XML documents with Author-X.

IEEE Internet Computing, 3, May/June 2001.

[3] E. Bertino, S. Castano, E. Ferrari, and M. Mesiti. Controlled access and dissemi-

nation of XML documents. In Proc. of 2nd ACM Workshop on Web Information

and Data Management, pages 22–27, Kansas City, 1999.

[4] E. Bertino, S. Castano, E. Ferrari, and M.Mesiti. Specifying and enforcing ac-

cess control policies for XML document sources. In World Wide Web Journal,

volume 3. Kluwer Academic Publishers, 2000.

[5] Kirk D. Borne. Xml group resources. Retrieved on April 25, 2004 from

http://xml.gsfc.nasa.gov, April 2004.

[6] World Wide Web Consortium. XML Path Language (XPath) Ver-

sion 1.0. W3C Recommendation, retrieved on November 16,1999 from

http://www.w3.org/XML/Schema, Nov 1999.

[7] World Wide Web Consortium. Extensible Markup Language Language 1.0

specification. W3C Recommendation, retrieved on October 6,2000 from

http://www.w3.org/TR/2000/REC-xml-20001006, October 2000.

32

Page 39: Multilevel XML Data Model - Semantic ScholarDepartment of Computer Science and Engineering College of Engineering and Information Technology ... 3.2 Document Update Framework for Multilevel

[8] World Wide Web Consortium. XML Schema. W3C Recommendation, retrieved

on May 2,2001 from http://www.w3.org/XML/Schema, May 2001.

[9] E. Damiani, S. De Capitani di Vimercati, S. Paraboschi, and P. Samarati. Design

and implementation of an access control processor for XML documents. In Proc.

of 9th International World Wide Web Conference, The Netherlands, 2000.

[10] E. Damiani, S. De Capitani di Vimercati, S. Paraboschi, and P. Samarati. Reg-

ulating access to semistructured information on the web. In Proc. of 16th IFIP

TC11 Annual Working Conference on Information Security: Information Secu-

rity for Global Information Infrastructures, Beijing, China, August 2000.

[11] E. Damiani, S. De Capitani di Vimercati, S. Paraboschi, and P. Samarati. XML

access control systems: A component-based approach. In Proc. of 14th IFIP

WG11.3 Working Conference on Database Security, The Netherlands, August

2000.

[12] E. Damiani, S. De Capitani di Vimercati, S. Paraboschi, and P. Samarati. Secur-

ing XML documents. In Proc. of Conference on Extending Database Technology,

Prague, March 2002.

[13] Ernesto Damiani, Sabrina De Capitani di Vimercati, Stefano Paraboschi, and

Pierangela Samarati. A fine-grained access control system for xml documents.

ACM Trans. Inf. Syst. Secur., 5(2):169–202, 2002.

[14] XML:DB Working draft. Xml updates(xupdate). retrived September 2004 from

http://xmldb-org.sourceforge.net/xupdate/xupdate-wd.html, September 2004.

[15] Vaibhav Gowadia and Csilla Farkas. Rdf metadata for xml access control. In

Proceedings of the 2003 ACM workshop on XML security, pages 39–48. ACM

Press, 2003.

33

Page 40: Multilevel XML Data Model - Semantic ScholarDepartment of Computer Science and Engineering College of Engineering and Information Technology ... 3.2 Document Update Framework for Multilevel

[16] Andy Hadley and Cheryl Hutchings. 1.25 million electronic patient records in

xml at poole. In Proc. of XML Europe, Berlin, May 2001.

[17] M. Kudo and S. Hada. XML document security based on provisional authoriza-

tions. In Proc. of the 7th ACM conference on Computer and Communications

Security, Athens, Greece, November 2000.

[18] M. Kudo and S. Hada. Access control model with provisional actions. IEICE

Trans. Fundamentals, E84-A(1), 2001.

[19] Wolfgang Meier. exist: An open source native xml database. Web, Web-Services

and Database systems, March 2003.

[20] Daniela Florescu Michael Rys, Donald D.Chamberlin. Xml and relational

database management systems. In SIGMOD Conference 2005, 2005.

[21] OASIS. Cover pages. Hosted by OASIS, retrieved on April 25, 2004 from

http://xml.coverpages.org/, April 2004.

[22] A. Stoica and C. Farkas. Secure XML views. In Proc.of IFIP WG11.3 Working

Group Conference on Database and Application Security, 2002.

[23] Igor Tatarinov, Zachary G. Ives, Alon Y. Halevy, and Daniel S. Weld. Updating

xml. In Proceedings of the 2001 ACM SIGMOD international conference on

Management of data, pages 413–424. ACM Press, 2001.

34

Page 41: Multilevel XML Data Model - Semantic ScholarDepartment of Computer Science and Engineering College of Engineering and Information Technology ... 3.2 Document Update Framework for Multilevel

Appendix A

35


Recommended