Date post: | 12-Nov-2015 |
Category: |
Documents |
Upload: | sarfraznawazbrohi |
View: | 20 times |
Download: | 5 times |
i
SECURE CLOUD STORAGE MODEL TO PRESERVE CONFIDENTIALITY
AND INTEGRITY
SARFRAZ NAWAZ BROHI
A thesis submitted in fulfilment of the
requirements for the award of the degree of
Doctor of Software Engineering
Advanced Informatics School
Universiti Teknologi Malaysia
JANUARY 2015
iii
To
my supportive parents,
and
beloved siblings
iv
ACKNOWLEDGEMENT
First of All, I thank ALLAH (SWT), the God Almighty, for granting me the
health, knowledge, strength, ability, and patience to accomplish this research, and for
blessing me with sympathetic and supportive supervisors as well as family members.
I am glad to express tremendous gratitude to my supervisor Dr Suriayati
Chuprat for her compassionate character, knowledge sharing, ideas and continuous
support from the first until the last day of this study. Her sincere behaviour and
constructive feedback enabled me to achieve significant research milestones within
the required time-frame.
I would also like to thank my external supervisor Dr Jamalul-lail Ab Manan
for enriching me with innovative ideas and skills by sharing his expertise and
knowledge in the field of cloud computing security. Due to his unlimited support for
reviewing, improving and evaluating my research, I was able to publish several high
quality research papers.
At various stages during this study, I faced several undesirable challenges
which overburdened me with mental and physical stress. However, this never
stopped me from progressing further due to encouraging, moral as well as financial
support from my father Dr Muhammad Nawaz Brohi. I am extremely thankful to
him for his understanding, kindness, believe, and trust on me.
I also wish to express deepest appreciation to my mother for her prayers
regarding my success during this entire study. I will always remember my late
grandmother in prayers. This research would have never been possible without her
wishes for my success.
v
ABSTRACT
Cloud Service Providers (CSPs) offer remotely located cloud storage services
to business organizations which include cost-effective advantages. From an
industrial perspective, Amazon Simple Storage Service (S3) and Google Cloud
Storage (GCS) are the leading cloud storage services. These storages are secured
using the latest data security approaches such as cryptography algorithms, data
auditing processes, and strict access control policies. However, organizations where
confidentiality of information is a significant act, they are not assertive to adopt these
services due to emerging data confidentiality and integrity concerns. Malicious
attackers have violated the cloud storages to steal, view, manipulate, and tamper
clients data. The researchers have attempted to overcome these shortcomings by
designing and developing various security models. These solutions incorporate
limitations and require enhancements as well as improvements before they can be
widely accepted by CSPs to guarantee secure cloud storage services. In order to
solve the stated problem, this research developed an improved security solution
namely Secure Cloud Storage Model (SCSM) which consists of Multi-factor
authentication and authorization process using Role-Based Access Control (RBAC)
with Complex Random Security Code Generator (CRSCG), Partial homomorphic
cryptography using Rivest, Shamir and Adleman (RSA) algorithm, Trusted Third
Party (TTP) services including Key Management (KM) approach and data auditing
process, Implementation of 256-bit Secure Socket Layer (SSL), and Service Level
Agreement (SLA). SCSM was implemented using Java Enterprise Edition with
glassfish server and deployed on a cloud computing infrastructure. The model was
evaluated using extended euclidean algorithm, system security analysis, key
management recommendations, web-based testing tool, security scanner, and survey.
The survey results presented that 83.33% of the respondents agreed for SCSM to be
widely accepted by CSPs to offer secured cloud storage services. The aggregate
evaluation results proved that SCSM is successful in preserving data confidentiality
and integrity at remotely located cloud storages.
vi
ABSTRAK
Penyedia perkhidmatan awan (CSP) menawarkan servis storan awan secara
jauh yang memberi kelebihan kos yang efektif. Mengikut perspektif industri,
Amazon Simple Storage Service (S3) dan Google Cloud Storage (GCS) merupakan
peneraju utama servis storan awan. Storan ini adalah selamat kerana mereka
menggunakan pendekatan keselamatan data yang terkini seperti algoritma
kriptografi, proses pengauditan data serta polisi kawalan capaian yang ketat. Walau
bagaimanapun, bagi organisasi yang mengutamakan kerahsiaan maklumat, mereka
tidak tertarik untuk menggunakan servis tersebut kerana bimbang akan kerahsiaan
dan integriti data. Penyerang yang berniat jahat telah mencabuli storan awan dengan
mencuri, melihat, memanipulasi dan mengganggu data pelanggan. Para penyelidik
telah mencuba menangani masalah-masalah ini dengan mereka bentuk dan
membangunkan pelbagai model keselamatan. Penyelesaian yang telah dibangunkan
ini masih mempunyai had tertentu dan memerlukan penambahbaikan sebelum ianya
diterima secara meluas oleh CSP demi menjamin keselamatan servis tersebut. Untuk
menyelesaikan masalah yang dinyatakan, penyelidikan ini telah membangunkan
penyelesaian keselamatan yang telah ditambahbaik dan ianya dinamakan Secure
Cloud Storage Model (SCSM). Model ini terdiri daripada pengesahan pelbagai-
faktor, proses kebenaran menggunakan Role-Based Access Control (RBAC) dengan
Complex Random Security Code Generator (CRSCG), kriptografi homomorphic
separa menggunakan algoritma Rivest, Shamir and Adleman (RSA), servis-servis
Trusted Third Party (TTP) iaitu pendekatan pengurusan kunci (KM) dan proses
pengauditan data, perlaksanaan Secure Socket Layer (SSL) 256-bit, dan Service
Level Agreement (SLA). SCSM dibangunkan menggunakan Java Enterprise Edition
dengan pelayan Glassfish dan dilaksanakan pada infrastruktur pengkomputeran
awan. Model ini kemudiannya dinilai menggunakan algoritma Extended Euclidean,
analisis keselamatan sistem, cadangan-cadangan pengurusan kunci, alatan ujian
berasaskan sesawang, pengimbas keselamatan serta kajian. Hasil kajian
menunjukkan 83.33% responden bersetuju SCSM boleh diterima secara meluas oleh
CSP yang menawarkan servis storan awan yang selamat. Keputusan penilaian
membuktikan SCSM berjaya dalam memelihara kerahsiaan data dan integriti pada
storan awan jarak jauh.
vii
TABLE OF CONTENTS
CHAPTER TITLE PAGE
DECLARATION ii
ACKNOWLEDGEMENT iv
ABSTRACT v
ABSTRAK vi
TABLE OF CONTENTS vii
LIST OF TABLES xiii
LIST OF FIGURES xiv
LIST OF ABBREVIATIONS xvii
LIST OF SYMBOLS xx
LIST OF APPENDICES xxi
1 INTRODUCTION 1
1.1 Overview 1
1.2 Problem Background 2
1.3 Problem Statement 6
1.4 Research Objectives 7
1.5 Scope of Research 9
1.6 Significance of Research 10
1.7 Contribution of Research 10
1.8 Thesis Organization 11
1.9 Summary 13
2 LITERATURE REVIEW 14
2.1 Introduction 14
viii
2.2 Cloud Deployment Models 16
2.2.1 Public Cloud 16
2.2.2 Private Cloud 17
2.2.3 Hybrid Cloud 18
2.2.4 Community Cloud 19
2.3 Cloud Service Delivery Models 20
2.3.1 Software as a Service 20
2.3.2 Platform as a Service 21
2.3.3 Infrastructure as a Service 21
2.4 Cloud Storage Services 22
2.5 Cloud Storage Data Security Concerns 23
2.5.1 Data Confidentiality 23
2.5.2 Data Integrity 24
2.6 Data Protection Mechanisms for Cloud Storages 25
2.6.1 Cryptography and Key Management 25
2.6.2 Trusted Computing 26
2.6.3 Access Control Mechanisms 27
2.6.4 Service Level Agreement 27
2.6.5 Data Auditing Services 28
2.7 Industry Based Implementations of Cloud Storage
Services 29
2.7.1 Amazon Simple Storage Service 29
2.7.2 Google Cloud Storage 33
2.8 Limitations of Industry Implemented Cloud Storage
Services 37
2.8.1 Vulnerable Key Management Approach 39
2.8.2 Inadequate Cryptographic Support 40
2.8.3 Exclusion of Security Assurance in Service
Level Agreements 40
2.8.4 Untrustworthy Data Integrity Verification
Services 41
2.9 Confidentiality and Integrity Preserving
Cloud Storage Models 42
ix
2.9.1 Secure Cloud Storage Integrator for
Enterprises 43
2.9.2 Data Confidentiality and Integrity
Verification Using User Authenticator
Scheme in Cloud 45
2.9.3 Secure Storage Services in Cloud 47
2.9.4 Data Confidentiality in Storage-Intensive
Cloud Applications 49
2.9.5 Cloud Storage Integrity Checking Using
Encryption Algorithm 51
2.10 Critical Analysis on Related Work Solutions 52
2.11 Contribution and Road Map of Research 56
2.12 Summary 59
3 RESEARCH METHODOLOGY 60
3.1 Introduction 60
3.2 Research Methodology 62
3.2.1 Literature Review 62
3.2.2 Analysis 64
3.2.3 Design 65
3.2.4 Implementation 66
3.2.5 Evaluation 67
3.3 Research Activities and Outcomes 68
3.4 Summary 71
4 SECURE CLOUD STORAGE MODEL 72
4.1 Introduction 72
4.2 Building Blocks of SCSM 73
4.3 Description and Architecture of SCSM 74
4.3.1 Roles and Responsibilities 76
4.4 Components of SCSM 77
4.4.1 Multi-factor Authentication and Authorization
Process 78
x
4.4.1.1 Role Based Access Control 79
4.4.1.2 Complex Random Security
Code Generator 81
4.4.2 Partial Homomorphic Cryptography 82
4.4.3 256-bit Secure Socket Layer 86
4.4.4 Service Level Agreement 87
4.4.5 Trusted Third Party Services 96
4.4.5.1 Key Management Approach 96
4.4.5.2 Data Auditing Process 98
4.5 Process of SCSM 101
4.6 Summary 103
5 IMPLEMENTATION OF THE SECURE CLOUD
STORAGE MODEL 104
5.1 Introduction 104
5.2 Software Development Process of SCSM 106
5.3 Systematic Workflow of SCSM 112
5.3.1 Data Transfer and Retrieval 113
5.3.2 Encrypted Data Processing 115
5.3.3 Verification Metadata Generation
and Secure Transfer of Parameters 117
5.3.4 Data Integrity Verification 118
5.3.5 Data Recovery 122
5.3.6 Private Key Retrieval and Data
Downloading 123
5.4 Deployment of SCSM 125
5.5 Summary 127
6 EVALUATION AND RESULTS 128
6.1 Introduction 128
6.2 Evaluation Strategy of Research 129
6.3 Evaluation and Results of SCSM Components 130
6.3.1 Qualys Web-based Evaluation Methodology 131
xi
6.3.1.1 SSL Certificate Inspection 131
6.3.1.2 Server Configuration Inspection 133
6.3.1.3 Final Score and Grade Assignment 138
6.3.2 Mathematical Evaluation 140
6.3.3 Compliance Evaluation 144
6.3.4 Security Analysis 146
6.3.5 Survey Based Evaluation 147
6.3.5.1 Structure of Survey 148
6.3.5.2 Survey Analysis for Multi-factor
Authentication and Authorization
Process 150
6.3.5.3 Survey Analysis for Service
Level Agreement 152
6.4 Evaluation of SCSM using Survey and Skipfish 156
6.5 Benchmarking of SCSM with Industry and Academia
Best Practices 161
6.5.1 Secure and Flexible Partial Homomorphic
Cryptography 165
6.5.2 Security and Privacy Guaranteeing Service
Level Agreement 167
6.5.3 Trusted, Secure and Efficient Data Auditing
Service 168
6.5.4 Trusted and Secure Key Management
Approach 170
6.5.5 Extremely Secure Multi-factor Authentication
and Authorization Process 171
6.6 Summary 173
7 CONCLUSION AND FUTURE WORK 174
7.1 Introduction 174
7.2 Contributions and Significance 175
7.3 Potential Applications of SCSM 178
7.4 Limitations and Future Directions of Research 179
xii
7.4.1 Fully Homomorphic Encryption 179
7.4.2 Heterogeneous Data 180
7.4.3 Performance 180
7.4.4 Multi-user Computing Environment 181
7.5 Summary 181
REFERENCES 182
Appendices A - C 197 - 201
xiii
LIST OF TABLES
TABLE NO. TITLE PAGE
1.1 Analysis of Research Problem Area 4
3.1 Research Activities and Outcomes 69
4.1 Service Level Agreement 89
6.1 Protocol Support Rating Guide 134
6.2 Key Exchange Rating Guide 135
6.3 Cipher Strength Rating Guide 137
6.4 Evaluation Criteria 138
6.5 Letter Grading Translation 139
6.6 Keys of Alice and Bob 141
6.7 Key Management Compliance and Auditing 144
6.8 Participation of the Industry Experts in Survey 149
6.9 Analysis of Multi-factor Authentication and
Authorization Process 151
6.10 Analysis of Service Level Agreement 155
6.11 Analysis of SCSM 157
6.12 SCSM Benchmarking with Industry and Academia
Implemented Solutions 163
xiv
LIST OF FIGURES
FIGURE NO.
TITLE
PAGE
1.1 Survey for Research Problem Area 4
1.2 Thesis Organization 12
2.1 Server Side Encryption 30
2.2 Encryption with Clients Key 31
2.3 Client Side Encryption 32
2.4 Data Migration Process 34
2.5 Authentication Process 36
2.6 Limitations of Amazon S3 and GCS 38
2.7 Cloud Storage Integrator 44
2.8 Preserving Data Confidentiality 45
2.9 Data Integrity Verification 46
2.10 Data Updating 46
2.11 TrustStore Hybrid Cloud Service 48
2.12 Key Management and Data Confidentiality 50
2.13 Cloud Storage Security using Broker 51
2.14 Academia Implemented Cloud Storage Models 53
2.15 Research Road Map 58
3.1 Research Methodology 61
4.1 Architecture of SCSM 74
4.2 Components of SCSM 78
4.3 RBAC Privileges 80
4.4 Access Logs Report 100
4.5 Process of SCSM 102
5.1 HTTP based Authentication 106
5.2 Role Mapping 107
xv
5.3 Roles and Security Annotations 108
5.4 RSA Partial Homomorphic Cryptography 109
5.5 Metadata Generation 110
5.6 Metadata Verification 110
5.7 Sound Steganography 111
5.8 Operations of SCSM 113
5.9 Encryption Process 114
5.10 Decryption Process 115
5.11 Data Processing 116
5.12 VMD Generation and Transfer Process 117
5.13 VMD Decoding Process 119
5.14 Data Auditing Process 119
5.15 Auditing Report 120
5.16 Data Integrity Violation 121
5.17 Auditing Report After Violation 121
5.18 Data Recovery Process 122
5.19 Auditing Report after Data Recovery Process 123
5.20 Private Key Decoding Process 124
5.21 Data Retrieval Process 124
5.22 Module based Deployment Using Glassfish Server 126
6.1 Evaluation Strategy 130
6.2 Implemented SSL Certificate Details 132
6.3 SSL Certificate Inspection 133
6.4 Protocol Support 135
6.5 Key Exchange 136
6.6 Cipher Strength 138
6.7 SSL Evaluation Results 140
6.8 Results for Multi-factor Authentication and
Authorization Process 151
6.9 Results for SLA 154
6.10 Results for SCSM 158
6.11 Skipfish Security Scanning Report 159
6.12 Skipfish Interactive Report 160
xvi
6.13 Performance Analysis of Encryption Process 166
6.14 Performance Analysis of Decryption Process 166
6.15 Performance Analysis of Data Integrity Verification
Process 169
6.16 Security Experiment on CRSCG 172
7.1 Contributions, Publications and Certificates 177
xvii
LIST OF ABBREVIATIONS
ACL - Access Control List
ACM - Access Control Mechanism
ACP - Access Control Policy
AES - Advanced Encryption Standard
API - Application Programming Interface
AWS - Amazon Web Services
CA - Clients Admin
CAT - Computer Associates Technologies
CentOS - Community Enterprise Operating System
CRC - Cyclic Redundancy Check
CRSCG - Complex Random Security Code Generator
CSA - Cloud Security Alliance
CSP - Cloud Service Provider
CSPA - Cloud Service Providers Admin
CSSP - Cloud Storage Service Provider
DAC - Discretionary Access Control
DBAN - Dariks Boot and Nuke
DSA - Digital Signature Algorithm
ECC - Elliptic Curve Cryptography
EJBs - Enterprise Java Beans
FHE - Fully Homomorphic Encryption
GCS - Google Cloud Storage
GFIS - German Federal Office of Information Security
HIPAA - Health Insurance Portability and Accountability Act
HMAC - Keyed-Hash Message Authentication Code
HTML - Hypertext Markup Language
HTTPS - Hypertext Transfer Protocol Secure
xviii
IaaS - Infrastructure as a Service
IM - Integrity Management
JSF - Java Server Faces
JSP - Java Server Pages
KM - Key Management
MAC - Mandatory Access Control
MITM - Man-in-the-Middle
NAS - Network Attached Storage
NIST - National Institute of Standards and Technology
NSA - National Security Agency
OS - Operating System
PaaS - Platform as a Service
PCI - Payment Card Industry
PCIDSS - Payment Card Industry Data Security Standard
RBAC - Role-based Access Control
RSA - Rivest, Shamir and Adleman
S3 - Simple Storage Service
SaaS - Software as a Service
SCSM - Secure Cloud Storage Model
SDK - Software Development Kit
SDLC - Software Development Life Cycle
SE - Software Engineering
SHA - Secure Hash Algorithm
SLA - Service Level Agreement
SMBs - Small and Medium Businesses
SMS - Short Message Service
SQL - Structured Query Language
SSE - Server Side Encryption
SSE-C - Server Side Encryption with Customer-Provided Key
SSL - Secure Socket Layer
SSO - Single Sign-On
TCG - Trusted Computing Group
TDEA - Triple Data Encryption Algorithm
xix
TED - Trusted Extension Device
TLS - Transport Layer Security
TPM - Trusted Platform Module
TTP - Trusted Third Party
TTPA - Trusted Third Partys Admin
TVD - Trusted Virtual Domain
UML - Unified Modelling Language
VF - Virtual Firewall
VM - Virtual Machine
VMD - Verification Metadata
VPC - Virtual Private Cloud
VPS - Virtual Private Server
vTPM - Virtual Trusted Platform Module
XHTML - Extensible Hypertext Markup Language
XML - Extensible Markup Language
XSS - Cross-site Scripting
xx
LIST OF SYMBOLS
| - Such That
d - Private Key Exponent
e - Public Key Exponent
n - Modulus for Private and Public Key
(n) - Phi Eulers Function
R - Random Factor
xxi
LIST OF APPENDICES
APPENDIX
TITLE
PAGE
A Papers published during the authors
candidature
197
B Certificates obtained during the authors
candidature
200
C Survey design and delivery 201
1
CHAPTER 1
INTRODUCTION
1.1 Overview
Cloud computing is an innovative method of delivering computing resources
(Tripathi and Mishra, 2011). It facilitates the clients to execute their enterprise
applications and store data at third party owned servers. The cloud offers various
service delivery models such as Software as a Service (SaaS), Platform as a Service
(PaaS) and Infrastructure as a Service (IaaS), which are acquired by the clients
according to their requirements (Bouayad et al., 2012). IaaS is further categorized in
three major facilities which include compute, network, and storage.
This research mainly focuses on storage sub-offering of IaaS, which is
provided to clients by well-known Cloud Service Providers (CSPs) such as Amazon,
and Google (Ghosh and Ghosh, 2012). This service facilitates the organizations to
obtain dynamic, redundant and scalable, remotely located data storage services that
can be easily scaled-up or down to avoid costly burdens of an under or over-utilized
storage capacity (Jiang et al., 2013). Cloud storage services have been very useful
for Small and Medium Businesses (SMBs) that lack capital budget to implement and
maintain personalized storage infrastructure (Sun and Sha-sha, 2011; Deyan and
Hong, 2012).
2
However, nowadays cloud storage is becoming a business interest for all size
organizations that are requiring resilient data availability, business continuity, and
disaster recovery solutions. For cloud storage clients, critical data are maintained
and backed-up by the CSP at multiple geographically distributed locations (Zhang
and Zhang, 2011).
The remainder of this chapter is organized in eight sections. Section 1.2,
describes the problem background. Section 1.3, represents the problem statement.
The objectives, scopes, significance, and contribution of research are described in
Sections 1.4, 1.5, 1.6 and 1.7, respectively. Section 1.8, illustrates and describes
anatomy of the entire thesis. Section 1.9, represents the summary of this chapter.
1.2 Problem Background
The organizations that are required to follow well-defined data security
standards such as the Health Insurance Portability and Accountability Act (HIPAA)
and Payment Card Industry Data Security Standard (PCIDSS), do not trust the
existing security techniques as well as policies offered by the CSPs (Hofmann and
Woods, 2010; Bamiah et al., 2012; Shucheng et al., 2010). Due to lack of control on
their confidential data while it is stored at cloud storages, clients are concerned that
malicious users might gain illegal access to their sensitive records (Taeho et al.,
2013).
This research focuses on solving two major issues which are emerging
concerns for organizations dealing with confidential data not to adopt cloud storage
services, these include data confidentiality and integrity breaches (Syam and
Subramanian, 2011; Gansen et al., 2010). The term data confidentiality refers to the
concept that only authorized parties or systems have the ability to access protected
3
information. The threat of data compromise increases in the cloud environment due
to augmented number of parties, devices and applications involved, which leads to an
increase in the amount of access points.
Data integrity means data can only be modified by the authorized parties. The
concept of data integrity refers to protection of data from unauthorized deletion,
modification or fabrication (Zissis and Lekkas, 2012). In order to further analyze the
research problem, this research also conducted a survey from industry and academia
based information security analysts, data auditors, cloud computing researchers,
developers, architects and security specialists. The detailed structure of the survey is
described in Chapter 6. The following question was mentioned in the survey to
determine the validity and impact of the problem background of this research.
Question: Organizations dealing with confidential data are reluctant to use
remotely located third party cloud storage services due to emerging data
confidentiality and integrity concerns.
The response scale was based on three options, i.e. Agree, Neutral and
Disagree. The survey response obtained for the research problem area, as shown in
Table 1.1 and Figure 1.1, justifies the necessity of formulating a solution for the
research problem, whereby 83.33% of respondents agreed that the organizations are
reluctant to adopt cloud storage services due to emerging data confidentiality and
integrity concerns.
4
Table 1.1: Analysis of Research Problem Area
Answer Choices Response Rate Academia Industry Total
Agree 83.33% 14 11 25
Neutral 16.67% 2 3 5
Disagree 0% 0 0 0
Figure 1.1: Survey for Research Problem Area
5
Past studies proved that confidentiality and integrity of data stored at cloud
computing storage is breached by external or internal attacks (Ling et al., 2011).
External attacks are issued by outside hackers who steal clients confidential records.
These attacks may take place by wicked IT personnel from the competitors of CSP or
the client. The intention of these attacks is to damage the brand reputation of CSP or
to violate the clients files. In order to defend against these attacks, CSPs normally
secure their physical and virtual infrastructure using various tools and techniques for
protecting clients data and their systems. However, existing solutions are not
adequate enough to achieve the desired target (Rocha and Correia, 2011). It is also
identified that internal employees of CSP may become malicious as well (Catteddu
and Hogben, 2009).
Internal attacks are placed by malicious insiders such as disgruntled
employees of a CSP. They intentionally exceed their privileged accesses in a
negative manner to affect the data confidentiality and integrity (Duncan et al., 2012).
In contrast to an external hacker, malicious insiders can attack the computing
infrastructure with relatively easy manner and less knowledge of hacking, since they
have a detailed description of the underlying infrastructure. Without using a
complete trustworthy solution for defending against insider attacks, malicious
insiders can easily obtain the passwords, cryptographic keys, files and gain access to
clients records (Rocha et al., 2011). When clients data confidentiality has been
breached, they would never have knowledge of the unauthorized access mostly due
to lack of control over their data and lack of transparency in the CSPs security
practices as well as policies.
The breach of data confidentiality and integrity creates a barrier of trust
among clients and CSPs. Clients need to ensure that CSP will always provide the
agreed level of service and security to protect their confidential data. Trust is
impacted when CSPs do not meet the negotiated agreements, for example,
implementing insufficient security techniques, storing data at invalid locations which
are not permitted by the legal law or not complying with the standards such as
HIPAA or PCIDSS (Khan and Malluhi, 2010). The trust issues are normally
6
mitigated by signing a legal Service Level Agreement (SLA) and granting adequate
control to the clients on their confidential data (Xiaoyong and Junping, 2013).
However, the existing SLAs are non-negotiable and fixed from the CSPs for every
client which may be either an ordinary home user or a banking sector. These SLAs
are not able to accommodate specific requirements of the organizations who are
seeking to leverage cloud storage services for storing confidential data (Asha, 2012).
1.3 Problem Statement
As discussed in the problem background that cloud storages are vulnerable to
external and internal attacks which have impacted the clients trust towards CSPs for
shifting their confidential data at third party cloud storages. Existing network
security solutions are not able to overcome cloud storage data confidentiality and
integrity violating threats (Nirmala et al., 2013). Considering these issues, the
problem statement of research is mentioned as follows:
How to develop a secure cloud storage model that preserves data
confidentiality and integrity as well as ensures the delivery of trusted services to the
clients by considering their data security policies?
Several research questions can be extracted from the problem statement,
which are mentioned as follows:
i. What are the existing security models that have been designed, developed
or proposed by the industry and academia researchers to overcome data
confidentiality and integrity concerns for using cloud storage services?
7
ii. What are the limitations of existing industry and academia implemented
cloud storage models that raise confidentiality and integrity issues which
prevent organizations dealing with sensitive data from adopting cloud
storage services?
iii. How to design a model that preserves data confidentiality and integrity at
cloud storages as well as ensures the delivery of trusted services to the
clients?
iv. How to develop a model that enables the clients to store and process their
data at cloud storages with consistent data integrity, confidentiality and
trust?
v. How to verify that the implemented cloud storage model is successful in
preserving the confidentiality and integrity of sensitive data, and ensuring
the delivery of trusted services to the clients?
1.4 Research Objectives
The aim of this research is to develop a security model that overcomes the
data confidentiality and integrity concerns for using cloud storage services as well as
for ensuring the delivery of trusted services to the clients by considering their data
security policies. The targeted aim will be achieved by completing the following
research objectives:
8
i. To investigate and obtain in-depth understanding of existing security
models that have been proposed by the industry and academia researchers
to overcome data confidentiality and integrity concerns for using cloud
storage services.
ii. To critically analyze as well as explain the limitations or gaps which have
been identified in the existing industry and academia implemented secure
cloud storage models.
iii. To design an improved and enhanced secure cloud storage model which
preserves data confidentiality and integrity, as well as ensures the delivery
of trusted services to the clients by considering their data security
policies.
iv. To implement and deploy a web-based prototype on a cloud computing
infrastructure which facilitates the clients to store and process their data at
cloud storages with consistent data confidentiality, integrity and trust
assurance.
v. To evaluate the developed cloud storage model in order to ensure that it
overcomes or mitigates the data confidentiality and integrity concerns,
and gains the trust of organizations dealing with sensitive data to adopt
cloud storage services.
9
1.5 Scope of Research
Cloud reference architecture consists of three service delivery (SaaS, PaaS,
and IaaS) and four deployment models (Public, Private, Hybrid, and Community)
(Mell and Grance, 2011). Since cloud computing is a vast area of research, this
study only focuses on IaaS. Furthermore, IaaS providers offer compute, network and
storage services to the clients. This research considers security of a cloud storage
that resides at data center of a CSP. Security has several perspectives when it comes
to research and development. This research considers confidentiality and integrity
parameters of security as the major problems to be solved. This research assumed
that breach of data confidentiality and integrity will impact the clients trust for using
cloud storage services. In order to achieve clients trust, data confidentiality and
integrity must be protected, and CSP must always ensure the delivery of trusted
cloud storage services to the clients. Therefore, in this thesis, trust do not refers to
the concept of trusted computing.
However, this research assumed that users may be required to use trusted
platforms for using cloud storage services. For example, Trusted Extension Device
(TED) and Trusted Platform Module (TPM) can be used by the clients to protect
their devices. In a cloud computing environment, system performance is also
considered as a significant factor, but SCSM was designed and developed mainly by
considering the security requirements of the organizations dealing with highly
confidential data. We believe that the identified research problem was not possible
to be solved just by providing encryption and data auditing approaches. Therefore,
our research scope focuses on providing a complete secure process that is comprised
of a set of five components which include Multi-factor authentication and
authorization process using Role-Based Access Control (RBAC) with Complex
Random Security Code Generator (CRSCG), Partial homomorphic cryptography,
Trusted Third Party (TTP) services including Key Management (KM) approach and
data auditing process, implementation of 256-bit Secure Socket Layer (SSL) and
SLA. This research also focuses on the deployment of the research contribution
10
Secure Cloud Storage Model (SCSM) on a cloud computing infrastructure in order to
obtain authentic evaluation results.
1.6 Significance of Research
When objectives of the research are successfully accomplished, the
development of SCSM can be considered as one of the valuable contributions in the
field of cloud computing security, since it will overcome the existing data
confidentiality and integrity concerns by providing trusted and secure cloud storage
services to the clients. Contribution of this research will be beneficial for both, client
organizations and CSPs. Clients will adopt cost-effective storage solutions in order
to store their confidential data for high availability, accessibility, secure backup and
recovery. Alternatively, CSPs will adopt this solution to overcome the limitations of
their existing cloud storage services and to gain clients trust. This research expects
that adoption of cloud storage service will rapidly increase with the successful
implementation and deployment of SCSM at the industry level.
1.7 Contribution of Research
The advent of cloud computing brought up enormous challenges for the
software engineers to design as well as develop secure cloud applications, platforms,
and infrastructures that deal with the storage of mission critical data. In the domain
of Software Engineering (SE), information security engineers apply security
principles at each stage of the Software Development Life Cycle (SDLC) from
requirements analysis until development and deployment phases. They are also
responsible to analyze and test the security of their developed cloud based solutions
11
(Zingham and Saqib, 2013). This research adopted a SE approach by designing,
developing, deploying and analyzing the requirements of secure cloud storages.
Therefore, this research contributed in the field of SE by completing those
requirements which actually fall under the responsibilities of information security
engineers for developing secure cloud storage services. The final contribution
produced by this research as software will introduce a novel SE approach to develop
complex confidentiality and integrity preserved cloud storage systems.
1.8 Thesis Organization
This thesis explores an emerging area of cloud security research focusing on
data confidentiality and integrity concerns for using cloud storage services. The
complete research is organized in seven chapters. Figure 1.2, shows the flow of
thesis organization. Chapter 1 represents the significance of this research mainly by
clarifying the research problem area, scope, contributions and objectives. An in-
depth analysis of existing literature is provided in Chapter 2, which covers cloud
security techniques and models provided by various researchers to solve the existing
cloud storage security problems. Chapter 2 also covers the critical analysis on the
limitations and strengths of industry and academia implemented contributions.
Chapter 3 describes the entire research methodology used systematically for
accomplishing each research objective. Description and design of the SCSM are
provided in Chapter 4. Each component of SCSM is discussed with technical as well
as theoretical details. SCSM is designed using architecture, use-case and sequence
diagram, in-addition to the construction of an effective SLA. Chapter 5 describes the
development details of SCSM implementation as a web-based prototype. Entire
system workflow is described using user interface snapshots. System deployment
details at the real cloud computing infrastructure are also described in Chapter 5.
The evaluation process and results for the entire process as well as the each
component of SCSM are described in Chapter 6. The applications of SCSM, overall
12
research conclusion, limitations and future direction, are critically discussed and
justified in Chapter 7.
Figure 1.2: Thesis Organization
CHAPTER 2
Literature Review
CHAPTER 3
Research
Methodology
CHAPTER 7
Conclusion and Future
Work
CHAPTER 1
Introduction
CHAPTER 6
Evaluation and
Results
CHAPTER 4
Secure Cloud Storage Model
CHAPTER 5
Implementation of the Secure
Cloud Storage Model
13
1.9 Summary
Cloud storage service is sub-category of IaaS which is provided to
organizations for storing large amounts of data with unlimited capacity, broad
accessibility, resilient availability, disaster recovery, and cost-effectiveness features.
However, organizations dealing with confidential data are reluctant to adopt remotely
located cloud storage services due to emerging data confidentiality and integrity
concerns which have created a barrier of trust among the CSPs and clients. In order
to overcome the mentioned problem, this research aims to provide an improved as
well as enhanced solution for designing as well as developing confidentiality and
integrity preserved secure model to use cloud storage services by accomplishing the
research objectives. The successful implementation as well as deployment of SCSM
at the industry level will assist CSPs to adopt this solution for offering secure and
trusted cloud storage services to the business organizations.
14
CHAPTER 2
LITERATURE REVIEW
2.1 Introduction
National Institute of Standards and Technology (NIST) defines cloud
computing as A model for enabling ubiquitous, convenient, on-demand network
access to a shared pool of configurable computing resources e.g., networks, servers,
storage, applications, and services, that can be rapidly provisioned and released
with minimal management effort or service provider interaction (Mell and Grance,
2011). A cloud model is composed of five essential characteristics defined as
follows:
i. On-demand Self-service: Consumers can automatically provision the
computing capabilities such as server time and network storage
without human interaction.
ii. Broad Network Access: Services are provided to a large community
of users over the internet for ubiquitous and pervasive access via web-
browser.
15
iii. Resource Pooling: CSPs pool their resources to server multiple
clients by developing a multi-tenant architecture using virtualization
tools and technologies.
iv. Rapid Elasticity: Clients can easily scale up and down their service
capabilities by requesting the CSP without being engaged in physical
efforts.
v. Measured Service: Usage of resources such as storage, bandwidth,
and processing, are automatically monitored, controlled, reported and
optimized to provide efficient services. (Mell and Grance, 2011)
The remainder of this chapter is organized in eleven sections. Section 2.2,
describes the cloud deployment models. Section 2.3, describes the cloud service
delivery models. Section 2.4, describes the cloud storage services, its concepts,
advantages and the adopting organizations. Data security concerns for using cloud
storage services, and the protection mechanisms used to overcome those concerns are
described in Sections 2.5 and 2.6, respectively. Section 2.7, describes the leading
industry implemented cloud storage services. Section 2.8, identifies and analyses the
limitations or vulnerabilities of industry implemented cloud storage services. Section
2.9, presents the related work of various researchers who designed or developed
security models to overcome confidentiality and integrity concerns for using cloud
storage services. Section 2.10, critically analyses the reviewed research
contributions and determines their limitations. Section 2.11, describes the strengths
of related work and complete roadmap of this research. Section 2.12, presents the
summary of this chapter.
16
2.2 Cloud Deployment Models
Cloud services are mainly provided via Private, Public, Hybrid, and
Community cloud deployment models. Adoption of these models depends on the
security, privacy, performance, flexibility and scalability requirements of an
organization (Keung and Kwok, 2012; Soni et al., 2013). However, if one model
does not well-suit the demands of an organization, they can also mix-up among
multiple models which is considered as a hybrid cloud deployment model (Savu,
2011). Cloud deployment models are described in the following sub-sections.
2.2.1 Public Cloud
Public cloud infrastructure is managed and operated by the CSP. It is offered
to a wide range of registered users, and it is configured as well as settled at on-
premises of the CSP. Resources such as storage, application or servers are provided
to clients without requiring them to build, maintain or monitor a personalized IT
infrastructure (Wang et al., 2013). They are also not required to pay for the software
licensing or the hardware purchasing cost. Using this deployment model, everything
(SaaS, PaaS and IaaS) is provided from the CSPs such as Amazon, Google and
Microsoft (Ang et al., 2011). Clients can easily access the public cloud services over
the internet using a web-browser. They are only required to pay according to the
usage of their services, and resources are highly flexible, which can be easily scaled
up and down at any time with minimal interaction, hence it saves organizations from
paying for the complexities of managing an over or under-utilized computing
infrastructure. Users of public cloud also enjoy the benefits of resilient availability.
For instance, Amazon guarantees 99.99% service uptime (Amazon, 2013). However,
using public cloud, clients do not gain direct control over their data or services and
there is no transparency of cloud security standards, techniques, policies or
procedures for the client to trust the public cloud services (Huaqun, 2013) (Astrova
et al., 2012).
17
2.2.2 Private Cloud
A private cloud infrastructure is managed and operated by the client
organization, CSP or in certain cases both. Unlike public, the private cloud belongs
exclusively to a single client (Stipic and Bronzin, 2012). A business organization
such as education, banking or healthcare sector, can use their existing in-house
infrastructure to build a cost-effective private cloud using the supporting software
stack which includes Eucalyptus or OpenStack (Loewen et al., 2013) (Baun and
Kunze, 2009). Alternatively, if an organization does not own an on-premises
infrastructure or they lack supporting IT staff to build a private cloud, they can also
acquire it from the CSP. Normally the private cloud offered by CSPs is the Virtual
Private Cloud (VPC).
A VPC is an on-demand configurable pool of shared computing resources
allocated within a public cloud environment. However, it is isolated from other parts
of a public cloud infrastructure using encrypted communication channels, Access
Control Mechanisms (ACMs), Trusted Virtual Domain (TVD), and Virtual Firewall
(VF) (Mishra et al., 2013). Using a private cloud, clients have a greater degree of
control on their data and services since data reside at on-premises of clients
organization and resources are controlled as well operated by internal IT staff, but
private cloud is less scalable compared to public or VPC. Alternatively, a VPC is
secure compared to a public cloud and client has minimal control over their acquired
space, but since it is off-premises cloud service for the client organizations, they still
feel lack of direct control over their personal data and computing resources (Dillon et
al., 2010).
18
2.2.3 Hybrid Cloud
Hybrid cloud is the combination of multiple distinct cloud models (Private,
Public or Community) to create a customized solution based on the requirements of
an organization (Mell and Grance, 2011). This provides an opportunity for the
organizations to store sensitive information and mission critical processes in a private
cloud, and non-critical information as well as processes in a public cloud or to use
different cloud models for backup and disaster recovery (Yen-Hung et al., 2013). A
hybrid cloud solution is also useful, when an organization has started a new business
and they require certain systems to serve their customers, but they do not possess an
existing IT infrastructure as well as technical IT staff to deploy, monitor and manage
their systems.
They might decide to handle these tasks to a third party CSP by leveraging
public cloud services without investing in building an in-house infrastructure, as a
result, they will gain a cost-effective IT solution and the opportunity to focus on their
core business. However, with the passage of time, as their customers growth rate
increases, they will decide to initiate their own IT department and build on-premises
private cloud as a long term business strategy. The organization will host their
internal systems at on-premises infrastructure but still continue to use SaaS and
storage from CSP, in this case their computing infrastructure will be considered as a
hybrid cloud (Marston et al., 2011). The security of a hybrid cloud depends on the
type of cloud models which are used as mixed and the approach of adoption such as
on or off-premises.
19
2.2.4 Community Cloud
A community cloud infrastructure is provisioned for exclusive use by a
specific group of organizations. Members of community cloud may belong to a
combination of hospitals, banks or universities that have a common mission, security
requirements, policy, and compliance considerations (CSA, 2011). It may be
managed and operated by one or more entities in the community, a third party CSP
or combination of them, and it may exists on or off-premises (Jadeja and Modi,
2012). In community cloud, organizations have ubiquitous access to shared
information. For example, using a healthcare community cloud, one hospital can
access the shared records of patients from another hospital.
A community cloud is smaller than public but larger than a private cloud.
Since it is constructed by several organizations collaborating together to achieve
economies of scale (Gall et al., 2013; Sattiraju et al., 2013). Using this model,
organizations have complete control over their management responsibilities as they
can control, which other organizations are permitted to join the community
infrastructure. Organizations also have enough control over their resource and they
trust the security standards used, especially when it is located at an on-premises
datacentre. Organizations are also able to manage their data sharing policies.
However, allocation of cost, responsibilities, governance, security and controlling
multiple user access points are the main challenges to be faced while acquiring a
community cloud (Sathiyapriya et al., 2013).
20
2.3 Cloud Service Delivery Models
CSPs offer variety of services such as remotely located business processing
applications, integration and development tools, flexible and scalable storages to the
clients. These services are obtained by the clients using SaaS, PaaS, and IaaS, which
are delivered via private, public, hybrid, and community deployment models with
minimal interaction as well as management responsibilities. Cloud service delivery
models are described in the following sub-sections.
2.3.1 Software as a Service
SaaS refers to web-based applications such as word-processing, human
resource and customer relationship management systems, running on the CSPs
infrastructure. These applications are accessed using a web-browser over the internet
(Junjie et al., 2009). Users of SaaS are charged according to pay-as-you-go billing
model and they do not manage or control the underlying cloud infrastructure besides
having limited permissions for application configuration settings. SaaS applications
are gaining rapid popularity and they are provided by the well-known companies
such as Microsoft and Salesforce. SaaS bear advantages for clients and providers,
the key benefits for providers include rapid deployment, better user adoption, and
reduced support needs. For customers, key benefits of SaaS include lower IT cost
and faster access to new technology, functionality and upgrades (Pang Xiong and Li,
2013). However, due to clients lack of control on their enterprise applications,
security and privacy are the top concerns preventing firms from adopting SaaS (Yu-
Hui, 2011).
21
2.3.2 Platform as a Service
PaaS facilitates the clients to rapidly develop, deploy and test applications on
a cloud infrastructure using Application Programming Interfaces (APIs), libraries,
and tools, provided by the CSPs without buying or maintaining the underlying
infrastructure (Zeng and Xu, 2010). PaaS clients have full control over their
deployed applications and adequate control over the application hosting platforms.
PaaS tools can play an important role in software development stages. Nowadays,
there is a variety of PaaS tools offered by the CSPs. These include Cloud Foundry,
Azure, OpenShift, Google App Engine, AppFog, Cloudify and Heroku etc. PaaS
offerings provide plenty of features that will appeal to any developer working in a
cloud environment. However, there are some challenges faced in adoption of PaaS
mainly due to the required learning curve, since developers are not ready to use these
new tools beside the traditional ones they are already familiar with (Cohen, 2013).
2.3.3 Infrastructure as a Service
IaaS is a method of delivering resources such as servers, storage, network and
operating systems (OSs) as on-demand services (Xing et al., 2012). IaaS clients are
not required to build private datacentres, maintain servers, storage or networks, also
they do not manage or control the underlying cloud infrastructure but they have
control over their OSs, storage, deployed applications and possibly limited control of
selected networking components e.g., host firewalls (Mell and Grance, 2011; CSA,
2011). IaaS is offered from companies such as Amazon, Rackspace, HP and IBM.
Business organizations can achieve high availability, reliability and disaster recovery
solutions by leveraging IaaS. However, security policies, governance and lack of
awareness about the physical location of data remain the main barriers in adopting
IaaS (Gibson et al., 2012).
22
2.4 Cloud Storage Services
Cloud storage services are offered via public cloud deployment models and
they do not require setup, configuration or installation of a personalized IT
infrastructure at the client organizations. These services are remotely acquired from
a third party CSP such as Amazon, and Google at any time with on-demand
unlimited capacity. For example, Amazon Simple Storage Service (S3) offers data
storage that is easily scalable without physical interaction which can shrink and grow
as per clients requirements and it does not require capital investment. Users are
normally charged according to pay-per-use billing model (Mazhelis et al., 2012).
Beside the cost-effectiveness, ease-of-use and accessibility, cloud storage services
also involve resilient availability and disaster recovery solutions for the organizations
(Ullrich et al., 2012). Typically, CSPs utilize cost-effective redundant storage
hardware which overcomes the issues of interrupted service during a planned or
accidental outage, i.e. the scheduled maintenance or upgrades. Amazons claim for
providing 99.99% uptime is visible proof of resilient cloud storage service
availability (Amazon, 2013).
Organizations that are not dealing with sensitive data, are rapidly adopting
cloud storage services for disaster recovery solutions. Under some unwanted
circumstances such as natural disasters, they can easily recover their data from the
backup storage without loss or damage to the information. Normally, CSPs create
redundant backup for the clients data at various backup zones by considering the
local data protection law. Due to these backup zones, clients data will be always
protected and returned as intact. In case of a disaster, IT staff from a CSP can restore
the data back to cloud from local storage, which may reside at geographically located
backup zones (Javaraiah, 2011). However, compared to traditional storage methods,
cloud storages possess new challenges in data security, reliability, and management.
CSPs use numbers of heterogeneous storage devices which work together to provide
data storage and business functions. While using cloud storage services, clients are
not aware about the details of security controls used to protect their data (Zhang and
Zhang, 2011).
23
Cloud storages are yet not being adopted by the organizations dealing with the
confidential data due to emerging data confidentiality and integrity problems (Chirag
et al., 2012). In order to enhance adoption of cloud storage services by all type
business organizations, there is tremendous opportunity for this research to formulate
a valuable contribution to overcome data confidentiality and integrity concerns.
2.5 Cloud Storage Data Security Concerns
Data stored in third party cloud storages might not be secure due to the
absence of ensuring confidentiality and integrity preserved services. Although, CSPs
are offering cost-effective cloud storage services, since it is a remotely located
facility, clients cannot trust the CSP or feel satisfied that their data are always secure
at the cloud. Hence, organizations dealing with confidential data are not willing to
adopt cloud storage services until certain guarantees are achieved (Karumanchi,
2010). Information confidentiality and integrity concerns for using remotely located
public cloud storage services are described in the following sub-sections.
2.5.1 Data Confidentiality
In a cloud computing environment, confidentiality is a key concern when data
are of sensitive nature that includes information related to health, political opinions,
religious and personal beliefs. The organizations such as banking, healthcare and
Payment Card Industry (PCI) have strict data confidentiality requirements, so they
will have numerous anxieties for storing their sensitive records at a remotely located
third party owned cloud storages (Hofmann and Woods, 2010).
24
Since by acquiring and using cloud storage services, clients lose direct control
over their data, it will open new opportunities for illegal authorities to access their
personal records. Confidentiality of data can be breached in-transit or while it is
stored at the cloud storage. When clients send data to the cloud, it may be attacked
by Man-in-the-Middle (MITM) or personal records can be viewed by the CSP which
will lead to a breach of confidentiality. Encryption is recommended as the
fundamental approach to preserve data confidentiality while using cloud storages
(Karumanchi, 2010). However, implementation of improper security procedures
such as ineffective KM approaches may result into a vulnerable and insecure cipher.
Data of a cloud user must be accessed only by the authorized parties as specified in
the SLA. Alternatively, CSP and other involved organizations such as TTP must not
exceed their privileges to view or distribute confidential data of clients to
unauthorized parties.
2.5.2 Data Integrity
Although data confidentiality at cloud storage can be preserved by
implementing encryption with effective KM approaches, however, these techniques
cannot guarantee that the integrity of data will always remain intact. Data integrity
can be breached when data are modified by an external hacker or a malicious insider
even when it is in cipher format. In order to defend data integrity violations, security
approaches such as ACMs and data auditing services should be implemented. Data
owners must hire a TTP to monitor their data on cloud and to ensure that their
sensitive records are protected from integrity violations (Stamou et al., 2012). TTP
can verify the integrity of clients data using applications which are based on the
implementations of Digital Signature Algorithm (DSA). CSPs can also play an
integral role in sustaining a safe computing environment for storing clients
confidential data by enforcing ACMs which are specified from the data owners
considering their data security policies (Cong et al., 2013). Clients trust on cloud is
25
breached when the SLA, terms or conditions are not followed or accomplished
(Zissis and Lekkas, 2012).
2.6 Data Protection Mechanisms for Cloud Storages
In order to overcome the issues of breaching data confidentiality and integrity
in a cloud computing environment, researchers have designed and developed security
models using various data protection mechanisms which include cryptography and
KM, trusted computing, ACMs, SLA, and data auditing services. The significances
of these data protection mechanisms to secure cloud storage services are described in
the following sub-sections.
2.6.1 Cryptography and Key Management
The protection of data against the loss and theft is a shared responsibility of
cloud customer and CSP. Nowadays, encryption is one of the strongly recommended
techniques specified in cloud SLAs (Jansen and Grance, 2011). However, simply
encryption is not enough to secure the data. There must be proper KM practices to
ensure safe and legal access of encryption keys. Keys must be protected with the
same significance as the data itself and they should be accessed only by limited and
authorized personalities. Proper procedures must be followed, if encryption keys are
lost or stolen (CSA, 2011).
26
Triple Data Encryption Algorithm (TDEA) and Advanced Encryption
Standard (AES), are vastly implemented symmetric algorithms used for protecting
data at cloud storages. These algorithms use a single secret key for performing
encryption and decryption processes. From asymmetric cryptography, Rivest,
Shamir and Adleman (RSA) cryptography, and Elliptic Curve Cryptography (ECC)
are mostly used encryption techniques to protect data at cloud. Unlike symmetric,
these methods use two different keys, a public key for encryption and a private key
for decryption (Jing-Jang et al., 2011). If encryption practices are followed
accurately, data will be protected from illegal access or theft by malicious employees
of a CSP and external adversaries.
2.6.2 Trusted Computing
Trusted computing is a term that refers to technology and proposals for
resolving computer security problems through hardware augmentations and related
software amendments. Several well-known hardware manufacturers and software
companies, jointly known as the Trusted Computing Group (TCG), are working for
trusted computing and they have come up with significant developments to enhance
computing security. TCG developed a set of hardware and software technologies to
enable the construction of trusted platforms (Karumanchi, 2010). Trusted computing
is a promising technology to mitigate the novel security challenges existing in cloud
computing infrastructures using TCG products. The use of approaches such as TPM,
and Virtual Trusted Platform Module (vTPM) can significantly improve the security
of cloud services. For example, TPM can be used by the CSPs to secure cloud
storage servers. In order to protect client Virtual Machines (VMs) that are residing at
a remote cloud platform, each VM can be associated with a vTPM instance which
emulates the TPM functionality to extend the chain of trust from physical TPM.
Similarly, TED (a portable device containing the functionality of TPM) which was
introduced by Nepal et al., (2007), can be used by the clients for remote platform
attestation and integrity verification tasks.
27
2.6.3 Access Control Mechanisms
Access control is the process of limiting system access to only authorized
people, programs, processes or other system components. ACMs are responsible for
protecting cloud storage by limiting, denying or restricting access to a system or an
entity according to the well-defined security policies (Afoulk et al., 2012). Most
common ACMs used in a cloud computing environment include Mandatory Access
Control (MAC), Discretionary Access Control (DAC), and RBAC (Wei et al., 2012).
All these techniques are known as identity based ACMs, as user subjects and
resources objects are identified by unique names. Identification may be done directly
or through roles assigned to subjects. ACMs guarantee integrity and confidentiality
of the resources. ACMs must be implemented by the CSP or third party in
association with the cloud user (Khan, 2012).
2.6.4 Service Level Agreement
SLA is a formal document which contains the terms and condition for using
cloud services. It documents common understandings about priorities,
responsibilities, and guarantees. The main objective of SLA is to reduce key areas of
potential conflicts and to identify their resolution before they get materialized. Each
CSP offers different SLA structure, service offerings, negotiation opportunities and
performance levels. SLA can be used to select a CSP on the basis of data protection,
continuity, and cost. In order to avoid unwanted situations, client and the CSP
should involve a TTP who can monitor the provided service and undertake necessary
steps independently by considering the specified terms and conditions in cases of
service violation (Stamou et al., 2012). TTP will ensure the delivery of required
service from CSP using metrics such as throughput, response, availability, reliability,
and data security controls (Ghosh and Ghosh, 2012). However, an SLA should not
28
only validate and penalized the CSP but it must also involve the possible penalties
for malicious client activities (Kandukuri et al., 2009).
2.6.5 Data Auditing Services
When clients transfer their confidential data to cloud storages, they lose
physical possession over it, which raises serious concerns of data integrity protection,
and makes it a challenging task. While using cloud storage services, users should be
able to access and process their data in the similar manner as they do on their
personalized system without worrying about the requirements of verifying its
integrity. Hence, facilitating public auditability for cloud storage is a vital task to
ensure protection of data residing at third party storages. CSPs should negotiate with
clients to allow a third party auditor for checking the integrity of outsourced data
(Nithiavathy, 2013). However, the auditing process should not bring new
vulnerabilities towards users data confidentiality. The process of data verification is
very significant while using cloud storages, since there are possible chances of
threats from external hackers and malicious insiders such as disgruntled employees
from a CSP. They may illegally or intentionally delete or modify the clients
records. The third party auditor can periodically conduct the auditing services on
behalf of clients to ensure that their data are stored with the maintained integrity. In
cases of integrity violation, auditor should report to client and take appropriate steps
as specified in the SLA (Cong et al., 2013).
29
2.7 Industry Based Implementations of Cloud Storage Services
The demand for adopting cloud storage services is rapidly increasing due to its
advantages such as cost-effectiveness, scalability, backup, and disaster recovery.
However, the clients dealing with mission critical data are reluctant to move their
sensitive records at external third-party owned infrastructures due to data security
concerns (Taeho et al., 2013; Syam and Subramanian, 2011; Gansen et al., 2010).
Numerous industries are developing security models for cloud storages to preserve
data confidentiality and integrity. This research focused on reviewing and analyzing
the security of cloud storage solutions developed by Amazon and Google since they
are the leading cloud storage providers, and they are quite transparent when it comes
to the discussion of their security mechanisms and approaches used for protecting
clients data at cloud storages (Eric, 2013). The cloud security mechanisms as well
as approaches used by Amazon and Google are described in following sub-sections.
2.7.1 Amazon Simple Storage Service
Amazon S3 is scalable, reliable and low-latency data storage infrastructure. It
supports a simple web services interface which is used to store and retrieve unlimited
data ubiquitously and pervasively. Users of S3 are required to follow pay-per-use
billing model, and the provided service is efficiently flexible. Amazon claims to
consider S3 as a highly durable storage infrastructure designed for mission critical
and primary data storage (Amazon, 2014). S3 supports user authentication process
for controlling access to confidential data. Considering the data security approaches
of S3, only data owners have access to their personal resources. The clients can use
ACMs such as bucket policies and Access Control List (ACL) to selectively grant
permissions to the users and groups. Amazon further strengthens the authentication
process for their customers by adding an extra layer of security to the system. The
users are required to provide a six digit single-use code in addition to their standard
30
username and password credentials before access is granted to their services and
resources. Customers retrieve this code from an authentication device which they
keep in their physical possession. This process is called multi-factor authentication
because two different factors are checked before the access is granted (Amazon,
2011).
Figure 2.1: Server Side Encryption
(Jeff, 2011)
Furthermore, data is protected in-transit during upload and download
operations, as well as at rest. Clients can protect their data in transit using 256-bit
SSL, and data at rest can be protected using Server Side Encryption (SSE) or the
Server Side Encryption with Customer-Provided Key (SSE-C). The users of S3 can
31
use third party libraries to encrypt the data before storing it to the cloud storage.
Selecting an encryption method is based on preferences and requirements of the
clients (Amazon, 2014). Both SSE and SSE-C are used to encrypt data at rest, and
they are based on using 256-bit AES algorithm. S3 encrypts the data when it is
written to the disk and decrypts it when a client submits a request to access it, as
shown in Figure 2.1.
Using SSE, a client will not realize any difference in accessing the encrypted
data as long as S3 authenticates the client and validate his/her access privileges.
Whereas, using SSE-C, client provides the encryption key to the server as a part of
the request. S3 performs encryption when data is written to the disk, and performs
decryption, when client accesses the objects. Therefore, clients are not required to
maintain any source code to encrypt or decrypt data manually. The only task they
require is to manage the encryption keys. When a client uploads an object, S3 uses
the provided encryption key to apply AES-256 encryption on data as shown in Figure
2.2.
Figure 2.2: Encryption with Clients Key
(Jeff, 2014)
32
S3 does not store the clients provided encryption key. Instead, it stores a
randomly salted Keyed-Hash Message Authentication Code (HMAC) value of the
encryption key in order to validate the future requests. The salted HMAC value
cannot be used to derive value of the encryption key or to decrypt the contents of an
encrypted object, therefore if client loses the encryption key, the cipher data will be
useless. Unlike SSE, using S3, clients can use third party libraries or Amazon
Software Development Kit (SDK) to encrypt their data using client side encryption
before sending it to the cloud storage. Clients can apply any encryption algorithm to
their data. The private keys of the client and unencrypted data are never sent to S3.
Therefore, it is mandatory for the clients to safely manage their keys. For client side
encryption, the Amazon Web Services (AWS) SDK uses a process called envelope
encryption. Using this technique, client provides the encryption key to S3 encryption
module, and S3 performs the entire encryption process, as shown in Figure 2.3.
Figure 2.3: Client Side Encryption
(Jeff, 2011a)
33
Initially, during the encryption process, the S3 encryption module generates a
one-time-use 256-AES symmetric key known as the envelope key, S3 uses this key
to encrypt clients data, and then encrypts the envelope key using clients private key
(Amazon, 2014). Client uploads the encrypted envelope key along with the
encrypted data to S3. During the retrieving and decrypting process, S3 retrieves
clients encrypted data from the server along with the encrypted envelope key. S3
then decrypts the envelope key using clients private encryption key, and finally it
decrypts clients data using the envelope key. In this process, if encryption keys are
lost, clients will not be able to decrypt their data. Beside these encryption and
decryption approaches, S3 uses integrity verification techniques to verify the
correctness of data stored at the cloud. It uses a combination of checksums and
Cyclic Redundancy Checks (CRCs) to detect data corruption. S3 performs
checksums on data at rest and repairs any corruption using redundant data.
Moreover, S3 server also calculates checksums on the entire network traffic to detect
corruption of data packets when storing or retrieving data. The data protection
approaches of Amazon are also in compliance with its associated SLA which mainly
specifies the service commitments regarding durability and availability of service
(Amazon, 2014).
2.7.2 Google Cloud Storage
The data storage process can be time-consuming and costly since it includes
maintaining data servers, storage disks, firewalls, backup copies and disaster
recovery provisions. Google Cloud Storage (GSC) reduces these burdens on
individuals as well as organization, allowing them to store, retrieve, share, and
analyze their data without worrying about maintenance, hardware and firmware
upgrades. It is massively scalable, users can store and process terabytes of data to
support big data scenarios required by scientific, financial analysis and media
applications or they can store small amounts of data required for light business
websites. GCS is elastic, so users can design applications for a large global
34
audience, and scale those applications as desired. Users are required to pay only for
what and when they use. It supports a simple programming interface which enables
the developers to take advantage of Googles own reliable and fast network
infrastructure to perform data operations in a secure and cost-effective manner
(Google, 2012). The GCS users can move their data from Amazon S3 to the google
data storage through migration pipeline as shown in Figure 2.4.
Figure 2.4: Data Migration Process
(Google, 2013)
35
Beside these advantages of GCS, it also involves strict security policies and
approaches to protect confidential data of the users from external as well as internal
threats such as disgruntled employees. Google requires the use of a unique user
identification number for each employee. It is used to identify each persons activity
on Googles network, including any access to data of employees or customers.
During the hiring process, an employee is assigned an identification number by
Human Resources (HR) system and granted with default set of privileges. At the end
of a persons employment, his/her accounts access to Googles network is disabled
from the HR system. Google makes widespread use of two-factor authentication
mechanisms, such as certificates and one-time password generators. Two-factor
authentication is required for accessing the production environments and resources
through Googles Single Sign-On (SSO) system. Access rights and levels are based
on an employees job function and role using the least-privileges concept (Google,
2012a).
GCS also uses ACLs to manage access to objects and buckets. ACLs are the
mechanisms that customers use to share objects with other users and allow them to
access their buckets as well as objects. An ACL consists of one or more entries,
where each entry grants permissions to a scope. Permissions define the actions that
can be performed against an object or bucket for example read or write tasks,
whereas the scope defines who the permission applies to i.e. a specific user or group
of users. When a user requests access to an object or bucket, the GCS system reads
the ACL on the object or bucket and determines whether to allow or reject the
request. If ACL grants the user permission for the requested operation, the task is
performed. Users are able to share data with authorized colleagues and partners
through ACLs and they can control the access to their confidential records. If ACL
does not grant the user permission for the requested operation, the request will fail
and a forbidden error or access denied message will be returned (Google, 2012a).
GCS users are also able to grant access of their objects to the users who do not have
GCS accounts as these types of users can access the objects by authenticating via
simple Google accounts. For example, in order to permit a user Jane to download an
object from a bucket, the object owner is first required to grant Jane read permission
36
for that object and then to provide her the resource link. When Jane opens the link in
her browser she will be automatically prompted to sign-in to her Google account.
After she is authenticated, and her browser has acquired a cookie with an
encapsulated identity token, she will be redirected to the object in GCS repository.
GCS then verifies that Jane is allowed to read the object, and then object is
downloaded to Janes computer (Google, 2014). This process of authorization is
shown in Figure 2.5.
Figure 2.5: Authentication Process
(Google, 2014)
GCS is based on the use of Hypertext Transfer Protocol Secure (HTTPS) by
configuring 256-bit SSL to establish a secure communication channel. Information
sent via HTTPS is encrypted from the time it leaves GCS until it is received by the
recipients computer. GCS automatically encrypts all data before it is written to disk,
37
at no additional charge. There is no setup or configuration required, and no need to
modify the way customers access their services. Data is automatically and
transparently decrypted when read by an authorized user. If customers require
encryption for their data, this functionality frees them from the hassle and risk of
managing personal encryption and decryption keys. GCS manages the cryptographic
keys on behalf of customers using the same hardened key management systems that
Google uses for their own encrypted data, including strict key access controls and
auditing. Each cloud storage objects data and metadata is encrypted using the 128-
bit AES, and each encryption key is encrypted with a regularly rotated set of master
keys. Users of GCS are also able encrypt their data themselves prior to sending it for
storage, in this case users will be responsible to manage their own encryption and
decryption keys (Ferreira, 2013). GCS provides a CRC header that allows the clients
to verify the integrity of object contents. For non-composite objects, GCS also
provides a message digest header in order to allow the clients to verify integrity of
the objects, but for composite objects only CRC is available. Integrity checks are
automatically performed on all uploads and downloads. Likewise S3, the service
provided to the GCS customers will be operational and available by considering the
compliance with SLA. If Google does not comply with the requirements of SLA,
and if customers meet their obligations as specified in SLA, customers will be
eligible to receive service credit. Google also ensures that retired disks containing
customers old information are subjected to data destruction process prior to leaving
the premises (Google, 2012a).
2.8 Limitations of Industry Implemented Cloud Storage Services
Although cloud storage services have several advantages for business
organizations as well individuals, but after conducting in-depth literature review, this
research identified that well-known cloud storages such as S3 and GCS have certain
vulnerabilities or limitations in terms of KM approach, cryptographic support, SLA,
38
and data integrity verification services. The summarized information of industry
implemented cloud storage services and their limitations is provided in Figure 2.6.
Figure 2.6: Limitations of Amazon S3 and GCS
The shortcomings of existing cloud storages can raise information
confidentiality and integrity concerns, therefore these services are not trusted by the
organizations to store mission critical data such as healthcare, banking or
government related records. The common vulnerabilities of S3 and GCS are
described in following sub-sections.
39
2.8.1 Vulnerable Key Management Approach
GCS and Amazon S3 automatically encrypt data before it is written to disk
using SSE process without additional charges. There is no setup or configuration
required and no need to modify the way clients access the service. Data are
automatically and transparently decrypted when read by an authorized client without
requiring the users to undertake the burden of managing their private keys.
Although, CSPs protect the cryptographic keys of clients using the same hardened
KM systems that they use for their own encrypted data including strict key access
controls and auditing. However, this process of data and keys protection requires the
clients to undoubtedly trust the CSPs because the keys are managed by them. In
certain cases organizations can be concerned that some government authorities such
as National Security Agency (NSA) can obtain their keys from the CSP to decrypt
and illegally access or view their sensitive records (Ferreira, 2013).
In order to overcome clients concerns, alternative approach supported by
CSPs is to recommend the clients to encrypt their data before sending it to the cloud
storage or to perform SSE-C without storing their keys at the server. In this
approach, there are also implications in terms of KM, since clients are responsible to
manage and protect their data encryption and decryption keys from compromise to
an adversary, unintentional deletion or loss, whereas clients are not eager to
undertake such a responsibility as it requires managing hardware and software which
will add to the cost of overall service The clients must be facilitated with an
effective and secure KM approach by involving a TTP and without being anxious
about maintaining a secure key storage service to protect their data confidentiality
and integrity.
40
2.8.2 Inadequate Cryptographic Support
The data stored at S3 or GCS, are encrypted using AES algorithm. S3
supports AES-256 bit, whereas GCS supports AES-128 bit. At present AES-256 bit
is proven secure but there is criticism on GCS using AES-128 bit because it has been
already proven as not strong enough to protect confidential data, and it can be
cracked by todays technology in a reasonable amount of time (Ferreira, 2013).
Beside the security concerns, encryption techniques of S3 and GCS have limitations
in terms of functionality and usage. For example when data is encrypted at the GCS
or S3, it cannot be processed to perform computations unless it is decrypted to its
original format, so each time users are required download their data to perform these
tasks. In other words, in existing cloud storage services, there is no support for FHE
which can enable the users to perform live computations on their data while their
privacy remains preserved (Murali et al., 2013). FHE is a good basis to enhance the
security measures of un-trusted systems or applications that stores and manipulates
sensitive data. Although, FHE is implemented but it is not proven efficient for
practical use in challenging cloud computing services such as S3 and GCS (Kui et
al., 2012; Wang et al., 2013; Stefania et al., 2012).
2.8.3 Exclusion of Security Assura