Download - Elysium PRO Titles with Abstracts 2017-18...guarantee the security and privacy of cloud users, data are always outsourced in an encrypted form. However, encrypted data could incur

Elysium PRO Titles with Abstracts 2017-18



As one important technique of fuzzy clustering in data mining and pattern recognition, the possibilistic

c-means algorithm (PCM) has been widely used in image analysis and knowledge discovery. However,

it is difficult for PCM to produce a good result for clustering big data, especially for heterogenous data,

since it is initially designed for only small structured dataset. To tackle this problem, the paper proposes

a high-order PCM algorithm (HOPCM) for big data clustering by optimizing the objective function in

the tensor space. Further, we design a distributed HOPCM method based on MapReduce for very large

amounts of heterogeneous data. Finally, we devise a privacy-preserving HOPCM algorithm

(PPHOPCM) to protect the private data on cloud by applying the BGV encryption scheme to HOPCM,

In PPHOPCM, the functions for updating the membership matrix and clustering centers are

approximated as polynomial functions to support the secure computing of the BGV scheme.

Experimental results indicate that PPHOPCM can effectively cluster a large number of heterogeneous

data using cloud computing without disclosure of private data.

ETPL CLD - 001

PPHOPCM: Privacy-preserving High-order Possibility c-Means Algorithm for Big

Data Clustering with Cloud Computing

Cloud storage as one of the most important services of cloud computing helps cloud users break the

bottleneck of restricted resources and expand their storage without upgrading their devices. In order to

guarantee the security and privacy of cloud users, data are always outsourced in an encrypted form.

However, encrypted data could incur much waste of cloud storage and complicate data sharing among

authorized users. We are still facing challenges on encrypted data storage and management with

deduplication. Traditional deduplication schemes always focus on specific application scenarios, in

which the deduplication is completely controlled by either data owners or cloud servers. They cannot

flexibly satisfy various demands of data owners according to the level of data sensitivity. In this paper,

we propose a heterogeneous data storage management scheme, which flexibly offers both

deduplication management and access control at the same time across multiple Cloud Service Providers

(CSPs). We evaluate its performance with security analysis, comparison and implementation. The

results show its security, effectiveness and efficiency towards potential practical usage.

ETPL CLD

- 002 Heterogeneous Data Storage Management with Deduplication in Cloud Computing


Fog computing, an extension of cloud computing services to the edge of the network to decrease latency

and network congestion, is a relatively recent research trend. Although both cloud and fog offer similar

resources and services, the latter is characterized by low latency with a wider spread and geographically

distributed nodes to support mobility and real-time interaction. In this paper, we describe the fog

computing architecture and review its different services and applications. We then discuss security and

privacy issues in fog computing, focusing on service and resource availability. Virtualization is a vital

technology in both fog and cloud computing that enables virtual machines (VMs) to coexist in a

physical server (host) to share resources. These VMs could be subject to malicious attacks or the

physical server hosting it could experience system failure, both of which result in unavailability of

services and resources. Therefore, a conceptual smart pre-copy live migration approach is presented

for VM migration. Using this approach, we can estimate the downtime after each iteration to determine

whether to proceed to the stop-and-copy stage during a system failure or an attack on a fog computing

node. This will minimize both the downtime and the migration time to guarantee resource and service

availability to the end users of fog computing. Last, future research directions are outlined.

ETPL CLD - 003

From cloud to fog computing: A review and a conceptual live VM migration

framework

Though the electronic technologies have undergone fast developments in recent years, mobile devices

such as smartphones are still comparatively weak in contrast to desktops in terms of computational

capability, storage etc, and are not able to meet the increasing demands from mobile users. By

integrating mobile computing and cloud computing, mobile cloud computing (MCC) greatly extends

the boundary of the mobile applications, but it also inherits many challenges in cloud computing, e.g.,

data privacy and data integrity. In this paper, we leverage several cryptographic primitives such as a

new type-based proxy re-encryption to design a secure and efficient data distribution system in MCC,

which provides data privacy, data integrity, data authentication, and flexible data distribution with

access control. Compared to traditional cloud-based data storage systems, our system is a lightweight

and easily deployable solution for mobile users in MCC since no trusted third parties are involved and

each mobile user only has to keep short secret keys consisting of three group elements for all

cryptographic operations. Finally, we present extensive performance analysis and empirical studies to

demonstrate the security, scalability, and efficiency of our proposed system.

ETPL CLD

- 004 Towards Secure Data Distribution Systems in Mobile Cloud Computing


In the past decade, mobile devices and applications have experienced an explosive growth, and users

are expecting higher data rates and better quality services every year. In this paper, we propose several

ideas to increase the functionality and capacity of wireless networks using software-defined networking

(SDN) and cloud computing technologies. Connections between users and services in mobile networks

typically have to pass through a required set of middle boxes. The complex routing is one of the major

impetus for the SDN paradigm, which enables flexible policy-aware routing in the next generation

mobile networks. In addition, the high costs of middle boxes and limited capabilities of mobile devices

call for revolutionary virtualization technologies enabled by cloud computing. Based on these, we

consider an online routing problem for mobile networks with SDN and cloud computing. In this

problem, connection requests are given one at a time (as in a real mobile system), and the objective is

to steer traffic flows to maximize the total amount of traffic accepted over time, subject to capacity,

budget, policy, and quality of service constraints. A fast log-competitive approximation algorithm is

developed based on time-dependent duals.

ETPL CLD - 005

Enhancing Mobile Networks with Software Defined Networking and Cloud

Computing

The growth of mobile cloud computing (MCC) is challenged by the need to adapt to the resources and

environment that are available to mobile clients while addressing the dynamic changes in network

bandwidth. Big data can be handled via MCC. In this paper, we propose a model of computation

partitioning for stateful data in the dynamic environment that will improve performance. First, we

constructed a model of stateful data streaming and investigated the method of computation partitioning

in a dynamic environment. We developed a definition of direction and calculation of the segmentation

scheme, including single frame data flow, task scheduling and executing efficiency. We also defined

the problem for a multi-frame data flow calculation segmentation decision that is optimized for

dynamic conditions and provided an analysis. Second, we proposed a computation partitioning method

for single frame data flow. We determined the data parameters of the application model, the

computation partitioning scheme, and the task and work order data stream model. We followed the

scheduling method to provide the optimal calculation for data frame execution time after computation

partitioning and the best computation partitioning method. Third, we explored a calculation

segmentation method for single frame data flow based on multi-frame data using multi-frame data

optimization adjustment and prediction of future changes in network bandwidth. We were able to

demonstrate that the calculation method for multi-frame data in a changing network bandwidth

environment is more efficient than the calculation method with the limitation of calculations for single

frame data. Finally, our research verified the effectiveness of single frame data in the application of

the data stream and analyzed the performance of the method to optimize the adjustment of multi-frame

data. We used a mobile cloud computing platform prototype system for face recognition to verify the

effectiveness of the method.

ETPL CLD

- 006 Computation partitioning for mobile cloud computing in big data environment


With the exponential increase of the mobile devices and the fast development of cloud computing, a

new computing paradigm called mobile cloud computing (MCC) is put forward to solve the limitation

of the mobile device's storage, communication, and computation. Through mobile devices, users can

enjoy various cloud computing services during their mobility. However, it is difficult to ensure security

and protect privacy due to the openness of wireless communication in the new computing paradigm.

Recently, Tsai and Lo proposed a privacy-aware authentication (PAA) scheme to solve the

identification problem in MCC services and proved that their scheme was able to resist many kinds of

existing attacks. Unfortunately, we found that Tsai and Lo's scheme cannot resist the service provider

impersonation attack, i.e., an adversary can impersonate the service provider to the user. Also, the

adversary can extract the user's real identity during executing the service provider impersonation attack.

To address the above problems, in this paper, we construct a new PAA scheme for MCC services by

using an identity-based signature scheme. Security analysis shows that the proposed PAA scheme is

able to address the serious security problems existing in Tsai and Lo's scheme and can meet security

requirements for MCC services. The performance evaluation shows that the proposed PAA scheme

has less computation and communication costs compared with Tsai and Lo's PAA scheme.

ETPL CLD - 007

Efficient Privacy-Aware Authentication Scheme for Mobile Cloud Computing

Services

Resource procurement using reverse auction in Cloud computing is an interesting but a complex

problem as it involves many attributes and constraints. Reverse auction is a mechanism in which a

customer prepares call for proposal of resource requirement and publicizes it in order to get the

attention of eligible service providers. This work proposes a multi-attribute combinatorial reverse

auction for Cloud resource procurement which considers price as well as non-price attributes such as

quality of service parameters, reputation etc. in the determination of winning service providers. For

this, the problem is formulated using approximation algorithm and near optimal solution is obtained in

polynomial time. Auction mechanism allows providers to reveal true information in order to maximize

their profit. It also imposes a penalty on the providers who cheat i.e. do not offer the agreed upon

services. This makes the system robust as it maintains the utility of the customer. It also maintains a

healthy competition among the providers. Performance evaluation and a comparative study with some

base line models exhibit that the proposed method performs better

ETPL CLD

- 008

A Truthful and Fair Multi-Attribute Combinatorial Reverse Auction for Resource

Procurement in Cloud Computing


Virtual machine placement (VMP) and energy efficiency are significant topics in cloud computing

research. In this paper, evolutionary computing is applied to VMP to minimize the number of active

physical servers, so as to schedule underutilized servers to save energy. Inspired by the promising

performance of the ant colony system (ACS) algorithm for combinatorial problems, an ACS-based

approach is developed to achieve the VMP goal. Coupled with order exchange and migration (OEM)

local search techniques, the resultant algorithm is termed an OEMACS. It effectively minimizes the

number of active servers used for the assignment of virtual machines (VMs) from a global optimization

perspective through a novel strategy for pheromone deposition which guides the artificial ants towards

promising solutions that group candidate VMs together. The OEMACS is applied to a variety of VMP

problems with differing VM sizes in cloud environments of homogenous and heterogeneous servers.

The results show that the OEMACS generally outperforms conventional heuristic and other

evolutionary based approaches, especially on VMP with bottleneck resource characteristics, and offers

significant savings of energy and more efficient use of different resources.

ETPL CLD - 009

An Energy Efficient Ant Colony System for Virtual Machine Placement in Cloud

Computing

Likely system invariants model properties that hold in operating conditions of a computing system.

Invariants may be mined offline from training datasets, or inferred during execution. Scientific work

has shown that invariants’ mining techniques support several activities, including capacity planning

and detection of failures, anomalies and violations of Service Level Agreements. However their

practical application by operation engineers is still a challenge. We aim to fill this gap through an

empirical analysis of three major techniques for mining invariants in cloud-based utility computing

systems: clustering, association rules, and decision list. The experiments use independent datasets from

real-world systems: a Google cluster, whose traces are publicly available, and a Software-as-a-Service

platform used by various companies worldwide. We assess the techniques in two invariants’

applications, namely executions characterization and anomaly detection, using the metrics of coverage,

recall and precision. A sensitivity analysis is performed. Experimental results allow inferring practical

usage implications, showing that relatively few invariants characterize the majority of operating

conditions, that precision and recall may drop significantly when trying to achieve a large coverage,

and that techniques exhibit similar precision, though the supervised one a higher recall. Finally, we

propose a general heuristic for selecting likely invariants from a dataset.

ETPL CLD

- 010 Assessing Invariant Mining Techniques for Cloud-based Utility Computing Systems


With today's massive jobs spanning thousands of tasks each, cost-optimality has become more

important than ever. Modern distributed data processing paradigms can be significantly more sensitive

to cost than make span, especially for long jobs deployed in commercial clouds. This paper posits that

minimized dollar costs cannot be achieved unless data and tasks are scheduled simultaneously. In this

paper, we introduce the problem of cost-efficient co-scheduling for highly data-intensive jobs in cloud,

such as Map Reduce. We show that while the problem is polynomial in some cases, its general problem

is NP-Hard. We propose to tackle the problem by using integer programming techniques coupled with

heuristic reduction and optimization to enable a near-real-time solution. Afford Hadoop, a pluggable

co-scheduler for Hadoop, is implemented as an example of such a co-scheduler. Afford Hadoop can

save up to 48% of the overall dollar costs when compared to existing schedulers and provides

significant flexibility in fine-tuning the cost-performance trade-off.

ETPL CLD - 011

Cost-Efficient Tasks and Data Co-Scheduling with Afford Hadoop

Colocation data centers are an important type of data centers that have some unique challenges in

managing their energy consumption. Tenants in a colocation data center usually manage their servers

independently without coordination, leading to inefficiency. To address this issue, we propose a

formulation of coordinated energy management for colocation data centers. Considering the

randomness of workload arrival and electricity cost function, we formulate it as a stochastic

optimization problem, and then develop an online algorithm to solve it efficiently. Our algorithm is

based on Lyapunov optimization, which only needs to track the instantaneous values of the underlying

random factors without requiring any knowledge of the statistics or future information. Moreover,

alternating direction method of multipliers (ADMM) is utilized to implement our algorithm in a

decentralized way, making it easy to be implemented in practice. We analyze the performance of our

online algorithm, proving that it is asymptotically optimal and robust to the statistics of the involved

random factors. Moreover, extensive trace-based simulations are conducted to illustrate the

effectiveness of our approach.

ETPL CLD

- 012 Dynamic Multi-Tenant Coordination for Sustainable Colocation Data Centers


Energy-efficient cloud computing has recently attracted much attention, where not only performance

but also energy consumption are important metrics to be considered for designing rational resource

scheduling strategies. Most of existing approaches for achieving energy efficient computing focus on

connecting these two metrics and balancing the tradeoff between them, which however is inadequate

because another important factor reliability is not considered. In fact, both virtual machine (VM)

failures and server failures inevitably interrupt execution of a cloud service, and eventually result in

spending more time and consuming more energy on completing the cloud service. Therefore, reliability

significantly affects service performance and energy consumption, and thus they should not be handled

separately. Connecting these correlated metrics is essential for making more precise evaluation and

further for developing rational cloud resource scheduling strategies. In this paper, we present a

correlated modeling approach applying Semi-Markov models, the Laplace-Stieltjes transform (LST),

a Bayesian approach to analyze reliability-performance (R-P) and reliability-energy (R-E) correlations

for cloud services using a retrying fault recovery mechanism. A recursive method is also proposed for

modeling the correlations for cloud services using a check-pointing fault recovery mechanism. The

proposed correlation models can be used to calculate the expected service time and energy consumption

for completing a cloud service. Moreover, the models can contribute to analyzing the expected

performance-energy tradeoff. We formulate the expected performance-energy optimization problem

by describing performance and energy consumption metrics as functions of assigned CPU frequencies.

Finally, we use a derivation approach to determine Pareto optimal solutions for the formulated

optimization problem. Illustrative examples are provided.

ETPL CLD - 013

Correlation Modelling and Resource Optimization for Cloud Service with Fault

Recovery

In the Infrastructure as a Service (IaaS) paradigm of cloud computing, computational resources are

available for rent. Although it offers a cost efficient solution to virtual network requirements, low trust

on the rented computational resources prevents users from using it. To reduce the cost, computational

resources are shared, i.e., there exists multi-tenancy. As the communication channels and other

computational resources are shared, it creates security and privacy issues. A user may not identify a

trustworthy co-tenant as the users are anonymous. The user depends on the Cloud Provider (CP) to

assign trustworthy co-tenants. But, it is in the CP’s interest that it gets maximum utilization of its

resources. Hence, it allows maximum co-tenancy irrespective of the behaviours of users. In this paper,

we propose a robust reputation management mechanism that encourages the CPs in a federated cloud

to differentiate between good and malicious users and assign resources in such a way that they do not

share resources. We show the correctness and the efficiency of the proposed reputation management

system using analytical and experimental analysis.

ETPL CLD

- 014 A Robust Reputation Management Mechanism in Federated Cloud


Outsourcing storage and computation to the cloud has become a common practice for businesses and

individuals. As the cloud is semi-trusted or susceptible to attacks, many researches suggest that the

outsourced data should be encrypted and then retrieved by using searchable symmetric encryption

(SSE) schemes. Since the cloud is not fully trusted, we doubt whether it would always process queries

correctly or not. Therefore, there is a need for users to verify their query results. Motivated by this, in

this paper, we propose a publicly verifiable dynamic searchable symmetric encryption scheme based

on the accumulation tree. We first construct an accumulation tree based on encrypted data and then

outsource both of them to the cloud. Next, during the search operation, the cloud generates the

corresponding proof according to the query result by mapping Boolean query operations to set

operations, while keeping privacy-preservation and achieving the verification requirements: freshness,

authenticity, and completeness. Finally, we extend our scheme by dividing the accumulation tree into

different small accumulation trees to make our scheme scalable. The security analysis and performance

evaluation show that the proposed scheme is secure and practical.

ETPL CLD - 015

Publicly Verifiable Boolean Query over Outsourced Encrypted Data

This paper addresses the problem of sharing person-specific genomic sequences without violating the

privacy of their data subjects to support large-scale biomedical research projects. The proposed method

builds on the framework proposed by Kantarcioglu et al. [1] but extends the results in a number of

ways. One improvement is that our scheme is deterministic, with zero probability of a wrong answer

(as opposed to a low probability). We also provide a new operating point in the space-time tradeoff,

by offering a scheme that is twice as fast as theirs but uses twice the storage space. This point is

motivated by the fact that storage is cheaper than computation in current cloud computing pricing

plans. Moreover, our encoding of the data makes it possible for us to handle a richer set of queries than

exact matching between the query and each sequence of the database, including: (i) counting the

number of matches between the query symbols and a sequence; (ii) logical OR matches where a query

symbol is allowed to match a subset of the alphabet thereby making it possible to handle (as a special

case) a “not equal to” requirement for a query symbol (e.g., “not a G”); (iii) support for the extended

alphabet of nucleotide base codes that encompasses ambiguities in DNA sequences (this happens on

the DNA sequence side instead of the query side); (iv) queries that specify the number of occurrences

of each kind of symbol in the specified sequence positions (e.g., two ‘A’ and four ‘C’ and one ‘G’ and

three ‘T’, occurring in any order in the query-specified sequence positions); (v) a threshold query

whose answer is ‘yes’ if the number of matches exceeds a query-specified threshold (e.g., “7 or more

matches out of the 15 query-specified positions”). (vi) For all query types we can hide the answers

from the decrypting server, so that only the client learns the answer. (vii) In all cases, the client

deterministically learns only the query's answer, except for query type (v) where we quantify the (very

small) statistical leakage to the client of the actual count.

ETPL CLD

- 016 Securing Aggregate Queries for DNA Databases


Service migration between data centers can reduce the network overhead within a cloud infrastructure;

thereby, also improving the quality of service for the clients. Most of the algorithms in the literature

assume that the client access pattern remains stable for a sufficiently long period so as to amortize such

migrations. However, if such an assumption does not hold, these algorithms can take arbitrarily poor

migration decisions that can substantially degrade system performance. In this paper, we approach the

issue of performing service migrations for an unknown and dynamically changing client access pattern.

We propose an online algorithm that minimizes the inter-data center network, taking into account the

network load of migrating a service between two data centers, as well as the fact that the client request

pattern may change “quickly”, before such a migration is amortized. We provide a rigorous

mathematical proof showing that the algorithm is 3.8-competitive for a cloud network structured as a

tree of multiple data centers. We briefly discuss how the algorithm can be modified to work on general

graph networks with an O(log|V|) probabilistic approximation of the optimal algorithm. Finally, we

present an experimental evaluation of the algorithm based on extensive simulations.

ETPL CLD - 017

Online Inter-Data center Service Migrations

Storage requirements for visual data have been increasing in recent years, following the emergence of

many highly interactive multimedia services and applications for mobile devices in both personal and

corporate scenarios. This has been a key driving factor for the adoption of cloud-based data outsourcing

solutions. However, outsourcing data storage to the Cloud also leads to new security challenges that

must be carefully addressed, especially regarding privacy. In this paper we propose a secure framework

for outsourced privacy-preserving storage and retrieval in large shared image repositories. Our

proposal is based on IES-CBIR, a novel Image Encryption Scheme that exhibits Content-Based Image

Retrieval properties. The framework enables both encrypted storage and searching using Content-

Based Image Retrieval queries while preserving privacy against honest-but-curious cloud

administrators. We have built a prototype of the proposed framework, formally analyzed and proven

its security properties, and experimentally evaluated its performance and retrieval precision. Our results

show that IES-CBIR is provably secure, allows more efficient operations than existing proposals, both

in terms of time and space complexity, and paves the way for new practical application scenarios.

ETPL CLD

- 018 Practical Privacy-Preserving Content-Based Retrieval in Cloud Image Repositories


Recent news reveal a powerful attacker which breaks data confidentiality by acquiring cryptographic

keys, by means of coercion or backdoors in cryptographic software. Once the encryption key is

exposed, the only viable measure to preserve data confidentiality is to limit the attacker’s access to the

ciphertext. This may be achieved, for example, by spreading ciphertext blocks across servers in

multiple administrative domains—thus assuming that the adversary cannot compromise all of them.

Nevertheless, if data is encrypted with existing schemes, an adversary equipped with the encryption

key, can still compromise a single server and decrypt the ciphertext blocks stored therein. In this paper,

we study data confidentiality against an adversary which knows the encryption key and has access to

a large fraction of the ciphertext blocks. To this end, we propose Bastion, a novel and efficient scheme

that guarantees data confidentiality even if the encryption key is leaked and the adversary has access

to almost all ciphertext blocks. We analyze the security of Bastion, and we evaluate its performance

by means of a prototype implementation. We also discuss practical insights with respect to the

integration of Bastion in commercial dispersed storage systems. Our evaluation results suggest that

Bastion is well-suited for integration in existing systems since it incurs less than 5% overhead

compared to existing semantically secure encryption modes.

ETPL CLD - 019

Securing Cloud Data under Key Exposure

As cloud computing data centers grow in size and complexity to accommodate an increasing number

of virtual machines, the scalability of monitoring and management processes becomes a major

challenge. Recent research studies show that automatically clustering virtual machines that are similar

in terms of resource usage may address the scalability issues of IaaS clouds. Existing solutions provides

high clustering accuracy at the cost of very long observation periods that are not compatible with

dynamic cloud scenarios where VMs may frequently join and leave. We propose a novel technique,

namely AGATE (Adaptive Gray Area-based TEchnique), that provides accurate clustering results for

a subset of VMs after a very short time. This result is achieved by introducing elements of fuzzy logic

into the clustering process to identify the VMs with undecided clustering assignment (the so-called

gray area), that should be monitored for longer periods. To evaluate the performance of the proposed

solution, we apply the technique to multiple case studies with real and synthetic workloads. We

demonstrate that our solution can correctly identify the behavior of a high percentage of VMs after few

hours of observations, and significantly reduce the data required for monitoring with respect to state-

of-the-art solutions.

ETPL CLD

- 020

AGATE: Adaptive Gray Area-based Technique to Cluster Virtual Machines with

Similar Behaviour


Cloud Storage Providers (CSPs) offer geographically data stores providing several storage classes with

different prices. An important problem facing by cloud users is how to exploit these storage classes to

serve an application with a time-varying workload on its objects at minimum cost. This cost consists

of residential cost (i.e., storage, Put and Get costs) and potential migration cost (i.e., network cost). To

address this problem, we first propose the optimal offline algorithm that leverages dynamic and linear

programming techniques with the assumption of available exact knowledge of workload on objects.

Due to the high time complexity of this algorithm and its requirement for a priori knowledge, we

propose two online algorithms that make a trade-off between residential and migration costs and

dynamically select storage classes across CSPs. The first online algorithm is deterministic with no need

of any knowledge of workload and incurs no more than 2 � 1 times of the minimum cost obtained by

the optimal offline algorithm, where is the ratio of the residential cost in the most expensive data store

to the cheapest one in either network or storage cost. The second online algorithm is randomized that

leverages “Receding Horizon Control” (RHC) technique with the exploitation of available future

workload information for w time slots. This algorithm incurs at most 1 + w times the optimal cost. The

effectiveness of the proposed algorithms is demonstrated through simulations using a workload

synthesized based on characteristics of the Facebook workload.

ETPL CLD - 021

Cost Optimization for Dynamic Replication and Migration of Data in Cloud Data

Centers

Data center benefits cloud applications in providing high scalability and ensuring service availability.

However, virtual machine (VM) placement in data center poses new challenges for service

provisioning. For many cloud services such as storage and video streaming, present placement

approaches are unable to support network-demanding services due to overwhelming communication

traffic and time. Therefore VM placement concerning link capacity is vital to cloud data centers. In

this paper, we define the network-aware VM placement optimization (NAVMPO) problem based on

integer linear programming. The objective function of NAVMPO problem aims to minimize

communication time for VMs of the same service type. Then we propose the service-oriented physical

machine (PM) selection (SOPMS) algorithm and link-aware VM placement (LAVMP) algorithm. The

SOPMS algorithm selects the most appropriate PM based on service-oriented architecture, and then

the LAVMP algorithm deploys the most suitable VM to target PM regarding to the link capacity

between them. Simulation results show that the proposed placement approach significantly decreases

communication time compared to existing non-service-oriented and service-oriented VM placement

algorithms, and also improves the average utility rate of PMs with lower power consumption

ETPL CLD

- 022

Link-aware Virtual Machine Placement for Cloud Services based on Service-

Oriented Architecture


Clustering techniques have been widely adopted in many real world data analysis applications, such as

customer behavior analysis, targeted marketing, digital forensics, etc. With the explosion of data in

today’s big data era, a major trend to handle a clustering over large-scale datasets is outsourcing it to

public cloud platforms. This is because cloud computing offers not only reliable services with

performance guarantees, but also savings on in-house IT infrastructures. However, as datasets used for

clustering may contain sensitive information, e.g., patient health information, commercial data, and

behavioral data, etc, directly outsourcing them to public cloud servers inevitably raise privacy

concerns.

ETPL CLD - 023

Practical Privacy-Preserving Map Reduce Based K-means clustering over Large-

scale Dataset

Cloud computing has recently emerged as an important service to manage applications efficiently over

the Internet. Various cloud providers offer pay per use cloud services that requires Quality of Service

(QoS) management to efficiently monitor and measure the delivered services through Internet of

Things (IoT) and thus needs to follow Service Level Agreements (SLAs). However, providing

dedicated cloud services that ensure user's dynamic QoS requirements by avoiding SLA violations is

a big challenge in cloud computing. As dynamism, heterogeneity and complexity of cloud environment

is increasing rapidly, it makes cloud systems insecure and unmanageable. To overcome these problems,

cloud systems require self-management of services. Therefore, there is a need to develop a resource

management technique that automatically manages QoS requirements of cloud users thus helping the

cloud providers in achieving the SLAs and avoiding SLA violations. In this paper, we present SLA-

aware autonomic resource management technique called STAR which mainly focuses on reducing

SLA violation rate for the efficient delivery of cloud services. The performance of the proposed

technique has been evaluated through cloud environment. The experimental results demonstrate that

STAR is efficient in reducing SLA violation rate and in optimizing other QoS parameters which effect

efficient cloud service delivery.

ETPL CLD

- 024 STAR: SLA-aware Autonomic Management of Cloud Resources


Power management of cloud data centers has received great attention from industry and academia as

they are expensive to operate due to their high energy consumption. While hosts are dominant to

consume electric power, networks account for 10% to 20% of the total energy costs in a data center.

Resource overbooking is one way to reduce the usage of active hosts and networks by placing more

requests to the same amount of resources. Network resource overbooking can be facilitated by Software

Defined Networking (SDN) that can consolidate traffics and control Quality of Service (QoS)

dynamically. However, the existing approaches employ fixed overbooking ratio to decide the amount

of resources to be allocated, which in reality may cause excessive Service Level Agreements (SLA)

violation with workloads being unpredictable. In this paper, we propose dynamic overbooking strategy

which jointly leverages virtualization capabilities and SDN for VM and traffic consolidation. With the

dynamically changing workload, the proposed strategy allocates more precise amount of resources to

VMs and traffics. This strategy can increase overbooking in a host and network while still providing

enough resources to minimize SLA violations. Our approach calculates resource allocation ratio based

on the historical monitoring data from the online analysis of the host and network utilization without

any pre-knowledge of workloads. We implemented it in simulation environment in large scale to

demonstrate the effectiveness in the context of Wikipedia workloads. Our approach saves energy

consumption in the data center while reducing SLA violations.

ETPL CLD - 025

SLA-aware and Energy-Efficient Dynamic Overbooking in SDN-based Cloud Data

Centers

Cloud applications built on service-oriented architectures generally integrate a number of component

services to fulfill certain application logic. The changing cloud environment highlights the need for

these applications to keep resilient against QoS variations of their component services so that end-to-

end quality-of-service (QoS) can be guaranteed. Runtime service adaptation is a key technique to

achieve this goal. To support timely and accurate adaptation decisions, effective and efficient QoS

prediction is needed to obtain real-time QoS information of component services. However, current

research has focused mostly on QoS prediction of working services that are being used by a cloud

application, but little on predicting QoS values of candidate services that are equally important in

determining optimal adaptation actions. In this paper, we propose an adaptive matrix factorization

(namely AMF) approach to perform online QoS prediction for candidate services. AMF is inspired

from the widely-used collaborative filtering techniques in recommender systems, but significantly

extends the conventional matrix factorization model with new techniques of data transformation, online

learning, and adaptive weights. Comprehensive experiments, as well as a case study, have been

conducted based on a real-world QoS dataset of Web services (with over 40 million QoS records). The

evaluation results demonstrate AMF’s superiority in achieving accuracy, efficiency, and robustness,

which are essential to enable optimal runtime service adaptation.

ETPL CLD

- 026

Online QoS Prediction for Runtime Service Adaptation via Adaptive Matrix

Factorization


Far-edge analytics refers to the enablement of data mining algorithms in far-edge mobile devices that

are part of mobile edge cloud computing (MECC) systems. Far-edge analytics enables data reduction

in mobile environments, hence reducing the data transfer rate and bandwidth utilization cost for

mobile-edge communication. In addition, far-edge analytics facilitates local knowledge availability to

enable personalized mobile data stream mining applications. Existing literature mainly addresses

classification and clustering problems in far-edge mobile devices, but the problem of frequent pattern

mining (FPM) remains unexplored. This paper presents the results of an experimental study on the

performance profiling of frequent pattern mining algorithms. We developed a real mobile application

for performance analysis and profiling of 21 FPM algorithms with various real data sets in terms of

execution time, storage complexity, sparsity, density, and data set size. According to the experimental

results, large-sized data sets with high sparsity increase computational and storage cost in far-edge

mobile devices. To address these issues, we propose a framework and discuss the relevant research

challenges for seamless execution of FPM algorithms in MECC systems.

ETPL CLD - 027

Enabling Far-Edge Analytics: Performance Profiling of Frequent Pattern Mining

Algorithms

As cloud computing becomes increasingly popular, cloud providers compete to offer the same or

similar services over the Internet. Quality of service (QoS), which describes how well a service is

performed, is an important differentiator among functionally equivalent services. It can help a firm to

satisfy and win its customers. As a result, how to assist cloud providers to promote their services and

cloud consumers to identify services that meet their QoS requirements becomes an important problem.

In this paper, we argue for QoS-based cloud service recommendation, and propose a collaborative

filtering approach using the Spearman coefficient to recommend cloud services. The approach is used

to predict both QoS ratings and rankings for cloud services. To evaluate the effectiveness of the

approach, we conduct extensive simulations. Results show that the approach can achieve more reliable

rankings, yet less accurate ratings, than a collaborative filtering approach using the Pearson coefficient.

ETPL CLD

- 028 QoS Recommendation in Cloud Services


Distributed Denial of Service (DDoS) constitutes a major threat against cloud systems owing to the

large financial losses it incurs. This motivated the security research community to investigate numerous

detection techniques to limit such attack’s effects. Yet, the existing solutions are still not mature enough

to satisfy a cloud-dedicated detection system’s requirements since they overlook the attacker’s wily

strategies that exploit the cloud’s elastic and multi-tenant properties, and ignore the cloud system’s

resources constraints. Motivated by this fact, we propose a two-fold solution that allows, firstly, the

hypervisor to establish credible trust relationships toward guest Virtual Machines (VMs) by

considering objective and subjective trust sources and employing Bayesian inference to aggregate

them. On top of the trust model, we design a trust-based maximin game between DDoS attackers trying

to minimize the cloud system’s detection and hypervisor trying to maximize this minimization under

limited budget of resources. The game solution guides the hypervisor to determine the optimal

detection load distribution among VMs in real-time that maximizes DDoS attacks’ detection.

Experimental results reveal that our solution maximizes attacks’ detection, decreases false positives

and negatives, and minimizes CPU, memory and bandwidth consumption during DDoS attacks

compared to the existing detection load distribution techniques.

ETPL CLD - 029

Optimal Load Distribution for the Detection of VM-based DDoS Attacks in the

Cloud

In current cloud computing systems, when leveraging virtualization technology, the customer’s

requested data computing or storing service is accommodated by a set of communicated virtual

machines (VM) in a scalable and elastic manner. These VMs are placed in one or more server nodes

according to the node capacities or failure probabilities. The VM placement availability refers to the

probability that at least one set of all customer’s requested VMs operates during the requested lifetime.

In this paper, we first study the problem of placing at most H groups of k requested VMs on a minimum

number of nodes, such that the VM placement availability is no less than , and that the specified

communication delay and connection availability for each VM pair under the same placement group

are not violated. We consider this problem with and without Shared-Risk Node Group (SRNG) failures,

and prove this problem is NP-hard in both cases. We subsequently propose an exact Integer Nonlinear

Program (INLP) and an efficient heuristic to solve this problem. We conduct simulations to compare

the proposed algorithms with two existing heuristics in terms of performance. Finally, we study the

related reliable routing problem of establishing a connection over at most w link-disjoint paths from a

source to a destination, such that the connection availability requirement is satisfied and each path

delay is no more than a given value. We devise an exact algorithm and two heuristics to solve this NP-

hard problem, and evaluate them via simulations.

ETPL CLD

- 030 Reliable Virtual Machine Placement and Routing in Clouds


By locally solving an optimization problem and broadcasting an update message over the underlying

communication infrastructure, demand response program based on the distributed optimization model

encourage all users to participate in the program. However, some challenging issues present

themselves, such as the existence of an ideal communication network, especially when utilizing

wireless communication, and the effects of communication channel properties, like the bit error rate,

on the overall performance of the demand response program. To address the issues, this paper first

defines a Cloud-based Demand Response (CDR) model, which is implemented as a two-tier cloud

computing platform. Then a communication model is proposed to evaluate the communication

performance of both the CDR and DDR (Distributed Demand Response) models. The present study

shows that when users are finely clustered, the channel bit error rate is high and the User Datagram

Protocol (UDP) is leveraged to broadcast the update messages, making the optimal solution

unachievable. Contradictory to UDP, the Transmission Control Protocol (TCP) will be caught up with

a higher bandwidth and increase the delay in the convergence time. Finally, the current work presents

a cost-effectiveness analysis which confirms that achieving higher demand response performance

incurs a higher communication cost.

ETPL CLD - 031

On the Performance of Distributed and Cloud-Based Demand Response in Smart

Grid

Remote data access control is of crucial importance in public cloud. Based on its own inclinations, the

data owner predefines the access policy. When the user satisfies the data owner’s access policy, it has

the right to access the data owner’s remote data. In order to improve flexibility and efficiency of remote

data access control, attribute-based encryption (for short, ABE) is used to realize the remote data fine-

grained access control. For the low-capacity terminals, verifiable outsourced decryption is a very

attractive technique. In the real application scenarios, the user’s attributes are usually managed by many

authorities. When some authorized users access some sensitive remote data, they hope to preserve their

identity privacy. From the two points, we propose an anonymous distributed fine-grained access

control protocol with verifiable outsourced decryption in public cloud (for short, VOD-ADAC). VOD-

ADAC is a novel concept which is proposed for the first time in the paper. By adopting the pseudonym

technique, the user’s high anonymity can be achieved by frequently changing the independent

pseudonyms at some highly social spots. This paper formalizes the system model and security model

of VOD-ADAC protocol. Then, by using hybrid encryption technique of distributed ABE and

symmetric encryption, a concrete VOD-ADAC protocol is designed from the bilinear pairings.

Through security analysis and performance analysis, our proposed VOD-ADAC protocol is provably

secure and efficient.

ETPL CLD

- 032

VOD-ADAC: Anonymous Distributed Fine-Grained Access Control Protocol with

Verifiable Outsourced Decryption in Public Cloud


With mobile devices increasingly able to connect to cloud servers from anywhere, resource-constrained

devices can potentially perform offloading of computational tasks to either save local resource usage

or improve performance. It is of interest to find optimal assignments of tasks to local and remote

devices that can take into account the application-specific profile, availability of computational

resources, and link connectivity, and find a balance between energy consumption costs of mobile

devices and latency for delay-sensitive applications. We formulate an NP-hard problem to minimize

the application latency while meeting prescribed resource utilization constraints. Different from most

of existing works that either rely on the integer programming solver, or on heuristics that offer no

theoretical performance guarantees, we propose Hermes, a novel fully polynomial time approximation

scheme (FPTAS). We identify for a subset of problem instances, where the application task graphs can

be described as serial trees, Hermes provides a solution with latency no more than (1 + ) times of the

minimum while incurring complexity that is polynomial in problem size and 1 . We further propose an

online algorithm to learn the unknown dynamic environment and guarantee that the performance gap

compared to the optimal strategy is bounded by a logarithmic function with time. Evaluation is done

by using real data set collected from several benchmarks, and is shown that Hermes improves the

latency by 16% compared to a previously published heuristic and increases CPU computing time by

only 0:4% of overall latency.

ETPL CLD - 033

Hermes: Latency Optimal Task Assignment for Resource-constrained Mobile

Computing

Due to the complexity and volume, outsourcing ciphertexts to a cloud is deemed to be one of the most

effective approaches for big data storage and access. Nevertheless, verifying the access legitimacy of

a user and securely updating a ciphertext in the cloud based on a new access policy designated by the

data owner are two critical challenges to make cloud-based big data storage practical and effective.

Traditional approaches either completely ignore the issue of access policy update or delegate the update

to a third party authority; but in practice, access policy update is important for enhancing security and

dealing with the dynamism caused by user join and leave activities. In this paper, we propose a secure

and verifiable access control scheme based on the NTRU cryptosystem for big data storage in clouds.

We first propose a new NTRU decryption algorithm to overcome the decryption failures of the original

NTRU, and then detail our scheme and analyze its correctness, security strengths, and computational

efficiency. Our scheme allows the cloud server to efficiently update the ciphertext when a new access

policy is specified by the data owner, who is also able to validate the update to counter against cheating

behaviors of the cloud. It also enables (i) the data owner and eligible users to effectively verify the

legitimacy of a user for accessing the data, and (ii) a user to validate the information provided by other

users for correct plaintext recovery. Rigorous analysis indicates that our scheme can prevent eligible

users from cheating and resist various attacks such as the collusion attack.

ETPL CLD

- 034 A Secure and Verifiable Access Control Scheme for Big Data Storage in Clouds


Live VM migration helps attain both cloud-wide load balancing and operational consolidation while

the migrating VMs remain accessible to users. To avoid periods of high-load for the involved resources,

IaaS-cloud operators assign specific time windows for such migrations to occur in an orderly manner.

Moreover, providers typically rely on share-nothing architectures to attain scalability. In this paper, we

focus on the real-time scheduling of live VM migrations in large share-nothing IaaS clouds, such that

migrations are completed on time and without adversely affecting agreed-upon SLAs. We propose a

scalable, distributed network of brokers that oversees the progress of all on-going migration operations

within the context of a provider. Brokers make use of an underlying special purpose file system,

termedMigrateFS, that is capable of both replicating and keeping in sync virtual disks while the

hypervisor live-migrates VMs (i.e., RAM and CPU state). By limiting the resources consumed during

migration, brokers implement policies to reduce SLA violations while seeking to complete all

migration tasks on time. We evaluate two such policies, one based on task prioritization and a second

that considers the financial implications set by migration deadline requirements. Using ourMigrateFS

prototype operating on a real cloud, we demonstrate the feasibility of performing migrations within

time windows. By simulating large clouds, we assess the effectiveness of our proposed broker policies

in a share–nothing configuration

ETPL CLD - 035

Live VM Migration under Time-Constrains in Share-Nothing IaaS-Clouds

In this paper, we present the QuantCloud infrastructure, designed for performing big data analytics in

modern quantitative finance. Through analyzing market observations, quantitative finance (QF)

utilizes mathematical models to search for subtle patterns and inefficiencies in financial markets to

improve prospective profits. To discover profitable signals in anticipation of volatile trading patterns

amid a global market, analytics are carried out on Exabyte-scale market metadata with a complex

process in pursuit of a microsecond or even a nanosecond of data processing advantage. This objective

motivates the development of innovative tools to address challenges for handling high volume,

velocity, and variety investment instruments. Inspired by this need, we developed QuantCloud by

employing large-scale SSD-backed datastore, various parallel processing algorithms, and portability in

Cloud computing. QuantCloud bridges the gap between model computing techniques and financial

data-driven research. The large volume of market data is structured in an SSD-backed datastore, and a

daemon reacts to provide the Data-on-Demand services. Multiple client services process user requests

in a parallel mode and query on-demand datasets from the datastore through Internet connections. We

benchmark QuantCloud performance on a 40-core, 1TB-memory computer and a 5-TB SSD-backed

datastore. We use NYSE TAQ data from the fourth quarter of 2014 as our market data. The results

indicate data-access application latency as low as 3.6 nanoseconds per message, sustained throughput

for parallel data processing as high as 74 million messages per second, and completion of 11 petabyte-

level data analytics within 53 minutes.

ETPL CLD

- 036 Quant Cloud: Big Data Infrastructure for Quantitative Finance on the Cloud


With the fast growth of applications of service-oriented architecture (SOA) in software engineering,

there has been a rapid increase in demand for building service-based systems (SBSs) by composing

existing Web services. Finding appropriate component services to compose is a key step in the SBS

engineering process. Existing approaches require that system engineers have detailed knowledge of

SOA techniques which is often too demanding. To address this issue, we propose KS3 (Keyword

Search for Service-based Systems), a novel approach that integrates and automates the system

planning, service discovery and service selection operations for building SBSs based on keyword

search. KS3 assists system engineers without detailed knowledge of SOA techniques in searching for

component services to build SBSs by typing a few keywords that represent the tasks of the SBSs with

quality constraints and optimisation goals for system quality, e.g., reliability, throughput and cost. KS3

offers a new paradigm for SBS engineering that can significantly save the time and effort during the

system engineering process. We conducted large-scale experiments using a real-world Web service

dataset to demonstrate the practicality, effectiveness and efficiency of KS3.

ETPL CLD - 037

Keyword Search for Building Service-Based Systems

Traffic volumes in mobile networks are rising and end-user needs are rapidly changing. Mobile

network operators need more flexibility, lower network operating costs, faster service roll-out cycles

and new revenue sources. 5G (5th Generation) and future networks aim to deliver ultra-fast and ultra-

reliable network access capable of supporting the anticipated surge in data traffic and connected nodes

in years to come. Several technologies have been developed to meet these emergent demands of future

mobile networks, among these are Software Defined Networking (SDN), Network Function

Virtualization (NFV) and cloud computing. In this paper, we discuss the security challenges these new

technologies are prone to in the context of the new telecommunication paradigm. We present a multi-

tier component based security architecture to address these challenges and secure 5G Software Defined

Mobile Network (SDMN), by handling security at different levels to protect the network and its users.

The proposed architecture contains five components i.e. Secure Communication (SC), Policy Based

Communication (PBC) Security Information and Event Management (SIEM), Security Defined

Monitoring (SDM) and Deep Packet Inspection (DPI) components for elevated security in the control

and the data planes of SDMNs. Finally, the proposed security mechanisms are validated using testbed

experiments.

ETPL CLD

- 038 Enhancing Security of Software Defined Mobile Networks


With the growing amount of data, the demand of big data storage significantly increases. Through the

cloud center, data providers can conveniently share data stored in the center with others. However, one

practically important problem in big data storage is privacy. During the sharing process, data is

encrypted to be confidential and anonymous. Such operation can protect privacy from being leaked

out. To satisfy the practical conditions, data transmitting with multi receivers is also considered.

Furthermore, this paper proposes the notion of pre-authentication for the first time, i.e., only users with

certain attributes that have already. The pre-authentication mechanism combines the advantages of

proxy conditional re-encryption multi-sharing mechanism with the attribute-based authentication

technique, thus achieving attributes authentication before re-encryption, and ensuring the security of

the attributes and data. Moreover, this paper finally proves that the system is secure and the proposed

pre-authentication mechanism could significantly enhance the system security level.

ETPL CLD - 039

A Pre-Authentication Approach to Proxy Re-encryption in Big Data Context

Today, cloud storage becomes one of the critical services, because users can easily modify and share

data with others in cloud. However, the integrity of shared cloud data is vulnerable to inevitable

hardware faults, software failures or human errors. To ensure the integrity of the shared data, some

schemes have been designed to allow public verifiers (i.e., third party auditors) to efficiently audit data

integrity without retrieving the entire users’ data from cloud. Unfortunately, public auditing on the

integrity of shared data may reveal data owners’ sensitive information to the third party auditor. In this

paper, we propose a new privacy-aware public auditing mechanism for shared cloud data by

constructing a homomorphic verifiable group signature. Unlike the existing solutions, our scheme

requires at least t group managers to recover a trace key cooperatively, which eliminates the abuse of

single-authority power and provides nonframeability. Moreover, our scheme ensures that group users

can trace data changes through designated binary tree; and can recover the latest correct data block

when the current data block is damaged. In addition, the formal security analysis and experimental

results indicate that our scheme is provably secure and efficient.

ETPL CLD

- 040

NPP: A New Privacy-Aware Public Auditing Scheme for Cloud Data Sharing with

Group Users


Fog computing aims at extending the Cloud by bringing computational power, storage and

communication capabilities to the edge of the network, in support of the IoT. Segmentation,

distribution and adaptive deployment of functionalities over the continuum from Things to Cloud are

challenging tasks, due to the intrinsic heterogeneity, hierarchical structure and very large scale

infrastructure they will have to exploit. In this paper, we propose a simple, yet general, model to support

the QoS-aware deployment of multi-component IoT applications to Fog infrastructures. The model

describes operational systemic qualities of the available infrastructure (latency and bandwidth),

interactions among software components and Things, and business policies. Algorithms to determine

eligible deployments for an application to a Fog infrastructure are presented. A Java tool, FogTorch,

based on the proposed model has been prototyped.

ETPL CLD - 041

QoS-aware Deployment of IoT Applications through the Fog

Optimal placement and selection of service instances in a distributed heterogeneous cloud is a complex

trade-off between application requirements and resource capabilities that requires detailed information

on the service, infrastructure constraints, and the underlying IP network. In this article we first posit

that from an analysis of a snapshot of todays centralized and regional data center infrastructure, there

is a sufficient number of candidate sites for deploying many services while meeting latency and

bandwidth constraints. We then provide quantitative arguments why both network and hardware

performance needs to be taken into account when selecting candidate sites to deploy a given service.

Finally, we propose a novel architectural solution for service-centric networking. The resulting system

exploits the availability of fine-grained execution nodes across the Internet and uses knowledge of

available computational and network resources for deploying, replicating and selecting instances to

optimize quality of experience for a wide range of services.

ETPL CLD

- 042 Service-Centric Networking for Distributed Heterogeneous Clouds


Many cloud service providers (CSPs) provide data storage services with datacenters distributed

worldwide. These datacenters provide different get/put latencies and unit prices for resource utilization

and reservation. Thus, when selecting different CSPs' datacenters, cloud customers of globally

distributed applications (e.g., online social networks) face two challenges: 1) how to allocate data to

worldwide datacenters to satisfy application service level objective (SLO) requirements, including both

data retrieval latency and availability and2) how to allocate data and reserve resources in datacenters

belonging to different CSPs to minimize the payment cost. To handle these challenges, we first model

the cost minimization problem under SLO constraints using the integer programming. Due to its NP-

hardness, we then introduce our heuristic solution, including a dominant-cost-based data allocation

algorithm and an optimal resource reservation algorithm. We further propose three enhancement

methods to reduce the payment cost and service latency: 1) coefficient-based data reallocation; 2)

multicast-based data transferring; and 3) request redirection-based congestion control. We finally

introduce an infrastructure to enable the conduction of the algorithms. Our trace-driven experiments

on a supercomputing cluster and on real clouds (i.e., Amazon S3, Windows Azure Storage, and Google

Cloud Storage) show the effectiveness of our algorithms for SLO guaranteed services and customer

cost minimization.

ETPL CLD - 043

Minimum-Cost Cloud Storage Service across Multiple Cloud Providers

Cloud-supported Internet of Things (Cloud-IoT) has been broadly deployed in smart grid systems. The

IoT front-ends are responsible for data acquisition and status supervision, while the substantial amount

of data is stored and managed in the cloud server. Achieving data security and system efficiency in the

data acquisition and transmission process are of great significance and challenging, because the power

grid-related data is sensitive and in huge amount. In this paper, we present an efficient and secure data

acquisition scheme based on CP-ABE (Ciphertext Policy Attribute Based Encryption). Data acquired

from the terminals will be partitioned into blocks and encrypted with its corresponding access sub-tree

in sequence, thereby the data encryption and data transmission can be processed in parallel.

Furthermore, we protect the information about the access tree with threshold secret sharing method,

which can preserve the data privacy and integrity from users with the unauthorized sets of attributes.

The formal analysis demonstrates that the proposed scheme can fulfill the security requirements of the

Cloud-supported IoT in smart grid. The numerical analysis and experimental results indicate that our

scheme can effectively reduce the time cost compared with other popular approaches.

ETPL CLD

- 044

Achieving Efficient and Secure Data Acquisition for Cloud-supported Internet of

Things in Smart Grid


The development of cloud computing pours great vitality into traditional wireless sensor networks

(WSNs). The integration of WSNs and cloud computing has received a lot of attention from both

academia and industry. However, collecting data from WSNs to cloud is not sustainable. Due to the

weak communication ability of WSNs, uploading big sensed data to the cloud within the limited time

becomes a bottleneck. Moreover, the limited power of sensor usually results in a short lifetime of

WSNs. To solve these problems, we propose to use multiple mobile sinks (MSs) to help with data

collection. We formulate a new problem which focuses on collecting data from WSNs to cloud within

a limited time and this problem is proved to be NP-hard. To reduce the delivery latency caused by

unreasonable task allocation, a time adaptive schedule algorithm (TASA) for data collection via

multiple MSs is designed, with several provable properties. In TASA, a non-overlapping and adjustable

trajectory is projected for each MS. In addition, a minimum cost spanning tree (MST) based routing

method is designed to save the transmission cost. We conduct extensive simulations to evaluate the

performance of the proposed algorithm. The results show that the TASA can collect the data from

WSNs to Cloud within the limited latency and optimize the energy consumption, which makes the

sensor-cloud sustainable.

ETPL CLD - 045

Sustainable and Efficient Data Collection from WSNs to Cloud

As accurate malware detection on mobile devices requires fast process of a large number of application

traces, cloud-based malware detection can utilize the data sharing and powerful computational

resources of security servers to improve the detection performance. In this paper, we investigate the

cloud-based malware detection game, in which mobile devices offload their application traces to

security servers via base stations or access points in dynamic networks. We derive the Nash equilibrium

(NE) of the static malware detection game and present the existence condition of the NE, showing how

mobile devices share their application traces at the security server to improve the detection accuracy,

and compete for the limited radio bandwidth, the computational and communication resources of the

server. We design a malware detection scheme with Q-learning for a mobile device to derive the

optimal offloading rate without knowing the trace generation and the radio bandwidth model of other

mobile devices. The detection performance is further improved with the Dyna architecture, in which a

mobile device learns from the hypothetical experience to increase its convergence rate. We also design

a post-decision state learning-based scheme that utilizes the known radio channel model to accelerate

the reinforcement learning process in the malware detection. Simulation results show that the proposed

schemes improve the detection accuracy, reduce the detection delay and increase the utility of a mobile

device in the dynamic malware detection game, compared with the benchmark strategy.

ETPL CLD

- 046 Cloud-based Malware Detection Game for Mobile Devices with Offloading


Spatial data have wide applications, e.g., location-based services, and geometric range queries (i.e.,

finding points inside geometric areas, e.g., circles or polygons) are one of the fundamental search

functions over spatial data. The rising demand of outsourcing data is moving large-scale datasets,

including large-scale spatial datasets, to public clouds. Meanwhile, due to the concern of insider

attackers and hackers on public clouds, the privacy of spatial datasets should be cautiously preserved

while querying them at the server side, especially for location-based and medical usage. In this paper,

we formalize the concept of Geometrically Searchable Encryption, and propose an efficient scheme,

named FastGeo, to protect the privacy of clients’ spatial datasets stored and queried at a public server.

With FastGeo, which is a novel two-level search for encrypted spatial data, an honest-but-curious

server can efficiently perform geometric range queries, and correctly return data points that are inside

a geometric range to a client without learning sensitive data points or this private query. FastGeo

supports arbitrary geometric areas, achieves sublinear search time, and enables dynamic updates over

encrypted spatial datasets. Our scheme is provably secure, and our experimental results on real-world

spatial datasets in cloud platform demonstrate that FastGeo can boost search time over 100 times.

ETPL CLD - 047

Fast Geo: Efficient Geometric Range Queries on Encrypted Spatial Data

Network Function Virtualization (NFV) and Software Defined Networking (SDN) have been proposed

to increase the cost-efficiency, flexibility and innovation in network service provisioning. This is

achieved by leveraging IT virtualization techniques and combining them with programmable networks.

By doing so, NFV and SDN are able to decouple the network functionality from the physical devices

on which they are deployed. Service Function Chains (SFCs) composed out of Virtual Network

Functions (VNFs) can now be deployed on top of the virtualized infrastructure to create new value-

added services. Current NFV approaches are limited to mapping the different VNFs to the physical

substrate subject to resource capacity constraints. They do not provide the possibility to define location

requirements with a certain granularity and constraints on the colocation of VNFs and virtual edges.

Nevertheless, many scenarios can be envisioned in which a Service Provider (SP) would like to attach

placement constraints for efficiency, resilience, legislative, privacy and economic reasons. Therefore,

we propose a set of affinity and anti-affinity constraints, which can be used by SPs to define such

placement restrictions. Furthermore, a semantic SFC validation framework is proposed that allows the

Virtual Network Function Infrastructure Provider (VNFInP) to check the validity of a set of constraints

and provide feedback to the SPs. This allows the VNFInP to filter out any non-valid SFC requests

before sending them to the mapping algorithm, significantly reducing the mapping time.

ETPL CLD

- 048

Semantically Enhanced Mapping Algorithm for Affinity Constrained Service

Function Chain Requests


In this paper, we propose a novel HYBRID Bio-Inspired algorithm for task scheduling and resource

management, since it plays an important role in the cloud computing environment. Conventional

scheduling algorithms such as Round Robin, First Come First Serve, Ant Colony Optimization etc.

have been widely used in many cloud computing systems. Cloud receives clients tasks in a rapid rate

and allocation of resources to these tasks should be handled in an intelligent manner. In this proposed

work, we allocate the tasks to the virtual machines in an efficient manner using Modified Particle

Swarm Optimization algorithm and then allocation / management of resources (CPU and Memory), as

demanded by the tasks, is handled by proposed HYBRID Bio-Inspired algorithm (Modified PSO +

Modified CSO). Experimental results demonstrate that our proposed HYBRID algorithm outperforms

peer research and benchmark algorithms (ACO, MPSO, CSO, RR and Exact algorithm based on

branch-and-bound technique) in terms of efficient utilization of the cloud resources, improved

reliability and reduced average response time.

ETPL CLD - 049

A Hybrid Bio-Inspired Algorithm for Scheduling and Resource Management in

Cloud Environment

Virtualization is an essential step before a baremetal data center being ready for commercial usage,

because it bridges the foreground interface for cloud tenants and the background resource management

on underlying infrastructures. A concept at the heart of the foreground is multi-tenancy, which deals

with logical isolation of shared virtual computing, storage, and network resources and provides

adaptive capability for heterogeneous demands from various tenants. A crucial problem in the

background is load balancing, which affects multiple issues including cost, flexibility and availability.

In this work, we propose a virtualization framework that consider these two problems simultaneously.

Our framework takes advantage of the flourishing application of distributed virtual switch (DVS), and

leverages the blooming adoption of OpenFlow protocols. First, the framework accommodates

heterogeneous network communication pattern by supporting arbitrary traffic matrices among virtual

machines (VMs) in virtual private clouds (VPCs). The only constraint on the network flows is that the

bandwidth of a server’s network interface. Second, our framework achieves load balancing using an

elaborately designed link establishment algorithm. The algorithm takes the configurations of the bare-

metal data center and the dynamic network environment as inputs, and adaptively applies a globally

bounded oversubscription on every link. Our framework concentrates on the fat-tree architecture,

which is widely used in today’s data centers.

ETPL CLD

- 050

A Load Balancing and Multi-tenancy Oriented Data Center Virtualization

Framework


Attribute-based encryption (ABE) has been widely used in cloud computing where a data provider

outsources his/her encrypted data to a cloud service provider, and can share the data with users

possessing specific credentials (or attributes). However, the standard ABE system does not support

secure deduplication, which is crucial for eliminating duplicate copies of identical data in order to save

storage space and network bandwidth. In this paper, we present an attribute-based storage system with

secure deduplication in a hybrid cloud setting, where a private cloud is responsible for duplicate

detection and a public cloud manages the storage. Compared with the prior data deduplication systems,

our system has two advantages. Firstly, it can be used to confidentially share data with users by

specifying access policies rather than sharing decryption keys. Secondly, it achieves the standard

notion of semantic security for data confidentiality while existing systems only achieve it by defining

a weaker security notion. In addition, we put forth a methodology to modify a ciphertext over one

access policy into ciphertexts of the same plaintext but under other access policies without revealing

the underlying plaintext.

ETPL CLD - 051

Attribute-Based Storage Supporting Secure Deduplication of Encrypted Data in

Cloud

Cloud storage facilitates both individuals and enterprises to cost effectively share their data over the

Internet. However, this also brings difficult challenges to the access control of shared data since few

cloud servers can be fully trusted. Ciphertext-policy attribute-based encryption (CP-ABE) is a

promising approach that enables the data owners themselves to place fine-grained and

cryptographically-enforced access control over outsourced data. In this paper, we present secure and

cost-effective attribute-based data access control for cloud storage systems. Specifically, we construct

a multiauthority CP-ABE scheme that features: 1) the system does not need a fully trusted central

authority, and all attribute authorities independently issue secret keys for users; 2) each attribute

authority can dynamically remove any user from its domain such that those revoked users cannot access

subsequently outsourced data; 3) cloud servers can update the encrypted data from the current time

period to the next one such that the revoked users cannot access those previously available data; and

4) the update of secret keys and ciphertext is performed in a public way. We show the merits of our

scheme by comparing it with the related works, and further implement it to demonstrate its practicality.

In addition, the proposed scheme is proven secure in the random oracle model.

ETPL CLD

- 052

Secure and Efficient Attribute-Based Access Control for Multiauthority Cloud

Storage