+ All Categories
Home > Documents > Vasudevan, Meera,Tian, Glen,Tang, Maolin,Kozan, Erhan, & Zhang, … · 2020. 8. 19. · This may be...

Vasudevan, Meera,Tian, Glen,Tang, Maolin,Kozan, Erhan, & Zhang, … · 2020. 8. 19. · This may be...

Date post: 29-Aug-2020
Category:
Upload: others
View: 0 times
Download: 0 times
Share this document with a friend
27
This may be the author’s version of a work that was submitted/accepted for publication in the following source: Vasudevan, Meera, Tian, Glen, Tang, Maolin, Kozan, Erhan, & Zhang, Xueying (2018) Energy-efficient application assignment in profile-based data center man- agement through a Repairing Genetic Algorithm. Applied Soft Computing, 67, pp. 399-408. This file was downloaded from: https://eprints.qut.edu.au/117088/ c Consult author(s) regarding copyright matters This work is covered by copyright. Unless the document is being made available under a Creative Commons Licence, you must assume that re-use is limited to personal use and that permission from the copyright owner must be obtained for all other uses. If the docu- ment is available under a Creative Commons License (or other specified license) then refer to the Licence for details of permitted re-use. It is a condition of access that users recog- nise and abide by the legal requirements associated with these rights. If you believe that this work infringes copyright please provide details by email to [email protected] License: Creative Commons: Attribution-Noncommercial-No Derivative Works 2.5 Notice: Please note that this document may not be the Version of Record (i.e. published version) of the work. Author manuscript versions (as Sub- mitted for peer review or as Accepted for publication after peer review) can be identified by an absence of publisher branding and/or typeset appear- ance. If there is any doubt, please refer to the published source. https://doi.org/10.1016/j.asoc.2018.03.016
Transcript
Page 1: Vasudevan, Meera,Tian, Glen,Tang, Maolin,Kozan, Erhan, & Zhang, … · 2020. 8. 19. · This may be the author’s version of a work that was submitted/accepted for publication in

This may be the author’s version of a work that was submitted/acceptedfor publication in the following source:

Vasudevan, Meera, Tian, Glen, Tang, Maolin, Kozan, Erhan, & Zhang,Xueying(2018)Energy-efficient application assignment in profile-based data center man-agement through a Repairing Genetic Algorithm.Applied Soft Computing, 67, pp. 399-408.

This file was downloaded from: https://eprints.qut.edu.au/117088/

c© Consult author(s) regarding copyright matters

This work is covered by copyright. Unless the document is being made available under aCreative Commons Licence, you must assume that re-use is limited to personal use andthat permission from the copyright owner must be obtained for all other uses. If the docu-ment is available under a Creative Commons License (or other specified license) then referto the Licence for details of permitted re-use. It is a condition of access that users recog-nise and abide by the legal requirements associated with these rights. If you believe thatthis work infringes copyright please provide details by email to [email protected]

License: Creative Commons: Attribution-Noncommercial-No DerivativeWorks 2.5

Notice: Please note that this document may not be the Version of Record(i.e. published version) of the work. Author manuscript versions (as Sub-mitted for peer review or as Accepted for publication after peer review) canbe identified by an absence of publisher branding and/or typeset appear-ance. If there is any doubt, please refer to the published source.

https://doi.org/10.1016/j.asoc.2018.03.016

Page 2: Vasudevan, Meera,Tian, Glen,Tang, Maolin,Kozan, Erhan, & Zhang, … · 2020. 8. 19. · This may be the author’s version of a work that was submitted/accepted for publication in

Energy-efficient Application Assignment inProfile-based Data Center Management Through a

Repairing Genetic Algorithm

Meera Vasudevana, Yu-Chu Tiana,c,∗, Maolin Tanga, Erhan Kozanb, XueyingZhangc

aSchool of Electrical Engineering and Computer Science, Queensland University ofTechnology, GPO Box 2434, Brisbane QLD 4001.

bSchool of Mathematical Sciences, Queensland University of Technology, GPO Box 2434,Brisbane QLD 4001.

cCollege of Information Engineering, Taiyuan University of Technology, Taiyuan 030024,P. R. China

Abstract

The massive deployment of data center services and cloud computing comeswith exorbitant energy costs and excessive carbon footprint. This demandsgreen initiatives and energy-efficient strategies for greener data centers. Assign-ment of an application to different virtual machines has a significant impact onboth energy consumption and resource utilization in virtual resource manage-ment of a data centre. However, energy efficiency and resource utilization areconflicting in general. Thus, it is imperative to develop a scalable applicationassignment strategy that maintains a trade-off between energy efficiency andresource utilization. To address this problem, this paper formulates applicationassignment to virtual machines as a profile-driven optimization problem underconstraints. Then, a Repairing Genetic Algorithm (RGA) is presented to solvethe large-scale optimization problem. It enhances penalty-based genetic algo-rithm by incorporating the Longest Cloudlet Fastest Processor (LCFP), fromwhich an initial population is generated, and an infeasible-solution repairingprocedure (ISRP). The application assignment with RGA is integrated into athree-layer energy management framework for data centres. Experiments areconducted to demonstrate the effectiveness of the presented approach, e.g., 23%less energy consumption and 43% more resource utilization in comparison withthe steady-state Genetic Algorithm (GA) under investigated scenarios.

Keywords: Data center; application assignment; energy efficiency, resourcescheduling, genetic algorithm

∗Corresponding AuthorEmail address: [email protected] (Yu-Chu Tian)

Preprint submitted to Elsevier December 10, 2017

Applied Soft Computing online published on 13 March 2018. DOI: 10.1016/j.asoc.2018.03.016

Page 3: Vasudevan, Meera,Tian, Glen,Tang, Maolin,Kozan, Erhan, & Zhang, … · 2020. 8. 19. · This may be the author’s version of a work that was submitted/accepted for publication in

1. Introduction1

With the rapid development of modern information technology, data cen-2

ters predictably escalate the energy consumption and carbon footprint at an3

alarming pace. Overall, data centers consume 1.1% to 1.5% of the world’s4

total electricity consumption [1]. More than 35% of the current data center5

operational expenses are accounted for by energy consumption. This figure is6

projected to double in a few years. The Natural Resources Defence Council7

states that 91 billion kWh of electrical energy was consumed by data centers8

in 2013. It indicates that this figure is projected to increase by 53% by year9

2020 [2]. It further reports that there is a distinct gap in energy-efficient ini-10

tiatives when comparing energy-efficient large data centers and numerous less11

energy-efficient small to medium data centers. Large data centers are run by12

giant providers such as Microsoft, Google, Dell, and Facebook, which make up13

for 5% data center energy usage. Small to medium data centers are operated14

by thousands of businesses, universities and government agencies. They are the15

focus of our research in this paper.16

The real and emerging problem of high carbon footprint and exorbitant17

energy costs incites the need for green and energy-efficient architectures and18

algorithms for data center management. Energy and cost distribution studies,19

e.g., Le et al. [3], have demonstrated that deploying green initiatives at data20

centers reduces the carbon footprint by 35% at only a 3% cost increase. Previ-21

ously, research and development with respect to large-scale distributed systems22

have been mostly driven by performance. This evolves to the current focus on23

implementation of energy-aware measures with simultaneous maximum perfor-24

mance efficiency and minimum energy consumption [4]. Through management25

of energy efficiency, the variable part of the energy consumption in a data center26

can be largely reduced [5]. For this purpose, a three-layer architecture has been27

presented [6, 7]: application assignment, Virtual Machine (VM) placement, and28

Physical Machine (PM) management. For application assignment to VMs, the29

energy-efficiency problem can be discussed in an optimization framework. Thus,30

techniques from optimization can be adopted and configured to suit various data31

centers.32

However, assignment of an application to one of the large number of available33

VMs for energy optimization is challenging. It is fundamentally a combinatorial34

optimization problem under a number of constraints. Such an optimization35

problem is well-known to be NP-hard. Thus, it is not possible to derive a36

solution to the problem in a polynomial time. Due to the massive scale of the37

problem, solving it through exhaustive search is also not practically feasible.38

Thus, development of intelligent algorithms becomes particularly attractive for39

application assignment in energy-efficient data center management.40

Extending our preliminary work [6, 7, 8] that mainly investigated the ap-41

plication layer, our work in this paper presents a complete energy management42

system by designing a new application assignment strategy and at the same43

time deploying a VM placement policy. This will demonstrate actual energy44

savings at the physical server level. More specifically, our work in this paper45

2

Page 4: Vasudevan, Meera,Tian, Glen,Tang, Maolin,Kozan, Erhan, & Zhang, … · 2020. 8. 19. · This may be the author’s version of a work that was submitted/accepted for publication in

formulates application assignment to VMs as a profile-driven optimization prob-46

lem under constraints. However, the problem is too large with a huge solution47

space to be solved directly. A heuristic-based algorithm is required to derive48

a solution to the problem. This paper makes use of Genetic Algorithm (GA)49

as the fundamental heuristic to solve the application assignment problem. The50

GA is chosen due to its ability of providing a feasible solution anytime when51

it is terminated [9]. It provides a feasible solution even on interruption of the52

algorithm or when the execution time requirement is tight. The GA is modi-53

fied to improve solutions by designing a Repairing Genetic Algorithm (RGA)54

to solve the large-scale optimization problem. The RGA distinctively incorpo-55

rates two components: the Longest Cloudlet Fastest Processor (LCFP) and an56

Infeasible-Solution Repairing Procedure (ISRP). The LCFP generates an initial57

population to minimize makespan and also maximize resource utilization, while58

the ISRP converts infeasible chromosomes into feasible application placement59

solutions. While a solution repairing procedure has previously applied in VM60

management, it has not been investigated for placement of applications to VMs.61

This paper is organized as follows. Acronyms and notations used throughout62

the paper are summarized in Tables 1 and 1, respectively. Section 2 reviews63

related work and motivates the research. This is followed by Section 3 that64

describes and formulates the profile-based application assignment problem. The65

RGA is presented in Section 4. Case studies are conducted in Section 5 to66

demonstrate our approach. Finally, Section 6 concludes the paper.67

Table 1: Acronyms.

FFD First-fit decreasingGA Genetic algorithmIC Instruction countIPS Instructions per secondLCFP Longest cloudlet fastest processorMIPS Million instructions per secondPM Physical machinePUE Power usage efficiencyRGA Repairing GASD Standard deviationVM Virtual machine

2. Related Work68

Hierarchically, the energy management of data centers can be implemented69

at three layers: application assignment to VMs, VM placement to PMs, and70

PM management [6, 7]. This three-layer hierarchical architecture is shown in71

Figure 1. The PM management determines PM resource usage and ON/OFF72

operations. The VM layer is responsible for VM management, including VM73

3

Page 5: Vasudevan, Meera,Tian, Glen,Tang, Maolin,Kozan, Erhan, & Zhang, … · 2020. 8. 19. · This may be the author’s version of a work that was submitted/accepted for publication in

Table 2: Notations.ai The ith application, i ∈ ICPUmaxj Computing capacity of Vj in MIPSeij Execution time of ai on VjEusage PUE of the data centerF Fitness functionFobj , F∗, Fworst Objective function, and its best and worst valuesCij Energy cost of allocation

I The set of applications, I , {1, ..., Napp}ICi Instruction count for ai in IPSICj The number of instructions to be executed on VjJ The set of VMs, J , {1, ..., Nvm}K The set of PMs, K , {1, ..., Npm}memi Memory requirement of ai in BytesMEMmax

j Memory capacity of Vj in BytesNapp The total number of applicationsNapm The total number of active PMsNpm, Nvm The total numbers of PMs and VMs, respectivelyP Total power consumption of the data centerPidle Idle server power consumptionPpeak Peak server power consumptionr Range of normalizationSk The kth PM, k ∈ KT [Vj ] Total completion time of VjTr[Vj ] CPU ready time of VjUavg Average CPU utilization of all VMsVj The jth VM, j ∈ JV CPUtot Total CPU requirements of all ai on VjVid The ID of VMV memtot Total memory requirements of all ai on Vjviol Violation indicatorw1, w2 Weightsxij Binary allocation decision variableα An intermediate variableφ Penaltyθ Makespan of the allocation

placement and migration. The application layer assigns incoming applications74

to the previously generated VMs for considerations of resource utilization, per-75

formance efficiency, and energy consumption. For a data center, all these three76

layers work together to determine the overall energy consumption, which is77

composed of the fixed and variable parts. The fixed energy consumption part78

depends on hardware computing, storage and network elements, while the vari-79

4

Page 6: Vasudevan, Meera,Tian, Glen,Tang, Maolin,Kozan, Erhan, & Zhang, … · 2020. 8. 19. · This may be the author’s version of a work that was submitted/accepted for publication in

able energy consumption part depends on the application, VM and PM resource80

usage [5]. Our research in this paper focuses on reducing the variable energy81

consumption part by developing a three-tiered profile-based application man-82

agement strategy.83

Figure 1: Three-layer architecture for data center management.

Efforts have been made for minimization of the energy consumption of data84

centers over the past few years. The most successful measures include the de-85

ployment of workload prediction [10, 11], VM management [12, 13], and resource86

and application management strategies [14, 15]. The techniques developed in87

all these efforts contribute to energy savings in data center management. Our88

research in the present paper focuses on scheduling of applications as tasks to89

all available VMs as resources.90

Existing application management schemes use various methods for energy91

savings. An example is an adaptive controller for dynamic resource provision-92

ing [16]. Using an Ink Drop Spread method based on reinforcement learning,93

the controller nullifies application rejection rate and minimizes energy wastage.94

Another example is a speed scaling model to deploy applications to resources95

from the cooperative game theory perspective with the objective of improving96

energy conservation in data centers [17]. A further example is an enhanced max-97

min application scheduling algorithm for cloud systems [18]. The algorithm98

is contingent on the expected execution time rather than the more common99

completion time as a selection criteria. More recently, an application and plat-100

form aware resource allocation framework is presented for consolidated server101

systems [19]. The framework targets multi-core platforms and scale-up server102

systems. Fundamentally different from all above mentioned methods, our ap-103

proach in this paper is profile-based with consideration of relatively consistent104

workload in a large class of data centers. More specifically, our work targets105

widely deployed low- to medium-density data centers operated by government106

agencies, corporate businesses and universities with easily recordable workload107

information. These data centers do not often modify VM parameters, resulting108

in near-habitual application processing. This motivates us to use profiles of109

applications, VMs and PMs for improvement of energy efficiency in data center110

management.111

The profiling technique has been used in application management for specific112

objectives, e.g., workload prediction, and behavioural and performance evalua-113

5

Page 7: Vasudevan, Meera,Tian, Glen,Tang, Maolin,Kozan, Erhan, & Zhang, … · 2020. 8. 19. · This may be the author’s version of a work that was submitted/accepted for publication in

tion. For example, an energy consumption model is established through profiling114

energy consumption with respect to computation, data, and communication in-115

tensive tasks [20]. This model is later extended in development of StressCloud116

for profiling the energy consumption and performance of cloud systems [21].117

This profile-based energy consumption method has successfully profiled cloud118

application models with varying task workload and resource allocation strate-119

gies. A static profiling technique is discussed in [22, 23] for prediction of perfor-120

mance degradation in relation to assignment of multiple applications to a single121

machine. A Bubble-Up and Bubble-Flux method is also developed to accurately122

predict the performance degradation incurred on allocating multiple workloads123

to servers for maximum utilization [24]. Vu Do et al. [25] and Ye et al. [26] have124

utilized performance profiles to determine resource allocation efficiency. In spite125

of all these achievements in developing various profiling methods, the concept126

of profiles has not been integrated into the decision making process of initial127

and continued application management in energy efficiency of data centres ex-128

cept our previous studies [6, 7, 8] . Our work in this paper builds profiles and129

employs the profiles in the decision process for management of applications and130

VMs. This enables development of our penalty-based RGA to solve very large-131

scale application assignment problems without compromising the performance132

efficiency.133

Evolutionary algorithms such as GA have been successfully applied in ap-134

plication scheduling in data centers and cloud computing [27]. Application135

scheduling in data centers is generally NP-complete. The main advantage of136

using GA for NP-complete allocation problems is to allow faster convergence137

by searching the solution space in multiple directions. Once a feasible solution138

is obtained, a feasible and improved solution is always achieved whenever GA139

is terminated. This is particularly useful when the algorithm has to be termi-140

nated for whatever reason, e.g., when its execution time reaches its limit for141

a specific system or when it is interrupted by other processes. Therefore, an142

energy-efficient, pareto-front and solution-based hybrid GA has been proposed143

for resource allocation in cloud systems [28]. It utilizes a crossover operator for144

multiple genes and a case library for initializations by identifying case similar-145

ities. The concept of case library is similar to that of profiles. While a case146

library considers only allocation information, profiles are more adaptable by147

considering individual component information. In addition, a GA-based appli-148

cation and VM scheduler has been tested for cloud systems with various initial149

populations generated from heuristics [29]. Among various GA methods, the150

LCFP is shown to be the most efficient when considering a large number of pro-151

cessing nodes. However, no research has been found to use LCFP in application152

assignment for energy-efficient data center management. This motivates our153

work in this paper to integrate LCFP into our RGA for application assignment154

to VMs155

It is worth mentioning that GA has also been used in VM placement prob-156

lems, which have been proven to be NP-complete. A GA-based VM placement157

strategy has been presented to minimize the energy consumption of servers and158

communication network within the data centers [30]. It is extended in [31] to159

6

Page 8: Vasudevan, Meera,Tian, Glen,Tang, Maolin,Kozan, Erhan, & Zhang, … · 2020. 8. 19. · This may be the author’s version of a work that was submitted/accepted for publication in

significantly improve the energy and performance efficiency with an enhanced160

repairing GA. Different from the above mentioned methods developed for VM161

placement to PMs, our work in this paper deals with application assignment to162

VMs. In comparison with VM placement, application problem is much larger163

in size and thus demands high computational efficiency. Therefore, our work in164

this paper makes use of the concept of profiles, develops an infeasible-solution165

repairing procedure, and then incorporates the procedure distinctively into ap-166

plication assignment to VMs167

In addition to the references reviewed above, other energy-efficient data cen-168

ter strategies have also been studied. For example, a reputation-guided genetic169

scheduling algorithm has been proposed for autonomous tasks inter-clouds envi-170

ronment [15]. A GA-based repairing heuristic algorithm is presented for work-171

flow scheduling in cloud systems [32]. A dynamic power and interference aware172

resource management mechanism called PIASA [14] is developed to handle dif-173

ferent types of data-intensive application workloads in dynamic cloud environ-174

ments. By using an open source GA framework called jMetal, an energy-efficient175

resource allocation approach has been presented by Portaluri et al. [33]. Au-176

tonomous virtual resource management in data centers is discussed through a177

Markovian decision process [34]. A fractal framework is presented by Ghorbani178

et al. [35] for effective management of burst cloud workloads. Different from all179

these recent reports, our work in this paper considers the application assignment180

layer specifically in data center management, and makes use of the concept of181

profiles in our approach for efficient energy management of application assign-182

ment to VMs. It uses the information extracted from the profiles and thus is183

expected to give a satisfactory solution with a shorter execution time.184

Profile-based application assignment to VMs has been recently investigated185

in our previous work. In our report [6, 8], the problem is formulated as an186

optimization problem and is further solved through a greedy algorithm. This is187

extended in [7] for static application assignment with improved problem mod-188

elling and a penalty-based steady-state GA to solve the problem. Our work189

in the present work further extends our previous work [6, 7, 8] significantly190

in the following aspects: 1) it presents a complete energy management sys-191

tem with both application assignment and VM placement; 2) it incorporates192

LCFP-generated initial population into the RGA; 3) it designs an ISRP for the193

RGA; and 4) it considers makespan as a performance metric. The ideas of RGA194

and ISRP have been considered in our recent work [36] on dynamic application195

assignment problems. However, our research here deals with large-scale static196

application assignment, which is fundementally different from that studied in197

[36] and thus requires a different modelling framework and different strategies198

for solving the established model.199

Before concluding literature review, let us briefly discuss the application as-200

signment problem in a broader sense. As application assignment to VMs can201

be formulated as a constrained optimization problem, one may think of many202

other optimization methods for solving the problem. A typical category of such203

methods includes (chaotic) neural network based optimization [37]. However, a204

computationally efficient neural network method has not been found for energy-205

7

Page 9: Vasudevan, Meera,Tian, Glen,Tang, Maolin,Kozan, Erhan, & Zhang, … · 2020. 8. 19. · This may be the author’s version of a work that was submitted/accepted for publication in

efficient application assignment to VMs in data center management. More re-206

search needs to be conducted to clarify this issue. Another typical category of207

optimization methods is Tabu search [38]. Tabu search is a metaheuristic search208

method that employs local search methods for mathematical optimization. It is209

potentially useful for application assignment in energy-efficient data center man-210

agement. A further category of optimization methods include a combination of211

GA and neural networks [39]. Again, it is not clear how to well design neural212

networks for application assignment in greener data centers. Therefore, while213

various methods have been developed for general optimization problems, GA-214

based methods have shown to be promising over the last many years for virtual215

resource optimization in data center management. Thus, GA is investigated216

in this paper with integration of the concept of profiles for energy-efficiency of217

application assignment to VMs.218

3. Problem Formulation219

Our research problem is to design an application management strategy for220

energy-efficient data centers. It assigns applications to VMs with the objec-221

tive of reducing energy consumption whilst maintaining a satisfactory level of222

resource utilization performance. Our solution incorporates the concept of uti-223

lizing profiles for energy-efficient application management of data centers [6].224

The profile-based application assignment builds profiles for applications, VMs225

and PMs from workload traces of a data center. The workload traces contain226

the data related to application resource requirement, VM resource availability,227

application frequency, submission times and server energy consumption. The228

profiles used in our research have been built from the workload logs from a229

real data center. The workload data including CPU and memory utilisation are230

recorded every 60 minutes along with energy consumption every 5 minutes. The231

name of the data center is omitted here due to the requirement of commercial232

confidentiality.233

In order to derive actual energy savings and also to determine the effec-234

tiveness of our profile-based RGA, a three-layer energy management strategy is235

developed, as shown in Figure 1. It consists of the following components:236

1) Application Assignment implements our profile-based RGA; and237

2) VM Placement implements the First-Fit Decreasing (FFD) algorithm.238

Our profile-based RGA utilizes profiles to obtain the data related to energy,239

CPU, memory availability and requirements in order to evaluate the energy240

consumption model to be formulated below. Unlike other approaches, profiles241

allow to determine the cost of energy on allocation of an application to a specific242

VM. They are used with the help of recorded instruction count (IC) require-243

ment of a profiled application, the pre-set VM CPU capacity and the server244

power usage range. The RGA exploits an LCFP-generated initial population to245

improve the makespan of the application allocation. It also incorporates a new246

8

Page 10: Vasudevan, Meera,Tian, Glen,Tang, Maolin,Kozan, Erhan, & Zhang, … · 2020. 8. 19. · This may be the author’s version of a work that was submitted/accepted for publication in

ISRP for application management to re-allocate the applications that violate247

the constraints, thus converting infeasible solutions to feasible ones.248

The application assignment problem is formulated as a linear optimization249

problem. The assignment of an application ai, i ∈ I, onto a VM Vj , j ∈ J is250

represented by a binary decision variable xij , i ∈ I, j ∈ J , and incurs an energy251

cost of Cij [6]:252

xij =

{1, if ai is allocated to Vj ; i ∈ I, j ∈ J,0, otherwise.

(1)

Cij =PpeakPidle

· ICiCPUmaxj

, (2)

where Cij is the product of the peak and idle server power ratio and the exe-253

cution time of application ai on VM Vj . The total number of instructions to254

execute application ai is given by ICi [40].255

The assignment of a set of applications to VMs is given by a constrained256

combinatorial optimization model as:257

F (obj) = min

Nvm∑j=1

Napp∑i=1

Cij · xij (3)

s.t. ICj/T [Vj ] ≤ CPUmaxj , ∀j ∈ J ; (4)∑Napp

i=1 xij ·memi ≤MEMmaxj , ∀j ∈ J ; (5)∑Nvm

j=1 xij = 1, ∀i ∈ I; (6)

xij = 0 or 1, ∀i ∈ I, j ∈ J. (7)

The constraints in Equations (4) and (5) ensure that the allocated resourcesare within the total capacity of the VM. Constraint in Equation (6) restricts anapplication from running on more than one VM. The binary constraint of theallocation decision variable xij is given by Equation (7). ICj , the number ofinstructions to be executed on Vj , is calculated as:

ICj =

Napp∑i=1

xij · ICi. (8)

Makespan is the total length of time required for all applications in the258

allocation to finish processing.The makespan θ of the allocation is calculated as:259

θ = max(T [Vj ]), j ∈ J (9)

T [Vj ] = eij + Tr[Vj ], i ∈ I, j ∈ J (10)

where T [Vj ] is an iterative variable representing the total time required to ex-260

ecute all applications allocated to the VM. Tr[Vj ] is the time at which the last261

application finished processing and the execution time of the next application262

ai on VM Vj is represented by eij .263

9

Page 11: Vasudevan, Meera,Tian, Glen,Tang, Maolin,Kozan, Erhan, & Zhang, … · 2020. 8. 19. · This may be the author’s version of a work that was submitted/accepted for publication in

Average CPU utilization of all VMs across the data center for the time264

interval under consideration is given by:265

Uavg =1

Nvm

Nvm∑j=1

cpuj , (11)

cpuj =ICjT [Vj ]

· 1

CPUmaxj

, (12)

where cpuj of a VM Vj is the ratio of the amount of resource actually used to266

the maximum CPU capacity that can be used.267

The total power consumption associated with a data center is:

P = Napm · [Pidle + (Eusage − 1) · Ppeak + (Ppeak − Pidle) · Uavg], (13)

where Ppeak and Pidle represent the power consumed at the maximum and idle268

server utilization, respectively. The Power Usage Efficiency (PUE) is repre-269

sented by Eusage. 1 ≤ Napm ≤ Npm represents the number of active servers in270

the data center.271

The profile-based application assignment problem has been formally for-272

mulated as a constrained optimization problem. The size of this problem is273

generally large with multiple constraints and consequently the problem is NP-274

hard. The following section investigates how to solve the optimization problem275

through designing a Repairing Genetic Algorithm (RGA).276

4. Repairing Genetic Algorithm277

This section aims to solve the the linear optimization model in Equation (3)278

subject to constraints (4) to (7) for the profile-based application assignment279

problem in data centers. Our preliminary work in [7] has developed a simple280

penalty-based GA for profile-based application. However, due to the large data281

set of the optimization problem, the performance of the classic GA is degraded282

due to imperfect, slow or no convergence [41]. Therefore, this section develops283

an RGA to solve the problem. The overall objective is to minimize the energy284

consumption and makespan whilst maxmizing the resource utilization.285

4.1. High-Level Description of RGA286

A high-level description of the RGA is given in Algorithm 1, which is self-287

explained. In comparison with a simple GA in our preliminary study [7], RGA288

incorporates an LCFP-generated initial population (Lines 1 and 2) and an ISRP289

(Lines 8 to 10). This will be further discussed in detail in the following subsec-290

tions.291

10

Page 12: Vasudevan, Meera,Tian, Glen,Tang, Maolin,Kozan, Erhan, & Zhang, … · 2020. 8. 19. · This may be the author’s version of a work that was submitted/accepted for publication in

Algorithm 1: RGA

1 Find output of solutions generated by LCFP;2 Initialize population with LCFP output;3 Evaluate fitness of each candidate chromosome;4 while Termination condition is not satisfied do5 for Each Generation do6 for Each chromosome do7 Evaluate fitness;8 if Chromosome is infeasible then9 Apply ISRP described in Algorithm 3;

10 Evaluate Fitness;

11 Parents selected using Roulette Wheel Selection;

12 for Parent chromosomes do13 Apply uniform crossover as per Pc;14 Mutate resulting offspring as per Pm;15 Offspring chromosomes generated;

16 for Offspring chromosomes do17 Evaluate fitness of new candidates;18 Replace low-fitness chromosomes with better offspring;

19 for Each chromosome do20 Select chromosomes for next generation;

21 Output best fit chromosome as solution;

4.2. LCFP-Generated Initial Population292

In the steady-state GA for profile-based assignment [7], the initial population293

is randomly generated as in other general GA methods. For faster convergence294

and better solutions, the initial population can be better chosen through heuris-295

tics. The LCFP heuristics consider the computational complexity of applica-296

tions and the computing power of processors to generate an initial population,297

which is composed of a set of chromosomes. It assigns the longest application298

to the fastest processing VM such that the lengthier applications finish quickly299

thereby minimizing the makespan.300

The LCFP heuristics pursue three main steps. The steps are:301

1. Sort applications ai, i ∈ I, in a descending order of execution time;302

2. Sort VMs Vj , j ∈ J in a descending order of processing power (MIPS);303

and304

3. Pack sorted applications into fastest processing VMs.305

The initial population undergoes genetic operations such as selection, mutation306

and crossover to produce a new population consisting of offspring chromosomes307

as seen in Algorithm 2.308

11

Page 13: Vasudevan, Meera,Tian, Glen,Tang, Maolin,Kozan, Erhan, & Zhang, … · 2020. 8. 19. · This may be the author’s version of a work that was submitted/accepted for publication in

Algorithm 2: Genetic operators - crossover, mutation, and roulette wheelselection

1 Roulette Wheel Selection:2 Evaluate constant Fitnesssum = the total fitness of all chromosomes in

the population;3 Let variable itSum represent the iterative fitness sum;4 Let variable Ch represent a chromosome;5 Generate a random number G ∈ [0, 1];6 Set itSum← 0;7 for Each Chromosome do8 Evaluate Prob. of Selection: probCh ← fitness(Ch)/F itnesssum;9 itSum← itSum+ probCh;

10 if itSum < G then11 Select chromosome Ch to be parent p;

12 Crossover:13 Set the length of chromosome, λ, to be Napp, the number of applications

to be allocated;14 Parents : p(1) = g11 , g

12 , ..., g

1λ and p(2) = g21 , g

22 , ..., g

2λ;

15 Let Offspring : O(3) = g31 , g32 , ..., g

3λ and O(4) = g41 , g

42 , ..., g

4λ;

16 for the Length of chromosome do17 Assign 0 or 1 randomly to individual genes of Mask;18 if Mask is 0 then19 g3λ ← g2λ; g4λ ← g1λ;

20 else21 g3λ ← g1λ; g4λ ← g2λ;

22 Mutation:23 Randomly select two numbers between 1 and λ, and assign them to A

and B;24 for Each Child do25 temp← g3A; g3A ← g4B ; g4B ← temp ;

Figure 2 represents the working process of the genetic operators. The chro-309

mosomes are represented by value encoding. This allows individual genes to be310

represented as positive integers derived from the actual VM numbers to build311

chromosomes. Each chromosome consists of |Napp| genes. Each gene has a value312

ranging from 1 to |Nvm|. The selection operator identifies and assigns possible313

solution chromosomes as parents. The parent chromosomes are selected from314

the mating pool with the help of roulette wheel selection (Lines 1 to 11), which315

is a fitness-proportionate selection technique. The greater the fitness value of316

a chromosome is, the more probability of being selected as a parent is. The317

parent chromosomes are then used to generate a consequent population.318

Both crossover and mutation are employed as gene operators, as shown in319

12

Page 14: Vasudevan, Meera,Tian, Glen,Tang, Maolin,Kozan, Erhan, & Zhang, … · 2020. 8. 19. · This may be the author’s version of a work that was submitted/accepted for publication in

Figure 2: Value encoding, uniform crossover using binary mask and mutation by selection andexchange of two genes.

Algorithm 2 and Figure 2. Uniform crossover (Lines 12 to 21 in Algorithm 2)320

is applied to the parent solutions to produce the offspring solutions. A binary321

crossover mask is randomly generated for each parent, where 1 and 0 indicate322

that the gene will be copied from the first and second parents, respectively. The323

mutation (Lines 22 to 25 in Algorithm 2) is carried out by selecting and exchang-324

ing two genes from the offspring chromosomes. The mutation probability is set325

to a low number, in order to control the search space. The termination condition326

specifies that the cycle is repeated for each generation until a maximum number327

of generations is reached or an individual is found which adequately solves the328

problem. Every iteration of the algorithm creates a population consisting of a329

set of chromosomes, which represent a possible assignment solution.330

The fitness function determines the quality of the solution when compared toan optimal solution. It effectively penalizes an allocation solution that violatesthe CPU and memory constraints discussed in Equations (4) and (5). The lowerthe energy cost and penalty in terms of resource utilization efficiency, the higherthe fitness function. Feasible solutions have a positive fitness value, whereasinfeasible solutions incur a negative fitness. The fitness function is derived as:

F (X) = w1 · F̄obj − w2 ·1

Nvm·Nvm∑j=1

(φcpuj + φmemj

). (14)

The two weights w1 and w2 associated with the fitness function are currentlyset to 2 and 1, respectively. The multiplicative inverse of the objective functiondiscussed in Equation (3) is represented by F̄obj . In order to normalise and scale

13

Page 15: Vasudevan, Meera,Tian, Glen,Tang, Maolin,Kozan, Erhan, & Zhang, … · 2020. 8. 19. · This may be the author’s version of a work that was submitted/accepted for publication in

the objective function F̄obj to a range of [1, 10], we use:

F̄obj =Fworst − FobjFworst − F ?

· F?

Fobj· r + 1, (15)

where the range r = 9. The best (minimized) and worst objective functions arerepresented by F ? and Fworst, respectively. The penalty for CPU and memoryconstraint violations are derived as follows:

φcpuj =

0, if Uavg = 1λ · CPUmaxj /ICj , if 0 < Uavg < 12 if Uavg = 0

(16)

331

φmemj =

{2 (1− 1/α), if α ≥ 12, otherwise

(17)

where α is the ratio of the VM capacity to the total allocated memory:332

α =MEMmax

j∑Ni=1 xij ·MEM i

. (18)

4.3. Infeasible-Solution Repairing Procedure333

The RGA incorporates an ISRP to convert infeasible chromosomes to feasible334

ones. An infeasible allocation of applications to VMs is characterized by a335

negative fitness as a result of high penalty due to CPU and memory constraint336

violations. The applications allocated to infeasible VMs are re-allocated to337

other VMs until the violations are resolved. Consequently, the fitness becomes338

a positive value.339

Each VM in a chromosome is linked to a data structure. As shown inFigure 3, the data structure consists of a violation indicator (viol), the totalCPU and memory requirements of the applications allocated to the VM (V CPUtot

& V memtot ), the VM’s CPU and memory capacity (CPUmaxj and MEMmaxj ),

and a pointer to a linked list to indicate the application id, CPU and memoryrequirement of each application allocated to the VM. The violation indicatoris set to 0 if the application allocations do not violate the CPU and memoryconstraints, and is set to 1 otherwise:

viol =

{0, if resource constraints are not violated

1, otherwise.(19)

The total CPU and memory requirements of the applications on VM Vj isderived by:

V CPUtot =ICjT [Vj ]

, V memtot =

Napp∑i=1

xij ·memi. (20)

14

Page 16: Vasudevan, Meera,Tian, Glen,Tang, Maolin,Kozan, Erhan, & Zhang, … · 2020. 8. 19. · This may be the author’s version of a work that was submitted/accepted for publication in

Figure 3: Data structure used in the Infeasible-Solution Repairing Procedure.

With the help of the violation indicator, the data structure shown in Figure 3340

is used to identify the VMs that violate the constraints. A constraint violation341

makes the chromosome infeasible. Once the VM V [j] is identified, the CPU342

and memory availability of the next VM V [j′] is calculated. If the resource343

availability is greater than the resource requirement of the first application in344

the linked list of V [j], the application is re-allocated to the new VM V [j′]. The345

process is repeated until the violations are fixed and the chromosome becomes346

feasible. The working process of the ISRP is presented in Algorithm 3, which is347

self-explained.348

After the applications are allocated to the VMs using RGA, we have:

Vj = [V1, V2, V3, ..., VNvm], Sk = [S1, S2, S3, ..., SNpm

], (21)

where Vj and Sk represent VMs and PM servers, respectively. The FFD algo-349

rithm packs the VMs using as few servers as possible. It sorts the PMs in a350

decreasing order of resource capacity. Each active VM is placed onto the first351

server with adequate space remaining. All active VMs are eventually packed352

onto PM servers.353

5. Case Studies354

This section conducts experiments to demonstrate our profile-based RGA.355

It begins with an introduction into experimental design. This is followed by a356

discussion of evaluation criteria. Then, experimental results are presented.357

5.1. Experimental Design358

Profiles are created for every application, VM and PM from real data center359

workload logs. For building the profiles, the workload logs are collected over360

a period of seven days (from the 12th to 19th of May, 2014). They include361

information about CPU, memory and energy utilizations. The length of each362

application is determined by the IC. The computing capacity of each VM is in363

Instructions Per Second (IPS). Table 3 depicts application and VM parameter364

settings.365

15

Page 17: Vasudevan, Meera,Tian, Glen,Tang, Maolin,Kozan, Erhan, & Zhang, … · 2020. 8. 19. · This may be the author’s version of a work that was submitted/accepted for publication in

Algorithm 3: ISRP

1 for j = 1 to Nvm do2 if V [j].violation = 1 then3 select = V [j].pointer ;4 V [j].pointer = V [j].pointer.next ;5 fixed = false ;6 j′ = j + 1 ;

7 availCPU = V [j′].V CPUtot − V [j′].CPUmaxj′ ;

8 availmem = V [j′].V memtot − V [j′].MEMmaxj′ ;

9 while j′ 6= j do10 if (availCPU ≥ select.aCPU )&(availmem ≥ select.amem) then11 select.next = V [j′].pointer ;12 V [j′].pointer = select ;

13 V [j′].V CPUtot + = select.aCPU ;14 V [j′].V memtot + = select.amem ;15 fixed = true ;

16 Set j′ = j′ + 1 ;

Table 3: Parameter settings for applications and VMs.

Parameter ValueIC [5, 10]× 109 instructionsIPS [1, 2]× 109 instructions/secMemory [1000, 5000] BytesPpeak 350 WPidle 150 WEusage 2

In our case studies, a data center with 100 PM servers that host upto 1000366

VMs is considered. For our evaluation, 11 different problem test sets are consid-367

ered where the number of applications ranges from 20 to 5000 with correspond-368

ing number of VMs as seen in Table 4. For the first five test sets, the number369

of VMs is kept constant while the number of applications varies.370

The three-layer energy management system shown in Figure 1 is imple-371

mented in our case studies for evaluation of overall energy consumption. At372

the application management layer, the profile-based application placement is373

designed with our profile-based RGA incorporating with an LCFP-generated374

initial population and the ISRP algorithm. RGA is carried out with a pre-set375

population size of 200 individuals in each generation. It is terminated when376

there is no change in the average and maximum fitness values of strings for 10377

generations. The number of the maximum generations is set to be 200. The378

probabilities for crossover and mutation operations are configured to be 0.75379

16

Page 18: Vasudevan, Meera,Tian, Glen,Tang, Maolin,Kozan, Erhan, & Zhang, … · 2020. 8. 19. · This may be the author’s version of a work that was submitted/accepted for publication in

Table 4: Problem test sets (Npm = 100).

Test set Nvm Napp1 10 202 10 403 10 604 10 805 10 1006 50 2007 100 5008 250 15009 500 250010 750 350011 1000 5000

and 0.02, respectively. At the VM management layer, the widely used FFD380

algorithm is implemented for VM placement to PM servers.381

5.2. Benchmark and Evaluation Criteria382

In order to evaluate the quality and efficiency of the solutions for profiled-383

based application assignment to VMs, RGA presented in this paper is compared384

with the steady-state GA reported previously [7] as a benchmark method. Also,385

FFD is implemented for VM placement to PMs. This forms RGA-FFD for our386

RGA, and GA-FFD for static GA as benchmark, respectively.387

The evaluation criteria for testing RGA include the following:388

1) Scalability;389

2) Energy efficiency and computing efficiency390

(a) Energy consumption;391

(b) Solution time; and392

(c) Statistical T-Test analysis393

3) Quality of solutions394

(a) VM resource utilization;395

(b) Convergence; and396

(c) Makespan performance with respect to initial population.397

5.3. The Scalability of RGA398

The high scalability of the RGA is demonstrated through solving the profile-399

based assignment problem for a problem size ranging from 200 to 5, 000, 000.400

Figure 4 shows the algorithm solution time with respect to the problem size. As401

the test problem size [Nvm ·Napp] increases, the solution time of RGA increases402

linearly. A nearly linear increase in the solution time with respect to the problem403

size well characterizes RGA’s good scalability. For test sets 1 to 5 where the404

number of VMs is a constant of 10, the elapsed times for algorithm solution are405

small.406

17

Page 19: Vasudevan, Meera,Tian, Glen,Tang, Maolin,Kozan, Erhan, & Zhang, … · 2020. 8. 19. · This may be the author’s version of a work that was submitted/accepted for publication in

Figure 4: The scalability of RGA.

5.4. Energy Efficiency and Computing Efficiency407

In order to calculate the actual energy consumption and computing time408

performance for both GA and RGA, FFD is implemented as the policy for VM409

placement to PMs. The results of energy efficiency and computing efficiency are410

tabulated in Table 5.411

Table 5: Energy efficiency and computing efficiency of GA and RGA incorporating with FFD(SD: standard deviation). For each of the test sets, the results are derived from 30 runs.

TestNvm

Npm Total energy of all Daily energy Solutionset Required active servers (Wh) (KWh) time (sec)

GA SD RGA SD GA RGA GA RGA1 10 1 219 35.8 194 13.9 5.26 4.66 0.8 0.62 10 1 247 41.7 222 14.5 5.92 5.32 2.8 2.33 10 2 409 42.1 315 28.8 9.82 7.57 6.0 4.74 10 2 425 46.3 337 30.1 10.19 8.09 6.4 5.25 10 2 432 50.5 398 34.9 10.37 9.56 15 126 50 5 1422 45.3 1308 39.4 34.12 31.38 162 1347 100 11 2737 32.4 2249 38.7 65.70 53.97 1298 8048 250 27 7027 69.9 5875 45.1 168.6 141.0 2293 14069 500 62 17127 63.9 13407 52.5 411.0 321.8 5127 238710 750 75 21093 53.9 17461 35.9 506.2 419.2 8004 368911 1000 100 27591 57.7 24605 58.4 662.2 590.5 13070 5999

The first observation from Table 5 is that RGA-FFD gives smaller energy412

consumption than GA-FFD for all test sets. The energy savings of RGA-FFD413

in comparison with GA-FFD are from 7.8% up to 23% for the considered test414

sets. For Test Set 11, which represents a realistic size of a small data center,415

18

Page 20: Vasudevan, Meera,Tian, Glen,Tang, Maolin,Kozan, Erhan, & Zhang, … · 2020. 8. 19. · This may be the author’s version of a work that was submitted/accepted for publication in

both GA and RGA are respectively used to allocate 5000 applications to 1000416

VMs, which are then placed by FFD to 100 active PMs. The resulting daily417

energy consumption of the data center is 662.59 KWh for GA-FFD and 590.52418

KWh for RGA-FFD, indicating an about 72 KWh daily energy saving.419

The second observation from Table 5 is that the energy consumption results420

from RGA-FFD show smaller average standard deviations than those from GA-421

FFD. This implies that RGA is more stable than GA for deriving the results.422

This conclusion is drawn from 30 runs for each the test sets.423

For computing efficiency, as the problem size increases, GA-FFD takes upto424

twice as much time as RGA-FFD does to solve the problem. This is clearly425

shown in average Solution Time in Table 5. This means that RGA converges426

faster with better solutions than GA.427

To demonstrate the confidence level of the experimental results, a paired428

t-test is conducted for the two independent methods of GA and RGA for each429

test set. As GA and RGA are stochastic in nature, both of them are individually430

run 30 times for each Test set. The null hypothesis is that there is no difference431

between GA and RGA methods. The confidence interval is set at 95% and a432

two-tailed hypothesis is assumed. The t-stat values are recorded in Table 6. The433

two-tailed P-value is less than 0.0001 and is extremely statistically significant.434

The results show that the difference between the two methods are significant,435

and thus the null hypothesis is rejected.436

Table 6: T-test of the solutions by GA and RGA.

Test set T-Value std. error t-crit df1 -3.66 6.866 2.045 292 -3.15 7.888 2.045 293 -9.98 9.416 2.045 294 -8.04 10.901 2.045 295 -2.43 13.960 2.045 296 -11.25 10.139 2.045 297 -29.12 9.943 2.045 298 -56.87 16.726 2.045 299 -62.08 12.753 2.045 2910 -60.11 11.841 2.045 2911 -54.17 17.145 2.045 29

5.5. The Quality of Solutions437

The quality of solutions is determined by measuring resource utilization,438

convergence, and makespan performance due to LCFP-generated initial popu-439

lation.440

VM Resource Utilization. The results of VM resource utilization for441

both GA and RGA are shown in Figure 5. It is clearly seen from Figure 5 that442

RGA uses VM resources more efficiently than GA. Overall, RGA gives solutions443

19

Page 21: Vasudevan, Meera,Tian, Glen,Tang, Maolin,Kozan, Erhan, & Zhang, … · 2020. 8. 19. · This may be the author’s version of a work that was submitted/accepted for publication in

that are 10.42% to 42.86% more resource efficient than GA. The highest average444

VM resource utilization achieved by RGA is 70%, while this metric is only 56%445

for GA. The initial steep incline in Figure 5 is a result of keeping the number446

of VMs constant to 10 whilst increasing the number of applications from 20 to447

100 for Test sets 1 to 5. This implies that for Test set 5, the VM resources are448

under maximum utilization due to an approximate average of ten applications449

assigned per VM. The drop in the figure indicates the increase in the number of450

VMs in Test set 6. The resource utilization increases linearly with the gradual451

increase in both the numbers of VMs and applications.452

Figure 5: Resource utilization efficiency.

Convergence. Table 7 tabulates the best results in terms of energy and the453

corresponding genetic iteration for both GA and RGA. It is clearly seen from454

Table 7 that RGA gives better solutions with much fewer iterations. With the455

increase in the problem size from Test Sets 1 to 6, the numbers of iterations from456

RGA are over three quarters fewer than those from GA. From test sets 7 to 11,457

GA has reached the pre-set maximum number of iterations of 200, while RGA458

achieves better solutions with less than half of the pre-set number of iterations.459

For Test set 11, RGA uses 93 iterations to derive a better solution than GA460

with 200 iterations. All these results show faster convergence of RGA than that461

of GA.462

Makespan Performance due to Initial Population. Makespan is the463

maximum completion time of all applications allocated to a VM. In our prelimi-464

nary work [7], GA is applied with a random initial population. This section will465

use a random initial population and an LCFP-generated initial population for466

RGA. The overall average makespan for GA with a random initial population467

is denoted by GA-Rand. Similarly, use RGA-Rand and RGA-LCFP to denote468

RGA with random and LCFP-seeded initial populations, respectively. Figure 6469

depicts the makespan performance of GA-Rand, RGA-Rand and RGA-LCFP.470

With the increase of problem size, the steep initial incline of all three strategies471

represent the solutions of Test sets 1-5, where the number of VMs are constant472

whilst the number of applications varies from 20 to 100. It is seen from the473

figure that both RGA-Rand and RGA-LCFP provide better solutions than GA-474

20

Page 22: Vasudevan, Meera,Tian, Glen,Tang, Maolin,Kozan, Erhan, & Zhang, … · 2020. 8. 19. · This may be the author’s version of a work that was submitted/accepted for publication in

Table 7: Comparisons of GA and RGA with regard to convergence.

Testset

GA - Best Result RGA - Best ResultEnergy (Wh) Iterations Energy (Wh) Iterations

1 191 47 175 42 205 94 190 133 372 132 280 184 399 159 300 235 417 172 349 256 1398 197 1294 397 2719 200 2212 448 6998 200 5839 579 17098 200 13379 6210 21038 200 17439 8111 27554 200 24584 93

Rand does. However, RGA-LCFP outperforms RGA-Rand in the sense that it475

generates better solution with faster convergence. This is because the LCFP476

strategy seeds the initial populations by assigning the longest application to the477

fastest processing VM. This in turn minimizes the makespan and maximizes the478

CPU utilization.479

Figure 6: Makespan performance due to initial population.

5.6. Summary of Case Studies480

The results from our case studies have demonstrated the effectiveness of the481

presented RGA incorporated with an LCFP-generated initial population and482

the ISRP algorithm. First of all, our RGA scales well with a nearly linear483

increase in the computing time with respect to the problem size. Secondly, in484

comparison with the steady-state GA, our approach gives much improved energy485

efficiency and computing time performance in application assignment. Our T-486

test results have confirmed that the results from our RGA and the steady-state487

21

Page 23: Vasudevan, Meera,Tian, Glen,Tang, Maolin,Kozan, Erhan, & Zhang, … · 2020. 8. 19. · This may be the author’s version of a work that was submitted/accepted for publication in

GA are significantly different. Furthermore, the quality of our RGA solutions is488

higher than that of the steady-state GA. This is characterized by a better VM489

utilization, faster convergence, and improved makespan performance.490

6. Conclusion491

Data centers play a major role in today’s information and Internet services,492

leading to significant increase in energy costs to power data centers. In this493

paper, a profile-based approach has been presented for energy-efficient applica-494

tion assignment to VMs with consideration of resource utilization. It includes495

a formal modelling to formulate the problem into a constrained optimization496

problem and an RGA to solve the problem. RGA has been incorporated with497

an LCFP-generated initial population and an ISRP algorithm. It has been im-498

plemented at the top application management layer in our three-layer energy499

management. In addition, for a complete energy management system, FFD500

has been implemented at the middle VM management layer in the three-layer501

energy management. Experiments have been conducted to demonstrate the502

effectiveness and efficiency of the profile-based RGA. For the investigated sce-503

narios, RGA has shown 23% less energy consumption and 43% more resource504

utilization in comparison with the steady-state GA. The better solutions are505

achieved with faster convergence and shorter computing time than GA. There-506

fore, the profile-based RGA is a promising tool for energy-efficient application507

assignment to VMs in data centers.508

Acknowledgement509

This work is supported in part by the Australian Research Council (ARC)510

under the Discovery Projects Scheme (grant no. PDP170103305), and the Sci-511

ence and Technology Department of Shanxi Provincial Government under the512

International Collaboration Grants Scheme (grant no. 2015081007) and Special513

Talents Projects Grant Scheme (grant no. 201605D211021).514

References515

[1] J. Koomey, Growth in data center electricity use 2005 to 2010, Tech. rep.,516

Analytics Press, Oakland, California, USA (1 Aug 2011).517

[2] J. Whitney, P. Delforge, Scaling Up Energy Efficiency Across the Data Cen-518

ter Industry: Evaluating Key Drivers and Barriers (Issue Paper), Natural519

Resources Defense Council (NRDC), August 2014.520

[3] K. Le, R. Bianchini, M. Martonosi, T. Nguyen, Cost- and energy-aware load521

distribution across data centers, in: Proceedings of HotPower, Montana,522

USA, 2009, pp. 1–5.523

22

Page 24: Vasudevan, Meera,Tian, Glen,Tang, Maolin,Kozan, Erhan, & Zhang, … · 2020. 8. 19. · This may be the author’s version of a work that was submitted/accepted for publication in

[4] A. Greenberg, J. Hamilton, D. A. Maltz, P. Patel, The cost of a cloud:524

research problems in data center networks, SIGCOMM Computer Com-525

munication Review 39 (1) (2008) 68–73.526

[5] A.-C. Orgerie, M. D. d. Assuncao, L. Lefevre, A survey on techniques for527

improving the energy efficiency of large-scale distributed systems, ACM528

Computing Surveys 46 (4) (2014) 47:1–47:31.529

[6] M. Vasudevan, Y.-C. Tian, M. Tang, E. Kozan, Profiling : an application530

assignment approach for green data centers, in: Proceedings of the IEEE531

40th Annual Conference of the Industrial Electronics Society, Dallas, TX,532

USA, 29 Oct - 1 Nov 2014, pp. 5400–5406.533

[7] M. Vasudevan, Y.-C. Tian, M. Tang, E. Kozan, J. Gao, Using genetic534

algorithm in profile-based assignment of applications to virtual machines for535

greener data centers, in: Neural Information Processing: 22nd International536

Conference ICONIP’2015, Vol. 9490 of Lecture Notes in Computer Science,537

2015, pp. 182–189.538

[8] M. Vasudevan, Y.-C. Tian, M. Tang, , E. Kozan, Profile-based applica-539

tion assignment for greener and more energy-efficient data centers, Future540

Generation Computer Systems 67 (2017) 94–108.541

[9] P. Bajpai, M. Kumar, Genetic algorithm an approach to solve global op-542

timization problems, Indian Journal of Computer Science and Engineering543

1 (3) (2010) 199–206.544

[10] Z. Abbasi, T. Mukherjee, G. Varsamopoulos, S. K. S. Gupta, Dahm: A545

green and dynamic web application hosting manager across geographically546

distributed data centers, Journal of Emerging Technologies in Computing547

Systems 8 (4) (2012) 34:1–34:22.548

[11] Z. Abbasi, G. Varsamopoulos, S. K. S. Gupta, Tacoma: Server and work-549

load management in internet data centers considering cooling-computing550

power trade-off and energy proportionality, ACM Transactions on Archi-551

tecture and Code Optimization 9 (2) (2012) 11:1–11:37.552

[12] K. H. Park, W. Hwang, H. Seok, C. Kim, D.-j. Shin, D. J. Kim, M. K.553

Maeng, S. M. Kim, Mn-mate: Elastic resource management of manycores554

and a hybrid memory hierarchy for a cloud node, Journal of Emerging555

Technologies in Computing Systems 12 (1) (2015) 5:1–5:25.556

[13] C. Canali, R. Lancellotti, Automatic parameter tuning for class-based vir-557

tual machine placement in cloud infrastructures, in: Proceedings of the558

23rd International Conference on Software, Telecommunications and Com-559

puter Networks (SoftCOM), 2015, pp. 290–294.560

[14] A. M. Sampaio, J. G. Barbosa, R. Prodan, PIASA: A power and interfer-561

ence aware resource management strategy for heterogeneous workloads in562

23

Page 25: Vasudevan, Meera,Tian, Glen,Tang, Maolin,Kozan, Erhan, & Zhang, … · 2020. 8. 19. · This may be the author’s version of a work that was submitted/accepted for publication in

cloud data centers, Simulation Modelling Practice and Theory 57 (2015)563

142 – 160.564

[15] F. Pop, C. Dobre, V. Cristea, N. Bessis, F. Xhafa, L. Barolli, Reputation-565

guided evolutionary scheduling algorithm for independent tasks in inter-566

clouds environments, International Journal of Web and Grid Services 11 (1)567

(2015) 4–20.568

[16] F. Bahrpeyma, A. Zakerolhoseini, H. Haghighi, Using IDS fitted q to de-569

velop a real-time adaptive controller for dynamic resource provisioning in570

cloud’s virtualized environment, Applied Soft Computing 26 (2015) 285 –571

298.572

[17] K. Han, X. Cai, Speed-scaling-based job/tasks deployment for energy-573

efficient datacenters in cloud computing, in: Proceedings of the Second574

International Conference on Innovative Computing and Cloud Computing,575

New York, NY, USA, 2013, pp. 154:154–154:157.576

[18] U. Bhoi, P. N. Ramanuj, Enhanced max-min task scheduling algorithm577

in cloud computing, International Journal of Application or Innovation in578

Engineering and Management (IJAIEM) (2013) 259–264.579

[19] P. Tembey, A. Gavrilovska, K. Schwan, Merlin: Application- and platform-580

aware resource allocation in consolidated server systems, in: Proceedings581

of the ACM Symposium on Cloud Computing, New York, NY, USA, 2014,582

pp. 14:1–14:14.583

[20] F. Chen, J. Grundy, Y. Yang, J.-G. Schneider, Q. He, Experimental analy-584

sis of task-based energy consumption in cloud computing systems, in: Pro-585

ceedings of the 4th ACM/SPEC International Conference on Performance586

Engineering, New York, NY, USA, 2013, pp. 295–306.587

[21] F. Chen, J. Grundy, J.-G. Schneider, Y. Yang, Q. He, Automated analysis588

of performance and energy consumption for cloud applications, in: Pro-589

ceedings of the 5th ACM/SPEC International Conference on Performance590

Engineering, ICPE ’14, New York, NY, USA, 2014, pp. 39–50.591

[22] J. Mars, L. Tang, K. Skadron, M. Soffa, R. Hundt, Increasing utilization592

in modern warehouse-scale computers using bubble-up, IEEE Micro 32 (3)593

(2012) 88–99.594

[23] J. Mars, L. Tang, R. Hundt, K. Skadron, M. L. Soffa, Bubble-up: Increasing595

utilization in modern warehouse scale computers via sensible co-locations,596

in: Proceedings of the 44th Annual IEEE/ACM International Symposium597

on Microarchitecture, MICRO-44, New York, NY, USA, 2011, pp. 248–259.598

[24] H. Yang, A. Breslow, J. Mars, L. Tang, Bubble-flux: Precise online599

qos management for increased utilization in warehouse scale computers,600

SIGARCH Compututer Architecture News 41 (3) (2013) 607–618.601

24

Page 26: Vasudevan, Meera,Tian, Glen,Tang, Maolin,Kozan, Erhan, & Zhang, … · 2020. 8. 19. · This may be the author’s version of a work that was submitted/accepted for publication in

[25] A. V. Do, J. Chen, C. Wang, Y. C. Lee, A. Zomaya, B. B. Zhou, Profiling602

applications for virtual machine placement in clouds, in: Proceedings of603

the IEEE 4th International Conference on Cloud Computing, Washington,604

DC, USA, 2011, pp. 660–667.605

[26] K. Ye, Z. Wu, C. Wang, B. B. Zhou, W. Si, X. Jiang, A. Zomaya, Profiling-606

based workload consolidation and migration in virtualized data centers,607

IEEE Transactions on Parallel and Distributed Systems 26 (3) (2015) 878–608

890.609

[27] M. Guzek, J. E. Pecero, B. Dorronsoro, P. Bouvry, Multi-objective evolu-610

tionary algorithms for energy-aware scheduling on distributed computing611

systems, Applied Soft Computing 24 (2014) 432 – 446.612

[28] F. Tao, Y. Feng, L. Zhang, T. Liao, Clps-ga: A case library and pareto613

solution-based hybrid genetic algorithm for energy-aware cloud service614

scheduling, Applied Soft Computing 19 (2014) 264 – 279.615

[29] S. Sindhu, S. Mukherjee, A genetic algorithm based scheduler for cloud616

environment, in: Proceedings of the 4th International Conference on Com-617

puter and Communication Technology (ICCCT), 2013, pp. 23–27.618

[30] G. Wu, M. Tang, Y.-C. Tian, W. Li, Energy-efficient virtual machine place-619

ment in data centers by genetic algorithm, in: T. Huang, Z. Zeng, C. Li,620

C. Leung (Eds.), Neural Information Processing, Vol. 7665 of Lecture Notes621

in Computer Science, Springer Berlin Heidelberg, 2012, pp. 315–323.622

[31] M. Tang, S. Pan, A hybrid genetic algorithm for the energy-efficient vir-623

tual machine placement problem in data centers, Neural Processing Letters624

41 (2) (2015) 211–221.625

[32] A. Ghorbannia Delavar, Y. Aryan, HSGA: a hybrid heuristic algorithm for626

workflow scheduling in cloud systems, Cluster Computing 17 (1) (2014)627

129–137.628

[33] G. Portaluri, S. Giordano, D. Kliazovich, B. Dorronsoro, A power efficient629

genetic algorithm for resource allocation in cloud computing data centers,630

in: Proceedings of the 3rd IEEE International Conference on Cloud Net-631

working (CloudNet), 2014, pp. 58–63.632

[34] L. Chen, H. Shen, K. Sapra, Distributed autonomous virtual resource man-633

agement in datacenters using finite-markov decision process, in: Proceed-634

ings of the ACM Symposium on Cloud Computing, New York, NY, USA,635

2014, pp. 24:1–24:13.636

[35] M. Ghorbani, Y. Wang, Y. Xue, M. Pedram, P. Bogdan, Prediction and637

control of bursty cloud workloads: A fractal framework, in: Proceedings of638

the International Conference on Hardware/Software Codesign and System639

Synthesis, New York, NY, USA, 2014, pp. 12:1–12:9.640

25

Page 27: Vasudevan, Meera,Tian, Glen,Tang, Maolin,Kozan, Erhan, & Zhang, … · 2020. 8. 19. · This may be the author’s version of a work that was submitted/accepted for publication in

[36] M. Vasudevan, Y.-C. Tian, M. Tang, , E. Kozan, W. Zhang, Profile-based641

dynamic application assignment with a repairing genetic algorithm for642

greener data centers, Journal of Supercomputing 73 (9) (Sep 2017) 3977–643

3998.644

[37] L. P. Wang, S. Li, F. Tian, X. Fu, A noisy chaotic neural network for645

solving combinatorial optimization problems: Stochastic chaotic simulated646

annealing, IEEE Transactions on Systems, Man and Cybernetics, Part B -647

Cybernetics 34 (5) (2004) 2119–2125.648

[38] F. Larumbe, B. Sanso, A tabu search algorithm for the location of data649

centers and software components in green cloud computing networks, IEEE650

Transactions on Cloud Computing 1 (2013) 22–35.651

[39] X. J. Fu, L. P. Wang, Rule extraction by genetic algorithms based on a652

simplified RBF neural network, in: Proceedings of the 2001 Congress on653

Evolutionary Computation, 2001, pp. 753–758.654

[40] T. W. S. Robert D. Kent, High Performance Computing Systems and Ap-655

plications, Vol. 727, Springer US, 2003.656

[41] K. Zhu, H. Song, L. Liu, J. Gao, G. Cheng, Hybrid genetic algorithm for657

cloud computing applications, in: Proceedings of the IEEE Asia-Pacific658

Services Computing Conference (APSCC), 2011, pp. 182–187.659

26


Recommended