+ All Categories
Home > Documents > [IEEE 2014 Recent Advances in Engineering and Computational Sciences (RAECS) - Chandigarh, India...

[IEEE 2014 Recent Advances in Engineering and Computational Sciences (RAECS) - Chandigarh, India...

Date post: 27-Jan-2017
Category:
Upload: sakshi
View: 214 times
Download: 2 times
Share this document with a friend
6
Proceedings of 2014 RAECS UIET Panjab University Chandigarh, 06 - 08 March, 2014 978-1-4799-2291-8/14/$31.00 ©2014 IEEE Bi-Criteria Priority Based Particle Swarm Optimization Workflow Scheduling Algorithm for Cloud Amandeep Verma University Institute of Engineering and Technology Panjab University, Chandigarh, India [email protected] Sakshi Kaushal University Institute of Engineering and Technology Panjab University, Chandigarh, India [email protected] Abstract— Cloud Computing is based upon market oriented business model in which users can access the cloud services through Internet and pay only for what they use. Large scale scientific applications are often expressed as Workflows. Workflow tasks should be scheduled efficiently such that execution time as well as cost incurred by using a set of heterogeneous resources over cloud should be minimized. In this paper, we propose Bi-Criteria Priority based Particle Swarm Optimization (BPSO) to schedule workflow tasks over the available cloud resources that minimized the execution cost and the execution time under given the deadline and budget constraints. The proposed algorithm is evaluated using simulation with four different real world workflow applications and comparison is done with Budget Constrained Heterogeneous Earliest Finish Time (BHEFT) and standard PSO. The simulation results show that our scheduling algorithm significantly decreasing the execution cost of schedule as compared to BHEFT and PSO under the same Deadline and Budget Constraint and using same pricing model. KeywordsScheduling; Workflow; Directed Acyclic Graph (DAG); HEFT; PSO; Priority. I. INTRODUCTION Cloud Computing is recently a booming area in Information Technology domain. It is the paradigm that provides dynamically scalable services on demand over Internet through virtualization of hardware and software [1]. It delivers hardware like computing, network bandwidth, storage, and software applications as services [2]. It is based on a market- oriented business model where users are charged for using cloud services on a pay-as-you-go basis like conventional utilities (e.g. water, electricity, and telephony etc.) in everyday life [3]. Scheduling of workflows requires massive computation and communication cost. These workflows, especially those related to scientific areas such as astronomy and biology, present a strong case for the usage of the cloud for their execution. It is a process of mapping inter-dependent tasks on the available resources such that workflow application is able completes its execution within the user’s specified Quality of Service (QoS) constraints [4]. For grid workflow management systems, basically workflow scheduling algorithms attempt to minimise the execution time without considering the cost of accessing resources. But, in case of Cloud computing, usually, faster resources are more expensive than the slower one. Therefore, workflow scheduling in cloud, requires both time and cost constraints to be satisfied as specified by the user [5]. Time constraints ensure that the workflow is executed with the given deadline and the cost ensures that the budget specified by the user is not overshot. A good heuristic tries to balance both these values and still obtain a near optimal solution. In this paper, we proposed Bi-Criteria Priority based Particle Swarm Optimization (BPSO) algorithm to schedule workflow applications to cloud resources that optimize the execution cost for running the workflow and also minimize the total execution time (i.e. sum of processing time and data transmission time) under the given budget and deadline constraint. The remaining paper is organized as follow: Section II presents the related work in the area of workflow scheduling. The problem description is presented in section III. The Particle Swarm Optimization (PSO) and proposed BPSO are discussed in section IV and section V respectively. The proposed BPSO is evaluated and compared with BHEFT in section VI and section VII concludes the paper. II. RELATED WORK Scheduling of workflows is an NP – complete problem [6]. Many heuristic algorithms such as Minimum Completion Time, Sufferage, Min-min, and Max-min are used as candidates for best-effort based scheduling strategies [7, 8]. Now-a-days Meta heuristic techniques such as Genetic Algorithms, Particle Swarm Optimization, and Ant Colony Optimization have been gaining popularity. This is due to the fact that these are easy to implement, have a faster convergence speed and give an approximate solution in much lesser time as compared to traditional methods [9-13]. However, only few works in the past consider bi-objective (time and cost mainly) criteria to schedule workflow tasks over cloud resources. Zheng W., and Sakellariou R. [14] proposed Budget and Deadline Constrained BHEFT which is the extension of HEFT that gives BDC plan to check whether a workflow request should be accepted or not. From the literature review, we found that now day, meta-heuristic
Transcript

Proceedings of 2014 RAECS UIET Panjab University Chandigarh, 06 - 08 March, 2014

978-1-4799-2291-8/14/$31.00 ©2014 IEEE

Bi-Criteria Priority Based Particle Swarm Optimization Workflow Scheduling Algorithm for

Cloud

Amandeep Verma University Institute of Engineering and Technology

Panjab University, Chandigarh, India [email protected]

Sakshi Kaushal University Institute of Engineering and Technology

Panjab University, Chandigarh, India [email protected]

Abstract— Cloud Computing is based upon market oriented business model in which users can access the cloud services through Internet and pay only for what they use. Large scale scientific applications are often expressed as Workflows. Workflow tasks should be scheduled efficiently such that execution time as well as cost incurred by using a set of heterogeneous resources over cloud should be minimized. In this paper, we propose Bi-Criteria Priority based Particle Swarm Optimization (BPSO) to schedule workflow tasks over the available cloud resources that minimized the execution cost and the execution time under given the deadline and budget constraints. The proposed algorithm is evaluated using simulation with four different real world workflow applications and comparison is done with Budget Constrained Heterogeneous Earliest Finish Time (BHEFT) and standard PSO. The simulation results show that our scheduling algorithm significantly decreasing the execution cost of schedule as compared to BHEFT and PSO under the same Deadline and Budget Constraint and using same pricing model.

Keywords— Scheduling; Workflow; Directed Acyclic Graph (DAG); HEFT; PSO; Priority.

I. INTRODUCTION Cloud Computing is recently a booming area in Information Technology domain. It is the paradigm that provides dynamically scalable services on demand over Internet through virtualization of hardware and software [1]. It delivers hardware like computing, network bandwidth, storage, and software applications as services [2]. It is based on a market-oriented business model where users are charged for using cloud services on a pay-as-you-go basis like conventional utilities (e.g. water, electricity, and telephony etc.) in everyday life [3].

Scheduling of workflows requires massive computation and communication cost. These workflows, especially those related to scientific areas such as astronomy and biology, present a strong case for the usage of the cloud for their execution. It is a process of mapping inter-dependent tasks on the available resources such that workflow application is able completes its execution within the user’s specified Quality of Service (QoS) constraints [4]. For grid workflow management systems, basically workflow scheduling algorithms attempt to minimise the execution time without considering the cost of

accessing resources. But, in case of Cloud computing, usually, faster resources are more expensive than the slower one. Therefore, workflow scheduling in cloud, requires both time and cost constraints to be satisfied as specified by the user [5]. Time constraints ensure that the workflow is executed with the given deadline and the cost ensures that the budget specified by the user is not overshot. A good heuristic tries to balance both these values and still obtain a near optimal solution.

In this paper, we proposed Bi-Criteria Priority based Particle Swarm Optimization (BPSO) algorithm to schedule workflow applications to cloud resources that optimize the execution cost for running the workflow and also minimize the total execution time (i.e. sum of processing time and data transmission time) under the given budget and deadline constraint. The remaining paper is organized as follow: Section II presents the related work in the area of workflow scheduling. The problem description is presented in section III. The Particle Swarm Optimization (PSO) and proposed BPSO are discussed in section IV and section V respectively. The proposed BPSO is evaluated and compared with BHEFT in section VI and section VII concludes the paper.

II. RELATED WORK Scheduling of workflows is an NP – complete problem [6].

Many heuristic algorithms such as Minimum Completion Time, Sufferage, Min-min, and Max-min are used as candidates for best-effort based scheduling strategies [7, 8]. Now-a-days Meta heuristic techniques such as Genetic Algorithms, Particle Swarm Optimization, and Ant Colony Optimization have been gaining popularity. This is due to the fact that these are easy to implement, have a faster convergence speed and give an approximate solution in much lesser time as compared to traditional methods [9-13].

However, only few works in the past consider bi-objective (time and cost mainly) criteria to schedule workflow tasks over cloud resources. Zheng W., and Sakellariou R. [14] proposed Budget and Deadline Constrained BHEFT which is the extension of HEFT that gives BDC plan to check whether a workflow request should be accepted or not. From the literature review, we found that now day, meta-heuristic

algorithms are gaining more popularity. So, we used Particle Swarm Optimization to schedule workflow applications on cloud resources focusing on minimizing the execution cost and time under the user’s specified deadline and budget

III. PROBLEM DESCRIPTION A workflow application is modelled by a Directed Acyclic

Graph (DAG), defined by a tuple G (T, E), where T is the set of n tasks {t1, t2,......,tn}, and E is a set of e edges, represent the dependencies. Each ti ε T, represents a task in the application and each edge (ti..........tj) ε E represents a precedence constraint, such that the execution of tj ε T cannot be started before ti ε T finishes its execution [15]. If (ti, tj) ε T, then ti is the parent of tj, and tj is the child of ti. A task with no parent is known as an entry task and a task with no children is known as exit task. The information associated with each task (ti) are: the service type (yi) that task wants to use and task size (zi) in Million of Instructions (MI).

There is a group of service types S= {S0,S1,......}and a set of heterogeneous resources that are fully interconnected. The different resources may have different processing power expressed as MIPS (Million of Instruction per Second). For each resource rp, a power ratio αp is calculated to depict its power using the following:

It is assumed that a resource rp is able to provide all the

service types. For each service type Sx, a parameter βx is given to depict its standard execution time, which is used to estimate the execution time of a task which uses this service type. The execution time of a task ti on a resource rp is calculated by the following equation:

ET(i,p)=( Zi * βx ) / MIPS of rp...............................(2) and the execution cost EC(i,p) is given by :

EC(i,p)=µp * ET(i,p)........................................................(3)

where µp is the price unit for resource rp and is assumed µp= αp(1+ αp) /2. Moreover, all resources are assumed to be in same physical region, so data storage cost and data transmission costs are assumed to be zero. Only, time to transmit data between two depend tasks which are mapped to different resources is considered during experiment.

IV. PARTICLE SWARM OPTIMIZATION (PSO) The Particle Swarm Optimization is an evolutionary technique introduced in 1995 by Kennedy and Eberhart. In this technique a swarm of individuals, known as the particles, flow through the swarm space. All particles have fitness values indicating their performances, which are problem specific, and velocities which direct the flight of particles. Each particle represents a candidate solution to the optimization problem

[16]. The position of the particle at any given time is influenced by both its position called pbest [16] and the position of the best particle in its neighborhood referred to as gbest [16] i.e. the experience of neighboring particles is taken into account during optimization. A particle status on the search space is characterized by two elements, namely its velocity and position, which are updated in every generation. For the updating of the velocity vector the following equation is used [16]: Velocity Vector

ω:inertia weight; : cognitive coefficient based on particle’s own experience; : social coefficient based on the swarms experience; : Random variables with between (0,1) Position Vector For the updating of the position vector the following equation is used:

xi: position of the particle; vi: velocity of the particle

A. Particle PSO algorithm defines particles and minimizes them based

on a fitness function. A particle may be defined as to be containing a list of tasks and the resources or instance onto which they are mapped. If there are four tasks and three resources onto which they are to be mapped, then the particle may be represented as in the table I.

TABLE I. PSO PARTICLE

Task 1 Task 2 Task 3 Task 4

Particle1 1 3 2 2 Particle 2 2 1 3 3 Particle 3 3 2 1 1

B. Fitness Function To determine the effectiveness of the schedule a fitness

function is defined. The function allows the comparison of performance of one schedule to another. As the objective is to minimize time and cost, the fitness function must incorporate both of these parameters. If the deadline is met, the fitness of a function is defined as:

Fitness = cost………………………………(6)

V. PROPOSED HEURISTIC (BPSO) Our proposed heuristic BPSO is based on PSO as discussed in Section IV. In BPSO, the workflow tasks are executed in order of their priority which is assigned using the following method.

Input: A DAG G with Budget B and Deadline D Output: A cost optimized Schedule

1. Calculate blevel for all tasks of the workflow using equation (7).

2. Sort the tasks in descending order according to blevel. 3. for k:=1 to n (where n is the number of tasks in

workflow) a. Select kth task from the sorted list. b. Assign it to first available free machine.

4. Initialize PSO by inserting the schedule created in step 3 as first particle.

5. The rest of the particles are created by assigning all the tasks randomly over different available machines.

6. do: a) Update velocity and position using equations (4)

and (5) respectively. b) Evaluate fitness using equation (6). c) Update pbest d) Update gbest While: termination criteria is not met

A. Priority Assignment The priority of the tasks specifies the order of execution of

the tasks. In our proposed heuristic (BPSO), the priorities of all tasks are computed using the bottom level which is same as defined in HEFT[17] and is given by the equation 7:

where wi is the average execution time of the task on the

different computing machines. succ(ti) includes all the children tasks of ti. dij is the data transmission time from a task ti to tj. If a node has no children, its b-level is equal to the average execution time of the task on the different computing machines. An Illustrative Example: Consider a DAG with 9 tasks as shown in Figure 1. Each edge weight of DAG represents the data transmission time between the tasks.Table II shows the expected completion time and priorities of various tasks on three different machines. Blevel of all tasks is calculated using equation (7). Then the tasks are sorted in descending order of blevel. The tasks are sent to different machines according to their order of execution for completion of workflow application. Figure 2 shows the schedule generated according to blevel of a DAG. TABLE II: EXECUTION TIME AND B-LEVEL OF TASKS

Figure 1: A Sample DAG

Figure 2: Schedule according to b-level

The figure 3 outlines the BPSO algorithm.

Figure 3: The BPSO Heuristic

VI. PERFORMANCE EVALUATION In this section, we present our simulation of the proposed

algorithm BPSO. To evaluate the workflow scheduling algorithm, we used four synthetic workflows based on realistic workflows from diverse scientific applications, which are:

• Montage: Astronomy • Genome: Biology • LIGO: Gravitational physics • SIPHT: Biology

Task M1 M2 M3 blevel Order of execution according to blevel

T1 3 5 1 20 1

T2 2 3 1 18 2

T3 3 5 1 16 3

T4 2 3 1 11 5

T5 2 3 1 14 4

T6 2 3 1 7 7

T7 2 3 1 7 8

T8 4 6 2 11 6

T9 3 5 1 3 9

t1 t2

t3 t4 t5

t6 t7 t8

t9

1 2 3 2

4 11

2

3

2 4 2

2

T1 T5 T6 M1

T2 T4 T7 M2

T3 T8 T9 M3

The detailed characterization for each workflow including their structure and data and computational requirements can be found in [18]. Fig. 4 shows the approximate structure of each workflow.

(a) Monatge (b) Gemonics

(c) SIPHT

(d) LIGO

Figure 4: Structure of various workflows [18] For our experiment, we have developed simulation program

in java for a cloud environment which consists of a data-center. A data-center includes the six resources with different processing speed and hence with different pricing models. There are four service types having a standard execution time of 1.0, 1.5, 2.5 and 3.0 respectively. The processor speeds of different resources are selected randomly in the range of 1000-6000 MIPS. The power ratio, αp and price, µp of using these resources is calculated using equations (i) and (ii) respectively. The average bandwidth between these resources is assumed to be 20 Mbps.

A. Performance Metric The performance metric chosen for the comparison is Normalized Schedule Cost (NSC). The NSC of a schedule is calculated as:

..........................................(8)

where Cc is the execution cost of the same workflow by executing all the tasks on the fastest service, according their precedence constraints. The reasonable values for deadline D, and Budget B are generated as: Deadline D= LBD + k1 * (UBD -LBD), where LBD= MHEFT (makespan of HEFT), UBD= 5* MHEFT and k1 is a deadline ratio in range 0 to 1. Budget B= LCB + k2 * (UCB –LCB), where LCB is the lowest cost obtained by mapping each task to the cheapest service and UCB is the highest cost obtained conversely and k2 is a budget ratio in range 1.0 to 3.0.

For PSO, the following parameters are set: Parameter Value Particle Size 20

Iterations 50 Inertia Weight 0.9 Vmax, Vmin 3,-3

Acceleration Coefficients 2.0

B. Experiment Results BPSO is compared with BHEFT[14] without considering the existing load of a resource and with standard PSO. As PSO is a stochastic algorithm and also in case of BHEFT, the services are assigned randomly to different tasks of workflows, so both of algorithms are run for 50 times and average value of NSC is used for comparing BHEFT and BPSO. Figure 5 shows the average NSC of scheduling different workflows with BPSO and BHEFT for three different values of k1 (0.2, 06, and 1.0) and three different values of k2 (1.5, 2.0, and 2.5), in total 9 combination. It shows that BPSO outperforms than the BHEFT by significantly reducing the execution cost of schedule under the same Deadline and Budget Constraint and using same pricing model in all cases. Even BPSO is generating the schedules which are cheaper than schedules created by standard PSO. The cost of BPSO is reduced by 30% for Montage, by 55% for Genome, by 39% for SIPHT and by 20% for LIGO as in comparison with the cost of BHEFT.

(a) Montage, 25 nodes

(b) Genome, 24 Nodes

(c) SIPHT, 29 nodes

(d) LIGO, 30 nodes

Figure 5: Average NSC of different Workflows

VII. CONCLUSION AND FUTURE WORK In this paper, we have presented Bi-Objective Priority

based Particle Swarm Optimization (BPSO) to schedule workflow applications to cloud resources that minimizes the execution cost while meeting the deadline and budget constraint for delivering the result. Each workflow’s task is assigned priority using bottom level. These priorities are then used to initialize the PSO. The proposed algorithm is evaluated with synthetic workflows that are based on realistic workflows with different structures and different sizes. The comparison of proposed algorithm is done with BHEFT (without considering the existing load of resources) and standard PSO under same deadline and budget constraint and pricing model. The simulation results show that our proposed

algorithm has a promising performance as compared to BHEFT and PSO. In future we intend to improve our work by considering existing load of resources during mapping of a task to particular resource and comparison can be made with BHEFT( with load) and other existing dynamic heuristic techniques in literature.

ACKNOWLEDGMENT We would like to thank all the reviewers for their valuable

comments.

REFERENCES [1] Foster, I., Zhao, Y., Raicu, L., And Lu, S., “Cloud

computing and grid computing 360-degree compared”, In: Proceeding of Grid Computing Environment Workshop, Austin, pp:1-10, Nov. 16,2008.

[2] Gabriel, M., Wolfgang, G., and Calvin, J. R., “Hybrid computing—where hpc meets grid and cloud computing”, Journal of Future Generation Computer Systems, vol. 27,no. 5,pp: 440-453, May 2011.

[3] Verma, A., and Kaushal, S., “Cloud computing security issues and challenges: a survey”, In Proceeding International Conference on Advances in Computing and Communications, Part-IV, Kochi, India, Series Title: Communications in Computer and Information Sciences, Vol.193, Springer, pp: 445-454, July 22-24, 2011.

[4] Taylor, I., Deelman, E., Gannon, D., and Shields, M., “Workflows for e-science: scientific workflows for grid”, 1st Edition, Springer.

[5] Pandey, S., “Scheduling and management of data intensive application workflows in grid and cloud computing environment”, PhD Thesis, University of Melbourne, Australia, 2011.

[6] Yu, J. and Buyya, R., “Taxonomy of workflow management systems for grid computing”, Journal of Grid Computing, vol.3, no.1-2 , pp: 171–200, Jan 2005.

[7] Yu, J. and Buyya, R., “Workflow scheduling algorithms for grid computing”, In: Xhafa F, Abraham A (eds) Metaheuristics for scheduling in distributed computing environment, Springer, Berlin. ISBN: 978-3-540-69260-7, 2008.

[8] Ke, L. Hai, J., Jinjun, C.,Xiao, L., and Dong, Y., “A compromised time-cost scheduling algorithm in swindew-c for instance-intensive cost-constrained workflows on cloud computing platform”, Journal of High Performance Computing Applications, pp:1-16, 2010.

[9] G.Kiruthiga,and M.Senthilkumar., “ A survey about dynamic tasks scheduling in heterogeneous processors using hybrid particle swarm optimization”, International Journal of Computer Science and Technology, vol.2, no. 3, pp: 85-289, Mar. 2011.

[10] Pandey, S., Linlin, W., Siddeswara, M, G., and Buyya, R, “A particle swarm optimization-based heuristic for scheduling workflow applications in cloud computing environments”, In. Proceeding International Conference on Advanced Information Networking and Applications Perth, WA, pp: 400-407, April 20-23, 2010.

[11] Zhangjun, W., Xiao, L., Zhiwei, N., Dong, Y., and Yun, Y., “A market-oriented hierarchical scheduling strategy in

cloud workflow systems”, Journal of Supercomputing, vol.63, no.1, pp: 256-293, Jan 2013.

[12] Verma, A., and Kaushal, S., “Deadline constraint heuristic based genetic algorithm for workflow scheduling in cloud”, Journal of Grid and Utility Computing. (Accepted)

[13] Verma, A., and Kaushal, S., “Budget constraint priority based genetic algorithm for workflow scheduling in cloud”, In. Proceeding of IET International Conference on Recent Trends in Information, Telecommunication and Computing, India, pp: 8-14, Aug. 01-02, 2013.

[14] Zheng, W., and Sakellariou, R., “Budget-deadline constrained workflow planning for admission control in market-oriented environments”, In. Proceedings of the 8th international conference on Economics of Grids, Clouds, Systems, and Services, LNCS, Springer-Verlang Berlin Heidelberg , 105-119, 2012. DOI=http://doi.acm.org/10.1007/978-3-642-28675-9_8,

[15] Verma, A., and Kaushal, S., “Deadline and budget distribution based cost-time optimization workflow scheduling algorithm for cloud”, In. IJCA Proceeding of

International Conference on Recent Advances and Future Trends in IT, Patiala, India, pp: 1-4, April 2012.

[16] J. Kennedy, R. Eberhart., “Particle swarm optimization”, In Proceedings International Conference on Neural Networks, Perth, WA, pp: 1942 – 1948, Nov.27- Dec.01, 1995.

[17] Haluk, T., Salim, H., and Wu, M.Y, “Performance-effective and low-complexity task scheduling for heterogeneous computing”, IEEE Transaction on Parallel and Distributed Systems, vol.13, no.3 pp: 260-274, Mar 2002.

[18] Bharathi, S., Lanitchi, A., Deelman, E., Mehta, G., Su, M.H., and Vahi, K., “Characterization of scientific workflows”, In. Workshop on Workflows in Support of Large Scale Science ,CA, USA, pp:1-10, Nov.17,2008.


Recommended