+ All Categories
Home > Documents > [IEEE 2010 IEEE International Conference on Virtual Environments, Human-Computer Interfaces and...

[IEEE 2010 IEEE International Conference on Virtual Environments, Human-Computer Interfaces and...

Date post: 08-Dec-2016
Category:
Upload: nandini
View: 214 times
Download: 1 times
Share this document with a friend
6
Optimizing the Utilization of Virtual Resources in Cloud Environment Sunirmal Khatua Department of Computer Science and Engineering University of Calcutta, India Email: [email protected] Anirban Ghosh School of Mobile Computing and Communication Jadavpur University, India Email: [email protected] Nandini Mukherjee Department of Computer Science and Engineering Jadavpur University, India Email: [email protected] Abstract—One of the key factors behind successful deployment of Cloud for on demand services is the optimal utilization of its virtual resources. A poorly managed cloud application may lead to huge cost which is even more than the cost of physical de- ployment. The most important issues in Cloud are the scalability and availability. A highly scalable deployment may lead to poor resource utilization whereas a low scalable deployment may lead to unavailability of services. This paper proposes architecture for optimal utilization of such resources considering both scalability and availability. The proposed architecture, named as Monitoring & Optimizing Virtual Resources (MOVR) architecture, manages and optimizes the usage of the resources required by a cloud application considering auto deployment, auto scaling and auto recovery of the provisioned resources for the application. I. I NTRODUCTION Virtualization [1] [2] is one of the key technologies that makes modern computers powerful and versatile. It is now possible to create a large number of virtual machines, within a single physical machine. Each of these virtual machines can function with different operating systems. This is a trend which is opposite to clustering technology where we create a powerful computational resource comprising a number of small computational units. Generally clustering is used when a single machine cannot support a workload. On the other hand, virtualization is used when the resources of a physical machine are underutilized by the workload. A combination of the two technologies through creation of a cluster of virtual machines [3] is also possible. In high performance computing research, optimal resource utilization has always been considered as one of the major challenges. Certain resource utilization problems arise even in simple environments, like individual workstations. They are further aggravated in complex environments, particularly in those which are dynamic and heterogeneous. Some so- lutions have been proposed in the literature for traditional environments like clusters. These solutions are mainly based on techniques for dynamic load balancing. However, not much work have been carried out for virtual environments in this regard. One major requirement for a solution to optimal resource utilization is monitoring the resource health. On the basis of the resource health information, virtual resources for an application can be increased or decreased. In this paper we propose an architecture for Monitoring and Optimizing Virtual Resources (MOVR) which can be built on top of some virtual- ization technologies. We also demonstrate how the architecture can be deployed for optimal use of the resources in a virtual environment. The paper is organized as follows: In section II we discuss the concepts of virtualization and cloud computing. A brief discussion on related work is presented in section III. The pro- posed MOVR architecture is described in section IV. Section V summarizes the preliminary implementation and evaluation of the proposed architecture. Section VII concludes with a direction of the future work. II. VIRTUALIZATION AND CLOUD COMPUTING In this section we briefly describe virtualization technolo- gies and implementation of Cloud computing paradigms. A. Virtualization Virtualization is a software technology that is rapidly changing the way people compute. The technology requires specialized hardware for its proper implementation. Recent hardware is matured enough to use virtualization technology and provide the illusion of many low end Virtual Machines (VMs), each running a separate operating system instance. Intel Virtual Technology (Intel VT) and AMD Virtualization (AMD V) are the two major initiatives to make virtualization popular. Virtualization can be applied at the server level, OS level or at the application level. Server level virtualization emulates the underlying physical hardware by virtualization software. Such virtualization software is called Hypervisor or Virtual Machine Monitor (VMM) [4] [5]. Some popular hypervisors include Xen, VMWare ESX, Kernel Virtual Machine (KVM), Sun xVM, Microsoft Hyper-V, oVirt and RTS Hypervisor. In case of OS virtualization, the virtualization platform runs on top of a host operating system that relay and co-ordinates the resource requests from various VMs. This type of virtualiza- tion is very lightweight and requires almost no system or appli- cation overhead. Application virtualization, on the other hand includes software technology allowing applications to run on many different operating systems and hardware platforms. This layer of technology makes it possible to restart an application 978-1-4244-5905-6/10/$26.00 ©2010 IEEE
Transcript
Page 1: [IEEE 2010 IEEE International Conference on Virtual Environments, Human-Computer Interfaces and Measurement Systems (VECIMS) - Taranto, Italy (2010.09.6-2010.09.8)] 2010 IEEE International

Optimizing the Utilization of Virtual Resources inCloud Environment

Sunirmal KhatuaDepartment of Computer Science and

EngineeringUniversity of Calcutta, India

Email: [email protected]

Anirban GhoshSchool of Mobile Computing and

CommunicationJadavpur University, India

Email: [email protected]

Nandini MukherjeeDepartment of Computer Science and

EngineeringJadavpur University, India

Email: [email protected]

Abstract—One of the key factors behind successful deploymentof Cloud for on demand services is the optimal utilization of itsvirtual resources. A poorly managed cloud application may leadto huge cost which is even more than the cost of physical de-ployment. The most important issues in Cloud are the scalabilityand availability. A highly scalable deployment may lead to poorresource utilization whereas a low scalable deployment may leadto unavailability of services. This paper proposes architecture foroptimal utilization of such resources considering both scalabilityand availability. The proposed architecture, named as Monitoring& Optimizing Virtual Resources (MOVR) architecture, managesand optimizes the usage of the resources required by a cloudapplication considering auto deployment, auto scaling and autorecovery of the provisioned resources for the application.

I. INTRODUCTION

Virtualization [1] [2] is one of the key technologies thatmakes modern computers powerful and versatile. It is nowpossible to create a large number of virtual machines, withina single physical machine. Each of these virtual machinescan function with different operating systems. This is a trendwhich is opposite to clustering technology where we createa powerful computational resource comprising a number ofsmall computational units. Generally clustering is used whena single machine cannot support a workload. On the otherhand, virtualization is used when the resources of a physicalmachine are underutilized by the workload. A combination ofthe two technologies through creation of a cluster of virtualmachines [3] is also possible.

In high performance computing research, optimal resourceutilization has always been considered as one of the majorchallenges. Certain resource utilization problems arise evenin simple environments, like individual workstations. Theyare further aggravated in complex environments, particularlyin those which are dynamic and heterogeneous. Some so-lutions have been proposed in the literature for traditionalenvironments like clusters. These solutions are mainly basedon techniques for dynamic load balancing. However, not muchwork have been carried out for virtual environments in thisregard.

One major requirement for a solution to optimal resourceutilization is monitoring the resource health. On the basisof the resource health information, virtual resources for anapplication can be increased or decreased. In this paper we

propose an architecture for Monitoring and Optimizing VirtualResources (MOVR) which can be built on top of some virtual-ization technologies. We also demonstrate how the architecturecan be deployed for optimal use of the resources in a virtualenvironment.The paper is organized as follows: In section II we discussthe concepts of virtualization and cloud computing. A briefdiscussion on related work is presented in section III. The pro-posed MOVR architecture is described in section IV. SectionV summarizes the preliminary implementation and evaluationof the proposed architecture. Section VII concludes with adirection of the future work.

II. VIRTUALIZATION AND CLOUD COMPUTING

In this section we briefly describe virtualization technolo-gies and implementation of Cloud computing paradigms.

A. Virtualization

Virtualization is a software technology that is rapidlychanging the way people compute. The technology requiresspecialized hardware for its proper implementation. Recenthardware is matured enough to use virtualization technologyand provide the illusion of many low end Virtual Machines(VMs), each running a separate operating system instance.Intel Virtual Technology (Intel VT) and AMD Virtualization(AMD V) are the two major initiatives to make virtualizationpopular.

Virtualization can be applied at the server level, OS levelor at the application level. Server level virtualization emulatesthe underlying physical hardware by virtualization software.Such virtualization software is called Hypervisor or VirtualMachine Monitor (VMM) [4] [5]. Some popular hypervisorsinclude Xen, VMWare ESX, Kernel Virtual Machine (KVM),Sun xVM, Microsoft Hyper-V, oVirt and RTS Hypervisor. Incase of OS virtualization, the virtualization platform runs ontop of a host operating system that relay and co-ordinates theresource requests from various VMs. This type of virtualiza-tion is very lightweight and requires almost no system or appli-cation overhead. Application virtualization, on the other handincludes software technology allowing applications to run onmany different operating systems and hardware platforms. Thislayer of technology makes it possible to restart an application

978-1-4244-5905-6/10/$26.00 ©2010 IEEE

Page 2: [IEEE 2010 IEEE International Conference on Virtual Environments, Human-Computer Interfaces and Measurement Systems (VECIMS) - Taranto, Italy (2010.09.6-2010.09.8)] 2010 IEEE International

in case of a failure, start another instance of an applicationif the application is not meeting service level objectives, orprovide workload balancing among multiple instances of anapplication to achieve high levels of scalability. Applicationlevel virtualizations include Application Streaming, VirtualDesktop Infrastructure (VDI) and others.

B. Cloud Computing

Buyya et al in their paper [21] defines cloud computing as:“A Cloud is a type of parallel and distributed system consistingof a collection of interconnected and virtualized computersthat are dynamically provisioned and presented as one or moreunified computing resources based on service-level agreementsestablished through negotiation between the service providerand consumers”.

Cloud computing systems fundamentally provide access tolarge pools of data and computational resources through avariety of interfaces. These types of systems offer a newprogramming target for scalable application developers. Thefollowing three aspects characterize Cloud computing [22]:

i) The illusion of infinite computing resources availableon demand, thereby eliminating the need for Cloudcomputing users to plan far ahead for provisioning.

ii) The elimination of an up-front commitment by Cloudusers, thereby allowing companies to start small andincrease hardware resources only when there is anincrease in their needs.

iii) The ability to pay for use of computing resources on ashort-term basis as needed (e.g., processors by the hourand storage by the day) and release them as needed,thereby rewarding conservation by letting machines andstorage go when they are no longer useful.

Cloud computing allows an application to use indefinitenumber of resources on demand and this dream comes truewith the advent of virtualization technology. The necessityof achieving elasticity and introducing the illusion of infinitecapacity resources requires each of these resources to be virtu-alized to hide the implementation of how they are multiplexedand shared.

Cloud computing incorporates Infrastructure-as-a-Service(IaaS), Platform-as-a-Service (PaaS) and Software-as-a-Service (SaaS) and includes many other technology trendsrelated to cluster computing, grid computing and utility com-puting [6]. The above mentioned services are generally deliv-ered through data centers with different levels of virtualizationtechnologies. There are basically two broad categories ofclouds. In the first category, computing instances are createdand provided on demand. Amazon’s EC2 Services [7] isan example of this category. The second category providescomputing capacity on demand. One example of this categoryis Google’s MapReduce Application [8]. In the first case,scalability is achieved by scaling the computing instances,whereas in the second case, scalability is achieved by scalingthe computing capacity through aggregating the provisionedcomputing instances. Cloud computing is often used withutility computing pay-per-use model [9]. In such cases it offers

reduced capital expenditure, low entry barrier and scaling upon demand. Thus, the key issues of a standard cloud basedapplication are managing and providing virtualized resources,dynamic scalability of the virtualized resources and incorpo-rating pay-per-use (utility) services. In this paper we focuson the issues of dynamic provisioning and dynamic scalingof virtualized resources. The other issues are assumed to betaken care of by the cloud provider such as Amazon EC2 orany private cloud built with tools like Eucalyptus [10].

III. PRIOR WORK

We have already mentioned that not much work has beendone in the field of optimization of virtual resources. AfkhamAzeez, in his paper [11], focuses on deploying highly scalableweb service applications on Amazon EC2. He has concentratedon high availability with an Well Known Address (WKA)based membership technique to overcome shortcoming ofAmazon EC2’s multicasting feature. Amazon has also releaseda mechanism called Auto Scaling [12] that automaticallyscales the client’s EC2 capacity up or down according tothe conditions defined by the client with the help of AmazonCloud Watch [12]. However the above systems fail to providea generalized framework for optimal use of resources usedfor a cloud based application. This is because in both cases,an infrastructure centric optimization was adopted. Therefore,the owner of a cloud application fails to integrate their ownoptimization rules within the system. Moreover, an applicationis not allowed to use resources from different infrastructureproviders. On the contrary, MOVR - the architecture pro-posed in this paper, adopts an application centric optimizationmechanism. The MOVR architecture focuses on establishinga generalized framework which is independent of the CloudProviders and Cloud Applications. The major goal of thisframework is achieving a cost effective, fault tolerant and autoscalable cloud application deployment.

IV. MOVR ARCHITECTURE

The on-demand availability of resources and the pay-per-usemodel of Cloud Computing paradigm adds a new dimension inthe possibility of optimizing virtual resources. MOVR focuseson the optimization of such resources used for a web applica-tion deployed over the Cloud. In a typical web application,generally an n-tier architecture is used that consists of adatabase-tier, a business-tier, a web-tier and a combinationof multi domain tiers. While running such applications on acluster, load balancing techniques can be used to distribute theloads among the resources within any particular tier. However,while running in a virtual environment, number of requests forcertain services varies over time and the pattern is not knownin advance. Therefore, it may not be possible to determinethe amount of resources to be provisioned at a particularpoint of time and this uncertainty often leads over-provisioningor under-provisioning of the resources. If extra resourcesare added to support peak hour requests, these resourcesmay remain underutilized most of the time. On the otherhand, if some of the requests are dropped during peak hours

Page 3: [IEEE 2010 IEEE International Conference on Virtual Environments, Human-Computer Interfaces and Measurement Systems (VECIMS) - Taranto, Italy (2010.09.6-2010.09.8)] 2010 IEEE International

Fig. 1. The MOVR Architecture

because of non-availability of resources, the owners of theapplications may loose some of their customers. The MOVRarchitecture provides a generalized framework for deployinga web application over cloud and auto scaling the provisionedvirtual resources. Different layers of the architecture and itsfunctioning are described in the next two subsections.

A. Layers of MOVR ArchitectureThe MOVR architecture consists of three layers, which

are: Infrastructure layer, Optimization layer and Applicationlayer. The components that span over the three layers includevirtual resources, a Provisioner, a Deployer, a Controller, aMonitor a Policy-Arbitration-QoS module, and an ApplicationRepository.

Infrastructure layer provides the resource pool for the de-ployed application. Resources can be provisioned from publicproviders like Amazon EC2, Flexiscale, GoGrid [16] [17] etc.or from the private providers that allow a deployer to use hisown infrastructure. One can establish his own private cloudwith tools like Eucalyptus.

Optimization layer is responsible to optimize the resourcesduring first time deployment as well as during the life time ofthe deployed application through a feedback system [23]. TheDeployer module in this layer uses the Application Repository,Policy-Arbitration-QoS and Provisioner module to provisionthe resources in an optimal way as described in later sections.The Policy module maintains various strategies and prioritiesfor scheduling the resources. In the Arbitration module, someof the strategies are overruled by experts manually. Thusthe strategies in the Policy module consider long term goalswhereas the strategies in the Arbitration module consider onlythe local goals. QoS module defines the reliability, responsetime, security and integrity issues for the deployed application.For example, security strategies defined in the QoS determineswhich tier (database tier, application tier etc.) of the applicationshould be deployed in the private infrastructure. Similarlyreliability in QoS determines how many backup nodes needto be maintained for the application. The Provisioner modulealong with Unified Provider API provisions the resourcesrequired by the application. It uses delegate design pattern

to solve the issue of heterogeneous request and responseformats for different providers. The Deployer module decidesthe resource requirements with the information from Policy-Arbitration-QoS module and uses the Application Repositoryand Provisioner module to deploy the application onto thescheduled resources. The Monitor and the Controller modulesalong with the Unified Event API implement the feedbackcontrol system to optimize the resources during the life timeof a deployed application. The feedback control system relieson event generation mechanism and along with the eventit provides necessary information required for optimization.Once an event is received the Controller module uses similarmechanism as used by the Deployer module to reprovision theresources.

The Application layer allows an end user to deploy anyweb application within the Cloud without modifying its code.The Unified Client API is used to support heterogeneousapplication formats and the Application Repository module isused to store application data along with application metadata.The application data stores the application in the forms ofscripts, ear file, war file etc. The application metadata storesinformation about the application data itself; e.g. how manytiers needed by the application, whether a tier needs to beimplemented as a cluster or a mapping between applicationdata and tiers.

B. Functioning of MOVR Architecture

MOVR works in two phases, namely Event Generationphase and Optimization phase. In the Event Generation phase,events are generated to identify any change in the state ofthe deployed application e.g. load to an application (suchas number of requests) increases or decreases. Once suchan event is received, in the Optimization phase resourcerequirements for the application is analyzed and resources arereprovisioned according to the requirements. We propose todefine an application in MOVR formally with the followingseven tuples:

A = ( T,R,E,W , Rm, Em, Wm ) where,T is the set of tiers, ({t})

Page 4: [IEEE 2010 IEEE International Conference on Virtual Environments, Human-Computer Interfaces and Measurement Systems (VECIMS) - Taranto, Italy (2010.09.6-2010.09.8)] 2010 IEEE International

R is the set of resources, ({r})E is the set of events, ({e})W is the set of workflows ({w})Rm : T → R

Em : T → E | R → E

Wm : E → W

Every web application consists of a number of tiers (T)(even a standalone server can also be considered to be basedon a single tier). Every tier uses specific resources (R) in orderto implement the functionality of that tier (e.g. a database tiermay use MySQL nodes as its resources). MOVR uses a servicestack maintained by each provider and provides a uniforminterface to the user while provisioning the resources (R) fora particular tier(T). Provisioning is based on the applicationmetadata (application definition) provided by the user. Twoother entities are used by MOVR for the application definition.These are events(E) and workflows(W). Events are defined aschanges in states of a resource or a tier. Examples of events canbe cited with server died, service died, a tier overloaded or un-derloaded etc. A workflow is a sequence of actions performedby MOVR to fine tune the resources for an application at theoccurrence of a specific event. Workflow may also be complexwhere one workflow kicks off (initiates) other workflows,either synchronously or asynchronously, for achieving the finalgoal. The last three tuples in the above definition provide threemappings: Resource Mapping - Rm, Event Mapping - Em andWorkflow Mapping - Wm. Rm defines the relationships amongtiers and resources. MOVR allows a resource to be shared bytiers of various applications; thereby optimizing the resources.In the current implementation of MOVR, security aspects havenot been considered, therefore, we assume that the applicationsbelong to the same user. Em defines relationships either amongtiers and events or among resources and events. For example,tier overloaded or underloaded events can be defined at the tierlevel whereas server died or service died may be defined atthe resource level. Wm defines the relationships among eventsand workflows that allow MOVR to kick off specific workflowat a particular event.

The two phases of MOVR are described in detail in the nexttwo subsections.

1) Event Generation Phase: An application in MOVR canhave a number of states. Initially it is in the new state.Once it is copied to the Application Repository it is in theinactive state. When the application is deployed properlywithin the provisioned resources it becomes in the active state.Similarly, when the loads to an active application is too highor too low (explained later) it changes its state to overloadedor underloaded state. The loads to an application can bemeasured in a number of ways, such as number of requestsper unit time to the application, CPU or memory loads onthe underlying resources. Whenever an application changes itsstate, an event is generated based on the current state. In thissection we discuss about various Event Generation Schemesthat can be adopted within MOVR.

The simplest one is the threshold based event generation

scheme. Here the user defines threshold values in the applica-tion metadata for the overloaded and underloaded states. Thefeedback control system changes the application state when-ever the measured loads to the application exceeds the pre-defined threshold. Accordingly, an event is generated. Rangesfor threshold values for the overloaded and underloaded statescan be defined statically or proportionally as discussed in[25]. In proportional thresholding, the target range is changeddynamically depending on the current measured values.

The second event generation scheme is the prediction basedevent generation scheme. Generally, loads to an applicationcan be characterized by four factors namely Trends(T), Sea-sonality(S), Randomness(R) and Cyclical(C). T identifies theoverall slope of loads to the application whereas S and Cdetermines the peeks at specific points of time in a shortterm and in a long term basis respectively. R is the unforeseenfactor showing the variation in load. With reasonable effectsof T, S, R and C in the loads, the simple threshold basedload prediction scheme as mentioned above may not showa good performance. In such situations, we propose to usethe following n-step ahead load prediction scheme for betterprediction of the future load.

i) Smooth the history data Xh(t) to generate Xs(t)using Savitzky-Golay filter [24].

ii) Find out AR(p) [26] parameters ϕ1, ϕ2, , ϕp fromXs(t), Xs(t − 1), · · ·, Xs(t − w − 1) where w is theprediction-window size.

iii) Using Xs(t), Xs(t− 1), · · ·, Xs(t−w− 1) and AR(p)parameters ϕ1, ϕ2, · · ·, ϕp predict Xp(t + 1), Xp(t +2), · · ·, Xp(t + n).

iv) If any of the Xp(t+1), Xp(t+2), ···, Xp(t+n) exceedsthe predefined threshold then trigger event.

The third event generation scheme is the request based eventgeneration scheme. This event generation scheme adopts thetechnique described by Amazon in their article [13]. Hereevery request to the application is queued in the AmazonSQS request queue. The two metrics, namely Length of theQueue(LoQ) and Time in the Queue(TiQ) determine the loadsto the application. LoQ is the number of messages in the queueand TiQ refers to the time elapsed between en-queuing andde-queuing of a particular request. MOVR triggers an eventwhenever LoQ or TiQ exceeds the predefined optimal LoQ orTiQ.

The last event generation scheme which is used by MOVRis the schedule based event generation scheme. This eventgeneration scheme is easy to implement, however, this schemecan be used only if the Randomness (R) characteristic ofapplication load as described in the prediction based eventgeneration scheme tends to zero such that one can manuallydetermine the patterns at the peeks of loads to an application.In such cases, one can schedule the event at a derived pointof time. MOVR will trigger events at those schedule pointsautomatically without analyzing any resource health data. Asthe scheme does not need any analysis of historical data, thisscheme is much more efficient than the other event generation

Page 5: [IEEE 2010 IEEE International Conference on Virtual Environments, Human-Computer Interfaces and Measurement Systems (VECIMS) - Taranto, Italy (2010.09.6-2010.09.8)] 2010 IEEE International

schemes. However it may fail to optimize if someone wronglyanalyzes the schedule points.

Once an event is generated, the feedback control system willfeed the event back to the Controller ( as in Fig. 1 ) throughthe Unified Event API. There are three sources of events tothe Controller. The first one (the arrow marked by 1 in Fig. 1)generates events for the first three event generation schemes.The second one (the arrow marked by 2 in Fig. 1) generatesschedule based events or instant manual events. For example,during the first time deployment of an application, one caninstantly trigger a startup event to change the application statefrom inactive to active. Similarly, a shutdown event should betriggered to stop and change the application state from activeto inactive. The third source of events falls within a differentdomain. For example, the application owner may have theirown event generation scheme and monitoring subsystem thatdirectly feed events into the Controller.

2) Optimization Phase: After receiving an event, the Con-troller initiates the optimization phase. MOVR already main-tains a list of workflows along with the mapping betweenevents and workflows for every application (as described inthe formal definition of the application). Besides, the receivedevents contain certain information which help to decide theoptimization procedure. Based on this information, MOVRgenerates a workflow sequence containing a list of targetworkflows along with the information of synchronous andasynchronous invocation. MOVR then kicks off (starts) the tar-get workflows according to the generated workflow sequence.Thus, the key to this phase is the implementation of variousoptimization workflows.

Along with autoscaling, MOVR also provides a cost effec-tive fault tolerant mechanism to the deployed application. Thusthe various events generated by MOVR can roughly be classi-fied into two categories - scaling events and recovery events.Accordingly, workflows can also be categorized as recoveryworkflow and scaling workflow. The recovery workflows areeasy to implement compared to the scaling workflows. Thisis because a recovery workflow just requires to restart aservice or relaunch a new VM replacing a failed one. Theseworkflows provide a cost effective fault tolerance mechanismto the deployed application. However scaling workflows re-quire implementation of much more complex actions. It ispossible to scale the resources either vertically or horizontally.With vertical scaling, the capacity of a particular resourceis increased or decreased, whereas with horizontal scaling,similar resources are added or removed from a particular tier.With vertical scaling workflow, once the Controller receivesa scaling event, the Controller retrieves the IP and AMI(Amazon Machine Image) information from the event. It thendecides the instance type [14] of the VM to be launched basedon the direction and amount of scaling. Once the new VM islaunched, it then migrates the ongoing sessions to the newVM. All the EBS (Elastic Block Storage) [15] volumes arealso moved to the new VM. Finally the global IP (Elastic IP)is assigned to the new VM and the old VM is terminated.

During the workflow execution, session migration action

must be carried out synchronously with instance launchingaction. However, actions related to associating EBS and as-sociating global IP can be invoked asynchronously once thenew VM is launched. Terminating instance action must also beinvoked synchronously with obtaining EBS list action. Consid-ering all these actions and their dependencies, MOVR formsa workflow sequence and invokes the actions accordingly.

Horizontal scaling workflow is easier to implement com-pared to vertical scaling workflow. Horizontal scaling work-flow needs the tier or resources to be deployed with a loadbalanced cluster [20]. Once the Controller receives a scalingevent, it decides the amount and direction of scaling based onthe information retrieved from the event. Accordingly, it eitherlaunches or terminates some of the VMs and reconfigures thecorresponding cluster.

V. IMPLEMENTATION

The MOVR architecture proposed by us has been imple-mented with Xen as the virtualization technology, AmazonEC2 as the public cloud provider, Eucalyptus as the privatecloud provider and Zabbix as the monitoring and feedbackcontrol system. We have tested the implementation with aphp based online collaboration system, Php-Collab. The Php-Collab application is deployed with three tiers (T) namelyweb tier, application tier and database tier (refer to the formaldefinition as discussed in Section IV B). The web tier is imple-mented with apache load balancer nodes as the resources (R).They balance the loads to the application tier. The applicationtier is implemented with php-collab nodes and the databasetier is implemented with mysql cluster nodes as the resources.We define two events (E) namely app-tier-overloaded and app-

Fig. 2. Php-collab Application during Scale Up

tier-underused for the application tier and a single event, app-tier-server-died for application tier nodes. The events app-tier-overloaded and app-tier-underused map to scale up andscale down workflow respectively. The event app-tier-server-died maps to the recovery workflow. The scale up and recoveryworkflow launches a new VM in the application tier, configurephp-collab in the new VM and reconfigures load balancerin the web tire. The scale down workflow just reconfiguresthe load balancer in the web tire. Open Symphony Workflow

Page 6: [IEEE 2010 IEEE International Conference on Virtual Environments, Human-Computer Interfaces and Measurement Systems (VECIMS) - Taranto, Italy (2010.09.6-2010.09.8)] 2010 IEEE International

Engine [19] has been used to implement the workflows forvarious events. The state of the Php-Collab application duringscale up is shown in Fig. 2.

Fig. 3. MOVR performance graph for php-collab application

In order to evaluate the MOVR architecture with the abovementioned php-collab application we have run our implemen-tation in a Desktop PC with Intel Core 2 Duo 2GHz Processor,1GB RAM & standard dialup network connection. The resultof our evaluation is shown graphically in Fig. 3. We comparethe time taken by our implementation as compared to the timespend in the infrastructure provider for various events.

In Fig. 3, Provider Time is the time the Controller waits forthe Provider to return. The MOVR time is the remaining timeto complete the workflow. Since various actions for scalingdown and service recovery events are asynchronously invokedwith the actions for provider, the Provider Time is small - justthe time required to invoke the corresponding web service.The performance data shows the overhead for the automationas compared to the time spend in the infrastructure layer and itshows that it is reasonable even for a large complex applicationlike Php-Collab.

VI. FUTURE WORK

Successful implementation of the architecture which hasbeen described in this paper depends on appropriate alertmessage co-relation and parallelism in the implementation ofvarious workflows. In future further investigation will be car-ried out for implementation of various workflows, in particularfor vertical scaling. Also in the current implementation wehave considered Amazon EC2 as the cloud provider. In futureimplementation of the MOVR architecture, hybrid cloud willbe used with proper security consideration.

VII. CONCLUSION

We are not far way from the day when we will see thatmost of the servers are physically installed only at the datacenters and most of the medium scale and startup organizationswill consist of only desktops and access or deploy theirlarge scale applications on the rented virtual servers from thedata centers. With such virtual environment, it will becomecritically important to monitor and optimize the use of suchvirtual resources and thereby reduce the cost and providescalability and availability to the deployed applications.

MOVR provides a generalized framework for managingthe resources provisioned for a deployed application andautomatically optimizes the cost considering the loads andfailures of various resources. MOVR utilizes the on demandavailability feature of cloud environment and the flexibility ofworkflow based programming to implement its functionality.

REFERENCES

[1] A. Desai, Definitive Guide To Virtual Platform Management, RealtimePublishers, 2007.

[2] G. Sheild, Selecting the Right Virtualization Solution, RealtimePublishers, 2007.

[3] I. Foster, et al, Virtual Clusters for Grid Communities, CCGrid 2006,Singapore.

[4] A. Awadallah and M. Rosenblum, The vMatrix: A network of virtualmachine monitors for dynamic content distribution, In Proceedings ofthe 7

th International Workshop on Web Content Caching and Distribution(WCW 2002), Aug. 2002

[5] P. Barham, et al, Xen and the Art of Virtualization, In Proceedings ofthe 19

th ACM symposium on Operating Systems Principles, 2003.[6] D. Abramson, R. Buyya and J. Giddy., A computational economy for

grid computing and its implementation in the Nimrod-G resource broker,Future Generation Computer Systems 18, 8 (2002), 1061 - 1074.

[7] S. Garfinkel, An Evaluation of Amazons Grid Computing Services: EC2,S3 and SQS, Tech. Rep. TR-08-07, Harvard University, August 2007.

[8] J. Dean and S. Ghemawat, Mapreduce: Simplified data processing onlarge clusters, In ACM,51(1):107-113, 2008.

[9] I. M. Llorente, R. S. Montero, E. Huedo and K. Leal, A Grid Infrastruc-ture for Utility Computing, In WETICE ’06, 15th IEEE InternationalWorkshops on Infrastructure for Collaborative Enterprises.

[10] R. Wolsky et al, Eucalyptus: A Technical Report on an Elastic UtilityComputing Archietcture Linking Your Programs to Useful Systems, Tech.Rep. 2008-10, University of California, Santa Barbara, October 2008.

[11] A. Azeez, Autoscaling Web Services on Amazon EC2, Available from:http://people.apache.org/∼azeez/autoscaling-web-services-azeez.pdf

[12] Amazon EC2 Auto Scaling, Cloud Watch and Elastic Load Balancing.March 2009. Available from:http://aws.amazon.com/autoscaling/, http://aws.amazon.com/cloudwatch/http://aws.amazon.com/elasticloadbalancing/

[13] Auto-scaling Amazon EC2 with Amazon SQS. May 2008. Avail-able from: http://developer.amazonwebservices.com/connect/entry.jspa?externalID=1464

[14] Amazon EC2 Instance Types. April 2008. Available from:http://aws.amazon.com/ec2/instance-types/

[15] Amazon Elastic Block Storage. April 2008. Available from:http://aws.amazon.com/ebs/

[16] Flexiscale Public Cloud Provider. Available from:http://www.flexiscale.com

[17] GoGrid Public Cloud Provider. Available from: http://www.gogrid.com[18] K. P. Stoilova and T. A. Stoilova, Evolution of Workflow Management

Systems, In ICEST, 2006[19] Open Symphony Workflow Management System. Available from:

http://www.opensymphony.com/osworkflow/[20] Approaches to load balanced clusters. Available from:

http://onjava.com/onjava/2001/09/26/load.html[21] R. Buyya et al, Modeling and Simulation of Scalable Cloud Computing

Environments and the CloudSim Toolkit: Challenges and Opportunities,In Proceedings of the 7

th High Performance Computing and Simulation(HPCS 2009) Conference, Germany, June 21 - 24, 2009

[22] A. Weiss, Computing in the Clouds, In netWorker, Dec. 2007, 11(4):16-25.

[23] P. Padala et al, Adaptive control of virtualized resources in utilitycomputing enironments, In Proceedings of EuroSys, 2007.

[24] A. Savitzky and M.J.E. Golay, Smoothing and Differentiation of Databy Simplified Least Squares Procedures, In Analytical Chemistry, 36(8): 1627-1639.

[25] H.C.Lim et al, Automated control in cloud computing: challenges andopportunities, In Proceedings of the 1

st workshop on Automated controlfor datacenters and clouds, Spain, June 19, 2009.

[26] Mills and C. Terence, Time Series Techniques for Economists, Cam-bridge University Press, 1990.


Recommended