Compass: Cost of Migration-aware Placement in Storage Systems

Compass: Cost of Migration-aware Placement inStorage Systems

Akshat Verma Upendra Sharma Rohit Jain Koustuv DasguptaIBM India Research Lab IBM India Research Lab IBM India Research Lab IBM India Research [email protected] [email protected] [email protected] [email protected]

Abstract— We investigate methodologies for placement andmigration of logical data stores in virtualized storage systemsleading to optimum system configuration in a dynamic workloadscenario. The aim is to optimize the tradeoff between theperformance or operational cost improvement resulting fromchanges in store placement, and the cost imposed by the involveddata migration step. We propose a unified economic utility basedframework in which the tradeoff can be formulated as a utilitymaximization problem where the utility of a configuration isdefined as the difference between the benefit of a configurationand the cost of moving to the configuration.

We present a storage management middleware framework andarchitecture Compass that allows systems designers to plug-indifferent placement as well as migration techniques for estimationof utilities associated with different configurations. The biggestobstacle in optimizing the placement benefit and migration costtradeoff is the exponential number of possible configurations thatone may have to evaluate. We present algorithms that explorethe configuration space efficiently and compute a candidateset of configurations that optimize this cost-benefit tradeoff.Our algorithms have many desirable properties including localoptimality. Comprehensive experimental studies demonstrate theefficacy of the proposed framework and exploration algorithms,as our algorithms outperform migration cost-oblivious placementstrategies by upto 40% on real OLTP traces for many settings.

I. INTRODUCTION

Storage systems management for guaranteeing applicationI/O performance is an important and challenging problembeing faced by datacenter architects today. For most work-loads, disk access latency continues to be several orders ofmagnitude larger than the computation time. As hard theproblem was with directed attached storage, consolidation andvirtualization of storage resources in a shared Storage AreaNetwork has made the problem of performance managementeven harder, since the storage resources are now sharedbetween competing workloads from different applications. Inthis paper, we propose Compass: an integrated performancemanagement architecture and methodology for storage systemsthat takes advantage of the virtualization layer’s capabilityof migrating data between storage devices transparently. Theproposed solution is aimed at handling performance problemsin a virtualized storage environment that arise out of factorssuch as changes in workload intensity and pattern with time.Most existing techniques for solving storage performanceproblems rely on selective throttling of I/O streams but do notconsider change in data placement on storage devices, sincethe needed data migration is considered to be disruptive. Wepropose a technique based on reconfiguration involving changein placement of data, where the reconfiguration choices are

evaluated for both the potential improvement in performanceand the negative impact due to data migration.

The central idea of Compass is that by employing frequentbut limited local reconfigurations, system can respond tofrequent changes in incident workload without large scaledisruptions. In contrast, a large scale reconfiguration is veryexpensive in terms of performance disruption and hence can beundertaken very infrequently, leaving the system unable to takeadvantages of short term workload variations. We consider aneconomic utility based framework, in which the utility of acandidate new configuration is evaluated in terms of (a) thecost of data migration on account of the opportunity cost of notserving some requests during migration, and (b) the expectedbenefit from better service quality in new configuration. Aconfiguration that is close to the present configuration andoptimizes this trade-off is chosen as the new configuration. Thefact that the evaluation is sensitive to the migration cost makesthis technique effective. The proposed technique also providesa continuum in performance management between techniquesthat rely on workload throttling [13], [4], [18] and techniquesthat rely only on optimizing by placement [11], [2]. Unlikethe former, it can solve problems caused by load imbalance tosome extent, and by considering the cost of migration it canavoid taking placement decisions that worsen the situation.

The motivation for frequent reconfiguration comes froma number of factors. In a storage service provider settingwhere a number of workloads are consolidated on a fewsystems, system reconfiguration could be performed to shiftresources between negatively correlated workloads to sat-isfy the peak requirements of individual workloads, whileprovisioning resources only for average requirements. Thenegative correlation could arise from time-zone difference,from the diurnal nature of different workloads, or just from thechanging popularity of different data items in a large dataset.A number of studies on I/O workloads have also suggestedthe self-similar nature of I/O traffic [8], [9]. The implicationis that the traffic is bursty over a wide range of time scales.Again, migration cost aware system reconfiguration employedfrequently can provide much better performance than coarsegrained reconfigurations carried out after long intervals. Usedin conjunction with online migration execution techniquessuch as [5] that minimize performance impact, Compass canbe a very effective technique.A. Framework and Contribution

We consider the problem of dynamic resource reconfigura-tion in storage systems that provides statistical QoS guaran-

https://www.researchgate.net/publication/3044186_File_assignment_in_parallel_IO_systems_with_minimal_variance_of_service_time?el=1_x_8&enrichId=rgreq-ae504326b78780f8bf0d892c3b123b61-XXX&enrichSource=Y292ZXJQYWdlOzQyNTY2Nzc7QVM6MTAzMDg5NzI2MDMzOTI4QDE0MDE1ODk5MTQ1NjM=

https://www.researchgate.net/publication/4038621_Performance_virtualization_for_large-scale_storage_systems?el=1_x_8&enrichId=rgreq-ae504326b78780f8bf0d892c3b123b61-XXX&enrichSource=Y292ZXJQYWdlOzQyNTY2Nzc7QVM6MTAzMDg5NzI2MDMzOTI4QDE0MDE1ODk5MTQ1NjM=

https://www.researchgate.net/publication/4133532_QoSMig_adaptive_rate-controlled_migration_of_bulk_data_in_storage_systems?el=1_x_8&enrichId=rgreq-ae504326b78780f8bf0d892c3b123b61-XXX&enrichSource=Y292ZXJQYWdlOzQyNTY2Nzc7QVM6MTAzMDg5NzI2MDMzOTI4QDE0MDE1ODk5MTQ1NjM=

https://www.researchgate.net/publication/224101792_Characteristics_of_IO_traffic_in_personal_computer_and_server_workloads?el=1_x_8&enrichId=rgreq-ae504326b78780f8bf0d892c3b123b61-XXX&enrichSource=Y292ZXJQYWdlOzQyNTY2Nzc7QVM6MTAzMDg5NzI2MDMzOTI4QDE0MDE1ODk5MTQ1NjM=

https://www.researchgate.net/publication/41035914_Facade_Virtual_Storage_Devices_with_Performance_Guarantees?el=1_x_8&enrichId=rgreq-ae504326b78780f8bf0d892c3b123b61-XXX&enrichSource=Y292ZXJQYWdlOzQyNTY2Nzc7QVM6MTAzMDg5NzI2MDMzOTI4QDE0MDE1ODk5MTQ1NjM=

https://www.researchgate.net/publication/2308218_Self-Similarity_in_File_Systems?el=1_x_8&enrichId=rgreq-ae504326b78780f8bf0d892c3b123b61-XXX&enrichSource=Y292ZXJQYWdlOzQyNTY2Nzc7QVM6MTAzMDg5NzI2MDMzOTI4QDE0MDE1ODk5MTQ1NjM=

https://www.researchgate.net/publication/220881006_CHAMELEON_A_Self-Evolving_Fully-Adaptive_Resource_Arbitrator_for_Storage_Systems?el=1_x_8&enrichId=rgreq-ae504326b78780f8bf0d892c3b123b61-XXX&enrichSource=Y292ZXJQYWdlOzQyNTY2Nzc7QVM6MTAzMDg5NzI2MDMzOTI4QDE0MDE1ODk5MTQ1NjM=

https://www.researchgate.net/publication/243447927_Ergastulum_an_approach_to_solving_the_workload_and_device_configuration_problem?el=1_x_8&enrichId=rgreq-ae504326b78780f8bf0d892c3b123b61-XXX&enrichSource=Y292ZXJQYWdlOzQyNTY2Nzc7QVM6MTAzMDg5NzI2MDMzOTI4QDE0MDE1ODk5MTQ1NjM=

tees. QoS guarantees are provided on a request stream definedas an aggregation of I/O requests from an application. Thelogical aggregation of all data accessed by the requests ofa stream is referred to as a store, which is backed up byphysical space on storage devices. Typically, QoS guarantees,more formally referred to as service level agreements (SLA),specify an upper bound on the stream request rate and alsoan obligation on the system to serve a minimum fraction ofthese requests within a specified latency bound. In this setup,each request is associated with a reward value that the systemaccrues when the request is serviced within its latency bound.Reconfiguration, accomplished by migrating stores betweenstorage devices, impose a cost on the system since it consumesresources that could otherwise have been used for servicingapplication requests.

We provide an economic utility based framework usingwhich both the expected benefits of the new configuration andthe migration cost can be computed. This provides a basisfor evaluating the configuration choices taking into accountboth the expected benefit and the cost of migration. This is incontrast to store placement techniques that only consider theexpected benefit of a configuration regardless of the cost ofmigration or the expected period during which the workloadremains stable. We consider two commonly used placementtechniques and show that these techniques perform poorlywhen used for reconfiguring the system in response to smalland short-lived variations in workloads. We propose new al-gorithms that efficiently explore the configuration space in theneighborhood of configurations given by existing placementtechniques to find candidate configurations that optimize thecost benefit tradeoff. We believe Compass can be a powerfultechnique for performance management of storage systems.In our experiments, Compass shows significant improvement(up to 40% for a large number of settings) over the approachof using existing placement techniques for use in reconfigura-tions.B. Related Work

Much previous work in the area of storage systems man-agement has focused on automated mechanisms for offlinenear-optimal storage systems design, but only a few of theseaddress the issue of handling dynamic workload changes.Anderson et al. proposed Ergastulum [2] for design of storagesystems using heuristics based on best-fit bin packing withrandomization, to search the large design space. They alsoproposed Hippodrome [3] for automated storage system con-figuration that uses an iterative approach to determine storageconfiguration for a given set of workloads. However, thechange in configuration between the iterations is carried outwithout regard to the adverse impact of migration on systemperformance.

Other approaches such as Facade[13] and Sleds[4] relyexclusively on throttling workloads to enforce fairness inresource arbitration. Chameleon[18] improves on these ap-proaches by automatically inferring and refining workloadand device models at run-time. They are designed to workin scenarios where the total load on the system exceeds thesystem capacity, but they don’t address the problems causedby load imbalance resulting from inefficient placement. Thus,

these approaches can be used in conjunction, but not in lieu,of migration based approaches.

Scheuermann et al. [17] propose a disk cooling mechanismthat uses frequent migrations to ensure that the heat (requestrate) on all disks is equal while minimizing the total amountof data being moved. Their approach is most similar to ourwork as they take the cost of migration into account whilesearching for new placements by considering the size of databeing moved. However, their approach is tied strictly to onebenefit function (heat or load balancing) and can not be appliedto other benefit functions like throughput maximization orresponse time minimization. Further, since the migration costis integrated with the placement algorithm, the technique cannot be used in conjunction with other placement algorithms.These are exactly the problems that we solve in this work. Ourframework is designed to work with any choice of placementalgorithm and migration methodology.

The application of reward maximization as a tool in aservice provider setting for resource allocation problems hasbeen used by several researchers. Most of these addressallocation of resources for various request classes with QoSguarantees so that resources are optimally utilized, therebymaximizing the profit of the providers. Among these, Liu etal. [12] proposed a multi-class queuing network model forthe resource allocation problem in an offline setting. Vermaet al. [21] address the problem of admission control for profitmaximization of networked service providers. Dasgupta etal. [5] use a reward maximization approach for admissioncontrol and scheduling of migration requests that compete withapplication workload. However, none of these formulationsare able to capture the tradeoff between the benefit of a newconfiguration and the cost of migrating to it.

II. COMPASS: MODEL AND ARCHITECTURE

The placement of stores on the disks of a storage sub-system determines the throughput as well as the responsetime experienced by the application workload. An optimalplacement/allocation scheme allocates the stores to the dif-ferent disks in a manner such that some suitably definedbenefit function (e.g, response time averaged over all requests,disk throughput, number of requests that are served withina deadline) is maximized. If the objective of the placementis to minimize a particular metric (e.g. response time), thebenefit may be defined as inverse of the metric to transform theminimization version of the placement problem to a (benefit)maximization problem. Hence, the placement problem canalways be expressed as a maximization problem with respectto some suitably defined Benefit function. The Benefit functioncan similarly be defined for the problem of placing stores onRAID arrays. Although, we talk about the placement problemonly in the context of disks in this work, our central idea isapplicable to RAID arrays as well.

In real deployment, the optimal placement of stores on disks(one that maximizes benefit) may change with change in work-load or storage resources (e.g., addition/deletion of disks) andthe stores may have to be periodically reallocated to differentdisks. In order to achieve the new optimal configuration, somestores are migrated from one disk to another. This migration,

https://www.researchgate.net/publication/2922496_Hippodrome_Running_Circles_Around_Storage_Administration?el=1_x_8&enrichId=rgreq-ae504326b78780f8bf0d892c3b123b61-XXX&enrichSource=Y292ZXJQYWdlOzQyNTY2Nzc7QVM6MTAzMDg5NzI2MDMzOTI4QDE0MDE1ODk5MTQ1NjM=

https://www.researchgate.net/publication/221444961_On_maximizing_service-level-agreement_profits?el=1_x_8&enrichId=rgreq-ae504326b78780f8bf0d892c3b123b61-XXX&enrichSource=Y292ZXJQYWdlOzQyNTY2Nzc7QVM6MTAzMDg5NzI2MDMzOTI4QDE0MDE1ODk5MTQ1NjM=





https://www.researchgate.net/publication/221023257_On_admission_control_for_profit_maximization_of_networked_service_providers?el=1_x_8&enrichId=rgreq-ae504326b78780f8bf0d892c3b123b61-XXX&enrichSource=Y292ZXJQYWdlOzQyNTY2Nzc7QVM6MTAzMDg5NzI2MDMzOTI4QDE0MDE1ODk5MTQ1NjM=

https://www.researchgate.net/publication/225813950_Data_partitioning_and_load_balancing_in_parallel_disk_systems?el=1_x_8&enrichId=rgreq-ae504326b78780f8bf0d892c3b123b61-XXX&enrichSource=Y292ZXJQYWdlOzQyNTY2Nzc7QVM6MTAzMDg5NzI2MDMzOTI4QDE0MDE1ODk5MTQ1NjM=


MigrationPlanManager

CalculatorConfig Utility

CalculatorMigration Cost

Current Config

EngineMonitoring

SLAs

PlacementGenerator

Storage System Modeler

Traffic Predictor

Workload PerfNumbers

InfoTraffic

Placement Orchestrator

PlacementExplorer

NewConfig

SystemStorage

SystemStorage

Application I/O

Virtualization Engine

Virtualization M

anagement M

iddleware

Fig. 1. Compass: An architecture for Performance Management

which may be frequent, imposes additional workload on thestorage resources and may lead to degraded performance(and consequently revenue loss) for the application workload.Further, since any particular placement/allocation of stores todisks remains optimal only for the limited period when theworkload remains stable, the additional benefit accrued due tothe new configuration within that period should outweigh therevenue loss due to the degraded performance of the applica-tion workload during the migration phase for the migration tobe useful.

Hence, the best placement at any given moment is not theplacement that maximizes the benefit but one that optimizesa net utility function capturing this cost-benefit tradeoff, i.e.,a placement that has both a high benefit and a low cost ofmigration from the previous placement. We now formalize theabove notion mathematically. Given an initial allocation AI ofN stores to M disks, a benefit function B(Ai) (to denote thebenefit accrued per unit time) defined on all possible storeallocations Ai, (i ≤ 1 ≤ MN ), a revenue loss (or migrationcost) function C(Ai, Af ), and a period T for which the newallocation would hold, the profit maximizing allocation isgiven by

arg maxAi,1≤i≤MN

T (B(Ai) − B(AI )) − C(AI , Ai) (1)

i.e., the optimal allocation maximizes the benefit obtained bythe new configuration while minimizing the revenue loss dueto the impact of migration on the application workload. Thegoal of this work is to explore such cost-benefit optimizingplacements.

A. Framework for evaluating the tradeoff between Benefit andCost of Configuration

We now present the Compass performance managementmiddleware framework that solves the optimization problemposed in Eqn. 1.

The key question while designing an architecture to findthe cost-benefit tradeoff optimizing placement is whether toenhance the placement method to become migration cost-aware or to separate the migration cost from the benefit ofplacement and evaluate the tradeoff in a separate higher-levelorchestration component. The advantage of unifying migrationwith placement methodology is that we directly get the cost-benefit optimizing placement. However, one would have toredesign the placement strategies so that they incorporate themigration cost; which, may or may not be possible for allplacement algorithms. Further, since there is no placementalgorithm that works best in all workload scenarios, one wouldhave to individually redesign the placement algorithm mostsuitable for one’s setting.

On the other hand, the other design choice of separatingmigration cost from benefit of placement makes the frameworkeasily extensible, as new placement algorithms or migrationmethods could be plugged in and used directly. The obstaclein this approach, however, is a method to find candidatecost-benefit optimizing placements as the placement algorithmonly provide a single optimal configuration that maximizesbenefit oblivious of cost. The only other placement availableto us is the previous placement that has zero migration cost.Hence, in this framework, one has to explore the configurationspace using these two extreme points (one with zero cost ofmigration and the other with the maximum benefit).

In Compass, we have decided on the latter choice forthe plug-and-play capability it offers, allowing us to useexisting placement methodologies directly. In order to generatecandidate cost-benefit optimizing placements, we present al-gorithms that efficiently explore the configuration space usingthe two extreme points. The main components of the Compassarchitecture are shown in Fig. 1. The architecture assumesthat there is a Virtualization Engine between the consumersof storage and the physical devices providing the storage. TheVirtualization Engine maps the logical view of storage as seenby consumers to the storage provided by physical devices. Thisindirection affords the Virtualization Engine the capability tomove data between storage devices while keeping the logicalview of data consistent.

A Workload Monitoring engine monitors the current work-load on the storage subsystem as well as the performance seenby the workload in terms of response time and throughput.The performance measurements are passed to the main controlmodule, which we call the Placement Orchestrator. At regularintervals or when SLA violations reach a certain threshold,the Orchestrator triggers a configuration evaluation. As a firststep towards exploring a new configuration, the Orchestratorinvokes the Traffic Predictor for the predicted workload inthe short-term future. The Traffic Predictor uses time-seriesanalysis based short-term prediction [10] for estimating therequests arrival rate for each stream. To compute other work-load parameters, the Traffic Predictor uses a simple history-based model where the weight of a measurement decaysexponentially with time.

The Orchestrator then obtains the benefit-maximizing place-ment for this new predicted workload from the PlacementGenerator and invokes the Placement Explorer componentfor any intermediate placements that may optimize the cost-benefit tradeoff. The Placement Explorer uses the currentbenefit-maximizing and cost-minimizing placements (previ-ous placement) as extreme points to generate intermediateplacements. At this juncture, the Orchestrator has a list ofeligible placements and uses the Config Utility Calculatorand Migration Cost Calculator to compute the benefit ofeach placement and the cost of moving to the new placementrespectively. The Config Utility Calculator and the MigrationCost Calculator, in turn, use a Storage System Modeler toaid them in their calculation by providing an estimate of theperformance numbers for a given workload and placementefficiently. The Orchestrator then selects the placement thatbest optimizes the cost-benefit tradeoff (Eqn. 1). Once a new

https://www.researchgate.net/publication/220301981_Analysis_and_Characterization_of_Large-Scale_Web_Server_Access_Patterns_and_Performance?el=1_x_8&enrichId=rgreq-ae504326b78780f8bf0d892c3b123b61-XXX&enrichSource=Y292ZXJQYWdlOzQyNTY2Nzc7QVM6MTAzMDg5NzI2MDMzOTI4QDE0MDE1ODk5MTQ1NjM=

configuration is selected, a Migration Plan Manager createsand executes a migration plan to migrate the stores accordingto some migration methodology. The Virtualization Enginedoes the actual task of migration in accordance with themigration plan, thus completing the loop.

Note that both the Configuration Utility Calculator and Mi-gration Cost Calculator rely on the existence of either an ana-lytical model of the storage devices or a simulator to determinethe expected utility. Admittedly, this is a challenging task, buta number of models have been developed by researchers in thepast with some degree of success. Analytical models, thoughhard to construct, have been shown to successfully modeldisks and disk arrays with controllers [19]. Other models,such as table based lookups with interpolation between tableentry configurations [1], that are less accurate but more easilyadaptable to device changes have also been proposed and canbe used.B. Store Placement for Benefit Maximization

The benefit of a placement Ah, in a service providersetting, denotes the revenue earned by the provider for servingrequests in the placement Ah, where the revenue earned froma request (or set of requests) is calculated based on pre-specified Service Level Agreements (SLA). Common SLAstend to reward providers for maximizing the number of re-quests served and minimizing the response time of the servedrequests. In a single-enterprise setting, the aim of the storageadministrator may be to maximize total disk throughput and/orminimize response time without any explicit agreements inplace. However, both these settings employ a common notionof optimizing certain objective functions and we capture theseobjectives with the aid of a general Benefit function.

The Compass methodology does not depend on the actualbenefit function used, which is just an input in finding theplacement that optimizes the cost-benefit tradeoff. The choiceof Benefit function is dictated rather by the choice of place-ment algorithms. In this work, we consider the commonly usedplacement strategies proposed in [11], [2] and accordingly,for an allocation Ah that places N stores on M disks underthe assumption that all requests have the same reward, theplacement objectives are captured by the following benefitfunction (Eqn. 2).

B(Ah) = λAh

L

δ(Ah)=

N∑j=1

λj

L

δj

(2)

where λAhis the number of requests served, L is baseline

response time, δ(Ah) is the average response time in theallocation Ah, λj is the request rate of stream j and δj isthe average response time of the requests of stream j in theallocation Ah. The parameter L is a constant and may denotea baseline response time, which could be the target responsetime for the specific application. In a service provider settingwith differentiated rewards rj for each stream j, based on thenegotiated SLA, the benefit of served request depends on thereward of the request as well. Hence, we extend the benefitfunction for a multi-reward setting in the following manner.

B(Ah) =

N∑j=1

λjrj

L

δj

(3)

This benefit function captures both facets of most servicelevel agreements: rewarding placements for (i) maximizingthroughput and (ii) minimizing response time.

We now discuss some of the common placement strategiesused in practice and learn the intuition behind these strategies.The placement of stores on a storage subsystem to optimizesome specific objective function has been investigated bymany researchers. Lee et al. [11] place files on parallel diskswith the aim of minimizing the average response time whileensuring that the load is balanced across the disks. We call thisalgorithm as the LSV algorithm. Ergastulum strives to finda storage system with the minimum cost and then balancesthe load across the disks for that storage system. Garg etal. [7] allocate streams to servers in a web farm in order tominimize the average response time of all the stores. Verma etal. [20] solve the same problem for store allocation on paralleldisks. Our notion of Benefit of a store allocation is able tocapture all these diverse objective functions. The benefit of anallocation increases with increase in throughput and decreaseswith increase in response time and hence, the above benefitmetric (Eqn. 3) is rich enough to capture all these settings.To investigate the cost-benefit tradeoff, we will revisit someplacement algorithms, namely, the LSV and Ergastulum.We will use insights from these algorithms in designing ourintermediate selection methodology.

III. EXPLORING THE CONFIGURATION SPACE

The key insight behind the framework presented aboveis that there might be certain configurations that may notlead to the highest benefit but achieve a better cost-benefittradeoff, since the migration load to achieve the particularconfiguration may be very low. We now describe methodsto search configurations that optimize this tradeoff.

An obvious way to find such configurations is to per-form a local random search for allocations near the benefit-maximizing configuration. However, the number of such con-figurations may be large and exploring all of them may beprohibitively expensive. Hence, instead of a local randomsearch, we conduct a more informed search using two ex-treme points; the previous configuration and the new benefit-maximizing configuration. Note that the former representsthe cost minimizing placement while the latter represents thebenefit maximizing placement. We now describe the insightbehind our informed search method that uses the two extremepoints.

The reason that the configuration with the highest benefitmay not optimize the cost-benefit tradeoff is that the placementmethods strive to maximize the benefit oblivious of the earlierconfiguration and the resultant migration cost. To take anexample, LSV sorts the streams based on expected servicetime (E(S)) of the requests of a stream and assign them tothe disks in this order such that all disks have equal load.Hence, if the load for a particular stream changes, a disk maynot have balanced load and that imbalance may have to bedistributed across the disks. Since the disks should maintaina sorted order for E(S), moving to the new allocation wouldrequire moving most of the imbalance through all the disks(Fig. 2).

https://www.researchgate.net/publication/3300859_Issues_and_challenges_in_the_performance_analysis_of_real_disk_arrays?el=1_x_8&enrichId=rgreq-ae504326b78780f8bf0d892c3b123b61-XXX&enrichSource=Y292ZXJQYWdlOzQyNTY2Nzc7QVM6MTAzMDg5NzI2MDMzOTI4QDE0MDE1ODk5MTQ1NjM=

https://www.researchgate.net/publication/221459167_On_Store_Placement_for_Response_Time_Minimization_in_Parallel_Disks?el=1_x_8&enrichId=rgreq-ae504326b78780f8bf0d892c3b123b61-XXX&enrichSource=Y292ZXJQYWdlOzQyNTY2Nzc7QVM6MTAzMDg5NzI2MDMzOTI4QDE0MDE1ODk5MTQ1NjM=



https://www.researchgate.net/publication/2364923_HPL--SSP--2001--4_Simple_table-based_modeling_of_storage_devices?el=1_x_8&enrichId=rgreq-ae504326b78780f8bf0d892c3b123b61-XXX&enrichSource=Y292ZXJQYWdlOzQyNTY2Nzc7QVM6MTAzMDg5NzI2MDMzOTI4QDE0MDE1ODk5MTQ1NjM=


Data Moved

Disk 2 Disk MDisk 1

Flow1 Flow2Flow M−1

Flow 1,2 Short−circuited

Initial Balanced LSV Allocation

Final Balanced LSV Allocation

Fig. 2. Chained Migration in Ordered Placements: Reduced Load on onedisk leading to a chain of migration involving all disks

One may observe in the above example (Fig. 2) that thetotal migration load could be O(MδL) for an imbalanceof δL. To verify, observe that each disk Disk k (k > 1)

receives stores with load of δLM

+ δL (M−k)M

and transfersstores with load of δL (M−k)

M. Summing up over all the disks,

the total data transferred equals δLM−12 . We capture this

notion of a chain of migrations that arises as a result of loadvariations, by the term chained migration: a migration thatrequires some stores Si,j placed on disk Di to be movedto Dj , and some stores Sj,k placed on Dj to be movedto Dk. However, if the stores Si,j and Sj,k have similarstatistical properties, then a transfer of Si,j directly to Dk maylead to a configuration with approximately the same benefitbut at a much reduced migration cost. Our basic strategy inselecting candidate placements is to explore placements thatare intermediate between the previous and the new benefit-maximizing placement, but do not involve chained migrationsfrom the previous placement.

A. Short-circuiting Chained MigrationsThe core idea used by us for generating intermediate place-

ments that may optimize the cost benefit tradeoff is, what wecall, flow short-circuiting. For a set of stores Si,j and Sj,k

that are involved in a chained migration, flow short-circuitingessentially is replacing the two flows from disk Di to Dj andDj to Dk with a single flow from Di to Dk. In this process, wealso need to ensure that the load remains balanced even afterthis replacement. We name this process as flow short-circuitingsince the load that was initially flowing from Di to Dk via Dj

is now directly flowing from Di to Dk. However, in order topreserve the load-balanced condition, if the total load of Si,j

and Sj,k are not same, we may not be able to short-circuit thecomplete flow. In such a scenario, we short-circuit the flow ofload equal to the minimum of the loads of Si,j and Sj,k. Inthe example of Fig. 2 the flows F low1 and F low2 are short-circuited resulting in a reduction equal to the load generatedby F low2, which is the smaller of the two flows, in the totalmigration load. The same process can be repeated until nochains are left. The details of the short-circuiting algorithmare presented in Fig. 3. In order to preserve the load balancedcondition, we compute the total load of Si,j and Sj,k stores,and short-circuit a load equal to the minimum of the two, thusensuring that the load of all three disks Di, Dj and Dk arepreserved across a short-circuit.

function shortCircuitFlow(Si,j , Sj,k)

Compute the net load inflow IN from Di to Dj

Compute the net load outflow OUT from Dj to Dk

If (IN < OUT)In the outflow of Di, change the target disk of

stores Si,j from Dj to Dk.Identify a subset S′

j,k of Sj,k such that load of S ′jk

equals load of Si,j .In the outflow of Dj , remove the stores S ′

j,k.Else

In the outflow of Dj , remove the stores Sj,k.Identify a subset S′

i,j of Si,j such that load of S ′ij

equals load of Sj,k.In the outflow of Di, change the target disk of stores

S′i,j from Dj to Dk.

end shortCircuitFlow

Fig. 3. Flow Short-Circuiting Algorithm for stores Si,j and Sj,k

Our intermediate selection methodology essentially con-sists of (i) computing the migrations required to move fromthe previous to the new benefit-maximizing placement, (ii)identifying chains among them and (iii) short-circuiting thechains one at a time. We reduce the migration cost in eachstep by reducing the number of disks that exchange stores(i.e. both receive and transfer stores) and are part of somechained migration. Given an initial migration flow that takesus from an initial to a final allocation, every short-circuit willlead to a migration flow that is one step farther from thefinal placement and one step closer to the initial allocation.Moreover, every short-circuit reduces the migration cost byremoving one chained migration. However, the number ofsuch chained migrations could be large (up to a maximum ofNC3) and the order in which we short-circuit these chainedmigrations determines the intermediate states that are selected.We enhance this basic methodology next and specify the orderin which these chained migrations are short-circuited and showcertain desirable properties of the proposed order for somecommon placement schemes.

B. Chained Migration Ordering for Sorting Based PlacementAlgorithms

We now present a method to explore the placement spaceand find candidate allocations that may optimize the benefitand migration cost tradeoff for placement schemes that sort thestreams based on some stream parameter. For ease of elucida-tion, we consider only the LSV allocation while observing thatthe same scheme is applicable for other placement schemesthat sort streams based on any other stream parameter. TheLSV scheme sorts streams based on E(S) of the requests ofthe stream and assign them to the disks in this order such thatall disks have equal load. Hence, if the load for a particularstream changes, the disk on which it is placed may no longerhave balanced load and this imbalance may flow across all thedisks.

One may note that the LSV scheme derives its performanceby isolating streams with large E(S) from streams with smallE(S). Hence, while short-circuiting flows, we try to preservethe property of maintaining the sorted order of streams asmuch as possible, thus isolating the streams with large request

function exploreSortedfor i = 0 to M - 1

find the chained migration Di, Dj , Dk such thatESi,j

(S) − ESj,k(S) is minimized

shortCircuitFlow(Si,j , Sj,k)end for

end exploreSorted

Fig. 4. Candidate Allocation Finding Algorithm for Sorted Order Placements

sizes from those with small request sizes. Hence, in theexample of Fig. 2 with M − 2 chained migrations (andanalogously M−2 disks that both receive and transfer stores),we select a disk Dj such that E(S) of the stores Sj−1,j that aretransferred to the disk and E(S) of the stores Sj,j+1 that aretransferred out of the disk are most similar. Hence, after short-circuiting the chained migration, the intermediate placementobtained has the least deviation per unit load short-circuitedfrom the final placement obtained by LSV. Hence, in somesense, this chained migration selection methodology is locallyoptimal. We describe the details of the selection methodologyin Fig. 4.

The algorithm exploreSorted computes a set of up to M −1intermediates where each successive intermediate is one morestep away from the final placement returned by LSV algorithm.Hence, the set of intermediates provide a sequence of stepsthat takes us from the new allocation to old allocation, whereeach step reduces the cost of migration. The number of suchintermediates selected is bounded by min{M, N} and hencethe method efficiently provides us a small set of intermediatesfrom the exponentially large number of intermediates, one ofwhich may optimize the cost-benefit tradeoff. Moreover, wehave the following local optimality result for the intermediatesreturned by the above algorithm.

Lemma 1: In every iteration of A from k disks exchanges tok−1 disk exchanges, the new allocation minimizes the benefitlost per unit load transfer saved amongst all allocations thathave k − 1 disk exchanges.

C. Intermediate Placement Exploration for General Place-ment Schemes

We now detail out a method to explore the allocationspace for intermediate placements when the initial and finalplacement do not have any sorted total order on the streamsassigned to the various disks. Observe that in such a placementscheme, where there is no sorted order between disks, as theheat on one disk increases, the new placement may have disksexchanging streams with more than one disk and the disksexchanging streams may be arbitrarily ordered.

A direct implication of such a scenario is that the numberof chained migrations are no longer bounded by M , thenumber of disks. Instead, in a worst-case scenario, one canverify that the number of chained migrations may be ashigh as O(M3). Further, the order in which we short-circuitchained migrations is not clear, as there is no ordering that theplacement scheme follows. However, one may note that theresponse time of a disk depends on the aggregated propertiesof the streams placed on that disks. Hence, if we can ensurethat the aggregated stream parameters placed on a disk before

and after the short-circuit are similar, then the intermediateplacement obtained by the short-circuit would have a benefitsimilar to the final placement given by the strategy.

function exploreAllWhile there exists at least one Chained Migration

find the chained migration Di, Dj , Dk such thatESi,j

(S) − ESj,k(S) is minimized

shortCircuitFlow (Si,j , Sj,k)end for

end exploreAll

Fig. 5. Candidate Allocation Finding Algorithm for General PlacementSchemes

We design the intermediate-exploration algorithm exploreAll(Fig. 5) based on these insights. The algorithm takes as input aset of flows that migrate to a final placement from the originalplacement. It short-circuits the chained migration that may leadto the least variation from the current workload on each disksand terminates when it can not find any chained migrationto short-circuit. The following lemma bounds the number ofiterations the algorithm may execute.

Lemma 2: The number of intermediates selected by theexploreAll algorithm from an initial allocation Ai and a finalallocation Af is bounded by min{Ns, M

2}, where Ns is thetotal number of streams participating in the flow Mi−f .

IV. ESTIMATING THE COST (C) OF MIGRATION

We have looked at placements that reduce the cost ofmigration by reducing the migration load, thus implicitlyassuming that the cost of migration is directly related to themigration load. We now formalize the notion of the Cost (C)of a migration more precisely and show how to estimate thiscost for some popular migration methodologies.A. Cost of Migration by Whole Store methodology

This commonly used migration methodology tries to com-plete the migration as quickly as possible, and is usuallyreferred to as Whole Store migration. Whole Store migrationis not rate-controlled and almost all application requests willmiss their QoS requirements while migration is in progress.Hence, the migration of a store of size Bm on a disk that cansupport a migration throughput of Cm would reject all requestsfor Bm/Cm time. Let the migration from a configuration A0

to A1 be represented as a set M0−1{(Sj , Dk, Dl), } whereeach entry (Sj , Dk, Dl) in M0−1 represents migration of aset of stores Sj from disk Dk to Dl. We have the followinglemma for the revenue loss due to migration for the WholeStore migration methodology.

Lemma 3: The revenue loss due to migration from A0 toA1 by the Whole Store migration methodology is given by

∑(Sj ,Dk,Dl)∈M0−1

Bj/Crkλk

∫ ∞

Rko

ckrpk

rdr+Bj/Cwl λl

∫ ∞

Rlo

clrp

lrdr

(4)where for any given disk Dk, Cr

k and Cwk are the maximum

read throughput and write throughput respectively supportedby Dk, ck

r is the expected capacity used by requests withreward r, pk

r is the probability that a request has reward r,λk is the expected number of requests present at any giventime, Rk

o is such that λk

∫ ∞

Rkockrpk

r = Ck, i.e., Rko is the reward

of the lowest priority request that would have been served ifthere was no migration.

B. Cost of Migration by QoSMig methodologyDasgupta et al. [5] propose an adaptive rate-controlled

migration methodology QoSMig that optimizes the tradeoffbetween the migration utility and the impact on client traffic.QoSMig uses the long term forecast to compute a migrationrate that is enough to complete the migration within thedeadline. Further, it varies the rate of migration adaptivelyas the client traffic changes: when the client traffic has a largenumber of high priority requests, migration is throttled belowthe base-line migration rate and increased when the clienttraffic has few high priority requests.

The central idea behind the QoSMig methodology is toassign a reward Rm to migration requests that ensures thatmigration completes within the deadline and at the same timethe rate of migration can be increased or decreased as thearrival rate of high reward requests decreases or increases.For the migration of a store of size Bm within a deadline Ton a disk with capacity C, Rm is computed as

λ

∫ ∞

Rm

crprdr ≤ C − Cm, (5)

where Cm = Bm/T , ckr is the expected capacity used by

requests with reward r and pr is the probability that a requesthas reward r. For further details of the methodology and itscorrectness, we refer the user to [5]. We have the followinglemma for revenue loss due to migration when the QoSMigmethodology is used for migrating stores.

Lemma 4: The revenue loss due to migration M0−1 froman allocation A0 to A1 by the QoSMig migration methodologywith a deadline T is given by

∑(Sj ,Dk,Dl)∈M0−1

Tλk

∫ Rkm

0

ckrpk

rdr + Tλl

∫ Rlm

0

clrp

lrdr (6)

where for any given disk Dk, ckr is the expected capacities used

by requests with reward r on Dk, λk is the expected number ofrequests present at any given time, Rm is the reward assignedto migration request by the QoSMig methodology, and pk

r isthe probability that a request of a stream placed on the diskhas reward r.

V. SIMULATION STUDY

We have conducted a large number of experiments toassess the usefulness of a framework that optimizes the trade-off between the benefit of a new placement and the costof migration incurred in reaching the new placement. Ourexperiments also evaluate the effectiveness of our intermediateselection methodologies in exploring the configuration spaceto reach an allocation that optimizes this tradeoff.

We compare our intermediate selection methodology againstthe methodologies that either do not take the migration costor the benefit of a new placement into account. We provide abrief description of these methodologies below.

• Static Placement: In this scheme, the cost of migration isassumed to be large. Hence, at the start of the experiment,an optimal placement is computed based on the forecastedtraffic and the same placement continues through the runof the experiment.

• Benefit-based Placement: This is the scheme commonlyemployed in practice where a new benefit-maximizingplacement is computed periodically as the (forecasted)traffic changes. If the benefit of the new allocation ismore than the benefit of old allocation, the methodologymigrates the data to the new placement.

A. Simulation SetupOur experimental testbed is modeled along the lines of the

Compass framework described in Sec. II-A.For the experimental evaluation, the Storage System Mod-

eler component is built on top of the Disksim simulationenvironment [6]. Disksim has been used in a large number ofexperimental studies and simulate the behavior of a moderndisk very closely. To work in as realistic a setting as possible,we chose the disk model of Seagate Cheetah4LP disk [6] thathas been validated by Disksim against the real disk for a widevariety of workloads and simulates its behavior very closely.

In order to study the performance of the various method-ologies in a realistic scenario, we used field traces madeavailable by the Storage Performance Council [16]. We used2 different OLTP traces and identified stores in them. Sincethe traces used by us have been collected on a disk arraythat uses disks different from Cheetah4LP disks, as a firststep, we scaled the traces so that the disks operate with anaverage response time close to 100ms, which is a reasonablevalue for such workload. We then split the original tracesinto individual traces for each store (each request in the tracecontains an application identifier) and fed the traces throughour experimental testbed.

IOPS Disks Streams Mig Deadline Store SizeBaseline 135 4 28 1200s 0.4GBRange 110-170 4-16 24-40 800-1800 0.1-0.8

TABLE I

EXPERIMENTAL SETTINGS

As a baseline setting, we used 4 disks and 28 streams forthe experiments. The configuration was evaluated every 1 hourwith the migration deadline set as 20 minutes. We kept thestore size at 10% of the disk size in the baseline setting.We then changed all the above parameters from the baseline,one at a time, to investigate their effect on the performanceof the competing methodologies. Table I lists the range ofexperimental parameters.

B. ResultsWe first study the behavior of the various algorithms in

Fig. 6 as the trace is played with time. We investigate how theNet Utility (defined as Benefit - Cost) of a configuration varieswith time for the three competing methodologies when we useLSV as benefit-maximizing placement and QoSMig as themigration methodology. The Benefit is calculated using Eqn. 2with L as 100ms, whereas cost is calculated by summing upthe expected benefit of all requests that could not be servedbecause of migration. Since we had chosen a good operatingpoint, all methodologies were able to serve all requests andhence, the difference in benefit is only as a result of differencein response time achieved by the methodologies.

The results show that the placement strategy, proposed inCompass, is able to explore good intermediates and select



0

200000

400000

600000

800000

1e+06

1.2e+06

0 1 2 3 4 5

Net

Util

ity

Time (hours)

COMPASSSTATIC

BENEFIT-BASED

Fig. 6. Benefit of various Placement Strategies and Intermediate allocationwith time using LSV placement and QoSMig Migration methodology

0

50

100

150

200

250

300

350

1 2 3 4 5

Res

pons

e T

ime

(mse

c)

Time (hours)

STATICBENEFIT-BASED

COMPASS

(a)

0

500000

1e+06

1.5e+06

2e+06

4321

Ben

efit

Time (hours)

BE

NE

FIT

-BA

SE

D

ST

AT

IC

CO

MP

AS

S

Migration CostNet Utility

(b)Fig. 7. (a) Response Time and (b) Migration Cost of different PlacementStrategies with time using LSV placement and QoSMig Migration methodol-ogy

them appropriately. An interesting observation is that all theintermediates selected by our algorithm (shown as uncon-nected points in Fig. 6) have Net Utility greater than thatof LSV allocation. Further, when the LSV allocation andthe initial allocation are very different, we outperform theother algorithms by a more significant margin. This is a resultof the fact that when the old and the new allocations aredifferent, the explored intermediates are much varied and showa much wider variation in Net Utility. To understand thisbehavior better, we look at the response time (Fig. 7(a)) ofthe allocations selected by the different strategies and observe

0

500000

1e+06

1.5e+06

2e+06

1684

Net

Util

ity

Number of Disks

BENEFIT-BASEDSTATIC

COMPASS

(a)

0

100

200

300

400

500

600

1684

Res

pons

e T

ime

(mse

c)

Number of Disks

BENEFIT-BASEDSTATIC

COMPASS

(b)Fig. 8. (a) Net Utility and (b) Response Time Comparison with increasingnumber of disks

that Compass is able to get a response time close to LSVwith very low cost of migration (Fig. 7(b)). On the other hand,Static placement has a highly variable response time leadingto low benefit whereas Benefit-based placement has to incur ahigh migration cost.

We performed the same experiments with increased numberof disks, while compressing the traces, so that the load onthe disks remains same. We found that as the number ofdisks increase, there is a marginal increase in the performanceimprovement of Compass over other methodologies (Fig. 8).This can be attributed to the fact that Compass now has morechained migrations to potentially short-circuit and explore amuch richer set of intermediates. Hence, as the number ofdisks increase from 4 to 16, Compass outperforms Static by amargin of 30%, up from the 20% seen in the 4-disk scenario.For lack of space, for the other sets of experiments, we reportour observations only for the 4-disk scenario, while notingthat the performance improvement of Compass increases withincrease in number of disks.

We studied the performance of the various techniquesfor different combinations of benefit-maximizing placementstrategies and migration methodologies with variation in re-quest rate. We observed that LSV as a placement algo-rithm and QoSMig as a migration methodology are morepredictable than Ergastulum placement and Wholestoremigration. This is because of the deterministic nature of LSVand the adaptive nature of QoSMig. For lack of space, we

0

200000

400000

600000

800000

1e+06

1.2e+06

1.4e+06

1.6e+06

110 120 130 140 150 160

Net

Util

ity

Request rate(IOPS) per disk

STATICBENEFIT-BASED

COMPASS

(a)Fig. 9. Net Utility of different Placement Strategies with Change in RequestRate using LSV allocations and QoSMig

report our results only for this combination. As stated earlier,we vary the request rate by compressing or expanding thetrace. An obvious manifestation of the scaling is that thetotal utility achieved by all methods falls with increase inrequest rate as the same requests are compressed togetherleading to an increase in average response time. Fig. 9 showsthe performance of the various strategies with LSV as thebenefit-maximizing placement algorithm and QoSMig as themigration methodology. The results clearly demonstrate thesuperiority of Compass as it achieves a significantly higher NetUtility as compared to both Static placement and the Benefit-based (vanilla LSV) placement.

Fig. 10(a) studies the behavior of the various strategies withchange in the Reconfiguration period. A large reconfigurationperiod smoothens out the short-term workload variations anda methodology that does not use frequent migrations may stillperform reasonably well. The results validate this intuitionas the Static placement strategy has the least Net Utilityat small reconfiguration period but starts to improve as theReconfiguration period increases to outperform LSV andeven approach our Compass methodology. We also study theperformance of various algorithms with change in Migrationdeadline (Fig. 10(b)). As the migration deadline is increased,the duration for which we get any additional benefit due toimproved placement is reduced. Hence, both the algorithmsthat migrate data exhibit a fall in Net Utility with increasein migration deadline whereas the Static placement shows noperformance change with change in migration deadline.

In Fig 11(a), we study the behavior of the competing strate-gies as the average number of streams per disk is varied from 6to 10. One may observe that having a large number of streamsper disk leads to a more balanced allocation as fragmentationproblems are less pronounced. Further, the requirement offrequent migrations is low as the large number of streams on adisk may smoothen workload variations on the disk. Both theseintuitions are validated by our study as the performance of theCompass algorithm become similar to the Static placement asthe number of streams are increased. However, as the numberof streams increase, the number of variations in the sortedorder of streams with time increases quadratically. Hence, acost-oblivious algorithm may resort to large scale migrations,which may improve the benefit only marginally. Hence, theNet Utility of the Benefit-based algorithm falls with increase

0

200000

400000

600000

800000

1e+06

1.2e+06

1.4e+06

2500 3000 3500 4000 4500

Net

Util

ity

Reconfiguration Period (seconds)

STATICBENEFIT-BASED

COMPASS

(a)

0

200000

400000

600000

800000

1e+06

1.2e+06

1.4e+06

800 1000 1200 1400 1600 1800

Net

Util

ity

Migration Deadline (seconds)

STATICBENEFIT-BASED

COMPASS

(b)Fig. 10. Net Utility of different Placement Strategies with change in (a)Reconfiguration Period and (b) Migration deadline for LSV allocation andQoSMig Migration methodology

in the number of streams.We have found in most of our experiments that Static

placement outperforms the Benefit-based placement, and apossible reason for that could be that we have stores with verylow temperatures (temperature is defined as the request rate tothe store per unit space used). Hence, we now vary the size(space used) of each store and study the behavior (Fig. 11 (b)).It is natural to expect that as the store size decreases, the costof migration decreases and, as a result, both Benefit-based andCompass methods have to to pay a lower migration cost andhence, their performance improves with decrease in store size.On the other hand, the Net Utility of Static should not changewith variation in store size. We found both these intuitions tobe validated by our results (Fig. 11(b)).

Our experiments conclusively establish the superiority ofCompass over existing methodologies under a wide variety ofworkload settings. Compass is especially effective for mid-sized stores under moderate to heavy load.

VI. DISCUSSION

We have presented Compass, a methodology for perfor-mance management of storage systems that optimizes thetradeoff between the cost of migration and the expectedimprovement in performance as a result of the configurationresulting from migration. This central idea has been shownto be effective in addressing performance problems resultingfrom load imbalance between various storage subsystems. We

0

500000

1e+06

1.5e+06

6 7 8 9 10

Net

Util

ity

Number of streams per disk

STATICBENEFIT-BASED

COMPASS

(a)

2e+06

3e+06

4e+06

5e+06

6e+06

7e+06

8e+06

9e+06

1e+07

1.1e+07

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8

Net

Util

ity

Store Size (GB)

STATICBENEFIT-BASED

COMPASS

(b)Fig. 11. Net Utility of different Placement Strategies with change in (a)Streams per disk and (b) Store size for LSV allocation and QoSMig Migrationmethodology

have also presented algorithms for efficient search of con-figurations, which optimize the tradeoff, in the neighborhoodaround the configurations given by placement strategies. Com-pass is aimed at handling performance problems in mediumtime frame (on the order of couple of hours) resulting fromworkload variations, that are not addressed by load throttlingbased techniques that work on much shorter time frame.

We now discuss the computational overhead of usingCompass as opposed to using a benefit-based placementalgorithm like LSV directly. For a benefit-based (or Vanilla)placement strategy, that does not take into account the cost ofmigration, the computational overhead of the placement strat-egy is bounded by the running time of the placement algorithm(TA). In Compass, we additionally explore the configurationspace for intermediates, and for each of these intermediates,the benefit of the intermediate and the cost of moving to theintermediate is computed. These computations are based onmathematical models of the underlying disk subsystem (weuse Disksim in our experimental study) and hence, very fast ascompared to the duration of actual migrations. Further, evena vanilla placement strategy also uses the same models forgenerating its new placement and hence would suffer if thedisk model is very detailed and benefit computation takes along time.

Hence, the only pitfall that the Compass strategy may

suffer from is evaluating a large number of intermediates.However, as we have shown, (Lemma 2), the number of suchintermediates is bounded by the number of disks M for sortingbased placement strategies and M 2 for any general placementstrategy. Combining this with the fact that data migration isa very time consuming operation as opposed to computationof simple mathematical functions, the overhead of Compassturns out to be insignificant and transparent to the user in ourexperiments.

Our work opens up many promising areas to explore.Avenues for future work include allowing migration at gran-ularities smaller than the whole store. Current virtualizationtechnology allows migration at the level of individual RAIDstripe, but the interplay of factors such as request locality thatcan affect performance needs to be carefully evaluated. Wealso want to evaluate the efficacy of the proposed approachwhen used with other placement algorithms.

REFERENCES

[1] E. Anderson. Simple Table-based Modeling of Storage Devices. HPLaboratories SSP Technical Report HPL-SSP-2001-4, 2001.

[2] E. Anderson, M. Kallahalla, S. Spence, R. Swaminathan, and Q.Wang. Ergastulum: An Approach to Solving the Workload and DeviceConfiguration Problem. HP Labs technical memo HPL-SSP-2001-05,2001.

[3] E. Anderson, M. Hobbs, K. Keeton, and S. Spence. Hippodrome:Running Circles Around Storage Administration. In USENIX FAST 2002

[4] D.D. Chambliss, G. A. Alvarez, P. Pandey, D. Jadav, J. Xu, R. Menon,and T. P. Lee. Performance Virtualization for Large-Scale StorageSystems. In Proc. of 22nd SRDS, 2003.

[5] K. Dasgupta, R. Ghosal, R. Jain, U. Sharma, and A. Verma. QoSMig:Adaptive Rate-Controlled Migration of Bulk Data in Storage Systems.In Proc. IEEE ICDE, 2005.

[6] DiskSim Simulation Environment, at http://www.pdl.cmu.edu/Disksim/.[7] R. Garg, P.N. Shahabuddin, and A. Verma. Optimal Assignment of

Streams in Server Farms. In IBM Technical Report, 2003.[8] S.D. Gribble, G.S. Manku, D. Rosseli, E.A. Brewer, T.J. Gibson, and E.

L. Miller . Self-Similarity in File-System Traffic. In ACM SIGMETRICS1998, p 141-1 50.

[9] W. Hsu, and A. J. Smith. Characteristics of IO Traffic in personalComputer and Server Workloads. In IBM Systems Journal, 42(2), 2003.

[10] A. K. Iyengar, M. S. Squillante, and L. Zhang. Analysis and character-ization of large-scale Web Server Access Patterns and Performance. InProc. ACM World Wide Web Conference, 1999.

[11] L.W. Lee, P. Scheuermann, and R. Vingralek. File Assignment inParallel I/O Systems with Minimal Variance of Service Time. In IEEETransactions on Computers 49(2), 2000.

[12] Z. Liu, M.S. Squillante, and J.L. Wolf. On Maximizing Ser vice-LevelAgreement Profits. In Proc. ACM Conf. on Electronic Commerce, 2001.

[13] C. R. Lumb, A. Merchant, G. A. Alvarez. Facade: virtual storage devicewith performance guarantees. In Proc. of USENIX FAST 2003.

[14] IBM SAN Volume Controller at http://www.ibm.com/storage.[15] P. Marbach, and R. Berry. Downlink Resource Allocation and Pricing

for Wireless Networks. In Proc. IEEE Infocom, 2002.[16] Storage Trace Repository, at http://traces.cs.umass.edu/storage/.[17] P. Scheuermann, G. Weikum, and P. Zabback. Data partitioning and load

balancing in parallel disk systems. In VLDB Journal: Very Large DataBases, 7(1):4866, 1998.

[18] S. Uttamchandani, L. Yin, G. A. Alvarez, J. Palmer, and G. Agha.CHAMELEON: A Self-Evolving, Fully-Adaptive Resource Arbitratorfor Storage Systems”. In USENIX Annual Technical Conference, 2005.

[19] E. Varki, A. Merchant, J. Xu, and X. Qiu. Issues and Challenges in thePerforman ce Analysis of Real Disk Arrays. In IEEE Transactions onParallel and Distr ibuted Systems(TPDS) 15(6):559-574, June 2004.

[20] A. Verma and A. Anand. On Store Placement for Response TimeMinimization in Parallel Disks. In IEEE ICDCS, 2006.

[21] A. Verma, and S. Ghosal. Admission Control for Profit Maximization ofNetworked Service Providers. In Proc. Int’l World Wide Web Conference,2003.








































https://www.researchgate.net/publication/2494485_Downlink_Resource_Allocation_and_Pricing_for_Wireless_Networks?el=1_x_8&enrichId=rgreq-ae504326b78780f8bf0d892c3b123b61-XXX&enrichSource=Y292ZXJQYWdlOzQyNTY2Nzc7QVM6MTAzMDg5NzI2MDMzOTI4QDE0MDE1ODk5MTQ1NjM=

https://www.researchgate.net/publication/2494485_Downlink_Resource_Allocation_and_Pricing_for_Wireless_Networks?el=1_x_8&enrichId=rgreq-ae504326b78780f8bf0d892c3b123b61-XXX&enrichSource=Y292ZXJQYWdlOzQyNTY2Nzc7QVM6MTAzMDg5NzI2MDMzOTI4QDE0MDE1ODk5MTQ1NjM=





Date post:	01-Mar-2023
Category:	Documents
Upload:	independent
View:	0 times
Download:	0 times

Compass: Cost of Migration-aware Placement in Storage Systems

Documents