+ All Categories
Home > Documents > An IO Scheduling Algorithm to Improve Performance of Flash … · 2015-12-31 · of FTL mapping...

An IO Scheduling Algorithm to Improve Performance of Flash … · 2015-12-31 · of FTL mapping...

Date post: 29-Mar-2020
Category:
Upload: others
View: 0 times
Download: 0 times
Share this document with a friend
8
An IO Scheduling Algorithm to Improve Performance of Flash-Based Solid State Disks Mehrnoosh Sameki, Amirali Shambayati, Hossein Asadi Sharif University of Technology Tehran, Iran Abstract—Since the emergence of solid state devices into the storage scene, improvements in capacity and price have brought them to the point where they are becoming a viable alternative to traditional magnetic storage media. Current file systems and device-level I/O schedulers are optimized for rotational magnetic hard disk drives. In order to improve the efficiency of hard disk utilization, an Operating System (OS) reschedules IO requests to examine more read and write requests in one disk rotation. The pattern of the reordered IO requests is modified from its original sequential sequence to a random sequence. However, a random sequence of IO requests may impose large performance and endurance overhead to solid state devices. Since solid state devices have drastically different properties and structures than hard disks, we need to revisit some design aspects of file systems and scheduling algorithms used in the I/O subsystem. In this thesis, we first extract Linux IO access traces using the block I/O level of file system. Then, using the disk IO traces, we investigate the current approaches to I/O scheduling. Our results reveal that the current schedulers may not be ideally suited for solid-state devices. We also propose an SSD-aware scheduler, which can improve the performance of the disk subsystem. The proposed IO scheduler has been implemented in the Linux Kernel and has been evaluated using the DiskSim. The results prove that the throughput of the proposed idea has been improved 56, 59.6, 170.6 and 47.3 percent in comparison with Noop, Anticipatory, CFQ and Deadline schedulers respectively. I. I NTRODUCTION Recently Solid State Disks (SSDs) are widely used in modern storage systems, laptops, personal computers, and embedded applications. Due to the characteristics of tradi- tional magnetic disks, they are widely substituted by flash- based SSDs. In magnetic disks, also called Hard Disk Drives (HDDs), access to a sector of the disk can be done only by rotating the disk platter and moving the disk arm to the target sector [1]. The mechanical movements of HDDs significantly increase read and write latencies. SSDs, however, are non-mechanical devices. This feature is the source of many advantages including their resistance to shock and variation of temperature, their reliability, durability and low access time. Moreover, SSDs offer some other advantages such as high read performance, light weight, small size, and low power consumption [2], [3]. Some of these features including light weight, small size, and shock resistance make SSDs suitable for mobile applications while some features such as durability, low access time, and high reliability make them promising for high-end applications such high-performance servers and data storage systems. Flash-based SSDs are typically composed of flash memory chips. A flash chip consists of planes, which can be accessed in parallel. Each plane composed of blocks, which in turn consists of pages [4]. Read accesses are performed in page granularity while write accesses are achieved on a block basis. Despite so many merits offered by SSDs, they suffer from two major shortcomings. First, in order to write new data into a NAND flash block, the previous charge indicative of the old data should be removed from all over the block and the block should be reprogrammed according to the bit pattern of the new data. Thus, a data block should be erased before it is overwritten by new data. Such limitation, called erase-before- write, leads to a considerable latency for write operations and can affect the overall disk performance [5]. The second major shortcoming of flash-based SSDs is their limited endurance. This means that a flash block wears out after a limited number of erase operations. Hence, after a distinct number of writes within a cell, the cell would be useless so that it is not possible to write new data anymore. To address the issues of limited lifetime and erase-before- write, two different solutions can be followed. One solution is at the device level by taking advantage of a software layer, called Flash Translation Layer (FTL) [5]. The main goal of FTL is to hide the latency caused by erase-before- write operation from application or user perspective. FTL maps logical addresses requested by applications to physical addresses on a NAND flash array. The efficiency of write performance in SSDs considerably depends on the efficiency of FTL mapping algorithms [4]. Another way to alleviate SSD shortcomings is to design appropriate Input/Output (I/O) scheduling algorithms at the Operating System (OS) level. I/O scheduling algorithms are designed and embedded in operating systems in order to decide on the sequence and the order of requests to be submitted to the disk subsystem. Current IO schedulers reorder the IO requests to reduce response time by minimizing the arm movement of HDDs. To achieve this objective, traditional IO schedulers respond incoming requests in a way that the address of a candidate request is the nearest to the address of the previously responded request. Thus, the arm has to move less in order to reach the target sector. This will reduce the response time and power consumption [6], [7], [8], [9], [10]. The structure of SSDs is, however, entirely different from HDDs due to the fact that there is no mechanical part in them. Consequently, HDD-based IO scheduling algorithms are not appropriate for SSDs. Recently a few studies have proposed SSD-friendly I/O scheduling algorithms at the OS level by classify incoming
Transcript
Page 1: An IO Scheduling Algorithm to Improve Performance of Flash … · 2015-12-31 · of FTL mapping algorithms [4]. Another way to alleviate SSD shortcomings is to design appropriate

An IO Scheduling Algorithm to Improve Performance of Flash-Based Solid State Disks

Mehrnoosh Sameki, Amirali Shambayati, Hossein AsadiSharif University of Technology

Tehran, Iran

Abstract—Since the emergence of solid state devices into thestorage scene, improvements in capacity and price have broughtthem to the point where they are becoming a viable alternativeto traditional magnetic storage media. Current file systems anddevice-level I/O schedulers are optimized for rotational magnetichard disk drives.

In order to improve the efficiency of hard disk utilization,an Operating System (OS) reschedules IO requests to examinemore read and write requests in one disk rotation. The patternof the reordered IO requests is modified from its originalsequential sequence to a random sequence. However, a randomsequence of IO requests may impose large performance andendurance overhead to solid state devices. Since solid statedevices have drastically different properties and structuresthan hard disks, we need to revisit some design aspects offile systems and scheduling algorithms used in the I/O subsystem.

In this thesis, we first extract Linux IO access traces usingthe block I/O level of file system. Then, using the disk IO traces,we investigate the current approaches to I/O scheduling. Ourresults reveal that the current schedulers may not be ideallysuited for solid-state devices. We also propose an SSD-awarescheduler, which can improve the performance of the disksubsystem.

The proposed IO scheduler has been implemented in theLinux Kernel and has been evaluated using the DiskSim. Theresults prove that the throughput of the proposed idea has beenimproved 56, 59.6, 170.6 and 47.3 percent in comparison withNoop, Anticipatory, CFQ and Deadline schedulers respectively.

I. INTRODUCTION

Recently Solid State Disks (SSDs) are widely used inmodern storage systems, laptops, personal computers, andembedded applications. Due to the characteristics of tradi-tional magnetic disks, they are widely substituted by flash-based SSDs. In magnetic disks, also called Hard Disk Drives(HDDs), access to a sector of the disk can be done onlyby rotating the disk platter and moving the disk arm tothe target sector [1]. The mechanical movements of HDDssignificantly increase read and write latencies. SSDs, however,are non-mechanical devices. This feature is the source of manyadvantages including their resistance to shock and variation oftemperature, their reliability, durability and low access time.Moreover, SSDs offer some other advantages such as highread performance, light weight, small size, and low powerconsumption [2], [3]. Some of these features including lightweight, small size, and shock resistance make SSDs suitablefor mobile applications while some features such as durability,low access time, and high reliability make them promising forhigh-end applications such high-performance servers and datastorage systems.

Flash-based SSDs are typically composed of flash memorychips. A flash chip consists of planes, which can be accessedin parallel. Each plane composed of blocks, which in turnconsists of pages [4]. Read accesses are performed in pagegranularity while write accesses are achieved on a block basis.Despite so many merits offered by SSDs, they suffer from twomajor shortcomings. First, in order to write new data into aNAND flash block, the previous charge indicative of the olddata should be removed from all over the block and the blockshould be reprogrammed according to the bit pattern of thenew data. Thus, a data block should be erased before it isoverwritten by new data. Such limitation, called erase-before-write, leads to a considerable latency for write operations andcan affect the overall disk performance [5].

The second major shortcoming of flash-based SSDs is theirlimited endurance. This means that a flash block wears outafter a limited number of erase operations. Hence, after adistinct number of writes within a cell, the cell would beuseless so that it is not possible to write new data anymore.To address the issues of limited lifetime and erase-before-write, two different solutions can be followed. One solutionis at the device level by taking advantage of a softwarelayer, called Flash Translation Layer (FTL) [5]. The maingoal of FTL is to hide the latency caused by erase-before-write operation from application or user perspective. FTLmaps logical addresses requested by applications to physicaladdresses on a NAND flash array. The efficiency of writeperformance in SSDs considerably depends on the efficiencyof FTL mapping algorithms [4].

Another way to alleviate SSD shortcomings is to designappropriate Input/Output (I/O) scheduling algorithms at theOperating System (OS) level. I/O scheduling algorithms aredesigned and embedded in operating systems in order to decideon the sequence and the order of requests to be submitted to thedisk subsystem. Current IO schedulers reorder the IO requeststo reduce response time by minimizing the arm movementof HDDs. To achieve this objective, traditional IO schedulersrespond incoming requests in a way that the address of acandidate request is the nearest to the address of the previouslyresponded request. Thus, the arm has to move less in orderto reach the target sector. This will reduce the response timeand power consumption [6], [7], [8], [9], [10]. The structureof SSDs is, however, entirely different from HDDs due to thefact that there is no mechanical part in them. Consequently,HDD-based IO scheduling algorithms are not appropriate forSSDs.

Recently a few studies have proposed SSD-friendly I/Oscheduling algorithms at the OS level by classify incoming

Page 2: An IO Scheduling Algorithm to Improve Performance of Flash … · 2015-12-31 · of FTL mapping algorithms [4]. Another way to alleviate SSD shortcomings is to design appropriate

requests based on their logical addresses [11], [12], [13],[14], [15], [?], [16]. Among them, a request bundling schemehas been proposed in [11] and has been designed for blockmapping FTL. This scheme allocates some bundles with aspecific predefined size and examines the newly arrived re-quests to investigate to which bundle they belong. Dispatchingrequests are delayed until its corresponding bundle is full.After detecting a full bundle, all requests inside the fullbundle are dispatched to the disk subsystem. As a result,each SSD block is erased only once for a group of requestswithin a similar block. This scheme, however, can significantlyincrease the queue time associated with each IO request, whichconcludes to affect the order of requests dispatched to the disksubsystem. Significant disordering of I/O requests can lead toI/O starvation or unfairness between I/O requests.

In this paper, we present an IO scheduler for SSDs whichcan significantly improve the throughput of SSDs. Our pro-posed SSD-aware IO scheduler dispatches IO requests con-sidering the internal structure of SSDs. According to thisapproach, it reorders requests by their logical block addressesto produce more sequential IO accesses. By increasing tem-poral locality of dispatched requests, we exploit the featureof parallelism among flash chips. In other words, it enablesthe flash chips to respond arrival requests in a parallel mannerindependently. As a result, the throughput will be enhanceddramatically. We have implemented the proposed IO schedulerinside Linux 2.6.32.29 kernel. We have selected a variety ofIO intensive application programs to measure the efficiency ofthe proposed scheduler. We have achieved 52.3% improvementfor postmark throughput and 1.18% for Bonnie++ throughput.

The rest of the paper is organized as follows. In section2, background knowledge and related works are provided. Itincludes some major points about flash memory storage, Linuxschedulers, and some SSD-friendly IO scheduling algorithmswhich are previously exposed. In Section 3, we discuss ourproposed Linux IO scheduler for SSDs. Then, we discuss ourexperiments and tools which have been used for evaluatingour proposed design in Section 4. Afterwards, we presentthe experimental results associated with our proposed IOscheduler in section 5, and at last, we conclude our work inSection 6.

II. BACKGROUND

In this section, a brief explanation of characteristics ofNAND flash memory and a short description of FTL func-tionality have been presented. Moreover, the architecture ofa typical SSD has been presented and afterwards, the currentLinux IO schedulers and their characteristics have been dis-cussed.

A. NAND Flash Memory

NAND flash memory is a non-volatile storage device whichis used to permanently hold user data on its cells. A NANDflash chip consists of some erasable sections called blocks.Typically, a block consists of about 64 or 128 pages. Eachpage can be accessed separately and it is the unit of read

and write operations inside flash memory. Data overwriting isnot allowed in flash memory. To write a new data bit on apage containing previous data bit, the erase operation shouldbe performed to remove the previous data and then the newdata can be written on that page. The smallest unit of eraseoperation is achieved on a block basis. The erasure time mayadversely affect functionality of flash memory’s and cause aconsiderable erasure latency [16].

To alleviate this latency or hide it from user perspective,a software layer is developed inside the SSD which is calledFlash Translation Layer (FTL). FTL allows the file system toperform read/write transactions on flash memory without anyknowledge regarding the characteristics and structure of flashmemory. In other words, the file system is able to treat flashmemory as a traditional magnetic disk.

Since each block inside the flash memory has a limited lifetime, it is essential to evenly distribute the read and writeoperations, on different blocks in order to not allow one cellbecomes dead and another one stays unused and young. Toachieve this goal, FTL takes advantage of an internal mappingtable, translating each physical address issued by the filesystem to a logical address on flash memory. FTL maintains amapping table which holds the mapping data. Different map-ping schemes have been proposed in the literature includingpage mapping FTL, block mapping FTL and hybrid mappingFTL [4].

The page mapping FTL maintains a mapping table whichholds the mapping information between physical page ad-dresses and their corresponding logical page addresses. Whileupdating data, this scheme writes new data in a new pageand updates mapping table’s information. The Page Mappingscheme runs fast, but it needs a large memory space for itsinner mapping table. Furthermore, after multitudinous updates,each block would contain a horde of invalid pages which needssmart Garbage Collection algorithms to be invoked.

The block mapping FTL is another scheme in which map-ping table holds the mapping between physical block addressesand logical ones. After receiving a request to a logical pageaddress, it is converted to two separated numbers. The first oneis a logical block address and the other one, is a logical offset.These two numbers are used to dedicate an empty block withan empty page within it, to help the file system accomplishits operation. This scheme needs a smaller memory size tomaintain its mapping table comparatively. Instead, the BlockMapping scheme has an important disadvantage as well. Itcauses a great copy overhead which mean while updating asingle page within a block, all other valid pages existed in theblock, should be transferred to a newly-allocated block.

The hybrid mapping FTL takes advantage of both blockmapping scheme and page mapping scheme. It uses some logblocks which are called log buffers. This scheme allocateslog blocks and writes new data to them instead of flashmemory’s blocks. When these log blocks are full, a mergeoperation happens which leads to a merge between log blockand previous data block. This hybrid mapping scheme is moreefficient than previous schemes.

Page 3: An IO Scheduling Algorithm to Improve Performance of Flash … · 2015-12-31 · of FTL mapping algorithms [4]. Another way to alleviate SSD shortcomings is to design appropriate

Figure 1. Structure of Solid State Disk Drive [17]

B. Solid State Disks

SSDs are storage devices which take advantage of flashmemory to provide the system with a non-volatile memoryspace so that data can be stored permanently. By use offlash memory, SSDs are capable of holding data even at theabsence of electricity sources. A typical SSD, as shown infigure 1, consists of NAND flash memory, a controller, anSDRAM as a cache memory, and a communication interface[17]. The controller, which typically employs an embeddedprocessor, can significantly influence the disk performance.The functionalities of the SSD controller includes balancingthe lifetime of flash chips, implementing wear leveling al-gorithms, handling flash errors, and managing unused blockswithin flash memories.

All incoming requests to an SSD are first search withincache memory to find out whether the requested data is incache memory or not. If the requested data is found in cachememory, there is no need to access the disk subsystem. Instead,data will be provided by the cache memory. Lastly, the com-munication interface in an SSD provides the communicationlink between the host computer and the SSD.

C. Linux IO Schedulers

The main role of an IO Scheduler is to decide when andwhich request should be dispatched to the Disk. This decisionhas a significant impact on the overall system performance.This would help the system to respond to different processesfairly and reduce the waiting time each request requires to bedispatched to the disk [1]. Figure 2 present the IO schedulerin the IO subsystem hierarchy. In this section, we elaboratemore about the current IO schedulers used in Linux operatingsystem and their different characteristics. The most commonIO schedulers used in current operating systems are Noop,Deadline, Anticipatory and CFQ. The description of these IOschedulers are detailed next.

The Noop scheduler serves the requests based on the FirstCome First Served criterion. This scheduler does not changethe order of incoming requests to the scheduler queue. How-ever, consequent requests in incoming order may be mergedwithin the scheduler queue to make a bigger request.

Deadline sorts the requests based on their sector number;i.e., requests are organized in an order based on their physical

Figure 2. IO Subsystem Hierarchy

location on the disk. It also takes advantage of a deadlinelist which holds the deadlines in which the requests shouldbe dispatched. All requests are normally served based on thephysical sector sorted list. However, once a request is foundin the deadline list whose deadline is nearby, the request isserved immediately.

Anticipatory is a completed version of deadline. It acts thesame as deadline except the fact that after serving a readrequest, it holds the disk system idle for a short time andwaits to receive a request near the dispatched one. In otherwords, it is anticipated that a physically closed request will besent to the disk during this idle time. This strategy can affectthe IO performance properly [18].

CFQ (Completely Fair Queuing) distributes the requests intothe separated queues based on their priority and starts servingthem from the highest priority queue. It allocates a time sliceand tries to serve them within that time slice, and if they arenot finished during that period, they will be added to the endof queues. In other words, this algorithm acts based on theround robin fashion.

III. MOTIVATION

In [11], a bundling IO scheduling algorithm, named IRBWis designed, corresponded to block mapping scheme. Its mainpurpose is to decrease the number of necessary merges in theFTL. They attempt to reach this goal, by postponing dispatchof requests pertaining to an SSD block, until sufficient requestsare sent to re-write the whole block. By applying this method,it is not required to merge the log block with the originalblock; rather FTL just turn the log block into a new datablock. On the other hand, to fill a region’s address range,a bunch of sequential write requests should be sent fromworkloads, which leads to a long time waiting for previouslyarrived requests to be dispatched. Consequently, this issue willincrease the queue time and response time relatively.

IV. PROPOSED MODEL

Regarded to the IRBW IO scheduling algorithm which hasbeen designed for SSDs with block-mapping FTLs, we have

Page 4: An IO Scheduling Algorithm to Improve Performance of Flash … · 2015-12-31 · of FTL mapping algorithms [4]. Another way to alleviate SSD shortcomings is to design appropriate

Figure 3. Page-mapping FTL maps the pages associated with the victim bundle requests to the physical pages inside different flash chips in order to increaseparallelism.

proposed an requests bundling idea more suitable for page-mapping FTLs which attempts to remedy IRBW’s shortcom-ings.

Our idea is inspired by IRBW in aspect of categorizationof write requests into relevant address ranges, named regions,with a significant difference; write requests pertaining to aregion, should not necessarily wait until the whole region’saddress range becomes covered by incoming requests.

Our method, called CBFS (Coldest Bundling Based onFilling Speed), is proper for page-mapping FTLs, which maplogical page addresses to physical ones. Therefore, it doesnot matter that pages included in a dispatched bundle are notsequential. In each decision time for dispatching requests, weselect the bundle of requests related to the region which is notlikely to be more filled. We call it the coldest bundle and wehave a method to recognize it. We variate the region size, from4KB to 2MB and we are investigating that for which regionsize, more throughput and less deterioration of queue time willbe achieved. To avoid starvation of write requests available inthe queue and subsequently increase of queue time, each timethe IO scheduler’s queue is surveyed, our algorithm selectsone of the bundles as a victim. We choose a bundle with theleast expectancy to become filled to a large extent at leastin near future. This bundle is named the coldest bundle. Ouralgorithm should have a policy to determine how cold a bundleis. The policy is to retrieve bundles’ statistics during theirrecent invocations. For instance, how much data have beenadded to a bundle recently. Our algorithm provides bundles’statistics every "n" times of queue survey ,which "n" could be avariable parameter, then calculates an average size of recentlyadded requests to each bundle. We have considered the surveyparameter to be 5 in our experiments. Since the bundle withthe least average sounds like the coldest one, it is the bestchoice to select is as a victim. Our main reason to dispatchthe coldest bundle is that, this bundle has the least expectancyto become filled in near future and thus, keeping this bundlein the bundles’ queue is just an overhead for system. Rather,

keeping hot bundles would increase their chance of becomingmore filled and dispatched in further surveys.

FillingSpeed =SumOfIncomingRequests′Sizes

SurveysT imes(1)

Averagecoldest = ThebundlewithminimumFillingSpeed(2)

Figure 4 shows the general idea of our IO scheduling al-gorithm. Each request will be examined and its correspondingbundle will be determined. Whenever it needs to dispatch abundle, the algorithm chooses the bundle with the least fillingspeed as a victim, and dispatches all of the requests inside thevictim bundle to the disk subsystem.

The main merit of our algorithm is to get advantage of theparallelism existed among flash chips inside SSDs. Each timewe dispatch a large amount of data to SSD simultaneously andthus, the page mapping FTL distribute the dispatching requestsamong separated flash chips. As a result the service time ofrequests will be overlapped with each other, and throughputwill be raised up greatly. We do not consider any policy forread requests. Since SSDs have a reasonable response timewhile handling read requests, and also devising any method toschedule their dispatching times may conclude to an increasein the requests’ queue time. Moreover to avoid the starvationof write requests, read requests are prior to write requests, justfor once.

Figure 3 demonstrates how the CBFS schedules requests’dispatching times, in order to exploit flash chips parallelism.By dispatching a large-size bundle of requests, we enhancethe temporal locality among dispatching times of requests,enabling FTL to map logical addresses of requests to physicaladdresses in different flash chips, and hence, write requestscan be responded concurrently.

Page 5: An IO Scheduling Algorithm to Improve Performance of Flash … · 2015-12-31 · of FTL mapping algorithms [4]. Another way to alleviate SSD shortcomings is to design appropriate

Figure 4. RB : Relevant Bundle = Req Address / Bundle Size, FS : FillingSpeed

V. IMPLEMENTATION

We implemented our algorithm inside the Linux kernel2.6.32.29. Furthermore, we have applied the Qemu emulatorfor kernel debugging. First, We extracted Linux IO accesstrace, especially those access requests that come from filesystem and swap system calls. To accomplish this goal, weused a block IO layer tracing tool, named Blktrace, which hasthe ability to see what is going on inside the operating system’sblock IO layer. The tool extracts the detailed informationabout IO transactions from the block IO layer to the userperspective. We extracted the IO requests for various work-loads and replayed them on different IO schedulers, includingthe default Linux IO schedulers and our proposed ones. Weused Btreplay tool to reproduce the patterns of capturedIO transactions, produced by distinct workloads. Thus, theBtreplay tool enabled us to evaluate different IO schedulingalgorithms with definitely similar IO patterns. In addition, wejust extracted the IO requests with "Merge" or "Dispatch"types. Because the mentioned types represent IO scheduler’sfunctionality properly. Moreover, we used DiskSim simulatorwhich is a trace-driven simulator.

We have also modified Disksim simulator’s code, to addthe Throughput parameter and overlapping rate of incomingrequests to its output file. We have defined the throughputparameter according to Equation 3. In this equation, Ri refersto request i and the corresponding request size is denoted bySize(Ri).

Throughput =

∑Size(Ri)

Total Simulation T ime− Idle T ime(3)

Overlapping rate between two successive requests equalsto the difference between the values of the second requests’sstart time and the first request’s end time. Finally, we modifiedthe DiskSim to report the average of the overlapping rates. Byusing this parameter, we demonstrated a criterion to investigatehow well our algorithm exploits the parallelism among flashchips.

VI. RESULTS

We used two different workloads to evaluate our IO schedul-ing algorithm’s performance. The first workload is Postmarkwhich is designed to simulate the behavior of mail servers.By running this benchmark, four types of transactions areexecuted: files are created, deleted, read, and appended. TheSecond workload is Bonnie++ which performs a number ofsimple tests of hard drive and file system performance. TableI presents the characteristics of the used traces in details.

Table IIO TRACE CHARACTERISTIC

Parameter Postmark Bonnie++Read Requests (%) 0 0.0085Write Requests (%) 100 99.9915

Average Request Size (KB) 7.8257725 4.3516445Average Read Request Size (KB) 0 4Average Write Request Size (KB) 7.8257725 4.3516745

Tables II shows the system configuration on which ourexperiments have been done.

Table IISSD CONFIGURATION

Case DescriptionSSD Size 2 GB

SSD Block Size 512 SectorsSSD Page Size 8 Sectors

SSD Sector Size 512 BytesPages per Block 64Blocks per Plane 128

Planes per Element 8

We ran the mentioned benchmarks and collected their IOtransactions by Blktrace. Afterwards, we fed the simulatedSSD’s input by Blktrace’s output and compared our IOscheduling algorithm with default Linux IO Schedulers basedon their throughput. As it was described in the previoussection, We used a survey time parameter to decide aboutthe coldest bundle. We also changed the region size from256KB to 2MB to investigate which selection would concludeto more exploitation of parallelism. We considered the surveytime parameter equal to 5. Figures ?? demonstrates the resultsof the throughput associated with Postmark and Bonnie++benchmarks respectively. As it is obvious, we have achieved52.3% improvement for postmark and 1.18% for Bonnie++given region size equal to 1MB.

Additionally, we also evaluated average SSD response timeto investigate how CBFS affects this parameter. Figure 8 showsthe response time of Postmark and Bonnie++ benchmarks.Since CBFS ’s average size of dispatched requests is largerthan default Linux IO schedulers’ average request size, weconsidered another parameter named average normalized re-sponse time which can be calculated by the division of averageresponse time by average request size. Figure ?? shows thataverage normalized response time is improved a little by CBFSalgorithm. In addition, other configurations of our algorithm

Page 6: An IO Scheduling Algorithm to Improve Performance of Flash … · 2015-12-31 · of FTL mapping algorithms [4]. Another way to alleviate SSD shortcomings is to design appropriate

Figure 5. Throughput of Postmark Benchmark

Figure 6. Throughput of Postmark Benchmark

show better results in comparison with default I/O Schedulersfor both Postmark and Bonnie++ benchmarks.

Figure 7. Response Time of Postmark Benchmark

We believe that the main reason of our improvement inthroughput is referred to a better exploitation of the parallelismfeature which exists among flash chips. To verify this claim,we extracted the overlapping rate of IO requests’ service time,associated with each IO scheduling algorithm for PostmarkBenchmark. According to figure 11, it can be observed that thepatterns of overlapping rates match the patterns of throughputresults. Given the definition of overlapping rate, the lower it is,the more overlapping exists among the service time of differentIO requests.

Considering the fact that CBFS holds IO requests to makea large bundle out of them, prior to being dispatched, at

Figure 8. Response Time of Bonnie++ Benchmark

Figure 9. Normalized Response Time of Postmark Benchmark

first sight, it might be derived that CBFS would increase IOscheduler’s queue time. Hence, we evaluated this parameter,and the results in figures 12 and 13 demonstrates that there isnot a dramatic deterioration in CBFS’s queue time, and evenCBFS has a better queue time in comparison with CFQ andDeadline.

To verify that CBFS’s improvement in throughput refersto the parallelism existed inside SSD, we repeated our testsfor other configurations of our simulated SSD. We changedblocks per plane and pages per plane and number of flashchips parameters. Our results demonstrated that this patternmaintains for new configurations as well.

VII. RELATED WORK

Two different approaches have been presented in the liter-ature to alleviate the issues of high erase-before-write latencyand fast aging of flash memory in SSDs. The first approach

Figure 10. Normalized Response Time of Bonnie++ Benchmark

Page 7: An IO Scheduling Algorithm to Improve Performance of Flash … · 2015-12-31 · of FTL mapping algorithms [4]. Another way to alleviate SSD shortcomings is to design appropriate

Figure 11. Overlapping Rate for Postmark Benchmark

Figure 12. Average Queue Time for Postmark Benchmark

is to implement buffer management algorithms in the FTLsoftware. The other approach is at the operating system layerand by modifying current IO Schedulers’ algorithms to makethem suitable for SSDs.

A. Buffer Management Algorithms

In this approach, there is a RAM embedded inside the SSDwhich performs the write buffering and manipulating writerequests prior to being dispatched to the disk subsystem. Inother words, due to the fact that the write latency is muchmore than the read latency in SSDs, the buffer managementalgorithms intend to reduce the number of writes to increase

Figure 13. Average Queue Time for Bonnie++ Benchmark

the overall IO performance. So many different buffer man-agement algorithms proposed and investigated different buffermanagement schemes. Some of their most related algorithmsare discussed here.

Among buffer management algorithms, CFLRU and LRU-WSR apply clean-first scheme, which means that they firstevict clean pages from the buffer. CFLRU uses a page listbased on LRU algorithm, and divides it to two regions,one region is called working region which contains recentlyreferenced pages and another region is clean-first region andvictim candidates are selected from this region [19]. On theother hand, LRU-WSR considers some pages as cold anddelays flushing dirty pages which are not cold. Each page hasa cold flag which is set by default. As a page is referenced,its cold flag is become cleared [20].

Jo et al. Proposed a buffer management algorithm calledFlash-Aware Buffer Management(FAB) to minimize the num-ber of write and erase operations and also, to reduce thenumber of garbage collector invocations occurred while over-writing hot pages. When the buffer becomes full, FAB mustselect a victim block which is accomplished by selecting theblock with the maximum number of pages and afterwards,it flushes the pages inside the victim block. To increase thechance of switch merge, the pages with the most contiguouswritten data would be selected as victims. This selection alsoincreases the hit ratio of hot pages within the buffer since thepages containing short writes stay inside the buffer for a longerperiod of time [21].

In [5] Kim proposed a new write buffer management schemecalled Block Padding Least Recently Used (BPLRU) to en-hance the random write performance of flash storage. A RAMbuffer inside the SSD tries to present an appropriate writepattern using RAM buffering techniques. The main advantageof this method is that it can be easily applied on currentSSDs without any changes to their FTL. BPLRU uses threekey techniques: Block-level LRU, Page Padding and LRUCompensation.

Zhao et al. proposed another buffer management algorithmcalled Buffer Padding Cold and Large Cluster First(BPCLC).This algorithm uses a new padding technique called partialblock padding. Obviously, reading unmodified pages fromflash memory can waste a lot of power and causes latencywhich affects the IO performance adversely. This algorithmdoesn’t read all of the unmodified pages from flash memoryto make the victim cluster become full. It just pads the clusteruntil it becomes a fully sequential write sequence which meanscontaining sequentially ordered pages with offsets from 0to the largest page offset inside the cluster. This method isefficient because it can increase the possibility of switch/partialmerge. Furthermore, it can change the high cost of poor flashrandom write to the low cost of sequential write [22].

B. IO Scheduling Algorithms

One of the most related works is the bundling algorithmin which a new IO scheduling algorithm has been proposed

Page 8: An IO Scheduling Algorithm to Improve Performance of Flash … · 2015-12-31 · of FTL mapping algorithms [4]. Another way to alleviate SSD shortcomings is to design appropriate

which makes the IO scheduler suitable to interact with Solid-State Disks [11].

Its main idea is based on the Flash Translation Layerschemes of Flash-based SSDs, which are applied to manageread and write transactions of SSDs. The main purpose ofFTL is to compensate flaws of Flash technology: limited writecycles and inability of data overwriting, concluding to lowwrite performance.

As we discussed in the previous section, there are twomain approaches in designing FTLs: page mapping and blockmapping.

Kang et al. also proposed a new SSD IO scheduler calledSTB scheduler, based on the IRBW scheduler. It added atiming parameter to the bundling algorithm which decreasesthe response time and queuing time associated with eachIO request. In other words, it prevents the scheduler tostop dispatching some requests whose size is less than thelogical block size. This feature would definitely decreasethe overall system’s response time and increase performance.Furthermore, it divides the requests to two different types.First type are synchronous requests which may block otherrequests from being dispatched to the disk subsystem andthe second group are asynchronous requests. Another featureof the STB scheduler is to prioritize synchronous requeststo avoid them to stop other requests from being dispatched.In other words, it proposed time-out-bundling which preventsrequests from waiting for other requests infinitely and selectivebundling which prevents performance critical requests frombeing blocked [15].

[12] and [13] extracted important performance parametersof solid state disks to find the efficient I/O pattern for differentsolid state disks based on these parameters. [12] has alsoproposed a new I/O scheduling scheme and manipulated filesystem block size and maximum request size given theirextracted information about solid state drives. [14] proposed anew I/O scheduler appropriate for improvement of I/O perfor-mance over multi-bank flash memory storage systems to takeadvantage of the parallelism of multiple tasks. [?] concentratedon how to service IO requests in a fair manner. The schemetries to use this policy in order to avoid long execution time ofwrite requests from postponing the read requests’ execution. Itallocates time slices for requests executions. It also prioritizesread requests in its fairness management. Additionally, it triesto exploit parallelism inside SSD as well. In fact, they havetried to propose an alternative IO scheduling algorithm forCFQ IO scheduler which is more efficient for SSDs. [16]proposed Griffin as a hybrid-storage which applies HDDs asa cache for MLC-based SSDs in order to reduce writes to theSSD.

REFERENCES

[1] Abraham Silberschatz, Peter Baer Galvin, and Greg Gagne. Operatingsystem concepts (8. ed.). Wiley, 2008.

[2] M. Moshayedi and P. Wilkinson. Enterprise ssds, 2008.[3] Jae-Hong Kim, Dawoon Jung, Jin-Soo Kim, and Jaehyuk Huh. A

methodology for extracting performance parameters in solid state disks

(ssds). In Modeling, Analysis Simulation of Computer and Telecommu-nication Systems, 2009. MASCOTS ’09. IEEE International Symposiumon, pages 1 –10, sept. 2009.

[4] J. Shin, Z. Xia, N. Xu, R. Gao, X. Cai, S. Maeng, and F. Hsu. Ftldesign exploration in reconfigurable high-performance ssd for serverapplications, 2009.

[5] K. Hyojun and A. Seongjun. Bplru: a buffer management schemefor improving random writes in flash storage. In Proceedings of the6th USENIX Conference on File and Storage Technologies, number 16,pages 1–14. USENIX Association, 2008.

[6] D. Jacobson and J. Wilkes. Disk scheduling algorithms based onrotational position. Technical report.

[7] B. L. Worthington, G. R. Ganger, and Y. N. Patt. Scheduling algorithmsfor modern disk drives. In Proceedings of the 1994 ACM SIGMET-RICS conference on Measurement and modeling of computer systems,SIGMETRICS ’94, pages 241–251, New York, NY, USA, 1994. ACM.

[8] P. M. Chen and D. A. Patterson. A new approach to i/o performanceevaluation: Self-scaling i/o benchmarks, predicted i/o performance. ACMTrans. Comput. Syst., 12(4):308–339, 1994.

[9] A. Gulati, A. Merchant, and P. J. Varman. mclock: handling throughputvariability for hypervisor io scheduling. In Proceedings of the 9thUSENIX conference on Operating systems design and implementation,pages 1–7, 2010.

[10] F. I. Popovici, A. C. Arpaci-Dusseau, and R. H. Arpaci-Dusseau. Robust,portable i/o scheduling with the disk mimic. In USENIX AnnualTechnical Conference, General Track, pages 297–310, 2003.

[11] Jaeho Kim, Yongseok Oh, Eunsam Kim, Jongmoo Choi, DongheeLee, and Sam H. Noh. Disk schedulers for solid state drivers. InProceedings of the 7th ACM international conference on EmbeddedSoftware (EMSOFT).

[12] Byeungkeun Ko, Youngjoo Kim, and Taeseok Kim. Performanceimprovement of i/o subsystems exploiting the characteristics of solidstate drives. In Proceedings of the 2011 international conference onComputational science and its applications - Volume Part III, ICCSA’11,pages 528–539, Berlin, Heidelberg, 2011. Springer-Verlag.

[13] J. Kim, S. Seo, D. Jung, and J. Huh. Parameter-aware i/o managementfor solid state disks (ssds). IEEE Transactions on Computers, pp:pp–pp,2011.

[14] Z. Xiaosong, Z. Pei, and L. Guohui. An efficient i/o scheduler overmulti-bank flash memory storage systems. In IEEE 2nd InternationalConference on Software Engineering and Service Science (ICSESS),pages 59–62, 2011.

[15] Seungyup Kang, Hyunchan Park, and Chuck Yoo. Performance en-hancement of i/o scheduler for solid state devices. In IEEE InternationalConference on Consumer Electronics (ICCE), volume pp, pages pp–pp,2011.

[16] Gokul Soundararajan, Vijayan Prabhakaran, Mahesh Balakrishnan, andTed Wobber. Extending ssd lifetimes with disk-based write caches.In Proceedings of the 8th USENIX conference on File and storagetechnologies, FAST’10, pages 8–8, Berkeley, CA, USA, 2010. USENIXAssociation.

[17] N. Agrawal, V. Prabhakaran, T. Wobber, J.D. Davis, M. Manasse, andR. Panigrahy. Design tradeoffs for ssd performance. In Proceedings ofthe Usenix Annual Technical Conference (USENIX ’08), pages 57–70,Boston, MA, June 2008.

[18] S. Iyer and P. Druschel. Anticipatory scheduling: A disk schedulingframework to overcome deceptive idleness in synchronous i/o. InSymposium on Operating Systems Principles, pages 117–130, 2001.

[19] Seon-yeong Park, Dawoon Jung, Jeong-uk Kang, Jin-soo Kim, andJoonwon Lee. Cflru: a replacement algorithm for flash memory.In Proceedings of the 2006 international conference on Compilers,architecture and synthesis for embedded systems, pages 234–241. ACM,2006.

[20] J. Hoyoung, S. Hyoki, P. Sungmin, K. Sooyong, and C. Jaehyuk. Lru-wsr: integration of lru and writes sequence reordering for flash memory.IEEE Transactions on Consumer Electronics, 54:1215–1223, 2008.

[21] J. Heeseung, K. Jeong-uk, P. Seon-yeong, K. Jin-soo, and L. Joonwon.Fab: Flash-aware buffer management policy for portable media players.IEEE Transactions on Consumer Electronics, 52(2):485–493, 2006.

[22] H. Zhao, P. Jin, P. Yang, and L. Yue. Bpclc: An efficient writebuffer management scheme for flash-based solid state disks. InternationJournal of Digital Content Technology and its Applications, 4(6):123–133, September 2010.


Recommended