+ All Categories
Home > Engineering > Storage availibility in large scale data centers

Storage availibility in large scale data centers

Date post: 19-Aug-2015
Category:
Upload: marybabu10
View: 69 times
Download: 6 times
Share this document with a friend
Popular Tags:
23
INTELLIGENT DATA OUTSOURCING INTELLIGENT DATA OUTSOURCING An Improved Storage Availibility in Large Scale Data Centers An Improved Storage Availibility in Large Scale Data Centers
Transcript
Page 1: Storage availibility in large scale data centers

INTELLIGENT DATA OUTSOURCINGINTELLIGENT DATA OUTSOURCINGAn Improved Storage Availibility in Large Scale Data CentersAn Improved Storage Availibility in Large Scale Data Centers

Page 2: Storage availibility in large scale data centers

2

OUTLINEOUTLINE

● Introduction

● Existing System

● Proposed System

● Analysis

● Conclusion

● References

Page 3: Storage availibility in large scale data centers

3

INTRODUCTIONINTRODUCTION

Big data

● broad term for large datasets

● Business technology for modern enterprises

● Accuracy leads to redused risks

● Storage is a critical component

● Stored in disks

Page 4: Storage availibility in large scale data centers

4

STORAGE SYSTEMS TASKSSTORAGE SYSTEMS TASKS

● High priority foreground tasks

● Low priority background tasks

Storage system Tasks

Foreground tasks Background tasks

Page 5: Storage availibility in large scale data centers

5

LOW PRIORITY BACKGROUND TASKSLOW PRIORITY BACKGROUND TASKS

● RAID Reconstruction

● RAID Resynchronisation

● Disk Scrubbing

Page 6: Storage availibility in large scale data centers

6

INEFFICIENCIES OF EXISTING SYSTEMINEFFICIENCIES OF EXISTING SYSTEM

● Time consuming

● Data loss

● Inefficient Storage Availibility

● Failure Induced optimization

● Does not exploit the predictable nature

● Passive

Page 7: Storage availibility in large scale data centers

7

INTELLIGENT DATA OUTSOURCINGINTELLIGENT DATA OUTSOURCING

● Dynamically captures data popularity

● Exploits temporal and spatial access locality

● Balance between background tasks workflow and user I/O requests

● Portable

Page 8: Storage availibility in large scale data centers

8

OPTIMIZATION SCHEMEOPTIMIZATION SCHEME

EXISTING SYSTEM

● Reactive Optimization

● Request Based

● Exploits Temporal Locality

PROPOSED SYSTEM

● Proactive Optimization

● Zone Based

● Exploits both temporal and spatial locality

Page 9: Storage availibility in large scale data centers

9

OPTIMIZATIONOPTIMIZATION

Reactive Optimization

● Starts after the crash

● Passive

Proactive Optimization

● Starts before the crash

● Active

Page 10: Storage availibility in large scale data centers

10

ACCESS LOCALITYACCESS LOCALITY

Temporal locality

● Repeated data access within small time

● Request Based Optimization

Spatial locality

● Clustered data access within small storage areas

● Zone Based Optimization

Access Locality

Temporal locality Spatial locality

Page 11: Storage availibility in large scale data centers

11

CHARACTERISTICS OF IDOCHARACTERISTICS OF IDO

● Proactive Zone Based Optimization

● Temporal Locality

● Spatial Locality

● User I/O

● Background I/O

Page 12: Storage availibility in large scale data centers

12

DESIGN OF IDODESIGN OF IDO

MAIN THREE OBJECTIVES

● Improving the storage availibility

● Improving the I/O performance

● Providing high portality

Page 13: Storage availibility in large scale data centers

13

IDO ARCHITECTUREIDO ARCHITECTURE

Page 14: Storage availibility in large scale data centers

14

IDO FUNCTIONAL MODULESIDO FUNCTIONAL MODULES

Hot Zone Identifier

Data Migrator

Request Distributor

Task Predictor

DataReclaimer

● Hot Zone Identification

● Task Prediction

● Request Distribution

● Data Migration

● Data Reclamation

Page 15: Storage availibility in large scale data centers

15

KEY DATA STRUCTURESKEY DATA STRUCTURES

ZONE_TABLE

● Num

● Popularity

● Flag

D_MAP

● D_offset

● S_offset

● Len

Page 16: Storage availibility in large scale data centers

16

HOT DATA IDENTIFICATIONHOT DATA IDENTIFICATION

THREE DESIGN ISSUES

● By exploiting the spatial locality of workloads

● By exploiting the temporal locality of requests

● By implementing intelligent modules datastructures

Page 17: Storage availibility in large scale data centers

17

PROACTIVE DATA MIGRATIONPROACTIVE DATA MIGRATION

● Hot zone identified

● Task Predictor detects task

● Data Migrated

● Flag set to 01

● RAID Reconstructed

● Flag set to 10

● RAID Reclaimed

● Corresponding D_map deleted

Page 18: Storage availibility in large scale data centers

18

IMPROVED STORAGE AVAILIBILITY FOR I/OIMPROVED STORAGE AVAILIBILITY FOR I/O

I/O read request

● IDO determines target data zone

● Read request issued to degraded/surrogate RAID set

● Popularity updated

● IDO checks D_map

I/O write request

● Checks D_map for write request hits

● D_map updated

● Sequentially written to surrogate RAID set

Page 19: Storage availibility in large scale data centers

19

DATA CONSISTENCYDATA CONSISTENCY

TWO ASPECTS CONSIDERED

● Key data structures

● Redirected write data on surrogate RAID

Page 20: Storage availibility in large scale data centers

20

ANALYSISANALYSIS

Overhead Analysis

● Performance Overhead

● Memory Overhead

Page 21: Storage availibility in large scale data centers

21

CONCLUSIONCONCLUSION

● Proactive optimisation accelerates low priority background tasks

● Zone Based approach boosts the performance of low priority background tasks

● Designed and implemented a proactive zone based optimisation to outsource data

Page 22: Storage availibility in large scale data centers

22

REFERENCESREFERENCES

● S. Wu, H. Jiang, D. Feng, L. Tian, and B. Mao. Proactive Data Migration for Improved Storage Availability in Large-Scale Data Centers. IEEE Transactions on Computers, 2015.

● S. Wu, H. Jiang, D. Feng, L. Tian, and B. Mao. Improving Availability of RAID-Structured Storage Systems by Workload Outsoucing. IEEE Transactions on Computers, 2011.

● S. Wu, B. Mao, D. Feng, and J. Chen. Availability-Aware Cache Management with Improved RAID Reconstruction Performance. In CSE’10, Dec. 2010.

● L. Xiang, Y. Xu, John C. S. Lui, and Q. Chang. Optimal Recovery of Single Disk Failure in RDP Code Storage Systems. In SIGMETRICS’10, Jun. 2010.

Page 23: Storage availibility in large scale data centers

23

THANK YOU!!!THANK YOU!!!


Recommended