Date post: | 12-Jan-2015 |
Category: |
Technology |
Upload: | interop-mumbai-2009 |
View: | 1,563 times |
Download: | 1 times |
1© Copyright 2009 EMC Corporation. All rights reserved.
Next Generation Backup and Recovery
With Data Deduplication
Venkatesh K. IyerHead – India & SAARCBackup, Recovery & Archival Solutions
Driving Down the Cost and Risk
2© Copyright 2009 EMC Corporation. All rights reserved.
Agenda
� Current Market Scenario
� Backup and Recovery Challenges
� Why is Data De-duplication so hot?
� What is Data Deduplication ?
� Next Gen Backup and Recovery Architecture
3© Copyright 2009 EMC Corporation. All rights reserved.
2009 – Market Conditions
Challenging times with global economic recession in 2009
Economic environment is a leading indicator of tech spending
� 71% of CIO’s anticipate flat or declining IT spending budgets
� IT budgets in developed countries set to decline by 5%
� TOTAL IT SPENDING IS THE LOWEST IN THE HISTORY OF THE SURVEY
� “The current environment has moved virtualisation toward the top of the priority list for CIOs”
� “TCO reductions will be a key driver of the acceleration in server virtualisation deployments as CIOs are forced to cut capital spending and reign in management, administrative, and power/cooling cost”
Source Goldman Sachs IT spending Survey Nov 2008 & Merrill Lynch CIO Survey Oct 2008
4© Copyright 2009 EMC Corporation. All rights reserved.
Industry Challenges & CIO’s Concerns
� Current Economic Crisis
� Cut Operating Costs
� Continued Information Growth (10X growth over next 5 years, EMC/IDC white paper)
5© Copyright 2009 EMC Corporation. All rights reserved.
Key Backup and Recovery Themes:
� ROI and TCO are #1 on CIO minds
� The data protection market continues to evolve:– Operational Savings through Automation and Integration
– Improvements in Service Levels (RPOs / RTOs) and IT Compliance
– Decreasing Reliance on Tape through B2D and Data Deduplication
� Traditional data protection methodologies don’t map well to virtualized servers
A perfect storm is brewing for a fundamental re-architecture of data protection environments in organizations
6© Copyright 2009 EMC Corporation. All rights reserved.
Today’s Backup and Recovery Challenges
Massive Data Growth
Shift to Virtual
ComplianceComplianceCompliance
CostsCosts
Complexity
7© Copyright 2009 EMC Corporation. All rights reserved.Confidential7
Hierarchy of Data Reduction Types for Backup
Regular Storage Array1:1
LZ Compression~ 2:1
Single Instance Storage~ 3:1
Fixed Block~ 3:1
File Level
Fixed Blocks,Snapshots
Variable Segments
Whitespace Reduction
Data
‘Dedupe’~ 20:1
To 500:1
Data Deduplication
Significantly Reduces- Power
- Heat
- Cooling
-Management
-Bandwidth
-.-.
8© Copyright 2009 EMC Corporation. All rights reserved.Confidential8
Gartner Dedupe Prediction: The Market is HUGE
�By 2012, deduplication will
be applied to 75% of backups
�Key Findings:
�Production deployments of deduplication for backups have progressed at an unusually high rate for such a recent technology; however, Gartner estimates that less than 5% of backups today use deduplication techniques.
�Market Implications:
�Gartner views this technology as transformational because it radically decreases the economics of disk-based backup and recovery………too compelling to ignore.
�Recommendations:
�There are several different implementations of deduplication, and some vendors have only recently released this technology and have a few dozen customers, others have been shipping it for several years and have more than 1,000 customers.
�…… ensure that your organization is comfortable with the robustness and maturity of the vendor's approach.
�Analysis by: Dave Russell
9© Copyright 2009 EMC Corporation. All rights reserved.
Why So Much Interest in Data Deduplication?
� Backup & Archive processes have been overwhelmed by information growth
� Primary storage efficiency has become a necessity to cope with massive growth
� ROI drives the compelling appeal of deupe– Reduced Storage Capacities
– Lower Infrastructure Costs
– Improved SLA’s
– Efficient Replication for DR
Very important
In use Evaluating / In Near – Long Term plan Not in Plan
DeduplicationOne of the top 10 Technology Consideration 59%
24% Deploying Deduplication 55% 21%
- Source: TheInfoPro Wave 11 Storage Study, 2008
10© Copyright 2009 EMC Corporation. All rights reserved.
Why so much Interest in Data De-Duplication?
• Data De-duplication – One of the hottest emerging segments within the storage and data protection market – Why?
– Network Bandwidth utilisation – Efficiently Move Data
– Massive reduction in Storage requirements – Efficiently Store Data
– Security – Data protection in transit
– Improving efficiencies in virtualised environments
• Market is under duress, Backup and Restore has not kept pace with enterprise growth
• Companies looking to protect more data – increasing desktop volumes, mobile employees, remote offices, data growth circa 50% +
• Data retained at the back-end for longer periods of time for internal reasons or external regulations – Need to archive
• Tape is not ideal for backup and restore, the industry is moving towards backup-to-disk
• De-duplication market opportunity $1B by 2009
• It’s here to stay, its based on compression which has been around for 20+ years
11© Copyright 2009 EMC Corporation. All rights reserved.
Deduplication 101
� Dedupe - storing only unique ‘chunks’ of data (blocks, objects, files)– Uses identification & comparison algorithms, content addressing, indexing or cataloging
– Unique “chunks” are reconstituted to original format from the de-duplicated state
� Compression– Minimizes empty space within files; but does not eliminate redundant data
– Compression is employed in conjunction with other dedupe processes
Data Set 3
Data Set 2
Data Set 1
De-duplication
Data Set 3
Data Set 2
Data Set 1
Data Set 3
Data Set 2
Data Set 1
De-duplicationDe-duplication
12© Copyright 2009 EMC Corporation. All rights reserved.
De-duplication at Target
� Moves ~ 200 percent of primary data weekly
� Up to 50 times reduction backup storage
� Backups are typically restored from full and incremental images
� De-dupe device viewed as file system and/or virtual tape library target for traditional backup software
De-duplication at Source
� Moves ~ 2 percent of primary data weekly
� Up to 50 times reduction in backup storage
� Up to 500 times less daily network impact
� Up to 10 times faster daily full backups
� Fast, daily full backups, single-step recovery
� Next-generation backup and recovery
Target- and Source-basedData De-duplication
Network Network
There are strong use cases for both technologies…but only source-based de-duplication reduces daily network bandwidth requirements and
decreases client resource utilization during backups.
EMC AvamarEMC Disk
Library
13© Copyright 2009 EMC Corporation. All rights reserved.
A B C D
Unique data stored on disk, available for immediate recovery
Only unique data segments are backed up
AB
CD
Data already backed up, so only a unique ID pointer is stored (20 bytes)
E
ENew data segment identified and backed up
Data De-Duplication: How it Works
� First Instance � Duplicate Instance � Modified Instance
A B
C D
A B
C D
B
C D
E
14© Copyright 2009 EMC Corporation. All rights reserved.
Potential Impact of Data De-duplication on a Backup
RAW DataTotal Capacity Stored Over 12 Weeks
Daily incremental, weekly full
3 MB 12 Fulls = 36 MB
Daily full 3 MB 84 Fulls = 252 MB
De-duplicated backup 3 MB 84 Fulls = 1.25MB
Data de-duplication reducesBackup to Disk capacity requirements
File 1 = 1 MB
A B
C D
File 2= 1 MB
A B
C D
File 3= 1 MB
B
C D
E
15© Copyright 2009 EMC Corporation. All rights reserved.
Backup and Recovery Use Cases
Target
Source
Avam
ar
Data
Do
main
+ D
L4000
VirtualizedEnvironments
Remote / BranchOffices
EdgeDevices
3rd PartyBackup
DatacenterNAS / SAN
HighTransaction Apps
Relieves backup bottlenecks, enables greater server consolidation ratios
Protects ROBOs with highest WAN efficiency and with consistent DC policies
Protects enterprise desktops / laptops with low device overhead
Heterogeneous target for existing backup applications
Enterprise infrastructure support
High-change rate, large data sets
Next G
enera
tion B
ackup a
nd A
rchiv
e
16© Copyright 2009 EMC Corporation. All rights reserved.
Next Generation Backup, Recovery and Archive - Take the StepsBetter Protection and Compliance. Less Cost.
• Reduce the size of backup• Free valuable primary storage capacity• Assure compliance, remove exposure• Reduce eDiscovery expenses
• Reduce time, bandwidth and infrastructure• Streamline D/R operations, infrastructure• Expedite disaster recovery• Eliminate remote office backup infrastructure
• Expedite application recovery• Reduce backup management overhead• Streamline problem detection, resolution• Lower backup management costs
Best-Practice Business Benefits
Archive
Manage
Backup
• Avoid time and expense of developing expertise• Identify maximum investment/benefit strategies
Assess
17© Copyright 2009 EMC Corporation. All rights reserved.
Alignment Attributes
Specification AvamarEDL
Tier 1 Tier 3
Scheme
Operational
Backup &
Recovery
Disaster
Recovery
Backup Time
LAN/CPU/DISK Impact
Retention on Disk (Typical)
Data shredding compliance
Verify Quality of Backup Data automaticallyAbility to encrypt backup dataHow long for a replicated copy
Amount of data loss
Ability to recover data
Length of time data is retained on disk
Backup Performance
Proposed technology
Deduplication
Amount of data loss
25% Data Restore
Instant Normal Fastest
None HIGHMinimal -
None
< 2 Days 4 WeeksWeeks / Months
No Yes Yes
None Protocol Daily
None None Integrated
Real Time 24-48 hours 30-90 Minutes
Last Transaction
> = 24 hours < 24 hours
100% 100% 100%
2 days 3 Weeks 12 Weeks
HighHighest
CDP to DiskBackup to
diskSource Dedup
to disk
NoneOptional at
TargetIntegrated at
Source
High
Tape
None
Medium
1-2 days
95%
N/A
Last Transaction
Last Backup Last Backup Last Backup
< 2 hours < 24 hours < 24 hours < 48 hours
Best use Case DB DB/Emails Files/NAS/VmWare/RO
Data Integrity Checks
Encryption
Retention & Disposition
Replication for DR
ArchitectureConsiderations / Impact
Operational Recovery Pt Obj
Recoverability
Retention period
Backup Performance
Concept
Disaster Recovery Pt Obj (RPO)
Disaster Recovery Time Obj (RTO)
Backup Service Tiering / Catalogue
Tier 2
Recoverpoint LTO4 Tape
Years
No
2-3 Days
OS
Cost per TB K / TB K / TB K / TB K / TB
HIGH
Medium
None
Optional
Archive
Disk
Integrated
NA
0
100%
Years
Last Replication
Years
K / TB
Minimal
Centera
Long term Retention
Yes
Daily
30-90 Min.
NA
Application
18© Copyright 2009 EMC Corporation. All rights reserved.
CDP
Archive
Next Generation Data Protection Architecture
Source De-Duplication
Source De-Duplication
Mailarchive & retrieval
Database Archive
70-80% data reduction trough shortcutting
Centralized Data Protection Management
Source De-Duplicated Backup
SAN Backup
Typically 80% of Data Typically 20% of Data
Pro
fessio
nal S
erv
ices
19© Copyright 2009 EMC Corporation. All rights reserved.
EmailXtender / DiskXtender + Centera
Next Generation Data Protection with EMC
Source De-Duplication
Source De-Duplication
Mailarchive & retrieval
Database Backup
70-80% data reduction trough shortcutting
EMC DPA
EMC | Avamar
EMC | Networker
Typically 80% of Data Typically 20% of Data
EMC | Recoverpoint
Avamar Centera TapeEDL Data DomainRecoverPoint
EM
C P
rofe
ssio
nal S
erv
ices
Email: [email protected]