Date post: | 14-Jun-2015 |
Category: |
Technology |
Upload: | mainlines-knowledge-center |
View: | 831 times |
Download: | 0 times |
How to Know if Virtual Tape Technology is Right for You
Chris DedhamBusiness Continuity Specialist and Senior Storage Solutions ArchitectMainline Information Systems
Review of Backup Topologies - Review of Backup Topologies - Backup directly to TapeBackup directly to Tape
Database
Tape
Tape Duplication for
DR
The disk storage contains the production data
Host
Data is backed up directly to
tape
Tape
Disk
Review of Backup Topologies - Review of Backup Topologies - Backup directly to disk, Backup directly to disk, then moved to tape [D2d2T]then moved to tape [D2d2T]
Database
Tape
Tape Duplication for
DR
The disk storage contains the production data
Host
Data is backed up directly to
disk
Tape
Data is flushed
from disk to tape
Disk
Disk
Review of Backup Topologies - Review of Backup Topologies - Backup directly to disk, Backup directly to disk, and stored on disk [D2D2T]and stored on disk [D2D2T]
Database
Tape Copy for DR
and Archiving
The disk storage contains the production data
Host
Data is backed up directly to
disk and stored on disk
Tape
DiskDisk or VTL
Review of Backup Topologies - Review of Backup Topologies - Backup directly to VTL, Backup directly to VTL, and replicated to VTLand replicated to VTL
Database
The disk storage contains the production data
Host
DiskVTL
VTL
Data is backed up directly to
Disk on a VTL, and stored on
the VTL
De-Duplicate the data and replicate it to another VTL
over the WAN
“Truly Tape Less”
Review of Backup Topologies - Review of Backup Topologies - Backup directly to VTL, Backup directly to VTL, and replicated to VTLand replicated to VTL
Database
The disk storage contains the production data
Host
DiskVTL
VTL
Data is backed up directly to
Disk on a VTL, and stored on
the VTL
De-Duplicate the data and replicate it to another VTL
over the WAN
“Truly Tape Less”
Tape for Archiving
Tape
Why are customers interested in VTL’s
•A desire to get away from Real Tape– Tape drive and tape automation break - fix issues– Eliminate tape handling – Operational / Security– Doing something different than status quo
•Increase backup throughput– More tape drives (aka mount points) for greater
parallelism– LAN Free to disk– Helps with small files and NDMP– Less administrative overhead than large diskpools
Why are customers interested in VTL’s (cont)
•Faster restores– No tape mount load times (Disk seeks instead)– Tape volume fragmentation is greatly reduced
•Increase TSM admin throughput processing– Reduced disk pool migrations– Faster DR copies to real tape– Faster reclamation processing– Additional drives (aka mount points)
Why are customers interested in VTL’s (cont)
•Replication for Disaster Recovery– Tape handling is costly operationaly and can create
security issues– Data De-Duplication can greatly reduce the amount
of data that must be transferred over the WAN
WAN costs can be 50% to 70% TCO of a DR solution
•Data De-Duplication can reduce the cost of disk– Making 30TB’s of disk look like 300TB’s
Some facts about VTL’s , however
•Real tape can be faster than virtual tape– High speed tape can be faster than virtual tape
depending on the workload– Real tape can store data at a lower cost per GB– Storing inactive data on real tape can have better
environmental characteristics
•Virtual tape may require “straight disk” and real tape to handle all of the workload
The special sauce in VTL’s is intelligent compression (aka Data De-Duplication)
•All VTL’s can do standard LZ compression like in tape drives– LZ compression reduces space within files. – Typically 2:1 compression ratio
•Some VTL’s have Data De-Duplication– Data De-Dupe eliminates redundant files– 7:1 or greater De-Duplication ratios can be obtained
“Your mileage may vary”
There are two methods for Data De-Duplication
•In-Line Data De-Duplication– The incoming backup data is de-duped before it
lands on the VTL storage.– The IBM TS7650g (Diligent) and Data Domain use
this method
•Post Processing Data De-Duplication– The incoming backup data is de-duped after it lands
on the VTL storage– The FalconStor VTL and Sepaton VTL use this
method
Three Basic Approaches
Talked about today in the industry:1. Hash based de-duplication
– Sometimes referred to as a Content Addressable Storage approach
2. Content Aware– Assumes the best candidate to de-dupe against is
an object with the same properties (name etc.)3. HyperFactor
– A different approach based on an agnostic view of data
How Hash Data De-duplication (DDD) Works
1. Data chunks are evaluated to determine a unique signature for each
2. Signature values are compared to identify all
duplicates
3. Duplicate data chunks are replaced with pointers to a single stored chunk,
saving storage space
C
A
B
C
AA B
B
A
Data Store
Data Store
C
A
B
C
AA B
B
A
Data Store
C
A
b
c
ab
B
a
a
Data Store
Identification of Redundant Chunks• Unique identifier is determined for each chunk• Identifiers are typically calculated using a hash
function that outputs a digest based on the data in each chunk
– MD5 – SHA
• For each chunk, the identifier is compared against an index of identifiers to determine whether that chunk is already in the data store
• Selection of hash function involves tradeoffs between– Processing time to compute hash values– Index space required to store hash values– Risk of false matches
Deduplication Ratios• Used to indicate compression achieved by deduplication
• If deduplication reduces 500 TB of data to 100 TB, ratio is 5:1
• Deduplication vendors claim ratios in the range 20:1 to 400:1
• Ratios reflect design tradeoffs involving performance and compression
• Actual compression ratios will be highly dependent on other variables
– Data from each source: redundancy, change rate, retention– Number of data sources and redundancy of data among those
sources– Backup methodology: incremental forever, full+incremental,
full+differential– Whether data encryption occurs prior to deduplication
And data deduplication is the key to using more disk more cost effectively!
Thank you
Knowledge is POWER at:Mainline’s Knowledge Center
www.mainline.com/kc866.490.MAIN (6246)