© 2013 IBM Corporation
Replication for Business Continuity, Disaster Recovery and High AvailabilityTony Pearson – IBM Master Inventor and Senior Managing Consultant
March 2013
© 2013 IBM Corporation2
� Lost brand equity
� Loss of goodwill and trust
� Lost loyalty
� Lost revenue and market share
� Lost productivity
� Causes:
Everyone Knows: Downtime is Bad!
© 2013 IBM Corporation3
2013: continued acceleration of changes in today’s business world….
3|
Collaboration
Tru
st
Core Business
Subsidiary/JV
Customer
Partner/Channel
Supplier/OutsourcerIsolated
Operations
11
Select ‘Trusted Partners’
22
Extended Value Chain
33
Industry-Centric Value Web
44
Cross-Industry Value Coalition
55
Metcalf’s Law:Value of network
increasesproportional to square
of # people on it
© 2013 IBM Corporation4
Application 1Application 3
Analyticsreport
managementreports
http://xyz.xml
decisionpoint
MQseries
WebSphere
Application 2
SQL
db2
Businessprocess A
Businessprocess B
Businessprocess C
Businessprocess D
Businessprocess E
Businessprocess F
Businessprocess G
Inf
rast
ruct
ure
App
lica
tion
Bus
iness
1. An error occurs on a storage device that correspondingly corrupts a database
2. The error impacts the ability of two or more applications to share critical data
3. The loss of both applications affects two distinctly different business processes
IT Business Continuity must recover at the business processlevel
The “Business Process” is the Unit of Recovery
© 2013 IBM Corporation5
Overlap of valid data protection techniques
Protection of critical Business data Operations continue after a disaster
Costs are predictable and manageableRecovery is predictable and reliable
Fault-tolerant, failure-resistant streamlined infrastructure
with affordable cost foundation
1. High AvailabilityNon-disruptive backups and
system maintenance coupled with continuous availability of
applications
2. Continuous OperationsProtection against unplanned
outages such as disasters through reliable, predictable
recovery
3. Disaster Recovery
IT DataProtection
© 2013 IBM Corporation6
Production ☺ Network StaffOperations StaffOperations Staff
Data
Operating System
Physical Facilities
Telecom Network
Management Control
Execute hardware, O/S, data integrity recovery
AssessRPO
Software transactionintegrity recovery
Applications
Now we're done!
Applications Staff
Recovery Time Objective (RTO)of transaction integrity
Recovery Time Objective (RTO)of hardware data integrity
RPO
Outage
RecoverySite
ΔΔΔΔ Data
TimeRecovery Point
Objective (RPO).How much data
must be recreated?
Timeline of a Disaster Recovery
© 2013 IBM Corporation7
Tape Backup
SecsMinsHrsDaysWks Secs Mins Hrs Days Wks
Recovery PointRecovery Point Recovery TimeRecovery Time
Synchronous replication / HA
Periodic Replication
Asynchronousreplication
For example:
Technology drives the Recover Point Objective (RPO)
© 2013 IBM Corporation8
� Recovery Time includes:
– Fault detection
– Recovering data
– Bringing applications back online
– Network access
Manual Tape Restore
SecsMinsHrsDaysWks Secs Mins Hrs Days Wks
Recovery PointRecovery Point Recovery TimeRecovery Time
End to end automated clustering
Storage automation
For example:
Automation drives Recovery Time Objective (RTO)
© 2013 IBM Corporation9
Recovery Time Objective (guidelines only)
15 Min. 1-4 Hr.. 4 -8 Hr.. 8-12 Hr.. 12-16 Hr.. 24 Hr.. Days
Cos
t / V
alue
BC Tier 4 –Point in Time replication to Backup/Restore
BC Tier 3 – VTL, Data De-Dup, Remote vault
BC Tier 2 – Tape libraries + Automation
BC Tier 7 –Server or Storage replication with end-t o-end automated server recovery
BC Tier 6 –real-time continuous data replication, s erver or storage
BC Tier 1 – Restore from Tape
BC Tier 5 –Application/database integration to Backup/Restore
Recovery from a disk image Recovery from tape copy
Balancing recovery time objective with cost / value
Business Continuity Tiers
© 2013 IBM Corporation10
Integration into IT ManageBusiness Prioritization
StrategyDesign
riskassessment
businessimpactanalysis
Risks,
Vulnerab
ilities
and Thre
ats
programassessment
Impacts
of
Outage
RTO/RPO
• Maturity Model
• Measure ROI
• Roadmap for Program
ProgramDesign
Current
Capab
ility
Implementprogram
validation
Estimate
d
Recove
ry Tim
e
ResilienceProgram
Management
Awareness, Regular Validation, Change Management, Quarterly Management Briefings
Business processes drive strategies and they are integral to the Continuity of Business Operations. A company cannot be resilient without having strategies for alternate workspace, staff members, call centers and communications channels.
crisis team
businessresumption
disasterrecovery
highavailability
1. People2. Processes3. Plans4. Strategies5. Networks6. Platforms7. Facilities
Database andSoftware design
High Availability Servers
Storage, Data Replication
High Availabilitydesign
Source: IBM STG, IBM Global Services
Ideal World for High Availability and Business Continuity (HA/BC)
© 2013 IBM Corporation11
The role of the basic “Data Strategy” for HA/BC purposes
� Define major data types “good enough”– i.e. by major application, by business line….– An ongoing journey
� For each data type:– Usage– Performance and measurement– Security– Availability– Criticality– Organizational role– Who manages– What standards for this data
• What type storage deployed on• What database • What virtualization
� Be pragmatic– Create a basic, “good enough” data strategy for HA/BC purposes
� Acquire tools that help you know your data
Data Strategy Defined
Business Strategies
IT Strategy
Data Strategy
Enterprise IT Architecture
IT Infrastructure
People
Process
Structure
Data
Technology
Data Strategy
You have toknow your data
And have abasic strategy
for it
© 2013 IBM Corporation12
A basic data strategy tells you how to categorize y our data -looks something like this (step by step):
� Mission – critical data – Mission-critical data that is the highest priority dtaa– Priority = uptime, with high value justification
Lower cost
� Subset of data that is either mission-critical or supports mission critical
– Data that supports business lines – Balanced priorities = Uptime and cost/value
� Knowledge of user and application data– All data, whether active or not…. – Which eventually needs to be archived, retained– Priority = cost
Mission Critical
Not easy to know and categorize your data -But is the only foundation possible
Virtualized Storage
© 2013 IBM Corporation13
Then, your basic data strategy allows you to scopeyour HA/BC – something like this:
� Continuous Availability (CA) – Finally, create the mission-critical subset with highest level of recovery– RTO = near continuous, RPO = small as possible (Tier 7)– Priority = uptime, with high value justification
Lower cost
� Rapid Data Recovery (RDR)– Then create separate storage pools as required– RTO = minutes, to (approx. range): 2 to 6 hours– BC Tiers 4, 5 and 6– Balanced priorities = Uptime and cost/value
� Backup/Restore (B/R)– Virtualize, optimize cost, lay recovery capability foundation – Provide universal 24 hour - 12 hour (approx) recovery capability– Address requirements for archival, compliance, green energy– Priority = cost
Mission Critical
Know and categorize your data -This is where virtualization is the enabler
VirtualizedStorage
Not easy to know and categorize your data -But is the only foundation possible
© 2013 IBM Corporation14
Rule of Thumb for continuous replication bandwidth
� Rule of Thumb:– Every 1 TB of mirrored disk storage generates
about this much MB/sec of writes:
� OLTP– 1-2 MB/sec of write bandwidth
� Sequential/batch– 6-7 MB/sec of write bandwidth
� Expect minimum 2.5x this to handle peaks
� Expect normal data compression to be about 2:1
� Example - you have 10 TB of disk to mirror :– OLTP: 10-20 MB/sec – Batch/sequential: 60-70 MB/sec ROT:
one OC3 line = 15 MB/sec raw Effective transfer rate
© 2013 IBM Corporation15
Short distance synchronous mirroring: 2 site
S
FP
Short distance may not meet DR requirements
Ability to utilize server capacity in both sites for single instance of application data
Potential/ability for non-disruptive failover
Hardware solution gives data consistency between multiple servers/applications and single management point
Additional copy of data might be provided for testing or testing may be done by regular switch of sites
© 2013 IBM Corporation16
1
0
0
0
0
1
0
S FP
Longer distance to meet regulatory requirements and protect against regional events Ability to utilize server
capacity in both sites for applications with separate/independent data
Disruptive failover and less potential to use DR solution for continuous availability
Asynchronous replication more likely due to performance requirements
SF P
Additional copy of data more likely to be provided for testing
Long Distance Mirroring: 2 site
© 2013 IBM Corporation17
•Write to primary volume•The primary site initiates an I/O to the secondary site to transfer the data
•Secondary indicates to the primary that the write is complete
•Primary acknowledges to the host application that the write is complete
•Round-trip latency added to each Write I/O
•Write to primary volume•The primary site acknowledges to the host application that the write is complete
Some later time:•The primary site initiates an I/O to the secondary site to transfer the data
•Secondary indicates to the primary that the write is complete
•Primary and secondary bitmap updated that data is in sync
2
3
1
43
4
1
2
Server I/OServer I/O
Metro MirrorSynchronous <300 km
Global MirrorAsynchronous (any distance)
P SP S
Sync versus Async
© 2013 IBM Corporation18
3-Site Configurations
CampusLocal-1, Local-2
Remote-3
Local-1
Bunker-2
Remote-3 Remote-3
Remote-2Local-1
© 2013 IBM Corporation19
FastBack for Workstations
•FastBack
•TSM ServerWAN
•Remote Office(s) •Data Center•DR Operations
Archive / Off Site
ProtecTIER
Tiers of Storage
Information Archive
FastBack
ApplicationsFile Servers
VMware Servers
•TSM Clients•TDPs
•Mobile Offices
•FlashCopy•Manager
Centralized Administration• Install / Upgrade• Monitoring• Reporting
• Configuration• Set Policies• Execute Backup / Restore
Cloud Gateway
•TSM Server
•TSM VE
•TSM Clients•TDPs
•DR
Critical ApplicatServers
CriticalApplicat
VMwareServers
ApplicationsFile Servers
VMware Servers
WAN
Cloud Storage
“TSM is the grand-daddy of unified recovery managem ent” --Lauren Whitehouse, Enterprise Strategy Group
Tiers of Storage
Tivoli Storage Manager an integrated, end-to-end data protection and unified recovery management solution
© 2013 IBM Corporation20
Summary
� Understand today’s best practices
– for IT High Availability and Business Continuity
� Strategies for:– Requirements, design,
implementation– In-house vs. out-sourcing
� Step by step methodology– Essential role of virtualization– IBM technologies for replication
and replication management
20
© 2013 IBM Corporation21
© 2013 IBM Corporation22
Resources and Information
� IBM Redbook: Business Continuity Planning Guide
� http://www.redbooks.ibm.com/abstracts/sg246547.html
� In particular, chapters 3, 6, 7
© 2013 IBM Corporation23
About the Speaker
Mr. Tony Pearson
Master Inventor,
Senior Managing Consultant
IBM System Storage
Tony Pearson is a Master Inventor and Senior managing consultant for the IBM System Storage™ product line. Tony joined
IBM Corporation in 1986 in Tucson, Arizona, USA, and has lived there ever since. In his current role, Tony presents briefings on storage topics covering the entire System Storage product line, Tivoli storage software products, and topics related to Cloud
Computing. He interacts with clients, speaks at conferences and events, and leads client workshops to help clients with strategic planning for IBM’s integrated set of storage management software, hardware, and virtualization products.
Tony writes the “Inside System Storage” blog, which is read by hundreds of clients, IBM sales reps and IBM Business Partners every week. This blog was rated one of the top 10 blogs for the IT storage industry by “Networking World” magazine, and #1
most read IBM blog on IBM’s developerWorks. The blog has been published in series of books, Inside System Storage: Volume I through V.
Over the past years, Tony has worked in development, marketing and customer care positions for various storage hardware and software products. Tony has a Bachelor of Science degree in Software Engineering, and a Master of Science degree in
Electrical Engineering, both from the University of Arizona. Tony holds 19 IBM patents for inventions on storage hardware and software products.
9000 S. Rita RoadBldg 9032 Room 1238Tucson, AZ 85744
+1 520-799-4309 (Office)
Tony Pearson
Master Inventor, Senior Managing Consultant
IBM System Storage™
© 2013 IBM Corporation24
Additional Resources
24
Email:[email protected]
Twitter:http://twitter.com/az99Øtony
Blog: http://ibm.co/brAeZØ
Books:http://www.lulu.com/spotlight/99Ø_tony
IBM Expert Network:http://www.slideshare.net/az99Øtony
24
© 2013 IBM Corporation25
Trademarks and disclaimersAdobe, the Adobe logo, PostScript, and the PostScript logo are either registered trademarks or trademarks of Adobe Systems Incorporated in the United States, and/or other countries. IT Infrastructure Library is a registered trademark of the Central Computer and Telecommunications Agency which is now part of the Office of Government Commerce. Intel, Intel logo, Intel Inside, Intel Inside logo, Intel Centrino, Intel Centrino logo, Celeron, Intel Xeon, Intel SpeedStep, Itanium, and Pentium are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries. Linux is a registered trademark of Linus Torvalds in the United States, other countries, or both. Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft Corporation in the United States, other countries, or both. ITIL is a registered trademark, and a registered community trademark of the Office of Government Commerce, and is registered in the U.S. Patent and Trademark Office. UNIX is a registered trademark of The Open Group in the United States and other countries. Java and all Java-based trademarks and logos are trademarks or registered trademarks of Oracle and/or its affiliates. Cell Broadband Engine is a trademark of Sony Computer Entertainment, Inc. in the United States, other countries, or both and is used under license therefrom. Linear Tape-Open, LTO, the LTO Logo, Ultrium, and the Ultrium logo are trademarks of HP, IBM Corp. and Quantum in the U.S. and other countries.
Other product and service names might be trademarks of IBM or other companies. Information is provided "AS IS" without warranty of any kind.
The customer examples described are presented as illustrations of how those customers have used IBM products and the results they may have achieved. Actual environmental costs and performance characteristics may vary by customer.
Information concerning non-IBM products was obtained from a supplier of these products, published announcement material, or other publicly available sources and does not constitute an endorsement of such products by IBM. Sources for non-IBM list prices and performance numbers are taken from publicly available information, including vendor announcements and vendor worldwide homepages. IBM has not tested these products and cannot confirm the accuracy of performance, capability, or any other claims related to non-IBM products. Questions on the capability of non-IBM products should be addressed to the supplier of those products.
All statements regarding IBM future direction and intent are subject to change or withdrawal without notice, and represent goals and objectives only.
Some information addresses anticipated future capabilities. Such information is not intended as a definitive statement of a commitment to specific levels of performance, function or delivery schedules with respect to any future products. Such commitments are only made in IBM product announcements. The information is presented here to communicate IBM's current investment and development activities as a good faith effort to help with our customers' future planning.
Performance is based on measurements and projections using standard IBM benchmarks in a controlled environment. The actual throughput or performance that any user will experience will vary depending upon considerations such as the amount of multiprogramming in the user's job stream, the I/O configuration, the storage configuration, and the workload processed. Therefore, no assurance can be given that an individual user will achieve throughput or performance improvements equivalent to the ratios stated here.
Prices are suggested U.S. list prices and are subject to change without notice. Starting price may not include a hard drive, operating system or other features. Contact your IBM representative or Business Partner for the most current pricing in your geography.
Photographs shown may be engineering prototypes. Changes may be incorporated in production models.
© IBM Corporation 2013. All rights reserved.
References in this document to IBM products or services do not imply that IBM intends to make them available in every country.
Trademarks of International Business Machines Corporation in the United States, other countries, or both can be found on the World Wide Web at http://www.ibm.com/legal/copytrade.shtml. ZSP03490-USEN-00