© 2009 EMC Corporation. All rights reserved.
Introduction to Business ContinuityIntroduction to Business Continuity
Module 3.1
© 2009 EMC Corporation. All rights reserved. Introduction to Business Continuity - 2
Introduction to Business Continuity
After completing this module, you will be able to:
Define Business Continuity and Information Availability
Detail impact of information unavailability
Define BC measurement and terminologies
Describe BC planning process
Detail BC technology solutions
© 2009 EMC Corporation. All rights reserved. Introduction to Business Continuity - 3
What is Business Continuity
Business Continuity is preparing for, responding to, and recovering from an application outage that adversely affects business operations
Business Continuity solutions address unavailability and degraded application performance
BC is an integrated and enterprise wide process and set of activities to ensure “information availability”
© 2009 EMC Corporation. All rights reserved. Introduction to Business Continuity - 4
What is Information Availability (IA)
IA refers to the ability of an infrastructure to function according to business expectations during its specified time of operation
IA can be defined in terms of three parameters:– Accessibility
Information should be accessible at right place and to the right user
– Reliability Information should be reliable and correct
– Timeliness Information must be available whenever required
© 2009 EMC Corporation. All rights reserved. Introduction to Business Continuity - 5
Causes of Information Unavailability
Disaster (<1% of Occurrences)
Natural or man made Flood, fire, earthquakeContaminated building
Unplanned Outages (20%)
FailureDatabase corruptionComponent failureHuman error
Planned Outages (80%)
Competing workloads Backup, reportingData warehouse extractsApplication and data restore
© 2009 EMC Corporation. All rights reserved. Introduction to Business Continuity - 6
Impact of Downtime
Lost RevenueKnow the downtime costs (per hour, day, two days...)• Number of employees
impacted (x hours out * hourly rate)
Damaged Reputation
• Customers• Suppliers• Financial markets• Banks• Business partners
Financial Performance
• Revenue recognition• Cash flow• Lost discounts (A/P)• Payment guarantees• Credit rating• Stock price
Other ExpensesTemporary employees, equipment rental, overtime costs, extra shipping costs, travel expenses...
• Direct loss• Compensatory payments• Lost future revenue• Billing losses• Investment losses
Lost Productivity
© 2009 EMC Corporation. All rights reserved. Introduction to Business Continuity - 7
Measuring Information Availability
MTBF: Average time available for a system or component to perform its normal operations between failures
MTTR: Average time required to repair a failed component
IA = MTBF / (MTBF + MTTR) or IA = uptime / (uptime + downtime)
Detection
IncidentTime
Detection elapsed
time
Diagnosis
Response Time
Repair
Recovery
Repair time
Restoration
Recovery Time
MTTR – Time to repair or ‘downtime’
Incident
MTBF – Time between failures or ‘uptime’
© 2009 EMC Corporation. All rights reserved. Introduction to Business Continuity - 8
Availability Measurement – Levels of ‘9s’ Availability
% Uptime % Downtime Downtime per Year Downtime per Week
98% 2% 7.3 days 3hrs 22 min
99% 1% 3.65 days 1 hr 41 min
99.8% 0.2% 17 hrs 31 min 20 min 10 sec
99.9% 0.1% 8 hrs 45 min 10 min 5 sec
99.99% 0.01% 52.5 min 1 min
99.999% 0.001% 5.25 min 6 sec
99.9999% 0.0001% 31.5 sec 0.6 sec
© 2009 EMC Corporation. All rights reserved. Introduction to Business Continuity - 9
BC Terminologies
Disaster recovery– Coordinated process of restoring systems, data, and infrastructure
required to support ongoing business operations in the event of a disaster
– Restoring previous copy of data and applying logs to that copy to bring it to a known point of consistency
– Generally implies use of backup technology
Disaster restart– Process of restarting from disaster using mirrored consistent copies
of data and applications
– Generally implies use of replication technologies
© 2009 EMC Corporation. All rights reserved. Introduction to Business Continuity - 10
BC Terminologies (Cont.)
Recovery Point Objective (RPO)
Point in time to which systems and data must be recovered after an outage
Amount of data loss that a business can endure
Recovery Time Objective (RTO)
Time within which systems, applications, or functions must be recovered after an outage
Amount of downtime that a business can endure and survive
Recovery-point objective Recovery-time objective
Seconds
Minutes
Hours
Days
Weeks
Seconds
Minutes
Hours
Days
Weeks Tape Backup
Periodic Replication
Asynchronous Replication
Synchronous Replication
Tape Restore
Disk Restore
Manual Migration
Global Cluster
© 2009 EMC Corporation. All rights reserved. Introduction to Business Continuity - 11
Business Continuity Planning (BCP) Process
Identifying the critical business functions
Collecting data on various business processes within those functions
Business Impact Analysis (BIA) – Risk Analysis
Assessing, prioritizing, mitigating, and managing risk
Designing and developing contingency plans and disaster recovery plan (DR Plan)
Testing, training and maintenance
© 2009 EMC Corporation. All rights reserved. Introduction to Business Continuity - 12
BC Technology Solutions
Following are the solutions and supporting technologies that enable business continuity and uninterrupted data availability:– Single point of failure
– Multi-pathing software
– Backup and replication Backup recovery Local replication Remote replication
© 2009 EMC Corporation. All rights reserved. Introduction to Business Continuity - 13
Resolving Single Points of Failure
FC Switches
Storage Array
Redundant Network
Clustered ServersRedundant Arrays
Remote Site
Redundant Ports
Redundant FC Switches
Redundant Paths
Heartbeat Connection
IP
Storage Array
Client
© 2009 EMC Corporation. All rights reserved. Introduction to Business Continuity - 14
Multi-pathing Software
Configuration of multiple paths increases data availability
Even with multiple paths, if a path fails I/O will not reroute unless system recognizes that it has an alternate path
Multi-pathing software helps to recognize and utilizes alternate I/O path to data
Multi-pathing software also provide the load balancing
Load balancing improves I/O performance and data path utilization
© 2009 EMC Corporation. All rights reserved. Introduction to Business Continuity - 15
Backup and Replication
Local Replication– Data from the production devices is copied to replica devices within
the same array– The replicas can then be used for restore operations in the event of
data corruption or other events
Remote Replication– Data from the production devices is copied to replica devices on a
remote array – In the event of a failure, applications can continue to run from the
target device
Backup/Restore– Backup to tape has been a predominant method to ensure business
continuity– Frequency of backup is depend on RPO/RTO requirements
© 2009 EMC Corporation. All rights reserved. Introduction to Business Continuity - 16
Module Summary
Key points covered in this module:
Importance of Business Continuity
Types of outages and their impact to businesses
Information availability measurements
Definitions of disaster recovery and restart, RPO and RTO
Business Continuity technology solutions overview
© 2009 EMC Corporation. All rights reserved. Introduction to Business Continuity - 17
Concept in Practice – EMC PowerPath
SE
RV
ER
ST
OR
AG
E
SCSISCSIDriverDriver
SCSISCSIDriverDriver
SCSISCSIDriverDriver
SCSISCSIDriverDriver
SCSISCSIDriverDriver
SCSISCSIDriverDriver
SCSISCSIControllerController
SCSISCSIControllerController
SCSISCSIControllerController
SCSISCSIControllerController
SCSISCSIControllerController
SCSISCSIControllerController
PowerPathPowerPath Host Based Software
Resides between application and SCSI device driver
Provides Intelligent I/O path management
Transparent to the application
Automatic detection and recovery from host-to-array path failures
Host Application (s)Host Application (s)
LUNLUN
LUNLUN
Storage Network
© 2009 EMC Corporation. All rights reserved. Introduction to Business Continuity - 18
Check Your Knowledge
Which concerns do business continuity solutions address?
“Availability is expressed in terms of 9s.” Explain the relevance of the use of 9s for availability, using examples.
What is the difference between RPO and RTO?
What is the difference between Disaster Recovery and Disaster Restart?
Provide examples of planned and unplanned downtime in the context of data center operations.
What are some of the Single Points of Failure in a typical data center environment?