+ All Categories
Home > Documents > Business Continuity - IT Disaster Recovery Discussion c.ymcdn.com/sites/ · PDF fileBusiness...

Business Continuity - IT Disaster Recovery Discussion c.ymcdn.com/sites/ · PDF fileBusiness...

Date post: 06-Mar-2018
Category:
Upload: doannga
View: 219 times
Download: 0 times
Share this document with a friend
46
Melbourne Sydney 79-81 Coppin St Level 2 Richmond VIC 3121 414 Kent St P: +61 3 8420 0100 NSW 2000 F: +61 3 8420 0101 P: +61 2 9028 0100 www.td.com.au Business Continuity - IT Disaster Recovery Discussion Paper - - Version V2.0R Wednesday, 5 September 2012 Commercial in Confidence
Transcript

Melbourne Sydney 79-81 Coppin St Level 2 Richmond VIC 3121 414 Kent St P: +61 3 8420 0100 NSW 2000 F: +61 3 8420 0101 P: +61 2 9028 0100

www.td.com.au

Business Continuity - IT Disaster Recovery Discussion Paper

-

-

Version V2.0R

Wednesday, 5 September 2012

Commercial in Confidence

Business Continuity - IT Disaster Recovery Discussion Paper - V2.0R Page 2 of 46

Commercial-in-Confidence

© Thomas Duryea Consulting

Document Control

Document Control Information

Title: Business Continuity - IT Disaster Recovery Discussion Paper

File Name: BC DR Discussion Paper v2.0.docm

Version: V2.0R

Status: Released

Release Date: Wednesday, 5 September 2012

Revision History

Version # Description Revised by Date

V2.0 Added IOS 22301 comments David Danher 4th September 2012

Reviews and Authorisation

Technical Assurance

David Danher has prepared this document as the Thomas Duryea Principal Consultant

David Danher Principal Consultant

Date: Wednesday, 5 September 2012

Business Continuity - IT Disaster Recovery Discussion Paper - V2.0R Page 3 of 46

Commercial-in-Confidence

© Thomas Duryea Consulting

Table of Contents

1. Overview ..................................................................................................................................7

2. Business Continuity Management Overview ..............................................................................8

2.1 BCM Policy and Programme Management ................................................................................................ 8

2.2 Understanding the Organisation ............................................................................................................... 9

2.3 Determining BCM Strategy ........................................................................................................................ 9

2.4 Developing and Implementing a BCM Response ....................................................................................... 9

2.5 Exercising, Maintaining and Reviewing BCM ............................................................................................. 9

2.6 Embedding BCM in the Organisation’s Culture ........................................................................................ 10

2.7 Key Processes.......................................................................................................................................... 10

2.7.1 Awareness Training ......................................................................................................................... 10

2.7.2 Business Impact Analysis (BIA) ......................................................................................................... 10

2.7.2.1 MTPD ....................................................................................................................................... 11

2.7.2.2 MTDL ....................................................................................................................................... 11

2.7.3 Continuity Requirements Analysis (CRA) .......................................................................................... 11

2.7.4 Evaluating Threats through Risk Assessment ................................................................................... 11

2.7.5 Strategy ........................................................................................................................................... 12

2.7.5.1 RTO .......................................................................................................................................... 12

2.7.5.2 RPO .......................................................................................................................................... 12

2.7.6 Identifying and Selecting Tactical Responses ................................................................................... 12

2.7.7 Consolidating Resource Levels ......................................................................................................... 13

2.7.8 Plans ................................................................................................................................................ 13

2.7.9 Exercising ........................................................................................................................................ 14

2.7.10 Maintenance ................................................................................................................................. 14

2.7.11 Reviewing & Auditing .................................................................................................................... 15

3. Business Continuity Standards ................................................................................................. 16

3.1 Management Lifecycle - The Plan-Do-Check-Act (PDCA) cycle ................................................................. 17

4. Disaster Recovery Programme Overview .................................................................................. 18

4.1 DR Programme Management .................................................................................................................. 21

4.2 Understanding the Organisation ............................................................................................................. 21

4.3 Determining DR Strategy ......................................................................................................................... 21

4.4 Developing and Implementing a DR Response ........................................................................................ 21

4.5 Exercising, Maintaining and Reviewing DR .............................................................................................. 22

4.6 Embedding DR in the Organisation’s Culture ........................................................................................... 22

4.7 Key Processes.......................................................................................................................................... 23

Business Continuity - IT Disaster Recovery Discussion Paper - V2.0R Page 4 of 46

Commercial-in-Confidence

© Thomas Duryea Consulting

4.7.1 Awareness Training ......................................................................................................................... 24

4.7.2 DR Policy ......................................................................................................................................... 24

4.7.3 DR Framework ................................................................................................................................. 25

4.7.4 Business Impact Analysis (BIA) ......................................................................................................... 25

4.7.4.1 Application Impact Rating ........................................................................................................ 26

4.7.4.2 MTPD ....................................................................................................................................... 26

4.7.4.3 RTO .......................................................................................................................................... 26

4.7.4.4 MTDL ....................................................................................................................................... 26

4.7.4.5 RPO .......................................................................................................................................... 26

4.7.5 Threats and Vulnerability Assessment ............................................................................................. 26

4.7.6 Solution Architecture Design ........................................................................................................... 27

4.7.7 Implementing the Selected Technical Solution ................................................................................ 27

4.7.8 Plans ................................................................................................................................................ 27

4.7.9 Exercising ........................................................................................................................................ 28

4.7.10 Maintenance ................................................................................................................................. 28

4.7.11 Reviewing & Auditing .................................................................................................................... 28

4.8 Amendments for ISO 22301 .................................................................................................................... 29

4.8.1 Lifecycle........................................................................................................................................... 29

4.8.2 Processes......................................................................................................................................... 30

5. The DR Business Impact Analysis Process ................................................................................. 31

5.1 Pre workshop Activities ........................................................................................................................... 32

5.1.1 Application Template Build .............................................................................................................. 32

5.1.2 Consequence Table Build ................................................................................................................. 32

5.1.3 Analysis Timeframes ........................................................................................................................ 33

5.1.4 BIA Workbook Build ......................................................................................................................... 33

5.2 Workshop ............................................................................................................................................... 35

5.2.1 Impact Rating Assignment ............................................................................................................... 35

5.2.2 Maximum Tolerable Period of Disruption (MTPD) Assignment ........................................................ 35

5.2.3 Recovery Time Objective (RTO) Assignment .................................................................................... 35

5.2.4 Maximum Tolerable Data Loss (MTDL) Assignment ......................................................................... 35

5.2.5 Recovery Point Objective (RPO) Assignment .................................................................................... 36

5.2.6 Business Continuity Workarounds Acknowledgement ..................................................................... 36

5.2.7 Disaster Recovery Solutions Acknowledgement ............................................................................... 36

5.3 Threat and Vulnerability Assessment ...................................................................................................... 36

5.4 Post Workshop Analysis and Reporting ................................................................................................... 37

5.4.1 Calculation of the Original Total Impact Rating ................................................................................ 37

5.4.2 Calculation of the Revised Total Impact Rating ................................................................................ 38

Business Continuity - IT Disaster Recovery Discussion Paper - V2.0R Page 5 of 46

Commercial-in-Confidence

© Thomas Duryea Consulting

5.4.3 Common Denominators .................................................................................................................. 38

5.4.4 Analysis ........................................................................................................................................... 39

5.4.5 Reporting ........................................................................................................................................ 39

6. Potential Issues When Deploying a Technical Solution Only ...................................................... 41

6.1 Misalignment with Business Needs ......................................................................................................... 41

6.2 Technical Solution Cost ........................................................................................................................... 41

6.3 Service Availability .................................................................................................................................. 42

6.4 Exercising ................................................................................................................................................ 42

6.5 Evolution ................................................................................................................................................. 42

7. Appendix A - Thomas Duryea Consulting .................................................................................. 43

7.1 Company Overview ................................................................................................................................. 43

7.2 Awards .................................................................................................................................................... 43

7.3 Partners and Accreditations .................................................................................................................... 44

7.4 Thomas Duryea – the Disaster Recovery Programme Specialists ............................................................. 45

7.5 About the Author .................................................................................................................................... 46

Business Continuity - IT Disaster Recovery Discussion Paper - V2.0R Page 6 of 46

Commercial-in-Confidence

© Thomas Duryea Consulting

List of Figures

Figure 1 BCM Lifecycle ................................................................................................................................... 8

Figure 2: The business continuity management life cycle............................................................................. 17

Figure 3 BCM Umbrella ................................................................................................................................ 18

Figure 4 Risk, Emergency response, Incident BC and DR management......................................................... 19

Figure 5 The relationship between BCM standards and the DR methodology ............................................. 20

Figure 6 DR Lifecycle .................................................................................................................................... 20

Figure 7 DR Processes .................................................................................................................................. 23

Figure 8 Solution Architecture Processes ..................................................................................................... 27

Figure 9 DR Lifecycle using ISO 22301 .......................................................................................................... 29

Figure 10 DR Processes using ISO 22301 ...................................................................................................... 30

Figure 11 BIA Process Map .......................................................................................................................... 31

Figure 12 Application Template Sample ....................................................................................................... 32

Figure 13 Consequence Table from Standards Australia HB 436 2004 - Risk Management Guidelines ......... 33

Figure 14 Sample Consequence Table .......................................................................................................... 33

Figure 15 Sample BIA Worksheet ................................................................................................................. 34

Figure 16 Likelihood Scale ........................................................................................................................... 36

Figure 17 Likelihood of Incidents ................................................................................................................. 37

Figure 18 Chart samples .............................................................................................................................. 40

List of Tables

Table 1: BC standards .................................................................................................................................. 16

Table 2: The PDCS BC Management Model .................................................................................................. 17

Business Continuity - IT Disaster Recovery Discussion Paper - V2.0R Page 7 of 46

Commercial-in-Confidence

© Thomas Duryea Consulting

1. Overview

During the course of my day to day duties I often find that the terms “Business Continuity (BC) and IT Disaster Recovery” (DR) are often intermixed, which I believe comes from a basic misunderstanding of each discipline and their interaction with each.

This discussion paper has been created to correct misunderstandings that may exist.

Chapter 2 contains extracts from the BCI GPG 20101 and explains Business Continuity.

Chapter 3 contains a quick overview of Business Continuity Standards

Chapter 4 contains extracts from the relevant Standards and Good Practice Guidelines2 along with commentary from the author to explain IT Disaster Recovery.

Chapter 5 contains the methodology deployed by Thomas Duryea Consulting (TD) to undertake an IT DR Business Impact Analysis (BIA).

Chapter 6 contains some commentary on issues associated with deploying a technical DR solution without the support of business processes.

1 Business Continuity Institute (BCI) Good Practice Guidelines (GPG) 2010 (published March 2010)

2 Business Continuity Institute (BCI) Good Practice Guidelines (GPG) 2010 (published March 2010) Good Practice Guidelines (GPG) 2010 (published March 2008)

Australian National Audit Office Business Continuity Management – Keeping the Wheels in Motion 2000 Business Continuity Management Good Practice Guide June 2009

Standards Australia HB 221:2003 Business Continuity Management HB 221:2004 Business Continuity Management HB 292-2006 A Practitioners Guide to BCM HB 293-2006 Executive Guide to BCM AS NZS ISO IEC 17799-2001 Information Technology - Code of Practice for Information Security Management (Clause 14) AS NZS ISO IEC 27002-2006 Information technology - Security techniques - Code of practice for Information Security Management (Clause 14)

British Standards BS25999-1 2006 Business Continuity Management - Part 1 Code of Practice BS25999-2 2007 Business Continuity Management - Part 2 Specification

Business Continuity - IT Disaster Recovery Discussion Paper - V2.0R Page 8 of 46

Commercial-in-Confidence

© Thomas Duryea Consulting

2. Business Continuity Management Overview

Prior to discussing Disaster Recovery, it is imperative that the reader is presented an opportunity to understand, at a high level, what Business Continuity Management (BCM) is. This allows the reader to understand the context of the disaster recovery discussion.

The remainder of this chapter (Chapter2) contains extracts from the BCI GPG 2010.3

Business Continuity Management (BCM) is a holistic process that identifies potential threats to an organisation and the impacts to business operations that those threats, if realized, might cause. It provides a framework for building organisational resilience with the capability for an effective response that safeguards the interests of key stakeholders, reputation, brand and value-creating activities.

Figure 1 BCM Lifecycle

2.1 BCM Policy and Programme Management

The BCM Policy is the key document that sets out the scope and governance of the BCM programme, and reflects the reasons why BCM is being implemented. It provides the context in which the required capabilities will be implemented, and identifies the principles to which the organisation aspires and against which its performance can be audited.

A BCM programme needs to reflect the organisation’s strategy, objectives and culture to ensure that the programme is relevant, effective and appropriate.

The purpose of setting the scope is to ensure clarity of what areas of the organisation are included within the BCM programme, defined by identifying which products and services fall within it.

3 Business Continuity Institute (BCI) Good Practice Guidelines (GPG) 2010 (published March 2010)

Business Continuity - IT Disaster Recovery Discussion Paper - V2.0R Page 9 of 46

Commercial-in-Confidence

© Thomas Duryea Consulting

2.2 Understanding the Organisation

“Understanding the Organisation” is the professional practice within the BCM Lifecycle that reviews an organisation in terms of what its objectives are, how it works functionally and the constraints of the environment in which it operates. The information collected makes it possible to determine how best to prepare an organisation to be able to manage disruptions which might otherwise seriously or fatally damage it.

This sets the scope of the Business Impact Analysis (BIA), Continuity Requirements Analysis (CRA) and Evaluating Threats stages.

2.3 Determining BCM Strategy

“Determining Business Continuity Strategy” is the professional practice within the BCM Lifecycle that determines which BCM strategies will meet the BCM Policy and organisational requirements and selects tactical responses from available options.

‘Determining Business Continuity Strategy’ uses the information obtained from the analysis in the ‘Understanding the Organisation’ stage of the BCM process to identify and select recovery and continuity options. This will enable the organisation’s activities to become operational following an interruption, before the organisation’s continued survival is threatened by their loss. It consists of three elements:

Identifying and Selecting Strategies

Identifying and Selecting Tactical Responses from Available Options

Consolidating Resource Levels

2.4 Developing and Implementing a BCM Response

“Developing and Implementing a BCM Response” is the professional practice within the BCM Lifecycle that implements agreed strategies through the process of developing a set of Business Continuity Plans. The aim of the various plan(s) covered in this stage is to identify, as far as possible, the actions and the resources which are needed to enable the organisation to manage an interruption whatever its cause, back to a position where normal business processes can resume. The key requirements for an effective response are:

A clear procedure for the escalation and control of an incident (incident response structure)

Communication with stakeholders

Plans to resume interrupted activities

2.5 Exercising, Maintaining and Reviewing BCM

“Exercising, Maintaining and Reviewing BCM” is the professional practice within the BCM Lifecycle that seeks to ensure continuous improvement is achieved through the ongoing and scheduled actions.

Most organisations exist in a dynamic environment and are subject to changes in people, processes, market, risk, environment, geography and business strategy. To ensure that their BCM capability continues to reflect the nature, scale and complexity of the organisation it supports, it must be current, accurate, complete, exercised and understood by all stakeholders and participants.

The purpose of the Exercise Programme is to ensure that over a period of time:

All information in plans is verified

All plans are rehearsed

All relevant personnel (including deputies) are exercised

There are several ways to review a BCM programme, which include self-assessment (first party), internal audit (second party) and external audit (third party).

Business Continuity - IT Disaster Recovery Discussion Paper - V2.0R Page 10 of 46

Commercial-in-Confidence

© Thomas Duryea Consulting

2.6 Embedding BCM in the Organisation’s Culture

The successful establishment of BCM within the organisation’s culture is dependent upon its integration with the organisation’s strategic and day-to-day management as well as its alignment with business priorities.

This is not unique to BCM. Other disciplines such as Quality, Health and Safety, Environmental Services, IT Service Management and Information Security have similar demands placed upon them, and consequently have used the same ISO approved management system model.

2.7 Key Processes

This section details the key (but not all) of the process required to implement BCM. It serves to highlight to the reader what each process accomplishes without delving into the explicit details of the process.4

2.7.1 Awareness Training

Awareness training is both an initial and integral process of the BCM programme. It assists an organisation to explain the process of BCM to obtain enthusiasm for the programme across all levels of the organisation. It assists an organisation to maintain enthusiasm over time for the ongoing maintenance of the programme. The awareness training process should be constructed from activities described elsewhere in this Guide. These could include:

A desktop exercise with senior managers to demonstrate what would happen in the absence of an incident response structure and procedures

Presentations on the impact of recent local incidents

Questionnaires or interviews to determine the current state of readiness within the organisation

2.7.2 Business Impact Analysis (BIA)

The Business Impact Analysis (BIA) is the foundation on which the whole BCM process is built. It identifies, quantifies and qualifies the business impacts of a loss, interruption or disruption of business activities on an organisation and provides the data from which appropriate continuity strategies can be determined. A BIA can be used to identify the timescale and extent of the impact of a disruption at several levels in an organisation. For example, to examine the effect of:

Strategic: The loss of the ability to deliver each product or service – to assist in deciding the scope of the BCM programme

Tactical: An interruption to the internal and external activities that would disrupt the delivery of products and services – to provide the information for selection of continuity options and their resource requirements

Operational: A disruption of a business area’s activities – to assist the preparation of a detailed plan for the department

The BIA provides the MTPD and the MTDL.

4 Refer to the BCI GPG 2010 should additional information be required.

Business Continuity - IT Disaster Recovery Discussion Paper - V2.0R Page 11 of 46

Commercial-in-Confidence

© Thomas Duryea Consulting

2.7.2.1 MTPD

Maximum Tolerable Period of Disruption (MTPD) – this is the duration after which an organisation’s viability (either financially or through loss of reputation) will be irreparably damaged if delivery of a particular product or service cannot be resumed. Factors that could be considered in estimating the MTPD include:

The impact on staff or public well-being

The impact of breaches of statutory duties or regulatory requirements

Damage to reputation

Damage to financial viability

Deterioration of product or service quality

Environmental damage

Other factors specific to the organisation

2.7.2.2 MTDL

The Maximum Tolerable Data Loss (MTDL) is the loss of currency of data (electronic and other) from which an organisation would be unable to recover its operational capability.

2.7.3 Continuity Requirements Analysis (CRA)

The Continuity Requirements Analysis (CRA) collects information on the resources required to resume and continue the business activities to support the organisation’s objectives and obligations. This step is usually undertaken at the same time as the BIA information is being gathered. Its purpose is to:

Provide the resource information from which an appropriate recovery strategy can be determined and/or recommended

Identify resource requirements resulting from activity dependencies that exist both internally and externally

2.7.4 Evaluating Threats through Risk Assessment

The purpose of evaluating threats is to identify measures that can be put in place to reduce the likelihood of interruption to the organisation’s most urgent activities and the impact, should the risk be realised. The process of evaluating threats uses risk assessment techniques to identify unacceptable concentrations of risk to activities, and single points of failure, and identifies measures that can be put in place to lower the likelihood of disruption to them. This allows mitigation measures to be targeted at the most urgent activities within the organisation thus improving the likely return on investment and minimal impact during disruption.

Business Continuity - IT Disaster Recovery Discussion Paper - V2.0R Page 12 of 46

Commercial-in-Confidence

© Thomas Duryea Consulting

2.7.5 Strategy

Determining Business Continuity Strategy uses the information obtained from the analysis in the BIA and CRA processes (described above) to identify and select recovery and continuity options. This will enable the organisation’s activities to become operational following an interruption, before the organisation’s continued survival is threatened by their loss. It consists of three elements:

1. Identifying and Selecting Strategies

2. Identifying and Selecting Tactical Responses from Available Options

3. Consolidating Resource Levels

An up to date BIA and CRA will provide the MTPD and MTDL for each product and service in the scope of the BCM programme. It will also quantify the recovery requirements for the activities that support the delivery of the products and services. The RTO and RPO parameters for each product and service are determined in the strategy. This leads to the selection of the most appropriate BCM strategies. The organisation needs to select BCM strategies that will enable it to protect the continued delivery of its products and services. This section covers the identification and selection of these strategies.

2.7.5.1 RTO

The target time for resuming the delivery of a product or service following its disruption is known as its Recovery Time Objective (RTO).

2.7.5.2 RPO

The age or value of the lost data could make resumed operations impossible. The target time for the worst case data loss in planning terms is known as its Recovery Point Objective (RPO).

2.7.6 Identifying and Selecting Tactical Responses

Once the strategy has been decided appropriate tactical continuity tactical options for each activity that supports the delivery of the organisation’s products and services needs to be selected. Appropriate tactics for each activity will need to be selected to cover the requirements in the relevant areas of:

People (skills and knowledge)

Premises (buildings and facilities)

Resources

Information technology (IT)

Telecommunications

Non electronic (paper) information

Equipment

Suppliers (products and services supplied by third parties)

For manufacturing organisations, particular attention will also need to be given to:

Production processes

Materials, logistics and inventory

Power and utilities

Business Continuity - IT Disaster Recovery Discussion Paper - V2.0R Page 13 of 46

Commercial-in-Confidence

© Thomas Duryea Consulting

2.7.7 Consolidating Resource Levels

The purpose of consolidating resource levels is to:

Ensure that the selected tactics are consistent across the organisation

Ensure that the selected tactics do not conflict with one another (e.g. that different activities are not planning to use the same internal resource for recovery)

Determine how best to source external requirements (e.g. third party recovery sites)

Assist in determining the number and structure of the Business Continuity Plans

Having selected appropriate tactical continuity options for each important and urgent activity, the resource requirements of the tactics need to be consolidated.

2.7.8 Plans

The key requirements for an effective response are:

A clear procedure for the escalation and control of an incident (incident response structure)

Communication with stakeholders

Plans to resume interrupted activities

The term Business Continuity Plan (BCP) can be defined as: “A documented collection of procedures and information that have been developed, compiled and maintained in readiness for use in an incident, to enable an organisation to continue to deliver its important and urgent activities, at an acceptable pre-defined level.”

There are other terms in common usage, all of which are specialist forms of the BCP. Although clearly within the generic definition above, Emergency Response Plans and Incident Management Plans are managed separately from BCP in some organisations. In some organisations, ICT (Information and Communication Technology) departments still refer to their plans as Disaster Recovery Plans.

Other names for specialist plans include:

Crisis Management Plan

Media Response Plan

Product Recall Plan

Pandemic Plan

Continuity of Operations Plan

Business Continuity - IT Disaster Recovery Discussion Paper - V2.0R Page 14 of 46

Commercial-in-Confidence

© Thomas Duryea Consulting

2.7.9 Exercising

The purpose of the Exercise Programme is to ensure that over a period of time:

All information in plans is verified

All plans are rehearsed

All relevant personnel (including deputies) are exercised

Business Continuity Management (BCM) capability cannot be considered reliable until it has been exercised. An Exercise Programme should focus on maximizing business benefits while minimizing business disruption. A planned Exercise Programme is required to ensure that all aspects of the plans and personnel have been exercised over a period of time, avoiding disruption to the whole business.

Exercising can take various forms, including technical tests, desktop walkthroughs and full live rehearsals. No matter how well designed a BCM Strategy or Business Continuity Plan (BCP) is: a series of robust and realistic exercises will identify issues and assumptions that require attention.

Time and resources spent exercising BCPs are crucial parts of the overall process as they develop competence, instil confidence and impart knowledge that are essential in times of crisis. Validating technical recovery capabilities is an important part of an exercise programme but an equally key element is the role of people. The programme should ensure that their skill levels, knowledge of their role, management capability and decision-making are exercised in a safe environment.

2.7.10 Maintenance

The BCM Maintenance Programme ensures that the organisation remains ready to manage incidents despite the constant changes that all organisations experience. To be effective, the BCM Maintenance Programme should be embedded within the organisation’s normal management processes rather than be a separate structure that can be ignored or forgotten.

An effective change management process is a prerequisite of maintenance of the BCM programme. Many of the issues that show up in tests and exercises are the result of internal changes within the organisation – staff, locations or technology.

Business Continuity - IT Disaster Recovery Discussion Paper - V2.0R Page 15 of 46

Commercial-in-Confidence

© Thomas Duryea Consulting

2.7.11 Reviewing & Auditing

There are several ways to review a BCM programme, which include self-assessment (first party), internal audit (second party) and external audit (third party). A formal BCM Audit process ensures that an organisation has an effective Business Continuity programme.

BCM Audit has five key functions:

1. To validate compliance with the organisation’s BCM policies and standards

2. To review the organisation’s BCM solutions

3. To validate the organisation’s range of BCM plans

4. To verify that appropriate exercise and maintenance activities are taking place

5. To highlight deficiencies and issues, and ensure their resolution

Auditing is designed to verify that the process has been followed correctly, not that the solutions adopted are necessarily correct. The audit or review should be conducted against a BCM Policy and appropriate standards identified by it.

The audit should be conducted on a regular basis as defined by the organisation’s audit and governance policies. For BCM, it is recommended that the period between audits should not exceed two years. In the interim, self-auditing, or ‘Performance Monitoring’ may be carried out more frequently, by the owners of the plans.

Business Continuity - IT Disaster Recovery Discussion Paper - V2.0R Page 16 of 46

Commercial-in-Confidence

© Thomas Duryea Consulting

3. Business Continuity Standards

The previous chapter (Chapter2) contained extracts from the BCI GPG 2010. It is imperative to note that the BCI GPG 2010 reflects the contents of the BS25999 British Standard. Part one of this standard was released in 2006 and Part 2 was released in 2007. Most BC/DR practitioners take the lead from these documents.

It should be noted however that current BCM standards are undergoing transition with the International Standard ISO 22301 being released in May 2012. The standard it replaces (BS25999 – Part 2) is not due to be retired until November 2012.

Development of the supporting document, ISO 22313 “Business continuity management systems — Guidelines”, to replace to replace BS25999 –Part 1 is continuing. The timeframe for completion of ISO 22313 is not certain, but a practical estimate suggests May 2013.

As we are in a period of change between the two standards, the remainder of this document will utilise the BS2599 standard (unless explicitly stated) as the basis for discussion.

Australian organisations may elect to utilise one or all of the Australian Standards but these are now dated and overshadowed with the emergence of the new ISO standards. While the Australian standards differ in style to the British or International standards, the content can be aligned with either.

Australian Government organisations tend to utilise the “Business Continuity Management Good Practice Guide June 2009” published by Australian National Audit Office (ANAO) which) supports the Auditor-General.

Table 1: BC standards

Source Documents

International Standards and Best Practice Guidelines

BS ISO 22301 2012 Societal security – Business continuity management systems – Requirements (Supersedes BS25999 -2 2007)

ISO IEC 24762-2008 Information technology – Security techniques – Guidelines for information and communications technology disaster recovery services

British Standards BS25999-1 2006 Business Continuity Management – Part 1 Code of Practice

BS25999-2 2007 Business Continuity Management – Part 2 Specification (to be withdrawn 1 November 2012)

Standards Australia HB 221:2004 Business Continuity Management

HB 292-2006 A Practitioners Guide to BCM

HB 293-2006 Executive Guide to BCM

AS NZS ISO IEC 27002-2006 Information Technology – Code of Practice for Information Security Management (Clause 14)

Business Continuity - IT Disaster Recovery Discussion Paper - V2.0R Page 17 of 46

Commercial-in-Confidence

© Thomas Duryea Consulting

3.1 Management Lifecycle - The Plan-Do-Check-Act (PDCA) cycle

Management systems standards – such as ISO 9001:2000 (Quality Management Systems), ISO 14001:2004 (Environmental Management Systems), ISO/IEC 27001:2005 (Information Security Management Systems) and ISO/IEC 20000:2005 (ICT Service Management) – support the Plan-Do-Check-Act (PDCA) cycle (see Error! eference source not found.) in establishing, implementing, operating, monitoring, reviewing maintaining and improving the effectiveness of an organisation’s processes.

Current BCM standards also apply the PDCA cycle to an organisation’s BCM system.

Figure 2: The business continuity management life cycle

Figure 2 illustrates the current BCM management models.

The BS25999 (Part 1 and Part 2 (which is being retired) standard is highlighted on the left.

The new ISO 22301 (which has replaced BS 25999 part 2) standard is highlighted on the right.

In the middle is the Plan, Do, Check, Act (PDCA) model utilised by management systems. Its purpose is to illustrate how the BS25999 and the ISO 22301 management models utilise the PDCA model

Plan

(Establish)

Establish business continuity policy, objectives, targets, controls, processes and procedures relevant to managing risk and improving business continuity to deliver results that align with an organisation’s overall policies and objectives

Do

(Implement and Operate)

Implement and operate the business continuity policy, controls, processes and procedures

Check

(Monitor and Review)

Monitor and review performance against business continuity objectives and policy, report the results to management for review, and determine and authorise actions for remediation and improvement

Act

(Maintain and Improve)

Maintain and improve the BCM system by taking corrective actions, based on the results of management review and reappraising the scope of the BCMS and Business Continuity policy and objectives

Table 2: The PDCS BC Management Model

Business Continuity - IT Disaster Recovery Discussion Paper - V2.0R Page 18 of 46

Commercial-in-Confidence

© Thomas Duryea Consulting

4. Disaster Recovery Programme Overview

There are no standards or good practice guidelines for Disaster Recovery. This is an interesting thought as DR existed before BC; the BC discipline grew out of the DR discipline and as previously discussed a number of standards and Good Practices were developed for BC.

There is however a number of BC Standards and Good Practice Guidelines that refers to Information Technology Disaster Recovery (IT Disaster Recovery or IT DR) as a component of the BCM.

From these, we can construct and deliver similar methodologies and process of our BC programme so that our DR programme is aligned with our BC programme. This is extremely important for organisations that have a heavy reliance on IT and have identified IT as a major consolidated resource that needs to be recovered urgently after an incident.

The remainder of this chapter (chapter 4) contains extracts from the relevant Standards and Good Practice Guidelines and commentary from the author.

Figure 3 BCM Umbrella

As previously discussed, Business Continuity describes the processes and procedures an organisation puts in place to ensure that essential functions can continue during and after a disaster. Business Continuity planning seeks to prevent interruption of mission-critical services, and to re-establish full functionality as swiftly and smoothly as possible. We can see from Figure 3 that IT Disaster Recovery is one of the components of the Business Continuity Programme. It comprises the policies and procedures that enable an organisation to restart IT operations that support the essential business functions after a disaster.

Business Continuity - IT Disaster Recovery Discussion Paper - V2.0R Page 19 of 46

Commercial-in-Confidence

© Thomas Duryea Consulting

Figure 4 Risk, Emergency response, Incident BC and DR management

“.....Business continuity management is an essential component of good governance. It supports and sustains the organisation’s business strategy, goals and objectives in the face of disruptive events. There are a number of interrelated activities that work together to prevent and manage a significant business disruption event. These include:

Business Continuity Management

Incorporating IT Disaster Recovery

Risk Management

Emergency Response Management

Incident Management

The integration of these activities is a success factor for building organisational resilience. These activities provide the tactical, strategic and operational response to a business disruption. Figure 4 depicts the relationship between these key concepts.....

....IT disaster recovery is a term used to describe the operational response associated with the recovery of technology-based resources. Typically, these include computerised information processing systems and telecommunications. IT disaster recovery involves defining the overall strategy for recovering these resources and the activities required to implement the strategy, including timelines for recovering each specific technology component as required by the business. The availability of appropriately skilled personnel and sourcing of specialist equipment in the event of a business disruption are two areas requiring particular attention, as business areas may make incorrect assumptions regarding these. IT disaster recovery is a part of an entity's business continuity strategy....”5

5Australian National Audit Office Business Continuity Management Good Practice Guide June 2009

Business Continuity - IT Disaster Recovery Discussion Paper - V2.0R Page 20 of 46

Commercial-in-Confidence

© Thomas Duryea Consulting

Figure 5 The relationship between BCM standards and the DR methodology

Figure 5 above demonstrates that the BCM standards have a similar approach, with slight differences in terminology and the structure of the flowchart in which the key elements are aligned. The methodology taken for disaster recovery is also shown to highlight that it aligns with business continuity standards.

Figure 6 DR Lifecycle

The DR Life cycle is depicted in Figure 6 above and explained in the next few paragraphs.

Business Continuity - IT Disaster Recovery Discussion Paper - V2.0R Page 21 of 46

Commercial-in-Confidence

© Thomas Duryea Consulting

4.1 DR Programme Management

Being a component of the BCM, the DR programme also needs to reflect the organisation’s strategy, objectives and culture to ensure that the programme is relevant, effective and appropriate.

The purpose of setting the scope is to ensure clarity of what areas of the organisation are included within the DR programme. This is generally defined as “a total loss of the server room or the loss of a single or multiple critical applications or components within the server room where an alternate site and resources are required for recovery of the IT Applications”

The DR Policy is the key document that sets out the scope and governance of the DR programme, and reflects the reasons why DR is being implemented. It provides the context in which the required capabilities will be implemented, and identifies the principles to which the organisation aspires and against which its performance can be audited.

4.2 Understanding the Organisation

“Understanding the Organisation” is the component within the DR lifecycle that reviews an organisation in terms of what its objectives are, how it works functionally and the constraints of the environment in which it operates in relation to its reliance on the IT infrastructure. The information collected makes it possible to determine how best to prepare an organisation to be able to manage IT related disruptions which might otherwise seriously or fatally damage it.

This sets the scope of the Business Impact Analysis (BIA) and Evaluating Threats stages.

4.3 Determining DR Strategy

“Determining DR Strategy” is the component within the DR Lifecycle that determines which technical solutions will meet the organisational requirements and selects appropriate solutions from available options.

‘Determining DR Strategy’ uses the information obtained from the analysis in the ‘Understanding the Organisation’ stage of the DR process to identify and select recovery options. This will enable the organisation’s activities to become operational following an interruption, before the organisation’s continued survival is threatened by their IT loss.

4.4 Developing and Implementing a DR Response

“Developing and Implementing a DR Response” is the component within the DR Lifecycle that implements the agreed technical solution and develops the Disaster Recovery Plan. The aim of the plan is to document the actions and the resources which are needed to enable the organisation to recover the IT infrastructure back to a position where normal business processes can resume

Business Continuity - IT Disaster Recovery Discussion Paper - V2.0R Page 22 of 46

Commercial-in-Confidence

© Thomas Duryea Consulting

4.5 Exercising, Maintaining and Reviewing DR

“Exercising, Maintaining and Reviewing DR” is the component within the DR Lifecycle that seeks to ensure continuous improvement is achieved through the ongoing and scheduled actions.

Most organisations exist in a dynamic environment and are subject to changes in people, processes, market, risk, environment, geography and business strategy. To ensure that their DR capability continues to reflect the nature, scale and complexity of the organisation it supports, it must be current, accurate, complete, exercised and understood by all stakeholders and participants.

The purpose of the Exercise Programme is to ensure that over a period of time:

All information in plans is verified

All plans are rehearsed

All relevant personnel (including deputies) are exercised

There are several ways to review a DR programme, which include self-assessment (first party), internal audit (second party) and external audit (third party).

4.6 Embedding DR in the Organisation’s Culture

The successful establishment of DR within the organisation’s culture is dependent upon its integration with the organisation’s strategic and day-to-day management as well as its alignment with business priorities.

This is not unique to DR Other disciplines such as BCM, Quality, Health and Safety, Environmental Services, IT Service Management and Information Security have similar demands placed upon them, and consequently have used the same ISO approved management system model.

Business Continuity - IT Disaster Recovery Discussion Paper - V2.0R Page 23 of 46

Commercial-in-Confidence

© Thomas Duryea Consulting

4.7 Key Processes

This section details the key (but not all) of the process required to implement DR. It serves to highlight to the reader what each process accomplishes without delving into the explicit details of the process.

Figure 7 DR Processes

Benefits of a correctly implemented DR programme are:

Confidence DR meets Business needs

Confidence it will work when needed

Proven and exercised

Compliance with all Standards and Best Practices

Auditable by External auditors

Business Continuity - IT Disaster Recovery Discussion Paper - V2.0R Page 24 of 46

Commercial-in-Confidence

© Thomas Duryea Consulting

4.7.1 Awareness Training

Awareness training is both an initial and integral process of the DR programme. It assists an organisation to explain the process of DR to obtain enthusiasm for the programme across all levels of the organisation. It assists an organisation to maintain enthusiasm over time for the ongoing maintenance of the programme. The awareness training process should be constructed from activities including:

Formal or Informal education

A desktop exercise with senior managers to demonstrate what would happen in the absence of an incident response structure and procedures

Presentations on the impact of recent local incidents

Questionnaires or interviews to determine the current state of readiness within the organisation

4.7.2 DR Policy

The DR policy document defines the “must dos” without defining the processes to accomplish this. It aims to minimize the damage or loss as the result of an unplanned incident and to ensure the rapid return to service and availability of all key IT capabilities. It:

Defines DR Incident

Scope

Physical Limitation

Defines Governance and Responsibilities

Executive

IT Management Team

Business Systems Managers

Business Continuity - IT Disaster Recovery Discussion Paper - V2.0R Page 25 of 46

Commercial-in-Confidence

© Thomas Duryea Consulting

4.7.3 DR Framework

The DR framework uses the contents of the policy document and describes the process that an organisation undertakes in the establishment and ongoing management of the DR programme. This document is used to explain to internal staff and interested external third parties the DR processes an organisation undertakes. It details:

The Disaster Recovery (DR) Programme

Common Terminology

Governance

Roles and Responsibilities

Incidents

Definitions

Lifecycle

Management and Recovery

Planning Steps

Continuous Improvement

Quality Assurance

Culture Building

Plan Development

Format

Content

4.7.4 Business Impact Analysis (BIA)

The Business Impact Analysis (BIA) is an operational level BIA and the foundation on which the whole DR process is built. It identifies, quantifies and qualifies the business impacts of a loss, interruption or disruption of IT applications on an organisation and provides the data from which appropriate technical solutions can be determined.

The BIA provides the, Application Impact rating, MTPD, RTO, MTDL and RPO and will be discussed in detail in Chapter 4.

Experience has highlighted, that the information collected within the BC BIA is not always sufficient for the DR BAI. This is because often the BC BIA is taken at the strategic level. Often

The relationship between applications and their dependencies is not captured.

At the strategic level, only the major application(s) for a business function are considered

The data loss requirements are not considered

The restart order of applications in supporting the various business functions may differ in priority to the business functions themselves.

Business Continuity - IT Disaster Recovery Discussion Paper - V2.0R Page 26 of 46

Commercial-in-Confidence

© Thomas Duryea Consulting

4.7.4.1 Application Impact Rating

Is the “Application Impact Rating” is not technically a BC or DR terminology, it is alluded to in the HB221 Standard as the “Total Business Process Rating (Overall)” when presenting the “Business Impact Analysis Worksheet” template.

The application impact rating is an indication of the application’s importance to an organisation measured over time using standard risk management techniques.

It allows on organisation to assigned priorities for financial investment for recovery and determines the applications order of recovery.

An application with a high rating would potentially necessitate a larger financial investment than an application with a lower rating and would be recovered prior to an application with a lower rating to ensure the service it delivers to an organisation is returned prior to the application with the lower rating.

Factors that could be considered in determining the Application Impact rating include:

The impact on staff or public well-being

The impact of breaches of statutory duties or regulatory requirements

Damage to reputation

Damage to financial viability

Deterioration of product or service quality

Environmental damage

Other factors specific to the organisation

4.7.4.2 MTPD

The Maximum Tolerable Period of Disruption (MTPD) – this is the duration after which an organisation’s viability (either financially or through loss of reputation) will be irreparably damaged if restoration of the application cannot be resumed. Factors that could be considered in estimating the MTPD are the same as those considered for the Application Impact Rating.

4.7.4.3 RTO

The required recovery time for resuming the application following its disruption is known as its Recovery Time Objective (RTO).

4.7.4.4 MTDL

The Maximum Tolerable Data Loss (MTDL) – this is the amount of data loss that if the data was restored it would be of no value to the organisation.

4.7.4.5 RPO

The amount of data loss (measured in time) for the application that the organisation is prepared to accept.

4.7.5 Threats and Vulnerability Assessment

The purpose of evaluating threats and vulnerabilities in relation to IT is to identify measures that can be put in place to reduce the likelihood or severity of interruption to the organisation’s IT infrastructure. The process of evaluating threats uses risk assessment techniques to identify unacceptable concentrations of risks to IT infrastructure and identifies measures that can be put in place to lower the likelihood of disruption to them.

Business Continuity - IT Disaster Recovery Discussion Paper - V2.0R Page 27 of 46

Commercial-in-Confidence

© Thomas Duryea Consulting

4.7.6 Solution Architecture Design

Thomas Duryea Solution Architects use the information obtained from the analysis in the BIA and the threat and vulnerability processes (described above) to identify and select recovery and availability options to enable the organisation’s IT infrastructure to be restored. It consists of a number of elements.

Figure 8 Solution Architecture Processes

This leads to the selection of the most appropriate DR technical solution.

The business requirements may be met by a number of technical solutions and/or options added to solutions. From the available options the most cost effective solution that reduces the risk to an acceptable level to Management is selected.

4.7.7 Implementing the Selected Technical Solution

Thomas Duryea engineers build implement, configure and document the selected technical solution.

4.7.8 Plans

The key requirements for an effective response are:

A clear procedure for the escalation and control of an incident (incident response structure)

Communication with stakeholders

Plans to resume interrupted activities

The term Disaster Recovery Plan (DRP) can be defined as: “A documented collection of procedures and information that have been developed, compiled and maintained in readiness for use in an incident, to enable an organisation to recover its IT infrastructure to continue to deliver its important and urgent activities, at an acceptable pre-defined level.”

Business Continuity - IT Disaster Recovery Discussion Paper - V2.0R Page 28 of 46

Commercial-in-Confidence

© Thomas Duryea Consulting

4.7.9 Exercising

The purpose of the Exercise Programme is to ensure that over a period of time:

All information in the DRP is verified

The DRP is rehearsed

All relevant personnel (including deputies) are exercised

Disaster Recovery capability and readiness cannot be considered reliable until it has been exercised. An Exercise Programme should focus on maximizing business benefits while minimizing business disruption. A planned Exercise Programme is required to ensure that all aspects of the plans and personnel have been exercised over a period of time, avoiding disruption to the whole business.

Exercising can take various forms, including call outs, desktop walkthroughs, rehearsals and full live rehearsals. No matter how well designed a DR Strategy or Disaster Recovery y Plan is: a series of robust and realistic exercises will identify issues and assumptions that require attention.

Time and resources spent exercising DRPs are crucial parts of the overall process as they develop competence, instil confidence and impart knowledge that are essential in times of crisis. Validating technical recovery capabilities is an important part of an exercise programme but an equally key element is the role of people. The programme should ensure that their skill levels, knowledge of their role, management capability and decision-making are exercised in a safe environment.

4.7.10 Maintenance

The DR Maintenance Programme ensures that the organisation remains ready to manage incidents despite the constant changes that all organisations experience. To be effective, the DR Maintenance Programme should be embedded within the organisation’s normal management processes rather than be a separate structure that can be ignored or forgotten.

Effective change management and project management processes are a prerequisite of maintenance of the DR programme. Many of the issues that show up in tests and exercises are the result of internal changes within the organisation – staff, locations or technology.

4.7.11 Reviewing & Auditing

Best practice demands that reviews of the DR programme must be undertaken at least annually. The purpose of the review is to begin at the Awareness Training phase and undertake all process through to Exercising. This ensures any unforseen or overlooked changes in the IT infrastructure and or business process or recovery requirements over the past 12 months that have not been “captured” via change or project management process are identified and remedial activities applied.

A formal DR Audit process ensures that an organisation has an effective Disaster Recovery programme. Auditing is designed to verify that the process has been followed correctly, not that the technical solution is necessarily correct and should be conducted on a regular basis as defined by the organisation’s audit and governance policies.

Business Continuity - IT Disaster Recovery Discussion Paper - V2.0R Page 29 of 46

Commercial-in-Confidence

© Thomas Duryea Consulting

4.8 Amendments for ISO 22301

4.8.1 Lifecycle

Figure 9 DR Lifecycle using ISO 22301

The DR Life cycle utilising the ISO n22301 standard is depicted in Figure 9 above. The major differences between this and the once described previously utilising the BS2599 standard are:

Operation (Do) includes the ”old”

Organisation’s DR Strategies

Developing & Implementing the Organisations DR Solution and Plan

Exercising

Performance, Evaluation (Check) includes the “old”

Maintain and Review

Performance, Evaluation (Check) ads

Audit

Improvement adds a formal Continuous Improvement (CI) process via programme review. This in effect eliminates the assumption of the older standard that if the DR processes were undertaken regularly, a DR culture would be imbedded into the organisation.

Business Continuity - IT Disaster Recovery Discussion Paper - V2.0R Page 30 of 46

Commercial-in-Confidence

© Thomas Duryea Consulting

4.8.2 Processes

Figure 10 DR Processes using ISO 22301

The new standard incorporates all of the existing process and adds a few more.

Policy and Framework remain as they were under “Context”

Business Impact Analysis, Threat and Vulnerability Assessment and Solution Architecture remain as they were but are now under “Operation” rather than “Strategy”.

Technical Solution, Recovery Plans and Exercised Plans remain as they were but are now under “Operation” rather than “Developing and Implementing a DR Solution” and ”Exercising Maintaining and Reviewing”

Maintain and Review under ”Exercising Maintaining and Reviewing” is replaced by Management Review, Internal Audit and Monitoring, Measurement, Analysis & Evaluation under “Performance Evaluation”

Non-Conformity and Continuous Improvement are added processes under “Improvement”

Business Continuity - IT Disaster Recovery Discussion Paper - V2.0R Page 31 of 46

Commercial-in-Confidence

© Thomas Duryea Consulting

5. The DR Business Impact Analysis Process

The Business Impact Analysis (BIA) provides the business defined recovery requirements from which we can determine the application order of restart, the recovery timeframes and the amount of data loss per individual applications the business requires. Without undertaking the BIA, on organisation risks:

Overspending on the technical solution

Underspending on the technical solution

Never knowing if business recovery requirements are being met

Figure 11 BIA Process Map

Business Continuity - IT Disaster Recovery Discussion Paper - V2.0R Page 32 of 46

Commercial-in-Confidence

© Thomas Duryea Consulting

Figure 11 on the previous page depicts the process undertaken to complete the BIA. We start at the “blue box”, determining the Consequence table. This is followed by the “green box”, completing the data collection, next comes the “red box”, completing the threat and vulnerability assessment and lastly comes the “purple box” which completes the analysis and provides the BIA report.

5.1 Pre workshop Activities

5.1.1 Application Template Build

TD provides an application template containing place holders for sites, application groups, applications, application dependencies and IT and business owners. Clients complete this template, providing TD with a list of all of their applications and their dependencies sorted by their appropriate business group and site.

An application dependency is another application that is required by the first application to enable full business functionality to resume.

For example, it is no use in recovering a business application without recovering Active Directory as Active Directory provides the mechanism to log onto the business application, allowing the business to resume the business functions the application provides. Without Active Directory, the application would be unable to provide any business functionality and therefore Active Directory is a dependency for the application.

Figure 12 Application Template Sample

5.1.2 Consequence Table Build

Consequences are defines as an outcome or impact of an event. There can be more than one consequence from one event ranging from positive to negative and can be expressed qualitatively or quantitatively. For the purpose of business risk management, Consequences are considered in relation to the achievement of objectives on a range of stakeholders and assets e.g. environmental damage, loss or increase of market/profits, regulations increase or decrease competitiveness.

Site

Application Group

Applications Application

Description/

Comments

Dependencies

Prod Business

Acc Pac Accounting

SQL

Active Directory

Internet Gateway

Exchange Email

Active Directory

Internet Gateway

Infrastructure

SQL Database

Active Directory

Active Directory Logon

Internet Gateway Internet

Active Directory

Business Continuity - IT Disaster Recovery Discussion Paper - V2.0R Page 33 of 46

Commercial-in-Confidence

© Thomas Duryea Consulting

If an organisation does not have AN EXISTING Consequence table, one will have to be built.

Figure 13 Consequence Table from Standards Australia HB 436 2004 - Risk Management Guidelines

Figure 14 Sample Consequence Table

5.1.3 Analysis Timeframes

To determine when the loss of an individual application will cause critical issues for an organisation, time frames need to be established to which consequences of the loss can be measured against.

Common timeframes are 1or 4 hours, 1 day, 1 week and 1 month. Some organisations prefer to add 3-4 days and 2 weeks into the mix. The maximum period of time being the point where the business felt an incident with a catastrophic impact impeded the business from being viable e maximum period of time being the point where the business felt an incident with a catastrophic impact impeded the business from being viable

5.1.4 BIA Workbook Build

TD uses all of the above provided information to build the BIA workbook, of which an example is shown on the next page in Figure 15.

Business Continuity - IT Disaster Recovery Discussion Paper - V2.0R Page 34 of 46

Commercial-in-Confidence

© Thomas Duryea Consulting

Figure 15 Sample BIA Worksheet

Site

Application Group

Applications Application

Description/

Comments

Dependencies

Tangible

Impacts

Tangible

Impacts

Tangible

Impacts

Tangible

Impacts

Minutes Hours Days Minutes Hours Days Minutes Hours Days Minutes Hours Days Minutes Hours Days Minutes Hours Days

Stakeholders Product Image Financial Stakeholders Product Image Financial Stakeholders Product Image Financial Stakeholders Product Image Financial Minutes Hours Days

Prod Business

Acc Pac Accounting 1 1 1 1 2 1 2 2 2 1 2 3 3 3 3 4 37 14 7 30 7 1 7 n n

SQL 2 2 2 1 3 2 3 2 4 2 3 2 4 2 3 2 45 0 0 21 0 0 14 0 0 30 0 0 2 0 0 1 0 0 2 n 0 0 0 n 0

Active Directory 3 3 3 3 4 4 3 5 5 4 5 5 5 4 5 5 81 0 0 1 0 12 0 0 0 30 0 0 1 0 0 1 0 0 2 n 0 0 0 n 0

Internet Gateway 2 2 2 2 3 3 2 4 4 3 4 4 4 3 4 4 61 0 0 7 0 0 1 0 0 30 0 0 n/a 0 0 n/a 0 0 n/a n 0 0 0 n 0

Exchange Email 2 2 2 1 3 2 3 2 4 2 3 2 4 2 3 4 47 3 1 30 2 12 1 n n

Active Directory 3 3 3 3 4 4 3 5 5 4 5 5 5 4 5 5 81 0 0 1 0 12 0 0 0 30 0 0 1 0 0 1 0 0 2 n 0 0 0 n 0

Internet Gateway 2 2 2 2 3 3 2 4 4 3 4 4 4 3 4 4 61 0 0 7 0 0 1 0 0 30 0 0 n/a 0 0 n/a 0 0 n/a n 0 0 0 n 0

Infrastructure

SQL Database 2 2 2 1 3 2 3 2 4 2 3 2 4 2 3 2 45 21 14 30 2 1 2 n n

Active Directory 3 3 3 3 4 4 3 5 5 4 5 5 5 4 5 5 81 0 0 1 0 12 0 0 0 30 0 0 1 0 0 1 0 0 2 n 0 0 0 n 0

Active Directory Logon 3 3 3 3 4 4 3 5 5 4 5 5 5 4 5 5 81 1 12 30 1 1 2 n n

Internet Gateway Internet 2 2 2 2 3 3 2 4 4 3 4 4 4 3 4 4 61 7 1 30 n/a n/a n/a n n

Active Directory 3 3 3 3 4 4 3 5 5 4 5 5 5 4 5 5 81 0 0 1 0 12 0 0 0 30 0 0 1 0 0 1 0 0 2 n 0 0 0 n 0

Mitigation Strategies

Viable Work

AroundTime to Implement Work Around DRP in

Place

DRP

Exercise

Date

Total Impact

Rating

MTPD, RTO, MTDL & RPO

MTPD Required RTO Current RTO Required RPO Current RPOMTDL

Business Impact

1 week

Business Impact

1 month

Intangible Impacts Intangible Impacts

Business Impact

1 hour

Business Impact

1 day

Intangible Impacts Intangible Impacts

Business Continuity - IT Disaster Recovery Discussion Paper - V2.0R Page 35 of 46

Commercial-in-Confidence

© Thomas Duryea Consulting

5.2 Workshop

Workshops are conducted with representation from IT and the business. The purpose of the workshops is to obtain a consensus of the organisation wide risk appetite and recovery requirement in relation to IT infrastructure.

5.2.1 Impact Rating Assignment

Using the Consequence Table, Impact Ratings are assigned to each Application and Dependency. As an example, they can be based on assessments of:

Intangible Impacts

Safety

Environment

Property Damage

Asset Management

Business Objective

Reputation and Image

Tangible Impacts

Legal

Financial

if the Application was unable to function for any reason.

The consequences are considered for different time frames. In Figure 15, consequences will be considered at intervals of 1 hour, 1 day, 1 week and 1month.

5.2.2 Maximum Tolerable Period of Disruption (MTPD) Assignment

Business representatives determine the value of the Maximum Tolerable Period of Disruption (MTPD) on an application by application basis.

5.2.3 Recovery Time Objective (RTO) Assignment

Business representatives determine the value of the Recovery Time Objectives (RTO) on an application by application basis.

IT representatives disseminate the current Recovery Time Objectives (RTO) on an application by application basis i.e. the current time it takes to recover the application.

Quite often, especially if an organisation does not have an existing DR technical or their DT technical solution has not been reviewed for some time, this is an awakening for the business that investment in DR is required as their recovery time requirements are not being met.

5.2.4 Maximum Tolerable Data Loss (MTDL) Assignment

Business representatives determine the value of the Maximum Tolerable Data Loss (MTDL) on an application by application basis.

Business Continuity - IT Disaster Recovery Discussion Paper - V2.0R Page 36 of 46

Commercial-in-Confidence

© Thomas Duryea Consulting

5.2.5 Recovery Point Objective (RPO) Assignment

Business representatives determine the value of the Recovery Point Objectives (RPO) on an application by application basis.

IT representatives disseminate the current Recovery Point Objectives (RPO) on an application by application basis i.e. the current time it takes to recover the application.

The gap between the business requirements and the current capability is often the same however; if tape based restores are being utilised, the busi9ness may not realise that serialisation recoveries will cause an increase in RTO capability.

5.2.6 Business Continuity Workarounds Acknowledgement

Noted in the DR BIA is the existence of a viable workaround. A viable workaround is a documented and exercise BCP, linking the DR BIA back into BCM.

Quite often, this is the second awakening for the business that investment in DR and BC is required as often they realise that they do not have any or sufficient BC workarounds in case the application is unavailable.

5.2.7 Disaster Recovery Solutions Acknowledgement

Noted in the DR BIA is the existence of a DR Plan for the application and if one does exist, when it was lasted exercised.

Quite often, this is the third awakening for the business that investment in DR is required as often they realise that they do not have any or sufficient DR capability that has been exercised within the last 12 months.

Quite often we find DRPs that have not been exercised for many years and the business has evolved to such a degree that the DRP would not support current business requirements. This is often caused by organisations not undertaking their DR programmes, rather believing that a once off technical solution is the sole requirement

5.3 Threat and Vulnerability Assessment

Based upon the an organisation’s s local knowledge, the author’s personal experience and using Industry recognised methodology we can presume a number of potential threats, their likelihood and therefore make some recommendations with regards to the logical and physical security of the current premises.

The below figure (Figure 16) taken from the Australian Standard “HB 436:2004 Risk Management Guidelines” provides a scale of likelihoods that we can measure potential events against.

Figure 16 Likelihood Scale

Using the likelihood scale and a list of potential events sourced from the Business Continuity Institute (BCI) “Good Practice Guidelines 2005” document we can predict what events may occur and how often they may occur.

The table (Figure 17) on the next page maps possible events to their likelihood.

Level Descriptor Description Indicative Frequency

(expected to occur)A Almost certain The event will occur on an annual basis Once a year or more frequently

B Likely The event has occurred several times or more in your career Once every three years

C Possible The event might occur once in your career Once every ten years

D Unlikely The event does occur somewhere from time to time Once every thirty years

E Rare Heard of something like the occurring elsewhere Once every 100 years

F Very rare Have never heard of this happening One in 1000 years

G Almost incredible Theoretically possible but not expected to occur One in 10 000 years

Business Continuity - IT Disaster Recovery Discussion Paper - V2.0R Page 37 of 46

Commercial-in-Confidence

© Thomas Duryea Consulting

Figure 17 Likelihood of Incidents

Any event depicted by a coloured bar above the red line is either possible, likely or almost certain to occur. The events depicted by a colour bar below the red line are either unlikely, rare, very rare or almost incredible to occur.

We may be concerned with those events above the red line and may need to reduce the impact or the likelihood of these events as part of the DR technical solution.

5.4 Post Workshop Analysis and Reporting 5.4.1 Calculation of the Original Total Impact Rating

For each time interval that an organisation considers, a unique value is calculated for each application based on the highest risk numerical value of the tangibles and intangibles from the Consequence Table.

For each timeframe:

The maximum value of the intangibles is added to the maximum value of the tangibles to arrive at a unique value.

Multiplication factors are applied to each total value of each timeframe with the earliest timeframe having the largest multiplication factor. This ensures that if an application has the same value for two or more concurrent timeframes, the application is afforded a higher value for the earlier timeframes.

The subsequent values are added to arrive at the total impact rating.

Descriptor Indicative Frequency

(expected to occur)

Almost certain Once a year or more frequently

Likely Once every three years

Possible Once every ten years

Unlikely Once every thirty years

Rare Once every 100 years

Very rare One in 1000 years

Almost incredible One in 10 000 years

Bom

b th

reat o

r dam

age (p

artia

l or to

tal s

ite d

estru

ctio

n)

Com

pro

mis

ed IT

security

Build

ing m

anagem

ent s

yste

ms fa

ilure

Change Im

ple

meta

tion F

ailiu

re

Com

pro

mis

ed p

hysic

al s

ecurity

Dam

age o

r loss o

f ele

ctro

nic

record

s

Data

corru

ptio

n

Denia

l of A

ccess/E

vacuatio

n

Enviro

nm

enta

l - Earth

quake

Enviro

nm

enta

l - Flo

odin

g

Enviro

nm

enta

l - Lig

hte

nin

g S

trikes

Enviro

nm

enta

l - Natu

ral D

isaste

rs

Enviro

nm

enta

l -Cyclo

nes

Enviro

nm

enta

l fire th

reat o

r dam

age

Hum

an - a

ccid

denta

l

Hum

an - d

elib

era

te

IT s

yste

ms fa

ilure

- hard

ware

IT s

yste

ms fa

ilure

- softw

are

Loss o

f key s

taff (te

mpora

ry o

r perm

anent)

Loss o

f Locatio

n

Respondin

g to

Dem

onstra

tions &

Civ

il Dis

turb

ances

Thre

ate

nin

g C

alls

Unauth

oris

ed a

ccess

Utility

failu

re - g

as

Utility

failu

re - e

lectric

ity

Utility

failu

re - te

lecom

munic

atio

ns

Utility

failu

re - w

ate

r

Viru

s a

ttacks

Business Continuity - IT Disaster Recovery Discussion Paper - V2.0R Page 38 of 46

Commercial-in-Confidence

© Thomas Duryea Consulting

5.4.2 Calculation of the Revised Total Impact Rating

The original total impact rating for each application and its dependencies are examined.

Where the original total impact rating for the application dependency was less than the original total impact rating for the application, the application dependency’s rating was increased to that of the application, producing a higher Revised Application Impact Rating for the application dependency.

Where the total impact rating for the application dependency was greater than that of the application, no amendments were made, leaving no change to the Revised Total Impact Rating

If an application appears multiple times, the highest Revised Total Impact Rating will be used

The employed method ensures that all required components (dependencies) for an application are afforded and treated with the same level of priorities.

5.4.3 Common Denominators

MTPD, RTO, MTDL and RPO values for each application are modified to report on a common denominator of a day (during the workshop, the values could have been complied in minutes, hours or days).

This allows for simpler charting of values presented in the BIA report.

Business Continuity - IT Disaster Recovery Discussion Paper - V2.0R Page 39 of 46

Commercial-in-Confidence

© Thomas Duryea Consulting

5.4.4 Analysis

Collected data is stored in the BIA analysis workbook and a number of charts created to facilitate analysis.

Analysis of the collected data was undertaken:

Highlight the risk impact of each of the applications

Identify the MTPDO, RTO, MTDL and RPO for each application

Identify the MTPD for each application dependency

Identify where an application dependency’s MTPD may cause issues when compared to the application’s MTPD

Verify that the RTO for each application is within the MTPD

Identify where the RTO is greater than the MTPD

Identify discrepancies between the required RTO and the current RTO

Identify the RTO for each application dependency

Identify where an application dependency’s RTO may cause issues when compared to the application’s RTO

Identify the MTDL for each application dependency

Identify where an application dependency’s MTDL may cause issues when compared to the application’s MTDL

Verify that the RPO for each application is within the MTDL

Identify where the RPO is greater than the MTDL

Identify discrepancies between the required RPO and the current RPO

Identify the RPO for each Application Dependency

Identify where an application dependency’s RPO may cause issues when compared to the application’s RPO

5.4.5 Reporting

Figure 18 on the next page provides a glimpse of some of the charts presented in the BIA report along with commentary explaining what the charts have highlighted.

The report also highlights the applications whose impact rating was amended and this is used not only to provide a grouping of high, medium and low risk applications, but also a discrete restart order within each grouping.

Finally the report will contain a high level design brief for the solution Architects to work from.

Business Continuity - IT Disaster Recovery Discussion Paper - V2.0R Page 40 of 46

Commercial-in-Confidence

© Thomas Duryea Consulting

Figure 18 Chart samples

Business Continuity - IT Disaster Recovery Discussion Paper - V2.0R Page 41 of 46

Commercial-in-Confidence

© Thomas Duryea Consulting

6. Potential Issues When Deploying a Technical Solution Only

We have noted in the previous chapters that there are processes and methodologies that need to be implemented to ensure the DR technical solution is the correct one.

Many organisations and indeed vendors today, take the uneducated or misinformed view that they only need to buy some hardware and some software, place these in another location and move data from the production site to the DR site and they have DR.

Their main argument centres on the increase in technology capability in recent years, which has not only seen a dramatic fall in price but a significant gain in functionality. They argue that they can exceed business requirements with the new technology.

Unknowns of an in correctly implemented DR programme are:

Lack of confidence DR meets Business needs

Lack of confidence it will work when needed

It is not proven and exercised

Lack of compliance with all Standards and Best Practices

Would fail if auditable by External auditors

Let’s examine some of the issues one will face if you only put in a hardware solution.

6.1 Misalignment with Business Needs

If we have not asked the business what they want for disaster recovery, how can we possibly know with confidence we are meeting their need?

Are we exceeding their required recovery time and data loss requirements?

Are we failing to meet their required recovery time and data loss requirements?

We may never know until an incident if we have met or failed to meet business needs and if we encounter an incident, it is too late to find out.

6.2 Technical Solution Cost

If we have failed to ask the business what they want for disaster recovery, how can we possibly know with confidence we have purchased the correctly costed technical solution?

Chances are if we have exceeding the business required recovery time and/or data loss requirements, we have bought a more technical capable solution than we needed at a larger cost than we really required.

On the other hand, are if we have not met the business required recovery time and/or data loss requirements, we have a technical inferior solution bought at a lesser cost that does not meet business needs.

We may never know until an incident if we have purchased sufficient equipment to meet business requirements and if we encounter an incident, it is too late to find out.

Business Continuity - IT Disaster Recovery Discussion Paper - V2.0R Page 42 of 46

Commercial-in-Confidence

© Thomas Duryea Consulting

6.3 Service Availability

If we have not determined the order of recovery of individual applications for disaster recovery, how can we possibly know with confidence we are meeting service availability targets to the business, allowing them to meet their service availability targets to our clients.

Without a pre-defined business order of recovery, competing business factions will demand discrete applications be recovered within the same timeframe, compromising recovery of all applications in a timely manner.

We may never know until an incident if we have met or failed to meet business service availability targets allowing or preventing the business to meet their service targets with our clients and if we encounter an incident, it is too late to find out.

6.4 Exercising

If we fail to obtain business acceptance and/or buy-in of the DR process we would be extremely fortunate to obtain their willing participation in an exercise.

Falling to obtain everyone’s co-operation in a DR exercise, places the success of the exercise at risk and increases the risk to the business that should an incident occur, some business personnel will not be able to assist in the recovery of their application.

Core activities of business representatives in an exercise or in the event of a DR declaration are the validation of the application’s data prior to it being commissioned for use.

We may never know until an incident if we have sufficiently trained personnel and if we encounter an incident, it is too late to find out.

6.5 Evolution

If we fail to acknowledge that business evolves and business focus changes, we will in effect have purchased a “white elephant” with a technical solution that will eventually fail to meet business recovery needs.

The provision of the DR technical solution is not a once off project but should be supported with change and project management process in addition to annual reviews.

Anecdotal evidence suggests that organisations that purchase a DR technical solution without applying the DR programme business methodology tend to disregard the need for evolving the DR solution. This means that over time the DR solution becomes obsolete.

We may never know until an incident if we have sufficient DR technical capability or obsolete DR infrastructure and if we encounter an incident, it is too late to find out.

Business Continuity - IT Disaster Recovery Discussion Paper - V2.0R Page 43 of 46

Commercial-in-Confidence

© Thomas Duryea Consulting

7. Appendix A - Thomas Duryea Consulting 7.1 Company Overview

Founded in 2000, Thomas Duryea Consulting (TD) design, plan, implement and manage innovative, world class IT infrastructure solutions from the keyboard to the cloud.

Working closely with our clients as their strategic infrastructure partner, we utilise best practice and rigorous, proven methodologies; to both recommend and implement best-fit IT infrastructure solutions. Our portfolio of solutions and services will help you solve your organisational challenges while maximising your return on ICT investments.

Our ongoing success is a testament to our people’s abilities and their shared commitment to our core mission and values. At TD we pride ourselves on technical excellence and as such we attract, develop and retain exceptional consultants, architects, engineers and project managers. Our ongoing success is a testament to these people’s abilities.

We maintain an interest and through leadership position in the Business Continuity (BC)/Disaster Recovery (DR) community and are up to date with the latest standards and good practices guidelines governing correct programme implementations.

We rigorously test and evaluate vendor technologies and choose to specialise in these to be able to ensure our clients continuously receive successful project outcomes based on business requirements. We can assure our clients that we are technically excellent in these vendor technologies and can successfully demonstrate repeatable business outcomes across clients in all industries. Further to this we have a strong Managed Services division which provides ongoing support in these chosen vendor technologies to our strong client base.

7.2 Awards

Thomas Duryea has been recognised for many years as a leading IT Infrastructure Solution provider, and as such is proud to acknowledge the below awards from both industry and vendors.

2008 VMware Partner of the Year

2008 BRW ‘Fast 100’ [6th place]

2008 BRW ANZ ‘Fastest Growing Private Business’

2008 BRW ‘Fast 100’ [6th place]

2009 BRW ‘Fast 100’ [7th place]

2009 Symantec Growth Partner of the year

2010 BRW ‘Fast 100’ [21th place]

2010 Symantec Enterprise Value Partner of the year

2011 EMC Unified Storage Partner of the year

2011 EMC VIC/Tas Partner of the year

Business Continuity - IT Disaster Recovery Discussion Paper - V2.0R Page 44 of 46

Commercial-in-Confidence

© Thomas Duryea Consulting

7.3 Partners and Accreditations

Industry Accreditations

Member of the Business Continuity Forum and SNIA

Vendor Accreditations

CommVault Platinum Partner

NetApp Platinum Partner

EMC Velocity Signature Partner

2011 Unified Storage Partner of the Year

Cisco UCS Partner

VMWare Premier Partner

2005 - 2009 Highest Performing Partner

Symantec Platinum Partner

2011 Specialised Service Partner of the year

2010 Symantec Enterprise Value Partner of the year

Microsoft Gold Systems Management Certified

Microsoft Silver Desktop Management Certified

Citrix Gold Partner

Business Continuity - IT Disaster Recovery Discussion Paper - V2.0R Page 45 of 46

Commercial-in-Confidence

© Thomas Duryea Consulting

7.4 Thomas Duryea – the Disaster Recovery Programme Specialists

Many organisations have developed a technical DR solution but struggle to successfully manage it or maintain its currency with business requirements. Others simply have not been in a position for various reasons to be able to implement a DR solution. Thomas Duryea has provided a methodology to these clients that enable them to implement and manage their DR solution.

We are able to assist clients to:

Implement policy

Implement framework

Undertake BIA

Design and implement technical solution

Write declaration and recovery plans

Exercise recovery plans

Annual reviews of the DR programme

TD is a Silver Supplier Member of the Continuity Forum, which is the Australian & New Zealand BC/DR User Group. We have been called upon to assist the BC/DR community to understand the core components of the DR programme and in particular the correct processes in undertaking the BIA and have presented at the Continuity Forum Expos and Business Continuity Institute (BCI) forums and summits.

TD has a vast amount of experience in delivering tailored DR programmes for many clients over many industry sectors. It is a common practice within the BC/DR community not to name clients unless we have their approval to do so.

Aged Care

Construction

Education

Employment

Exploration

Financial

Health

Manufacturing

Not for profit

Telecommunications

Transport

Sporting

Utilities

We have delivered a Collaborative DR Training Programme to a local government peak body, effectively training all Victorian councils and providing them with documentation and process templates. Based upon the success of this training, approximately 30% of councils engaged TD for assistance in implementing services to compliment the work they were undertaking themselves.

Business Continuity - IT Disaster Recovery Discussion Paper - V2.0R Page 46 of 46

Commercial-in-Confidence

© Thomas Duryea Consulting

7.5 About the Author

David Danher is Thomas Duryea’s Principal Consultant and has over 30 years’ experience in the IT industry. He began his career as a mainframe expert in disaster recovery, configuration, storage and operations. For the last 13 years he has been a Consultant within the disaster recovery and business continuity disciplines.

As Thomas Duryea’s Principal Consultant, David is responsible for the presales and delivery of disaster recovery consultation, programmes and training for clients spanning multiple industry sectors. He has provided training to the local government sector, established DR programmes for clients in the health and local government sectors and established the world’s first 4 way mainframe replication solution. He is currently engaged by clients in the sports, HR, manufacturing, health and education sectors to assist them in developing DR programmes.

Prior to Thomas Duryea Consulting, David worked as an independent consultant for storage vendors designing and implementing DR solutions and storage upgrades across Australia, China, Taiwan, Hong Kong and the USA. During this period he relocated a data centre from Auckland to Melbourne within the standard 4 hour maintenance window, upgraded a major medical facilities’ storage infrastructure without any downtime and implemented DR infrastructure for a Wall St financial institute which was activated following the events of 9/11.

David has also worked for both the NAB and CBA as a mainframe computer operator, storage administrator and disaster recovery technician. His experience, wealth of knowledge and creative problem solving skills make him the consummate professional. David is a professional member of the Business Continuity Institute.


Recommended