+ All Categories
Home > Documents > Disaster Recovery Planning for MySQL & MariaDB · Disaster Recovery Is 100% uptime possible...

Disaster Recovery Planning for MySQL & MariaDB · Disaster Recovery Is 100% uptime possible...

Date post: 20-Mar-2020
Category:
Upload: others
View: 7 times
Download: 0 times
Share this document with a friend
43
Disaster Recovery Planning for MySQL & MariaDB Bart Oles Severalnines
Transcript

Disaster Recovery Planning for MySQL &

MariaDBBart Oles

Severalnines

Copyright 2017 Severalnines AB

Copyright 2017 Severalnines AB

Free to download

Initial 30 days Enterprise trial

Converts into free Community Edition

Enterprise / paid versions available

Copyright 2017 Severalnines AB

Automation & Management

Deployment (Free Community)

● Deploy a Cluster in Minutes

○ On-Prem

○ Cloud (AWS/Azure/Google) - paid

Monitoring (Free Community)

● Systems View with 1 sec Resolution

● DB / OS stats & Performance Advisors

● Configurable Dashboards

● Query Analyzer

● Real-time / historical

Management (Paid Features)

● Backup Management

● Upgrades & Patching

● Security & Compliance

● Operational Reports

● Automatic Recovery & Repair

● Performance Management

● Automatic Performance Advisors

Copyright 2017 Severalnines AB

Supported Databases

Copyright 2017 Severalnines AB

Our Customers

Agenda

Business Considerations for

Disaster Recovery

○ Is 100% uptime possible

○ Analyzing risk

○ Assessing business impact

Defining Disaster Recovery

○ Outage Timeline

○ RTO

○ RPO

○ RTO + RPO = ?

Disaster Recovery Tiers

○ No offsite data

○ Database with no Hot Site

○ Database with Hot Site

○ Asynchronous replication to Hot Site

○ Synchronous replication to Hot Site

Copyright 2017 Severalnines AB

Copyright 2018 Severalnines AB

Business Considerations for Disaster Recovery

What is Disaster Recovery?

● Failures ○ Operational (power, network, IT systems)○ Natural (hurricane, flood, fire, earthquake) ○ Human caused (operator error, malicious

activity, terrorism)

● Drivers○ How fast can we get up and running○ What data have we lost○ How can we reduce risk

Policies, tools & procedures that ensure your data is secure and protected in case of an outage or serious catastrophe

Uptime Guarantees - Why Compromise?

The Small Print

“We Offer 100% Availability, But We Exclude… “

● Planned outages ○ e.g., server or network maintenance

● Failure of network, power or facilities delivered by an upstream provider

● DOS attacks, hacker activity or other malicious events

● Acts of God ○ e.g., weather related - hurricane, flood

Low Downtime Comes at a Cost

Analyzing Risk

Assessing Business Impact

Copyright 2017 Severalnines AB

Copyright 2018 Severalnines AB

Defining Disaster Recovery

Stockholm to Oslo Train Breaks Down After One Hour

x

Outage Timeline

Recovery Time Objective (RTO)

Recovery Point Objective (RPO)

RTO + RPO = 0 ?

Database Replication

Storage Clustering

Data Integrity

Security

Load Balancing

Network Bonding

File Replication

DB Clustering

“Everything Fails, All the Time”

Werner Vogels

Copyright 2017 Severalnines AB

Copyright 2018 Severalnines AB

Disaster Recovery Tiers

Cost of Disaster Recovery

Matching Disaster Recovery Plans to the Business

Backup with No Hot Site

● Physical vs Logical backup○ High impact on RTO

● Combine Full & Incremental○ PITR-compatible to reduce RPO

● Schrödinger’s backup○ “The condition of any backup is

unknown until a restore is attempted”

● Encryption

● Keep a copy of latest backup in active site

Backup Retention

● Local Server ○ Up to 1 week

● Local Datacenter○ Up to 2 weeks

● Remote Datacenter○ Up to 4 weeks○ Plus keep monthly backups &

annual backups as required

Backup with Hot Site

● We can reinstall DBs and apps from scratch and restore data

● Recovery time predictable

● In case of AWS, pre-configured AMIs can be used to quickly provision the application environment

Asynchronous Replication to Hot Site

● Low RTO○ ‘Almost current’ data

enables fast failover

● Low RPO

● Add a delayed slave to guard against operator error

● Backup still important

Synchronous Replication to Hot Site

● Highest tier of DR○ Minimal RPO and RTO

● Data on primary site and hot sites have same transactional state○ Failover instantaneous

and automatic

● Failure detection time is main culprit that adds to RTO

● 3 sites to avoid network partitioning

Synchronous Replication to Hot Site - Database Proxy

Copyright 2017 Severalnines AB

Copyright 2018 Severalnines AB

Concluding Remarks

Reality Check

Disaster Recovery Planning & Testing

Source: https://www.zetta.net/resource/state-disaster-recovery-2016

Source:https://uptimeinstitute.com/about-ui/press-releases/uptime-institute-annual-survey-results-enterprise-owned-data-centers-still-primary-compute-venue

Geographically Distributed Datacenters on the increase

Uptime Institute Data Center Industry Survey (2017)

Failover the New Normal

● Failover used to be a complex procedure○ Required lot of staff○ Required availability of VPs /

technology heads

● In modern distributed infrastructure, design for failure

● Considerations○ How many sites?○ How to route users to sites?○ What goes into a failover?

Source: https://severalnines.com/blog/database-tco-calculating-total-cost-ownership-mysql-management

Calculating Database TCO - Colocation

Source: https://severalnines.com/blog/database-tco-calculating-total-cost-ownership-mysql-management

Calculating Database TCO - Cloud

Additional Resources

Free Disaster Recovery Whitepaper

severalnines.com/resources/whitepapers

Additional Resources

Free Database Backup Whitepaper

severalnines.com/resources/whitepapers

Additional Resources

● Calculating Database TCO

○ https://severalnines.com/blog/database-tco-calculating-total-cost-ownership-mysql-management

● Multi-DC setups for MySQL & MariaDB

○ https://severalnines.com/blog/multiple-data-center-setups-using-galera-cluster-mysql-or-mariadb

● Contact us: [email protected]

43

Rate My Session


Recommended