+ All Categories
Home > Travel > High Availability - How to get 99.99% service availabilty - Designing clusters (DOs & DON'Ts)

High Availability - How to get 99.99% service availabilty - Designing clusters (DOs & DON'Ts)

Date post: 12-Jan-2015
Category:
Upload: barcamp-saigon
View: 677 times
Download: 1 times
Share this document with a friend
Description:
Presentation at BarcampSaigon 2013 - RMIT 7th July Presenter: Lukas Rypl
Popular Tags:
22
High Availability Lukas Rypl Twitter: @LukasRypl 7th July 2013
Transcript
Page 1: High Availability - How to get 99.99% service availabilty - Designing clusters (DOs & DON'Ts)

High Availability

Lukas Rypl

Twitter: @LukasRypl

7th July 2013

Page 2: High Availability - How to get 99.99% service availabilty - Designing clusters (DOs & DON'Ts)

Agenda

Intro - what and why (3 mins)

Describing Requirements (10 mins)

Solutions (7 mins)

Q&A

Lukas Rypl (BarCamp Saigon) High Availability 7th July 2013 2 / 17

Page 3: High Availability - How to get 99.99% service availabilty - Designing clusters (DOs & DON'Ts)

Lukas Rypl (BarCamp Saigon) High Availability 7th July 2013 3 / 17

Page 4: High Availability - How to get 99.99% service availabilty - Designing clusters (DOs & DON'Ts)

Customers Talking aboutRequirements

fault tolerance

fail-over solution

high availability

disaster recovery

geographic redundancy

cluster

Lukas Rypl (BarCamp Saigon) High Availability 7th July 2013 4 / 17

Page 5: High Availability - How to get 99.99% service availabilty - Designing clusters (DOs & DON'Ts)

What You Need to Know

Why they need it?

Budget

Lukas Rypl (BarCamp Saigon) High Availability 7th July 2013 5 / 17

Page 6: High Availability - How to get 99.99% service availabilty - Designing clusters (DOs & DON'Ts)

Use Only One, Some or All

Active-Active (Master-Master)

Active-Standby (Master-Slave)

Operations:

read/write

read-only

none

Lukas Rypl (BarCamp Saigon) High Availability 7th July 2013 6 / 17

Page 7: High Availability - How to get 99.99% service availabilty - Designing clusters (DOs & DON'Ts)

Use Only One, Some or All

Active-Active (Master-Master)

Active-Standby (Master-Slave)

Operations:

read/write

read-only

none

Lukas Rypl (BarCamp Saigon) High Availability 7th July 2013 6 / 17

Page 8: High Availability - How to get 99.99% service availabilty - Designing clusters (DOs & DON'Ts)

Lukas Rypl (BarCamp Saigon) High Availability 7th July 2013 7 / 17

Page 9: High Availability - How to get 99.99% service availabilty - Designing clusters (DOs & DON'Ts)

RPO and RTORecovery Time ObjectiveRecovery Point Objective

Beyond Redundancy: How Geographic Redundancy Can Improve Service Availability and Reliability of

Computer-based Systems by Eric Bauer, Randee Adams, Daniel Eustace

Lukas Rypl (BarCamp Saigon) High Availability 7th July 2013 8 / 17

Page 10: High Availability - How to get 99.99% service availabilty - Designing clusters (DOs & DON'Ts)

Availability

24x7

SLA Weekly Monthly Yearly99% 1h 40m 7h 12m 3days 15hrs99.9% 10m 43m 12s 8h 45m99.99% 1m 4m 19s 52m 33s99.999% 6s 26s 5m 15s

Lukas Rypl (BarCamp Saigon) High Availability 7th July 2013 9 / 17

Page 11: High Availability - How to get 99.99% service availabilty - Designing clusters (DOs & DON'Ts)

What You Need to Know

How to handle network partitioning?

Manual or automatic failover and recovery?

Local or geographical redundancy?

Connection parameters (BW, RTT, L2/L3)

Other goals - load balancing?

Lukas Rypl (BarCamp Saigon) High Availability 7th July 2013 10 / 17

Page 12: High Availability - How to get 99.99% service availabilty - Designing clusters (DOs & DON'Ts)

What You Need to Know

How to handle network partitioning?

Manual or automatic failover and recovery?

Local or geographical redundancy?

Connection parameters (BW, RTT, L2/L3)

Other goals - load balancing?

Lukas Rypl (BarCamp Saigon) High Availability 7th July 2013 10 / 17

Page 13: High Availability - How to get 99.99% service availabilty - Designing clusters (DOs & DON'Ts)

What You Need to Know

How to handle network partitioning?

Manual or automatic failover and recovery?

Local or geographical redundancy?

Connection parameters (BW, RTT, L2/L3)

Other goals - load balancing?

Lukas Rypl (BarCamp Saigon) High Availability 7th July 2013 10 / 17

Page 14: High Availability - How to get 99.99% service availabilty - Designing clusters (DOs & DON'Ts)

What You Need to Know

How to handle network partitioning?

Manual or automatic failover and recovery?

Local or geographical redundancy?

Connection parameters (BW, RTT, L2/L3)

Other goals - load balancing?

Lukas Rypl (BarCamp Saigon) High Availability 7th July 2013 10 / 17

Page 15: High Availability - How to get 99.99% service availabilty - Designing clusters (DOs & DON'Ts)

What You Need to Know

How to handle network partitioning?

Manual or automatic failover and recovery?

Local or geographical redundancy?

Connection parameters (BW, RTT, L2/L3)

Other goals - load balancing?

Lukas Rypl (BarCamp Saigon) High Availability 7th July 2013 10 / 17

Page 16: High Availability - How to get 99.99% service availabilty - Designing clusters (DOs & DON'Ts)

Solutions - Layers

High Availability and Disaster Recovery: Concepts, Design, Implementation by Klaus Schmidt

Lukas Rypl (BarCamp Saigon) High Availability 7th July 2013 11 / 17

Page 17: High Availability - How to get 99.99% service availabilty - Designing clusters (DOs & DON'Ts)

Solutions - Hardware

disks: RAID 1, 10, 5, ...(controller with BBWC)

power supply

network interface: teaming/bonding

out-of-band management(HP iLO, Dell DRAC, IBM RSA)

Lukas Rypl (BarCamp Saigon) High Availability 7th July 2013 12 / 17

Page 18: High Availability - How to get 99.99% service availabilty - Designing clusters (DOs & DON'Ts)

Solutions - Data

snapshots

filesystem: drbd

NAS, SAN

Lukas Rypl (BarCamp Saigon) High Availability 7th July 2013 13 / 17

Page 19: High Availability - How to get 99.99% service availabilty - Designing clusters (DOs & DON'Ts)

CAP (Brewster’s) theorem

http://blog.rizzif.com/2011/08/31/intro-to-nosql/

Lukas Rypl (BarCamp Saigon) High Availability 7th July 2013 14 / 17

Page 20: High Availability - How to get 99.99% service availabilty - Designing clusters (DOs & DON'Ts)

Solutions - Databases

MySQL - Master-Master replication

PostgreSQL - Master-Slave, 3rd party forMaster-Master

Oracle RAC

Lukas Rypl (BarCamp Saigon) High Availability 7th July 2013 15 / 17

Page 21: High Availability - How to get 99.99% service availabilty - Designing clusters (DOs & DON'Ts)

Solutions - CRM

corosync + pacemakerresources (application, IP addresss, ...)rules (where, when, what requires)master-slave

Partitioning not allowedSTONITH required

www.clusterlabs.org

Lukas Rypl (BarCamp Saigon) High Availability 7th July 2013 16 / 17

Page 22: High Availability - How to get 99.99% service availabilty - Designing clusters (DOs & DON'Ts)

Q & A

Lukas Rypl (BarCamp Saigon) High Availability 7th July 2013 17 / 17


Recommended