Building a High-Availability PostgreSQL Cluster at ARIN

Post on 19-Oct-2014

805 views 1 download

Tags:

description

Through a long and intense period of research, implementation, and testing, ARIN completed the migration from Oracle to PostgreSQL late last year. Learn more at: http://teamarin.net/2014/04/01/building-high-availability-postgresql-cluster-arin/

transcript

Building a High-Availability PostgreSQL Cluster

Presenter: Devon MizelleSystem Administrator

Co-Author: Steven Bambling System Administrator

What is ARIN?• Regional Internet registry for Canada,

US, and parts of the Caribbean• Distributes IPv4 & IPv6 addresses and

Autonomous System Numbers

(Internet number resources) in the

region• Provides authoritative WHOIS services

for number resources in the region

2

ARIN’s Internal Data

3

Requirements

4

Why Not Slony or pgpool-II?• Slony replaces pgSQL’s replication –

Why do this?–Why not let pgSQL handle it?

• Pgpool is not ACID-Compliant – Doesn’t confirm writes to multiple nodes

5

Our solution

• CMAN / Corosync– Red Hat + Open-source solution for

cross-node communication• Pacemaker– Red Hat and Novell’s solution for service

management and fencing• Both under active development by

Clusterlabs

6

CMAN/ Corosync

• Provides a messaging framework between nodes

• Handles a heartbeat between nodes– “Are you up and available?”– Does not provide ‘status’ of service,

Pacemaker does• Pacemaker uses Corosync to send

messages between nodes

7

CMAN / Corosync

8

About Pacemaker• Developed / maintained by Red Hat and Novell • Scalable – Anywhere from a two-node to a 16-

node setup• Scriptable – Resource scripts can be written in

any language– Monitoring – Watches out for service state changes– Fencing – Disables a box and switches roles when

failures occur• Shareable database between nodes about

status of services / nodes

9

Pacemaker

10

Master

AsyncSync

?

Other Pacemaker Resources

11

Fencing IP Addresses

How does it all tie together?

From the bottom up…

Pacemaker

13

Client “vip”Replication “vip”

Master

Sync Async App

Event Scenario

14

?X

XMaster Sync AsyncMaster SyncAsync

PostgreSQL

• Still in charge of replicating data• The state of the service and how it

starts is controlled by Pacemaker

15

Layout

16

��

��

MasterSlave Slave

cman cman cman

Client

Using Tools to Look Deeper

Introspection…

# crm_mon -i 1 -Arf

# crm_mon –i 1 -Arf (cont)

Questions? Devon Mizelle