Hosted PostgreSQL

Mike Fowler ([email protected])

Hosted PostgreSQL

● Senior Site Reliability Engineer in the Public Cloud Practice of claranet

● Background in Software Engineering, Systems Engineering, System & Database Administration

● Contributed to several open source projects (YAWL, PostgreSQL & Terraform)

● Been using PostgreSQL since 7.4

About Me

● Hosted PostgreSQL

● Overview of public cloud hosting options

● Database migration strategies

Overview

● Your database somewhere else

● A managed service

– Some providers offer full DBA support

– Cloud providers give only the infrastructure

● Typically provisioned through an API or GUI

– i.e. a self-service environment

What is hosted PostgreSQL?

● Reduces adoption costs

● Installation & configuration is already done

– Generally sane defaults, some tuning often required

● Needn’t worry about physical servers

● Opex instead of Capex

● Most routine DBA tasks are done for you

● Easier to grow

Benefits of Hosted PostgreSQL

● Less control

● Latency

● Some features are disabled

● Migrating existing databases is hard

● Potential for vendor lock-in

● Resource limits

Drawbacks of Hosted PostgreSQL

● We’ll look only at Public Cloud offerings

● Current major offerings

– Amazon Relation Database Service (RDS)

– Heroku

● Future major offerings

– Amazon Aurora

– Google Cloud SQL

– Microsoft Azure

Hosting Options

● PostgreSQL 9.3.12 – 9.6.2 supported

● Numerous instance types

– Costs range from $0.018 to $4.97 per hour

– Select from 1 vCPU up to 32 vCPUs, all 64-bit

– Memory ranges from 1GB to 244GB

● Flexible storage options

– Choose between SSD or Provisioned IOPS

– Up to 6TB with up to 30,000 IOPS

Amazon RDS

● High availability multi-availability zone option

– Synchronous replica

– Automatic failover (~2 minutes)

● Up to 5 read-only replicas (asynchronous replication)

● Configurable automatic backups with PITR

● Monthly uptime percentage of 99.95 per instance

– Allows for approximately 22 minutes downtime

Amazon RDS

● Supports PostgreSQL 9.3, 9.5, 9.6 & 9.6

● Simpler pricing based on choice of tier ($0-8.5k pcm)

● Tier dictates resource limits

– Maximum number of rows (Hobby only)

– Cache size (1GB - 240GB)

– Storage limit (64GB - 1TB)

– Connection limit (120 - 500)

– Rollback (4 days – 1 week)

Heroku

● Fork & Follow

● Some of your data may end up in the US

– Logs (can be blocked at creation time)

– Snapshots & Dataclips

● Not possible to replicate out

– No permission for Bucardo, Londiste & Slony

– Remote slave is prohibited

– Only way is dump & restore

Heroku

● Currently in open preview

– Largely free to use but no SLA

● Compatible with PostgreSQL 9.6

● Up to 2x throughput of conventional PostgreSQL

● Up to 16 read replicas with sub-10ms replica lag

● Auto-growing filesystem up to 64TB

– Filesystem is shared between 3 availability zones

Amazon Aurora

● Currently in Beta (no SLA)

● Only supports PostgreSQL 9.6

● Only available in Iowa, no replication support

● Posed to be a serious rival to RDS

– Billing per minute

– Automatic scaling of filesystem

– Similar variety of instance types

● Minimal extensions but includes PostGIS

Google Cloud SQL

● Currently in preview (no SLA)

● Supports PostgreSQL 9.5 & 9.6

● Replication is seamless

– Automated failover

– PITR

● Selectable compute units

● Supports some extensions including PostGIS

Microsoft Azure

● Dump & Restore

● Replication failover

● PITR + Logical decoding

Migration Strategies

● Simplest strategy

– Perceived as low risk for data loss

– Less “moving parts”● Just a pg_dump & pg_restore

● Downtime is function of database size

Dump & Restore

● Move historic data ahead of time

– Opportunity to clear out unused data

– Consider introducing partitions

● Consider moving the dump closer to the target

– e.g. Upload to EC2 instance in the same region as the RDS instance and run pg_restore from there

● Over provision resources

– Gives higher throughput during data load

– Downscale once operational

Strategies to Minimise Downtime

● No one supports external masters!

● Trigger based replication failover

– Slony, Londiste & Bucardo

● Can be used on most any version of PostgreSQL

● Some restrictions apply

– DDL is not supported

– Rows must be uniquely identifiable

Replication Failover

● Presents some risk to production environment

– Initial overhead of replicating each table● Gradually add tables to the configuration to

spread the load

– Per-transaction overhead● Write latency to remote slave● Heavy write workload could lead to high

replication lag

● This also works to replicate out of RDS but not Heroku

Replication Failover

● Most involved approach, least downtime

● Combines point-in-time recovery with the changes captured by logical decoding to create a replica

● Need to be running at least PostgreSQL 9.4 with WAL level logical and have WAL archiving configured

● DDL not supported, still need unique rows

● Recommend barman for managing WALhttp://www.pgbarman.org/

● Recommend decoder_raw as logical decoding plugingithub.com/michaelpq/pg_plugins/tree/master/decoder_raw

PITR & Logical decoding

1. Create a logical replication slot

SELECT * FROM pg_create_logical_replication_slot ('logical_slot', 'decoder_raw');

2. Note the transaction ID (catalog_xmin)

SELECT catalog_xmin FROM pg_replication_slots WHERE slot_name = ‘logical_slot’;


3. Perform a barman backup

$ barman backup master

4. Perform a barman PITR

$ barman recover –target-xid (catalog_xmin - 1) master latest

5. Start database and verify correct recovery


5. Perform pg_dump on the readonly barman node

6. Restore to public cloud

7. Read output of logical decoding and write to cloud


● Hosted PostgreSQL gives you high performance PostgreSQL without the hassle of hardware, maintenance and configuration

● Opex instead of Capex

● Consider the limitations of your intended platform

● There are multiple options for migration

Summary

Date post:	21-Jan-2018
Category:	Technology
Upload:	mike-fowler
View:	43 times
Download:	2 times

Hosted PostgreSQL

Technology