+ All Categories
Home > Technology > PostgreSQL Scaling And Failover

PostgreSQL Scaling And Failover

Date post: 10-May-2015
Category:
Upload: john-paulett
View: 32,148 times
Download: 0 times
Share this document with a friend
Description:
Overview of PostgreSQL scaling and high availability options.
Popular Tags:
36
PostgreSQL John Paulett October 26, 2009 High Availability & Scaling
Transcript
Page 1: PostgreSQL Scaling And Failover

PostgreSQL

John Paulett

October 26, 2009

High Availability & Scaling

Page 2: PostgreSQL Scaling And Failover

10/26/2009 2

Overview

Scaling Overview– Horizontal & Vertical Options

High Availability Overview

Other Options

Suggested Architecture

Hardware Discussion

Page 3: PostgreSQL Scaling And Failover

10/26/2009 3

What are we trying to solve?

Survive server failure?– Support an uptime SLA (e.g. 99.9999%)?

Application scaling?– Support additional application demand

Page 4: PostgreSQL Scaling And Failover

10/26/2009 4

What are we trying to solve?

Survive server failure?– Support an uptime SLA (e.g. 99.9999%)?

Application scaling?– Support additional application demand

→ Many options, each optimized for different constraints

Page 5: PostgreSQL Scaling And Failover

10/26/2009 5

Scaling Overview

Page 6: PostgreSQL Scaling And Failover

10/26/2009 6

How To Scale

Horizontal Scaling– “Google” approach– Distribute load across multiple servers– Requires appropriate application architecture

Vertical Scaling– “Big Iron” approach– Single, massive machine (lots of fast processors,

RAM, & hard drives)

Page 7: PostgreSQL Scaling And Failover

10/26/2009 7

Horizontal DB Scaling

Load Balancing– Distribute operations to multiple servers

Partitioning– Cut up the data (horizontal) or tables (vertical)

and put them on separate servers– aka “sharding”

Page 8: PostgreSQL Scaling And Failover

10/26/2009 8

Basic Problem when Load Balancing

Difficult to maintain consistent state between servers (remember ACID), especially when dealing with writes

4 PostgreSQL Load Balancing Methods:– Master-Slave Replication– Statement-Based Replication Middleware– Asynchronous Multimaster Replication– Synchronous Multimaster Replication

Page 9: PostgreSQL Scaling And Failover

10/26/2009 9

Master-Slave Replication

Master handles writes, slaves handle reads

Asynchronous replication – Possible data loss on master failure

Slony-I– Does not automatically propagate schema changes – Does not offer single connection point– Requires separate solution for master failures

Page 10: PostgreSQL Scaling And Failover

10/26/2009 10

Statement-Based Replication Middleware

Intercept SQL queries, send writes to all servers, reads to any server

Possible issues using random(), CURRENT_TIMESTAMP, & sequences

pgpool-II– Connection Pooling, Replication, Load Balancing,

Parallel Queries, Failover

Page 11: PostgreSQL Scaling And Failover

10/26/2009 11

pgpool-II

Page 12: PostgreSQL Scaling And Failover

10/26/2009 12

Synchronous Multimaster Replication

Writes & reads on any server

Not implemented in PostgreSQL, but application code can mimic via two-phase commit

Page 13: PostgreSQL Scaling And Failover

10/26/2009 13

Load Balancing Issue

Scaling writes breaks down at a certain point

Page 14: PostgreSQL Scaling And Failover

10/26/2009 14

Partitioning

Requires heavy application modification

Performing queries across partitions is problematic (not possible)

PL/Proxy can help

Page 15: PostgreSQL Scaling And Failover

10/26/2009 15

Vertical DB Scaling

“Buying a bigger box is quick(ish). Redesigning software is not.”● Cal Henderson, Flickr

37 Signals Basecamp upgraded to 128 GB DB server: “don’t need to pay the complexity tax yet”● David Heinemeier Hansson, Ruby on Rails

Page 16: PostgreSQL Scaling And Failover

10/26/2009 16

Sites Running on Single DB

StackOverflow– MS SQL, 48GB RAM, RAID 1 OS, RAID 10 for data

37Signals Basecamp– MySQL, 128GB RAM. Dell R710 or Dell 2950

Page 17: PostgreSQL Scaling And Failover

10/26/2009 17

High Availability Overview

Page 18: PostgreSQL Scaling And Failover

10/26/2009 18

High Availability

Application still up even after node failure– (Also try to prevent failure with appropriate

hardware)

PostgreSQL High Availability Options– pg-pool – Shared Disk Failover– File System Replication– Warm Standby with Point-In-Time Recovery (PITR)

Often still need heartbeat application

Page 19: PostgreSQL Scaling And Failover

10/26/2009 19

Shared Disk Failover

Use single disk array to hold database's data files.

– Network Attached Storage (NAS)– Network File System (NFS)

Disk array is central point of failure

Need heartbeat to bring 2nd server online

Page 20: PostgreSQL Scaling And Failover

10/26/2009 20

File System Replication

File system is mirrored to another computer

DRDB– Linux filesystem replication

Need heartbeat to bring 2nd server online

Page 21: PostgreSQL Scaling And Failover

10/26/2009 21

Point in Time Recovery

“Log shipping”– Write Ahead Logs sent to and replayed on standby– Included in PostgreSQL 8.0+– Asynchronous - Potential loss of data

Warm Standby– Standbys' hardware very similar to primary's– Need heartbeat to bring 2nd server online

Page 22: PostgreSQL Scaling And Failover

10/26/2009 22

Heartbeat

“STONITH” (Shoot the Other Node In The Head)

– Prevent multiple nodes thinking they are the master

Linux-HA– Creates cluster, takes nodes out when they fail

Page 23: PostgreSQL Scaling And Failover

10/26/2009 23

Additional Options

Page 24: PostgreSQL Scaling And Failover

10/26/2009 24

Additional Options

Tune PostgreSQL– Defaults designed to “run anywhere”– pgbench, VACUUM/ANALYZE

Tune Queries– EXPLAIN

Caching (avoid the database)– memcached– Ehcache

Page 25: PostgreSQL Scaling And Failover

10/26/2009 25

Radical Additional Options

“NoSQL” database– CouchDB, MongoDB, HBase, Cassandra, Redis– Document store– Map/Reduce querying

Page 26: PostgreSQL Scaling And Failover

10/26/2009 26

Suggested Architecture

Page 27: PostgreSQL Scaling And Failover

10/26/2009 27

Current Production Setup

DB and Web server on same machine

No failover

Page 28: PostgreSQL Scaling And Failover

10/26/2009 28

Suggested Architecture

2 nice machines

Point in Time Recovery with Heartbeat

Tune PostgreSQL

Monitor & improve slow queries

Add in Ehcache as we touch code

→ Leave horizontal scaling for another day

Page 29: PostgreSQL Scaling And Failover

10/26/2009 29

Initial Architecture

High Availability

Page 30: PostgreSQL Scaling And Failover

10/26/2009 30

Future Architecture

Scale up application servers horizontally as needed

Improve DB Hardware

Page 31: PostgreSQL Scaling And Failover

10/26/2009 31

Hardware Options

PostgreSQL typically constrained by RAM & Disk IO, not processor

64-bit, as much memory as possible

Data Array– RAID10 with 4 drives (not RAID 5), 15k RPM

Separate OS Drive / Array

Page 32: PostgreSQL Scaling And Failover

10/26/2009 32

Dell R710

Processor: Xeon

4x 15k HD in RAID10

24GB (3x 8GB) RAM (up to 6x 16GB)

=$6,905

Page 33: PostgreSQL Scaling And Failover

10/26/2009 33

Other Considerations

Should have Test environment mimic Production

– Same database setup– Provides environment for experimentation

Can host multiple DBs on single cluster

Page 34: PostgreSQL Scaling And Failover

10/26/2009 34

References

http://37signals.com/svn/posts/1509-mr-moore-gets-to-punt-on-sharding

http://37signals.com/svn/posts/1819-basecamp-now-with-more-vroom

http://anchor.com.au/hosting/dedicated/Tuning_PostgreSQL_on_your_Dedicated_Server

http://blogs.amd.co.at/robe/2009/05/testing-postgresql-replication-solutions-log-shipping-with-pg-standby.html

http://blog.stackoverflow.com/2009/01/new-stack-overflow-servers-ready/

http://developer.postgresql.org/pgdocs/postgres/high-availability.html

http://developer.postgresql.org/pgdocs/postgres/pgbench.html

https://developer.skype.com/SkypeGarage/DbProjects/PlProxy

http://wiki.postgresql.org/wiki/Performance_Optimization

http://www.postgresql.org/docs/8.4/static/warm-standby.html

http://www.postgresql.org/files/documentation/books/aw_pgsql/hw_performance/

http://www.slony.info/

Page 35: PostgreSQL Scaling And Failover

10/26/2009 35

Additional Links

http://ehcache.org/

http://highscalability.com/skype-plans-postgresql-scale-1-billion-users

http://www.25hoursaday.com/weblog/2009/01/16/BuildingScalableDatabasesProsAndConsOfVariousDatabaseShardingSchemes.aspx

http://www.danga.com/memcached/

http://www.mysqlperformanceblog.com/2009/08/06/why-you-dont-want-to-shard/

http://www.slideshare.net/iamcal/scalable-web-architectures-common-patterns-and-approaches-web-20-expo-nyc-presentation

Page 36: PostgreSQL Scaling And Failover

10/26/2009 36


Recommended