Date post: | 10-Apr-2015 |
Category: |
Documents |
Upload: | robert-hodges |
View: | 3,248 times |
Download: | 0 times |
© Continuent 2008
Getting Smart about the NewWorld of PostgreSQL Replication
PostgreSQL Conference West 2008Robert Hodges – CTO Continuent, Inc.
2 © Continuent PostgreSQL West Conference 2008
Agenda
/ Introductions/ Why Are We Here?/ How Does Database Replication Work?/ What Kinds of Replication Does PostgreSQL Have
Available?/ Summary
3 © Continuent PostgreSQL West Conference 2008
About Continuent
/ Company• The leading provider of scale-out and replication solutions for
open source databases
/ Solutions• Tungsten Enterprise Multi-Master, aka uni/cluster• Tungsten Enterprise Master/Slave
/ Value• Help customers build highly capable applications with low cost
databases and commodity hardware/ Open Source
• Tungsten Replicator - Master/slave replication• Tungsten Connector - Proxy for PostgreSQL and MySQL• Sequoia - Generic middleware clustering for JDBC databases• Hedera - Group communications adapters• Bristlecone - Performance benchmarks for database clusters
4 © Continuent PostgreSQL West Conference 2008
Why Are We Here?
5 © Continuent PostgreSQL West Conference 2008
Possible Reasons
/ You hunger for enlightenment concerning themysteries of data replication
/ You thirst for instruction on the ways ofreplication in the PostgreSQL world
/ The other talks looked really boring
6 © Continuent PostgreSQL West Conference 2008
DEEP QUESTIONS about Replication
/ How does it work?/ Does it work over a WAN?/ Where’s multi-master??!/ Do I have to change my application?/ What do I have to change?/ How does it affect performance?/ Does it scale reads?/ Does it scale writes?/ What happens when a database fails?/ What happens when the network fails?/ What’s “split brain?” That sounds really bad.
7 © Continuent PostgreSQL West Conference 2008
How Does DatabaseReplication Work?
8 © Continuent PostgreSQL West Conference 2008
What is Database Replication?
/ Database replication copies updates automaticallyfrom one database to another
/ The biggest problem in replication is serializingupdates so copies are identical
/ Replication is incredibly useful/ …But also confusing because there are so many
ways to do it/ …And because there are some non-obvious gotchas
9 © Continuent PostgreSQL West Conference 2008
Let Me Count the Ways: Bits or SQL?
/ Fun fact: every self-respecting DBMS can apply atleast two kinds of changes
/ Log records -- Databases apply them automaticallyduring recovery
/ SQL statements -- Clients send SQL to make changes
Physical ReplicationReplicate log records/events tocreate bit-for-bit copy
Logical ReplicationReplicate SQL to createequivalent data
Transparent, high performance,hard to open replicas
Flexible, fewer/differentrestrictions, replicas support reads
10 © Continuent PostgreSQL West Conference 2008
Replicate Statements or Rows?
/ SQL updates other than DDL can be represented intwo different ways
/ Statements -- What the client said/ Row updates -- What the client actually did
Statement ReplicationReplicate changes as SQLstatements
Row ReplicationReplicate changes other thanDDL as row updates
DDL, only way some DBMS canreplicate
Flexible, fewer weird exceptions
11 © Continuent PostgreSQL West Conference 2008
Use Triggers or Read the Log?
/ Another Fun Fact: There are three ways to capturechanges in databases
/ Triggers - For impatient app developers/ Recovery log - For patient developers outside server/ Events - For very patient developers inside server
Log-BasedReplication
Read changes fromrecovery log
Trigger-BasedReplication
Replicate changes asrow updates
Event-BasedReplicationCapture “events”
within server
Quick but lots ofbaggage
High performance butexpensive to implement
Requires server re-engineering
12 © Continuent PostgreSQL West Conference 2008
Replicate Now or a Little Later?
/ Replicating is like buying a car--there are lots of waysto pay for it
/ $0 down - Pay later; hope nothing goes wrong/ Down payment - Pay some so less goes wrong later/ Cash - Pay up front and it’s yours forever
AsynchronousReplication
Commit now,replicate later
Semi-SynchronousReplication
Replicate to at leastone other database
SynchronousReplication
Replicate fully toall other databases
Lose data but robustagainst network failure
Trade-off data loss vs.partition handling
Network fails --> youstop
13 © Continuent PostgreSQL West Conference 2008
Multi-Master or Master/Slave or…?
OK, now it gets confusing! Should I…/ Update one database and let it serialize all changes?/ Update any database with global serialization?/ Update any database and replicate without global
serialization?
Master/Slave
Single masterserializes and
replicates
Multi-Master
Multiple masterswith globalserialization
Master-Master
Multiple masterswith no globalserialization
Fast serialization,SPOF, no split brain
Good scaling but reallyhard to implement
Convenient for WAN buthard for applications
14 © Continuent PostgreSQL West Conference 2008
Master/Slave Replication is Way Flexible!
MasterMaster SlaveSlave
Master-Slave Pair forbasic availability
MasterMaster SlaveSlave
Fan-Out forread scaling SlaveSlave
SlaveSlave MasterMaster
Fan-In for datawarehouseloading
MasterMaster
MasterMasterMaster/Master/SlaveSlave
Chaining to off-load master or goacross slow link
SlaveSlave
MasterMaster MasterMasterCircular Replicationfor cross-siteoperation
15 © Continuent PostgreSQL West Conference 2008
What Kinds of ReplicationDoes PostgreSQL Support?
16 © Continuent PostgreSQL West Conference 2008
An Amazing Number of Ways to Replicate!
Most of the “active” replication methods
Log-based master/slaveMammothReplicator
Heterogeneous log-basedmaster/slave
TungstenReplicator
Master-master replicationBucardoCertification-based multi-masterPostgres RClient-driver-based multi-masterHA-JDBCMiddleware-based multi-masterPg-Pool IIMiddleware-based multi-masterSequoiaTrigger-based master/slaveLondistTrigger-based master/slaveSLONY
DescriptionNameWAL shipping to standby instanceWarm Standby
17 © Continuent PostgreSQL West Conference 2008
Warm Standby
/ Warm standby--your friend for basic availability/ Replicates the write-ahead log (WAL) to standby
server/ Key Characteristics
• Physical replication - Copies WAL blocks to standbyhost in permanent recovery mode
• Asynchronous - Applications do not wait• Log-based - Reads completed WAL files
/ So…• It’s completely transparent• It’s very fast• It does not have serialization or split brain problems
18 © Continuent PostgreSQL West Conference 2008
Warm Standby Implementation
WALWALSegmentsSegments
PostgreSQLPostgreSQL
Master InstanceMaster Instance
pg_xlogspg_xlogsDirectoryDirectory
ArchivedArchivedWALWAL
SegmentsSegments
ArchiveArchiveDirectoryDirectory
PostgreSQLPostgreSQL
StandbyStandby
19 © Continuent PostgreSQL West Conference 2008
Warm Standby Self Defense
/ Warm standby helps with availability, not scaling/ It works over a WAN for disaster recovery/ You cannot open the standby or otherwise use it in
any way without doing a failover/ You are virtually guaranteed to lose data if your
system is busy and you don’t replicate pg_xlogs/ Stupid warm standby tricks you might want to know
• Creating throw-away reporting databases with ZFS• Avoiding data loss using DRBD
/ You should also know about the following:• pgstandby - Makes standby setup very easy• Ucarp and Heartbeat - Automates failover• Other tools?
20 © Continuent PostgreSQL West Conference 2008
PostgreSQL 8.4 Will Be Better!
/ Well maybe…Here’s the plan/ WAL streaming - Copy log records in real time/ Read-only standby - Open standby for reads/ Synchronous and asynchronous modes
• Async - Don’t wait for standby acknowledgement• Syncnet - Wait for acknowledgement that event is received• Syncdisk - Wait for standby to process fully
/ Monitoring hooks to allow Heartbeat, Ucarp, etc. tointegrate easily and manage failover
/ Unclear how much will make it into the 8.4 release
21 © Continuent PostgreSQL West Conference 2008
SLONY
/ SLONY--Table to table replication for PostgreSQL• HA and disaster recovery• Database upgrades• Workload scaling• Geographic distribution of data
/ Key Characteristics• Logical replication - Replicates using SQL• Asynchronous - Replication occurs after commits• Trigger-based - Triggers on all replicated tables• Row-based - Separate mechanismc for DDL
/ So…• It’s flexible• It’s reasonably fast• No split-brain• It handles any DML• Replicas open for reads
22 © Continuent PostgreSQL West Conference 2008
Master / Slave Replication (SLONY)
MasterDB
Backups
Client Updates
Slave
Slave
Read-OnlyApps
LegendReplication:Client Access:
Slon Daemon
Slon Daemon
Slon Daemon
SLONY Cluster
23 © Continuent PostgreSQL West Conference 2008
SLONY Self Defense
/ Clean failover design; “flip” between master & slave/ Handles sequences/ Flexible topologies like chaining and fan-out/ Permits complex schemes where some tables are
replicated and others are not/ Complex administration/ Does not automatically replicate DDL/ If you plan to use SLONY, you will need to study its
habits carefully
24 © Continuent PostgreSQL West Conference 2008
Sequoia Middleware Clustering
/ Multi-master replication using middleware/ Key Characteristics
• Logical replication - Replicates SQL• Statement-based - Distributes SQL statements send
by clients• Synchronous - Updates all databases at once• Multi-master - All nodes update at once
/ So…• No master SPOF• Automatic scaling of reads• Excellent design for daylight maintenance• Automatic scaling of reads without any application
changes
25 © Continuent PostgreSQL West Conference 2008
DB DB ServerServer
Java Clients
Sequoia Architecture
Client ApplicationsNative Clients
(PHP, Perl, C, Ruby, etc.)
Controller – Virtual DBs
JDBC Driver JDBC Driver
Tungsten Connector
Controller – Virtual DBs
Driver Driver
Group Comm
Java ClientsSequoia JDBC Driver
Sequoia JDBC Driver
DB DB ServerServer
DB DB ServerServer
DB DB ServerServer
26 © Continuent PostgreSQL West Conference 2008
Sequoia Self-Defense
/ Middleware gets multi-master without changingdatabase internals
/ Read scaling fully transparent/ Updates slow/not fully transparent
• SQL statements like nextval() hard to serialize• To make it faster, relax ordering = no serialization!!
/ Middleware replication subject to distributeddeadlock
/ Does not support WAN• If your “ping” is 10ms, max updates/sec = 1000/10 = 100• Metropolitan networks are possible if ping < 1-2ms
/ If a database fails or network partitions, you reloadfrom a backup
27 © Continuent PostgreSQL West Conference 2008
Postgres-R Clustering
/ In-core multi-master replication using certification toensure global serialization
/ Key Characteristics• Physical replication - Replicates change sets (events)• Synchronous - All databases in sync• Multi-master - Connect to any node to update• Event replication - Copies change sets
/ So…• Good update performance• Write scaling!• Excellent read scaling (linear)• No master SPOF
28 © Continuent PostgreSQL West Conference 2008
Postgres-R Replication using Certification
Client
submitupdate
Tx: 2Update foo set total = 2.3 where id = 25
Tx: 1Update foo set total = 2.5 where id = 25
Abort!
Client
submitupdate
Apply
Group Communications(Totally ordered logical queue)
29 © Continuent PostgreSQL West Conference 2008
Postgres-R Self-Defense
/ Postgres-R approach scales writes!/ Read scaling is fully transparent/ Postgres-R subject to aborts on hot spots
• It gets bad fast on “small” data sets or structures like queues/ Sequences are problematic/ DDL changes can lock the cluster/ Provisioning new nodes is complex/ Does not work over a WAN
• (See the Sequoia slides for reason why)
/ Very sensitive to group communication performance/ Nobody has ever gotten the certification approach to
work! This is vaporware!
30 © Continuent PostgreSQL West Conference 2008
Or Is It???
31 © Continuent PostgreSQL West Conference 2008
Tungsten Replicator
/ Heterogeneous master/slave replication/ Key Characteristics
• Logical replication - Replicates using SQL• Asynchronous - Replication occurs after commits• Log-based - If we can read the log, we do• Row- *or* statement-based - Handles DDL
/ So…• It’s flexible• It’s reasonably fast• No split-brain• Handles DML and DDL• Replicas open for reads
32 © Continuent PostgreSQL West Conference 2008
Tungsten ReplicatorMaster
Replicator
Extractor
TransactionHistory Log
SlaveReplicator
Applier
TransactionHistory Log
BinlogsBinlogsPostgreSQLPostgreSQL
““Brand XBrand X””DatabaseDatabase
Master DBMaster DB
Rows +Rows +StatementsStatements
Slave DBSlave DB
JDBC SQLJDBC SQLRequestsRequests
Platform-Platform-IndependentIndependentSQL EventsSQL Events
Log RecordsLog Records
[Filters] [Filters]
33 © Continuent PostgreSQL West Conference 2008
Tungsten Replicator
/ Support for writing into PostgreSQL on the way/ Nice heterogeneous features
• Fungible row update event with standard datatypes• Event filtering - dropping based on conditions• Transformations like changing db name, dropping columns, etc.
/ Supports fan-in, fan-out, chaining/ Oracle replication available as commercial extension/ Easy to set up and use/ We can’t read PostgreSQL logs -- that will be a
while…
34 © Continuent PostgreSQL West Conference 2008
In Summary… A New Day for PostgreSQLReplication
35 © Continuent PostgreSQL West Conference 2008
Questions?
Robert HodgesCTOContinuent, Inc.
http://www.continuent.comhttp://community.continuent.com