© MariaDB Corporation Ab05/10/2014 1
© MariaDB Corporation
Replication with MariaDBAchieving Scalability and High Availability using MariaDB,
Replication and MariaDB Galera Cluster
Anders Karlsson, Principal Sales Engineer
* *05/10/2014 2
© MariaDB Corporation
Agenda
• About Anders Karlsson
• What is replication good for?
• How does MariaDB Replication work?
• MariaDB Replication enhancements
• Intro to MariaDB Galera Cluster
• Questions ? Answers?
05/10/2014 3
© MariaDB Corporation
Who am Anders Karlsson?
• Principal Sales Engineer at MariaDB
• Former Database Architect at Recorded Future, Sales Engineer and Consultant with Oracle, Informix, TimesTen, MySQL / Sun / Oracle etc.
• I have been in the RDBMS business for some 30 years
• I have also worked as Tech Support engineer, Porting Engineer and in many other roles
• Outside SkySQL I build websites (www.papablues.com), develop Open Source software (MyQuery, mycleaner, OraMySQL etc), am a keen photographer, has an affection for English Real Ales and a great interest in computer history
• I am a proud father of two gorgeous and smart twins
05/10/2014 4
© MariaDB Corporation
So, what is MariaDB replication used for?
• High Availability
• Scale-out
• Backup servers
• Disaster Recovery
• Reporting servers
05/10/2014 5
© MariaDB Corporation
Replication for High-Availability
• MariaDB Replication is asynchronous, which is an issue for HA in this case
• Failover Slave has to wait for Slave to catch up
• Using MHA makes the process somewhat automatic and less problematic
05/10/2014 6
© MariaDB Corporation
Replication for Scale-out
• Scale-out is exactly what MariaDB Replication was intended to do from the start
• Use as a Write-only master and one or more Read-only slaves
• Slaves has low load on the Master
• Easy to use and set up
• Often combined with a Load Balancer
05/10/2014 7
Load Balancer
© MariaDB Corporation
Replication for Backup servers
• Oldschool means cold disk backups, online backups or mysqldump backups at regular intervals
• These days, backups are done on asynchreplicas, sometimes combines with the backup methods above (i.e. cold slave backup)
05/10/2014 8
© MariaDB Corporation
Replication for Disaster Recovery
• Async replication is great for replication between Data Centers
• Disaster Recovery usually has less stringent requirements that, say, Backup replication. A DR site should have to use used more than very seldom
05/10/2014 9
Data Center Disaster Recovery Site
© MariaDB Corporation
Replication for Reports
• Using Replication for Report servers is slightly different
• Report servers may have a slightly different schema / indexing and is probably differently tuned
05/10/2014 10
© MariaDB Corporation
So, how does MariaDB Replication work?
• The replication is Asynchronous, or rather, data is pulled by the Slave from the Master as fast as possible
• To know what to replicate to the slave, the Master keeps an ordered log, the bin-log, of statements to replicate
05/10/2014 11
© MariaDB Corporation
So, how does MariaDB Replication work?
• The Slave keeps track of the position in the Master bin-log either by a file/position pair of by a global transaction id
• The Slave reads the Master bin-log and replicates this to a Relay-log on the slave of statements to be replicated
• This is the Slave IO-thread
05/10/2014 12
© MariaDB Corporation
So, how does MariaDB Replication work?
• Then the Slave reads the Relay-log and applies the statements to the Slave database
05/10/2014 13
Master SlaveBin log Relay log
© MariaDB Corporation
Simple and effective, right? Well...
• There are a few things to note
05/10/2014 14
© MariaDB Corporation
There is nothing special with a Slave!
• Yes, this is true, and this is different from some other replication systems
• A slave runs the exact same software as the master
• The only difference is how the Master and Slave are configured!
05/10/2014 15
© MariaDB Corporation
A Slave can be a Master also!
• This is sometimes useful when you have Many Slaves on a master
05/10/2014 16
Master Intermediate
Master
Bin log Relay log Bin log Relay log Slave
© MariaDB Corporation
Replication can be statement based or row based
• Statement based means that the SQL text of the statement is replicated• UPDATE orders SET customer_id = 57 WHERE
order_id = 19
• Row based means that each row change is replicated as a binary entity which is idempotent
• There is also a MIXED format, which is a mix of the two above
05/10/2014 17
© MariaDB Corporation
Replication can optionally be semi-synchronous
• Semi-sync is an option to MariaDB Replication
• Semi-sync means that the master will ensure that a committed transaction is replicated to at least N slaves before being accepted
• If there is a failure in replication, then the Master can fall back to normal Asynchronous replication
05/10/2014 18
© MariaDB Corporation
Replication just replicates! (Surprise!)
• What this means is that there is nothing in the replication system that guarantee that the Slave and the Master are identical, beyond replication DML
• If you send DML directly to a Slave, then the effect of any upcoming replication data might be different than on the master or might fail
• But if you know what you are doing, this is quite alright!
05/10/2014 19
© MariaDB Corporation
A slave might be a master to the Master
• Two or more masters can be in a ring in a Multi-Master setup
05/10/2014 20
Master Bin logRelay log Relay log Master Bin log
© MariaDB Corporation
Without Semi-sync replication, the Master doesn’t care about the Slaves
• The Master just writes to the Bin-log, nothing else
• The Master also purges the bin-logs as needed
• There is no real “cluster” setup where the nodes know about each other and act in a particular manner
05/10/2014 21
© MariaDB Corporation
MariaDB 10 – One Slave / Many Masters
• As the Master only has to manage the Bin-Log, you can have more than one Slave on one Master
• Before MariaDB 10, a Slave could only be a Slave of one Master
• With MariaDB 10, a Slave can replicate from more than 1 Master
06/10/2014 22
© MariaDB Corporation
One Slave for Many Master
• A Data Warehouse that gets data from multiple sources
2306/10/2014
ERP System CRM System Data Warehouse
© MariaDB Corporation
The serialized Binlog – Slave lag
• The Binlog logs operations from multiple threads in order in one log
• This means that the log has to be applied serialized also
• This in turn means that the SQL thread is single threaded
• And this means that the Slave is slower than the Master, which means Slave lag
06/10/2014 24
© MariaDB Corporation
The serialized Binlog – The fix
• MariaDB 10 has multiple parallel Slave threads
• This is based on the Binlog Group commit functionality
• Application by application parallelism is also available
06/10/2014 25
© MariaDB Corporation
MariaDB Galera Cluster – HA Replication
• For true High Availability, we want all servers in a consistent state
• This requires Synchronous replication
• But HA also requires
• Controlled Failover
• Cluster status information and management
• Add and Remove cluster nodes
• To the rescure: MariaDB Galera Cluster
06/10/2014 26
© MariaDB Corporation
MariaDB Galera Cluster in short
• MariaDB Galera Cluster is separate from and does not depend on MariaDB Replication
• MariaDB Galera Cluster is a complete HA Cluster setup
• Based on and requires InnoDB
• Synchronous replication
• Multi-master with conflict detection
• All nodes are “Cluster aware”
• Add and Remove node is built-in
06/10/2014 27
© MariaDB Corporation
MariaDB Galera Cluster setup
06/10/2014 28
WSREP
Library
WSREP
Library
• MariaDB 10.0 version includes WSREP API
• MariaDB 10.1 has this as standard, for 5.5 and 10.0 patched versions are available
• WSREP API installed as a MariaDB Plugin
• WSREP does replication, failover and management
© MariaDB Corporation
MariaDB Enterprise
• MariaDB Enterprise is the MariaDB Subscription offered by MariaDB
• Includes monitoring and management software
• Includes world-class support
• Includes Certified Binaries
• And much more!
• MariaDB Enterprise Cluster is MariaDB Enterprise for MariaDB Galera Cluster
06/10/2014 29
© MariaDB Corporation
Questions? Answers?
06/10/2014 30
The question is not “What is
the answer?”, the question is
“What is the question?”.
Henri Poincaré