MySQL Cluster Scaling to Billion Database Queries with MySQL Cluster
Bernd OcklinMySQL Cluster [email protected]
<Insert Picture Here>
Program Agenda
• Databases Are Exciting Again!!!
• Overview of MySQL Cluster
• MySQL Cluster - What’s New
• How is it used?
40% DATA GROWTH PER YEAR
5.9BN MOBILE SUBS IN 2011
1 BILLION iOS & ANDROID APPS DOWNLOADED PER WEEK
370K CALL MINUTES EVERY 60 SECONDS
$1TR BY 2014
$700BN IN 2011
2.1BN USERS
8X DATA GROWTH IN 5 YRS
70+ NEW DOMAINS EVERY 60 SECONDS
250m TWEETS PER DAY
850M USERS
20M APPS PER DAY
1 TR VIDEO PLAYBACKS
EXTREME WRITE SCALABILITY REAL TIME USER EXPERIENCE
ROCK SOLID RELIABILITY RAPID SERVICE INNOVATION
Driving new Database Requirements
No Trade-Offs: Cellular Network
HLR / HSS
Billing, AuC, VLR
AuC, Call
Routing, Billing
Location Updates
Pre & Post Paid
• Massive volumes of write traffic
• <3ms database response• Downtime & lost transactions = lost $
MySQL Cluster in Action: http://bit.ly/oRI5tF
EXTREME WRITE SCALABILITY REAL TIME USER EXPERIENCE
ROCK SOLID RELIABILITY ELIMNATE BARRIERS TO ENTRY
No Trade-Offs
Transactional Integrity
Complex Queries
Standards & Skillsets
MySQL Cluster – Users & ApplicationsExtreme Scalability, Availability and Affordability
http://www.mysql.com/customers/cluster/
• Web• High volume OLTP• eCommerce• User Profile Management• Session Management & Caching• Content Management• On-Line Gaming
• Telecoms• Subscriber Databases (HLR / HSS)• Service Delivery Platforms• VAS: VoIP, IPTV & VoD• Mobile Content Delivery• Mobile Payments• LTE Access
Scaling Reads & Writes
99.999% Availability
Real-Time Responsiveness
SQL & NoSQL APIs
Low TCO, Open platform
Auto-sharding + Multi-master
Transactional, ACID-compliant relational database
Shared-nothing design, no Single Point of Failure
On-Line operations: Scale, Upgrade Schema, etc.
High-load, real-time performance
Predictable low latency, bounded access times
Complex, relational queries + Key/Value Access
MySQL, Memcached, C++, Java, JPA, HTTP / REST
GPL & Commercial editions
Commodity hardware, management & monitoring tools
Key Benefits
Basic architectures3-tier
SQL, JDBC, ADO, ...
DataIndexes
Data access(e.g. SQL engine)
Front-EndApplication Logic
NDB API
LDAPREST/JSONClusterJ memcachednative SQL, JDBC, ADO, ...
All services share the same data view
MySQL Cluster Data Nodes
C++ example
NdbOperation *op = trx>getNdbOperation(myTable);
op>insertTuple(); op>equal("key", i); op>setValue("value", &value);
trx>execute( NdbTransaction::Commit );
Java example
Character newCharacter = session.newInstance(Character.class);
newCharacter.setName(„Yoda“);newCharacter.setAttributes(„Force“);
Session.persist(newCharacter);
SQL example(requires MySQL Server)
Mysql> INSERT INTO Charaters (Name, Attributes) VALUES („Yoda“, „Force“);
High performance and Scalability
Cluster is• Distributed• Event Driven• Asynchronous• Parallel• Non-locking
Your friends / Your enemies
• Disks (life-saver)• CPU cache• RAM• Many cores
• Disks (slow fsync)• Network latency• Heap allocation• NUMA• Context switching
Use your friends
Disks (your job saver)– Log your data to disk (asynchrounsly)
CPU cache– Align to to it
RAM– Preallocate!
Many cores– Distribute to cores (have a model that supports this)
Avoid your enemies
Disks– Reduce fsyncs
– no swapping
Network latency– Reduce network round trips
Slow heap allocation– Pre-allocate all memory, avoid using it
NUMA– Disable it
Context switching– Lock to cores
– Get network interrupts out of your way
MySQL Cluster– A distributed hash table
MySQL Cluster Data Nodes
17 Yoda
143 Albert
12 Bernd
42 Ernest
17 Yoda
143 Ernest
12 Bernd
143 Albert
md5() % <no of nodes>
Best Practice : Primary Keys
• ALWAYS DEFINE A PRIMARY KEY ON THE TABLE!• A hidden PRIMARY KEY is added if no PK is specified. BUT..
• .. NOT recommended• The hidden primary key is for example not replicated
(between Clusters)!!• There are problems in this area, so avoid the problems!
• So always, at least haveid BIGINT AUTO_INCREMENT PRIMARY KEY
• Even if you don't “need” it for you applications
Auto-Sharding (distribution)
[ {id: 12, name: Bernd}, {id: 143, name: Albert}, {id: 42, name: Ernest}, …, {id: 17, name: Yoda}]
Application
MySQL Cluster Data Nodes
{id: 17, … } {id: 143, … } {id: 42, … } {id: 12, … }
Auto-Sharding (distribution)– Application “knows“ the data location
find({id: 12})
Application
MySQL Cluster Data Nodes
{id: 12, name: Bernd}
• Transparent to the application and data access layer• No need for application-layer sharding logic – build into the API & kernel• Partitioning based on hashing all or part of the primary key• Each node stores primary fragment for 1 partition and back-up fragment for another
• Transparency maintained during failover, upgrades and scale-out• No need to limit application to single-shard transactions
Auto-Sharding
Adding High Availability– Introducing Node Groups
[ {id: 12, name: Bernd}, {id: 143, name: Albert}, {id: 42, name: Ernest}, …, {id: 17, name: Yoda}]
Application
MySQL Cluster Data Nodes
{id: 17, … } {id: 143, … } {id: 42, … } {id: 12, … }
Adding High Availability- Synchronous Replication
[ {id: 12, name: Bernd}, {id: 143, name: Albert}, {id: 42, name: Ernest}, …, {id: 17, name: Yoda}]
Application
{id: 17, … } {id: 143, … } {id: 42, … } {id: 12, … }
MySQL Cluster Data Nodes
Adding High Availability– Synchronous Replication
17 Yoda42 Ernest12 Bernd143 Albert
12 Bernd
143 Albert
17 Yoda
42 Ernest
Handling Scheduled MaintenanceOn-Line Operations
• Scale the cluster (add & remove nodes on-line)• Repartition tables• Upgrade / patch servers & OS
• Upgrade / patch MySQL Cluster
• Back-Up
• Evolve the schema on-line, in real-time
Adding disk durability
{id: 17, … } In-memory tablesData kept in memory but
complemented by logging to disk.
Disk based tablesData kept on disk but
cached in memory.
Memory
Disk
Logging to disk is decoupled from transaction writing.
Shared Nothing
No shared components. Cheap commodity hardware.
Proper SAN acceptable but expensive.
SQL, JDBC, ADO, ...
Adding High Availability– Extreme resilience
Application
MySQL Cluster Data Nodes
Service continuing
Event driven & asynchronous
find(17, {});
TC LQH ACC TUP
{id: 17, name: Albert}
{id: 17} {name: Albert}
Doing things in parallel
find(17, {});
TC LQHACC TUP
{id: 17, name: Albert}
{id: 17} {name: Albert}
x16
TCTCTCTCLQHLQHLQHLQH
x16
Doing things in parallel
• Primary key reads can be directed to the correct shard on the API application level
– No waste of resources by doing same operation on all
• Each data node can handle up to 16 operations in parallel
• One data node can fully utilize up to 51 physical CPU cores
Basic DeploymentsStart scaling with 2 Data Nodes
SQL, JDBC, ADO, ...
DataIndexes
Front-EndApplication LogicData Access
● Shared Data View● Automated Load Balancing
Basic DeploymentsContinue Scaling
SQL, JDBC, ADO, ...
DataIndexes
Data access(e.g. SQL engine)
Front-EndApplication Logic
Automated Load Balancing
OS, App or Hardware Load Balancing
SQL, JDBC, ADO, ...
Basic DeploymentsScale on 3 levels
SQL, JDBC, ADO, ...
DataIndexes
Data access(e.g. SQL engine)
Front-EndApplication Logic
SQL, JDBC, ADO, ...
Typical Deployments2 Server
SQL, JDBC, ADO, ...
Application
Web Server
MySQL Server
Data Nodes
Server 1 Server 2
Typical Deployments4 Server
SQL, JDBC, ADO, ...
Application
Web Server
MySQL Server
Data Nodes
Server 1 Server 2
Server 3 Server 4
Typical DeploymentsGeo Replication
SQL, JDBC, ADO, ...
Server 1 Server 2
Server 3 Server 4
Server 1 Server 2
Server 3 Server 4
• 8 x Commodity Intel Servers• 2 x 6-core processors 2.93GHz • x5670 processors (24 threads
per total)• 48GB RAM• Linux
• Infiniband networking• flexAsynch benchmark
• C++ NoSQL API (NDB API)
0
200
400
600
800
1.000
1.200
0
20
40
60
80
100
120
Millio
n / m
inu
te
READS
UPDATE
Millio
n / m
inu
te
1.056
109
4 node 8 node
2 node 4 node 8 node
Adaptive Query LocalizationScaling Distributed Joins
• Perform Complex Queries across Shards• JOINs pushed down to data nodes• Executed in parallel • Returns single result set to MySQL
• Opens Up New Use-Cases• Real-time analytics• Recommendations engines• Analyze click-streams
mysqld
Data Nodes
mysqld
AQL
70x More
Performance
DON’T COMPROMISE FUNCTIONALITY TO SCALE-OUT !!
Data Nodes
MySQL Cluster 7.2 AQL Test QueryWeb-Based Content Management System
Copyright 2011 Oracle Corporation 49
DataNode1
DataNode2
MySQLServer
Web-Based CMS
Must Analyze tables for best results
mysql> ANALYZE TABLE <tab-name>;
87.23 seconds
1.26 seconds
70x More
Performance
Memcached Key-Value API
• Persistent, Scalable, HA Back-End to memcached• No application changes: re-
uses standard memcached clients & libraries
• Consolidate Caching & Database Tiers• Eliminate cache invalidation• Simpler re-use of data across
services• Improved service levels
• Flexible Deployment• Schema or Schema-less
storage
NewNoSQL Access
Schema-Free apps• Rapid application
evolution• New types of data
constantly added
• No time to get schema extended
• Missing skills to extend schema
• Initially roll out to just a few users
• Constantly adding to live system
Copyright 2011 Oracle Corporation 52
Cluster & Memcached – Schema-Free
<town:maidenhead,SL6>
key value
<town:maidenhead,SL6>
key value
generic table
Application view
SQL view
Cluster & Memcached – Configured Schema
<town:maidenhead,SL6>
prefix key value
<town:maidenhead,SL6>
key value
Config tables map.zip
Application view
SQL view
SQL & NoSQL
• SQL: Complex, relational queries• Memcached: Key-Value web services• Java: Enterprise Apps• NDB API: Real-time services
Mix&
Match
Data Nodes
NDB API
Clients
Native memcached HTTP/REST
JDBC / ODBCPHP / PERL
Python / Ruby
MySQL 5.5 Server Integration• Configure storage engine per-table
• Choose the right tool for the job• InnoDB: Foreign Keys, XA Transactions,
Large Rows• MySQL Cluster: HA, High Write Rates, Real-
Time
• Reduces Complexity, Simplifies DevOps
• Take advantage of MySQL 5.5• 3x higher performance
• Improved partitioning, diagnostics, availability, etc.
Multi-Site Clustering
• Split data nodes across data centers• Synchronous replication
and auto-failover between sites
• Improved heartbeating to handle network partitions
• Extends HA Options• Active/Active with no
need for conflict handling
Node Group 2
Node Group 1
Data Node 1
Data Node 3
Data Node 2
Data Node 4
Synchronous Replication
Synchronous Replication
Active/Active Geographic Replication
•Replicating complete clusters across data centers
• DR & data locality• No passive resources
•Simplified Active / Active Replication
• Eliminates requirement for application & schema changes
• Transaction-level rollback
Geographic Replication
The existence, content and timing of future releases described here is included for information only and may be changed at Oracles discretion. October 3rd, 2011
Simplified Provisioning & MaintenanceUser Privilege Consolidation
Monitoring & Recovery
High Availability Operation
Automated Management
Reducing TCO and creating a more agile, highly available database environment
MySQL Cluster Manager
Copyright 2011 Oracle Corporation 62
How Does MySQL Cluster Manager Help?
Example: Initiating upgrade from MySQL Cluster 7.0 to 7.2
•1 x preliminary check of cluster state•8 x ssh commands per server•8 x per-process stop commands•4 x scp of configuration files (2 x mgmd & 2 x
mysqld)•8 x per-process start commands•8 x checks for started and re-joined processes•8 x process completion verifications•1 x verify completion of the whole cluster. •Excludes manual editing of each configuration
file.
Total: 46 commands - 2.5 hours of attended operation
Before MySQL Cluster Manager With MySQL Cluster Manager upgrade cluster --package=7.1 mycluster;
Total: 1 Command - Unattended Operation
• Results• Reduces the overhead and complexity
of managing database clusters• Reduces the risk of downtime resulting
from administrator error• Automates best practices in database
cluster management
Copyright 2011 Oracle Corporation 63
Bootstrap single host Cluster1. Download MCM from edelivery.oracle.com:
• Package including Cluster
1. Unzip
2. Run agent, define, create & start Cluster!
$> bin\mcmd –bootstrapMySQL Cluster Manager 1.1.2 started
Connect to MySQL Cluster Manager by running "D:\Andrew\Documents\MySQL\mcm\bin\mcm" -a NOVA:1862
Configuring default cluster 'mycluster'...
Starting default cluster 'mycluster'...
Cluster 'mycluster' started successfully
ndb_mgmd NOVA:1186
ndbd NOVA
ndbd NOVA
mysqld NOVA:3306
mysqld NOVA:3307
ndbapi *
Connect to the database by running "D:\Andrew\Documents\MySQL\mcm\cluster\bin\mysql" -h NOVA -P 3306 -u root
• Connect to Cluster & start using databaseTo bootstrap with Cluster 7.2 replace contents of mcm/cluster directory
http://www.clusterdb.com/mysql-cluster/mysql-cluster-manager-1-1-2-creating-a-cluster-is-now-trivial
Copyright 2011 Oracle Corporation 64
Evaluate MySQL Cluster CGE30-Day Trial
• Navigate to http://edelivery.oracle.com/ and step through (selecting “MySQL Database” as the Product Pack)
• Select MySQL Cluster Manager
When to Consider MySQL Cluster What are the consequences of downtime or failing to meet
performance requirements? How much effort and $ is spent in developing and managing HA in
your applications? Are you considering sharding your database to scale write
performance? How does that impact your application and developers?
Do your services need to be real-time? Will your services have unpredictable scalability demands,
especially for writes ? Do you want the flexibility to manage your data
with more than just SQL ?
Where would I not Use MySQL Cluster?• “Hot” data sets >3TB
• Replicate cold data to InnoDB
• Long running transactions • Large rows, without using BLOBs• Foreign Keys
• Can use triggers to emulate:• http://dev.mysql.com/tech-resources/articles/mysql-enforcing-foreign-keys.html
• Full table scans• Savepoints• Geo-Spatial indexes• InnoDB storage engine would be the right choice
MySQL Cluster Evaluation Guidehttp://mysql.com/why-mysql/white-papers/mysql_cluster_eval_guide.php
MySQL Cluster in ActionWeb Reference Architectures
Session Management eCommerce
MySQL Master
Content Management
Slave 1
MySQL Master
AnalyticsSlave 2 Slave 3 Slave 4 Slave 5
Slave 6 Slave 7 Slave 8 Slave 9 Slave 10
Slave N
Slave 2
Node Group 2
F2
F4
No
de 3
No
de 4
F2
F4
Node Group 1
F1
F3
No
de 3
No
de 4
F1
F3
MySQL Cluster Data Nodes
MySQL Servers
Node Group 2
F2
F4
No
de 3
No
de 4
F2
F4
Node Group 1
F1
F3
No
de 3
No
de 4
F1
F3
MySQL Cluster Data Nodes
MySQL Servers
Slave 1 Slave 3
Data Refinery Memcache / Application Servers
Distributed Storage
XOR• 4 x Data Nodes: 6k page hits per second• Each page hit generating 8 – 12 database operations
Whitepaper: http://www.mysql.com/why-mysql/white-papers/mysql_wp_high-availability_webrefarchs.php
COMPANY OVERVIEW• Leading provider of communications
platforms, solutions & services• €15.2bn Revenues (2009), 77k employees
across 130 countries
CHALLENGES / OPPORTUNITIES• Converged services driving migration to
next generation HLR / HSS systems• New IMS platforms for Unified
Communications• Reduce cost per subscriber and accelerate
time to value
SOLUTIONS• MySQL Cluster Carrier Grade Edition• MySQL Support & Consulting Services
CUSTOMER PERSPECTIVE“MySQL Cluster won the performance test hands-down, and it fitted our needs perfectly. We evaluated shared-disk clustered databases, but the cost would have been at least 10x more.”
-- François Leygues, Systems Manager
RESULTS• Scale out on standard ATCA hardware to
support 60m+ subscribers on a single platform
• Low latency, high throughput with 99.999%+ availability
• Enabled customers to reduce cost per subscriber and improve margins
• Delivered data management solution at 10x less cost than alternatives
http://www.mysql.com/why-mysql/case-studies/mysql-alcatel-casestudy.phphttp://www.mysql.com/why-mysql/case-studies/mysql-alcatel-casestudy.php
76
“Since deploying MySQL Cluster as our eCommerce database, we have had
continuous uptime with linear scalability enabling us to exceed our most stringent SLAs” — Sean Collier, CIO & COO, Shopatron Inc
Shopatron: eCommerce Platform• Applications
– Ecommerce back-end, user authentication, order data & fulfilment, payment data & inventory tracking. Supports several thousand queries per second
• Key business benefits– Scale quickly and at low cost to meet
demand
– Self-healing architecture, reducing TCO
• Why MySQL? – Low cost scalability
– High read and write throughput
– Extreme availability
http://www.mysql.com/why-mysql/case-studies/mysql_cs_shopatron.phphttp://www.mysql.com/why-mysql/case-studies/mysql_cs_shopatron.php
COMPANY OVERVIEW• Pyro provide comms technology solutions
in Core Network, OSS/BSS & VAS• Deployed in 120+ networks worldwide• Cell C, one of the largest mobile
operators in South Africa• 560 roaming partners in 186 countries
CHALLENGES / OPPORTUNITIES• FIFA 2010 world cup opens up network
services to millions of mobile subscribers• International roaming SDP to support up
to 7m roaming subscribers per day• Offer local pricing with home network
functionality• Minimize cost and time to market
SOLUTIONS• MySQL Cluster 7.1 & Services
CUSTOMER PERSPECTIVE
”MySQL Cluster 7.1 gave us the perfect combination of extreme levels of transaction throughput, low latency & carrier-grade availability. We also reduced TCO by being able to scale out on commodity server blades and eliminate costly shared storage” -- Phani Naik, Head of Technology at Pyro Group
RESULTS• Supported subscriber and traffic volumes • Delivered continuous availability• Implemented in 25% of the time of typical
SDP solutions• Choice in deployment platforms to eliminate
vendor lock-in (migrated from Microsoft)
COMPANY OVERVIEW• Leading telecoms provider across Europe
and Asia. Largest Nordic provider• 184m subscribers (Q2, 2010)
CHALLENGES / OPPORTUNITIES• Extend OSS & BSS platforms for new
mobile services and evolution to LTE• OSS: IP Management & AAA• BSS: Subscriber Data Management &
Customer Support
SOLUTIONS• MySQL Cluster• MySQL Support Services
CUSTOMER PERSPECTIVE“Telenor has been using MySQL for fixed IPmanagement since 2003 and are extremelysatisfied with its speed, availability andflexibility. Now we also support mobileand LTE IP management with our solution.Telenor has found MySQL Cluster to bethe best performing database in the worldfor our applications.”
- Peter Eriksson, Manager, Network Provisioning
RESULTS• Launch new services with no downtime,
due to on-line operations of MySQL Cluster
• Consolidated database supports Subscriber Data Management initiatives
• MySQL Cluster selected due to 99.999% availability, real time performance and linear scalability on commodity hardware
COMPANY OVERVIEW• UK-based retail and wholesale ISP &
Hosting Services• 2010 awards for best home broadband
and customer service• Acquired by BT in 2007
CHALLENGES / OPPORTUNITIES• Enter market for wholesale services,
demanding more stringent SLAs• Re-architect AAA systems for data
integrity & continuous availability to support billing sytems
• Consolidate data to for ease of reporting and operating efficiency
• Fast time to market
SOLUTIONS• MySQL Cluster• MySQL Server with InnoDB
CUSTOMER PERSPECTIVE “Since deploying our latest AAA platform, the MySQL environment has delivered continuous uptime, enabling us to exceed our most stringent SLAs”-- Geoff Mitchell Network Engineer
RESULTS• Continuous system availability, exceeding
wholesale SLAs• 2x faster time to market for new services• Agility and scale by separating database
from applications• Improved management & infrastructure
efficiency through database consolidation
COMPANY OVERVIEW• Division of Docudesk• Deliver Document Management SaaS
CHALLENGES / OPPORTUNITIES
• Provide a single repository for customers to manage, archive, and distribute documents
• Implement scalable, fault tolerant, real time data management back-end
• PHP session state cached for in-service personalization
• Store document meta-data, text (as BLOBs), ACL, job queues and billing data
• Data volumes growing at 2% per day
SOLUTION• MySQL Cluster deployed on EC2
USER PERSPECTIVE“MySQL Cluster exceeds our requirements for low latency, high throughput performance with continuous availability, in a single solution that minimizes complexity and overall cost.”
-- Casey Brown, Manager of Dev & DBA Services, Docudesk
RESULTS• Successfully deployed document
management solution, eliminating paper trails from legal processes
• Integrate caching and database into one layer, reducing complexity & cost
• Support workload with 50:50 read/write ratio
• Low latency for real-time user experience and document time-stamping
• Continuous database availability
Getting Started
Learn More
Evaluate MySQL Cluster 7.2 Bootstrap a Cluster!
Scaling Web Databases Guide
www.mysql.com/cluster/
Download Todayhttp://www.mysql.com/downloads/cluster/
Download, No Obligationhttps://edelivery.oracle.com/
Copyright 2011 Oracle Corporation 81
Summary
Scale Web Services with Carrier-Grade Availability
Don’t Trade Functionality for Scale
Try it out Today!
Copyright 2011 Oracle Corporation 82
Multi-threaded Data Node Extensions
• Scaling out on commodity hardware is the standard way to increase performance• Add more data nodes and
API nodes as required
• MySQL Cluster 7.2 increases the ability to also scale-up each data node• Increases maximum
number of utilised threads from 8 to 59
• Can deliver aX single thread performance with bX cores
Node Group 2
No
de 3
No
de 4
Application Nodes
Node Group 1
No
de 2
No
de 1
Multi-threaded Data Node Extensions
• Threads (post GA!):• recv: <= 8 Receive threads• tc: <= 24 Transaction
Coordinator threads• ldm: <= 16 Local Query
Handler threads• send: <= 8 Send threads• main: 1 Main thread• rep: 1 Replication thread• io: 1 I/O thread
• Engineering guidelines provided to find the best configuration: ZXZX
Application Nodes
Data N
od
e 1
recv
tc ldm
send main
rep io
Multi-threaded Data Node Extensions
ThreadConfig :=<entry> [ ,<entry> ] +
entry :=<type>={ [<param> ]+ }
param := count = N | cpubind = L | cpuset = L type := ldm | main | recv | rep | maint | send | tc | io
Example:ThreadConfig=ldm={count=2,cpubind=1,2},ldm={count=2,cpuset=6-9},main={cpubind=12},rep={cpubind=11}
• Note that extra send, recv & tc threads will be part of post-GA maintenance release.
NoSQL with Memcached
• Flexible:• Deployment options• Multiple Clusters• Simultaneous SQL Access• Can still cache in
Memcached server• Flat key-value store or map
to multiple tables/columns
set maidenhead 0 0 3
SL6
STORED
get maidenhead
VALUE maidenhead 0 3
SL6
END
Multi-Site Clustering – changes to STONITH algorithm
• When heartbeat not received, all data nodes will be asked to ping all other data nodes
• Each node establishes its list of ‘suspect’ data nodes from whom they don’t receive a ping response within ConnectCheckIntervalDelay msecs
• If second period of ConnectCheckIntervalDelay passes without a ping response then each data node will send a Fail report to all data nodes naming its suspected node(s)
• On receipt of a Fail message from a suspect node, the receiving node will consider the originating node as failed rather than the requested target
• Leaves each side of the temporarily partitioned network with a viable set of data nodes and arbitration is used to select the surviving side if there is no longer a clear majority
Multi-Site Clustering – WAN engineering recommendations based on user experience
• (Obviously) the longer the latency between sites, the higher the impact to performance
• Target latency should be <= 10 ms; 20 ms acceptable• Test with 1000 byte packet, under load
• Bandwidth requirements dependent on traffic but aim for 1 Gbps+ (100 Mbps for low traffic Cluster)
• Simplest WAN topology possible (fewer points of failure/failover latency)
• Typical WAN failover times should be short enough not to trigger STONITH in Cluster
Geographic Replication – what’s changed in conflict resolution
• Reflecting GCI (Global Checkpoint Index) removes requirement for applications to maintain timestamp field in each potentially conflicting table• One of the two masters acts as the ‘primary’ and monitors all received
replication events from the ‘secondary’ (including its own ‘reflected GCI’) to establish when changes not applied in same order on primary and secondary Clusters
• Primary will then overwrite all conflicting transactions (or optionally just the conflicting rows) on the secondary – as well as subsequent transactions influenced by the conflict
• To use, set the function in mysql.ndb_replication to NDB$EPOCH() or NDB$EPOCH_TRANS()
• Overview & worked example: http://bit.ly/activeactive • Gory details: http://bit.ly/refcgci
How to Push Privilege Data into Data Nodes
mysql> SOURCE /usr/local/mysql/share/mysql/ndb_dist_priv.sql; mysql> CALL mysql.mysql_cluster_move_privileges();
mysql> SHOW CREATE TABLE mysql.user\G *************************** 1. row *************************** Table: userCreate Table: CREATE TABLE `user` ( `Host` char(60) COLLATE utf8_bin NOT NULL DEFAULT '', .... .... ) ENGINE=ndbcluster DEFAULT CHARSET=utf8 COLLATE=utf8_bin COMMENT='Users and global privileges‘
•Fully worked example: http://www.clusterdb.com/mysql-cluster/sharing-user-credentials-between-mysql-servers-with-cluster/ (http://bit.ly/userpriv)
Data Node 1
F1
F2
Data Node2
F2
F1
Application Nodes
Data Nodes
Cluster Mgmt
Scale the ClusterCapacity & Performance
Data Nodes
Node Group 1
F1
F3
F3
F1
Node 1
Node 2
Node Group 2
F2
F4
F4
F2
Node 3
Node 4
Application Nodes
Cluster Mgmt
Cluster Mgmt
Scale the ClusterCapacity & Performance
On-Line Scaling & Maintenance
• Can also update schema on-line
• Upgrade hardware & software with no downtime
• Perform back-ups on-line
1. New node group added
2. Data is re-partitioned
3. Redundant data is deleted
4. Distribution is switched to share load with new node group
Only MySQL Can…..
blend the agility & innovation of the web….
….with the trust & capability of the network.
No Trade-Offs: eCommerce
• Integrated Service Provider platform• eCommerce• Payment processing• Fulfillment
• Supports 1k+ manufacturers & 18k retail partners
• Requirements• Scaling, On-Demand• HA: failures & on-line upgrades• High batch & real time loads• Low TCO: capex and opex
http://mysql.com/customers/view/?id=1080
No Trade-Offs: Flight Control
• US Navy aircraft carriers• Consolidated flight operations
management system• Maintenance records• Fuel loads• Weather conditions• Flight deck plans
• Requirements• No Single Points of Failure• Complete redundancy• Small footprint, harsh environment
• 4 x MySQL Cluster nodes, Linux and Windows
MySQL User Conference Session: http://bit.ly/ogeid3
Creating & running your first Cluster- the “manual” way (without MCM)
• Up & running in 10-15 minutes using Quick Start guides from http://dev.mysql.com/downloads/cluster/
• Versions for Linux, Windows & Solaris
Copyright 2011 Oracle Corporation 106
ACID Compliant Relational Database• SQL & NoSQL interfaces
Write-Scalable & Real-Time• Distributed, multi-master, auto-sharding, optimized in-memory structures & indices
99.999% Availability• Shared-nothing, integrated clustering & sub-second recovery, local & geographic
replication, on-line operations
Low Barriers to Entry• Open-source, elastic, multiple APIs, management tools, commodity hardware