Date post: | 19-Jan-2017 |
Category: |
Software |
Upload: | frazerclement |
View: | 548 times |
Download: | 3 times |
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
OpenWorld 2015200 Million QPS on Commodity Hardware
Getting Started with MySQL Cluster 7.4
Frazer ClementMySQL Cluster Technical LeadBernd OcklinDirector, MySQL Cluster Engineering
October 26th, 2015
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
200 Million QPS on Commodity HardwareGetting started with MySQL Cluster 7.4
Users, Features and Releases1
2
3
4
3
Design for Availability and Scale
Performance, getting to 200M queries/second
How to get started with MySQL Cluster
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
Safe Harbor Statement
The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle’s products remains at the sole discretion of Oracle.
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
Keynote: Monday, 4.00-6.00 pm, YBCA TheaterState of the Dolphin
5
• Rich Mason, SVP & General Manager MySQL GBU, Oracle• Tomas Ulin, VP MySQL Engineering, Oracle
Customer Experiences
Hari Tatrakal, Director of Database Services, Live NationOlaniyi Oshinowo, MySQL & Open Source Technologies Leader, GEErnie Souhrada & Rob Wultsch, Database Engineers, Pinterest
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
MySQL Cluster content @ OpenWorld
Fully Elastic Real-Time Services with MySQL Cluster
Bernd Ocklin, Oracle
Conference Session
Tuesday 11am.Moscone South, 262
MySQL Server and MySQL Cluster at India’s Financial Inclusion Gateway Service
NEC et al
Conference Session
Tuesday 5.15pmMoscone South, 250
Get Started with MySQL Cluster
Benedita Vasconcelos, OracleHands On Lab
Thursday, 9.30amHotel Nikko - Peninsula
6
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
MySQL Community Reception @ OpenWorld
Celebrate, Have Fun and Mingle with Oracle’s MySQL Engineers & Your Peers
7
• Tuesday, October 27th, 7 pm
• Jillian’s at Metreon: 175 Fourth Street, San Francisco CA94103At the corner of Howard and 4th st.; only 2-min walk from Moscone Center
(same place as last year)
Join us!
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
MySQL Cluster deployments
8
Web
Telecoms
High volume OLTPeCommerceOn-Line GamingDigital Marketing
User Profile ManagementSession Management & Caching
Service Delivery Platforms
VAS: VoIP, IPTV & VoD
Mobile Content Delivery
Mobile Payments
Other
Online gaming : AAA + profile management
Payment fraud detection
Many more, some unknown
DBMS research
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
MySQL Cluster highlights
10
High Throughput Reads & Writes
Carrier-Grade Availability
Real-Time Responsiveness
On-Line, Linear Scalability
Low TCO, Open platform
Distributed, Parallel architectureTransactional, ACID-compliant relational database
Shared-nothing design, synchronous data replicationSub-second failover & self-healing recovery
Data structures optimized for RAM. Real-time extensionsPredictable low latency, bounded access times
Incrementally scale out, scale up and scale on-lineLinearly scale with distribution awareness
GPL & Commercial editions, scale on COTSFlexible APIs: SQL, C++, Java, OpenJPA, LDAP & HTTP
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
MySQL Cluster highlights
11
SQL Joins, Foreign Keys, Transactions, Row locks, Triggers, Views, Stored procedures, Blobs, keyless tables, newSQL, MySQL compatible... connectors for most languages, ORMs etc...
NoSQL Full C++ Api for best control and performance (MySQLD SE built on top), Other Apis :
Java, JPA, Node.js, Memcache....
HA 99.999% uptime systems (five nines), No single point of failure (SPOF),
Heartbeating, cluster membership, automatic failover + recovery, automatic client failover, transactional DDL, CP, async replication, advanced exception logging...
Performance and parallelism High throughput, low bounded latency (200M read tx/s). Batching, optimised protocols, Intra and Inter query parallelism, pushed
parallel filters, pushed parallel joins, non-blocking event driven multithreaded....
HA, High performance, Relational, Transactional, Distributed, Parallel, SQL, NoSQL, Shared-Nothing, Commodity ...
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
MySQL Cluster highlights
12
Scalability Scale-out nodegroups or stateless API clients online, Scale-up data nodes and
clients online with multithreading, scale up hardware online
Replication Synchronous two phase commit internally, Transactional HA async replication between clusters, conflict detection+resolution...
Storage Data transparently distributed and balanced by hash, Indexed columns in
memory, others on disk or memory, Secondary unique and ordered indexes, Redundant Redo logs and periodic checkpoints...
Manageability Online add + drop (index, column), Online consistent backup, Online upgrade, Online OS or hardware upgrade, consolidated cluster logs, C management Api for tooling...
Shared nothing, Commodity No need for shared storage, In-memory data uses disk frugally,
TCP over Ethernet / Infiniband etc, No special layer 2 requirements. Open source.
HA, High performance, Relational, Transactional, Distributed, Parallel, SQL, NoSQL, Shared-Nothing, Commodity ...
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
7.3
MySQL Cluster Releases
13
7.2 7.4
- Distributed parallel joins- Multi-TC- Active-Active- Memcached- MySQL Server 5.5
- Foreign keys- Client lib performance- node.js- MySQL Server 5.6
- Restart performance- Active-Active- Internal reporting- MySQL Server 5.6
Regular fixes and improvements
2012 2013 2014 2015
...
MySQL Cluster is built on top of and tracks GA MySQL Server releases, gaining their features, optimisations and bug fixes.
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
MySQL Cluster 7.4
Active-Active replication enhancements
1
System restart and maintenance activities parallelised
Improved observability and manageability
14
Performance optimisations in the data node kernel
More detail and download links at dev.mysql.com
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
MySQL Cluster Architecture
15
Data Nodes
Node Group 1
F1
F4
F4
F1
Node Group 2
F2
F5
F5
F2
Application Nodes
Cluster Mgmt
Cluster Mgmt
RESTJPA
Node Group 3
F3
F6
F6
F3
F1
F2
F3
F4
F5
F6
Table 1
NdbApi protocol
Tables and Indices are horizontally partitioned, distributed across and replicated within the NodeGroups. Application Nodes including MySQLD, use
NdbApi to perform transactional operations and queries on data.
Most Application Nodes are themselves Servers for various client protocols
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
MySQL Cluster Architecture for Availability
16
Data Nodes
Node Group 1
F1
F4
F4
F1
Node Group 2
F2
F5
F5
F2
Application Nodes
Cluster Mgmt
Cluster Mgmt
RESTJPA
Node Group 3
F3
F6
F6
F3
F1
F2
F3
F4
F5
F6
Table 1
Redundancy for availability - All nodes in each nodegroup store the same data - Can survive data node failures so long as one node per nodegroup is
available. - Load balanced, Synchronous 2PC, heartbeating, automatic failover,
recovery
PC A
Redundant components
MySQL Cluster is a CP system in that consistency is favoured over availability. Async replication between clusters gives AP properties
NdbApi protocol
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
MySQL Cluster Architecture for Availability
17
Data Nodes
Node Group 1
F1
F4
F4
F1
Node Group 2
F2
F5
F5
F2
Application Nodes
Cluster Mgmt
Cluster Mgmt
RESTJPA
Node Group 3
F3
F6
F6
F3
F1
F2
F3
F4
F5
F6
Table 1
Redundancy for availability - Two (or more) management servers. - Used for configuration, node startup/shutdown, triggering backups, logging + 'split-brain' arbitration - Not critical – not involved in transaction processing / querying
Redundant components
Management nodes act as lightweight arbitrators, avoiding the cost of odd-sized data node quorums to cope with single failures.
NdbApi protocol
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
MySQL Cluster Architecture for Availability
18
Data Nodes
Node Group 1
F1
F4
F4
F1
Node Group 2
F2
F5
F5
F2
Application Nodes
Cluster Mgmt
Cluster Mgmt
RESTJPA
Node Group 3
F3
F6
F6
F3
F1
F2
F3
F4
F5
F6
Table 1
Redundancy for availability - API nodes are stateless and consistent, can use n + m sparing with simple front end load balancing. - NdbApi automatically balances, fails over and back on data node failures. - Network needs no SPOF too – no single failure takes out > 1 cluster member.
Redundant components
NdbApi protocol
Availability also comes from support for online operations : Schema changes, Hardware and OS upgrades, Software upgrades, Cluster scaling
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
MySQL Cluster Architecture for Scale Out
19
Data Nodes
Node Group 1
F1
F4
F4
F1
Node Group 2
F2
F5
F5
F2
Application Nodes
Cluster Mgmt
Cluster Mgmt
RESTJPA
Node Group 3
F3
F6
F6
F3
F1
F2
F3
F4
F5
F6
Table 1
Performance + CapacityOnline scale out of back end by adding whole node groups
(Read + Write scaling)
Data Nodes can be added online, while transactions and queries are running. Existing data is rebalanced across all nodegroups.
NdbApi protocol
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
MySQL Cluster Architecture for Scale Out
20
Data Nodes
Node Group 1
F1
F4
F4
F1
Node Group 2
F2
F5
F5
F2
Application Nodes
Cluster Mgmt
Cluster Mgmt
RESTJPA
Node Group 3
F3
F6
F6
F3
F1
F2
F3
F4
F5
F6
Table 1
Performance + CapacityOnline scale out of back end by adding whole node groups
(Read + Write scaling)
Application Nodes can be added and removed online, all have equal, consistent access to the data stored by the data nodes.
Performance + HAOnline scale out of front end / Api nodes
NdbApi protocol
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
MySQL Cluster Architecture for Scale Up
21
Replication thread
Main thread
LDM instancesShared nothing
TC instancesShared nothing
Send threads
Request processing threads
TC and LDM threads do most work, must be well fed by Send + Receive
threads
Receive threads IO threads
Connect threads Watchdog
ndbmtdTCTransaction coordinatorLDMLocal data manager (Table + Index partitions)
Generally no more than one request processing thread per [HT] core
Data node
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
MySQL Cluster Architecture for Scale Up
22
Replication thread
Main thread
LDM instancesShared nothing
TC instancesShared nothing
Send threads
Request processing threads
TC and LDM threads do most work, must be well fed by Send + Receive
threads
Receive threads IO threads
Connect threads Watchdog
ndbmtd
Data node
Configurable parallelism within a Data node
TCTransaction coordinatorLDMLocal data manager (Table + Index partitions)
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
MySQL Cluster Architecture for Scale Up
23
Applicationnode
Database / Persistence layer
Business logic / State machines
Protocol decoding
Many* threads
NdbApi
API conn
Clients Clients Clients
libndbclient
NdbApi calls
Client protocol (mysql, memcached, ldap...)
'Protocol 6'
API conn API conn
MysqldMemcachedNode.js*
JavaSlapd...
- Can scale the number of threads to meet demand
- Can scale the number of NdbApi connections to avoid bottlenecks
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
MySQL Cluster Architecture for Scale Up
24
Applicationnode
Database / Persistence layer
Business logic / State machines
Protocol decoding
Many* threads
NdbApi
API conn
Clients Clients Clients
libndbclient
NdbApi calls
Client protocol (mysql, memcached, ldap...)
'Protocol 6'
API conn API conn
MysqldMemcachedNode.js*
JavaSlapd...
- Can scale the number of threads to meet demand
- Can scale the number of NdbApi connections to avoid bottlenecks
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
MySQL Cluster Performance
25
Distributedefficiency
Protocol design, optimisation, packing,
multiplexing.Data Distribution
awarenessLocality - Pushed down
filtering and joining
Coordinationavoidance
Non blocking readsParallel commit
Balance
Hash partitioning
Localefficiency
OS call amortisationNon blocking execution
Cache friendly data structures
Lock free shared data structures
Local data structuresMulti granularity pools
Scale Out Scale Up
See MySQL Connect 2012 session 'Breakthrough performance with MySQL Cluster'
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
MySQL Cluster Performance
26
SQL joins,aggregates
Optimisations build in layers
NoSQL R/W of single rows
NoSQL R/W of multi rows
SQL R/W of multi rows
MySQL Server SQL optimisationsDistributed parallel filter + join
Batching hints, distribution
awareness, read removal
Optimised 2PC, asyncAPIs.
Low level efficiency,Coordinationavoidance
Lower volume, more complex, bigger footprint
Higher volume,simpler, smaller
footprint
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
MySQL Cluster Performance
27
7.2 7.3 7.4
Feb 20121 billion NoSQL reads per minute
Jun 2013 8.5x better performance per NdbApi connection
Feb 2015 200 million NoSQL reads per second
50% better Sysbench read performance
Jul 20121 billion writes per minute
2.5 million SQL statements per second
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
7.3
MySQL Cluster Performance
28
7.2 7.4
Data node Multiple Transaction Coordinator (TC) threads
NdbApi Connection thread contention reduction
Data node Scan + PK lookup optimisations, Send + Recv optimisations
Regular improvements compound over releases
...
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
MySQL Cluster Performance
29
NoSQL Bulk benchmarks - Getting to millions of requests per second on a distributed system is often a matter of efficient multiplexing and demultiplexing of individual requests - Modern hardware is very capable and so it is important to keep out of the way, avoiding context switches, threads, lock contention, small messages, extra hops, and unnecessary communication or coordination. - Many small requests must be gathered together and handled in bulk, without adversely affecting latency or application semantics.
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
MySQL Cluster Performance
30
- Delivered as part of source distribution - Multithreaded C++ NdbApi application - Uses the asynchronous features of NdbApi which allow a single thread to participate in multiple concurrent database transactions. - Row operations using the full primary key - Can make use of NdbApi Distribution Awareness hints to minimise communication - Parameters : Number of API connections, Number of threads, Number of parallel transactions per thread, Number of rows per transaction, Number of columns, Size of each column, Lockmode, Distribution Awareness, Thread partitioning …
NoSQL benchmark tool flexAsynchUnlike e.g. MySQLD / Memcached, has no
upstream clients to serve, so simpler
Details : http://mikaelronstrom.blogspot.co.uk/2013/11/how-to-make-efficient-scalable-key.html
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
MySQL Cluster 200 million NoSQL reads/s
31
72 API client machines running flexAsynch
32 Data node machines
running ndbmtd
1 Management node
- 100 bytes data / read - 19 GB/s aggregate data read rate - 6.4 M reads/s per data node - 612 MB/s data node read rate - 2.86 M reads/s per client - 272 MB/s read per client
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
MySQL Cluster 200 million NoSQL reads/s
32
72 API client machines running flexAsynch
32 Data node machines
running ndbmtd
1 Management node
- 100 bytes data / read - 19 GB/s aggregate data read rate - 6.4 M reads/s per data node - 612 MB/s data node read rate - 2.86 M reads/s per client - 272 MB/s read per client
216 NdbApi connections18,432 client threads> 10 million concurrent reads
384 TC threads384 LDM threads
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
MySQL Cluster 200 million NoSQL reads/s
33
The Infiniband
CloudTM
10 million conc. reads
72 x 256 threads
72 x 3 API connections
flexAsynch ndbmtd
32 x 12 TC + LDM threads
> 100 GB data
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
MySQL Cluster 200 million NoSQL reads/s
34
The Infiniband
CloudTM
flexAsynch ndbmtd
Not distribution aware, extra hop to data
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
MySQL Cluster 200 million NoSQL reads/s
35
The Infiniband
CloudTM
flexAsynch ndbmtd
Distribution aware, minimal hops
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
MySQL Cluster 200 million NoSQL reads/s
36
The Infiniband
CloudTM
flexAsynch ndbmtd
Distribution aware, minimal hops
Batching of requests
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
MySQL Cluster 200 million NoSQL reads/s
37
The Infiniband
CloudTM
flexAsynch ndbmtd
Distribution aware, minimal hops
Batching of requests
Partitionedclient threads
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
MySQL Cluster 200 million NoSQL reads/s
38
Intel hardware lab (Thanks!)
105 machines, each with 28 cores (56 HT threads) - 2 sockets Intel Xeon 'Haswell' E5-2697 v3 processors Each socket : - 14 cores (28 HT threads) - 2.6GHz base, 3.6GHz turbo - 35MB LLC - 64GB DDR4 memory - Infiniband + Gig Ethernet
56 Gbps switched Infiniband network.~1 Tbps bisection bandwidth
Software configuration
Data nodes : - 12 LDM threads (non-HT) - 12 TC threads (HT) - 2 Send threads (non-HT) - 8 Receive threads (HT) - MaxSendDelay config
API nodes : - 3 NdbApi connections per client machine - 256 flexAsynch threads per client machine
Scripts : https://dev.mysql.com/downloads/benchmarks.html
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
MySQL Cluster NoSQL Scale Out
39
0 5 10 15 20 25 30 350
50
100
150
200
250
Data node throughput scaling
Million NoSQL reads/s as number of data nodes scales
Number of Data nodes
Mill
ion
re
ad
s/s
0 20 40 60 80 100 120 140 160 1800
20
40
60
80
100
120
140
160
180
API connection scaling
Million NoSQL reads/s as API connections scale @ 24 data nodes
Number of Api connections
Mill
ion
re
ad
s/s
API node scaling saturates Data nodes
with Infiniband interrupts
Near-linear scaling, 92% efficiency at 32 nodes
Infiniband adapters configred for latency rather than throughput, but benchmarks reached within 10% of maximum throughput in any case
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
Getting started with MySQL Cluster
40
Try Cluster at OOW!Benedita's Hands-on Lab on Thursday morning
Getting started video on YouTubehttps://www.youtube.com/watch?v=4OixfzhOJoA
QuickStart whitepaperhttp://downloads.mysql.com/tutorials/cluster/mysql_wp_cluster_quickstart.pdf
MySQL Cluster 'Getting Started' pagehttps://www.mysql.com/products/cluster/start.html
education.oracle.com MySQL Cluster courses
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
Getting started with MySQL Cluster
41
Tips
Start small and simple - Minimal nodes + configuration - (< 10M concurrent reads!) - Start on localhost to rule out firewall issues
Get it up and running, then add complexity
Experiment with mysql / mysqld, node failures, applications
Consider using MySQL Cluster Manager (https://edelivery.oracle.com)
Ask for help : forums.mysql.com
F1
F4
F4
F1
My laptop
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | 42
Classroom
Training
Learning
Subscription
Live Virtual Class
Training On
Demand
Keep Learning with OracleUniversity
education.oracle.com
Cloud
Technology
Applications
Industries
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
Keynote: Monday, 4.00-6.00 pm, YBCA TheaterState of the Dolphin
43
• Rich Mason, SVP & General Manager MySQL GBU, Oracle• Tomas Ulin, VP MySQL Engineering, Oracle
Customer Experiences
Hari Tatrakal, Director of Database Services, Live NationOlaniyi Oshinowo, MySQL & Open Source Technologies Leader, GEErnie Souhrada & Rob Wultsch, Database Engineers, Pinterest
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
MySQL Cluster Performance Gains
44
Synchronous API - Operation definition and execution are separated. - Single user thread can define a batch of operations, then execute them together, with only one API ↔ DB round trip - A transaction can contain one or more batches of operations. - 1 user thread : 1 executing transaction
Asynchronous API adds : - Single user thread can define, execute and wait for the results of multiple independent transactions. - 1 user thread : n executing transactions
Async Api allows the number of client threads to be reduced giving efficiency gains.