Presented by,
MySQL AB® & O’Reilly Media, Inc.
Ten Thousand Tables Can’t Be Wrong…
www.sciencelogic.com
Richard Chart – Co-Founder, ScienceLogic, LLC
This Presentation is about…
An unusual approach to application scaling using MySQL
The design choices we made early in product design, and how they look a few years later
A few hard earned lessons in what to look for as an application grows beyond its roots
…Covering
The decision path that led to using MySQL in this way
The growth path that got us to current scale Along the way I will talk about:
What we discovered about MySQL write performanceWhile real world monitoring is important… understanding how to interpret the results is even more so
In the beginning… …or at least the beginning for ScienceLogic:
2003 Product concept: IT Management Appliance A series of choices - fundamental technology for
each layer:HardwareOperating SystemApplication ArchitectureDatabase technology & architecture
Sweet Layers – Shirley Shelton
Technology Choices
Hardware/OS/architecture – for another day
Database Technology: MySQL1. Performance
2. Stability
3. Experience
4. Support (increasingly important)
Scale
Starting point: single appliance supporting 200 devices
Goal: extensible architecture with - very deep host and application monitoring1,000+ devices, each with 100+ management points
Current reality –several times the scale of the original goal
Future: The fundamentals are sound: next generation of the product moves up another order of magnitude
Database Architecture
The challenge:Wide range of monitored nodes (100…1,000+)Wide range of data points (ping…hundreds of HW, OS and application metrics)Distributed collection (WAN: latency, security concerns)Commercial product: MySQL dual licensing comes into play
Database Architecture
Data Characteristics:90+% WritesACID not importantResilient to loss
(Remaining data is not invalidated by gaps)Data elements valuable by themselvesData elements much more valuable when
relationships added
The MyISAM Fit
Very fast writes when no lock contention Simple data handling Lower license cost that InnoDB (important only
for those of us shipping commercial products) Not transactional – and we don’t care!
Slower to stuff with data More expensive (for us commercial folks)
Not InnoDB (all the opposite reasons)
What about memory tables? Limited applicable areas in EM7: most of the data has to
live on disk for reporting weeks or months in future
No measurable benefit over MyISAM in transient data areas where we could use them in EM7
Because we take advantage of MyISAM cached indexes and the required data is in the index
…but the application continues to evolve, we will use them in future if the right situation occurs
Scaling without lock contention
The ace in the holeDynamically created tablesNo more than one thread writing to a table at onceSeparate thread consolidates data for reporting across devices
This approaches the sharding architecture used in highly scalable web sites, but with core data stored centrally
Multiple Threads / Multiple Collectors
Dynamic Databases and TablesStats Databases
Dynamic_app_18 Dynamic_app_20 Dynamic_app_21 Dynamic_app_22
Stat_1
Stat_12
Stat_13
Stat_14
Stat_1
Stat_253
Stat_300
Stat_301
Stat_302
Dynamic_app_27 Dynamic_app_28
Stat_56
Stat_59
Stat_16
Stat_18
Stat_180
Stat_181
Stat_77
Stat_79
Stat_80
Stat_81
Stat_82
Stat_89
Stat_500
Stat_550
Stat_551
Stat_552
Stat_553
Stat_554
Stat_555
State_600
Stats_601
Stats_602
Stats_650
Stats_651
Stats_640
Stats_642
Stats_649
Stat_15
Stat_120
Stat_121
Stat_250
Dynamic Table Creation NO_TABLE = 1146 # MySQL error code
try: db.execute(“INSERT INTO dynamic_app_%s.stat_%s VALUES (10, ’sample data’)”, (app, device))
except MySQLdb.Error, e: if e.args[0] == NO_TABLE: db.execute(“CREATE DATABASE IF NOT EXISTS dynamic_app_%s”, (app)) db.execute(“CREATE TABLE IF NOT EXISTS dynamic_app_%s.stat_%s LIKE dynamic_app_0.stat_0”,(app, device))
Growth Curve
0
50
100
150
200
250
Mth
2
Mth
4
Mth
6
Mth
8
Mth
10
Mth
12
Collec tors
Devic es / 10
Tables / 100
How far does this go?
So Far: 20,000+ tables 2,200+ queries per
second 5 billion rows 93% writes
Next limit is how quickly data can be stuffed onto disk
The Database Platform 4 x Intel Xeon Dual Core 7140M,
16MB Cache, 3.4GHz, 800MHz FSB
16GB (16 x 1GB) 400MHz Single Ranked DIMMs
Hardware RAID Controller 10 x 146GB 15k RPM SAS Drives
(RAID10) Linux 2.6 kernel MySQL 5.0.x
Managing Performance With Growth
As usage rates escalate, things that once were fine become an issue…eg:Query Cache entries purged due to too many entries, or too many changes to underlying tablesLock contentionSort data set size causing increased
created_tmp_disk_tables rather than created_tmp_tables
Open Files & Tables
Critical when scaling this way show global status like 'open%' Open Files (Linux)
/etc/security/limits.confmysql hard nofile 20480
mysql soft nofile 20480
/etc/my.cnf[safe_mysqld]
open_files_limit = 20480
[mysqld]
table_cache=8192
Misc MyISAM Tuning Helpers
concurrent_insert = 21 (Default) Enables concurrent insert for MyISAM tables that don't have holes
2 Enables concurrent inserts for all MyISAM tables, even those that have holes. For a table with a hole, new rows are inserted at the end of the table if it is in use by another thread. Otherwise, MySQL acquires a normal write lock and inserts the row into the hole.
myisam_recover = QUICK, BACKUP
Very important to measure DB stats over time show global status like ‘opened_tables';Point in time counter useless in its own right…very valuable with 5 minute poll and graphed deltas
Some stats must be combined to be usefulPercentage of requests waiting for locks (deltas) (table_locks_waited/(table_locks_waited + table_locks_immediate) * 100
Trends
Monitoring MySQL Enterprise Monitor
Very worthwhile tool – take advantage of it if you subscribe to MySQL Enterprise support
Other ToolsEM7 monitors databases (as well as servers, routers, firewalls, etc. etc.), so of course we use that…
EM7 Example
Monitoring Caveat! Some monitored changes are obviously bad…
eg increase in created_tmp_disk_tables
Some monitors are misleading…
…what’s going on here?
What really happened:
Misconfigured clients caused the CPU load
created_tmp_tables unconnected with the CPU load
EXPLAIN showed small row sets being sorted in memory tmp tables
Summarizing & Pruning Data Keeping the source statistics tables small is key
for ongoing performance – Summarized data for reporting
Infrequent writes, regular reads - MyISAM fine here also in most casesIn EM7 we summarize hourly, daily, monthly etc.Retention periods configurable
DELETEs suck MySQL performance…can use a deleted row marker If you can do a purge in your app, you’re goldenIn EM7 we schedule DELETEs for a nightly quiet time
Summarizing Data With Dynamic Stored Procedure USE dynamic_app_data_43; DROP PROCEDURE IF EXISTS dynamic_app_43.dynamictest; DELIMITER // CREATE PROCEDURE dynamic_app_43.dynamictest(n INTEGER) BEGIN SET @s = ""; SET @s = CONCAT(@s," INSERT INTO app_crunched (did,object,ind,year,month, date,average, total, poll_count)"); SET @s = CONCAT(@s," SELECT
",n,",object,ind,YEAR(date),MONTH(date),date,0,0,0"); SET @s = CONCAT(@s," FROM stat_", n); SET @s = CONCAT(@s," WHERE crunched = 0"); #SELECT @s; PREPARE exe FROM @s; EXECUTE exe; END; // DELIMITER ; CALL dynamictest(5);
What’s The Downside?
Reporting tools not good with dynamic databases and tables (eg Crystal Reports)
80 : 20 rule: Above a certain size of implementation, some tables just have to use row locking (with EM7, 1,500 devices, 10,000 tables, and 0.1% need to be InnoDB… so should be the 99.9 : 0.1 rule)
Backup and data maintenance complexity of multiple engines
So… MyISAM all the way?
Well, no. In our application, at larger sites, around a
dozen tables need the characteristics of InnoDB, or around 0.1%
Be selective in the storage engine choice, consider relative merits for each part of the application