Date post: | 31-May-2018 |
Category: |
Documents |
Upload: | oleksiy-kovyrin |
View: | 216 times |
Download: | 0 times |
8/14/2019 OSCON2007: Landscape of trx engines
http://slidepdf.com/reader/full/oscon2007-landscape-of-trx-engines 1/48
Landscape of Open SourceTransactionalStorage EnginesPeter Zaitsev
Vadim Tkachenko
http://MySQLPerformanceBlog.com
8/14/2019 OSCON2007: Landscape of trx engines
http://slidepdf.com/reader/full/oscon2007-landscape-of-trx-engines 2/48
About us- Founders Percona Ltd
- M ySQL Perform ance Focused Consulting
-http://www.M ySQLPerform anceBlog.com - authors
- W orked for M ySQL AB for years
- Peter – lead of “High Perform ance Group”, Vadim his
right hand
- Long tim e M ySQL users for bunch of personally
involved projects
8/14/2019 OSCON2007: Landscape of trx engines
http://slidepdf.com/reader/full/oscon2007-landscape-of-trx-engines 3/48
M ySQL pluginable architecture
8/14/2019 OSCON2007: Landscape of trx engines
http://slidepdf.com/reader/full/oscon2007-landscape-of-trx-engines 4/48
M ySQL Transactional Engines- BDB - Legacy Storage Engine, rem oved in 5.1 not
tested
- InnoDB - “M ost popular” (The only com m only used)
storage engine by Innobase O y.
- SolidDB - Storage Engine from Solid Inform ation Technology
- PBXT - Storage Engine by SNAP Innovation (Paul McCullagh)
-Falcon - New Storage Engine by MySQL AB, Project lead byJim Starkey
- NDB - MySQL Cluster is a whole other beast and not covered
8/14/2019 OSCON2007: Landscape of trx engines
http://slidepdf.com/reader/full/oscon2007-landscape-of-trx-engines 5/48
InnoDB- http://www.innodb.com /
- M ature Storage Engine, developm ent started by Heikki
Tuuri over 10 years ago.
- Heikki was looking for a ways to im prove traditional
databases perform ance
- Acquired by O racle in the end of 2005
- The only Transactional storage engine available in
M ySQL 5.0 official release
8/14/2019 OSCON2007: Landscape of trx engines
http://slidepdf.com/reader/full/oscon2007-landscape-of-trx-engines 6/48
solidDB- http://www.solidtech.com /solidDBforM ySQL/
- OpenSourced in 2006
-Existing Storage Engine technology “integrated” withM ySQL
- Focused on reliability and M ultiprocessor Scalability
- Currently shipped as production ready.
8/14/2019 OSCON2007: Landscape of trx engines
http://slidepdf.com/reader/full/oscon2007-landscape-of-trx-engines 7/48
Prim eBase XT (PBXT)- http://www.prim ebase.com /xt/
- W ritten m ainly by Paul McCullagh since 2005
-Not a port of existing storage engine to MySQL but new writeup
- Uses number of unusual design decisions
- Only 50% transactional
-Focused on efficient BLOB storage
- http://www.blobstreaming.org/
8/14/2019 OSCON2007: Landscape of trx engines
http://slidepdf.com/reader/full/oscon2007-landscape-of-trx-engines 8/48
Falcon- http://dev.m ysql.com /doc/falcon/en/index.htm l
- Based on “Netfrastructure” engine by Jim Starkey
-Purchased by M ySQL AB in early 2006
- “Lightweight Design”
- Focused on Transactional needs of W eb Application,
efficient use of large am ount of m em ory
8/14/2019 OSCON2007: Landscape of trx engines
http://slidepdf.com/reader/full/oscon2007-landscape-of-trx-engines 9/48
Design and Behavior
8/14/2019 OSCON2007: Landscape of trx engines
http://slidepdf.com/reader/full/oscon2007-landscape-of-trx-engines 10/48
InnoDB design- M VCC and very efficient row level locks
- Clustering by prim ary key, write to sam e pages
-non-com pressed secondary indexes w. transaction info
- Single tablespace or tablespace per table
- Pessim istic locking
-Instant Deadlock detection
- Fuzzy Checkpointing
- “DoubleW rite” for partial page write protection
8/14/2019 OSCON2007: Landscape of trx engines
http://slidepdf.com/reader/full/oscon2007-landscape-of-trx-engines 11/48
InnoDB- DEADLOCK detection
- Session 1: BEGIN;Session 2: BEGIN;Session 1: UPDATE test SET name=‘random1-1’ WHERE id=1;
Session 2: UPDATE test SET name=‘random2-1’ WHERE id=2;Session 1: UPDATE test SET name=‘random1-2’ WHERE id=2;Session 2: UPDATE test SET name=‘random2-2’ WHERE id=1;
- InnoDB detect deadlock (Error 1213)Instantly insecond session
-
Pessim istic locking:- UPDATE the sam e row in two concurrent transaction –
second transaction waits on COM M IT/ROLLBACK infirst
8/14/2019 OSCON2007: Landscape of trx engines
http://slidepdf.com/reader/full/oscon2007-landscape-of-trx-engines 12/48
InnoDB Strengths- Powerful M VCC
- Good perform ance on wide range of workloads
-Great Stability
- Great Data Protection
- Prim ary Key Clustering allows a lot of optim izations
-Transaction info in secondary indexes allow fast indexonly scans
- Adaptive Hash indexes and other advanced techniques
8/14/2019 OSCON2007: Landscape of trx engines
http://slidepdf.com/reader/full/oscon2007-landscape-of-trx-engines 13/48
InnoDB W eaknesses- Slow Developm ent pace in recent years
- Still having scalability issues with m ultiple CPUs
-Unscalable Auto-Increm ent, Broken G roup Com m it takevery long to fix
- Large footprint, especially for secondary indexes
- It turns out not so large as we com pare
- Still m essy integration with M ySQL
- How do you see how m uch space is free in Innodb
tablespace ?
8/14/2019 OSCON2007: Landscape of trx engines
http://slidepdf.com/reader/full/oscon2007-landscape-of-trx-engines 14/48
SolidDB Design- M VCC and Row level locking
- Clustering by Prim ary Key
-New data stored in new pages
- “Bonsai Tree” used for M ulti Versioning
- OPTIM ISTIC and PESSIM ISTIC locking specified on
table level
- Online Backup (Not usable for Slave creation)
- High Available sync replication prom ised soon.
8/14/2019 OSCON2007: Landscape of trx engines
http://slidepdf.com/reader/full/oscon2007-landscape-of-trx-engines 15/48
solidDB - PESSIM ISTIC- DEADLO CK - DEADLO CK detected in first Session after
20 sec of waiting
- Tim eout based deadlocks
- UPDATE two rows – second session wait on first
8/14/2019 OSCON2007: Landscape of trx engines
http://slidepdf.com/reader/full/oscon2007-landscape-of-trx-engines 16/48
solidDB - OPTIM ISTIC- DEADLOCK - DEADLOCK detected in second Session
im m ediately but with error 1205 – Lock wait tim eout exceeded
- UPDATE two concurrent rows:
-SESSION 1: BEGIN;SESSION 2: BEGIN;SESSION 1: UPDATE test SET nam e = ‘rnd’ W HERE id=2;SESSION 2: UPDATE test SET nam e = ‘rnd’ W HERE id=2;
- In Session 2 we got:
ERROR 1205 (HY000): Lock wait tim eout exceeded; tryrestarting transaction
- This is O K for OPTIM ISTIC engines, but m ay cause trouble inW eb applications.
8/14/2019 OSCON2007: Landscape of trx engines
http://slidepdf.com/reader/full/oscon2007-landscape-of-trx-engines 17/48
S olidDB S trengths and Weakness- Lim ited production usage to really tell
- Out of storage engines reviewed m ost sim ilar in design
to Innodb
- Choice of Optim istic vs Pessim istic is nice for som e
applications
- No instant deadlock detection
-So far available as special download only (not even a
plugin)
8/14/2019 OSCON2007: Landscape of trx engines
http://slidepdf.com/reader/full/oscon2007-landscape-of-trx-engines 18/48
PBXT Design- M VCC W ith row level locking
- “Per Database” Transactions
-No real durability yet, weak crash recovery
- OPTIM ISTIC locking
- W rite once, write sequentially to log
-Never update in place
- Data cache + Key cache
- Efficient BLOB Handling
8/14/2019 OSCON2007: Landscape of trx engines
http://slidepdf.com/reader/full/oscon2007-landscape-of-trx-engines 19/48
PBXT- DEADLO CK detected in second session, 1213 error
- UPDATE two concurrent rows – optim istic,
second session:
ERROR 1020 (HY000): Record has changed since last
read in table 'test2'
8/14/2019 OSCON2007: Landscape of trx engines
http://slidepdf.com/reader/full/oscon2007-landscape-of-trx-engines 20/48
PBXT Strengths and W eaknesses- Not yet com m only used in production (we tried but got
too m any bugs)
- Very good perform ance for som e workloads
- Efficient Storage, close to M yISAM
- Focused on BLOB efficient handling, extra features like
Blob Stream ing
-Still m ainly one m an project
- Large ToDo, a lot needs to be done, including Recovery
- Potentially large Purging overhead
8/14/2019 OSCON2007: Landscape of trx engines
http://slidepdf.com/reader/full/oscon2007-landscape-of-trx-engines 21/48
Falcon Design- M VCC, row level locking (in practice, not in theory)
- PESSIM ISTIC locking
-Not clustered by prim ary key
- Row cache (cache only rows you need)
- “Optim al” index traversion
-“Data Com pression” - Nulls, Em pty Strings
- Always needs to read row data (because of index
structure)
8/14/2019 OSCON2007: Landscape of trx engines
http://slidepdf.com/reader/full/oscon2007-landscape-of-trx-engines 22/48
Falcon- DEADLOCK:
In Session2:
ERROR 1020 (HY000): Record has changed since lastread in table 'test2'
- Ann Harrison tells Falcon checks cycles in lock graph
periodically rather than instantly on row lock wait
-UPDATE:Second session waits
8/14/2019 OSCON2007: Landscape of trx engines
http://slidepdf.com/reader/full/oscon2007-landscape-of-trx-engines 23/48
Falcon Strengths in W eaknesses- Still Alpha with m any bugs – Early to judge
- Very active support from M ySQL AB
-Fast developm ent pace – bugs being fixed quickly, m ajorperform ance im provem ents during last 3 m onths
- Good integration with M ySQL, ie tables for perform ance
data
-No Prim ary key clustering or covering index support
- Different design decisions can com plicate m igration from
Innodb (though logical behavior becam e closer)
8/14/2019 OSCON2007: Landscape of trx engines
http://slidepdf.com/reader/full/oscon2007-landscape-of-trx-engines 24/48
There are lies, big liesand there are
Benchmarks
8/14/2019 OSCON2007: Landscape of trx engines
http://slidepdf.com/reader/full/oscon2007-landscape-of-trx-engines 25/48
Benchm arks – things to note- Benchm arks m ay not be relevant for perform ance of
your application
- Early versions we tried for Falcon, PBXT m ay change
their perform ance properties before production
- There is not too m uch experience out where tuning
Falcon, PBXT and Solid with M ySQL as they are barely
used in production
- W e did less benchm arks than wanted – spent a lot of
tim e fighting/reporting bugs and checking fixes
8/14/2019 OSCON2007: Landscape of trx engines
http://slidepdf.com/reader/full/oscon2007-landscape-of-trx-engines 26/48
Benchm arks- Read-O nly on typical table for web-application
- DBT2 – TPC-C em ulation
-Dell DVD Store – em ulation of e-com m erce site
- Sysbench – O LTP transactions
- Sqlbench - sm all data set, single user, typical query
patterns
8/14/2019 OSCON2007: Landscape of trx engines
http://slidepdf.com/reader/full/oscon2007-landscape-of-trx-engines 27/48
Box- Dell PowerEdge 2950
- CentOS release 4.5
-4 CPUm odel nam e : Intel(R) Xeon(R) CPU 5148 @2.33GHzstepping : 6cpu M Hz : 2327.529cache size : 4096 KB
- 16 GB of RAM
- RAID 10 (6 10K RPM 3.5” SAS hard drives)
8/14/2019 OSCON2007: Landscape of trx engines
http://slidepdf.com/reader/full/oscon2007-landscape-of-trx-engines 28/48
M ySQL Versions- Yes, this m eans version affects perform ance not only
storage engine but we could not get all storage engine
working with sam e M ySQL version.
-InnoDB and PBXT5.1.19
- Falcon
6.0.1-alpha, bk tree from 10-Jul
- SolidDB
5.0.41-0073
8/14/2019 OSCON2007: Landscape of trx engines
http://slidepdf.com/reader/full/oscon2007-landscape-of-trx-engines 29/48
Engines param eters- 12 GB of RAM for buffers
- InnoDB --innodb_buffer_pool_s ize=12G--innodb_flush_method=O_DIRECT
--innodb-log-file-s ize=100M- SolidDB --soliddb-cache-size=12G
- Falcon--falcon_min_record_memory=2G
--falcon_max_record_memory=4G--falcon_page_cache_size=8G
- PBXTpbxt_index_cache_size=8Gpbxt_record_cache_size=4G
8/14/2019 OSCON2007: Landscape of trx engines
http://slidepdf.com/reader/full/oscon2007-landscape-of-trx-engines 30/48
DBT2 Configuration Details- DBT2
- http://osdldbt.sourceforge.net/
-10 Concurrent users (about 2 for each CPU core anddisk)
- “Zero Delay” to fully load M ySQL Server
- In 400W configuration reduced available m em ory to 4G
by locking 12G B of m em ory to have it IO bound.
- Buffer sizes were reduced to 2GB
8/14/2019 OSCON2007: Landscape of trx engines
http://slidepdf.com/reader/full/oscon2007-landscape-of-trx-engines 31/48
DBT2 – 10 warehouses- 10 warehouses, 10
clients (datasize ~
700M )
-Result in New OrderTransaction Per M inute,
m ore is better
- PBXT crashed
- Old version of Falcon
had ~1100 NOTPM
- Great im provem ent !NOTPM0
1000
2000
3000
4000
5000
6000
7000
8000
9000
10000
11000
12000
13000
14000
15000
16000
17000
18000 17744
6097
8209
InnoDB
SolidDB
Falcon
PBXT
8/14/2019 OSCON2007: Landscape of trx engines
http://slidepdf.com/reader/full/oscon2007-landscape-of-trx-engines 32/48
DBT2 – 400 warehouses- Data size ~ 29G B
- SolidDBcrashed after 336 m ins
- Did Not disable logs onSolidDB to have thingscom parable.
Time, min0
10
20
30
40
50
60
70
80
90
100
110
120
130
140
63
40
136
Load time
InnoDB
PBXT
Falcom
8/14/2019 OSCON2007: Landscape of trx engines
http://slidepdf.com/reader/full/oscon2007-landscape-of-trx-engines 33/48
DBT2, 400W , Data size
MB0
5000
10000
15000
20000
25000
30000
35000
40000
45000
38266
4219141770
30726
Size of loaded data
InnoDB
SolidDB
PBXT
Falcon
- Surprizingly large size
from PBXT
- SolidDB – tables were
loaded into M yISAM andthen converted to
SolidDB
- It was crashing
otherwise
8/14/2019 OSCON2007: Landscape of trx engines
http://slidepdf.com/reader/full/oscon2007-landscape-of-trx-engines 34/48
DBT2, 400W , Results
NOTPM0
100
200
300
400
500
600
700
800
900
1000
1100
1200
1105
495
178
InnoDB
SolidDB
Falcon
- PBXT crashed
- Result in New Order
Transaction Per M inute,
m ore is better
8/14/2019 OSCON2007: Landscape of trx engines
http://slidepdf.com/reader/full/oscon2007-landscape-of-trx-engines 35/48
Dell DVD Store- Datasize
M edium 1 GB
2,000,000 Custom ers
100,000 Products- Falcon – crashed
- PBXT – a lot of errors
-
Result in New O rdersper m inute, m ore is
better
orders per minute0
1000
2000
3000
4000
5000
6000
7000
8000
9000
10000
11000
12000
13000
14000
15000
16000
17000
18000 17589
7594
InnoDB
SolidDB
8/14/2019 OSCON2007: Landscape of trx engines
http://slidepdf.com/reader/full/oscon2007-landscape-of-trx-engines 36/48
sysbench- Older Falcon used in this test. New one crashes :(
- Couple of READ-ONLY queries against typical table for
W eb-applications – info of user account:CREATE TABLE IF NOT EXISTS sbtest (id int(10) unsigned NOT NULL auto_increment,name varchar(64) NOT NULL default '',email varchar(64) NOT NULL default '',password varchar(64) NOT NULL default '',dob date default NULL,address varchar(128) NOT NULL default '',
city varchar(64) NOT NULL default '',state_id tinyint(3) unsigned NOT NULL default '0',zip varchar(8) NOT NULL default '',country_id smallint(5) unsigned NOT NULL default '0',PRIMARY KEY (id),KEY `country_id` (country_id,state_id,city))
8/14/2019 OSCON2007: Landscape of trx engines
http://slidepdf.com/reader/full/oscon2007-landscape-of-trx-engines 37/48
sysbench, read by prim ary key
1 4 16 64 128 256
0.00
5000.00
10000.00
15000.00
20000.00
25000.00
30000.00
35000.00
40000.00
45000.00
50000.00
55000.00
60000.00
65000.00
Innodb
Falcon
SolidDBPBXT
clients
q u r i e s
/ s e c
•SELECT nameFROM sbtestWHERE id=?
•Innodb andSolid havesweat spotbeing
clustered byPK
8/14/2019 OSCON2007: Landscape of trx engines
http://slidepdf.com/reader/full/oscon2007-landscape-of-trx-engines 38/48
sysbench, read by index
1 4 16 64 128 256
0.00
25.00
50.00
75.00
100.00
125.00
150.00
175.00
200.00
Innodb
Falcon
SolidDB
PBXT
clients
q u r i e s
/ s e c
●SELECT nameFROM sbtestWHERE
country_id=?●PBXT Excels
●Falcon comesnext
8/14/2019 OSCON2007: Landscape of trx engines
http://slidepdf.com/reader/full/oscon2007-landscape-of-trx-engines 39/48
sysbench, read by covered index
1 4 16 64 128 256
0.00
25.00
50.00
75.00
100.00
125.00
150.00
175.00
200.00
225.00
250.00
Innodb
Falcon
SolidDB
PBXT
clients
q u r i e s
/ s e c
●SELECTstate_idFROM sbtest
WHEREcountry_id=?
●PBXT stillbest
●Falcon can'tuse coveredindex
8/14/2019 OSCON2007: Landscape of trx engines
http://slidepdf.com/reader/full/oscon2007-landscape-of-trx-engines 40/48
sysbench, read by index, LIM IT 20
1 4 16 64 128 256
0.005000.00
10000.00
15000.00
20000.00
25000.00
30000.00
35000.00
40000.00
45000.00
50000.00
Innodb
Falcon
SolidDB
PBXT
clients
q u r i e s
/ s e c
●SELECT nameFROM sbtestWHERE
country_id=?LIMIT 20
●Falcon Doesnot optimize
Limit●InnodbScalespoorly
8/14/2019 OSCON2007: Landscape of trx engines
http://slidepdf.com/reader/full/oscon2007-landscape-of-trx-engines 41/48
Sysbench OLTP- Datasize
100,000,000 rows
~25GB
-Uniform distribution
- I/O-bound load
- read / write transactions
-Reduced available m em ory by locking 12GB our of16GB
8/14/2019 OSCON2007: Landscape of trx engines
http://slidepdf.com/reader/full/oscon2007-landscape-of-trx-engines 42/48
Sysbench OLTP, tim e to load data- Using m ulti-value
INSERTs rather than
LOAD DATA INFILE
-Solid and Falcon areeven slower than Innodb
which is known to be
slow com pared to
M yISAM for data load.
sec0
250
500
750
1000
1250
1500
1750
2000
2250
2500
2750
3000
3250
3500
1930
3364
1237
2880
InnoDB
SolidDB
PBXT
Falcon
8/14/2019 OSCON2007: Landscape of trx engines
http://slidepdf.com/reader/full/oscon2007-landscape-of-trx-engines 43/48
Sysbench OLTP, Datasize
InnoDB SolidDB PBXT Falcon0
2.5
5
7.5
10
12.5
15
17.5
20
22.5
25
27.5
22.51
26.44
23.03
8.719.6
14.8
23
8.71
Datasize, varchar vs char
char, GB
varchar, GB
- Com parison of storages
of char and varchar
colum ns in the table
-Falcon uses dynam iclength rows anyway
- PBXT surprisingly has
sam e huge size in both
cases
8/14/2019 OSCON2007: Landscape of trx engines
http://slidepdf.com/reader/full/oscon2007-landscape-of-trx-engines 44/48
Sysbench OLTP, results
1 4 640
5
10
15
20
25
30
35
40
45
50
12.77
30.14
46.24
10.62
22.33
26.11
3.87
10.3
19.06
4.86 5.8 5.71
I/O bound
InnoDB
SolidDB
PBXT
Falcon
clients
t r a n s a c t i o n s
/ s e c
- M em ory lim ited to 4GB,
2GB for buffers
- Innodb and SolidDB have
benefit due to clusteringby prim ary key
- All but Falcon scale well
for IO bound workload
with this am ount of harddrives.
8/14/2019 OSCON2007: Landscape of trx engines
http://slidepdf.com/reader/full/oscon2007-landscape-of-trx-engines 45/48
Selected sqlbench results- single operation repeated N times, total time in secs. less
is better
- Operation | 1| 2| 3||innodb_|pbxt_fa|soliddb|
alter_table_add (100) | 8.00| 3.00| 32.00|
count (100) | 12.00| 8.00| 28.00|count_distinct (1000) | 6.00| 8.00| 74.00|count_distinct_2 (1000) | 11.00| 11.00| 16.00|count_group_on_key_parts (1000) | 7.00| 10.00| 83.00|count_on_key (50100) | 70.00| 94.00| 210.00|delete_all_many_keys (1) | 17.00| 2.00| 28.00|insert (350768) | 6.00| 5.00| 21.00|outer_join (10) | 14.00| 7.00| 61.00|
select_key2_return_prim (200000) | 30.00| 29.00| 25.00|select_many_fields (2000) | 8.00| 6.00| 5.00|update_big (10) | 18.00| 56.00| 727.00|update_of_key_big (501) | 19.00| 6.00| 165.00|update_of_primary_key_many_keys (256| 44.00| 17.00| 55.00|update_with_key_prefix (100000) | 19.00| 8.00| 10.00|
8/14/2019 OSCON2007: Landscape of trx engines
http://slidepdf.com/reader/full/oscon2007-landscape-of-trx-engines 46/48
Conclusion- All reviewed storage engines but InnoDB are currently
too unstable for production use. SolidDB com es closest.
- InnoDB is still winner in m ajority of tests
- Falcon has serve issues with LIM IT optim ization and IO
bound scalability
- PBXT and Falcon win in certain tests
-SolidDB is currently an outsider in term s of Perform ance
- Need to revisit when production versions of all storage
engines are ready.
8/14/2019 OSCON2007: Landscape of trx engines
http://slidepdf.com/reader/full/oscon2007-landscape-of-trx-engines 47/48
The End- Thanks for com ing !
- Slides will be published at
http://www.m ysqlperform anceblog.com /
- Feel free to approach us with your question
- M ySQL Perform ance Optim ization Consulting Available
- http://www.m ysqlperform anceblog.com /m ysql-consulting/
8/14/2019 OSCON2007: Landscape of trx engines
http://slidepdf.com/reader/full/oscon2007-landscape-of-trx-engines 48/48
Sysbench OLTP, results, char- Datasize com parable
with m em ory size
1 4 640
2.5
5
7.5
10
12.5
15
17.5
20
22.5
25
27.5
30
32.5
35
37.5
18.75
36.71
29.36
13.81
25.11
34.77
8.87
17.51
29.1
15.15
20.4
17.27
CPU bound
InnoDB
SolidDB
PBXT
Falcon
t r a n s a c t i o n s
/ s e c