The Fast, the Slow and the Ugly:
A Magento Cache Showdown
Colin Mollenhour [email protected]
github.com/colinmollenhour
@colinmollenhour
Began working with Magento in early 2009 with Business Services & Solutions, LLC • Small team based in Knoxville, TN • Services small number of clients in which we own
an interest • Highly distributed team of coders • Integrates with third-party solution providers • We’re hiring engineers!
Life Without Cache?!
• Config.xml + core_config_data
• Block output
• Zend_* components (DDL, Locale, Date, etc..)
• Remote API data
• Expensive queries
• Custom uses
Cache Without Tags
What would happen without tagging?
No Tags == ==
Invalidation event occurs
Match tagged
cache keys
Invalidate layouts,
blocks, etc..
No invalidation Cache thrashing
Zend_Cache_Backend_?
Out-of-the-box
• Files
• Database
• Memcached
• APC/Xcache/eAccelerator
• ZendServer_Disk/ShMem
• Sqlite
• TwoLevels combinations
Alternative
• Redis
• Indexed Files
• Simplified TwoLevels
• and more..
Special Needs
• Memcached, APC, Xcache, eAccel, Zend*
• File, Database, Sqlite Tags support
• File, APC, Xcache, eAccel, Zend*, Sqlite
• Database, Memcached
Cluster-friendly
• Sqlite High-
concurrency
• Memcached, APC, Xcache, eAccel, ZendServer_ShMem
• File, Database, Sqlite, ZendServer_Disk Persistent
Problems with TwoLevels
• Redundant – Hits: fast or fast+slow
– Misses: fast+slow
– Writes: fast+slow
– Cleans: fast+slow
• Buggy – Synchronization
– Priorities
– Expire times
TwoLevels
DATA
(fast)
DATA + TAGS
(slow)
TwoLevels: Simplified
• No data written to slow backend
– Varien’s database backend does the same since CE 1.5
• Removes features
– Priority/filling
– Write-through
– Read-through
http://goo.gl/o92Zr
TwoLevels
DATA
(fast)
TAGS
(slow)
Advanced key-value store
In-memory and disk
Automatic key expiration
Can specify maxmemory
Master/Slave(s) replication
remote dictionary server
Strings (set, get, incr, append, …)
Lists (push, pop, trim, index, sort, …)
Hashes (set, get, del, keys, vals, …)
Sets (add, rem, union, inter, diff, …)
Sorted Sets (rank, score, union, sum, …)
Transactions
Publish/Subscribe
Lua Scripting
http://redis.io
Cm_Cache_Backend_Redis
loadCache()
• HGET(‘zc:k:’.$id, ‘d’)
saveCache()
• HMSET($id, {$data, $tags, $time})
• EXPIRE($id, $lifetime) • SADD(‘zc:tags’, $tags) • SADD(‘zc:ti:’.$tags[$i++], $id)
cleanCache($tags)
• $ids = SUNION($tags) • DEL($ids) • DEL(‘zc:ti:’.$tags[$i++]) • SREM(‘zc:tags’, $tags)
removeCache()
• $tags = HGET(‘zc:k:’.$id, ‘t’) • DEL($id) • SREM(‘zc:ti:’.$tags[$i++])
http://goo.gl/1ThM8
Cm_Cache_Backend_File
• Writes tags in append-only mode
• Randomly compacts large tag files
• Locks tag files for safe operation
• Fixes broken subdirectory distribution
• Unit tested
http://goo.gl/WWyM4
Testing Methodology
Generate random
data
Generate random
ops
Load cache data
Start N clients
Sum results
from each client
Generate Test Data Once Repeatable Execution
• 32-byte keys/tags
• Base64 encoded
• Random data size
• Random tags per key
• N clients, X ops per client
• 1 in 1000 chance for clean
• 1 in 1000 chance for save
• Bash script cleans cache and loads pre-generated cache data
• Read pre-generated ops into memory
• Output reads, writes, cleans
• Awk script sums reads, writes, cleans
Choosing Parameters
$ php shell/cache-benchmark.php analyze
0
1000
2000
3000
4000
5000
6000
Data Size
$ php shell/cache-benchmark.php init \
--name basic --clients 4 –ops 30000 --seed 1 \
--keys 20000 --tags 5000 --min-tags 1 --max-tags 10 --max-rec-size 32768
$ bash var/cachebench/basic/run.sh
Choosing Parameters
0200400600800
1000120014001600
0 2 4 6 8 10 12 14 17 19 21 24 28 32 34 41 >47
Keys per Tag
0
2000
4000
6000
8000
10000
12000
0 2 4 6 8 10
Tags per Key
$ php shell/cache-benchmark.php analyze
$ php shell/cache-benchmark.php init \
--name basic --clients 4 –ops 30000 --seed 1 \
--keys 20000 --tags 5000 --min-tags 1 --max-tags 10 --max-rec-size 32768
$ bash var/cachebench/basic/run.sh
Benchmarks!!
Machine Specs
• Dual Quad Core Xeon E5620 (2.4 Ghz Gulftown)
• 12Gb RAM
• 2x 250Gb SATA in RAID 1
• Debian 6.0
• dotdeb.org packages
• Magento CE 1.6.2.0
Backends Tested
• Files
• Database
• Memcache + Files
• Memcache + Database
• Memcache + Redis*
• Redis (lzf compression)
• Cm_Cache_Backend_File
• * = simplified two-levels
-
10
20
30
40
50
60
files databasememc-
filesmemc-
dbmemc-redis
memc-redis*
redis-lzf cm-files
reads 20,672 6,517 16,494 17,109 15,360 17,320 14,957 52,008
writes 7,415 988 2,877 2,103 2,370 2,593 4,371 4,198
cleans 1.3 986 1.2 1,470 259 1,263 6,173 3,391
Tho
usa
nd
s O
ps/
sec
Basic Comparison
0
50
100
150
200
250
2 8 16 32 64
Tho
usa
nd
s O
ps/
sec
Reads
0
2
4
6
8
10
12
14
16
2 8 16 32 64
Cleans
Concurrency Scaling (Number of Clients)
0
5
10
15
20
25
2 8 16 32 64
Concurrent Clients
Writes
files database memc-db memc-redis* redis-lzf cm-files
0
20
40
60
80
100
120
140
Tho
usa
nd
s O
ps/
sec
Reads
0
2
4
6
8
10
12
Cleans
Capacity Scaling (Keys and Tags)
0
2
4
6
8
10
12
14
16
Thousands of Keys/Tags
Writes
files database memc-db memc-redis* redis-lzf cm-files
The Ugly
Files
• Unbearably slow clean
• Tmpfs is moot
• Negligible clean gains
• No read gains
Database
• Poor read latency
• Capacity doesn’t scale
• Would contend with Magento queries
Redis Pros Redis Cons
Redis: Gotchas & Tips
maxmemory
• Recommend ‘volatile-lru’ policy
• LRU algorithm evicts one of N (default=3) random keys
Compression
• Saves ~69% with gzip, ~50% with lzf
• Use Google’s “snappy” or PECL’s “lzf” library for best performance
• Set `compress_threshold` lower to compress more often
phpredis native extension
• Only use db 0 with phpredis due to reconnect bug
33.76 34.24 33.73 32.37
31.15 30.39 31.33 33.03
31.00
7.46 7.53 7.50 7.60 8.62 8.84 8.64 9.08 8.29
15.48 16.02 15.25 14.04 14.74 14.12 14.55 14.38
11.34
0
5
10
15
20
25
30
35
40
gzip-1k gzip-4k gzip-8k gzip-16k lzf-1k lzf-4k lzf-8k lzf-16k none
Tho
usa
nd
s O
ps/
Sec
Redis Compression Options
reads writes cleans
Cm_Cache_Backend_File
Pros Cons
Memcache + ?
Redis
(w/ simplified TwoLevels)
• Slightly better performance
• Very low resource utilization
• No affect on SQL database
Database
• Contention with Magento queries
• Lots of overhead:
• Tag indexes
• Durable writes
Recommendations
Small (single web node)
• Cm_Cache_Backend_File
• Cm_Cache_Backend_Redis
Medium (1-5 web nodes)
• Cm_Cache_Backend_Redis
Large (6+ web nodes)
• Memcached + Redis (w/ simplified TwoLevels)
More possibilities
github.com/AntonStoeckl/Zend_Cache_Backend_Mongo
• Blazing fast
• MongoDb’s memory usage can’t be capped
• No automatic online compaction so long-term use may require maintenance
Sharding with Redis to overcome single-threaded problem
• 1 shard for tags, N shards for data
Magento Session Handlers
Files (php implementation)
• Good for single servers
Database (SQL)
• Lacks locking mechanism
Memcache
• Not persistent
Eaccelerator
• Not persistent
• Lacks locking mechanism
• Contends with opcode cache
Persistent
Cluster-friendly
No garbage collection needed
Optimistic locking
Compression supported
Includes online migration script
Cm_RedisSession
http://goo.gl/D2Dyw
Optimistic Locking
Increment ‘lock’
If lock == 1, take lock
Else If timeout,
break lock
Fetch session data
Common case read:
• HINCRBY
• HMSET, EXPIRE
• HGET(id, ‘data’)
Common case write:
• HGET(id, ‘pid’)
• HMSET, EXPIRE