+ All Categories
Home > Technology > Benchmarking at Parse

Benchmarking at Parse

Date post: 12-Jul-2015
Category:
Upload: travis-redman
View: 814 times
Download: 0 times
Share this document with a friend
Popular Tags:
35
Advanced Benchmarking at Parse Travis Redman Parse + Facebook
Transcript
Page 1: Benchmarking at Parse

Advanced Benchmarking at Parse

Travis Redman Parse + Facebook

Page 2: Benchmarking at Parse

Parse?

• Parse is a backend service for mobile apps

• Data Storage

• Server-side code

• Push Notifications

• Analytics

• … all by dropping an SDK into your app

Page 3: Benchmarking at Parse

Parse Stats

• Parse has 400,000 apps

• Rapidly growing MongoDB deployment with:

• 500 databases

• 2.5M collections

• 8M indexes

• 50T storage (excluding replication)

• We have all kinds of workloads!

Page 4: Benchmarking at Parse

Variety is Fun• We support just about any kind workload you can

imagine

• Games, social networking, events, travel, music, etc

• Apps that are read heavy or write heavy

• Heavy push users (time sensitive notifications)

• Apps that store large objects

• Apps that use us for backups

• Inefficient queries

Page 5: Benchmarking at Parse

2.6 - Why Upgrade?

• General desire to stay current, precursor for 2.8 and pluggable storage engines

• Specific features in 2.6

• Background indexing on secondaries

• Index intersection

• query plan summary logging

Page 6: Benchmarking at Parse

Upgrading is Scary

• In the early days, we just upgraded

• Put a new version on a secondary

• ???

• Upgrade primaries

• ???

• Fix bugs as we find them - LIVE!

Page 7: Benchmarking at Parse

Upgrading

• We’re too big now to cowboy it up

• Upgrading blindly is a potential catastrophe

• In particular, we want to avoid:

• Significant performance regressions

• Unexpected bugs that break customer apps

Page 8: Benchmarking at Parse

Benchmarking

• We know that:

• Benchmarking can detect performance regressions between versions

• Tools and sample workloads (sysbench, YCSB, …) already exist

• MongoDB runs its own benchmarks

• Our workload is complex - we want more confidence

Page 9: Benchmarking at Parse

A Customized Approach

• Why not test with production workloads?

• Flashback: https://github.com/ParsePlatform/flashback

• Record - python tool to record ops

• Replay - go tool to play back ops

Page 10: Benchmarking at Parse

Record

• Record leverages mongo’s profiling and oplog

• Profiling is enabled on all DBs

• Inserts are collected from the oplog

• All other ops taken from profile db

• Ops are recorded for specified time period (24H) and then merged

• Produces a JSON file of ops to feed the replay tool

Page 11: Benchmarking at Parse

Recording

Page 12: Benchmarking at Parse

Base Snapshot

• Need to replay prod ops on prod data

• It’s best to play back ops on a consistent copy of the data, otherwise:

• inserts are duplicate key errors

• deletes are no-ops

• queries don’t return the right data

• Using EBS snapshots, we grab a copy of the db during the recording

• Discard ops before the snapshot

Page 13: Benchmarking at Parse

Recording Timeline

Page 14: Benchmarking at Parse

Base Snapshot

• Snapshot is restored to our benchmark server(s)

• EBS volume has to be “warmed” because snapshot blocks are not instantiated

• Multi TB volumes can take a few hours to warm

• After warming we create an LVM snapshot

• We can “rewind” (merge) after each playback, iterating faster

Page 15: Benchmarking at Parse

Playback

1. Freeze the LVM volume

2. Start the version of mongo being tested

3. Adjust replay parameters

• # workers

• # num ops

• timestamp to start at (when base snapshot was taken)

4. Go!

5. Client-side results are logged to file, server-side collected from monitoring tools

Page 16: Benchmarking at Parse

Playback

Page 17: Benchmarking at Parse

Our Workload

• 24h of ops collected

• 10M ops at a time, as fast as possible

• 10 workers

• No warming of RS

• LVM snapshot reset, mongod restarted for each version

• Rinse and repeat for multiple replica sets

Page 18: Benchmarking at Parse

Our Results

2.4.10

3061.96 ops/sec (avg)

Page 19: Benchmarking at Parse

Results2.6.3

2062.69 ops/sec (avg)

Page 20: Benchmarking at Parse

• 33% loss in throughput.

• A second workload showed a 75% drop in throughput

• 3669.73 ops/sec vs 975.64 ops/sec

• Ouch! What do we do next?

Results

Page 21: Benchmarking at Parse

2.4.10 P99

2.4.10 MAX

2.6.3 P99

2.6.3 MAX

query 18.45ms 20953ms 19.21ms 60001ms

insert 23.5ms 6290ms 50.29ms 48837ms

update 21.87ms 3835ms 21.79ms 48776ms

FAM 21.99ms 6159ms 24.91ms 49254ms

Replay Data

Page 22: Benchmarking at Parse

Replay Data

Page 23: Benchmarking at Parse

Bug Hunt!

• Old fashioned troubleshooting begins

• Began isolating query patterns and collections with high max times

• Reproduced issue, confirmed slowness in 2.6

• Lots of documentation and log gathering, including extremely verbose QLOG

• Started investigation with the Mongo team that ran several weeks

Page 24: Benchmarking at Parse

What we found

• Basically, new query planner in 2.6 meets Parse auto-indexer

• We create lots of indexes automatically

• More indexes to score and potentially race

• Increased likelihood of running into query planner bugs

Page 25: Benchmarking at Parse

Example 1

Remove op on “Installation”

{ "installationId": {"$ne": ? }, "appIdentifier": "?", "deviceToken": “?”}

• 9M documents

• installationId is UUID, unique value

• "installationId": {"$ne": ? } matches most documents

• deviceToken is a unique token identifying the device

Page 26: Benchmarking at Parse

{ "installationId": {"$ne": ? }, "appIdentifier": "?", "deviceToken": “?”}

• Three candidate indexes:

{installationId: 1, deviceToken: 1} {deviceToken: 1, installationId: 1} {deviceToken: 1}

• The second and third indexes are clearly better candidates for this query, since the device token is a simple point lookup.

• Mongo bug where the work required to skip keys was not factored in to the plan ranking, causing the inefficient plan to sometimes tie

• Since it’s a remove op, held the write lock for the DB

• Fixed in: https://jira.mongodb.org/browse/SERVER-14311

Page 27: Benchmarking at Parse

Example 2

Query on “Activity”:

{ $or: [ { _p_project: “?" }, { _p_newProject: “?”} ], acl: { $in: [ "a", “b”, “c" ] } } }

• 25M documents

• _p_project and _p_newProject are pointers to unique IDs of other objects

• acl matches most documents

• Four candidate indexes for this query

{ _p_newProject: 1 } { _p_project: 1 } { _p_project: 1, _created_at: 1 } { acl: 1 }

Page 28: Benchmarking at Parse

{ $or: [ { _p_project: “?" }, { _p_newProject: “?”} ], acl: { $in: [ "a", “b”, “c" ] } } }

• Query Planner would race multiple plans using indexes

• Due to a bug, one of the raced indexes would do a full index scan (acl)

• Index scan was non-yielding, tying up the lock until it had completed

• Parse query killer job kills non-yielding queries after 45s

• Query planner would fail to cache plan, and would re-run on next query with the same pattern

• Fixed: https://jira.mongodb.org/browse/SERVER-15152

Page 29: Benchmarking at Parse

Example 3Query on “Activity”: { $or: [ { _p_project: “?" }, { _p_newProject: “?”} ], acl: { $in: [ "a", “b”, “c" ] } } } (same as previous example)

• Usually fast, but occasionally saw high nscanned and query time > 60s

• Since there were indexes on all fields in AND condition, this was a candidate for index intersection

• planSummary: IXSCAN { _p_project: 1 }, IXSCAN { _p_newProject: 1 }, IXSCAN { acl: 1.0 }

• acl was not selective, but _p_project and _p_newProject would sometimes match 0 documents during race

• intersection-based query plan would get cached, subsequent queries slow

• Fixed in https://jira.mongodb.org/browse/SERVER-14961

Page 30: Benchmarking at Parse

Success?2.6.5

4443.10 ops/sec (vs 3061.96 in 2.4.10)

Page 31: Benchmarking at Parse

Comparison

2.4.10 P99

2.4.10 MAX

2.6.4 P99

2.6.4 MAX

2.6.5 P99

2.6.5 MAX

query 18 ms

20,953 ms

19 ms

60,001 ms

10 ms

4,352 ms

insert 23 ms

6,290 ms

50 ms

48,837 ms

24 ms

2,225 ms

update 22 ms

3,835 ms

21 ms

48,776 ms

23 ms

4,535 ms

FAM 22 ms

6,159 ms

24 ms

49,254 ms

23 ms

4,353 ms

Page 32: Benchmarking at Parse

More Results

2.4.10 2.6.5

Ops:10M W:10

3061 ops/sec

4443 ops/sec

Ops:10M W:250

10666 ops/sec

12248 ops/sec

Ops:20M W:1000

11735 ops/sec

14335 ops/sec

Page 33: Benchmarking at Parse

What now?

• 2.6 has a green light on performance

• Working through functionality testing

• Unit/integration testing catching majority of issues

• Bonus: Flashback error log helping us to identify problems not caught by tests

Page 34: Benchmarking at Parse

Wrap Up

• Benchmarking with something representative of your production workload is worth the time

• Saved us from discovering slowness in production and inevitable and painful rollbacks

• Using actual production data is even better

• Helped us avoid new bugs

• Learned a lot about our own service (indexing algorithms need some work)

• Initial work can be reused to efficiently test future versions

Page 35: Benchmarking at Parse

Questions?

• Flashback: https://github.com/ParsePlatform/flashback

• Links to bugs:

• https://jira.mongodb.org/browse/SERVER-14311

• https://jira.mongodb.org/browse/SERVER-15152

• https://jira.mongodb.org/browse/SERVER-14961


Recommended