@tomtheguvnor
MonufactureEffortless test data for MongoDB
Tom Leach @tomtheguvnor | github.com/tleach
@tomtheguvnor
MonufactureThe Spookiest Way To Create Test Data for MongoDB
Tom Leach @tomtheguvnor
@tomtheguvnor
A little about me…
@tomtheguvnor
A little about GameChanger…
@tomtheguvnor
gc.com/careers
@tomtheguvnor
GameChanger & MongoDB• Migrated to MongoDB 1.2 in 2009
• 12 TB of data (including 1,714,971,688 plays)
• Split across 10 shards
• 35 nodes in total
• Recently migrated all of this from EC2 Classic to VPC (5 mins downtime)
@tomtheguvnor
“Testing MongoDB-dependent code”
• Unit tests (“classicist” not “mockist”)
• Integration tests
• Functional tests
• System tests
@tomtheguvnor
Why is testing this important?• MongoDB trades off many of the consistency guarantees provided by a traditional
RDBMS in favor of availability
• The onus is on the application developer to implement:
• Validation, Defaults, Sequences
• Concurrency handling
• Denormalization
• Referential integrity
• These are not simple problems to solve
@tomtheguvnor
Why is Mongo test data hard?
• Documents have arbitrarily complex nested structures
• Understanding what a valid document looks like is not obvious
• The DB does not really help us out
@tomtheguvnor
Typical approaches(we tried all of these)
@tomtheguvnor
Option 1: Static, “real-looking” database• Can run either manual or (scary) automated tests
• “Easy” to get started (clone production)
• Data is inventory which requires maintenance
• Non-deterministic
• Failures tend to have indirect causality and are time-consuming to diagnose
• Portability is hard
@tomtheguvnor
Option 2: Fixtures• Deterministic
• Brittle, non-obvious inter-test dependencies
• Static and painful to override
• Time consuming to maintain (inventory)
• Tend to bloat and slow tests
@tomtheguvnor
Option 3: Generate test data in tests
• Deterministic
• Localized causality*
• Boilerplate (especially in Mongo) obfuscates intent
• Expensive to create and maintain (schema changes)
• Overhead dis-incentivizes developers from writing tests
@tomtheguvnor
What were we solving for?• Simple, readable, maintainable, concise tests
• Good coverage
• Tests are fast to write
• Developers who want to write tests because they are a help not a hindrance
• A cultural shift that scales
@tomtheguvnor
Team
Player
User
0..*
0..*
0..*
0..*
0..*
1
follows
family of
rosters
@tomtheguvnor
What does a good test data API look like?
def test_add_player(self): team = create('team') add_player(team, 'Madison', 'Bumgarner') self.assertIn( {'first_name': 'Madison', 'last_name': 'Bumgarner'} db.team.find_one(team['_id'])['players'])
@tomtheguvnor
Introducing Monufacture
@tomtheguvnor
Mongopymongo
business logic modelling layer
tests monufacture
front end / HTTP API /
worker
@tomtheguvnor
with factory('team', db.team): fragment('player', { 'first_name': text(), 'last_name': text(), 'number': sequence() }) default({ 'name': text(spaces=True), 'zip_code': 10007, 'players': list_of(
embed(‘player’), 5)
})
from monufacture import create
def test_add_player(self): team = create('team') add_player(team, 'Madison', 'Bumgarner') self.assertIn( {'first_name': 'Madison', 'last_name': 'Bumgarner'} db.team.find_one(team['_id'])['players'])
Declare factories centrally Use factories in tests
@tomtheguvnor
Demo
@tomtheguvnor
monufacture introduced
@tomtheguvnor
–Gabriel Khaselev, GameChanger Engineer, Millennial
“Dude, you gotta check out Monufacture, it’s so dope.”
@tomtheguvnor
github.com/gamechanger/monufacture
Send us: • Feedback • Issues • Pull Requests
Also check out Mongothon (github.com/gamechanger/mongothon) - our Mongo modeling library with built-in validation, field defaults and events