+ All Categories
Home > Technology > PyConUK2013 - Validated documents on MongoDB with Ming

PyConUK2013 - Validated documents on MongoDB with Ming

Date post: 27-Jan-2015
Category:
Upload: amol
View: 108 times
Download: 1 times
Share this document with a friend
Description:
Ming is a SQLAlchemy-inspired object-document mapper (ODM) for MongoDB developed at SourceForge which is also used by the TurboGears2 web framework to provide mongodb support. After a short introduction to the basic Ming layer we will cover the Ming Object Document Mapper layer to show how to take advantage of its Unit Of Work to avoid performing incomplete changes and achieve relations between collections. The last part of the talk will show how to use Ming to perform lazy migration of data when your schema changes and how to drop below the ODM layer to achieve maximum speed.
Popular Tags:
23
VALIDATED DOCUMENTS ON MONGODB WITH MING Alessandro Molina @__amol__ [email protected]
Transcript
Page 1: PyConUK2013 - Validated documents on MongoDB with Ming

VALIDATED DOCUMENTS ON MONGODB WITH MING

Alessandro Molina@__amol__

[email protected]

Page 2: PyConUK2013 - Validated documents on MongoDB with Ming

Who am I

● CTO @ Axant.it, mostly Python company

(with some iOS and Android)

● TurboGears development team member

● Contributions to Ming project ODM layer

● Really happy to be here at PyConUK!

○ I thought I would have crashed my car driving on

the wrong side!

Page 3: PyConUK2013 - Validated documents on MongoDB with Ming

MongoDB Models

● Schema free

○ It looks like you don’t have a schema, but your

code depends on properties that need to be there.

● SubDocuments

○ You know that a blog post contain a list of

comments, but what it is a comment?

● Relations

○ You don’t have joins and foreign keys, but you still

need to express relationships

Page 4: PyConUK2013 - Validated documents on MongoDB with Ming

What’s Ming?

● MongoDB toolkit

○ Validation layer on pymongo

○ Manages schema migrations

○ In Memory MongoDB

○ ODM on top of all of those

● Born at sourceforge.net

● Supported by TurboGears

community

MongoDB

PyMongo

Ming

Ming.ODM

Page 5: PyConUK2013 - Validated documents on MongoDB with Ming

Getting Started with the ODM

● Ming.ODM looks like SQLAlchemy

● UnitOfWork

○ Avoid half-saved changes in case of crashes

○ Flush all your changes at once

● IdentityMap

○ Same DB objects are the same object in memory

● Supports Relations

● Supports events (after_insert, before_update, …)

Page 6: PyConUK2013 - Validated documents on MongoDB with Ming

Declaring Schema with the ODM

class WikiPage(MappedClass): # Metadata for the collection # like its name, indexes, session, ... class __mongometa__: session = DBSession name = 'wiki_page'

unique_indexes = [('title',)]

_id = FieldProperty(schema.ObjectId) title = FieldProperty(schema.String) text = FieldProperty(schema.String)

# Ming automatically generates # the relationship query comments = RelationProperty('WikiComment')

class WikiComment(MappedClass): class __mongometa__: session = DBSession name = 'wiki_comment'

_id = FieldProperty(schema.ObjectId) text=FieldProperty(s.String, if_missing='')

# Provides an actual relation point # between comments and pages page_id = ForeignIdProperty('WikiPage')

● Declarative interface for models

● Supports polymorphic models

Page 7: PyConUK2013 - Validated documents on MongoDB with Ming

Querying the ODM

wp = WikiPage.query.get(title='FirstPage')

# Identity map prevents duplicateswp2 = WikiPage.query.get(title='FirstPage')assert wp is wp2

# manually fetching related commentscomments = WikiComment.query.find(dict(page_id=wp._id)).all()# orcomments = wp.comments

# gets last 5 wikipages in natural orderwps = WikiPage.query.find().sort('$natural', DESCENDING).limit(5).all()

● Query language tries to be natural for both

SQLAlchemy and MongoDB users

Page 8: PyConUK2013 - Validated documents on MongoDB with Ming

The Unit Of Work

● Flush or Clear the pending changes

● Avoid mixing UOW and atomic operations

● UnitOfWork as a cache

wp = WikiPage(title='FirstPage', text='This is my first page')DBSession.flush()

wp.title = "TITLE 2"DBSession.update(WikiPage, {'_id':wp._id}, {'$set': {'title': "TITLE 3"}})DBSession.flush() # wp.title will be TITLE 2, not TITLE 3

wp2 = DBSession.get(WikiPage, wp._id)# wp2 lookup won’t query the database again

Page 9: PyConUK2013 - Validated documents on MongoDB with Ming

How Validation works

● Ming documents are validated at certain

points in their life cycle

○ When saving the document to the database

○ When loading it from the database.

○ Additionally, validation is performed when the

document is created through the ODM layer or

using the .make() method

■ Happens before they get saved for real

Page 10: PyConUK2013 - Validated documents on MongoDB with Ming

Cost of Validation

● MongoDB is famous for its speed, but

validation has a cost

○ MongoDB documents can contain many

subdocuments

○ Each subdocument must be validated by ming

○ Can even contain lists of multiple subdocuments

Page 11: PyConUK2013 - Validated documents on MongoDB with Ming

Cost of Validation benchmark#With Validationclass User(MappedClass): # ... friends = FieldProperty([dict(fbuser=s.String, photo=s.String, name=s.String)], if_missing=[]) >>> timeit.timeit('User.query.find().all()', number=20000)31.97218942642212

#Without Validationclass User(MappedClass): # ... friends = FieldProperty(s.Anything, if_missing=[]) >>> timeit.timeit('User.query.find().all()', number=20000)23.391359090805054

#Avoiding the field at query time>>> timeit.timeit('User.query.find({}, fields=("_id","name")).all()', number=20000)21.58667516708374

Page 12: PyConUK2013 - Validated documents on MongoDB with Ming

Only query what you need

● Previous benchmark explains why it is

good to query only for fields you need to

process the current request

● All the fields you don’t query for, will still

be available in the object with None value

Page 13: PyConUK2013 - Validated documents on MongoDB with Ming

Evolving the Schema

● Migrations are performed lazily as the

objects are loaded from the database

● Simple schema evolutions:

○ New field: It will just be None for old entities.

○ Removed: Declare it as ming.schema.Deprecated

○ Changed Type: Declare it as ming.schema.Migrate

● Complex schema evolutions:

○ Add a migration function in __mongometa__

Page 14: PyConUK2013 - Validated documents on MongoDB with Ming

Complex migrations with Mingclass OldWikiPage(Document): _id = Field(schema.ObjectId) title = Field(str) text = Field(str, if_missing='') metadata = Field(dict(tags=[str], categories=[str]))

class WikiPage(Document): class __mongometa__: session = DBSession name = 'wiki_page' version_of = OldWikiPage

def migrate(data): result = dict(data, version=1, tags=data['metadata']['tags'], categories=data['metadata']['categories']) del result['metadata'] return result

version = Field(1, required=True) # … more fields ...

Page 15: PyConUK2013 - Validated documents on MongoDB with Ming

Testing MongoDB

● Ming makes testing easy

○ Your models can be directly imported from tests

○ Just bind the session to a DataStorage created in

your tests suite

● Ming provides MongoInMemory

○ much like sqlite://:memory:

● Implements 90% of mongodb, including

javascript execution with spidermonkey

Page 16: PyConUK2013 - Validated documents on MongoDB with Ming

Ming for Web Applications

● Ming can be integrated in any WSGI

framework through the ming.odm.

middleware.MingMiddleware

○ Automatically disposes open sessions at the end

of requests

○ Automatically provides session flushing

○ Automatically clears the session in case of

exceptions

Page 17: PyConUK2013 - Validated documents on MongoDB with Ming

Ming with TurboGears

● Provides builtin support for ming

○ $ gearbox quickstart --ming projectname

● Ready made test suite with fixtures on MIM

● Facilities to debug and benchmark Ming

queries through the DebugBar

● TurboGears Admin automatically

generates CRUD from Ming models

Page 18: PyConUK2013 - Validated documents on MongoDB with Ming

Debugging MongoDB

● TurboGears debugbar has builtin support

for MongoDB

○ Executed queries logging and results

○ Queries timing

○ Syntax prettifier and highlight for Map-Reduce and

$where javascript code

○ Queries tracking on logs for performance

reporting of webservices

Page 19: PyConUK2013 - Validated documents on MongoDB with Ming

DebugBar in action

Page 20: PyConUK2013 - Validated documents on MongoDB with Ming

Ming without learning MongoDB

● Transition from SQL/Relational solutions

to MongoDB can be scary first time.

● You can use Sprox to lower the learning

cost for simple applications

○ Sprox is the library that empowers TurboGears

Admin to automatically generate pages from

SQLA or Ming

Page 21: PyConUK2013 - Validated documents on MongoDB with Ming

Sprox ORM abstractions

● ORMProvider, provides an abstraction over

the ORM

● ORMProviderSelector, automatically

detects the provider to use from a model.

● Mix those together and you have a db

independent layer with automatic storage

backend detection.

Page 22: PyConUK2013 - Validated documents on MongoDB with Ming

Hands on Sprox

● Provider.query(self, entity, **kwargs) → get all objects of a collection

● Provider.get_obj(self, entity, params) → get an object ● Provider.update(self, entity, params) → update an

object● Provider.create(self, entity, params) → create a new

object

# Sprox (Ming or SQLAlchemy)count, transactions = provider.query(MoneyTransfer)

transactions = DBSession.query(MoneyTransfer).all() # SQLAlchemytransactions = MoneyTransfer.query.find().all() # Ming

Page 23: PyConUK2013 - Validated documents on MongoDB with Ming

Questions?


Recommended