Date post: | 27-Jan-2015 |
Category: |
Technology |
Upload: | amol |
View: | 108 times |
Download: | 1 times |
Who am I
● CTO @ Axant.it, mostly Python company
(with some iOS and Android)
● TurboGears development team member
● Contributions to Ming project ODM layer
● Really happy to be here at PyConUK!
○ I thought I would have crashed my car driving on
the wrong side!
MongoDB Models
● Schema free
○ It looks like you don’t have a schema, but your
code depends on properties that need to be there.
● SubDocuments
○ You know that a blog post contain a list of
comments, but what it is a comment?
● Relations
○ You don’t have joins and foreign keys, but you still
need to express relationships
What’s Ming?
● MongoDB toolkit
○ Validation layer on pymongo
○ Manages schema migrations
○ In Memory MongoDB
○ ODM on top of all of those
● Born at sourceforge.net
● Supported by TurboGears
community
MongoDB
PyMongo
Ming
Ming.ODM
Getting Started with the ODM
● Ming.ODM looks like SQLAlchemy
● UnitOfWork
○ Avoid half-saved changes in case of crashes
○ Flush all your changes at once
● IdentityMap
○ Same DB objects are the same object in memory
● Supports Relations
● Supports events (after_insert, before_update, …)
Declaring Schema with the ODM
class WikiPage(MappedClass): # Metadata for the collection # like its name, indexes, session, ... class __mongometa__: session = DBSession name = 'wiki_page'
unique_indexes = [('title',)]
_id = FieldProperty(schema.ObjectId) title = FieldProperty(schema.String) text = FieldProperty(schema.String)
# Ming automatically generates # the relationship query comments = RelationProperty('WikiComment')
class WikiComment(MappedClass): class __mongometa__: session = DBSession name = 'wiki_comment'
_id = FieldProperty(schema.ObjectId) text=FieldProperty(s.String, if_missing='')
# Provides an actual relation point # between comments and pages page_id = ForeignIdProperty('WikiPage')
● Declarative interface for models
● Supports polymorphic models
Querying the ODM
wp = WikiPage.query.get(title='FirstPage')
# Identity map prevents duplicateswp2 = WikiPage.query.get(title='FirstPage')assert wp is wp2
# manually fetching related commentscomments = WikiComment.query.find(dict(page_id=wp._id)).all()# orcomments = wp.comments
# gets last 5 wikipages in natural orderwps = WikiPage.query.find().sort('$natural', DESCENDING).limit(5).all()
● Query language tries to be natural for both
SQLAlchemy and MongoDB users
The Unit Of Work
● Flush or Clear the pending changes
● Avoid mixing UOW and atomic operations
● UnitOfWork as a cache
wp = WikiPage(title='FirstPage', text='This is my first page')DBSession.flush()
wp.title = "TITLE 2"DBSession.update(WikiPage, {'_id':wp._id}, {'$set': {'title': "TITLE 3"}})DBSession.flush() # wp.title will be TITLE 2, not TITLE 3
wp2 = DBSession.get(WikiPage, wp._id)# wp2 lookup won’t query the database again
How Validation works
● Ming documents are validated at certain
points in their life cycle
○ When saving the document to the database
○ When loading it from the database.
○ Additionally, validation is performed when the
document is created through the ODM layer or
using the .make() method
■ Happens before they get saved for real
Cost of Validation
● MongoDB is famous for its speed, but
validation has a cost
○ MongoDB documents can contain many
subdocuments
○ Each subdocument must be validated by ming
○ Can even contain lists of multiple subdocuments
Cost of Validation benchmark#With Validationclass User(MappedClass): # ... friends = FieldProperty([dict(fbuser=s.String, photo=s.String, name=s.String)], if_missing=[]) >>> timeit.timeit('User.query.find().all()', number=20000)31.97218942642212
#Without Validationclass User(MappedClass): # ... friends = FieldProperty(s.Anything, if_missing=[]) >>> timeit.timeit('User.query.find().all()', number=20000)23.391359090805054
#Avoiding the field at query time>>> timeit.timeit('User.query.find({}, fields=("_id","name")).all()', number=20000)21.58667516708374
Only query what you need
● Previous benchmark explains why it is
good to query only for fields you need to
process the current request
● All the fields you don’t query for, will still
be available in the object with None value
Evolving the Schema
● Migrations are performed lazily as the
objects are loaded from the database
● Simple schema evolutions:
○ New field: It will just be None for old entities.
○ Removed: Declare it as ming.schema.Deprecated
○ Changed Type: Declare it as ming.schema.Migrate
● Complex schema evolutions:
○ Add a migration function in __mongometa__
Complex migrations with Mingclass OldWikiPage(Document): _id = Field(schema.ObjectId) title = Field(str) text = Field(str, if_missing='') metadata = Field(dict(tags=[str], categories=[str]))
class WikiPage(Document): class __mongometa__: session = DBSession name = 'wiki_page' version_of = OldWikiPage
def migrate(data): result = dict(data, version=1, tags=data['metadata']['tags'], categories=data['metadata']['categories']) del result['metadata'] return result
version = Field(1, required=True) # … more fields ...
Testing MongoDB
● Ming makes testing easy
○ Your models can be directly imported from tests
○ Just bind the session to a DataStorage created in
your tests suite
● Ming provides MongoInMemory
○ much like sqlite://:memory:
● Implements 90% of mongodb, including
javascript execution with spidermonkey
Ming for Web Applications
● Ming can be integrated in any WSGI
framework through the ming.odm.
middleware.MingMiddleware
○ Automatically disposes open sessions at the end
of requests
○ Automatically provides session flushing
○ Automatically clears the session in case of
exceptions
Ming with TurboGears
● Provides builtin support for ming
○ $ gearbox quickstart --ming projectname
● Ready made test suite with fixtures on MIM
● Facilities to debug and benchmark Ming
queries through the DebugBar
● TurboGears Admin automatically
generates CRUD from Ming models
Debugging MongoDB
● TurboGears debugbar has builtin support
for MongoDB
○ Executed queries logging and results
○ Queries timing
○ Syntax prettifier and highlight for Map-Reduce and
$where javascript code
○ Queries tracking on logs for performance
reporting of webservices
DebugBar in action
Ming without learning MongoDB
● Transition from SQL/Relational solutions
to MongoDB can be scary first time.
● You can use Sprox to lower the learning
cost for simple applications
○ Sprox is the library that empowers TurboGears
Admin to automatically generate pages from
SQLA or Ming
Sprox ORM abstractions
● ORMProvider, provides an abstraction over
the ORM
● ORMProviderSelector, automatically
detects the provider to use from a model.
● Mix those together and you have a db
independent layer with automatic storage
backend detection.
Hands on Sprox
● Provider.query(self, entity, **kwargs) → get all objects of a collection
● Provider.get_obj(self, entity, params) → get an object ● Provider.update(self, entity, params) → update an
object● Provider.create(self, entity, params) → create a new
object
# Sprox (Ming or SQLAlchemy)count, transactions = provider.query(MoneyTransfer)
transactions = DBSession.query(MoneyTransfer).all() # SQLAlchemytransactions = MoneyTransfer.query.find().all() # Ming
Questions?