+ All Categories
Home > Technology > moma-django overview --> Django + MongoDB: building a custom ORM layer

moma-django overview --> Django + MongoDB: building a custom ORM layer

Date post: 01-Nov-2014
Category:
Upload: gadi-oren
View: 25 times
Download: 0 times
Share this document with a friend
Description:
moma-django is a MongoDB manager for Django. It provides native Django ORM support for MongoDB documents, including the query API and the admin interface. It was developed as a part of two commercial products and released as an open source. In the talk we will review the motivation behind its developments, its features and go through 2-3 examples of how to use some of the features: migrating an existing model, advanced queries and the admin interface. If time permits we will discuss unit testing and south migrations. Please find the video at: http://www.youtube.com/watch?v=cxQKTDLjb-w
Popular Tags:
26
Moma-Django Overview Django Boston meetup, 02-27-2014
Transcript
Page 1: moma-django overview --> Django + MongoDB: building a custom ORM layer

Moma-DjangoOverviewDjango Boston meetup, 02-27-2014

Page 2: moma-django overview --> Django + MongoDB: building a custom ORM layer

Django + MongoDB: building a custom ORM layer

Overview of the talk:

moma-django is a MongoDB manager for Django. It provides native Django ORM support for MongoDB documents, including the query API and the admin interface. It was developed as a part of two commercial products and released as an open source. In the talk we will review the motivation behind its developments, its features and go through 2-3 examples of how to use some of the features: migrating an existing model, advanced queries and the admin interface. If time permits we will discuss unit testing and south migrations

Page 3: moma-django overview --> Django + MongoDB: building a custom ORM layer

Who are we?

Company: Cloudoscope.com What we do:

– Cloudoscope’s product enable IT vendors to automate the pre-sales process by collecting and analyzing prospect IT performance

– Previous product - Lucidel: B2C marketing analytics based on website data

– Data intensive projects / sites, NoSQL, analytics focus (as a way of funding)

Gadi Oren: @gadioren, gadioren

Page 4: moma-django overview --> Django + MongoDB: building a custom ORM layer

Why moma-django?

Certain problems can be addressed well with NoSQL The team wants to experiment with a NoSQL

HOWEVER: A lot of code needs to be rewritten Team learn a new API Some of the tools and procedures are no longer functioning

and should be replaced– Admin interface– Unit testing environment

Some of the data need to be somewhat de-normalized*

Page 5: moma-django overview --> Django + MongoDB: building a custom ORM layer

Why moma-django? (our example)

Needed a very efficient way of processing timeseries The timeseries where constantly growing We required very detailed search/slice/dice capabilities to

find the timeseries to be processed Some of the data was optional (e.g. demographics

information was never complete) Document size, content and structure varied widelyHowever, we have a small distributed team and we did not

want to create a massive project We started experimenting using a stub Manager doing small

iterations, adding functionality as we needed over nine months

Page 6: moma-django overview --> Django + MongoDB: building a custom ORM layer

Other packages

PyMongo – a dependency for moma-django

MongoEngine – somewhat similar concepts in terms of models

Non relational versions of Django

Page 7: moma-django overview --> Django + MongoDB: building a custom ORM layer

“Native” - advantages

Django packages and plugins (e.g. Admin functionality)

Using similar code conventions

Easier to bring in new team members

Use the same unit testing frameworks (e.g. Jenkins)

Simple experimentation and migration path

Page 8: moma-django overview --> Django + MongoDB: building a custom ORM layer

Let’s make it interactiveQuestions Anyone??? (Example Application)

Small question asking application Allows voting and adding images Implemented as a django application over MongoDB, using

moma-django

Register and login at http://momadjango.org

Ask away!

Page 9: moma-django overview --> Django + MongoDB: building a custom ORM layer

Migrating an existing model

class TstBook(models.Model): name = models.CharField(max_length=64) publish_date = MongoDateTimeField() author = models.ForeignKey('testing.TstAuthor') class Meta: unique_together = ['name', 'author']

class TstAuthor(models.Model): first_name = models.CharField(max_length=32) last_name = models.CharField(max_length=32)

class TstBook(MongoModel): name = models.CharField(max_length=64) publish_date = MongoDateTimeField() author = models.ForeignKey('testing.TstAuthor') class Meta: unique_together = ['name', 'author']

class TstAuthor(MongoModel): first_name = models.CharField(max_length=32) last_name = models.CharField(max_length=32)

models.signals.post_syncdb.connect(post_syncdb_mongo_handler)

Page 10: moma-django overview --> Django + MongoDB: building a custom ORM layer

Migrating an existing model (2)

Syncdb:

Add objects

Page 11: moma-django overview --> Django + MongoDB: building a custom ORM layer

Migrating an existing model (2)

Syncdb:

Add objects

>>> TstBook(name=“Good night half moon”, publish_date=datetime.datetime(2014,2,20), author=TstAuthor.objects.get(first_name=“Gadi”)).save()

Page 12: moma-django overview --> Django + MongoDB: building a custom ORM layer

Migrating an existing model (3) Breaching uniqueness try and save the same object again:

Page 13: moma-django overview --> Django + MongoDB: building a custom ORM layer

Migrating an existing model (4) In Mongo: content, indexes

Admin

class Meta: unique_together = ['name', 'author']

Page 14: moma-django overview --> Django + MongoDB: building a custom ORM layer

New field types

MongoIDField – Internal. Used to hold the MongoDB object ID

MongoDateTimeField – Used for Datetime ValuesField – Used to represent a list of objects of any type StringListField – Used for a list of stringsDictionaryField – Used as a dictionary

Current limitation: nested structures have limited support

Page 15: moma-django overview --> Django + MongoDB: building a custom ORM layer

Queries and update – 1: bulk insert

records.append( { "_id" : ObjectId("502abdabf7f16836f100285a"), "time_on_site" : 290, "user_id" : 1154449631, "account_id" : NumberLong(5), "campaign" : "(not set)", "first_visit_date" : ISODate("2012-07-30T17:10:06Z"), "referral_path" : "(not set)", "source" : "google", "exit_page_path" : "/some-analysis/lion-king/", "landing_page_path" : "(not set)", "keyword" : "wikipedia lion king", "date" : ISODate("2012-07-30T00:00:00Z"), "visit_count" : 1, "page_views" : 3, "visit_id" : "false---------------1154449631.1343668206", "goal_values" : { }, "goal_starts" : { }, "demographics" : { }, "goal_completions" : { }, "location" : { "cr" : "United States", "rg" : "California", "ct" : "Pasadena" }, })

UniqueVisit.objects.filter(account__in=self.list_of_accounts).delete()

UniqueVisit.objects.bulk_insert( records )

Page 16: moma-django overview --> Django + MongoDB: building a custom ORM layer

Queries and update – 2: examples

def ISODate(timestr): res = datetime.strptime(timestr, "%Y-%m-%dT%H:%M:%SZ") res = res.replace(tzinfo=timezone.utc) return res

# Datetimeqs = UniqueVisit.objects.filter( first_visit_date__lte =ISODate("2012-07-30T12:29:05Z"))self.assertEqual( qs.query.spec, dict( # pymongo expression {'first_visit_date': {'$lte': datetime(2012, 7, 30, 12, 29, 5, tzinfo=timezone.utc)}}))

# Multiple conditionsqs = UniqueVisit.objects.filter( first_visit_date__lte =ISODate("2012-07-30T12:29:05Z"), time_on_site__gt =10, page_views__gt =2)self.assertEqual( qs.query.spec, dict( # pymongo expression {'time_on_site': {'$gt': 10.0}, 'page_views': {'$gt': 2}, 'first_visit_date': {'$lte': datetime(2012, 7, 30, 12, 29, 5, tzinfo=timezone.utc)}}))

Page 17: moma-django overview --> Django + MongoDB: building a custom ORM layer

Queries and update– 3: examples

# Different query optimizationsqs = UniqueVisit.objects.filter(Q(time_on_site =10)|Q(time_on_site =25)|Q(time_on_site =275))self.assertEqual( qs.query.spec, dict( # pymongo expression {'time_on_site': {'$in': [10.0, 25.0, 275.0]}}))

# Multiple or Q expressionsqs = UniqueVisit.objects.filter(Q(time_on_site =10)|Q(time_on_site =25)|Q(time_on_site =275)|Q(source = 'bing'))self.assertEqual( qs.query.spec, dict( # pymongo expression {'$or': [{'time_on_site': 10.0}, {'time_on_site': 25.0}, {'time_on_site': 275.0}, {'source': 'bing'}]}))

# Negate Qqs = UniqueVisit.objects.filter(~Q(first_visit_date =ISODate("2012-07-30T12:29:05Z")))self.assertEqual( qs.query.spec, dict( # pymongo expression {'first_visit_date': {'$ne': datetime(2012, 7, 30, 12, 29, 5, tzinfo=timezone.utc)}}))

Page 18: moma-django overview --> Django + MongoDB: building a custom ORM layer

Queries – 4: extensions beyond standard Django

# Dot notationqs = UniqueVisit.objects.filter(location__rg__exact ="New York")self.assertEqual( qs.query.spec, dict(( # pymongo expression {'location.rg': 'New York'}))

# Check key existenceqs = UniqueVisit.objects.filter(demographics__age__exists ="true")self.assertEqual( qs.query.spec, dict(( # pymongo expression {'demographics.age': {'$exists': 'true'}}))

# variable typeqs = UniqueVisit.objects.filter(landing_page_path__type = int)self.assertEqual( qs.query.spec, dict(( # pymongo expression {'landing_page_path': {'$type': 16}}))

Page 19: moma-django overview --> Django + MongoDB: building a custom ORM layer

Queries - by the structure of documents# How many documents in the DB?>>> UniqueVisit.objects.all().count()20>>> # For how many documents in the DB do we have age information?>>> UniqueVisit.objects.filter(demographics__age__exists ="true").count()7>>> # For how many documents in the DB do we have gender information?>>> UniqueVisit.objects.filter(demographics__gender__exists ="true").count()3>>> # For how many documents in the DB do we have gender and age information?>>> UniqueVisit.objects.filter(demographics__age__exists ="true“, demographics__gender__exists ="true").count()1>>>

Page 20: moma-django overview --> Django + MongoDB: building a custom ORM layer

Manipulating documents payload

# Store an image: get the image from the “POST” upload form (snippet)docfile = request.FILES['docfile']question_id = form.cleaned_data['question_id']docfile_name = docfile.namedocfile_name_changed = _replace_dots(docfile.name)question = Question.objects.get(id=question_id)

# Store meta-dataquestion.docs.update({docfile_name_changed : docfile.content_type})question.image.update( {docfile_name_changed +'_url' : '/static/display/s_'+docfile_name, docfile_name_changed +'_name' : docfile_name, docfile_name_changed +'_content_type' : docfile.content_type})

# Store the actual image binary block (small scale implementation)file_read = docfile.file.read() # Note – this is a naïve implementation!file_data = base64.b64encode(file_read)question.image.update({docfile_name_changed +'_data' : file_data})question.save()

# Modelclass Question(MongoModel): user = models.ForeignKey(User) date = MongoDateTimeField(db_index=True) question = models.CharField(max_length=256 )

docs = DictionaryField(models.CharField()) image = DictionaryField(models.TextField()) audio = DictionaryField() other = DictionaryField()

vote_ids = ValuesField(models.IntegerField())

def __unicode__(self): return u'%s[%s %s]' % (self.question, self.date, self.user, ) class Meta: unique_together = ['user', 'question',]

Page 21: moma-django overview --> Django + MongoDB: building a custom ORM layer

Admin interface

Page 22: moma-django overview --> Django + MongoDB: building a custom ORM layer

So – what’s next?

Github: https://github.com/gadio/moma-django If you want to contribute – please contact (forking is also an

option) Contact: gadi.oren.1 at gmail.com or

gadi at Cloudoscope.com

Page 23: moma-django overview --> Django + MongoDB: building a custom ORM layer

Backup

Page 24: moma-django overview --> Django + MongoDB: building a custom ORM layer

South

Dealing with apps with mixed models South to disregard the model

# Enabling South for the non conventional mongo model

add_introspection_rules( [ ( (MongoIdField, MongoDateTimeField, DictionaryField ), [], { "max_length": ["max_length", {"default": None}], }, ), ], ["^moma_django.fields.*",])

Page 25: moma-django overview --> Django + MongoDB: building a custom ORM layer

Unit testing

The model name is defined in settings.py In unit testing run, a new mongo DB schema is created

MONGO_COLLECTION prefixed with “test_”(e.g. test_momaexample)

MONGO_HOST = 'localhost'MONGO_PORT = 27017MONGO_COLLECTION = 'momaexample'

Page 26: moma-django overview --> Django + MongoDB: building a custom ORM layer

Moma-django on google…


Recommended