+ All Categories
Home > Technology > MongoATL: How Sourceforge is Using MongoDB

MongoATL: How Sourceforge is Using MongoDB

Date post: 15-Jan-2015
Category:
Upload: rick-copeland
View: 7,262 times
Download: 1 times
Share this document with a friend
Description:
How Sourceforge is Using MongoDB
Popular Tags:
17
SourceForge | Slashdot | ThinkGeek | Ohloh | freshmeat Geeknet, page 1 How SourceForge is Using MongoDB Rick Copeland @rick446 [email protected]
Transcript
Page 1: MongoATL: How Sourceforge is Using MongoDB

SourceForge | Slashdot | ThinkGeek | Ohloh | freshmeatGeeknet, page 1

How SourceForge is Using MongoDB

Rick Copeland@rick446

[email protected]

Page 2: MongoATL: How Sourceforge is Using MongoDB

SourceForge | Slashdot | ThinkGeek | Ohloh | freshmeatGeeknet, page 2

SF.net “BlackOps”: FossFor.us

User Editable!

Web 2.0!(ish)

Not Ugly!

Page 3: MongoATL: How Sourceforge is Using MongoDB

SourceForge | Slashdot | ThinkGeek | Ohloh | freshmeatGeeknet, page 3

Moving to NoSQL FossFor.us used CouchDB (NoSQL) “Just adding new fields was trivial, and was

happening all the time” – Mark Ramm Scaling up to the level of SF.net needs

research CouchDB MongoDB Tokyo Cabinet/Tyrant Cassandra... and others

Page 4: MongoATL: How Sourceforge is Using MongoDB

SourceForge | Slashdot | ThinkGeek | Ohloh | freshmeatGeeknet, page 4

Rewriting “Consume” Most traffic on SF.net hits 3 types of pages:

Project Summary File Browser Download

Pages are read-mostly, with infrequent updates from the “Develop” side of sf.net

Original goal is 1 MongoDB document per project Later split release data because some projects have lots of releases

Periodic updates via RSS and AMQP from “Develop”

Page 5: MongoATL: How Sourceforge is Using MongoDB

SourceForge | Slashdot | ThinkGeek | Ohloh | freshmeatGeeknet, page 5

Deployment ArchitectureLoad Balancer / Proxy

Master DB Server

MongoDBMaster

Apachemod_wsgi / TG 2.0

MongoDBSlave

Apachemod_wsgi / TG 2.0

MongoDBSlave

Apachemod_wsgi / TG 2.0

MongoDBSlave

Gobble Server

Develop

Apachemod_wsgi / TG 2.0

MongoDBSlave

Page 6: MongoATL: How Sourceforge is Using MongoDB

SourceForge | Slashdot | ThinkGeek | Ohloh | freshmeatGeeknet, page 6

Deployment Architecture (revised)

Load Balancer / Proxy

Master DB Server

MongoDBMaster

Apachemod_wsgi / TG 2.0

Gobble Server

DevelopApachemod_wsgi / TG 2.0

Apachemod_wsgi / TG 2.0

Apachemod_wsgi / TG 2.0

Scalability is good

Single-node performance is

good, too

Page 7: MongoATL: How Sourceforge is Using MongoDB

SourceForge | Slashdot | ThinkGeek | Ohloh | freshmeatGeeknet, page 7

SF.net Downloads Allow non-sf.net projects to use SourceForge mirror network

Stats calculated in Hadoop and stored/served from MongoDB

Same deployment architecture as Consume (4 web, 1 db)

Page 8: MongoATL: How Sourceforge is Using MongoDB

SourceForge | Slashdot | ThinkGeek | Ohloh | freshmeatGeeknet, page 8

Allura (SF.net “beta” devtools)

Rewrite developer tools with new architecture

Wiki, Tracker, Discussions, Git, Hg, SVN, with more to come

Single MongoDB replica set manually sharded by project

Release early & often

Page 9: MongoATL: How Sourceforge is Using MongoDB

SourceForge | Slashdot | ThinkGeek | Ohloh | freshmeatGeeknet, page 9

What We Liked Performance, performance, performance – Easily handle

90% of SF.net traffic from 1 DB server, 4 web servers

Schemaless server allows fast schema evolution in development, making many migrations unnecessary

Replication is easy, making scalability and backups easy Keep a “backup slave” running

Kill backup slave, copy off database, bring back up the slave

Automatic re-sync with master

Query Language You mean I can have performance without map-reduce?

GridFS

Page 10: MongoATL: How Sourceforge is Using MongoDB

SourceForge | Slashdot | ThinkGeek | Ohloh | freshmeatGeeknet, page 10

Pitfalls Too-large documents

Store less per document Return only a few fields

Ignoring indexing Watch your server log; bad queries show up

there Ignoring your data’s schema Using many databases when one will do Using too many queries

Page 11: MongoATL: How Sourceforge is Using MongoDB

SourceForge | Slashdot | ThinkGeek | Ohloh | freshmeatGeeknet, page 11

Ming – an “Object-Document

Mapper?” Your data has a schema Your database can define and enforce it

It can live in your application (as with MongoDB)

Nice to have the schema defined in one place in the code

Sometimes you need a “migration” Changing the structure/meaning of fields

Adding indexes

Sometimes lazy, sometimes eager

Queuing up all your updates can be handy

Python dicts are nice; objects are nicer

Page 12: MongoATL: How Sourceforge is Using MongoDB

SourceForge | Slashdot | ThinkGeek | Ohloh | freshmeatGeeknet, page 12

Ming Concepts Inspired by SQLAlchemy

Group of classes to which you map your collections

Each class defines its schema, including indexes

Convenience methods for loading/saving objects and ensuring indexes are created

Migrations

Unit of Work – great for web applications

MIM – “Mongo in Memory” nice for unit tests

Page 13: MongoATL: How Sourceforge is Using MongoDB

SourceForge | Slashdot | ThinkGeek | Ohloh | freshmeatGeeknet, page 13

Ming Example

from ming import schemafrom ming.orm import MappedClassfrom ming.orm import (FieldProperty, ForeignIdProperty, RelationProperty)

class WikiPage(MappedClass):

class __mongometa__: session = session name = 'wiki_page'

_id = FieldProperty(schema.ObjectId) title = FieldProperty(str) text = FieldProperty(str) comments=RelationProperty('WikiComment')

MappedClass.compile_all() # Lets ming know about the mapping

Page 14: MongoATL: How Sourceforge is Using MongoDB

SourceForge | Slashdot | ThinkGeek | Ohloh | freshmeatGeeknet, page 14

Open Source

Minghttp://sf.net/projects/merciless/

MIT License

Allurahttp://sf.net/p/allura/

Apache License

Page 15: MongoATL: How Sourceforge is Using MongoDB

SourceForge | Slashdot | ThinkGeek | Ohloh | freshmeatGeeknet, page 15

Future Work

mongos New Allura Tools Migrating legacy SF.net projects to Allura Stats all in MongoDB rather than Hadoop? Better APIs to access your project data

Page 16: MongoATL: How Sourceforge is Using MongoDB

SourceForge | Slashdot | ThinkGeek | Ohloh | freshmeatConfidential Geeknet, page 16

Questions?

Page 17: MongoATL: How Sourceforge is Using MongoDB

SourceForge | Slashdot | ThinkGeek | Ohloh | freshmeatGeeknet, page 17

Rick Copeland@rick446

[email protected]


Recommended