EMBRACING CONSTRAINTS WITH COUCHDB
David Zülke
David Zuelke
http://en.wikipedia.org/wiki/File:München_Panorama.JPG
Founder
Lead Developer
A DISCLAIMER FIRSTBefore You All Figure This Out Yourselves...
NEIN NEIN NEIN NEIN
DAS IST BETRUG
This talk is not really about embracing constraints
I’ll tell you what it’s really about when we’re finished
I’ll also apologize to you for lying at that point
(it’s always easier to apologize than to ask for permission)
COUCHDB IN THREE SLIDESFull Of DIS IS SRS BSNS Bullet Points
COUCHDB STORES DOCUMENTS
• CouchDB stores documents with arbitrary keys and values
• Each document is identified by an ID and has a revision
•Documents can have file attachments
• Stored as JSON, so it’s easy to interface with
COUCHDB SPEAKS HTTP
• CouchDB uses HTTP to communicate with clients & servers
• That means scalability
• That means a lot of kick ass stuff totally for free
• Caching
• Load Balancing
• Content Negotiation
COUCHDB USES MVCC
•Multiversion Concurrency Control
•When updating, you must supply a revision number
• Your change will be rejected if the revision is not the latest
• All writes are serialized
•No need for locks, but puts some responsibility on developers
SOME DETAILSAn In-Depth Look At What Makes CouchDB Different
CAP
consistency
availability
partition toleranceX
“So, CouchDB does not have consistency of CAP?”
“Booh, that means my data will be inconsistent. Fail!”
psssshhh
YOUR MOM IS INCONSISTENT
CouchDB is eventually consistent
When replicating, conflicting revisions will be marked as such
These conflicts can then be resolved (users, daemons,...)
and everything will be fine\o/
which brings us to...
REPLICATION
• You can do Master-Master replication
• Conflicts are detected and marked automatically
• Conflicts are supposed to be resolved by applications
•Or by users, who usually know best what to do!
CouchDB is Ground Computing
Imagine a world where every computer runs CouchDB
Ubuntu One already does, to sync bookmarks etc!
MAP/REDUCE
BASIC PRINCIPLE: MAPPER
• The Mapper reads records and emits <key, value> pairs
• Example: Apache access.log
• Each line is a record
• Extract client IP address and number of bytes transferred
• Emit IP address as key, number of bytes as value
• For hourly rotating logs, the job can be split across 24 nodes** In pratice, it’s a lot smarter than that
BASIC PRINCIPLE: REDUCER
• A Reducer is given a key and all values for this specific key
• Even if there are many Mappers on many computers; the results are aggregated before they are handed to Reducers
• Example: Apache access.log
• The Reducer is called once for each client IP (that’s our key), with a list of values (transferred bytes)
•We simply sum up the bytes to get the total traffic per IP!
EXAMPLE OF MAPPED INPUT
IP Bytes
212.122.174.13 18271
212.122.174.13 191726
212.122.174.13 198
74.119.8.111 91272
74.119.8.111 8371
212.122.174.13 43
REDUCER WILL RECEIVE THIS
IP Bytes
212.122.174.13
18271
212.122.174.13191726
212.122.174.13198
212.122.174.13
43
74.119.8.11191272
74.119.8.1118371
AFTER REDUCTION
IP Bytes
212.122.174.13 210238
74.119.8.111 99643
COUCHDB INCREMENTAL MAPREDUCE
THE KEY DIFFERENCE
•Maps and Reduces are incremental:
• If one document changes, only that one document needs:
•mapping
• reduction
• Then a few new reduce runs are performed to compute the final result
MAPPER: DOCS BY TAGS
function(doc) { if(doc.type == 'talk') { (doc.tags || []).forEach(function(tag) { emit(tag, doc); }); }}
MAPREDUCE: COUNT TAGS
function(doc) { if(doc.type == 'talk') { (doc.tags || []).forEach(function(tag) { emit(tag, 1); }); }}
function(key, values) { return sum(values);}
LUCENE INTEGRATIONFull Control Over What Is Indexed, And How
COUCHAPPPython Tool For Development And Deployment
DEMO TIMELet’s Relax On The Couch
!e End
FURTHER READING
• http://books.couchdb.org/
• http://couchdb.apache.org/
• http://github.com/couchapp/couchapp
• http://github.com/rnewson/couchdb-lucene/
• http://janl.github.com/couchdbx/
• http://j.mp/oqbQs (E4X in CouchDB for XML parsing)
DID YOU SEE THE HEAD FAKE?This Talk Was Not About Embracing Constraints
It Was About Embracing Awesomeness
Questions?
THANK YOU!This was
http://joind.in/1651by
@dzuelke