Drop acid

Post on 14-May-2015

823 views 2 download

Tags:

description

Session on NoSQL Databases and MongoDB. I stole the title from someone who deserves credit, but unfortunately, can't remember who. I blame the acid.

transcript

NoSQL - Death to Relational Databases

Mike FeltmanF1 Technologies

Agenda• The NoSQL Movement• MongoDB Discussion & Demo• Discussion

The NoSQL MovementNo SQL Databases:

Non-relationalLess ACID More BASECAP TradingHighly ScalableHighly Performant

NoSQL = Not Only SQL

Less ACID• Atomic • basically means supports transactions

• Consistent• Has hard constraints & rejects non-conforming data

• Isolated • No peaking at incomplete commits

• Durable• Once a commit is finished, it lasts forever.

More BASE• Basically Available • Soft-state • Eventually consistent

CAP Trading• Consistency (client perceives set of operations

completed)• Availability (operations terminate with an

expected result)• Partition tolerance (operations will complete,

even if a required resource is unavailable)• Only 2 are possible in distributed systems.– Eric Brewer

The NoSQL MovementWhy:• SQL is tedious and difficult• Strongly typed schemas are inflexible and painful

to maintain• Inadequate performance of RDBMS on huge data

stores• Poor Scalability of RDBMS• Poor Replication Support

Types of NoSQL Databases• Document Stores• Graph• Key/Value Store• Object Database• Tabular

Major Players• Mongodb (10gen)• CouchDB (Apache)• Cassandra (Apache –

formerly Facebook)• BigTable – (Google)• Berkeley DB (Oracle)

• Dynamo (Amazon)• MObStor (Yahoo)• Haystack (Facebook)• Voldemort (LinkedIn)• HBase/Hadoop (Apache

& Microsoft)

MongoDBCombining the best features of document

databases, key-value stores, and RDBMSes.

• Scalable• High-Performance• Open Source• Schema-free• Document Oriented

MongoDB Features• Document-oriented

storage (BSON)• Dynamic Queries• Full index support

(including embedded objects & arrays)

• Fast, in-place updates• Efficient Blob storage

• Replication• Auto-sharding• MapReduce• Driver support for many

languages• Cross-Platform• Admin Tools

Document Oriented Storage

• Data is stored in BSON– Binary-encoded

serialization of JSON-like documents.

– Lightweight, traversable & efficient

– Supports embedded objects & arrays

– Document = Record

{ firstName: “Nicklas”, lastName: “Lidstrom”, team: “Red Wings”, stanleyCups : [1997, 1998, 2002, 2008], norrisTrophies : [2001, 2002, 2003, 2006, 2007, 2008] }

Dynamic Queries• No indexes required to

find data.• RDBMSes all support

this as well.

Examples• All records:db.players.find({})• All Red Wingsdb.players.find({“team”:

“Red Wings”})

Index Support• B-Tree format• Default index on PK• Supports unique, compound, document

indexes (indexes on nested documents) and multikeys indexes (allows indexing of arrays of values)

Fast in-place updates• Updates are made to existing documents

within a collection. • Many “NoSQL” databases (such as CouchDB)

do not support updates and instead store versions of records.

Efficient Blob Storage• Blob = Binary Large Object• Up to 4MB within document• GridFS specification is followed for larger

items and external files

Replication• Enhanced master-slave configuration– one server active for writes at a time.– Provides failover and redundancy– Implemented with Replica Pairs• When master fails slave takes over• When slave fails control reverts to master

• Limited Master-master

Auto-Sharding• Sharding: – Breaking database down into “shards” and

spreading those across distributed/commodity servers.

– highly scalable approach for increased throughput and performance of high-transaction, large database applications.

– MongoDB manages data storage and retrieval behind the scenes.

MapReduce

• Term comes from Google. – Patented framework for

processing huge datasets on certain kinds of distributable problems using a large number of servers.

– MongoDB applies it to single server instances as well.

• Useful for batch operations

• Aggregation: NoSQL answer to GROUP BY

Drivers• .NET (C#)• JavaScript• Python• PHP• Ruby• Java• C++

• Perl• JVM– Clojure– Groovy– Scala

Cross-Platform• 32 bit & 64 bit versions available for:– Windows– OS X– Linux– Solaris

Admin Tools• Command Shell• Simple limited REST (http) Interface• Mongostat• Mongosniff (Unix only – use tcpdump on

Windows)• Backup & Restore

MongoDB TerminologyTraditional RDBMS• Database• Table• Record• Field

MongoDB• Database• Collection• Document• Key

Demo!• Start the server (if it’s not running).

C:\mongodb\bin\mongod• Start the shell

C:\mongodb\bin\mongo

The MongoDB Shell

Database Commands• Open Database• Create Database

• use (database name)• use (database name)

How it works• Focused on documents

– Document = sequence of key value pairs in bson• Value can be another document• Additional types vs. JSON. ie dates, regexp

• Messages (cpassed over TCP/IP) in BSON drivers convert code to BSON• Memory mapped storage engine (MMSE) – all disk access takes place

through MMSE• Query Optimizer:

– Find( {x:10, y:”foo”})– Launches multiple simultaneous queries based on indexes & table scan. Stops

when one finishes, remembers which one was the fastest for future similar queries. Can use hint option to specify which index to use.

Why?• Applications where schema gets in the way• Performance• Scalability• RAD• More natural fit with OO Languages

Resources• www.mongodb.org