Date post: | 24-Jun-2015 |
Category: |
Technology |
Upload: | mongodb |
View: | 4,861 times |
Download: | 0 times |
Software Engineer, MongoDB
Hannes Magnusson
#MongoDB
Common MongoDB Use Cases
NoSQL and MongoDB
NoSQL Features
Flexible Data Models• Lists, embedded
objects• Sparse data• Semi-structured
data• Agile development
High Data Throughput• Reads• Writes
Big Data• Aggregate Data
Size• Number of Objects
Low Latency • For reads and
writes• Millisecond Latency
Cloud Computing• Runs everywhere• No special
hardware
Commodity Hardware• Ethernet• Local data storage
• JSON Based • Dynamic
Schemas
• Replica Sets to scale reads
• Sharding to scale writes
• 1000s of shards in a single DB
• Data partitioning
• Designed for “typical” OS and local file system
• Scale-out to overcome hardware limitations
• In-memory cache
• Scale-out working set
Use Cases
High Volume Data Feeds
• More machine forms, sensors & data
• Variably structured
Machine Generated
Data
• High frequency trading• Daily closing price
Securities Data
• Multiple data sources• Each changes their format
consistently• Student Scores, ISP logs
Social Media /General Public
High Volume Data Feeds
Data Sources
Asynchronous Writes
Flexible document model can adapt
to changes in sensor format
Write to memory with periodic disk
flush
Data SourcesData
SourcesData Sources
Scale writes over multiple shards
Operational Intelligence
• Large volume of users• Very strict latency requirements• Sentiment Analysis
Ad Targeting
• Expose data to millions of customers
• Reports on large volumes of data• Reports that update in real time
Real time dashboards
• Join the conversation• Catered Games • Customized Surveys
Social Media Monitoring
Operational Intelligence
Dashboards
API
Low latency reads
Parallelize queries across replicas and
shards
In database aggregation
Flexible schema adapts to
changing input data
Can use same cluster to
collect, store and report on
data
{ cookie_id: “1234512413243”, advertiser:{ apple: { actions: [ { impression: ‘ad1’, time: 123 }, { impression: ‘ad2’, time: 232 }, { click: ‘ad2’, time: 235 }, { add_to_cart: ‘laptop’, sku: ‘asdf23f’, time: 254 }, { purchase: ‘laptop’, time: 354 } ] …
Behavioural Profiles
1
2
3
See Ad
See Ad
4
Click
Convert
Rich profiles collecting multiple complex actions
Scale out to support high throughput of
activities tracked
Dynamic schemas make it easy to
Metadata
• Diverse product portfolio• Complex querying and filtering• Multi-faceted product attributes
Product Catalogue
• Data mining• Call records• Insurance Claims
Data analysis
• Retina Scans• FingerprintsBiometric
Metadata
{ ISBN: “00e8da9b”, type: “Book”, country: “Egypt”, title: “Ancient Egypt”}
{ type: “Artifact”, medium: “Ceramic”, country: “Egypt”, year: “3000 BC”}
Flexible data model for similar
but different objects
Indexing and rich query API for easy
searching and sorting
db.archives. find({ “country”: “Egypt” });
db.archives. find({key:“type”, value:“Artifact”});
Indexing techniques that
fit your data modeling
Content Management
• Comments and user generated content
• Personalization of content and layout
News Site
• Generate layout on the fly• No need to cache static pages
Multi-device rendering
• Store large objects• Simpler modeling of metadataSharing
Content Management
{ camera: “Nikon d4”, location: [ -122.418333, 37.775 ] }
{ camera: “Canon 5d mkII”, people: [ “Jim”, “Carol” ], taken_on: ISODate("2012-03-07T18:32:35.002Z")}
{ origin: “facebook.com/photos/xwdf23fsdf”, license: “Creative Commons CC0”, size: { dimensions: [ 124, 52 ], units: “pixels” }}
Flexible data model for similar
but different objects
Horizontal scalability for
large data sets
Geo spatial indexing for
location-based searches
GridFS for large object storage
Is MongoDB a good fit for my use case?
Is there an Ideal use case?
Application Why MongoDB Might be a good fit
Large number of objects to store
Sharding lets you split objects across multiple servers
High write / read throughput and data distribution
Sharding + Replication lets you scale read and write traffic across multiple servers, multiple tenants, or data centers
Low latency access Memory mapped storage engine caches documents in RAM, enabling in-memory operations. Data locality of documents significantly improves latency over join-based approaches
Variable data in objects Dynamic schema and JSON data model enable flexible data storage without sparse tables or complex joins, and provide for an intuitive query language
Cloud based deployment Sharding and replication let you work around hardware limitations in the cloud.