+ All Categories
Home > Software > Introduction to NoSQL

Introduction to NoSQL

Date post: 06-Apr-2017
Category:
Upload: joe-drumgoole
View: 76 times
Download: 0 times
Share this document with a friend
41
Transcript
Page 1: Introduction to NoSQL
Page 2: Introduction to NoSQL

Introduction to NoSQLJoe DrumgooleDirector, Solutions Architecture EMEA, MongoDB@jdrumgoole

Page 3: Introduction to NoSQL

3

Remember

Page 4: Introduction to NoSQL

Who Are These Guys?

Page 5: Introduction to NoSQL

The relational model : 1970

Page 6: Introduction to NoSQL

6

The Internet - 1971

Page 7: Introduction to NoSQL

7

The Internet 2014

Page 8: Introduction to NoSQL

8

Moore’s Law

Page 9: Introduction to NoSQL

9

• Great for chip/transistor based technologies• But has had little impact on Disks• Disks are still dog slow• Like really, really, really slow• Slower than molasses on a cold day in Alaska?• Even slower than that• How slow….

Limits to Moore’s Law

Page 10: Introduction to NoSQL

10

How Slow is a Disk*

CPU

L1 Cache

L2 Cache

Main Memory

Spinning Disks

0.5 Seconds

7 seconds

1.5 minutes

7 months

* http://norvig.com/21-days.html#answers

Page 11: Introduction to NoSQL

11

Virtualization

Hypervisor

CPU

RAM

CPU

RAM

CPU

RAM

CPU

RAM

CPU

RAM

CPU

RAM

CPU

RAM

Windows 7 RedHatLinux

WindowsServer

UbuntuLinux

Windows Vista Windows 8 CentosLinux

Apps

Apps

Apps

Apps

Apps

Apps

Apps

Apps

Apps

Apps

Apps

Apps

Apps

Apps

Apps

Apps

Apps

Apps

Apps

Apps

Apps

Page 12: Introduction to NoSQL

12

Now Imagine Lots

Page 13: Introduction to NoSQL

13

• Takes virtualization to the extreme• Amazon has over 500,000 physical servers• Microsoft has over 1 million physical servers• Google has over 2.3 million physical servers• Anyone with a credit card can start hundreds of servers in a matter of

minutes

Cloud Computing

Page 14: Introduction to NoSQL

14

• Filter – Store – Distribute– Encyclopedias– Newspapers– Libraries– Banking

• Why?– Storage caps– Bandwidth caps

Traditional Data Approaches

Page 15: Introduction to NoSQL

15

• Cost per GB/Month for Stable Storage–~$5GB down to .15 cent per GB

• Unlimited Storage• Purchased in GB chunks• Pay only for what you use

The AWS Disruption - 2006

Page 16: Introduction to NoSQL

16

• No CAPEX• No Data Centre• Availability of Google Scale• Utility Pricing

Filter-Store-DistributeStore-Filter-Distribute

What did this mean?

Page 17: Introduction to NoSQL

17

• SQL Databases – depend on a pre-filter– Assume monolithic memory– Assume single disk farm– Hard to partition– Based on 1970’s storage assumptions

Store Everything Is A Challenge

Page 18: Introduction to NoSQL

It makes development hard

Relational DatabaseObject Relational MappingApplication

Code XML Config DB Schema

Page 19: Introduction to NoSQL

NoSQL Scales Better

Vs.

Pric

e

Scale

Pric

e

Scale

Page 20: Introduction to NoSQL

20

Why NoSQL

V elocityariety

olume

Page 21: Introduction to NoSQL

21

Unlock Your Big Data

Page 22: Introduction to NoSQL

22

• Goal of a Database– Data Durability– Consistent performance– Graceful degradation under load

• Big Data Workloads require distributed computing– Tranactions– Joins

What is NoSQL?

Page 23: Introduction to NoSQL

23

Distributed Computing

Master

Compute1 Compute1 Compute1 Compute1 Compute1

Disk 1 Disk 1 Disk 1 Disk 1 Disk 1

Page 24: Introduction to NoSQL

CAP Theorem

Consistency

Partition Tolerance

Availability

Page 25: Introduction to NoSQL

25

CAP Theorem

• Consistency – all nodes see the same data at the same time

• Availability – A guarantee that every request receives a response about whether it

succeeded or failed• Partition tolerance

– The system continues to operate despite arbitrary partitioning due to network failures

Page 26: Introduction to NoSQL

CAP – MongoDB Style

Consistency

Partition Tolerance

Availability

Page 27: Introduction to NoSQL

CAP – I Want Availability As Well

Consistency

Partition Tolerance

Availability

Page 28: Introduction to NoSQL

28

Be Careful What You Wish For

• Even Google thinks Eventual Consistency Sucks• Reconciliation is a problem

Page 29: Introduction to NoSQL

But What About CA?

Consistency

Partition Tolerance

Availability

Page 30: Introduction to NoSQL

30

• Key Value Stores• Column Stores• Document Stores• Note

– Lots of Hybrids– Lots of NewSQL vendors– Some niche Graph Stores

Three Common Types of NoSQL

Page 31: Introduction to NoSQL

31

• Maps keys to values• Single Index• Very fast• Think Memcache on steroids• Great for shopping carts, user profiles• Inefficient to do aggregate queries, “all the carts worth $100 or

more”

Key Value Stores

Page 32: Introduction to NoSQL

32

• Store data as columns rather rows• Efficient to do column ordered operations• Not so great at row based queries• A quick recap…

Column Stores

Page 33: Introduction to NoSQL

33

• Materialise storage as rows

Relational/Row Order Databases

ID Name Salary Start Date

1 Joe D $24000 1/Jun/1970

2 Peter J $28000 1/Feb/1972

3 Phil G $23000 1/Jan/1973

1 Joe D 2400 1/Jun/1970 1 Joe D 2400 1/Jun/1970 1 Joe D 2400 1/Jun/1970

Page 34: Introduction to NoSQL

34

• Materialise data as columns

Column Databases

ID Name Salary Start Date

1 Joe D $24000 1/Jun/1970

2 Peter J $28000 1/Feb/1972

3 Phil G $23000 1/Jan/1973

1

2

3

Joe D

Peter J

Phil G

24000

28000

23000

1/Jun/1970

1/Feb/1972

1/Jan/1973

Page 35: Introduction to NoSQL

35

• Relational : Good For– Queries that return small subsets of rows– Queries that use a large subset of row data– e.g. find all employee data for employees with salary > 12000

• Column : Good For– Queries that require just a column of data– Queries that require a small subset of row data– E.g. Give me the total salary outlay for all staff

Pros and Cons

Page 36: Introduction to NoSQL

36

• Store Javascript Documents– JSON = JavaScript Object Notation– An associative array– Key value pairs– Values can be documents or arrays– Arrays can contain documents

• Data is implicitly denormalised

Document Databases

Page 37: Introduction to NoSQL

Documents are easier

Relational Document DB{ first_name: ‘Paul’, surname: ‘Miller’ city: ‘London’, location: [45.123,47.232], cars: [ { model: ‘Bentley’, year: 1973, value: 100000, … }, { model: ‘Rolls Royce’, year: 1965, value: 330000, … } ]}

Page 38: Introduction to NoSQL

Document DBs are full featured

MongoDB{ first_name: ‘Paul’, surname: ‘Miller’, city: ‘London’, location: [45.123,47.232], cars: [ { model: ‘Bentley’, year: 1973, value: 100000, … }, { model: ‘Rolls Royce’, year: 1965, value: 330000, … } }}

Rich Queries• Find Paul’s cars• Find everybody who owns a car built between 1970 and

1980

Geospatial • Find all of the car owners in London

Text Search • Find all the cars described as having leather seats

Aggregation • What’s the average value of Paul’s car collection

Map Reduce • For each make and model of car, how many exist in the world?

Page 39: Introduction to NoSQL

39

• Hadoop is a Map/Reduce Framework• Used to partition computation on large datasets• Used where you need to analyse most of the data• E.g.

– Count all the links on all the web pages in Ireland– Calculate the overnight interest on every account– Analyse the recommendations based on yesterdays purchases

Where Does Hadoop Fit?

Page 40: Introduction to NoSQL

40

A Mature NoSQL Model

Low LatencyHigh Performance

General Purpose NoSQLDatabase

HadoopAnalytics

Front End Middle Tier Back End

Page 41: Introduction to NoSQL

41

• Great technical transition of our generation• Everyone will have a NoSQL deployment• Right now it sits alongside Relational• In the future it will replace Relational• It’s all Open Source• Ask me about it after

Conclusions


Recommended