+ All Categories
Home > Technology > NoSQL overview #phptostart turin 11.07.2011

NoSQL overview #phptostart turin 11.07.2011

Date post: 15-May-2015
Category:
Upload: david-funaro
View: 4,178 times
Download: 5 times
Share this document with a friend
Description:
overview of noslq database
Popular Tags:
58
NoSQL David Funaro Torino, 11 luglio 2011 PHP.TO.START
Transcript
Page 1: NoSQL overview #phptostart turin 11.07.2011

NoSQL

David Funaro

Torino, 11 luglio 2011

PHP.TO.START

Page 2: NoSQL overview #phptostart turin 11.07.2011

What about me ?• sw engineer

• PHP developer (2002)

• Symfony Framework developer (2009)

• Mobile developer ( iOs / Symbian )

• Senior developer @ dnsee

• PHP user group Rome Founder

• Open Source contributor

Page 3: NoSQL overview #phptostart turin 11.07.2011

RDBMS

NOSQL

Other

Database - logical model

Page 4: NoSQL overview #phptostart turin 11.07.2011

Relational DB

• In the *70’s

• SQL ,relational algebra & set theory

• excellent for applications such as management( accounting, reservations, management staff)

Page 5: NoSQL overview #phptostart turin 11.07.2011

ACID

• Atomic

• Consistency

• Isolation

• Durability

Transactions work in the right mode if the database can satisfy this four properties:

Page 6: NoSQL overview #phptostart turin 11.07.2011

RDBMS

NOSQL

Other

Database - logical model

Page 7: NoSQL overview #phptostart turin 11.07.2011

RDBMS

Database - logical model

Page 8: NoSQL overview #phptostart turin 11.07.2011

RDBMS

Database - logical model

Key Value

Document Oriented

Column Oriented

Graph DB

NOSql

Page 9: NoSQL overview #phptostart turin 11.07.2011

NOSql !=

Page 10: NoSQL overview #phptostart turin 11.07.2011

NOSql !=

Not Only SqlOne Size fits all

Page 11: NoSQL overview #phptostart turin 11.07.2011

Historical IntroThe concept of “non relational database” is older than the “relational model” but has been resumed and improved

technology comes back

Page 12: NoSQL overview #phptostart turin 11.07.2011

New Requirements

Page 13: NoSQL overview #phptostart turin 11.07.2011

New Requirements

half *90’s

Page 14: NoSQL overview #phptostart turin 11.07.2011

New Requirements

half *90’s

with the new internet-based systems the Consistency and the Security of data are no longer enough

Page 15: NoSQL overview #phptostart turin 11.07.2011

New Requirements

half *90’s

with the new internet-based systems the Consistency and the Security of data are no longer enough

the new need is the Hight availability

Page 16: NoSQL overview #phptostart turin 11.07.2011

Google

• distributed storage system

• scale file dimension up to Petabyte

Wide applicability

Scalability

High performance

High availability

Page 17: NoSQL overview #phptostart turin 11.07.2011

Google BigTable

• Web indexing

• Google Earth

• Google Finance

• Orkut

• Custom Search

• Google Docs

column - Oriented DB

Page 18: NoSQL overview #phptostart turin 11.07.2011

Amazon

• Relational model doesn’t fit requirements

• 10 of thousand of server around the world

• 10 Millions customers

High Reliability High scale

Page 19: NoSQL overview #phptostart turin 11.07.2011

Amazon Dynamo

• High Reliability

• High Scale

Key-Value Store Database

Page 20: NoSQL overview #phptostart turin 11.07.2011

New Trends

Page 21: NoSQL overview #phptostart turin 11.07.2011

Web Company

• Startup with explosive growth:

• DBMS open source

• v 1.0 - 1 node , becomes soon inadequate

• next version:

• Horizontal Partitioning (sharding)

• implement the node routing inside the application logic

Page 22: NoSQL overview #phptostart turin 11.07.2011

Web Company

• Re-implement inter-node query

• Handle inter-node transaction

• Node failure increasingly likely - less reliability - less availability

• “Hot” Data restructuring and data redistribuition becomes hard

Page 23: NoSQL overview #phptostart turin 11.07.2011

Solution

• Scalability, very simple operations, but on many nodes

• Performance, low latency

• Productivity

• Flexibility (data structure)

• Skill to distribute data on many nodes

} web Application

needs

Page 24: NoSQL overview #phptostart turin 11.07.2011

Compromise

• SQL Renounce

• less strict transactions

Page 25: NoSQL overview #phptostart turin 11.07.2011

Query Language

• SQL like

• map-reduce

• SparQL

• ...

Leave a standard query language like SQL, and embrace a different kind of query language based on the selected product

Page 26: NoSQL overview #phptostart turin 11.07.2011

CAP Theorem(2009)

• Consistency

• Availability

• Partition Tollerance

It’s impossibile to have all of them at the same time in a distributed system. You have to choose only two.

Eric Brewer

Page 27: NoSQL overview #phptostart turin 11.07.2011

Consistency

N2

N5

N4

N1

N6

tk

tk

tk

tk

• Strong: After the update completes any subsequent access will return the updated value.

• Weak: The system does not guarantee that subsequent accesses will return the updated value.

• Eventually: The storage system guarantees that if no new updates are made to the object eventually (after the inconsistency window closes) all accesses will return the last updated value.

Page 28: NoSQL overview #phptostart turin 11.07.2011

Consistency

N2

N5

N4

N1

N6

tk

tk

tk

tk

• Strong: After the update completes any subsequent access will return the updated value.

• Weak: The system does not guarantee that subsequent accesses will return the updated value.

• Eventually: The storage system guarantees that if no new updates are made to the object eventually (after the inconsistency window closes) all accesses will return the last updated value.

Page 29: NoSQL overview #phptostart turin 11.07.2011

Facebook Cassandra

• Key-Value store

• data model: BigTable

• infrastructure: Amazon-Dynamo

• Eventual Consistency

• High Availability

Page 30: NoSQL overview #phptostart turin 11.07.2011

Just find the right way to manage your data-set

Search Best Solution

Page 31: NoSQL overview #phptostart turin 11.07.2011

Technology Focus

context

purp

ose

Cos

t of

impl

emen

tatio

n

Page 32: NoSQL overview #phptostart turin 11.07.2011

choose bike => (climb the mountain)

Page 33: NoSQL overview #phptostart turin 11.07.2011

choose bike => (climb the mountain)

Page 34: NoSQL overview #phptostart turin 11.07.2011

choose bike => (climb the mountain)

Page 35: NoSQL overview #phptostart turin 11.07.2011

choose bike => (climb the mountain)

Page 36: NoSQL overview #phptostart turin 11.07.2011

choose bike => (climb the mountain)

Know available tools

Page 37: NoSQL overview #phptostart turin 11.07.2011

NOSql Families

Page 38: NoSQL overview #phptostart turin 11.07.2011

Key Value StoreOne Key -> One Value

it’s like an HASH

db knows information about “key” type (integer, float, ...), nothing about the value

very fast

‘name’ ‘david’=>

key value

Page 39: NoSQL overview #phptostart turin 11.07.2011

Key Value Store

• redis

• memcached

• dynamo

• voldemort

performance

Scalability

Flexibility

Complexity

Functionality

high

high

high

none

variabile(none)

Page 40: NoSQL overview #phptostart turin 11.07.2011

Document Oriented• key -> document

• structured document

• schema-less{ name: ‘david’, surname: ‘funaro’, age: ’18’, mail: { home : ‘[email protected]’, office: ‘[email protected]‘ }}

user_13 =>

key

document

Page 41: NoSQL overview #phptostart turin 11.07.2011

Document Oriented

performance

Scalability

Flexibility

Complexity

Functionality

high

variable (high)

high

low

variabile(low)

Page 42: NoSQL overview #phptostart turin 11.07.2011

Graph DB

• composed by Vertices and Edges

• Vertices connected by Edges

• Edge has a Label and Direction

• Edges and Vertices have Properties

Page 43: NoSQL overview #phptostart turin 11.07.2011

Graph DB

Funaro

dnsee

User_2David

User_1

User_3

User_3

friend

friend

friendsurnam

e

name

work

Page 44: NoSQL overview #phptostart turin 11.07.2011

Graph DB

• neo4J

• OrientDB

• infogrid

• VertexDB

performance

Scalability

Flexibility

Complexity

Functionality

variable

variable

high

high

graph theory

Page 45: NoSQL overview #phptostart turin 11.07.2011

Why NOSql

some case example

Page 46: NoSQL overview #phptostart turin 11.07.2011

A Graph RDBMS

id name salary

1 ale 200

2 marco 230

3 david 340

4 sergio 349

5 andre 200

id_1 id_22 43 13 43 21 55 35 2

FolloweeUsers

Page 47: NoSQL overview #phptostart turin 11.07.2011

A Graph RDBMS

id name salary

1 ale 200

2 marco 230

3 david 340

4 sergio 349

5 andre 200

id_1 id_22 43 13 43 21 55 35 2

FolloweeUsers

handled as BTree101

Page 48: NoSQL overview #phptostart turin 11.07.2011

A Graph RDBMS

Lookup david’s id [Log(N)]

N = # users

Look K Followees [Log(N)]

Get their names [K*Log(N)]

Page 49: NoSQL overview #phptostart turin 11.07.2011

Graph DB

Marco

Sergio

AndreaAle

David

Lookup David Log(N)

Lookup for Followees O(K)

Page 50: NoSQL overview #phptostart turin 11.07.2011

Benchmark

• 1 Million Vertex

• 4 Million Edge

• Scale Free Tolopogy

• Postgres VS Neo4J

• Both Hash and BTree

Deph RDBMS Graph

1

2

3

4

5

100ms 30ms

1000ms 500ms

10000ms 3000ms

100000ms 50000ms

N/A 100000ms

http://markorodriguez.com/2011/02/18/mysql-vs-neo4j-on-a-large-scale-graph-traversal/

Page 51: NoSQL overview #phptostart turin 11.07.2011

Schema

RDBMS NOSql - DocumentaleCREATE TABLE `pma_bookmark` ( `id` int(11) NOT NULL auto_increment, `name` varchar(255) NOT NULL default '', `surname` varchar(255) NOT NULL default '', `mobile` varchar(255) NOT NULL default '', `url` text NOT NULL,... `name` varchar(255) NOT NULL default '',... `telex` varchar(255) NOT NULL default '', `fax` varchar(255) NOT NULL default '', `office` text NOT NULL, PRIMARY KEY (`id`));

Schema Less

Page 52: NoSQL overview #phptostart turin 11.07.2011

Schema 2id name surname mobile url ... telex office telex ...

1

2

3

david funaro 3548 davidfunaro.com null null 3548631 null null

alessandro nadalin 3257 null null null 32458 5456 null

marco rossi 3548 null null null null 515648 null

too value set to NULL

user :{ name: david, surname: funaro, mobile : 3454, url: davidfunaro.com, office: 3423423,}

user :{ name: alessandro, surname: nadalin, mobile : 6262, office: 342343, telex: 3434}

user :{ name: marco, surname: rossi, telex: 3434}

Each Document has only the required fields

Page 53: NoSQL overview #phptostart turin 11.07.2011

Schema less

• flexibility to handle the data model fields

• the model can grow easily

Page 54: NoSQL overview #phptostart turin 11.07.2011

Performance====== SET ======

100007 requests completed in 0.88 seconds 50 parallel clients 3 bytes payload keep alive: 1

====== GET ====== 100000 requests completed in 1.23 seconds 50 parallel clients 3 bytes payload keep alive: 1

http://redis.io/topics/benchmarks

http://research.yahoo.com/files/ycsb-v4.pdf

Page 55: NoSQL overview #phptostart turin 11.07.2011

NOSql for PHP

✓Redis

✓MongoDB

✓CouchDB

✓Cassandra

✓Memcached

✴OrientDB

Page 56: NoSQL overview #phptostart turin 11.07.2011

OrientDB library for PHP

https://github.com/congow/Orient

A Set of tools to use and manage any OrientDB instance from PHP.

Orient includes:

•the HTTP protocol binding•the query builder•the data mapper ( Object Graph Mapper )

Page 57: NoSQL overview #phptostart turin 11.07.2011

Thanks

• David Funaro

• http://davidfunaro.com

• @ingdavidino

[email protected]

Page 58: NoSQL overview #phptostart turin 11.07.2011

credits

http://www.slideshare.net/ClaudioMartella/presentation-7398682?from=ss_embed http://www.slideshare.net/harrikauhanen/nosql-3376398 http://www.slideshare.net/ingdavidino/cmf-a-pain-in-the-f-phpday-05142011 http://it.wikipedia.org/wiki/Modello_relazionale http://www.slideshare.net/gabriele.lana/nosql-7405964 http://blog.indigenidigitali.com/l-ecosistema-nosql/ http://www.dia.uniroma3.it/~torlone/bd2/noSQL-1.pdf http://nosql-database.org/


Recommended