OrientDB - the 2nd generation of (Multi-Model) NoSQL

Post on 16-Jul-2015

1,341 views 1 download

Tags:

transcript

Luigi Dell’Aquila

Director of Consulting

Orient Technologies LTD

Twitter: @ldellaquila

http://www.orientdb.com

OrientDB - the 2nd generation of

(Multi-Model) NoSQLAnd why GraphDB are the starting point of this revolution

“90% of the data

in the world today

has been created

in the last two years alone.”

- IBM

Welcome to Big Data

Just Data

Order #134(Order) John

(Provider)

Commodore

Amiga 1200

(Product)

Frank(Customer)

Monitor 40”

(Product)

Mouse

(Product)

Bruno(Provider)

Just Data

Order #134(Order) John

(Provider)

Commodore

Amiga 1200

(Product)

Frank(Customer)

Monitor 40”

(Product)

Mouse

(Product)

Bruno(Provider)

Data by itself has little value,

it’s the relationship

between data that gives it

incredible value

Relationships give data “meaning”

Order #134(Order) John

(Provider)

Commodore

Amiga 1200

(Product)

(Sells)

Frank(Customer)

(Has)(Makes)

Monitor 40”

(Product)(Sells)

(Has)

Mouse

(Product)

Bruno(Provider) (Sells)

(Has)

Top NoSQL categories

Key/Value Databases

Document Databases

Graph Databases

Column Databases

Top NoSQL categories

Key/Value Databases

Document Databases Graph Databases

Column Databases

Why do most NoSQL products

avoid

managing relationships?

ID Name

10 John

11 John

24 Mike

28 Mike

ID Address

10 24

10 33

32 44

ID Location

24 Milan

33 London

18 Paris

18 Madrid

44 Moscow

Customer CustomerAddress Address

Is this

familiar?

What’s wrongwith JOIN?

A-Z

A-L M-Z

Imagine an Address Book

where we want to find Luigi’s phone number

Index Lookup: how does it work?

A-Z

A-L M-Z

A-L

A-D E-L

M-Z

M-R S-Z

Index algorithms are all similar and based on

balanced trees

Index Lookup: how does it work?

A-Z

A-L M-Z

A-L

A-D E-L

M-Z

M-R S-Z

A-D

A-B C-D

E-L

E-G H-L

Index Lookup: how does it work?

A-Z

A-L M-Z

A-L

A-D E-L

M-Z

M-R S-Z

A-D

A-B C-D

E-L

E-G H-L

E-G

E-F G

H-L

H-J K-L

Index Lookup: how does it work?

Index Lookup: how does it work?

A-Z

A-L M-Z

A-L

A-D E-L

M-Z

M-R S-Z

A-D

A-B C-D

E-L

E-G H-L

E-G

E-F G

H-L

H-J K-L

Luigi

Found! This lookup took 5 steps. With millions of indexed records, the tree depth

could be 1000’s of levels!

Joins Kill Performance

ID Name

10 John

11 John

24 Mike

28 Mike

ID Address

10 24

10 33

32 44

ID Location

24 Milan

33 London

18 Paris

18 Madrid

44 Moscow

Customer CustomerAddress Address

Joins are executed every time

you cross relationships

Querying million of records

joining 3-4 tables could

generate billions of

combinations

This is why the database

query performance

suffers as the database

increases in size

O(Log N)

RDBMS performance on traversal

In a world that’s becoming

more connected, we need a

better way to store data and

manage relationships

Read: Data is important, but relationships are even more fundamental today

“A graph database is any

storage system

that provides

index-free adjacency”

- Marko Rodriguez(author of TinkerPop Blueprints)

Every developer knows

the Relational Model,

but who knows the

Graph one?

Back to school:

Graph Theory crash course

Basic Graph

Luigi LyonVisited

Vertices and Edges can have properties

Vertices are directed

* https://github.com/tinkerpop/blueprints/wiki/Property-Graph-Model

Property Graph Model*

Lyon

people: 500,000

Luigi

company: OrientTechnologies

Vertices and Edges can have properties

Vertices and Edges can have properties

Visited

on: 2015

Luigi Lyon

An Edge connects only 2 vertices

Use multiple edges to represent 1-N and N-M relationships

1-N and N-M Relationships

Congrats! This is your diploma in

«Graph Theory»

The Graph theory

is so simple,

yet so

powerful

How does a true*

Graph Database

manage relationships?

*a “Graph” layer on top of a DBMS doesn’t qualify as a true GraphDB

Luigi Lyon#13:55

#15:99

Each element in the

Graph has own

immutable Record ID

#22:11

(Edge)

(Vertex)(Vertex)

Each element in the

Graph has own

immutable Record ID

Each element in the

Graph has own

immutable Record ID

Luigi Lyon#13:55

#15:99

Connections use

persistent

pointers

#22:11

(Edge)

(Vertex)(Vertex)

Luigi Lyon#13:55

#15:99

#22:11

(Edge)

(Vertex)(Vertex)

Luigi Lyon#13:55

#15:99

#22:11

(Edge)

(Vertex)(Vertex)

A Graph Database creates the

relationship just once

(when the edge is created)

VS

RDBMS computes the

relationship every time

you query a database

When you move from a RDBMS

to a Graph Database you jump

from a O(log N) speed to a near O(1)

With a Graph Database, the

traversing time is

not affected by database size!

This is huge in the BigData age

Graph Databases Easily Manage Complex Relationships

No costs to traverse relationships:

• Recommendation engines

• Social Applications

• Spatial Apps

• Master Data Management

• Information Clustering

John

Thriller

Comedy

Pulp Fiction

Mr Bean

TheaterB

TheaterA

Theater C

NYC

San Josè

Lives in

GraphDB Database QuadrantR

ela

tionship

s C

om

ple

xity >

Data Complexity >

Relational

Key Value

Column

Graph

Document

GraphDB Database QuadrantR

ela

tionship

s C

om

ple

xity >

Data Complexity >

Relational

Key Value

Column

Graph

Document

These were 1st generation NoSQL

products, where each tool was

only good at a few use cases

Oracle

(RDBMS)

Redis or

Memcache

(Key/Value)

MongoDB

(DocDB)

Neo4j

(GraphDB)

E

Application

ETL

E

E

E

1st Generation NoSQL: Scenario

Primary

DB

1st Generation NoSQL: Fact

In > 90% of use cases,

NoSQL products are

used as second DBMS

Oracle

(RDBMS)

Redis or

Memcache

(Key/Value)

MongoDB

(DocDB)

Neo4j

(GraphDB)

E

Application

ETL

E

E

E

1st Generation NoSQL: Problems

- No standard between NoSQL

products

- Multiple vendors = multiple skills

- ETL + synchronization code

is costly to write and maintain

- Performance and Reliability is

hard to predict

2nd Generation NoSQL

is

Multi-Model

What’s Multi-Model DBMS?

GraphDocument

Object

Key/Value

Multi Model represents the

intersection

of multiple models in just one

product

What’s Multi-Model DBMS?

GraphDocument

Object

Key/Value

Multi Model represents the

intersection

of multiple models in just one

product

- Just one product to learn and maintain

- Just one vendor relationship to manage

- No ETL, no synchronization required

- Performance and Reliability is easy to test from the

beginning

Relationships give data “meaning”

Order #134(Order) John

(Provider)

Commodore

Amiga 1200

(Product)

(Sells)

Frank(Customer)

(Has)(Makes)

Monitor 40”

(Product)(Sells)

(Has)

Mouse

(Product)

Bruno(Provider)

(Sells)

(Has)

Multi-Model domain schema

Customer Provider

Productname: string

qty: int

Actorname: string

surname: string

Sellsprice: decimal

Inherits

Edge

Legenda:

V Vertex

Makes

Ordernumber: int

date: datetime

Hasprice: decimal

`

Vertices and Edges are Documents

{

”@rid": “12:382”,

”@class": ”Customer",

“name”: “Frank”,

“surname” : “Raggio”,

“phone” : “+39 33123212”,

“details”: {

“city”:”London",

“tags”:”millennial”

}

}

Frank

Order

General purpose solution:

• JSON

• Schema-less

• Schema-full

• Schema-hybrid

• Nested documents

• Rich indexing and

querying

• Developer friendly

Polymorphic queries

John(Provider)

Frank(Customer)SELECT * FROM Customer

SELECT * FROM Provider

SELECT * FROM Actor

Bruno(Provider)

Bruno(Provider)

Frank(Customer)

John(Provider)

Multi-Model complex domains schema

Band Genre

AccountMusicTaste

Location

Likes

Performs

Inherits

Edge

Legenda:

V Vertex

Plays

Multi-Model complex domains

Snow Patrol(Band)

John(Account)

Indie(Genre)

123, 1st Street

Austin, TX

(Location)

(Performs)

April 7, 2015

9pm-11.30pm

(Likes)

Frank(Account)

(Likes)

(Likes)

Rock(Genre)(Likes)

(Plays)

Multi-Model Database QuadrantR

ela

tionship

s C

om

ple

xity >

Data Complexity >

Relational

Key Value

Column

Graph Multi-Model

Document

Multi-Model Solutions

There are a few DBMSs that claim

to be Multi-Model, but they do not

have a true Graph Engine.

The “Graph” is only a layer on top

of the engine.

Under the hood they do JOINs,

which means traversal time is

affected by database size.

Meet OrientDB

The First Ever Multi-Model

Database Combining Flexibility

of Documents with

Connectedness of Graphs

With a true Graph, Document,

Key/Value and Object Oriented engine

OrientDB features

DEMO

• Support for TinkerPop standard

for Graph DB: Gremlin language

and Blueprints API

• SQL + extensions for graphs

• JDBC driver to connect any BI tool

• HTTP/JSON support

• Drivers in Java, Node.js, Python,

PHP, .NET, Perl, C/C++ and more

API & Standards

Availability and Integrity

• Atomic, Consistent, Isolated and Durable (ACID)

multi-statement transactions

Master

Node

Master

Node

C

C C C

CC

C

Multi-master

Replication

Scalability and Performance

• Multi-Master Replication, Sharding and Auto-

Discovery to Simplify Ops

• +200k Tps on Commodity Hardware

Master

Node

Master

Node

C

C C C

CC

C

Auto-

Discovered

Node

Some numbers

A Bright Future

Graph DBMS increased their popularity by 500% within the last 2 years

Document DBMS are the 3rd fastest growing category

Some of Our Customers

Get Started for Free

OrientDB Community Edition is FREE

for any purpose (Apache 2 license)

Udemy Getting Started Training is

★★★★★ and Free

http://www.orientechnologies.com/getting-started

OrientDB Enterprise is Free for

Development

Thank you!

Luigi Dell’Aquila

@ldellaquila

http://www.orientdb.com