+ All Categories
Home > Technology > Leveraging relations at scale with Neo4j

Leveraging relations at scale with Neo4j

Date post: 27-Jan-2015
Category:
Upload: alberto-perdomo
View: 102 times
Download: 0 times
Share this document with a friend
Description:
An introduction to graphs, graph databases, Neo4j and the Cypher Query language
86
[email protected] | @albertoperdomo Leveraging relations at scale with Neo4j Madrid.rb - July 2013 Alberto Perdomo GrapheneDB
Transcript
Page 1: Leveraging relations at scale with Neo4j

[email protected] | @albertoperdomo

Leveraging relations at scale with Neo4j

Madrid.rb - July 2013

Alberto PerdomoGrapheneDB

Page 2: Leveraging relations at scale with Neo4j

[email protected] | @albertoperdomo

About me

๏Co-founder of Aentos

๏Ruby developer

๏GrapheneDB: Neo4j as a Service

Page 4: Leveraging relations at scale with Neo4j

[email protected] | @albertoperdomo

Leonhard Euler, 1736

Page 5: Leveraging relations at scale with Neo4j

[email protected] | @albertoperdomo

Königsberg Bridge Problem

Page 6: Leveraging relations at scale with Neo4j

[email protected] | @albertoperdomo

Euler’s Technique

Page 7: Leveraging relations at scale with Neo4j

[email protected] | @albertoperdomo

Euler’s Technique

Page 8: Leveraging relations at scale with Neo4j

[email protected] | @albertoperdomo

Königsberg Problem Graph

Page 9: Leveraging relations at scale with Neo4j

A little Graph Theory

Page 10: Leveraging relations at scale with Neo4j

[email protected] | @albertoperdomo

The Math

G=( V, E )

Page 12: Leveraging relations at scale with Neo4j

[email protected] | @albertoperdomo

Undirected Graph

A

B

C

Adam

Michael

John

Example: Facebook Friendships

Page 13: Leveraging relations at scale with Neo4j

[email protected] | @albertoperdomo

Directed Graph

A

B

C

Adam

Michael

John

Example: Twitter follows

Page 14: Leveraging relations at scale with Neo4j

[email protected] | @albertoperdomo

Weighted Graph

A

B

C

0.6 0.8Adam John

Star Wars: Episode IV

Example: Movie Ratings

Page 15: Leveraging relations at scale with Neo4j

[email protected] | @albertoperdomo

Labeled Graph

A C

friend_of fan_ofAdam LA Lakers

Michael

fan_pagefriend_of

Buser

user

Example: Facebook friendships+ fan pages

Page 16: Leveraging relations at scale with Neo4j

[email protected] | @albertoperdomo

Property Graph

A

B

Crated: 0.6 directed

Type: Cast MemberName: George LucasBorn_At: 1944-05-14

Type: MovieTitle: Star Wars Episode IV - A New HopeRelease: 1977

Type: UserName: AdamAge: 34Country: USA

wrote

Example: IMDB

Page 18: Leveraging relations at scale with Neo4j

[email protected] | @albertoperdomo

Graph DB: Definition

๏Uses graph as primary data structure

๏Property graph: store data as nodes, relations and properties

Page 19: Leveraging relations at scale with Neo4j

[email protected] | @albertoperdomo

Graph DBsvs

other DBs

Page 20: Leveraging relations at scale with Neo4j

[email protected] | @albertoperdomo

A graph can be modeled with almost any technology

Page 21: Leveraging relations at scale with Neo4j

[email protected] | @albertoperdomo

Mysql vs Neo4j๏1M users

๏Friends of friends for 1K users

Depth Execution Time – MySQL Execution Time – Neo4j

2 0. 016 0. 010

3 30. 267 168

4 1, 543. 505 1. 359

5 Not Finished in 1 Hour 2. 132

http://www.neotechnology.com/how-much-faster-is-a-graph-database-really/, http://www.manning.com/partner/

Page 22: Leveraging relations at scale with Neo4j

[email protected] | @albertoperdomo

Conventional DBs

๏ Index lookup to find out adjacent nodes

๏Depends on total number of vertices and edges in DB (global)

Page 23: Leveraging relations at scale with Neo4j

[email protected] | @albertoperdomo

Graph DB: Definition๏Any system that provides index-free

adjacency [1]

๏Linear cost to retrieve adjacent nodes: depends on the number of local neighbours

[1] http://www.slideshare.net/slidarko/problemsolving-using-graph-traversals-searching-scoring-ranking-and-recommendation

Page 24: Leveraging relations at scale with Neo4j

[email protected] | @albertoperdomo

While DB grows, cost of local step remains the same

Page 25: Leveraging relations at scale with Neo4j

[email protected] | @albertoperdomo

Modeling connected data is natural

Page 26: Leveraging relations at scale with Neo4j

[email protected] | @albertoperdomo

When use a graph?

Page 27: Leveraging relations at scale with Neo4j

[email protected] | @albertoperdomo

High density of relations

Page 28: Leveraging relations at scale with Neo4j

[email protected] | @albertoperdomo

A search engine for relations

Page 29: Leveraging relations at scale with Neo4j

[email protected] | @albertoperdomo

Graph analysis

๏Recommend vertices to user x

๏Search for y given x

๏Score x given its local neighbourhood

๏Rank x relative to y

Page 31: Leveraging relations at scale with Neo4j

[email protected] | @albertoperdomo

Social Graph

A representation of the relationship between people and

other people

Page 32: Leveraging relations at scale with Neo4j

[email protected] | @albertoperdomo

Social Graph

๏Facebook

๏Twitter

๏LinkedIn

“Since you have many friends in

common, you might know fellow X.”

Page 33: Leveraging relations at scale with Neo4j

[email protected] | @albertoperdomo

Interest Graph

A representation of the relationship between people

and things

Page 34: Leveraging relations at scale with Neo4j

[email protected] | @albertoperdomo

Interest Graph

๏Pinterest

๏ Instagram

๏Quora

๏Spotify

“A lot of people who like x like you, also

like y, too.”

Page 37: Leveraging relations at scale with Neo4j

[email protected] | @albertoperdomo

e-commerce

Upselling

Page 38: Leveraging relations at scale with Neo4j

[email protected] | @albertoperdomo

Recommendations

bought

Many users

Star Wars I DVD

bought

C

A

Blooking at

A user

Star Wars Trilogy DVD Pack

“Customers who bought a, also bought b”

Page 39: Leveraging relations at scale with Neo4j

[email protected] | @albertoperdomo

Rank x

๏Rank nodes based on their neighbourhood/network

๏Klout, PageRank

Page 40: Leveraging relations at scale with Neo4j

[email protected] | @albertoperdomo

Geospatial problems

๏Travelling Salesman

๏Route for delivery of parcels

๏Optimize route for duration, distance, traffic flow, etc.

๏Must not be physical path, example: connecting people

Page 41: Leveraging relations at scale with Neo4j

[email protected] | @albertoperdomo

Recognize patterns

๏Fraud detection

๏Debt compensation systems

๏Text analysis

๏Chain of exchanges

Page 42: Leveraging relations at scale with Neo4j

[email protected] | @albertoperdomo

Visualize connected data

Page 43: Leveraging relations at scale with Neo4j

[email protected] | @albertoperdomo

Your domain model determines what you

can do

Page 44: Leveraging relations at scale with Neo4j

[email protected] | @albertoperdomo

High chances your data is a graph

Page 45: Leveraging relations at scale with Neo4j

[email protected] | @albertoperdomo

The Neo4jGraph Database

Page 48: Leveraging relations at scale with Neo4j

[email protected] | @albertoperdomo

Then Add Complexity

Page 49: Leveraging relations at scale with Neo4j

[email protected] | @albertoperdomo

Process, Tips

๏Model facts as nodes

๏Use relations to model relations between facts

๏Refactor - schema-less !

Page 50: Leveraging relations at scale with Neo4j

[email protected] | @albertoperdomo

Neo4j

๏Graph DB written in Java

๏ Java API + HTTP/REST + Embedded

๏Full ACID

๏Built-in indexing (or roll your own)

๏Scale: 32B nodes, 32B relations

Page 51: Leveraging relations at scale with Neo4j

[email protected] | @albertoperdomo

The Cypher Query Language

Page 52: Leveraging relations at scale with Neo4j

[email protected] | @albertoperdomo

Cypher

๏Neo4j’s graph query language

๏Declarative pattern matching

๏ “SQL for graphs”

๏ASCII art

Page 55: Leveraging relations at scale with Neo4j

[email protected] | @albertoperdomo

Basic Syntax

A B

(a) --> (b)

Page 56: Leveraging relations at scale with Neo4j

[email protected] | @albertoperdomo

Basic Syntax

START a=node(*)MATCH (a)-->(b)RETURN a,b;

Page 58: Leveraging relations at scale with Neo4j

[email protected] | @albertoperdomo

Relations

(a) -[:ACTED_IN]-> (b)

A BACTED IN

Page 59: Leveraging relations at scale with Neo4j

[email protected] | @albertoperdomo

Syntax

START a=node(*)MATCH (a)-[:ACTED_IN]->(b)RETURN a.name, b.title;

Page 60: Leveraging relations at scale with Neo4j

[email protected] | @albertoperdomo

Syntax

START a=node(*)MATCH (a)-[r:ACTED_IN]->(b)RETURN a.name, r.roles, b.title;

Page 61: Leveraging relations at scale with Neo4j

[email protected] | @albertoperdomo

Syntax

(a) --> (b) <-- (c)

A B C

Page 62: Leveraging relations at scale with Neo4j

[email protected] | @albertoperdomo

Syntax

START a=node(*)MATCH (a) -[:ACTED_IN]->(m)<-[:DIRECTED]- (d)RETURN a.name, m.title, d.name;

Page 63: Leveraging relations at scale with Neo4j

[email protected] | @albertoperdomo

Sort & Limit

START a=node(*)MATCH (a) -[:ACTED_IN]->(m)<-[:DIRECTED]- (d)RETURN a.name, m.title, d.nameORDER BY(count) DESCLIMIT 5;

Page 64: Leveraging relations at scale with Neo4j

[email protected] | @albertoperdomo

Starting point: All nodes

START n=node(*)RETURN n;

Page 65: Leveraging relations at scale with Neo4j

[email protected] | @albertoperdomo

Starting point: Where

START n=node(*)WHERE has (n.name) AND n.name = “George Lucas” RETURN n;

Page 66: Leveraging relations at scale with Neo4j

[email protected] | @albertoperdomo

Starting point: Auto Index

START n=node:node_auto_index(name=“George Lucas”)RETURN n;

Page 67: Leveraging relations at scale with Neo4j

[email protected] | @albertoperdomo

Starting point: multiple nodes

START lucas=node:node_auto_index(name=“George Lucas”), ford=node:node_auto_index(name=”Harrison Ford”)MATCH (lucas) -[:DIRECTED]-> (m) <-[:ACTED_IN]- (ford)RETURN m.title;

Page 68: Leveraging relations at scale with Neo4j

[email protected] | @albertoperdomo

Multiple relations

MATCH (a)-[:ACTED_IN|DIRECTED]->()

Page 69: Leveraging relations at scale with Neo4j

[email protected] | @albertoperdomo

Constraints with comparison

START a=node:node_auto_index(name=“Alberto Perdomo”)MATCH (a) -[:KNOWS]-> (b)WHERE b.born < a.bornRETURN a.name;

Page 70: Leveraging relations at scale with Neo4j

[email protected] | @albertoperdomo

Contraints with patterns

MATCH (alberto)-[:KNOWS*2]->(fof)WHERE NOT((ferblape)-[:KNOWS]-(fof))

Page 71: Leveraging relations at scale with Neo4j

[email protected] | @albertoperdomo

Variable length paths

MATCH (alberto)-[:KNOWS*2]->(fof)

Page 72: Leveraging relations at scale with Neo4j

[email protected] | @albertoperdomo

Agreggation๏ count(x)

๏min(x)

๏max(x)

๏avg(x)

๏ collect(x)

๏ filter(x)

Page 73: Leveraging relations at scale with Neo4j

[email protected] | @albertoperdomo

Updating the graph

๏Create, Set, Delete nodes

๏Create, Set, Delete relations

Page 74: Leveraging relations at scale with Neo4j

[email protected] | @albertoperdomo

Neo4j: More features

Page 75: Leveraging relations at scale with Neo4j

[email protected] | @albertoperdomo

Built-in Graph Algos

๏ shortest path

๏allSimplePaths

๏allPaths

๏dijkstra

Page 76: Leveraging relations at scale with Neo4j

[email protected] | @albertoperdomo

Extending Neo4j: Plugins

๏Provides extra API endpoints to run external code. JAR files.

๏Neo4j-Spatial

๏Neo4j-Sparql

Page 77: Leveraging relations at scale with Neo4j

[email protected] | @albertoperdomo

Neo4j from Ruby๏ Neography

๏ Wrapper around REST API

๏ Neo4j.rb:

๏ Language binding for JRuby

๏ ActiveModel, Mixins

๏ Embedded Neo4j w/ GPL license (not only?)

๏ Other?

Page 78: Leveraging relations at scale with Neo4j

[email protected] | @albertoperdomo

Neography# Node creation:node1 = @neo.create_node("age" => 31, "name" => "Max")node2 = @neo.create_node("age" => 33, "name" => "Roel")

# Node properties:@neo.set_node_properties(node1, {"weight" => 200})

# Relationships between nodes:@neo.create_relationship("coding_buddies", node1, node2)

# Get node relationships:@neo.get_node_relationships(node2, "in", "coding_buddies")

# Use indexes:@neo.add_node_to_index("people", "name", "max", node1)@neo.get_node_index("people", "name", "max")

# Cypher queries:@neo.execute_query("start n=node(0) return n")

Page 79: Leveraging relations at scale with Neo4j

[email protected] | @albertoperdomo

Neo4j.rb ActiveModelclass User < Neo4j::Rails::Model attr_accessor :password attr_accessible :email, :password, :password_confirmation after_save :encrypt_password

email_regex = /\A[\w+\-.]+@[a-z\d\-.]+\.[a-z]+\z/i

# add an exact lucene index on the email property property :email, index: :exact

has_one(:avatar).to(Avator)

accepts_nested_attributes_for :avatar, allow_destroy: true

end

Page 80: Leveraging relations at scale with Neo4j

[email protected] | @albertoperdomo

Neo4j.rb Mixinclass Person include Neo4j::NodeMixin property :name, index: :exact property :city

has_n :friends has_one :addressend

Page 81: Leveraging relations at scale with Neo4j

[email protected] | @albertoperdomo

Neo4j Licensing

๏ Community: GPL

๏ Advanced: Commercial + AGPL๏ Monitoring

๏ Support

๏ Enterprise: Commercial + AGPL๏ Monitoring + HA clustering + Online backups

๏ Support

Page 82: Leveraging relations at scale with Neo4j

[email protected] | @albertoperdomo

Getting Started

๏www.neo4j.org/learn/try

๏www.neo4j.org/download

๏download -> unpack -> start

๏http://localhost:7474

Page 83: Leveraging relations at scale with Neo4j

[email protected] | @albertoperdomo

Built in Web Admin

๏Stats

๏Console & browser

๏Indexes

Page 84: Leveraging relations at scale with Neo4j

[email protected] | @albertoperdomo

Neo4j Resources

๏ Code & Issues: github.com/neo4j/neo4j

๏ Resources: www.neo4j.org/learn

๏ Mailing List: groups.google.com/forum/#!forum/neo4j

๏ Questions: stackoverflow.com/questions/tagged/neo4j

๏ Meetups: www.neo4j.org/participate/events/meetups

Free download:http://graphdatabases.com/

Page 85: Leveraging relations at scale with Neo4j

[email protected] | @albertoperdomo

GrapheneDB:Neo4j as a Service


Recommended