Date post: | 27-Jan-2015 |
Category: |
Technology |
Upload: | alberto-perdomo |
View: | 102 times |
Download: | 0 times |
[email protected] | @albertoperdomo
Leveraging relations at scale with Neo4j
Madrid.rb - July 2013
Alberto PerdomoGrapheneDB
[email protected] | @albertoperdomo
About me
๏Co-founder of Aentos
๏Ruby developer
๏GrapheneDB: Neo4j as a Service
A little Graph Theory
[email protected] | @albertoperdomo
Undirected Graph
A
B
C
Adam
Michael
John
Example: Facebook Friendships
[email protected] | @albertoperdomo
Weighted Graph
A
B
C
0.6 0.8Adam John
Star Wars: Episode IV
Example: Movie Ratings
[email protected] | @albertoperdomo
Labeled Graph
A C
friend_of fan_ofAdam LA Lakers
Michael
fan_pagefriend_of
Buser
user
Example: Facebook friendships+ fan pages
[email protected] | @albertoperdomo
Property Graph
A
B
Crated: 0.6 directed
Type: Cast MemberName: George LucasBorn_At: 1944-05-14
Type: MovieTitle: Star Wars Episode IV - A New HopeRelease: 1977
Type: UserName: AdamAge: 34Country: USA
wrote
Example: IMDB
[email protected] | @albertoperdomo
Graph DB: Definition
๏Uses graph as primary data structure
๏Property graph: store data as nodes, relations and properties
[email protected] | @albertoperdomo
Mysql vs Neo4j๏1M users
๏Friends of friends for 1K users
Depth Execution Time – MySQL Execution Time – Neo4j
2 0. 016 0. 010
3 30. 267 168
4 1, 543. 505 1. 359
5 Not Finished in 1 Hour 2. 132
http://www.neotechnology.com/how-much-faster-is-a-graph-database-really/, http://www.manning.com/partner/
[email protected] | @albertoperdomo
Conventional DBs
๏ Index lookup to find out adjacent nodes
๏Depends on total number of vertices and edges in DB (global)
[email protected] | @albertoperdomo
Graph DB: Definition๏Any system that provides index-free
adjacency [1]
๏Linear cost to retrieve adjacent nodes: depends on the number of local neighbours
[1] http://www.slideshare.net/slidarko/problemsolving-using-graph-traversals-searching-scoring-ranking-and-recommendation
[email protected] | @albertoperdomo
Graph analysis
๏Recommend vertices to user x
๏Search for y given x
๏Score x given its local neighbourhood
๏Rank x relative to y
[email protected] | @albertoperdomo
Social Graph
A representation of the relationship between people and
other people
[email protected] | @albertoperdomo
Social Graph
“Since you have many friends in
common, you might know fellow X.”
[email protected] | @albertoperdomo
Interest Graph
A representation of the relationship between people
and things
[email protected] | @albertoperdomo
Interest Graph
๏Quora
๏Spotify
“A lot of people who like x like you, also
like y, too.”
[email protected] | @albertoperdomo
Pinterest Interest Graph
http://engineering.pinterest.com/post/55272557617/building-a-follower-model-from-scratch
[email protected] | @albertoperdomo
Recommendations
bought
Many users
Star Wars I DVD
bought
C
A
Blooking at
A user
Star Wars Trilogy DVD Pack
“Customers who bought a, also bought b”
[email protected] | @albertoperdomo
Rank x
๏Rank nodes based on their neighbourhood/network
๏Klout, PageRank
[email protected] | @albertoperdomo
Geospatial problems
๏Travelling Salesman
๏Route for delivery of parcels
๏Optimize route for duration, distance, traffic flow, etc.
๏Must not be physical path, example: connecting people
[email protected] | @albertoperdomo
Recognize patterns
๏Fraud detection
๏Debt compensation systems
๏Text analysis
๏Chain of exchanges
[email protected] | @albertoperdomo
Process, Tips
๏Model facts as nodes
๏Use relations to model relations between facts
๏Refactor - schema-less !
[email protected] | @albertoperdomo
Neo4j
๏Graph DB written in Java
๏ Java API + HTTP/REST + Embedded
๏Full ACID
๏Built-in indexing (or roll your own)
๏Scale: 32B nodes, 32B relations
[email protected] | @albertoperdomo
Cypher
๏Neo4j’s graph query language
๏Declarative pattern matching
๏ “SQL for graphs”
๏ASCII art
[email protected] | @albertoperdomo
Syntax
START a=node(*)MATCH (a)-[:ACTED_IN]->(b)RETURN a.name, b.title;
[email protected] | @albertoperdomo
Syntax
START a=node(*)MATCH (a)-[r:ACTED_IN]->(b)RETURN a.name, r.roles, b.title;
[email protected] | @albertoperdomo
Syntax
START a=node(*)MATCH (a) -[:ACTED_IN]->(m)<-[:DIRECTED]- (d)RETURN a.name, m.title, d.name;
[email protected] | @albertoperdomo
Sort & Limit
START a=node(*)MATCH (a) -[:ACTED_IN]->(m)<-[:DIRECTED]- (d)RETURN a.name, m.title, d.nameORDER BY(count) DESCLIMIT 5;
[email protected] | @albertoperdomo
Starting point: Where
START n=node(*)WHERE has (n.name) AND n.name = “George Lucas” RETURN n;
[email protected] | @albertoperdomo
Starting point: Auto Index
START n=node:node_auto_index(name=“George Lucas”)RETURN n;
[email protected] | @albertoperdomo
Starting point: multiple nodes
START lucas=node:node_auto_index(name=“George Lucas”), ford=node:node_auto_index(name=”Harrison Ford”)MATCH (lucas) -[:DIRECTED]-> (m) <-[:ACTED_IN]- (ford)RETURN m.title;
[email protected] | @albertoperdomo
Constraints with comparison
START a=node:node_auto_index(name=“Alberto Perdomo”)MATCH (a) -[:KNOWS]-> (b)WHERE b.born < a.bornRETURN a.name;
[email protected] | @albertoperdomo
Contraints with patterns
MATCH (alberto)-[:KNOWS*2]->(fof)WHERE NOT((ferblape)-[:KNOWS]-(fof))
[email protected] | @albertoperdomo
Agreggation๏ count(x)
๏min(x)
๏max(x)
๏avg(x)
๏ collect(x)
๏ filter(x)
[email protected] | @albertoperdomo
Updating the graph
๏Create, Set, Delete nodes
๏Create, Set, Delete relations
[email protected] | @albertoperdomo
Built-in Graph Algos
๏ shortest path
๏allSimplePaths
๏allPaths
๏dijkstra
[email protected] | @albertoperdomo
Extending Neo4j: Plugins
๏Provides extra API endpoints to run external code. JAR files.
๏Neo4j-Spatial
๏Neo4j-Sparql
[email protected] | @albertoperdomo
Neo4j from Ruby๏ Neography
๏ Wrapper around REST API
๏ Neo4j.rb:
๏ Language binding for JRuby
๏ ActiveModel, Mixins
๏ Embedded Neo4j w/ GPL license (not only?)
๏ Other?
[email protected] | @albertoperdomo
Neography# Node creation:node1 = @neo.create_node("age" => 31, "name" => "Max")node2 = @neo.create_node("age" => 33, "name" => "Roel")
# Node properties:@neo.set_node_properties(node1, {"weight" => 200})
# Relationships between nodes:@neo.create_relationship("coding_buddies", node1, node2)
# Get node relationships:@neo.get_node_relationships(node2, "in", "coding_buddies")
# Use indexes:@neo.add_node_to_index("people", "name", "max", node1)@neo.get_node_index("people", "name", "max")
# Cypher queries:@neo.execute_query("start n=node(0) return n")
[email protected] | @albertoperdomo
Neo4j.rb ActiveModelclass User < Neo4j::Rails::Model attr_accessor :password attr_accessible :email, :password, :password_confirmation after_save :encrypt_password
email_regex = /\A[\w+\-.]+@[a-z\d\-.]+\.[a-z]+\z/i
# add an exact lucene index on the email property property :email, index: :exact
has_one(:avatar).to(Avator)
accepts_nested_attributes_for :avatar, allow_destroy: true
end
[email protected] | @albertoperdomo
Neo4j.rb Mixinclass Person include Neo4j::NodeMixin property :name, index: :exact property :city
has_n :friends has_one :addressend
[email protected] | @albertoperdomo
Neo4j Licensing
๏ Community: GPL
๏ Advanced: Commercial + AGPL๏ Monitoring
๏ Support
๏ Enterprise: Commercial + AGPL๏ Monitoring + HA clustering + Online backups
๏ Support
[email protected] | @albertoperdomo
Getting Started
๏www.neo4j.org/learn/try
๏www.neo4j.org/download
๏download -> unpack -> start
๏http://localhost:7474
[email protected] | @albertoperdomo
Neo4j Resources
๏ Code & Issues: github.com/neo4j/neo4j
๏ Resources: www.neo4j.org/learn
๏ Mailing List: groups.google.com/forum/#!forum/neo4j
๏ Questions: stackoverflow.com/questions/tagged/neo4j
๏ Meetups: www.neo4j.org/participate/events/meetups
Free download:http://graphdatabases.com/