Betting the Company on a Graph Database - Aseem Kishore @ GraphConnect Boston 2013

Post on 27-Jan-2015

106 views 0 download

Tags:

description

 

transcript

BETTING THE COMPANYBETTING THE COMPANY

(LITERALLY) ON A

GRAPH DATABASEGRAPH DATABASE

TIPS, TRICKS, AND LESSONS LEARNED

Aseem KishoreJan 2013

START user=node(1), other=node(2)MATCH (user) -[r1:has|wants]-> (thing) <-[r2:has|wants]- (other)WHERE TYPE(r1) <> TYPE(r2)RETURN TYPE(r1), TYPE(r2), thing

Daniel Gasienica@gasi

SO…

JUST WHAT IS A

GRAPH DATABASEGRAPH DATABASE?

# adjacency list:nodes = List<Node>neighbors = Map<Node, List<Node>>neighbors[node1].add(node2)

# adjacency matrix:nodes = List<Node>connections = Map<Node, Map<Node, bool>>connections[node1][node2] = true

“ By definition, a graph database is any storagesystem that provides index-free adjacency. ”

“ This means that every element contains adirect pointer to its adjacent element and noindex lookups are necessary. ”

QUERYING

1. Start somewhere

2. Traverse elsewhere

QUERYING IN NEO4J

1. Start somewhereRoot nodeID directly (file offset)

Lucene index

2. Traverse elsewhere

Traversal APIsCypher patternsBuilt-in graph algos (Djikstra, A*, etc.)

NEO4J USAGE

Embedded mode (Java API)

Server mode (REST API)

Cypher query language (both)

OUR USAGE

NODE.JS

+

REST API

+

CYPHER

NEO4J EDITIONS

Community editionSingle instanceOffline backup

Advanced editionMeh

Enterprise editionMulti-instance cluster!

Online backup!

NEO4J SCALING

Master-slave replication

Cache-based sharding

Feature-based polyglot'ing

64B limit on nodes, rels, propsBut can be easily upped; just flipping some bits100 props/node (high) ⇒ 640M nodes

OKAY...

LET'S TALK ABOUT

WHAT WE LEARNEDWHAT WE LEARNED

WHAT WE LEARNED

Unique, expressive relationship types

WHAT WE LEARNED

Unique, expressive relationship types

Cache stats where possible

WHAT WE LEARNED

Unique, expressive relationship types

Cache stats where possible

Capture history through event nodes

WHAT WE LEARNED

Unique, expressive relationship types

Cache stats where possible

Capture history through event nodes

First-class objects ⇒ nodes, not rels

WHAT WE LEARNED

Unique, expressive relationship types

Cache stats where possible

First-class objects ⇒ nodes, not rels

Capture history through event nodes

Connected data ⇒ nodes, not props

WHAT WE LEARNED

Unique, expressive relationship types

Cache stats where possible

First-class objects ⇒ nodes, not rels

Capture history through event nodes

Connected data ⇒ nodes, not props

Maintain linked lists for O(1) queries

NEO4J ROADMAP

Overhaul of indexing API

Relationship type grouping

Socket and/or binary protocol

Automatic sharding?

THANKS!

TWITTER: @ASEEMK

GITHUB: @ASEEMK

EMAIL: ASEEM.KISHORE@GMAIL.COM

Questions?