New opportunities for connected data : Neo4j the graph database

Post on 26-Jan-2015

718 views 1 download

description

 

transcript

1

Cédric FauvetCedric.fauvet@neotechnology.com

Twitter Francophone: @Neo4jFr

Confidential - Neo Technology, Inc.

New opportunities for connected data : Neo4j the graph database

Agenda

• The gaph theory• About Neo Technology• Uses cases• Vision du marché• The Neo4j Technology • Cypher the Neo4j’s « SQL »

The graph theory

An 840 : The horeseman’s problem

The Arab mathematician and chess master al-Adli ar-Rumi solved the problem.

The graph theory

An 1735 The Königsberg’s 7 bridges problem

How to pass through the bridges only once ?

Leonhard EulerSwiss mathematician

The graph theory

2013: Today’s questions

• Collaboration• Configuration management• Geo mapping• Molecule’s Interaction (Biology)• Impact analysis• Master Data Management• Product management • Recommendation• Social

Agenda

• The gaph theory• About Neo Technology• Uses cases• Vision du marché• The Neo4j Technology • Cypher the Neo4j’s « SQL »

Neo Technology (Neo4j) Corporate Overview• Neo4j founded 2000• Headquartered in Palo Alto, California• Engineering headquarter in Malmö, Sweden• Employees based in France, Germany, UK, Sweden, US, and Malaysia

• 24/7 support on global basis• 100,000+ users• F500 customers such as Adobe, Cisco, Deutsche Telecom, Telenor, Deutsche

Post, SFR, Lockheed Martin, and others

• SI partners such as Accenture and dozens of local SI boutiques• Technology partners such as VMware, Informatica and Microsoft• Leader in the Graph Database arena

• Mission: Help the world to make sense of data

Agenda

• The gaph theory• About Neo Technology• Uses cases• Vision du marché• The Neo4j Technology • Cypher the Neo4j’s « SQL »

Société

- Worldwide company

- 45 millions users, + 30 000 each days.

- Owner of the social networks

ApnaCircle (Inde) and Tianji (Chine)

Problème

Viadeo, integrated Neo4j as their backend

database, to store all of their users and

relationships. When their network expanded to a

level that their traditional MySQL database

couldn’t handle, Viadeo experienced performance

and storage issues that would not perform at the

rate the

company was growing.

Etude de cas: Réseau social

Bénéfices & time

frame

- Real time

recomendation

with Neo4j.

- Project timeframe

= 8 weeks

Solution

Integrating Neo4j, Viadeo has highly accelerated their system

in two ways. Neo4j increased Viadeo’s performance by

requiring less storage space andless time to restructure the

graph.

10

Company

- Worldwide leader in networking for the InternetSolution

- Clustered Neo4j Enterprise architecture

- Part of a larger infrastructure solution

- Multi-region AWS deployment

- Neo4j selected in competition with custom solution

and Oracle

Benefits & time frame

- Highly flexible data analysis

- Sub-second results for large, densely-connected datasets

- User experience - competitive advantage

- 12 month project

Problem definition

- Massive amounts of data tied to members, user

groups, member content, etc. all interconnected

- Need to infer collaborative relationships based on

user-generated content

Case study: Web/ISV - social collaboration Adobe

11

Company

- Leading telco provider in the Nordics

Solution

- Neo4j Enterprise solution

- Embedded + HA

- Replacing 10 yr-old Oracle, Berkeley DB and a

mainframe environment

Problem definition

- Need: Reliable access control administration system

for 5mio customers, subscriptions and agreements

- Complex dependencies between groups, companies,

individuals, accounts, products, subscriptions, services

and agreements

- Broad and deep graphs (master customers with

1000s of customers, subscriptions & agreements)

Case study: TelcoTelenor

Benefits & time frame

- Flexible and dynamic architecture

- Exceptional performance

-Low cost compared to alternatives

-Extensible data model supports new applications and

features

12

Company

-World wide leader in network infrastructure

-Large sales organization

Solution

-2x Highly Available Neo4j clusters

-One live cluster and one backup / hot spare cluster at

a different datacenter

-Total: 6 Embedded Enterprise Neo4j DBs

Benefits & timeframe

-Real time overview of sales accounts and owners

-The ability to model complex rules for account ownership

-Direct commissioning computation through the entire sales

organization

->12 month development and rollout

Problem definition

-Intricate rules governing ownership of sales accounts

-Complex rules for sales commissions

-Queries complicated to structure with RDBMS

-Oracle performance not good enough for online

account management

Case study: Sales account managementCisco

Use case – What’s in common ?

Alice

ACME

ACME EMEA

Bob

Retail Co.

FooBar Inc.

Sales Rep

Sales Rep

Worked For

Worked For

Sold To

Use case – What’s a best path ?

Retail Co.Bob

ACME

Steve

JaneLiza

Pauline

William

Sales Rep

VPCMO

Sales Rep

VP

Use case : Pattern matching

Fraud detection

Correspondance

Fraud detection

Pas de correspondance

Fraud detection

Graph navigation

Impact analysis

Start node

Impact analysis

Follow the relationships

Impact analysis

Evaluate each node

Impact analysis

Agenda

• The gaph theory• About Neo Technology• Uses cases• Vision du marché• The Neo4j Technology • Cypher the Neo4j’s « SQL »

size * connectivity = complexity

Agenda

• The gaph theory• About Neo Technology• Uses cases• Vision du marché• The Neo4j Technology • Cypher the Neo4j’s « SQL »

Neo4j

Tackles complex data:– Large– Densely-connected– Semi-structured

Neo4j characteristics

• Fully ACID– Including XA-compliant distributed two-phase commits

• High Availability / Read Scaling through master-slave replication with master failover

• In-memory speeds with warm caches while maintaining full ACID

• Cypher query language and Java APIs

Caractéristiques de Neo4j

• Transactions Full ACID– XA-compliant distributed two-phase commits

• Haute disponibilité / Scalabilité*– master-slave réplication avec master Fail-over– * Lecture

• Hautes performance en mémoire– Caches évolués full ACID

• Langage des requêtes– Cypher– Java APIs– JDBC– Rest API– Ruby

Agenda

• The gaph theory• About Neo Technology• Uses cases• Vision du marché• The Neo4j Technology • Cypher the Neo4j’s « SQL »

() --> ()

Cypher the Neo4j’s « SQL »Based on ACSII-Art

(A) --> (B)

A B

Cypher the Neo4j’s « SQL »Each node have a identifier

A -[:LOVES]-> B

LOVES

A B

Cypher the Neo4j’s « SQL »Relationship

A --> B --> C

A B C

Cypher the Neo4j’s « SQL »You can traverse the graph

A -[*]-> B

A B

A B

A B

Cypher the Neo4j’s « SQL »You can dynamically traverse the graph

Cypher the Neo4j’s « SQL »The friend of friend query

START john=node:node_auto_index(name = 'John')MATCH john-[:friend]->()-[:friend]->fofRETURN john, fof

Thank you

Let’s move forward together !

Cédric Fauvet Your contact in France and switzerland

E-mail : Cedric.fauvet@neotechnology.comFrench speaking Twitter : @Neo4jFrFrench speaking community : meetup.com/graphdb-france