+ All Categories
Home > Technology > La bi, l'informatique décisionnelle et les graphes

La bi, l'informatique décisionnelle et les graphes

Date post: 20-Aug-2015
Category:
Upload: cedric-fauvet
View: 605 times
Download: 0 times
Share this document with a friend
Popular Tags:
45
Neo Technology, Inc Confidential La BI, l'informatique décisionnelle et les graphes Philip Rathle, Sr Dir Products [email protected] http://twitter.com/prathle
Transcript

Neo Technology, Inc Confidential

La BI, l'informatique décisionnelle

et les graphesPhilip Rathle, Sr Dir Products

[email protected]://twitter.com/prathle

Neo Technology, Inc Confidential

Les Grapheset la Pensée

Neo Technology, Inc Confidential

The BrainStructure: Neurons Connected by Synapses.

Processing: Signals Relayed Between Neurons through Synapses

Neo Technology, Inc Confidential

Human ThinkingStructured Unstructured (Creative)

Both forms involve processing connections

Neo Technology, Inc Confidential

Les Applications Graphes dans le Commerce

Neo Technology, Inc Confidential

Early Adopters of Graph Tech

Neo Technology, Inc Confidential

Evolution of Web SearchSurvival of the Fittest

Pre-1999WWW Indexing

Discrete Data

1999 - 2012Google Invents

PageRank

Connected Data(Simple)

2012-?Google Knowledge Graph, Facebook Graph Search

Connected Data(Rich)

Neo Technology, Inc Confidential

Evolution of Online Recruiting

2010-11Resume Searching &

Scoring

Aggregated Data

Survival of the Fittest

2011-12Social Job Search

Connected Data

Neo Technology, Inc Confidential

Consumer Web Giants Depends on Five GraphsGartner’s “5 Graphs”

Social Graph

Ref: http://www.gartner.com/id=2081316

Interest Graph

Payment Graph

Intent Graph

Mobile Graph

Neo Technology, Inc Confidential

Graph Buzz!

Neo Technology, Inc Confidential

Core Industries & Use Cases: Web / ISV Finance &

InsuranceCommuni-

cationsLogistics Life

SciencesMedia &

Publishing

Education, Not-for-

Profit

Government, Aerospace,

Gaming, Other

Network Management

MDM

Social

Geo

Authorization & Access Control

Content Management

Recommend-ations

Fraud Detection,

Other

Accenture

Select Commercial Customers (Community Users Not Included)

Neo4j Adoption Snapshot*

Neo Technology, Inc Confidential

Ecosystème de la Technologie Graph

Neo Technology, Inc Confidential

Data Storage & Processing

• Graph Databases

• Graph Compute Engines

Programming:

• Graph-Centric APIs & Languages

• Graph Algorithms

Tools:

• Visualization Tools & Libraries

• Other

Key Graph Analytic Technologies

Neo Technology, Inc Confidential

Typical Graph BI Environment

ApplicationOther

Databases

ETL

Neo4jCluster

Data Storage &Business Rules Execution

Reporting

Graph-Dashboards&Ad-hocAnalysis

GraphVisualization

End User Ad-hoc visual navigation & discovery

Bulk Analytic Infrastructure

(e.g. Graph Compute Engine)

ETL

Graph Mining & Aggregation

Data Scientist

Ad-HocAnalysis

What is aGraph Database

A graph database is an online (“real-time”) database management system with CRUD methods that expose a graph data model

• Two important properties:

• Native graph processing, includingindex-free adjacency1 to facilitate traversals

• Native graph storage engine, i.e. written from the ground up to be optimized for managing graph data

1] See Rodriguez, M.A., Neubauer, P., , “The Graph Traversal Pattern,” 2010 (http://arxiv.org/abs/1004.1001)

Overview of PopularGraph Data Models

• Property Graph

• Description: A “directed, labeled, attributed, multi-graph”1 which exposes three building blocks: nodes, typed relationships and key-value properties on both nodes and relationships

• Vendors: Neo4j, OrientDB, InfiniteGraph, Dex

• RDF Triples

• Description: URI-centered subject-predicate-object triples as pioneered by the semantic web movement2

• Vendors: AllegroGraph, Sesame

• HyperGraph

• Description: A generalized graph where a relationship can connect an arbitrary amount of nodes (compared to the more common binary graph models)3

• Vendors: HyperGraphDB, TrinityDB1] Rodriguez, M.A., Neubauer, P., “Constructions from Dots and Lines,” 2010, http://arxiv.org/abs/1006.23612] W3C, “The Resource Description Framework (RDF),” 2004, http://www.w3.org/RDF/3] Wikipedia, http://en.wikipedia.org/wiki/Hypergraph

Graph Compute EngineProcessing platforms that enable graph global computational algorithms to be run against large data sets

Graph Mining Engine

(Working Storage)

In-Memory ProcessingSystem(s)of Record

Graph Compute Engine

Data extraction,transformation,

and load

Neo Technology, Inc Confidential

Graph Global QueriesWhat is the max/min/avg. number of connections per node?

(aka “Degree Distribution”)

Neo Technology, Inc Confidential

Quoi faire avec un Graph Database?Example: Facebook Graph Search

Neo Technology, Inc Confidential

For the Facebook Graph Question:

What sushi restaurants in NYC do my friends like?

Neo Technology, Inc Confidential

What the Graph Looks Like:What sushi restaurants in NYC do my friends like?

Neo Technology, Inc Confidential

What the Cypher Query Looks Like:What sushi restaurants in NYC do my friends like?

START me=node:person(name = 'Philip'), location=node:location(location='New York'), cuisine=node:cuisine(cuisine='Sushi')

MATCH (me)-[:IS_FRIEND_OF]->(friend)-[:LIKES]->(restaurant) -[:LOCATED_IN]->(location),(restaurant)-[:SERVES]->(cuisine)

RETURN restaurant

Neo Technology, Inc Confidential

What the Search Looks Like:What sushi restaurants in NYC do my friends like?

Neo Technology, Inc Confidential

What Other Graph Searches Look LikeWhat drugs will bind to protein X and not interact with drug Y?

Neo Technology, Inc Confidential

Graph Dashboards

Social Network Analysis

Fraud Detection & Money Laundering

Service Assurance& Network Failure Analysis

Neo Technology, Inc Confidential

Industry Example:5 Graphs of

Communications

Neo Technology, Inc Confidential

#1: The Network GraphGraphs in Communications

Cell Signal Analysis

Router

Service

DEPEN

DS_ON

Switch Switch

Router

Fiber LinkFiber Link

Fiber Link

Oceanfloor Cable

DEP

END

S_O

N

DEPEN

DS_O

N

DEPENDS_ON

DEPEN

DS_O

NDEPENDS_ON

DEPENDS_ON

DEPENDS_ON

DEPENDS_ON

DEP

END

S_O

N

LINKED

LINKED

LINKED

DEPENDS_ON

“What if” Downtime Analysis(Service-to-Infrastructure Mapping)

Network Inventory & Cost Accounting

Neo Technology, Inc Confidential

#2: The Social GraphGraphs in Communications

Mobile apps, Collaboration,

Social Recommendations, and more...

Neo Technology, Inc Confidential

#3: The Call Graph Graphs in Communications

Plan & Feature Recommendations, Assess Churn Risk

Neo Technology, Inc Confidential

#4: Master Data GraphGraphs in Communications

Organizational HierarchyManagement Resource Authorization

Ref: http://www.slideshare.net/verheughe/how-nosql-paid-off-for-telenor

Neo Technology, Inc Confidential

#5: The Help Desk GraphGraphs in Communications

Online Recommendationsfor Case Avoidance

Neo Technology, Inc Confidential

Entitlements & Identity Management

Network Asset Management

Network Cell Analysis

Geo Routing(Public Transport)

BioInformatics

Emergent Graph in Other Industries(Actual Neo4j Graphs)

Insurance Risk Analysis

Neo Technology, Inc Confidential

Web Browsing Portfolio Analytics

Mobile Social ApplicationGene Sequencing

Emergent Graph in Other Industries(Actual Neo4j Graphs)

Neo Technology, Inc Confidential

Cas d’études selectionés

Neo Technology, Inc Confidential

Background• World’s largest provider of IT infrastructure, software

& services

• HP’s Unified Correlation Analyzer (UCA) application is a key application inside HP’s OSS Assurance portfolio

• Carrier-class resource & service management, problem determination, root cause & service impact analysis

• Helps communications operators manage large, complex and fast changing networks

Business problem• Use network topology information to identify root

problems causes on the network

• Simplify alarm handling by human operators

• Automate handling of certain types of alarms Help operators respond rapidly to network issues

• Filter/group/eliminate redundant Network Management System alarms by event correlation

Solution & Benefits• Accelerated product development time

• Extremely fast querying of network topology

• Graph representation a perfect domain fit

• 24x7 carrier-grade reliability with Neo4j HA clustering

• Met objective in under 6 months

Industry: Web/ISV, CommunicationsUse case: Network ManagementGlobal (U.S., France)

Neo Technology, Inc Confidential

Background

• One of the world’s largest logistics carriers

• Projected to outgrow capacity of old system

• New parcel routing system• Single source of truth for entire network

• B2C & B2B parcel tracking

•Real-time routing: up to 5M parcels per day

Business problem• 24x7 availability, year round• Peak loads of 2500+ parcels per second

• Complex and diverse software stack• Need predictable performance & linear

scalability

• Daily changes to logistics network: route from any point, to any point

Solution & Benefits• Neo4j provides the ideal domain fit:

• a logistics network is a graph

• Extreme availability & performance with Neo4j clustering

• Hugely simplified queries, vs. relational for complex routing

• Flexible data model can reflect real-world data variance much better than relational

• “Whiteboard friendly” model easy to understand

Industry: LogisticsUse case: Parcel Routing

Neo Technology, Inc Confidential

Industry: Online Job SearchUse case: Social / Recommendations

• Online jobs and career community, providing anonymized inside information to job seekers

Business problem• Wanted to leverage known fact that most jobs are

found through personal & professional connections

• Needed to rely on an existing source of social network data. Facebook was the ideal choice.

• End users needed to get instant gratification

• Aiming to have the best job search service, in a very competitive market

Solution & Benefits• First-to-market with a product that let users find jobs

through their network of Facebook friends

• Job recommendations served real-time from Neo4j

• Individual Facebook graphs imported real-time into Neo4j

• Glassdoor now stores > 50% of the entire Facebook social graph

• Neo4j cluster has grown seamlessly, with new instances being brought online as graph size and load have increased

Person

Company

KNO

WS

Person

Person

KNOWS

Company

KN

OW

S

WORKS_AT

WORKS_AT

Neo Technology Confidential

Background

Sausalito, CA

Neo Technology, Inc Confidential

Industry: CommunicationsUse case: Recommendations

• Cisco.com serves customer and business customers with Support Services

• Needed real-time recommendations, to encourage use of online knowledge base

• Cisco had been successfully using Neo4j for its internal master data management solution.

• Identified a strong fit for online recommendations

Solution & Benefits• Cases, solutions, articles, etc. continuously scraped

for cross-reference links, and represented in Neo4j

• Real-time reading recommendations via Neo4j• Neo4j Enterprise with HA cluster

• The result: customers obtain help faster, with decreased reliance on customer support

Neo Technology Confidential

Background

Business problem• Call center volumes needed to be lowered by

improving the efficacy of online self service

• Leverage large amounts of knowledge stored in service cases, solutions, articles, forums, etc.

• Problem resolution times, as well as support costs, needed to be lowered

Support Case

Support Case

KnowledgeBase

Article

Solution

KnowledgeBase

Article

KnowledgeBase

Article

Message

San Jose, CA

Cisco.com

Neo Technology, Inc Confidential

Interactive Television Programming

Industry: CommunicationsUse case: Social gaming

Background• Europe’s largest communications company

• Provider of mobile & land telephone lines to consumers and businesses, as well as internet services, television, and other services

Solution & Benefits• Interactive, social offering gives fans a way to

experience the game more closely

• Increased customer stickiness for Deutsche Telekom

• A completely new channel for reaching customers with information, promotions, and ads

• Clear competitive advantage

Frankfurt, Germany

Business problem• The Fanorakel application allows fans to have an

interactive experience while watching sports

• Fans can vote for referee decisions and interact with other fans watching the game

• Highly connected dataset with real-time updates

• Queries need to be served real-time on rapidly changing data

• One technical challenge is to handle the very high spikes of activity during popular games

Neo Technology, Inc Confidential

Reasons for Choosing a Graph Database

1. Order-of-magnitude improvements in query performance for complex, connected data

2. Drastically accelerated application development cycles

3. Maintainability and extensibility of the data model

4. Maturity and reliability of the product

Neo Technology, Inc Confidential

Questions ?

Neo Technology, Inc Confidential

Merci !

Pour  aller  plus  loin  :

Cédric  Fauvet  –  Votre  contact  en  France

E-­‐mail  :  [email protected]+er  :  @Neo4jFrCommunauté  Francophone  :  meetup.com/graphdb-­‐france


Recommended