Date post: | 27-Jan-2015 |
Category: |
Technology |
Upload: | infinitegraph |
View: | 103 times |
Download: | 0 times |
Agenda
• The NoSQL Landscape• InfiniteGraph• Solving what problems and how?
Copyright © InfiniteGraph
Some NoSQL Notes
Copyright © InfiniteGraph
• NoSQL = Not Only SQL
• NoSQL is requirements driven
• NoSQL = open source?
• NoSQL = cloud computing?
Company Confidential
The NoSQL Landscape
Cassandra
InfiniteGraph
NoSQL Landscape
Key Value Stores
Key Value Stores
BigTable Clones
BigTable Clones
Document databasesDocument databases
Complexity
Voldemort – LinkedInDynamo - Amazon
Cassandra – FacebookHBase – Apache/HadoopHypertable
CouchDB – ApacheMongoDB
Neo4jHypergraphDBAllegroGraphSones
Performance
Graph Databases
Social Network AnalysisIntelligence Community
Graph Databases
Graph Databases• A graph database is used to trace relationships among entities, most
commonly people, to any depth. Its characteristics are:– Very simple, fixed schema– Very complex data relationships– Used to support complex associations among like entities.
6
Node
Edge
John Jones
Jane Jones-Smith
Nancy Jones Paul
Jones
Doris Smith
Jim Smith
Jeff Smith
Meta-Model Instance Example (simplified)
Attribute(s)
Jeff Smith
InfiniteGraphA business unit of Objectivity
• In the business of distributed data management for over 10 years
• Solving graph data problems for over 8 years
• Focusing on the emerging requirements of graph data for cloud and on-premise distributed systems
Copyright © InfiniteGraph
Graphs are everywhere
Enterprise and government 2.0, bio-engineering, gene sequencing, drug development…..
LinkedIn, Facebook….Social network analytics, social CRM….
Network analysis, complex BoM, predictive and real-time ISR, fraud detection and response….
Graph Databases – What’s so Different ?
Darren WoodChief Architect, InfiniteGraph
Graph Databases
• Key technical attributes• How Infinite Graph addresses these• Query and navigation• Challenges/Requirements of Distibution• Practical applications
Copyright © InfiniteGraph
Graph Databases
• Optimized around data relationships– Relationships as first class citizens– Super fast navigation between entities– Rich/flexible annotation of connections
• Small focused API (typically not SQL)– Natively work with concepts of Vertex/Edge– SQL has no concept of “navigation”– Most attempts based in SQL are convoluted
Copyright © InfiniteGraph
Physical Storage Comparison
Copyright © InfiniteGraph
Meetings
P1 Place TimeP2Alice Denver 5-27-10Bob
Calls
From Time DurationToBob 13:20 25CarlosBob 17:10 15Charlie
Payments
From Date AmountToCarlos 5-12-10 100000Charlie
Met5-27-10Alice
Called13:20Bob
Payed100000Carlos
Charlie
Called17:10
Rows/Columns/Tables Relationship/Graph Optimized
Query and Navigation• Queries – but not as you know them• More like a rules based search and discovery• Asynchronous Results
Copyright © InfiniteGraph
Alice Carlos CharlieBobMeets Calls Pays
Calls
“Find all paths between Alice and Charlie”
“Find all paths between Alice and Charlie – within 2 degrees”
“Find all paths between Alice and Charlie – events in May 2010”
Management of Large Data Graphs
• Graphs grow quickly– Billions of phone calls / day in US– Emails, social media events, IP Traffic– Financial transactions
• Some analytics require navigation of large sections of the graph
• Each step (often) depends on the last• Must distribute data and go parallel
Copyright © InfiniteGraph
Graph Partitioning
• Graph partitioning is not as simple• Graph operations are rarely partition bound• Graphs are ‘alive’• Repartitioning is expensive• Partitions must co-operate
Copyright © InfiniteGraph
Distributed API
Application(s)
Partition 1 Partition 3Partition 2 Partition ...n
Processor Processor Processor Processor
Graph Partitioning – Reality !
Copyright © InfiniteGraph
Distributed Graph Must Haves
• High performance distributed persistence• Ability to deal with remote data reads (fast)• Intelligent local cache of subgraphs• Distributed navigation processing• Distributed, multi-source concurrent ingest• Write modes supporting both strict and
eventual consistency
Copyright © InfiniteGraph
Practical Applications
Copyright © InfiniteGraph
Graph Analysis (Algorithms)
• Social Networks– Most connected participants– Influencers– Important Syndicates or Sub-networks
• Central figures in crime organisations• Business Intelligence
– Discovering Knowledge Assets– Complex analytics
Copyright © InfiniteGraph
Graph Analysis (Patterns)
• Crime (again)– Recognize common patterns of activity– Complex chains of interaction
• Security– Recognize attack/threat patterns– Auditing / log analytics
• Targeting Advertising– To specific browsing patterns
Copyright © InfiniteGraph
Many Many More !
• Spatial data• Defence / Situational Awareness• Sciences• Health Care• Genealogy• Logistics• Tracking
Copyright © InfiniteGraph