Date post: | 31-Jul-2015 |
Category: |
Software |
Upload: | russell-spitzer |
View: | 319 times |
Download: | 2 times |
Company Confidential© 2015 Aeris Communications, Inc. All Rights Reserved © 2015 DataStax, All Rights Reserved. 1
“ideal to store time series data”
“Apache Cassandra has never failed us.”
Company Confidential© 2015 Aeris Communications, Inc. All Rights Reserved © 2015 DataStax, All Rights Reserved. 3
Startup Program
ToastrBox
Analytics
Search
In-memory
Visual Admin
Security
Certified Cassandra
Company Confidential© 2015 Aeris Communications, Inc. All Rights Reserved © 2015 DataStax, All Rights Reserved. 4
IoT requires performance and reliability
Company Confidential© 2015 Aeris Communications, Inc. All Rights Reserved © 2015 DataStax, All Rights Reserved. 5
Your System
Send Heating Coil Repair Man
IoT requires performance and reliability
Company Confidential© 2015 Aeris Communications, Inc. All Rights Reserved © 2015 DataStax, All Rights Reserved. 6
Your System
Send Heating Coil Repair ManSend 10% Off Bread Coupon
IoT requires performance and reliability
Company Confidential© 2015 Aeris Communications, Inc. All Rights Reserved © 2015 DataStax, All Rights Reserved. 7
Your System
Send Heating Coil Repair ManSend 10% Off Bread Coupon
Offer Upgrade Suggestions
IoT requires performance and reliability
Company Confidential© 2015 Aeris Communications, Inc. All Rights Reserved © 2015 DataStax, All Rights Reserved. 8
Your System
Send Heating Coil Repair ManSend 10% Off Bread Coupon
Offer Upgrade Suggestions Integrate with your SaaS (Spread as a Service)
IoT requires performance and reliability
Company Confidential© 2015 Aeris Communications, Inc. All Rights Reserved © 2015 DataStax, All Rights Reserved. 9
Your SystemFAULT
IoT requires performance and reliability
App Down, Customers Lose Interest
Company Confidential© 2015 Aeris Communications, Inc. All Rights Reserved © 2015 DataStax, All Rights Reserved. 10
Your SystemSLOW
Send Heating Coil Repair ManThree months after they
get a competitor's toaster
Offer Upgrade SuggestionsThat are already out of date
Send 10% Off Bread Coupon They've already restocked on bread
Integrate with your SaaS (Spread as a Service)
Toast got spread a long time ago
IoT requires performance and reliability
Company Confidential© 2015 Aeris Communications, Inc. All Rights Reserved © 2015 DataStax, All Rights Reserved. 11
Send Heating Coil Repair ManSend 10% Off Bread Coupon
Offer Upgrade Suggestions Integrate with your SaaS (Spread as a Service)
IoT requires performance and reliability
Company Confidential© 2015 Aeris Communications, Inc. All Rights Reserved © 2015 DataStax, All Rights Reserved. 12
0 50 100 150 200 250 300 350
174,373
366,828
537,172
1,099,837
http://techblog.netflix.com/2011/11/benchmarking-cassandra-scalability-on.html
50 nodes
100
150
300 nodesScale
Company Confidential© 2015 Aeris Communications, Inc. All Rights Reserved © 2015 DataStax, All Rights Reserved. 13
Horizontal scale
B
A
A BToken Range Mapping Data To Nodes
Ring Architecture Peer to Peer Communication
No Masters, No Slaves
Company Confidential© 2015 Aeris Communications, Inc. All Rights Reserved © 2015 DataStax, All Rights Reserved. 14
C
BA
D
A B
C D
Token Range Mapping Data To Nodes
Ring Architecture Peer to Peer Communication
No Masters, No Slaves
Horizontal scale
Company Confidential© 2015 Aeris Communications, Inc. All Rights Reserved © 2015 DataStax, All Rights Reserved. 15
Availability
"During Hurricane Sandy, we lost an entire data center. Completely. Lost. It.
Our data in Cassandra never went offline."
Company Confidential© 2015 Aeris Communications, Inc. All Rights Reserved © 2015 DataStax, All Rights Reserved.
Peer-to-peer architecture
16
C
BA
D
Client
Client has a holistic view
Cluster cluster = Cluster.builder().addContactPoint("192.168.0.1").build();
Cassandra Cluster
Company Confidential© 2015 Aeris Communications, Inc. All Rights Reserved © 2015 DataStax, All Rights Reserved. 17
C
BA
D
Client
Client has a holistic view
Partition Keys are Hashed to a Token Range
DeviceID: 102349
Divided data responsibility across cluster
A B
C D
Company Confidential© 2015 Aeris Communications, Inc. All Rights Reserved © 2015 DataStax, All Rights Reserved.
Controlling fault tolerance: replication factor
18
Server - Replication: How many copies of a data should exist in the cluster?
ReplicationFactor=3
Client
Replication Strategies can span data centers! Survive whole AWS Region Failure!
ACD
ABCABD
BCD
A B
C D
Company Confidential© 2015 Aeris Communications, Inc. All Rights Reserved © 2015 DataStax, All Rights Reserved.
Controlling fault tolerance: replication factor
19
ACD
ABCABD
BCDACD
ABCABD
BCD
US-West US-East
Server - Replication: How many copies of a data should exist in the cluster?
ReplicationFactor=3
A B
C D
Company Confidential© 2015 Aeris Communications, Inc. All Rights Reserved © 2015 DataStax, All Rights Reserved.
Controlling fault tolerance: replication factor
20
Cassandra Cluster
ACD
ABCABD
BCDACD
ABCABD
BCD
US-East
Server - Replication: How many copies of a data should exist in the cluster?
ReplicationFactor=3
US-West
A B
C D
Company Confidential© 2015 Aeris Communications, Inc. All Rights Reserved © 2015 DataStax, All Rights Reserved.
Controlling fault tolerance: replication factor
21
A B
C D
ACD
ABCABD
BCDACD
ABCABD
BCD
US-West US-East
Server - Replication: How many copies of a data should exist in the cluster?
ReplicationFactor=3
Company Confidential© 2015 Aeris Communications, Inc. All Rights Reserved © 2015 DataStax, All Rights Reserved.
Controlling fault tolerance: tunable consistency
22
Client - Consistency Level: How many replicas should we check before acknowledgement?
CL = One
Client
Successful Toast Made!
ACD
ABCABD
BCDACD
ABCABD
BCD
A B
C D
Company Confidential© 2015 Aeris Communications, Inc. All Rights Reserved © 2015 DataStax, All Rights Reserved.
Controlling fault tolerance: tunable consistency
23
Client - Consistency Level: How many replicas should we check before acknowledgement?
CL = Quorum
Client
Toaster Burst Into Flames!
Higher Consistency Level's Let us Make Sure Events are Persisted
ACD
ABCABD
BCDACD
ABCABD
BCD
A B
C D
http://www.datastax.com/apache-cassandra-leads-nosql-benchmark
0
40000
80000
120000
160000
1 2 4 8
Performance
Company Confidential© 2015 Aeris Communications, Inc. All Rights Reserved © 2015 DataStax, All Rights Reserved.
Unparalleled durable performance
25
Par ReClu Memory
Commit Log
Memtable Memtable
Disk
Memtable
Par ReClu
Par ReCluPar ReClu
Par ReCluPar ReClu
Par ReCluPar ReClu
Company Confidential© 2015 Aeris Communications, Inc. All Rights Reserved © 2015 DataStax, All Rights Reserved.
Unparalleled durable performance
26
Par ReClu Memory
Commit Log
Memtable Memtable
Disk
Memtable
Par ReClu
Par ReCluPar ReClu
Par ReCluPar ReClu
Par ReCluPar ReClu
SSTable SSTable
Flushed
Company Confidential© 2015 Aeris Communications, Inc. All Rights Reserved © 2015 DataStax, All Rights Reserved.
Reading data is fast but limited by disk IO
27
Memory
Commit Log
Memtable Memtable
Disk
Memtable
Par ReCluPar ReClu
Par ReCluPar ReClu
Par ReCluPar ReClu
SSTable SSTable
Flushed
Replica
Company Confidential© 2015 Aeris Communications, Inc. All Rights Reserved © 2015 DataStax, All Rights Reserved.
Reading data is fast but limited by disk IO
28
Memory
Commit Log
Memtable Memtable
Disk
Memtable
Par ReCluPar ReClu
Par ReCluPar ReClu
Par ReCluPar ReClu
SSTable SSTable
Flushed
Replica
Par ReCluPar ReClu
Par ReCluPar ReClu
Company Confidential© 2015 Aeris Communications, Inc. All Rights Reserved © 2015 DataStax, All Rights Reserved.
Reading data is fast but limited by disk IO
29
Memory
Commit Log
Memtable Memtable
Disk
Memtable
Par ReCluPar ReClu
Par ReCluPar ReClu
Par ReCluPar ReClu
SSTable SSTable
Flushed
Replica
Par ReCluPar ReClu
Par ReCluPar ReCluLWW
Company Confidential© 2015 Aeris Communications, Inc. All Rights Reserved © 2015 DataStax, All Rights Reserved.
Data modeling for time series
30
Things Generating Events
Company Confidential© 2015 Aeris Communications, Inc. All Rights Reserved © 2015 DataStax, All Rights Reserved.
Data modeling for time series
31
Things Generating Events
Store Events ordered by TimeUUID
t1 t2 t3 t4 t5 t6 t7 t8 t9
Company Confidential© 2015 Aeris Communications, Inc. All Rights Reserved © 2015 DataStax, All Rights Reserved.
Data modeling for time series
32
Things Generating Events
Store Events ordered by TimeUUID
t1 t2 t3 t4 t5 t6 t7 t8 t9
SSTable SSTable
t1 t10 t11 t20
Data Ends up being Stored Temporally Sequentially on Disk
Additional tables with Rollups/aggs etc …
With data stored sequentially by time, time based queries become extremely fast!
Company Confidential© 2015 Aeris Communications, Inc. All Rights Reserved © 2015 DataStax, All Rights Reserved.
Cassandra data modeling
33
Create Table example ( toasterID UUID, eventTime TIMEUUID, event Text, PRIMARY KEY (pk, ck))
Whole partition available
on each replica
Data ordered within Partition by Clustering Key
Partition Key Idle Toasting Toasting Toast Success! Idle
12:00 12:01 12:02 12:03 12:04
Stored as Multiple SSTables,
Each Internally Ordered
Easy to Search Ranges of Clustering Key Difficult to search Ranges of Partition Key
Company Confidential© 2015 Aeris Communications, Inc. All Rights Reserved © 2015 DataStax, All Rights Reserved.
DataStax Spark-Cassandra connector
34
Receiver
DStream
Events
Batch Batch
RDD RDD RDD RDD
https://github.com/datastax/spark-cassandra-connector
Company Confidential© 2015 Aeris Communications, Inc. All Rights Reserved © 2015 DataStax, All Rights Reserved.
Streaming data direct to Cassandra
35
It's easier than ever to connect you incoming event data with Cassandra