Intuit: Reporting from the Trenches: Using Cassandra Effectively

Post on 13-Apr-2017

470 views 0 download

transcript

Rekha JoshiIntuit, Inc.

Reporting From Trenches: Using Cassandra Effectively!

Who Am I?

Staff Engineer at Intuit Inc.

Oreilly Certified Apache Cassandra Professional

Good Software?

And a Truly Successful Software?

All This Data!!!!!!!!!!

Can I Lift This Alone?

Need For Speed

Cassandra,who?

Cassandra is a Java based NoSQL, linearly scalable, best in class tunable performance, fault tolerant, distributed, masterless, time series database.

DynamoDB(Amazon)

Big Table(Google)

Cassandra

Inherits data distribution Inherits data model

Masterless ArchitectureLinear Scalability Tunable Consistency/Performance

ApplicationQuery Access Patterns

influencing influencing

Cassandra: The Hybrid Kid has the Edge!

Intuit And Cassandra

Cassandra = Intuit Technology Standard of Choice for NoSQL Distributed Database

Intuit On Mission

Personalized AB Testing Platform

Advanced Security

Analytics Options

Advanced Tools

Cassandra And DataStax Enterprise

Your Worries?

Fantasy And Engineering Fantasy

Application live on internal network

Blank Slate

Application live on AWSSecurity approvedData Security, Encryption

System happy, load tested, multiple releases, customers happy

Learnings – How? Why?

Successful Mini Peak Traffic, Paranoid Monitoring

Application releases use cases, Refactoring Data Model,

Excellent Peak Tax season!!!

Oct Start

Oct End

Nov Dec AprMarFebJan

Trusting -> Paranoid -> Seasoned

Garbage Collection Issue

Clock Issue

Understand the Node Ring

Nodetool statusNodetool ringNodetool infoNodetool cfstatsNodetool tpstats

Repeat after me: Cassandra is a Java based NoSQL linearly scalable, best in class tunable performance, fault tolerant, distributed, masterless, time series database.

What If A Node Goes Down?

Tuning The Application

Refactor data modelRevisit the usage access patternsParanoid Monitoring

Repeat after me: Cassandra is a Java based NoSQL linearly scalable, best in class tunable performance, fault tolerant, distributed, masterless, time series database.

Tuning For Reads

Tuning For Writes

Tuning The System

Little Talked Aspect Of The Pareto Principle!

Heavy Lifting? Easy!

Thank You!https://www.linkedin.com/in/rekhajoshm

https://twitter.com/rekhajoshm