Date post: | 15-Jan-2015 |
Category: |
Technology |
Upload: | shimik |
View: | 1,390 times |
Download: | 2 times |
Introduction toCassandra
Shimi Kiviti@shimi_k
Motivation
Scaling
How do you scale your database?● reads● writes
Influential Papers
● Bigtable: A distributed storage system for structured data, 2006
● Dynamo: amazon's highly available key-value store, 2007
Cassandra:● partition and replication - Dynamo● log structure column family - Bigtable
Cassandra Highlights
● Symmetric - all nodes are exactly the same○ No single point of failure○ Linearly scalable○ Ease of administration
● High availability with multiple datacenters● Consistency vs Latency● Read/Write anywhere● Flexible Schema● Column TTL● Distributed Counters
DHT - Distributed Hash Table
DHT
● O(1) node lookup● Explicit replication● Linear Scalability
Consistency
N = Replication factorR = Number of replicas to block when read <= NW = Number of replicas to block when write <= NQuorum = N/2 + 1
When W + R > N there is a full consistencyexamples:
● W = 1, R = N● W = N, R = 1● W = Quorum, R = Quorum
Consistency Level
● Every request defines consistency level○ Any○ One○ Two○ Three○ Quorum○ Local Quorum○ Each Quorum○ All
Data Model
● Keyspace ~ schema● ColumnFamilies ~ table● Rows● Columns
Column Family
Key1 Column Column Column
Key2 Column Column
Column Family
ColumnFamily: { TOK: { chen: 1, ronen: 7 } CityPath: { yuval: 5 }}
Super Column Family
ColumnFamily: { Key: { super1: { name: value, name: value } super2: { name: value } }}
KeyColumn Column ColumnSuper2
Column Column ColumnSuper1
Write
● Any node● Partitioner● Commit log, memtable ● Wait for W responses
Write
Write
● No reads● No seeks● Sequential disk access● Atomic within a column family● Fast● Always writeable (hinted hand-off)
Read
● Choose any node● Partitioner● Wait for R responses● tunable read repair in the background
Read
Read can be from multiple SSTablesSlower then writes
Cache
● There is no need to use memcached● There is an internal configurable cache
○ Key cache○ Row cache
Sorting
When you preform get the result is sorted● Rows are sorted according to the partitioner● Columns in a row are sorted according to the type of the
column name
Partitioner
● RandomPartitioner - Uses hash values as tokens. useful for distributing the load on all nodes.If you use it, set the nodes tokens manually
● OrderPreservePartioner - You can get sorted rows but it will cost you with an even cluster
Column Types
Available types:● Bytes● UTF8● Ascii● Long● Date● UUID● Composite - <Type1>:<Type2>
Column Types
Examples:
Sort1:8 109 vs 810 9
Sort2:dan:8 dan:10dan:10 vs dan:8shimi:1 shimi:1
Clients
● Thrift - Cassandra driver level interface● CQL - Cassandra query language (SQL like)● High level clients:
○ Python○ Java○ Scala○ Clojure○ .Net○ Ruby○ PHP○ Perl○ C++○ Haskel
Cascal - Scala client
Insert column:
session.insert("app" \ "users" \ "shimi" \ "passwd" \ "mypass")
val key = "app" \ "users" \ "shimi"session.insert(key \ "email" \ "shimi.k@...")
Get column value:
val pass = session.get(key \ "passwd")
Cascal
Get multiple columns:
val row = session.list(key)val cols = session.list(key, RangePredicate("email", "passwd"))val cols = session.list(key, ColumnPredicate( List("passwd", "email") ))
Cascal
Get multiple rows:
val family = "app" \ "users"val rows = session.list(family, RangePredicate("dan", "shimi"))val rows = session.list(family, KeyPrdicate("dan", "shimi"))
Cascal
Remove column:session.remove("app" \ "users" \ "shimi" \ "passwd")
Remove row:session.remove("app" \ "users" \ "shimi")
Batch operations:
val deleteCols = Delete(key, ColumnPredicate("age" :: "sex"))val insertEmail = Insert(key \ "email" \ "shimi.k@...")session.batch(insertEmail :: deleteCols)
Guidelines
● Keep together the data you query together● Think about your use case and how you should fetch your
data.● Don't try to normalize your data● You can't win the disk● Be ready to get your hands dirty● There is no single solution for everything. You might
consider using different solutions together
The End
Useful links:● Cassandra, http://cassandra.apache.org/● Wiki http://wiki.apache.org/cassandra/● Cassandra mailing list● IRC● Bigtable, http://labs.google.com/papers/bigtable.html● Dynamo http://www.allthingsdistributed.
com/2007/10/amazons_dynamo.html● Cascal, https://github.com/shimi/cascal