+ All Categories
Home > Technology > Scylla Summit 2017: Migrating to Scylla From Cassandra and Others With No Downtime

Scylla Summit 2017: Migrating to Scylla From Cassandra and Others With No Downtime

Date post: 22-Jan-2018
Category:
Upload: scylladb
View: 376 times
Download: 1 times
Share this document with a friend
23
PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First and last name Position, company Migration To Scylla From Cassandra Senior Solutions Architect, ScyllaDB Alexander Sicular
Transcript
Page 1: Scylla Summit 2017: Migrating to Scylla From Cassandra and Others With No Downtime

PRESENTATION TITLE ON ONE LINE AND ON TWO LINES

First and last namePosition, company

Migration To ScyllaFrom Cassandra

Senior Solutions Architect, ScyllaDB

Alexander Sicular

Page 2: Scylla Summit 2017: Migrating to Scylla From Cassandra and Others With No Downtime

PRESENTATION TITLE ON ONE LINE AND ON TWO LINES

First and last namePosition, company

Alexander "Sasha" Sicular

2

● Over 16 years at Columbia University, the last seven as Director of Medical Informatics, working in the field of clinical informatics building EMR's, billing, data integration and research systems.

● Having extensive experience in relational, non-relational and distributed databases, Alexander helps customers get the most out of Scylla as a Senior Solutions Architect at ScyllaDB.

Page 3: Scylla Summit 2017: Migrating to Scylla From Cassandra and Others With No Downtime

PRESENTATION TITLE ON ONE LINE AND ON TWO LINES

First and last namePosition, company

3

Agenda

+ Compatibility+ DB Migration 101

+ Offline migration+ Live migration

+ Migration From Cassandra to Scylla+ Migration Tools+ Best Practice

Page 4: Scylla Summit 2017: Migrating to Scylla From Cassandra and Others With No Downtime

PRESENTATION TITLE ON ONE LINE AND ON TWO LINES

First and last namePosition, company

Compatibility

Page 5: Scylla Summit 2017: Migrating to Scylla From Cassandra and Others With No Downtime

PRESENTATION TITLE ON ONE LINE AND ON TWO LINES

First and last namePosition, company

Scylla Compatibility

5

+ SSTable file format (Compatible to Cassandra 2.1)

+ Configuration file format (Compatible to Cassandra 2.1)

+ CQL language (CQL version 3.3.1)

+ CQL native protocol (CQL version 3.3.1)

+ JMX management protocol (Compatible to Cassandra 2.1)

+ Management command line (nodetool from C* 3.0)

+ All Drivers (Java, C++, Python, Node, Ruby, Go…)

Page 6: Scylla Summit 2017: Migrating to Scylla From Cassandra and Others With No Downtime

PRESENTATION TITLE ON ONE LINE AND ON TWO LINES

First and last namePosition, company

DB Migration 101

Page 7: Scylla Summit 2017: Migrating to Scylla From Cassandra and Others With No Downtime

PRESENTATION TITLE ON ONE LINE AND ON TWO LINES

First and last namePosition, company

DB Migration Steps

7

+ Schema Migration+ Migrating Historical Data (Forklifting)+ Migrating Live Data (Dual Writes)+ Validation (Offline and/or Dual Reads)*+ Fade out old DB

* Optional step

Page 8: Scylla Summit 2017: Migrating to Scylla From Cassandra and Others With No Downtime

PRESENTATION TITLE ON ONE LINE AND ON TWO LINES

First and last namePosition, company

Offline MigrationFrom DB-OLD to DB-NEW

8

Read from DB-NEW

Read / Write to DB-OLD

Write to DB-NEW

Time

Forklifting Historical Data

Validation*

Fade out DB-OLDDBs in Sync

Down Time

Migrate Schema

Page 9: Scylla Summit 2017: Migrating to Scylla From Cassandra and Others With No Downtime

PRESENTATION TITLE ON ONE LINE AND ON TWO LINES

First and last namePosition, company

Live MigrationFrom DB-OLD to DB-NEW

9

Read from DB-OLD

Read from DB-NEW

Dual Reads*

Write to DB-OLD

Write to DB-NEW

Dual Writes

Time

Forklifting Historical Data

Validation*

DBs in Sync

Fade out DB-OLD

Migrate Schema

Page 10: Scylla Summit 2017: Migrating to Scylla From Cassandra and Others With No Downtime

PRESENTATION TITLE ON ONE LINE AND ON TWO LINES

First and last namePosition, company

Migration Tools

Page 11: Scylla Summit 2017: Migrating to Scylla From Cassandra and Others With No Downtime

PRESENTATION TITLE ON ONE LINE AND ON TWO LINES

First and last namePosition, company

11

Migration Multi DC cluster

SSTableLoader

SSTablesCQL

Internal communication

DC A

DC B

DC C

DC A

DC B

If every Cassandra DC holds the same information, uploading from one of the DC's sstables is sufficient.

Dual Write needs to be implemented in all regions.

Number and RF of DC's does not have to be preserved.

Page 12: Scylla Summit 2017: Migrating to Scylla From Cassandra and Others With No Downtime

PRESENTATION TITLE ON ONE LINE AND ON TWO LINES

First and last namePosition, company

12

+ Use DESCRIBE to export each Cassandra Keyspace, Table, UDT (not including system tables)

+ Cassandra + cqlsh "-e DESC SCHEMA" > schema.cql

+ Scylla+ cqlsh --file ‘schema.cql’

+ When migrating from Cassandra 3.x some schema updates required

Migrate Schema

Page 13: Scylla Summit 2017: Migrating to Scylla From Cassandra and Others With No Downtime

PRESENTATION TITLE ON ONE LINE AND ON TWO LINES

First and last namePosition, company

13

+ Update the application logic to send each write to both clusters (Cassandra and Scylla) in parallel

+ Recommendations: + Compare the results and log inconsistencies, if any+ Use client side timestamp + Create knobs for each DB writer, allowing you to stop/start writing to each DB in

runtime

+ Rolling application logic upgrade for zero downtime+ Dual Read can follow the same logic

Dual Write

ClientCQLCQL

Page 14: Scylla Summit 2017: Migrating to Scylla From Cassandra and Others With No Downtime

PRESENTATION TITLE ON ONE LINE AND ON TWO LINES

First and last namePosition, company

14

Use two different cluster sessions.

#connect to cluster 1

db1 = cassandra.cluster.Cluster(IP_C1).connect()

#connect to cluster 2

db2 = cassandra.cluster.Cluster(IP_C2).connect()

Dual Writes

Page 15: Scylla Summit 2017: Migrating to Scylla From Cassandra and Others With No Downtime

PRESENTATION TITLE ON ONE LINE AND ON TWO LINES

First and last namePosition, company

15

Two prepared statements, one for each DB session.

#insert statement with explicit TIMESTAMP

insert_statement = "INSERT INTO keyspace.table (c1,c2)

VALUES (?,?) USING TIMESTAMP ?"

#prepared statements

prepared_statement_1 = db1.prepare(insert_statement)

prepared_statement_2 = db2.prepare(insert_statement)

Dual Writes

Page 16: Scylla Summit 2017: Migrating to Scylla From Cassandra and Others With No Downtime

PRESENTATION TITLE ON ONE LINE AND ON TWO LINES

First and last namePosition, company

16

Create sample values, execute async insert statements.

#rand values, explicitly set a write time in microseconds

values = [random.randrange(0,1000) , str(uuid.uuid4()) , int(time.time()*1000000)]

# build a list of queries

inserts = []

#insert 1st statement into the 1st session

inserts.append(db1.execute_async(prepared_statement_1, values))

#insert 2nd statement into the 2nd session

inserts.append(db2.execute_async(prepared_statement_2, values))

Dual Writes

Page 17: Scylla Summit 2017: Migrating to Scylla From Cassandra and Others With No Downtime

PRESENTATION TITLE ON ONE LINE AND ON TWO LINES

First and last namePosition, company

17

Return for results, log results and values in array.

# loop over futures and output success/fail

results = []

for i in range(0,len(inserts)):

try:

row = inserts[i].result()

results.append(1)

except Exception:

results.append(0)

results.append(values)

Dual Writes

Page 18: Scylla Summit 2017: Migrating to Scylla From Cassandra and Others With No Downtime

PRESENTATION TITLE ON ONE LINE AND ON TWO LINES

First and last namePosition, company

18

Check for failures in either write.

#did we have failures?

if (results[0]==0):

#do something

log('Write to cluster 1 failed')

if (results[1]==0):

#do something

log('Write to cluster 2 failed')

Dual Writes

Page 19: Scylla Summit 2017: Migrating to Scylla From Cassandra and Others With No Downtime

PRESENTATION TITLE ON ONE LINE AND ON TWO LINES

First and last namePosition, company

19

Forklifting Historical Data+ Install Scylla’s sstableloader on Cassandra nodes, or on intermediate servers+ Create snapshot of each Cassandra node+ Run sstableloader from each Cassandra node

sstableloader -x -d [Scylla IP] .../[ks]/[table]

Or, from intermediate servers, using mount to Cassandra filesystem

sstableloader -x -d [scylla IP] .../[mount point] in /[ks]/[table] format

+ Watch for an affect on Cassandra nodes, and use throttling (-t) to limit the loader throughput

SSTableLoader

SSTablesCQL

Page 20: Scylla Summit 2017: Migrating to Scylla From Cassandra and Others With No Downtime

PRESENTATION TITLE ON ONE LINE AND ON TWO LINES

First and last namePosition, company

Best Practices

Page 21: Scylla Summit 2017: Migrating to Scylla From Cassandra and Others With No Downtime

PRESENTATION TITLE ON ONE LINE AND ON TWO LINES

First and last namePosition, company

21

Best Practices

+ Clean up the origin database in advance. Don't waste time on old data!

+ More data = longer migration time+ Iterative migration and validation. For example one table,

one region, one user prefix, etc. After validation keep or delete/restart that dataset

+ At any point: verify and validate. You can always roll back to the origin DB for any reason

Page 22: Scylla Summit 2017: Migrating to Scylla From Cassandra and Others With No Downtime

PRESENTATION TITLE ON ONE LINE AND ON TWO LINES

First and last namePosition, company

22

Best Practices… Continued

+ Make sure to have a monitoring stack in place for both DBs and the application during the entire migration

+ Validate the process by sampling data at different points+ Before fading out the origin DB, make sure there are no

live connections to it+ Make sure all relevant users are aware of the process and

limitations (don't update your schema!)+ Get Scylla involved. We want to help!

Page 23: Scylla Summit 2017: Migrating to Scylla From Cassandra and Others With No Downtime

PRESENTATION TITLE ON ONE LINE AND ON TWO LINES

First and last namePosition, company

THANK YOU!

[email protected]

@siculars

Please stay in touch:

Any questions?


Recommended