+ All Categories
Home > Technology > Deep Dive into High Availability and Disaster Recovery Features in Couchbase Server 4.0: Couchbase...

Deep Dive into High Availability and Disaster Recovery Features in Couchbase Server 4.0: Couchbase...

Date post: 26-Jul-2015
Category:
Upload: couchbase
View: 98 times
Download: 5 times
Share this document with a friend
28
DEEP DIVE INTO HIGH AVAILABILITY & DISASTER RECOVERY FEATURES IN COUCHBASE SERVER Anil Kumar, Senior Product Manager Couchbase
Transcript

1. DEEP DIVE INTO HIGH AVAILABILITY & DISASTER RECOVERY FEATURES IN COUCHBASE SERVER Anil Kumar, Senior Product Manager Couchbase 2. 2015 Couchbase Inc. 2 About Me Anil Kumar Sr. Product Manager, Couchbase [email protected] @anilkumar1129 3. 2015 Couchbase Inc. 3 Next 40 minutes Part I - High Availability Single-node architecture Local data redundancy Rebalance and failover Node recovery Part II - Disaster Recovery Business continuity for mission-critical applications Geo-redundancy Backup-restore for the worst case scenario Demo Q & A 4. Part I - High Availability 5. 2015 Couchbase Inc. 5 Couchbase Server Single-Node Architecture Single-node type is the foundation for a high- availability architecture No single point of failure (SPOF) Easy scalability STORAGE Couchbase Server 1 SHARD 7 SHARD 9 SHARD 5 SHARDSHARDSHARD Managed Cache Cluster Manager Cluster Manager Managed Cache Storage Data Service Index Service Query Service STORAGE Couchbase Server 2 SHARD 7 SHARD 9 SHARD 5 SHARDSHARDSHARD Managed Cache Cluster Manager Cluster Manager Managed Cache Storage Data Service Index Service Query Service STORAGE Couchbase Server 3 SHARD 7 SHARD 9 SHARD 5 SHARDSHARDSHARD Managed Cache Cluster Manager Cluster Manager Managed Cache Storage Data Service Index Service Query Service 6. 2015 Couchbase Inc. 6 Intra-Cluster Replication Data Redundancy RAM-to-RAM replication Max of 4 copies of data in a cluster Bandwidth optimized through deduplication Intra-cluster replication is the process of replicating data on multiple servers within a cluster in order to provide data redundancy. 7. 2015 Couchbase Inc. 7 Write Operation Data Redundancy APPLICATION SERVER MANAGED CACHE DISK DISK DOC 1 DOC 1DOC 1 Caching based on memcached: App gets an ACK when a write is successfully in RAM Or RAM + Replicated Or RAM + Persisted Or RAM + Replicated + Persisted DCP-based replication: writes are queued to other nodes Couchstore-based storage: writes are queued for storage DCP INDEXER 8. 2015 Couchbase Inc. 8 Database Change Protocol Data Redundancy DCP is a new streaming replication protocol in Couchbase Server 3.0 High-performance, stream-based protocol Better resume-ability after blips and failures Ordering Consistent Intra-Cluster Replication Cross Datacenter Replication Incremental Rebalance Incremental Backup & Restore External streams for Change Data Capture (CDC) in future Incremental Map/Reduce Views Global Secondary Indexes Connectors (Kafka, Scoop, Spark) 9. 2015 Couchbase Inc. 9 AutoTuning SharedThread Pool - Durability Efficient auto-tuning engine Detect and allocate threads based on HW resources Pool threads for best resource utilization Improved latency across the board Faster Rebalance Faster Node Reactivation Faster Durability withWrites & PersistTo 10. 2015 Couchbase Inc. 10 Rebalance Operation Data Availability Rebalance redistributes data-partitions (data) around a cluster When adding nodes When removing nodes When nodes have failed over Aim is to bring a cluster back to optimal health Data-partitions are moved between nodes automatically Rebalance happens on an active cluster Allows you to expand/shrink without pausing your application Client libraries automatically handle the rebalance and redistribute their requests accordingly 11. 2015 Couchbase Inc. 11 Failover Operation FaultTolerance Failover automatically switches-over to the replicas for a given database Gracefully under node maintenance Immediately under auto-failover Can be triggered manually through the Admin-UI/REST/CLI Automatic failover in case of unplanned outages system failures Can be configured through Admin-UI/REST/CLI Constraints in place to avoid split-brain and false positives 30 second delay, multiple heartbeat pings Clusters >=3 nodes Only one node down at a time 12. 2015 Couchbase Inc. 12 Automatic Failover In Action SERVER 4 SERVER 5 Replica Active Replica ActiveActive SERVER 1 Shard 5 Shard 2 Shard 9Shard Shard Shard Replica Shard 4 Shard 1 Shard 8Shard Shard Shard Active SERVER 2 Shard 4 Shard 7 Shard 8 Shard Shard Shard Replica Shard 6 Shard 3 Shard 2 Shard Shard Shard Active SERVER 3 Shard 1 Shard 3 Shard 6Shard Shard Shard Replica Shard 7 Shard 9 Shard 5Shard Shard Shard App servers accessing Shards Requests to Server 3 fail Cluster detects server failure Promotes replicas of Shards to active Updates cluster map Requests for docs now go to appropriate server Typically a rebalance would follow Shard 1 Shard 3 Shard COUCHBASE Client Library CLUSTER MAP COUCHBASE Client Library CLUSTER MAP 13. 2015 Couchbase Inc. 13 Node Recovery Bring Cluster Back to Capacity Failed node can added back to the cluster: Full recovery Add back the failed node as a fresh node Delta Node recovery Add back the failed node incrementally into the cluster, without having to rebuild the full node. 14. 2015 Couchbase Inc. 14 Rack-Zone Awareness Rack-Zone Availability Grouping of servers into server groups so that each group is on a physically separate rack Ensures that replica data partitions are not on the same rack as the primary partitions Rack 1 1 2 3 Rack 2 4 5 6 Rack 3 7 8 9 Servers 1, 2, 3 on Rack 1 Servers 4, 5, 6 on Rack 2 Servers 7, 8, 9 on Rack 3 Cluster has 2 replicas (3 copies of data) This is a balanced configuration 15. 2015 Couchbase Inc. 15 Couchbase Server - MDS Architecture (NEW in 4.0) What is Multi-Dimensional Scalability? MDS is the architecture that enables independent scaling of data, query and indexing workloads. That also provides isolation of services for minimized interference. Independent zones for the query service, index service, and data service. Index Service Couchbase Cluster Query Service Data Service node1 node8 16. Demo !!! 17. Part I I Disaster Recovery 18. 2015 Couchbase Inc. 18 19. 2015 Couchbase Inc. 19 Cross Datacenter Replication (XDCR) Unidirectional Replication Hot spare and disaster recovery Development and testing copies Bidirectional Replication Datacenter locality Multiple active masters 20. 2015 Couchbase Inc. 20 Cross Datacenter Replication (XDCR) using DCP Continuously replicates data from the source cluster to remote clusters that can be spread across geographies Supports unidirectional and bidirectional operations Applications can read and write from both clusters (active active replication) Automatically handles node addition and removal of nodes Simplified administration via Admin UI, REST APIs, and CLI Pause-and-resume of XDCR streams (NEW in 4.0) Filtering of data on replication streams 21. 2015 Couchbase Inc. 21 XDCR Memory-based Using DCP APPLICATION SERVER MANAGED CACHE DISK DISK DOC 1 DOC 1 Intra-Cluster Replication INDEXER Cross Datacenter Replication DOC 1DOC 1 22. 2015 Couchbase Inc. 22 Backup & Restore Oops Case cbbackup tools provides backup for a running cluster Entire cluster across all bucket Single node across all buckets Single node single bucket Supports remote or local access Incremental backups Differential or cumulative Only backs up data that has changed since the last backup Minimize resource and time consumption during backups Enables more frequent backups Restore cluster to point in time of a differential or cumulative backup 23. Demo !!! 24. 2015 Couchbase Inc. 25 Deeper Dive into Architecture THUR @1.00 - ArchitectureTrack Deep Dive into Cluster Manager in Couchbase Server 4.0 Dave Finlay, Senior Director of Development, Couchbase 25. 2015 Couchbase Inc. 26 Deeper Dive into Architecture THUR @10.30 - ArchitectureTrack Multi-Dimensional Scaling: A New Architecture for Scaling Big Data Application Anil Kumar, Senior Product Manager, Couchbase 26. 2015 Couchbase Inc. 27 Best Practices THUR @5.15 - OperationsTrack Best Practices: Enabling HA - DR for Mission Critical Production Systems Kirk Kirkconnell, Senior Solutions Engineer, Couchbase 27. Thank you. 28. Get Started withCouchbase Server 4.0: www.couchbase.com/beta GetTrained on Couchbase: training.couchbase.com


Recommended