Webinar - Scale your Application Globally Using Couchbase and XDCR

Post on 31-May-2015

676 views 4 download

Tags:

description

Couchbase has the ability to replicate your data across datacenters, offering a truly high-performance experience to a worldwide audience. Replication also provides resilience in the face of infrastructure failures. In this webinar you will see: An overview on how cross datacenter replication (XDCR) works in Couchbase Server How you can use this feature to reduce risk in the face of infrastructure failures Live demo of XDCR setup in Couchbase

transcript

Scale your Application Globallyusing Couchbase & XDCR

Ilam SivaSenior Product Manager

Couchbase Open Source Project

• Leading NoSQL database project focused on distributed database technology and surrounding ecosystem

• Supports both key-value and document-oriented use cases

• All components are available under the Apache 2.0 Public License

• Obtained as packaged software in both enterprise and community editions.

Couchbase Open Source Project

54.219.86.249

Easy Scalabili

ty

Consistent High

Performance

Always On

24x365

Grow cluster without application changes, without downtime with a single click

Consistent sub-millisecond read and write response times

with consistent high throughput

No downtime for software upgrades, hardware maintenance, etc.

JSONJSONJSON

JSONJSON

PERFORMANCE

Flexible Data Model

JSON document model with no fixed schema.

Couchbase Server

Additional Couchbase Server Features

Built-in clustering – All nodes equal

Data replication with auto-failover

Zero-downtime maintenance

Built-in managed cached

Append-only storage layer

Online compaction

Monitoring and admin API & UI

SDK for a variety of languages

Single Node: Couchbase Server Architecture

Replication, Rebalance, Shard State Manager

REST management API/Web UI

8091Admin Console

Erla

ng /

OTP

11210 / 11211Data access ports

Object-managedCache

Storage Engine

8092Query API

Que

ry E

ngin

e

http

Data Manager Cluster Manager

Hash Partitioning

COUCHBASE SERVER CLUSTER

Basic Operation

• Docs distributed evenly across servers

• Each server stores both active and replica docsOnly one server active at a time

• Client library provides app with simple interface to database

• Cluster map provides map to which server doc is onApp never needs to know

• App reads, writes, updates docs

• Multiple app servers can access same document at same time

User Configured Replica Count = 1

READ/WRITE/UPDATE

ACTIVE

Doc 5

Doc 2

Doc

Doc

Doc

SERVER 1

ACTIVE

Doc 4

Doc 7

Doc

Doc

Doc

SERVER 2

Doc 8

ACTIVE

Doc 1

Doc 2

Doc

Doc

Doc

REPLICA

Doc 4

Doc 1

Doc 8

Doc

Doc

Doc

REPLICA

Doc 6

Doc 3

Doc 2

Doc

Doc

Doc

REPLICA

Doc 7

Doc 9

Doc 5

Doc

Doc

Doc

SERVER 3

Doc 6

APP SERVER 1

COUCHBASE Client Library

CLUSTER MAP

COUCHBASE Client Library

CLUSTER MAP

APP SERVER 2

Doc 9

XDCR: Cross Datacenter Replication

US DATA CENTER

EUROPE DATA CENTER

ASIA DATA CENTER

http://blog.groosy.com/wp-content/uploads/2011/10/internet-map.jpg

Cross Datacenter Replication – The basics

• Replicate your Couchbase data across clusters

• Clusters may be spread across geos

• Configured on a per-bucket basis

• Supports unidirectional and bidirectional operation

• Application can read and write from both clusters (active – active replication)

• Replication throughput scales out linearly

• Different from intra-cluster replication

Intra-cluster Replication

Cross Datacenter Replication (XDCR)

33 2

Single node - Couchbase Write Operation with XDCR

Managed Cache

Dis

k Q

ueue

Disk

Replication Queue

App Server

Couchbase Server Node

Doc 1Doc 1

Doc 1

To other node

XDCR Engine

Doc 1

To other cluster

Internal Data Flow

1. Document written to managed cache

2. Document added to intra-cluster replication queue

3. Document added to disk queue

4. XDCR push replicates to other clusters

XDCR in actionCOUCHBASE SERVER CLUSTERNYC DATA CENTERACTIVE

Doc

Doc 2

SERVER 1

Doc 9

SERVER 2 SERVER 3

RAM

Doc Doc Doc

ACTIVE

Doc

Doc

Doc RAM

ACTIVE

Doc

Doc

DocRAM

DISK

Doc Doc Doc

DISK

Doc Doc Doc

DISK

COUCHBASE SERVER CLUSTERSF DATA CENTER

ACTIVE

Doc

Doc 2

SERVER 1

Doc 9

SERVER 2 SERVER 3

RAM

Doc Doc Doc

ACTIVE

Doc

Doc

Doc RAM

ACTIVE

Doc

Doc

DocRAM

DISK

Doc Doc Doc

DISK

Doc Doc Doc

DISK

XDCR Architecture

Bucket-level XDCR

Bucket A

Bucket B

Bucket C

Cluster 1

Bucket A

Bucket B

Bucket C

Cluster 2

Continuous Reliable Replication

• All data mutations replicated to destination cluster

• Multiple streams round-robin across vBuckets in parallel (32 default)

• Automatic resume after network disruption

Cluster Topology Aware

• Automatically handles node addition and removal in source and destination clusters

Efficient• Couchbase Server de-duplicates writes to disk

With multiple updates to the same document only the last version is written to disk

Only this last change written to disk is passed to XDCR

• Document revisions are compared between clusters prior to transfer

Active-Active Conflict Resolution

• Couchbase Server provides strong consistency at the document level within a cluster

• XDCR provides eventual consistency across clusters

• If a document is mutated on both clusters, both clusters will pick the same “winner”

• In case of conflict, document with the most updates will be considered the “winner”

{ … } 33{ … }

Doc 1 on DC1Doc 1 on DC2

Winner

Configuration and Monitoring

STEP 1: Define Remote Cluster

STEP 2: Start Replication

Monitor Ongoing Replications

Detailed Replication Progress• Source Cluster

• Destination Cluster

Demo!

XDCR Topologies

Unidirectional

• Hot spare / Disaster Recovery

• Development/Testing copies

Bidirectional

• Multiple Active Masters

• Disaster Recovery

• Datacenter Locality

Chain

Data aggregation

Data propagation

XDCR in the Cloud

• Server Naming Optimal configuration using DNS name that resolves to internal address

for intra-cluster communication and public address for inter-cluster communication

• Security XDCR traffic is not encrypted, plan topology accordingly Consider 3rd party Amazon VPN solutions

Use Cases

Scale your data globally

• Data closer to your users is faster for your users

Disaster Recovery

• Ensure 24x7x365 data availability even if an entire data center goes down

Development and Testing

• Test code changes with actual production data without interrupting your production cluster

• Give developers local databases with real data, easy to dispose and recreate

Test and Dev Staging Production

Impact of XDCR on the cluster

Your clusters need to be sized for XDCR

• XDCR is CPU intensive Configure the number of parallel streams based on your CPU capacity Release 2.2 is a tremendous improvement in this regard. Strongly

recommend the XDCR Ver 2 protocol in 2.2 release for high performance and optimized resource usage.

• You are doubling your I/O usage I/O capacity needs to be sized correctly

• You will need more memory particularly for bidirectional XDCR Memory capacity needs to be sized correctly

Q & A

Thank you