Introduction to DRBD

Post on 28-Jan-2015

118 views 0 download

Tags:

description

Talk about DRBD in Sudoers Barcelona's October meeting.

transcript

Sudoers BarcelonaOctubre 2013

alba ferrer

What is it?

Distributed Replicated Block Device

What is it?

Distributed Replicated Block Device

Software-based, shared-nothing replicated storage solution mirroring the contents of block devices

What is it?

Distributed Replicated Block Device

Software-based, shared-nothing replicated storage solution mirroring the contents of block devices

• In real time

What is it?

Distributed Replicated Block Device

Software-based, shared-nothing replicated storage solution mirroring the contents of block devices

• In real time• Transparently

What is it?

Distributed Replicated Block Device

Software-based, shared-nothing replicated storage solution mirroring the contents of block devices

• In real time• Transparently• Synchronously/asynchronously

Kernel module

User space admin tools• drbsetup

• Used to configure the kernel module• All parameters in command-line

User space admin tools• drbsetup

• Used to configure the kernel module• All parameters in command-line

• drbdmeta• Create/dump/restore/modify DRBD metadata

(more on this later)

User space admin tools• drbsetup

• Used to configure the kernel module• All parameters in command-line

• drbdmeta• Create/dump/restore/modify DRBD metadata

(more on this later)

• drbdadm• High-level, frontend for drbdsetup/drbdmeta• Reads from /etc/drbd.conf• Has a dry-run option (-d)

Resources

• A particular replicated storage device

Resources

• A particular replicated storage device

• Resource name• DRBD device: virtual block device (major=147).

The associated block device is always /dev/drbdm (m=minor)

• Disk configuration: local copy of the data• Network configuration: comms with peer

ConfigurationPer resource (/etc/drbd.d/mysql.res):

resource mysql {device minor 0; # /dev/drbd0disk /dev/sdb;meta-disk internal;

on alice {address 192.168.133.111:7000;

}on bob {

address 192.168.133.112:7000;}

syncer {rate 10M; # static resync rate of

10MByte/s}

}

Configuration

Global (/etc/drbd.d/global_common.conf):global {

usage-count yes;}

common {protocol C;disk {

on-io-error detach;}syncer {

al-extents 3833;}

}

Resource roles

• Primary: read and write ops• Secondary: receives updates from primary,

disallows any other access.

• Promotion: from secondary to primary drbdadm primary all

• Demotion: from primary to secondarydrbdadm secondary all

Modes

• Single-primary• Dual-primary (>= 8.0)

Modes

• Single-primary• Dual-primary (>= 8.0)

• Replication modes:• Protocol A: asynchronous• Protocol B: memory synchronous• Protocol C: synchronous

Features: efficient synchronization

• Synchronization != replication

Features: efficient synchronization

• Synchronization != replication• Inconsistent remote dataset during sync

• Useless

Features: efficient synchronization

• Synchronization != replication• Inconsistent remote dataset during sync

• Useless• Service in active node unaffected

Features: efficient synchronization

• Synchronization != replication• Inconsistent remote dataset during sync

• Useless• Service in active node unaffected• Synchronization and replication happen at the

same time

Features: efficient synchronization

• Only one write op per several successive writes in active node in a block

Features: efficient synchronization

• Only one write op per several successive writes in active node in a block

• Linear access to blocks

Features: efficient synchronization

• Only one write op per several successive writes in active node in a block

• Linear access to blocks• Configure rate of sync

Features: efficient synchronization

• Only one write op per several successive writes in active node in a block

• Linear access to blocks• Configure rate of sync

• Checksum-based synchronization

Features: data verification

• On-line device verification• block-by-block data integrity check

between nodes

Features: data verification

• On-line device verification• block-by-block data integrity check

between nodes• Replication traffic integrity checking

• end-to-end message integrity checking using cryptographic message digest algorithms

Features: disk

• Support for disk flushes

Features: disk

• Support for disk flushes• Disk error handling strategies

• Passing• Masking• DIY

Features: disk

• Support for disk flushes• Disk error handling strategies

• Passing• Masking• DIY

• Deal with outdated data• DRBD won't promote an outdated

resource -> fencing

Features: replication• Three-way replication

Features: replication

• Long distance replication with DRBD Proxy• Not free

• Truck based replication

Split-brain

Split brain is a situation where, due to temporary failure of all network links between cluster nodes, and possibly due to intervention by a cluster management software or human error, both nodes switched to the primary role while disconnected.

Split-brain

• Configurable notifications

Split-brain

• Configurable notifications• Automatic recovery methods

• Discard modifications on 'younger' primary.• Discard modifications on 'older' primary.• Discard modifications on primary with

fewer changes.• Graceful recovery if one primary had no

changes.

Metadata

• Various pieces of information about the data DRBD keeps in a dedicated area• The size of the DRBD device• The generation identifier• The activity log• The quick-sync bitmap

Metadata

• Can be stored internally or externally

Metadata

• Can be stored internally or externally• Size

root@bob:~ # blockdev --getsz /dev/drbd0root@bob:~ # 8388280

(8388280/2^18) * 8 + 72 = 328 sectors328 sectors = 0,16MB

What it’s not/What it can’t do

• It’s not a backup system

What it’s not/What it can’t do

• It’s not a backup system

• It can’t add features to upper layers

What it’s not/What it can’t do

• It’s not a backup system

• It can’t add features to upper layers• DRBD cannot auto-detect file system

corruption • DRBD cannot add active-active clustering

capability to file systems like ext3 or XFS.

Limitations

• Only two nodes• Stacked resources• Version 9

Limitations

• Only two nodes• Stacked resources• Version 9

• There is no automatic failover.

Limitations

• Only two nodes• Stacked resources• Version 9

• There is no automatic failover.• Promotion/demotion is manual.

Limitations

• Only two nodes• Stacked resources• Version 9

• There is no automatic failover.• Promotion/demotion is manual.• Needs a CRM to be useful

PACEMAKER FTW

Funcionament

root@alice:/etc/drbd.d # cat /proc/drbd

version: 8.3.13 (api:88/proto:86-96)GIT-hash: 234a142f7cf5bb21ffa1e95afa4f31608089c8b8 build by buildsystem@linbit, 2012-09-12 14:27:28 0: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r----- ns:152 nr:4 dw:156 dr:4017 al:5 bm:1 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0

More info

• drbd.org

• www.drbd.org/home/mailinglists

• www.linbit.com