Revealing the secret behind a stress-free database migration

transcript

Arald den BraberPatrick van Tongeren

About us

Avans University of Applied Sciences

Arald den Braber

System Administrator / Developer / Teacheracbjl.denbraber@avans.nl

Patrick van Tongeren

Functional designerphjf.vantongeren@avans.nl

About our institution

• 3 locations in the Netherlands

• 28.000 students

• 2600 employees

• 20 schools

• Blackboard since 2001

• 7500 active courses

• 3 system Administrators

• 20 e-Learning coaches

• Selfhosted

Bruxelles

Amsterdam

Our current Blackboard infrastructure

Storage(CIFS)

Load Balancer

http://bb.avans.nl

Private network

Firewall

DZ37 DZ38DZ36DZ35DZ60(BEP)

HTTPS traffic

Trafic to/from database, data storage and between nodes (ActiveMQ)

Database cluster

What we are going to learn today?

• Options for implementing high availability

• “By failing to prepare, you are preparing to fail.” ― Benjamin Franklin Therefor: good preparations are the key to a successful migration

• Tuning the database server

• A few things you have to know about Blackboards configuration

Goal of our migration

• Replace old hardware from mid 2010

• Change the database passwords

• And……

Goal of our migration

• Risk analysis audit learned us that we had one single point of failure in our Blackboard infrastructure

Our disaster recovery plan and high availability goes something like this…

Our old Blackboard infrastructure

Storage(CIFS)

Load Balancer

http://bb.avans.nl

Private network

Firewall

DZ37 DZ38DZ36DZ35DZ60(BEP)

HTTPS traffic

Trafic to/from database, data storage and between nodes (ActiveMQ)

Database server

FH20 !!

Database server

How to select your best high availability solution

• Depends on your needs– Required availability– Recovery time (RTO)– Data loss tolerance (RPO)– Cost

• SQL Server 2012 has several options– Database mirroring (obsolete in future versions)– Replication (not designed for high availability)– Backup and restore– Logshipping– AlwaysOn Failover Cluster Instances (FCI)– AlwaysOn Availability Groups

Backup and restore

• Appropriate for disaster recovery, but not for high availability

• Must always be present

• Pros– Simple– Already present

• Cons– Can take long after disaster before up and running

Database backups

Log backups

Database server Backup share

Logshipping

• Uses the transaction log backup

• User-configurable interval

• Pros– Simple– Does not require Enterprise edition (potentially cheaper to implement)– Logshipping operates at the database level

• Cons– Data loss from last backup– Manual failover– No easy way to failback to the primary server

Log shipping Log shipping

Primary database server

Secondary database server

Backup share

• Shared-storage solution

• Pros– Connect to a virtual name– Allows for OS and SQL upgrades

with minimal user impact

• Cons– Failover takes place at a SQL instance level (not database level)– Dependent upon Windows Clustering – can be complex to setup

AlwaysOn Failover Cluster Instances (FCI)

Clients

Databasestorage

ActiveDatabaseServer 1

PassiveDatabaseServer 2

Windows Server Failover Cluster

Primary Data Center Disaster RecoveryData Center

Primary Secondary Secondary

Availability Group

Synchronous/Asynchronous

Fileshare Witness

Synchronous

AlwaysOn Availability Groups

• Supports a failover environment for a discrete set of user databases

• Pros– Connect to a virtual

name– Mirrors at a database

level (not whole instance)– Supports asynchronous (high performance) and synchronous

(high security) communication options

• Cons– Storage– Only available on SQL Server Enterprise Edition– Dependent upon Windows Clustering – can be complex to setup

Summary

High Availability and Disaster Recovery

SQL Server Solution

Potential Data Loss

Potential Recovery Time

Automatic Failover

Readable Secondaries

AlwaysOn Availability Group synchronous-commit

Zero Seconds Yes 0 - 2

AlwaysOn Availability Groupasynchronous-commit

Seconds Minutes No 0 - 4

AlwaysOn Failover Cluster Instance NA Seconds-to-minutes

Yes NA

Log Shipping Minutes Minutes-to-hours

No Not duringa restore

Backup, Copy, Restore Hours Hours-to-days

No Not duringa restore

Our goal

• Primary goals– Redundancy of the database– Reduce downtime to a maximum of 1 hour

• Secondary goals– Reduce overhead for the primary server– Relationship Costs versus benefit

(Don’t invest too much in something we wish we never have to use)

Our solution

AlwaysOn Availability Groups• Servers

– One physical server acting as primary role (384 GB memory) – One virtual server acting as secondary role (8 Gb memory)

• Configuration– Asynchronous commits– No automatic failover– Readable secondary– Backup secondary

Failover

• Steps to perform– Add resources to our secondary server (restart)– Manually Failover (wizard)

• Estimated duration– Normally within 15 minutes

• Advice– Test failover several times– Familiar with steps for failover – Do a failover in production once a year

• Planned failover– Maintenance– Change asynchronous to synchronous commits first

Database best practice

• Memory– Set a fixed maximum memory usage

• MAXDOP setting– To limit the number of processors to use in parallel plan execution

• Patching– Hotfixes for AlwaysOn

• Use backup compression– Reduce the amount of storage– Decrease backup and restore times

Migration

• Preparation– Make a roadmap– Use scripts

• Test– Use test servers– Execute the roadmap several

times– Know the timeframe

• Production– Last test execution on new production environment

Reduced complexity at migration moment

• The complexity of the actual migration was reduced to these simple steps:– “Refresh” data in the new database server– Duplicate data to standby server– Update the Blackboard configuration

• Using a script• Update the bb-config.properties file

• The servers, SQL server configuration and cluster setup where all prepared in advance and ready to be used

DZ35, 36, 37, 38 and 60

Old server

New database cluster

The migration

The actual migration

• As simple as taking all steps in the roadmap

• Performed at home

• Communication using a Whatsapp group

• On schedule

• No surprises

A funny note Be aware – configuration stored at various locations

• While testing we ran into the situation in which– The new database server was logging authentication failures from the

test application server– The test application server successfully started connecting to the old

database server

App server

Old database server

New database server

Login failedPartly continue using

Mmmm If the new server doesn’t want me I will just

continue to use my old partner

Log file

This was caused by Blackboard ‘s hidden configuration

• bbadmin database – BB_INSTANCE: Database server name & passwords

• CMS database– XY_FILE_SYSTEMS: Database server name & passwords

How to avoid this?

• Make sure the application server can’t see the old database

• This can be done by– Tweaking name resolving on the app server so the old database server

name resolves to 127.0.0.1 (localhost). We used this solution during the test phase of our project

– Take the databases offline on the old server after making the final backup

A few final words

• Good preparations help to reduce complexity and as a result stress during the actual moment of a migration

• The roadmap we created during our project can also be used when you need to clone your environment.

Migration summary

• Making a roadmap and testing has been the key to our success. It taught us:– Which steps we had to take– How long these steps take input for the final roadmap– About the secret locations of the Blackboard configuration

• Testing also helps you not to forget important steps

Questions

?? ? ?

Revealing the secret behind a stress-free database migration

Education