+ All Categories
Home > Documents > Data Guard – Fast-Start Failover

Data Guard – Fast-Start Failover

Date post: 03-Jan-2022
Category:
Upload: others
View: 1 times
Download: 0 times
Share this document with a friend
38
Data Guard – Fast-Start Failover DOAG Regio Stuttgart – 18.05.2006 Torsten Rosenwald [email protected]
Transcript

Data Guard – Fast-Start Failover

DOAG Regio Stuttgart – 18.05.2006

Torsten Rosenwald

[email protected]

Oracle 10g High Availability - Data Guard and Maximum Availability 2 © 2006

Agenda

> Introduction

> Concept / Architecture

> Flashback & Reinstate

> Fast-Start Failover

> Core Messages

Data Guard and Maximum Availability

Know-howWe know how

Oracle 10g High Availability - Data Guard and Maximum Availability 3 © 2006

The Data Guard Manager

> Data Guard is part of Enterprise Edition (no special option), it includesall functionality 'to manage Standby Databases'

> The Data Guard Broker Framework (applications and language) facilitates the following tasks of Data Guard» Setup and configuration» Monitoring/control of Redo Log transport- and log apply services» Core operating tasks (Switchover, Failover,Reinstate, Fast-Start Failover, etc.)

> Standard Edition doesn‘t contain Data Guard features like» Automated log-transport or» Managed Recovery Modus

> There is a Trivadis package implementing the basic Data Guardfunctionality ☺

Oracle 10g High Availability - Data Guard and Maximum Availability 4 © 2006

Agenda

> Introduction

> Concept / Architecture

> Flashback & Reinstate

> Fast-Start Failover

> Miscellaneous

> Core Messages

Data Guard Fast-Start Failover

Know-howWe know how

Oracle 10g High Availability - Data Guard and Maximum Availability 5 © 2006

Dataguard Broker Framework

PrimaryDatabase Standby

Database

Primary Site Standby Site

OnlineLog Files

LocalArchiving

StandbyLog Files

ArchivedLog Files

RemoteArchiving

Log Transport

Log Apply if no standby

redo log filesconfigured

Real Time Log Apply

Data Guard monitor

Data Guard monitor

Oracle Data GuardGUI or Command Line Interface

Oracle 10g High Availability - Data Guard and Maximum Availability 6 © 2006

Data Loss Protection Modes

> Maximum Performance: less performance impact to the primary database, asynchronous redo transfer

> Maximum Availability: highest possible level of data protection without compromising the availability of the primary database, synchronous redo transfer when the standby database is up

> Maximum Protection: this protection mode ensures that the primary database and at least one standby database are always synchronous

DGMGRL> EDIT CONFIGURATION SET PROTECTION MODE AS MAXAVAILABILITY;

Oracle 10g High Availability - Data Guard and Maximum Availability 7 © 2006

Agenda

> Introduction

> Concept / Architecture

> Flashback & Reinstate

> Fast-Start Failover

> Core Messages

Data Guard and Maximum Availability

Know-howWe know how

Oracle 10g High Availability - Data Guard and Maximum Availability 8 © 2006

Why Flashback?

> ASM is nice…» but the real value is leveraged together with RAC

> Flashback database is nice…» but the real value is leveraged in a Data Guard Environment

» Almost all real cool features provided with 10g Data Guard need Flashback Database!

Oracle 10g High Availability - Data Guard and Maximum Availability 9 © 2006

IncrementalLevel 0

SCN33557 34382 37121 41389 52891 …

Restore

Recovery

Flashback LOGS

Flashback

Concept: Flashback Database

Oracle 10g High Availability - Data Guard and Maximum Availability 10 © 2006

Concept of Database Reinstate (1)

> prior to 10g» backup from the new primary

database» duplicate with backup» recreate standby database

> from 10g» reinstate database

standby database

Failover

LIONOLD Primary

Database

LIONOLD Primary

Database

LIONOLD Primary

Databaseformer primary

database

What about the former primary database?

Oracle 10g High Availability - Data Guard and Maximum Availability 11 © 2006

Concept of Database Reinstate (2)

LIONOLD Primary

Database

REINSTATE DATABASE ‘THEDB_BONN'

LIONOLD Primary

Database

FAILOVER TO ‘THEDB_BERLIN'LIONOLD Primary

Database

LIONOLD Primary

Databaseformer primary

database

standby database

TIGERStandby Databaseprimary

database

Oracle 10g High Availability - Data Guard and Maximum Availability 12 © 2006

Agenda

> Introduction

> Concept / Architecture

> Flashback & Reinstate

> Fast-Start Failover

> Core Messages

Data Guard and Maximum Availability

Know-howWe know how

Oracle 10g High Availability - Data Guard and Maximum Availability 13 © 2006

Physical Standby: Startup Behavior 10g versus 9i

startup nomount

alter database mount;

alter database open;

recover managed standby database disconnect;

startu

p

dmon 10g takes over

dmon 9i takes over

Oracle 10g High Availability - Data Guard and Maximum Availability 14 © 2006

> What is the biggest problem in a cluster?

Split Brain!

> What is the biggest problem in a Data Guard environment?

More than one primary!

> How can this happen? Primary re-availability after standby activation

Physical Standby – Startup Issue (1)

PRIM DB

PRIM DB?

Oracle 10g High Availability - Data Guard and Maximum Availability 15 © 2006

Physical Standby – Startup Issue (2)

> Automatic database startup as part of system startup» 2 primary databases possible after standby has been activated

> Manual database startup after system boot» No issue after activation of standby, because manual intervention is

necessary anyway» Requires additional attention after every system startup

> Is there a better solution?

Oracle 10g High Availability - Data Guard and Maximum Availability 16 © 2006

Physical Standby – Activation Issue

> Main criticism of standby databases: too much manual action

> Manual intervention is required for a failover» Need some administrative checks before to validate the status of the

standby database, e.g. if all redo are applied» More downtime

> Manual intervention to recreate a new standby database » No HA until the setup of the new standby is finished

> How can this be addressed? Fast-Start Failover

Oracle 10g High Availability - Data Guard and Maximum Availability 17 © 2006

Concept

1. Observed Data Guard environment

2. Fast-Start-Failover (automatic)

3. Reinstate (automatic)

Primary Standby

Primary Primary

PrimaryStandby

Oracle 10g High Availability - Data Guard and Maximum Availability 18 © 2006

When is a Fast-Start Failover triggered?

> Primary site failure » Server crash or server shutdown (without database shutdown)

> Primary database failure» Instance failure (last running instance if RAC)» Shutdown abort (but not with normal or immediate)» Data file is taken offline

> Network failure (special case)» Documentation of when and when not automatic activation will

happen is quite large. Read and test carefully. We will show onecase.

Oracle 10g High Availability - Data Guard and Maximum Availability 19 © 2006

Network Failure (1)

primary DB standby DB

log transport

observer

Select fs_failover_status,fs_failover_observer_presentfrom v$database; ---on primary siteFS_FAILOVER_STATUS FS_FAILOVER_OBSERVER_PRESENT-------------------- -----------------------------SYNCHRONIZED NO

Oracle 10g High Availability - Data Guard and Maximum Availability 20 © 2006

Network Failure (2)

standby DB

log transport

observer

Select fs_failover_status,fs_failover_observer_presentfrom v$database; ---on primary siteFS_FAILOVER_STATUS FS_FAILOVER_OBSERVER_PRESENT-------------------- ----------------------------STALLED NO

database STALLED

FAILOVER start

Oracle 10g High Availability - Data Guard and Maximum Availability 21 © 2006

Network Failure (3)

observer

new primary DB

Select fs_failover_status,fs_failover_observer_presentfrom v$database; ---on new primary siteFS_FAILOVER_STATUS FS_FAILOVER_OBSERVER_PRESENT-------------------- ----------------------------REINSTATE REQUIRED YES

database STALLED

Oracle 10g High Availability - Data Guard and Maximum Availability 22 © 2006

Network Failure (4)

log transport

observer

database reinstate

new primary DBnew standby DB

Select fs_failover_status,fs_failover_observer_presentfrom v$database; --on primary site and standby siteFS_FAILOVER_STATUS FS_FAILOVER_OBSERVER_PRESENT-------------------- -----------------------------SYNCHRONIZED YES

Oracle 10g High Availability - Data Guard and Maximum Availability 23 © 2006

Observer location?

Standby computingcenter

Main computingcenter

primary DB standby DB

log transport

= Observer

1 2

34

Public LANPublic LAN

5

Oracle 10g High Availability - Data Guard and Maximum Availability 24 © 2006

Observer location …

> Not really an option

> Reason» no prevention from system crash!

> Consequence» additional observer machine is necessary!

1 2

primary / standby DB

Oracle 10g High Availability - Data Guard and Maximum Availability 25 © 2006

Observer location …

> Close (same fire prevention area) to theStandby Database

> Advantages» scenario - complete failing main

computing center – will be addressed

> Disadvantages» primary database will be heavily

dependent from the network» therefore unnecessary failover events are

possible

3

Primary Standby

Oracle 10g High Availability - Data Guard and Maximum Availability 26 © 2006

Observer location …

> Close (same fire prevention area) to theprimary database

> Advantages» Fast-start-failover works in most

important error situations- Instance crash, media failure, database file

offline …» primary database is not that much

dependent on the network» no unnecessary activations due to

networking issues

> Disadvantages» loss of the whole main computing center

is not addressed

4

Primary Standby

Oracle 10g High Availability - Data Guard and Maximum Availability 27 © 2006

Observer location …

> Third computing center / somewhere withinthe public LAN

> Advantages» basically all error scenarios addressed

> Disadvantages» Observer is separated from a network point of

view- Therefore the observer itself is more dependent

on the network» most companies do not operate 3 computing

centers» running the observer on some PC or

whatsoever in the public LAN means reducedavailability

5

Primary Standby

Oracle 10g High Availability - Data Guard and Maximum Availability 28 © 2006

Observer location …

> Consequences» Fast-start-failover is not an appropriate solution to overcome the

loss of a whole computing center. It is not a failover cluster!

» After switchover, setup 4 turns into setup 3 and vice versaException: the observer is switched somehow as well

» In many real life situations (no 3 computation centers) option 4 will be the best choice (tradeoff)

Oracle 10g High Availability - Data Guard and Maximum Availability 29 © 2006

Public LANPublic LAN

standbycomputingcenter

maincomputingcenter

primary DB standby DB

log transport

Observer

Observer location – The compromise

Oracle 10g High Availability - Data Guard and Maximum Availability 30 © 2006

Observer - Requirements

> Observer machine and configuration

> Special entry in Data Guard Broker configuration

> Maximum Availability Mode (mandatory)» but: special startup behaviour» but: primary stalls in certain situations

> Flashback database must be activated

Oracle 10g High Availability - Data Guard and Maximum Availability 31 © 2006

Observer - Data Guard additional Configuration

> Not much to configure, but much to describe (see manual)

> Fast-Start Failover is a feature of Oracle Data Guard, and can't run without a Data Guard Broker configuration!

edit database ‘THEDB_BONN'set property FastStartFailoverTarget = ‘THEDB_BERLIN';

edit database ‘THEDB_BERLIN'set property FastStartFailoverTarget = ‘THEDB_BONN';

edit configurationset property FastStartFailoverThreshold = 15;

enable fast_start failover;

Oracle 10g High Availability - Data Guard and Maximum Availability 32 © 2006

Observer – Configuration

» Start of Observer

» Better write a shell script with background execution, “start observer” does not terminate, - use the logfile option

» Change name of the observer binary file, this file is created in the working directory where you start the observer fsfo.dat. With the parameter 'FILE' you can change the file name descriptor, but not the location

dgmgrl -logfile /u00/app/oracle/local/dba/log/observer.log sys@THEDB_BONN "start observer"

Start observer file=fsfo_<DG_configuration_name>.dat

connect sys@THEDB_BONNstart observer

Oracle 10g High Availability - Data Guard and Maximum Availability 33 © 2006

Demo: Fast Start Failover

1. Configure Fast_Start Failover

2. Start Observer with connect to primary

3. Shutdown abort on the primary database THEDB_BONN

4. Wait until Fast_Start occurs on THEDB_BERLIN

5. Restart the old primary THEDB_BONN

6. Verify that observer reinstates database THEDB_BONN

Oracle 10g High Availability - Data Guard and Maximum Availability 34 © 2006

Conclusion (1)

+ Prevention of "Split Brain" due to accidental startup of former primary database

+ Reduced downtime through automatic activation of thestandby database

+ It is a small step for the DBA, but a giant leap from an availability point of view+ It is easy to configure+ The necessary checks are automatically done before a failover is

started

Oracle 10g High Availability - Data Guard and Maximum Availability 35 © 2006

Conclusion (2)

+ A failover solution without a shared disk system+ with additional advantages (enhanced data availibity)+ and even reduced failover time compared to HA cluster

− Many technical prerequisites (Flashback database, specialMaximum Availability Mode)

− No automatic failover to a second standby databasepossible

Oracle 10g High Availability - Data Guard and Maximum Availability 36 © 2006

Agenda

> Introduction

> Concept / Architecture

> Flashback & Reinstate

> Fast-Start Failover

> Core Messages

Data Guard and Maximum Availability

Know-howWe know how

Oracle 10g High Availability - Data Guard and Maximum Availability 37 © 2006

Data Guard andMaximum Availability - Core messages…

> Data Guard 10g » Flashback makes the difference

> Fast-Start Failover» Protection from 2 primary databases

due to inadvertend restart of failedprimary database

» Rather easy implementation / configuration

» Reinstate database – even automatically» Very short failover timesAt the core it's

about data.

> by Trivadis

Data Guard – Fast-Start Failover

DOAG Regio Stuttgart – 18.05.2006

Torsten Rosenwald

[email protected]


Recommended