+ All Categories
Home > Documents > Backup And Recovery Requirements

Backup And Recovery Requirements

Date post: 31-Jan-2016
Category:
Upload: kesia
View: 29 times
Download: 0 times
Share this document with a friend
Description:
Backup And Recovery Requirements. Routine backups must have minimal impact to development environment VOBS must be locked for a minimal amount of time during backup Routine backups must capture relevant data in a way that can be quickly and accurately recovered - PowerPoint PPT Presentation
Popular Tags:
46
Backup And Recovery Requirements Routine backups must have minimal impact to development environment VOBS must be locked for a minimal amount of time during backup Routine backups must capture relevant data in a way that can be quickly and accurately recovered Data validation is required prior to backing up data to tape All relevant data must be backed up at the same time (i.e. registry, configuration, VOB storage)
Transcript
Page 1: Backup And Recovery Requirements

Backup And Recovery RequirementsBackup And Recovery Requirements

Routine backups must have minimal impact to development environment VOBS must be locked for a minimal amount of time

during backup Routine backups must capture relevant data in a

way that can be quickly and accurately recovered Data validation is required prior to backing up data to

tape All relevant data must be backed up at the same time

(i.e. registry, configuration, VOB storage)

Routine backups must have minimal impact to development environment VOBS must be locked for a minimal amount of time

during backup Routine backups must capture relevant data in a

way that can be quickly and accurately recovered Data validation is required prior to backing up data to

tape All relevant data must be backed up at the same time

(i.e. registry, configuration, VOB storage)

Page 2: Backup And Recovery Requirements

Backup And Recovery Requirements (Continued)Backup And Recovery Requirements (Continued)

Recovery time must minimize impact to developers - typical VOB server with 80 to 90 VOBs and 100 - 200 GB of storage (hub servers 130 – 160 VOBs)

Typical recovery scenario (takes a week): Restore data from backup media, i.e. tape (days!) Data validation on restored data, i.e. checkvob &

dbcheck (days! 10-15GB VOBs with 3-4 GB db’s) Sync replicas to get changes since last backup (this

alone takes about 8 - 12 hours) Reset client machines (rebooting required?) Minimize downtime during recovery -- needs to be

minutes/hours, not days or weeks

Recovery time must minimize impact to developers - typical VOB server with 80 to 90 VOBs and 100 - 200 GB of storage (hub servers 130 – 160 VOBs)

Typical recovery scenario (takes a week): Restore data from backup media, i.e. tape (days!) Data validation on restored data, i.e. checkvob &

dbcheck (days! 10-15GB VOBs with 3-4 GB db’s) Sync replicas to get changes since last backup (this

alone takes about 8 - 12 hours) Reset client machines (rebooting required?) Minimize downtime during recovery -- needs to be

minutes/hours, not days or weeks

Page 3: Backup And Recovery Requirements

Warm High Availability (WHA) ConfigurationWarm High Availability (WHA) Configuration

Aspects of WHA Implementation: Using SAN technology Snapshot to minimize VOB locks Specialized ClearCase configuration Currently only on VOB servers, could implement

View servers the same way Now some details!

Aspects of WHA Implementation: Using SAN technology Snapshot to minimize VOB locks Specialized ClearCase configuration Currently only on VOB servers, could implement

View servers the same way Now some details!

Page 4: Backup And Recovery Requirements

WHA Configuration (ContinuedWHA Configuration (Continued

Using SAN technology• Any server can dynamically control any storage device,

allowing for quick fail over of VOB servers• Use of a “shadow” disk for initial backup medium

Snapshot to minimize VOB locks Minimizes VOB lock times to less than 2 minutes

Specialized ClearCase configuration allows fail-over to new server with no required

changes to the ClearCase registry and configuration More details later!

Using SAN technology• Any server can dynamically control any storage device,

allowing for quick fail over of VOB servers• Use of a “shadow” disk for initial backup medium

Snapshot to minimize VOB locks Minimizes VOB lock times to less than 2 minutes

Specialized ClearCase configuration allows fail-over to new server with no required

changes to the ClearCase registry and configuration More details later!

Page 5: Backup And Recovery Requirements

WHA Configuration (Continued)WHA Configuration (Continued)

Hardware configuration SAN configuration ClearCase configuration

Hardware configuration SAN configuration ClearCase configuration

Page 6: Backup And Recovery Requirements

WHA Configuration (Continued)WHA Configuration (Continued)

Hardware configuration Unix Solaris servers SAN storage appliance – currently about 5 - 6 TB in

San Diego of ClearCase storage (VOBs and Views) Each VOB server has primary disk storage plus 2

“shadow images” of the VOB storage (3 copies on disk)

Large servers: 16GB RAM, 4 CPUs, GB network and 2GB interface to storage device

We have implemented WHA on all our VOB servers, large and small

Hardware configuration Unix Solaris servers SAN storage appliance – currently about 5 - 6 TB in

San Diego of ClearCase storage (VOBs and Views) Each VOB server has primary disk storage plus 2

“shadow images” of the VOB storage (3 copies on disk)

Large servers: 16GB RAM, 4 CPUs, GB network and 2GB interface to storage device

We have implemented WHA on all our VOB servers, large and small

Page 7: Backup And Recovery Requirements

WHA Configuration (Continued)WHA Configuration (Continued)

SAN configuration Many-to-many connectivity between servers and

storage locations Dynamic control of storage locations Accommodates snapshots and shadow images (where

dbcheck is run) Using 2 shadow images – one day apart

• Oldest one has successfully passed dbcheck and is/has been dumped to tape

• Newest one is undergoing dbcheck• Always have a validated copy of all necessary data on disk for

restoration

SAN configuration Many-to-many connectivity between servers and

storage locations Dynamic control of storage locations Accommodates snapshots and shadow images (where

dbcheck is run) Using 2 shadow images – one day apart

• Oldest one has successfully passed dbcheck and is/has been dumped to tape

• Newest one is undergoing dbcheck• Always have a validated copy of all necessary data on disk for

restoration

Page 8: Backup And Recovery Requirements

WHA Configuration (Continued)WHA Configuration (Continued)

ClearCase configuration Currently using ClearCase 4.2 When implementing a recovery, NO ClearCase

configuration changes are required (i.e. registry) Backup ALL relevant data at the same time

• VOB data and /var/adm/atria located on same disk location DNS alias used instead of real host name for ClearCase

license server Use Logical vs. Physical VOB storage locations in

registering DNS alias used for VOB servers (VOB server can

change by moving the alias)

ClearCase configuration Currently using ClearCase 4.2 When implementing a recovery, NO ClearCase

configuration changes are required (i.e. registry) Backup ALL relevant data at the same time

• VOB data and /var/adm/atria located on same disk location DNS alias used instead of real host name for ClearCase

license server Use Logical vs. Physical VOB storage locations in

registering DNS alias used for VOB servers (VOB server can

change by moving the alias)

Page 9: Backup And Recovery Requirements

WHA Configuration (Continued)WHA Configuration (Continued)

ClearCase configuration (continued) Use Logical vs. Physical VOB storage locations in

registering -- The path to the VOB storage must be the same, independent of host and storage location

Create links to VOB storage, example:• /local/mnt (this mount point always exists and is always shared• Use links to create logical physical mapping, need unique

logical paths for all VOB storage within the same region

/local/mnt/VOBSA /net/dnsalias/local/mnt2/vobs

/local/mnt/VOBSB /net/dnsalias/local/mnt3/vobs

ClearCase configuration (continued) Use Logical vs. Physical VOB storage locations in

registering -- The path to the VOB storage must be the same, independent of host and storage location

Create links to VOB storage, example:• /local/mnt (this mount point always exists and is always shared• Use links to create logical physical mapping, need unique

logical paths for all VOB storage within the same region

/local/mnt/VOBSA /net/dnsalias/local/mnt2/vobs

/local/mnt/VOBSB /net/dnsalias/local/mnt3/vobs

Page 10: Backup And Recovery Requirements

WHA Configuration (Continued)WHA Configuration (Continued)

ClearCase configuration (continued) Once links are created, register and tag (mkvob,

mkreplica…). Must use fully qualifying method:-host <dns alias of VOB server>-hpath <the linked path, not physical path>-gpath <the global and linked path>

Never use the real host name or real physical path!! To switch servers: Restore data, move host alias,

create links, stop and start ClearCase The clients and view servers must reacquire the new

VOB storage mount points, so restart ClearCase or reboot the clients

ClearCase configuration (continued) Once links are created, register and tag (mkvob,

mkreplica…). Must use fully qualifying method:-host <dns alias of VOB server>-hpath <the linked path, not physical path>-gpath <the global and linked path>

Never use the real host name or real physical path!! To switch servers: Restore data, move host alias,

create links, stop and start ClearCase The clients and view servers must reacquire the new

VOB storage mount points, so restart ClearCase or reboot the clients

Page 11: Backup And Recovery Requirements

WHA Configuration (Continued)WHA Configuration (Continued)

ClearCase configuration (continued) Example -- /vobs/bsc

• Host name is cyclone and VOB storage location: /local/mnt2/vobs/bsc.vob (physcial)/local/mnt/VOBS/bsc.vob (logical)

• DNS alias cyclone == edbvobA• Register and tag /vobs/bsc to DNS alias and logical

link instead of physical storage location

/net/edbvobA/local/mnt/VOBS/bsc.vob -VS-

/net/cyclone/local/mnt2/vobs/bsc.vob

ClearCase configuration (continued) Example -- /vobs/bsc

• Host name is cyclone and VOB storage location: /local/mnt2/vobs/bsc.vob (physcial)/local/mnt/VOBS/bsc.vob (logical)

• DNS alias cyclone == edbvobA• Register and tag /vobs/bsc to DNS alias and logical

link instead of physical storage location

/net/edbvobA/local/mnt/VOBS/bsc.vob -VS-

/net/cyclone/local/mnt2/vobs/bsc.vob

Page 12: Backup And Recovery Requirements

WHA Configuration (Continued)WHA Configuration (Continued)

ClearCase configuration (cont)

Example of lsvob (2 VOB servers, 3 storage locations):

* /vobs/mgw/msf_erab /net/mother/local/mnt/VOBSA/mgw/msf_erab.vob public* /vobs/mgw/msf_eedn /net/mother/local/mnt/VOBSA/mgw/msf_eedn.vob public* /vobs/mgw/msf_etm /net/mother/local/mnt/VOBSA/mgw/msf_etm.vob public* /vobs/cello/ose /net/mother/local/mnt/VOBSC/cello/ose.vob public* /vobs/ewu/perl /net/stepmother/local/mnt/VOBSB/ewu/perl.vob public* /vobs/ewu/freeware /net/stepmother/local/mnt/VOBSB/ewu/freeware.vob pu* /vobs/stre/det /net/stepmother/local/mnt/VOBSB/stre/det.vob public

ClearCase configuration (cont)

Example of lsvob (2 VOB servers, 3 storage locations):

* /vobs/mgw/msf_erab /net/mother/local/mnt/VOBSA/mgw/msf_erab.vob public* /vobs/mgw/msf_eedn /net/mother/local/mnt/VOBSA/mgw/msf_eedn.vob public* /vobs/mgw/msf_etm /net/mother/local/mnt/VOBSA/mgw/msf_etm.vob public* /vobs/cello/ose /net/mother/local/mnt/VOBSC/cello/ose.vob public* /vobs/ewu/perl /net/stepmother/local/mnt/VOBSB/ewu/perl.vob public* /vobs/ewu/freeware /net/stepmother/local/mnt/VOBSB/ewu/freeware.vob pu* /vobs/stre/det /net/stepmother/local/mnt/VOBSB/stre/det.vob public

Page 13: Backup And Recovery Requirements

WHA Configuration (Continued)WHA Configuration (Continued)

ClearCase configuration (continued) DNS alias used for VOB servers (VOB server can

change by moving the alias) The registered path and host is always the same no

matter what physical host is the VOB server! Always use the alias, for MultiSite as well. Machines

can come and go but the VOB server host name is always the same

There is both a Rational and SUN white paper documenting this configuration and setup!

http://www.rational.com/media/partners/sun/Ericsson_final.pdf

ClearCase configuration (continued) DNS alias used for VOB servers (VOB server can

change by moving the alias) The registered path and host is always the same no

matter what physical host is the VOB server! Always use the alias, for MultiSite as well. Machines

can come and go but the VOB server host name is always the same

There is both a Rational and SUN white paper documenting this configuration and setup!

http://www.rational.com/media/partners/sun/Ericsson_final.pdf

Page 14: Backup And Recovery Requirements

Backup ProcessBackup Process

All setup is completed and WHA implemented Lock VOBs (less than 2 minutes) We use SUN Instant Image TM to snapshot VOB

storage partition both VOB storage and /var/adm/atria is located here

(we also have trigger scripts and …) Snapshot is to shadow1

another disk partition, could be totally different disk Shadow2 passed data validation with “dbcheck”

yesterday and is being dumped to tape

All setup is completed and WHA implemented Lock VOBs (less than 2 minutes) We use SUN Instant Image TM to snapshot VOB

storage partition both VOB storage and /var/adm/atria is located here

(we also have trigger scripts and …) Snapshot is to shadow1

another disk partition, could be totally different disk Shadow2 passed data validation with “dbcheck”

yesterday and is being dumped to tape

Page 15: Backup And Recovery Requirements

Backup Process (Continued)Backup Process (Continued)

Once backup to shadow1 complete, “dbcheck” will be started for data validation

Once data validation is successful -- and it’s a new backup day -- shadow1 becomes shadow2, and shadow2 becomes shadow1, and it starts all over

If error found during dbcheck we take immediate corrective action – keep validated copy on disk (shadow2) while we check out the production data

There is ALWAYS a “good copy” on the shadow2 disk!

Once backup to shadow1 complete, “dbcheck” will be started for data validation

Once data validation is successful -- and it’s a new backup day -- shadow1 becomes shadow2, and shadow2 becomes shadow1, and it starts all over

If error found during dbcheck we take immediate corrective action – keep validated copy on disk (shadow2) while we check out the production data

There is ALWAYS a “good copy” on the shadow2 disk!

Page 16: Backup And Recovery Requirements

Recovery ProcessRecovery Process

Typical recovery scenario: Get another server or fix broken one – you have to give it

the same server hostname or change the ClearCase registry information!

Restore data from backup tape (100 - 200 GB, 2+ days) Do data validation, checkvob and dbcheck (2+ days) Restore replica (MultiSite users) for 80+ VOBs, this takes

at least 8 – 12 hours Clean up clients – typically a crash means NFS/MVFS is

messed up – REBOOT! Is that it? I wish it was! Developers can’t work! WHA recovery scenario?

Typical recovery scenario: Get another server or fix broken one – you have to give it

the same server hostname or change the ClearCase registry information!

Restore data from backup tape (100 - 200 GB, 2+ days) Do data validation, checkvob and dbcheck (2+ days) Restore replica (MultiSite users) for 80+ VOBs, this takes

at least 8 – 12 hours Clean up clients – typically a crash means NFS/MVFS is

messed up – REBOOT! Is that it? I wish it was! Developers can’t work! WHA recovery scenario?

Page 17: Backup And Recovery Requirements

Recovery Process (Continued)Recovery Process (Continued)

WHA recovery scenario - get another server or fix broken one ANY server can act as the new VOB server. Of course

using an existing VOB/View server would degrade performance

Get VOBs on-line and back in service as fast as possible, WHA means I can “cut-over” to another server again later!

WHA recovery scenario - get another server or fix broken one ANY server can act as the new VOB server. Of course

using an existing VOB/View server would degrade performance

Get VOBs on-line and back in service as fast as possible, WHA means I can “cut-over” to another server again later!

Page 18: Backup And Recovery Requirements

Recovery Process (Continued)Recovery Process (Continued)

WHA recovery scenario - Get another server or fix broken one (cont) STEPS (same for any WHA cut-over):

• Move the DNS alias to new server• create the links (links for /var/adm/atria and VOB

physical storage locations from /local/mnt/VOBS?) Since /var/adm/atria was backed up with the VOB

storage, they are in sync Just turn ClearCase off/on and – NEW VOB

SERVER!

WHA recovery scenario - Get another server or fix broken one (cont) STEPS (same for any WHA cut-over):

• Move the DNS alias to new server• create the links (links for /var/adm/atria and VOB

physical storage locations from /local/mnt/VOBS?) Since /var/adm/atria was backed up with the VOB

storage, they are in sync Just turn ClearCase off/on and – NEW VOB

SERVER!

Page 19: Backup And Recovery Requirements

Recovery Process (Continued)Recovery Process (Continued) WHA recovery scenario - Restore data from

backup tape,100 - 200 GB Not 2+ days We don’t go to tape, unless we’ve had a real

disaster! We don’t do a “restore” we have 2 copies on disk! Use shadow1 if data validation is complete or

confidence level high – shadow2 is only 24-48 hrs old

Mount shadow disk to new VOB server (SAN makes this easy)

WHA recovery scenario - Restore data from backup tape,100 - 200 GB Not 2+ days We don’t go to tape, unless we’ve had a real

disaster! We don’t do a “restore” we have 2 copies on disk! Use shadow1 if data validation is complete or

confidence level high – shadow2 is only 24-48 hrs old

Mount shadow disk to new VOB server (SAN makes this easy)

Page 20: Backup And Recovery Requirements

Recovery Process (Continued)Recovery Process (Continued)

WHA recovery scenario - Restore data from backup tape (cont) Create the links to the VOB physical storage

location Much faster than transferring 100 – 200 GB data

from tape! 15 minutes MAX!

WHA recovery scenario - Restore data from backup tape (cont) Create the links to the VOB physical storage

location Much faster than transferring 100 – 200 GB data

from tape! 15 minutes MAX!

Page 21: Backup And Recovery Requirements

Recovery Process (Continued)Recovery Process (Continued)

WHA recovery scenario - Do data validation, checkvob and dbcheck Not 2+ days Takes a “very” long time (100-200GB of VOBs,

some with 4-6GB databases) Checkvob and dbcheck is run on all servers

monthly Daily successful dbcheck runs on shadow disk –

high confidence

WHA recovery scenario - Do data validation, checkvob and dbcheck Not 2+ days Takes a “very” long time (100-200GB of VOBs,

some with 4-6GB databases) Checkvob and dbcheck is run on all servers

monthly Daily successful dbcheck runs on shadow disk –

high confidence

Page 22: Backup And Recovery Requirements

Recovery Process (Continued)Recovery Process (Continued)

WHA recovery scenario - Do data validation, checkvob and dbcheck (cont) If shadow1 has completed dbchecks, use it, if not

use shadow2 NO time spent on data validation during recovery

because it was done during the backup phase! Would like checkvob and other data validation

utilities that can be run on off-line VOBs!

WHA recovery scenario - Do data validation, checkvob and dbcheck (cont) If shadow1 has completed dbchecks, use it, if not

use shadow2 NO time spent on data validation during recovery

because it was done during the backup phase! Would like checkvob and other data validation

utilities that can be run on off-line VOBs!

Page 23: Backup And Recovery Requirements

Recovery Process (Continued)Recovery Process (Continued)

WHA recovery scenario - Restore replica MultiSite heavily used with syncing internally

every 30 minutes – checked in changes will be available in another replica since the shadow image was snapshot!

Get the changes since the snapshot from other replica

By default – restorereplica wants to sync with ALL replicas (NOT all 30-40 we have )

**CAREFULL**

WHA recovery scenario - Restore replica MultiSite heavily used with syncing internally

every 30 minutes – checked in changes will be available in another replica since the shadow image was snapshot!

Get the changes since the snapshot from other replica

By default – restorereplica wants to sync with ALL replicas (NOT all 30-40 we have )

**CAREFULL**

Page 24: Backup And Recovery Requirements

Recovery Process (Continued)Recovery Process (Continued)

WHA recovery scenario - Restore replica (continued) Lots of VOBs, 80+ , this will still take at least 8 –

12 hours to only 2-4 replicas Must get update packets (that have changes since

the backup) from other replicas See example of commands on next slides!

WHA recovery scenario - Restore replica (continued) Lots of VOBs, 80+ , this will still take at least 8 –

12 hours to only 2-4 replicas Must get update packets (that have changes since

the backup) from other replicas See example of commands on next slides!

Page 25: Backup And Recovery Requirements

Recovery Process (Continued)Recovery Process (Continued)

WHA recovery scenario - Restore replica (continued) Example of commands:

mt restorereplica (default requires updates from all replicas)OR

mt restorereplica replica:ewuhub_bscng_aim replica:ewucth_bscng_aim replica:ewubo_bscng_aim

** MUST INCLUDE THE REPLICA THAT WAS THE LAST REPLICA THAT WAS EXPORTED TO JUST BEFORE THE CRASH!! – NEED TO AVOID DIVERGENCE IN THE VOB REPLICAS!

* Check via lsepoch, make sure the replica with record of the most changes that took place in the restored replica is included! (mt lsepoch ewuhub_bscng_aim@/vobs/bscng/aim)

WHA recovery scenario - Restore replica (continued) Example of commands:

mt restorereplica (default requires updates from all replicas)OR

mt restorereplica replica:ewuhub_bscng_aim replica:ewucth_bscng_aim replica:ewubo_bscng_aim

** MUST INCLUDE THE REPLICA THAT WAS THE LAST REPLICA THAT WAS EXPORTED TO JUST BEFORE THE CRASH!! – NEED TO AVOID DIVERGENCE IN THE VOB REPLICAS!

* Check via lsepoch, make sure the replica with record of the most changes that took place in the restored replica is included! (mt lsepoch ewuhub_bscng_aim@/vobs/bscng/aim)

Page 26: Backup And Recovery Requirements

Recovery Process (Continued)Recovery Process (Continued)

WHA recovery scenario - Restore replica (continued)

**WARNINGS – POSSIBLE DIVERGANCE**

** MUST INCLUDE THE REPLICA THAT WAS THE LAST REPLICA THAT WAS EXPORTED TO JUST BEFORE THE CRASH!! – NEED TO AVOID DIVERGENCE IN THE VOB REPLICAS!

• Cheak for latest replica sync’d to• lsepoch• lshistory

WHA recovery scenario - Restore replica (continued)

**WARNINGS – POSSIBLE DIVERGANCE**

** MUST INCLUDE THE REPLICA THAT WAS THE LAST REPLICA THAT WAS EXPORTED TO JUST BEFORE THE CRASH!! – NEED TO AVOID DIVERGENCE IN THE VOB REPLICAS!

• Cheak for latest replica sync’d to• lsepoch• lshistory

Page 27: Backup And Recovery Requirements

Recovery Process (Continued)Recovery Process (Continued)

WHA recovery scenario - Restore replica (continued) Check via lsepoch, make sure the replica

with record of the most changes that took place in the restored replica is included!

With ClearCase 4.X you can use –actual to query remote replicas

WHA recovery scenario - Restore replica (continued) Check via lsepoch, make sure the replica

with record of the most changes that took place in the restored replica is included!

With ClearCase 4.X you can use –actual to query remote replicas

Page 28: Backup And Recovery Requirements

Recovery Process (continued)Recovery Process (continued)

WHA recovery scenario - Restore replica (continued)

Check via lsepoch EXAMPLE: restored replica is ewucello_bscng_aim

mt lsepoch –actual ewuhub_bscng_aim@/vobs/bscng/aim oid:834d7251.f24c11d4.a4df.00:01:80:b8:c7:b4=450831450831

(ewucello_bscng_aim)

mt lsepoch –actual ewucth_bscng_aim@/vobs/bscng/aim oid:834d7251.f24c11d4.a4df.00:01:80:b8:c7:b4=450745450745

(ewucello_bscng_aim)

WHA recovery scenario - Restore replica (continued)

Check via lsepoch EXAMPLE: restored replica is ewucello_bscng_aim

mt lsepoch –actual ewuhub_bscng_aim@/vobs/bscng/aim oid:834d7251.f24c11d4.a4df.00:01:80:b8:c7:b4=450831450831

(ewucello_bscng_aim)

mt lsepoch –actual ewucth_bscng_aim@/vobs/bscng/aim oid:834d7251.f24c11d4.a4df.00:01:80:b8:c7:b4=450745450745

(ewucello_bscng_aim)

Page 29: Backup And Recovery Requirements

Recovery Process (Continued)Recovery Process (Continued) WHA recovery scenario - Restore replica (continued)

Example of commands – to find last replica exported to. This is not trivial, you have to check each replica you have been syncing with:

Example: mt lsreplica –invob /vobs/nmis

Replicas (14): boulder_nmis, bscclassic_nmis, cbssw_nmis, edbbsc_nmis, edbbsm_nmis, edbspe_nmis, edbtetra_nmis, ewubo_nmis, ewucth_nmis, ewuhub_nmis, ewustre_nmis, ramstest_nmis, servicenet_nmis, streit2_nmis

These replicas are the only ones the restored replica syncs with: boulder_nmis, bscclassic_nmis, ewubo_nmis, ewucth_nmis, ewuhub_nmis

WHA recovery scenario - Restore replica (continued) Example of commands – to find last replica exported to.

This is not trivial, you have to check each replica you have been syncing with:

Example: mt lsreplica –invob /vobs/nmis

Replicas (14): boulder_nmis, bscclassic_nmis, cbssw_nmis, edbbsc_nmis, edbbsm_nmis, edbspe_nmis, edbtetra_nmis, ewubo_nmis, ewucth_nmis, ewuhub_nmis, ewustre_nmis, ramstest_nmis, servicenet_nmis, streit2_nmis

These replicas are the only ones the restored replica syncs with: boulder_nmis, bscclassic_nmis, ewubo_nmis, ewucth_nmis, ewuhub_nmis

Page 30: Backup And Recovery Requirements

Recovery Process (Continued)Recovery Process (Continued)

WHA recovery scenario - Restore replica (continued)Example (cont): /vobs/nmis (must do lshistory

at each remote replica site!)

cleartool lshistory replica:boulder_nmiscleartool lshistory replica:bscclassic_nmiscleartool lshistory replica:ewubo_nmiscleartool lshistory replica:ewucth_nmiscleartool lshistory replica:ewuhub_nmis

Example results:12-Jun.15:55 root import sync from replica "bscclassic_nmis" to

replica “ewuhub_nmis” Review the output of the above commands, see which

was the last replica to be sent an export sync packet

WHA recovery scenario - Restore replica (continued)Example (cont): /vobs/nmis (must do lshistory

at each remote replica site!)

cleartool lshistory replica:boulder_nmiscleartool lshistory replica:bscclassic_nmiscleartool lshistory replica:ewubo_nmiscleartool lshistory replica:ewucth_nmiscleartool lshistory replica:ewuhub_nmis

Example results:12-Jun.15:55 root import sync from replica "bscclassic_nmis" to

replica “ewuhub_nmis” Review the output of the above commands, see which

was the last replica to be sent an export sync packet

Page 31: Backup And Recovery Requirements

Recovery Process (Continued)Recovery Process (Continued)

WHA recovery scenario - Restore replica (continued) Now run the restorereplica command with appropriate

replica(s) identified! (we use ALL replicas we sync with, but not replicas we never sync with)

mt restorereplica replica:boulder_nmis replica:bscclassic_nmis \replica:ewubo_nmis replica:ewucth_nmis replica:ewuhub_nmis

Now send export packets to those replicas and send packets with changes back. The VOB is locked until the replica you are restoring gets update packets from each!

Once all changes have been processed by the restored replica, you can unlock the VOBs and go to the next step

WHA recovery scenario - Restore replica (continued) Now run the restorereplica command with appropriate

replica(s) identified! (we use ALL replicas we sync with, but not replicas we never sync with)

mt restorereplica replica:boulder_nmis replica:bscclassic_nmis \replica:ewubo_nmis replica:ewucth_nmis replica:ewuhub_nmis

Now send export packets to those replicas and send packets with changes back. The VOB is locked until the replica you are restoring gets update packets from each!

Once all changes have been processed by the restored replica, you can unlock the VOBs and go to the next step

Page 32: Backup And Recovery Requirements

Recovery Process (Continued)Recovery Process (Continued)

WHA recovery scenario - Clean up clients Typically a crash means NFS/MVFS is messed up Easiest way to get clients and servers working

properly is to – REBOOT To try and clean-up clients without a reboot see

the basic script on the next page

WHA recovery scenario - Clean up clients Typically a crash means NFS/MVFS is messed up Easiest way to get clients and servers working

properly is to – REBOOT To try and clean-up clients without a reboot see

the basic script on the next page

Page 33: Backup And Recovery Requirements

Recovery Process (Continued)Recovery Process (Continued)

WHA recovery scenario - Clean up clients (continued) Script#!/bin/sh -x

/usr/sbin/fuser -uck /view

for VOB in `/usr/atria/bin/cleartool lsvob -s`do /usr/sbin/fuser -uck $VOB > /dev/null 2>&1

done

/usr/atria/bin/cleartool umount -all > /dev/null 2>&1

for MNT in `df | grep local/mnt | grep -v "/dev/dsk" | cut -f1 -d "("`do umount $MNT > /dev/null 2>&1

done

rm -r /vobs/*

/etc/init.d/atria stop

WHA recovery scenario - Clean up clients (continued) Script#!/bin/sh -x

/usr/sbin/fuser -uck /view

for VOB in `/usr/atria/bin/cleartool lsvob -s`do /usr/sbin/fuser -uck $VOB > /dev/null 2>&1

done

/usr/atria/bin/cleartool umount -all > /dev/null 2>&1

for MNT in `df | grep local/mnt | grep -v "/dev/dsk" | cut -f1 -d "("`do umount $MNT > /dev/null 2>&1

done

rm -r /vobs/*

/etc/init.d/atria stop

Page 34: Backup And Recovery Requirements

Recovery Process (Continued)Recovery Process (Continued)

WHA Restore completed! But developers can’t work!

Build issues – need error handling in build scripts VOBs and Views may have been created or deleted

since the backup: Created since backup - storage exists without entry in registry Deleted since backup - registry entry exists without storage

FIRST – MAKE SURE ALL VOB AND VIEW SERVER PROCESSES HAVE BEEN KILLED – this eliminates lots of potential problems (stop and restart ClearCase on all systems)

WHA Restore completed! But developers can’t work!

Build issues – need error handling in build scripts VOBs and Views may have been created or deleted

since the backup: Created since backup - storage exists without entry in registry Deleted since backup - registry entry exists without storage

FIRST – MAKE SURE ALL VOB AND VIEW SERVER PROCESSES HAVE BEEN KILLED – this eliminates lots of potential problems (stop and restart ClearCase on all systems)

Page 35: Backup And Recovery Requirements

Recovery Process (Continued)Recovery Process (Continued)

Build issues Case #1: VOBs that have been restored HAVE

references to DO’s• DO’s physically exist in VOB (no problem)• DO’s exist in view (ref count = 1) (again no problem)• DO’s references exist in VOBs, but the DO data DOES NOT

exist anymore (maybe removed since backup by rmview or rmdo)

Case #2: VOBs that have been restored DO NOT have references to DO’s that exist

• DO’s exist in a single view, reference count == 1, reference in the view but not the VOBs

• DO’s were promoted so references exist in multiple views (ref count > 1) – but not in the VOBs

Build issues Case #1: VOBs that have been restored HAVE

references to DO’s• DO’s physically exist in VOB (no problem)• DO’s exist in view (ref count = 1) (again no problem)• DO’s references exist in VOBs, but the DO data DOES NOT

exist anymore (maybe removed since backup by rmview or rmdo)

Case #2: VOBs that have been restored DO NOT have references to DO’s that exist

• DO’s exist in a single view, reference count == 1, reference in the view but not the VOBs

• DO’s were promoted so references exist in multiple views (ref count > 1) – but not in the VOBs

Page 36: Backup And Recovery Requirements

Recovery Process (Continued)Recovery Process (Continued)

Build issues – Case # 1VOBs that have been restored HAVE

references to DO’s• DO’s references exist in VOBs, but the DO data

DOES NOT exist anymore• maybe removed since backup by rmview or

rmdo

Build issues – Case # 1VOBs that have been restored HAVE

references to DO’s• DO’s references exist in VOBs, but the DO data

DOES NOT exist anymore• maybe removed since backup by rmview or

rmdo

Page 37: Backup And Recovery Requirements

Recovery Process (Continued)Recovery Process (Continued)

Build issues – Case #1 (continued) Since DO pointers exist in the restored VOB, these DO’s

are considered during configuration lookup of builds. Results in Warnings! But it does rebuild the DO’s

clearmake -C sun -f /vobs/wds/build/include/Makefile.if -e clearmake: Warning: Unable to evaluate derived object "libimc.a.1@@07-Nov.19:10.220156" in VOB directory

"/vobs/bscng/ccl/imc/imc_if/lib.sp750@@"

** recoverview does NOT clean this up, you just keep getting warnings! We created a script to clean this up, but you might be able to just ignore the messages!

Build issues – Case #1 (continued) Since DO pointers exist in the restored VOB, these DO’s

are considered during configuration lookup of builds. Results in Warnings! But it does rebuild the DO’s

clearmake -C sun -f /vobs/wds/build/include/Makefile.if -e clearmake: Warning: Unable to evaluate derived object "libimc.a.1@@07-Nov.19:10.220156" in VOB directory

"/vobs/bscng/ccl/imc/imc_if/lib.sp750@@"

** recoverview does NOT clean this up, you just keep getting warnings! We created a script to clean this up, but you might be able to just ignore the messages!

Page 38: Backup And Recovery Requirements

Recovery Process (Continued)Recovery Process (Continued)

Build issues – Case #1 (continued) If view has been deleted, ERROR message will be

generated (scripts need error handling)>>> (clearmake): Build evaluating lib1.a>>> (clearmake): Build evaluating one.o

No candidate in current view for "one.o">>> (clearmake): Shopping for DO named "one.o" in VOB directory

"/vobs/stre/do_test/.@@">>> (clearmake): Evaluating heap derived object "one.o@@05-

Jun.12:24.74">>> clearmake: Error: Unable to find view by

uuid:5b997e3d.78b711d6.ad2c.00:01:80:b6:87:eb, last known at "lime:/tmp/do3.vws".

>>> clearmake: Error: Unable to contact View - ClearCase object not found

>>> clearmake: Warning: View "lime:/tmp/do3.vws" unavailable -This process will not contact the view again for 60 minutes.NOTE: Other processes may try to contact the view.

>>> clearmake: Warning: Unable to evaluate derived object "one.o@@05-Jun.12:24.74" in VOB directory "/vobs/stre/do_test/.@@"

Build issues – Case #1 (continued) If view has been deleted, ERROR message will be

generated (scripts need error handling)>>> (clearmake): Build evaluating lib1.a>>> (clearmake): Build evaluating one.o

No candidate in current view for "one.o">>> (clearmake): Shopping for DO named "one.o" in VOB directory

"/vobs/stre/do_test/.@@">>> (clearmake): Evaluating heap derived object "one.o@@05-

Jun.12:24.74">>> clearmake: Error: Unable to find view by

uuid:5b997e3d.78b711d6.ad2c.00:01:80:b6:87:eb, last known at "lime:/tmp/do3.vws".

>>> clearmake: Error: Unable to contact View - ClearCase object not found

>>> clearmake: Warning: View "lime:/tmp/do3.vws" unavailable -This process will not contact the view again for 60 minutes.NOTE: Other processes may try to contact the view.

>>> clearmake: Warning: Unable to evaluate derived object "one.o@@05-Jun.12:24.74" in VOB directory "/vobs/stre/do_test/.@@"

Page 39: Backup And Recovery Requirements

Recovery Process (Continued)Recovery Process (Continued)

Build issues – Case #2VOBs that have been restored DO NOT

have references to DO’s that exist• DO’s exist in a single view, reference count ==

1, reference in the view but not the VOBs• DO’s were promoted so references exist in

multiple views (ref count > 1) – but not in the VOBs

Build issues – Case #2VOBs that have been restored DO NOT

have references to DO’s that exist• DO’s exist in a single view, reference count ==

1, reference in the view but not the VOBs• DO’s were promoted so references exist in

multiple views (ref count > 1) – but not in the VOBs

Page 40: Backup And Recovery Requirements

Recovery Process (Continued)Recovery Process (Continued)

Build issues – Case #2 (continued) DO’s exist in a single view, reference count == 1,

reference in the view but not the VOBs DO’s were promoted so references exist in

multiple views (ref count > 1) – but not in the VOBs

Recoverview can be used to clean this up, needs to be run in each view with a problem. Moves stranded DO’s to view .s/lost+found:

recoverview –vob <vob uuid> -tag <view tag>

Build issues – Case #2 (continued) DO’s exist in a single view, reference count == 1,

reference in the view but not the VOBs DO’s were promoted so references exist in

multiple views (ref count > 1) – but not in the VOBs

Recoverview can be used to clean this up, needs to be run in each view with a problem. Moves stranded DO’s to view .s/lost+found:

recoverview –vob <vob uuid> -tag <view tag>

Page 41: Backup And Recovery Requirements

Recovery Process (Continued)Recovery Process (Continued)

Build issues – Case #2.2 (continued) DO’s promoted so references exist in multiple

views (ref count > 1) – but not in the VOBs *careful, view server processes have not been terminated!

lime /vobs/stre/do_test 53 ct setview do2lime /vobs/stre/do_test 51 ct ls -l view private object .cmake.stateversion Makefile@@/main/1 Rule: element *

/main/LATESTderived object four.o [no config record]derived object lib1.a [no config record]dir version lost+found@@/main/0 Rule: element * /main/LATESTderived object one.o [no config record]derived object three.o [no config record]derived object two.o [no config record]

Build issues – Case #2.2 (continued) DO’s promoted so references exist in multiple

views (ref count > 1) – but not in the VOBs *careful, view server processes have not been terminated!

lime /vobs/stre/do_test 53 ct setview do2lime /vobs/stre/do_test 51 ct ls -l view private object .cmake.stateversion Makefile@@/main/1 Rule: element *

/main/LATESTderived object four.o [no config record]derived object lib1.a [no config record]dir version lost+found@@/main/0 Rule: element * /main/LATESTderived object one.o [no config record]derived object three.o [no config record]derived object two.o [no config record]

Page 42: Backup And Recovery Requirements

Recovery Process (Continued)Recovery Process (Continued)

Build issues – Case #2.2 (continued)DO’s promoted so references exist in

multiple views (ref count > 1) – but not in the VOBs

*view server processes have been terminated!

lime /vobs/stre/do_test 52 ct ls.cmake.stateMakefile@@/main/1 Rule: /main/LATESTcleartool: Error: Trouble looking up element "four.o" in directory ".".cleartool: Error: Trouble looking up element "lib1.a" in directory ".".lost+found@@/main/0 Rule: /main/LATESTcleartool: Error: Trouble looking up element "one.o" in directory ".".cleartool: Error: Trouble looking up element "three.o" in directory ".".cleartool: Error: Trouble looking up element "two.o" in directory ".".

Build issues – Case #2.2 (continued)DO’s promoted so references exist in

multiple views (ref count > 1) – but not in the VOBs

*view server processes have been terminated!

lime /vobs/stre/do_test 52 ct ls.cmake.stateMakefile@@/main/1 Rule: /main/LATESTcleartool: Error: Trouble looking up element "four.o" in directory ".".cleartool: Error: Trouble looking up element "lib1.a" in directory ".".lost+found@@/main/0 Rule: /main/LATESTcleartool: Error: Trouble looking up element "one.o" in directory ".".cleartool: Error: Trouble looking up element "three.o" in directory ".".cleartool: Error: Trouble looking up element "two.o" in directory ".".

Page 43: Backup And Recovery Requirements

Recovery Process (Continued)Recovery Process (Continued)

Build issues – Case #2.2 (continued)DO’s promoted so references exist in

multiple views (ref count > 1) – but not in the VOBs

*view server processes have been terminated!

> ls -l./one.o: No such file or directory./two.o: No such file or directory./three.o: No such file or directory./four.o: No such file or directory./lib1.a: No such file or directory

Build issues – Case #2.2 (continued)DO’s promoted so references exist in

multiple views (ref count > 1) – but not in the VOBs

*view server processes have been terminated!

> ls -l./one.o: No such file or directory./two.o: No such file or directory./three.o: No such file or directory./four.o: No such file or directory./lib1.a: No such file or directory

Page 44: Backup And Recovery Requirements

Recovery Process (Continued)Recovery Process (Continued)

Build issues – Case #2.2 (continued) With proper shutdown of the view server process,

ClearCase automatically purges the references and enter a log message in /var/adm/atria/view_log:

06/12/02 10:54:44 view_server(24163): Warning: Cover object mother:/local/mnt2/workspace/vobs/stre/do_test.vbs:336e07d7.7e2b11d6.b659.00:01:80:b6:87:eb for 0x8000000a not found in VOB: ClearCase object not found

06/12/02 10:54:44 view_server(24163): Warning: Cover object mother:/local/mnt2/workspace/vobs/stre/do_test.vbs:336e07df.7e2b11d6.b659.00:01:80:b6:87:eb for 0x80000007 not found in VOB: ClearCase object not found

06/12/02 10:54:44 view_server(24163): Warning: Cover object 06/12/02 10:54:53 view_server(24163): Warning: Vob stale

0x8000000d: Purging

Build issues – Case #2.2 (continued) With proper shutdown of the view server process,

ClearCase automatically purges the references and enter a log message in /var/adm/atria/view_log:

06/12/02 10:54:44 view_server(24163): Warning: Cover object mother:/local/mnt2/workspace/vobs/stre/do_test.vbs:336e07d7.7e2b11d6.b659.00:01:80:b6:87:eb for 0x8000000a not found in VOB: ClearCase object not found

06/12/02 10:54:44 view_server(24163): Warning: Cover object mother:/local/mnt2/workspace/vobs/stre/do_test.vbs:336e07df.7e2b11d6.b659.00:01:80:b6:87:eb for 0x80000007 not found in VOB: ClearCase object not found

06/12/02 10:54:44 view_server(24163): Warning: Cover object 06/12/02 10:54:53 view_server(24163): Warning: Vob stale

0x8000000d: Purging

Page 45: Backup And Recovery Requirements

Recovery Process (Continued)Recovery Process (Continued)

VOBs and Views may have been created or deleted since the backup: VOBs or Views created since backup - storage

exists without entry in registry VOBs or Views deleted since backup - registry

entry exists without storage At least the registry is in sync with the data that

was restored• ClearCase configuration and VOB storage on same

device, gets backed up at the same time!

VOBs and Views may have been created or deleted since the backup: VOBs or Views created since backup - storage

exists without entry in registry VOBs or Views deleted since backup - registry

entry exists without storage At least the registry is in sync with the data that

was restored• ClearCase configuration and VOB storage on same

device, gets backed up at the same time!

Page 46: Backup And Recovery Requirements

Recovery Process (Continued)Recovery Process (Continued)

VOBs and Views may have been created or deleted since the backup (continued): You can use rgy_check to help clean this up

/usr/atria/etc/rgy_check –views | vobs It helps if you have standard storage locations for

VOBs and Views, you know where to look Sometimes you just need to wait for users to

complain! Remember those “error/warning” msg! Views are suppose to be temporary working

space right!

VOBs and Views may have been created or deleted since the backup (continued): You can use rgy_check to help clean this up

/usr/atria/etc/rgy_check –views | vobs It helps if you have standard storage locations for

VOBs and Views, you know where to look Sometimes you just need to wait for users to

complain! Remember those “error/warning” msg! Views are suppose to be temporary working

space right!


Recommended