Post on 28-Apr-2018
transcript
Everything You Need to Know About Oracle Exadata Backup and Recovery: Best Practices Andrew Babb, Consulting Member of Technical Staff, Oracle Donna Cooksey, Principal Product Manager, Oracle Harpreet Singh, Vice President, Database Management, Fidelity Investments
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 2
The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle’s products remains at the sole discretion of Oracle.
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 3
Program Agenda
Evolving IT Infrastructure
Recovery, Recovery, Recovery
Architecting Your Backup Infrastructure
Customer Case Study – Fidelity Investments
New Modern Cloud Paradigm
Summary and Q & A
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 4
Evolution of Data Protection
IT consumers are increasingly involved in technology decisions – The flexible, fast moving opportunities of the “3rd Platform” translate to
more IT initiatives being driven by Line of Businesses (LOB) – Applications, storage, servers … even data protection?
Technology in stealth mode makes a sound data protection even more important !
Business Requirements Meeting IT Head-on
Greater Complexity Causing More Data Center Downtime: http://www.datacenterdynamics.com/focus/archive/2012/09/greater-complexity-causing-more-data-center-downtime-0
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 5
Critical Databases Get Poor Protection Today
What Business Wants Never lose business data Keep critical apps available
What IT Wants Private and public cloud solution Ensured end-to-end protection
What Business Gets Data loss on restore, typically full day End-user slowdown during backup
What IT Gets Sprawl of non-scalable solutions Uncertain protection, poor visibility
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 6
Primary Causes of Downtime 2012 IOUG Survey – Enterprise Data and The Cost of Downtime*
Human Error
Storage Failure
Application Errors
Network Outages
Server Failure
Recovery plan / Training / Oversight
Interoperability / Scalability / Performance
Failover / Fallback capabilities
System Monitoring
Unplanned Downtime
*http://www.oracle.com/us/products/database/2012-ioug-db-survey-1695554.pdf
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 7
Bad News Travels Faster Than Good What is The Cost of Downtime?
NASDAQ HALTS TRADING FOR THREE HOURS: http://www.businessinsider.com/nasdaq-options-market-halted-2013-8
NASDAQ HALTS TRADING FOR THREE HOURS: http://www.businessinsider.com/nasdaq-options-market-halted-2013-8
NASDAQ HALTS TRADING FOR THREE HOURS: http://www.businessinsider.com/nasdaq-options-market-halted-2013-8
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 9
What are Your Recovery Requirements? Four Key Points to Define
1 2 3 4
Recovery Point Objective (RPO)
Retention Period
Recovery Time Objective (RTO)
Disaster Recovery (onsite/offsite)
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 10
Group Databases Into Protection Tiers Basic Grouping Strategy - Example
Category Gold Silver Bronze RTO Seconds < 6 Hours Up to 24 hours
RPO Current Up to 3 hours* Up to 6 hours*
Backup Retention
Critical Restores Up to one week One day One day norm / not
critical
Retention 7 Years 6 months 1 month
DR / Long-term
Two sites for one week
Offsite copy within 3 days
No specific DR requirement
*Stay tuned to the new paradigm.
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 11
Exadata Environments
The criticality and workloads of typical Exadata databases makes recovery strategies especially important:
– Batch load / NOLOGGING operation went south – Long-term, periodic archival backups (keep forever / until) – Application patches and upgrades – Backing out a bad transaction
Common Restore Scenarios / Planning
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 12
Oracle Recovery Strategies Complementary and Integrated Technologies
Category Technology / Solution Recovery Time Objective (RTO)
Recovery Point Objective (RPO)
Physical Data Protection • Recovery Manager (RMAN) • Oracle Secure Backup (OSB)
Days/Hours As of last backup
• Data Guard or Active Data Guard Minutes/Seconds Current
Logical Data Protection • Flashback Technologies Hours/Minutes Minutes
Recovery Analysis • Data Recovery Advisor (DRA)
• Minimizes time for problem identification & recovery planning
Optimized Optimized
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 13
Oracle Logical Data Protection Technologies
Flashback Technologies are a suite of logical error investigation and correction capabilities built-in the Oracle database:
– Error investigation: Flashback query, version query and transaction query – Error correction: Flashback database, table, drop and transaction
Flashback Database operates on physical data blocks and is similar in effect to point-in-time recovery - other Flashback features operate at logical level
– Only Flashback feature which must be explicitly enabled by user as it generates logs
In applicable scenarios, Flashback features are more efficient than media recovery
Complements Physical Data Protection Strategy
Flashback Technologies Should be part of ALL Recovery Plans !
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 14
Restore Points
Restore point is a user-defined name assigned to an SCN or specific point in time – a user-friendly “bookmark”
FLASHBACK DATABASE TO RESTORE POINT 'before_upgrade';
User-defined restore point names may be used as aliases for SCN with the following supported commands:
– RECOVER DATABASE and FLASHBACK DATABASE commands in RMAN – FLASHBACK TABLE in SQL
What They Are and Why Use Them
There are two types of restore points – Normal and Guaranteed Guaranteed must be explicitly deleted by the user Normal age out of the control file
For archival backups, use the PRESERVE key word to retain the restore point until backup expiration
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 15
Flashback Database VS Point-in-Time Recovery Different Approaches and Multiple Use Cases
Flashback Database Traditional Point-in-Time Recovery Rewinds the database to SCN Restores then recovers the database to SCN
Advantages
• Significantly faster than point-in-time recovery - No restore and only limited redo needed
• Useful during database upgrades, application deployments, and efficient alternative to rebuilding a failed primary database after a Data Guard failover
• Provides continuous data protection
• Compatible with restore points
Works at the database or tablespace level
No additional logs necessary beyond redo
Compatible with restore points
Disadvantages
Requires Flashback logs and associated storage
Works at whole database level only
Flashback logging has some (minimal) overhead on database server
Time consuming especially for larger databases
Database is down until fully recovered
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 16
Data Recovery Advisor (DRA)
Oracle Database tool that automatically diagnoses data failures, presents repair options, and executes repairs at the user's request
Determines failures based on symptoms – Failure Information recorded in diagnostic Automatic Diagnostic Repository (ADR) – Flags problems before user discovers them, via automated health monitoring
Intelligently determines recovery strategies – Aggregates failures for efficient recovery, presents only feasible recovery
options and indicates any data loss for each option Can automatically perform selected recovery steps Accessed via RMAN or EM
Reduces Downtime by Eliminating Confusion!
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 17
How Good is Your Backup Infrastructure?
1. Documented recovery plan for database and object level recovery 2. Perform periodic (i.e. regularly) recovery tests for various recovery
scenarios: 1. Full database 2. Objects 3. Control file
3. Refresh test environments with RMAN 4. If hardware isn’t available to perform full database recovery tests,
use RMAN RESTORE VALIDATE
You Never Know – Unless Your Periodically Test It !
Job Security Tip # 1 – Successful recovery is all that matters!
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 18
Architecting Your Backup Infrastructure
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 19
Full Backup Two types of RMAN full backups:
Image copy – Disk only Same size as the database less temp
files Backupset – Disk or tape
Smaller than image copy full Can be compressed and/or encrypted
by RMAN Full backup consumes more overhead on the
production server and take more time than an incremental backup
Restoration may be faster than an incremental
RMAN Traditional Backup Strategies
Full / Incremental Schedule Backupset backups – Disk or tape Typical schedule – Week full with daily incremental
backups Typical retention:
– Days to weeks – On disk – Weeks to years – On tape – Full and corresponding incremental backup
should be treated as a group • Reduces backup window and overhead on servers • Ideal with low-medium change rate e.g. <20% • Database must be in archived log mode
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 20
RMAN Incremental Forever Strategy
Oracle Database 10g Release 2 Enterprise Edition > Incremental forever after initial full image copy Full image copy is rolled forward on user-defined schedule
• Roll-forward / merge does incur overhead on server • Offers SWITCH TO COPY capability
Typical retention – One to seven days Backup full or incremental to tape
Incrementally Updated Backups
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 21
Processing Offloaded From Database Nodes
Block Change Tracking (BCT) enables fast incremental backups – RMAN tracks 32k data file sections which include a changed block(s) – During an incremental backup, RMAN scans these 32k file sections to
determine which block(s) have changed Only these changed blocks are included in the incremental backup
Incremental Backup Scans Occur on Exadata Storage Cells
Note: Incremental backup without Block Change Tracking (BCT) enabled – all database blocks are scanned to determine what has changed
Database Server Exadata
Scan of blocks occurs on the database server
Scan of blocks is offloaded to the Exadata Storage Cells
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 22
Backup of Compressed Data
Compressed data remains compressed in the backup – This data will not benefit from further compression during the
backup (e.g. RMAN backup or tape drive compression) – Deduplication software cannot deduplicate compressed data
Effects on Sizing and Processes HCC Data OLTP Compressed Tables
SecureFiles Compressed/Deduplicated
RMAN backup compression is effective on non-compressed database files Avoid using RMAN backup compression on HCC tablespaces by separating the
backups as shown below: Restore is no different than if the backups had not been separated
CONFIGURE EXCLUDE FOR TABLESPACE historical_data; CONFIGURE COMPRESSION ALGORITHM 'low’; BACKUP TABLESPACE historical_data; BACKUP AS COMPRESSED BACKUPSET database;
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 23
Protecting Exadata Operating System Files
On the Exadata Storage Cells, the internal USB stick provides the backup On the Exadata database nodes, backup the operating system(OS)
files in the same manner as with any other database server
Please refer to the documentation for more information: http://wd0338.oracle.com/archive/cd_ns/E13877_01/doc/doc.112/e13874/maintenance.htm#CHDIDGAI
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 24
Exadata Backup Targets
20 – 25 TB / hour
All Exadata smart features
Considerations - Performance and Cost Trade-offs
Highest Performance High Performance and Added Flexibility
Cost – Varies with hardware configuration
Exadata Storage
Exadata Storage Expansion Rack
ZFS Storage Appliance (ZFS/SA)
StorageTek Tape Library
27 TB / hour Fastest Backup and Restore ILM Historical Archive Second DATA2 Disk Group
13 TB / hour Backups of database &
non-database files Snapshots Clones
9 TB hour* Backup of database and
non-database files Offsite Backups Vaulting
Note*: Backup Rate limited by number of tape drives – 8 x T10000C Drives
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 25
Oracle-Integrated Backup to Disk and/or Tape Multi-media Strategy: Disk-to-Disk-to-Tape (D2D2T)
Fast Recovery Area
RMAN Disk Backup
Backup to Tape BACKUP RECOVERY AREA;
BACKUP BACKUPSET;
D2D2T Exadata
StorageTek Tape Library
• Fast Recovery Area should reside on Exadata storage – slower storage could degrade production database performance
• Online redo, archived logs, Flashback logs, controlfile ZFS Storage Appliance (ZFSSA)
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 26
Expanding Exadata Environments Connectivity Considerations
FRA
RMAN Disk Backup
Exadata
ZFS Storage Appliance (ZFSSA)
FRA What happens when a 2nd Exadata is added?
InfiniBand
What about a 3rd
Exadata?
FRA
10Gigabit Ethernet
The two Exadatas MUST be configured with different InfiniBand Subnets.
The 3rd Exadata would be connected via 10Gigabit
Refer to the MAA white paper: http://www.oracle.com/technetwork/database/features/availability/maa-wp-dbm-zfs-backup-1593252.pdf
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 27
Customer Case Study – Fidelity Investments
Oracle Open World
Exadata Backups
September 24, 2013
Harpreet Singh Vice President, Database Management Fidelity Investments
29
Transition To Exadata – A Huge Success!
Challenges with traditional infrastructure • 300TB of storage with over 60% annual growth rate • Performance challenges • Cost reduction pressures • Need to make failover/recovery more robust
Benefits gained with Exadata • 42x performance gains for reporting & 40% for OLTP • Reduced storage by 30% using compression • Consolidated physical servers from 10 to 4 • Reduced direct/indirect chargebacks by 30% • Significantly improved failover, backup & recovery strategy
31
Pre-Exadata Backup Challenges
Over 60% annual data growth rate
Business needs growing and becoming more complex
Expensive software/hardw
are licenses
Costly to keep backups on the
disk
Backups hurting
database performance
Complicated recovery with “no-logging”
Concerns around non-logical DR software
32
Fundamental Data Protection Strategy
1st Line of Defense • Flashback: 48
hours • data deletion • logical corruption • user errors
2nd Line of Defense • Disk Backup: 24
Hours • application • system
3rd Line of Defense • Standby Database
(DR) • Building/site, region • HW failure
Last Line of Defense • Tape: 35 Days
• Offsite • multi-site failures
33
Pros Faster recovery Data recovery from tables, schema, or entire database Roll database back and forth repeatedly within the
flashback window for complex data restore
Cons Same location as production
– No protection from storage failure No protection from physical corruption
Flashback Disk Backup Standby Database Tape Backup
• Oracle Flashback Database • Primary and Standby Sites
Flashback
Retention Period: 48 Hours Restore Time: < 1 Hour Space Used: 300GB
34
Flashback Disk Backup Standby Database Tape Backup
Pros Protect against physical/logical database corruption Faster backup and restore Minimal overhead to the production database
Cons Shorter protection window (24 hours) Same location as production so no protection from DR
or catastrophic storage failure
• Exadata Fast Recovery Area • Incrementally Updated
Disk Backup
Retention Period: 24 Hours Backup Rate: 1.2 TB/hour Restore Rate: 1 TB/hour Type: RMAN
Online Daily Normal Redundancy
35
Flashback Disk Backup Standby Database Tape Backup
Pros Great for any data recovery when combined with Flashback
Database Complete data protection if primary site is lost Protection from physical corruption Can be turned into snapshot standby database temporarily
and used for QA/Dev database refreshes through RMAN
Cons Resources (another set of servers/storage)
• Data Guard • Asynchronous • No Delay Apply • 48 Hour Flashback Database
setup • 700 miles between Primary
and Standby sites
Standby Database
36
Flashback Disk Backup Standby Database Tape Backup
Pros Longer term offsite retention than disk and standby Media is relatively cheap
Cons Slower backup and restore than disk Media is less reliable
Tape Backup Retention Period:
35 Days (Offsite)
Channels: 2-4 Nodes: 1 Backup Rate: 1TB/hour (2 channels) Restore Rate: 800GB/hour (2 channels) RTO: 3 Days Type: RMAN
CommVault Archived Redo Logs Retention
3 Days on disk
Archived Redo Logs Backup
Every 30 minutes
37
Planning a Comprehensive Backup Strategy
• Consider full backups once a week with daily incremental
Determine disk backup strategy
• Implement Oracle suggested RMAN backup strategy as it is great protection against data loss
Develop tape backup process
• At least annually Test different restore processes
• Should be centrally managed Consolidate tape backup system
38
Implementation Recommendations
Optimal performance • Configure Exadata backup over
InfiniBand for better throughput • Configure number of channels
based on database size and SLAs • Use one RMAN channel per tape
drive for better throughput • Enable block change tracking for
fast RMAN incremental backups
Data protection and disaster recovery • Backup Archived Log every 30
minutes for better data protection • Encrypt the data before writing to
tape for data security • Set-up Flashback on both primary
and standby databases • Utilize Data Guard broker
Monitoring • Use Oracle Enterprise Manager to
monitor: • Disk backup • Tape backup • Data Guard • Flashback
39
Summary
Have clear and well communicated recovery SLAs Build your strategy around the business needs Revisit a well-documented, multi-level strategy
periodically Be conservative and prepare for the worst Test Practice
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 40
The New Modern Cloud Paradigm
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 41
Oracle Database Backup Logging Recovery Appliance
Please refer to Oracle.com for additional information: http://www.oracle.com/us/corporate/features/database-backup-logging-recovery-appliance/index.html
Announced at Oracle OpenWorld 2013
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 43
Oracle Technologies Mitigate Downtime Complexities Are Inherent in IT – Know IT and PLAN for IT!
Flashback Technologies RMAN Enterprise
Manager
Validated, reliable backup you know can be recovered
Oracle Engineered Solutions eliminate interoperability, patching and upgrade risks
Policy-based, data protection management
Failover, fallback and/or disaster recovery
Oracle Technologies
Quickly review and/or correct user errors
System Monitoring
Active Data
Guard
Oracle Secure Backup
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 44
Key Takeaways
RMAN backup / recovery on Exadata is the same as other platforms – just faster! Oracle data protection technologies meet diverse RTO /
RPO and budget requirements Database consolidation and data protection is ideally
suited to the Exadata platform
Exadata Backup and Recovery
Who Better to Backup Oracle Than Oracle?
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 45
Resources OTN HA Portal:
http://www.oracle.com/goto/availability
Maximum Availability Architecture (MAA): http://www.oracle.com/goto/maa
MAA Blogs: http://blogs.oracle.com/maa
Exadata on OTN: http://www.oracle.com/technetwork/database/exadata/index.html
Oracle HA Customer Success Stories on OTN: http://www.oracle.com/technetwork/database/features/ha-casestudies-098033.html