+ All Categories
Home > Documents > Oracle Netapp Best Practices

Oracle Netapp Best Practices

Date post: 22-Oct-2015
Category:
Upload: saeed-meethal
View: 184 times
Download: 3 times
Share this document with a friend
Popular Tags:
47
Network Appliance™ Best Practices for Oracle® Revision 1.0 October 30, 2004 Eric Barrett Technical Global Advisor Bikash R. Choudhury Technical Global Advisor Bruce Clarke Consulting Systems Engineer Sunil Mahale Technical Marketing TECHNICAL REPORT Ravi Animi Technical Marketing Network Appliance, a pioneer and industry leader in data storage technology, helps organizations understand and meet complex technical challenges with advanced storage solutions and global data management strategies. Blaine McFadden Technical Marketing Ed Hsu Systems Engineer Christopher Slater Database Consulting Systems Engineer Michael Tatum Database Consulting Systems Engineer
Transcript
Page 1: Oracle Netapp Best Practices

Network Appliance™ Best Practices for Oracle® Revision 1.0 October 30, 2004 Eric Barrett Technical Global Advisor Bikash R. Choudhury Technical Global Advisor Bruce Clarke Consulting Systems Engineer

Sunil Mahale Technical Marketing

TECHNICAL REPORT Ravi Animi Technical Marketing N

etwork A

ppliance, a pioneer and industry leader in data storage technology, helps organizations understand and m

eet com

plex technical challenges with

advanced storage solutions and global data m

anagement strategies.

Blaine McFadden Technical Marketing

Ed Hsu Systems Engineer Christopher Slater Database Consulting Systems Engineer Michael Tatum Database Consulting Systems Engineer

Page 2: Oracle Netapp Best Practices

Table of Contents

Introduction.........................................................................................................3 1. Network Appliance System Configuration....................................................4

1.1. Appliance Network Settings........................................................................4 1.1.1. Ethernet—Gigabit Ethernet, Autonegotiation, and Full Duplex ............4

1.2. Volume Setup and Options.........................................................................5 1.2.1. Databases............................................................................................5 1.2.2. Volume Size.........................................................................................5 1.2.3. Oracle Optimal Flexible Architecture (OFA) on NetApp Storage .........6 1.2.4. Best Practices for Control and Log Files..............................................6

1.3. RAID Group Size........................................................................................8 1.4. Snapshot and SnapRestore .......................................................................9 1.5. Snap Reserve.............................................................................................9 1.6. System Options ........................................................................................10

1.6.1. The minra Option ...............................................................................10 1.6.2. File Access Time Update ...................................................................11 1.6.3. NFS Settings......................................................................................11

2. Operating Systems .......................................................................................11 2.1. Linux.........................................................................................................11

2.1.1. Linux—Recommended Versions .......................................................12 2.1.2. Linux—Kernel Patches ......................................................................13 2.1.3. Linux—OS Settings ...........................................................................13 2.1.4. Linux Networking—Full Duplex and Autonegotiation .........................14 2.1.5. Linux Networking—Gigabit Ethernet Network Adapters.....................14 2.1.6. Linux Networking—Jumbo Frames with GbE ....................................15 2.1.7. Linux NFS Protocol—Mount Options .................................................16 2.1.8. iSCSI Initiators for Linux ....................................................................19 2.1.9. FC-AL Initiators for Linux ...................................................................20

2.2. Sun Solaris Operating Systems................................................................20 2.2.1. Solaris—Recommended Versions .....................................................20 2.2.2. Solaris—Kernel Patches ....................................................................20 2.2.3. Solaris—OS Settings .........................................................................21 2.2.4. Solaris Networking—Full Duplex and Autonegotiation.......................22 2.2.5. Solaris Networking—Gigabit Ethernet Network Adapters ..................23 2.2.6. Solaris Networking—Jumbo Frames with GbE..................................23 2.2.7. Solaris Networking—Improving Network Performance ......................24 2.2.8. Solaris IP Multipathing (IPMP) ...........................................................26 2.2.9. Solaris NFS Protocol—Mount Options...............................................27 2.2.10. iSCSI Initiators for Solaris ................................................................29 2.2.11. Fibre Channel SAN for Solaris.........................................................30

2.3. Microsoft Windows Operating Systems....................................................30 2.3.1. Windows Operating System—Recommended Versions....................30

Page 2

2.3.2. Windows Operating System—Service Packs ....................................30

Page 3: Oracle Netapp Best Practices

2.3.3. Windows Operating System—Registry Settings ................................30 2.3.4. Windows Networking—Autonegotiation and Full Duplex ...................31 2.3.5. Windows Networking—Gigabit Ethernet Network Adapters...............31 2.3.6. Windows Networking—Jumbo Frames with GbE ..............................32 2.3.7. iSCSI Initiators for Windows ..............................................................32 2.3.8. FC-AL Initiators for Windows .............................................................33

3. Oracle Database Settings.............................................................................33 3.1. DISK_ASYNCH_IO ..................................................................................33 3.2. DB_FILE_MULTIBLOCK_READ_COUNT ...............................................34 3.3. DB_BLOCK_SIZE 3.4. DBWR_IO_SLAVES and DB_WRITER_PROCESSES ...........................34 3.5. DB_BLOCK_LRU_LATCHES...................................................................35

4. Backup, Restore, and Disaster Recovery ...................................................35 4.1. How to Back Up Data from a NetApp System ..........................................35 4.2. Creating Online Backups Using Snapshot Copies ...................................36 4.3. Recovering Individual Files from a Snapshot Copy ..................................37 4.4. Recovering Data Using SnapRestore.......................................................37 4.5. Consolidating Backups with SnapMirror...................................................38 4.6. Creating a Disaster Recovery Site with SnapMirror .................................38 4.7. Creating Nearline Backups with SnapVault ..............................................38 4.8. NDMP and Native Tape Backup and Recovery........................................39 4.9. Using Tape Devices with NetApp Systems ..............................................40 4.10. Supported Third-Party Backup Tools .....................................................40 4.11. Backup and Recovery Best Practices ....................................................40

4.11.1. SnapVault and Database Backups ..................................................41 References ........................................................................................................45 Revision History................................................................................................46

Introduction Thousands of Network Appliance (NetApp) customers have successfully deployed Oracle Databases on NetApp filers for their mission- and business-critical applications. NetApp and Oracle have worked over the past several years to validate Oracle products on NetApp filers and a range of server platforms. NetApp and Oracle support have established a joint escalations team that works hand in hand to resolve customer support issues in a timely manner. In the process, the team discovered that most escalations are due to failure to follow the best established practices when deploying Oracle Databases with NetApp filers. This document describes best practices for running Oracle Databases on NetApp filers with system platforms such as Solaris™, HP/UX, AIX, Linux®, and

Page 3

Page 4: Oracle Netapp Best Practices

Windows®. These practices were developed through the interaction of technical personnel from NetApp, Oracle, and joint customer sites. This guide assumes a basic understanding of the technology and operation of NetApp products and presents options and recommendations for planning, deployment, and operation of NetApp products to maximize their effective use.

1. Network Appliance System Configuration

1.1. Appliance Network Settings When configuring network interfaces for new systems, it's best to run the setup command to automatically bring up the interfaces and update the /etc/rc file and /etc/hosts file. The setup command will require a reboot to take effect. However, if a system is in production and cannot be rebooted, network interfaces can be configured with the ifconfig command. If a NIC is currently online and needs to be reconfigured, it must first be brought down. To minimize downtime on that interface, a series of commands can be entered on a single command line separated by the semicolon (;) symbol.

Example: filer>ifconfig e0 down;ifconfig e0 'hostname'-e0 mediatype auto netmask 255.255.255.0 partner e0

When configuring or reconfiguring NICs or VIFs in a cluster, it is imperative to include the appropriate partner <interface> name or VIF name in the configuration of the cluster partner’s NIC or VIF to ensure fault tolerance in the event of cluster takeover. Please consult your NetApp support representative for assistance. A NIC or VIF being used by a database should not be reconfigured while the database is active. Doing so can result in a database crash.

1.1.1. Ethernet—Gigabit Ethernet, Autonegotiation, and Full Duplex Any database using NetApp storage should utilize Gigabit Ethernet on both the filer and database server. NetApp Gigabit II, III, and IV cards are designed to autonegotiate interface configurations and are able to intelligently self-configure themselves if the autonegotiation process fails. For this reason, NetApp recommends that Gigabit Ethernet links on clients, switches, and NetApp systems be left in

Page 4

Page 5: Oracle Netapp Best Practices

their default autonegotiation state, unless no link is established, performance is poor, or other conditions arise that might warrant further troubleshooting. Flow control should by default be set to “full” on the filer in its /etc/rc file, by including the following entry (assuming the Ethernet interface is e5): ifconfig e5 flowcontrol full If the output of the ifstat –a command does not show full flow control, then the switch port will also have to be configured to support it. (The ifconfig command on the filer will always show the requested setting; ifstat shows what flow control was actually negotiated with the switch.)

1.2. Volume Setup and Options 1.2.1. Databases There is currently no empirical data to suggest that splitting a database into multiple physical volumes enhances or degrades performance. Therefore, the decision on how to structure the volumes used to store a database should be driven by backup, restore, and mirroring requirements. A single database instance should not be hosted on multiple unclustered filers, because a database with sections on multiple filers makes maintenance that requires filer downtime—even for short periods—hard to schedule and increases the impact of downtime. If a single database instance must be spread across several separate filers for performance, care should be taken during planning so that the impact of filer maintenance or backup can be minimized. Segmenting the database so the portions on a specific filer can periodically be taken offline is recommended whenever feasible.

1.2.2. Volume Size While the maximum supported volume size on a NetApp system is 8TB, NetApp discourages customers from configuring individual volumes larger than 3TB. NetApp recommends that the size of a volume be limited to 3TB or smaller for the following reasons: Reduced per volume backup time

Individual grouping of Snapshot™ copies, qtrees, etc. Improved security and manageability through data separation Reduced risk from administrative mistakes, hardware failures, etc.

Page 5

Very small volumes (two or three disks) will have limited performance.

Page 6: Oracle Netapp Best Practices

NetApp recommends that no fewer than 10 data disks be configured in volumes that require high data throughput.

1.2.3. Oracle Optimal Flexible Architecture (OFA) on NetApp Storage Distribute files on multiple volumes on physically separate disks to achieve I/O load balancing: Separate out high I/O Oracle files from system files for better response times Ease backup and recovery for Oracle data and log files by putting them in

separate logical volumes Ensure fast recovery from a crash to minimize downtime Maintain logical separation of Oracle components to ease maintenance and

administration OFA architecture works well with a multiple Oracle home (MOH) layout

ORACLE_BASE

ORACLE_HOME(home1)

ORACLE_HOME

(home2) /DBS /LOG /DBS

/Admin /LOG /Admin

……

Filer Server OFA /vol/vol0 /vol/oracle /var/opt/oracle $ORACLE_HOME (Oracle libraries) /vol/oradata /var/opt/oracle/dbs $ORACLE_HOME/dbs (data files) /vol/oralog /var/opt/oracle/log $ORACLE_HOME/log (log files)

1.2.4. Best Practices for Control and Log Files Online Redo Log Files Multiplex your log files. To do that: 1. Create a minimum of two online redo log groups, each with three

members.

Page 6

Page 7: Oracle Netapp Best Practices

2. Put the first online redo log group on one volume and the next on another

volume. Oracle writes each committed transaction to the first member of each online redo log group until it fills up, then goes to the next member of each online redo log group, and so on. When all members are full, it does a checkpoint to flush the members of the Oracle redo log groups to the archived log files.

Redo Grp 1: $ORACLE_HOME/Redo_Grp1 (on filer volume /vol/oracle) Redo Grp 2: $ORACLE_HOME/Redo_Grp2 (on filer volume /vol/oralog) Archived Log Files 1. Set your init parameter, ARCHIVE_LOG_DEST, to a directory in the log

volume such as $ORACLE_HOME/log/ArchiveLog (on filer volume /vol/oralog).

Control Files Multiplex your control files. To do that: 1. Set your init parameter, CONTROL_FILE_DEST, to point to destinations

on at least two different filer volumes: Dest 1: $ORACLE_HOME/Control_File1 (on filer volume /vol/oracle) Dest 2: $ORACLE_HOME/log/Control_File2 (on filer volume /vol/oralog) Filer Server /vol/vol0 (Filer root volume) /vol/oracle /var/opt/oracle ($ORACLE_HOME)

/Binaries /Redo_Grp1 /Redo Log Member1 /Redo Log Member2 /Redo Log Member 3 /Control_File1 /Control File 1

/vol/oradata /var/opt/oracle/dbs ($ORACLE_HOME/dbs) /Data Files

/vol/oralog

/var/opt/oracle/log ($ORACLE_HOME/log) /Redo_Grp2 /Redo Log Member1 /Redo Log Member2 /Redo Log Member3 /Control_File2 /Control File 2

Mountpoint

Mountpoint

Mountpoint

Page 7

Page 8: Oracle Netapp Best Practices

OFA Layout on NetApp Filer

/ [Root partition on Oracle DB Server machine]

/var/opt/oracle [Oracle Home (Binaries) file system]/Binaries/Redo_Grp1

/Redo Log Member1/Redo Log Member2/Redo Log Member 3

/Control_File1/Control File 1

/var/opt/oracle/dbs [Data File file system]/Data Files

/var/opt/oracle/log [Log File file system]

/Redo_Grp2/Redo Log Member1/Redo Log Member2/Redo Log Member3

/Control_File2/Control File 2

Database Client

Database Client

Mount Point

Mount Point

Mount Point

/vol/vol0

/vol/oracle

/vol/oradata

/vol/oralog

NetApp®

Filer GbE

NFS over TCP/IP

GbE

NFS over TCP/IP

DBServer/

NFS Client

10/100

Oracle® Net(SQL Net)

1.3. RAID Group Size When reconstruction rate (the time required to rebuild a disk after a failure) is an important factor, smaller RAID groups should be used. Network Appliance recommends using the default RAID group size of eight disks for most applications. Larger RAID group sizes increase the impact from disk reconstruction due to: Increased number of reads required Increased RAID resources required An extended period during which I/O performance is impacted (reconstruction

in a larger RAID group takes longer; therefore I/O performance is compromised for a longer period)

These factors will result in a larger performance impact to normal user workloads and/or slower reconstruction rates. Larger RAID groups also increase the possibility that a double disk failure will lead to data loss. (The larger the RAID group, the greater the chance that two disks will fail at the same time in the same group.) With the release of Data ONTAP™ 6.5, double-parity RAID, or RAID-DP™, was introduced. With RAID-DP, each RAID group is allocated an additional parity

Page 8

Page 9: Oracle Netapp Best Practices

disk. Given this additional protection, the likelihood of data loss due to a double disk failure has been nearly eliminated, and therefore larger RAID group sizes can be supported. With Data ONTAP 6.5 or later, RAID group sizes up to 14 disks can be safely configured using RAID-DP. However we recommend the default RAID group size of 16 for RAID-DP.

1.4. Snapshot and SnapRestore® NetApp strongly recommends using Snapshot and SnapRestore for Oracle Database backup and restore operations. Snapshot provides a point-in-time copy of the entire database in seconds without incurring any performance penalty, while SnapRestore can instantly restore an entire database to a point in time in the past. In order for Snapshot copies to be effectively used with Oracle Databases, they must be coordinated with the Oracle hot backup facility. For this reason, NetApp recommends that automatic Snapshot copies be turned off on volumes that are storing data files for an Oracle Database. To turn off automatic Snapshot copies on a volume, issue the following command:

vol options <volname> nosnap on

If you want to make the “.snapshot” directory invisible to clients, issue the following command:

vol options <volname> nosnapdir on With automatic Snapshot copies disabled, regular Snapshot copies are created as part of the Oracle backup process when the database is in a consistent state. For additional information on using Snapshot and SnapRestore to back up/restore an Oracle Database, see [5].

1.5. Snap Reserve Setting the snap reserve on a volume sets aside part of the volume for the exclusive use of Snapshot copies. Note: Snapshot copies may consume more space than allocated with snap reserve, but user files may not consume the reserved space.

Page 9

Page 10: Oracle Netapp Best Practices

To see the snap reserve size on a volume, issue this command: snap reserve To set the volume snap reserve size (the default is 20%), issue this command: snap reserve <volume> <percentage> Do not use a percent sign (%) when specifying the percentage. The snap reserve should be adjusted to reserve slightly more space than the Snapshot copies of a volume consume at their peak. The peak Snapshot copy size can be determined by monitoring a system over a period of a few days when activity is high. The snap reserve may be changed at any time. Don’t raise the snap reserve to a level that exceeds free space on the volume; otherwise client machines may abruptly run out of storage space. NetApp recommends that you observe the amount of snap reserve being consumed by Snapshot copies frequently. Do not allow the amount of space consumed to exceed the snap reserve. If the snap reserve is exceeded, consider increasing the percentage of the snap reserve or delete Snapshot copies until the amount of space consumed is less than 100%. NetApp DataFabric® Manager (DFM) can aid in this monitoring.

1.6. System Options 1.6.1. The minra Option When the minra option is enabled, it minimizes the number of blocks that are prefetched for each read operation. By default, minra is turned off, and the system performs aggressive read ahead on each volume. The effect of read ahead on performance is dependent on the I/O characteristics of the application. If data is being accessed sequentially, as when a database performs full table and index scans, read ahead will increase I/O performance. If data access is completely random, read ahead should be disabled, since it may decrease performance by prefetching disk blocks that are never used, thereby wasting system resources. The following command is used to enable minra on a volume and turn read ahead off:

vol options <volname> minra on

Page 10

Page 11: Oracle Netapp Best Practices

Generally, the read ahead operation is beneficial to databases, and the minra option should be left alone. However, NetApp recommends experimenting with the minra option to observe the performance impact, since it is not always possible to determine how much of an application’s activity is sequential versus random. This option is transparent to client access and can be changed at will without disrupting client I/O. Be sure to allow two to three minutes for the cache on the appliance to adjust to the new minra setting before looking for a change in performance.

1.6.2. File Access Time Update Another option that can improve access time is file access time update. If an application does not require or depend upon maintaining accurate access times for files, this option can be disabled. Use this option only if the application generates heavy read I/O traffic. The following command is used to disable file access time updates:

vol options <volname> no_atime_update on

1.6.3. NFS Settings NetApp recommends the use of TCP as the data transport mechanism with the current NFS V3.0 client software on the host. If it isn’t possible to use NFS V3.0 on the client, then it may be necessary to use UDP as the data transport mechanism. When UDP is configured as the data transport mechanism, the following NFS option should be configured on the NetApp system:

options nfs.udp.xfersize 32768 This sets the NFS transfer size to the maximum. There is no penalty for setting this value to the maximum of 32,768. However, if xfersize is set to a small value and an I/O request exceeds that value, the I/O request is broken up into smaller chunks, resulting in degraded performance.

2. Operating Systems 2.1. Linux For additional information about getting the most from Linux and NetApp technologies, see [6].

Page 11

Page 12: Oracle Netapp Best Practices

2.1.1. Linux—Recommended Versions The various Linux operating systems are based on the underlying kernel. With all the distributions available, it is important to focus on the kernel to understand features and compatibility. Kernel 2.2 The NFS client in kernels later than 2.2.19 supports NFS V2 and NFS V3 and supports NFS over both UDP and TCP. Clients in earlier releases of the 2.2 branch did not support NFS over TCP. This NFS client is not as stable, nor does it perform as well as the NFS client in the 2.4 branch. However, it should work well enough for most small to moderate workloads. TCP support is less mature than UDP support in the 2.2 branch, so using UDP may provide higher reliability when using the 2.2 branch of the Linux kernel. Kernel 2.4 The NFS client in this kernel has many improvements over the 2.2 client, most of which address performance and stability problems. The NFS client in kernels later than 2.4.16 has significant changes to help improve performance and stability. There have been recent controversial changes in the 2.4 branch that have prevented distributors from adopting late releases of the branch. Although there were significant improvements to the NFS client in 2.4.15, Torvalds also replaced parts of the VM subsystem, making the 2.4.15, 2.4.16, and 2.4.17 kernels unstable for heavy workloads. Many recent releases from Red Hat and SuSe include the 2.4.18 kernel. The use of 2.4 kernels on hardware with more than 896MB should include a special kernel compile option known as CONFIG_HIGHMEM, which is required to access and use memory above 896MB. The Linux NFS client has a known problem in these configurations in which an application or the whole client system can hang at random. This issue has been addressed in the 2.4.20 kernel, but still haunts kernels contained in distributions from Red Hat and SuSE that are based on earlier kernels. NetApp has tested many kernel distributions, and those based on 2.4.18 are currently recommended. Recommended distributions include Red Hat Enterprise Linux Advanced Server 2.1 (based on 2.4.9), as well as SuSe 7.2 and SLES 8. Work is under way to test the newer Linux distributions such as RHEL 3 and SLES 9. At this time not enough testing has been completed to recommend their use. This section will be revisited in the future with further recommendations. Manufacturer Version Tested Recommended

Page 12

Page 13: Oracle Netapp Best Practices

Red Hat Advanced Server 2.1 Yes Yes SUSE 7.2 Yes Yes SUSE SLES 8 Yes Yes

2.1.2. Linux—Kernel Patches

In all circumstances, the kernel patches recommended by Oracle for the particular database product being run should be applied first. In general, those recommendations will not conflict with the ones here, but if a conflict does arise, check with Oracle or NetApp customer support for resolution before proceeding.

The uncached I/O patch was introduced in Red Hat Advanced Server 2.1, update 3, with kernel errata e35 and up. It is mandatory to use uncached I/O when running Oracle9i™ RAC with NetApp filers in a NAS environment. Uncached I/O does not cache data in the Linux file system buffer cache during read/write operations for volumes mounted with the "noac" mount option. To enable uncached I/O, add the following entry to the /etc/modules.conf file and reboot the cluster nodes:

options nfs nfs_uncached_io=1

The volumes used for storing Oracle Database files should still be mounted with the "noac" mount option for Oracle9i RAC databases. The uncached I/O patch has been developed by Red Hat and tested by Oracle, NetApp, and Red Hat.

2.1.3. Linux—OS Settings 2.1.3.1. Enlarging a Client's Transport Socket Buffers Enlarging the transport socket buffers that Linux uses for NFS traffic helps reduce resource contention on the client, reduces performance variance, and improves maximum data and operation throughput. In future releases of the client, the following procedure will not be necessary, as the client will automatically choose an optimal socket buffer size.

Become root on the client cd into /proc/sys/net/core echo 262143 > rmem_max echo 262143 > wmem_max echo 262143 > rmem_default echo 262143 > wmem_default Remount the NFS file systems on the client

Page 13

This is especially useful for NFS over UDP and when using Gigabit Ethernet. Consider adding this to a system startup script that runs before the system

Page 14: Oracle Netapp Best Practices

mounts NFS file systems. The recommended size (262,143 bytes) is the largest safe socket buffer size NetApp has tested. On clients with 16MB of memory or less, leave the default socket buffer size setting to conserve memory. Red Hat distributions after 7.2 contain a file called /etc/sysctl.conf where changes such as this can be added so they will be executed after every system reboot. Add these lines to the /etc/sysctl.conf file on these Red Hat systems:

net.core.rmem_max = 262143 net.core.wmem_max = 262143 net.core.rmem_default = 262143 net.core.wmem_default = 262143

2.1.3.2. Other TCP Enhancements The following settings can help reduce the amount of work clients and filers do when running NFS over TCP:

echo 0 > /proc/sys/net/ipv4/tcp_sack echo 0 > /proc/sys/net/ipv4/tcp_timestamps

These operations disable optional features of TCP to save a little processing time and network bandwidth. When building kernels, be sure that CONFIG_SYNCOOKIES is disabled. SYN cookies slow down TCP connections by adding extra processing on both ends of the socket. Some Linux distributors provide kernels with SYN cookies enabled. Linux 2.2 and 2.4 kernels support large TCP windows (RFC 1323) by default. No modification is required to enable large TCP windows.

2.1.4. Linux Networking—Full Duplex and Autonegotiation Most network interface cards use autonegotiation to obtain the fastest settings allowed by the card and the switch port to which it attaches. Sometimes, chipset incompatibilities may result in constant renegotiation or negotiating half duplex or a slow speed. When diagnosing a network problem, be sure the Ethernet settings are as expected before looking for other problems. Avoid hard coding the settings to solve autonegotiation problems, because it only masks a deeper problem. Switch and card vendors should be able to help resolve these problems.

2.1.5. Linux Networking—Gigabit Ethernet Network Adapters If Linux servers are using high-performance networking (gigabit or faster), provide enough CPU and memory bandwidth to handle the interrupt and data rate. The NFS client software and the gigabit driver reduce the resources

Page 14

Page 15: Oracle Netapp Best Practices

available to the application, so make sure resources are adequate. Most gigabit cards that support 64-bit PCI or better should provide good performance. Any database using NetApp storage should utilize Gigabit Ethernet on both the filer and database server to achieve optimal performance. NetApp has found that the following Gigabit Ethernet cards work well with Linux: SysKonnect. The SysKonnect SK-98XX series cards work very well with

Linux and support single- and dual-fiber and copper interfaces for better performance and availability. A mature driver for this card exists in the 2.4 kernel source distribution.

Broadcom. Many cards and switches use this chipset, including the

ubiquitous 3Com solutions. This provides a high probability of compatibility between network switches and Linux clients. The driver software for this chipset appeared in the 2.4.19 Linux kernel and is included in Red Hat distributions with earlier 2.4 kernels. Be sure the chipset firmware is up to date.

AceNIC Tigon II. Several cards, such as the NetGear GA620T, use this

chipset, but none are still being manufactured. A mature and actively maintained driver for this chipset exists in the kernel source distribution.

Intel® EEPro/1000. This appears to be the fastest gigabit card available for

systems based on Intel, but the card's driver software is included only in recent kernel source distributions (2.4.20 and later) and may be somewhat unstable. The card's driver software for earlier kernels can be found on the Intel Web site. There are reports that the jumbo frame MTU for Intel cards is only 8998 bytes, not the standard 9000 bytes.

2.1.6. Linux Networking—Jumbo Frames with GbE All of the cards described above support the jumbo frames option of Gigabit Ethernet. Using jumbo frames can improve performance in environments where Linux NFS clients and NetApp systems are together on an unrouted network. Be sure to consult the command reference for each switch to make sure it is capable of handling jumbo frames. There are some known problems in Linux drivers and the networking layer when using the maximum frame size (9000 bytes). If unexpected performance slowdowns occur when using jumbo frames, try reducing the MTU to 8960 bytes.

Page 15

Page 16: Oracle Netapp Best Practices

2.1.7. Linux NFS Protocol—Mount Options Version. Use NFS V3 whenever practical. Specify the mount option "vers=3" when mounting file systems from a NetApp system. (Be sure that the NFS V3 protocol is also enabled on the filer.) Hard. The "hard" mount option is the default on Linux and is mandatory to ensure data integrity. Using the "soft" option prevents NFS client instability during server and network outages, but it may expose applications to silent data corruption, even when file systems are mounted read-only. Intr. As an alternative to soft mounting, consider using the "intr" option, which allows users and applications to interrupt the NFS client when it is hung waiting for a server or network to recover. On Linux, interrupting applications or mount commands is not always successful, so rebooting may be necessary to recover when a mount hangs because a server is not available. Do not use this option for databases on Linux. Databases should use the nointr option. nointr. When running applications such as databases that depend on end-to-end data integrity, use "hard,nointr." If a system call is interrupted and the "intr" option is in effect, the system call can return error codes the application is not expecting, resulting in incorrect behavior or data corruption. Use of "intr" has a small risk of unexpected database crashes with associated potential loss of transactions. Given this small but existing risk, NetApp recommends the use of "nointr" for Linux NFS mounts for mission-critical database environments. Less critical environments such as development and test can use "intr" with only a small risk of problems. NetApp recommends the use of TCP for NFS. There are rare cases in which NFS over UDP on noisy or very busy networks can result in data corruption. Oracle has certified Oracle9i RAC on NFS over TCP running on Red Hat Advanced Server 2.1. rsize/wsize. In Linux, the "rsize" and "wsize" mount options have additional semantics compared with the same options as implemented in other operating systems. Normally these options determine how large a network read or write operation can be before the client breaks it into smaller operations. Low rsize and wsize values are appropriate when adverse network conditions prevent NFS from working with higher values or when NFS must share a low-bandwidth link with interactive data streams. By default, NFS client implementations choose the largest rsize and wsize values a server supports. However, if rsize and wsize are not explicitly set when mounting an NFS file system on a Red Hat host, the default value for both is a modest 4096 bytes. Red Hat chose this default

Page 16

Page 17: Oracle Netapp Best Practices

because it allows the Linux NFS client to work without adjustment in most environments. Usually, on clean high-performance networks or with NFS over TCP, overall NFS performance can be improved by explicitly increasing these values. With NFS over TCP, setting rsize and wsize to 32kB usually provides good performance by allowing a single RPC to transmit or receive a large amount of data. It is very important to note that the capabilities of the Linux NFS server are different from the capabilities of the Linux NFS client. As of the 2.4.19 kernel release, the Linux NFS server does not support NFS over TCP and does not support rsize and wsize larger than 8kB. The Linux NFS client, however, supports NFS over both UDP and TCP and rsize and wsize up to 32kB. Some online documentation is confusing when it refers to features that "Linux NFS" supports. Usually such documentation refers to the Linux NFS server, not the client. fg vs. bg. Consider using the "bg" option if a client system needs to be available even if it cannot mount some servers. This option causes mount requests to put themselves in the background automatically if a mount cannot complete immediately. When a client starts up and a server is not available, the client waits for the server to become available by default. The default behavior results in waiting for a very long time before giving up. The "fg" option is useful when mount requests must be serialized during system initialization. For example, a system must mount /usr before proceeding with multiuser boot. When /usr or other critical file systems are mounted from an NFS server, the fg option should be specified.

Page 17

Because the boot process can complete without all the file systems being available, care should be taken to ensure that required file systems are present before starting the Oracle Database processes. nosuid. The "nosuid" mount option can be used to improve security. This option causes the client to disable the special bits on files and directories. The Linux man page for the mount command recommends also disabling or removing the suidperl command when using this option. actimeo/nocto. Due to the requirements of the NFS protocol, clients must check back with the server at intervals to be sure cached attribute information is still valid. The attribute cache timeout interval can be lengthened with the "actimeo" mount option to reduce the rate at which the client tries to revalidate its attribute cache. With the 2.4.19 kernel release, the "nocto" mount option can also be used to reduce the revalidation rate even further, at the expense of cache coherency among multiple clients.

Page 18: Oracle Netapp Best Practices

timeo. The units specified for the timeout option are in tenths of a second. This causes confusion for many users. This option controls RPC retransmission timeouts. By default, the client retransmits an unanswered UDP RPC request after 0.6 seconds (timeo=6). In general, it is not necessary to change the retransmission timeout, but in some cases, a shorter retransmission timeout for NFS over UDP may improve latencies due to packet losses. As of kernel 2.4.20, an estimation algorithm that adjusts the timeout for optimal performance governs the UDP retransmission timeout for some types of RPC requests. The TCP network protocol contains its own timeout and retransmission mechanism. The RPC client depends on this mechanism for recovering from the loss of RPC requests and thus uses a much longer timeout setting for NFS over TCP by default. Due to a bug in the mount command, the default retransmission timeout value on Linux for NFS over TCP is six seconds, unlike other NFS client implementations. To obtain standard behavior, you may wish to specify "timeo=600" explicitly when mounting via TCP. Using a short retransmission timeout with NFS over TCP does not have performance benefits and may introduce the risk of data corruption. sync. The Linux NFS client delays application writes to combine them into larger, more efficiently processed requests. The “sync” option guarantees that a client immediately pushes every write system call an application makes to servers. This is useful when an application must guarantee that data is safe on disk before it continues. Frequently such applications already use the O_SYNC open flag or invoke the flush system call when needed. Thus, the sync mount option is often not necessary. Oracle Database software specifies D_SYNC when it opens files, so the use of the “sync” option is not required in an Oracle environment.

Page 18

Noac. The "noac" mount option prevents an NFS client from caching file attributes. This means that every file operation on the client that requires file attribute information results in a GETATTR operation to retrieve a file's attribute information from the server. Note that noac also causes a client to process all writes to that file system synchronously, just as the sync mount option does. Disabling attribute caching is only one part of noac; it also guarantees that data modifications are visible on the server so that other clients using noac can detect them immediately. Thus noac is shorthand for "actimeo=0,sync." When the noac option is in effect, clients still cache file data as long as they detect that a file has not changed on the server. This allows a client to keep very close track of files on a server so it can discover changes made by other clients quickly. This option is normally not used, but it is important when an application that depends on single system behavior is deployed across several clients. Noac generates a very large number of GETATTR operations and sends write operations synchronously. Both of these add significant protocol overhead. The noac mount option trades off

Page 19: Oracle Netapp Best Practices

single-client performance for client cache coherency. With uncached I/O the number of GETATTR calls is reduced during reads, and data is not cached in the NFS client cache on reads and writes. Uncached I/O is available with Red Hat Advanced Server 2.1, update 3, kernel 2.4.9-e35 and up. Uncached I/O works on file systems mounted with the “noac” mount option. Only applications that require tight cache coherency among multiple clients require that file systems be mounted with the noac mount option. nolock. For some servers or applications, it is necessary to prevent the Linux NFS client from sending network lock manager requests. Use the "nolock" mount option to prevent the Linux NFS client from notifying the server's lock manager when an application locks a file. Note, however, that the client still uses more restrictive write-back semantics when a file lock is in effect. The client always flushes all pending writes whenever an application locks or unlocks a file. NetApp recommended mount options for an Oracle single-instance database on Linux: a) rw,bg,vers=3,tcp,hard,nointr,timeo=600,rsize=32768,wsize=32768 NetApp recommended mount options for Oracle9i RAC on Linux (without directI/O support, ex: RHEL 2.1, Update 3): a) Uncached I/O patch for RHEL 2.1 is release in Update 3 (e35) b) Add entry to /etc/modules.conf file: options nfs nfs_uncached_io=1 c) Use “noac” NFS client mount option d) Complete mount options: rw,bg,vers=3,tcp,hard,nointr,timeo=600,rsize=32768,wsize=32768,noac NetApp recommended mount options for Oracle9i RAC on Linux (with directI/O support, ex: RHEL 3.0): a) Apply direct I/O patch for RHEL 3.0 update 2. Patch obtained from Oracle metalink site, patch 2448994 b) Enable Oracle init.ora param: filesystemio_options=directio c) Use “actimeo=0” NFS client mount option d) Complete mount options: rw,bg,vers=3,tcp,hard,nointr,timeo=600,rsize=32768,wsize=32768,actimeo=0 NetApp recommended mount options for Oracle10g™ RAC on Linux (with directI/O support, ex: RHEL 3.0): a) Direct I/O support is built in to 10g RAC and RHEL 3.0 update 2 b) Enable Oracle init.ora param: filesystemio_options=directio c) Use “actimeo=0” NFS client mount option d) Complete mount options: rw,bg,vers=3,tcp,hard,nointr,timeo=600,rsize=32768,wsize=32768,actimeo=0

2.1.8. iSCSI Initiators for Linux iSCSI support for Linux is just now becoming available in a number of different forms. Both hardware and software initiators are starting to appear but have not reached a level of adoption to merit a great deal of attention. Testing is

Page 19

Page 20: Oracle Netapp Best Practices

insufficient to recommend any best practices at this time. This section will be revisited in the future for any recommendations or best practices for running Oracle Databases on Linux with iSCSI initiators.

2.1.9. FC-AL Initiators for Linux NetApp supports Fibre Channel storage access for Oracle Databases running on a Linux host. Connections to NetApp storage can be made through a Fibre Channel switch (SAN) or direct-attached. NetApp currently supports Red Hat Enterprise Linux 2.1 and SuSE Enterprise Server 8 on a Linux host with NetApp storage running Data ONTAP 6.4.1 and up. For more information about system requirements and installation, refer to [7]. NetApp recommends using Fibre Channel SANs for Oracle Databases on Linux where there is an existing investment in Fibre Channel infrastructure or the sustained throughput requirement for the database server is greater than 1GB per second (~110MB per second).

2.2. Sun™ Solaris Operating Systems 2.2.1. Solaris—Recommended Versions Manufacturer Version Tested Recommended Sun Solaris 2.6 Obsolete No 2.7 Yes No 2.8 Yes Yes 2.9 No Yes NetApp recommends the use of Solaris 2.9 or Solaris 2.8 for optimal server performance.

2.2.2. Solaris—Kernel Patches Sun patches are frequently updated, so any list is almost immediately obsolete. The patch levels listed are considered a minimally acceptable level for a particular patch; later revisions will contain the desired fixes but may introduce unexpected issues. NetApp recommends installing the latest revision of each Sun patch. However, report any problems encountered and back out the patch to the revision specified below to see if the problem is resolved.

Page 20

Page 21: Oracle Netapp Best Practices

These recommendations are in addition to, not a replacement for, the Solaris patch recommendations included in the Oracle installation or release notes. List of desired Solaris 8 patches as of January 21, 2004: Solaris 8 108813-16 SunOS 5.8: Sun Gigabit Ethernet 3.0 108806-17 SunOS 5.8: Sun Quad FastEthernet qfe driver

108528-27 SunOS 5.8: kernel update patch 108727-26 SunOS 5.8: /kernel/fs/nfs and /kernel/fs/sparcv9/nfs patch

(108727-25 addresses Solaris NFS client caching [wcc] bug 4407669: VERY important performance patch)

111883-23 SunOS 5.8: Sun GigaSwift Ethernet 1.0 driver patch

List of desired Solaris 9 patches as of January 21, 2004: Solaris 9

112817-16 SunOS 5.9: Sun GigaSwift Ethernet 1.0 driver patch 113318-10 SunOS 5.9: /kernel/fs/nfs and /kernel/fs/sparcv9/nfs patch

(addresses Solaris NFS client caching [wcc[ bug 4407669: VERY important performance patch)

113459-02 SunOS 5.9: udp patch 112233-11 SunOS 5.9: kernel patch 112854-02 SunOS 5.9: icmp patch 112975-03 SunOS 5.9: patch /kernel/sys/kaio 112904-09 SunOS 5.9: kernel/drv/ip patch; obsoletes 112902-12 112764-06 SunOS 5.9: Sun Quad FastEthernet qfe driver

Failure to install the patches listed above can result in database crashes and/or slow performance. They must be installed. Please note that the "Sun EAGAIN bug"—SUN Alert 41862, referenced in patch 108727—can result in Oracle Database crashes accompanied by this error message:

SVR4 Error 11: Resource temporarily unavailable The patches listed here may have other dependencies that are not listed. Read all installation instructions for each patch to ensure that any dependent or related patches are also installed.

2.2.3. Solaris—OS Settings There are a variety of Solaris settings that a system administrator or database administrator can use to get the most performance, availability, and simplicity out of a Sun and NetApp environment.

Page 21

Page 22: Oracle Netapp Best Practices

Solaris file descriptors:

rlim_fd_cur. "Soft" limit on the number of file descriptors (and sockets) that a single process can have open rlim_fd_max. "Hard" limit on the number of file descriptors (and sockets) that a single process can have open Setting these values to 1024 is STRONGLY recommended to avoid database crashes resulting from Solaris resource deprivation.

Solaris kernel "maxusers" setting:

The Solaris kernel parameter "maxusers" controls the allocation of several major kernel resources, such as the maximum size of the process table and the maximum number of processes per user.

2.2.4. Solaris Networking—Full Duplex and Autonegotiation The settings in this section only apply to back-to-back connections between NetApp and Sun without connecting through a switch. Solaris GbE cards must have autonegotiation forced off and transmit flow control forced on. This is true for the Sun "ge" cards and is assumed to still be the case with the newer Sun “ce” cards. NetApp recommends disabling autonegotiation, forcing the flow control settings, and forcing full duplex.

Page 22

Page 23: Oracle Netapp Best Practices

2.2.5. Solaris Networking—Gigabit Ethernet Network Adapters Sun provides Gigabit Ethernet cards in both PCI and SBUS configurations. The PCI cards deliver higher performance than the SBUS versions. NetApp recommends the use of the PCI cards wherever possible. Any database using NetApp storage should utilize Gigabit Ethernet on both the filer and database server to achieve optimal performance. SysKonnect is a third-party NIC vendor that provides Gigabit Ethernet cards. The PCI versions have proven to deliver high performance. Sun servers with Gigabit Ethernet interfaces should ensure that they are running with full flow control (some require setting both “send” and “receive” to ON individually). On a Sun server, set Gigabit flow control by adding the following lines to a startup script (such as one in /etc/rc2.d/S99*) or modify these entries if they already exist:

ndd –set /dev/ge instance 0 ndd –set /dev/ge ge_adv_pauseRX 1 ndd –set /dev/ge ge_adv_pauseTX 1 ndd –set /dev/ge ge_intr_mode 1 ndd –set /dev/ge ge_put_cfg 0

Note: The instance may be other than 0 if there is more than one Gigabit Ethernet interface on the system. Repeat for each instance that is connected to NetApp storage. For servers using /etc/system, add these lines:

set ge:ge_adv_pauseRX=1 set ge:ge_adv_pauseTX=1 set ge:ge_intr_mode=1 set ge_ge_put_cfg=0

Note that placing these settings in /etc/system changes every Gigabit interface on the Sun server. Switches and other attached devices should be configured accordingly.

2.2.6. Solaris Networking—Jumbo Frames with GbE Sun Gigabit Ethernet cards do NOT support jumbo frames.

Page 23

Page 24: Oracle Netapp Best Practices

SysKonnect provides SK-98xx cards that do support jumbo frames. To enable jumbo frames, execute the following steps:

1. Edit /kernel/drv/skge.conf and uncomment this line:

JumboFrames_Inst0=”On”; 2. Edit /etc/rcS.d/S50skge and add this line:

ifconfig skge0 mtu 9000

3. Reboot.

If using jumbo frames with a SysKonnect NIC, use a switch that supports jumbo frames and enable jumbo frame support on the NIC on the NetApp system. 2.2.7. Solaris Networking—Improving Network Performance Adjusting the following settings can have a beneficial effect on network performance. Most of these settings can be displayed using the Solaris “ndd” command and set by either using “ndd” or editing the /etc/system file. /dev/udp udp_recv_hiwat. Determines the maximum value of the UDP receive buffer. This is the amount of buffer space allocated for UDP received data. The default value is 8192 (8kB). It should be set to 65,535 (64kB). /dev/udp udp_xmit_hiwat. Determines the maximum value of the UDP transmit buffer. This is the amount of buffer space allocated for UDP transmit data. The default value is 8192 (8kB). It should be set to 65,535 (64kB). /dev/tcp tcp_recv_hiwat. Determines the maximum value of the TCP receive buffer. This is the amount of buffer space allocated for TCP receive data. The default value is 8192 (8kB). It should be set to 65,535 (64kB). /dev/tcp tcp_xmit_hiwat. Determines the maximum value of the TCP transmit buffer. This is the amount of buffer space allocated for TCP transmit data. The default value is 8192 (8kB). It should be set to 65,535 (64kB).

/dev/ge adv_pauseTX 1. Forces transmit flow control for the Gigabit Ethernet adapter. Transmit flow control provides a means for the transmitter to govern the amount of data sent; "0" is the default for Solaris, unless it becomes enabled as a result of autonegotiation between the NICs. NetApp strongly recommends that transmit flow control be enabled. Setting this value to 1 helps avoid dropped

Page 24

Page 25: Oracle Netapp Best Practices

packets or retransmits, because this setting forces the NIC card to perform flow control. If the NIC gets overwhelmed with data, it will signal the sender to pause. It may sometimes be beneficial to set this parameter to 0 to determine if the sender (the NetApp system) is overwhelming the client. Recommended settings were described in section 2.2.6 of this document. /dev/ge adv_pauseRX 1. Forces receive flow control for the Gigabit Ethernet adapter. Receive flow control provides a means for the receiver to govern the amount of data received. A setting of "1" is the default for Solaris. /dev/ge adv_1000fdx_cap 1. Forces full duplex for the Gigabit Ethernet adapter. Full duplex allows data to be transmitted and received simultaneously. This should be enabled on both the Solaris server and the NetApp system. A duplex mismatch can result in network errors and database failure. sq_max_size. Sets the maximum number of messages allowed for each IP queue (STREAMS synchronized queue). Increasing this value improves network performance. A safe value for this parameter is 25 for each 64MB of physical memory in a Solaris system up to a maximum value of 100. The parameter can be optimized by starting at 25 and incrementing by 10 until network performance reaches a peak. Nstrpush. Determines the maximum number of modules that can be pushed onto a stream and should be set to 9. Ncsize. Determines the size of the DNLC (directory name lookup cache). The DNLC stores lookup information for files in the NFS-mounted volume. A cache miss may require a disk I/O to read the directory when traversing the pathname components to get to a file. Cache hit rates can significantly affect NFS performance; getattr, setattr, and lookup usually represent greater than 50% of all NFS calls. If the requested information isn't in the cache, the request will generate a disk operation that results in a performance penalty as significant as that of a read or write request. The only limit to the size of the DNLC cache is available kernel memory. Each DNLC entry uses about 50 bytes of extra kernel memory. Network Appliance recommends that ncsize be set to 8000. nfs:nfs3_max_threads. The maximum number of threads that the NFS V3 client can use. The recommended value is 24. nfs:nfs3_nra. The read-ahead count for the NFS V3 client. The recommended value is 10.

Page 25

Page 26: Oracle Netapp Best Practices

nfs:nfs_max_threads. The maximum number of threads that the NFS V2 client can use. The recommended value is 24. nfs:nfs_nra. The read-ahead count for the NFS V2 client. The recommended value is 10.

2.2.8. Solaris IP Multipathing (IPMP) Solaris has a facility that allows the use of multiple IP connections in a configuration similar to a NetApp virtual interface (VIF). In some circumstances, use of this feature can be beneficial. IPMP can be configured either in a failover configuration or in a load-sharing configuration. The failover configuration is fairly self-explanatory and straightforward to set up. Two interfaces are allocated to a single IP address, with one interface on standby (referred to in the Solaris documentation as “deprecated”) and one interface active. If the active link goes down, Solaris transparently moves the traffic to the second interface. Since this is done within the Solaris kernel, applications utilizing the interface are unaware and unaffected when the switch is made. NetApp has tested the failover configuration of Solaris IPMP and recommends its use where failover is required, the interfaces are available, and standard trunking (e.g., Cisco Etherchannel) capabilities are not available. The load-sharing configuration utilizes a trick wherein the outbound traffic to separate IP addresses is split across interfaces, but all outbound traffic contains the return address of the primary interface. Where a large amount of writing to a filer is occurring, this configuration sometimes yields improved performance. Because all traffic back into the Sun returns on the primary interface, heavy read I/O is not accelerated at all. Furthermore, the mechanism that Solaris uses to detect failure and trigger failover to the surviving NIC is incompatible with NetApp cluster solutions. NetApp recommends against the use of IPMP in a load-sharing configuration due to its current incompatibility with NetApp cluster technology, its limited ability to improve read I/O performance, and its complexity and associated inherent risks.

Page 26

Page 27: Oracle Netapp Best Practices

2.2.9. Solaris NFS Protocol—Mount Options Getting the right NFS mount options can have a significant impact on both performance and reliability of the I/O subsystem. Below are a few tips to aid in choosing the right options. Mount options are set either manually, when a file system is mounted on the Solaris system, or, more typically, specified in /etc/vfstab for mounts that occur automatically at boot time. The latter is strongly preferred since it ensures that a system that reboots for any reason will return to a known state without operator intervention. To specify mount options: 1. Edit the /etc/vfstab. 2. For each NFS mount participating in a high-speed I/O infrastructure, make

sure the mount options specify TCP V3 with transfer sizes of 32kB: …hard,bg,intr,vers=3,proto=tcp, rsize=32768, wsize=32768,… Note: These values are the default NFS settings for Solaris 8 and 9. Specifying them is not actually required but is recommended for clarity. Hard. The "soft" option should never be used with databases. It may result in incomplete writes to data files and database file connectivity problems. The “hard” option specifies that I/O requests will retry forever in the event that they fail on the first attempt. This forces applications doing I/O over NFS to hang until the required data files are accessible. This is especially important where redundant networks and servers (e.g., NetApp clusters) are employed. Bg. Specifies that the mount should move into the background if the NetApp system is not available to allow the Solaris boot process to complete. Because the boot process can complete without all the file systems being available, care should be taken to ensure that required file systems are present before starting the Oracle Database processes. Intr. This option allows operations waiting on an NFS operation to be interrupted. This is desirable for rare circumstances in which applications utilizing a failed NFS mount need to be stopped so that they can be reconfigured and restarted. If this option is not used and an NFS connection mounted with the “hard” option fails and does not recover, the only way for Solaris to be recovered is to reboot the Sun server. rsize/wsize. Determines the NFS request size for reads/writes. The values of these parameters should match the values for nfs.udp.xfersize and nfs.tcp.xfersize on the NetApp system. A value of 32,768 (32kB) has been shown to maximize database performance in the environment of NetApp and Solaris. In

Page 27

Page 28: Oracle Netapp Best Practices

all circumstances, the NFS read/write size should be the same as or greater than the Oracle block size. For example, specifying a DB_FILE_MULTIBLOCK_READ_COUNT of 4 multiplied by a database block size of 8kB results in a read buffer size (rsize) of 32kB. NetApp recommends that DB_FILE_MULTIBLOCK_READ_COUNT should be set from 1 to 4 for an OLTP database and from 16 to 32 for DSS. Vers. Sets the NFS version to be used. Version 3 yields optimal database performance with Solaris. Proto. Tells Solaris to use either TCP or UDP for the connection. Previously UDP gave better performance but was restricted to very reliable connections. TCP has more overhead but handles errors and flow control better. If maximum performance is required and the network connection between the Sun and the NetApp system is short, reliable, and all one speed (no speed matching within the Ethernet switch), UDP can be used. In general, it is safer to use TCP. In recent versions of Solaris (2.8 and 2.9) the performance difference is negligible. Forcedirectio. A new option introduced with Solaris 8. It allows the application to bypass the Solaris kernel cache, which is optimal for Oracle. This option should only be used with volumes containing data files. It should never be used to mount volumes containing executables. Using it with a volume containing Oracle executables will prevent all executables stored on that volume from being started. If programs that normally run suddenly won’t start and immediately core dump, check to see if they reside on a volume being mounted using “forcedirectio.” The introduction of forced direct I/O with Solaris 8 is a tremendous benefit. Direct I/O bypasses the Solaris file system cache. When a block of data is read from disk, it is read directly into the Oracle buffer cache and not into the file system cache. Without direct I/O, a block of data is read into the file system cache and then into the Oracle buffer cache, double-buffering the data, wasting memory space and CPU cycles. Oracle does not use the file system cache. Using system monitoring and memory statistics tools, NetApp has observed that without direct I/O enabled on NFS-mounted file systems, large numbers of file system pages are paged in. This adds system overhead in context switches, and system CPU utilization increases. With direct I/O enabled, file system page-ins and CPU utilization are reduced. Depending on the workload, a significant increase can be observed in overall system performance. In some cases the increase has been more than 20%.

Page 28

Direct I/O for NFS is new in Solaris 8, although it was introduced in UFS in Solaris 6. Direct I/O should only be used on mountpoints that house Oracle

Page 29: Oracle Netapp Best Practices

Database files, not on nondatabase files or Oracle executables or when doing normal file I/O operations such as “dd.” Normal file I/O operations benefit from caching at the file system level. A single volume can be mounted more than once, so it is possible to have certain operations utilize the advantages of “forcedirectio” while others don’t. However, this can create confusion, so care should be taken. NetApp recommends the use of “forcedirectio” on selected volumes where the I/O pattern associated with the files under that mountpoint do not lend themselves to NFS client caching. In general these will be data files with access patterns that are mostly random as well as any online redo log files and archive log files. The forcedirectio option should not be used for mountpoints that contain executable files such as the ORACLE_HOME directory. Using the forcedirectio option on mountpoints that contain executable files will prevent the programs from executing properly. NetApp recommended mount options for Oracle single-instance database on Solaris: rw,bg,vers=3,proto=tcp,hard,intr,rsize=32768,wsize=32768,forcedirectio NetApp recommended mount options for Oracle9i RAC on Solaris: rw,bg,vers=3,proto=tcp,hard,intr,rsize=32768,wsize=32768,forcedirectio,noac Multiple Mountpoints To achieve the highest performance, transactional OLTP databases benefit from configuring multiple mountpoints on the database server and distributing the load across these mountpoints. The performance improvement is generally from 2% to 9%. This is a very simple change to make, so any improvement justifies the effort. To accomplish this, create another mountpoint to the same file system on the NetApp filer. Then either rename the data files in the database (using the ALTER DATABASE RENAME FILE command) or create symbolic links from the old mountpoint to the new mountpoint.

2.2.10. iSCSI Initiators for Solaris Currently, NetApp does not support iSCSI initiators on Solaris. This section will be updated in the future when iSCSI initiators for Solaris become available.

Page 29

Page 30: Oracle Netapp Best Practices

2.2.11. Fibre Channel SAN for Solaris NetApp introduced the industry’s first unified storage appliance capable of serving data in either NAS or SAN configurations. NetApp provides Fibre Channel SAN solutions for all platforms, including Solaris, Windows, Linux, HP/UX, and AIX. The NetApp Fibre Channel SAN solution provides the same manageability framework and feature-rich functionality that have benefited our NAS customers for years. Customers can choose either NAS or FC SAN for Solaris, depending on the workload and the current environment. For FC SAN configurations, it is highly recommended to use the latest SAN host attach kit 1.2 for Solaris. The kit comes with the Fibre Channel HBA, drivers, firmware, utilities, and documentation. For installation and configuration, refer to the documentation that is shipped with the attach kit. NetApp has validated the FC SAN solution for Solaris in an Oracle environment. Refer to the Oracle integration guide with NetApp FC SAN in a Solaris environment ([8]) for more details. For performing backup and recovery of an Oracle Database in a SAN environment, refer to [9]. NetApp recommends using Fibre Channel SAN with Oracle Databases on Solaris where there is an existing investment in Fibre Channel infrastructure or the sustained throughput requirement for the database server is more than 1GB per second (~110MB per second). 2.3. Microsoft® Windows Operating Systems 2.3.1. Windows Operating System—Recommended Versions Microsoft Windows NT® 4.0, Windows 2000 Server and Advanced Server, Windows 2003 Server

2.3.2. Windows Operating System—Service Packs Microsoft Windows NT 4.0: Apply Service Pack 5 Microsoft Windows 2000: SP2 or SP3 Microsoft Windows 2000 AS: SP2 or SP3 Microsoft Windows 2003: Standard or Enterprise

2.3.3. Windows Operating System—Registry Settings The following changes to the registry will improve the performance and reliability of Windows. Make the following changes and reboot the server:

Page 30

Page 31: Oracle Netapp Best Practices

The /3GB switch should not be present in C:\boot.ini. \\HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\LanmanServer

\Parameters\MaxMpxCt

Datatype: DWORD Value: To match the setting above for cifs.max_mpx

\\HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip \Parameters\TcpWindow Datatype: DWORD Value: 64240 (0xFAF0)

The following table explains some of these items and offers tuning suggestions:

Item Description MaxMpxCt The maximum number of outstanding requests a

Windows client can have against a NetApp system. This must match cifs.max_mpx. Look at the performance monitor redirector/current item. If it is constantly running at the current value of MaxMpxCt, then increase this value.

TcpWindow The maximum transfer size for data across the network. This value should be set to 64,240 (0xFAF0).

2.3.4. Windows Networking—Autonegotiation and Full Duplex Go to Control Panel -> Network -> Services tab -> Server and click the Properties button. Set the options to maximize network applications and network performance.

NetApp recommends that customers use either iSCSI or Fiber Channel to run their Oracle Database on Windows.

2.3.5. Windows Networking—Gigabit Ethernet Network Adapters

Any database using NetApp storage should utilize Gigabit Ethernet on both the filer and database server to achieve optimal performance.

Page 31

Page 32: Oracle Netapp Best Practices

NetApp has tested the Intel PRO/1000 F Server Adapter. The following settings can be tuned on this adapter. Each setting should be tested and optimized as necessary to achieve optimal performance.

Item Description

Coalesce buffers = 32 The number of buffers available for transmit acceleration.

Flow control = receive pause frame

The flow control method used. This should match the setting for the Gigabit Ethernet adapter on the NetApp system.

Jumbo frames = disable

This would allow larger Ethernet packets to be transmitted. NetApp filers will support this in Data ONTAP 6.1 and later releases.

Receive descriptors = 32 The number of receive buffers and descriptors that the driver allocates for receiving packets.

Transmit descriptors = 32 The number of transmit buffers and descriptors that the driver allocates for sending packets.

2.3.6. Windows Networking—Jumbo Frames with GbE Note: Be very careful when using jumbo frames with Microsoft Windows 2000. If jumbo frames are enabled on the filer and the Windows server running Oracle and authentication is being done through a Windows 2000 domain, then authentication could be going out the interface that has jumbo frames enabled to a domain controller, which is typically not configured to use jumbo frames. This could result in long delays or errors in authentication when using CIFS.

2.3.7. iSCSI Initiators for Windows NetApp recommends using either the Microsoft iSCSI initiator or the Network Appliance iSCSI host attach kit 2.0 for Windows over a high-speed dedicated Gigabit Ethernet network on platforms such as Windows 2000, Windows 2000 AS, and Windows 2003 with Oracle Databases. For platforms such as Windows NT, which does not have iSCSI support, NetApp supports CIFS for Oracle Database and application storage. It is recommended to upgrade to Windows 2000 or later and use an iSCSI initiator (either software or

Page 32

Page 33: Oracle Netapp Best Practices

hardware). NetApp currently supports Microsoft initiator 1.02 and 1.03, available from www.microsoft.com.

2.3.8. FC-AL Initiators for Windows Network Appliance supports Fibre Channel SAN on Windows for use with Oracle Databases. NetApp recommends using Fibre Channel SAN with Oracle Databases on Windows where there is an existing investment in Fibre Channel infrastructure. NetApp also recommends considering Fibre Channel SAN solutions for Windows when the sustained throughput requirement for the Oracle Database server is more than 1GB per second (~110MB per second).

3. Oracle Database Settings This section describes settings that are made to the Oracle Database application, usually through settings contained in the “init.ora” file. The reader is assumed to have an existing knowledge of how to correctly set these settings and an idea of their effect. The settings described here are the ones most frequently tuned when using NetApp storage with Oracle Databases.

3.1. DISK_ASYNCH_IO Enables or disables Oracle asynchronous I/O. Asynchronous I/O allows processes to proceed with the next operation without having to wait for an issued write operation to complete, therefore improving system performance by minimizing idle time. This setting may improve performance depending on the database environment. If the DISK_ASYNCH_IO parameter is set to TRUE, then DB_WRITER_PROCESSES and DB_BLOCK_LRU_LATCHES (Oracle versions prior to 9i) or DBWR_IO_SLAVES must also be used, as described below. The calculation looks like this:

DB_WRITER_PROCESSES = 2 * number of CPUs Recent performance findings on Solaris 8 patched to 108813-11 or later and Solaris 9 have shown that setting:

DISK_ASYNCH_IO = TRUE DB_WRITER_PROCESSES = 1

can result in better performance as compared to when DISK_ASYNCH_IO was set to FALSE. NetApp recommends ASYNC_IO for Solaris 2.8 and above.

Page 33

Page 34: Oracle Netapp Best Practices

3.2. DB_FILE_MULTIBLOCK_READ_COUNT Determines the maximum number of database blocks read in one I/O operation during a full table scan. The number of database bytes read is calculated by multiplying DB_BLOCK_SIZE * DB_FILE_MULTIBLOCK_READ_COUNT. The setting of this parameter can reduce the number of I/O calls required for a full table scan, thus improving performance. Increasing this value may improve performance for databases that perform many full table scans but degrade performance for OLTP databases where full table scans are seldom (if ever) performed. Setting this number to a multiple of the NFS READ/WRITE size specified in the mount will limit the amount of fragmentation that occurs in the I/O subsystem. Be aware that this parameter is specified in “DB Blocks,” and the NFS setting is in “bytes,” so adjust as required. As an example, specifying a DB_FILE_MULTIBLOCK_READ_COUNT of 4 multiplied by a DB_BLOCK_SIZE of 8kB will result in a read buffer size of 32kB. NetApp recommends that DB_FILE_MULTIBLOCK_READ_COUNT should be set from 1 to 4 for an OLTP database and from 16 to 32 for DSS. 3.3. DB_BLOCK_SIZE For best database performance, DB_BLOCK_SIZE should be a multiple of the OS block size. For example, if the Solaris page size is 4096:

DB_BLOCK_SIZE = 4096 * n

The NFS rsize and wsize options specified when the file system is mounted should also be a multiple of this value. Under no circumstances should it be smaller. For example, if the Oracle DB_BLOCK_SIZE is set to 16kB, the NFS read and write size parameters (rsize and wsize) should be set to either 16kB or 32kB, never to 8kB or 4kB.

3.4. DBWR_IO_SLAVES and DB_WRITER_PROCESSES DB_WRITER_PROCESSES is useful for systems that modify data heavily. It specifies the initial number of database writer processes for an instance. If DBWR_IO_SLAVES is used, only one database writer process will be allowed, regardless of the setting for DB_WRITER_PROCESSES. Multiple DBWRs and DBWR IO slaves cannot coexist. It is recommended that one or the other be used to compensate for the performance loss resulting from disabling DISK_ASYNCH_IO. Metalink note 97291.1 provides guidelines on usage.

Page 34

Page 35: Oracle Netapp Best Practices

The first rule of thumb is to always enable DISK_ASYNCH_IO if it is supported on that OS platform. Next, check to see if it is supported for NFS or only for block access (FC/iSCSI). If supported for NFS, then consider enabling async I/O at the Oracle level and at the OS level and measure the performance gain. If performance is acceptable, then use async I/O for NFS. If async I/O is not supported for NFS or if the performance is not acceptable, then consider enabling multiple DBWRs and DBWR IO slaves as described next. Multiple DBWRs and DBWR IO slaves cannot coexist. It is recommended that one or the other be used to compensate for the performance loss resulting from disabling DISK_ASYNCH_IO. Metalink note 97291.1 provides guidelines on usage. The recommendation is that DBWR_IO_SLAVES be used for single CPU systems and that DB_WRITER_PROCESSES be used with systems having multiple CPUs. NetApp recommends that DBWR_IO_SLAVES be used for single-CPU systems and that DB_WRITER_PROCESSES be used with systems having multiple CPUs.

3.5. DB_BLOCK_LRU_LATCHES The number of DBWRs cannot exceed the value of the DB_BLOCK_LRU_LATCHES parameter:

DB_BLOCK_LRU_LATCHES = DB_WRITER_PROCESSES

Starting with Oracle9i, DB_BLOCK_LRU_LATCHES is obsolete and need not be set.

4. Backup, Restore, and Disaster Recovery For additional information about strategies for designing backup, restore, and disaster recovery architectures, see [10], [11], and [12].

For additional information about implementing instantaneous backup and recovery of an Oracle Database running on UNIX®, see [5] and [9].

4.1. How to Back Up Data from a NetApp System Data that is stored on a NetApp system can be backed up to online storage, nearline storage, or tape. The protocol used to access data while a backup is occurring must always be considered. When NFS and CIFS are used to access data, Snapshot and SnapMirror® can be used and will always result in consistent

Page 35

Page 36: Oracle Netapp Best Practices

copies of the file system. They must coordinate with the state of the Oracle Database to ensure database consistency. With Fibre Channel or iSCSI protocols, Snapshot copies and SnapMirror commands must always be coordinated with the server. The file system on the server must be blocked and all data flushed to the filer before invoking the Snapshot command.

Data can be backed up within the same NetApp filer, to another NetApp filer, to a NearStore® system, or to a tape storage device. Tape storage devices can be directly attached to an appliance, or they can be attached to an Ethernet or Fibre Channel network, and the appliance can be backed up over the network to the tape device. Possible methods for backing up data on NetApp systems include:

Use automated Snapshot copies to create online backups Use scripts on the server that rsh to the NetApp system to invoke

Snapshot copies to create online backups Use SnapMirror to mirror data to another filer or NearStore system Use SnapVault® to vault data to another NetApp filer or NearStore system Use server operating system–level commands to copy data to create

backups Use NDMP commands to back up data to a NetApp filer or NearStore

system Use NDMP commands to back up data to a tape storage device Use third-party backup tools to back up the filer or NearStore system to

tape or other storage devices

4.2. Creating Online Backups Using Snapshot Copies NetApp Snapshot technology makes extremely efficient use of storage by storing only block-level changes between creating each successive Snapshot copy. Since the Snapshot process is virtually instantaneous, backups are fast and simple. Snapshot copies can be automatically scheduled, they can be called from a script running on a server, or they can be created via SnapDrive™ or SnapManager®.

Data ONTAP includes a scheduler to automate Snapshot backups. Use automatic Snapshot copies to back up nonapplication data, such as home directories.

Database and other application data should be backed up when the application is in its backup mode. For Oracle Databases this means placing the database

Page 36

Page 37: Oracle Netapp Best Practices

tablespaces into hot backup mode prior to creating a Snapshot copy. NetApp has several technical reports that contain details on backing up an Oracle Database.

For additional information on determining data protection requirements, see [13].

NetApp recommends using Snapshot copies for performing cold or hot backup of Oracle Databases. No performance penalty is incurred for creating a Snapshot copy. It is recommended to turn off the automatic Snapshot scheduler and coordinate the Snapshot copies with the state of the Oracle Database. For more information on integrating Snapshot technology with Oracle Database backup, refer to [5] and [9].

4.3. Recovering Individual Files from a Snapshot Copy Individual files and directories can be recovered from a Snapshot copy by using native commands on the server, such as the UNIX “cp” command, or dragging and dropping in Microsoft Windows. Data can also be recovered using the single-file SnapRestore command. Use the method that works most quickly.

4.4. Recovering Data Using SnapRestore SnapRestore quickly restores a file system to an earlier state preserved by a Snapshot copy. SnapRestore can be used to recover an entire volume of data or individual files within that volume.

When using SnapRestore to restore a volume of data, the data on that volume should belong to a single application. Otherwise operation of other applications may be adversely affected.

The single-file option of SnapRestore allows individual files to be selected for restore without restoring all of the data on a volume. Be aware that the file being restored using SnapRestore cannot exist anywhere in the active file system. If it does, the appliance will silently turn the single-file SnapRestore into a copy operation. This may result in the single-file SnapRestore taking much longer than expected (normally the command executes in a fraction of a second) and also requires that sufficient free space exist in the active file system. NetApp recommends using SnapRestore to instantaneously restore an Oracle Database. SnapRestore can restore the entire volume to a point in time in the past or can restore a single file. It is advantageous to use SnapRestore on a volume level, as the entire volume can be restored in

Page 37

Page 38: Oracle Netapp Best Practices

minutes, and this reduces downtime while performing Oracle Database recovery. If using SnapRestore on a volume level, it is recommended to store the Oracle log files, archive log files, and copies of control files on a separate volume from the main data file volume and use SnapRestore only on the volume containing the Oracle data files. For more information on using SnapRestore for Oracle Database restores, refer to [5] and [9].

4.5. Consolidating Backups with SnapMirror SnapMirror mirrors data from a single volume or qtree to one or more remote NetApp systems simultaneously. It continually updates the mirrored data to keep it current and available.

SnapMirror is an especially useful tool to deal with shrinking backup windows on primary systems. SnapMirror can be used to continuously mirror data from primary storage systems to dedicated nearline storage systems. Backup operations are transferred to systems where tape backups can run all day long without interrupting the primary storage. Since backup operations are not occurring on production systems, backup windows are no longer a concern.

4.6. Creating a Disaster Recovery Site with SnapMirror SnapMirror continually updates mirrored data to keep it current and available. SnapMirror is the correct tool to use to create disaster recovery sites. Volumes can be mirrored asynchronously or synchronously to systems at a disaster recovery facility. Application servers should be mirrored to this facility as well. In the event that the DR facility needs to be made operational, applications can be switched over to the servers at the DR site and all application traffic directed to these servers until the primary site is recovered. Once the primary site is online, SnapMirror can be used to transfer the data efficiently back to the production filers. After the production site takes over normal application operation again, SnapMirror transfers to the DR facility can resume without requiring a second baseline transfer. For more information on using SnapMirror for DR in an Oracle environment, refer to [14].

4.7. Creating Nearline Backups with SnapVault SnapVault provides a centralized disk-based backup solution for heterogeneous storage environments. Storing backup data in multiple Snapshot copies on a SnapVault secondary storage system allows enterprises to keep weeks of

Page 38

Page 39: Oracle Netapp Best Practices

backups online for faster restoration. SnapVault also gives users the power to choose which data gets backed up, the frequency of backup, and how long the backup copies are retained.

SnapVault software builds on the asynchronous, block-level incremental transfer technology of SnapMirror with the addition of archival technology. This allows data to be backed up via Snapshot copies on a filer and transferred on a scheduled basis to a destination filer or NearStore appliance. These Snapshot copies can be retained on the destination system for many weeks or even months, allowing recovery operations to the original filer to occur nearly instantaneously. For additional references on data protection strategies using SnapVault, refer to [10], [11], and [13].

4.8. NDMP and Native Tape Backup and Recovery The Network Data Management Protocol, or NDMP, is an open standard for centralized control of enterprise-wide data management. The NDMP architecture allows backup application vendors to control native backup and recovery facilities in NetApp appliances and other file servers by providing a common interface between backup applications and file servers. NDMP separates the control and data flow of a backup or recovery operation into separate conversations. This allows for greater flexibility in configuring the environment used to protect the data on NetApp systems. Since the conversations are separate, they can originate from different locations, as well as be directed to different locations, resulting in extremely flexible NDMP-based topologies. Available NDMP topologies are discussed in detail in [15]. If an operator does not specify an existing Snapshot copy when performing a native or NDMP backup operation, Data ONTAP will create one before proceeding. This Snapshot copy will be deleted when the backup completes. When a file system contains FCP data, a Snapshot copy that was created at a point in time when the data was consistent should always be specified. As mentioned earlier, this is ideally done in script by quiescing an application or placing it in hot backup mode before creating the Snapshot copy. After Snapshot copy creation, normal application operation can resume, and tape backup of the Snapshot copy can occur at any convenient time. When attaching an appliance to a Fibre Channel SAN for tape backup, it is necessary to first ensure that NetApp certifies the hardware and software in use. A complete list of certified configurations is available on the Network Appliance data protection portal. Redundant links to Fibre Channel switches and tape libraries are not currently supported by NetApp in a Fibre Channel tape SAN.

Page 39

Page 40: Oracle Netapp Best Practices

Furthermore, a separate host bus adapter must be used in the filer for tape backup. This adapter must be attached to a separate Fibre Channel switch that contains only filers, NearStore appliances, and certified tape libraries and tape drives. The backup server must either communicate with the tape library via NDMP or have library robotic control attached directly to the backup server.

4.9. Using Tape Devices with NetApp Systems NetApp filers and NearStore systems support backup and recovery from local, Fibre Channel, and Gigabit Ethernet SAN-attached tape devices. Support for most existing tape drives is included as well as a method for tape vendors to dynamically add support for new devices. In addition, the RMT protocol is fully supported, allowing backup and recovery to any capable system. Backup images are written using a derivative of the BSD dump stream format, allowing full file system backups as well as nine levels of differential backups.

4.10. Supported Third-Party Backup Tools NetApp has partnered with the following vendors to support NDMP-based backup solutions for data stored on NetApp systems.

Atempo® Time Navigator www.atempo.com

Legato® NetWorker® www.legato.com

BakBone® NetVault® www.bakbone.com

SyncSort® Backup Express www.syncsort.com

CommVault® Galaxy www.commvault.com

VERITAS® NetBackup™ www.veritas.com

Computer Associates™ BrightStor™ Enterprise Backup www.ca.com

Workstation Solutions Quick Restore www.worksta.com

4.11. Backup and Recovery Best Practices This section combines the NetApp data protection technologies and products described above into a set of best practices for performing Oracle hot backups (online backups) for backup, recovery, and archival purposes using primary storage (filers with high-performance Fibre Channel disk drives) and nearline storage (NearStore systems with low-cost, high-capacity ATA and SATA disk drives). This combination of primary storage for production databases and

Page 40

Page 41: Oracle Netapp Best Practices

nearline disk-based storage for backups of the active data set improves performance and lowers the cost of operations. Periodically moving data from primary to nearline storage increases free space and improves performance, while generating considerable cost savings. Note: If NetApp NearStore nearline storage is not part of your backup strategy, then refer to [5] for information on Oracle backup and recovery on filers based on Snapshot technology. The remainder of this section assumes both filers and NearStore systems are in use.

4.11.1. SnapVault and Database Backups Oracle Databases can be backed up while they continue to run and provide service, but must first be put into a special hot backup mode. Certain actions must be taken before and after a Snapshot copy is created on a database volume. Since these are the same steps taken for any other backup method, many database administrators probably already have scripts that perform these functions. While SnapVault Snapshot schedules can be coordinated with appropriate database actions by synchronizing clocks on the filer and database server, it is easier to detect potential problems if the database backup script creates the Snapshot copies using the SnapVault snap create command. In this example, a consistent image of the database is created every hour, keeping the most recent five hours of Snapshot copies (the last five copies). One Snapshot version is retained per day for a week, and one weekly version is retained at the end of each week. On the SnapVault secondary software, a similar number of SnapVault Snapshot copies is retained. Procedure for performing Oracle hot backups with SnapVault:

1. Set up the NearStore system to talk to the filer. 2. Set up the schedule for the number of Snapshot copies to retain on each

of the storage devices using the script-enabled SnapVault schedule on both the filer and NearStore systems.

3. Start the SnapVault process between the filer and NearStore system. 4. Create shell scripts to drive Snapshot copies through SnapVault on the

filer and NearStore to perform Oracle hot backups. 5. Create a cron-based schedule script on the host to drive hot backup

scripts for Snapshot copies driven by SnapVault, as described above.

Page 41

Step 1: Set up the NearStore system to talk to the filer.

Page 42: Oracle Netapp Best Practices

The example in this subsection assumes the primary filer for database storage is named “descent” and the NearStore appliance for database archival is named “rook.” The following steps occur on the primary filer, “descent”:

1. License SnapVault and enable it on the filer, “descent”:

descent> license add ABCDEFG descent> options snapvault.enable on descent> options snapvault.access host=rook

2. License SnapVault and enable it on the NearStore appliance, “rook”:

rook> license add ABCDEFG rook> options snapvault.enable on rook> options snapvault.access host=descent

3. Create a volume for use as a SnapVault destination on the NearStore appliance, “rook”:

rook> vol create vault –r 10 10 rook> snap reserve vault 0

Step 2: Set up schedules (disable automatic Snapshot copies) on filer and NearStore system.

1. Disable the normal Snapshot schedule on the filer and the NearStore system, which will be replaced by SnapVault Snapshot schedules:

descent> snap sched oracle 0 0 0 rook> snap sched vault 0 0 0

2. Set up a SnapVault Snapshot schedule to be script driven on the filer, descent, for the “oracle” volume. This command disables the schedule and also specifies how many of the named Snapshot copies to retain.

descent> snapvault snap sched oracle sv_hourly 5@- This schedule creates a Snapshot copy called sv_hourly and retains the most recent five copies, but does not specify when to create the Snapshot copies specified by a cron script, described later in this procedure.

Page 42

descent> snapvault snap sched oracle sv_daily 1@-

Page 43: Oracle Netapp Best Practices

Similarly, this schedule creates a Snapshot copy called sv_daily and retains only the most recent copy. It does not specify when to create the Snapshot copy.

descent> snapvault snap sched oracle sv_weekly 1@-

This schedule creates a Snapshot copy called sv_weekly and retains only the most recent copy. It does not specify when to create the Snapshot copy.

3. Set up the SnapVault Snapshot schedule to be script driven on the NearStore appliance, rook, for the SnapVault destination volume, “vault.” This schedule also specifies how many of the named Snapshot copies to retain.

rook> snapvault snap sched vault sv_hourly 5@- This schedule creates a Snapshot copy called sv_hourly and retains the most recent five copies, but does not specify when to create the Snapshot copies. That is done by a cron script, described later in this procedure.

rook> snapvault snap sched vault sv_daily 1@-

Similarly, this schedule creates a Snapshot copy called sv_daily and retains only the most recent copy. It does not specify when to create the Snapshot copy.

rook> snapvault snap sched vault sv_weekly 1@-

This schedule creates a Snapshot copy called sv_weekly and retains only the most recent copy. It does not specify when to create the Snapshot copy.

Step 3: Start the SnapVault process between filer and NearStore appliance.

At this point, the schedules have been configured on both the primary and secondary systems, and SnapVault is enabled and running. However, SnapVault does not know which volumes or qtrees to back up or where to store them on the secondary. Snapshot copies will be created on the primary, but no data will be transferred to the secondary. To provide SnapVault with this information, use the SnapVault start command on the secondary:

Page 43

Page 44: Oracle Netapp Best Practices

rook> snapvault start -S descent:/vol/oracle/- /vol/vault/oracle

Step 4: Create Oracle hot backup script enabled by SnapVault. Here is the sample script defined in “/home/oracle/snapvault/sv-dohot-daily.sh”: #!/bin/csh -f # Place all of the critical tablespaces in hot backup mode. $ORACLE_HOME/bin/sqlplus system/oracle @begin.sql # Create a new SnapVault Snapshot copy of the database volume on the primary filer rsh -l root descent snapvault snap create oracle sv_daily # Simultaneously 'push' the primary filer Snapshot copy to the secondary NearStore system rsh -l root rook snapvault snap create vault sv_daily # Remove all affected tablespaces from hot backup mode. $ORACLE_HOME/bin/sqlplus system/oracle @end.sql Note that the “@begin.sql” and “@end.sql” scripts contain sql commands to put the database’s tablespaces into hot backup mode (begin.sql) and then to take them out of hot backup mode (end.sql). Step 5: Use cron script to drive Oracle hot backup script enabled by SnapVault from step 4. A scheduling application such as cron on UNIX systems or the Windows task scheduler program on Windows systems is used to create an sv_hourly Snapshot copy each day at every hour except 11:00 p.m. and a single sv_daily Snapshot copy each day at 11:00 p.m. except on Saturday evenings, when an sv_weekly Snapshot copy is created instead. Sample cron script: # sample cron script with multiple entries for Oracle hot backup # using SnapVault, NetApp filer (descent), and NetApp NearStore (rook) # Hourly Snapshot copy/SnapVault at the top of each hour 0 * * * *: /home/oracle/snapvault/sv-dohot-hourly.sh # Daily Snapshot copy/SnapVault at 2:00 a.m. every day except on Saturdays

Page 44

Page 45: Oracle Netapp Best Practices

0 2 * * 0-5: /home/oracle/snapvault/sv-dohot-daily.sh # Weekly Snapshot copy/SnapVault at 2:00 a.m. every Saturday 0 2 * * 6: /home/oracle/snapvault/sv-dohot-weekly.sh; In step 4 above, there is a sample script for daily backups, “sv-dohot-daily.sh.” The hourly and weekly scripts are identical to the script used for daily backups, except the Snapshot copy name is different (sv_hourly and sv_weekly, respectively).

References 1. Power and System Requirements for Network Appliance Filers: http://now.netapp.com/NOW/knowledge/docs/hardware/hardware_index.shtml 2. DS14 Disk Shelf Installation Guide, page 37: http://now.netapp.com/NOW/knowledge/docs/hardware/filer/ds14hwg.pdf 3. Installation Tips Regarding Power Supplies and System Weight: http://now.netapp.com/NOW/knowledge/docs/hardware/filer/warn_fly.pdf 4. Definition of FCS and GA Terms: http://now.netapp.com/NOW/download/defs/ontap.shtml 5. Oracle9i for UNIX: Backup and Recovery Using a NetApp Filer: www.netapp.com/tech_library/3130.html 6. Using the Linux NFS Client with Network Appliance Filers: Getting the Best from Linux and Network Appliance Technologies: www.netapp.com/tech_library/3183.html 7. Installation and Setup Guide 1.0 for Fibre Channel Protocol on Linux: http://now.netapp.com/NOW/knowledge/docs/hba/fcp_linux/fcp_linux10/pdfs/install.pdf 8. Oracle9i for UNIX: Integrating with a NetApp Filer in a SAN Environment: www.netapp.com/tech_library/3207.html 9. Oracle9i for UNIX: Backup and Recovery Using a NetApp Filer in a SAN Environment: www.netapp.com/tech_library/3210.html

Page 45

Page 46: Oracle Netapp Best Practices

10. Data Protection Strategies for Network Appliance Filers: www.netapp.com/tech_library/3066.html 11. Data Protection Solutions Overview: www.netapp.com/tech_library/3131.html 12. Simplify Application Availability and Disaster Recovery: www.netapp.com/partners/docs/oracleworld.pdf 13. SnapVault Deployment and Configuration: www.netapp.com/tech_library/3240.html 14. Oracle8i™ for UNIX: Providing Disaster Recovery with NetApp SnapMirror Technology: www.netapp.com/tech_library/3057.html 15. NDMPCopy Reference: http://now.netapp.com/NOW/knowledge/docs/ontap/rel632/html/ontap/dpg/ndmp11.htm#1270498

Revision History Version Date Comments

1.0 October 30, 2004 Creation date

© 2005 Network Appliance, Inc. All rights reserved. Specifications subject to change without notice. NetApp, the Network Appliance logo, DataFabric, NearStore, SnapManager, SnapMirror, SnapRestore, and SnapVault are registered trademarks and Network Appliance, Data ONTAP, RAID-DP, SnapDrive, and Snapshot are trademarks of Network Appliance, Inc. in the U.S. and other countries. Intel is a registered trademark of Intel Corporation. Solaris and Sun are trademarks of Sun Microsystems, Inc. Linux is a registered trademark of Linus Torvalds. Microsoft, Windows, and Windows NT are registered trademarks of Microsoft Corporation. Oracle is a registered trademark and Oracle8i, Oracle9i, and Oracle10g are trademarks of Oracle Corporation. UNIX is a registered trademark of The Open Group. All other brands or products are trademarks or registered trademarks of their respective holders and should be treated as such.

Page 46

Page 47: Oracle Netapp Best Practices

Page 47


Recommended