+ All Categories
Transcript

Veritas Cluster ServerApplication Note: DynamicReconfiguration for OracleServers

Solaris

6.0 Platform Release 1

April 2012

VCS Application Note: Dynamic Reconfiguration forOracle Servers

The software described in this book is furnished under a license agreement andmay be usedonly in accordance with the terms of the agreement.

6.0 PR1

6.0PR1.0

Legal NoticeCopyright © 2012 Symantec Corporation. All rights reserved.

Symantec, the Symantec logo, Veritas, Veritas Storage Foundation, CommandCentral,NetBackup, Enterprise Vault, and LiveUpdate are trademarks or registered trademarks ofSymantec corporation or its affiliates in the U.S. and other countries. Other names may betrademarks of their respective owners.

The product described in this document is distributed under licenses restricting its use,copying, distribution, and decompilation/reverse engineering. No part of this documentmay be reproduced in any form by any means without prior written authorization ofSymantec Corporation and its licensors, if any.

THEDOCUMENTATIONISPROVIDED"ASIS"ANDALLEXPRESSORIMPLIEDCONDITIONS,REPRESENTATIONS AND WARRANTIES, INCLUDING ANY IMPLIED WARRANTY OFMERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE OR NON-INFRINGEMENT,ARE DISCLAIMED, EXCEPT TO THE EXTENT THAT SUCH DISCLAIMERS ARE HELD TOBELEGALLYINVALID.SYMANTECCORPORATIONSHALLNOTBELIABLEFORINCIDENTALOR CONSEQUENTIAL DAMAGES IN CONNECTION WITH THE FURNISHING,PERFORMANCE, OR USE OF THIS DOCUMENTATION. THE INFORMATION CONTAINEDIN THIS DOCUMENTATION IS SUBJECT TO CHANGE WITHOUT NOTICE.

The Licensed Software andDocumentation are deemed to be commercial computer softwareas defined in FAR12.212 and subject to restricted rights as defined in FARSection 52.227-19"Commercial Computer Software - Restricted Rights" and DFARS 227.7202, "Rights inCommercial Computer Software or Commercial Computer Software Documentation", asapplicable, and any successor regulations. Any use, modification, reproduction release,performance, display or disclosure of the Licensed Software andDocumentation by theU.S.Government shall be solely in accordance with the terms of this Agreement.

Symantec Corporation350 Ellis StreetMountain View, CA 94043

http://www.symantec.com

Technical SupportSymantec Technical Support maintains support centers globally. TechnicalSupport’s primary role is to respond to specific queries about product featuresand functionality. TheTechnical Support group also creates content for our onlineKnowledge Base. The Technical Support group works collaboratively with theother functional areas within Symantec to answer your questions in a timelyfashion. For example, theTechnical Support groupworkswithProductEngineeringand Symantec Security Response to provide alerting services and virus definitionupdates.

Symantec’s support offerings include the following:

■ A range of support options that give you the flexibility to select the rightamount of service for any size organization

■ Telephone and/or Web-based support that provides rapid response andup-to-the-minute information

■ Upgrade assurance that delivers software upgrades

■ Global support purchased on a regional business hours or 24 hours a day, 7days a week basis

■ Premium service offerings that include Account Management Services

For information about Symantec’s support offerings, you can visit our Web siteat the following URL:

www.symantec.com/business/support/index.jsp

All support services will be delivered in accordance with your support agreementand the then-current enterprise technical support policy.

Contacting Technical SupportCustomers with a current support agreement may access Technical Supportinformation at the following URL:

www.symantec.com/business/support/contact_techsupp_static.jsp

Before contacting Technical Support, make sure you have satisfied the systemrequirements that are listed in your product documentation. Also, you should beat the computer onwhich theproblemoccurred, in case it is necessary to replicatethe problem.

When you contact Technical Support, please have the following informationavailable:

■ Product release level

■ Hardware information

■ Available memory, disk space, and NIC information

■ Operating system

■ Version and patch level

■ Network topology

■ Router, gateway, and IP address information

■ Problem description:

■ Error messages and log files

■ Troubleshooting that was performed before contacting Symantec

■ Recent software configuration changes and network changes

Licensing and registrationIf yourSymantecproduct requires registrationor a licensekey, access our technicalsupport Web page at the following URL:

www.symantec.com/business/support/

Customer serviceCustomer service information is available at the following URL:

www.symantec.com/business/support/

Customer Service is available to assist with non-technical questions, such as thefollowing types of issues:

■ Questions regarding product licensing or serialization

■ Product registration updates, such as address or name changes

■ General product information (features, language availability, local dealers)

■ Latest information about product updates and upgrades

■ Information about upgrade assurance and support contracts

■ Information about the Symantec Buying Programs

■ Advice about Symantec's technical support options

■ Nontechnical presales questions

■ Issues that are related to CD-ROMs or manuals

DocumentationYour feedback on product documentation is important to us. Send suggestionsfor improvements and reports on errors or omissions. Include the title anddocument version (located on the second page), and chapter and section titles ofthe text on which you are reporting. Send feedback to:

[email protected]

Support agreement resourcesIf youwant to contact Symantec regarding an existing support agreement, pleasecontact the support agreement administration team for your region as follows:

[email protected] and Japan

[email protected], Middle-East, and Africa

[email protected] America and Latin America

Dynamic reconfiguration ofOracle servers

This document includes the following topics:

■ Overview: Dynamic reconfiguration in a VCS environment

■ Supported software and hardware

■ Preparing to perform dynamic reconfiguration

■ Scenarios requiring a VCS shutdown

■ Stopping and starting VCS

■ Performing dynamic reconfiguration on Oracle SunFire (s6800; e12K/15K/e25K)

■ Performing dynamic reconfiguration on Oracle SunEnterprise 10K

■ Replacing an online Host Bus Adapter (HBA) on an M5000 server

Overview: Dynamic reconfiguration in a VCSenvironment

This application note describes how to perform Dynamic Reconfigurationoperations on VCS clustered system domains of the Oracle TM servers.

The dynamic reconfiguration operations typically include configuring andunconfiguring CPU/memory boards to and from domains and configuring andunconfiguring I/O boards in a domain. These operations allow switching boardsfrom one domain to another or permit removing a board or card to upgrade or

replace it. Dynamic reconfiguration operations can be performed while theoperating environment continues to run.

However, a dynamic reconfiguration operation performed on a CPU/memoryboard thathaspermanentmemory requires that the systemdomainbe temporarilysuspended. In this case, VCSmust be stopped. Donot use the following proceduresto dynamically reconfigure a system board containing a VCS private heartbeatlink. If you need to do so, you must stop VCS before proceeding.

For a dynamic reconfiguration operation performed on an I/O board, ensure thatall devices that are in use and belong to the I/O board are released, i.e., they areare not in use by any application modules.

For users of Veritas Storage Foundation for Oracle RAC, it is necessary to stopthe Oracle RAC instance within the domain being reconfigured if VCS must bestopped. This permits communications among otherRAC instances to occurwhilethe instance in the one domain is temporarily stopped.

See “Scenarios requiring a VCS shutdown” on page 9.

See “Stopping and starting VCS” on page 11.

Boards with I/O controllers can be dynamically reconfigured as long as you useVxVM with the Dynamic Multi-Pathing (DMP) feature to manage the sharedstorage.

The Solaris dynamic reconfiguration utility enables you to reconfigure theresources of system boards so that the boards can be replaced without systemdowntime.

In such cases, before you can physically remove a board, you must “detach” it, orreconfigure it such that its resources canbedisabled and removed fromthedomainconfiguration. Likewise, after you have physically replaced a board in a domain,you must “attach” it, or reconfigure it into the domain.

The Oracle documentation for dynamic reconfiguration contains comprehensivedescriptions of procedures and commands. To avoid damaging systemboards andcomponents, you should be familiar with the procedures for their removal andreplacement.

Note: Currently, VCS does not support using dynamic reconfiguration in clusterswhere I/O controllers and storage use Multiplexed I/O (MPxIO).

Supported software and hardwareFollowing is a list of supported software and hardware requirements:

Dynamic reconfiguration of Oracle serversSupported software and hardware

8

Supported software■ Solaris 10, update 8 and later

■ Veritas Cluster Server version: 6.0

■ Veritas Volume Manager (VxVM), as supported by the VCS version

■ Veritas File System, as supported by the VCS version

Note: For latest information on supported software please refer toVeritas ClusterServer release notes.

Supported hardware■ Oracle SunFire/Enterprise servers (s6800, e12K/15K, e10K, e25K)

Preparing to perform dynamic reconfigurationMake sure that youdeterminewhichdevices on the systemboardwill be impactedby the dynamic reconfiguration operations and determine how to mitigate theimpact.

To be dynamically reconfigured, the boardsmust satisfy the following conditions:

■ Critical resources on the boards must be redundant. For example, boards forwhichCPUsandmemoryare redundant canbe reconfiguredafter their functionhas been replaced and their activity stopped. A CPU board that contains theonly CPU in a domain cannot be moved.

■ A memory board containing permanent memory, such as the OpenBootTMPROM or kernel memory, can be moved after the memory has been moved toanother board. Dynamic reconfiguration on boards with permanent memoryrequires VCS to be shut down.

■ Disk drives must be accessible via alternate pathways. The DynamicMulti-Pathing (DMP) feature can provide alternate paths. Before moving ahost bus adapter (HBA), switch all the card’s functions to an alternate card.An HBA that controls sole access to an active drive cannot be moved.

■ Activity on a PCI card must be stopped before the card is removed.

Scenarios requiring a VCS shutdownIt is necessary to stopVCSandunconfigureGABandLLT in certain circumstances.

VCS must be shutdown under the following circumstances:

9Dynamic reconfiguration of Oracle serversPreparing to perform dynamic reconfiguration

■ When performing dynamic reconfiguration on a system board (CPU/Memoryboard) with permanent memory.

■ When the I/O board requiring reconfiguration contains all the private networklinks used by the domain.

■ When the I/O board contains the only public network links used by the domain.

■ When the I/O board contains all of the paths to a storage device.

Thenecessity of performing aVCS shutdowncanbe reduced by somedevice layoutplanning before clustering the domains.

CPU/Memory boards with permanent memoryIf the CPU/memory board to be removed contains permanent memory, theoperating system’s functionmust be suspended topermit dynamic reconfigurationto occur. In such a case, VCS must be stopped.

However, you do not need to stop VCS when you are performing dynamicreconfiguration on a board that does not contain permanent memory. Typically,in adomainwithmultipleCPU/memoryboards, oneboardhaspermanentmemory,while the others do not. When you are performing dynamic reconfiguration toadd a new board to the domain, the existing functions in the domain are notaffected by the dynamic addition of a new CPU/memory board.

Note: If youmust reconfiguremultiple boards andaboardwithpermanentmemoryis among them, reconfigure the boardwithpermanentmemory last. This sequenceensures minimum VCS downtime.

Dynamic reconfiguration of Oracle serversScenarios requiring a VCS shutdown

10

To determine if the CPU/memory board has permanent memory

1 Log in to the domain as domain administrator.

2 List the boards with permanent memory in the domain by entering thefollowing command:

# cfgadm -av | grep permanent

SB2::memory connected configured ok base address 0x1e000000000,

16777216 KBytes total, 2001200 KBytes permanent

The output in the example shows SB2 to contain permanent memory. Beforethis board can be dynamically reconfigured, VCS must be stopped.

See “Stopping and starting VCS” on page 11.

Other CPU/memory boards in the domaindonot contain permanentmemoryand may be dynamically reconfigured without stopping VCS.

Stopping and starting VCSThis section contains the procedures for stoppingVCS if it is required for dynamicreconfiguration and the procedures for starting VCS if it has been stopped fordynamic reconfiguration.

■ See “Stopping VCS in a standard environment” on page 11.

■ See “Restarting VCS in a standard environment” on page 13.

■ See “Stopping VCS in Veritas SF for Oracle RAC environment” on page 14.

■ See “Restarting VCS in Veritas SF for Oracle RAC environment” on page 17.

Stopping VCS in a standard environmentWhen you dynamically reconfigure CPU/Memory boards and I/O boards, it maybe necessary, in some circumstances, to stop VCS in the domain.

Applications running on clusters of three ormore domains remainhighly availableon two or more domains if VCS operation must be stopped on one domain. In acluster of two domains, the applications running during reconfiguration are nothighly available when VCS must be stopped on one of the domains.

If you are running Veritas SF for Oracle RAC, see Stopping VCS in Veritas SF forOracle RAC environment

11Dynamic reconfiguration of Oracle serversStopping and starting VCS

To stop VCS in a standard environment

1 Log in as administrator to the domain (wildcat, for example) you arereconfiguring.

2 List the VCS service groups to determine which are online on the domain.

# hagrp -list

3 If you can switch the service groups running on thedomain to another domain(cheetah, for example), switch the service groups.

# hagrp -switch service_grp_name -to cheetah

Verify that the service groups are offline on wildcat.

# hastatus

Stop VCS on wildcat.

# hastop -local

4 If you cannot switch the online service groups to another system, freeze eachof them for the duration of dynamic reconfiguration.

Make the VCS configuration writable.

# haconf -makerw

Freeze each of the service groups persistently.

# hagrp -freeze service_grp_name -persistent

Verify the groups are frozen.

# hagrp display | grep Frozen

Make the configuration read-only.

# haconf -dump -makero

Stop VCS.

# hastop -local -force

5 Unconfigure GAB.

# /sbin/gabconfig -U

6 Unconfigure LLT.

# /sbin/lltconfig -U

Answer “y” to confirm that you want to stop LLT.

Dynamic reconfiguration of Oracle serversStopping and starting VCS

12

7 Stop GAB and LLT modules if required.

For Solaris 10:

# svcadm disable -t system/gab

# svcadm disable -t system/llt

8 Remove the GAB and LLT modules from the kernel.

Determine the IDs of the GAB and LLT modules:

# modinfo | egrep "gab|llt"

305 78531900 30e 305 1 gab

292 78493850 30e 292 1 llt

Unload the GAB and LLT modules based on their module IDs:

# modunload -i 305

# modunload -i 292

9 You can begin performing dynamic reconfiguration.

Restarting VCS in a standard environmentIf you are ready to restart VCS in the domain where you are performing dynamicreconfiguration, use the following procedure. If you are running Veritas SF forOracle RAC, and are ready to restart VCS, see Restarting VCS in Veritas SF forOracle RAC environment.

To restart LLT, GAB, and VCS

1 Restart LLT.

For Solaris 10:

# svcadm enable system/llt

2 Restart GAB.

For Solaris 10:

# svcadm enable system/gab

13Dynamic reconfiguration of Oracle serversStopping and starting VCS

3 Start VCS.

# hastart

4 Verify GAB and VCS are started.

# /sbin/gabconfig -a

GAB Port Memberships

================================================

Port a gen 4a1c0001 membership 012

Port h gen g8ty0002 membership 012

To bring service groups online

1 Determine which service groups are frozen.

# hagrp -display | grep Frozen

2 Make the configuration writable.

# haconf -makerw

3 Unfreeze the frozen service groups.

# hagrp -unfreeze service_grp_name -persistent

4 Make the configuration read-only.

# haconf -dump -makero

Stopping VCS in Veritas SF for Oracle RAC environmentIf you must stop VCS on a domain where Veritas SF for Oracle RAC is running,the Oracle RAC application on the domain being reconfigured must be broughtoffline. In addition, theGAB, LLT, LMX, andVXFENmodulesmust beunconfigured.Performing these steps ensures that other instancesdonot attempt communicationwith the stopped instance. This could cause the application to hang, when theinstance does not respond.

To stop VCS in a Veritas SF for Oracle RAC environment

1 Log in as administrator to the domain being reconfigured (wildcat, forexample).

2 List the configuredVCS service groups and seewhich are online in thedomain:

# hagrp -list

3 Based on the output of step 2, bring each service group that is online to offlinein the domain wildcat. Use the following command:

# hagrp -offline service_grp_name -sys wildcat

Dynamic reconfiguration of Oracle serversStopping and starting VCS

14

4 Stop VCS.

# hastop -local

In addition to port h, this command stops the CVM drivers using ports v andw.

5 If any CFS file systems outside of VCS control are mounted, unmount them.

6 Stop and unconfigure the drivers required by DBE/AC:

# cd /opt/VRTSvcs/rac

# ./uload_drv

Unloading qlog

Unloading odm

Unloading fdd

Unloading vxportal

Unloading vxfs

7 Unconfigure the VCSMM and I/O fencing drivers, which use ports b and o,respectively:

# /sbin/vxfenconfig -U

# /sbin/vcsmmconfig -U

8 Unconfigure the LMX driver:

# /sbin/lmxconfig -U

9 Verify that the drivers h, v, w, f, q, d, b, and o are stopped. They should notshow memberships when you use the gabconfig -a command:

# gabconfig -a

GAB Port Memberships

============================================================

Port a gen 4a1c0001 membership 01

15Dynamic reconfiguration of Oracle serversStopping and starting VCS

10 Stop cluster fencing, VCSMM, LMX, ODM, and GAB modules if required.

For Solaris 10:

# svcadm disable -t system/vxfen

# svcadm disable -t system/vcsmm

# svcadm disable -t system/lmx

# svcadm disable -t system/vxodm

# svcadm disable -t system/gab

11 Unload the VCSMM, I/O fencing, and LMX modules.

Determine the module IDs for VCSMM, I/O fencing, and LMX:

# modinfo | egrep "lmx|vxfen|vcsmm"

237 783e4000 25497 237 1 vcsmm (VERITAS Membership

Manager)

238 78440000 263df 238 1 vxfen (VERITAS I/O Fencing)

239 7845a000 12b1e 239 1 lmx (LLT Mux 3.5B2)

Unload the VCSMM, I/O fencing, and LMX modules based on their moduleIDs:

# modunload -i 237

# modunload -i 238

# modunload -i 239

12 Unconfigure GAB

# /sbin/gabconfig -U

13 Unconfigure LLT

# /sbin/lltconfig -U

Dynamic reconfiguration of Oracle serversStopping and starting VCS

16

14 Remove the GAB and LLT modules from the kernel.

Determine the IDs of the GAB and LLT modules:

# modinfo | egrep "gab|llt"

305 78531900 30e 305 1 gab

292 78493850 30e 292 1 llt

Unload the GAB and LLT modules based on their module IDs:

# modunload -i 305

# modunload -i 292

15 You can begin performing dynamic reconfiguration.

Restarting VCS in Veritas SF for Oracle RAC environmentIf you used the procedure described in Stopping VCS in Veritas SF for Oracle RACenvironment before dynamically reconfiguring a CPU/memory board, use thefollowing procedures to restart VCS and bring the service groups on the domainonline.

To restart LLT, GAB, VCS, and DBE/AC processes

1 Restart LLT.

For Solaris 10:

# svcadm enable system/llt

2 Restart GAB.

For Solaris 10:

# svcadm enable system/gab

3 Restart the LMX driver.

For Solaris 10:

# svcadm enable system/lmx

4 Restart the VCSMM driver.

For Solaris 10:

# svcadm enable system/vcsmm

5 Restart the VXFEN driver

For Solaris 10:

# svcadm enable system/vxfen

17Dynamic reconfiguration of Oracle serversStopping and starting VCS

6 Restart the ODM driver.

For Solaris 10:

# svcadm enable system/odm

7 Start VCS.

# hastart

8 Verify that the CVM service group is online.

# hagrp -state cvm

9 Verify the GAB memberships required for DBE/AC for Oracle9i RAC areconfigured.

# /sbin/gabconfig -a

GAB Port Memberships

============================================================

Port a gen 4a1c0001 membership 012

Port b gen g8ty0002 membership 012

Port d gen 40100001 membership 012

Port f gen f1990002 membership 012

Port h gen g8ty0002 membership 012

Port o gen f1100002 membership 012

Port q gen 28d10002 membership 012

Port v gen 1fc60002 membership 012

Port w gen 15ba0002 membership 012

10 Bring the service groups that had been take offline in See 3 on page 12.

# hagrp -online service_grp_name -sys wildcat

Performing dynamic reconfiguration on OracleSunFire (s6800; e12K/15K/ e25K)

You may dynamically reconfigure CPU/memory boards, I/O boards and PCI onI/O boards for Oracle SunFire s6800/e12K/e15K/e25K.

■ See “Performingdynamic reconfigurationonaCPU/memoryboard”onpage19.

■ See “Performing dynamic reconfiguration on PCI cards on I/O boards”on page 25.

■ See “Performing dynamic reconfiguration on I/O boards” on page 28.

Dynamic reconfiguration of Oracle serversPerforming dynamic reconfiguration on Oracle SunFire (s6800; e12K/15K/ e25K)

18

Performing dynamic reconfiguration on a CPU/memory boardYoumaywant to remove a CPU/memory board that ismalfunctioning or youmaywant to reconfigure a board from one domain to another where it is neededmore.

To reassign a board from one domain to another, you must unconfigure it fromonedomain and reassign it to another domain. This canbe donewithout physicallyremoving the board from its slot. To replace a board, however, you mustunconfigure it from one domain, physically remove it, add its replacement boardand reconfigure it to the domain.

Use the following procedures to dynamically reconfigure a CPU/memory board.

To determine the status of the board you are reconfiguring

1 If necessary, log in as the administrator to the domain containing theCPU/memory board.

2 Determine the attachment point of the board you are removing:

# cfgadm

Ap_Id Type Receptable Occupant Cond

.

N0.SB2 CPU connected configured ok

.

3 Make sure you have checked whether the board has permanent memory.

See “To determine if the CPU/memory board has permanent memory”on page 11.

■ If the board in the domain you want to dynamically reconfigure containspermanent memory, be sure you have first stopped VCS using theprocedures described in See “Stopping and starting VCS” on page 11.

■ See “Stopping VCS in a standard environment” on page 11.

■ See “Restarting VCS in a standard environment” on page 13.

■ See “Stopping VCS in Veritas SF for Oracle RAC environment”on page 14.

■ See “Restarting VCS in Veritas SF for Oracle RAC environment”on page 17.

■ If the board youwant to reconfigure does not contain permanentmemory,you can proceed to dynamically reconfigure it.

19Dynamic reconfiguration of Oracle serversPerforming dynamic reconfiguration on Oracle SunFire (s6800; e12K/15K/ e25K)

To unbind processes bound to CPU on the board

1 To determine if any processes are bound to a CPU, enter:

# pbind -q

2 If a processes is bound to the board, the output indicates the process ID andthe ID number of the CPU.

process id 650: 0

3 If you see no output or see output showing no processes bound to a CPU onthe board, you are reconfiguring, perform the steps in To unconfigure theboard.

4 Unbind all processes bound to the CPU on the board. For example, enter:

# pbind -u 650

5 Rebind the processes to a processor on another board, if necessary. Forexample, bind process 650 to processor with ID 9, which is on another board,using the command:

# pbind -b 650 9

6 If you attempt to unconfigure a board with processes bound to it, you receivea message that resembles:

cfgadm: Hardware specific failure: unconfigure SB15: Failed to

off-line:dr@0:SB15::cpu3

Dynamic reconfiguration of Oracle serversPerforming dynamic reconfiguration on Oracle SunFire (s6800; e12K/15K/ e25K)

20

To unconfigure the board

1 Unconfigure and disconnect the board:

# cfgadm -v -c disconnect SB2

2 If the board does not contain permanent memory, the command’s outputresembles the following with slight variations for each server:

request delete capacity (4 cpus)

request delete capacity (2097152 pages)

request delete capacity SB2 done

request offline SUNW_cpu/cpu448

request offline SUNW_cpu/cpu449

request offline SUNW_cpu/cpu450

request offline SUNW_cpu/cpu451

request offline SUNW_cpu/cpu448 done

request offline SUNW_cpu/cpu449 done

request offline SUNW_cpu/cpu450 done

request offline SUNW_cpu/cpu451 done

unconfigure SB2

unconfigure SB2 done

notify remove SUNW_cpu/cpu448

notify remove SUNW_cpu/cpu449

notify remove SUNW_cpu/cpu450

notify remove SUNW_cpu/cpu451

notify remove SUNW_cpu/cpu448 done

notify remove SUNW_cpu/cpu449 done

notify remove SUNW_cpu/cpu450 done

notify remove SUNW_cpu/cpu451 done

disconnect SB2

disconnect SB2 done

poweroff SB2

poweroff SB2 done

unassign SB2 skipped

Skip to 4.

21Dynamic reconfiguration of Oracle serversPerforming dynamic reconfiguration on Oracle SunFire (s6800; e12K/15K/ e25K)

3 If the board has permanent memory, the system prompts you to proceed:

System may be temporarily suspended; proceed (yes/no)?

If the answer is “yes,” dynamic reconfiguration proceeds. The system issuspended during reconfiguration. When the system resumes operation onanother board, the board you are reconfiguring is disconnected. If thedisconnect operation succeeds, the output resembles the followingwith slightvariations for different servers:

request suspend SUNW_OS

request suspend SUNW_OS done

request delete capacity (2097152 pages)

request delete capacity SB15 done

request offline SUNW_cpu/cpu480

request offline SUNW_cpu/cpu481

request offline SUNW_cpu/cpu482

request offline SUNW_cpu/cpu483

request offline SUNW_cpu/cpu480 done

request offline SUNW_cpu/cpu481 done

request offline SUNW_cpu/cpu482 done

request offline SUNW_cpu/cpu483 done

unconfigure SB15

unconfigure SB15 done

notify remove SUNW_cpu/cpu480

notify remove SUNW_cpu/cpu481

notify remove SUNW_cpu/cpu482

notify remove SUNW_cpu/cpu483

notify remove SUNW_cpu/cpu480 done

notify remove SUNW_cpu/cpu481 done

notify remove SUNW_cpu/cpu482 done

notify remove SUNW_cpu/cpu483 done

disconnect SB15

disconnect SB15 done

poweroff SB15

Skip to 4.

Note: If there are real-time processes running on the board you areunconfiguring, the disconnect operation may not succeed. You must stopthese processes in the appropriate manner before continuing with dynamicreconfiguration.

Dynamic reconfiguration of Oracle serversPerforming dynamic reconfiguration on Oracle SunFire (s6800; e12K/15K/ e25K)

22

4 If the board has real-time processes that must be stopped, the dynamicreconfiguration operation fails, indicating the PID of those processes thatare running. There may be slight variations in output for different OracleSun Enterprise servers.

For example:

.

.

notify remove SUNW_cpu/cpu481 done

notify remove SUNW_cpu/cpu482 done

notify remove SUNW_cpu/cpu483 done

cfgadm: Hardware specific failure: unconfigure SB15:

Cannot

quiesce realtime thread: 621

5 To determine the name of the processes, use the command:

# ps -ef | grep PID

6 Stop the process in the appropriate manner. For example, the processes inour example must be stopped using the kill command:

# kill -9 PID

7 Retry the command in 1.

8 To verify the board is disconnected and unconfigured, use the cfgadmcommand:

# cfgadm

Ap_Id Type Receptable Occupant Cond

.

N0.SB2 CPU disconnected unconfigured unknown

.

Nowyou can remove the board from the slot, or reassign it to another domain.

Note: Do not remove the board until you have verified it is disconnected.

9 If you are replacing the board immediately, see To add a board to a domain.Otherwise, return the cluster to operationwithout replacing the disconnectedCPU/memory board using the procedure in the following section.

23Dynamic reconfiguration of Oracle serversPerforming dynamic reconfiguration on Oracle SunFire (s6800; e12K/15K/ e25K)

To add a board to a domain

1 Log in as administrator to the domainwhere you plan to add or configure theboards.

2 If you are adding a new or a replacement board to a domain (for example,wildcat), verify the state of the slot to contain the board.

To be configured with a new board, the slot must have the following statesand condition:

■ Receptacle state: empty

■ Occupant state: unconfigured

■ Condition: unknown

Verify this by using the cfgadm command to list the slots, as in the followingexample. In the wildcat domain, slot SB2 is to contain the CPU board:

Dynamic reconfiguration of Oracle serversPerforming dynamic reconfiguration on Oracle SunFire (s6800; e12K/15K/ e25K)

24

3 Use the cfgadm command to connect and configure a CPU or memory board:

cfgadm -v -c configure SBx

For example:

# cfgadm -v -c configure SB2

assign SB2

assign SB2 done

poweron SB2

poweron SB2 done

test SB2

test SB2 done

connect SB2

connect SB2 done

configure SB2

configure SB2 done

notify online SUNW_cpu/cpu448

notify online SUNW_cpu/cpu449

notify online SUNW_cpu/cpu450

notify online SUNW_cpu/cpu451

notify add capacity (4 cpus)

notify add capacity (2097152 pages)

notify add capacity SB2 done

4 Verify the newboard has been connected and configured using the commandcfgadm. For example:

# cfgadm

Ap_Id Type Receptable Occupant Cond

.

SB2 CPU connected configured ok

Performing dynamic reconfiguration on PCI cards on I/O boardsAcard containing anHBA can be removed and replaced on an I/O board. If a failedHBA has been used with other adapters on separate cards in a DynamicMulti-Pathing (DMP) configuration, I/O can proceed through the alternate pathand VCS need not be stopped.

25Dynamic reconfiguration of Oracle serversPerforming dynamic reconfiguration on Oracle SunFire (s6800; e12K/15K/ e25K)

To determine the status of the card you are unconfiguring

1 Log in to the domain as the administrator. For the following example, the I/Oboard is in the wildcat domain.

2 Check the status of the boards. Use the cfgadm command.

cougar# cfgadm

The output resembles:

For Solaris 10:

Ap_Id Type Receptacle Occupant Condition

IO4 HPCI connected configured ok

IO4_C3V0 fibre/hp connected configured ok

IO4_C3V1 pci-pci/hp connected configured ok

IO4_C5V0 pci-pci/hp connected configured ok

IO4_C5V1 fibre/hp connected configured ok

SB7 CPU connected configured ok

SB8 CPU connected configured ok

c0 scsi-bus connected configured unknown

c1 scsi-bus connected unconfigured unknown

c2 fc connected unconfigured unknown

c3 fc connected unconfigured unknown

c4 fc-fabric connected configured unknown

c5 fc connected unconfigured unknown

cougar# uname -a

SunOS cougar 5.10 Generic_118833-17 sun4u sparc

SUNW,Sun-Fire-15000

cougar#

In case of Solaris 10, the reporting of I/O board slot namesmakes it somewhateasier to discover the relationship between physical and logical devicesbecause slots on the I/O boards are also numbered using the C[35]V[01]notation.

To remove a PCI card

1 Disable the controllers on the I/O system card using the vxdmpadm command:

# vxdmpadm disable ctlr=c3

If the card has more than one controller, repeat this command for eachcontroller on the card.

2 Disconnect the card:

# cfgadm -v -c disconnect pcisch1:sg8slot0

Dynamic reconfiguration of Oracle serversPerforming dynamic reconfiguration on Oracle SunFire (s6800; e12K/15K/ e25K)

26

3 Check the states and the condition of the card using the cfgadm command:

# cfgadm

The disconnected card must have the following states and condition:

■ Receptacle state: disconnected

■ Occupant state: unconfigured

■ Condition: unknown

4 Remove the disconnected card only if it is powered off.

To add a card

1 Verify that the slot you selected can accept a device, such as a PCI card.

To accept a device, the slot must have the following states and condition:

■ Receptacle state: empty or disconnected

■ Occupant state: unconfigured

■ Condition: unknown

Verify this by using the cfgadm command to list all of the system boards, asin the following example:

The output resembles:

For Solaris 10:

cougar# cfgadm

Ap_Id Type Receptacle Occupant Condition

IO4 HPCI connected configured ok

IO4_C3V0 fibre/hp connected configured ok

IO4_C3V1 pci-pci/hp connected configured ok

IO4_C5V0 pci-pci/hp connected configured ok

IO4_C5V1 fibre/hp connected configured ok

SB7 CPU connected configured ok

SB8 CPU connected configured ok

c0 scsi-bus connected configured unknown

c1 scsi-bus connected unconfigured unknown

c2 fc connected unconfigured unknown

c3 fc connected unconfigured unknown

c4 fc-fabric connected configured unknown

c5 fc connected unconfigured unknown

cougar# uname -a

SunOS cougar 5.10 Generic_118833-17 sun4u sparc

27Dynamic reconfiguration of Oracle serversPerforming dynamic reconfiguration on Oracle SunFire (s6800; e12K/15K/ e25K)

SUNW,Sun-Fire-15000

cougar#

In case of Sol 10, the reporting of I/O board slot names makes it somewhateasier to discover the relationship between physical and logical devicesbecause slots on the I/O boards are also numbered using the C[35]V[01]notation.

2 Add the replacement PCI card to the empty card slot.

3 To configure the new card, use the cfgadm command. For example:

For s6800:

# cfgadm -c configure pcisch1:sg8slot0

For e12K/15K:

# cfgadm -c configure pcisch1:e15b1slot0

After the system configures and tests the board, it displays a message in thedomain console log indicating the configuration of the components.

4 Check the states and the condition of the board using the cfgadm command;it must be “connected,” “configured,” and “ok.”

5 Enable the controller for the HBA:

# vxdmpadm enable ctlr=c3

Note: This command succeeds if the controller is accessible to the domainand I/O can be performed on it.

Performing dynamic reconfiguration on I/O boardsUnder certain circumstances, you must stop VCS on the domain where you arereconfiguring a board.

See “Scenarios requiring a VCS shutdown” on page 9.

For s6800:

In the following scenario, a cluster consists of thewildcat and the leopard domains.The cluster is running service groups on the wildcat domain, which includes I/Oboards N0.IB8 and N0.IB6. N0.IB8 requires dynamic reconfiguration because of amalfunctioning component. The domain leopard includes I/O boards IO14 andIO15. The disk controllers and NICs are labeled in the following diagrams.

Dynamic reconfiguration of Oracle serversPerforming dynamic reconfiguration on Oracle SunFire (s6800; e12K/15K/ e25K)

28

qfe0

qfe1

qfe2

qfe3

qfe7

qfe6

qfe5

qfe4

c2

c1

NO.IB8

NO.IB6 2 private links Publiclinks

Domain: Wildcat

29Dynamic reconfiguration of Oracle serversPerforming dynamic reconfiguration on Oracle SunFire (s6800; e12K/15K/ e25K)

IO15 IO14

ce8

ce7

ce6

ce5

ce3 ce1

ce2 ce02

privatelinks

Publiclinks

c9 c8

Domain: Leopard

For e12K/15K/25K: In the following scenario, a cluster consists of the leopard andtheS6800f0domains. The cluster is running service groups on the leoparddomain,which includes I/O boards IO14 and IO15. IO15 requires dynamic reconfigurationbecause of amalfunctioning component. The domain S6800f0 includes I/O boardsIB8 and IB6. The disk controllers and NICs are labeled in the following diagrams.

Dynamic reconfiguration of Oracle serversPerforming dynamic reconfiguration on Oracle SunFire (s6800; e12K/15K/ e25K)

30

IO15 IO14

ce8

ce7

ce6

ce5

ce3 ce1

ce2 ce02

privatelinks

Publiclinks

c9 c8

Domain: Leopard

SCSI SCSI SCSIBootSCSI

31Dynamic reconfiguration of Oracle serversPerforming dynamic reconfiguration on Oracle SunFire (s6800; e12K/15K/ e25K)

qfe0

qfe1

qfe2

qfe3

qfe7

qfe6

qfe5

qfe4

c2

c1

NO.IB8

NO.IB6 2 private links Publiclinks

Domain: S6800f0

Thehighlights of the procedure to dynamically reconfigure the I/O boards (N0.IB8board and IO15 board) in the wildcat and leopard domains for s6800 ande12K/15K/25K respectively include:

■ Disabling all the active controllers on the board.

■ Disabling all the NIC devices used for private communications on the board

■ Disabling all the NIC devices used for public communications on the board

■ Disabling the IO board and removing it

■ Adding the replacement IO board

■ Enabling the replacement board

■ Enabling the public NIC devices

■ Enabling the private NIC devices

■ Enabling the active controllers

Dynamic reconfiguration of Oracle serversPerforming dynamic reconfiguration on Oracle SunFire (s6800; e12K/15K/ e25K)

32

To verify the status of the cluster before dynamic reconfiguration

1 Use theVCScommandhastatus -sum to verify the current state of the servicegroups in the cluster. Use the command before reconfiguring the I/O boardand after reconfiguration to verify the cluster’s state. The output is as followswith slight variations for the different Oracle servers.

-- SYSTEM STATE

-- System State Frozen

A leopard RUNNING 0

A s6800f0 RUNNING 0

-- GROUP STATE

-- Group System Probed AutoDisabled State

B ServiceGroupA leopard Y N ONLINE

B ServiceGroupA s6800f0 Y N OFFLINE

B cvm leopard Y N ONLINE

B cvm s6800f0 Y N ONLINE

33Dynamic reconfiguration of Oracle serversPerforming dynamic reconfiguration on Oracle SunFire (s6800; e12K/15K/ e25K)

2 For s6800: By using the cfgadm -lv command, you can show the I/O boardsand cards in the wildcat domain. For example:

# cfgadm -lv

In the output (not shown), the board N0.IB8 is reported to be connected,configured, and ok. In addition, the condition of each of the slots on N0.IB8are reported.

For e12K/15K: By using the cfgadm -al command, you can show the I/Oboards and cards in the leopard domain. For example:

# cfgadm -al

Ap_Id Type Receptacle Occupant

Condition

IO14 HPCI connected configured ok

IO14::pci0 io connected configured ok

IO14::pci1 io connected configured ok

IO14::pci2 io connected configured ok

IO14::pci3 io connected configured ok

IO15 HPCI connected configured ok

IO15::pci0 io connected configured ok

IO15::pci1 io connected configured ok

IO15::pci2 io connected configured ok

IO15::pci3 io connected configured ok

SB14 CPU connected configured ok

SB14::cpu0 cpu connected configured ok

.

.

.

pcisch1:e14b1slot0 fibre/hp connected configured ok

pcisch2:e14b1slot3 pci-pci/hp connected configured ok

pcisch3:e14b1slot2 ethernet/hp connected configured ok

pcisch4:e15b1slot1 pci-pci/hp connected configured ok

pcisch5:e15b1slot0 fibre/hp connected configured ok

pcisch6:e15b1slot3 pci-pci/hp connected configured ok

pcisch7:e15b1slot2 ethernet/hp connected configured ok

Dynamic reconfiguration of Oracle serversPerforming dynamic reconfiguration on Oracle SunFire (s6800; e12K/15K/ e25K)

34

To determine the controllers on a board

1 Use the command vxdmpadm listctlrall to determine all controllers in thedomain. For example, on the leopard domain:

# vxdmpadm listctlr all

CTLR-NAME ENCLR-TYPE STATE ENCLR-NAME

=====================================================

c0 Disk ENABLED Disk

c9 HDS9960 ENABLED HDS99600

c8 HDS9960 ENABLED HDS99600

35Dynamic reconfiguration of Oracle serversPerforming dynamic reconfiguration on Oracle SunFire (s6800; e12K/15K/ e25K)

2 To determine which controllers are on a specific board, for example IO15,use the following commands to display information about the disks in thedomain, their controllers, and the location of the controllers on the IO boards.

Use the command cfgadm -lv, which provides a verbose listing of all boardsin the domain. In the output, you can see the device slots listed for the boardIO15.

# cfgadm -lv

In the following example (not all output is shown) the listing might containlines that resemble:

.

pcish4:e15b1slot1 . . .

/devices/pci@1fc,700000:e15b1slot1

pcish5:e15b1slot0 . . .

/devices/pci@1fc,600000:e15b1slot0

pcish6:e15b1slot3 . . .

/devices/pci@1fd,700000:e15b1slot3

pcish7:e15b1slot2 . . .

/devices/pci@1fd,600000:e15b1slot2

.

The listing indicates that the device labeled pci@1fc is used by slots 0 and 1of board 15, the device labeled pci@1fd is used by slots 3 and 2.

Using the format command in the domain, you can list the disk devices. Thelisting may be lengthy, but in the output, the controller, indicated by “c#” inthe first two characters of the device name, corresponds to a device that islisted in the previous command (step a). For example:

# format

c0t0d0 <SUN18G ..... /pci@1dc,700000/pci@1.. .....

c8t0d0 <HITACHI-OPEN ....

/pci@1dc,600000/fibre-channel ...

.

c9t0d0 <HITACHI-OPEN ....

/pci@1fc,600000/fibre-channel ...

A comparison of the output of the previous two commands shows that board15 slot 0 contains the controller c9.

Dynamic reconfiguration of Oracle serversPerforming dynamic reconfiguration on Oracle SunFire (s6800; e12K/15K/ e25K)

36

3 Asanalternative to using the format command, you can also use the followingprocedure to determine which storage controllers are impacted by dynamicreconfiguration on a given slot or I/O board for e25K on Solaris 10.

Verifywhich I/O controllers are impacted by dynamic reconfiguration on theboard IO4 on sol10 (cougar) by using the following command:

cougar# cfgadm -s “cols=ap_id:physid” | grep IO4

IO4 /devices/pseudo/dr@0:IO4

IO4_C3V0 /devices/pci@9c,600000:IO4_C3V0

IO4_C3V1 /devices/pci@9d,600000:IO4_C3V1

IO4_C5V0 /devices/pci@9c,700000:IO4_C5V0

IO4_C5V1 /devices/pci@9d,700000:IO4_C5V1

The -s parameter is used to limit output to the ap_id and physical id columns.

Notice the pci@... In the phys id, use grep again using pci@9[cd],[67]00000:

cougar# cfgadm -s "cols=ap_id:physid" | grep pci@9[cd],[67]

IO4_C3V0 /devices/pci@9c,600000:IO4_C3V0

IO4_C3V1 /devices/pci@9d,600000:IO4_C3V1

IO4_C5V0 /devices/pci@9c,700000:IO4_C5V0

IO4_C5V1 /devices/pci@9d,700000:IO4_C5V1

c0 /devices/pci@9c,700000/pci@1/scsi@2:scsi

c1 /devices/pci@9c,700000/pci@1/scsi@2,1:scsi

c2 /devices/pci@9c,600000/SUNW,qlc@1,1/fp@0,0:fc

c3 /devices/pci@9c,600000/SUNW,qlc@1/fp@0,0:fc

c4 /devices/pci@9d,700000/SUNW,qlc@1/fp@0,0:fc

c5 /devices/pci@9d,700000/SUNW,qlc@1,1/fp@0,0:fc

c0 and c1 are located on IO4_C5V0, c2 and c3 are on IO4_C3V0, and c4 andc5 are on IO4_C5V1

On sol 9, the procedure is almost the same:

jaguar# cfgadm -s "cols=ap_id:physid" | grep e17

e17 corresponds to the IO board #17

pcisch4:e17b1slot1 /devices/pci@23c,700000:e17b1slot1

pcisch5:e17b1slot0 /devices/pci@23c,600000:e17b1slot0

pcisch6:e17b1slot3 /devices/pci@23d,700000:e17b1slot3

pcisch7:e17b1slot2 /devices/pci@23d,600000:e17b1slot2

jaguar# cfgadm -s "cols=ap_id:physid" | grep pci@23[cd],[67]

37Dynamic reconfiguration of Oracle serversPerforming dynamic reconfiguration on Oracle SunFire (s6800; e12K/15K/ e25K)

c4 /devices/pci@23c,700000/pci@1/scsi@2:scsi

c5 /devices/pci@23c,700000/pci@1/scsi@2,1:scsi

c6 /devices/pci@23d,700000/SUNW,qlc@1/fp@0,0:fc

pcisch4:e17b1slot1 /devices/pci@23c,700000:e17b1slot1

pcisch5:e17b1slot0 /devices/pci@23c,600000:e17b1slot0

pcisch6:e17b1slot3 /devices/pci@23d,700000:e17b1slot3

pcisch7:e17b1slot2 /devices/pci@23d,600000:e17b1slot2

c4 and c5 are on e17b1slot1 and c6 is on slot3

To determine the network interfaces on the board

◆ Verify which network interfaces correspond to which slot on the I/O board(since each I/O board can carry upto four PCI cards) by using the grepcommand to match the /etc/path_to_inst for pci identifiers.

For e25K on Solaris 10

IO4_C3V0 /devices/pci@9c,600000:IO4_C3V0

IO4_C3V1 /devices/pci@9d,600000:IO4_C3V1

IO4_C5V0 /devices/pci@9c,700000:IO4_C5V0

IO4_C5V1 /devices/pci@9d,700000:IO4_C5V1

cougar# grep pci@9[cd],[67] /etc/path_to_inst |grep network

"/pci@9c,700000/network@3,1" 0 "eri"

"/pci@9c,700000/pci@1/network@0" 0 "ce"

"/pci@9c,700000/pci@1/network@1" 1 "ce"

"/pci@9d,600000/pci@1/network@0" 2 "ce"

IO4_C5V0 contains eri0, c0, and c1. IO4_C3V1 contains ce2.

cougar#

Dynamic reconfiguration of Oracle serversPerforming dynamic reconfiguration on Oracle SunFire (s6800; e12K/15K/ e25K)

38

To disable the controllers on the board

1 Disable the active controllers on the I/O system card using the vxdmpadmcommand.

vxdmpadm disable ctlr=ctlr

For s6800:

# vxdmpadm disable ctlr=c2

For e12K/15K:

# vxdmpadm disable ctlr=c9

2 Using the vxdmpadm command, verify that the controller is disabled. Theoutput for all Oracle servers (s6800 and e12K/15K/25K)will be similar exceptfor minor differences.

# vxdmpadm listctlr all

For s6800: In this example, the only controller on board is c2.

CTLR-NAME ENCLR-TYPE STATE ENCLR-NAME

=====================================================

c0 Disk ENABLED Disk

c2 HDS9960 DISABLED HDS99600

c1 HDS9960 ENABLED HDS99600

For e12K/15K: In this example, the only controller on board IO15 is c9.

CTLR-NAME ENCLR-TYPE STATE ENCLR-NAME

=====================================================

c0 Disk ENABLED Disk

c9 HDS9960 DISABLED HDS99600

c8 HDS9960 ENABLED HDS99600

3 If a card has more than one controller, repeat this command for eachcontroller on the card to be reconfigured.

39Dynamic reconfiguration of Oracle serversPerforming dynamic reconfiguration on Oracle SunFire (s6800; e12K/15K/ e25K)

To list the status of the private network links and to disable them

1 Enter the command lltstat -nv:

The output resembles:

For s6800:

LLT node information:

Node State Links

* 0 wildcat OPEN 2

1 leopard OPEN 2

2 CONNWAIT 0

.

.

31 CONNWAIT 0

The output shows that both domains have two links for privatecommunication. Both links are “OPEN,” that is, operational.

For e12K/15K:

LLT node information:

Node State Links

0 s6800f0 OPEN 2

* 1 leopard OPEN 2

2 CONNWAIT 0

.

.

31 CONNWAIT 0

The output shows that both domains have two links for privatecommunication. Both links are “OPEN,” that is, operational.

Dynamic reconfiguration of Oracle serversPerforming dynamic reconfiguration on Oracle SunFire (s6800; e12K/15K/ e25K)

40

2 Display the /etc/llttab file using the following command:

# cat /etc/llttab

For s6800:

set-node wildcat

set-cluster 13

link qfe4 /dev/qfe:4 - ether - -

link qfe0 /dev/qfe:0 - ether - -

The devices qfe0 and qfe4 are shown as the private network links.

For e12K/15K:

set-node leopard

set-cluster 13

link cd3 /dev/ce:3 - ether - -

link cd8 /dev/ce:8 - ether - -

The devices ce3 and ce8 are shown as the private network links.

41Dynamic reconfiguration of Oracle serversPerforming dynamic reconfiguration on Oracle SunFire (s6800; e12K/15K/ e25K)

3 Disable the private network link device.

For example for s6800, the private network link device is: qfe4,on I/O boardN0.IB8.

# /sbin/lltconfig -u qfe4

For example for e12K/15K, the private network link device is: ce8, on I/Oboard 15.

# /sbin/lltconfig -u ce8

4 Check the status of the private network links:

# lltstat -nv

For s6800:

LLT node information:

Node State Links

* 0 wildcat OPEN 2

leopard OPEN 1

2 CONNWAIT 0

.

.

.

31 CONNWAIT 0

For e12K/15K:

LLT node information:

Node State Links

0 s6800f0 OPEN 1

* 1 leopard OPEN 2

2 CONNWAIT 0

.

.

.

31 CONNWAIT 0

Dynamic reconfiguration of Oracle serversPerforming dynamic reconfiguration on Oracle SunFire (s6800; e12K/15K/ e25K)

42

To list the status of the public NICs and to disable them

1 Use the command ifconfig -a.

For s6800: For example, qfe3 (on board N0.IB6) and qfe7 (on board N0.IB8),the NICs used for the public network connections, are operational.

# ifconfig -a

lo0: flags=1000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4> mtu 8232

index

1 inet 127.0.0.1 netmask ff000000

ge0: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500

index 2 inet 10.182.65.99 netmask fffff000 broadcast

10.182.79.255 ether 0:3:ba:8:ec:40

qfe3:

flags=9040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,

NOFAILOVER> mtu 1500 index 3 inet 10.182.66.143 netmask

ffffff00 broadcast 10.255.255.255 groupname mn1 ether

0:3:ba:8:ec:40

qfe7:

flags=9040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,

NOFAILOVER> mtu 1500 index 4 inet 10.182.66.144 netmask

ffffff00 broadcast 10.255.255.255 groupname mn1 ether

0:3:ba:8:ec:40

2 For s6800: To disable the device qfe7 on board N0.IB8, use the commands:

# ifconfig qfe7 down

# ifconfig qfe7 unplumb

For e12K/15K: To disable the device ce5 on board IO15, use the command:

# ifconfig ce5 down

3 For s6800: Use the ifconfig -a command to verify that qfe7 is down. Noinformation about qfe7 should appear in the output.

For e12K/15K/25K:Use the ifconfig -a command to verify that ce5 is down.No information about ce5 should appear in the output.

# ifconfig -a

43Dynamic reconfiguration of Oracle serversPerforming dynamic reconfiguration on Oracle SunFire (s6800; e12K/15K/ e25K)

To disable and remove the IO board

1 When the controllers and network interface cards are disabled, disconnectthe board:

For s6800:

# cfgadm -c disconnect N0.IB8

For e12K/15K:

# cfgadm -c disconnect IO15

Note: The -f option is recommended only when a normal disconnect attemptfails and there is no clear way to make the command succeed gracefully.

2 Use the cfgadm command to check the status of the I/O board:

# cfgadm -al

For s6800: In the output, the fields Receptable, Occupant, and Condition forN0.IB8 show disconnected, unconfigured, and unknown respectively.

The I/O boardmay be physically removed at this time. Before adding the newboard to the wildcat domain, you must test it in another spare domain.

For e12K/15K:

Ap_Id Type Receptacle Occupant

Condition

IO14 HPCI connected configured ok

IO14::pci0 io connected configured ok

IO14::pci1 io connected configured ok

IO14::pci2 io connected configured ok

IO14::pci3 io connected configured ok

IO15 HPCI disconnected unconfigured

unknown

SB14 CPU connected configured ok

SB14::cpu0 cpu connected configured ok

.

.

The I/O board, IO15, may be physically removed at this time.

Dynamic reconfiguration of Oracle serversPerforming dynamic reconfiguration on Oracle SunFire (s6800; e12K/15K/ e25K)

44

To add the new IO board

1 Physically add the board, connecting all necessary cables, and configure it:

For s6800:

# cfgadm -c configure N0.IB8

For e12K/15K:

# cfgadm -c configure IO15

Note:Make sure that the output of the cfgadm command shows the slotwherethe new board is to be added. The status is disconnected, unconfigured, andunknown.

2 Run the cfgadm -al command to verify the board has been configured; theboard should be connected, configured, and ok. If you have stopped VCS, youmay skip step 3 through step 6.

3 Reconfigure the network interface cards on the new board:

For s6800:

# ifconfig qfe7 plumb

# ifconfig qfe7 up

For e12K/15K:

# ifconfig ce5 plumb

4 Run the command ifconfig -a to verify that the NICs are up and running.

5 Reconfigure LLT to reestablish the private network links:

For s6800:

# /sbin/lltconfig -t qfe4 -d /dev/qfe:4

For e12K/15K:

# /sbin/lltconfig -t ce8 -d /dev/ce:8

6 Verify the private network links are restored using the command lltstat

-nv:

# /sbin/lltstat -nv

45Dynamic reconfiguration of Oracle serversPerforming dynamic reconfiguration on Oracle SunFire (s6800; e12K/15K/ e25K)

7 For s6800: Enable the controller c2 on the N0.IB8 using vxdmpadm command:

# vxdmpadm enable ctlr=c2

For e12K/15K:Enable the controller c9 on the IO15usingvxdmpadm command:

# vxdmpadm enable ctlr=c9

8 Verify that the controller is up and running:

# vxdmpadm listctlr all

If you have stopped VCS before reconfiguring the I/O board, restart it. Referto the section, See “Stopping and starting VCS” on page 11.

Performing dynamic reconfiguration on OracleSunEnterprise 10K

The system board in a domain may contain I/O controllers, CPUs, or memory.

Boards with I/O controllers can be dynamically reconfigured as long as you useVxVM with the Dynamic Multi-Pathing (DMP) feature to manage the sharedstorage.

■ See “Detaching and attaching I/O system boards” on page 47.

■ See “Detaching I/O system boards with DMP enabled” on page 48.

■ See “Attaching I/O system boards with DMP enabled” on page 50.

■ See “Detaching CPU/memory boards” on page 51.

■ See “Attaching CPU/Memory boards” on page 52.

■ See “Using VM without DMP enabled” on page 53.

Preparing environment for dynamic reconfigurationBefore performing dynamic reconfiguration operations on a domain, you mustfirst set the appropriate environment variable.

Dynamic reconfiguration of Oracle serversPerforming dynamic reconfiguration on Oracle SunEnterprise 10K

46

To enable the kernel cage variable for dynamic reconfiguration

1 Using the Solaris 8 operating environment, you must set the system(4)variable, kernel_cage_enable, to 1 (enabled). By default, this variable is set tozero (kernel cage disabled), preventing dynamic reconfiguration Detachoperations.

2 Edit the file /etc/system so that kernel_cage_enable equals 1.

.

set kernel_cage_enable=1

.

3 Reboot the domain. To verify the kernel cage is enabled, check the file/var/adm/messages.

4 Look for the message:

NOTICE: DR Kernel Cage is ENABLED

Detaching and attaching I/O system boardsIn the configuration shown below, VCS runs on Domains A and B with servicegroups online on Domain A. Shared storage consists of a VxVM disk group withDynamic Multi-Pathing (DMP) enabled. Dynamic Reconfiguration of I/O boardsdepends on DMP being configured for the storage.

47Dynamic reconfiguration of Oracle serversPerforming dynamic reconfiguration on Oracle SunEnterprise 10K

I/O

c1 c2

I/O

SB1 SB2 SB3 SB4

CPU/MEM

CPU/MEM

CPU/MEM

CPU/MEM

I/O I/O

c3 c4

SB9 SB10 SB11 SB12

DOMAIN A DOMAIN B

I/O STORAGE

In the example, the systemboardSB3,whichhas a disk controller, is to be removed,repaired, and replaced. The administrator disables the controller, and the diskcontroller on SB1 automatically takes over because of the DMP functionality.Using dynamic reconfiguration commands, the administrator can detach, orremove the board from the Domain A’s configuration. When this is complete, theboard can be physically removed.

Replacing the board—a controller board in this case—involves physically installingit and reconnecting it to the shared storage. Reconfiguring the board requiresusingdynamic reconfiguration commands to “attach” it to the domain, afterwhichthe controller can be re-enabled.

Detaching I/O system boards with DMP enabledMake sure the kernel_cage_enable variable is set.

See “Preparing environment for dynamic reconfiguration” on page 46.

Dynamic reconfiguration of Oracle serversPerforming dynamic reconfiguration on Oracle SunEnterprise 10K

48

To attach an I/O board with DMP enabled

1 Freeze the VCS service groups running on the domain where you intend toperform dynamic reconfiguration operations. Freezing the service groupsprevents them from being taken offline or failed over. Repeat the followingcommand for each service group:

# hagrp -freeze ser_grp_name

2 Connect to the SSP server and log in to the domain whose system boardrequires Dynamic Reconfiguration.

ssp:D1% echo $SUNW_HOSTNAME

3 Enter the dr(1M) shell:

ssp:D1% dr

4 To verify the board is an I/O board, enter:

dr> drshow sb# IO

If the display lists the disks connected to the controller, the system board isan I/O board.

5 If the system board is an I/O board, open another window and log in as rootto the domain you are currently reconfiguring.

6 Disable the controller on the I/O system board:

# vxdmpadm disable ctlr=ctlr#

7 In the window where you are running dynamic reconfiguration, startdetaching the I/O board by entering:

dr> drain sb#

8 Monitor the progress of the drain operation by entering:

dr> drshow sb# drain

9 When you see the message:

Percent Complete= 100% (0 KBytes remaining)

complete the detach operation:

dr> complete_detach sb#

10 To verify that the board is no longer configured, type the following command:

dr> drshow sb#

The detached board should not appear in the detailed listing.

49Dynamic reconfiguration of Oracle serversPerforming dynamic reconfiguration on Oracle SunEnterprise 10K

11 Exit the dr shell:

dr> exit

12 If the board is not to be immediately replaced, unfreeze any frozen servicegroups:

# hagrp -unfreeze ser_grp_name

Repeat for each service group.

Attaching I/O system boards with DMP enabledYou can attach a system I/O board using the following procedure:

To attach I/O system boards with DMP enabled

1 Freeze the VCS service groups running on the domain where you intend toattach a systemboard. Repeat the following command for each service group:

# hagrp -freeze ser_grp_name

2 After physically replacing a previously removed I/O board, make sure it isconnected to the shared storage.

3 From the SSP server, enter the dr(1M) shell:

ssp:D1% dr

4 Follow theOracle procedure to attach the systemboard, describedhere briefly:

dr> init_attach sb#

Complete the attach operation:

dr> complete_attach sb#

5 Verify that the dynamic reconfiguration attach operation has succeeded.Type:

dr> drshow #sb

The new system board should show in the list of configured boards.

6 Exit the dr shell.

dr> exit

7 Log in as root to the domain where you are adding the system board. Enablethe controller by entering:

# vxdmpadm enable ctlr=ctlr#

Dynamic reconfiguration of Oracle serversPerforming dynamic reconfiguration on Oracle SunEnterprise 10K

50

8 When you have successfully attached and enabled the system I/O board,unfreeze any frozen service groups:

# hagrp -unfreeze ser_grp_name

Repeat for each service group.

9 Verify that VCS is still up and running.

Detaching CPU/memory boardsUse the following procedure if no I/O devices on the system board are used.

Make sure the kernel_cage_enable variable is set.

See “Preparing environment for dynamic reconfiguration” on page 46.

To detach CPU/memory boards

1 Freeze the VCS service groups running on the domain where you intend todetach a CPU/Memory board. Freezing the service groups prevents themfrom being taken offline or failed over. Repeat the following command foreach service group:

# hagrp -freeze ser_grp_name

2 Connect to the SSP server and log in to the domain whose system boardrequires Dynamic Reconfiguration.

ssp:D1% echo $SUNW_HOSTNAME

3 Enter the dr(1M) shell:

ssp:D1% dr

4 In the window where you are running dynamic reconfiguration, startdetaching the I/O board by entering:

dr> drain sb#

5 Monitor the progress of the drain operation by entering:

dr> drshow sb# drain

6 When you see the message

Percent Complete= 100% (0 KBytes remaining)

complete the detach operation:

dr> complete_detach sb#

51Dynamic reconfiguration of Oracle serversPerforming dynamic reconfiguration on Oracle SunEnterprise 10K

7 To verify that the board is no longer configured, type the following command:

dr> drshow sb#

The detached board should not appear in the detailed listing.

8 Exit the dr shell:

dr > exit

9 If the board is not to be immediately replaced, unfreeze any frozen servicegroups:

# hagrp -unfreeze ser_grp_name

10 Repeat for each service group.

Attaching CPU/Memory boardsUse the following procedure if none of the I/O devices on the system board areused.

To attach a CPU/Memory board

1 Freeze the VCS service groups running on the domain where you intend toattach a systemboard. Repeat the following command for each service group:

# hagrp -freeze ser_grp_name

2 Physically replace the CPU/Memory board.

3 From the SSP server, enter the dr(1M) shell:

ssp:D1% dr

4 Follow theOracle procedure to attach the systemboard, describedhere briefly:

dr> init_attach sb#

Complete the attach operation:

dr> complete_attach sb#

5 Verify that the dynamic reconfiguration attach operation has succeeded.Type:

dr> drshow #sb

The new system board should show in the list of configured boards.

6 Exit the dr shell.

dr> exit

Dynamic reconfiguration of Oracle serversPerforming dynamic reconfiguration on Oracle SunEnterprise 10K

52

7 When you have successfully attached the CPU/Memory board, unfreeze anyfrozen service groups:

# hagrp -unfreeze ser_grp_name

Repeat for each service group.

8 Verify that VCS is still up and running.

Using VM without DMP enabledIf you have the VolumeManager DMP feature disabled for some or all of the disksin the shared storage, and youmust performdynamic reconfiguration operationswithin the cluster, we recommend using the VCS DiskReservation agent to guardagainst data corruption. In the event of a “split-brain” condition, that is, whentwo processors in a cluster can simultaneously write to the shared storage, theDiskReservation agent ensures that only one processor has access to the storageat one time. See the VCS Bundled Agents Reference Guide for information onconfiguring the DiskReservation agent.

Replacing an online Host Bus Adapter (HBA) on anM5000 server

This section contains the procedure to replace an online Host Bus Adapter (HBA)when DMP is managing multi-pathing in a Cluster File System (CFS) cluster. TheHBA World Wide Port Name (WWPN) changes when the HBA is replaced.

Following are the prerequisites to replace an online Host Bus Adapter (HBA):

■ A single node or two or more node CFS or RAC cluster.

■ I/O running on CFS file system.

■ An M5000 server with atleast two HBAs in separate PCIe slots andrecommended Solaris patch level for HBA replacement.

Following is the procedure to hotswap an online Host Bus Adapter on an M5000server:

53Dynamic reconfiguration of Oracle serversReplacing an online Host Bus Adapter (HBA) on an M5000 server

To replace an online Host Bus Adapter (HBA) on an M5000 server

1 Identify the HBAs on the M5000 server using the following command:

/usr/platform/sun4u/sbin/prtdiag -v | grep emlx ( emulex HBA)

/usr/platform/sun4u/sbin/prtdiag -v | grep qlc ( qlogic HBA )

00 PCIe 0 2, fc20, 10df 119, 0, 0 okay 4,

4 SUNW,emlxs-pci10df,fc20 LPe 11002-S

/pci@0,600000/pci@0/pci@9/SUNW,emlxs@0

00 PCIe 0 2, fc20, 10df 119, 0, 1 okay 4,

4 SUNW,emlxs-pci10df,fc20 LPe 11002-S

/pci@0,600000/pci@0/pci@9/SUNW,emlxs@0,1

00 PCIe 3 2, fc20, 10df 2, 0, 0 okay 4,

4 SUNW,emlxs-pci10df,fc20 LPe 11002-S

/pci@3,700000/SUNW,emlxs@0

00 PCIe 3 2, fc20, 10df 2, 0, 1 okay 4,

4 SUNW,emlxs-pci10df,fc20 LPe 11002-S

/pci@3,700000/SUNW,emlxs@0,1

Dynamic reconfiguration of Oracle serversReplacing an online Host Bus Adapter (HBA) on an M5000 server

54

2 Identify the HBA and it's WWPN(s), which you want to replace using thecfgadm command.

To identify the HBA:

# cfgadm -al | grep -i fibre

iou#0-pci#1 fibre/hp connected configured ok

iou#0-pci#4 fibre/hp connected configured ok

To list all HBAs:

# luxadm -e port ( will list all HBA's )

/devices/pci@0,600000/pci@0/pci@9/SUNW,emlxs@0/fp@0,0:devctl

NOT CONNECTED

/devices/pci@0,600000/pci@0/pci@9/SUNW,emlxs@0,1/fp@0,0:devctl

CONNECTED

/devices/pci@3,700000/SUNW,emlxs@0/fp@0,0:devctl

NOT CONNECTED

/devices/pci@3,700000/SUNW,emlxs@0,1/fp@0,0:devctl

CONNECTED

Select the HBA to dump the portap and get the WWPN:

# luxadm -e dump_map /devices/pci@0,600000/pci@0/pci@9/SUNW,emlxs@0,1/

fp@0,0:devctl

0 304700 0 203600a0b847900c 200600a0b847900c 0x0

(Disk device)

1 30a800 0 20220002ac00065f 2ff70002ac00065f 0x0

(Disk device)

2 30a900 0 21220002ac00065f 2ff70002ac00065f 0x0

(Disk device)

3 560500 0 10000000c97c3c2f 20000000c97c3c2f 0x1f

(Unknown Type)

4 560700 0 10000000c97c9557 20000000c97c9557 0x1f

(Unknown Type)

5 560b00 0 10000000c97c34b5 20000000c97c34b5 0x1f

(Unknown Type)

6 560900 0 10000000c973149f 20000000c973149f 0x1f

(Unknown Type,Host Bus Adapter)

55Dynamic reconfiguration of Oracle serversReplacing an online Host Bus Adapter (HBA) on an M5000 server

Alternately, you can run the fcinfo hba-port Solaris command to get theWWPN(s) for the HBA ports.

3 Ensure you have a compatible spare HBA for hot-swap.

4 Stop the I/O operations on the HBA port(s) and disable the DMP subpath(s)for the HBA that you want to replace.

# vxdmpadm disable ctrl=<>

5 Dynamically unconfigure theHBA in thePCIe slot using thecfgadm command.

# cfgadm -c unconfigure iou#0-pci#1

Look for console messages to check if the cfgadm command is unsuccessful.

If the cfgadm command is unsuccessful, proceed to troubleshootingusing theserver hardware documentation. Check the Solaris 10 patch levelrecommended for dynamic reconfiguration operations and contact Oraclesupport for further assistance.

console messages

Oct 24 16:21:44 m5000sb0 pcihp: NOTICE: pcihp (pxb_plx2):

card is removed from the slot iou 0-pci 1

6 Verify that the HBA card that is being replaced in step 5 is not in theconfiguration using the following command:

# cfgadm -al | grep -i fibre

iou 0-pci 4 fibre/hp connected configured ok

7 Mark the fiber cable(s).

8 Remove the fiber cable(s) and the HBA that you must replace.

Note: You can refer to the HBA replacement procedures in SPARCEnterpriseM4000/M5000/M8000/M9000 Servers Dynamic Reconfiguration (DR) User'sGuide for more information.

Dynamic reconfiguration of Oracle serversReplacing an online Host Bus Adapter (HBA) on an M5000 server

56

9 Replace it with a new compatible HBA of similar type in the same slot.

The reinserted card shows up as follows:

console messages

iou 0-pci 1 unknown disconnected unconfigured unknown

10 Run the following command to bring the replaced HBA back into theconfiguration.

# cfgadm -c configure iou 0-pci 1

console messages

Oct 24 16:21:57 m5000sb0 pcihp: NOTICE: pcihp (pxb_plx2):

card is inserted in the slot iou#0-pci#1 (pci dev 0)

11 Verify that the reinserted HBA is in the configuration using the cfgadmcommand:

# cfgadm -al | grep -i fibre

iou#0-pci 1 fibre/hp connected configured ok <====

iou#0-pci 4 fibre/hp connected configured ok

12 Modify fabric zoning to include the replaced HBA WWPN(s).

13 Enable LUN security on storage for the new WWPN(s).

14 Perform an operating system device scan to re-discover the LUNs using thecfgadm command:

# cfgadm -c configure c3

15 Clean up the device tree for old LUNs.

# devfsadm -Cv

Note:SometimesHBAreplacementmay create newdevices. Performcleanupoperations for the LUN only when new devices are created.

57Dynamic reconfiguration of Oracle serversReplacing an online Host Bus Adapter (HBA) on an M5000 server

16 If VxVM / Dynamic Multi-pathing (DMP) does not show a ghost path for theremoved HBA path, enable the path using the vxdmpadm command: Thisperforms the device scan for that particular HBA subpath(s).

# vxdmpadm disable ctrl=<ctrl#>

17 Verify if I/O operations are scheduled on that path.

If I/O operations are running correctly on all paths, then the dynamic HBAreplacement operation is complete.

Dynamic reconfiguration of Oracle serversReplacing an online Host Bus Adapter (HBA) on an M5000 server

58


Top Related